20 KiB
title, date, weight, geekdocRepo, geekdocEditPath, geekdocFilePath
| title | date | weight | geekdocRepo | geekdocEditPath | geekdocFilePath |
|---|---|---|---|---|---|
| Migration | 2021-03-16T16:17:00+01:00 | 41 | https://github.com/owncloud/ocis | edit/master/docs/ocis | migration.md |
The migration happens in subsequent stages while the service is online. First all users need to migrate to the new architecture, then the data on disk can be migrated user by user by switching the storage driver.
User Stories
- As an admin I need to avoid downtime.
- As an admin I want to migrate certain groups of users before others.
- As a user, I need a seamless migration and not lose data by any chance.
Migration Stages
Stage-0
Is the pre-migration stage having a functional ownCloud 10 instance.
Stage-1
Introduce OpenID Connect to ownCloud 10 server and clients.
Stage-2
Install and introduce ownCloud Web and let users test it voluntarily.
Stage-3
Start oCIS backend and make read only tests on existing data using the owncloud storage driver which will read (and write)
- blobs from the same datadirectory layout as in ownCloud 10
- metadata from the ownCloud 10 database: The oCIS share manager will read share information from the ownCloud database as well.
Therefore we need:
- a share manager that can read from the ownCloud 10 database as well as from whatever new backend will be used for a pure oCIS setup. Currently, that would be the json file. Or that is migrated after all users have switched to oCIS. -- jfd
Stage-4
Test writing data with oCIS into the existing ownCloud 10 data directory using the owncloud storage driver.
Stage-5
Introduce reverse proxy and switch over early adopters, let admins gain trust in the new backend by comparing metrics of the two systems and having it running in parallel.
Stage-6
Voluntary transition period and subsequent hard deadline for all users
Stage-7
Disable ownCloud 10 in the proxy, all requests are now handled by oCIS, shut down oc10 web servers and redis (or keep for calendar & contacts only? rip out files from oCIS?)
Stage-8
User by user storage migration from owncloud driver to ocis/s3ng/cephfs...
Stage-9
Migrate share data to <yet to determine> share manager backend and shut down ownCloud database
Stage-10
Profit! (db for file metadata no longer necessary, less maintenance effort)
Architectural differences
The fundamental difference between ownCloud 10 and oCIS is that the file metadata is moved from the database in the oc_filecache table (which is misnamed, as it actually is an index) to the storage provider who can place metadata as close to the underlying storage system as possible. In effect, the file metadata is sharded over multiple specialized services.
Data that will be migrated
Currently, oCIS focuses on file sync and share use cases.
Blob data
In ownCloud 10 the files are laid out on disk in the data directory using the following layout:
data
├── einstein
│ ├── cache
│ ├── files
│ │ ├── Photos
│ │ │ └── Portugal.jpg
│ │ ├── Projects
│ │ │ └── Notes.md
│ │ └── ownCloud Manual.pdf
│ ├── files_external
│ ├── files_trashbin
│ │ ├── files
│ │ │ ├── Documents.d1564687985
│ │ │ ├── TODO.txt.d1565721976
│ │ │ └── welcome.txt.d1564775872
│ │ └── versions
│ │ │ ├── TODO.txt.v1564605543.d1565721976
│ │ │ └── TODO.txt.v1564775936.d1565721976
│ ├── files_versions
│ │ ├── Projects
│ │ │ ├── Notes.md.v1496912691
│ │ │ └── Notes.md.v1540305560
│ │ └── ownCloud Manual.pdf.v1396628249
│ ├── thumbnails
│ │ └── 123
│ │ │ ├── 2048-1536-max.png
│ │ │ └── 32-32.png // the file id, eg. of /Photos/Portugal.jpg
│ └── uploads
├── marie
│ ├── cache
│ ├── files
│ ├── files_external
│ ├── files_trashbin
│ ├── files_versions
│ └── thumbnails
│ …
├── moss
…
The data directory may also contain subfolders for ownCloud 10 applications like avatars, gallery, files_external and cache.
When an object storage is used as the primary storage all file blobs are stored by their file id and a prefix, eg.: urn:oid:<fileid>.
The three types of blobs we need to migrate are stored in
filesfor file blobs, the current file content,files_trashbinfor trashed files (and their versions) andfiles_versionsfor file blobs of older versions.
Filecache table
In both cases the file metadata, including a full replication of the file tree, is stored in the oc_filecache table of an ownCloud 10 database. The primary key of a row is the file id. It is used to attach additional metadata like shares, tags, favorites or arbitrary file properties.
The filecache table itself has more metadata:
| Field | Type | Null | Key | Default | Extra | Comment | Migration |
|---|---|---|---|---|---|---|---|
fileid |
bigint(20) | NO | PRI | NULL | auto_increment | MUST become the oCIS opaqueid of a file reference. ocis driver stores it in extendet attributes and can use numbers as node ids on disk. for eos see note below table |
|
storage |
int(11) | NO | MUL | 0 | the filecache holds metadata for multiple storages | corresponds to an oCIS storage space | |
path |
varchar(4000) | YES | NULL | the path relative to the storages root | MUST become the path relative to the storage root. files prefix needs to be trimmed. |
||
path_hash |
varchar(32) | NO | *mysql once had problems indexing long paths, so we stored a hash for lookup by path. | - | |||
parent |
bigint(20) | NO | MUL | 0 | used to implement the hierarchy and listing children of a folder by id. redundant with path |
- | |
name |
varchar(250) | YES | NULL | basename of path |
- | ||
mimetype |
int(11) | NO | 0 | joined with the oc_mimetypes table. only relevant for object storage deployments |
can be determined from blob / file extension | ||
mimepart |
int(11) | NO | 0 | " | can be determined from blob / file extension | ||
size |
bigint(20) | NO | 0 | same as blob size unless encryption is used | MAY become size, can be determined from blob | ||
mtime |
bigint(20) | NO | 0 | same as blob mtime | for files MAY become mtime (can be determined from blob as well), for directories MUST become tmtime | ||
encrypted |
int(11) | NO | 0 | encrypted flag | oCIS currently does not support encryption | ||
etag |
varchar(40) | YES | NULL | used to propagate changes in a tree | MUST be migrated (or calculated in the same way) to prevent clients from syncing unnecessarily | ||
unencrypted_size |
bigint(20) | NO | 0 | same as blob size | oCIS currently does not support encryption | ||
storage_mtime |
bigint(20) | NO | 0 | used to detect external storage changes | oCIS delegates that to the storage providers and drivers | ||
permissions |
int(11) | YES | 0 | *used as the basis for permissions. synced from disk when running a file scan. * | oCIS delegates that to the storage providers and drivers | ||
checksum |
varchar(255) | YES | NULL | same as blob checksum | SHOULD become the checksum in the storage provider. eos calculates it itself, ocis driver stores it in extended attributes |
Note: for EOS a hot migration only works seamlessly if file ids in oc10 are already read from eos. otherwise either a mapping from the oc10 filecache file id to the new eos file id has to be created under the assumption that these id sets do not intersect or files and corresponding shares need to be exported and imported offline to generate a new set of ids. While this will preserve public links, user, group and even federated shares, old internal links may still point to different files because they contain the oc10 fileid
share table
used to store
- Public links
- Private shares with users and groups
- Federated shares partly
- Guest shares
| Field | Type | Null | Key | Default | Extra | Comment | CS3 API |
|---|---|---|---|---|---|---|---|
id |
int(11) | NO | PRI | NULL | auto_increment | ShareId.opaqueid string |
|
share_type |
smallint(6) | NO | 0 | in CS3 every type is handled by a dedicated API. See below the table | does NOT map to Share.ShareType TODO clarify |
||
share_with |
varchar(255) | YES | MUL | NULL | Share.grantee Grantee |
||
uid_owner |
varchar(64) | NO | ShareId.owner UserID |
||||
parent |
int(11) | YES | NULL | - | |||
item_type |
varchar(64) | NO | MUL | Share.resource_id ResourceId |
|||
item_source |
varchar(255) | YES | MUL | NULL | Share.resource_id ResourceId |
||
item_target |
varchar(255) | YES | NULL | Share.resource_id ResourceId |
|||
file_source |
bigint(20) | YES | MUL | NULL | cannot store uuid style file ids from ocis. when all users have migrated to ocis the share manager needs to be updated / migrated to a version that does. | Share.resource_id ResourceId |
|
file_target |
varchar(512) | YES | NULL | Share.resource_id ResourceId |
|||
permissions |
smallint(6) | NO | 0 | Share.Permissions SharePermissions |
|||
stime |
bigint(20) | NO | 0 | Share.ctime, Share.mtime |
|||
accepted |
smallint(6) | NO | 0 | ReceivedShare.ShareState ShareState |
|||
expiration |
datetime | YES | NULL | only used for the Link API and storage provider api, currently cannot be added using the Collaboration or OCM API | Grant |
||
token |
varchar(32) | YES | MUL | NULL | PublicShare.token |
||
mail_send |
smallint(6) | NO | 0 | - | |||
uid_initiator |
varchar(64) | YES | NULL | ShareId.creator UserID |
|||
share_name |
varchar(64) | YES | NULL | only exists for public shares | PublicShare.display_name |
||
attributes |
longtext | YES | NULL | additional share attributes | could be implemented using opaque data, but should be added to the CS3 api |
In the CS3 API
- public links are handled by the PublicShareProvider using the Link API
- internal shares are handled by the UserShareProvider using the Collaboration API. This covers user and group shares.
- federated shares are handled by the OcmShareProvider using the OCM Share Provider AP aka. Open Cloud Mesh.
share_external
Used to store additional metadata for federated shares.
| Field | Type | Null | Key | Default | Extra | Comment |
|---|---|---|---|---|---|---|
id |
bigint(20) | NO | PRI | NULL | auto_increment | |
remote |
varchar(512) | NO | NULL | Url of the remote owncloud instance | ||
share_token |
varchar(64) | NO | NULL | Public share token | ||
password |
varchar(64) | YES | NULL | Optional password for the public share | ||
name |
varchar(64) | NO | NULL | Original name on the remote server | ||
owner |
varchar(64) | NO | NULL | User that owns the public share on the remote server | ||
user |
varchar(64) | NO | MUL | NULL | Local user which added the external share | |
mountpoint |
varchar(4000) | NO | NULL | Full path where the share is mounted | ||
mountpoint_hash |
varchar(32) | NO | NULL | md5 hash of the mountpoint | ||
remote_id |
varchar(255) | NO | -1 | |||
accepted |
int(11) | NO | 0 |
TODO document how the reva OCM service currently persists the data
trusted_servers
used to determine if federated shares can automatically be accepted
| Field | Type | Null | Key | Default | Extra | Comment |
|---|---|---|---|---|---|---|
id |
int(11) | NO | PRI | NULL | auto_increment | |
url |
varchar(512) | NO | NULL | Url of trusted server | ||
url_hash |
varchar(255) | NO | UNI | sha1 hash of the url without the protocol | ||
token |
varchar(128) | YES | NULL | token used to exchange the shared secret | ||
shared_secret |
varchar(256) | YES | NULL | shared secret used to authenticate | ||
status |
int(11) | NO | 2 | current status of the connection | ||
sync_token |
varchar(512) | YES | NULL | cardDav sync token |
TODO clarify how OCM handles this and where we store / configure this. It seems related to trusted IdPs
user data
Users are migrated in two steps:
- They should all be authenticated using openid connect, which already moves them to a common identity management system.
- To search share recipients, both, ownCloud 10 and oCIS need access to the same user directory using eg. LDAP.
TODO: add state to CS3 API, so we can 'disable' users TODO: how do we map (sub) admins? -> map to roles & permissions
accounts:
| Field | Type | Null | Key | Default | Extra | Comment |
|---|---|---|---|---|---|---|
id |
bigint(20) unsigned | NO | PRI | NULL | auto_increment | |
email |
varchar(255) | YES | MUL | NULL | ||
user_id |
varchar(255) | NO | UNI | NULL | ||
lower_user_id |
varchar(255) | NO | UNI | NULL | ||
display_name |
varchar(255) | YES | MUL | NULL | ||
quota |
varchar(32) | YES | NULL | |||
last_login |
int(11) | NO | 0 | |||
backend |
varchar(64) | NO | NULL | |||
home |
varchar(1024) | NO | NULL | |||
state |
smallint(6) | NO | 0 |
users:
| Field | Type | Null | Key | Default | Extra | Comment |
|---|---|---|---|---|---|---|
uid |
varchar(64) | NO | PRI | |||
password |
varchar(255) | NO | ||||
displayname |
varchar(64) | YES | NULL |
groups:
The groups table really only contains the group name.
| Field | Type | Null | Key | Default | Extra |
|---|---|---|---|---|---|
gid |
varchar(64) | NO | PRI |
LDAP
TODO clarify if metadata from ldap & user_shibboleth needs to be migrated
- the
dn-> owncloud internal username mapping that currently lives in theoc_ldap_user_mappingtable needs to move into a dedicated ownclouduuid attribute in the LDAP server. The idp should send it as a claim so the proxy does not have to look up the user using LDAP again. The username cannot be changed in ownCloud 10 and the oCIS provisioning API will not allow changing it as well. When we introduce the graph api we may allow changing usernames when all clients have moved to that api.
The problem is that the username in owncloud 10 and in oCIS also need to be the same, which might not be the case when the ldap mapping used a different column. In that case we should add another owncloudusername attribute to the ldap server ...
activities
dedicated service, not yet implemented, requires decisions about an event system -- jfd
| Field | Type | Null | Key | Default | Extra | Comment |
|---|---|---|---|---|---|---|
activity_id |
bigint(20) | NO | PRI | NULL | auto_increment | |
timestamp |
int(11) | NO | MUL | 0 | ||
priority |
int(11) | NO | 0 | |||
type |
varchar(255) | YES | NULL | |||
user |
varchar(64) | YES | NULL | |||
affecteduser |
varchar(64) | NO | MUL | NULL | ||
app |
varchar(255) | NO | NULL | |||
subject |
varchar(255) | NO | NULL | |||
subjectparams |
longtext | NO | NULL | |||
message |
varchar(255) | YES | NULL | |||
messageparams |
longtext | YES | NULL | |||
file |
varchar(4000) | YES | NULL | |||
link |
varchar(4000) | YES | NULL | |||
object_type |
varchar(255) | YES | MUL | NULL | ||
object_id |
bigint(20) | NO | 0 |
Links
The data_exporter has logic that allows exporting and importing users, including shares. The model classes contain the exact mapping.