diff --git a/docs/extensions/storage/proposedchanges.md b/docs/extensions/storage/proposedchanges.md new file mode 100644 index 000000000..e8c1a14ce --- /dev/null +++ b/docs/extensions/storage/proposedchanges.md @@ -0,0 +1,114 @@ +--- +title: "Proposed Changes" +date: 2018-05-02T00:00:00+00:00 +weight: 18 +geekdocRepo: https://github.com/owncloud/ocis +geekdocEditPath: edit/master/docs/extensions/storage +geekdocFilePath: proposedchanges.md +--- + +Some architectural changes still need to be clarified or changed. Maybe an ADR is in order for all of the below. + +## Reva Gateway changes + +## A dedicated shares storage provider + +Currently, the *gateway* treats `/home/shares` different than any other path: it will stat all children and calculate an etag to allow clients to discover changes in accepted shares. This requires the storage provider to cooperate and provide this special `/shares` folder in the root of a users home when it is accessed as a home storage, which is a config flag that needs to be set for every storage driver. + +The `enable_home` flag will cause drivers to jail path based requests into a `` subfolder. In effect it divides a storage provider into multiple [*storage spaces*]({{< ref "#storage-spaces" >}}): when calling `CreateHome` a subfolder following the `` is created and market as the root of a users home. Both, the eos and ocis storage drivers use extended attributes to mark the folder as the end of the size aggregation and tree mtime propagation mechanism. Even setting the quota is possible like that. All this literally is a [*storage space*]({{< ref "#storage-spaces" >}}). + +We can implement [ListStorageSpaces](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ListStorageSpacesRequest) by either +- iterating over the root of the storage and treating every folder following the `` as a `home` *storage space*, +- iterating over the root of the storage and treating every folder following a new `` as a `project` *storage space*, or +- iterating over the root of the storage and treating every folder following a generic `` as a *storage space* for a configurable space type, or +- we allow configuring a map of `space type` to `layout` (based on the [CreateStorageSpaceRequest](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.CreateStorageSpaceRequest)) which would allow things like +``` +home=/var/lib/ocis/storage/home/{{substr 0 1 .Owner.Username}}/{{.Owner.Username}} +spaces=/spaces/var/lib/ocis/storage/projects/{{.Name}} +``` + +This would make the `GetHome()` call return the path to the *storage provider* including the relative path to the *storage space*. No need for a *storage provider* mounted at `/home`. This is just a UI alias for `/users/`. Just like a normal `/home/` on a linux machine. + +But if we have no `/home` where do we find the shares, and how can clients discover changes in accepted shares? + +The `/shares` namespace should be provided by a *shares storage provider* that lists all accepted shares for the current user... but what about copy pasting links from the browser? Well this storage is only really needed to have a path to ocm shares that actually reside on other instances. In the UI the shares would be listed by querying a *share manager*. It returns ResourceIds, which can be stated to fetch a path that is then accessible in the CS3 global namespace. Two caveats: +- This only works for resources that are actually hosted by the current instance. For those it would leak the parent path segments to a shared resource. +- For accepted OCM shares there must be a path in the [*CS3 global namespace*]({{< ref "./namespaces.md#cs3-global-namespaces" >}}) that has to be the same for all users, otherwise they cannot copy and share those URLs. + +Work on this is done in https://github.com/cs3org/reva/pull/1846 + +### The gateway should be responsible for path transformations + +Currently, storage providers are aware af their mount point, coupling them tightly with the gateway. + +Tracked in https://github.com/cs3org/reva/issues/578 + +Work is done in https://github.com/cs3org/reva/pull/1866 + +## URL escaped string representation of a CS3 reference + +For the `/dav/spaces/` endpoint we need to encode the *reference* in a url compatible way. +1. We can separate the path using a `/`: `/dav/spaces//` +2. The `spaceid` currently is a cs3 resourceid, consisting of `` and ``. Since the nodeid might contain `/` eg. for the local driver we have to urlencode the spaceid. + +To access resources by id we need to make the `/dav/meta/` able to list directories... Otherwise id based navigation first has to look up the path. Or we use the libregraph api for id based navigation. + +A *reference* is a logical concept. It identifies a [*resource*]({{< ref "#resources" >}}) and consists of a `` and a ``. A `` consists of a `` and a ``. They can be concatenated using the separators `!` and `:`: +``` +!: +``` +While all components are optional, only three cases are used: +| format | example | description | +|-|-|-| +| `!:` | `!:/absolute/path/to/file.ext` | absolute path | +| `!:` | `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!:path/to/file.ext` | path relative to the root of the storage space | +| `!:` | `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!c3cf23bb-8f47-4719-a150-1d25a1f6fb56:to/file.ext` | path relative to the specified node in the storage space, used to reference resources without disclosing parent paths | + +`` should be a UUID to prevent references from breaking when a *user* or [*storage space*]({{< ref "#storage-spaces" >}}) gets renamed. But it can also be derived from a migration of an oc10 instance by concatenating an instance identifier and the numeric storage id from oc10, e.g. `oc10-instance-a$1234`. + +A reference will often start as an absolute/global path, e.g. `!:/home/Projects/Foo`. The gateway will look up the storage provider that is responsible for the path + +| Name | Description | Who resolves it? | +|------|-------------|-| +| `!:/home/Projects/Foo` | the absolute path a client like davfs will use. | The gateway uses the storage registry to look up the responsible storage provider | +| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!:/Projects/Foo` | the `storage_space` is the same as the `root`, the path becomes relative to the root | the storage provider can use this reference to identify this resource | + +Now, the same file is accessed as a share +| Name | Description | +|------|-------------| +| `!:/users/Einstein/Projects/Foo` | `Foo` is the shared folder | +| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a:` | `56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a` is the id of `Foo`, the path is empty | + + +The `:`, `!` and `$` are chosen from the set of [RFC3986 sub delimiters](https://tools.ietf.org/html/rfc3986#section-2.2) on purpose. They can be used in URLs without having to be encoded. In some cases, a delimiter can be left out if a component is not set: +| reference | interpretation | +|-|-| +| `/absolute/path/to/file.ext` | absolute path, all delimiters omitted | +| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!path/to/file.ext` | relative path in the given storage space, root delimiter `:` omitted | +| `56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a:to/file.ext` | relative path in the given root node, storage space delimiter `!` omitted | +| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a:` | node id in the given storage space, `:` must be present | +| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62` | root of the storage space, all delimiters omitted, can be distinguished by the `/` | + +## space providers +When looking up an id based resource the reference must use a logical space id, not a CS3 resource id. Otherwise id based requests, which only have a resourceid consisting of a storage id and a node id cannot be routed to the correct storage provider if the storage has moved from one storage provider to another. + +if the registry routes based on the storageid AND the nodeid it has to keep a cache of all nodeids in order to route all requests for a storage space (which consists of storage it + nodeid) to the correct storage provider. the correct resourceid for a node in a storage space would be `$!`. The `$` part allow the storage registry to route all id based requests to the correct storage provider. This becomes relevant when the storage space was moved from one storage provider to another. The storage space id remains the same, but the internal address and port change. + +TODO discuss to clarify further + +## Storage drivers + +### allow clients to send a uuid on upload +iOS clients can only queue single requests to be executed in the background. They queue an upload and need to be able to identify the uploaded file after it has been uploaded to the server. The disconnected nature of the connection might cause workflows or manual user interaction with the file on the server to move the file to a different place or changing the content while the device is offline. However, on the device users might have marked the file as favorite or added it to other iOS specific collections. To be able to reliably identify the file the client can generate a `uuid` and attach it to the file metadata during the upload. While it is not necessary to look up files by this `uuid` having a second file id that serves exactly the same purpose as the `file id` is redundant. + +Another aspect for the `file id` / `uuid` is that it must be a logical identifier that can be set, at least by internal systems. Without a writeable fileid we cannot restore backups or migrate storage spaces from one storage provider to another storage provider. + +Technically, this means that every storage driver needs to have a map of a `uuid` to an internal resource identifier. This internal resource identifier can be +- an eos fileid, because eos can look up files by id +- an inode if the filesystem and the storage driver support looking up by inode +- a path if the storage driver has no way of looking up files by id. + - In this case other mechanisms like inotify, kernel audit or a fuse overlay might be used to keep the paths up to date. + - to prevent excessive writes when deep folders are renamed a reverse map might be used: it will map the `uuid` to `:`, in order to trade writes for reads + - as a fallback a sync job can read the file id from the metadata of the resources and populate the uuid to internal id map. + +The TUS upload can take metadata, for PUT we might need a header. \ No newline at end of file diff --git a/docs/extensions/storage/static/spacesregistry.drawio.svg b/docs/extensions/storage/static/spacesregistry.drawio.svg new file mode 100644 index 000000000..e3a771b07 --- /dev/null +++ b/docs/extensions/storage/static/spacesregistry.drawio.svg @@ -0,0 +1,434 @@ + + + + + + + +
+
+
+ oCIS System +
+ [Software System] +
+
+
+
+ + oCIS System... + +
+
+ + + + + + +
+
+
+ + Einstein + +
+ [Person] +
+
+
+ End user +
+
+
+
+
+ + Einstein... + +
+
+ + + + +
+
+
+ + Client + +
+ [Container: C++, Kotlin, Swift or Vue] +
+
+
+ A desktop, mobile or web Client +
+
+
+
+
+ + Client... + +
+
+ + + + +
+
+
+ + Storage Space Registry + +
+ [Container: golang, HTTP, libregraph] +
+
+
+ Manages spaces for users +
+
+
+
+
+ + Storage Space Registry... + +
+
+ + + + +
+
+
+ + Storage Provider + +
+ [Container: golang] +
+
+
+ Persists storage spaces using reva +
+
+
+
+
+ + Storage Provider... + +
+
+ + + + +
+
+
+ + Storage System + +
+ [Software System] +
+
+
+ provides persistent storage +
+
+
+
+
+ + Storage System... + +
+
+ + + + + + +
+
+
+ + Moss + +
+ [Person] +
+
+
+ Administrator +
+
+
+
+
+ + Moss... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [POSIX, S3] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [WebDAV, libregraph, CS3, tus] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Manages the users Storage Spaces in + +
+
+ [libregraph] +
+
+
+
+
+
+ + Manages the users Storage Spac... + +
+
+ + + + + +
+
+
+
+
+ + Manages resources with + +
+
+ [Web UI or native clients] +
+
+
+
+
+
+ + Manages resources with... + +
+
+ + + + + +
+
+
+
+
+ + Registers itself at and +
+ sends space root etag changes to +
+
+
+ [CS3, libregraph?, PUSH] +
+
+
+
+
+
+ + Registers itself at and... + +
+
+ + + + + +
+
+
+
+
+ + Manages organizational Storage Spaces in + +
+
+ [WebDAV, libregraph, CS3, CLI] +
+
+
+
+
+
+ + Manages organizational Storage... + +
+
+ + + + +
+
+
+ + Identity Management System + +
+ [Software System] +
+
+
+ provides users and groups +
+
+
+
+
+ + Identity Management System... + +
+
+ + + + + +
+
+
+
+
+ + Authenticates users and searches recipients with + +
+
+ [OpenID Connect, LDAP, REST] +
+
+
+
+
+
+ + Authenticates users and search... + +
+
+ + + + +
+
+
+

+ C4 Container diagram for the oCIS System +

+

+ As a platform, the oCIS system may not only includes web, mobile and desktop clients but also the underlying storage system or an identity management system +

+

+ Date: 2021-07-22T16:43 +

+
+
+
+
+ + C4 Container diagram for the oCIS System... + +
+
+
+ + + + + Viewer does not support full SVG 1.1 + + + +
\ No newline at end of file diff --git a/docs/extensions/storage/static/storageprovider.drawio.svg b/docs/extensions/storage/static/storageprovider.drawio.svg index 4b88a71c1..e7ba5ea77 100644 --- a/docs/extensions/storage/static/storageprovider.drawio.svg +++ b/docs/extensions/storage/static/storageprovider.drawio.svg @@ -1,119 +1,345 @@ - + - - + -
-
-
- - CS3 -
- storage provider -
- API (GRPC) -
-
+
+
+
+ oCIS storage provider +
+ [Software System]
- - CS3... + + oCIS storage provider... - - - - - - - - - - + -
+
-
- storage provider +
+ + reva storage provider + +
+ [Component: golang] +
+
+
+ hosts multiple storage spaces using a storage driver +
- - storage provider + + reva storage provider... - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + -
+
-
- - / - +
+ + reva gateway + +
+ [Component: golang] +
+
+
+ API facade for internal reva services +
- - / + + reva gateway... + + + + + + + +
+
+
+ + Storage System + +
+ [Software System] +
+
+
+ provides persistent storage +
+
+
+
+
+ + Storage System... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [POSIX, S3] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + +
+
+
+ + reva frontend + +
+ [Component: golang] +
+
+
+ handles protocol translation +
+
+
+
+
+ + reva frontend... + +
+
+ + + + +
+
+
+ + oCIS proxy + +
+ [Component: golang] +
+
+
+ Routes requests to oc10 or ecis +
+
+
+
+
+ + oCIS proxy... + +
+
+ + + + + +
+
+
+
+
+ + Mints an internal JWT +
+ and torwards requests to +
+
+
+ [WebDAV, OCS, OCM, tus] +
+
+
+
+
+
+ + Mints an internal JWT... + +
+
+ + + + +
+
+
+ + Client + +
+ [Container: C++, Kotlin, +
+ Swift or Vue] +
+
+
+ A desktop, mobile or web Client +
+
+
+
+
+ + Client... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [WebDAV, libregraph, CS3] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [CS3, tus] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Forwards to + +
+
+ [CS3, storage registry] +
+
+
+
+
+
+ + Forwards to... + +
+
+ + + + +
+
+
+

+ C4 Component diagram for an oCIS storage provider +

+

+ An oCIS storage provider manages resources in storage spaces by persisting them with a specific storage driver in a storage system. +

+

+ Date: 2021-07-22T12:40 +

+
+
+
+
+ + C4 Component diagram for an oCIS storage provider...
- - - - - - - - diff --git a/docs/extensions/storage/static/storageregistry-spaces.drawio.svg b/docs/extensions/storage/static/storageregistry-spaces.drawio.svg deleted file mode 100644 index 3c2d49717..000000000 --- a/docs/extensions/storage/static/storageregistry-spaces.drawio.svg +++ /dev/null @@ -1,327 +0,0 @@ - - - - - - - - -
-
-
- The storage registry currently maps paths and storageids to the -
- - address:port - - of the corresponding storage provider -
-
-
-
- - The storage registry currently maps... - -
-
- - - - - - - -
-
-
- storage registry -
-
-
-
- - storage registry - -
-
- - - - - - - -
-
-
- storage providers -
-
-
-
- - storage providers - -
-
- - - - - - - - - - - - -
-
-
- The gateway uses the storage registry to look up the storage provider that is responsible for path and id based references in incoming requests. -
-
-
-
- - The gateway uses the storage regist... - -
-
- - - - - - - -
-
-
- gateway -
-
-
-
- - gateway - -
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - Viewer does not support full SVG 1.1 - - - -
\ No newline at end of file diff --git a/docs/extensions/storage/storages.md b/docs/extensions/storage/storages.md index 6172dc10e..59927fd93 100644 --- a/docs/extensions/storage/storages.md +++ b/docs/extensions/storage/storages.md @@ -78,7 +78,15 @@ The storage keeps an activity history, tracking the different actions that have ## Storage drivers -Reva currently has four storage driver implementations that can be used for *storage providers* an well as *data providers*. +Reva currently has several storage driver implementations that can be used for *storage providers* an well as *data providers*. + +### OCIS and S3NG Storage Driver + +The oCIS storage driver is the default storage driver. It decomposes the metadata and persists it in a POSIX filesystem. Blobs are stored on the filesystem as well. The layout makes extensive use of symlinks and extended attributes. A filesystem like xfs or zfs without inode size limitations is recommended. We will evolve this to further integrate with file systems like cephfs or gpfs. + +The S3NG storage driver uses the same metadata layout on a POSIX storage as the oCIS driver, but it uses S3 as the blob storage. + +TODO add list of capabilities / tradeoffs ### Local Storage Driver diff --git a/docs/extensions/storage/terminology.md b/docs/extensions/storage/terminology.md index 0b3ce00a4..2b883b570 100644 --- a/docs/extensions/storage/terminology.md +++ b/docs/extensions/storage/terminology.md @@ -9,66 +9,50 @@ geekdocFilePath: terminology.md Communication is hard. And clear communication is even harder. You may encounter the following terms throughout the documentation, in the code or when talking to other developers. Just keep in mind that whenever you hear or read *storage*, that term needs to be clarified, because on its own it is too vague. PR welcome. -## Resources -A *resource* is a logical concept. Resources can be of [different types](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ResourceType): +## Logical concepts + +### Resources +A *resource* is the basic building block that oCIS manages. It can be of [different types](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ResourceType): - an actual *file* - a *container*, e.g. a folder or bucket - a *symlink*, or - a [*reference*]({{< ref "#references" >}}) which can point to a resource in another [*storage provider*]({{< ref "#storage-providers" >}}) -## References +### References -A *reference* is a logical concept that identifies a [*resource*]({{< ref "#resources" >}}). A [*CS3 reference*](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.Reference) consists of either -- a *path* based reference, used to identify a [*resource*]({{< ref "#resources" >}}) in the [*namespace*]({{< ref "./namespaces.md" >}}) of a [*storage provider*]({{< ref "#storage-providers" >}}). It must start with a `/`. -- a [CS3 *id* based reference](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ResourceId), uniquely identifying a [*resource*]({{< ref "#resources" >}}) in the [*namespace*]({{< ref "./namespaces.md" >}}) of a [*storage provider*]({{< ref "#storage-providers" >}}). It consists of a `storage provider id` and an `opaque id`. The `storage provider id` must NOT start with a `/`. - -{{< hint info >}} -The `/` is important because currently the static [*storage registry*]({{< ref "#storage-space-registries" >}}) uses a map to look up which [*storage provider*]({{< ref "#storage-providers" >}}) is responsible for the resource. Paths must be prefixed with `/` so there can be no collisions between paths and storage provider ids in the same map. -{{< /hint >}} +A *reference* identifies a [*resource*]({{< ref "#resources" >}}). A [*CS3 reference*](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.Reference) can carry a *path* and a [CS3 *resource id*](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ResourceId). The references come in two flavors: absolute and combined. +Absolute references have either the *path* or the *resource id* set: +- An absolute *path* MUST start with a `/`. The *resource id* MUST be empty. +- An absolute *resource id* uniquely identifies a [*resource*]({{< ref "#resources" >}}) and is used as a stable identifier for sharing. The *path* MUST be empty. +Combined references have both, *path* and *resource id* set: +- the *resource id* identifies the root [*resource*]({{< ref "#resources" >}}) +- the *path* is relative to that root. It MUST start with `.` -{{< hint warning >}} -### Alternative: reference triple #### -A *reference* is a logical concept. It identifies a [*resource*]({{< ref "#resources" >}}) and consists of -a `storage_space`, a `` and a `` -``` -!: -``` -While all components are optional, only three cases are used: -| format | example | description | -|-|-|-| -| `!:` | `!:/absolute/path/to/file.ext` | absolute path | -| `!:` | `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!:path/to/file.ext` | path relative to the root of the storage space | -| `!:` | `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!c3cf23bb-8f47-4719-a150-1d25a1f6fb56:to/file.ext` | path relative to the specified node in the storage space, used to reference resources without disclosing parent paths | +### Storage Spaces +A *storage space* organizes a set of [*resources*]({{< ref "#resources" >}}) in a hierarchical tree. It has a single *owner* (*user* or *group*), +a *quota*, *permissions* and is identified by a `storage space id`. -`` should be a UUID to prevent references from breaking when a *user* or [*storage space*]({{< ref "#storage-spaces" >}}) gets renamed. But it can also be derived from a migration of an oc10 instance by concatenating an instance identifier and the numeric storage id from oc10, e.g. `oc10-instance-a$1234`. +{{< svg src="extensions/storage/static/storagespace.drawio.svg" >}} -A reference will often start as an absolute/global path, e.g. `!:/home/Projects/Foo`. The gateway will look up the storage provider that is responsible for the path +Examples would be every user's personal storage space, project storage spaces or group storage spaces. While they all serve different purposes and may or may not have workflows like anti virus scanning enabled, we need a way to identify and manage these subtrees in a generic way. By creating a dedicated concept for them this becomes easier and literally makes the codebase cleaner. A [*storage space registry*]({{< ref "#storage-space-registries" >}}) then allows listing the capabilities of [*storage spaces*]({{< ref "#storage-spaces" >}}), e.g. free space, quota, owner, syncable, root etag, upload workflow steps, ... -| Name | Description | Who resolves it? | -|------|-------------|-| -| `!:/home/Projects/Foo` | the absolute path a client like davfs will use. | The gateway uses the storage registry to look up the responsible storage provider | -| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!:/Projects/Foo` | the `storage_space` is the same as the `root`, the path becomes relative to the root | the storage provider can use this reference to identify this resource | +Finally, a logical `storage space id` is not tied to a specific [*storage provider*]({{< ref "#storage-providers" >}}). If the [*storage driver*]({{< ref "#storage-drivers" >}}) supports it, we can import existing files including their `file id`, which makes it possible to move [*storage spaces*]({{< ref "#storage-spaces" >}}) between [*storage providers*]({{< ref "#storage-providers" >}}) to implement storage classes, e.g. with or without archival, workflows, on SSDs or HDDs. -Now, the same file is accessed as a share -| Name | Description | -|------|-------------| -| `!:/users/Einstein/Projects/Foo` | `Foo` is the shared folder | -| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a:` | `56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a` is the id of `Foo`, the path is empty | +### Shares +*To be clarified: we are aware that [*storage spaces*]({{< ref "#storage-spaces" >}}) may be too 'heavywheight' for ad hoc sharing with groups. That being said, there is no technical reason why group shares should not be treated like [*storage spaces*]({{< ref "#storage-spaces" >}}) that users can provision themselves. They would share the quota with the users home [*storage space*]({{< ref "#storage-spaces" >}}) and the share initiator would be the sole owner. Technically, the mechanism of treating a share like a new [*storage space*]({{< ref "#storage-spaces" >}}) would be the same. This obviously also extends to user shares and even file indvidual shares that would be wrapped in a virtual collection. It would also become possible to share collections of arbitrary files in a single storage space, e.g. the ten best pictures from a large album.* -The `:`, `!` and `$` are chosen from the set of [RFC3986 sub delimiters](https://tools.ietf.org/html/rfc3986#section-2.2) on purpose. They can be used in URLs without having to be encoded. In some cases, a delimiter can be left out if a component is not set: -| reference | interpretation | -|-|-| -| `/absolute/path/to/file.ext` | absolute path, all delimiters omitted | -| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!path/to/file.ext` | relative path in the given storage space, root delimiter `:` omitted | -| `56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a:to/file.ext` | relative path in the given root node, storage space delimiter `!` omitted | -| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a:` | node id in the given storage space, `:` must be present | -| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62` | root of the storage space, all delimiters omitted, can be distinguished by the `/` | +### Storage Space Registries -{{< /hint >}} +A *storage space registry* manages the [*namespace*]({{< ref "./namespaces.md" >}}) for a *user*: it is used by *clients* to look up storage spaces a user has access to, the `/dav/spaces` endpoint to access it via WabDAV, and where the client should mount it in the users personal namespace. -## Storage Drivers +{{< svg src="extensions/storage/static/spacesregistry.drawio.svg" >}} + + +## Technical concepts + +### Storage Drivers A *storage driver* implements access to a [*storage system*]({{< ref "#storage-systems" >}}): @@ -77,38 +61,14 @@ It maps the *path* and *id* based CS3 *references* to an appropriate [*storage s - posix inodes or paths - deconstructed filesystem nodes -{{< hint warning >}} -**Proposed Change** -iOS clients can only queue single requests to be executed in the background. The queue an upload and need to be able to identify the uploaded file after it has been uploaded to the server. The disconnected nature of the connection might cause worksflows or manual user interaction with the file on the server to move the file to a different place or changing the content while the device is offline. However, on the device users might have marked the file as favorite or added it to other iOS specific collections. To be able to reliably identify the file the client can generate a `uuid` and attach it to the file metadata during the upload. While it is not necessary to look up files by this `uuid` having a second file id that serves exactly the same purpose as the `file id` is redundant. - -Another aspect for the `file id` / `uuid` is that it must be a logical identifier that can be set, at least by internal systems. Without a writeable fileid we cannot restore backups or migrate storage spaces from one storage provider to another storage provider. - -Technically, this means that every storage driler needs to have a map of a `uuid` to in internal resource identifier. This internal resource identifier can be -- an eos fileid, because eos can look up files by id -- an inode if the filesystem and the storage driver support lookung up by inode -- a path if the storage driver has no way of looking up files by id. - - In this case other mechanisms like inotify, kernel audit or a fuse overlay might be used to keep the paths up to date. - - to prevent excessive writes when deep folders are renamed a reverse map might be used: it will map the `uuid` to `:`, allowing to trade writes for reads - -{{< /hint >}} -## Storage Providers +### Storage Providers A *storage provider* manages [*resources*]({{< ref "#resources" >}}) identified by a [*reference*]({{< ref "#references" >}}) by accessing a [*storage system*]({{< ref "#storage-systems" >}}) with a [*storage driver*]({{< ref "#storage-drivers" >}}). {{< svg src="extensions/storage/static/storageprovider.drawio.svg" >}} -{{< hint warning >}} -**Proposed Change** -A *storage provider* manages multiple [*storage spaces*]({{< ref "#storage-space" >}}) -by accessing a [*storage system*]({{< ref "#storage-systems" >}}) with a [*storage driver*]({{< ref "#storage-drivers" >}}). - -{{< svg src="extensions/storage/static/storageprovider-spaces.drawio.svg" >}} - -By making [*storage providers*]({{< ref "#storage-providers" >}}) aware of [*storage spaces*]({{< ref "#storage-spaces" >}}) we can get rid of the current `enablehome` flag / hack in reva, which lead to the [spawn of `*home` drivers](https://github.com/cs3org/reva/tree/master/pkg/storage/fs). Furthermore, provisioning a new [*storage space*]({{< ref "#storage-space" >}}) becomes a generic operation, regardless of the need of provisioning a new user home or a new project space. -{{< /hint >}} - -## Storage Space Registries +### Storage Registry A *storage registry* manages the [*CS3 global namespace*]({{< ref "./namespaces.md#cs3-global-namespaces" >}}): It is used by the *gateway* @@ -117,65 +77,11 @@ that should handle a [*reference*]({{< ref "#references" >}}). {{< svg src="extensions/storage/static/storageregistry.drawio.svg" >}} -{{< hint warning >}} -**Proposed Change** -A *storage space registry* manages the [*namespace*]({{< ref "./namespaces.md" >}}) for a *user*: -It is used by the *gateway* -to look up `address` and `port` of the [*storage provider*]({{< ref "#storage-providers" >}}) -that is currently serving a [*storage space*]({{< ref "#storage-space" >}}). - -{{< svg src="extensions/storage/static/storageregistry-spaces.drawio.svg" >}} - -By making *storage registries* aware of [*storage spaces*]({{< ref "#storage-spaces" >}}) we can query them for a listing of all [*storage spaces*]({{< ref "#storage-spaces" >}}) a user has access to. Including his home, received shares, project folders or group drives. See [a WIP PR for spaces in the oCIS repo (#1827)](https://github.com/owncloud/ocis/pull/1827) for more info. -{{< /hint >}} - -## Storage Spaces -A *storage space* is a logical concept: -It is a tree of [*resources*]({{< ref "#resources" >}})*resources* -with a single *owner* (*user* or *group*), -a *quota* and *permissions*, identified by a `storage space id`. - -{{< svg src="extensions/storage/static/storagespace.drawio.svg" >}} - -Examples would be every user's home storage space, project storage spaces or group storage spaces. While they all serve different purposes and may or may not have workflows like anti virus scanning enabled, we need a way to identify and manage these subtrees in a generic way. By creating a dedicated concept for them this becomes easier and literally makes the codebase cleaner. A [*storage space registry*]({{< ref "#storage-space-registries" >}}) then allows listing the capabilities of [*storage spaces*]({{< ref "#storage-spaces" >}}), e.g. free space, quota, owner, syncable, root etag, upload workflow steps, ... - -Finally, a logical `storage space id` is not tied to a specific [*storage provider*]({{< ref "#storage-providers" >}}). If the [*storage driver*]({{< ref "#storage-drivers" >}}) supports it, we can import existing files including their `file id`, which makes it possible to move [*storage spaces*]({{< ref "#storage-spaces" >}}) between [*storage providers*]({{< ref "#storage-providers" >}}) to implement storage classes, e.g. with or without archival, workflows, on SSDs or HDDs. - -## Shares -*To be clarified: we are aware that [*storage spaces*]({{< ref "#storage-spaces" >}}) may be too 'heavywheight' for ad hoc sharing with groups. That being said, there is no technical reason why group shares should not be treated like [*storage spaces*]({{< ref "#storage-spaces" >}}) that users can provision themselves. They would share the quota with the users home [*storage space*]({{< ref "#storage-spaces" >}}) and the share initiator would be the sole owner. Technically, the mechanism of treating a share like a new [*storage space*]({{< ref "#storage-spaces" >}}) would be the same. This obviously also extends to user shares and even file indvidual shares that would be wrapped in a virtual collection. It would also become possible to share collections of arbitrary files in a single storage space, e.g. the ten best pictures from a large album.* - - -## Storage Systems +### Storage Systems Every *storage system* has different native capabilities like id and path based lookups, recursive change time propagation, permissions, trash, versions, archival and more. A [*storage provider*]({{< ref "#storage-providers" >}}) makes the storage system available in the CS3 API by wrapping the capabilities as good as possible using a [*storage driver*]({{< ref "#storage-drivers" >}}). There migt be multiple [*storage drivers*]({{< ref "#storage-drivers" >}}) for a *storage system*, implementing different tradeoffs to match varying requirements. -## Gateways +### Gateways A *gateway* acts as a facade to the storage related services. It authenticates and forwards API calls that are publicly accessible. - -{{< hint warning >}} -**Proposed Change** -Currently, the *gateway* treats `/home/shares` different than any other path: it will stat all children and calculate an etag to allow clients to discover changes in accepted shares. This requires the storage provider to cooperate and provide this special `/shares` folder in the root of a users home when it is accessed as a home storage, which is a config flag that needs to be set for every storage driver. - -The `enable_home` flag will cause drivers to jail path based requests into a `` subfolder. In effect it divides a storage provider into multiple [*storage spaces*]({{< ref "#storage-spaces" >}}): when calling `CreateHome` a subfolder following the `` is created and market as the root of a users home. Both, the eos and ocis storage drivers use extended attributes to mark the folder as the end of the size aggregation and tree mtime propagation mechanism. Even setting the quota is possible like that. All this literally is a [*storage space*]({{< ref "#storage-spaces" >}}). - -We can implement [ListStorageSpaces](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ListStorageSpacesRequest) by either -- iterating over the root of the storage and treating every folder following the `` as a `home` *storage space*, -- iterating over the root of the storage and treating every folder following a new `` as a `project` *storage space*, or -- iterating over the root of the storage and treating every folder following a generic `` as a *storage space* for a configurable space type, or -- we allow configuring a map of `space type` to `layout` (based on the [CreateStorageSpaceRequest](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.CreateStorageSpaceRequest)) which would allow things like -``` -home=/var/lib/ocis/storage/home/{{substr 0 1 .Owner.Username}}/{{.Owner.Username}} -spaces=/spaces/var/lib/ocis/storage/projects/{{.Name}} -``` - -This would make the `GetHome()` call return the path to the *storage provider* including the relative path to the *storage space*. No need for a *storage provider* mounted at `/home`. This is just a UI alias for `/users/`. Just like a normal `/home/` on a linux machine. - -But if we have no `/home` where do we find the shares, and how can clients discover changes in accepted shares? - -The `/shares` namespace should be provided by a *storage provider* that lists all accepted shares for the current user... but what about copy pasting links from the browser? Well this storage is only really needed to have a path to ocm shares that actually reside on other instances. In the UI the shares would be listed by querying a *share manager*. It returns ResourceIds, which can be stated to fetch a path that is then accessible in the CS3 global namespace. Two caveats: -- This only works for resources that are actually hosted by the current instance. For those it would leak the parent path segments to a shared resource. -- For accepted OCM shares there must be a path in the [*CS3 global namespace*]({{< ref "./namespaces.md#cs3-global-namespaces" >}}) that has to be the same for all users, otherwise they cannot copy and share those URLs. - -{{< /hint >}} \ No newline at end of file diff --git a/docs/extensions/storage/updating.md b/docs/extensions/storage/updating.md index 2a5133618..dd1920376 100644 --- a/docs/extensions/storage/updating.md +++ b/docs/extensions/storage/updating.md @@ -11,9 +11,9 @@ geekdocFilePath: updating.md ## Updating reva -1. Run `go get github.com/cs3org/reva@master` +1. Run `go get github.com/cs3org/reva@master` in all repos that depend on reva 2. Create a changelog entry containing changes that were done in [reva](https://github.com/cs3org/reva/commits/master) -3. Create a Pull Request to ocis-reva master with those changes +3. Create a Pull Request to ocis master with those changes 4. If test issues appear, you might need to adjust the tests 5. After the PR is merged, consider doing a [release of the storage submodule]({{< ref "releasing" >}}) diff --git a/docs/ocis/_index.md b/docs/ocis/_index.md index ed233878b..46f0b599a 100644 --- a/docs/ocis/_index.md +++ b/docs/ocis/_index.md @@ -10,18 +10,16 @@ geekdocFilePath: _index.md {{< figure class="floatright" src="/media/is.png" width="70%" height="auto" >}} ## ownCloud Infinite Scale - Welcome to oCIS, the modern file-sync and share platform, which is based on our knowledge and experience with the PHP based [ownCloud server](https://owncloud.com/#server). ### The idea of federated storage - To creata a truly federated storage architecture oCIS breaks down the old ownCloud 10 user specific namespace, which is assembled on the server side, and makes the individual parts accessible to clients as storage spaces and storage space registries. The below diagram shows the core conceps that are the foundation for the new architecture: - End user devices can fetch the list of *storage spaces* a user has access to, by querying one or multiple *storage space registries*. The list contains a unique endpoint for every *storage space*. -- [*Storage space registries*]({{< ref "../extensions/storage/terminology#storage-space-registries" >}}) manage the list of storage spaces a user has access to. They may subscrible to *storage spaces* in order to receive notifications about changes on behalf of an end users mobile or desktop client. -- [*Storage spaces*]({{< ref "../extensions/storage/terminology#storage-spaces" >}}) represent a collection of files and folders. A users personal files are a *storage space*, a group or project drive is a *storage space*, and even incoming shares are treated and implemented as *storage spaces*. Each with properties like owners, permissions, quota and type. -- [*Storage providers*]({{< ref "../extensions/storage/terminology#storage-providers" >}}) can hold multiple *storage spaces*. At an oCIS instance, there might be a dedicated *storage provider* responsible for users personal storage spaces. There might be multiple, sharing the load or there might be just one, hosting all types of *storage spaces*. +- [*Storage space registries*]({{< ref "../extensions/storage/terminology#storage-space-registries" >}}) manage the list of storage spaces a user has access to. They may subscribe to *storage spaces* in order to receive notifications about changes on behalf of an end users mobile or desktop client. +- [*Storage spaces*]({{< ref "../extensions/storage/terminology#storage-spaces" >}}) represent a collection of files and folders. A users personal files are contained in a *storage space*, a group or project drive is a *storage space*, and even incoming shares are treated and implemented as *storage spaces*. Each with properties like owners, permissions, quota and type. +- [*Storage providers*]({{< ref "../extensions/storage/terminology#storage-providers" >}}) can hold multiple *storage spaces*. At an oCIS instance, there might be a dedicated *storage provider* responsible for users personal storage spaces. There might be multiple, either to shard the load, provide different levels of redundancy or support custom workflows. Or there might be just one, hosting all types of *storage spaces*. {{< svg src="ocis/static/idea.drawio.svg" >}} @@ -35,19 +33,18 @@ Einstein copies the URL in the browser (or an email with the same URL is sent au When Marie enters that URL she will be presented with a login form on the `https://cloud.zurich.test` instance, because the share was created on that domain. If `https://cloud.zurich.test` trusts her OpenID Connect identity provider `https://idp.paris.test` she can log in. This time, the *storage space registry* discovery will come up with `https://cloud.paris.test` though. Since that registry is different than the registry tied to `https://cloud.zurich.test` oCIS web can look up the *storage space* `716199a6-00c0-4fec-93d2-7e00150b1c84` and register the WebDAV URL `https://cloud.zurich.test/dav/spaces/716199a6-00c0-4fec-93d2-7e00150b1c84/a/rel/path` in Maries *storage space registry* at `https://cloud.paris.test`. When she accepts that share her clients will be able to sync the new *storage space* at `https://cloud.zurich.test`. -### oCIS microservice runtime +Or in other words: _total world federation!_ +### oCIS microservice runtime The oCIS runtime allows us to dynamically manage services running in a single process. We use [suture](https://github.com/thejerf/suture) to create a supervisor tree that starts each service in a dedicated goroutine. By default oCIS will start all built-in oCIS extensions in a single process. Individual services can be moved to other nodes to scale-out and meet specific performance requirements. A [go-micro](https://github.com/asim/go-micro/blob/master/registry/registry.go) based registry allows services in multiple nodes to form a distributed microservice architecture. ### oCIS extensions - Every oCIS extension uses [ocis-pkg](https://github.com/owncloud/ocis/tree/master/ocis-pkg), which implements the [go-micro](https://go-micro.dev/) interfaces for [servers](https://github.com/asim/go-micro/blob/v3.5.0/server/server.go#L17-L37) to register and [clients](https://github.com/asim/go-micro/blob/v3.5.0/client/client.go#L11-L23) to lookup nodes with a service [registry](https://github.com/asim/go-micro/blob/v3.5.0/registry/registry.go). We are following the [12 Factor](https://12factor.net/) methodology with oCIS. The uniformity of services also allows us to use the same command, logging and configuration mechanism. Configurations are forwarded from the oCIS runtime to the individual extensions. ### go-micro - While the [go-micro](https://go-micro.dev/) framework provides abstractions as well as implementations for the different components in a microservice architecture, it uses a more developer focused runtime philosophy: It is used to download services from a repo, compile them on the fly and start them as individual processes. For oCIS we decided to use a more admin friendly runtime: You can download a single binary and start the contained oCIS extensions with a single `bin/ocis server`. This also makes packaging easier. We use [ocis-pkg](https://github.com/owncloud/ocis/tree/master/ocis-pkg) to configure the default implementations for the go-micro [grpc server](https://github.com/asim/go-micro/tree/v3.5.0/plugins/server/grpc), [client](https://github.com/asim/go-micro/tree/v3.5.0/plugins/client/grpc) and [mdns registry](https://github.com/asim/go-micro/blob/v3.5.0/registry/mdns_registry.go), swapping them out as needed, eg. to use the [kubernetes registry plugin](https://github.com/asim/go-micro/tree/v3.5.0/plugins/registry/kubernetes). @@ -62,7 +59,6 @@ Interacting with oCIS involves a multitude af APIs. The server and all clients r We run a huge [test suite](https://github.com/owncloud/core/tree/master/tests), which originated in ownCloud 10 and continues to grow. A detailed description can be found in the developer docs for [testing]({{< ref "development/testing" >}}). ### Architecture Overview - Running `bin/ocis server` will start the below services, all of which can be scaled and deployed on a single node or in a cloud native environment, as needed. {{< svg src="ocis/static/architecture-overview.drawio.svg" >}} diff --git a/docs/ocis/migration.md b/docs/ocis/migration.md index 12583248a..8fd1ecd41 100644 --- a/docs/ocis/migration.md +++ b/docs/ocis/migration.md @@ -9,14 +9,6 @@ geekdocFilePath: migration.md The migration happens in subsequent stages while the service is online. First all users need to migrate to the new architecture, then the global namespace needs to be introduced. Finally, the data on disk can be migrated user by user by switching the storage driver. -
- -{{< hint warning >}} -@jfd: It might be easier to introduce the spaces api in oc10 and then migrate to oCIS. We cannot migrate both at the same time, the architecture to oCIS (which will change fileids) and introduce a global namespace (which requires stable fileids to let clients handle moves without redownloading). Either we implement arbitrary mounting of shares in oCIS / reva or we make clients and oc10 spaces aware. -{{< /hint >}} - -
- ## Migration Stages ### Stage 0: pre migration @@ -56,7 +48,7 @@ The ownCloud 10 demo instance uses OAuth to obtain a token for ownCloud web and
-_TODO make oauth2 in oc10 trust the new web ui, based on `redirect_uri` and CSRF so no explicit consent is needed_ +_TODO make oauth2 in oc10 trust the new web ui, based on `redirect_uri` and CSRF so no explicit consent is needed?_ #### FAQ _Feel free to add your question as a PR to this document using the link at the top of this page!_ @@ -72,13 +64,12 @@ While SAML and Shibboleth are protocols that solve that problem, they are limite
-_TODO @butonic add ADR for OpenID Connect_ +_TODO @butonic add ADR for OpenID Connect and flesh out pros and cons of the above_
#### User impact -When introducing OpenID Connect, the clients will detect the new authentication scheme when their current way of authenticating returns an error. Users will then have to -reauthorize at the OpenID Connecd IdP, which again, may be configured to skip the consent step for trusted clients. +When introducing OpenID Connect, the clients will detect the new authentication scheme when their current way of authenticating returns an error. Users will then have to reauthorize at the OpenID Connect IdP, which again, may be configured to skip the consent step for trusted clients. #### Steps 1. There are multiple products that can be used as an OpenID Connect IdP. We test with [LibreGraph Connect](https://github.com/libregraph/lico), which is also [embedded in oCIS](https://github.com/owncloud/web/). Other alternatives include [Keycloak](https://www.keycloak.org/) or [Ping](https://www.pingidentity.com/). Please refer to the corresponding setup instructions for the product you intent to use. @@ -106,7 +97,7 @@ Should there be problems with OpenID Connect at this point you can disable the a
Legacy clients relying on Basic auth or app passwords need to be migrated to OpenId Connect to work with oCIS. For a transition period Basic auth in oCIS can be enabled with `PROXY_ENABLE_BASIC_AUTH=true`, but we strongly recommend adopting OpenID Connect for other tools as well. -While OpenID Connect providers will send an `iss` and `sub` claim that relying parties (services like oCIS or ownCloud 10) can use to identify users we recommend introducing a dedicated, globally unique, persistent, non-reassignable user identifier like a UUID for every user. This `ownclouduuid` shold be sent as an additional claim to save additional lookups on the server side. It will become the user id in oCIS, e.g. when searching for recipients the `ownclouduuid` will be used to persist permissions with the share manager. It has a different purpose than the ownCloud 10 username, which is used to login. Using UUIDs we can not only mitigate username collisions when merging multiple instances but also allow renaming usernames after the migration to oCIS has been completed. +While OpenID Connect providers will send an `iss` and `sub` claim that relying parties (services like oCIS or ownCloud 10) can use to identify users we recommend introducing a dedicated, globally unique, persistent, non-reassignable user identifier like a UUID for every user. This `ownclouduuid` should be sent as an additional claim to save additional lookups on the server side. It will become the user id in oCIS, e.g. when searching for recipients the `ownclouduuid` will be used to persist permissions with the share manager. It has a different purpose than the ownCloud 10 username, which is used to login. Using UUIDs we can not only mitigate username collisions when merging multiple instances but also allow renaming usernames after the migration to oCIS has been completed.
@@ -117,9 +108,9 @@ _Feel free to add your question as a PR to this document using the link at the t
-### Stage 3: introduce oCIS interally +### Stage 3: introduce oCIS internally -Befor letting oCIS handle end user requests we will first make it available in the internal network. By subsequently adding services we can add functionality and verify the services work as intended. +Before letting oCIS handle end user requests we will first make it available in the internal network. By subsequently adding services we can add functionality and verify the services work as intended. Start oCIS backend and make read only tests on existing data using the `owncloudsql` storage driver which will read (and write) - blobs from the same datadirectory layout as in ownCloud 10 @@ -139,7 +130,7 @@ None, only administrators will be able to explore oCIS during this stage. #### Steps and verifications -We are going to run and explore a series of services that will together handle the same requests as ownCloud 10. For initial exploration the oCIS binary is recommended. The services can later be deployed using a single oCIS runtime or in multiple cotainers. +We are going to run and explore a series of services that will together handle the same requests as ownCloud 10. For initial exploration the oCIS binary is recommended. The services can later be deployed using a single oCIS runtime or in multiple containers. ##### Storage provider for file metadata @@ -172,7 +163,7 @@ Enable spaces API in oc10: {{< hint warning >}} **Alternative 2** -An additional `uuid` property used only to detect moves. A lookup by uuid is not necessary for this. The `/dav/meta` endpoint would still take the fileid. Clients would use the `uuid` to detect moves and set up new sync pairs when migrating to a global namespace. +An additional `uuid` property used only to detect moves. A lookup by uuid is not necessary for this. The `/dav/meta` endpoint would still take the fileid. Clients would use the `uuid` to detect moves and set up new sync pairs when migrating to a global namespace. ### Stage-3.1 Generate a `uuid` for every file as a file property. Clients can submit a `uuid` when creating files. The server will create a `uuid` if the client did not provide one. @@ -280,7 +271,7 @@ The IP address of the ownCloud host changes. There is no change for the file syn 2. Verify the requests are routed based on the ownCloud 10 routing policy `oc10` by default ##### Test user based routing -1. Change the routing policy for a user or an early adoptors group to `ocis`
_TODO @butonic currently, the migration selector will use the `ocis` policy for users that have been added to the accounts service. IMO we need to evaluate a claim from the IdP._
+1. Change the routing policy for a user or an early adopters group to `ocis`
_TODO @butonic currently, the migration selector will use the `ocis` policy for users that have been added to the accounts service. IMO we need to evaluate a claim from the IdP._
2. Verify the requests are routed based on the oCIS routing policy `oc10` for 'migrated' users. At this point you are ready to rock & roll! @@ -322,8 +313,8 @@ _TODO @butonic update performance comparisons nightly_ #### Steps There are several options to move users to the oCIS backend: -- Use a canary app to let users decide thamselves -- Use an early adoptors group with an opt in +- Use a canary app to let users decide themselves +- Use an early adopters group with an opt in - Force migrate users in batch or one by one at the administrators will #### Verification @@ -333,7 +324,7 @@ The same verification steps as for the internal testing stage apply. Just from t Until now, the oCIS configuration mimics ownCloud 10 and uses the old data directory layout and the ownCloud 10 database. Users can seamlessly be switched from ownCloud 10 to oCIS and back again.
-_TODO @butonic we need a canary app that allows users to decide for themself which backend to use_ +_TODO @butonic we need a canary app that allows users to decide for themselves which backend to use_
@@ -401,12 +392,12 @@ Noticeable performance improvements because we effectively shard the storage log _TODO @butonic implement `ownclouds3` based on `s3ng`_ _TODO @butonic implement tiered storage provider for seamless migration_ -_TODO @butonic document how to manually do that until the storge registry can discover that on its own._ +_TODO @butonic document how to manually do that until the storage registry can discover that on its own._
#### Verification -Start with a test user, then move to early adoptors and finally migrate all users. +Start with a test user, then move to early adopters and finally migrate all users. #### Rollback To switch the storage provider again the same storage space migration can be performed again: copy medatata and blob data using the CS3 api, then change the responsible storage provider in the storage registry. @@ -432,7 +423,7 @@ Migrate share data to _yet to determine_ share manager backend and shut down own The ownCloud 10 database still holds share information in the `oc_share` and `oc_share_external` tables. They are used to efficiently answer queries about who shared what with whom. In oCIS shares are persisted using a share manager and if desired these grants are also sent to the storage provider so it can set ACLs if possible. Only one system should be responsible for the shares, which in case of treating the storage as the primary source effectively turns the share manager into a cache. #### User impact -Depending on chosen the share manager provider some sharing requests should be faster: listing incoming and outgoing shares is no longer bound to the ownCloud 10 database but to whatever technology is used by the share provdier: +Depending on chosen the share manager provider some sharing requests should be faster: listing incoming and outgoing shares is no longer bound to the ownCloud 10 database but to whatever technology is used by the share provider: - For non HA scenarios they can be served from memory, backed by a simple json file. - TODO: implement share manager with redis / nats / ... key value store backend: use the micro store interface please ... @@ -446,13 +437,13 @@ Depending on chosen the share manager provider some sharing requests should be f _TODO for HA implement share manager with redis / nats / ... key value store backend: use the micro store interface please ..._ _TODO for batch migration implement share data migration cli with progress that reads all shares via the cs3 api from one provider and writes them into another provider_ -_TODO for seamless migration implement tiered/chained share provider that reads share data from the old provider and writes newc shares to the new one_ +_TODO for seamless migration implement tiered/chained share provider that reads share data from the old provider and writes new shares to the new one_ _TODO for storage provider as source of truth persist ALL share data in the storage provider. Currently, part is stored in the share manager, part is in the storage provider. We can keep both, but the the share manager should directly persist its metadata to the storage system used by the storage provider so metadata is kept in sync_
#### Verification -After copying all metadata start a dedicated gateway and change the configuration to use the new share manager. Route a test user, a test group and early adoptors to the new gateway. When no problems occur you can stirt the desired number of share managers and roll out the change to all gateways. +After copying all metadata start a dedicated gateway and change the configuration to use the new share manager. Route a test user, a test group and early adopters to the new gateway. When no problems occur you can start the desired number of share managers and roll out the change to all gateways.
@@ -461,12 +452,12 @@ _TODO let the gateway write updates to multiple share managers ... or rely on th
#### Rollback -To switch the share manager to the database one revert routing users to the new share manager. If you already shut down the old share manager start it again. Use the tiered/chained share manager provider in reverse configuration (new share provider as read only, old as write) and migrate the shares again. You can alse restore a database backup if needed. +To switch the share manager to the database one revert routing users to the new share manager. If you already shut down the old share manager start it again. Use the tiered/chained share manager provider in reverse configuration (new share provider as read only, old as write) and migrate the shares again. You can also restore a database backup if needed.
### Stage-10 -Profit! Well, on the one hand you do not need to maintain a clustered database setup and can rely on the storage system. On the other hand you are now in microservice wonderland and will have to relearn how to identify bottlenecks and scale oCIS accordingly. The good thing is that tools like jaeger and prometheus have evolved and will help you understand what is going on. But this is a different Topic. See you on the other side! +Profit! Well, on the one hand you do not need to maintain a clustered database setup and can rely on the storage system. On the other hand you are now in micro service wonderland and will have to relearn how to identify bottlenecks and scale oCIS accordingly. The good thing is that tools like jaeger and prometheus have evolved and will help you understand what is going on. But this is a different Topic. See you on the other side! #### FAQ _Feel free to add your question as a PR to this document using the link at the top of this page!_ @@ -709,7 +700,7 @@ _TODO clarify if metadata from ldap & user_shibboleth needs to be migrated_
-The `dn` -> *owncloud internal username* mapping that currently lives in the `oc_ldap_user_mapping` table needs to move into a dedicated ownclouduuid attribute in the LDAP server. The idp should send it as a claim so the proxy does not have to look up the user using LDAP again. The username cannot be changed in ownCloud 10 and the oCIS provisioning API will not allow changing it as well. When we introduce the graph api we may allow changing usernames when all clients have moved to that api. +The `dn` -> *owncloud internal username* mapping that currently lives in the `oc_ldap_user_mapping` table needs to move into a dedicated `ownclouduuid` attribute in the LDAP server. The idp should send it as a claim so the proxy does not have to look up the user using LDAP again. The username cannot be changed in ownCloud 10 and the oCIS provisioning API will not allow changing it as well. When we introduce the graph api we may allow changing usernames when all clients have moved to that api. The problem is that the username in owncloud 10 and in oCIS also need to be the same, which might not be the case when the ldap mapping used a different column. In that case we should add another owncloudusername attribute to the ldap server.