From 93c833348e149f3f3db3531f12c0c585c6ff2c92 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=B6rn=20Friedrich=20Dreyer?= Date: Thu, 22 Jul 2021 15:36:48 +0000 Subject: [PATCH 1/4] separate proposed changes from terminology MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Jörn Friedrich Dreyer --- docs/extensions/storage/proposedchanges.md | 114 +++++ .../storage/static/spacesregistry.drawio.svg | 434 ++++++++++++++++++ .../storage/static/storageprovider.drawio.svg | 390 ++++++++++++---- .../static/storageregistry-spaces.drawio.svg | 327 ------------- docs/extensions/storage/storages.md | 10 +- docs/extensions/storage/terminology.md | 158 ++----- docs/extensions/storage/updating.md | 4 +- docs/ocis/_index.md | 14 +- docs/ocis/migration.md | 49 +- 9 files changed, 924 insertions(+), 576 deletions(-) create mode 100644 docs/extensions/storage/proposedchanges.md create mode 100644 docs/extensions/storage/static/spacesregistry.drawio.svg delete mode 100644 docs/extensions/storage/static/storageregistry-spaces.drawio.svg diff --git a/docs/extensions/storage/proposedchanges.md b/docs/extensions/storage/proposedchanges.md new file mode 100644 index 0000000000..e8c1a14cec --- /dev/null +++ b/docs/extensions/storage/proposedchanges.md @@ -0,0 +1,114 @@ +--- +title: "Proposed Changes" +date: 2018-05-02T00:00:00+00:00 +weight: 18 +geekdocRepo: https://github.com/owncloud/ocis +geekdocEditPath: edit/master/docs/extensions/storage +geekdocFilePath: proposedchanges.md +--- + +Some architectural changes still need to be clarified or changed. Maybe an ADR is in order for all of the below. + +## Reva Gateway changes + +## A dedicated shares storage provider + +Currently, the *gateway* treats `/home/shares` different than any other path: it will stat all children and calculate an etag to allow clients to discover changes in accepted shares. This requires the storage provider to cooperate and provide this special `/shares` folder in the root of a users home when it is accessed as a home storage, which is a config flag that needs to be set for every storage driver. + +The `enable_home` flag will cause drivers to jail path based requests into a `` subfolder. In effect it divides a storage provider into multiple [*storage spaces*]({{< ref "#storage-spaces" >}}): when calling `CreateHome` a subfolder following the `` is created and market as the root of a users home. Both, the eos and ocis storage drivers use extended attributes to mark the folder as the end of the size aggregation and tree mtime propagation mechanism. Even setting the quota is possible like that. All this literally is a [*storage space*]({{< ref "#storage-spaces" >}}). + +We can implement [ListStorageSpaces](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ListStorageSpacesRequest) by either +- iterating over the root of the storage and treating every folder following the `` as a `home` *storage space*, +- iterating over the root of the storage and treating every folder following a new `` as a `project` *storage space*, or +- iterating over the root of the storage and treating every folder following a generic `` as a *storage space* for a configurable space type, or +- we allow configuring a map of `space type` to `layout` (based on the [CreateStorageSpaceRequest](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.CreateStorageSpaceRequest)) which would allow things like +``` +home=/var/lib/ocis/storage/home/{{substr 0 1 .Owner.Username}}/{{.Owner.Username}} +spaces=/spaces/var/lib/ocis/storage/projects/{{.Name}} +``` + +This would make the `GetHome()` call return the path to the *storage provider* including the relative path to the *storage space*. No need for a *storage provider* mounted at `/home`. This is just a UI alias for `/users/`. Just like a normal `/home/` on a linux machine. + +But if we have no `/home` where do we find the shares, and how can clients discover changes in accepted shares? + +The `/shares` namespace should be provided by a *shares storage provider* that lists all accepted shares for the current user... but what about copy pasting links from the browser? Well this storage is only really needed to have a path to ocm shares that actually reside on other instances. In the UI the shares would be listed by querying a *share manager*. It returns ResourceIds, which can be stated to fetch a path that is then accessible in the CS3 global namespace. Two caveats: +- This only works for resources that are actually hosted by the current instance. For those it would leak the parent path segments to a shared resource. +- For accepted OCM shares there must be a path in the [*CS3 global namespace*]({{< ref "./namespaces.md#cs3-global-namespaces" >}}) that has to be the same for all users, otherwise they cannot copy and share those URLs. + +Work on this is done in https://github.com/cs3org/reva/pull/1846 + +### The gateway should be responsible for path transformations + +Currently, storage providers are aware af their mount point, coupling them tightly with the gateway. + +Tracked in https://github.com/cs3org/reva/issues/578 + +Work is done in https://github.com/cs3org/reva/pull/1866 + +## URL escaped string representation of a CS3 reference + +For the `/dav/spaces/` endpoint we need to encode the *reference* in a url compatible way. +1. We can separate the path using a `/`: `/dav/spaces//` +2. The `spaceid` currently is a cs3 resourceid, consisting of `` and ``. Since the nodeid might contain `/` eg. for the local driver we have to urlencode the spaceid. + +To access resources by id we need to make the `/dav/meta/` able to list directories... Otherwise id based navigation first has to look up the path. Or we use the libregraph api for id based navigation. + +A *reference* is a logical concept. It identifies a [*resource*]({{< ref "#resources" >}}) and consists of a `` and a ``. A `` consists of a `` and a ``. They can be concatenated using the separators `!` and `:`: +``` +!: +``` +While all components are optional, only three cases are used: +| format | example | description | +|-|-|-| +| `!:` | `!:/absolute/path/to/file.ext` | absolute path | +| `!:` | `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!:path/to/file.ext` | path relative to the root of the storage space | +| `!:` | `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!c3cf23bb-8f47-4719-a150-1d25a1f6fb56:to/file.ext` | path relative to the specified node in the storage space, used to reference resources without disclosing parent paths | + +`` should be a UUID to prevent references from breaking when a *user* or [*storage space*]({{< ref "#storage-spaces" >}}) gets renamed. But it can also be derived from a migration of an oc10 instance by concatenating an instance identifier and the numeric storage id from oc10, e.g. `oc10-instance-a$1234`. + +A reference will often start as an absolute/global path, e.g. `!:/home/Projects/Foo`. The gateway will look up the storage provider that is responsible for the path + +| Name | Description | Who resolves it? | +|------|-------------|-| +| `!:/home/Projects/Foo` | the absolute path a client like davfs will use. | The gateway uses the storage registry to look up the responsible storage provider | +| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!:/Projects/Foo` | the `storage_space` is the same as the `root`, the path becomes relative to the root | the storage provider can use this reference to identify this resource | + +Now, the same file is accessed as a share +| Name | Description | +|------|-------------| +| `!:/users/Einstein/Projects/Foo` | `Foo` is the shared folder | +| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a:` | `56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a` is the id of `Foo`, the path is empty | + + +The `:`, `!` and `$` are chosen from the set of [RFC3986 sub delimiters](https://tools.ietf.org/html/rfc3986#section-2.2) on purpose. They can be used in URLs without having to be encoded. In some cases, a delimiter can be left out if a component is not set: +| reference | interpretation | +|-|-| +| `/absolute/path/to/file.ext` | absolute path, all delimiters omitted | +| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!path/to/file.ext` | relative path in the given storage space, root delimiter `:` omitted | +| `56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a:to/file.ext` | relative path in the given root node, storage space delimiter `!` omitted | +| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a:` | node id in the given storage space, `:` must be present | +| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62` | root of the storage space, all delimiters omitted, can be distinguished by the `/` | + +## space providers +When looking up an id based resource the reference must use a logical space id, not a CS3 resource id. Otherwise id based requests, which only have a resourceid consisting of a storage id and a node id cannot be routed to the correct storage provider if the storage has moved from one storage provider to another. + +if the registry routes based on the storageid AND the nodeid it has to keep a cache of all nodeids in order to route all requests for a storage space (which consists of storage it + nodeid) to the correct storage provider. the correct resourceid for a node in a storage space would be `$!`. The `$` part allow the storage registry to route all id based requests to the correct storage provider. This becomes relevant when the storage space was moved from one storage provider to another. The storage space id remains the same, but the internal address and port change. + +TODO discuss to clarify further + +## Storage drivers + +### allow clients to send a uuid on upload +iOS clients can only queue single requests to be executed in the background. They queue an upload and need to be able to identify the uploaded file after it has been uploaded to the server. The disconnected nature of the connection might cause workflows or manual user interaction with the file on the server to move the file to a different place or changing the content while the device is offline. However, on the device users might have marked the file as favorite or added it to other iOS specific collections. To be able to reliably identify the file the client can generate a `uuid` and attach it to the file metadata during the upload. While it is not necessary to look up files by this `uuid` having a second file id that serves exactly the same purpose as the `file id` is redundant. + +Another aspect for the `file id` / `uuid` is that it must be a logical identifier that can be set, at least by internal systems. Without a writeable fileid we cannot restore backups or migrate storage spaces from one storage provider to another storage provider. + +Technically, this means that every storage driver needs to have a map of a `uuid` to an internal resource identifier. This internal resource identifier can be +- an eos fileid, because eos can look up files by id +- an inode if the filesystem and the storage driver support looking up by inode +- a path if the storage driver has no way of looking up files by id. + - In this case other mechanisms like inotify, kernel audit or a fuse overlay might be used to keep the paths up to date. + - to prevent excessive writes when deep folders are renamed a reverse map might be used: it will map the `uuid` to `:`, in order to trade writes for reads + - as a fallback a sync job can read the file id from the metadata of the resources and populate the uuid to internal id map. + +The TUS upload can take metadata, for PUT we might need a header. \ No newline at end of file diff --git a/docs/extensions/storage/static/spacesregistry.drawio.svg b/docs/extensions/storage/static/spacesregistry.drawio.svg new file mode 100644 index 0000000000..e3a771b07e --- /dev/null +++ b/docs/extensions/storage/static/spacesregistry.drawio.svg @@ -0,0 +1,434 @@ + + + + + + + +
+
+
+ oCIS System +
+ [Software System] +
+
+
+
+ + oCIS System... + +
+
+ + + + + + +
+
+
+ + Einstein + +
+ [Person] +
+
+
+ End user +
+
+
+
+
+ + Einstein... + +
+
+ + + + +
+
+
+ + Client + +
+ [Container: C++, Kotlin, Swift or Vue] +
+
+
+ A desktop, mobile or web Client +
+
+
+
+
+ + Client... + +
+
+ + + + +
+
+
+ + Storage Space Registry + +
+ [Container: golang, HTTP, libregraph] +
+
+
+ Manages spaces for users +
+
+
+
+
+ + Storage Space Registry... + +
+
+ + + + +
+
+
+ + Storage Provider + +
+ [Container: golang] +
+
+
+ Persists storage spaces using reva +
+
+
+
+
+ + Storage Provider... + +
+
+ + + + +
+
+
+ + Storage System + +
+ [Software System] +
+
+
+ provides persistent storage +
+
+
+
+
+ + Storage System... + +
+
+ + + + + + +
+
+
+ + Moss + +
+ [Person] +
+
+
+ Administrator +
+
+
+
+
+ + Moss... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [POSIX, S3] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [WebDAV, libregraph, CS3, tus] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Manages the users Storage Spaces in + +
+
+ [libregraph] +
+
+
+
+
+
+ + Manages the users Storage Spac... + +
+
+ + + + + +
+
+
+
+
+ + Manages resources with + +
+
+ [Web UI or native clients] +
+
+
+
+
+
+ + Manages resources with... + +
+
+ + + + + +
+
+
+
+
+ + Registers itself at and +
+ sends space root etag changes to +
+
+
+ [CS3, libregraph?, PUSH] +
+
+
+
+
+
+ + Registers itself at and... + +
+
+ + + + + +
+
+
+
+
+ + Manages organizational Storage Spaces in + +
+
+ [WebDAV, libregraph, CS3, CLI] +
+
+
+
+
+
+ + Manages organizational Storage... + +
+
+ + + + +
+
+
+ + Identity Management System + +
+ [Software System] +
+
+
+ provides users and groups +
+
+
+
+
+ + Identity Management System... + +
+
+ + + + + +
+
+
+
+
+ + Authenticates users and searches recipients with + +
+
+ [OpenID Connect, LDAP, REST] +
+
+
+
+
+
+ + Authenticates users and search... + +
+
+ + + + +
+
+
+

+ C4 Container diagram for the oCIS System +

+

+ As a platform, the oCIS system may not only includes web, mobile and desktop clients but also the underlying storage system or an identity management system +

+

+ Date: 2021-07-22T16:43 +

+
+
+
+
+ + C4 Container diagram for the oCIS System... + +
+
+
+ + + + + Viewer does not support full SVG 1.1 + + + +
\ No newline at end of file diff --git a/docs/extensions/storage/static/storageprovider.drawio.svg b/docs/extensions/storage/static/storageprovider.drawio.svg index 4b88a71c17..e7ba5ea770 100644 --- a/docs/extensions/storage/static/storageprovider.drawio.svg +++ b/docs/extensions/storage/static/storageprovider.drawio.svg @@ -1,119 +1,345 @@ - + - - + -
-
-
- - CS3 -
- storage provider -
- API (GRPC) -
-
+
+
+
+ oCIS storage provider +
+ [Software System]
- - CS3... + + oCIS storage provider... - - - - - - - - - - + -
+
-
- storage provider +
+ + reva storage provider + +
+ [Component: golang] +
+
+
+ hosts multiple storage spaces using a storage driver +
- - storage provider + + reva storage provider... - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + -
+
-
- - / - +
+ + reva gateway + +
+ [Component: golang] +
+
+
+ API facade for internal reva services +
- - / + + reva gateway... + + + + + + + +
+
+
+ + Storage System + +
+ [Software System] +
+
+
+ provides persistent storage +
+
+
+
+
+ + Storage System... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [POSIX, S3] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + +
+
+
+ + reva frontend + +
+ [Component: golang] +
+
+
+ handles protocol translation +
+
+
+
+
+ + reva frontend... + +
+
+ + + + +
+
+
+ + oCIS proxy + +
+ [Component: golang] +
+
+
+ Routes requests to oc10 or ecis +
+
+
+
+
+ + oCIS proxy... + +
+
+ + + + + +
+
+
+
+
+ + Mints an internal JWT +
+ and torwards requests to +
+
+
+ [WebDAV, OCS, OCM, tus] +
+
+
+
+
+
+ + Mints an internal JWT... + +
+
+ + + + +
+
+
+ + Client + +
+ [Container: C++, Kotlin, +
+ Swift or Vue] +
+
+
+ A desktop, mobile or web Client +
+
+
+
+
+ + Client... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [WebDAV, libregraph, CS3] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [CS3, tus] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Forwards to + +
+
+ [CS3, storage registry] +
+
+
+
+
+
+ + Forwards to... + +
+
+ + + + +
+
+
+

+ C4 Component diagram for an oCIS storage provider +

+

+ An oCIS storage provider manages resources in storage spaces by persisting them with a specific storage driver in a storage system. +

+

+ Date: 2021-07-22T12:40 +

+
+
+
+
+ + C4 Component diagram for an oCIS storage provider...
- - - - - - - - diff --git a/docs/extensions/storage/static/storageregistry-spaces.drawio.svg b/docs/extensions/storage/static/storageregistry-spaces.drawio.svg deleted file mode 100644 index 3c2d49717f..0000000000 --- a/docs/extensions/storage/static/storageregistry-spaces.drawio.svg +++ /dev/null @@ -1,327 +0,0 @@ - - - - - - - - -
-
-
- The storage registry currently maps paths and storageids to the -
- - address:port - - of the corresponding storage provider -
-
-
-
- - The storage registry currently maps... - -
-
- - - - - - - -
-
-
- storage registry -
-
-
-
- - storage registry - -
-
- - - - - - - -
-
-
- storage providers -
-
-
-
- - storage providers - -
-
- - - - - - - - - - - - -
-
-
- The gateway uses the storage registry to look up the storage provider that is responsible for path and id based references in incoming requests. -
-
-
-
- - The gateway uses the storage regist... - -
-
- - - - - - - -
-
-
- gateway -
-
-
-
- - gateway - -
-
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - Viewer does not support full SVG 1.1 - - - -
\ No newline at end of file diff --git a/docs/extensions/storage/storages.md b/docs/extensions/storage/storages.md index 6172dc10ef..59927fd933 100644 --- a/docs/extensions/storage/storages.md +++ b/docs/extensions/storage/storages.md @@ -78,7 +78,15 @@ The storage keeps an activity history, tracking the different actions that have ## Storage drivers -Reva currently has four storage driver implementations that can be used for *storage providers* an well as *data providers*. +Reva currently has several storage driver implementations that can be used for *storage providers* an well as *data providers*. + +### OCIS and S3NG Storage Driver + +The oCIS storage driver is the default storage driver. It decomposes the metadata and persists it in a POSIX filesystem. Blobs are stored on the filesystem as well. The layout makes extensive use of symlinks and extended attributes. A filesystem like xfs or zfs without inode size limitations is recommended. We will evolve this to further integrate with file systems like cephfs or gpfs. + +The S3NG storage driver uses the same metadata layout on a POSIX storage as the oCIS driver, but it uses S3 as the blob storage. + +TODO add list of capabilities / tradeoffs ### Local Storage Driver diff --git a/docs/extensions/storage/terminology.md b/docs/extensions/storage/terminology.md index 0b3ce00a4e..2b883b5705 100644 --- a/docs/extensions/storage/terminology.md +++ b/docs/extensions/storage/terminology.md @@ -9,66 +9,50 @@ geekdocFilePath: terminology.md Communication is hard. And clear communication is even harder. You may encounter the following terms throughout the documentation, in the code or when talking to other developers. Just keep in mind that whenever you hear or read *storage*, that term needs to be clarified, because on its own it is too vague. PR welcome. -## Resources -A *resource* is a logical concept. Resources can be of [different types](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ResourceType): +## Logical concepts + +### Resources +A *resource* is the basic building block that oCIS manages. It can be of [different types](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ResourceType): - an actual *file* - a *container*, e.g. a folder or bucket - a *symlink*, or - a [*reference*]({{< ref "#references" >}}) which can point to a resource in another [*storage provider*]({{< ref "#storage-providers" >}}) -## References +### References -A *reference* is a logical concept that identifies a [*resource*]({{< ref "#resources" >}}). A [*CS3 reference*](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.Reference) consists of either -- a *path* based reference, used to identify a [*resource*]({{< ref "#resources" >}}) in the [*namespace*]({{< ref "./namespaces.md" >}}) of a [*storage provider*]({{< ref "#storage-providers" >}}). It must start with a `/`. -- a [CS3 *id* based reference](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ResourceId), uniquely identifying a [*resource*]({{< ref "#resources" >}}) in the [*namespace*]({{< ref "./namespaces.md" >}}) of a [*storage provider*]({{< ref "#storage-providers" >}}). It consists of a `storage provider id` and an `opaque id`. The `storage provider id` must NOT start with a `/`. - -{{< hint info >}} -The `/` is important because currently the static [*storage registry*]({{< ref "#storage-space-registries" >}}) uses a map to look up which [*storage provider*]({{< ref "#storage-providers" >}}) is responsible for the resource. Paths must be prefixed with `/` so there can be no collisions between paths and storage provider ids in the same map. -{{< /hint >}} +A *reference* identifies a [*resource*]({{< ref "#resources" >}}). A [*CS3 reference*](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.Reference) can carry a *path* and a [CS3 *resource id*](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ResourceId). The references come in two flavors: absolute and combined. +Absolute references have either the *path* or the *resource id* set: +- An absolute *path* MUST start with a `/`. The *resource id* MUST be empty. +- An absolute *resource id* uniquely identifies a [*resource*]({{< ref "#resources" >}}) and is used as a stable identifier for sharing. The *path* MUST be empty. +Combined references have both, *path* and *resource id* set: +- the *resource id* identifies the root [*resource*]({{< ref "#resources" >}}) +- the *path* is relative to that root. It MUST start with `.` -{{< hint warning >}} -### Alternative: reference triple #### -A *reference* is a logical concept. It identifies a [*resource*]({{< ref "#resources" >}}) and consists of -a `storage_space`, a `` and a `` -``` -!: -``` -While all components are optional, only three cases are used: -| format | example | description | -|-|-|-| -| `!:` | `!:/absolute/path/to/file.ext` | absolute path | -| `!:` | `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!:path/to/file.ext` | path relative to the root of the storage space | -| `!:` | `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!c3cf23bb-8f47-4719-a150-1d25a1f6fb56:to/file.ext` | path relative to the specified node in the storage space, used to reference resources without disclosing parent paths | +### Storage Spaces +A *storage space* organizes a set of [*resources*]({{< ref "#resources" >}}) in a hierarchical tree. It has a single *owner* (*user* or *group*), +a *quota*, *permissions* and is identified by a `storage space id`. -`` should be a UUID to prevent references from breaking when a *user* or [*storage space*]({{< ref "#storage-spaces" >}}) gets renamed. But it can also be derived from a migration of an oc10 instance by concatenating an instance identifier and the numeric storage id from oc10, e.g. `oc10-instance-a$1234`. +{{< svg src="extensions/storage/static/storagespace.drawio.svg" >}} -A reference will often start as an absolute/global path, e.g. `!:/home/Projects/Foo`. The gateway will look up the storage provider that is responsible for the path +Examples would be every user's personal storage space, project storage spaces or group storage spaces. While they all serve different purposes and may or may not have workflows like anti virus scanning enabled, we need a way to identify and manage these subtrees in a generic way. By creating a dedicated concept for them this becomes easier and literally makes the codebase cleaner. A [*storage space registry*]({{< ref "#storage-space-registries" >}}) then allows listing the capabilities of [*storage spaces*]({{< ref "#storage-spaces" >}}), e.g. free space, quota, owner, syncable, root etag, upload workflow steps, ... -| Name | Description | Who resolves it? | -|------|-------------|-| -| `!:/home/Projects/Foo` | the absolute path a client like davfs will use. | The gateway uses the storage registry to look up the responsible storage provider | -| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!:/Projects/Foo` | the `storage_space` is the same as the `root`, the path becomes relative to the root | the storage provider can use this reference to identify this resource | +Finally, a logical `storage space id` is not tied to a specific [*storage provider*]({{< ref "#storage-providers" >}}). If the [*storage driver*]({{< ref "#storage-drivers" >}}) supports it, we can import existing files including their `file id`, which makes it possible to move [*storage spaces*]({{< ref "#storage-spaces" >}}) between [*storage providers*]({{< ref "#storage-providers" >}}) to implement storage classes, e.g. with or without archival, workflows, on SSDs or HDDs. -Now, the same file is accessed as a share -| Name | Description | -|------|-------------| -| `!:/users/Einstein/Projects/Foo` | `Foo` is the shared folder | -| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a:` | `56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a` is the id of `Foo`, the path is empty | +### Shares +*To be clarified: we are aware that [*storage spaces*]({{< ref "#storage-spaces" >}}) may be too 'heavywheight' for ad hoc sharing with groups. That being said, there is no technical reason why group shares should not be treated like [*storage spaces*]({{< ref "#storage-spaces" >}}) that users can provision themselves. They would share the quota with the users home [*storage space*]({{< ref "#storage-spaces" >}}) and the share initiator would be the sole owner. Technically, the mechanism of treating a share like a new [*storage space*]({{< ref "#storage-spaces" >}}) would be the same. This obviously also extends to user shares and even file indvidual shares that would be wrapped in a virtual collection. It would also become possible to share collections of arbitrary files in a single storage space, e.g. the ten best pictures from a large album.* -The `:`, `!` and `$` are chosen from the set of [RFC3986 sub delimiters](https://tools.ietf.org/html/rfc3986#section-2.2) on purpose. They can be used in URLs without having to be encoded. In some cases, a delimiter can be left out if a component is not set: -| reference | interpretation | -|-|-| -| `/absolute/path/to/file.ext` | absolute path, all delimiters omitted | -| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!path/to/file.ext` | relative path in the given storage space, root delimiter `:` omitted | -| `56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a:to/file.ext` | relative path in the given root node, storage space delimiter `!` omitted | -| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62!56f7ceca-e7f8-4530-9a7a-fe4b7ec8089a:` | node id in the given storage space, `:` must be present | -| `ee1687e5-ac7f-426d-a6c0-03fed91d5f62` | root of the storage space, all delimiters omitted, can be distinguished by the `/` | +### Storage Space Registries -{{< /hint >}} +A *storage space registry* manages the [*namespace*]({{< ref "./namespaces.md" >}}) for a *user*: it is used by *clients* to look up storage spaces a user has access to, the `/dav/spaces` endpoint to access it via WabDAV, and where the client should mount it in the users personal namespace. -## Storage Drivers +{{< svg src="extensions/storage/static/spacesregistry.drawio.svg" >}} + + +## Technical concepts + +### Storage Drivers A *storage driver* implements access to a [*storage system*]({{< ref "#storage-systems" >}}): @@ -77,38 +61,14 @@ It maps the *path* and *id* based CS3 *references* to an appropriate [*storage s - posix inodes or paths - deconstructed filesystem nodes -{{< hint warning >}} -**Proposed Change** -iOS clients can only queue single requests to be executed in the background. The queue an upload and need to be able to identify the uploaded file after it has been uploaded to the server. The disconnected nature of the connection might cause worksflows or manual user interaction with the file on the server to move the file to a different place or changing the content while the device is offline. However, on the device users might have marked the file as favorite or added it to other iOS specific collections. To be able to reliably identify the file the client can generate a `uuid` and attach it to the file metadata during the upload. While it is not necessary to look up files by this `uuid` having a second file id that serves exactly the same purpose as the `file id` is redundant. - -Another aspect for the `file id` / `uuid` is that it must be a logical identifier that can be set, at least by internal systems. Without a writeable fileid we cannot restore backups or migrate storage spaces from one storage provider to another storage provider. - -Technically, this means that every storage driler needs to have a map of a `uuid` to in internal resource identifier. This internal resource identifier can be -- an eos fileid, because eos can look up files by id -- an inode if the filesystem and the storage driver support lookung up by inode -- a path if the storage driver has no way of looking up files by id. - - In this case other mechanisms like inotify, kernel audit or a fuse overlay might be used to keep the paths up to date. - - to prevent excessive writes when deep folders are renamed a reverse map might be used: it will map the `uuid` to `:`, allowing to trade writes for reads - -{{< /hint >}} -## Storage Providers +### Storage Providers A *storage provider* manages [*resources*]({{< ref "#resources" >}}) identified by a [*reference*]({{< ref "#references" >}}) by accessing a [*storage system*]({{< ref "#storage-systems" >}}) with a [*storage driver*]({{< ref "#storage-drivers" >}}). {{< svg src="extensions/storage/static/storageprovider.drawio.svg" >}} -{{< hint warning >}} -**Proposed Change** -A *storage provider* manages multiple [*storage spaces*]({{< ref "#storage-space" >}}) -by accessing a [*storage system*]({{< ref "#storage-systems" >}}) with a [*storage driver*]({{< ref "#storage-drivers" >}}). - -{{< svg src="extensions/storage/static/storageprovider-spaces.drawio.svg" >}} - -By making [*storage providers*]({{< ref "#storage-providers" >}}) aware of [*storage spaces*]({{< ref "#storage-spaces" >}}) we can get rid of the current `enablehome` flag / hack in reva, which lead to the [spawn of `*home` drivers](https://github.com/cs3org/reva/tree/master/pkg/storage/fs). Furthermore, provisioning a new [*storage space*]({{< ref "#storage-space" >}}) becomes a generic operation, regardless of the need of provisioning a new user home or a new project space. -{{< /hint >}} - -## Storage Space Registries +### Storage Registry A *storage registry* manages the [*CS3 global namespace*]({{< ref "./namespaces.md#cs3-global-namespaces" >}}): It is used by the *gateway* @@ -117,65 +77,11 @@ that should handle a [*reference*]({{< ref "#references" >}}). {{< svg src="extensions/storage/static/storageregistry.drawio.svg" >}} -{{< hint warning >}} -**Proposed Change** -A *storage space registry* manages the [*namespace*]({{< ref "./namespaces.md" >}}) for a *user*: -It is used by the *gateway* -to look up `address` and `port` of the [*storage provider*]({{< ref "#storage-providers" >}}) -that is currently serving a [*storage space*]({{< ref "#storage-space" >}}). - -{{< svg src="extensions/storage/static/storageregistry-spaces.drawio.svg" >}} - -By making *storage registries* aware of [*storage spaces*]({{< ref "#storage-spaces" >}}) we can query them for a listing of all [*storage spaces*]({{< ref "#storage-spaces" >}}) a user has access to. Including his home, received shares, project folders or group drives. See [a WIP PR for spaces in the oCIS repo (#1827)](https://github.com/owncloud/ocis/pull/1827) for more info. -{{< /hint >}} - -## Storage Spaces -A *storage space* is a logical concept: -It is a tree of [*resources*]({{< ref "#resources" >}})*resources* -with a single *owner* (*user* or *group*), -a *quota* and *permissions*, identified by a `storage space id`. - -{{< svg src="extensions/storage/static/storagespace.drawio.svg" >}} - -Examples would be every user's home storage space, project storage spaces or group storage spaces. While they all serve different purposes and may or may not have workflows like anti virus scanning enabled, we need a way to identify and manage these subtrees in a generic way. By creating a dedicated concept for them this becomes easier and literally makes the codebase cleaner. A [*storage space registry*]({{< ref "#storage-space-registries" >}}) then allows listing the capabilities of [*storage spaces*]({{< ref "#storage-spaces" >}}), e.g. free space, quota, owner, syncable, root etag, upload workflow steps, ... - -Finally, a logical `storage space id` is not tied to a specific [*storage provider*]({{< ref "#storage-providers" >}}). If the [*storage driver*]({{< ref "#storage-drivers" >}}) supports it, we can import existing files including their `file id`, which makes it possible to move [*storage spaces*]({{< ref "#storage-spaces" >}}) between [*storage providers*]({{< ref "#storage-providers" >}}) to implement storage classes, e.g. with or without archival, workflows, on SSDs or HDDs. - -## Shares -*To be clarified: we are aware that [*storage spaces*]({{< ref "#storage-spaces" >}}) may be too 'heavywheight' for ad hoc sharing with groups. That being said, there is no technical reason why group shares should not be treated like [*storage spaces*]({{< ref "#storage-spaces" >}}) that users can provision themselves. They would share the quota with the users home [*storage space*]({{< ref "#storage-spaces" >}}) and the share initiator would be the sole owner. Technically, the mechanism of treating a share like a new [*storage space*]({{< ref "#storage-spaces" >}}) would be the same. This obviously also extends to user shares and even file indvidual shares that would be wrapped in a virtual collection. It would also become possible to share collections of arbitrary files in a single storage space, e.g. the ten best pictures from a large album.* - - -## Storage Systems +### Storage Systems Every *storage system* has different native capabilities like id and path based lookups, recursive change time propagation, permissions, trash, versions, archival and more. A [*storage provider*]({{< ref "#storage-providers" >}}) makes the storage system available in the CS3 API by wrapping the capabilities as good as possible using a [*storage driver*]({{< ref "#storage-drivers" >}}). There migt be multiple [*storage drivers*]({{< ref "#storage-drivers" >}}) for a *storage system*, implementing different tradeoffs to match varying requirements. -## Gateways +### Gateways A *gateway* acts as a facade to the storage related services. It authenticates and forwards API calls that are publicly accessible. - -{{< hint warning >}} -**Proposed Change** -Currently, the *gateway* treats `/home/shares` different than any other path: it will stat all children and calculate an etag to allow clients to discover changes in accepted shares. This requires the storage provider to cooperate and provide this special `/shares` folder in the root of a users home when it is accessed as a home storage, which is a config flag that needs to be set for every storage driver. - -The `enable_home` flag will cause drivers to jail path based requests into a `` subfolder. In effect it divides a storage provider into multiple [*storage spaces*]({{< ref "#storage-spaces" >}}): when calling `CreateHome` a subfolder following the `` is created and market as the root of a users home. Both, the eos and ocis storage drivers use extended attributes to mark the folder as the end of the size aggregation and tree mtime propagation mechanism. Even setting the quota is possible like that. All this literally is a [*storage space*]({{< ref "#storage-spaces" >}}). - -We can implement [ListStorageSpaces](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ListStorageSpacesRequest) by either -- iterating over the root of the storage and treating every folder following the `` as a `home` *storage space*, -- iterating over the root of the storage and treating every folder following a new `` as a `project` *storage space*, or -- iterating over the root of the storage and treating every folder following a generic `` as a *storage space* for a configurable space type, or -- we allow configuring a map of `space type` to `layout` (based on the [CreateStorageSpaceRequest](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.CreateStorageSpaceRequest)) which would allow things like -``` -home=/var/lib/ocis/storage/home/{{substr 0 1 .Owner.Username}}/{{.Owner.Username}} -spaces=/spaces/var/lib/ocis/storage/projects/{{.Name}} -``` - -This would make the `GetHome()` call return the path to the *storage provider* including the relative path to the *storage space*. No need for a *storage provider* mounted at `/home`. This is just a UI alias for `/users/`. Just like a normal `/home/` on a linux machine. - -But if we have no `/home` where do we find the shares, and how can clients discover changes in accepted shares? - -The `/shares` namespace should be provided by a *storage provider* that lists all accepted shares for the current user... but what about copy pasting links from the browser? Well this storage is only really needed to have a path to ocm shares that actually reside on other instances. In the UI the shares would be listed by querying a *share manager*. It returns ResourceIds, which can be stated to fetch a path that is then accessible in the CS3 global namespace. Two caveats: -- This only works for resources that are actually hosted by the current instance. For those it would leak the parent path segments to a shared resource. -- For accepted OCM shares there must be a path in the [*CS3 global namespace*]({{< ref "./namespaces.md#cs3-global-namespaces" >}}) that has to be the same for all users, otherwise they cannot copy and share those URLs. - -{{< /hint >}} \ No newline at end of file diff --git a/docs/extensions/storage/updating.md b/docs/extensions/storage/updating.md index 2a5133618c..dd19203766 100644 --- a/docs/extensions/storage/updating.md +++ b/docs/extensions/storage/updating.md @@ -11,9 +11,9 @@ geekdocFilePath: updating.md ## Updating reva -1. Run `go get github.com/cs3org/reva@master` +1. Run `go get github.com/cs3org/reva@master` in all repos that depend on reva 2. Create a changelog entry containing changes that were done in [reva](https://github.com/cs3org/reva/commits/master) -3. Create a Pull Request to ocis-reva master with those changes +3. Create a Pull Request to ocis master with those changes 4. If test issues appear, you might need to adjust the tests 5. After the PR is merged, consider doing a [release of the storage submodule]({{< ref "releasing" >}}) diff --git a/docs/ocis/_index.md b/docs/ocis/_index.md index ed233878b1..46f0b599a5 100644 --- a/docs/ocis/_index.md +++ b/docs/ocis/_index.md @@ -10,18 +10,16 @@ geekdocFilePath: _index.md {{< figure class="floatright" src="/media/is.png" width="70%" height="auto" >}} ## ownCloud Infinite Scale - Welcome to oCIS, the modern file-sync and share platform, which is based on our knowledge and experience with the PHP based [ownCloud server](https://owncloud.com/#server). ### The idea of federated storage - To creata a truly federated storage architecture oCIS breaks down the old ownCloud 10 user specific namespace, which is assembled on the server side, and makes the individual parts accessible to clients as storage spaces and storage space registries. The below diagram shows the core conceps that are the foundation for the new architecture: - End user devices can fetch the list of *storage spaces* a user has access to, by querying one or multiple *storage space registries*. The list contains a unique endpoint for every *storage space*. -- [*Storage space registries*]({{< ref "../extensions/storage/terminology#storage-space-registries" >}}) manage the list of storage spaces a user has access to. They may subscrible to *storage spaces* in order to receive notifications about changes on behalf of an end users mobile or desktop client. -- [*Storage spaces*]({{< ref "../extensions/storage/terminology#storage-spaces" >}}) represent a collection of files and folders. A users personal files are a *storage space*, a group or project drive is a *storage space*, and even incoming shares are treated and implemented as *storage spaces*. Each with properties like owners, permissions, quota and type. -- [*Storage providers*]({{< ref "../extensions/storage/terminology#storage-providers" >}}) can hold multiple *storage spaces*. At an oCIS instance, there might be a dedicated *storage provider* responsible for users personal storage spaces. There might be multiple, sharing the load or there might be just one, hosting all types of *storage spaces*. +- [*Storage space registries*]({{< ref "../extensions/storage/terminology#storage-space-registries" >}}) manage the list of storage spaces a user has access to. They may subscribe to *storage spaces* in order to receive notifications about changes on behalf of an end users mobile or desktop client. +- [*Storage spaces*]({{< ref "../extensions/storage/terminology#storage-spaces" >}}) represent a collection of files and folders. A users personal files are contained in a *storage space*, a group or project drive is a *storage space*, and even incoming shares are treated and implemented as *storage spaces*. Each with properties like owners, permissions, quota and type. +- [*Storage providers*]({{< ref "../extensions/storage/terminology#storage-providers" >}}) can hold multiple *storage spaces*. At an oCIS instance, there might be a dedicated *storage provider* responsible for users personal storage spaces. There might be multiple, either to shard the load, provide different levels of redundancy or support custom workflows. Or there might be just one, hosting all types of *storage spaces*. {{< svg src="ocis/static/idea.drawio.svg" >}} @@ -35,19 +33,18 @@ Einstein copies the URL in the browser (or an email with the same URL is sent au When Marie enters that URL she will be presented with a login form on the `https://cloud.zurich.test` instance, because the share was created on that domain. If `https://cloud.zurich.test` trusts her OpenID Connect identity provider `https://idp.paris.test` she can log in. This time, the *storage space registry* discovery will come up with `https://cloud.paris.test` though. Since that registry is different than the registry tied to `https://cloud.zurich.test` oCIS web can look up the *storage space* `716199a6-00c0-4fec-93d2-7e00150b1c84` and register the WebDAV URL `https://cloud.zurich.test/dav/spaces/716199a6-00c0-4fec-93d2-7e00150b1c84/a/rel/path` in Maries *storage space registry* at `https://cloud.paris.test`. When she accepts that share her clients will be able to sync the new *storage space* at `https://cloud.zurich.test`. -### oCIS microservice runtime +Or in other words: _total world federation!_ +### oCIS microservice runtime The oCIS runtime allows us to dynamically manage services running in a single process. We use [suture](https://github.com/thejerf/suture) to create a supervisor tree that starts each service in a dedicated goroutine. By default oCIS will start all built-in oCIS extensions in a single process. Individual services can be moved to other nodes to scale-out and meet specific performance requirements. A [go-micro](https://github.com/asim/go-micro/blob/master/registry/registry.go) based registry allows services in multiple nodes to form a distributed microservice architecture. ### oCIS extensions - Every oCIS extension uses [ocis-pkg](https://github.com/owncloud/ocis/tree/master/ocis-pkg), which implements the [go-micro](https://go-micro.dev/) interfaces for [servers](https://github.com/asim/go-micro/blob/v3.5.0/server/server.go#L17-L37) to register and [clients](https://github.com/asim/go-micro/blob/v3.5.0/client/client.go#L11-L23) to lookup nodes with a service [registry](https://github.com/asim/go-micro/blob/v3.5.0/registry/registry.go). We are following the [12 Factor](https://12factor.net/) methodology with oCIS. The uniformity of services also allows us to use the same command, logging and configuration mechanism. Configurations are forwarded from the oCIS runtime to the individual extensions. ### go-micro - While the [go-micro](https://go-micro.dev/) framework provides abstractions as well as implementations for the different components in a microservice architecture, it uses a more developer focused runtime philosophy: It is used to download services from a repo, compile them on the fly and start them as individual processes. For oCIS we decided to use a more admin friendly runtime: You can download a single binary and start the contained oCIS extensions with a single `bin/ocis server`. This also makes packaging easier. We use [ocis-pkg](https://github.com/owncloud/ocis/tree/master/ocis-pkg) to configure the default implementations for the go-micro [grpc server](https://github.com/asim/go-micro/tree/v3.5.0/plugins/server/grpc), [client](https://github.com/asim/go-micro/tree/v3.5.0/plugins/client/grpc) and [mdns registry](https://github.com/asim/go-micro/blob/v3.5.0/registry/mdns_registry.go), swapping them out as needed, eg. to use the [kubernetes registry plugin](https://github.com/asim/go-micro/tree/v3.5.0/plugins/registry/kubernetes). @@ -62,7 +59,6 @@ Interacting with oCIS involves a multitude af APIs. The server and all clients r We run a huge [test suite](https://github.com/owncloud/core/tree/master/tests), which originated in ownCloud 10 and continues to grow. A detailed description can be found in the developer docs for [testing]({{< ref "development/testing" >}}). ### Architecture Overview - Running `bin/ocis server` will start the below services, all of which can be scaled and deployed on a single node or in a cloud native environment, as needed. {{< svg src="ocis/static/architecture-overview.drawio.svg" >}} diff --git a/docs/ocis/migration.md b/docs/ocis/migration.md index 12583248ac..8fd1ecd414 100644 --- a/docs/ocis/migration.md +++ b/docs/ocis/migration.md @@ -9,14 +9,6 @@ geekdocFilePath: migration.md The migration happens in subsequent stages while the service is online. First all users need to migrate to the new architecture, then the global namespace needs to be introduced. Finally, the data on disk can be migrated user by user by switching the storage driver. -
- -{{< hint warning >}} -@jfd: It might be easier to introduce the spaces api in oc10 and then migrate to oCIS. We cannot migrate both at the same time, the architecture to oCIS (which will change fileids) and introduce a global namespace (which requires stable fileids to let clients handle moves without redownloading). Either we implement arbitrary mounting of shares in oCIS / reva or we make clients and oc10 spaces aware. -{{< /hint >}} - -
- ## Migration Stages ### Stage 0: pre migration @@ -56,7 +48,7 @@ The ownCloud 10 demo instance uses OAuth to obtain a token for ownCloud web and
-_TODO make oauth2 in oc10 trust the new web ui, based on `redirect_uri` and CSRF so no explicit consent is needed_ +_TODO make oauth2 in oc10 trust the new web ui, based on `redirect_uri` and CSRF so no explicit consent is needed?_ #### FAQ _Feel free to add your question as a PR to this document using the link at the top of this page!_ @@ -72,13 +64,12 @@ While SAML and Shibboleth are protocols that solve that problem, they are limite
-_TODO @butonic add ADR for OpenID Connect_ +_TODO @butonic add ADR for OpenID Connect and flesh out pros and cons of the above_
#### User impact -When introducing OpenID Connect, the clients will detect the new authentication scheme when their current way of authenticating returns an error. Users will then have to -reauthorize at the OpenID Connecd IdP, which again, may be configured to skip the consent step for trusted clients. +When introducing OpenID Connect, the clients will detect the new authentication scheme when their current way of authenticating returns an error. Users will then have to reauthorize at the OpenID Connect IdP, which again, may be configured to skip the consent step for trusted clients. #### Steps 1. There are multiple products that can be used as an OpenID Connect IdP. We test with [LibreGraph Connect](https://github.com/libregraph/lico), which is also [embedded in oCIS](https://github.com/owncloud/web/). Other alternatives include [Keycloak](https://www.keycloak.org/) or [Ping](https://www.pingidentity.com/). Please refer to the corresponding setup instructions for the product you intent to use. @@ -106,7 +97,7 @@ Should there be problems with OpenID Connect at this point you can disable the a
Legacy clients relying on Basic auth or app passwords need to be migrated to OpenId Connect to work with oCIS. For a transition period Basic auth in oCIS can be enabled with `PROXY_ENABLE_BASIC_AUTH=true`, but we strongly recommend adopting OpenID Connect for other tools as well. -While OpenID Connect providers will send an `iss` and `sub` claim that relying parties (services like oCIS or ownCloud 10) can use to identify users we recommend introducing a dedicated, globally unique, persistent, non-reassignable user identifier like a UUID for every user. This `ownclouduuid` shold be sent as an additional claim to save additional lookups on the server side. It will become the user id in oCIS, e.g. when searching for recipients the `ownclouduuid` will be used to persist permissions with the share manager. It has a different purpose than the ownCloud 10 username, which is used to login. Using UUIDs we can not only mitigate username collisions when merging multiple instances but also allow renaming usernames after the migration to oCIS has been completed. +While OpenID Connect providers will send an `iss` and `sub` claim that relying parties (services like oCIS or ownCloud 10) can use to identify users we recommend introducing a dedicated, globally unique, persistent, non-reassignable user identifier like a UUID for every user. This `ownclouduuid` should be sent as an additional claim to save additional lookups on the server side. It will become the user id in oCIS, e.g. when searching for recipients the `ownclouduuid` will be used to persist permissions with the share manager. It has a different purpose than the ownCloud 10 username, which is used to login. Using UUIDs we can not only mitigate username collisions when merging multiple instances but also allow renaming usernames after the migration to oCIS has been completed.
@@ -117,9 +108,9 @@ _Feel free to add your question as a PR to this document using the link at the t
-### Stage 3: introduce oCIS interally +### Stage 3: introduce oCIS internally -Befor letting oCIS handle end user requests we will first make it available in the internal network. By subsequently adding services we can add functionality and verify the services work as intended. +Before letting oCIS handle end user requests we will first make it available in the internal network. By subsequently adding services we can add functionality and verify the services work as intended. Start oCIS backend and make read only tests on existing data using the `owncloudsql` storage driver which will read (and write) - blobs from the same datadirectory layout as in ownCloud 10 @@ -139,7 +130,7 @@ None, only administrators will be able to explore oCIS during this stage. #### Steps and verifications -We are going to run and explore a series of services that will together handle the same requests as ownCloud 10. For initial exploration the oCIS binary is recommended. The services can later be deployed using a single oCIS runtime or in multiple cotainers. +We are going to run and explore a series of services that will together handle the same requests as ownCloud 10. For initial exploration the oCIS binary is recommended. The services can later be deployed using a single oCIS runtime or in multiple containers. ##### Storage provider for file metadata @@ -172,7 +163,7 @@ Enable spaces API in oc10: {{< hint warning >}} **Alternative 2** -An additional `uuid` property used only to detect moves. A lookup by uuid is not necessary for this. The `/dav/meta` endpoint would still take the fileid. Clients would use the `uuid` to detect moves and set up new sync pairs when migrating to a global namespace. +An additional `uuid` property used only to detect moves. A lookup by uuid is not necessary for this. The `/dav/meta` endpoint would still take the fileid. Clients would use the `uuid` to detect moves and set up new sync pairs when migrating to a global namespace. ### Stage-3.1 Generate a `uuid` for every file as a file property. Clients can submit a `uuid` when creating files. The server will create a `uuid` if the client did not provide one. @@ -280,7 +271,7 @@ The IP address of the ownCloud host changes. There is no change for the file syn 2. Verify the requests are routed based on the ownCloud 10 routing policy `oc10` by default ##### Test user based routing -1. Change the routing policy for a user or an early adoptors group to `ocis`
_TODO @butonic currently, the migration selector will use the `ocis` policy for users that have been added to the accounts service. IMO we need to evaluate a claim from the IdP._
+1. Change the routing policy for a user or an early adopters group to `ocis`
_TODO @butonic currently, the migration selector will use the `ocis` policy for users that have been added to the accounts service. IMO we need to evaluate a claim from the IdP._
2. Verify the requests are routed based on the oCIS routing policy `oc10` for 'migrated' users. At this point you are ready to rock & roll! @@ -322,8 +313,8 @@ _TODO @butonic update performance comparisons nightly_ #### Steps There are several options to move users to the oCIS backend: -- Use a canary app to let users decide thamselves -- Use an early adoptors group with an opt in +- Use a canary app to let users decide themselves +- Use an early adopters group with an opt in - Force migrate users in batch or one by one at the administrators will #### Verification @@ -333,7 +324,7 @@ The same verification steps as for the internal testing stage apply. Just from t Until now, the oCIS configuration mimics ownCloud 10 and uses the old data directory layout and the ownCloud 10 database. Users can seamlessly be switched from ownCloud 10 to oCIS and back again.
-_TODO @butonic we need a canary app that allows users to decide for themself which backend to use_ +_TODO @butonic we need a canary app that allows users to decide for themselves which backend to use_
@@ -401,12 +392,12 @@ Noticeable performance improvements because we effectively shard the storage log _TODO @butonic implement `ownclouds3` based on `s3ng`_ _TODO @butonic implement tiered storage provider for seamless migration_ -_TODO @butonic document how to manually do that until the storge registry can discover that on its own._ +_TODO @butonic document how to manually do that until the storage registry can discover that on its own._
#### Verification -Start with a test user, then move to early adoptors and finally migrate all users. +Start with a test user, then move to early adopters and finally migrate all users. #### Rollback To switch the storage provider again the same storage space migration can be performed again: copy medatata and blob data using the CS3 api, then change the responsible storage provider in the storage registry. @@ -432,7 +423,7 @@ Migrate share data to _yet to determine_ share manager backend and shut down own The ownCloud 10 database still holds share information in the `oc_share` and `oc_share_external` tables. They are used to efficiently answer queries about who shared what with whom. In oCIS shares are persisted using a share manager and if desired these grants are also sent to the storage provider so it can set ACLs if possible. Only one system should be responsible for the shares, which in case of treating the storage as the primary source effectively turns the share manager into a cache. #### User impact -Depending on chosen the share manager provider some sharing requests should be faster: listing incoming and outgoing shares is no longer bound to the ownCloud 10 database but to whatever technology is used by the share provdier: +Depending on chosen the share manager provider some sharing requests should be faster: listing incoming and outgoing shares is no longer bound to the ownCloud 10 database but to whatever technology is used by the share provider: - For non HA scenarios they can be served from memory, backed by a simple json file. - TODO: implement share manager with redis / nats / ... key value store backend: use the micro store interface please ... @@ -446,13 +437,13 @@ Depending on chosen the share manager provider some sharing requests should be f _TODO for HA implement share manager with redis / nats / ... key value store backend: use the micro store interface please ..._ _TODO for batch migration implement share data migration cli with progress that reads all shares via the cs3 api from one provider and writes them into another provider_ -_TODO for seamless migration implement tiered/chained share provider that reads share data from the old provider and writes newc shares to the new one_ +_TODO for seamless migration implement tiered/chained share provider that reads share data from the old provider and writes new shares to the new one_ _TODO for storage provider as source of truth persist ALL share data in the storage provider. Currently, part is stored in the share manager, part is in the storage provider. We can keep both, but the the share manager should directly persist its metadata to the storage system used by the storage provider so metadata is kept in sync_
#### Verification -After copying all metadata start a dedicated gateway and change the configuration to use the new share manager. Route a test user, a test group and early adoptors to the new gateway. When no problems occur you can stirt the desired number of share managers and roll out the change to all gateways. +After copying all metadata start a dedicated gateway and change the configuration to use the new share manager. Route a test user, a test group and early adopters to the new gateway. When no problems occur you can start the desired number of share managers and roll out the change to all gateways.
@@ -461,12 +452,12 @@ _TODO let the gateway write updates to multiple share managers ... or rely on th
#### Rollback -To switch the share manager to the database one revert routing users to the new share manager. If you already shut down the old share manager start it again. Use the tiered/chained share manager provider in reverse configuration (new share provider as read only, old as write) and migrate the shares again. You can alse restore a database backup if needed. +To switch the share manager to the database one revert routing users to the new share manager. If you already shut down the old share manager start it again. Use the tiered/chained share manager provider in reverse configuration (new share provider as read only, old as write) and migrate the shares again. You can also restore a database backup if needed.
### Stage-10 -Profit! Well, on the one hand you do not need to maintain a clustered database setup and can rely on the storage system. On the other hand you are now in microservice wonderland and will have to relearn how to identify bottlenecks and scale oCIS accordingly. The good thing is that tools like jaeger and prometheus have evolved and will help you understand what is going on. But this is a different Topic. See you on the other side! +Profit! Well, on the one hand you do not need to maintain a clustered database setup and can rely on the storage system. On the other hand you are now in micro service wonderland and will have to relearn how to identify bottlenecks and scale oCIS accordingly. The good thing is that tools like jaeger and prometheus have evolved and will help you understand what is going on. But this is a different Topic. See you on the other side! #### FAQ _Feel free to add your question as a PR to this document using the link at the top of this page!_ @@ -709,7 +700,7 @@ _TODO clarify if metadata from ldap & user_shibboleth needs to be migrated_
-The `dn` -> *owncloud internal username* mapping that currently lives in the `oc_ldap_user_mapping` table needs to move into a dedicated ownclouduuid attribute in the LDAP server. The idp should send it as a claim so the proxy does not have to look up the user using LDAP again. The username cannot be changed in ownCloud 10 and the oCIS provisioning API will not allow changing it as well. When we introduce the graph api we may allow changing usernames when all clients have moved to that api. +The `dn` -> *owncloud internal username* mapping that currently lives in the `oc_ldap_user_mapping` table needs to move into a dedicated `ownclouduuid` attribute in the LDAP server. The idp should send it as a claim so the proxy does not have to look up the user using LDAP again. The username cannot be changed in ownCloud 10 and the oCIS provisioning API will not allow changing it as well. When we introduce the graph api we may allow changing usernames when all clients have moved to that api. The problem is that the username in owncloud 10 and in oCIS also need to be the same, which might not be the case when the ldap mapping used a different column. In that case we should add another owncloudusername attribute to the ldap server. From fea6728e6b2a2a42b311e706e3a06c5c3cd9429b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=B6rn=20Friedrich=20Dreyer?= Date: Thu, 22 Jul 2021 19:28:46 +0000 Subject: [PATCH 2/4] add space id vs resource id vs storage id section MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Jörn Friedrich Dreyer --- docs/extensions/storage/proposedchanges.md | 73 +++++++++++++++++++++- 1 file changed, 72 insertions(+), 1 deletion(-) diff --git a/docs/extensions/storage/proposedchanges.md b/docs/extensions/storage/proposedchanges.md index e8c1a14cec..29fe8c9d40 100644 --- a/docs/extensions/storage/proposedchanges.md +++ b/docs/extensions/storage/proposedchanges.md @@ -111,4 +111,75 @@ Technically, this means that every storage driver needs to have a map of a `uuid - to prevent excessive writes when deep folders are renamed a reverse map might be used: it will map the `uuid` to `:`, in order to trade writes for reads - as a fallback a sync job can read the file id from the metadata of the resources and populate the uuid to internal id map. -The TUS upload can take metadata, for PUT we might need a header. \ No newline at end of file +The TUS upload can take metadata, for PUT we might need a header. + +### Space id vs resource id vs storage id + +We have `/dav/meta/` where the `fileid` is a string that was returned by a PROPFIND or by the `/graph/v1.0/me/drives/` endpoint? That returns a space id and the root drive item which has an `id` + +Does that `id` have a specific format? We currently concatenate as `!`. + +A request against `/dav/meta/fileid` will use the reva storage registry to look up a path. + +What if the storage space is moved to another storage provider. This happens during a migration: + +1. the current oc10 fileids need to be prefixed with at least the numeric storage id to shard them. + +`123` becomes `instanceprefix$345!123` if we use a custom prefix that identifies an instance (so we can merge multiple instances into one ocis instance) and append the numeric storageid `345`. The pattern is `$!`. + +Every `$` identifies a space. + +- [ ] the owncloudsql driver can return these spaceids when listing spaces. + +Why does it not work if we just use the fileid of the root node in the db? + +Say we have a space with three resources: +`$!` +`instanceprefix$345!1` +`instanceprefix$345!2` +`instanceprefix$345!3` + +All users have moved to ocis and the registry contains a regex to route all `instanceprefix.*` references to the storageprovider with the owncloudsql driver. It is up to the driver to locate the correct resource by using the filecache table. In this case the numeric storage id is unnecessary. + +Now we migrate the space `345` to another storage driver: +- the storage registry contains a new entry for `instanceprefix$345` to send all resource ids for that space to the new storage provider +- the new storage driver has to take into account the full storageid because the nodeid may only be unique per storage space. + +If we now have to fetch the path on the `/dav/meta/` endpoint: +`/dav/meta/instanceprefix$345!1` +`/dav/meta/instanceprefix$345!2` +`/dav/meta/instanceprefix$345!3` + +This would work because the registry always sees `instanceprefix$345` as the storageid. + +Now if we use the fileids directly and leave out the numeric storageid: +`!` +`instanceprefix!1` +`instanceprefix!2` +`instanceprefix!3` + +This is the current `!` format. + +The reva storage registry contains a `instanceid` entry pointing to the storage provider with the owncloudsql driver. + +Resources can be looked up because the oc_filecache has a unique fileid over all storages. + +Now we again migrate the space `345` to another storage driver: +- the storage registry contains a new entry for `instanceprefix!1` so the storage space root now points to the new storage provider +- The registry needs to be aware of node ids to route properly. This is a no go. We don't want to keep a cache of *all* nodeids in the registry. Only the root nodes of spaces. +- The new storage driver only has a nodeid which might collide with other nodeids from other storage spaces, eg when two instances are imported into one ocis instance. Although it would be possible to just set up two storage providers extra care would have to be taken to prevent nodeid collisions when importing a space. + +If we now have to fetch the path on the `/dav/meta/` endpoint: +`/dav/meta/instanceprefix!1` would work because it is the root of a space +`/dav/meta/instanceprefix!2` would cause the gateway to poll all storage providers because the registry has no way to determine the responsible storage provider +`/dav/meta/instanceprefix!3` same + +The problem is that without a part in the storageid that allows differentiating storage spaces we cannot route them individually. + +Now, we could use the nodeid of the root of a storage space as the spaceid ... if it is a uuid. If it is numeric it needs a prefix to distinguish it from other spaces. +`!` would be easy for the decomposedfs. +eos might use numeric ids: `$!`, but it needs a custom prefix to distinguish multiple eos instances. + +Furthermore, when migrating spaces between storage providers we want to stay collision free, which is why we should recommend uuids. + +All this has implications for the decomposedfs, because it needs to split the nodes per space to prevent them from colliding. From a022ab78f354c9c871f331f8267fa9614dfd91f4 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=B6rn=20Friedrich=20Dreyer?= Date: Fri, 23 Jul 2021 15:17:18 +0000 Subject: [PATCH 3/4] rewrite with stronger spaces emphasis MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Jörn Friedrich Dreyer --- docs/extensions/storage/_index.md | 29 +- docs/extensions/storage/namespaces.md | 8 +- docs/extensions/storage/releasing.md | 31 - docs/extensions/storage/spaces.md | 28 + .../{architecture.md => spacesprovider.md} | 41 +- docs/extensions/storage/spacesregistry.md | 21 + .../storage/static/spacesprovider.drawio.svg | 352 +++++++++++ .../storage/static/spacesregistry.drawio.svg | 595 +++++++++--------- .../storage/static/storage.drawio.svg | 434 +++++++++++++ .../{storages.md => storagedrivers.md} | 12 +- docs/extensions/storage/terminology.md | 52 +- docs/extensions/storage/updating.md | 19 - docs/extensions/storage/users.md | 14 +- docs/ocis/deployment/ocis_keycloak.md | 2 +- docs/ocis/deployment/ocis_traefik.md | 2 +- 15 files changed, 1177 insertions(+), 463 deletions(-) delete mode 100644 docs/extensions/storage/releasing.md create mode 100644 docs/extensions/storage/spaces.md rename docs/extensions/storage/{architecture.md => spacesprovider.md} (77%) create mode 100644 docs/extensions/storage/spacesregistry.md create mode 100644 docs/extensions/storage/static/spacesprovider.drawio.svg create mode 100644 docs/extensions/storage/static/storage.drawio.svg rename docs/extensions/storage/{storages.md => storagedrivers.md} (97%) delete mode 100644 docs/extensions/storage/updating.md diff --git a/docs/extensions/storage/_index.md b/docs/extensions/storage/_index.md index 1f8bcf5e50..1d5023ed34 100644 --- a/docs/extensions/storage/_index.md +++ b/docs/extensions/storage/_index.md @@ -8,29 +8,12 @@ geekdocFilePath: _index.md geekdocCollapseSection: true --- -## Abstract +## Overview -This service provides an oCIS extension that wraps [reva](https://github.com/cs3org/reva/) and adds an opinionated configuration to it. +The storage extension wraps [reva](https://github.com/cs3org/reva/) and adds an opinionated configuration to provide two core services for the oCIS platform: +1. A [*Spaces Registry*]({{< ref "./spacesregistry.md" >}}) that acts as a dictionary for storage *Spaces* and their metadata +2. A [*Spaces Provider*]({{< ref "./spacesprovider.md" >}}) that organizes *Resources* in storage *Spaces* and persists them in an underlying *Storage System* -## Architecture Overview +*Clients* will use the *Spaces Registry* to poll or get notified about changes in all *Spaces* a user has access to. Every *Space* has a dedicated `/dav/spaces/` WebDAV endpoint that is served by a *Spaces Provider* which uses a specific reva storage driver to wrap an underlying *Storage System*. -The below diagram shows the oCIS services and the contained reva services within as dashed boxes. In general: -1. A request comes in at the proxy and is authenticated using OIDC. -2. It is forwarded to the oCIS frontend which handles ocs and ocdav requests by talking to the reva gateway using the CS3 API. -3. The gateway acts as a facade to the actual CS3 services: storage providers, user providers, group providers and sharing providers. - -{{< svg src="extensions/storage/static/overview.drawio.svg" >}} - -The dashed lines in the diagram indicate requests that are made to authenticate requests or lookup the storage provider: -1. After authenticating a request, the proxy may either use the CS3 `userprovider` or the accounts service to fetch the user information that will be minted into the `x-access-token`. -2. The gateway will verify the JWT signature of the `x-access-token` or try to authenticate the request itself, e.g. using a public link token. - -{{< hint warning >}} -The bottom part is lighter because we will deprecate it in favor of using only the CS3 user and group providers after moving some account functionality into reva and glauth. The metadata storage is not registered in the reva gateway to seperate metadata necessary for running the service from data that is being served directly. -{{< /hint >}} - -## Endpoints and references - -In order to reason about the request flow, two aspects in the architecture need to be understood well: -1. What kind of [*namespaces*]({{< ref "./namespaces.md" >}}) are presented at the different WebDAV and CS3 endpoints? -2. What kind of [*resource*]({{< ref "./terminology.md#resources" >}}) [*references*]({{< ref "./terminology.md#references" >}}) are exposed or required: path or id based? +{{< svg src="extensions/storage/static/storage.drawio.svg" >}} diff --git a/docs/extensions/storage/namespaces.md b/docs/extensions/storage/namespaces.md index 3ca0f9981d..eb68887c73 100644 --- a/docs/extensions/storage/namespaces.md +++ b/docs/extensions/storage/namespaces.md @@ -12,7 +12,7 @@ In ownCloud 10 all paths are considered relative to the users home. The CS3 API {{< svg src="extensions/storage/static/namespaces.drawio.svg" >}} -The different paths in the namespaces need to be translated while passing [*references*]({{< ref "./terminology.md#references" >}}) from service to service. While the oc10 endpoints all work on paths we internally reference shared resources by id, so the shares don't break when a file is renamed or moved inside a [*storage space*]({{< ref "./terminology.md#storage-spaces" >}}). The following table lists the various namespaces, paths and id based references: +The different paths in the namespaces need to be translated while passing [*references*]({{< ref "./terminology.md#references" >}}) from service to service. While the oc10 endpoints all work on paths we internally reference shared resources by id, so the shares don't break when a file is renamed or moved inside a storage [*space*]({{< ref "./spaces" >}}). The following table lists the various namespaces, paths and id based references: | oc10 namespace | CS3 global namespace | storage provider | reference | content | |--------------------------------------------------|----------------------------------------|------------------|-----------|---------| @@ -32,13 +32,13 @@ In the global CS3 namespaces we plan to move `/home/Shares`, which currently lis ## ownCloud namespaces -In contrast to the global namespace of CS3, ownCloud always presented a user specific namespace on all endpoints. It will always list the users private files under `/`. Shares can be mounted at an arbitrary location in the users private spaces. See the [webdav]({{< ref "./architecture#webdav" >}}) and [ocs]({{< ref "./architecture#sharing" >}}) sections for more details end examples. +In contrast to the global namespace of CS3, ownCloud always presented a user specific namespace on all endpoints. It will always list the users private files under `/`. Shares can be mounted at an arbitrary location in the users private spaces. See the [webdav]({{< ref "./spacesprovider#webdav" >}}) and [ocs]({{< ref "./spacesprovider#sharing" >}}) sections for more details end examples. With the spaces concept we are planning to introduce a global namespace to the ownCloud webdav endpoints. This will push the users private space down in the hierarchy: it will move from `/webdav` to `/webdav/home` or `/webdav/users/`. The related [migration stages]({{< ref "../../ocis/migration.md" >}}) are subject to change. ## CS3 global namespaces -The *CS3 global namespace* in oCIS is configured in the [*storage space registry*]({{< ref "./terminology.md#storage-space-registries" >}}). oCIS uses these defaults: +The *CS3 global namespace* in oCIS is configured in the storage [*spaces registry*]({{< ref "./spacesregistry" >}}). oCIS uses these defaults: | global namespace | description | |-|-| @@ -48,7 +48,7 @@ The *CS3 global namespace* in oCIS is configured in the [*storage space registry | `/public/` | a virtual folder listing public shares | | `/spaces/` | *TODO: project or group spaces* | -Technically, the `/home` namespace is not necessary: the [*storage space registry*]({{< ref "./terminology.md#storage-space-registries" >}}) knows the path to a users private space in the `/users` namespace and the gateway can forward the requests to the responsible storage provider. +Technically, the `/home` namespace is not necessary: the storage [*spaces registry*]({{< ref "./spacesregistry" >}}) knows the path to a users private space in the `/users` namespace and the gateway can forward the requests to the responsible storage provider. {{< hint warning >}} *@jfd: Why don't we use `/home/` instead of `/users/`. Then the paths would be consistent with most unix systems. diff --git a/docs/extensions/storage/releasing.md b/docs/extensions/storage/releasing.md deleted file mode 100644 index b55369d32e..0000000000 --- a/docs/extensions/storage/releasing.md +++ /dev/null @@ -1,31 +0,0 @@ ---- -title: "Releasing" -date: 2020-05-22T00:00:00+00:00 -weight: 60 -geekdocRepo: https://github.com/owncloud/ocis -geekdocEditPath: edit/master/docs/extensions/storage -geekdocFilePath: releasing.md ---- - -{{< toc >}} - -To release a new version of the storage submodule, you have to follow a few simple steps. - -## Preparation - -1. Before releasing, make sure that reva has been [updated to the desired version]({{< ref "updating" >}}) - -## Release -1. Check out master -{{< highlight txt >}} -git checkout master -git pull origin master -{{< / highlight >}} -2. Create a new tag (preferably signed) and replace the version number accordingly. Prefix the tag with the submodule `storage/v`. -{{< highlight txt >}} -git tag -s storage/vx.x.x -m "release vx.x.x" -git push origin storage/vx.x.x -{{< / highlight >}} -5. Wait for CI and check that the GitHub release was published. - -Congratulations, you just released the storage submodule! diff --git a/docs/extensions/storage/spaces.md b/docs/extensions/storage/spaces.md new file mode 100644 index 0000000000..f1342b4074 --- /dev/null +++ b/docs/extensions/storage/spaces.md @@ -0,0 +1,28 @@ +--- +title: "Spaces" +date: 2018-05-02T00:00:00+00:00 +weight: 3 +geekdocRepo: https://github.com/owncloud/ocis +geekdocEditPath: edit/master/docs/extensions/storage +geekdocFilePath: spaces.md +--- + +{{< hint warning >}} + +The current implementation in oCIS might not yet fully reflect this concept. Feel free to add links to ADRs, PRs and Issues in short warning boxes like this. + +{{< /hint >}} + +## Storage Spaces +A storage *space* is a logical concept. It organizes a set of [*resources*]({{< ref "#resources" >}}) in a hierarchical tree. It has a single *owner* (*user* or *group*), +a *quota*, *permissions* and is identified by a `storage space id`. + +{{< svg src="extensions/storage/static/storagespace.drawio.svg" >}} + +Examples would be every user's personal storage *space*, project storage *spaces* or group storage *spaces*. While they all serve different purposes and may or may not have workflows like anti virus scanning enabled, we need a way to identify and manage these subtrees in a generic way. By creating a dedicated concept for them this becomes easier and literally makes the codebase cleaner. A storage [*Spaces Registry*]({{< ref "./spacesregistry.md" >}}) then allows listing the capabilities of storage *spaces*, e.g. free space, quota, owner, syncable, root etag, upload workflow steps, ... + +Finally, a logical `storage space id` is not tied to a specific [*spaces provider*]({{< ref "./spacesprovider.md" >}}). If the [*storage driver*]({{< ref "./storagedrivers.md" >}}) supports it, we can import existing files including their `file id`, which makes it possible to move storage *spaces* between [*spaces providers*]({{< ref "./spacesprovider.md" >}}) to implement storage classes, e.g. with or without archival, workflows, on SSDs or HDDs. + +## Shares +*To be clarified: we are aware that [*storage spaces*]({{< ref "#storage-spaces" >}}) may be too 'heavywheight' for ad hoc sharing with groups. That being said, there is no technical reason why group shares should not be treated like storage [*spaces*]({{< ref "#storage-spaces" >}}) that users can provision themselves. They would share the quota with the users home or personal storage [*space*]({{< ref "#storage-spaces" >}}) and the share initiator would be the sole owner. Technically, the mechanism of treating a share like a new storage [*space*]({{< ref "#storage-spaces" >}}) would be the same. This obviously also extends to user shares and even file individual shares that would be wrapped in a virtual collection. It would also become possible to share collections of arbitrary files in a single storage space, e.g. the ten best pictures from a large album.* + diff --git a/docs/extensions/storage/architecture.md b/docs/extensions/storage/spacesprovider.md similarity index 77% rename from docs/extensions/storage/architecture.md rename to docs/extensions/storage/spacesprovider.md index ab8683234b..2100a3fa47 100644 --- a/docs/extensions/storage/architecture.md +++ b/docs/extensions/storage/spacesprovider.md @@ -1,12 +1,25 @@ --- -title: "Architecture" +title: "Spaces Provider" date: 2018-05-02T00:00:00+00:00 -weight: 10 +weight: 6 geekdocRepo: https://github.com/owncloud/ocis geekdocEditPath: edit/master/docs/extensions/storage -geekdocFilePath: architecture.md +geekdocFilePath: spacesprovider.md --- +{{< hint warning >}} + +The current implementation in oCIS might not yet fully reflect this concept. Feel free to add links to ADRs, PRs and Issues in short warning boxes like this. + +{{< /hint >}} + +## Spaces Provider +A *storage provider* manages [*resources*]({{< ref "#resources" >}}) identified by a [*reference*]({{< ref "#references" >}}) +by accessing a [*storage system*]({{< ref "#storage-systems" >}}) with a [*storage driver*]({{< ref "./storagedrivers.md" >}}). + +{{< svg src="extensions/storage/static/spacesprovider.drawio.svg" >}} + + ## Frontend The oCIS frontend service starts all services that handle incoming HTTP requests: @@ -38,16 +51,17 @@ The ocdav service not only handles all WebDAV requests under `(remote.php/)(web) | `(remote.php/)webdav/users` | ocdav | storageprovider | `/users` | | | | `(remote.php/)dav/files/` | ocdav | storageprovider | `/users/` | | | | *Spaces concept also needs a new endpoint:* ||||| -| `(remote.php/)dav/spaces//` | ocdav | storageregistry & storageprovider | bypass path based namespace and directly talk to the responsible storage provider using a relative path | [spaces concept](https://github.com/owncloud/ocis/pull/1827) needs to point to [*storage spaces*]({{< ref "./terminology.md#storage-spaces" >}}) or a global endpoint | allow accessing spaces, listing is done by the graph api | +| `(remote.php/)dav/spaces//` | ocdav | storageregistry & storageprovider | bypass path based namespace and directly talk to the responsible storage provider using a relative path | [spaces concept](https://github.com/owncloud/ocis/pull/1827) needs to point to storage [*spaces*]({{< ref "./spaces.md" >}}) | allow accessing spaces, listing is done by the graph api | -The correct endpoint for a users home [*storage space*]({{< ref "./terminology.md#storage-spaces" >}}) in oc10 is `remote.php/dav/files/`. In oc10 All requests at this endpoint use a path based reference that is relative to the users home. In oCIS this can be configured and defaults to `/home` as well. Other API endpoints like ocs and the web UI still expect this to be the users home. +The correct endpoint for a users home storage [*space*]({{< ref "./spaces.md" >}}) in oc10 is `remote.php/dav/files/`. In oc10 all requests at this endpoint use a path based reference that is relative to the users home. In oCIS this can be configured and defaults to `/home` as well. Other API endpoints like ocs and the web UI still expect this to be the users home. In oc10 we originally had `remote.php/webdav` which would render the current users home [*storage space*]({{< ref "./terminology.md#storage-spaces" >}}). The early versions (pre OC7) would jail all received shares into a `remote.php/webdav/shares` subfolder. The semantics for syncing such a folder are [not trivially predictable](https://github.com/owncloud/core/issues/5349), which is why we made shares [freely mountable](https://github.com/owncloud/core/pull/8026) anywhere in the users home. The current reva implementation jails shares into a `remote.php/webdav/Shares` folder for performance reasons. Obviously, this brings back the [special semantics for syncing](https://github.com/owncloud/product/issues/7). In the future we will follow [a different solution](https://github.com/owncloud/product/issues/302) and jail the received shares into a dedicated `/shares` space, on the same level as `/home` and `/spaces`. We will add a dedicated [API to list all *storage spaces*](https://github.com/owncloud/ocis/pull/1827) a user has access to and where they are mounted in the users *namespace*. {{< hint warning >}} +TODO rewrite this hint with `/dav/spaces` Existing folder sync pairs in legacy clients will break when moving the user home down in the path hierarchy like CernBox did. For legacy clients the `remote.php/webdav` endpoint will no longer list the users home directly, but instead present the different types of storage spaces: - `remote.php/webdav/home`: the users home is pushed down into a new `home` [*storage space*]({{< ref "./terminology.md#storage-spaces" >}}) @@ -55,11 +69,6 @@ For legacy clients the `remote.php/webdav` endpoint will no longer list the user - `remote.php/webdav/spaces`: other [*storage spaces*]({{< ref "./terminology.md#storage-spaces" >}}) the user has access to, e.g. group or project drives {{< /hint >}} -{{< hint warning >}} -An alternative would be to introduce a new `remote.php/dav/spaces` or `remote.php/dav/global` endpoint. However, `remote.php/dav` properly follows the WebDAV RFCs strictly. To ensure that all resources under that [*namespace*]({{< ref "./terminology.md#namespaces" >}}) are scoped to the user the URL would have to include the principal like `remote.php/dav/spaces/`, a precondition for e.g. WebDAV [RFC5397](https://tools.ietf.org/html/rfc5397). For a history lesson start at [Replace WebDAV with REST -owncloud/core#12504](https://github.com/owncloud/core/issues/12504#issuecomment-65218491) which spawned [Add extra layer in DAV to accomodate for other services like versions, trashbin, etc owncloud/core#12543](https://github.com/owncloud/core/issues/12543) -{{< /hint >}} - ### Sharing @@ -92,12 +101,12 @@ The user and public share provider implementations identify the file using the [ The OCM API takes an id based reference on the CS3 api, even if the OCM HTTP endpoint takes a path argument. *@jfd: Why? Does it not need the owner? It only stores the owner of the share, which is always the currently looged in user, when creating a share. Afterwards only the owner can update a share ... so collaborative management of shares is not possible. At least for OCM shares.* {{< /hint >}} -### User and Group provisioning -In oc10 users are identified by a username, which cannot change, because it is used as a foreign key in several tables. For oCIS we are internally identifying users by a UUID, while using the username in the WebDAV and OCS APIs for backwards compatability. To distinguish this in the URLs we are using `` instead of ``. You may have encountered ``, which refers to a template that can be configured to build several path segments by filling in user properties, e.g. the first character of the username (`{{substr 0 1 .Username}}/{{.Username}}`), the identity provider (`{{.Id.Idp}}/{{.Username}}`) or the email (`{{.Mail}}`) +## REVA Storage Registry -{{< hint warning >}} -Make no mistake, the [OCS Provisioning API](https://doc.owncloud.com/server/developer_manual/core/apis/provisioning-api.html) uses `userid` while it actually is the username, because it is what you use to login. -{{< /hint >}} +The reva *storage registry* manages the [*CS3 global namespace*]({{< ref "./namespaces.md#cs3-global-namespaces" >}}): +It is used by the reva *gateway* +to look up `address` and `port` of the [*storage provider*]({{< ref "#storage-providers" >}}) +that should handle a [*reference*]({{< ref "#references" >}}). -We are currently working on adding [user management through the CS3 API](https://github.com/owncloud/ocis/pull/1930) to handle user and group provisioning (and deprovisioning). +{{< svg src="extensions/storage/static/storageregistry.drawio.svg" >}} \ No newline at end of file diff --git a/docs/extensions/storage/spacesregistry.md b/docs/extensions/storage/spacesregistry.md new file mode 100644 index 0000000000..d5be48f8ab --- /dev/null +++ b/docs/extensions/storage/spacesregistry.md @@ -0,0 +1,21 @@ +--- +title: "Spaces Registry" +date: 2018-05-02T00:00:00+00:00 +weight: 9 +geekdocRepo: https://github.com/owncloud/ocis +geekdocEditPath: edit/master/docs/extensions/storage +geekdocFilePath: spacesregistry.md +--- + +{{< hint warning >}} + +The current implementation in oCIS might not yet fully reflect this concept. Feel free to add links to ADRs, PRs and Issues in short warning boxes like this. + +{{< /hint >}} + +## Storage Space Registries + +A storage *spaces registry* manages the [*namespace*]({{< ref "./namespaces.md" >}}) for a *user*: it is used by *clients* to look up storage spaces a user has access to, the `/dav/spaces` endpoint to access it via WabDAV, and where the client should mount it in the users personal namespace. + +{{< svg src="extensions/storage/static/spacesregistry.drawio.svg" >}} + diff --git a/docs/extensions/storage/static/spacesprovider.drawio.svg b/docs/extensions/storage/static/spacesprovider.drawio.svg new file mode 100644 index 0000000000..d122c58f7a --- /dev/null +++ b/docs/extensions/storage/static/spacesprovider.drawio.svg @@ -0,0 +1,352 @@ + + + + + + + +
+
+
+ oCIS spaces provider +
+ [Software System] +
+
+
+
+ + oCIS spaces provider... + +
+
+ + + + +
+
+
+ + reva storage provider + +
+ [Component: golang] +
+
+
+ hosts multiple storage spaces using a storage driver +
+
+
+
+
+ + reva storage provider... + +
+
+ + + + +
+
+
+ + reva gateway + +
+ [Component: golang] +
+
+
+ API facade for internal reva services +
+
+
+
+
+ + reva gateway... + +
+
+ + + + +
+
+
+ + Storage System + +
+ [Software System] +
+
+
+ provides persistent storage +
+
+
+
+
+ + Storage System... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [POSIX, S3] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + +
+
+
+ + reva frontend + +
+ [Component: golang] +
+
+
+ handles protocol translation +
+
+
+
+
+ + reva frontend... + +
+
+ + + + +
+
+
+ + oCIS proxy + +
+ [Component: golang] +
+
+
+ Routes requests to oc10 or ecis +
+
+
+
+
+ + oCIS proxy... + +
+
+ + + + + +
+
+
+
+
+ + Mints an internal JWT +
+ and torwards requests to +
+
+
+ [WebDAV, OCS, OCM, tus] +
+
+
+
+
+
+ + Mints an internal JWT... + +
+
+ + + + +
+
+
+ + Client + +
+ [Container: C++, Kotlin, +
+ Swift or Vue] +
+
+
+ A desktop, mobile or web Client +
+
+
+
+
+ + Client... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [WebDAV, libregraph, CS3] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [CS3, tus] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Forwards to + +
+
+ [CS3, storage registry] +
+
+
+
+
+
+ + Forwards to... + +
+
+ + + + +
+
+
+

+ C4 Component diagram for an oCIS spaces provider +

+

+ An oCIS spaces provider manages resources in storage spaces by persisting them with a specific storage driver in a storage system. +

+

+ Date: 2021-07-22T12:40 +

+
+
+
+
+ + C4 Component diagram for an oCIS spaces provider... + +
+
+
+ + + + + Viewer does not support full SVG 1.1 + + + +
\ No newline at end of file diff --git a/docs/extensions/storage/static/spacesregistry.drawio.svg b/docs/extensions/storage/static/spacesregistry.drawio.svg index e3a771b07e..f716b0bb0d 100644 --- a/docs/extensions/storage/static/spacesregistry.drawio.svg +++ b/docs/extensions/storage/static/spacesregistry.drawio.svg @@ -1,136 +1,82 @@ - + - + -
+
- oCIS System + oCIS spaces registry
[Software System]
- - oCIS System... + + oCIS spaces registry... - - - + -
-
-
- - Einstein - -
- [Person] -
-
-
- End user -
-
-
-
-
- - Einstein... - -
-
- - - - -
+
- Client + reva storage registry
- [Container: C++, Kotlin, Swift or Vue] + [Component: golang]

- A desktop, mobile or web Client + manages and caches storage space metadata
- - Client... + + reva storage registry... - + -
+
- Storage Space Registry + reva gateway
- [Container: golang, HTTP, libregraph] + [Component: golang]

- Manages spaces for users + API facade for internal reva services
- - Storage Space Registry... + + reva gateway... - + -
-
-
- - Storage Provider - -
- [Container: golang] -
-
-
- Persists storage spaces using reva -
-
-
-
-
- - Storage Provider... - -
-
- - - - -
+
@@ -147,45 +93,285 @@
- + Storage System... - - - + + -
+
-
- - Moss - -
- [Person] -
-
-
- Administrator +
+
+
+ + Provisions and manages spaces in + +
+
+ [CS3] +
- - Moss... + + Provisions and manages spaces... - - + -
+
+
+
+ + reva frontend + +
+ [Component: golang] +
+
+
+ handles protocol translation +
+
+
+
+ + + reva frontend... + + + + + + + +
+
+
+ + oCIS proxy + +
+ [Component: golang] +
+
+
+ Routes requests to oc10 or ecis +
+
+
+
+
+ + oCIS proxy... + +
+
+ + + + + +
+
+
+
+
+ + Mints an internal JWT +
+ and torwards requests to +
+
+
+ [libregraph] +
+
+
+
+
+
+ + Mints an internal JWT... + +
+
+ + + + +
+
+
+ + Client + +
+ [Container: C++, Kotlin, +
+ Swift or Vue] +
+
+
+ A desktop, mobile or web Client +
+
+
+
+
+ + Client... + +
+
+ + + + + +
+
+
+
+
+ + polls or gets notified about changes in + +
+
+ [libregraph] +
+
+
+
+
+
+ + polls or gets notified about c... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [CS3, tus] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Lists spaces using + +
+
+ [CS3] +
+
+
+
+
+
+ + Lists spaces using... + +
+
+ + + + +
+
+
+

+ C4 Component diagram for an oCIS spaces registry +

+

+ An oCIS spaces provider manages resources in storage spaces by persisting them with a specific storage driver in a storage system. +

+

+ Date: 2021-07-22T12:40 +

+
+
+
+
+ + C4 Component diagram for an oCIS spaces registry... + +
+
+ + + + +
+
+
+ + reva storage provider + +
+ [Component: golang] +
+
+
+ hosts multiple storage spaces using a storage driver +
+
+
+
+
+ + reva storage provider... + +
+
+ + + + + +
@@ -202,226 +388,11 @@
- + Reads from and writes to... - - - - - -
-
-
-
-
- - Reads from and writes to - -
-
- [WebDAV, libregraph, CS3, tus] -
-
-
-
-
-
- - Reads from and writes to... - -
-
- - - - - -
-
-
-
-
- - Manages the users Storage Spaces in - -
-
- [libregraph] -
-
-
-
-
-
- - Manages the users Storage Spac... - -
-
- - - - - -
-
-
-
-
- - Manages resources with - -
-
- [Web UI or native clients] -
-
-
-
-
-
- - Manages resources with... - -
-
- - - - - -
-
-
-
-
- - Registers itself at and -
- sends space root etag changes to -
-
-
- [CS3, libregraph?, PUSH] -
-
-
-
-
-
- - Registers itself at and... - -
-
- - - - - -
-
-
-
-
- - Manages organizational Storage Spaces in - -
-
- [WebDAV, libregraph, CS3, CLI] -
-
-
-
-
-
- - Manages organizational Storage... - -
-
- - - - -
-
-
- - Identity Management System - -
- [Software System] -
-
-
- provides users and groups -
-
-
-
-
- - Identity Management System... - -
-
- - - - - -
-
-
-
-
- - Authenticates users and searches recipients with - -
-
- [OpenID Connect, LDAP, REST] -
-
-
-
-
-
- - Authenticates users and search... - -
-
- - - - -
-
-
-

- C4 Container diagram for the oCIS System -

-

- As a platform, the oCIS system may not only includes web, mobile and desktop clients but also the underlying storage system or an identity management system -

-

- Date: 2021-07-22T16:43 -

-
-
-
-
- - C4 Container diagram for the oCIS System... - -
-
diff --git a/docs/extensions/storage/static/storage.drawio.svg b/docs/extensions/storage/static/storage.drawio.svg new file mode 100644 index 0000000000..fd6e759cf3 --- /dev/null +++ b/docs/extensions/storage/static/storage.drawio.svg @@ -0,0 +1,434 @@ + + + + + + + +
+
+
+ oCIS System +
+ [Software System] +
+
+
+
+ + oCIS System... + +
+
+ + + + + + +
+
+
+ + Einstein + +
+ [Person] +
+
+
+ End user +
+
+
+
+
+ + Einstein... + +
+
+ + + + +
+
+
+ + Client + +
+ [Container: C++, Kotlin, Swift or Vue] +
+
+
+ A desktop, mobile or web Client +
+
+
+
+
+ + Client... + +
+
+ + + + +
+
+
+ + Storage Space Registry + +
+ [Container: golang, HTTP, libregraph] +
+
+
+ Manages spaces for users +
+
+
+
+
+ + Storage Space Registry... + +
+
+ + + + +
+
+
+ + Storage Space Provider + +
+ [Container: golang] +
+
+
+ Persists storage spaces using reva +
+
+
+
+
+ + Storage Space Provider... + +
+
+ + + + +
+
+
+ + Storage System + +
+ [Software System] +
+
+
+ provides persistent storage +
+
+
+
+
+ + Storage System... + +
+
+ + + + + + +
+
+
+ + Moss + +
+ [Person] +
+
+
+ Administrator +
+
+
+
+
+ + Moss... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [POSIX, S3] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Reads from and writes to + +
+
+ [WebDAV, libregraph, CS3, tus] +
+
+
+
+
+
+ + Reads from and writes to... + +
+
+ + + + + +
+
+
+
+
+ + Manages the users Storage Spaces in + +
+
+ [libregraph] +
+
+
+
+
+
+ + Manages the users Storage Spac... + +
+
+ + + + + +
+
+
+
+
+ + Manages resources with + +
+
+ [Web UI or native clients] +
+
+
+
+
+
+ + Manages resources with... + +
+
+ + + + + +
+
+
+
+
+ + Registers itself at and +
+ sends space root etag changes to +
+
+
+ [CS3, libregraph?, PUSH] +
+
+
+
+
+
+ + Registers itself at and... + +
+
+ + + + + +
+
+
+
+
+ + Manages organizational Storage Spaces in + +
+
+ [WebDAV, libregraph, CS3, CLI] +
+
+
+
+
+
+ + Manages organizational Storage... + +
+
+ + + + +
+
+
+ + Identity Management System + +
+ [Software System] +
+
+
+ provides users and groups +
+
+
+
+
+ + Identity Management System... + +
+
+ + + + + +
+
+
+
+
+ + Authenticates users and searches recipients with + +
+
+ [OpenID Connect, LDAP, REST] +
+
+
+
+
+
+ + Authenticates users and search... + +
+
+ + + + +
+
+
+

+ C4 Container diagram for the oCIS System +

+

+ As a platform, the oCIS system may not only includes web, mobile and desktop clients but also the underlying storage system or an identity management system +

+

+ Date: 2021-07-22T16:43 +

+
+
+
+
+ + C4 Container diagram for the oCIS System... + +
+
+
+ + + + + Viewer does not support full SVG 1.1 + + + +
\ No newline at end of file diff --git a/docs/extensions/storage/storages.md b/docs/extensions/storage/storagedrivers.md similarity index 97% rename from docs/extensions/storage/storages.md rename to docs/extensions/storage/storagedrivers.md index 59927fd933..9a9d7fb4fd 100644 --- a/docs/extensions/storage/storages.md +++ b/docs/extensions/storage/storagedrivers.md @@ -1,15 +1,18 @@ --- -title: "Storages" +title: "Storage drivers" date: 2020-04-27T18:46:00+01:00 -weight: 37 +weight: 12 geekdocRepo: https://github.com/owncloud/ocis geekdocEditPath: edit/master/docs/extensions/storage geekdocFilePath: storages.md --- -## Storage commands +A *storage driver* implements access to a [*storage system*]({{< ref "#storage-systems" >}}): -`storage` has multiple storage provider commands to preconfigure different default configurations for the reva *storage provider* service. While you could rerun `storage storage-oc` multiple times with different flags to get multiple instances we are giving the different commands the necessary default configuration to allow the `ocis` binary to simply start them and not deal with configuration. +It maps the *path* and *id* based CS3 *references* to an appropriate [*storage system*]({{< ref "#storage-systems" >}}) specific reference, e.g.: +- eos file ids +- posix inodes or paths +- deconstructed filesystem nodes ## Storage providers @@ -25,7 +28,6 @@ A lot of different storage technologies exist, ranging from general purpose file Unfortunately, no POSIX filesystem natively supports all storage aspects that ownCloud 10 requires: - ### A hierarchical file tree An important aspect of a filesystem is organizing files and directories in a file hierarchy, or tree. It allows you to create, move and delete nodes. Beside the name a node also has well known metadata like size and mtime that are persisted in the tree as well. diff --git a/docs/extensions/storage/terminology.md b/docs/extensions/storage/terminology.md index 2b883b5705..426c8448be 100644 --- a/docs/extensions/storage/terminology.md +++ b/docs/extensions/storage/terminology.md @@ -19,7 +19,6 @@ A *resource* is the basic building block that oCIS manages. It can be of [differ - a [*reference*]({{< ref "#references" >}}) which can point to a resource in another [*storage provider*]({{< ref "#storage-providers" >}}) ### References - A *reference* identifies a [*resource*]({{< ref "#resources" >}}). A [*CS3 reference*](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.Reference) can carry a *path* and a [CS3 *resource id*](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ResourceId). The references come in two flavors: absolute and combined. Absolute references have either the *path* or the *resource id* set: - An absolute *path* MUST start with a `/`. The *resource id* MUST be empty. @@ -28,59 +27,12 @@ Combined references have both, *path* and *resource id* set: - the *resource id* identifies the root [*resource*]({{< ref "#resources" >}}) - the *path* is relative to that root. It MUST start with `.` - -### Storage Spaces -A *storage space* organizes a set of [*resources*]({{< ref "#resources" >}}) in a hierarchical tree. It has a single *owner* (*user* or *group*), -a *quota*, *permissions* and is identified by a `storage space id`. - -{{< svg src="extensions/storage/static/storagespace.drawio.svg" >}} - -Examples would be every user's personal storage space, project storage spaces or group storage spaces. While they all serve different purposes and may or may not have workflows like anti virus scanning enabled, we need a way to identify and manage these subtrees in a generic way. By creating a dedicated concept for them this becomes easier and literally makes the codebase cleaner. A [*storage space registry*]({{< ref "#storage-space-registries" >}}) then allows listing the capabilities of [*storage spaces*]({{< ref "#storage-spaces" >}}), e.g. free space, quota, owner, syncable, root etag, upload workflow steps, ... - -Finally, a logical `storage space id` is not tied to a specific [*storage provider*]({{< ref "#storage-providers" >}}). If the [*storage driver*]({{< ref "#storage-drivers" >}}) supports it, we can import existing files including their `file id`, which makes it possible to move [*storage spaces*]({{< ref "#storage-spaces" >}}) between [*storage providers*]({{< ref "#storage-providers" >}}) to implement storage classes, e.g. with or without archival, workflows, on SSDs or HDDs. - -### Shares -*To be clarified: we are aware that [*storage spaces*]({{< ref "#storage-spaces" >}}) may be too 'heavywheight' for ad hoc sharing with groups. That being said, there is no technical reason why group shares should not be treated like [*storage spaces*]({{< ref "#storage-spaces" >}}) that users can provision themselves. They would share the quota with the users home [*storage space*]({{< ref "#storage-spaces" >}}) and the share initiator would be the sole owner. Technically, the mechanism of treating a share like a new [*storage space*]({{< ref "#storage-spaces" >}}) would be the same. This obviously also extends to user shares and even file indvidual shares that would be wrapped in a virtual collection. It would also become possible to share collections of arbitrary files in a single storage space, e.g. the ten best pictures from a large album.* - - -### Storage Space Registries - -A *storage space registry* manages the [*namespace*]({{< ref "./namespaces.md" >}}) for a *user*: it is used by *clients* to look up storage spaces a user has access to, the `/dav/spaces` endpoint to access it via WabDAV, and where the client should mount it in the users personal namespace. - -{{< svg src="extensions/storage/static/spacesregistry.drawio.svg" >}} - - ## Technical concepts -### Storage Drivers - -A *storage driver* implements access to a [*storage system*]({{< ref "#storage-systems" >}}): - -It maps the *path* and *id* based CS3 *references* to an appropriate [*storage system*]({{< ref "#storage-systems" >}}) specific reference, e.g.: -- eos file ids -- posix inodes or paths -- deconstructed filesystem nodes - -### Storage Providers - -A *storage provider* manages [*resources*]({{< ref "#resources" >}}) identified by a [*reference*]({{< ref "#references" >}}) -by accessing a [*storage system*]({{< ref "#storage-systems" >}}) with a [*storage driver*]({{< ref "#storage-drivers" >}}). - -{{< svg src="extensions/storage/static/storageprovider.drawio.svg" >}} - -### Storage Registry - -A *storage registry* manages the [*CS3 global namespace*]({{< ref "./namespaces.md#cs3-global-namespaces" >}}): -It is used by the *gateway* -to look up `address` and `port` of the [*storage provider*]({{< ref "#storage-providers" >}}) -that should handle a [*reference*]({{< ref "#references" >}}). - -{{< svg src="extensions/storage/static/storageregistry.drawio.svg" >}} - ### Storage Systems Every *storage system* has different native capabilities like id and path based lookups, recursive change time propagation, permissions, trash, versions, archival and more. -A [*storage provider*]({{< ref "#storage-providers" >}}) makes the storage system available in the CS3 API by wrapping the capabilities as good as possible using a [*storage driver*]({{< ref "#storage-drivers" >}}). -There migt be multiple [*storage drivers*]({{< ref "#storage-drivers" >}}) for a *storage system*, implementing different tradeoffs to match varying requirements. +A [*storage provider*]({{< ref "#storage-providers" >}}) makes the storage system available in the CS3 API by wrapping the capabilities as good as possible using a [*storage driver*]({{< ref "./storagedrivers.md" >}}). +There might be multiple [*storage drivers*]({{< ref "./storagedrivers.md" >}}) for a *storage system*, implementing different tradeoffs to match varying requirements. ### Gateways A *gateway* acts as a facade to the storage related services. It authenticates and forwards API calls that are publicly accessible. diff --git a/docs/extensions/storage/updating.md b/docs/extensions/storage/updating.md deleted file mode 100644 index dd19203766..0000000000 --- a/docs/extensions/storage/updating.md +++ /dev/null @@ -1,19 +0,0 @@ ---- -title: "Updating reva" -date: 2020-05-22T00:00:00+00:00 -weight: 50 -geekdocRepo: https://github.com/owncloud/ocis -geekdocEditPath: edit/master/docs/extensions/storage -geekdocFilePath: updating.md ---- - -{{< toc >}} - -## Updating reva - -1. Run `go get github.com/cs3org/reva@master` in all repos that depend on reva -2. Create a changelog entry containing changes that were done in [reva](https://github.com/cs3org/reva/commits/master) -3. Create a Pull Request to ocis master with those changes -4. If test issues appear, you might need to adjust the tests -5. After the PR is merged, consider doing a [release of the storage submodule]({{< ref "releasing" >}}) - diff --git a/docs/extensions/storage/users.md b/docs/extensions/storage/users.md index 4a5e716faf..8e5d034bad 100644 --- a/docs/extensions/storage/users.md +++ b/docs/extensions/storage/users.md @@ -1,12 +1,24 @@ --- title: "Users" date: 2020-01-16T00:00:00+00:00 -weight: 35 +weight: 17 geekdocRepo: https://github.com/owncloud/ocis geekdocEditPath: edit/master/docs/extensions/storage geekdocFilePath: users.md --- +TODO add this to the storage overview? or is this a different part? That should be started as a separate service ? And documented elsewhere, eg. in the accounts? + +### User and Group provisioning + +In oc10 users are identified by a username, which cannot change, because it is used as a foreign key in several tables. For oCIS we are internally identifying users by a UUID, while using the username in the WebDAV and OCS APIs for backwards compatability. To distinguish this in the URLs we are using `` instead of ``. You may have encountered ``, which refers to a template that can be configured to build several path segments by filling in user properties, e.g. the first character of the username (`{{substr 0 1 .Username}}/{{.Username}}`), the identity provider (`{{.Id.Idp}}/{{.Username}}`) or the email (`{{.Mail}}`) + +{{< hint warning >}} +Make no mistake, the [OCS Provisioning API](https://doc.owncloud.com/server/developer_manual/core/apis/provisioning-api.html) uses `userid` while it actually is the username, because it is what you use to login. +{{< /hint >}} + +We are currently working on adding [user management through the CS3 API](https://github.com/owncloud/ocis/pull/1930) to handle user and group provisioning (and deprovisioning). + ### Demo driver This is a simple user driver for testing. It contains three users: diff --git a/docs/ocis/deployment/ocis_keycloak.md b/docs/ocis/deployment/ocis_keycloak.md index 46ef4ad061..4cc2be6cc0 100644 --- a/docs/ocis/deployment/ocis_keycloak.md +++ b/docs/ocis/deployment/ocis_keycloak.md @@ -21,7 +21,7 @@ The docker stack consists 4 containers. One of them is Traefik, a proxy which is Keykloak add two containers: Keycloak itself and a PostgreSQL as database. Keycloak will be configured as oCIS' IDP instead of the internal IDP [LibreGraph Connect]({{< ref "../../extensions/idp" >}}) -The other container is oCIS itself running all extensions in one container. In this example oCIS uses [oCIS storage driver]({{< ref "../../extensions/storage/storages#storage-drivers" >}}) +The other container is oCIS itself running all extensions in one container. In this example oCIS uses the [oCIS storage driver]({{< ref "../../extensions/storage/storagedrivers" >}}) ## Server Deployment diff --git a/docs/ocis/deployment/ocis_traefik.md b/docs/ocis/deployment/ocis_traefik.md index 6bb3088776..cb44362fee 100644 --- a/docs/ocis/deployment/ocis_traefik.md +++ b/docs/ocis/deployment/ocis_traefik.md @@ -18,7 +18,7 @@ geekdocFilePath: ocis_traefik.md The docker stack consists of two containers. One of them is Traefik, a proxy which is terminating ssl and forwards the requests to oCIS in the internal docker network. -The other one is oCIS itself running all extensions in one container. In this example oCIS uses its internal IDP [LibreGraph Connect]({{< ref "../../extensions/idp" >}}) and the [oCIS storage driver]({{< ref "../../extensions/storage/storages#storage-drivers" >}}) +The other one is oCIS itself running all extensions in one container. In this example oCIS uses its internal IDP [LibreGraph Connect]({{< ref "../../extensions/idp" >}}) and the [oCIS storage driver]({{< ref "../../extensions/storage/storagedrivers" >}}) ## Server Deployment From 27863fdb4316e48c5e42dbfda94a5b51b98c17cc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=B6rn=20Friedrich=20Dreyer?= Date: Mon, 13 Sep 2021 13:06:02 +0000 Subject: [PATCH 4/4] work MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Jörn Friedrich Dreyer --- docs/extensions/storage/namespaces.md | 6 +- docs/extensions/storage/proposedchanges.md | 30 +++---- docs/extensions/storage/spaces.md | 11 +++ docs/extensions/storage/spacesprovider.md | 2 +- .../storage/static/namespaces.drawio.svg | 83 +++++++++---------- docs/extensions/storage/users.md | 2 +- 6 files changed, 67 insertions(+), 67 deletions(-) diff --git a/docs/extensions/storage/namespaces.md b/docs/extensions/storage/namespaces.md index eb68887c73..19864a6e7f 100644 --- a/docs/extensions/storage/namespaces.md +++ b/docs/extensions/storage/namespaces.md @@ -16,7 +16,7 @@ The different paths in the namespaces need to be translated while passing [*refe | oc10 namespace | CS3 global namespace | storage provider | reference | content | |--------------------------------------------------|----------------------------------------|------------------|-----------|---------| -| `/webdav/path/to/file.ext` `/dav/files//path/to/file.ext` | `/home/path/to/file.ext` | home | `//path/to/file.ext` | currently logged in users home | +| `/webdav/path/to/file.ext` `/dav/files//path/to/file.ext` | `/home/path/to/file.ext` | home | `//path/to/file.ext` | currently logged in users home | | `/webdav/Shares/foo` `/dav/files//Shares/foo` | `/home/Shares/foo` | users | id based access | all users, used to access collaborative shares | | `/dav/public-files//rel/path/to/file.ext` | `/public//rel/path/to/file.ext` | public | id based access | publicly shared files, used to access public links | @@ -43,7 +43,7 @@ The *CS3 global namespace* in oCIS is configured in the storage [*spaces registr | global namespace | description | |-|-| | `/home` | an alias for the currently logged in uses private space | -| `/users/` | user private spaces | +| `/users/` | user private spaces | | `/shares` | a virtual listing of share spaces a user has access to | | `/public/` | a virtual folder listing public shares | | `/spaces/` | *TODO: project or group spaces* | @@ -51,7 +51,7 @@ The *CS3 global namespace* in oCIS is configured in the storage [*spaces registr Technically, the `/home` namespace is not necessary: the storage [*spaces registry*]({{< ref "./spacesregistry" >}}) knows the path to a users private space in the `/users` namespace and the gateway can forward the requests to the responsible storage provider. {{< hint warning >}} -*@jfd: Why don't we use `/home/` instead of `/users/`. Then the paths would be consistent with most unix systems. +*@jfd: Why don't we use `/home/` instead of `/users/`. Then the paths would be consistent with most unix systems. {{< /hint >}} The `/shares` namespace is used to solve two problems: diff --git a/docs/extensions/storage/proposedchanges.md b/docs/extensions/storage/proposedchanges.md index 29fe8c9d40..c306a4005b 100644 --- a/docs/extensions/storage/proposedchanges.md +++ b/docs/extensions/storage/proposedchanges.md @@ -13,29 +13,21 @@ Some architectural changes still need to be clarified or changed. Maybe an ADR i ## A dedicated shares storage provider -Currently, the *gateway* treats `/home/shares` different than any other path: it will stat all children and calculate an etag to allow clients to discover changes in accepted shares. This requires the storage provider to cooperate and provide this special `/shares` folder in the root of a users home when it is accessed as a home storage, which is a config flag that needs to be set for every storage driver. +Currently, when a user accepts a share, a cs3 reference is created in the users `/home/shares` folder. This reference represents the mount point of a share and can be renamed, similar to the share jail in ownCloud 10. This spreads the metadata of a share in two places: +- the share is persisted in the *share manager* +- the mount point of a share is persisted in the home *storage provider* -The `enable_home` flag will cause drivers to jail path based requests into a `` subfolder. In effect it divides a storage provider into multiple [*storage spaces*]({{< ref "#storage-spaces" >}}): when calling `CreateHome` a subfolder following the `` is created and market as the root of a users home. Both, the eos and ocis storage drivers use extended attributes to mark the folder as the end of the size aggregation and tree mtime propagation mechanism. Even setting the quota is possible like that. All this literally is a [*storage space*]({{< ref "#storage-spaces" >}}). +Furthermore, the *gateway* treats `/home/shares` different than any other path: it will stat all children and calculate an etag to allow clients to discover changes in accepted shares. This requires the storage provider to cooperate and provide this special `/shares` folder in the root of a users home when it is accessed as a home storage. That is the origin of the `enable_home` config flag that needs to be implemented for every storage driver. -We can implement [ListStorageSpaces](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ListStorageSpacesRequest) by either -- iterating over the root of the storage and treating every folder following the `` as a `home` *storage space*, -- iterating over the root of the storage and treating every folder following a new `` as a `project` *storage space*, or -- iterating over the root of the storage and treating every folder following a generic `` as a *storage space* for a configurable space type, or -- we allow configuring a map of `space type` to `layout` (based on the [CreateStorageSpaceRequest](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.CreateStorageSpaceRequest)) which would allow things like -``` -home=/var/lib/ocis/storage/home/{{substr 0 1 .Owner.Username}}/{{.Owner.Username}} -spaces=/spaces/var/lib/ocis/storage/projects/{{.Name}} -``` +In order to have a single source of truth we need to make the *share manager* aware of the mount point. We can then move all the logic that aggregates the etag in the share folder to a dedicated *shares storage provider* that is using the *share manager* for persistence. The *shares storage provider* would provide a `/shares` namespace outside of `/home` that lists all accepted shares for the current user. As a result the storage drivers no longer need to have a `enable_home` flag that jails users into their home. The `/home/shares` folder would move outside of the `/home`. In fact `/home` will no longer be needed, because the home folder concept can be implemented as a space: `CreateHome` would create a `personal` space on the. -This would make the `GetHome()` call return the path to the *storage provider* including the relative path to the *storage space*. No need for a *storage provider* mounted at `/home`. This is just a UI alias for `/users/`. Just like a normal `/home/` on a linux machine. +Work on this is done in https://github.com/cs3org/reva/pull/2023 -But if we have no `/home` where do we find the shares, and how can clients discover changes in accepted shares? - -The `/shares` namespace should be provided by a *shares storage provider* that lists all accepted shares for the current user... but what about copy pasting links from the browser? Well this storage is only really needed to have a path to ocm shares that actually reside on other instances. In the UI the shares would be listed by querying a *share manager*. It returns ResourceIds, which can be stated to fetch a path that is then accessible in the CS3 global namespace. Two caveats: +{{< hint warning >}} +What about copy pasting links from the browser? Well this storage is only really needed to have a path to ocm shares that actually reside on other instances. In the UI the shares would be listed by querying a *share manager*. It returns ResourceIds, which can be stated to fetch a path that is then accessible in the CS3 global namespace. Two caveats: - This only works for resources that are actually hosted by the current instance. For those it would leak the parent path segments to a shared resource. - For accepted OCM shares there must be a path in the [*CS3 global namespace*]({{< ref "./namespaces.md#cs3-global-namespaces" >}}) that has to be the same for all users, otherwise they cannot copy and share those URLs. - -Work on this is done in https://github.com/cs3org/reva/pull/1846 +{{< /hint >}} ### The gateway should be responsible for path transformations @@ -47,9 +39,9 @@ Work is done in https://github.com/cs3org/reva/pull/1866 ## URL escaped string representation of a CS3 reference -For the `/dav/spaces/` endpoint we need to encode the *reference* in a url compatible way. +For the spaces concept we introduced the `/dav/spaces/` endpoint. It encodes a cs3 *reference* in a URL compatible way. 1. We can separate the path using a `/`: `/dav/spaces//` -2. The `spaceid` currently is a cs3 resourceid, consisting of `` and ``. Since the nodeid might contain `/` eg. for the local driver we have to urlencode the spaceid. +2. The `spaceid` currently is a cs3 resourceid, consisting of `` and ``. Since the opaqueid might contain `/` eg. for the local driver we have to urlencode the spaceid. To access resources by id we need to make the `/dav/meta/` able to list directories... Otherwise id based navigation first has to look up the path. Or we use the libregraph api for id based navigation. diff --git a/docs/extensions/storage/spaces.md b/docs/extensions/storage/spaces.md index f1342b4074..3f87f3dd96 100644 --- a/docs/extensions/storage/spaces.md +++ b/docs/extensions/storage/spaces.md @@ -26,3 +26,14 @@ Finally, a logical `storage space id` is not tied to a specific [*spaces provide ## Shares *To be clarified: we are aware that [*storage spaces*]({{< ref "#storage-spaces" >}}) may be too 'heavywheight' for ad hoc sharing with groups. That being said, there is no technical reason why group shares should not be treated like storage [*spaces*]({{< ref "#storage-spaces" >}}) that users can provision themselves. They would share the quota with the users home or personal storage [*space*]({{< ref "#storage-spaces" >}}) and the share initiator would be the sole owner. Technically, the mechanism of treating a share like a new storage [*space*]({{< ref "#storage-spaces" >}}) would be the same. This obviously also extends to user shares and even file individual shares that would be wrapped in a virtual collection. It would also become possible to share collections of arbitrary files in a single storage space, e.g. the ten best pictures from a large album.* +## Notes + +We can implement [ListStorageSpaces](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ListStorageSpacesRequest) by either +- iterating over the root of the storage and treating every folder following the `` as a `home` *storage space*, +- iterating over the root of the storage and treating every folder following a new `` as a `project` *storage space*, or +- iterating over the root of the storage and treating every folder following a generic `` as a *storage space* for a configurable space type, or +- we allow configuring a map of `space type` to `layout` (based on the [CreateStorageSpaceRequest](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.CreateStorageSpaceRequest)) which would allow things like +``` +home=/var/lib/ocis/storage/home/{{substr 0 1 .Owner.Username}}/{{.Owner.Username}} +spaces=/spaces/var/lib/ocis/storage/projects/{{.Name}} +``` diff --git a/docs/extensions/storage/spacesprovider.md b/docs/extensions/storage/spacesprovider.md index 2100a3fa47..a30c723780 100644 --- a/docs/extensions/storage/spacesprovider.md +++ b/docs/extensions/storage/spacesprovider.md @@ -49,7 +49,7 @@ The ocdav service not only handles all WebDAV requests under `(remote.php/)(web) | *Note: existing folder sync pairs in legacy clients will break when moving the user home down in the path hierarchy* ||||| | `(remote.php/)webdav/home` | ocdav | storageprovider | `/home` | | | | `(remote.php/)webdav/users` | ocdav | storageprovider | `/users` | | | -| `(remote.php/)dav/files/` | ocdav | storageprovider | `/users/` | | | +| `(remote.php/)dav/files/` | ocdav | storageprovider | `/users/` | | | | *Spaces concept also needs a new endpoint:* ||||| | `(remote.php/)dav/spaces//` | ocdav | storageregistry & storageprovider | bypass path based namespace and directly talk to the responsible storage provider using a relative path | [spaces concept](https://github.com/owncloud/ocis/pull/1827) needs to point to storage [*spaces*]({{< ref "./spaces.md" >}}) | allow accessing spaces, listing is done by the graph api | diff --git a/docs/extensions/storage/static/namespaces.drawio.svg b/docs/extensions/storage/static/namespaces.drawio.svg index 5440f46b43..b3baa5895a 100644 --- a/docs/extensions/storage/static/namespaces.drawio.svg +++ b/docs/extensions/storage/static/namespaces.drawio.svg @@ -1,4 +1,4 @@ - + @@ -35,7 +35,7 @@ /home
- /users/<userlayout> + /users/<user_layout>
/public @@ -52,13 +52,13 @@ - +
-
+
storage home
@@ -69,16 +69,15 @@ - - - - + + +
-
+
storageprovider
@@ -89,13 +88,13 @@ - +
-
+
dataprovider
@@ -106,13 +105,13 @@ - +
-
+
frontend
@@ -123,16 +122,15 @@ - - - - + + +
-
+
ocdav
@@ -143,13 +141,13 @@ - +
-
+
ocs
@@ -160,14 +158,14 @@ - - + +
-
+
/webdav
@@ -203,14 +201,14 @@ - - + +
-
+
/ocs/v1.php/apps/files_sharing/api/v1/shares @@ -223,13 +221,13 @@ - +
-
+
gateway
@@ -240,16 +238,15 @@ - - - - + + +
-
+
gateway
@@ -260,13 +257,13 @@ - +
-
+
authregistry
@@ -277,13 +274,13 @@ - +
-
+
storageregistry
@@ -294,15 +291,15 @@ - - - + + +
-
+
oc10 namespace
(all paths aere relative to the users home) @@ -315,13 +312,13 @@ - +
-
+
CS3 global namespace
diff --git a/docs/extensions/storage/users.md b/docs/extensions/storage/users.md index 8e5d034bad..ea92720f82 100644 --- a/docs/extensions/storage/users.md +++ b/docs/extensions/storage/users.md @@ -11,7 +11,7 @@ TODO add this to the storage overview? or is this a different part? That should ### User and Group provisioning -In oc10 users are identified by a username, which cannot change, because it is used as a foreign key in several tables. For oCIS we are internally identifying users by a UUID, while using the username in the WebDAV and OCS APIs for backwards compatability. To distinguish this in the URLs we are using `` instead of ``. You may have encountered ``, which refers to a template that can be configured to build several path segments by filling in user properties, e.g. the first character of the username (`{{substr 0 1 .Username}}/{{.Username}}`), the identity provider (`{{.Id.Idp}}/{{.Username}}`) or the email (`{{.Mail}}`) +In oc10 users are identified by a username, which cannot change, because it is used as a foreign key in several tables. For oCIS we are internally identifying users by a UUID, while using the username in the WebDAV and OCS APIs for backwards compatability. To distinguish this in the URLs we are using `` instead of ``. You may have encountered ``, which refers to a template that can be configured to build several path segments by filling in user properties, e.g. the first character of the username (`{{substr 0 1 .Username}}/{{.Username}}`), the identity provider (`{{.Id.Idp}}/{{.Username}}`) or the email (`{{.Mail}}`) {{< hint warning >}} Make no mistake, the [OCS Provisioning API](https://doc.owncloud.com/server/developer_manual/core/apis/provisioning-api.html) uses `userid` while it actually is the username, because it is what you use to login.