From 13beef2fd4cd9e3294d078d8b95b2649ba2b9e07 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=B6rn=20Friedrich=20Dreyer?= Date: Mon, 13 Sep 2021 18:20:24 +0000 Subject: [PATCH 1/6] describe cephfs architecture and tradeoffs MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Jörn Friedrich Dreyer --- docs/ocis/migration.md | 59 +++++++++++++++++++--------- docs/ocis/storage-backends/cephfs.md | 32 +++++++++++++++ 2 files changed, 72 insertions(+), 19 deletions(-) create mode 100644 docs/ocis/storage-backends/cephfs.md diff --git a/docs/ocis/migration.md b/docs/ocis/migration.md index 12583248ac..46a7362715 100644 --- a/docs/ocis/migration.md +++ b/docs/ocis/migration.md @@ -119,10 +119,10 @@ _Feel free to add your question as a PR to this document using the link at the t ### Stage 3: introduce oCIS interally -Befor letting oCIS handle end user requests we will first make it available in the internal network. By subsequently adding services we can add functionality and verify the services work as intended. +Before letting oCIS handle end user requests we will first make it available in the internal network. By subsequently adding services we can add functionality and verify the services work as intended. Start oCIS backend and make read only tests on existing data using the `owncloudsql` storage driver which will read (and write) -- blobs from the same datadirectory layout as in ownCloud 10 +- blobs from the same data directory layout as in ownCloud 10 - metadata from the ownCloud 10 database: The oCIS share manager will read share information from the ownCloud database using an `owncloud` driver as well. @@ -139,11 +139,11 @@ None, only administrators will be able to explore oCIS during this stage. #### Steps and verifications -We are going to run and explore a series of services that will together handle the same requests as ownCloud 10. For initial exploration the oCIS binary is recommended. The services can later be deployed using a single oCIS runtime or in multiple cotainers. +We are going to run and explore a series of services that will together handle the same requests as ownCloud 10. For initial exploration the oCIS binary is recommended. The services can later be deployed using a single oCIS runtime or in multiple containers. ##### Storage provider for file metadata -1. Deploy OCIS storage provider with owncloudsql driver. +1. Deploy OCIS storage provider with the `owncloudsql` driver. 2. Set `read_only: true` in the storage provider config.
_TODO @butonic add read only flag to storage drivers_
3. Use cli tool to list files using the CS3 api @@ -194,7 +194,7 @@ When reading the files from oCIS return the same `uuid`. It can be migrated to a 2. Use curl to list spaces using graph drives endpoint ##### owncloud flavoured WebDAV endpoint -1. Deploy Ocdav +1. Deploy ocdav 2. Use curl to send PROPFIND ##### data provider for up and download @@ -205,13 +205,13 @@ When reading the files from oCIS return the same `uuid`. It can be migrated to a Deploy ... ##### share manager -Deploy share manager with owncloud driver +Deploy share manager with ownCloud driver ##### reva gateway 1. Deploy gateway to authenticate requests? I guess we need that first... Or we need the to mint a token. Might be a good exercise. ##### automated deployment -Finally, deploy OCIS with a config to set up everything running in a single oCIS runtime or in multiple containers. +Finally, deploy oCIS with a config to set up everything running in a single oCIS runtime or in multiple containers. #### Rollback You can stop the oCIS process at any time. @@ -280,7 +280,7 @@ The IP address of the ownCloud host changes. There is no change for the file syn 2. Verify the requests are routed based on the ownCloud 10 routing policy `oc10` by default ##### Test user based routing -1. Change the routing policy for a user or an early adoptors group to `ocis`
_TODO @butonic currently, the migration selector will use the `ocis` policy for users that have been added to the accounts service. IMO we need to evaluate a claim from the IdP._
+1. Change the routing policy for a user or an early adopters group to `ocis`
_TODO @butonic currently, the migration selector will use the `ocis` policy for users that have been added to the accounts service. IMO we need to evaluate a claim from the IdP._
2. Verify the requests are routed based on the oCIS routing policy `oc10` for 'migrated' users. At this point you are ready to rock & roll! @@ -340,8 +340,7 @@ _TODO @butonic we need a canary app that allows users to decide for themself whi
#### Notes -Running the two systems in parallel stage -Try to keep the duration of this stage short. Until now we only added services and made the system more complex. oCIS aims to reduce the maintenance cost of an ownCloud instance. You will not get there if you keep both systems alive. +Running the two systems in parallel requires additional maintenance effort. Try to keep the duration of this stage short. Until now, we only added services and made the system more complex. oCIS aims to reduce the maintenance cost of an ownCloud instance. You will not get there if you keep both systems alive.
@@ -352,7 +351,29 @@ _Feel free to add your question as a PR to this document using the link at the t
-### Stage-7: shut down ownCloud 10 +### Stage-7: introduce spaces using ocis +To encourage users to switch you can promote the workspaces feature that is built into oCIS. The ownCloud 10 storage backend can be used for existing users. New users and group or project spaces can be provided by storage providers that better suit the underlying storage system. + +#### Steps +First, the admin needs to +- deploy a storage provider with the storage driver that best fits the underlying storage system and requirements. +- register the storage in the storage registry with a new storage id (we recommend a uuid). + +Then a user with the necessary create storage space role can create a storage space and assign Managers. + +
+ +_TODO @butonic a user with management permission needs to be presented with a list of storage spaces where he can see the amount of free space and decide on which storage provider the storage space should be created. For now a config option for the default storage provider for a specific type might be good enough._ + +
+ +#### Verification +The new storage space should show up in the `/graph/drives` endpoint for the managers and the creator of the space. + +#### Notes +Depending on the requirements and acceptable tradeoffs, a database less deployment using the ocis or s3ng storage driver is possible. There is also a [cephfs driver](https://github.com/cs3org/reva/pull/1209) on the way, that directly works on the API level instead of POSIX. + +### Stage-8: shut down ownCloud 10 Disable ownCloud 10 in the proxy, all requests are now handled by oCIS, shut down oc10 web servers and redis (or keep for calendar & contacts only? rip out files from oCIS?) #### User impact @@ -387,7 +408,7 @@ _Feel free to add your question as a PR to this document using the link at the t
-### Stage 8: storage migration +### Stage 9: storage migration To get rid of the database we will move the metadata from the old ownCloud 10 database into dedicated storage providers. This can happen in a user by user fashion. group drives can properly be migrated to group, project or workspaces in this stage. #### User impact @@ -401,12 +422,12 @@ Noticeable performance improvements because we effectively shard the storage log _TODO @butonic implement `ownclouds3` based on `s3ng`_ _TODO @butonic implement tiered storage provider for seamless migration_ -_TODO @butonic document how to manually do that until the storge registry can discover that on its own._ +_TODO @butonic document how to manually do that until the storage registry can discover that on its own._
#### Verification -Start with a test user, then move to early adoptors and finally migrate all users. +Start with a test user, then move to early adopters and finally migrate all users. #### Rollback To switch the storage provider again the same storage space migration can be performed again: copy medatata and blob data using the CS3 api, then change the responsible storage provider in the storage registry. @@ -426,13 +447,13 @@ _Feel free to add your question as a PR to this document using the link at the t
-### Stage-9: share metadata migration +### Stage-10: share metadata migration Migrate share data to _yet to determine_ share manager backend and shut down ownCloud database. The ownCloud 10 database still holds share information in the `oc_share` and `oc_share_external` tables. They are used to efficiently answer queries about who shared what with whom. In oCIS shares are persisted using a share manager and if desired these grants are also sent to the storage provider so it can set ACLs if possible. Only one system should be responsible for the shares, which in case of treating the storage as the primary source effectively turns the share manager into a cache. #### User impact -Depending on chosen the share manager provider some sharing requests should be faster: listing incoming and outgoing shares is no longer bound to the ownCloud 10 database but to whatever technology is used by the share provdier: +Depending on chosen the share manager provider some sharing requests should be faster: listing incoming and outgoing shares is no longer bound to the ownCloud 10 database but to whatever technology is used by the share provider: - For non HA scenarios they can be served from memory, backed by a simple json file. - TODO: implement share manager with redis / nats / ... key value store backend: use the micro store interface please ... @@ -452,7 +473,7 @@ _TODO for storage provider as source of truth persist ALL share data in the stor #### Verification -After copying all metadata start a dedicated gateway and change the configuration to use the new share manager. Route a test user, a test group and early adoptors to the new gateway. When no problems occur you can stirt the desired number of share managers and roll out the change to all gateways. +After copying all metadata start a dedicated gateway and change the configuration to use the new share manager. Route a test user, a test group and early adoptors to the new gateway. When no problems occur you can start the desired number of share managers and roll out the change to all gateways.
@@ -465,8 +486,8 @@ To switch the share manager to the database one revert routing users to the new
-### Stage-10 -Profit! Well, on the one hand you do not need to maintain a clustered database setup and can rely on the storage system. On the other hand you are now in microservice wonderland and will have to relearn how to identify bottlenecks and scale oCIS accordingly. The good thing is that tools like jaeger and prometheus have evolved and will help you understand what is going on. But this is a different Topic. See you on the other side! +### Stage-11 +Profit! Well, on the one hand you do not need to maintain a clustered database setup and can rely on the storage system. On the other hand you are now in microservice wonderland and will have to relearn how to identify bottlenecks and scale oCIS accordingly. The good thing is that tools like jaeger and prometheus have evolved and will help you understand what is going on. But this is a different topic. See you on the other side! #### FAQ _Feel free to add your question as a PR to this document using the link at the top of this page!_ diff --git a/docs/ocis/storage-backends/cephfs.md b/docs/ocis/storage-backends/cephfs.md new file mode 100644 index 0000000000..9cc19ffd52 --- /dev/null +++ b/docs/ocis/storage-backends/cephfs.md @@ -0,0 +1,32 @@ +--- +title: "cephfs" +date: 2021-09-13T15:36:00+01:00 +weight: 30 +geekdocRepo: https://github.com/owncloud/ocis +geekdocEditPath: edit/master/docs/ocis/storage-backends/ +geekdocFilePath: cephfs.md +--- + +{{< toc >}} + +oCIS intends to make the aspects of existing storage systems available as transparently as possible, but the static sync algorithm of the desktop client relies on some form of recursive change time propagation on the server side to detect changes. While this can be bolted on top of existing file systems with inotify, the kernel audit or a fuse based overlay filesystem, a storage system that already implements this aspect is preferable. Aside from EOS, cephfs supports a recursive change time that oCIS can use to calculate an etag for the webdav API. + +## Development + +The cephfs development happens in a [reva branch](https://github.com/cs3org/reva/pull/1209) and is currently driven by CERN. + +## Architecture + +In the original approach the driver was based on the localfs driver, relying on a locally mounted cephfs. It would interface with it using the POSIX apis. This has been changed to direct Ceph API access using https://github.com/ceph/go-ceph. It allows using the ceph admin APIs to create subvolumes for user homes and maintain a file id to path mapping using symlinks. + +It also uses the `.snap` folder built into Ceph to provide versions. + +Trash is not implemented, as cephfs has no native recycle bin. + +## Future work +- The spaces concept matches subvolumes, implement the CreateStorageSpace call with that, keep track of the list of storage spaces using symlings, like for the id based lookup +- The Share manager needs a persistence layer. + - currently we persist using a json file. An sqlite db would be more robust. + - As it basically provides two lists, *shared with me* and *shared with others* we could persist this directly on cephfs! + - To allow deprovisioning a user the data should by sharded by userid. + - Backups are then done using snapshots. \ No newline at end of file From 804612077ab7d9d9f55739ae38c6cad3c2824610 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=B6rn=20Friedrich=20Dreyer?= Date: Mon, 13 Sep 2021 22:04:14 +0200 Subject: [PATCH 2/6] flesh out cephfs aspects and future work --- docs/ocis/storage-backends/cephfs.md | 27 ++++++++++++++++++--------- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/docs/ocis/storage-backends/cephfs.md b/docs/ocis/storage-backends/cephfs.md index 9cc19ffd52..fa29b092d7 100644 --- a/docs/ocis/storage-backends/cephfs.md +++ b/docs/ocis/storage-backends/cephfs.md @@ -17,16 +17,25 @@ The cephfs development happens in a [reva branch](https://github.com/cs3org/reva ## Architecture -In the original approach the driver was based on the localfs driver, relying on a locally mounted cephfs. It would interface with it using the POSIX apis. This has been changed to direct Ceph API access using https://github.com/ceph/go-ceph. It allows using the ceph admin APIs to create subvolumes for user homes and maintain a file id to path mapping using symlinks. +In the original approach the driver was based on the localfs driver, relying on a locally mounted cephfs. It would interface with it using the POSIX apis. This has been changed to directly call the Ceph API using https://github.com/ceph/go-ceph. It allows using the ceph admin APIs to create subvolumes for user homes and maintain a file id to path mapping using symlinks. -It also uses the `.snap` folder built into Ceph to provide versions. +## Implemented Aspects +The recursive change time built ino cephfs is used to implement the etag propagation expected by the ownCloud clients. This allows ocis to pick up changes that have been made by external tools, bypassing any oCIS APIs. -Trash is not implemented, as cephfs has no native recycle bin. +Like other filesystems cephfs uses inodes and like most other filesystems inodes are reused. To get stable file identifiers the current cephfs driver assigns every node a file id and maintains a fileid to path mapping in a system directory. + +Versions are not file but snapshot based, a native feature of cephfs. The driver maps entries in the native cephfs `.snap` folder available in the web UI using the versions sidebar. + +Trash is not implemented, as cephfs has no native recycle bin and instead relies on the snapshot functionality that can be triggered by endusers. It should be possible to automatically create a snapshot before deleting a file. This needs to be explored. + +Shares can be mapped to ACLs supported by cephfs. The share manager is used to persist the intent of a share and can be used to periodically verify or reset the ACLs on cephfs. ## Future work -- The spaces concept matches subvolumes, implement the CreateStorageSpace call with that, keep track of the list of storage spaces using symlings, like for the id based lookup -- The Share manager needs a persistence layer. - - currently we persist using a json file. An sqlite db would be more robust. - - As it basically provides two lists, *shared with me* and *shared with others* we could persist this directly on cephfs! - - To allow deprovisioning a user the data should by sharded by userid. - - Backups are then done using snapshots. \ No newline at end of file +- The spaces concept matches cephfs subvolumes. We can implement the CreateStorageSpace call with that, keep track of the list of storage spaces using symlinks, like for the id based lookup. +- The share manager needs a persistence layer. +- Currently we persist using a single json file. +- As it basically provides two lists, *shared with me* and *shared with others*, we could persist them directly on cephfs! +- A good tradeoff would be a folder for each user with a json file for each list. That way, we only have to open and read a single file when the user want's to list the shares. +- To allow deprovisioning a user the data should by sharded by userid. +- Backups are then done using snapshots. + From 549d1058c5f370da6b74adb4086076e60da15610 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=B6rn=20Friedrich=20Dreyer?= Date: Mon, 13 Sep 2021 20:30:38 +0000 Subject: [PATCH 3/6] file layout examples MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Jörn Friedrich Dreyer --- docs/ocis/storage-backends/cephfs.md | 41 ++++++++++++++++++++++++++-- 1 file changed, 38 insertions(+), 3 deletions(-) diff --git a/docs/ocis/storage-backends/cephfs.md b/docs/ocis/storage-backends/cephfs.md index fa29b092d7..6b6953744c 100644 --- a/docs/ocis/storage-backends/cephfs.md +++ b/docs/ocis/storage-backends/cephfs.md @@ -22,7 +22,21 @@ In the original approach the driver was based on the localfs driver, relying on ## Implemented Aspects The recursive change time built ino cephfs is used to implement the etag propagation expected by the ownCloud clients. This allows ocis to pick up changes that have been made by external tools, bypassing any oCIS APIs. -Like other filesystems cephfs uses inodes and like most other filesystems inodes are reused. To get stable file identifiers the current cephfs driver assigns every node a file id and maintains a fileid to path mapping in a system directory. +Like other filesystems cephfs uses inodes and like most other filesystems inodes are reused. To get stable file identifiers the current cephfs driver assigns every node a file id and maintains a fileid to path mapping in a system directory: +``` +/tmp/cephfs $ tree -a +. +├── reva +│ └── einstein +│ ├── Pictures +│ └── welcome.txt +└── .reva_hidden + ├── .fileids + │ ├── 50BC39D364A4703A20C58ED50E4EADC3_570078 -> /tmp/cephfs/reva/einstein + │ ├── 571EFB3F0ACAE6762716889478E40156_570081 -> /tmp/cephfs/reva/einstein/Pictures + │ └── C7A1397524D0419B38D04D539EA531F8_588108 -> /tmp/cephfs/reva/einstein/welcome.txt + └── .uploads +``` Versions are not file but snapshot based, a native feature of cephfs. The driver maps entries in the native cephfs `.snap` folder available in the web UI using the versions sidebar. @@ -35,7 +49,28 @@ Shares can be mapped to ACLs supported by cephfs. The share manager is used to p - The share manager needs a persistence layer. - Currently we persist using a single json file. - As it basically provides two lists, *shared with me* and *shared with others*, we could persist them directly on cephfs! + - If needed for redundancy, the share manager can be run multiple times, backed by the same cephfs + - To save disk io the data can be cached in memory, and invalidated using stat requests. - A good tradeoff would be a folder for each user with a json file for each list. That way, we only have to open and read a single file when the user want's to list the shares. - To allow deprovisioning a user the data should by sharded by userid. -- Backups are then done using snapshots. - +- For consistency over metadata any file blob data, backups can be done using snapshots. +- An example where einstein has sherad a file with marie would look like this on disk: +``` +/tmp/cephfs $ tree -a +. +├── reva +│ └── einstein +│ ├── Pictures +│ └── welcome.txt +├── .reva_hidden +│ ├── .fileids +│ │ ├── 50BC39D364A4703A20C58ED50E4EADC3_570078 -> /tmp/cephfs/reva/einstein +│ │ ├── 571EFB3F0ACAE6762716889478E40156_570081 -> /tmp/cephfs/reva/einstein/Pictures +│ │ └── C7A1397524D0419B38D04D539EA531F8_588108 -> /tmp/cephfs/reva/einstein/welcome.txt +│ └── .uploads +└── .reva_share_manager + ├── einstein + │ └── sharedWithOthers.json + └── marie + └── sharedWithMe.json +``` From c1b5e2f1cde5f255ed4e052e1e67581b8051e29e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=B6rn=20Friedrich=20Dreyer?= Date: Mon, 13 Sep 2021 20:38:52 +0000 Subject: [PATCH 4/6] final notes MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Jörn Friedrich Dreyer --- docs/ocis/storage-backends/cephfs.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/docs/ocis/storage-backends/cephfs.md b/docs/ocis/storage-backends/cephfs.md index 6b6953744c..1b5da459d4 100644 --- a/docs/ocis/storage-backends/cephfs.md +++ b/docs/ocis/storage-backends/cephfs.md @@ -20,9 +20,9 @@ The cephfs development happens in a [reva branch](https://github.com/cs3org/reva In the original approach the driver was based on the localfs driver, relying on a locally mounted cephfs. It would interface with it using the POSIX apis. This has been changed to directly call the Ceph API using https://github.com/ceph/go-ceph. It allows using the ceph admin APIs to create subvolumes for user homes and maintain a file id to path mapping using symlinks. ## Implemented Aspects -The recursive change time built ino cephfs is used to implement the etag propagation expected by the ownCloud clients. This allows ocis to pick up changes that have been made by external tools, bypassing any oCIS APIs. +The recursive change time built ino cephfs is used to implement the etag propagation expected by the ownCloud clients. This allows oCIS to pick up changes that have been made by external tools, bypassing any oCIS APIs. -Like other filesystems cephfs uses inodes and like most other filesystems inodes are reused. To get stable file identifiers the current cephfs driver assigns every node a file id and maintains a fileid to path mapping in a system directory: +Like other filesystems cephfs uses inodes and like most other filesystems inodes are reused. To get stable file identifiers the current cephfs driver assigns every node a file id and maintains a custom fileid to path mapping in a system directory: ``` /tmp/cephfs $ tree -a . @@ -38,11 +38,11 @@ Like other filesystems cephfs uses inodes and like most other filesystems inodes └── .uploads ``` -Versions are not file but snapshot based, a native feature of cephfs. The driver maps entries in the native cephfs `.snap` folder available in the web UI using the versions sidebar. +Versions are not file but snapshot based, a [native feature of cephfs](https://docs.ceph.com/en/latest/dev/cephfs-snapshots/). The driver maps entries in the native cephfs `.snap` folder to the CS3 api recycle bin concept and makes them available in the web UI using the versions sidebar. Snepshots cen be triggered by users themselves or on a schedule. -Trash is not implemented, as cephfs has no native recycle bin and instead relies on the snapshot functionality that can be triggered by endusers. It should be possible to automatically create a snapshot before deleting a file. This needs to be explored. +Trash is not implemented, as cephfs has no native recycle bin and instead relies on the snapshot functionality that can be triggered by end users. It should be possible to automatically create a snapshot before deleting a file. This needs to be explored. -Shares can be mapped to ACLs supported by cephfs. The share manager is used to persist the intent of a share and can be used to periodically verify or reset the ACLs on cephfs. +Shares [are be mapped to ACLs](https://github.com/cs3org/reva/pull/1209/files#diff-5e532e61f99bffb5754263bc6ce75f84a30c6f507a58ba506b0b487a50eda1d9R168-R224) supported by cephfs. The share manager is used to persist the intent of a share and can be used to periodically verify or reset the ACLs on cephfs. ## Future work - The spaces concept matches cephfs subvolumes. We can implement the CreateStorageSpace call with that, keep track of the list of storage spaces using symlinks, like for the id based lookup. @@ -54,7 +54,7 @@ Shares can be mapped to ACLs supported by cephfs. The share manager is used to p - A good tradeoff would be a folder for each user with a json file for each list. That way, we only have to open and read a single file when the user want's to list the shares. - To allow deprovisioning a user the data should by sharded by userid. - For consistency over metadata any file blob data, backups can be done using snapshots. -- An example where einstein has sherad a file with marie would look like this on disk: +- An example where einstein has shared a file with marie would look like this on disk: ``` /tmp/cephfs $ tree -a . @@ -74,3 +74,4 @@ Shares can be mapped to ACLs supported by cephfs. The share manager is used to p └── marie └── sharedWithMe.json ``` +- The fileids should [not be based on the path](https://github.com/cs3org/reva/pull/1209/files#diff-eba5c8b77ccdd1ac570c54ed86dfa7643b6b30e5625af191f789727874850172R125-R127) and instead use a uuid that is also persisted in the extended attributes to allow rebuilding the index from scratch if necessary. \ No newline at end of file From 6f9a9188186a09708bbe6eb331918fe2f157c148 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=B6rn=20Friedrich=20Dreyer?= Date: Tue, 14 Sep 2021 09:35:05 +0200 Subject: [PATCH 5/6] Apply suggestions from code review Co-authored-by: Alex Unger <6905948+refs@users.noreply.github.com> --- docs/ocis/storage-backends/cephfs.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/ocis/storage-backends/cephfs.md b/docs/ocis/storage-backends/cephfs.md index 1b5da459d4..b0859ea1c4 100644 --- a/docs/ocis/storage-backends/cephfs.md +++ b/docs/ocis/storage-backends/cephfs.md @@ -13,11 +13,11 @@ oCIS intends to make the aspects of existing storage systems available as transp ## Development -The cephfs development happens in a [reva branch](https://github.com/cs3org/reva/pull/1209) and is currently driven by CERN. +The cephfs development happens in a [Reva branch](https://github.com/cs3org/reva/pull/1209) and is currently driven by CERN. ## Architecture -In the original approach the driver was based on the localfs driver, relying on a locally mounted cephfs. It would interface with it using the POSIX apis. This has been changed to directly call the Ceph API using https://github.com/ceph/go-ceph. It allows using the ceph admin APIs to create subvolumes for user homes and maintain a file id to path mapping using symlinks. +In the original approach the driver was based on the [localfs](https://github.com/cs3org/reva/blob/a8c61401b662d8e09175416c0556da8ef3ba8ed6/pkg/storage/utils/localfs/localfs.go) driver, relying on a locally mounted cephfs. It would interface with it using the POSIX apis. This has been changed to directly call the Ceph API using https://github.com/ceph/go-ceph. It allows using the ceph admin APIs to create subvolumes for user homes and maintain a file id to path mapping using symlinks. ## Implemented Aspects The recursive change time built ino cephfs is used to implement the etag propagation expected by the ownCloud clients. This allows oCIS to pick up changes that have been made by external tools, bypassing any oCIS APIs. @@ -38,7 +38,7 @@ Like other filesystems cephfs uses inodes and like most other filesystems inodes └── .uploads ``` -Versions are not file but snapshot based, a [native feature of cephfs](https://docs.ceph.com/en/latest/dev/cephfs-snapshots/). The driver maps entries in the native cephfs `.snap` folder to the CS3 api recycle bin concept and makes them available in the web UI using the versions sidebar. Snepshots cen be triggered by users themselves or on a schedule. +Versions are not file but snapshot based, a [native feature of cephfs](https://docs.ceph.com/en/latest/dev/cephfs-snapshots/). The driver maps entries in the native cephfs `.snap` folder to the CS3 api recycle bin concept and makes them available in the web UI using the versions sidebar. Snapshots can be triggered by users themselves or on a schedule. Trash is not implemented, as cephfs has no native recycle bin and instead relies on the snapshot functionality that can be triggered by end users. It should be possible to automatically create a snapshot before deleting a file. This needs to be explored. From 6bf80daf66d079745e036a25ed5107293764e65b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=B6rn=20Friedrich=20Dreyer?= Date: Tue, 14 Sep 2021 09:51:23 +0200 Subject: [PATCH 6/6] explain sharding relationship with deprovisioning --- docs/ocis/storage-backends/cephfs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ocis/storage-backends/cephfs.md b/docs/ocis/storage-backends/cephfs.md index b0859ea1c4..fa3de4357d 100644 --- a/docs/ocis/storage-backends/cephfs.md +++ b/docs/ocis/storage-backends/cephfs.md @@ -52,7 +52,7 @@ Shares [are be mapped to ACLs](https://github.com/cs3org/reva/pull/1209/files#di - If needed for redundancy, the share manager can be run multiple times, backed by the same cephfs - To save disk io the data can be cached in memory, and invalidated using stat requests. - A good tradeoff would be a folder for each user with a json file for each list. That way, we only have to open and read a single file when the user want's to list the shares. -- To allow deprovisioning a user the data should by sharded by userid. +- To allow deprovisioning a user the data should by sharded by userid. That way all share information belonging to a user can easily be removed from the system. If necessary it can also be restored easily by copying the user specific folder back in place. - For consistency over metadata any file blob data, backups can be done using snapshots. - An example where einstein has shared a file with marie would look like this on disk: ```