Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
This commit is contained in:
Josh Soref
2021-09-17 05:21:38 -04:00
committed by Phil Davis
parent 806d224ed7
commit 55667a3ab3
87 changed files with 192 additions and 192 deletions

View File

@@ -26,7 +26,7 @@ The dashed lines in the diagram indicate requests that are made to authenticate
2. The gateway will verify the JWT signature of the `x-access-token` or try to authenticate the request itself, e.g. using a public link token.
{{< hint warning >}}
The bottom part is lighter because we will deprecate it in favor of using only the CS3 user and group providers after moving some account functionality into reva and glauth. The metadata storage is not registered in the reva gateway to seperate metadata necessary for running the service from data that is being served directly.
The bottom part is lighter because we will deprecate it in favor of using only the CS3 user and group providers after moving some account functionality into reva and glauth. The metadata storage is not registered in the reva gateway to separate metadata necessary for running the service from data that is being served directly.
{{< /hint >}}
## Endpoints and references

View File

@@ -89,7 +89,7 @@ The OCS service makes a stat request to the storage provider to get a [ResourceI
{{< hint >}}
The user and public share provider implementations identify the file using the [`ResourceId`](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ResourceId). The [`ResourceInfo`](https://cs3org.github.io/cs3apis/#cs3.storage.provider.v1beta1.ResourceInfo) is passed so the share provider can also store who the owner of the resource is. The *path* is not part of the other API calls, e.g. when listing shares.
The OCM API takes an id based reference on the CS3 api, even if the OCM HTTP endpoint takes a path argument. *@jfd: Why? Does it not need the owner? It only stores the owner of the share, which is always the currently looged in user, when creating a share. Afterwards only the owner can update a share ... so collaborative management of shares is not possible. At least for OCM shares.*
The OCM API takes an id based reference on the CS3 api, even if the OCM HTTP endpoint takes a path argument. *@jfd: Why? Does it not need the owner? It only stores the owner of the share, which is always the currently logged in user, when creating a share. Afterwards only the owner can update a share ... so collaborative management of shares is not possible. At least for OCM shares.*
{{< /hint >}}
### User and Group provisioning

View File

@@ -79,13 +79,13 @@ It maps the *path* and *id* based CS3 *references* to an appropriate [*storage s
{{< hint warning >}}
**Proposed Change**
iOS clients can only queue single requests to be executed in the background. The queue an upload and need to be able to identify the uploaded file after it has been uploaded to the server. The disconnected nature of the connection might cause worksflows or manual user interaction with the file on the server to move the file to a different place or changing the content while the device is offline. However, on the device users might have marked the file as favorite or added it to other iOS specific collections. To be able to reliably identify the file the client can generate a `uuid` and attach it to the file metadata during the upload. While it is not necessary to look up files by this `uuid` having a second file id that serves exactly the same purpose as the `file id` is redundant.
iOS clients can only queue single requests to be executed in the background. The queue an upload and need to be able to identify the uploaded file after it has been uploaded to the server. The disconnected nature of the connection might cause workflows or manual user interaction with the file on the server to move the file to a different place or changing the content while the device is offline. However, on the device users might have marked the file as favorite or added it to other iOS specific collections. To be able to reliably identify the file the client can generate a `uuid` and attach it to the file metadata during the upload. While it is not necessary to look up files by this `uuid` having a second file id that serves exactly the same purpose as the `file id` is redundant.
Another aspect for the `file id` / `uuid` is that it must be a logical identifier that can be set, at least by internal systems. Without a writeable fileid we cannot restore backups or migrate storage spaces from one storage provider to another storage provider.
Technically, this means that every storage driler needs to have a map of a `uuid` to in internal resource identifier. This internal resource identifier can be
Technically, this means that every storage driver needs to have a map of a `uuid` to in internal resource identifier. This internal resource identifier can be
- an eos fileid, because eos can look up files by id
- an inode if the filesystem and the storage driver support lookung up by inode
- an inode if the filesystem and the storage driver support looking up by inode
- a path if the storage driver has no way of looking up files by id.
- In this case other mechanisms like inotify, kernel audit or a fuse overlay might be used to keep the paths up to date.
- to prevent excessive writes when deep folders are renamed a reverse map might be used: it will map the `uuid` to `<parentuuid>:<childname>`, allowing to trade writes for reads
@@ -142,7 +142,7 @@ Examples would be every user's home storage space, project storage spaces or gro
Finally, a logical `storage space id` is not tied to a specific [*storage provider*]({{< ref "#storage-providers" >}}). If the [*storage driver*]({{< ref "#storage-drivers" >}}) supports it, we can import existing files including their `file id`, which makes it possible to move [*storage spaces*]({{< ref "#storage-spaces" >}}) between [*storage providers*]({{< ref "#storage-providers" >}}) to implement storage classes, e.g. with or without archival, workflows, on SSDs or HDDs.
## Shares
*To be clarified: we are aware that [*storage spaces*]({{< ref "#storage-spaces" >}}) may be too 'heavywheight' for ad hoc sharing with groups. That being said, there is no technical reason why group shares should not be treated like [*storage spaces*]({{< ref "#storage-spaces" >}}) that users can provision themselves. They would share the quota with the users home [*storage space*]({{< ref "#storage-spaces" >}}) and the share initiator would be the sole owner. Technically, the mechanism of treating a share like a new [*storage space*]({{< ref "#storage-spaces" >}}) would be the same. This obviously also extends to user shares and even file indvidual shares that would be wrapped in a virtual collection. It would also become possible to share collections of arbitrary files in a single storage space, e.g. the ten best pictures from a large album.*
*To be clarified: we are aware that [*storage spaces*]({{< ref "#storage-spaces" >}}) may be too 'heavyweight' for ad hoc sharing with groups. That being said, there is no technical reason why group shares should not be treated like [*storage spaces*]({{< ref "#storage-spaces" >}}) that users can provision themselves. They would share the quota with the users home [*storage space*]({{< ref "#storage-spaces" >}}) and the share initiator would be the sole owner. Technically, the mechanism of treating a share like a new [*storage space*]({{< ref "#storage-spaces" >}}) would be the same. This obviously also extends to user shares and even file individual shares that would be wrapped in a virtual collection. It would also become possible to share collections of arbitrary files in a single storage space, e.g. the ten best pictures from a large album.*
## Storage Systems

View File

@@ -15,11 +15,11 @@ Welcome to oCIS, the modern file-sync and share platform, which is based on our
### The idea of federated storage
To creata a truly federated storage architecture oCIS breaks down the old ownCloud 10 user specific namespace, which is assembled on the server side, and makes the individual parts accessible to clients as storage spaces and storage space registries.
To create a truly federated storage architecture oCIS breaks down the old ownCloud 10 user specific namespace, which is assembled on the server side, and makes the individual parts accessible to clients as storage spaces and storage space registries.
The below diagram shows the core conceps that are the foundation for the new architecture:
The below diagram shows the core concepts that are the foundation for the new architecture:
- End user devices can fetch the list of *storage spaces* a user has access to, by querying one or multiple *storage space registries*. The list contains a unique endpoint for every *storage space*.
- [*Storage space registries*]({{< ref "../extensions/storage/terminology#storage-space-registries" >}}) manage the list of storage spaces a user has access to. They may subscrible to *storage spaces* in order to receive notifications about changes on behalf of an end users mobile or desktop client.
- [*Storage space registries*]({{< ref "../extensions/storage/terminology#storage-space-registries" >}}) manage the list of storage spaces a user has access to. They may subscribe to *storage spaces* in order to receive notifications about changes on behalf of an end users mobile or desktop client.
- [*Storage spaces*]({{< ref "../extensions/storage/terminology#storage-spaces" >}}) represent a collection of files and folders. A users personal files are a *storage space*, a group or project drive is a *storage space*, and even incoming shares are treated and implemented as *storage spaces*. Each with properties like owners, permissions, quota and type.
- [*Storage providers*]({{< ref "../extensions/storage/terminology#storage-providers" >}}) can hold multiple *storage spaces*. At an oCIS instance, there might be a dedicated *storage provider* responsible for users personal storage spaces. There might be multiple, sharing the load or there might be just one, hosting all types of *storage spaces*.

View File

@@ -69,7 +69,7 @@ Chosen option: "Move accounts functionality to GLAuth and name it accounts", by
### Negative Consequences
* If users want to store users in their IDM and at the same time guests in a seperate user management we need to implement GLAuth backends that support more than one LDAP server.
* If users want to store users in their IDM and at the same time guests in a separate user management we need to implement GLAuth backends that support more than one LDAP server.
## Pros and Cons of the Options

View File

@@ -60,7 +60,7 @@ The migration happens while the service is offline. File metadata, blobs and sha
- Good, because oCIS can be tested in a staging system without writing to the production system.
- Good, because file layout on disk can be changed to support new storage driver capabilities.
- Bad, because the export and import might require significant amounts of storage.
- Bad, because a rollback to the state before the migration might cause data loss of the changes that happend in between.
- Bad, because a rollback to the state before the migration might cause data loss of the changes that happened in between.
- Bad, because the cold migration can mean significant downtime.
### Hot Migration

View File

@@ -37,7 +37,7 @@ Chosen option: "Dynamic service registration". There were some drawbacks regardi
* Having dynamic service registration delegates the entire lifecycle of finding a process to the service registry.
* Removing a-priori knowledge of hostname + port for services.
* Marrying go-micro's registry and a newly defined registry abstraction on Reva.
* We will embrace go-micro interfaces by defining a third merger interface in order to marry go-micro registry and rega revistry.
* We will embrace go-micro interfaces by defining a third merger interface in order to marry go-micro registry and reva registry.
* The ability to fetch a service node relying only on its name (i.e: com.owncloud.proxy) and not on a tuple hostname + port that we rely on being preconfigured during runtime.
* Conceptually speaking, a better framework to tie all the services together. Referring to services by names is less overall confusing than having to add a service name + where it is running. A registry is agnostic to "where is it running" because it, by definition, keeps track of this specific question, so when speaking about design or functionality, it will ease communication.

View File

@@ -23,11 +23,11 @@ There should be a way to impose certain limitations in areas of the code that re
## Considered Options
1. Build the evaluation engine in-house.
2. Use third party libraries such as Open Policy Agent (a CNCF aproved project written in Go)
2. Use third party libraries such as Open Policy Agent (a CNCF approved project written in Go)
## Decision Outcome
Chosen option: option 2; Use third party libraries such as Open Policy Agent (a CNCF aproved project written in Go)
Chosen option: option 2; Use third party libraries such as Open Policy Agent (a CNCF approved project written in Go)
### Positive Consequences

View File

@@ -177,7 +177,7 @@ There is a customized ownCloud instance that uses path only based URLs:
{{< hint >}}
* `/#` is used by the current vue router.
* `/s` denotes that this is a space url.
* `<space_id>` and `<resource_id>` both consist of `<storage_id>:<node_id>`, but the `space_id` can be replaced with a shorter id or an alias. See furthor down below.
* `<space_id>` and `<resource_id>` both consist of `<storage_id>:<node_id>`, but the `space_id` can be replaced with a shorter id or an alias. See further down below.
* `<relative/path>` takes precedence over the `<resource_id>`, both are optional
{{< /hint >}}
@@ -245,11 +245,11 @@ When every space has a namespaced alias and a relative path we can build a globa
| `https://demo.owncloud.com/files/personal/einstein/relative/path/to/resource?id=b78c2044-5b51-446f-82f6-907a664d089c:194b4a97-597c-4461-ab56-afd4f5a21608` | sub folder `/relative/path/to/resource` |
| `https://demo.owncloud.com/files/shares/einstein/somesharename?id=b78c2044-5b51-446f-82f6-907a664d089c:194b4a97-597c-4461-ab56-afd4f5a21608` | shared URL for `/relative/path/to/resource` |
| `https://demo.owncloud.com/files/personal/einstein/marie is stupid/and richard as well/resource?id=b78c2044-5b51-446f-82f6-907a664d089c:194b4a97-597c-4461-ab56-afd4f5a21608` | sub folder `marie is stupid/and richard as well/resource` ... something einstein might not want to reveal |
| `https://demo.owncloud.com/files/shares/einstein/resource (2)?id=b78c2044-5b51-446f-82f6-907a664d089c:194b4a97-597c-4461-ab56-afd4f5a21608` | named link URL for `/marie is stupid/and richard as well/resource`, does not disclose the actual hierarchy, has an appended counter to avaid a collision |
| `https://demo.owncloud.com/files/shares/einstein/resource (2)?id=b78c2044-5b51-446f-82f6-907a664d089c:194b4a97-597c-4461-ab56-afd4f5a21608` | named link URL for `/marie is stupid/and richard as well/resource`, does not disclose the actual hierarchy, has an appended counter to avoid a collision |
| `https://demo.owncloud.com/files/shares/einstein/mybestfriends?id=b78c2044-5b51-446f-82f6-907a664d089c:194b4a97-597c-4461-ab56-afd4f5a21608` | named link URL for `/marie is stupid/and richard as well/resource`, does not disclose the actual hierarchy, has a custom alias for the share |
| `https://demo.owncloud.com/files/public/kcZVYaXr7oZ66bg/relative/path/to/resource` | sub folder `/relative/path/to/resource` in public link with token `kcZVYaXr7oZ66bg` |
| `https://demo.owncloud.com/files/public/kcZVYaXr7oZ66bg/relative/path/to/resource` | sub folder `/relative/path/to/resource` in public link with token `kcZVYaXr7oZ66bg` |
| `https://demo.owncloud.com/s/kcZVYaXr7oZ66bg/` | shortened link to a resource. This is needed to be able to copy a link to a resource whithout leaking any metadata. |
| `https://demo.owncloud.com/s/kcZVYaXr7oZ66bg/` | shortened link to a resource. This is needed to be able to copy a link to a resource without leaking any metadata. |
`</namespaced/alias></relative/path/to/resource>` is the global path in the CS3 api. The CS3 Storage Registry is responsible by managing the mount points.

View File

@@ -115,7 +115,7 @@ On Linux and macOS you can add them to your `/etc/hosts` files like this:
```
127.0.0.1 ocis.owncloud.test
127.0.0.1 traefik.owncloud.testt
127.0.0.1 traefik.owncloud.test
```
After that you're ready to start the application stack:

View File

@@ -19,7 +19,7 @@ geekdocFilePath: ocis_keycloak.md
The docker stack consists 4 containers. One of them is Traefik, a proxy which is terminating ssl and forwards the requests to oCIS in the internal docker network.
Keykloak add two containers: Keycloak itself and a PostgreSQL as database. Keycloak will be configured as oCIS' IDP instead of the internal IDP [LibreGraph Connect]({{< ref "../../extensions/idp" >}})
Keycloak add two containers: Keycloak itself and a PostgreSQL as database. Keycloak will be configured as oCIS' IDP instead of the internal IDP [LibreGraph Connect]({{< ref "../../extensions/idp" >}})
The other container is oCIS itself running all extensions in one container. In this example oCIS uses [oCIS storage driver]({{< ref "../../extensions/storage/storages#storage-drivers" >}})

View File

@@ -7,7 +7,7 @@ geekdocEditPath: edit/master/docs/ocis/development
geekdocFilePath: extensions.md
---
oCIS is all about files, sync and share - but most of the time there is more you want to do with your files, e.g. having a different view on your photo collection or editing your offices files in an online file editor. ownCloud 10 faced the same problem and solved it with `applications`, which can extend the functionality of ownCloud 10 in a wide range. Since oCIS is different in its architecture compared to ownCloud 10, we had to come up with a similiar (yet slightly different) solution. To extend the functionality of oCIS, you can write or install `extensions`. An extension is basically any running code which integrates into oCIS and provides functionality to oCIS and its users. Because extensions are just microservices providing an API, you can technically choose any programming language you like - a huge improvement to ownCloud 10, where it was nearly impossible to use a different programming language than PHP.
oCIS is all about files, sync and share - but most of the time there is more you want to do with your files, e.g. having a different view on your photo collection or editing your offices files in an online file editor. ownCloud 10 faced the same problem and solved it with `applications`, which can extend the functionality of ownCloud 10 in a wide range. Since oCIS is different in its architecture compared to ownCloud 10, we had to come up with a similar (yet slightly different) solution. To extend the functionality of oCIS, you can write or install `extensions`. An extension is basically any running code which integrates into oCIS and provides functionality to oCIS and its users. Because extensions are just microservices providing an API, you can technically choose any programming language you like - a huge improvement to ownCloud 10, where it was nearly impossible to use a different programming language than PHP.
We will now introduce you to the oCIS extension system and show you how you can create a custom extension yourself.

View File

@@ -27,7 +27,7 @@ If you find tools needed besides the mentioned above, please feel free to open a
oCIS consists of multiple micro services, also called extensions. We started by having standalone repositories for each of them, but quickly noticed that this adds a time consuming overhead for developers. So we ended up with a monorepo housing all the extensions in one repository.
Each extension lives in a subfolder (eg. `accounts` or `settings`) within this respository as an independent Go module, following the [golang-standard project-layout](https://github.com/golang-standards/project-layout). They have common Makefile targets and can be used to change, build and run individual extensions. This allows us to version and release each extension independently.
Each extension lives in a subfolder (eg. `accounts` or `settings`) within this repository as an independent Go module, following the [golang-standard project-layout](https://github.com/golang-standards/project-layout). They have common Makefile targets and can be used to change, build and run individual extensions. This allows us to version and release each extension independently.
The `ocis` folder contains our [go-micro](https://github.com/asim/go-micro/) and [suture](https://github.com/thejerf/suture) based runtime. It is used to import all extensions and implements commands to manage them, similar to a small orchestrator. With the resulting oCIS binary you can start single extensions or even all extensions at the same time.

View File

@@ -50,7 +50,7 @@ sequenceDiagram
end
proxy->>+accounts: TODO API call to exchange sub@iss with account UUID
Note over proxy,accounts: does not autoprovision users. They are explicitly provsioned later.
Note over proxy,accounts: does not autoprovision users. They are explicitly provisioned later.
alt account exists or has been migrated

View File

@@ -78,7 +78,7 @@ _TODO @butonic add ADR for OpenID Connect_
#### User impact
When introducing OpenID Connect, the clients will detect the new authentication scheme when their current way of authenticating returns an error. Users will then have to
reauthorize at the OpenID Connecd IdP, which again, may be configured to skip the consent step for trusted clients.
reauthorize at the OpenID Connect IdP, which again, may be configured to skip the consent step for trusted clients.
#### Steps
1. There are multiple products that can be used as an OpenID Connect IdP. We test with [LibreGraph Connect](https://github.com/libregraph/lico), which is also [embedded in oCIS](https://github.com/owncloud/web/). Other alternatives include [Keycloak](https://www.keycloak.org/) or [Ping](https://www.pingidentity.com/). Please refer to the corresponding setup instructions for the product you intent to use.
@@ -106,7 +106,7 @@ Should there be problems with OpenID Connect at this point you can disable the a
<div style="break-after: avoid"></div>
Legacy clients relying on Basic auth or app passwords need to be migrated to OpenId Connect to work with oCIS. For a transition period Basic auth in oCIS can be enabled with `PROXY_ENABLE_BASIC_AUTH=true`, but we strongly recommend adopting OpenID Connect for other tools as well.
While OpenID Connect providers will send an `iss` and `sub` claim that relying parties (services like oCIS or ownCloud 10) can use to identify users we recommend introducing a dedicated, globally unique, persistent, non-reassignable user identifier like a UUID for every user. This `ownclouduuid` shold be sent as an additional claim to save additional lookups on the server side. It will become the user id in oCIS, e.g. when searching for recipients the `ownclouduuid` will be used to persist permissions with the share manager. It has a different purpose than the ownCloud 10 username, which is used to login. Using UUIDs we can not only mitigate username collisions when merging multiple instances but also allow renaming usernames after the migration to oCIS has been completed.
While OpenID Connect providers will send an `iss` and `sub` claim that relying parties (services like oCIS or ownCloud 10) can use to identify users we recommend introducing a dedicated, globally unique, persistent, non-reassignable user identifier like a UUID for every user. This `ownclouduuid` should be sent as an additional claim to save additional lookups on the server side. It will become the user id in oCIS, e.g. when searching for recipients the `ownclouduuid` will be used to persist permissions with the share manager. It has a different purpose than the ownCloud 10 username, which is used to login. Using UUIDs we can not only mitigate username collisions when merging multiple instances but also allow renaming usernames after the migration to oCIS has been completed.
<div class="editpage">
@@ -322,8 +322,8 @@ _TODO @butonic update performance comparisons nightly_
#### Steps
There are several options to move users to the oCIS backend:
- Use a canary app to let users decide thamselves
- Use an early adoptors group with an opt in
- Use a canary app to let users decide themselves
- Use an early adopters group with an opt in
- Force migrate users in batch or one by one at the administrators will
#### Verification
@@ -333,7 +333,7 @@ The same verification steps as for the internal testing stage apply. Just from t
Until now, the oCIS configuration mimics ownCloud 10 and uses the old data directory layout and the ownCloud 10 database. Users can seamlessly be switched from ownCloud 10 to oCIS and back again.
<div class="editpage">
_TODO @butonic we need a canary app that allows users to decide for themself which backend to use_
_TODO @butonic we need a canary app that allows users to decide for themselves which backend to use_
</div>
@@ -430,7 +430,7 @@ _TODO @butonic document how to manually do that until the storage registry can d
Start with a test user, then move to early adopters and finally migrate all users.
#### Rollback
To switch the storage provider again the same storage space migration can be performed again: copy medatata and blob data using the CS3 api, then change the responsible storage provider in the storage registry.
To switch the storage provider again the same storage space migration can be performed again: copy metadata and blob data using the CS3 api, then change the responsible storage provider in the storage registry.
#### Notes
<div style="break-after: avoid"></div>
@@ -473,7 +473,7 @@ _TODO for storage provider as source of truth persist ALL share data in the stor
</div>
#### Verification
After copying all metadata start a dedicated gateway and change the configuration to use the new share manager. Route a test user, a test group and early adoptors to the new gateway. When no problems occur you can start the desired number of share managers and roll out the change to all gateways.
After copying all metadata start a dedicated gateway and change the configuration to use the new share manager. Route a test user, a test group and early adopters to the new gateway. When no problems occur you can start the desired number of share managers and roll out the change to all gateways.
<div class="editpage">
@@ -568,7 +568,7 @@ The `filecache` table itself has more metadata:
| Field | Type | Null | Key | Default | Extra | Comment | Migration |
|--------------------|---------------|------|-----|---------|----------------|----------------|----------------|
| `fileid` | bigint(20) | NO | PRI | NULL | auto_increment | | MUST become the oCIS `opaqueid` of a file reference. `ocis` driver stores it in extendet attributes and can use numbers as node ids on disk. for eos see note below table |
| `fileid` | bigint(20) | NO | PRI | NULL | auto_increment | | MUST become the oCIS `opaqueid` of a file reference. `ocis` driver stores it in extended attributes and can use numbers as node ids on disk. for eos see note below table |
| `storage` | int(11) | NO | MUL | 0 | | *the filecache holds metadata for multiple storages* | corresponds to an oCIS *storage space* |
| `path` | varchar(4000) | YES | | NULL | | *the path relative to the storages root* | MUST become the `path` relative to the storage root. `files` prefix needs to be trimmed. |
| `path_hash` | varchar(32) | NO | | | | *mysql once had problems indexing long paths, so we stored a hash for lookup by path. | - |