diff --git a/docs/architecture/posixfs.md b/docs/architecture/posixfs.md index 4b8ccd0bf..c4fcd1816 100644 --- a/docs/architecture/posixfs.md +++ b/docs/architecture/posixfs.md @@ -17,21 +17,21 @@ The scope of this document is to give a high level overview to the technical asp Posix FS is a backend component that manages files on the server utilizing a "real" file tree that represents the data with folders and files in the file system as users are used to it. That is the big difference compared to Decomposed FS which is the default storage driver in Infinite Scale. -This does not mean that Infinite Scale is trading any of its benefits to this new feature: It still implements simplicity by running without a database, it continues to store metadata in the file system and adds them transparently to chaches and search index, and it also features the full spaces concept as before, just to name a few example. +This does not mean that Infinite Scale is trading any of its benefits to this new feature: It still implements simplicity by running without a database, it continues to store metadata in the file system and adds them transparently to caches and search indexes, and it also features the full spaces concept as before, just to name a few examples. -The architecture of Infinite Scale allows to configure different storage drivers for specific storage types and purposes on a space granularity. The Posix FS storage driver is an alternative to the default driver called Decomposed FS. +The architecture of Infinite Scale allows configuring different storage drivers for specific storage types and purposes on a space granularity. The Posix FS storage driver is an alternative to the default driver called Decomposed FS. -However, the clarity of the file structure in the underlying file system is not the only benefit of the Posix FS. This new technology allows users to manipulate the data directly in the file system, and any changes made to files outside of Infinite Scale are monitored and directly reflected in Infinite Scale. +However, the clarity of the file structure in the underlying file system is not the only benefit of the Posix FS. This new technology allows users to manipulate the data directly in the file system, and any changes made to files outside of Infinite Scale are monitored and directly reflected in Infinite Scale. For example, a scanner could store its output directly to the Infinite Scale file system, which immediately gets picked up in Infinite Scale. -The first time ever with feature rich open source file sync & share systems, users can either choose to work with their data through the clients of the system, its APIs or even directly in the underlying file system on the server. +For the first time ever with feature rich open source file sync & share systems, users can either choose to work with their data through the clients of the system, its APIs or even directly in the underlying file system on the server. -That is another powerful vector for integration and enables a new spectrum of use cases across all domains. Just imagine how much software can write files, and can now directly make them accessible real time in a convenient, secure and efficient way. +That is another powerful vector for integration and enables a new spectrum of use cases across all domains. ## Technical Aspects The Posix FS technology uses a few features of the underlying file system, which are mandatory and directly contributing to the performance of the system. -While the simplest form of Posix FS runs with default file systems of every modern Linux system, the full power of this unfolds with more capable file systems such as IBM Storage Scale or Ceph. These are recommended as reliable foundations for big installations of Infinite Scale. +While the simplest form of Posix FS runs with default file systems of every modern Linux system which are directly mounted and thus support inotify, the full power of this unfolds with more capable file systems such as IBM Storage Scale or Ceph. These are recommended as reliable foundations for big installations of Infinite Scale. This chapter describes some technical aspects of the storage driver. @@ -53,25 +53,25 @@ All indexing and caching of metadata is implemented in higher system levels than ### Monitoring -To get information about changes such as new files added, files edited or removed, Infinite Sale uses a monitoring system to directly watch the file system. This starts with the Linux inotify system and ranges to much more sophisticated services as for example in Spectrum Scale (see [GPFS Specifics](#gpfs-specifics) for more details on GPFS file systems). +To get information about changes such as new files added, files edited or removed, Infinite Scale uses a monitoring system to directly watch the file system. This starts with the Linux inotify system and ranges to much more sophisticated services as for example in Spectrum Scale (see [GPFS Specifics](#gpfs-specifics) for more details on GPFS file systems). Based on the information transmitted by the watching service, Infinite Scale is able to "register" new or changed files into its own caches and internal management structures. This enables Infinite Scale to deliver resource changes through the "traditional" channels such as APIs and clients. -Since the most important metadata is the file tree structure itself, the "split brain" situation between data and metadata is impossible to cause trouble. +Since the most important metadata is the file tree structure itself, it is impossible for the "split brain" situation between data and metadata to cause trouble. ### Automatic ETag Propagation -The ETag of a resource can be understood as a content fingerprint of any file- or folder resource in Infinite Scale. It is mainly used by clients to detect changes of resources. The rule is that if the content of a file changed the ETag has to change as well, as well as the ETag of all parent folders up to the root of the space. +The ETag of a resource can be understood as a content fingerprint of any file- or folder resource in Infinite Scale. It is mainly used by clients to detect changes of resources. The rule is, that if the content of a file changed the ETag has to change as well, as well as the ETag of all parent folders up to the root of the space. Infinite Scale uses a built in mechanism to maintain the ETag for each resource in the file meta data, and also propagates it automatically. -In the future a sophisticated underlying file system could provide an attribute that fulfills this requirement and changes whenever content or metadata of a resource changes, and - which is most important - also changes the attribute of the parent resource and the parent of the parent etc. +A sophisticated underlying file system could provide an attribute that fulfills this requirement and changes whenever content or metadata of a resource changes, and - which is most important - also changes the attribute of the parent resource and the parent of the parent etc. ### Automatic Tree Size Propagation Similar to the ETag propagation described before, Infinite Scale also tracks the accumulated tree size in all nodes of the file tree. A change to any file requires a re-calculation of the size attribute in all parent folders. -In the future Infinite Scale could benefit from file systems with native tree size propagation. +Infinite Scale would benefit from file systems with native tree size propagation. ### Quota @@ -81,28 +81,28 @@ For example, IBM Spectrum Scale supports quota handling directly in the file sys Other systems store quota data in the metadata storage and implement propagation of used quota similar to the tree size propagation. -### File Id Resolution +### File ID Resolution -Infinite Scale uses an Id based approach to work with resources, rather than a file path based mechanism. The reason for that is that Id based lookups can be done way more efficient compared to tree traversals, just to name one reason. +Infinite Scale uses an ID based approach to work with resources, rather than a file path based mechanism. The reason for that is that ID based lookups can be done way more efficiently compared to tree traversals, just to name one reason. -The most important component of the Id is a unique file Id that identifies the resource within a space. Ideally the Inode of a file could be used here. However, some file systems re-use inodes which must be avoided. Infinite Scale thus does not use the file Inode, but generates a UUID instead. +The most important component of the ID is a unique file ID that identifies the resource within a space. IDeally the Inode of a file could be used here. However, some file systems re-use inodes which must be avoided. Infinite Scale thus does not use the file Inode, but generates a UUID instead. -ID based lookups utilize an Id cache which needs to be shared between all storageprovider and dataprovider instances. During startup a scan of the whole file tree is performed to detect and cache new entities. +ID based lookups utilize an ID cache which needs to be shared between all storageprovider and dataprovider instances. During startup a scan of the whole file tree is performed to detect and cache new entities. In the future a powerful underlying file system could support Infinite Scale by providing an API that -1. Provides the Id for a given file path referenced resource -2. Provides the path for a given Id. +1. Provides the ID for a given file path referenced resource +2. Provides the path for a given ID. These two operations are very crucial for the performance of the entire system. ### User Management -With the requirement that data can be manipulated either through the filesystem or the Infinite Scale system, the question under which uid the manipulation happens is an important question. +With the requirement that data can be manipulated either through the filesystem or the Infinite Scale system, the question under which uid the manipulation happens is important. There are a few possible ways for user management: -1. Changes can either be only accepted by the same user that Infinite Scale is running under, for example the user `ocis`. All manipulations in the filesystem have to be done by, and only by, this user. -2. Group based: All users who should be able to manipulate files have to be in a unix group. The Infinite Scale user has also to be in there. The default umask in the directory used has to allow group writing all over the place. +1. Changes can either be only accepted by the same user that Infinite Scale is running under, for example the user `ocis`. All manipulations in the filesystem have to be done by, and only by this user. +2. Group based: All users who should be able to manipulate files have to be in a unix group. The Infinite Scale user has also to be member of that group. The default umask in the directory used has to allow group writing all over the place. 3. Impersonation: Infinite Scale impersonates the user who owns the folder on the file system to mimic the access as the user. All possibilities have pros and cons for operations. @@ -133,7 +133,7 @@ In the current state of the Posix FS, trash bin is not supported. ## Limitations -As of Q2/2024 the Posix FS is in technical preview state which means that it is not officially supported. +As of Q2/2024 the Posix FS is not officially supported and in technical preview state. The tech preview comes with the following limitations: @@ -141,6 +141,7 @@ The tech preview comes with the following limitations: 1. Advanced features versioning and trashbin are not supported yet 1. The space/project folders in the filesystem are named after the UUID, not the real space name 1. No CephFS support yet +1. Postprocessing (ie. anti virus check) does not happen for file actions outside of Infinite Scale ## Setup @@ -156,7 +157,7 @@ To run Posix FS, the following prerequisites have to be fulfilled: 1. When using inotify, the storage must be local on the same machine. Network mounts do not work with inotify. `inotifywait` needs to be installed. 1. The storage root path must be writeable and executable by the same user Infinite Scale is running under 1. An appropiate version of Infinite Scale is installed, version number 5.0.5 and later -1. Either redis or nats-js-kv cache service +1. Nats-js-kv as cache service ### Setup Configuration @@ -167,10 +168,8 @@ This is an example configuration with environment variables that configures Infi export STORAGE_USERS_DRIVER="posix" export STORAGE_USERS_POSIX_ROOT="/home/kf/tmp/posix-storage" export STORAGE_USERS_POSIX_WATCH_TYPE="inotifywait" -export STORAGE_USERS_ID_CACHE_STORE="nats-js-kv" # for redis "redis" -export STORAGE_USERS_ID_CACHE_STORE_NODES="localhost:9233" # for redis "127.0.0.1:6379" - - +export STORAGE_USERS_ID_CACHE_STORE="nats-js-kv" +export STORAGE_USERS_ID_CACHE_STORE_NODES="localhost:9233" # Optionally enable gid based space access export STORAGE_USERS_POSIX_USE_SPACE_GROUPS="true" @@ -207,4 +206,4 @@ The gpfswatchfolder watcher connects to a kafka cluster which is being filled wi export STORAGE_USERS_POSIX_WATCH_TYPE="gpfswatchfolder" export STORAGE_USERS_POSIX_WATCH_PATH="fs1_audit" # the kafka topic to watch export STORAGE_USERS_POSIX_WATCH_FOLDER_KAFKA_BROKERS="192.168.1.180:29092" -``` \ No newline at end of file +```