From 8fed32ead98ed0e87797e08c97ac9eabcfff6564 Mon Sep 17 00:00:00 2001 From: Klaas Freitag Date: Tue, 4 Jun 2024 16:13:41 +0200 Subject: [PATCH] Added some more aspects from previous documentation about posix fs Some additional cleanups --- docs/architecture/posixfs.md | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/docs/architecture/posixfs.md b/docs/architecture/posixfs.md index 6f57f6183..7904ce005 100644 --- a/docs/architecture/posixfs.md +++ b/docs/architecture/posixfs.md @@ -7,32 +7,40 @@ geekdocEditPath: edit/master/docs/architecture geekdocFilePath: posixfs.md --- +{{< toc >}} + Posix FS is the working name for the collaborative storage driver for Infinite Scale. The scope of this document is to give an high level overview to the technical aspects of the Posix FS and guide the setup. -## A Clean File Tree +## Introduction Posix FS is a backend component that manages files on the server utilizing a "real" file tree that represents the data with folders and files in the file system as users are used to it. That is the big difference compared to Decomposed FS which is the default storage driver in Infinite Scale. This does not mean that Infinte Scale is trading any of it's benefits to this new feature: It still implements simplicity by running without a database, it continues to store metadata in the file system and adds them transparently to chaches and search index, and it also features the full spaces concept as before, just to name a few example. -Based on the great system architecture of Infinite Scale, which allows to add different storage drivers with specific attributes, the Posix FS shares a lot of code with the decompsed FS which contributes to the stability and uniformity of both components. +The architecture of Infinite Scale allows to configure different storage drivers for specific storage types and purposes on a space granularity. The Posix FS storage driver is an alternative to the default driver called Decomposed FS. -However, the clearance of the file structure in the underlying file system is not the only benefit of the Posix FS. Moreover, this new technology allows users to manipulate the data directly in the file system, and any changes to files made even outside of Infinite Scale are monitored and directly reflected in Infinite Scale. +However, the clearance of the file structure in the underlying file system is not the only benefit of the Posix FS. This new technology allows users to manipulate the data directly in the file system, and any changes to files made aside of Infinite Scale are monitored and directly reflected in Infinite Scale. -The first time ever with feature rich open source file synce & share systems, users can either choose to work with their data through the clients of the system, it's API's or directly in the underlying file system on the server. +The first time ever with feature rich open source file synce & share systems, users can either choose to work with their data through the clients of the system, it's API's or even directly in the underlying file system on the server. -That is another powerful vector for integration and enables a new universe of use cases across all domains. Just imagine how many software can write files, and can now directly make them accessible real time in a convenient, secure and efficient way. +That is another powerful vector for integration and enables a new spectrum of use cases across all domains. Just imagine how many software can write files, and can now directly make them accessible real time in a convenient, secure and efficient way. ## Technical Aspects -The PosixFS technology uses a few features of the underlying file system, which are directly contributing to the performance of the system. +The Posix FS technology uses a few features of the underlying file system, which are mandatory and directly contributing to the performance of the system. While the simplest form of Posix FS runs with default file systems of every modern Linux system, the full power of this unfolds with more capable file systems such as IBM Storage Scale or Ceph. These are recommended as reliable foundations for big installations of Infinite Scale. This chapter describes some technical aspects of the storage driver. +### Path Locations + +The file tree that is used as storage path for both data and metadata is located under the local path on the machine that is running Infinite Scale. That might either be a real local file system or a mounted net filesystem. It is expected that oCIS is the only consumer of that file tree, except what is expected behaviour with a collaborative file system, that works with files in that tree. + +Underneath the Infinite Scale file system root, there is a collection of different folders containing Infinite Scale specific data storing personal spaces, project spaces and indexes. + ### Metadata Infinite Scale is highly dependent on the efficient usage of meta data which are attached to file resources, but also logical elements such as spaces. @@ -41,7 +49,7 @@ Metadata are stored in extended file attributes or in message pack files, as it ### Monitoring -To get information about changes that are done directly on the file system, Infinte Sale uses a monitoring system to directly watch the file system. This starts with the Linux inotify system and ranges to much more sophisticated services as for example in Spectrum Scale. +To get information about changes such as new files added, files edited or removed, Infinte Sale uses a monitoring system to directly watch the file system. This starts with the Linux inotify system and ranges to much more sophisticated services as for example in Spectrum Scale. Based on the information transmitted by the watching service, Infinite Scale is able to "register" new or changed files into its own caches and internal management structures. That entitles Infinte Scale to deliver resource changes through the "traditional" channels such as APIs and clients. @@ -61,6 +69,14 @@ Similar to the ETag propagation described before, Infinite Scale also tracks the If the file system supports that natively that is a huge benefit. +### Quota + +Each space has it's own quota, thus every storage driver implementation needs to consider that. + +For example, IBM Spectrum Scale supports quota handling directly in the file system. + +Other systems store quota data in the metadata storage and implement propagation of used quota similar to the tree size propagation. + ### File Id Resolution Infinite Scale uses an Id based approach to work with resources, rather than a file path based mechanism. The reason for that is that Id based lookups can be done way more efficient compared to tree traversals, just to name one reason. @@ -115,6 +131,7 @@ The tech preview comes with the following limitations: 2. Only inotify and GPFS file system change notification methods are supported 3. Advanced features versioning and trashbin are not supported yet 4. The space/project folders in the filesystem are named after the UUID, not the real space name +5. No CephFS support yet ## Setup