Files
ternfs-XTXMarkets/docs/kmod-file-tracking.md
2025-09-17 18:20:23 +01:00

3.6 KiB

TernFS files are immutable: they are created once and never modified. This presents a challenge when writing a kernel module for it, since the VFS API very much assumes that files can be modified.

So our high level strategy is: allow users to open a file for writing, and keep that file transient (i.e. not visible in the directory tree) until we declare it "done", and do not allow modifications after that.

The main problem in implementing the strategy above is when to declare the file "done". An attractive answer is "when the file is closed". However one problem with that answer is that it's not clear when files are "consciously" closed through through close(), and when they are closed because the process is winding down and all its open FDs are being closed.

The relevant VFS interface is flush in struct file_operations: exactly the same function gets called in the two situations above.

If we just blindly declare the file done when flush is called, we're going to get a ton of false positives. Consider the classic fork + execve pattern:

  • An TernFS file is opened for writing by process A;
  • A unrelatedly (perhaps in another thread) forks (maybe to run another process through execve) to process B;
  • B inherits all FDs of A, including the open TernFS file;
  • B terminates before A has finished writing the TernFS file it opened;
  • The file is prematurely declared "done" and A can't finish writing it to completion.

So we need a better way to recognize when a flush is intentional, so to speak. We achieve this as follows:

The above is quite dirty, but seems to be pretty solid1 . However trouble arises if files are created from inside the kernel, which is exactly what happens with NFS, which is what prompted me to write down this explanation. We'll have to do something else for NFS to work.

Also note that we want to keep struct mm_struct around anyway to increase MM_FILEPAGES when we allocate new pages to write files. But that is more of a nice to have than a strict requirement.


  1. Note that the FUSE implementation is not as solid, given that we don't have access to the internals that we have access to in the kernel module. ↩︎