Add Documentation/filesystems/famfs.rst and update MAINTAINERS
Signed-off-by: John Groves <john@groves.net>
---
Documentation/filesystems/famfs.rst | 142 ++++++++++++++++++++++++++++
Documentation/filesystems/index.rst | 1 +
MAINTAINERS | 1 +
3 files changed, 144 insertions(+)
create mode 100644 Documentation/filesystems/famfs.rst
diff --git a/Documentation/filesystems/famfs.rst b/Documentation/filesystems/famfs.rst
new file mode 100644
index 000000000000..0d3c9ba9b7a8
--- /dev/null
+++ b/Documentation/filesystems/famfs.rst
@@ -0,0 +1,142 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _famfs_index:
+
+==================================================================
+famfs: The fabric-attached memory file system
+==================================================================
+
+- Copyright (C) 2024-2025 Micron Technology, Inc.
+
+Introduction
+============
+Compute Express Link (CXL) provides a mechanism for disaggregated or
+fabric-attached memory (FAM). This creates opportunities for data sharing;
+clustered apps that would otherwise have to shard or replicate data can
+share one copy in disaggregated memory.
+
+Famfs, which is not CXL-specific in any way, provides a mechanism for
+multiple hosts to concurrently access data in shared memory, by giving it
+a file system interface. With famfs, any app that understands files can
+access data sets in shared memory. Although famfs supports read and write,
+the real point is to support mmap, which provides direct (dax) access to
+the memory - either writable or read-only.
+
+Shared memory can pose complex coherency and synchronization issues, but
+there are also simple cases. Two simple and eminently useful patterns that
+occur frequently in data analytics and AI are:
+
+* Serial Sharing - Only one host or process at a time has access to a file
+* Read-only Sharing - Multiple hosts or processes share read-only access
+ to a file
+
+The famfs fuse file system is part of the famfs framework; user space
+components [1] handle metadata allocation and distribution, and provide a
+low-level fuse server to expose files that map directly to [presumably
+shared] memory.
+
+The famfs framework manages coherency of its own metadata and structures,
+but does not attempt to manage coherency for applications.
+
+Famfs also provides data isolation between files. That is, even though
+the host has access to an entire memory "device" (as a devdax device), apps
+cannot write to memory for which the file is read-only, and mapping one
+file provides isolation from the memory of all other files. This is pretty
+basic, but some experimental shared memory usage patterns provide no such
+isolation.
+
+Principles of Operation
+=======================
+
+Famfs is a file system with one or more devdax devices as a first-class
+backing device(s). Metadata maintenance and query operations happen
+entirely in user space.
+
+The famfs low-level fuse server daemon provides file maps (fmaps) and
+devdax device info to the fuse/famfs kernel component so that
+read/write/mapping faults can be handled without up-calls for all active
+files.
+
+The famfs user space is responsible for maintaining and distributing
+consistent metadata. This is currently handled via an append-only
+metadata log within the memory, but this is orthogonal to the fuse/famfs
+kernel code.
+
+Once instantiated, "the same file" on each host points to the same shared
+memory, but in-memory metadata (inodes, etc.) is ephemeral on each host
+that has a famfs instance mounted. Use cases are free to allow or not
+allow mutations to data on a file-by-file basis.
+
+When an app accesses a data object in a famfs file, there is no page cache
+involvement. The CPU cache is loaded directly from the shared memory. In
+some use cases, this is an enormous reduction read amplification compared
+to loading an entire page into the page cache.
+
+
+Famfs is Not a Conventional File System
+---------------------------------------
+
+Famfs files can be accessed by conventional means, but there are
+limitations. The kernel component of fuse/famfs is not involved in the
+allocation of backing memory for files at all; the famfs user space
+creates files and responds as a low-level fuse server with fmaps and
+devdax device info upon request.
+
+Famfs differs in some important ways from conventional file systems:
+
+* Files must be pre-allocated by the famfs framework; allocation is never
+ performed on (or after) write.
+* Any operation that changes a file's size is considered to put the file
+ in an invalid state, disabling access to the data. It may be possible to
+ revisit this in the future. (Typically the famfs user space can restore
+ files to a valid state by replaying the famfs metadata log.)
+
+Famfs exists to apply the existing file system abstractions to shared
+memory so applications and workflows can more easily adapt to an
+environment with disaggregated shared memory.
+
+Memory Error Handling
+=====================
+
+Possible memory errors include timeouts, poison and unexpected
+reconfiguration of an underlying dax device. In all of these cases, famfs
+receives a call from the devdax layer via its iomap_ops->notify_failure()
+function. If any memory errors have been detected, access to the affected
+daxdev is disabled to avoid further errors or corruption.
+
+In all known cases, famfs can be unmounted cleanly. In most cases errors
+can be cleared by re-initializing the memory - at which point a new famfs
+file system can be created.
+
+Key Requirements
+================
+
+The primary requirements for famfs are:
+
+1. Must support a file system abstraction backed by sharable devdax memory
+2. Files must efficiently handle VMA faults
+3. Must support metadata distribution in a sharable way
+4. Must handle clients with a stale copy of metadata
+
+The famfs kernel component takes care of 1-2 above by caching each file's
+mapping metadata in the kernel.
+
+Requirements 3 and 4 are handled by the user space components, and are
+largely orthogonal to the functionality of the famfs kernel module.
+
+Requirements 3 and 4 cannot be met by conventional fs-dax file systems
+(e.g. xfs) because they use write-back metadata; it is not valid to mount
+such a file system on two hosts from the same in-memory image.
+
+
+Famfs Usage
+===========
+
+Famfs usage is documented at [1].
+
+
+References
+==========
+
+- [1] Famfs user space repository and documentation
+ https://github.com/cxl-micron-reskit/famfs
diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst
index 2636f2a41bd3..5aad315206ee 100644
--- a/Documentation/filesystems/index.rst
+++ b/Documentation/filesystems/index.rst
@@ -90,6 +90,7 @@ Documentation for filesystem implementations.
ext3
ext4/index
f2fs
+ famfs
gfs2
gfs2-uevents
gfs2-glocks
diff --git a/MAINTAINERS b/MAINTAINERS
index 02688f27a4d0..faa7de4a43de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8814,6 +8814,7 @@ M: John Groves <John@Groves.net>
L: linux-cxl@vger.kernel.org
L: linux-fsdevel@vger.kernel.org
S: Supported
+F: Documentation/filesystems/famfs.rst
F: fs/fuse/famfs.c
F: fs/fuse/famfs_kfmap.h
--
2.49.0
On Thu, Jul 3, 2025 at 8:51 PM John Groves <John@groves.net> wrote: > > Add Documentation/filesystems/famfs.rst and update MAINTAINERS > > Signed-off-by: John Groves <john@groves.net> > --- > Documentation/filesystems/famfs.rst | 142 ++++++++++++++++++++++++++++ > Documentation/filesystems/index.rst | 1 + > MAINTAINERS | 1 + > 3 files changed, 144 insertions(+) > create mode 100644 Documentation/filesystems/famfs.rst Considering "Documentation: fuse: Consolidate FUSE docs into its own subdirectory" https://lore.kernel.org/linux-fsdevel/20250612032239.17561-1-bagasdotme@gmail.com/ I wonder if famfs and virtiofs should be moved into fuse subdir? To me it makes more sense, but it's not a clear cut. > > diff --git a/Documentation/filesystems/famfs.rst b/Documentation/filesystems/famfs.rst > new file mode 100644 > index 000000000000..0d3c9ba9b7a8 > --- /dev/null > +++ b/Documentation/filesystems/famfs.rst > @@ -0,0 +1,142 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +.. _famfs_index: > + > +================================================================== > +famfs: The fabric-attached memory file system > +================================================================== > + > +- Copyright (C) 2024-2025 Micron Technology, Inc. > + > +Introduction > +============ > +Compute Express Link (CXL) provides a mechanism for disaggregated or > +fabric-attached memory (FAM). This creates opportunities for data sharing; > +clustered apps that would otherwise have to shard or replicate data can > +share one copy in disaggregated memory. > + > +Famfs, which is not CXL-specific in any way, provides a mechanism for > +multiple hosts to concurrently access data in shared memory, by giving it > +a file system interface. With famfs, any app that understands files can > +access data sets in shared memory. Although famfs supports read and write, > +the real point is to support mmap, which provides direct (dax) access to > +the memory - either writable or read-only. > + > +Shared memory can pose complex coherency and synchronization issues, but > +there are also simple cases. Two simple and eminently useful patterns that > +occur frequently in data analytics and AI are: > + > +* Serial Sharing - Only one host or process at a time has access to a file > +* Read-only Sharing - Multiple hosts or processes share read-only access > + to a file > + > +The famfs fuse file system is part of the famfs framework; user space > +components [1] handle metadata allocation and distribution, and provide a > +low-level fuse server to expose files that map directly to [presumably > +shared] memory. > + > +The famfs framework manages coherency of its own metadata and structures, > +but does not attempt to manage coherency for applications. > + > +Famfs also provides data isolation between files. That is, even though > +the host has access to an entire memory "device" (as a devdax device), apps > +cannot write to memory for which the file is read-only, and mapping one > +file provides isolation from the memory of all other files. This is pretty > +basic, but some experimental shared memory usage patterns provide no such > +isolation. > + > +Principles of Operation > +======================= > + > +Famfs is a file system with one or more devdax devices as a first-class > +backing device(s). Metadata maintenance and query operations happen > +entirely in user space. > + > +The famfs low-level fuse server daemon provides file maps (fmaps) and > +devdax device info to the fuse/famfs kernel component so that > +read/write/mapping faults can be handled without up-calls for all active > +files. > + > +The famfs user space is responsible for maintaining and distributing > +consistent metadata. This is currently handled via an append-only > +metadata log within the memory, but this is orthogonal to the fuse/famfs > +kernel code. > + > +Once instantiated, "the same file" on each host points to the same shared > +memory, but in-memory metadata (inodes, etc.) is ephemeral on each host > +that has a famfs instance mounted. Use cases are free to allow or not > +allow mutations to data on a file-by-file basis. > + > +When an app accesses a data object in a famfs file, there is no page cache > +involvement. The CPU cache is loaded directly from the shared memory. In > +some use cases, this is an enormous reduction read amplification compared > +to loading an entire page into the page cache. > + > + > +Famfs is Not a Conventional File System > +--------------------------------------- > + > +Famfs files can be accessed by conventional means, but there are > +limitations. The kernel component of fuse/famfs is not involved in the > +allocation of backing memory for files at all; the famfs user space > +creates files and responds as a low-level fuse server with fmaps and > +devdax device info upon request. > + > +Famfs differs in some important ways from conventional file systems: > + > +* Files must be pre-allocated by the famfs framework; allocation is never > + performed on (or after) write. > +* Any operation that changes a file's size is considered to put the file > + in an invalid state, disabling access to the data. It may be possible to > + revisit this in the future. (Typically the famfs user space can restore > + files to a valid state by replaying the famfs metadata log.) > + > +Famfs exists to apply the existing file system abstractions to shared > +memory so applications and workflows can more easily adapt to an > +environment with disaggregated shared memory. > + > +Memory Error Handling > +===================== > + > +Possible memory errors include timeouts, poison and unexpected > +reconfiguration of an underlying dax device. In all of these cases, famfs > +receives a call from the devdax layer via its iomap_ops->notify_failure() > +function. If any memory errors have been detected, access to the affected > +daxdev is disabled to avoid further errors or corruption. > + > +In all known cases, famfs can be unmounted cleanly. In most cases errors > +can be cleared by re-initializing the memory - at which point a new famfs > +file system can be created. > + > +Key Requirements > +================ > + > +The primary requirements for famfs are: > + > +1. Must support a file system abstraction backed by sharable devdax memory > +2. Files must efficiently handle VMA faults > +3. Must support metadata distribution in a sharable way > +4. Must handle clients with a stale copy of metadata > + > +The famfs kernel component takes care of 1-2 above by caching each file's > +mapping metadata in the kernel. > + > +Requirements 3 and 4 are handled by the user space components, and are > +largely orthogonal to the functionality of the famfs kernel module. > + > +Requirements 3 and 4 cannot be met by conventional fs-dax file systems > +(e.g. xfs) because they use write-back metadata; it is not valid to mount > +such a file system on two hosts from the same in-memory image. > + > + > +Famfs Usage > +=========== > + > +Famfs usage is documented at [1]. > + > + > +References > +========== > + > +- [1] Famfs user space repository and documentation > + https://github.com/cxl-micron-reskit/famfs > diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst > index 2636f2a41bd3..5aad315206ee 100644 > --- a/Documentation/filesystems/index.rst > +++ b/Documentation/filesystems/index.rst > @@ -90,6 +90,7 @@ Documentation for filesystem implementations. > ext3 > ext4/index > f2fs > + famfs > gfs2 > gfs2-uevents > gfs2-glocks > diff --git a/MAINTAINERS b/MAINTAINERS > index 02688f27a4d0..faa7de4a43de 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -8814,6 +8814,7 @@ M: John Groves <John@Groves.net> > L: linux-cxl@vger.kernel.org > L: linux-fsdevel@vger.kernel.org > S: Supported > +F: Documentation/filesystems/famfs.rst > F: fs/fuse/famfs.c > F: fs/fuse/famfs_kfmap.h > > -- > 2.49.0 >
On Fri, Jul 04, 2025 at 10:27:03AM +0200, Amir Goldstein wrote: > On Thu, Jul 3, 2025 at 8:51 PM John Groves <John@groves.net> wrote: > > > > Add Documentation/filesystems/famfs.rst and update MAINTAINERS > > > > Signed-off-by: John Groves <john@groves.net> > > --- > > Documentation/filesystems/famfs.rst | 142 ++++++++++++++++++++++++++++ > > Documentation/filesystems/index.rst | 1 + > > MAINTAINERS | 1 + > > 3 files changed, 144 insertions(+) > > create mode 100644 Documentation/filesystems/famfs.rst > > > Considering "Documentation: fuse: Consolidate FUSE docs into its own > subdirectory" > https://lore.kernel.org/linux-fsdevel/20250612032239.17561-1-bagasdotme@gmail.com/ > > I wonder if famfs and virtiofs should be moved into fuse subdir? > To me it makes more sense, but it's not a clear cut. > I guess these can stay in their place as-is for now. However, if we later have more fuse-based filesystems (at least 3 or 4), placing them in Documentation/filesystems/fuse-based might make sense (fuse subdir documents fuse framework itself, though). Thanks. -- An old man doll... just what I always wanted! - Clara
On 7/3/25 11:50 AM, John Groves wrote: > Add Documentation/filesystems/famfs.rst and update MAINTAINERS > > Signed-off-by: John Groves <john@groves.net> > --- > Documentation/filesystems/famfs.rst | 142 ++++++++++++++++++++++++++++ > Documentation/filesystems/index.rst | 1 + > MAINTAINERS | 1 + > 3 files changed, 144 insertions(+) > create mode 100644 Documentation/filesystems/famfs.rst > Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Thanks. -- ~Randy
On Thu, Jul 03, 2025 at 01:50:32PM -0500, John Groves wrote: > +Requirements 3 and 4 are handled by the user space components, and are > +largely orthogonal to the functionality of the famfs kernel module. > + > +Requirements 3 and 4 cannot be met by conventional fs-dax file systems "Such requirements, however, cannot be met by ..." > +(e.g. xfs) because they use write-back metadata; it is not valid to mount > +such a file system on two hosts from the same in-memory image. > + Thanks. -- An old man doll... just what I always wanted! - Clara
Bagas Sanjaya <bagasdotme@gmail.com> writes: > On Thu, Jul 03, 2025 at 01:50:32PM -0500, John Groves wrote: >> +Requirements 3 and 4 are handled by the user space components, and are >> +largely orthogonal to the functionality of the famfs kernel module. >> + >> +Requirements 3 and 4 cannot be met by conventional fs-dax file systems > > "Such requirements, however, cannot be met by ..." Bagas. Stop. John has written documentation, that is great. Do not add needless friction to this process. Seriously. Why do I have to keep telling you this? jon
On Thu, Jul 03, 2025 at 08:22:58PM -0600, Jonathan Corbet wrote: > Bagas. Stop. > > John has written documentation, that is great. Do not add needless > friction to this process. Seriously. > > Why do I have to keep telling you this? Cause I'm more of perfectionist (detail-oriented)... -- An old man doll... just what I always wanted! - Clara
On Fri, Jul 04, 2025 at 10:53:23AM +0700, Bagas Sanjaya wrote: > On Thu, Jul 03, 2025 at 08:22:58PM -0600, Jonathan Corbet wrote: > > Bagas. Stop. > > > > John has written documentation, that is great. Do not add needless > > friction to this process. Seriously. > > > > Why do I have to keep telling you this? > > Cause I'm more of perfectionist (detail-oriented)... Reviews aren't about you. They're about producing a better patch. Do your reviews produce better patches or do they make the perfect the enemy of the good?
On Fri, Jul 04, 2025 at 07:58:28PM +0100, Matthew Wilcox wrote: > On Fri, Jul 04, 2025 at 10:53:23AM +0700, Bagas Sanjaya wrote: > > On Thu, Jul 03, 2025 at 08:22:58PM -0600, Jonathan Corbet wrote: > > > Bagas. Stop. > > > > > > John has written documentation, that is great. Do not add needless > > > friction to this process. Seriously. > > > > > > Why do I have to keep telling you this? > > > > Cause I'm more of perfectionist (detail-oriented)... > > Reviews aren't about you. They're about producing a better patch. > Do your reviews produce better patches or do they make the perfect the > enemy of the good? I'm looking for any Sphinx warnings, but if there's none, I check for better wording or improving the docs output. -- An old man doll... just what I always wanted! - Clara
On Sat, Jul 05, 2025 at 06:29:03AM +0700, Bagas Sanjaya wrote: > On Fri, Jul 04, 2025 at 07:58:28PM +0100, Matthew Wilcox wrote: > > On Fri, Jul 04, 2025 at 10:53:23AM +0700, Bagas Sanjaya wrote: > > > On Thu, Jul 03, 2025 at 08:22:58PM -0600, Jonathan Corbet wrote: > > > > Bagas. Stop. > > > > > > > > John has written documentation, that is great. Do not add needless > > > > friction to this process. Seriously. > > > > > > > > Why do I have to keep telling you this? > > > > > > Cause I'm more of perfectionist (detail-oriented)... > > > > Reviews aren't about you. They're about producing a better patch. > > Do your reviews produce better patches or do they make the perfect the > > enemy of the good? > > I'm looking for any Sphinx warnings, but if there's none, I check for > better wording or improving the docs output. That's appreciated. Really. But what you should be looking for is unclear or misleading wording. Not "this should be 'may' instead of 'might'". The review you give is often closer to nitpicking than serious review.
On Sat, Jul 05, 2025 at 12:43:18AM +0100, Matthew Wilcox wrote: > On Sat, Jul 05, 2025 at 06:29:03AM +0700, Bagas Sanjaya wrote: > > I'm looking for any Sphinx warnings, but if there's none, I check for > > better wording or improving the docs output. > > That's appreciated. Really. But what you should be looking for is > unclear or misleading wording. Not "this should be 'may' instead of > 'might'". The review you give is often closer to nitpicking than > serious review. Thanks for the tip! -- An old man doll... just what I always wanted! - Clara
© 2016 - 2025 Red Hat, Inc.