[PATCH 6/8] add cleancache documentation

Suren Baghdasaryan posted 8 patches 2 months, 1 week ago
There is a newer version of this series
[PATCH 6/8] add cleancache documentation
Posted by Suren Baghdasaryan 2 months, 1 week ago
Document cleancache, it's APIs and sysfs interface.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
 Documentation/mm/cleancache.rst | 112 ++++++++++++++++++++++++++++++++
 MAINTAINERS                     |   1 +
 2 files changed, 113 insertions(+)
 create mode 100644 Documentation/mm/cleancache.rst

diff --git a/Documentation/mm/cleancache.rst b/Documentation/mm/cleancache.rst
new file mode 100644
index 000000000000..deaf7de51829
--- /dev/null
+++ b/Documentation/mm/cleancache.rst
@@ -0,0 +1,112 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========
+Cleancache
+==========
+
+Motivation
+==========
+
+Cleancache is a feature to utilize unused reserved memory for extending
+page cache.
+
+Cleancache can be thought of as a folio-granularity victim cache for clean
+file-backed pages that the kernel's pageframe replacement algorithm (PFRA)
+would like to keep around, but can't since there isn't enough memory. So
+when the PFRA "evicts" a folio, it stores the data contained in the folio
+into cleancache memory which is not directly accessible or addressable by
+the kernel (transcendent memory) and is of unknown and possibly
+time-varying size.
+
+Later, when a filesystem wishes to access a folio in a file on disk, it
+first checks cleancache to see if it already contains required data; if it
+does, the folio data is copied into the kernel and a disk access is
+avoided.
+
+The memory cleancache uses is donated by other system components, which
+reserve memory not directly addressable by the kernel. By donating this
+memory to cleancache, the memory owner enables its utilization while it
+is not used. Memory donation is done using cleancache backend API and any
+donated memory can be taken back at any time by its donor without no delay
+and with guarantees success. Since cleancache uses this memory only to
+store clean file-backed data, it can be dropped at any time and therefore
+the donor's request to take back the memory can be always satisfied.
+
+Implementation Overview
+=======================
+
+Cleancache "backend" (donor that provides transcendent memory), registers
+itself with cleancache "frontend" and received a unique pool_id which it
+can use in all later API calls to identify the pool of folios it donates.
+Once registered, backend can call cleancache_backend_put_folio() or
+cleancache_backend_put_folios() to donate memory to cleancache. Note that
+cleancache currently supports only 0-order folios and will not accept
+larger-order ones. Once the backend needs that memory back, it can get it
+by calling cleancache_backend_get_folio(). Only the original backend can
+take the folio it donated from the cleancache.
+
+Kernel uses cleancache by first calling cleancache_add_fs() to register
+each file system and then using a combination of cleancache_store_folio(),
+cleancache_restore_folio(), cleancache_invalidate_{folio|inode} to store,
+restore and invalidate folio content.
+cleancache_{start|end}_inode_walk() are used to walk over folios inside
+an inode and cleancache_restore_from_inode() is used to restore folios
+during such walks.
+
+From kernel's point of view folios which are copied into cleancache have
+an indefinite lifetime which is completely unknowable by the kernel and so
+may or may not still be in cleancache at any later time. Thus, as its name
+implies, cleancache is not suitable for dirty folios. Cleancache has
+complete discretion over what folios to preserve and what folios to discard
+and when.
+
+Cleancache Performance Metrics
+==============================
+
+If CONFIG_CLEANCACHE_SYSFS is enabled, monitoring of cleancache performance
+can be done via sysfs in the `/sys/kernel/mm/cleancache` directory.
+The effectiveness of cleancache can be measured (across all filesystems)
+with provided stats.
+Global stats are published directly under `/sys/kernel/mm/cleancache` and
+include:
+
+``stored``
+	number of successful cleancache folio stores.
+
+``skipped``
+	number of folios skipped during cleancache store operation.
+
+``restored``
+	number of successful cleancache folio restore operations.
+
+``missed``
+	number of failed cleancache folio restore operations.
+
+``reclaimed``
+	number of folios reclaimed from the cleancache due to insufficient
+	memory.
+
+``recalled``
+	number of times cleancache folio content was discarded as a result
+	of the cleancache backend taking the folio back.
+
+``invalidated``
+	number of times cleancache folio content was discarded as a result
+	of invalidation.
+
+``cached``
+	number of folios currently cached in the cleancache.
+
+Per-pool stats are published under `/sys/kernel/mm/cleancache/<pool name>`
+where "pool name" is the name pool was registered under. These stats
+include:
+
+``size``
+	number of folios donated to this pool.
+
+``cached``
+	number of folios currently cached in the pool.
+
+``recalled``
+	number of times cleancache folio content was discarded as a result
+	of the cleancache backend taking the folio back from the pool.
diff --git a/MAINTAINERS b/MAINTAINERS
index 1c97227e7ffa..441e68c94177 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6053,6 +6053,7 @@ CLEANCACHE
 M:	Suren Baghdasaryan <surenb@google.com>
 L:	linux-mm@kvack.org
 S:	Maintained
+F:	Documentation/mm/cleancache.rst
 F:	include/linux/cleancache.h
 F:	mm/cleancache.c
 F:	mm/cleancache_sysfs.c
-- 
2.51.0.740.g6adb054d12-goog
Re: [PATCH 6/8] add cleancache documentation
Posted by SeongJae Park 2 months, 1 week ago
Hello Suren,

On Thu,  9 Oct 2025 18:19:49 -0700 Suren Baghdasaryan <surenb@google.com> wrote:

> Document cleancache, it's APIs and sysfs interface.
> 
> Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> ---
>  Documentation/mm/cleancache.rst | 112 ++++++++++++++++++++++++++++++++
>  MAINTAINERS                     |   1 +

I think this great document is better to be linked on mm/index.rst.

Also, would it make sense to split the sysfs interface part and put under
Documentation/admin-guide/mm/ ?

>  2 files changed, 113 insertions(+)
>  create mode 100644 Documentation/mm/cleancache.rst
> 
> diff --git a/Documentation/mm/cleancache.rst b/Documentation/mm/cleancache.rst
> new file mode 100644
> index 000000000000..deaf7de51829
> --- /dev/null
> +++ b/Documentation/mm/cleancache.rst
> @@ -0,0 +1,112 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==========
> +Cleancache
> +==========
> +
> +Motivation
> +==========
> +
> +Cleancache is a feature to utilize unused reserved memory for extending
> +page cache.
> +
> +Cleancache can be thought of as a folio-granularity victim cache for clean
> +file-backed pages that the kernel's pageframe replacement algorithm (PFRA)
> +would like to keep around, but can't since there isn't enough memory. So
> +when the PFRA "evicts" a folio, it stores the data contained in the folio
> +into cleancache memory which is not directly accessible or addressable by
> +the kernel (transcendent memory) and is of unknown and possibly
> +time-varying size.

IMHO, "(transcendent memory)" better to be dropped, as it has removed by commit
814bbf49dcd0 ("xen: remove tmem driver").

> +
> +Later, when a filesystem wishes to access a folio in a file on disk, it
> +first checks cleancache to see if it already contains required data; if it
> +does, the folio data is copied into the kernel and a disk access is
> +avoided.
> +
> +The memory cleancache uses is donated by other system components, which
> +reserve memory not directly addressable by the kernel. By donating this
> +memory to cleancache, the memory owner enables its utilization while it
> +is not used. Memory donation is done using cleancache backend API and any
> +donated memory can be taken back at any time by its donor without no delay

"without delay" or "with no delay" ?

> +and with guarantees success. Since cleancache uses this memory only to
> +store clean file-backed data, it can be dropped at any time and therefore
> +the donor's request to take back the memory can be always satisfied.
> +
> +Implementation Overview
> +=======================
> +
> +Cleancache "backend" (donor that provides transcendent memory), registers

Again, "transcendent memory" part seems better to be dropped.

> +itself with cleancache "frontend" and received a unique pool_id which it
> +can use in all later API calls to identify the pool of folios it donates.
> +Once registered, backend can call cleancache_backend_put_folio() or
> +cleancache_backend_put_folios() to donate memory to cleancache. Note that
> +cleancache currently supports only 0-order folios and will not accept
> +larger-order ones. Once the backend needs that memory back, it can get it
> +by calling cleancache_backend_get_folio(). Only the original backend can
> +take the folio it donated from the cleancache.
> +
> +Kernel uses cleancache by first calling cleancache_add_fs() to register
> +each file system and then using a combination of cleancache_store_folio(),
> +cleancache_restore_folio(), cleancache_invalidate_{folio|inode} to store,
> +restore and invalidate folio content.
> +cleancache_{start|end}_inode_walk() are used to walk over folios inside
> +an inode and cleancache_restore_from_inode() is used to restore folios
> +during such walks.
> +
> +From kernel's point of view folios which are copied into cleancache have
> +an indefinite lifetime which is completely unknowable by the kernel and so
> +may or may not still be in cleancache at any later time. Thus, as its name
> +implies, cleancache is not suitable for dirty folios. Cleancache has
> +complete discretion over what folios to preserve and what folios to discard
> +and when.
> +
> +Cleancache Performance Metrics
> +==============================
> +
> +If CONFIG_CLEANCACHE_SYSFS is enabled, monitoring of cleancache performance
> +can be done via sysfs in the `/sys/kernel/mm/cleancache` directory.
> +The effectiveness of cleancache can be measured (across all filesystems)
> +with provided stats.
> +Global stats are published directly under `/sys/kernel/mm/cleancache` and
> +include:

``/sys/kernel/mm/cleancache`` ?

> +
> +``stored``
> +	number of successful cleancache folio stores.
> +
> +``skipped``
> +	number of folios skipped during cleancache store operation.
> +
> +``restored``
> +	number of successful cleancache folio restore operations.
> +
> +``missed``
> +	number of failed cleancache folio restore operations.
> +
> +``reclaimed``
> +	number of folios reclaimed from the cleancache due to insufficient
> +	memory.
> +
> +``recalled``
> +	number of times cleancache folio content was discarded as a result
> +	of the cleancache backend taking the folio back.
> +
> +``invalidated``
> +	number of times cleancache folio content was discarded as a result
> +	of invalidation.
> +
> +``cached``
> +	number of folios currently cached in the cleancache.
> +
> +Per-pool stats are published under `/sys/kernel/mm/cleancache/<pool name>`

``/sys/kernel/mm/cleancache/<pool name>`` ?

> +where "pool name" is the name pool was registered under. These stats
> +include:
> +
> +``size``
> +	number of folios donated to this pool.
> +
> +``cached``
> +	number of folios currently cached in the pool.
> +
> +``recalled``
> +	number of times cleancache folio content was discarded as a result
> +	of the cleancache backend taking the folio back from the pool.
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1c97227e7ffa..441e68c94177 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6053,6 +6053,7 @@ CLEANCACHE
>  M:	Suren Baghdasaryan <surenb@google.com>
>  L:	linux-mm@kvack.org
>  S:	Maintained
> +F:	Documentation/mm/cleancache.rst
>  F:	include/linux/cleancache.h
>  F:	mm/cleancache.c
>  F:	mm/cleancache_sysfs.c
> -- 
> 2.51.0.740.g6adb054d12-goog


Thanks,
SJ
Re: [PATCH 6/8] add cleancache documentation
Posted by Suren Baghdasaryan 2 months, 1 week ago
On Fri, Oct 10, 2025 at 1:20 PM SeongJae Park <sj@kernel.org> wrote:
>
> Hello Suren,

Hi SJ!

>
> On Thu,  9 Oct 2025 18:19:49 -0700 Suren Baghdasaryan <surenb@google.com> wrote:
>
> > Document cleancache, it's APIs and sysfs interface.
> >
> > Signed-off-by: Suren Baghdasaryan <surenb@google.com>
> > ---
> >  Documentation/mm/cleancache.rst | 112 ++++++++++++++++++++++++++++++++
> >  MAINTAINERS                     |   1 +
>
> I think this great document is better to be linked on mm/index.rst.

Ack.

>
> Also, would it make sense to split the sysfs interface part and put under
> Documentation/admin-guide/mm/ ?

Hmm. I guess that makes sense.

>
> >  2 files changed, 113 insertions(+)
> >  create mode 100644 Documentation/mm/cleancache.rst
> >
> > diff --git a/Documentation/mm/cleancache.rst b/Documentation/mm/cleancache.rst
> > new file mode 100644
> > index 000000000000..deaf7de51829
> > --- /dev/null
> > +++ b/Documentation/mm/cleancache.rst
> > @@ -0,0 +1,112 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +==========
> > +Cleancache
> > +==========
> > +
> > +Motivation
> > +==========
> > +
> > +Cleancache is a feature to utilize unused reserved memory for extending
> > +page cache.
> > +
> > +Cleancache can be thought of as a folio-granularity victim cache for clean
> > +file-backed pages that the kernel's pageframe replacement algorithm (PFRA)
> > +would like to keep around, but can't since there isn't enough memory. So
> > +when the PFRA "evicts" a folio, it stores the data contained in the folio
> > +into cleancache memory which is not directly accessible or addressable by
> > +the kernel (transcendent memory) and is of unknown and possibly
> > +time-varying size.
>
> IMHO, "(transcendent memory)" better to be dropped, as it has removed by commit
> 814bbf49dcd0 ("xen: remove tmem driver").

Ah, good point. Will remove.

>
> > +
> > +Later, when a filesystem wishes to access a folio in a file on disk, it
> > +first checks cleancache to see if it already contains required data; if it
> > +does, the folio data is copied into the kernel and a disk access is
> > +avoided.
> > +
> > +The memory cleancache uses is donated by other system components, which
> > +reserve memory not directly addressable by the kernel. By donating this
> > +memory to cleancache, the memory owner enables its utilization while it
> > +is not used. Memory donation is done using cleancache backend API and any
> > +donated memory can be taken back at any time by its donor without no delay
>
> "without delay" or "with no delay" ?

Ack. Will change to "without delay"

>
> > +and with guarantees success. Since cleancache uses this memory only to
> > +store clean file-backed data, it can be dropped at any time and therefore
> > +the donor's request to take back the memory can be always satisfied.
> > +
> > +Implementation Overview
> > +=======================
> > +
> > +Cleancache "backend" (donor that provides transcendent memory), registers
>
> Again, "transcendent memory" part seems better to be dropped.

Ack.

>
> > +itself with cleancache "frontend" and received a unique pool_id which it
> > +can use in all later API calls to identify the pool of folios it donates.
> > +Once registered, backend can call cleancache_backend_put_folio() or
> > +cleancache_backend_put_folios() to donate memory to cleancache. Note that
> > +cleancache currently supports only 0-order folios and will not accept
> > +larger-order ones. Once the backend needs that memory back, it can get it
> > +by calling cleancache_backend_get_folio(). Only the original backend can
> > +take the folio it donated from the cleancache.
> > +
> > +Kernel uses cleancache by first calling cleancache_add_fs() to register
> > +each file system and then using a combination of cleancache_store_folio(),
> > +cleancache_restore_folio(), cleancache_invalidate_{folio|inode} to store,
> > +restore and invalidate folio content.
> > +cleancache_{start|end}_inode_walk() are used to walk over folios inside
> > +an inode and cleancache_restore_from_inode() is used to restore folios
> > +during such walks.
> > +
> > +From kernel's point of view folios which are copied into cleancache have
> > +an indefinite lifetime which is completely unknowable by the kernel and so
> > +may or may not still be in cleancache at any later time. Thus, as its name
> > +implies, cleancache is not suitable for dirty folios. Cleancache has
> > +complete discretion over what folios to preserve and what folios to discard
> > +and when.
> > +
> > +Cleancache Performance Metrics
> > +==============================
> > +
> > +If CONFIG_CLEANCACHE_SYSFS is enabled, monitoring of cleancache performance
> > +can be done via sysfs in the `/sys/kernel/mm/cleancache` directory.
> > +The effectiveness of cleancache can be measured (across all filesystems)
> > +with provided stats.
> > +Global stats are published directly under `/sys/kernel/mm/cleancache` and
> > +include:
>
> ``/sys/kernel/mm/cleancache`` ?

Ack.

>
> > +
> > +``stored``
> > +     number of successful cleancache folio stores.
> > +
> > +``skipped``
> > +     number of folios skipped during cleancache store operation.
> > +
> > +``restored``
> > +     number of successful cleancache folio restore operations.
> > +
> > +``missed``
> > +     number of failed cleancache folio restore operations.
> > +
> > +``reclaimed``
> > +     number of folios reclaimed from the cleancache due to insufficient
> > +     memory.
> > +
> > +``recalled``
> > +     number of times cleancache folio content was discarded as a result
> > +     of the cleancache backend taking the folio back.
> > +
> > +``invalidated``
> > +     number of times cleancache folio content was discarded as a result
> > +     of invalidation.
> > +
> > +``cached``
> > +     number of folios currently cached in the cleancache.
> > +
> > +Per-pool stats are published under `/sys/kernel/mm/cleancache/<pool name>`
>
> ``/sys/kernel/mm/cleancache/<pool name>`` ?

Ack.

>
> > +where "pool name" is the name pool was registered under. These stats
> > +include:
> > +
> > +``size``
> > +     number of folios donated to this pool.
> > +
> > +``cached``
> > +     number of folios currently cached in the pool.
> > +
> > +``recalled``
> > +     number of times cleancache folio content was discarded as a result
> > +     of the cleancache backend taking the folio back from the pool.
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 1c97227e7ffa..441e68c94177 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -6053,6 +6053,7 @@ CLEANCACHE
> >  M:   Suren Baghdasaryan <surenb@google.com>
> >  L:   linux-mm@kvack.org
> >  S:   Maintained
> > +F:   Documentation/mm/cleancache.rst
> >  F:   include/linux/cleancache.h
> >  F:   mm/cleancache.c
> >  F:   mm/cleancache_sysfs.c
> > --
> > 2.51.0.740.g6adb054d12-goog
>
>
> Thanks,
> SJ

Thanks for the review!