From: David Hildenbrand <david@redhat.com>
Update the huge folios policy for tmpfs and shmem.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
Documentation/admin-guide/mm/transhuge.rst | 58 +++++++++++++++-------
1 file changed, 41 insertions(+), 17 deletions(-)
diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
index 9ae775eaacbe..ba6edff728ed 100644
--- a/Documentation/admin-guide/mm/transhuge.rst
+++ b/Documentation/admin-guide/mm/transhuge.rst
@@ -358,8 +358,21 @@ default to ``never``.
Hugepages in tmpfs/shmem
========================
-You can control hugepage allocation policy in tmpfs with mount option
-``huge=``. It can have following values:
+Traditionally, tmpfs only supported a single huge page size ("PMD"). Today,
+it also supports smaller sizes just like anonymous memory, often referred
+to as "multi-size THP" (mTHP). Huge pages of any size are commonly
+represented in the kernel as "large folios".
+
+While there is fine control over the huge page sizes to use for the internal
+shmem mount (see below), ordinary tmpfs mounts will make use of all available
+huge page sizes without any control over the exact sizes, behaving more like
+other file systems.
+
+tmpfs mounts
+------------
+
+The THP allocation policy for tmpfs mounts can be adjusted using the mount
+option: ``huge=``. It can have following values:
always
Attempt to allocate huge pages every time we need a new page;
@@ -374,19 +387,19 @@ within_size
advise
Only allocate huge pages if requested with fadvise()/madvise();
-The default policy is ``never``.
+Remember, that the kernel may use huge pages of all available sizes, and
+that no fine control as for the internal tmpfs mount is available.
+
+The default policy in the past was ``never``, but it can now be adjusted
+using the kernel parameter ``transparent_hugepage_tmpfs=<policy>``.
``mount -o remount,huge= /mountpoint`` works fine after mount: remounting
``huge=never`` will not attempt to break up huge pages at all, just stop more
from being allocated.
-There's also sysfs knob to control hugepage allocation policy for internal
-shmem mount: /sys/kernel/mm/transparent_hugepage/shmem_enabled. The mount
-is used for SysV SHM, memfds, shared anonymous mmaps (of /dev/zero or
-MAP_ANONYMOUS), GPU drivers' DRM objects, Ashmem.
-
-In addition to policies listed above, shmem_enabled allows two further
-values:
+In addition to policies listed above, the sysfs knob
+/sys/kernel/mm/transparent_hugepage/shmem_enabled will affect the
+allocation policy of tmpfs mounts, when set to the following values:
deny
For use in emergencies, to force the huge option off from
@@ -394,13 +407,24 @@ deny
force
Force the huge option on for all - very useful for testing;
-Shmem can also use "multi-size THP" (mTHP) by adding a new sysfs knob to
-control mTHP allocation:
-'/sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/shmem_enabled',
-and its value for each mTHP is essentially consistent with the global
-setting. An 'inherit' option is added to ensure compatibility with these
-global settings. Conversely, the options 'force' and 'deny' are dropped,
-which are rather testing artifacts from the old ages.
+shmem / internal tmpfs
+----------------------
+The mount internal tmpfs mount is used for SysV SHM, memfds, shared anonymous
+mmaps (of /dev/zero or MAP_ANONYMOUS), GPU drivers' DRM objects, Ashmem.
+
+To control the THP allocation policy for this internal tmpfs mount, the
+sysfs knob /sys/kernel/mm/transparent_hugepage/shmem_enabled and the knobs
+per THP size in
+'/sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/shmem_enabled'
+can be used.
+
+The global knob has the same semantics as the ``huge=`` mount options
+for tmpfs mounts, except that the different huge page sizes can be controlled
+individually, and will only use the setting of the global knob when the
+per-size knob is set to 'inherit'.
+
+The options 'force' and 'deny' are dropped for the individual sizes, which
+are rather testing artifacts from the old ages.
always
Attempt to allocate <size> huge pages every time we need a new page;
--
2.39.3
Drop 'fadvise()' from the doc, since fadvise() has no HUGEPAGE advise
currently.
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
Documentation/admin-guide/mm/transhuge.rst | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
index ba6edff728ed..333958ef0d5f 100644
--- a/Documentation/admin-guide/mm/transhuge.rst
+++ b/Documentation/admin-guide/mm/transhuge.rst
@@ -382,10 +382,10 @@ never
within_size
Only allocate huge page if it will be fully within i_size.
- Also respect fadvise()/madvise() hints;
+ Also respect madvise() hints;
advise
- Only allocate huge pages if requested with fadvise()/madvise();
+ Only allocate huge pages if requested with madvise();
Remember, that the kernel may use huge pages of all available sizes, and
that no fine control as for the internal tmpfs mount is available.
@@ -438,10 +438,10 @@ never
within_size
Only allocate <size> huge page if it will be fully within i_size.
- Also respect fadvise()/madvise() hints;
+ Also respect madvise() hints;
advise
- Only allocate <size> huge pages if requested with fadvise()/madvise();
+ Only allocate <size> huge pages if requested with madvise();
Need of application restart
===========================
--
2.39.3
On Wed, Nov 13, 2024 at 7:57 PM Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > Drop 'fadvise()' from the doc, since fadvise() has no HUGEPAGE advise > currently. > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Barry Song <baohua@kernel.org> I couldn’t find any mention of HUGEPAGE in fadvise() either. FADV_NORMAL FADV_RANDOM FADV_SEQUENTIAL FADV_WILLNEED FADV_DONTNEED FADV_NOREUSE > --- > Documentation/admin-guide/mm/transhuge.rst | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > index ba6edff728ed..333958ef0d5f 100644 > --- a/Documentation/admin-guide/mm/transhuge.rst > +++ b/Documentation/admin-guide/mm/transhuge.rst > @@ -382,10 +382,10 @@ never > > within_size > Only allocate huge page if it will be fully within i_size. > - Also respect fadvise()/madvise() hints; > + Also respect madvise() hints; > > advise > - Only allocate huge pages if requested with fadvise()/madvise(); > + Only allocate huge pages if requested with madvise(); > > Remember, that the kernel may use huge pages of all available sizes, and > that no fine control as for the internal tmpfs mount is available. > @@ -438,10 +438,10 @@ never > > within_size > Only allocate <size> huge page if it will be fully within i_size. > - Also respect fadvise()/madvise() hints; > + Also respect madvise() hints; > > advise > - Only allocate <size> huge pages if requested with fadvise()/madvise(); > + Only allocate <size> huge pages if requested with madvise(); > > Need of application restart > =========================== > -- > 2.39.3 >
On 20.11.24 22:35, Barry Song wrote: > On Wed, Nov 13, 2024 at 7:57 PM Baolin Wang > <baolin.wang@linux.alibaba.com> wrote: >> >> Drop 'fadvise()' from the doc, since fadvise() has no HUGEPAGE advise >> currently. >> >> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> > > Reviewed-by: Barry Song <baohua@kernel.org> > > I couldn’t find any mention of HUGEPAGE in fadvise() either. > > FADV_NORMAL > FADV_RANDOM > FADV_SEQUENTIAL > FADV_WILLNEED > FADV_DONTNEED > FADV_NOREUSE Probably it was forward-looking, and that change never happened. Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb
© 2016 - 2025 Red Hat, Inc.