Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
Rather confusingly, setting all Transparent Huge Page sysfs settings to
"never" does not in fact result in THP being globally disabled.
Rather, it results in khugepaged being disabled, but one can still obtain
THP pages using madvise(..., MADV_COLLAPSE).
This is something that has remained poorly documented for some time, and it
is likely the received wisdom of most users of THP that never does, in
fact, mean never.
It is therefore important to highlight, very clearly, that this is not the
ase.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
index dff8d5985f0f..182519197ef7 100644
--- a/Documentation/admin-guide/mm/transhuge.rst
+++ b/Documentation/admin-guide/mm/transhuge.rst
@@ -107,7 +107,7 @@ sysfs
Global THP controls
-------------------
-Transparent Hugepage Support for anonymous memory can be entirely disabled
+Transparent Hugepage Support for anonymous memory can be disabled
(mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE
regions (to avoid the risk of consuming more memory resources) or enabled
system wide. This can be achieved per-supported-THP-size with one of::
@@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of::
where <size> is the hugepage size being addressed, the available sizes
for which vary by system.
+.. note:: Setting "never" in all sysfs THP controls does **not** disable
+ Transparent Huge Pages globally. This is because ``madvise(...,
+ MADV_COLLAPSE)`` ignores these settings and collapses ranges to
+ PMD-sized huge pages unconditionally.
+
For example::
echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
@@ -187,7 +192,9 @@ madvise
behaviour.
never
- should be self-explanatory.
+ should be self-explanatory. Note that ``madvise(...,
+ MADV_COLLAPSE)`` can still cause transparent huge pages to be
+ obtained even if this mode is specified everywhere.
By default kernel tries to use huge, PMD-mappable zero page on read
page fault to anonymous mapping. It's possible to disable huge zero
--
2.50.1
On 21.07.25 17:55, Lorenzo Stoakes wrote: > Rather confusingly, setting all Transparent Huge Page sysfs settings to > "never" does not in fact result in THP being globally disabled. > > Rather, it results in khugepaged being disabled, but one can still obtain > THP pages using madvise(..., MADV_COLLAPSE). > > This is something that has remained poorly documented for some time, and it > is likely the received wisdom of most users of THP that never does, in > fact, mean never. > > It is therefore important to highlight, very clearly, that this is not the > ase. > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > --- Can we also somehow tone down or clarify the "entirely disabled"? Something as simple as diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index bd49b46398c92..2267d22277238 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -107,7 +107,7 @@ sysfs Global THP controls ------------------- -Transparent Hugepage Support for anonymous memory can be entirely disabled +Transparent Hugepage Support for anonymous memory can be mostly disabled (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE regions (to avoid the risk of consuming more memory resources) or enabled system wide. This can be achieved per-supported-THP-size with one of:: -- Cheers, David / dhildenb
On 22.07.25 09:20, David Hildenbrand wrote: > On 21.07.25 17:55, Lorenzo Stoakes wrote: >> Rather confusingly, setting all Transparent Huge Page sysfs settings to >> "never" does not in fact result in THP being globally disabled. >> >> Rather, it results in khugepaged being disabled, but one can still obtain >> THP pages using madvise(..., MADV_COLLAPSE). >> >> This is something that has remained poorly documented for some time, and it >> is likely the received wisdom of most users of THP that never does, in >> fact, mean never. >> >> It is therefore important to highlight, very clearly, that this is not the >> ase. >> >> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> >> --- > > Can we also somehow tone down or clarify the "entirely disabled"? Ah, missed that you touched that already. Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb
Hi Andrew, Could you apply this fix-patch? It adds the caveat regarding MADV_COLLAPSE in a couple other places whwere the sysfs 'never' mode is mentioned. Thanks, Lorenzo ----8<---- From 7c0bdda6a633bc38e7d5a3b0acf2cef7bdc961af Mon Sep 17 00:00:00 2001 From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Date: Tue, 22 Jul 2025 06:32:18 +0100 Subject: [PATCH] docs: update admin guide transhuge page to mention MADV_COLLAPSE everywhere We previously missed a couple places where the 'never' mode was described, put the caveat regarding MADV_COLLAPSE in these locations also. Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> --- Documentation/admin-guide/mm/transhuge.rst | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index 182519197ef7..370fba113460 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -385,7 +385,9 @@ always Attempt to allocate huge pages every time we need a new page; never - Do not allocate huge pages; + Do not allocate huge pages. Note that ``madvise(..., MADV_COLLAPSE)`` + can still cause transparent huge pages to be obtained even if this mode + is specified everywhere; within_size Only allocate huge page if it will be fully within i_size. @@ -441,7 +443,9 @@ inherit have enabled="inherit" and all other hugepage sizes have enabled="never"; never - Do not allocate <size> huge pages; + Do not allocate <size> huge pages. Note that ``madvise(..., + MADV_COLLAPSE)`` can still cause transparent huge pages to be obtained + even if this mode is specified everywhere; within_size Only allocate <size> huge page if it will be fully within i_size. -- 2.50.1
On Tue, Jul 22, 2025 at 1:34 PM Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > Hi Andrew, > > Could you apply this fix-patch? It adds the caveat regarding MADV_COLLAPSE in a > couple other places whwere the sysfs 'never' mode is mentioned. > > Thanks, Lorenzo > > ----8<---- > From 7c0bdda6a633bc38e7d5a3b0acf2cef7bdc961af Mon Sep 17 00:00:00 2001 > From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Date: Tue, 22 Jul 2025 06:32:18 +0100 > Subject: [PATCH] docs: update admin guide transhuge page to mention > MADV_COLLAPSE everywhere > > We previously missed a couple places where the 'never' mode was described, > put the caveat regarding MADV_COLLAPSE in these locations also. > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> LGTM, thanks! Reviewed-by: Barry Song <baohua@kernel.org> > --- > Documentation/admin-guide/mm/transhuge.rst | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > index 182519197ef7..370fba113460 100644 > --- a/Documentation/admin-guide/mm/transhuge.rst > +++ b/Documentation/admin-guide/mm/transhuge.rst > @@ -385,7 +385,9 @@ always > Attempt to allocate huge pages every time we need a new page; > > never > - Do not allocate huge pages; > + Do not allocate huge pages. Note that ``madvise(..., MADV_COLLAPSE)`` > + can still cause transparent huge pages to be obtained even if this mode > + is specified everywhere; > > within_size > Only allocate huge page if it will be fully within i_size. > @@ -441,7 +443,9 @@ inherit > have enabled="inherit" and all other hugepage sizes have enabled="never"; > > never > - Do not allocate <size> huge pages; > + Do not allocate <size> huge pages. Note that ``madvise(..., > + MADV_COLLAPSE)`` can still cause transparent huge pages to be obtained > + even if this mode is specified everywhere; > > within_size > Only allocate <size> huge page if it will be fully within i_size. > -- > 2.50.1
On 2025/7/22 13:34, Lorenzo Stoakes wrote: > Hi Andrew, > > Could you apply this fix-patch? It adds the caveat regarding MADV_COLLAPSE in a > couple other places whwere the sysfs 'never' mode is mentioned. > > Thanks, Lorenzo > > ----8<---- > From 7c0bdda6a633bc38e7d5a3b0acf2cef7bdc961af Mon Sep 17 00:00:00 2001 > From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > Date: Tue, 22 Jul 2025 06:32:18 +0100 > Subject: [PATCH] docs: update admin guide transhuge page to mention > MADV_COLLAPSE everywhere > > We previously missed a couple places where the 'never' mode was described, > put the caveat regarding MADV_COLLAPSE in these locations also. > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Thanks. Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > --- > Documentation/admin-guide/mm/transhuge.rst | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > index 182519197ef7..370fba113460 100644 > --- a/Documentation/admin-guide/mm/transhuge.rst > +++ b/Documentation/admin-guide/mm/transhuge.rst > @@ -385,7 +385,9 @@ always > Attempt to allocate huge pages every time we need a new page; > > never > - Do not allocate huge pages; > + Do not allocate huge pages. Note that ``madvise(..., MADV_COLLAPSE)`` > + can still cause transparent huge pages to be obtained even if this mode > + is specified everywhere; > > within_size > Only allocate huge page if it will be fully within i_size. > @@ -441,7 +443,9 @@ inherit > have enabled="inherit" and all other hugepage sizes have enabled="never"; > > never > - Do not allocate <size> huge pages; > + Do not allocate <size> huge pages. Note that ``madvise(..., > + MADV_COLLAPSE)`` can still cause transparent huge pages to be obtained > + even if this mode is specified everywhere; > > within_size > Only allocate <size> huge page if it will be fully within i_size. > -- > 2.50.1
On 2025/7/21 23:55, Lorenzo Stoakes wrote: > Rather confusingly, setting all Transparent Huge Page sysfs settings to > "never" does not in fact result in THP being globally disabled. > > Rather, it results in khugepaged being disabled, but one can still obtain > THP pages using madvise(..., MADV_COLLAPSE). > > This is something that has remained poorly documented for some time, and it > is likely the received wisdom of most users of THP that never does, in > fact, mean never. > > It is therefore important to highlight, very clearly, that this is not the > ase. > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > --- > Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > index dff8d5985f0f..182519197ef7 100644 > --- a/Documentation/admin-guide/mm/transhuge.rst > +++ b/Documentation/admin-guide/mm/transhuge.rst > @@ -107,7 +107,7 @@ sysfs > Global THP controls > ------------------- > > -Transparent Hugepage Support for anonymous memory can be entirely disabled > +Transparent Hugepage Support for anonymous memory can be disabled > (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE > regions (to avoid the risk of consuming more memory resources) or enabled > system wide. This can be achieved per-supported-THP-size with one of:: > @@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of:: > where <size> is the hugepage size being addressed, the available sizes > for which vary by system. > > +.. note:: Setting "never" in all sysfs THP controls does **not** disable > + Transparent Huge Pages globally. This is because ``madvise(..., > + MADV_COLLAPSE)`` ignores these settings and collapses ranges to > + PMD-sized huge pages unconditionally. > + > For example:: > > echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled > @@ -187,7 +192,9 @@ madvise > behaviour. > > never > - should be self-explanatory. > + should be self-explanatory. Note that ``madvise(..., > + MADV_COLLAPSE)`` can still cause transparent huge pages to be > + obtained even if this mode is specified everywhere. I hope this part of the explanation is also copy-pasted into the 'Hugepages in tmpfs/shmem' section. Otherwise look good to me. Thanks.
On Tue, Jul 22, 2025 at 09:30:18AM +0800, Baolin Wang wrote: > > > On 2025/7/21 23:55, Lorenzo Stoakes wrote: > > Rather confusingly, setting all Transparent Huge Page sysfs settings to > > "never" does not in fact result in THP being globally disabled. > > > > Rather, it results in khugepaged being disabled, but one can still obtain > > THP pages using madvise(..., MADV_COLLAPSE). > > > > This is something that has remained poorly documented for some time, and it > > is likely the received wisdom of most users of THP that never does, in > > fact, mean never. > > > > It is therefore important to highlight, very clearly, that this is not the > > ase. > > > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > > --- > > Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++-- > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > > index dff8d5985f0f..182519197ef7 100644 > > --- a/Documentation/admin-guide/mm/transhuge.rst > > +++ b/Documentation/admin-guide/mm/transhuge.rst > > @@ -107,7 +107,7 @@ sysfs > > Global THP controls > > ------------------- > > -Transparent Hugepage Support for anonymous memory can be entirely disabled > > +Transparent Hugepage Support for anonymous memory can be disabled > > (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE > > regions (to avoid the risk of consuming more memory resources) or enabled > > system wide. This can be achieved per-supported-THP-size with one of:: > > @@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of:: > > where <size> is the hugepage size being addressed, the available sizes > > for which vary by system. > > +.. note:: Setting "never" in all sysfs THP controls does **not** disable > > + Transparent Huge Pages globally. This is because ``madvise(..., > > + MADV_COLLAPSE)`` ignores these settings and collapses ranges to > > + PMD-sized huge pages unconditionally. > > + > > For example:: > > echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled > > @@ -187,7 +192,9 @@ madvise > > behaviour. > > never > > - should be self-explanatory. > > + should be self-explanatory. Note that ``madvise(..., > > + MADV_COLLAPSE)`` can still cause transparent huge pages to be > > + obtained even if this mode is specified everywhere. > > I hope this part of the explanation is also copy-pasted into the 'Hugepages > in tmpfs/shmem' section. Otherwise look good to me. Thanks. Thanks, will send a fix-patch to add it there too!
On Tue, Jul 22, 2025 at 9:30 AM Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > > > On 2025/7/21 23:55, Lorenzo Stoakes wrote: > > Rather confusingly, setting all Transparent Huge Page sysfs settings to > > "never" does not in fact result in THP being globally disabled. > > > > Rather, it results in khugepaged being disabled, but one can still obtain > > THP pages using madvise(..., MADV_COLLAPSE). > > > > This is something that has remained poorly documented for some time, and it > > is likely the received wisdom of most users of THP that never does, in > > fact, mean never. > > > > It is therefore important to highlight, very clearly, that this is not the > > ase. > > > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > > --- > > Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++-- > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > > index dff8d5985f0f..182519197ef7 100644 > > --- a/Documentation/admin-guide/mm/transhuge.rst > > +++ b/Documentation/admin-guide/mm/transhuge.rst > > @@ -107,7 +107,7 @@ sysfs > > Global THP controls > > ------------------- > > > > -Transparent Hugepage Support for anonymous memory can be entirely disabled > > +Transparent Hugepage Support for anonymous memory can be disabled > > (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE > > regions (to avoid the risk of consuming more memory resources) or enabled > > system wide. This can be achieved per-supported-THP-size with one of:: > > @@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of:: > > where <size> is the hugepage size being addressed, the available sizes > > for which vary by system. > > > > +.. note:: Setting "never" in all sysfs THP controls does **not** disable > > + Transparent Huge Pages globally. This is because ``madvise(..., > > + MADV_COLLAPSE)`` ignores these settings and collapses ranges to > > + PMD-sized huge pages unconditionally. > > + > > For example:: > > > > echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled > > @@ -187,7 +192,9 @@ madvise > > behaviour. > > > > never > > - should be self-explanatory. > > + should be self-explanatory. Note that ``madvise(..., > > + MADV_COLLAPSE)`` can still cause transparent huge pages to be > > + obtained even if this mode is specified everywhere. > > I hope this part of the explanation is also copy-pasted into the > 'Hugepages in tmpfs/shmem' section. Otherwise look good to me. Thanks. Apologies if this is a silly question, but regarding this patchset: https://lore.kernel.org/linux-mm/cover.1750815384.git.baolin.wang@linux.alibaba.com/ It looks like the intention is to disable hugepages even for `MADV_COLLAPSE` when the user has set the policy to 'never'. However, based on Lorenzo's documentation update, it seems we still want to allow hugepages for `MADV_COLLAPSE` even if 'never' is set? Could you clarify what the intended behavior is? It seems we've decided to keep the existing behavior unchanged—am I understanding that correctly? Thanks Barry
On Tue, Jul 22, 2025 at 10:23:39AM +0800, Barry Song wrote: > On Tue, Jul 22, 2025 at 9:30 AM Baolin Wang > <baolin.wang@linux.alibaba.com> wrote: > > > > > > > > On 2025/7/21 23:55, Lorenzo Stoakes wrote: > > > Rather confusingly, setting all Transparent Huge Page sysfs settings to > > > "never" does not in fact result in THP being globally disabled. > > > > > > Rather, it results in khugepaged being disabled, but one can still obtain > > > THP pages using madvise(..., MADV_COLLAPSE). > > > > > > This is something that has remained poorly documented for some time, and it > > > is likely the received wisdom of most users of THP that never does, in > > > fact, mean never. > > > > > > It is therefore important to highlight, very clearly, that this is not the > > > ase. > > > > > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > > > --- > > > Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++-- > > > 1 file changed, 9 insertions(+), 2 deletions(-) > > > > > > diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > > > index dff8d5985f0f..182519197ef7 100644 > > > --- a/Documentation/admin-guide/mm/transhuge.rst > > > +++ b/Documentation/admin-guide/mm/transhuge.rst > > > @@ -107,7 +107,7 @@ sysfs > > > Global THP controls > > > ------------------- > > > > > > -Transparent Hugepage Support for anonymous memory can be entirely disabled > > > +Transparent Hugepage Support for anonymous memory can be disabled > > > (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE > > > regions (to avoid the risk of consuming more memory resources) or enabled > > > system wide. This can be achieved per-supported-THP-size with one of:: > > > @@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of:: > > > where <size> is the hugepage size being addressed, the available sizes > > > for which vary by system. > > > > > > +.. note:: Setting "never" in all sysfs THP controls does **not** disable > > > + Transparent Huge Pages globally. This is because ``madvise(..., > > > + MADV_COLLAPSE)`` ignores these settings and collapses ranges to > > > + PMD-sized huge pages unconditionally. > > > + > > > For example:: > > > > > > echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled > > > @@ -187,7 +192,9 @@ madvise > > > behaviour. > > > > > > never > > > - should be self-explanatory. > > > + should be self-explanatory. Note that ``madvise(..., > > > + MADV_COLLAPSE)`` can still cause transparent huge pages to be > > > + obtained even if this mode is specified everywhere. > > > > I hope this part of the explanation is also copy-pasted into the > > 'Hugepages in tmpfs/shmem' section. Otherwise look good to me. Thanks. > > Apologies if this is a silly question, but regarding this patchset: > https://lore.kernel.org/linux-mm/cover.1750815384.git.baolin.wang@linux.alibaba.com/ > > It looks like the intention is to disable hugepages even for > `MADV_COLLAPSE` when the user has set the policy to 'never'. However, > based on Lorenzo's documentation update, it seems we still want to allow > hugepages for `MADV_COLLAPSE` even if 'never' is set? > > Could you clarify what the intended behavior is? It seems we've decided > to keep the existing behavior unchanged—am I understanding that > correctly? For now see [0], we have decided at this time that this series should not be applied. I again apologise sincerely to Baolin for this being such a back and forth and him doing so much work here prior to this decision, but overall David and I felt that _at this time_ we didn't want to risk breaking anybody by changing this behaviour. And so as I promised, this patch is my updating the documentation to reflect the current (and I entirely agree - odd) reality of 'never means never'. Cheers, Lorenzo [0]:https://lore.kernel.org/linux-mm/573eb43a-8536-4206-a7c6-d0daa1fd7e70@lucifer.local/ > > Thanks > Barry
On 2025/7/22 10:23, Barry Song wrote: > On Tue, Jul 22, 2025 at 9:30 AM Baolin Wang > <baolin.wang@linux.alibaba.com> wrote: >> >> >> >> On 2025/7/21 23:55, Lorenzo Stoakes wrote: >>> Rather confusingly, setting all Transparent Huge Page sysfs settings to >>> "never" does not in fact result in THP being globally disabled. >>> >>> Rather, it results in khugepaged being disabled, but one can still obtain >>> THP pages using madvise(..., MADV_COLLAPSE). >>> >>> This is something that has remained poorly documented for some time, and it >>> is likely the received wisdom of most users of THP that never does, in >>> fact, mean never. >>> >>> It is therefore important to highlight, very clearly, that this is not the >>> ase. >>> >>> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> >>> --- >>> Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++-- >>> 1 file changed, 9 insertions(+), 2 deletions(-) >>> >>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst >>> index dff8d5985f0f..182519197ef7 100644 >>> --- a/Documentation/admin-guide/mm/transhuge.rst >>> +++ b/Documentation/admin-guide/mm/transhuge.rst >>> @@ -107,7 +107,7 @@ sysfs >>> Global THP controls >>> ------------------- >>> >>> -Transparent Hugepage Support for anonymous memory can be entirely disabled >>> +Transparent Hugepage Support for anonymous memory can be disabled >>> (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE >>> regions (to avoid the risk of consuming more memory resources) or enabled >>> system wide. This can be achieved per-supported-THP-size with one of:: >>> @@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of:: >>> where <size> is the hugepage size being addressed, the available sizes >>> for which vary by system. >>> >>> +.. note:: Setting "never" in all sysfs THP controls does **not** disable >>> + Transparent Huge Pages globally. This is because ``madvise(..., >>> + MADV_COLLAPSE)`` ignores these settings and collapses ranges to >>> + PMD-sized huge pages unconditionally. >>> + >>> For example:: >>> >>> echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled >>> @@ -187,7 +192,9 @@ madvise >>> behaviour. >>> >>> never >>> - should be self-explanatory. >>> + should be self-explanatory. Note that ``madvise(..., >>> + MADV_COLLAPSE)`` can still cause transparent huge pages to be >>> + obtained even if this mode is specified everywhere. >> >> I hope this part of the explanation is also copy-pasted into the >> 'Hugepages in tmpfs/shmem' section. Otherwise look good to me. Thanks. > > Apologies if this is a silly question, but regarding this patchset: > https://lore.kernel.org/linux-mm/cover.1750815384.git.baolin.wang@linux.alibaba.com/ > > It looks like the intention is to disable hugepages even for > `MADV_COLLAPSE` when the user has set the policy to 'never'. However, > based on Lorenzo's documentation update, it seems we still want to allow > hugepages for `MADV_COLLAPSE` even if 'never' is set? > > Could you clarify what the intended behavior is? It seems we've decided > to keep the existing behavior unchanged—am I understanding that > correctly? Yes, Hugh has already explicitly opposed the current changes to the MADV_COLLAPSE logic[1], although there are still some disagreements that cannot be resolved. At least we reached the consensus to update the documentation to reflect the current sysfs THP control logic first, to avoid the misunderstanding that 'sysfs THP controls can disable Transparent Huge Pages globally'. [1] https://lore.kernel.org/linux-mm/75c02dbf-4189-958d-515e-fa80bb2187fc@google.com/
On Tue, Jul 22, 2025 at 10:33 AM Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > > > On 2025/7/22 10:23, Barry Song wrote: > > On Tue, Jul 22, 2025 at 9:30 AM Baolin Wang > > <baolin.wang@linux.alibaba.com> wrote: > >> > >> > >> > >> On 2025/7/21 23:55, Lorenzo Stoakes wrote: > >>> Rather confusingly, setting all Transparent Huge Page sysfs settings to > >>> "never" does not in fact result in THP being globally disabled. > >>> > >>> Rather, it results in khugepaged being disabled, but one can still obtain > >>> THP pages using madvise(..., MADV_COLLAPSE). > >>> > >>> This is something that has remained poorly documented for some time, and it > >>> is likely the received wisdom of most users of THP that never does, in > >>> fact, mean never. > >>> > >>> It is therefore important to highlight, very clearly, that this is not the > >>> ase. > >>> > >>> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > >>> --- > >>> Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++-- > >>> 1 file changed, 9 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > >>> index dff8d5985f0f..182519197ef7 100644 > >>> --- a/Documentation/admin-guide/mm/transhuge.rst > >>> +++ b/Documentation/admin-guide/mm/transhuge.rst > >>> @@ -107,7 +107,7 @@ sysfs > >>> Global THP controls > >>> ------------------- > >>> > >>> -Transparent Hugepage Support for anonymous memory can be entirely disabled > >>> +Transparent Hugepage Support for anonymous memory can be disabled > >>> (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE > >>> regions (to avoid the risk of consuming more memory resources) or enabled > >>> system wide. This can be achieved per-supported-THP-size with one of:: > >>> @@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of:: > >>> where <size> is the hugepage size being addressed, the available sizes > >>> for which vary by system. > >>> > >>> +.. note:: Setting "never" in all sysfs THP controls does **not** disable > >>> + Transparent Huge Pages globally. This is because ``madvise(..., > >>> + MADV_COLLAPSE)`` ignores these settings and collapses ranges to > >>> + PMD-sized huge pages unconditionally. > >>> + > >>> For example:: > >>> > >>> echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled > >>> @@ -187,7 +192,9 @@ madvise > >>> behaviour. > >>> > >>> never > >>> - should be self-explanatory. > >>> + should be self-explanatory. Note that ``madvise(..., > >>> + MADV_COLLAPSE)`` can still cause transparent huge pages to be > >>> + obtained even if this mode is specified everywhere. > >> > >> I hope this part of the explanation is also copy-pasted into the > >> 'Hugepages in tmpfs/shmem' section. Otherwise look good to me. Thanks. > > > > Apologies if this is a silly question, but regarding this patchset: > > https://lore.kernel.org/linux-mm/cover.1750815384.git.baolin.wang@linux.alibaba.com/ > > > > It looks like the intention is to disable hugepages even for > > `MADV_COLLAPSE` when the user has set the policy to 'never'. However, > > based on Lorenzo's documentation update, it seems we still want to allow > > hugepages for `MADV_COLLAPSE` even if 'never' is set? > > > > Could you clarify what the intended behavior is? It seems we've decided > > to keep the existing behavior unchanged—am I understanding that > > correctly? > > Yes, Hugh has already explicitly opposed the current changes to the > MADV_COLLAPSE logic[1], although there are still some disagreements that > cannot be resolved. > > At least we reached the consensus to update the documentation to reflect > the current sysfs THP control logic first, to avoid the misunderstanding > that 'sysfs THP controls can disable Transparent Huge Pages globally'. Nice, thanks! Personally, I prefer this approach as well. Updating the man page feels a bit odd, since it's something people are already familiar with and may have memorized. > > [1] > https://lore.kernel.org/linux-mm/75c02dbf-4189-958d-515e-fa80bb2187fc@google.com/ Best regards Barry
+cc Hugh since we're mentioning him here, and not-trimming for context - TL;DR I am updating the docs to reflect the sysfs never 'doesn't mean never' behaviour for THP. On Tue, Jul 22, 2025 at 11:37:07AM +0800, Barry Song wrote: > On Tue, Jul 22, 2025 at 10:33 AM Baolin Wang > <baolin.wang@linux.alibaba.com> wrote: > > > > > > > > On 2025/7/22 10:23, Barry Song wrote: > > > On Tue, Jul 22, 2025 at 9:30 AM Baolin Wang > > > <baolin.wang@linux.alibaba.com> wrote: > > >> > > >> > > >> > > >> On 2025/7/21 23:55, Lorenzo Stoakes wrote: > > >>> Rather confusingly, setting all Transparent Huge Page sysfs settings to > > >>> "never" does not in fact result in THP being globally disabled. > > >>> > > >>> Rather, it results in khugepaged being disabled, but one can still obtain > > >>> THP pages using madvise(..., MADV_COLLAPSE). > > >>> > > >>> This is something that has remained poorly documented for some time, and it > > >>> is likely the received wisdom of most users of THP that never does, in > > >>> fact, mean never. > > >>> > > >>> It is therefore important to highlight, very clearly, that this is not the > > >>> ase. > > >>> > > >>> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > > >>> --- > > >>> Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++-- > > >>> 1 file changed, 9 insertions(+), 2 deletions(-) > > >>> > > >>> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst > > >>> index dff8d5985f0f..182519197ef7 100644 > > >>> --- a/Documentation/admin-guide/mm/transhuge.rst > > >>> +++ b/Documentation/admin-guide/mm/transhuge.rst > > >>> @@ -107,7 +107,7 @@ sysfs > > >>> Global THP controls > > >>> ------------------- > > >>> > > >>> -Transparent Hugepage Support for anonymous memory can be entirely disabled > > >>> +Transparent Hugepage Support for anonymous memory can be disabled > > >>> (mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE > > >>> regions (to avoid the risk of consuming more memory resources) or enabled > > >>> system wide. This can be achieved per-supported-THP-size with one of:: > > >>> @@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of:: > > >>> where <size> is the hugepage size being addressed, the available sizes > > >>> for which vary by system. > > >>> > > >>> +.. note:: Setting "never" in all sysfs THP controls does **not** disable > > >>> + Transparent Huge Pages globally. This is because ``madvise(..., > > >>> + MADV_COLLAPSE)`` ignores these settings and collapses ranges to > > >>> + PMD-sized huge pages unconditionally. > > >>> + > > >>> For example:: > > >>> > > >>> echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled > > >>> @@ -187,7 +192,9 @@ madvise > > >>> behaviour. > > >>> > > >>> never > > >>> - should be self-explanatory. > > >>> + should be self-explanatory. Note that ``madvise(..., > > >>> + MADV_COLLAPSE)`` can still cause transparent huge pages to be > > >>> + obtained even if this mode is specified everywhere. > > >> > > >> I hope this part of the explanation is also copy-pasted into the > > >> 'Hugepages in tmpfs/shmem' section. Otherwise look good to me. Thanks. > > > > > > Apologies if this is a silly question, but regarding this patchset: > > > https://lore.kernel.org/linux-mm/cover.1750815384.git.baolin.wang@linux.alibaba.com/ > > > > > > It looks like the intention is to disable hugepages even for > > > `MADV_COLLAPSE` when the user has set the policy to 'never'. However, > > > based on Lorenzo's documentation update, it seems we still want to allow > > > hugepages for `MADV_COLLAPSE` even if 'never' is set? > > > > > > Could you clarify what the intended behavior is? It seems we've decided > > > to keep the existing behavior unchanged—am I understanding that > > > correctly? > > > > Yes, Hugh has already explicitly opposed the current changes to the > > MADV_COLLAPSE logic[1], although there are still some disagreements that > > cannot be resolved. > > > > At least we reached the consensus to update the documentation to reflect > > the current sysfs THP control logic first, to avoid the misunderstanding > > that 'sysfs THP controls can disable Transparent Huge Pages globally'. > > Nice, thanks! Personally, I prefer this approach as well. Updating the > man page feels a bit odd, since it's something people are already > familiar with and may have memorized. Indeed, Hugh's input was important here and gave pause for thought. This was not an easy decision, and I ended up changing my mind from initially supporting this chnage... :) We may return to it later, but for the time being this is the rather conservative approach we've decided upon. Re: man page - I _do_ intend to update the man page as I find it far too vague on this topic currently, so that patch will be coming soon. I will cc- the THP folks on that patch when I send it. > > > > > [1] > > https://lore.kernel.org/linux-mm/75c02dbf-4189-958d-515e-fa80bb2187fc@google.com/ > > Best regards > Barry Cheers, Lorenzo
On Tue, Jul 22, 2025 at 06:29:15AM +0100, Lorenzo Stoakes wrote: > Re: man page - I _do_ intend to update the man page as I find it far too > vague on this topic currently, so that patch will be coming soon. Actually: MADV_COLLAPSE is independent of any sysfs (see sysfs(5)) setting under /sys/kernel/mm/transparent_hugepage, both in terms of determining THP eligibility, and allocation semantics. See Linux kernel source file Documentation/admin-guide/mm/transhuge.rst for more information. MADV_COLLAPSE also ignores huge= tmpfs mount when operating on tmpfs files. Allocation for the new hugepage may enter direct reclaim and/or compaction, regardless of VMA flags (though VM_NOHUGEPAGE is still respected). I think this is clear enough to require no update. THe confusion arose from the doc page seemingly contradicting this. By referring to the doc page which now makes things clear, we're all good! :) Cheers, Lorenzo
On 21 Jul 2025, at 11:55, Lorenzo Stoakes wrote: > Rather confusingly, setting all Transparent Huge Page sysfs settings to > "never" does not in fact result in THP being globally disabled. > > Rather, it results in khugepaged being disabled, but one can still obtain > THP pages using madvise(..., MADV_COLLAPSE). > > This is something that has remained poorly documented for some time, and it > is likely the received wisdom of most users of THP that never does, in > fact, mean never. > > It is therefore important to highlight, very clearly, that this is not the > ase. > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > --- > Documentation/admin-guide/mm/transhuge.rst | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com> Best Regards, Yan, Zi
On Mon, 21 Jul 2025 16:55:30 +0100 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > Rather confusingly, setting all Transparent Huge Page sysfs settings to > "never" does not in fact result in THP being globally disabled. > > Rather, it results in khugepaged being disabled, but one can still obtain > THP pages using madvise(..., MADV_COLLAPSE). > > This is something that has remained poorly documented for some time, and it > is likely the received wisdom of most users of THP that never does, in > fact, mean never. > > It is therefore important to highlight, very clearly, that this is not the > ase. case? > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: SeongJae Park <sj@kernel.org> Thanks, SJ [...]
© 2016 - 2025 Red Hat, Inc.