mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Commit 524c48072e56 ("mm/page_alloc: rename ALLOC_HIGH to
ALLOC_MIN_RESERVE") is the start of a series that explains how __GFP_HIGH,
which implies ALLOC_MIN_RESERVE, is going to be used instead of
__GFP_ATOMIC for high atomic reserves.
Commit eb2e2b425c69 ("mm/page_alloc: explicitly record high-order atomic
allocations in alloc_flags") introduced ALLOC_HIGHATOMIC for such
allocations of order higher than 0. It still used __GFP_ATOMIC, though.
Then, commit 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH
non-blocking allocations accesses reserves") just turned that check for
!__GFP_DIRECT_RECLAIM, ignoring that high atomic reserves were expected to
test for __GFP_HIGH.
This leads to high atomic reserves being added for high-order GFP_NOWAIT
allocations and others that clear __GFP_DIRECT_RECLAIM, which is
unexpected. Later, those reserves lead to 0-order allocations going to the
slow path and starting reclaim.
From /proc/pagetypeinfo, without the patch:
Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA32, type HighAtomic 1 8 10 9 7 3 0 0 0 0 0
Node 0, zone Normal, type HighAtomic 64 20 12 5 0 0 0 0 0 0 0
With the patch:
Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone DMA32, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
Fixes: 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH non-blocking allocations accesses reserves")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Tested-by: Helen Koike <koike@igalia.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: NeilBrown <neilb@suse.de>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
---
mm/page_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2ef3c07266b3..bf52e3bef626 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4219,7 +4219,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order)
if (!(gfp_mask & __GFP_NOMEMALLOC)) {
alloc_flags |= ALLOC_NON_BLOCK;
- if (order > 0)
+ if (order > 0 && (alloc_flags & ALLOC_MIN_RESERVE))
alloc_flags |= ALLOC_HIGHATOMIC;
}
--
2.47.2
On 8/14/25 19:22, Thadeu Lima de Souza Cascardo wrote:
> Commit 524c48072e56 ("mm/page_alloc: rename ALLOC_HIGH to
> ALLOC_MIN_RESERVE") is the start of a series that explains how __GFP_HIGH,
> which implies ALLOC_MIN_RESERVE, is going to be used instead of
> __GFP_ATOMIC for high atomic reserves.
>
> Commit eb2e2b425c69 ("mm/page_alloc: explicitly record high-order atomic
> allocations in alloc_flags") introduced ALLOC_HIGHATOMIC for such
> allocations of order higher than 0. It still used __GFP_ATOMIC, though.
>
> Then, commit 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH
> non-blocking allocations accesses reserves") just turned that check for
> !__GFP_DIRECT_RECLAIM, ignoring that high atomic reserves were expected to
> test for __GFP_HIGH.
>
> This leads to high atomic reserves being added for high-order GFP_NOWAIT
> allocations and others that clear __GFP_DIRECT_RECLAIM, which is
> unexpected. Later, those reserves lead to 0-order allocations going to the
> slow path and starting reclaim.
>
> From /proc/pagetypeinfo, without the patch:
>
> Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
> Node 0, zone DMA32, type HighAtomic 1 8 10 9 7 3 0 0 0 0 0
> Node 0, zone Normal, type HighAtomic 64 20 12 5 0 0 0 0 0 0 0
>
> With the patch:
>
> Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
> Node 0, zone DMA32, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
> Node 0, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
>
> Fixes: 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH non-blocking allocations accesses reserves")
> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> Tested-by: Helen Koike <koike@igalia.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: NeilBrown <neilb@suse.de>
> Cc: Thierry Reding <thierry.reding@gmail.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
Agreed with others that this change matches the original intention and it
must have been an oversight. Also found nothing to the contrary in the
original threads.
> ---
> mm/page_alloc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2ef3c07266b3..bf52e3bef626 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4219,7 +4219,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order)
> if (!(gfp_mask & __GFP_NOMEMALLOC)) {
> alloc_flags |= ALLOC_NON_BLOCK;
>
> - if (order > 0)
> + if (order > 0 && (alloc_flags & ALLOC_MIN_RESERVE))
> alloc_flags |= ALLOC_HIGHATOMIC;
> }
>
On 8/29/25 10:36, Vlastimil Babka wrote:
> On 8/14/25 19:22, Thadeu Lima de Souza Cascardo wrote:
>> Commit 524c48072e56 ("mm/page_alloc: rename ALLOC_HIGH to
>> ALLOC_MIN_RESERVE") is the start of a series that explains how __GFP_HIGH,
>> which implies ALLOC_MIN_RESERVE, is going to be used instead of
>> __GFP_ATOMIC for high atomic reserves.
>>
>> Commit eb2e2b425c69 ("mm/page_alloc: explicitly record high-order atomic
>> allocations in alloc_flags") introduced ALLOC_HIGHATOMIC for such
>> allocations of order higher than 0. It still used __GFP_ATOMIC, though.
>>
>> Then, commit 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH
>> non-blocking allocations accesses reserves") just turned that check for
>> !__GFP_DIRECT_RECLAIM, ignoring that high atomic reserves were expected to
>> test for __GFP_HIGH.
>>
>> This leads to high atomic reserves being added for high-order GFP_NOWAIT
>> allocations and others that clear __GFP_DIRECT_RECLAIM, which is
>> unexpected. Later, those reserves lead to 0-order allocations going to the
>> slow path and starting reclaim.
>>
>> From /proc/pagetypeinfo, without the patch:
>>
>> Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
>> Node 0, zone DMA32, type HighAtomic 1 8 10 9 7 3 0 0 0 0 0
>> Node 0, zone Normal, type HighAtomic 64 20 12 5 0 0 0 0 0 0 0
>>
>> With the patch:
>>
>> Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
>> Node 0, zone DMA32, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
>> Node 0, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
>>
>> Fixes: 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH non-blocking allocations accesses reserves")
>> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
>> Tested-by: Helen Koike <koike@igalia.com>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: NeilBrown <neilb@suse.de>
>> Cc: Thierry Reding <thierry.reding@gmail.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>
> Agreed with others that this change matches the original intention and it
> must have been an oversight. Also found nothing to the contrary in the
> original threads.
Oops, forgot to add
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>> ---
>> mm/page_alloc.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 2ef3c07266b3..bf52e3bef626 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -4219,7 +4219,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order)
>> if (!(gfp_mask & __GFP_NOMEMALLOC)) {
>> alloc_flags |= ALLOC_NON_BLOCK;
>>
>> - if (order > 0)
>> + if (order > 0 && (alloc_flags & ALLOC_MIN_RESERVE))
>> alloc_flags |= ALLOC_HIGHATOMIC;
>> }
>>
>
On (25/08/29 10:38), Vlastimil Babka wrote: > On 8/29/25 10:36, Vlastimil Babka wrote: > > On 8/14/25 19:22, Thadeu Lima de Souza Cascardo wrote: [..] > > Agreed with others that this change matches the original intention and it > > must have been an oversight. Also found nothing to the contrary in the > > original threads. > > Oops, forgot to add > > Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Thank you! FWIW Tested-by: Sergey Senozhatsky <senozhatsky@chromium.org> This needs Cc: stable@vger.kernel.org all the way down to 5.15, as far as I can tell.
On 8/29/25 10:56, Sergey Senozhatsky wrote: > On (25/08/29 10:38), Vlastimil Babka wrote: >> On 8/29/25 10:36, Vlastimil Babka wrote: >> > On 8/14/25 19:22, Thadeu Lima de Souza Cascardo wrote: > [..] >> > Agreed with others that this change matches the original intention and it >> > must have been an oversight. Also found nothing to the contrary in the >> > original threads. >> >> Oops, forgot to add >> >> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> > > Thank you! > > FWIW > Tested-by: Sergey Senozhatsky <senozhatsky@chromium.org> > > This needs Cc: stable@vger.kernel.org all the way down to 5.15, > as far as I can tell. If that problem bothers users of LTS kernels in that range, we can do that. I wonder a bit how it was only found out now as a regression in the browser/desktop environment test if it's that old? Is there another factor i.e. some new frequesnt source of allocations that trigger it?
On Fri, Aug 29, 2025 at 11:30:17AM +0200, Vlastimil Babka wrote: > On 8/29/25 10:56, Sergey Senozhatsky wrote: > > On (25/08/29 10:38), Vlastimil Babka wrote: > >> On 8/29/25 10:36, Vlastimil Babka wrote: > >> > On 8/14/25 19:22, Thadeu Lima de Souza Cascardo wrote: > > [..] > >> > Agreed with others that this change matches the original intention and it > >> > must have been an oversight. Also found nothing to the contrary in the > >> > original threads. > >> > >> Oops, forgot to add > >> > >> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> > > > > Thank you! > > > > FWIW > > Tested-by: Sergey Senozhatsky <senozhatsky@chromium.org> > > > > This needs Cc: stable@vger.kernel.org all the way down to 5.15, > > as far as I can tell. > > If that problem bothers users of LTS kernels in that range, we can do that. > I wonder a bit how it was only found out now as a regression in the > browser/desktop environment test if it's that old? Is there another factor > i.e. some new frequesnt source of allocations that trigger it? That is not. It is just that there was an upgrade all the way back from 5.4 and, then, I caught this while doing some code inspection and reviewing the patchset I referred to. Well, I also tested that it really happens and caught a unix socket skb allocation triggering that as it masks off __GFP_DIRECT_RECLAIM. Cascardo.
On Thu 14-08-25 14:22:45, Thadeu Lima de Souza Cascardo wrote:
> Commit 524c48072e56 ("mm/page_alloc: rename ALLOC_HIGH to
> ALLOC_MIN_RESERVE") is the start of a series that explains how __GFP_HIGH,
> which implies ALLOC_MIN_RESERVE, is going to be used instead of
> __GFP_ATOMIC for high atomic reserves.
>
> Commit eb2e2b425c69 ("mm/page_alloc: explicitly record high-order atomic
> allocations in alloc_flags") introduced ALLOC_HIGHATOMIC for such
> allocations of order higher than 0. It still used __GFP_ATOMIC, though.
>
> Then, commit 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH
> non-blocking allocations accesses reserves") just turned that check for
> !__GFP_DIRECT_RECLAIM, ignoring that high atomic reserves were expected to
> test for __GFP_HIGH.
>
> This leads to high atomic reserves being added for high-order GFP_NOWAIT
> allocations and others that clear __GFP_DIRECT_RECLAIM, which is
> unexpected. Later, those reserves lead to 0-order allocations going to the
> slow path and starting reclaim.
>
> >From /proc/pagetypeinfo, without the patch:
>
> Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
> Node 0, zone DMA32, type HighAtomic 1 8 10 9 7 3 0 0 0 0 0
> Node 0, zone Normal, type HighAtomic 64 20 12 5 0 0 0 0 0 0 0
>
> With the patch:
>
> Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
> Node 0, zone DMA32, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
> Node 0, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
>
> Fixes: 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH non-blocking allocations accesses reserves")
> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
> Tested-by: Helen Koike <koike@igalia.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: NeilBrown <neilb@suse.de>
> Cc: Thierry Reding <thierry.reding@gmail.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
Yes, this makes a lot of sense to me. GFP_NOWAIT allocations should be
opportunistic and quick to fail rather than dipping into memory
reserves. We must have overlooked that during the review.
Acked-by: Michal Hocko <mhocko@suse.com>
> ---
> mm/page_alloc.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2ef3c07266b3..bf52e3bef626 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4219,7 +4219,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order)
> if (!(gfp_mask & __GFP_NOMEMALLOC)) {
> alloc_flags |= ALLOC_NON_BLOCK;
>
> - if (order > 0)
> + if (order > 0 && (alloc_flags & ALLOC_MIN_RESERVE))
> alloc_flags |= ALLOC_HIGHATOMIC;
> }
>
> --
> 2.47.2
--
Michal Hocko
SUSE Labs
On (25/08/14 14:22), Thadeu Lima de Souza Cascardo wrote:
> Commit 524c48072e56 ("mm/page_alloc: rename ALLOC_HIGH to
> ALLOC_MIN_RESERVE") is the start of a series that explains how __GFP_HIGH,
> which implies ALLOC_MIN_RESERVE, is going to be used instead of
> __GFP_ATOMIC for high atomic reserves.
>
> Commit eb2e2b425c69 ("mm/page_alloc: explicitly record high-order atomic
> allocations in alloc_flags") introduced ALLOC_HIGHATOMIC for such
> allocations of order higher than 0. It still used __GFP_ATOMIC, though.
>
> Then, commit 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH
> non-blocking allocations accesses reserves") just turned that check for
> !__GFP_DIRECT_RECLAIM, ignoring that high atomic reserves were expected to
> test for __GFP_HIGH.
>
> This leads to high atomic reserves being added for high-order GFP_NOWAIT
> allocations and others that clear __GFP_DIRECT_RECLAIM, which is
> unexpected. Later, those reserves lead to 0-order allocations going to the
> slow path and starting reclaim.
>
> From /proc/pagetypeinfo, without the patch:
>
> Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
> Node 0, zone DMA32, type HighAtomic 1 8 10 9 7 3 0 0 0 0 0
> Node 0, zone Normal, type HighAtomic 64 20 12 5 0 0 0 0 0 0 0
>
> With the patch:
>
> Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
> Node 0, zone DMA32, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
> Node 0, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
[..]
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2ef3c07266b3..bf52e3bef626 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4219,7 +4219,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order)
> if (!(gfp_mask & __GFP_NOMEMALLOC)) {
> alloc_flags |= ALLOC_NON_BLOCK;
>
> - if (order > 0)
> + if (order > 0 && (alloc_flags & ALLOC_MIN_RESERVE))
> alloc_flags |= ALLOC_HIGHATOMIC;
> }
From my limited understanding, it does look like this was the
intention. Vlastimil, Mel, got a minute to take a look?
Hello Thadeu,
On Thu, Aug 14, 2025 at 02:22:45PM -0300, Thadeu Lima de Souza Cascardo wrote:
> Commit 524c48072e56 ("mm/page_alloc: rename ALLOC_HIGH to
> ALLOC_MIN_RESERVE") is the start of a series that explains how __GFP_HIGH,
> which implies ALLOC_MIN_RESERVE, is going to be used instead of
> __GFP_ATOMIC for high atomic reserves.
>
> Commit eb2e2b425c69 ("mm/page_alloc: explicitly record high-order atomic
> allocations in alloc_flags") introduced ALLOC_HIGHATOMIC for such
> allocations of order higher than 0. It still used __GFP_ATOMIC, though.
>
> Then, commit 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH
> non-blocking allocations accesses reserves") just turned that check for
> !__GFP_DIRECT_RECLAIM, ignoring that high atomic reserves were expected to
> test for __GFP_HIGH.
It indeed looks accidental. From the cover letter,
High-order atomic allocations are explicitly handled with the caveat that
no __GFP_ATOMIC flag means that any high-order allocation that specifies
GFP_HIGH and cannot enter direct reclaim will be treated as if it was
GFP_ATOMIC.
it sounds like the intent was what your patch does, and not to extend
those privileges to anybody who is !gfp_direct_reclaim.
> This leads to high atomic reserves being added for high-order GFP_NOWAIT
> allocations and others that clear __GFP_DIRECT_RECLAIM, which is
> unexpected. Later, those reserves lead to 0-order allocations going to the
> slow path and starting reclaim.
Can you please provide more background on the workload and the
environment in which you observed this?
Which GFP_NOWAIT requests you saw participating in the reserves etc.
I would feel better with Mel or Vlastimil chiming in as well, but your
fix looks correct to me.
On Thu, Aug 14, 2025 at 04:12:11PM -0400, Johannes Weiner wrote:
> Hello Thadeu,
>
> On Thu, Aug 14, 2025 at 02:22:45PM -0300, Thadeu Lima de Souza Cascardo wrote:
> > Commit 524c48072e56 ("mm/page_alloc: rename ALLOC_HIGH to
> > ALLOC_MIN_RESERVE") is the start of a series that explains how __GFP_HIGH,
> > which implies ALLOC_MIN_RESERVE, is going to be used instead of
> > __GFP_ATOMIC for high atomic reserves.
> >
> > Commit eb2e2b425c69 ("mm/page_alloc: explicitly record high-order atomic
> > allocations in alloc_flags") introduced ALLOC_HIGHATOMIC for such
> > allocations of order higher than 0. It still used __GFP_ATOMIC, though.
> >
> > Then, commit 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH
> > non-blocking allocations accesses reserves") just turned that check for
> > !__GFP_DIRECT_RECLAIM, ignoring that high atomic reserves were expected to
> > test for __GFP_HIGH.
>
> It indeed looks accidental. From the cover letter,
>
> High-order atomic allocations are explicitly handled with the caveat that
> no __GFP_ATOMIC flag means that any high-order allocation that specifies
> GFP_HIGH and cannot enter direct reclaim will be treated as if it was
> GFP_ATOMIC.
>
> it sounds like the intent was what your patch does, and not to extend
> those privileges to anybody who is !gfp_direct_reclaim.
>
> > This leads to high atomic reserves being added for high-order GFP_NOWAIT
> > allocations and others that clear __GFP_DIRECT_RECLAIM, which is
> > unexpected. Later, those reserves lead to 0-order allocations going to the
> > slow path and starting reclaim.
>
> Can you please provide more background on the workload and the
> environment in which you observed this?
>
> Which GFP_NOWAIT requests you saw participating in the reserves etc.
>
> I would feel better with Mel or Vlastimil chiming in as well, but your
> fix looks correct to me.
Thanks for the review, Johannes.
This has been observed in a browser/desktop environment test, where we have
noticed some memory pressure regression. This change alone does not make
the regression go away entirely, but it improves it.
I noticed some unix skb allocation going on and I found this at
net/core/skbuff.c:alloc_skb_with_frags:
page = alloc_pages((gfp_mask & ~__GFP_DIRECT_RECLAIM) |
__GFP_COMP |
__GFP_NOWARN,
order);
But I tested this at a simple VM with the most simple workload (no swap,
writing to tmpfs) and it triggered with xarrays. At lib/xarray.c:xas_alloc:
gfp_t gfp = GFP_NOWAIT | __GFP_NOWARN;
if (xas->xa->xa_flags & XA_FLAGS_ACCOUNT)
gfp |= __GFP_ACCOUNT;
node = kmem_cache_alloc_lru(radix_tree_node_cachep, xas->xa_lru, gfp);
Where radix_tree_node_cachep, on that VM, uses a 4-page slab.
I tested with something like:
if (order > 0) {
WARN_ON_ONCE(!(alloc_flags & ALLOC_MIN_RESERVE));
alloc_flags |= ALLOC_HIGHATOMIC;
}
Thanks.
Cascardo.
© 2016 - 2026 Red Hat, Inc.