[PATCH v4] mm/page_alloc: prevent reporting pcp->batch = 0

Joshua Hahn posted 1 patch 1 day, 22 hours ago
mm/page_alloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH v4] mm/page_alloc: prevent reporting pcp->batch = 0
Posted by Joshua Hahn 1 day, 22 hours ago
zone_batchsize returns the appropriate value that should be used for
pcp->batch. If it finds a zone with less than 4096 pages or PAGE_SIZE >
1M, however, it leads to some incorrect math.

In the case above, we will get an intermediary value of 1, which is then
rounded down to the nearest power of two, and 1 is subtracted from it.
Since 1 is already a power of two, we will get batch = 1-1 = 0:

        batch = rounddown_pow_of_two(batch + batch/2) - 1;

A pcp->batch value of 0 is nonsensical, for MMU systems. If this were
actually set, then functions like drain_zone_pages would become no-ops,
since they would free 0 pages at a time.

Of the two callers of zone_batchsize, the one that is actually used to
set pcp->batch works around this by setting pcp->batch to the maximum
of 1 and zone_batchsize. However, the other caller, zone_pcp_init,
incorrectly prints out the batch size of the zone to be 0.

This is probably rare in a typical zone, but the DMA zone can often have
less than 4096 pages, which means it will print out "LIFO batch:0".

Before: [    0.001216]   DMA zone: 3998 pages, LIFO batch:0
After:  [    0.001210]   DMA zone: 3998 pages, LIFO batch:1

With all of this said, NOMMU differs in two ways. Semantically, it
should report that pcp->batch is 0. At the same time, it can never
really have a pcp->batch size of 0 since it will reach a deadlock in
pcp freeing functions. For this reason, zone_batchsize should still
report 0 for NOMMU, but zone_set_pageset_high_and_batch should still
interpret it as 1, meaning we cannot get rid of max(1, zone_batchsize())
in zone_set_pageset_high_and_batch.

Suggested-by: Daniel Palmer <daniel@0x0f.com>
Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
---
Reviewers' note:

This patch was originally a part of the 6.19-rc1 pr, but Daniel Palmer
kindly reported that this patch causes an issue on NOMMU systems [1].
Thank you, Daniel! I wasn't sure how to credit here since it was a
report on an unmerged commit so I went with suggested-by. If this is
problematic please let me know and I will change the tag.

[1] https://lore.kernel.org/all/CAFr9PX=_HaM3_xPtTiBn5Gw5-0xcRpawpJ02NStfdr0khF2k7g@mail.gmail.com/

Reviewer's note (to Andrew):

This replaces commit 2/2 of the series titled "mm/page_alloc: pcp->batch
cleanups" [2]. I thought it might be best to just send out this patch and not
the first since the other one is already in the pull request (and seemingly
without any issues). If you would like me to resubmit the series, please
let me know and I will do that!

[2] https://lore.kernel.org/all/20251009192933.3756712-3-joshua.hahnjy@gmail.com/

 mm/page_alloc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f928d37eeb6a..a7448840c5fd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5888,8 +5888,8 @@ static int zone_batchsize(struct zone *zone)
 	 * and zone lock contention.
 	 */
 	batch = min(zone_managed_pages(zone) >> 12, SZ_256K / PAGE_SIZE);
-	if (batch < 1)
-		batch = 1;
+	if (batch <= 1)
+		return 1;
 
 	/*
 	 * Clamp the batch to a 2^n - 1 value. Having a power

base-commit: 65dbdbbd25f928b8a56d9a27d97a6c668ce1ef08
-- 
2.47.3
Re: [PATCH v4] mm/page_alloc: prevent reporting pcp->batch = 0
Posted by Andrew Morton 1 day, 19 hours ago
On Tue, 16 Dec 2025 06:48:11 -0800 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:

> zone_batchsize returns the appropriate value that should be used for
> pcp->batch. If it finds a zone with less than 4096 pages or PAGE_SIZE >
> 1M, however, it leads to some incorrect math.
> 
> In the case above, we will get an intermediary value of 1, which is then
> rounded down to the nearest power of two, and 1 is subtracted from it.
> Since 1 is already a power of two, we will get batch = 1-1 = 0:
> 
>         batch = rounddown_pow_of_two(batch + batch/2) - 1;
> 
> A pcp->batch value of 0 is nonsensical, for MMU systems. If this were
> actually set, then functions like drain_zone_pages would become no-ops,
> since they would free 0 pages at a time.
> 
> Of the two callers of zone_batchsize, the one that is actually used to
> set pcp->batch works around this by setting pcp->batch to the maximum
> of 1 and zone_batchsize. However, the other caller, zone_pcp_init,
> incorrectly prints out the batch size of the zone to be 0.
> 
> This is probably rare in a typical zone, but the DMA zone can often have
> less than 4096 pages, which means it will print out "LIFO batch:0".
> 
> Before: [    0.001216]   DMA zone: 3998 pages, LIFO batch:0
> After:  [    0.001210]   DMA zone: 3998 pages, LIFO batch:1
> 
> With all of this said, NOMMU differs in two ways. Semantically, it
> should report that pcp->batch is 0. At the same time, it can never
> really have a pcp->batch size of 0 since it will reach a deadlock in
> pcp freeing functions. For this reason, zone_batchsize should still
> report 0 for NOMMU, but zone_set_pageset_high_and_batch should still
> interpret it as 1, meaning we cannot get rid of max(1, zone_batchsize())
> in zone_set_pageset_high_and_batch.
> 
> Suggested-by: Daniel Palmer <daniel@0x0f.com>
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> ---
> Reviewers' note:
> 
> This patch was originally a part of the 6.19-rc1 pr, but Daniel Palmer
> kindly reported that this patch causes an issue on NOMMU systems [1].
> Thank you, Daniel! I wasn't sure how to credit here since it was a
> report on an unmerged commit so I went with suggested-by. If this is
> problematic please let me know and I will change the tag.
> 
> [1] https://lore.kernel.org/all/CAFr9PX=_HaM3_xPtTiBn5Gw5-0xcRpawpJ02NStfdr0khF2k7g@mail.gmail.com/
> 
> Reviewer's note (to Andrew):
> 
> This replaces commit 2/2 of the series titled "mm/page_alloc: pcp->batch
> cleanups" [2].

That series is in mainline.  2783088ef24e ("mm/page_alloc: prevent
reporting pcp->batch = 0").

> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5888,8 +5888,8 @@ static int zone_batchsize(struct zone *zone)
>  	 * and zone lock contention.
>  	 */
>  	batch = min(zone_managed_pages(zone) >> 12, SZ_256K / PAGE_SIZE);
> -	if (batch < 1)
> -		batch = 1;
> +	if (batch <= 1)
> +		return 1;
>  
>  	/*
>  	 * Clamp the batch to a 2^n - 1 value. Having a power
> 

So this doesn't work.  Please send along a fix for current -linus and include

Fixes: 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")

and Reported-by:Daniel and the appropriate Closes:

Thanks.
Re: [PATCH v4] mm/page_alloc: prevent reporting pcp->batch = 0
Posted by Joshua Hahn 1 day, 8 hours ago
On Tue, 16 Dec 2025 10:22:05 -0800 Andrew Morton <akpm@linux-foundation.org> wrote:

> On Tue, 16 Dec 2025 06:48:11 -0800 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> 
> > zone_batchsize returns the appropriate value that should be used for
> > pcp->batch. If it finds a zone with less than 4096 pages or PAGE_SIZE >
> > 1M, however, it leads to some incorrect math.
> > 
> > In the case above, we will get an intermediary value of 1, which is then
> > rounded down to the nearest power of two, and 1 is subtracted from it.
> > Since 1 is already a power of two, we will get batch = 1-1 = 0:
> > 
> >         batch = rounddown_pow_of_two(batch + batch/2) - 1;
> > 
> > A pcp->batch value of 0 is nonsensical, for MMU systems. If this were
> > actually set, then functions like drain_zone_pages would become no-ops,
> > since they would free 0 pages at a time.
> > 
> > Of the two callers of zone_batchsize, the one that is actually used to
> > set pcp->batch works around this by setting pcp->batch to the maximum
> > of 1 and zone_batchsize. However, the other caller, zone_pcp_init,
> > incorrectly prints out the batch size of the zone to be 0.
> > 
> > This is probably rare in a typical zone, but the DMA zone can often have
> > less than 4096 pages, which means it will print out "LIFO batch:0".
> > 
> > Before: [    0.001216]   DMA zone: 3998 pages, LIFO batch:0
> > After:  [    0.001210]   DMA zone: 3998 pages, LIFO batch:1
> > 
> > With all of this said, NOMMU differs in two ways. Semantically, it
> > should report that pcp->batch is 0. At the same time, it can never
> > really have a pcp->batch size of 0 since it will reach a deadlock in
> > pcp freeing functions. For this reason, zone_batchsize should still
> > report 0 for NOMMU, but zone_set_pageset_high_and_batch should still
> > interpret it as 1, meaning we cannot get rid of max(1, zone_batchsize())
> > in zone_set_pageset_high_and_batch.
> > 
> > Suggested-by: Daniel Palmer <daniel@0x0f.com>
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> > ---
> > Reviewers' note:
> > 
> > This patch was originally a part of the 6.19-rc1 pr, but Daniel Palmer
> > kindly reported that this patch causes an issue on NOMMU systems [1].
> > Thank you, Daniel! I wasn't sure how to credit here since it was a
> > report on an unmerged commit so I went with suggested-by. If this is
> > problematic please let me know and I will change the tag.
> > 
> > [1] https://lore.kernel.org/all/CAFr9PX=_HaM3_xPtTiBn5Gw5-0xcRpawpJ02NStfdr0khF2k7g@mail.gmail.com/
> > 
> > Reviewer's note (to Andrew):
> > 
> > This replaces commit 2/2 of the series titled "mm/page_alloc: pcp->batch
> > cleanups" [2].
> 
> That series is in mainline.  2783088ef24e ("mm/page_alloc: prevent
> reporting pcp->batch = 0").

Hello Andrew,

Sorry again, this mix-up was also avoidable. I'll be more careful in the future.
 
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -5888,8 +5888,8 @@ static int zone_batchsize(struct zone *zone)
> >  	 * and zone lock contention.
> >  	 */
> >  	batch = min(zone_managed_pages(zone) >> 12, SZ_256K / PAGE_SIZE);
> > -	if (batch < 1)
> > -		batch = 1;
> > +	if (batch <= 1)
> > +		return 1;
> >  
> >  	/*
> >  	 * Clamp the batch to a 2^n - 1 value. Having a power
> > 
> 
> So this doesn't work.  Please send along a fix for current -linus and include
> 
> Fixes: 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
> 
> and Reported-by:Daniel and the appropriate Closes:
> 
> Thanks.

Will do, thank you for your patience. I hope you have a great day!
Joshua