mm/page_alloc: restore 0-handling to zone_set_pageset_high_and_batch

[PATCH] mm/page_alloc: restore 0-handling to zone_set_pageset_high_and_batch

Posted by Joshua Hahn 1 month, 3 weeks ago

Commit 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
moved the error handling (0-handling) of zone_batchsize from its
callers to inside the function. However, the commit left out the error
handling for the NOMMU case, leading to deadlocks on NOMMU systems.

Since in the NOMMU case the reported-to-user batchsize should still be 0,
we would only like the error handling to exist in the callsites that
set the internal value for the zone (i.e. zone_set_pageset_high_and_batch).

Restore max(1, zone_batchsize(zone)) to the callsite to prevent errors
on NOMMU systems.

Fixes: 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
Reported-by: Daniel Palmer <daniel@thingy.jp>
Closes: https://lore.kernel.org/linux-mm/CAFr9PX=_HaM3_xPtTiBn5Gw5-0xcRpawpJ02NStfdr0khF2k7g@mail.gmail.com/
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/all/42143500-c380-41fe-815c-696c17241506@roeck-us.net/
Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
---
 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 822e05f1a964..10c1297fd3ea 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6045,7 +6045,7 @@ static void zone_set_pageset_high_and_batch(struct zone *zone, int cpu_online)
 {
 	int new_high_min, new_high_max, new_batch;
 
-	new_batch = zone_batchsize(zone);
+	new_batch = max(1, zone_batchsize(zone));
 	if (percpu_pagelist_high_fraction) {
 		new_high_min = zone_highsize(zone, new_batch, cpu_online,
 					     percpu_pagelist_high_fraction);

base-commit: 40fbbd64bba6c6e7a72885d2f59b6a3be9991eeb
-- 
2.47.3

Re: [PATCH] mm/page_alloc: restore 0-handling to zone_set_pageset_high_and_batch

Posted by Guenter Roeck 1 month, 3 weeks ago

On Tue, Dec 16, 2025 at 10:05:03PM -0800, Joshua Hahn wrote:
> Commit 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
> moved the error handling (0-handling) of zone_batchsize from its
> callers to inside the function. However, the commit left out the error
> handling for the NOMMU case, leading to deadlocks on NOMMU systems.
> 
> Since in the NOMMU case the reported-to-user batchsize should still be 0,
> we would only like the error handling to exist in the callsites that
> set the internal value for the zone (i.e. zone_set_pageset_high_and_batch).
> 
> Restore max(1, zone_batchsize(zone)) to the callsite to prevent errors
> on NOMMU systems.
> 
> Fixes: 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
> Reported-by: Daniel Palmer <daniel@thingy.jp>
> Closes: https://lore.kernel.org/linux-mm/CAFr9PX=_HaM3_xPtTiBn5Gw5-0xcRpawpJ02NStfdr0khF2k7g@mail.gmail.com/
> Reported-by: Guenter Roeck <linux@roeck-us.net>
> Closes: https://lore.kernel.org/all/42143500-c380-41fe-815c-696c17241506@roeck-us.net/
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>

For m68k:mcf5208evb boot tests with qemu:

Tested-by: Guenter Roeck <linux@roeck-us.net>

Guenter

Re: [PATCH] mm/page_alloc: restore 0-handling to zone_set_pageset_high_and_batch

Posted by Joshua Hahn 1 month, 3 weeks ago

On Wed, 17 Dec 2025 08:47:12 -0800 Guenter Roeck <linux@roeck-us.net> wrote:

> On Tue, Dec 16, 2025 at 10:05:03PM -0800, Joshua Hahn wrote:
> > Commit 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
> > moved the error handling (0-handling) of zone_batchsize from its
> > callers to inside the function. However, the commit left out the error
> > handling for the NOMMU case, leading to deadlocks on NOMMU systems.
> > 
> > Since in the NOMMU case the reported-to-user batchsize should still be 0,
> > we would only like the error handling to exist in the callsites that
> > set the internal value for the zone (i.e. zone_set_pageset_high_and_batch).
> > 
> > Restore max(1, zone_batchsize(zone)) to the callsite to prevent errors
> > on NOMMU systems.
> > 
> > Fixes: 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
> > Reported-by: Daniel Palmer <daniel@thingy.jp>
> > Closes: https://lore.kernel.org/linux-mm/CAFr9PX=_HaM3_xPtTiBn5Gw5-0xcRpawpJ02NStfdr0khF2k7g@mail.gmail.com/
> > Reported-by: Guenter Roeck <linux@roeck-us.net>
> > Closes: https://lore.kernel.org/all/42143500-c380-41fe-815c-696c17241506@roeck-us.net/
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> 
> For m68k:mcf5208evb boot tests with qemu:
> 
> Tested-by: Guenter Roeck <linux@roeck-us.net>
> 
> Guenter

Hello Guenter, thank you so much for spending the time to test my patch!

I've gone ahead and took Vlastimil's suggestion, which just returns 1
for zone_batchsize for NOMMU systems. Functionally, it should be the same
but I thought it might be best not to carry the tested-by tag over, since
the code changed in its implementation.

Thank you again for testing the code, and sorry for the bug in the first place.
Have a great day!
Joshua

Re: [PATCH] mm/page_alloc: restore 0-handling to zone_set_pageset_high_and_batch

Posted by Vlastimil Babka 1 month, 3 weeks ago

On 12/17/25 07:05, Joshua Hahn wrote:
> Commit 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
> moved the error handling (0-handling) of zone_batchsize from its
> callers to inside the function. However, the commit left out the error
> handling for the NOMMU case, leading to deadlocks on NOMMU systems.
> 
> Since in the NOMMU case the reported-to-user batchsize should still be 0,

Should it? The value is effectively set to 1 despite what zone_batchsize()
returns, because of that adjustment this patch reinstates. Also does anyone
care, really?

> we would only like the error handling to exist in the callsites that
> set the internal value for the zone (i.e. zone_set_pageset_high_and_batch).
> 
> Restore max(1, zone_batchsize(zone)) to the callsite to prevent errors
> on NOMMU systems.

I would rather make zone_batchsize() for !CONFIG_MMU return 1 instead of 0.

> Fixes: 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
> Reported-by: Daniel Palmer <daniel@thingy.jp>
> Closes: https://lore.kernel.org/linux-mm/CAFr9PX=_HaM3_xPtTiBn5Gw5-0xcRpawpJ02NStfdr0khF2k7g@mail.gmail.com/
> Reported-by: Guenter Roeck <linux@roeck-us.net>
> Closes: https://lore.kernel.org/all/42143500-c380-41fe-815c-696c17241506@roeck-us.net/
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> ---
>  mm/page_alloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 822e05f1a964..10c1297fd3ea 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6045,7 +6045,7 @@ static void zone_set_pageset_high_and_batch(struct zone *zone, int cpu_online)
>  {
>  	int new_high_min, new_high_max, new_batch;
>  
> -	new_batch = zone_batchsize(zone);
> +	new_batch = max(1, zone_batchsize(zone));
>  	if (percpu_pagelist_high_fraction) {
>  		new_high_min = zone_highsize(zone, new_batch, cpu_online,
>  					     percpu_pagelist_high_fraction);
> 
> base-commit: 40fbbd64bba6c6e7a72885d2f59b6a3be9991eeb

Re: [PATCH] mm/page_alloc: restore 0-handling to zone_set_pageset_high_and_batch

Posted by Vlastimil Babka 1 month, 3 weeks ago

On 12/17/25 12:12, Vlastimil Babka wrote:
> On 12/17/25 07:05, Joshua Hahn wrote:
>> Commit 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
>> moved the error handling (0-handling) of zone_batchsize from its
>> callers to inside the function. However, the commit left out the error
>> handling for the NOMMU case, leading to deadlocks on NOMMU systems.
>> 
>> Since in the NOMMU case the reported-to-user batchsize should still be 0,
> 
> Should it? The value is effectively set to 1 despite what zone_batchsize()
> returns, because of that adjustment this patch reinstates. Also does anyone
> care, really?
> 
>> we would only like the error handling to exist in the callsites that
>> set the internal value for the zone (i.e. zone_set_pageset_high_and_batch).
>> 
>> Restore max(1, zone_batchsize(zone)) to the callsite to prevent errors
>> on NOMMU systems.
> 
> I would rather make zone_batchsize() for !CONFIG_MMU return 1 instead of 0.

Ah looks like you considered it too, initially:
https://lore.kernel.org/all/20251211225947.822866-1-joshua.hahnjy@gmail.com/

It makes more sense to me than doing effectively two fixups in the MMU case.

>> Fixes: 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
>> Reported-by: Daniel Palmer <daniel@thingy.jp>
>> Closes: https://lore.kernel.org/linux-mm/CAFr9PX=_HaM3_xPtTiBn5Gw5-0xcRpawpJ02NStfdr0khF2k7g@mail.gmail.com/
>> Reported-by: Guenter Roeck <linux@roeck-us.net>
>> Closes: https://lore.kernel.org/all/42143500-c380-41fe-815c-696c17241506@roeck-us.net/
>> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>> ---
>>  mm/page_alloc.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 822e05f1a964..10c1297fd3ea 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6045,7 +6045,7 @@ static void zone_set_pageset_high_and_batch(struct zone *zone, int cpu_online)
>>  {
>>  	int new_high_min, new_high_max, new_batch;
>>  
>> -	new_batch = zone_batchsize(zone);
>> +	new_batch = max(1, zone_batchsize(zone));
>>  	if (percpu_pagelist_high_fraction) {
>>  		new_high_min = zone_highsize(zone, new_batch, cpu_online,
>>  					     percpu_pagelist_high_fraction);
>> 
>> base-commit: 40fbbd64bba6c6e7a72885d2f59b6a3be9991eeb
>

Re: [PATCH] mm/page_alloc: restore 0-handling to zone_set_pageset_high_and_batch

Posted by Joshua Hahn 1 month, 3 weeks ago

On Wed, 17 Dec 2025 12:23:58 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:

> On 12/17/25 12:12, Vlastimil Babka wrote:
> > On 12/17/25 07:05, Joshua Hahn wrote:
> >> Commit 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
> >> moved the error handling (0-handling) of zone_batchsize from its
> >> callers to inside the function. However, the commit left out the error
> >> handling for the NOMMU case, leading to deadlocks on NOMMU systems.
> >> 
> >> Since in the NOMMU case the reported-to-user batchsize should still be 0,
> > 
> > Should it? The value is effectively set to 1 despite what zone_batchsize()
> > returns, because of that adjustment this patch reinstates. Also does anyone
> > care, really?
> > 
> >> we would only like the error handling to exist in the callsites that
> >> set the internal value for the zone (i.e. zone_set_pageset_high_and_batch).
> >> 
> >> Restore max(1, zone_batchsize(zone)) to the callsite to prevent errors
> >> on NOMMU systems.
> > 
> > I would rather make zone_batchsize() for !CONFIG_MMU return 1 instead of 0.
> 
> Ah looks like you considered it too, initially:
> https://lore.kernel.org/all/20251211225947.822866-1-joshua.hahnjy@gmail.com/
> 
> It makes more sense to me than doing effectively two fixups in the MMU case.

Hi Vlastimil,

Thank you for your review as always.

Yes, I had also considered returning 1 for the !MMU case, since I think it
would make it a lot simpler as well (It would also make my original patch
function as intended).

However, I was unsure if changing this user-facing behavior for one line of
simplification would be worth it. I am not a NOMMU user, so I have very
little experience here, but I imagine that there is someone out there who
looks at zone_batchsize() returning 0 for NOMMU and interpreting it as
"there is no batching" as opposed to "there is batching, and it processes
1 page at a time" (which, actually isn't even true anyways because of the
bitshift). Maybe an option is to just make batchsize not visible
in the NOMMU case in addition to always returning 1 to avoid confusion.

Anyways, back to your original question of "does anyone care". . .

I am not sure : -)

For me, both solutions work, and in fact I prefer the original solution of
always reurning 1 for !NOMMU. Maybe some NOMMU users like Daniel and Guenter
can comment on whether this change really matters?

Thank you again for your review and follow-up. I hope you have a great day!
Joshua

Re: [PATCH] mm/page_alloc: restore 0-handling to zone_set_pageset_high_and_batch

Posted by Vlastimil Babka 1 month, 3 weeks ago

On 12/17/25 14:02, Joshua Hahn wrote:
> On Wed, 17 Dec 2025 12:23:58 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:
> 
>> On 12/17/25 12:12, Vlastimil Babka wrote:
>> > On 12/17/25 07:05, Joshua Hahn wrote:
>> >> Commit 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
>> >> moved the error handling (0-handling) of zone_batchsize from its
>> >> callers to inside the function. However, the commit left out the error
>> >> handling for the NOMMU case, leading to deadlocks on NOMMU systems.
>> >> 
>> >> Since in the NOMMU case the reported-to-user batchsize should still be 0,
>> > 
>> > Should it? The value is effectively set to 1 despite what zone_batchsize()
>> > returns, because of that adjustment this patch reinstates. Also does anyone
>> > care, really?
>> > 
>> >> we would only like the error handling to exist in the callsites that
>> >> set the internal value for the zone (i.e. zone_set_pageset_high_and_batch).
>> >> 
>> >> Restore max(1, zone_batchsize(zone)) to the callsite to prevent errors
>> >> on NOMMU systems.
>> > 
>> > I would rather make zone_batchsize() for !CONFIG_MMU return 1 instead of 0.
>> 
>> Ah looks like you considered it too, initially:
>> https://lore.kernel.org/all/20251211225947.822866-1-joshua.hahnjy@gmail.com/
>> 
>> It makes more sense to me than doing effectively two fixups in the MMU case.
> 
> Hi Vlastimil,
> 
> Thank you for your review as always.
> 
> Yes, I had also considered returning 1 for the !MMU case, since I think it
> would make it a lot simpler as well (It would also make my original patch
> function as intended).
> 
> However, I was unsure if changing this user-facing behavior for one line of
> simplification would be worth it. I am not a NOMMU user, so I have very
> little experience here, but I imagine that there is someone out there who
> looks at zone_batchsize() returning 0 for NOMMU and interpreting it as
> "there is no batching" as opposed to "there is batching, and it processes
> 1 page at a time" (which, actually isn't even true anyways because of the
> bitshift). Maybe an option is to just make batchsize not visible
> in the NOMMU case in addition to always returning 1 to avoid confusion.
> 
> Anyways, back to your original question of "does anyone care". . .
> 
> I am not sure : -)

It's a pr_debug(), it's not even being printed by default, nothing can
possibly break by changing 0 to 1 there. So I really wouldn't overthink this...

> For me, both solutions work, and in fact I prefer the original solution of
> always reurning 1 for !NOMMU. Maybe some NOMMU users like Daniel and Guenter
> can comment on whether this change really matters?

I would be surprised if they were aware of that pr_debug() in the first place :)

> Thank you again for your review and follow-up. I hope you have a great day!
> Joshua

Re: [PATCH] mm/page_alloc: restore 0-handling to zone_set_pageset_high_and_batch

Posted by Joshua Hahn 1 month, 3 weeks ago

On Wed, 17 Dec 2025 14:19:47 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:

> On 12/17/25 14:02, Joshua Hahn wrote:
> > On Wed, 17 Dec 2025 12:23:58 +0100 Vlastimil Babka <vbabka@suse.cz> wrote:
> > 
> >> On 12/17/25 12:12, Vlastimil Babka wrote:
> >> > On 12/17/25 07:05, Joshua Hahn wrote:
> >> >> Commit 2783088ef24e ("mm/page_alloc: prevent reporting pcp->batch = 0")
> >> >> moved the error handling (0-handling) of zone_batchsize from its
> >> >> callers to inside the function. However, the commit left out the error
> >> >> handling for the NOMMU case, leading to deadlocks on NOMMU systems.
> >> >> 
> >> >> Since in the NOMMU case the reported-to-user batchsize should still be 0,
> >> > 
> >> > Should it? The value is effectively set to 1 despite what zone_batchsize()
> >> > returns, because of that adjustment this patch reinstates. Also does anyone
> >> > care, really?
> >> > 
> >> >> we would only like the error handling to exist in the callsites that
> >> >> set the internal value for the zone (i.e. zone_set_pageset_high_and_batch).
> >> >> 
> >> >> Restore max(1, zone_batchsize(zone)) to the callsite to prevent errors
> >> >> on NOMMU systems.
> >> > 
> >> > I would rather make zone_batchsize() for !CONFIG_MMU return 1 instead of 0.
> >> 
> >> Ah looks like you considered it too, initially:
> >> https://lore.kernel.org/all/20251211225947.822866-1-joshua.hahnjy@gmail.com/
> >> 
> >> It makes more sense to me than doing effectively two fixups in the MMU case.
> > 
> > Hi Vlastimil,
> > 
> > Thank you for your review as always.
> > 
> > Yes, I had also considered returning 1 for the !MMU case, since I think it
> > would make it a lot simpler as well (It would also make my original patch
> > function as intended).
> > 
> > However, I was unsure if changing this user-facing behavior for one line of
> > simplification would be worth it. I am not a NOMMU user, so I have very
> > little experience here, but I imagine that there is someone out there who
> > looks at zone_batchsize() returning 0 for NOMMU and interpreting it as
> > "there is no batching" as opposed to "there is batching, and it processes
> > 1 page at a time" (which, actually isn't even true anyways because of the
> > bitshift). Maybe an option is to just make batchsize not visible
> > in the NOMMU case in addition to always returning 1 to avoid confusion.
> > 
> > Anyways, back to your original question of "does anyone care". . .
> > 
> > I am not sure : -)
> 
> It's a pr_debug(), it's not even being printed by default, nothing can
> possibly break by changing 0 to 1 there. So I really wouldn't overthink this...
> 
> > For me, both solutions work, and in fact I prefer the original solution of
> > always reurning 1 for !NOMMU. Maybe some NOMMU users like Daniel and Guenter
> > can comment on whether this change really matters?
> 
> I would be surprised if they were aware of that pr_debug() in the first place :)

You are right : -) I do think I was overthinking it.
Happy to send a v2 that just returns 1 for NOMMU. I'll wait for a day or so
to see if anyone has any strong objections to this.

Thank you again! Have a great day!
Joshua