[PATCH v2 1/2] mm/page_alloc: Clarify batch tuning in zone_batchsize

Joshua Hahn posted 2 patches 2 months, 1 week ago
[PATCH v2 1/2] mm/page_alloc: Clarify batch tuning in zone_batchsize
Posted by Joshua Hahn 2 months, 1 week ago
Recently while working on another patch about batching
free_pcppages_bulk [1], I was curious why pcp->batch was always 63 on my
machine. This led me to zone_batchsize(), where I found this set of
lines to determine what the batch size should be for the host:

	batch = min(zone_managed_pages(zone) >> 10, SZ_1M / PAGE_SIZE);
	batch /= 4;		/* We effectively *= 4 below */
	if (batch < 1)
		batch = 1;

All of this is good, except the comment above which says "We effectively
*= 4 below". Nowhere else in the function zone_batchsize(), is there a
corresponding multipliation by 4. Looking into the history of this, it
seems like Dave Hansen had also noticed this back in 2013 [1]. Turns out
there *used* to be a corresponding *= 4, which was turned into a *= 6
later on to be used in pageset_setup_from_batch_size(), which no longer
exists.

Despite this mismatch not being corrected in the comments, it seems that
getting rid of the /= 4 leads to a performance regression on machines
with less than 250G memory and 176 processors. As such, let us preserve
the functionality but clean up the comments.

Fold the /= 4 into the calculation above: bitshift by 10+2=12, and
instead of dividing 1MB, divide 256KB and adjust the comments
accordingly. No functional change intended.

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>

[1] https://lore.kernel.org/all/20251002204636.4016712-1-joshua.hahnjy@gmail.com/
[2] https://lore.kernel.org/linux-mm/20131015203547.8724C69C@viggo.jf.intel.com/
---
 mm/page_alloc.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 600d9e981c23..39368cdc953d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5860,13 +5860,12 @@ static int zone_batchsize(struct zone *zone)
 	int batch;
 
 	/*
-	 * The number of pages to batch allocate is either ~0.1%
-	 * of the zone or 1MB, whichever is smaller. The batch
+	 * The number of pages to batch allocate is either ~0.025%
+	 * of the zone or 256KB, whichever is smaller. The batch
 	 * size is striking a balance between allocation latency
 	 * and zone lock contention.
 	 */
-	batch = min(zone_managed_pages(zone) >> 10, SZ_1M / PAGE_SIZE);
-	batch /= 4;		/* We effectively *= 4 below */
+	batch = min(zone_managed_pages(zone) >> 12, SZ_256K / PAGE_SIZE);
 	if (batch < 1)
 		batch = 1;
 
-- 
2.47.3
Re: [PATCH v2 1/2] mm/page_alloc: Clarify batch tuning in zone_batchsize
Posted by Vlastimil Babka 2 months ago
On 10/9/25 21:29, Joshua Hahn wrote:
> Recently while working on another patch about batching
> free_pcppages_bulk [1], I was curious why pcp->batch was always 63 on my
> machine. This led me to zone_batchsize(), where I found this set of
> lines to determine what the batch size should be for the host:
> 
> 	batch = min(zone_managed_pages(zone) >> 10, SZ_1M / PAGE_SIZE);
> 	batch /= 4;		/* We effectively *= 4 below */
> 	if (batch < 1)
> 		batch = 1;
> 
> All of this is good, except the comment above which says "We effectively
> *= 4 below". Nowhere else in the function zone_batchsize(), is there a
> corresponding multipliation by 4. Looking into the history of this, it
> seems like Dave Hansen had also noticed this back in 2013 [1]. Turns out
> there *used* to be a corresponding *= 4, which was turned into a *= 6
> later on to be used in pageset_setup_from_batch_size(), which no longer
> exists.
> 
> Despite this mismatch not being corrected in the comments, it seems that
> getting rid of the /= 4 leads to a performance regression on machines
> with less than 250G memory and 176 processors. As such, let us preserve
> the functionality but clean up the comments.
> 
> Fold the /= 4 into the calculation above: bitshift by 10+2=12, and
> instead of dividing 1MB, divide 256KB and adjust the comments
> accordingly. No functional change intended.
> 
> Suggested-by: Dave Hansen <dave.hansen@intel.com>
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>