mm/page_alloc.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
Commit 9726891fe753 ("mm/page_alloc: recalculate zone reserve pages
when managed pages change") moved setup_per_zone_lowmem_reserve() into
adjust_managed_page_count(), so zone reserve recalculation can now be
triggered from paths that run concurrently on different CPUs.
setup_per_zone_lowmem_reserve() updates zone->lowmem_reserve[],
pgdat->totalreserve_pages and the global totalreserve_pages as one
logical operation, but adjust_managed_page_count() does not serialize
those updates. Concurrent callers can therefore interleave the reserve
recalculation and leave the reserve accounting temporarily inconsistent.
This race was identified by code inspection rather than by a reported
runtime failure. However, these reserve counters are used by the page
allocator and reclaim paths to make allocation and watermark decisions,
so it is preferable to avoid publishing inconsistent values.
Serialize adjust_managed_page_count() to make each reserve recalculation
observe and publish a consistent state.
Fixes: 9726891fe753 ("mm/page_alloc: recalculate zone reserve pages when managed pages change")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
v1->v2:
- expand the changelog to explain why the theoretical race matters
---
mm/page_alloc.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3a56825a7fc5..0989067da588 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6384,6 +6384,8 @@ static void calculate_totalreserve_pages(void)
trace_mm_calculate_totalreserve_pages(totalreserve_pages);
}
+static DEFINE_SPINLOCK(zone_reserve_lock);
+
/*
* setup_per_zone_lowmem_reserve - called whenever
* sysctl_lowmem_reserve_ratio changes. Ensures that each zone
@@ -6394,6 +6396,8 @@ static void setup_per_zone_lowmem_reserve(void)
{
struct pglist_data *pgdat;
enum zone_type i, j;
+
+ guard(spinlock_irqsave)(&zone_reserve_lock);
/*
* For a given zone node_zones[i], lowmem_reserve[j] (j > i)
* represents how many pages in zone i must effectively be kept
@@ -6509,11 +6513,9 @@ static void __setup_per_zone_wmarks(void)
void setup_per_zone_wmarks(void)
{
struct zone *zone;
- static DEFINE_SPINLOCK(lock);
- spin_lock(&lock);
- __setup_per_zone_wmarks();
- spin_unlock(&lock);
+ scoped_guard(spinlock_irqsave, &zone_reserve_lock)
+ __setup_per_zone_wmarks();
/*
* The watermark size have changed so update the pcpu batch
base-commit: e98d21c170b01ddef366f023bbfcf6b31509fa83
--
2.54.0
© 2016 - 2026 Red Hat, Inc.