[PATCH v7 1/2] mm/memory hotplug: fix zone->contiguous always false when hotplug

Tianyou Li posted 2 patches 1 month, 2 weeks ago
There is a newer version of this series
[PATCH v7 1/2] mm/memory hotplug: fix zone->contiguous always false when hotplug
Posted by Tianyou Li 1 month, 2 weeks ago
Function set_zone_contiguous used __pageblock_pfn_to_page to
check the whole pageblock is in the same zone. One assumption is
the memory section must online, otherwise the __pageblock_pfn_to_page
will return NULL, then the set_zone_contiguous will be false.
When move_pfn_range_to_zone invoked set_zone_contiguous, since the
memory section did not online, the return value will always be false.

To fix this issue, we removed the set_zone_contiguous from the
move_pfn_range_to_zone, and place it after memory section onlined.

Function remove_pfn_range_from_zone did not have this issue because
memory section remains online at the time set_zone_contiguous invoked.

Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
Signed-off-by: Tianyou Li <tianyou.li@intel.com>
---
 mm/memory_hotplug.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a63ec679d861..ce6caf8674a5 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -782,8 +782,6 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
 	memmap_init_range(nr_pages, nid, zone_idx(zone), start_pfn, 0,
 			 MEMINIT_HOTPLUG, altmap, migratetype,
 			 isolate_pageblock);
-
-	set_zone_contiguous(zone);
 }
 
 struct auto_movable_stats {
@@ -1115,6 +1113,7 @@ int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages,
 	if (nr_pages >= PAGES_PER_SECTION)
 	        online_mem_sections(pfn, ALIGN_DOWN(end_pfn, PAGES_PER_SECTION));
 
+	set_zone_contiguous(zone);
 	return ret;
 }
 
@@ -1205,6 +1204,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
 	}
 
 	online_pages_range(pfn, nr_pages);
+	set_zone_contiguous(zone);
 	adjust_present_page_count(pfn_to_page(pfn), group, nr_pages);
 
 	if (node_arg.nid >= 0)
-- 
2.47.1
Re: [PATCH v7 1/2] mm/memory hotplug: fix zone->contiguous always false when hotplug
Posted by David Hildenbrand (Red Hat) 1 month ago
On 12/22/25 15:58, Tianyou Li wrote:
> Function set_zone_contiguous used __pageblock_pfn_to_page to
> check the whole pageblock is in the same zone. One assumption is
> the memory section must online, otherwise the __pageblock_pfn_to_page
> will return NULL, then the set_zone_contiguous will be false.
> When move_pfn_range_to_zone invoked set_zone_contiguous, since the
> memory section did not online, the return value will always be false.
> 
> To fix this issue, we removed the set_zone_contiguous from the
> move_pfn_range_to_zone, and place it after memory section onlined.
> 
> Function remove_pfn_range_from_zone did not have this issue because
> memory section remains online at the time set_zone_contiguous invoked.

The description is a bit hard to follow. Let me try:


"set_zone_contiguous() uses __pageblock_pfn_to_page() to detect 
pageblocks that either do not exist (hole) or that do not belong to the 
same zone.

__pageblock_pfn_to_page(), however, relies on pfn_to_online_page(), 
effectively always returning NULL for memory ranges that were not 
onlined yet. So when called on a range-to-be-onlined, it indicates a 
memory hole to set_zone_contiguous().

Consequently, the set_zone_contiguous() call in 
move_pfn_range_to_zone(), which happens early during memory onlining, 
will never detect a zone as being contiguous. Bad.

To fix the issue, move the set_zone_contiguous() call to a later stage
in memory onlining, where pfn_to_online_page() will succeed: after we
mark the memory sections to be online"


Now, there is no need to add the handling to 
mhp_init_memmap_on_memory(). Note how mhp_init_memmap_on_memory() in 
memory_block_online() is always followed by online_pages().

So, it's sufficient to move it after the online_pages_range(). I would 
also add a comment there saying something like:

/*
  * Now that the ranges are indicated as online, check whether the whole
  * zone is contiguous.
  */


Can we find some Fixes: tag (which commit introduced the regression)? 
Likely we want to CC stable.

-- 
Cheers

David
Re: [PATCH v7 1/2] mm/memory hotplug: fix zone->contiguous always false when hotplug
Posted by Li, Tianyou 1 month ago
Very appreciated for your review David! The suggestions you made are 
clear and the code/comments you posted are well formatted, I can even 
copy/pasted without modification. Thanks.


Regards,

Tianyou

On 1/7/2026 4:03 AM, David Hildenbrand (Red Hat) wrote:
> On 12/22/25 15:58, Tianyou Li wrote:
>> Function set_zone_contiguous used __pageblock_pfn_to_page to
>> check the whole pageblock is in the same zone. One assumption is
>> the memory section must online, otherwise the __pageblock_pfn_to_page
>> will return NULL, then the set_zone_contiguous will be false.
>> When move_pfn_range_to_zone invoked set_zone_contiguous, since the
>> memory section did not online, the return value will always be false.
>>
>> To fix this issue, we removed the set_zone_contiguous from the
>> move_pfn_range_to_zone, and place it after memory section onlined.
>>
>> Function remove_pfn_range_from_zone did not have this issue because
>> memory section remains online at the time set_zone_contiguous invoked.
>
> The description is a bit hard to follow. Let me try:
>
>
> "set_zone_contiguous() uses __pageblock_pfn_to_page() to detect 
> pageblocks that either do not exist (hole) or that do not belong to 
> the same zone.
>
> __pageblock_pfn_to_page(), however, relies on pfn_to_online_page(), 
> effectively always returning NULL for memory ranges that were not 
> onlined yet. So when called on a range-to-be-onlined, it indicates a 
> memory hole to set_zone_contiguous().
>
> Consequently, the set_zone_contiguous() call in 
> move_pfn_range_to_zone(), which happens early during memory onlining, 
> will never detect a zone as being contiguous. Bad.
>
> To fix the issue, move the set_zone_contiguous() call to a later stage
> in memory onlining, where pfn_to_online_page() will succeed: after we
> mark the memory sections to be online"
>
Will change accordingly.  Thanks.


>
> Now, there is no need to add the handling to 
> mhp_init_memmap_on_memory(). Note how mhp_init_memmap_on_memory() in 
> memory_block_online() is always followed by online_pages().
>
Plus there is no dependencies of previous zone contiguous state for the 
set_zone_contiguous now, it totally makes sense to remove the 
set_zone_contiguous in mhp_init_memmap_on_memory() as you suggested.


> So, it's sufficient to move it after the online_pages_range(). I would 
> also add a comment there saying something like:
>
> /*
>  * Now that the ranges are indicated as online, check whether the whole
>  * zone is contiguous.
>  */
>
Will change accordingly. Thanks.


>
> Can we find some Fixes: tag (which commit introduced the regression)? 
> Likely we want to CC stable.
>
Yes, probably we can add the tags as below, where the pfn_to_page() 
changed to pfn_to_online_page() in __pageblock_pfn_to_page().

Fixes: 2d070eab2e82 (mm: consider zone which is not fully populated to 
have holes)
Cc: Michal Hocko <mhocko@suse.com>