[PATCH v2 01/15] mm/memory_hotplug: fix possible race in scan_movable_pages()

David Hildenbrand (Arm) posted 15 patches 2 weeks ago
[PATCH v2 01/15] mm/memory_hotplug: fix possible race in scan_movable_pages()
Posted by David Hildenbrand (Arm) 2 weeks ago
If a hugetlb folio gets freed while we are in scan_movable_pages(),
folio_nr_pages() could return 0, resulting in or'ing "0 - 1 = -1"
to the PFN, resulting in PFN = -1. We're not holding any locks or
references that would prevent that.

for_each_valid_pfn() would then search for the next valid PFN, and could
return a PFN that is outside of the range of the original requested
range. do_migrate_page() would then try to migrate quite a big range,
which is certainly undesirable.

To fix it, simply test for valid folio_nr_pages() values. While at it,
as PageHuge() really just does a page_folio() internally, we can just
use folio_test_hugetlb() on the folio directly.

scan_movable_pages() is expected to be fast, and we try to avoid taking
locks or grabbing references. We cannot use folio_try_get() as that does
not work for free hugetlb folios. We could grab the hugetlb_lock, but
that just adds complexity.

The race is unlikely to trigger in practice, so we won't be CCing
stable.

Fixes: 16540dae959d ("mm/hugetlb: mm/memory_hotplug: use a folio in scan_movable_pages()")
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
---
 mm/memory_hotplug.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 86d3faf50453..969cd7ddf68f 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1747,6 +1747,7 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
 	unsigned long pfn;
 
 	for_each_valid_pfn(pfn, start, end) {
+		unsigned long nr_pages;
 		struct page *page;
 		struct folio *folio;
 
@@ -1763,9 +1764,9 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
 		if (PageOffline(page) && page_count(page))
 			return -EBUSY;
 
-		if (!PageHuge(page))
-			continue;
 		folio = page_folio(page);
+		if (!folio_test_hugetlb(folio))
+			continue;
 		/*
 		 * This test is racy as we hold no reference or lock.  The
 		 * hugetlb page could have been free'ed and head is no longer
@@ -1775,7 +1776,11 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
 		 */
 		if (folio_test_hugetlb_migratable(folio))
 			goto found;
-		pfn |= folio_nr_pages(folio) - 1;
+		nr_pages = folio_nr_pages(folio);
+		if (unlikely(nr_pages < 1 || nr_pages > MAX_FOLIO_NR_PAGES ||
+			     !is_power_of_2(nr_pages)))
+			continue;
+		pfn |= nr_pages - 1;
 	}
 	return -ENOENT;
 found:

-- 
2.43.0
Re: [PATCH v2 01/15] mm/memory_hotplug: fix possible race in scan_movable_pages()
Posted by Lorenzo Stoakes (Oracle) 1 week, 4 days ago
On Fri, Mar 20, 2026 at 11:13:33PM +0100, David Hildenbrand (Arm) wrote:
> If a hugetlb folio gets freed while we are in scan_movable_pages(),
> folio_nr_pages() could return 0, resulting in or'ing "0 - 1 = -1"
> to the PFN, resulting in PFN = -1. We're not holding any locks or
> references that would prevent that.
>
> for_each_valid_pfn() would then search for the next valid PFN, and could
> return a PFN that is outside of the range of the original requested
> range. do_migrate_page() would then try to migrate quite a big range,
> which is certainly undesirable.
>
> To fix it, simply test for valid folio_nr_pages() values. While at it,
> as PageHuge() really just does a page_folio() internally, we can just
> use folio_test_hugetlb() on the folio directly.
>
> scan_movable_pages() is expected to be fast, and we try to avoid taking
> locks or grabbing references. We cannot use folio_try_get() as that does
> not work for free hugetlb folios. We could grab the hugetlb_lock, but
> that just adds complexity.
>
> The race is unlikely to trigger in practice, so we won't be CCing
> stable.
>
> Fixes: 16540dae959d ("mm/hugetlb: mm/memory_hotplug: use a folio in scan_movable_pages()")
> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>

Logic looks right to me, though some nits below. With those accounted for:

Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>

> ---
>  mm/memory_hotplug.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 86d3faf50453..969cd7ddf68f 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1747,6 +1747,7 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>  	unsigned long pfn;
>
>  	for_each_valid_pfn(pfn, start, end) {
> +		unsigned long nr_pages;
>  		struct page *page;
>  		struct folio *folio;
>
> @@ -1763,9 +1764,9 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>  		if (PageOffline(page) && page_count(page))
>  			return -EBUSY;
>
> -		if (!PageHuge(page))

Yeah interesting to see this is folio_test_hugetlb(page_folio(page)) :))

So this is a nice change for sure.

> -			continue;
>  		folio = page_folio(page);
> +		if (!folio_test_hugetlb(folio))
> +			continue;
>  		/*
>  		 * This test is racy as we hold no reference or lock.  The
>  		 * hugetlb page could have been free'ed and head is no longer
> @@ -1775,7 +1776,11 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>  		 */
>  		if (folio_test_hugetlb_migratable(folio))
>  			goto found;
> -		pfn |= folio_nr_pages(folio) - 1;
> +		nr_pages = folio_nr_pages(folio);
> +		if (unlikely(nr_pages < 1 || nr_pages > MAX_FOLIO_NR_PAGES ||

NIT: since nr_pages is an unsigned long, would this be better as !nr_pages || ...?

> +			     !is_power_of_2(nr_pages)))

Could the latter two conditions ever really happen? I guess some weird tearing
or something maybe?

It would also be nice to maybe separate this out as is_valid_nr_pages() or
something, but then again, I suppose given this is a rare case of us
checking this under circumstances where the value might not be valid, maybe
not worth it.

> +			continue;
> +		pfn |= nr_pages - 1;
>  	}
>  	return -ENOENT;
>  found:
>
> --
> 2.43.0
>

Cheers, Lorenzo
Re: [PATCH v2 01/15] mm/memory_hotplug: fix possible race in scan_movable_pages()
Posted by David Hildenbrand (Arm) 1 week, 4 days ago
On 3/23/26 14:26, Lorenzo Stoakes (Oracle) wrote:
> On Fri, Mar 20, 2026 at 11:13:33PM +0100, David Hildenbrand (Arm) wrote:
>> If a hugetlb folio gets freed while we are in scan_movable_pages(),
>> folio_nr_pages() could return 0, resulting in or'ing "0 - 1 = -1"
>> to the PFN, resulting in PFN = -1. We're not holding any locks or
>> references that would prevent that.
>>
>> for_each_valid_pfn() would then search for the next valid PFN, and could
>> return a PFN that is outside of the range of the original requested
>> range. do_migrate_page() would then try to migrate quite a big range,
>> which is certainly undesirable.
>>
>> To fix it, simply test for valid folio_nr_pages() values. While at it,
>> as PageHuge() really just does a page_folio() internally, we can just
>> use folio_test_hugetlb() on the folio directly.
>>
>> scan_movable_pages() is expected to be fast, and we try to avoid taking
>> locks or grabbing references. We cannot use folio_try_get() as that does
>> not work for free hugetlb folios. We could grab the hugetlb_lock, but
>> that just adds complexity.
>>
>> The race is unlikely to trigger in practice, so we won't be CCing
>> stable.
>>
>> Fixes: 16540dae959d ("mm/hugetlb: mm/memory_hotplug: use a folio in scan_movable_pages()")
>> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
> 
> Logic looks right to me, though some nits below. With those accounted for:
> 
> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> 
>> ---
>>  mm/memory_hotplug.c | 11 ++++++++---
>>  1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 86d3faf50453..969cd7ddf68f 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1747,6 +1747,7 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>>  	unsigned long pfn;
>>
>>  	for_each_valid_pfn(pfn, start, end) {
>> +		unsigned long nr_pages;
>>  		struct page *page;
>>  		struct folio *folio;
>>
>> @@ -1763,9 +1764,9 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>>  		if (PageOffline(page) && page_count(page))
>>  			return -EBUSY;
>>
>> -		if (!PageHuge(page))
> 
> Yeah interesting to see this is folio_test_hugetlb(page_folio(page)) :))
> 
> So this is a nice change for sure.
> 
>> -			continue;
>>  		folio = page_folio(page);
>> +		if (!folio_test_hugetlb(folio))
>> +			continue;
>>  		/*
>>  		 * This test is racy as we hold no reference or lock.  The
>>  		 * hugetlb page could have been free'ed and head is no longer
>> @@ -1775,7 +1776,11 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
>>  		 */
>>  		if (folio_test_hugetlb_migratable(folio))
>>  			goto found;
>> -		pfn |= folio_nr_pages(folio) - 1;
>> +		nr_pages = folio_nr_pages(folio);
>> +		if (unlikely(nr_pages < 1 || nr_pages > MAX_FOLIO_NR_PAGES ||
> 
> NIT: since nr_pages is an unsigned long, would this be better as !nr_pages || ...?

It's easier on the brain when spotting that only a given range is
allowed, without having to remember the exact type of the variable :)

So I guess it doesn't really make a difference in the end.

> 
>> +			     !is_power_of_2(nr_pages)))
> 
> Could the latter two conditions ever really happen? I guess some weird tearing
> or something maybe?

Yes, or when the fields gets reused for something else.

> 
> It would also be nice to maybe separate this out as is_valid_nr_pages() or
> something, but then again, I suppose given this is a rare case of us
> checking this under circumstances where the value might not be valid, maybe
> not worth it.

I had the same thought. But this code is way too special regarding
raciness that I hope nobody else will really require this ... and if
they do, they might be doing something wrong :)

-- 
Cheers,

David
Re: [PATCH v2 01/15] mm/memory_hotplug: fix possible race in scan_movable_pages()
Posted by Lorenzo Stoakes (Oracle) 1 week, 4 days ago
On Mon, Mar 23, 2026 at 02:40:16PM +0100, David Hildenbrand (Arm) wrote:
> On 3/23/26 14:26, Lorenzo Stoakes (Oracle) wrote:
> > On Fri, Mar 20, 2026 at 11:13:33PM +0100, David Hildenbrand (Arm) wrote:
> >> If a hugetlb folio gets freed while we are in scan_movable_pages(),
> >> folio_nr_pages() could return 0, resulting in or'ing "0 - 1 = -1"
> >> to the PFN, resulting in PFN = -1. We're not holding any locks or
> >> references that would prevent that.
> >>
> >> for_each_valid_pfn() would then search for the next valid PFN, and could
> >> return a PFN that is outside of the range of the original requested
> >> range. do_migrate_page() would then try to migrate quite a big range,
> >> which is certainly undesirable.
> >>
> >> To fix it, simply test for valid folio_nr_pages() values. While at it,
> >> as PageHuge() really just does a page_folio() internally, we can just
> >> use folio_test_hugetlb() on the folio directly.
> >>
> >> scan_movable_pages() is expected to be fast, and we try to avoid taking
> >> locks or grabbing references. We cannot use folio_try_get() as that does
> >> not work for free hugetlb folios. We could grab the hugetlb_lock, but
> >> that just adds complexity.
> >>
> >> The race is unlikely to trigger in practice, so we won't be CCing
> >> stable.
> >>
> >> Fixes: 16540dae959d ("mm/hugetlb: mm/memory_hotplug: use a folio in scan_movable_pages()")
> >> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
> >
> > Logic looks right to me, though some nits below. With those accounted for:
> >
> > Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> >
> >> ---
> >>  mm/memory_hotplug.c | 11 ++++++++---
> >>  1 file changed, 8 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> >> index 86d3faf50453..969cd7ddf68f 100644
> >> --- a/mm/memory_hotplug.c
> >> +++ b/mm/memory_hotplug.c
> >> @@ -1747,6 +1747,7 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
> >>  	unsigned long pfn;
> >>
> >>  	for_each_valid_pfn(pfn, start, end) {
> >> +		unsigned long nr_pages;
> >>  		struct page *page;
> >>  		struct folio *folio;
> >>
> >> @@ -1763,9 +1764,9 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
> >>  		if (PageOffline(page) && page_count(page))
> >>  			return -EBUSY;
> >>
> >> -		if (!PageHuge(page))
> >
> > Yeah interesting to see this is folio_test_hugetlb(page_folio(page)) :))
> >
> > So this is a nice change for sure.
> >
> >> -			continue;
> >>  		folio = page_folio(page);
> >> +		if (!folio_test_hugetlb(folio))
> >> +			continue;
> >>  		/*
> >>  		 * This test is racy as we hold no reference or lock.  The
> >>  		 * hugetlb page could have been free'ed and head is no longer
> >> @@ -1775,7 +1776,11 @@ static int scan_movable_pages(unsigned long start, unsigned long end,
> >>  		 */
> >>  		if (folio_test_hugetlb_migratable(folio))
> >>  			goto found;
> >> -		pfn |= folio_nr_pages(folio) - 1;
> >> +		nr_pages = folio_nr_pages(folio);
> >> +		if (unlikely(nr_pages < 1 || nr_pages > MAX_FOLIO_NR_PAGES ||
> >
> > NIT: since nr_pages is an unsigned long, would this be better as !nr_pages || ...?
>
> It's easier on the brain when spotting that only a given range is
> allowed, without having to remember the exact type of the variable :)

Yeah it's not a big deal!

>
> So I guess it doesn't really make a difference in the end.
>
> >
> >> +			     !is_power_of_2(nr_pages)))
> >
> > Could the latter two conditions ever really happen? I guess some weird tearing
> > or something maybe?
>
> Yes, or when the fields gets reused for something else.
>
> >
> > It would also be nice to maybe separate this out as is_valid_nr_pages() or
> > something, but then again, I suppose given this is a rare case of us
> > checking this under circumstances where the value might not be valid, maybe
> > not worth it.
>
> I had the same thought. But this code is way too special regarding
> raciness that I hope nobody else will really require this ... and if
> they do, they might be doing something wrong :)

Yeah for sure, it does seem unique to this situation, so probably not worth
it!

>
> --
> Cheers,
>
> David

Cheers, Lorenzo