[v2] page_alloc: allow migration of smaller hugepages during contig_alloc.

[RFC PATCH v2] page_alloc: allow migration of smaller hugepages during contig_alloc.

Posted by Gregory Price 3 months, 2 weeks ago

We presently skip regions with hugepages entirely when trying to do
contiguous page allocation.  Instead, if hugepage migration is enabled,
consider regions with hugepages smaller than the requested allocation.

Compaction `isolate_migrate_pages_block()` already expects requests
with hugepages to originate from alloc_contig, and hugetlb code also
does a migratable check when isolating in `folio_isolate_hugetlb()`.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Gregory Price <gourry@gourry.net>
---
 mm/page_alloc.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 600d9e981c23..da2e65bf63e3 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7048,8 +7048,19 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
 		if (PageReserved(page))
 			return false;
 
-		if (PageHuge(page))
-			return false;
+		if (PageHuge(page)) {
+			unsigned int order;
+
+			if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION))
+				return false;
+
+			/* Don't consider moving same size/larger pages */
+			page = compound_head(page);
+			order = compound_order(page);
+			if ((order >= MAX_PAGE_ORDER) ||
+			    (nr_pages < (1 << order)))
+				return false;
+		}
 	}
 	return true;
 }
-- 
2.51.0

Re: [RFC PATCH v2] page_alloc: allow migration of smaller hugepages during contig_alloc.

Posted by SeongJae Park 3 months, 2 weeks ago

On Mon, 20 Oct 2025 17:08:16 -0400 Gregory Price <gourry@gourry.net> wrote:

> We presently skip regions with hugepages entirely when trying to do
> contiguous page allocation.  Instead, if hugepage migration is enabled,
> consider regions with hugepages smaller than the requested allocation.
> 
> Compaction `isolate_migrate_pages_block()` already expects requests
> with hugepages to originate from alloc_contig, and hugetlb code also
> does a migratable check when isolating in `folio_isolate_hugetlb()`.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
>  mm/page_alloc.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 600d9e981c23..da2e65bf63e3 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7048,8 +7048,19 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
>  		if (PageReserved(page))
>  			return false;
>  
> -		if (PageHuge(page))
> -			return false;
> +		if (PageHuge(page)) {
> +			unsigned int order;
> +
> +			if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION))
> +				return false;
> +
> +			/* Don't consider moving same size/larger pages */
> +			page = compound_head(page);
> +			order = compound_order(page);
> +			if ((order >= MAX_PAGE_ORDER) ||
> +			    (nr_pages < (1 << order)))
> +				return false;

Shouldn't the comparison of 'nr_pages' against '1 << order' use '<=' instead of
'<', to match the commit description?

> +		}
>  	}
>  	return true;
>  }
> -- 
> 2.51.0


Thanks,
SJ

Re: [RFC PATCH v2] page_alloc: allow migration of smaller hugepages during contig_alloc.

Posted by Gregory Price 3 months, 2 weeks ago

On Tue, Oct 21, 2025 at 06:40:57PM -0700, SeongJae Park wrote:
> On Mon, 20 Oct 2025 17:08:16 -0400 Gregory Price <gourry@gourry.net> wrote:
> > +
> > +			/* Don't consider moving same size/larger pages */
> > +			page = compound_head(page);
> > +			order = compound_order(page);
> > +			if ((order >= MAX_PAGE_ORDER) ||
> > +			    (nr_pages < (1 << order)))
> > +				return false;
> 
> Shouldn't the comparison of 'nr_pages' against '1 << order' use '<=' instead of
> '<', to match the commit description?
> 

Good catch, got lost in the shuffle.  Thank you!

~Gregory

Re: [RFC PATCH v2] page_alloc: allow migration of smaller hugepages during contig_alloc.

Posted by David Hildenbrand 3 months, 2 weeks ago

On 20.10.25 23:08, Gregory Price wrote:
> We presently skip regions with hugepages entirely when trying to do
> contiguous page allocation.  Instead, if hugepage migration is enabled,
> consider regions with hugepages smaller than the requested allocation.
> 
> Compaction `isolate_migrate_pages_block()` already expects requests

Please, let's not talk about "compaction" here, it's just confusing to 
talk about compaction for something that is not compaction but uses some 
primitives (because not properly separated yet)

Just say "isolate_migrate_pages_block() already expects ..."

> with hugepages to originate from alloc_contig, and hugetlb code also
> does a migratable check when isolating in `folio_isolate_hugetlb()`.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
>   mm/page_alloc.c | 15 +++++++++++++--
>   1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 600d9e981c23..da2e65bf63e3 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7048,8 +7048,19 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
>   		if (PageReserved(page))
>   			return false;
>   
> -		if (PageHuge(page))
> -			return false;
> +		if (PageHuge(page)) {
> +			unsigned int order;
> +
> +			if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION))
> +				return false;
> +
> +			/* Don't consider moving same size/larger pages */
> +			page = compound_head(page);
> +			order = compound_order(page);
> +			if ((order >= MAX_PAGE_ORDER) ||
> +			    (nr_pages < (1 << order)))
> +				return false;

This is roughly what we do in pageblock_skip_persistent(), just with a 
hardcoded pageblock size.

I'm not sure about the MAX_PAGE_ORDER check, though. If an arch supports 
two hugetlb sizes that exceed MAX_PAGE_ORDER, it would not work as expected.

Doesn't arm64 support that with cont-PMD vs. PUD hugetlb folios? 
MAX_FOLIO_ORDER would be better.

-- 
Cheers

David / dhildenb

Re: [RFC PATCH v2] page_alloc: allow migration of smaller hugepages during contig_alloc.

Posted by Gregory Price 3 months, 2 weeks ago

On Tue, Oct 21, 2025 at 11:30:48AM +0200, David Hildenbrand wrote:
> > Compaction `isolate_migrate_pages_block()` already expects requests
> 
> Please, let's not talk about "compaction" here, it's just confusing to talk
> about compaction for something that is not compaction but uses some
> primitives (because not properly separated yet)
> 
> Just say "isolate_migrate_pages_block() already expects ..."
>

will do

> > +			/* Don't consider moving same size/larger pages */
> > +			page = compound_head(page);
> > +			order = compound_order(page);
> > +			if ((order >= MAX_PAGE_ORDER) ||
> > +			    (nr_pages < (1 << order)))
> > +				return false;
> 
> This is roughly what we do in pageblock_skip_persistent(), just with a
> hardcoded pageblock size.
> 
> I'm not sure about the MAX_PAGE_ORDER check, though. If an arch supports two
> hugetlb sizes that exceed MAX_PAGE_ORDER, it would not work as expected.
> 
> Doesn't arm64 support that with cont-PMD vs. PUD hugetlb folios?
> MAX_FOLIO_ORDER would be better.

Will adjust to MAX_FOLIO_ORDER, makes sense.  Missed that.  Will take a
look at the pageblock code, I think I was looking at the compaction and
some other code for inspiration here.

~Gregory

Re: [RFC PATCH v2] page_alloc: allow migration of smaller hugepages during contig_alloc.

Posted by Wei Yang 3 months, 2 weeks ago

On Mon, Oct 20, 2025 at 05:08:16PM -0400, Gregory Price wrote:
>We presently skip regions with hugepages entirely when trying to do
>contiguous page allocation.  Instead, if hugepage migration is enabled,
>consider regions with hugepages smaller than the requested allocation.
>
>Compaction `isolate_migrate_pages_block()` already expects requests

isolate_migratepages_block() ?

>with hugepages to originate from alloc_contig, and hugetlb code also
>does a migratable check when isolating in `folio_isolate_hugetlb()`.
>
>Suggested-by: David Hildenbrand <david@redhat.com>
>Signed-off-by: Gregory Price <gourry@gourry.net>
>---
> mm/page_alloc.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
>
>diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>index 600d9e981c23..da2e65bf63e3 100644
>--- a/mm/page_alloc.c
>+++ b/mm/page_alloc.c
>@@ -7048,8 +7048,19 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
> 		if (PageReserved(page))
> 			return false;
> 
>-		if (PageHuge(page))
>-			return false;
>+		if (PageHuge(page)) {
>+			unsigned int order;
>+
>+			if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION))
>+				return false;
>+
>+			/* Don't consider moving same size/larger pages */
>+			page = compound_head(page);
>+			order = compound_order(page);

How about using folio_large_order(page_folio(page)) ?

>+			if ((order >= MAX_PAGE_ORDER) ||
>+			    (nr_pages < (1 << order)))
>+				return false;
>+		}
> 	}
> 	return true;
> }
>-- 
>2.51.0
>

Generally LGTM

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

-- 
Wei Yang
Help you, Help me

Re: [RFC PATCH v2] page_alloc: allow migration of smaller hugepages during contig_alloc.

Posted by Wei Yang 3 months, 2 weeks ago

On Tue, Oct 21, 2025 at 01:25:33AM +0000, Wei Yang wrote:
>On Mon, Oct 20, 2025 at 05:08:16PM -0400, Gregory Price wrote:
>>We presently skip regions with hugepages entirely when trying to do
>>contiguous page allocation.  Instead, if hugepage migration is enabled,
>>consider regions with hugepages smaller than the requested allocation.
>>
>>Compaction `isolate_migrate_pages_block()` already expects requests
>
>isolate_migratepages_block() ?
>
>>with hugepages to originate from alloc_contig, and hugetlb code also
>>does a migratable check when isolating in `folio_isolate_hugetlb()`.
>>
>>Suggested-by: David Hildenbrand <david@redhat.com>
>>Signed-off-by: Gregory Price <gourry@gourry.net>
>>---
>> mm/page_alloc.c | 15 +++++++++++++--
>> 1 file changed, 13 insertions(+), 2 deletions(-)
>>
>>diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>index 600d9e981c23..da2e65bf63e3 100644
>>--- a/mm/page_alloc.c
>>+++ b/mm/page_alloc.c
>>@@ -7048,8 +7048,19 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
>> 		if (PageReserved(page))
>> 			return false;
>> 
>>-		if (PageHuge(page))
>>-			return false;
>>+		if (PageHuge(page)) {
>>+			unsigned int order;
>>+
>>+			if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION))
>>+				return false;
>>+
>>+			/* Don't consider moving same size/larger pages */
>>+			page = compound_head(page);
>>+			order = compound_order(page);
>
>How about using folio_large_order(page_folio(page)) ?
>

Hmm... I see the discussion. Forget about it.

>>+			if ((order >= MAX_PAGE_ORDER) ||
>>+			    (nr_pages < (1 << order)))
>>+				return false;
>>+		}
>> 	}
>> 	return true;
>> }
>>-- 
>>2.51.0
>>
>
>Generally LGTM
>
>Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
>
>-- 
>Wei Yang
>Help you, Help me

-- 
Wei Yang
Help you, Help me

Re: [RFC PATCH v2] page_alloc: allow migration of smaller hugepages during contig_alloc.

Posted by Zi Yan 3 months, 2 weeks ago

On 20 Oct 2025, at 21:25, Wei Yang wrote:

> On Mon, Oct 20, 2025 at 05:08:16PM -0400, Gregory Price wrote:
>> We presently skip regions with hugepages entirely when trying to do
>> contiguous page allocation.  Instead, if hugepage migration is enabled,
>> consider regions with hugepages smaller than the requested allocation.
>>
>> Compaction `isolate_migrate_pages_block()` already expects requests
>
> isolate_migratepages_block() ?
>
>> with hugepages to originate from alloc_contig, and hugetlb code also
>> does a migratable check when isolating in `folio_isolate_hugetlb()`.
>>
>> Suggested-by: David Hildenbrand <david@redhat.com>
>> Signed-off-by: Gregory Price <gourry@gourry.net>
>> ---
>> mm/page_alloc.c | 15 +++++++++++++--
>> 1 file changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 600d9e981c23..da2e65bf63e3 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -7048,8 +7048,19 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
>> 		if (PageReserved(page))
>> 			return false;
>>
>> -		if (PageHuge(page))
>> -			return false;
>> +		if (PageHuge(page)) {
>> +			unsigned int order;
>> +
>> +			if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION))
>> +				return false;
>> +
>> +			/* Don't consider moving same size/larger pages */
>> +			page = compound_head(page);
>> +			order = compound_order(page);
>
> How about using folio_large_order(page_folio(page)) ?

See the discussion at https://lore.kernel.org/linux-mm/609E7E01-33A6-4931-AC89-1F4B2944FB64@nvidia.com/ about reading folio order without locking.

>
>> +			if ((order >= MAX_PAGE_ORDER) ||
>> +			    (nr_pages < (1 << order)))
>> +				return false;
>> +		}
>> 	}
>> 	return true;
>> }
>> -- 
>> 2.51.0
>>
>
> Generally LGTM
>
> Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
>
> -- 
> Wei Yang
> Help you, Help me


--
Best Regards,
Yan, Zi

Re: [RFC PATCH v2] page_alloc: allow migration of smaller hugepages during contig_alloc

Posted by Wei Yang 3 months, 2 weeks ago

On Mon, Oct 20, 2025 at 09:28:18PM -0400, Zi Yan wrote:
>On 20 Oct 2025, at 21:25, Wei Yang wrote:
>
>> On Mon, Oct 20, 2025 at 05:08:16PM -0400, Gregory Price wrote:
>>> We presently skip regions with hugepages entirely when trying to do
>>> contiguous page allocation.  Instead, if hugepage migration is enabled,
>>> consider regions with hugepages smaller than the requested allocation.
>>>
>>> Compaction `isolate_migrate_pages_block()` already expects requests
>>
>> isolate_migratepages_block() ?
>>
>>> with hugepages to originate from alloc_contig, and hugetlb code also
>>> does a migratable check when isolating in `folio_isolate_hugetlb()`.
>>>
>>> Suggested-by: David Hildenbrand <david@redhat.com>
>>> Signed-off-by: Gregory Price <gourry@gourry.net>
>>> ---
>>> mm/page_alloc.c | 15 +++++++++++++--
>>> 1 file changed, 13 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index 600d9e981c23..da2e65bf63e3 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -7048,8 +7048,19 @@ static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
>>> 		if (PageReserved(page))
>>> 			return false;
>>>
>>> -		if (PageHuge(page))
>>> -			return false;
>>> +		if (PageHuge(page)) {
>>> +			unsigned int order;
>>> +
>>> +			if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION))
>>> +				return false;
>>> +
>>> +			/* Don't consider moving same size/larger pages */
>>> +			page = compound_head(page);
>>> +			order = compound_order(page);
>>
>> How about using folio_large_order(page_folio(page)) ?
>
>See the discussion at https://lore.kernel.org/linux-mm/609E7E01-33A6-4931-AC89-1F4B2944FB64@nvidia.com/ about reading folio order without locking.
>

I took another look at this.

The above code is expanded to:

  page = _compound_head(page);
  order = compound_order(page);
      folio = page;
      test_bit(PG_head, &folio->flags.f);
      return folio_large_order(folio);

If we use

  order = folio_order(page_folio(page))

It is expanded to:

  folio = _compound_head(page);
  folio_test_large(folio)
      folio_test_head(folio)
          test_bit(PG_head, &folio->flags.f)
  return folio_large_order(folio);

These two seems expanded to the same code.

Maybe we can use folio_order() here?

>>
>>> +			if ((order >= MAX_PAGE_ORDER) ||
>>> +			    (nr_pages < (1 << order)))
>>> +				return false;
>>> +		}
>>> 	}
>>> 	return true;
>>> }
>>> -- 
>>> 2.51.0
>>>
>>
>> Generally LGTM
>>
>> Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
>>
>> -- 
>> Wei Yang
>> Help you, Help me
>
>
>--
>Best Regards,
>Yan, Zi

-- 
Wei Yang
Help you, Help me

Re: [RFC PATCH v2] page_alloc: allow migration of smaller hugepages during contig_alloc

Posted by Gregory Price 3 months, 2 weeks ago

On Tue, Oct 21, 2025 at 09:14:43AM +0000, Wei Yang wrote:
> 
> The above code is expanded to:
> 
>   page = _compound_head(page);
>   order = compound_order(page);
>       folio = page;
>       test_bit(PG_head, &folio->flags.f);
>       return folio_large_order(folio);
> 
> If we use
> 
>   order = folio_order(page_folio(page))
> 
> It is expanded to:
> 
>   folio = _compound_head(page);
>   folio_test_large(folio)
>       folio_test_head(folio)
>           test_bit(PG_head, &folio->flags.f)
>   return folio_large_order(folio);
> 
> These two seems expanded to the same code.
> 

folio_test_large drops into const_folio_flags which does
a VM_BUG_ON_PGFLAGS, so not the same logic here.

> Maybe we can use folio_order() here?
>

Re: [RFC PATCH v2] page_alloc: allow migration of smaller hugepages during contig_alloc

Posted by Wei Yang 3 months, 2 weeks ago

On Tue, Oct 21, 2025 at 12:03:39PM -0400, Gregory Price wrote:
>On Tue, Oct 21, 2025 at 09:14:43AM +0000, Wei Yang wrote:
>> 
>> The above code is expanded to:
>> 
>>   page = _compound_head(page);
>>   order = compound_order(page);
>>       folio = page;
>>       test_bit(PG_head, &folio->flags.f);
>>       return folio_large_order(folio);
>> 
>> If we use
>> 
>>   order = folio_order(page_folio(page))
>> 
>> It is expanded to:
>> 
>>   folio = _compound_head(page);
>>   folio_test_large(folio)
>>       folio_test_head(folio)
>>           test_bit(PG_head, &folio->flags.f)
>>   return folio_large_order(folio);
>> 
>> These two seems expanded to the same code.
>> 
>
>folio_test_large drops into const_folio_flags which does
>a VM_BUG_ON_PGFLAGS, so not the same logic here.
>

Thanks, you are right.

>> Maybe we can use folio_order() here?
>> 

-- 
Wei Yang
Help you, Help me

Re: [RFC PATCH v2] page_alloc: allow migration of smaller hugepages during contig_alloc.

Posted by Zi Yan 3 months, 2 weeks ago

On 20 Oct 2025, at 17:08, Gregory Price wrote:

> We presently skip regions with hugepages entirely when trying to do
> contiguous page allocation.  Instead, if hugepage migration is enabled,
> consider regions with hugepages smaller than the requested allocation.
>
> Compaction `isolate_migrate_pages_block()` already expects requests
> with hugepages to originate from alloc_contig, and hugetlb code also
> does a migratable check when isolating in `folio_isolate_hugetlb()`.
>
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
>  mm/page_alloc.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
>
LGTM. Reviewed-by: Zi Yan <ziy@nvidia.com>

--
Best Regards,
Yan, Zi