[v2] Improve folio split related functions

[PATCH v2 3/4] mm/huge_memory: make min_order_for_split() always return an order

Posted by Zi Yan 2 months, 2 weeks ago

min_order_for_split() returns -EBUSY when the folio is truncated and cannot
be split. In commit 77008e1b2ef7 ("mm/huge_memory: do not change
split_huge_page*() target order silently"), memory_failure() does not
handle it and pass -EBUSY to try_to_split_thp_page() directly.
try_to_split_thp_page() returns -EINVAL since -EBUSY becomes 0xfffffff0 as
new_order is unsigned int in __folio_split() and this large new_order is
rejected as an invalid input. The code does not cause a bug.
soft_offline_in_use_page() also uses min_order_for_split() but it always
passes 0 as new_order for split.

Fix it by making min_order_for_split() always return an order. When the
given folio is truncated, namely folio->mapping == NULL, return 0 and let
a subsequent split function handle the situation and return -EBUSY.

Add kernel-doc to min_order_for_split() to clarify its use.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 include/linux/huge_mm.h |  6 +++---
 mm/huge_memory.c        | 25 +++++++++++++++++++------
 2 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index 1ecaeccf39c9..9b3a4e2b0668 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -372,7 +372,7 @@ enum split_type {
 int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
 		unsigned int new_order);
 int folio_split_unmapped(struct folio *folio, unsigned int new_order);
-int min_order_for_split(struct folio *folio);
+unsigned int min_order_for_split(struct folio *folio);
 int split_folio_to_list(struct folio *folio, struct list_head *list);
 int folio_check_splittable(struct folio *folio, unsigned int new_order,
 			   enum split_type split_type, bool warns);
@@ -634,10 +634,10 @@ static inline int split_huge_page(struct page *page)
 	return -EINVAL;
 }
 
-static inline int min_order_for_split(struct folio *folio)
+static inline unsigned int min_order_for_split(struct folio *folio)
 {
 	VM_WARN_ON_ONCE_FOLIO(1, folio);
-	return -EINVAL;
+	return 0;
 }
 
 static inline int split_folio_to_list(struct folio *folio, struct list_head *list)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 6c821c1c0ac3..ebc3ba0907fd 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -4230,16 +4230,29 @@ int folio_split(struct folio *folio, unsigned int new_order,
 			     SPLIT_TYPE_NON_UNIFORM);
 }
 
-int min_order_for_split(struct folio *folio)
+/**
+ * min_order_for_split() - get the minimum order @folio can be split to
+ * @folio: folio to split
+ *
+ * min_order_for_split() tells the minimum order @folio can be split to.
+ * If a file-backed folio is truncated, 0 will be returned. Any subsequent
+ * split attempt should get -EBUSY from split checking code.
+ *
+ * Return: @folio's minimum order for split
+ */
+unsigned int min_order_for_split(struct folio *folio)
 {
 	if (folio_test_anon(folio))
 		return 0;
 
-	if (!folio->mapping) {
-		if (folio_test_pmd_mappable(folio))
-			count_vm_event(THP_SPLIT_PAGE_FAILED);
-		return -EBUSY;
-	}
+	/*
+	 * If the folio got truncated, we don't know the previous mapping and
+	 * consequently the old min order. But it doesn't matter, as any split
+	 * attempt will immediately fail with -EBUSY as the folio cannot get
+	 * split until freed.
+	 */
+	if (!folio->mapping)
+		return 0;
 
 	return mapping_min_folio_order(folio->mapping);
 }
-- 
2.51.0

Re: [PATCH v2 3/4] mm/huge_memory: make min_order_for_split() always return an order

Posted by Lorenzo Stoakes 2 months, 2 weeks ago

On Fri, Nov 21, 2025 at 09:55:28PM -0500, Zi Yan wrote:
> min_order_for_split() returns -EBUSY when the folio is truncated and cannot
> be split. In commit 77008e1b2ef7 ("mm/huge_memory: do not change
> split_huge_page*() target order silently"), memory_failure() does not
> handle it and pass -EBUSY to try_to_split_thp_page() directly.
> try_to_split_thp_page() returns -EINVAL since -EBUSY becomes 0xfffffff0 as
> new_order is unsigned int in __folio_split() and this large new_order is
> rejected as an invalid input. The code does not cause a bug.

Yikes!

This class of bug is all too common... 'unexpectedly returning an error the
caller wasn't prepared for'.

> soft_offline_in_use_page() also uses min_order_for_split() but it always
> passes 0 as new_order for split.
>
> Fix it by making min_order_for_split() always return an order. When the
> given folio is truncated, namely folio->mapping == NULL, return 0 and let
> a subsequent split function handle the situation and return -EBUSY.

OK so we allow the split essentially or rather give a return value that is
essentially 'we don't care' because any attempt at the split will run into
something like:

		anon_vma = folio_get_anon_vma(folio);
		if (!anon_vma) {
			ret = -EBUSY;
			goto out;
		}

In __folio_split() right?

>
> Add kernel-doc to min_order_for_split() to clarify its use.

Nice.

>
> Signed-off-by: Zi Yan <ziy@nvidia.com>

LGTM, so:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

> ---
>  include/linux/huge_mm.h |  6 +++---
>  mm/huge_memory.c        | 25 +++++++++++++++++++------
>  2 files changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index 1ecaeccf39c9..9b3a4e2b0668 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -372,7 +372,7 @@ enum split_type {
>  int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>  		unsigned int new_order);
>  int folio_split_unmapped(struct folio *folio, unsigned int new_order);
> -int min_order_for_split(struct folio *folio);
> +unsigned int min_order_for_split(struct folio *folio);
>  int split_folio_to_list(struct folio *folio, struct list_head *list);
>  int folio_check_splittable(struct folio *folio, unsigned int new_order,
>  			   enum split_type split_type, bool warns);
> @@ -634,10 +634,10 @@ static inline int split_huge_page(struct page *page)
>  	return -EINVAL;
>  }
>
> -static inline int min_order_for_split(struct folio *folio)
> +static inline unsigned int min_order_for_split(struct folio *folio)
>  {
>  	VM_WARN_ON_ONCE_FOLIO(1, folio);
> -	return -EINVAL;
> +	return 0;
>  }
>
>  static inline int split_folio_to_list(struct folio *folio, struct list_head *list)
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 6c821c1c0ac3..ebc3ba0907fd 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -4230,16 +4230,29 @@ int folio_split(struct folio *folio, unsigned int new_order,
>  			     SPLIT_TYPE_NON_UNIFORM);
>  }
>
> -int min_order_for_split(struct folio *folio)
> +/**
> + * min_order_for_split() - get the minimum order @folio can be split to
> + * @folio: folio to split
> + *
> + * min_order_for_split() tells the minimum order @folio can be split to.
> + * If a file-backed folio is truncated, 0 will be returned. Any subsequent
> + * split attempt should get -EBUSY from split checking code.
> + *
> + * Return: @folio's minimum order for split
> + */
> +unsigned int min_order_for_split(struct folio *folio)
>  {
>  	if (folio_test_anon(folio))
>  		return 0;
>
> -	if (!folio->mapping) {
> -		if (folio_test_pmd_mappable(folio))
> -			count_vm_event(THP_SPLIT_PAGE_FAILED);
> -		return -EBUSY;
> -	}
> +	/*
> +	 * If the folio got truncated, we don't know the previous mapping and
> +	 * consequently the old min order. But it doesn't matter, as any split
> +	 * attempt will immediately fail with -EBUSY as the folio cannot get
> +	 * split until freed.
> +	 */

Nice to have a comment here to clarify this!

> +	if (!folio->mapping)
> +		return 0;
>
>  	return mapping_min_folio_order(folio->mapping);
>  }
> --
> 2.51.0
>

Cheers, Lorenzo

Re: [PATCH v2 3/4] mm/huge_memory: make min_order_for_split() always return an order

Posted by Zi Yan 2 months, 2 weeks ago

On 24 Nov 2025, at 10:18, Lorenzo Stoakes wrote:

> On Fri, Nov 21, 2025 at 09:55:28PM -0500, Zi Yan wrote:
>> min_order_for_split() returns -EBUSY when the folio is truncated and cannot
>> be split. In commit 77008e1b2ef7 ("mm/huge_memory: do not change
>> split_huge_page*() target order silently"), memory_failure() does not
>> handle it and pass -EBUSY to try_to_split_thp_page() directly.
>> try_to_split_thp_page() returns -EINVAL since -EBUSY becomes 0xfffffff0 as
>> new_order is unsigned int in __folio_split() and this large new_order is
>> rejected as an invalid input. The code does not cause a bug.
>
> Yikes!
>
> This class of bug is all too common... 'unexpectedly returning an error the
> caller wasn't prepared for'.
>
>> soft_offline_in_use_page() also uses min_order_for_split() but it always
>> passes 0 as new_order for split.
>>
>> Fix it by making min_order_for_split() always return an order. When the
>> given folio is truncated, namely folio->mapping == NULL, return 0 and let
>> a subsequent split function handle the situation and return -EBUSY.
>
> OK so we allow the split essentially or rather give a return value that is
> essentially 'we don't care' because any attempt at the split will run into
> something like:
>
> 		anon_vma = folio_get_anon_vma(folio);
> 		if (!anon_vma) {
> 			ret = -EBUSY;
> 			goto out;
> 		}
>
> In __folio_split() right?

Not this one for the issue I mentioned above, since this is for anon folios
and min_order_for_split() returns 0 for all anon folios. anon_vma == NULL
does not mean folio->mapping == NULL, since folio->mapping still has
FOLIO_MAPPING_ANON set. The fun never ends, right? :)

The above issue is handled by

       /*
         * Folios that just got truncated cannot get split. Signal to the
         * caller that there was a race.
         *
         * TODO: this will also currently refuse shmem folios that are in the
         * swapcache.
         */
        if (!folio_test_anon(folio) && !folio->mapping)
                return -EBUSY;

>
>>
>> Add kernel-doc to min_order_for_split() to clarify its use.
>
> Nice.
>
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>
> LGTM, so:
>
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

Thanks.

>
>> ---
>>  include/linux/huge_mm.h |  6 +++---
>>  mm/huge_memory.c        | 25 +++++++++++++++++++------
>>  2 files changed, 22 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
>> index 1ecaeccf39c9..9b3a4e2b0668 100644
>> --- a/include/linux/huge_mm.h
>> +++ b/include/linux/huge_mm.h
>> @@ -372,7 +372,7 @@ enum split_type {
>>  int __split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>>  		unsigned int new_order);
>>  int folio_split_unmapped(struct folio *folio, unsigned int new_order);
>> -int min_order_for_split(struct folio *folio);
>> +unsigned int min_order_for_split(struct folio *folio);
>>  int split_folio_to_list(struct folio *folio, struct list_head *list);
>>  int folio_check_splittable(struct folio *folio, unsigned int new_order,
>>  			   enum split_type split_type, bool warns);
>> @@ -634,10 +634,10 @@ static inline int split_huge_page(struct page *page)
>>  	return -EINVAL;
>>  }
>>
>> -static inline int min_order_for_split(struct folio *folio)
>> +static inline unsigned int min_order_for_split(struct folio *folio)
>>  {
>>  	VM_WARN_ON_ONCE_FOLIO(1, folio);
>> -	return -EINVAL;
>> +	return 0;
>>  }
>>
>>  static inline int split_folio_to_list(struct folio *folio, struct list_head *list)
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 6c821c1c0ac3..ebc3ba0907fd 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -4230,16 +4230,29 @@ int folio_split(struct folio *folio, unsigned int new_order,
>>  			     SPLIT_TYPE_NON_UNIFORM);
>>  }
>>
>> -int min_order_for_split(struct folio *folio)
>> +/**
>> + * min_order_for_split() - get the minimum order @folio can be split to
>> + * @folio: folio to split
>> + *
>> + * min_order_for_split() tells the minimum order @folio can be split to.
>> + * If a file-backed folio is truncated, 0 will be returned. Any subsequent
>> + * split attempt should get -EBUSY from split checking code.
>> + *
>> + * Return: @folio's minimum order for split
>> + */
>> +unsigned int min_order_for_split(struct folio *folio)
>>  {
>>  	if (folio_test_anon(folio))
>>  		return 0;
>>
>> -	if (!folio->mapping) {
>> -		if (folio_test_pmd_mappable(folio))
>> -			count_vm_event(THP_SPLIT_PAGE_FAILED);
>> -		return -EBUSY;
>> -	}
>> +	/*
>> +	 * If the folio got truncated, we don't know the previous mapping and
>> +	 * consequently the old min order. But it doesn't matter, as any split
>> +	 * attempt will immediately fail with -EBUSY as the folio cannot get
>> +	 * split until freed.
>> +	 */
>
> Nice to have a comment here to clarify this!
>
>> +	if (!folio->mapping)
>> +		return 0;
>>
>>  	return mapping_min_folio_order(folio->mapping);
>>  }
>> --
>> 2.51.0
>>
>
> Cheers, Lorenzo


Best Regards,
Yan, Zi

Re: [PATCH v2 3/4] mm/huge_memory: make min_order_for_split() always return an order

Posted by David Hildenbrand (Red Hat) 2 months, 2 weeks ago

On 11/22/25 03:55, Zi Yan wrote:
> min_order_for_split() returns -EBUSY when the folio is truncated and cannot
> be split. In commit 77008e1b2ef7 ("mm/huge_memory: do not change
> split_huge_page*() target order silently"), memory_failure() does not
> handle it and pass -EBUSY to try_to_split_thp_page() directly.
> try_to_split_thp_page() returns -EINVAL since -EBUSY becomes 0xfffffff0 as
> new_order is unsigned int in __folio_split() and this large new_order is
> rejected as an invalid input. The code does not cause a bug.
> soft_offline_in_use_page() also uses min_order_for_split() but it always
> passes 0 as new_order for split.
> 
> Fix it by making min_order_for_split() always return an order. When the
> given folio is truncated, namely folio->mapping == NULL, return 0 and let
> a subsequent split function handle the situation and return -EBUSY.
> 
> Add kernel-doc to min_order_for_split() to clarify its use.
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>

Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>

-- 
Cheers

David

Re: [PATCH v2 3/4] mm/huge_memory: make min_order_for_split() always return an order

Posted by Wei Yang 2 months, 2 weeks ago

On Fri, Nov 21, 2025 at 09:55:28PM -0500, Zi Yan wrote:
>min_order_for_split() returns -EBUSY when the folio is truncated and cannot
>be split. In commit 77008e1b2ef7 ("mm/huge_memory: do not change
>split_huge_page*() target order silently"), memory_failure() does not
>handle it and pass -EBUSY to try_to_split_thp_page() directly.
>try_to_split_thp_page() returns -EINVAL since -EBUSY becomes 0xfffffff0 as
>new_order is unsigned int in __folio_split() and this large new_order is
>rejected as an invalid input. The code does not cause a bug.
>soft_offline_in_use_page() also uses min_order_for_split() but it always
>passes 0 as new_order for split.
>
>Fix it by making min_order_for_split() always return an order. When the
>given folio is truncated, namely folio->mapping == NULL, return 0 and let
>a subsequent split function handle the situation and return -EBUSY.
>
>Add kernel-doc to min_order_for_split() to clarify its use.
>
>Signed-off-by: Zi Yan <ziy@nvidia.com>

LGTM, Thanks

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

-- 
Wei Yang
Help you, Help me

[PATCH v2 1/4] mm/huge_memory: change folio_split_supported() to folio_check_splittable()
[PATCH v2 2/4] mm/huge_memory: replace can_split_folio() with direct refcount calculation
[PATCH v2 3/4] mm/huge_memory: make min_order_for_split() always return an order
[PATCH v2 4/4] mm/huge_memory: fix folio split stats counting