[v3] __folio_split() clean up.

[PATCH v3 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count.

Posted by Zi Yan 2 months, 3 weeks ago

Instead of open coding the ref_count calculation, use
folio_expected_ref_count().

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Zi Yan <ziy@nvidia.com>
Acked-by: Balbir Singh <balbirs@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
---
 mm/huge_memory.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index a7ee731f974f..31b5c4e61a57 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3735,6 +3735,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 	if (folio_ref_freeze(folio, 1 + extra_pins)) {
 		struct address_space *swap_cache = NULL;
 		struct lruvec *lruvec;
+		int expected_refs;
 
 		if (folio_order(folio) > 1 &&
 		    !list_empty(&folio->_deferred_list)) {
@@ -3805,11 +3806,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 		     new_folio = next) {
 			next = folio_next(new_folio);
 
-			folio_ref_unfreeze(
-				new_folio,
-				1 + ((mapping || swap_cache) ?
-					     folio_nr_pages(new_folio) :
-					     0));
+			expected_refs = folio_expected_ref_count(new_folio) + 1;
+			folio_ref_unfreeze(new_folio, expected_refs);
 
 			lru_add_split_folio(folio, new_folio, lruvec, list);
 
@@ -3839,8 +3837,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
 		 * Otherwise, a parallel folio_try_get() can grab origin_folio
 		 * and its caller can see stale page cache entries.
 		 */
-		folio_ref_unfreeze(folio, 1 +
-			((mapping || swap_cache) ? folio_nr_pages(folio) : 0));
+		expected_refs = folio_expected_ref_count(folio) + 1;
+		folio_ref_unfreeze(folio, expected_refs);
 
 		unlock_page_lruvec(lruvec);
 
-- 
2.47.2

Re: [PATCH v3 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count.

Posted by Lorenzo Stoakes 2 months, 3 weeks ago

On Mon, Jul 14, 2025 at 01:18:23PM -0400, Zi Yan wrote:
> Instead of open coding the ref_count calculation, use
> folio_expected_ref_count().

You really should put something here about why it is that the open-coded
value and the value returned from folio_expected_ref_count() would be
expected to be the same. See comment below inline with code.

>
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> Acked-by: Balbir Singh <balbirs@nvidia.com>
> Acked-by: David Hildenbrand <david@redhat.com>

Ah haha you're literally addresing some of my code review here from the
last patch :) I love it when that happens :P

I'd like you to improve the commit message, but that's a nit so:

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

See below for some analysis of the folio_expected_ref_count().

> ---
>  mm/huge_memory.c | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index a7ee731f974f..31b5c4e61a57 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3735,6 +3735,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>  	if (folio_ref_freeze(folio, 1 + extra_pins)) {
>  		struct address_space *swap_cache = NULL;
>  		struct lruvec *lruvec;
> +		int expected_refs;
>
>  		if (folio_order(folio) > 1 &&
>  		    !list_empty(&folio->_deferred_list)) {
> @@ -3805,11 +3806,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>  		     new_folio = next) {
>  			next = folio_next(new_folio);
>
> -			folio_ref_unfreeze(
> -				new_folio,
> -				1 + ((mapping || swap_cache) ?
> -					     folio_nr_pages(new_folio) :
> -					     0));
> +			expected_refs = folio_expected_ref_count(new_folio) + 1;


So digging in:

static inline int folio_expected_ref_count(const struct folio *folio)
{
	const int order = folio_order(folio);
	int ref_count = 0;

	...

	if (folio_test_anon(folio)) {
		/* One reference per page from the swapcache. */
		ref_count += folio_test_swapcache(folio) << order;
	} else {
		/* One reference per page from the pagecache. */
		ref_count += !!folio->mapping << order;

^---- these are covered off by (mapping || swap_cache) ? folio_nr_pages(folio)

		/* One reference from PG_private. */
		ref_count += folio_test_private(folio);

This one is trickier.

OK so looking through the logic, the can_split_folio() function will
already assert that the only pins you have are the swapcache/page cache
ones on the 'origin' folio (the mapcount bit used in the freeze doesn't matter
as you're dealing with split, not-yet-mapped 'sub'-folios).

So this precludes an elevated refcount from PG_private, therefore this will
naturally be 0.

	}

	/* One reference per page table mapping. */
	return ref_count + folio_mapcount(folio);

folio_mapcount() will be zero for these split folios, until remapped.

}

You add the + 1 to account for the folio pin of course.

TL;DR - this is correct AFAICT.


> +			folio_ref_unfreeze(new_folio, expected_refs);
>
>  			lru_add_split_folio(folio, new_folio, lruvec, list);
>
> @@ -3839,8 +3837,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>  		 * Otherwise, a parallel folio_try_get() can grab origin_folio
>  		 * and its caller can see stale page cache entries.
>  		 */
> -		folio_ref_unfreeze(folio, 1 +
> -			((mapping || swap_cache) ? folio_nr_pages(folio) : 0));
> +		expected_refs = folio_expected_ref_count(folio) + 1;
> +		folio_ref_unfreeze(folio, expected_refs);
>
>  		unlock_page_lruvec(lruvec);
>
> --
> 2.47.2
>

Re: [PATCH v3 2/2] mm/huge_memory: use folio_expected_ref_count() to calculate ref_count.

Posted by Baolin Wang 2 months, 3 weeks ago


On 2025/7/15 01:18, Zi Yan wrote:
> Instead of open coding the ref_count calculation, use
> folio_expected_ref_count().
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> Acked-by: Balbir Singh <balbirs@nvidia.com>
> Acked-by: David Hildenbrand <david@redhat.com>
> ---

Looks more readable. Thanks.
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>

>   mm/huge_memory.c | 12 +++++-------
>   1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index a7ee731f974f..31b5c4e61a57 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3735,6 +3735,7 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>   	if (folio_ref_freeze(folio, 1 + extra_pins)) {
>   		struct address_space *swap_cache = NULL;
>   		struct lruvec *lruvec;
> +		int expected_refs;
>   
>   		if (folio_order(folio) > 1 &&
>   		    !list_empty(&folio->_deferred_list)) {
> @@ -3805,11 +3806,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>   		     new_folio = next) {
>   			next = folio_next(new_folio);
>   
> -			folio_ref_unfreeze(
> -				new_folio,
> -				1 + ((mapping || swap_cache) ?
> -					     folio_nr_pages(new_folio) :
> -					     0));
> +			expected_refs = folio_expected_ref_count(new_folio) + 1;
> +			folio_ref_unfreeze(new_folio, expected_refs);
>   
>   			lru_add_split_folio(folio, new_folio, lruvec, list);
>   
> @@ -3839,8 +3837,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order,
>   		 * Otherwise, a parallel folio_try_get() can grab origin_folio
>   		 * and its caller can see stale page cache entries.
>   		 */
> -		folio_ref_unfreeze(folio, 1 +
> -			((mapping || swap_cache) ? folio_nr_pages(folio) : 0));
> +		expected_refs = folio_expected_ref_count(folio) + 1;
> +		folio_ref_unfreeze(folio, expected_refs);
>   
>   		unlock_page_lruvec(lruvec);
>