[PATCH] mm/migrate: fix stale partially_mapped arg to deferred_split_folio()

Deepanshu Kartikey posted 1 patch 12 hours ago
mm/migrate.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
[PATCH] mm/migrate: fix stale partially_mapped arg to deferred_split_folio()
Posted by Deepanshu Kartikey 12 hours ago
In migrate_folio_move(), src_partially_mapped is sampled from the source
folio before move_to_new_folio() is called:

  if (folio_order(src) > 1 &&
      !data_race(list_empty(&src->_deferred_list))) {
          src_deferred_split = true;
          src_partially_mapped = folio_test_partially_mapped(src);
  }

A concurrent thread can unmap pages from the source folio between this
read and the actual migration, making the sampled value stale.

After move_to_new_folio() succeeds, __folio_migrate_mapping() has already
copied all folio flags including PG_partially_mapped from src to dst.
Passing the stale src_partially_mapped=false to deferred_split_folio(dst)
while dst already has PG_partially_mapped=true triggers the invariant
check in deferred_split_folio():

  VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio)

at mm/huge_memory.c:4371, because the argument contradicts the flag
already set on the folio.

Fix this by removing the src_partially_mapped variable entirely and
reading PG_partially_mapped directly from dst after move_to_new_folio()
completes, where it is authoritative and race-free.

Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue")
Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085
Tested-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/migrate.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 05cb408846f2..11236779e910 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1361,7 +1361,6 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
 	int old_page_state = 0;
 	struct anon_vma *anon_vma = NULL;
 	bool src_deferred_split = false;
-	bool src_partially_mapped = false;
 	struct list_head *prev;
 
 	__migrate_folio_extract(dst, &old_page_state, &anon_vma);
@@ -1378,7 +1377,6 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
 	if (folio_order(src) > 1 &&
 	    !data_race(list_empty(&src->_deferred_list))) {
 		src_deferred_split = true;
-		src_partially_mapped = folio_test_partially_mapped(src);
 	}
 
 	rc = move_to_new_folio(dst, src, mode);
@@ -1404,11 +1402,13 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
 	/*
 	 * Requeue the destination folio on the deferred split queue if
 	 * the source was on the queue.  The source is unqueued in
-	 * __folio_migrate_mapping(), so we recorded the state from
-	 * before move_to_new_folio().
+	 * __folio_migrate_mapping(). Read PG_partially_mapped directly from
+	 * dst: move_to_new_folio() copies all flags from src to dst, so dst
+	 * now holds the correct authoritative state. The pre-migration value
+	 * sampled from src is racy and must not be used.
 	 */
 	if (src_deferred_split)
-		deferred_split_folio(dst, src_partially_mapped);
+		deferred_split_folio(dst, folio_test_partially_mapped(dst));
 
 out_unlock_both:
 	folio_unlock(dst);
-- 
2.43.0
Re: [PATCH] mm/migrate: fix stale partially_mapped arg to deferred_split_folio()
Posted by David Hildenbrand (Arm) 10 hours ago
On 4/1/26 10:41, Deepanshu Kartikey wrote:
> In migrate_folio_move(), src_partially_mapped is sampled from the source
> folio before move_to_new_folio() is called:
> 
>   if (folio_order(src) > 1 &&
>       !data_race(list_empty(&src->_deferred_list))) {
>           src_deferred_split = true;
>           src_partially_mapped = folio_test_partially_mapped(src);
>   }
> 
> A concurrent thread can unmap pages from the source folio between this
> read and the actual migration, making the sampled value stale.

Trying to make sense of this.

In migrate_folio_move() don't we have the folio completely unmapped
because there are only migration entries referencing the folio?

See migrate_folio_unmap(), where we check !folio_mapped().

Why should we suddenly have mapped folio here? Something is off.

Unmapping a migration entry will not involve rmap code and not mess with
the partially-mapped flag.

-- 
Cheers,

David
Re: [PATCH] mm/migrate: fix stale partially_mapped arg to deferred_split_folio()
Posted by David Hildenbrand (Arm) 10 hours ago
On 4/1/26 12:10, David Hildenbrand (Arm) wrote:
> On 4/1/26 10:41, Deepanshu Kartikey wrote:
>> In migrate_folio_move(), src_partially_mapped is sampled from the source
>> folio before move_to_new_folio() is called:
>>
>>   if (folio_order(src) > 1 &&
>>       !data_race(list_empty(&src->_deferred_list))) {
>>           src_deferred_split = true;
>>           src_partially_mapped = folio_test_partially_mapped(src);
>>   }
>>
>> A concurrent thread can unmap pages from the source folio between this
>> read and the actual migration, making the sampled value stale.
> 
> Trying to make sense of this.
> 
> In migrate_folio_move() don't we have the folio completely unmapped
> because there are only migration entries referencing the folio?
> 
> See migrate_folio_unmap(), where we check !folio_mapped().
> 
> Why should we suddenly have mapped folio here? Something is off.
> 
> Unmapping a migration entry will not involve rmap code and not mess with
> the partially-mapped flag.
> 

Okay, concluding that the above reasoning is all wrong.

Let's discuss it with Lance's proposal, which makes more sense.

https://lore.kernel.org/r/20260401085932.20945-1-lance.yang@linux.dev

-- 
Cheers,

David
Re: [PATCH] mm/migrate: fix stale partially_mapped arg to deferred_split_folio()
Posted by David Hildenbrand (Arm) 11 hours ago
On 4/1/26 10:41, Deepanshu Kartikey wrote:
> In migrate_folio_move(), src_partially_mapped is sampled from the source
> folio before move_to_new_folio() is called:
> 
>   if (folio_order(src) > 1 &&
>       !data_race(list_empty(&src->_deferred_list))) {
>           src_deferred_split = true;
>           src_partially_mapped = folio_test_partially_mapped(src);
>   }
> 
> A concurrent thread can unmap pages from the source folio between this
> read and the actual migration, making the sampled value stale.

Indeed. We could see "partially_mapped" getting set concurrently.

> 
> After move_to_new_folio() succeeds, __folio_migrate_mapping() has already
> copied all folio flags including PG_partially_mapped from src to dst.
> Passing the stale src_partially_mapped=false to deferred_split_folio(dst)
> while dst already has PG_partially_mapped=true triggers the invariant
> check in deferred_split_folio():
> 
>   VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio)
> 
> at mm/huge_memory.c:4371, because the argument contradicts the flag
> already set on the folio.
> 
> Fix this by removing the src_partially_mapped variable entirely and
> reading PG_partially_mapped directly from dst after move_to_new_folio()
> completes, where it is authoritative and race-free.
> 
> Fixes: 8a8ca142a488 ("mm: migrate: requeue destination folio on deferred split queue")
> Reported-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=a7067a757858ac8eb085
> Tested-by: syzbot+a7067a757858ac8eb085@syzkaller.appspotmail.com
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> ---
>  mm/migrate.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 05cb408846f2..11236779e910 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1361,7 +1361,6 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>  	int old_page_state = 0;
>  	struct anon_vma *anon_vma = NULL;
>  	bool src_deferred_split = false;
> -	bool src_partially_mapped = false;
>  	struct list_head *prev;
>  
>  	__migrate_folio_extract(dst, &old_page_state, &anon_vma);
> @@ -1378,7 +1377,6 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>  	if (folio_order(src) > 1 &&
>  	    !data_race(list_empty(&src->_deferred_list))) {
>  		src_deferred_split = true;
> -		src_partially_mapped = folio_test_partially_mapped(src);
>  	}
>  
>  	rc = move_to_new_folio(dst, src, mode);
> @@ -1404,11 +1402,13 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>  	/*
>  	 * Requeue the destination folio on the deferred split queue if
>  	 * the source was on the queue.  The source is unqueued in
> -	 * __folio_migrate_mapping(), so we recorded the state from
> -	 * before move_to_new_folio().
> +	 * __folio_migrate_mapping(). Read PG_partially_mapped directly from
> +	 * dst: move_to_new_folio() copies all flags from src to dst, so dst
> +	 * now holds the correct authoritative state.

Where are we copying PG_partially_mapped exactly?

I don't see that happening in folio_migrate_flags().


Also, I wonder what happens if the folio gets partially mapped after
already installing migration entries (zapping migration entries).

This is all extremely shaky.

-- 
Cheers,

David