mm/swapfile: validate swap offset in unuse_pte_range()

[PATCH] mm/swapfile: validate swap offset in unuse_pte_range()

Posted by Deepanshu Kartikey 2 months, 1 week ago

syzbot reported a WARNING in __swap_offset_to_cluster() triggered by
an invalid swap offset during swapoff:

  WARNING: CPU: 0 PID: 9861 at mm/swap.h:87 swap_cache_get_folio+0x186/0x200

The issue occurs because unuse_pte_range() extracts a swap entry from
a PTE and uses the offset without validating it is within bounds of
the swap area.

While the existing swp_type() check filters entries for other swap
areas, it cannot catch cases where the type bits are valid but the
offset is corrupted or stale - for example, due to a race condition
during PTE updates or memory corruption.

Add validation to ensure offset < si->max before using the swap entry.

Reported-by: syzbot+d7bc9ec4a100437aa7a2@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d7bc9ec4a100437aa7a2
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 mm/swapfile.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 46d2008e4b99..fdf358df7116 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2277,6 +2277,8 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
 			continue;
 
 		offset = swp_offset(entry);
+		if (offset >= si->max)
+			continue;
 		pte_unmap(pte);
 		pte = NULL;
 
-- 
2.43.0

Re: [PATCH] mm/swapfile: validate swap offset in unuse_pte_range()

Posted by YoungJun Park 2 months, 1 week ago

On Mon, Dec 01, 2025 at 03:07:41PM +0530, Deepanshu Kartikey wrote:
> syzbot reported a WARNING in __swap_offset_to_cluster() triggered by
> an invalid swap offset during swapoff:
> 
>   WARNING: CPU: 0 PID: 9861 at mm/swap.h:87 swap_cache_get_folio+0x186/0x200
> 
> The issue occurs because unuse_pte_range() extracts a swap entry from
> a PTE and uses the offset without validating it is within bounds of
> the swap area.
> 
> While the existing swp_type() check filters entries for other swap
> areas, it cannot catch cases where the type bits are valid but the
> offset is corrupted or stale - for example, due to a race condition
> during PTE updates or memory corruption.

Since this indicates a system-level issue (race/corruption), simply
avoiding the crash seems to be the goal here.
Should we at least add a WARN_ON or somthing? 
(Unless this corruption is expected 
to be reported elsewhere beforehand, in which case a silent skip is
fine as I think)

And it looks like swap_vma_readahead() share similar logic. 
The differene is intentionally allow entries from different swap
devices (to support vma readahead). 

If the offset is corrupted or invalid in those paths, wouldn't they
suffer from similar issues? Do you think we should add the boundary
check (offset >= si->max) there as well?

Best Regards,
Youngjun Park

> 
> Reported-by: syzbot+d7bc9ec4a100437aa7a2@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=d7bc9ec4a100437aa7a2
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> ---
>  mm/swapfile.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 46d2008e4b99..fdf358df7116 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -2277,6 +2277,8 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
>  			continue;
>  
>  		offset = swp_offset(entry);
> +		if (offset >= si->max)
> +			continue;
>  		pte_unmap(pte);
>  		pte = NULL;
>  
> -- 
> 2.43.0

Re: [PATCH] mm/swapfile: validate swap offset in unuse_pte_range()

Posted by Kairui Song 2 months, 1 week ago

On Mon, Dec 1, 2025 at 5:39 PM Deepanshu Kartikey <kartikey406@gmail.com> wrote:
>
> syzbot reported a WARNING in __swap_offset_to_cluster() triggered by
> an invalid swap offset during swapoff:
>
>   WARNING: CPU: 0 PID: 9861 at mm/swap.h:87 swap_cache_get_folio+0x186/0x200
>
> The issue occurs because unuse_pte_range() extracts a swap entry from
> a PTE and uses the offset without validating it is within bounds of
> the swap area.
>
> While the existing swp_type() check filters entries for other swap
> areas, it cannot catch cases where the type bits are valid but the
> offset is corrupted or stale - for example, due to a race condition
> during PTE updates or memory corruption.
>
> Add validation to ensure offset < si->max before using the swap entry.
>
> Reported-by: syzbot+d7bc9ec4a100437aa7a2@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=d7bc9ec4a100437aa7a2

Thanks for posting a fix!

But it seems the report is no longer triggering after the softleaf v3
change right?

Checking the syzbot link, last reproduce was 11/11, and my analyze was
posted here:
https://lore.kernel.org/all/CAMgjq7B=OizLoqKca3RjeV0h3p0GQ4uen+gDo3=WdAxQ1gfxnw@mail.gmail.com/

Then we have soft leaf v3 merged, and the warning is gone.

Your analyze:
> for example, due to a race condition
> during PTE updates or memory corruption.

What kind of race will lead to a invalid swap entry in the page table?
During swapoff no one can allocate any swap entry from this swap
device, and the swap type can't be used by other swap devices, so any
swap entry still in the page table must be a valid swap entry that was
allocated from this swap device before swapoff starts. And we are not
releasing the swap_map or si->cluster_info until swapoff is done, seem
no risk of OOB or UAF.

Memory corruption may cause it indeed, but memory corruption can also
cause failures in too many ways.

I'm not against a sanity check like this though, just want to double
check before we process.

> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> ---
>  mm/swapfile.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 46d2008e4b99..fdf358df7116 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -2277,6 +2277,8 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
>                         continue;
>
>                 offset = swp_offset(entry);
> +               if (offset >= si->max)
> +                       continue;
>                 pte_unmap(pte);
>                 pte = NULL;
>
> --
> 2.43.0
>
>

Re: [PATCH] mm/swapfile: validate swap offset in unuse_pte_range()

Posted by Deepanshu Kartikey 2 months, 1 week ago

Hi Kairui,

Thank you for the detailed feedback!

> But it seems the report is no longer triggering after the softleaf v3
> change right? Checking the syzbot link, last reproduce was 11/11

You're right - I should have checked the syzbot status more carefully.
If softleaf v3 has already fixed this, then this patch may not be
needed.

Could you point me to which specific change in softleaf v3 fixed it?
I'd like to understand the root cause better.

> What kind of race will lead to a invalid swap entry in the page table?

You make a good point. I was speculating about possible causes without
concrete evidence.

> I'm not against a sanity check like this though, just want to double
> check before we process.

If softleaf v3 has fixed the underlying issue, I can withdraw this
patch. Or if you think a defensive sanity check still has value, I can
update the commit message to reflect that it is defensive hardening
rather than a fix for an active bug.

Please let me know how you'd like to proceed.

Thanks,
Deepanshu

Re: [PATCH] mm/swapfile: validate swap offset in unuse_pte_range()

Posted by Kairui Song 2 months ago

On Mon, Dec 1, 2025 at 6:48 PM Deepanshu Kartikey <kartikey406@gmail.com> wrote:
>
> Hi Kairui,
>
> Thank you for the detailed feedback!

You are welcome :),

> > But it seems the report is no longer triggering after the softleaf v3
> > change right? Checking the syzbot link, last reproduce was 11/11
>
> You're right - I should have checked the syzbot status more carefully.
> If softleaf v3 has already fixed this, then this patch may not be
> needed.
>
> Could you point me to which specific change in softleaf v3 fixed it?
> I'd like to understand the root cause better.

This one, I think Lorenzo included it or a similar fix along with
another fix in swapfile.c:
https://lore.kernel.org/all/CAMgjq7AP383YfU3L5ZxJ9U3x-vRPnEkEUtmnPdXD29HiNC8OrA@mail.gmail.com/

>
> > What kind of race will lead to a invalid swap entry in the page table?
>
> You make a good point. I was speculating about possible causes without
> concrete evidence.
>
> > I'm not against a sanity check like this though, just want to double
> > check before we process.
>
> If softleaf v3 has fixed the underlying issue, I can withdraw this
> patch. Or if you think a defensive sanity check still has value, I can
> update the commit message to reflect that it is defensive hardening
> rather than a fix for an active bug.

A sanity check here is acceptable since swapoff is cold and the
overhead is hardly visible. No strong opinion on this one.

Re: [PATCH] mm/swapfile: validate swap offset in unuse_pte_range()

Posted by Deepanshu Kartikey 2 months ago

On Wed, Dec 3, 2025 at 8:24 AM Kairui Song <ryncsn@gmail.com> wrote:
> > If softleaf v3 has fixed the underlying issue, I can withdraw this
> > patch. Or if you think a defensive sanity check still has value, I can
> > update the commit message to reflect that it is defensive hardening
> > rather than a fix for an active bug.
>
> A sanity check here is acceptable since swapoff is cold and the
> overhead is hardly visible. No strong opinion on this one.

Hi Kairui,

Thank you for the link and clarification!

I'll study Lorenzo's fix to understand the root cause better.

Since you mentioned a sanity check is acceptable here, should I update
the commit message to frame this as defensive hardening rather than a
bug fix? Something like:

    mm/swapfile: add defensive bounds check in unuse_pte_range()

    Add a sanity check to validate the swap offset is within bounds
    before using it. While there is no known code path that can
    trigger an out-of-bounds offset, this provides defense against
    potential edge cases or memory corruption.

    The overhead is negligible since swapoff is a cold path.

Thanks,
Deepanshu

Re: [PATCH] mm/swapfile: validate swap offset in unuse_pte_range()

Posted by Chris Li 2 months ago

On Sat, Dec 6, 2025 at 4:28 PM Deepanshu Kartikey <kartikey406@gmail.com> wrote:
>
> On Wed, Dec 3, 2025 at 8:24 AM Kairui Song <ryncsn@gmail.com> wrote:
> > > If softleaf v3 has fixed the underlying issue, I can withdraw this
> > > patch. Or if you think a defensive sanity check still has value, I can
> > > update the commit message to reflect that it is defensive hardening
> > > rather than a fix for an active bug.
> >
> > A sanity check here is acceptable since swapoff is cold and the
> > overhead is hardly visible. No strong opinion on this one.
>
> Hi Kairui,
>
> Thank you for the link and clarification!

Thanks for working on this.

>
> I'll study Lorenzo's fix to understand the root cause better.
>
> Since you mentioned a sanity check is acceptable here, should I update
> the commit message to frame this as defensive hardening rather than a
> bug fix? Something like:
>
>     mm/swapfile: add defensive bounds check in unuse_pte_range()
>
>     Add a sanity check to validate the swap offset is within bounds
>     before using it. While there is no known code path that can
>     trigger an out-of-bounds offset, this provides defense against
>     potential edge cases or memory corruption.

Adding a defensive hardening is justified. Basically, the kernel
shouldn't put an invalid swap entry in the pte. If the swap entry read
back from the pte is invalid (out of range). That means there is some
kind of bug that might be able to cause data corruption. Bail out and
add a WARN_ONCE or the preferred equivalent version if triggered to
expose the possible data corruption.

I understand there is some opinion about avoiding using WARN. I will
leave other comments on what is the best way to notify the possible
data corruption. Because it is possible data corruption we are talking
about,  silently skipping over the possible data corruption bug is
worse.

That is just my take on this.

>
>     The overhead is negligible since swapoff is a cold path.

Ack.

Chris