From: Kairui Song <kasong@tencent.com>
The allocator will reclaim cached slots while scanning. Currently, it
will try again if the reclaim found a folio that is already removed from
the swap cache due to a race. But the following lookup will be using the
wrong index. It won't cause any OOB issue since the swap cache index is
truncated upon lookup, but it may lead to reclaiming of an irrelevant
folio.
This should not cause a measurable issue, but we should fix it.
Fixes: fae8595505313 ("mm, swap: avoid reclaiming irrelevant swap cache")
Signed-off-by: Kairui Song <kasong@tencent.com>
---
mm/swapfile.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 4b8ab2cb49ca..4c63fc62f4cb 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -240,13 +240,13 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si,
* Offset could point to the middle of a large folio, or folio
* may no longer point to the expected offset before it's locked.
*/
- entry = folio->swap;
- if (offset < swp_offset(entry) || offset >= swp_offset(entry) + nr_pages) {
+ if (offset < swp_offset(folio->swap) ||
+ offset >= swp_offset(folio->swap) + nr_pages) {
folio_unlock(folio);
folio_put(folio);
goto again;
}
- offset = swp_offset(entry);
+ offset = swp_offset(folio->swap);
need_reclaim = ((flags & TTRS_ANYWAY) ||
((flags & TTRS_UNMAPPED) && !folio_mapped(folio)) ||
--
2.51.0
On 05.09.25 21:13, Kairui Song wrote: > From: Kairui Song <kasong@tencent.com> Subject: s/cahe/cache/ > > The allocator will reclaim cached slots while scanning. Currently, it > will try again if the reclaim found a folio that is already removed from s/the reclaim/reclaim/ > the swap cache due to a race. But the following lookup will be using the > wrong index. It won't cause any OOB issue since the swap cache index is > truncated upon lookup, but it may lead to reclaiming of an irrelevant > folio. > > This should not cause a measurable issue, but we should fix it. > > Fixes: fae8595505313 ("mm, swap: avoid reclaiming irrelevant swap cache") > Signed-off-by: Kairui Song <kasong@tencent.com> > --- > mm/swapfile.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 4b8ab2cb49ca..4c63fc62f4cb 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -240,13 +240,13 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si, > * Offset could point to the middle of a large folio, or folio > * may no longer point to the expected offset before it's locked. > */ > - entry = folio->swap; > - if (offset < swp_offset(entry) || offset >= swp_offset(entry) + nr_pages) { > + if (offset < swp_offset(folio->swap) || > + offset >= swp_offset(folio->swap) + nr_pages) { > folio_unlock(folio); > folio_put(folio); > goto again; > } > - offset = swp_offset(entry); > + offset = swp_offset(folio->swap); > > need_reclaim = ((flags & TTRS_ANYWAY) || > ((flags & TTRS_UNMAPPED) && !folio_mapped(folio)) || Acked-by: David Hildenbrand <david@redhat.com> -- Cheers David / dhildenb
On 2025/9/6 03:13, Kairui Song wrote: > From: Kairui Song <kasong@tencent.com> > > The allocator will reclaim cached slots while scanning. Currently, it > will try again if the reclaim found a folio that is already removed from > the swap cache due to a race. But the following lookup will be using the > wrong index. It won't cause any OOB issue since the swap cache index is > truncated upon lookup, but it may lead to reclaiming of an irrelevant > folio. > > This should not cause a measurable issue, but we should fix it. > > Fixes: fae8595505313 ("mm, swap: avoid reclaiming irrelevant swap cache") > Signed-off-by: Kairui Song <kasong@tencent.com> > --- Good catch. LGTM. Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > mm/swapfile.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 4b8ab2cb49ca..4c63fc62f4cb 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -240,13 +240,13 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si, > * Offset could point to the middle of a large folio, or folio > * may no longer point to the expected offset before it's locked. > */ > - entry = folio->swap; > - if (offset < swp_offset(entry) || offset >= swp_offset(entry) + nr_pages) { > + if (offset < swp_offset(folio->swap) || > + offset >= swp_offset(folio->swap) + nr_pages) { > folio_unlock(folio); > folio_put(folio); > goto again; > } > - offset = swp_offset(entry); > + offset = swp_offset(folio->swap); > > need_reclaim = ((flags & TTRS_ANYWAY) || > ((flags & TTRS_UNMAPPED) && !folio_mapped(folio)) ||
Hi Kairui, The patch looks obviously correct to me with some very minor nitpicks following. Acked-by: Chris Li <chrisl@kernel.org> On Fri, Sep 5, 2025 at 12:14 PM Kairui Song <ryncsn@gmail.com> wrote: > > From: Kairui Song <kasong@tencent.com> > > The allocator will reclaim cached slots while scanning. Currently, it > will try again if the reclaim found a folio that is already removed from > the swap cache due to a race. But the following lookup will be using the > wrong index. It won't cause any OOB issue since the swap cache index is > truncated upon lookup, but it may lead to reclaiming of an irrelevant > folio. > > This should not cause a measurable issue, but we should fix it. > > Fixes: fae8595505313 ("mm, swap: avoid reclaiming irrelevant swap cache") > Signed-off-by: Kairui Song <kasong@tencent.com> > --- > mm/swapfile.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 4b8ab2cb49ca..4c63fc62f4cb 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -240,13 +240,13 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si, > * Offset could point to the middle of a large folio, or folio > * may no longer point to the expected offset before it's locked. > */ > - entry = folio->swap; Nitpick: This and the following reuse the folio->swap dereference and swp_offset() many times. You can use some local variables to cache the value into a register and less function calls. I haven't looked into if the compiler will do the same expression elimination on this, a good compiler should. The following looks less busy and doesn't need the compiler to optimize it for you. fe = folio->swap; eoffset = swp_offset(fe); if (offset < eoffset ) || offset >= eoffset + nr_pages) { ... } offset = eoffset; This might generate better code due to less function code. If the compiler does the perfect jobs the original code can generate the same optimized code as well. > - if (offset < swp_offset(entry) || offset >= swp_offset(entry) + nr_pages) { > + if (offset < swp_offset(folio->swap) || > + offset >= swp_offset(folio->swap) + nr_pages) { > folio_unlock(folio); > folio_put(folio); > goto again; > } > - offset = swp_offset(entry); > + offset = swp_offset(folio->swap); So the first entry is only assigned once in the function and never changed? You can use const to declare it. Chris > > need_reclaim = ((flags & TTRS_ANYWAY) || > ((flags & TTRS_UNMAPPED) && !folio_mapped(folio)) || > -- > 2.51.0 > >
On Sat, Sep 6, 2025 at 11:19 AM Chris Li <chrisl@kernel.org> wrote: > > Hi Kairui, > > The patch looks obviously correct to me with some very minor nitpicks following. > > Acked-by: Chris Li <chrisl@kernel.org> > > On Fri, Sep 5, 2025 at 12:14 PM Kairui Song <ryncsn@gmail.com> wrote: > > > > From: Kairui Song <kasong@tencent.com> > > > > The allocator will reclaim cached slots while scanning. Currently, it > > will try again if the reclaim found a folio that is already removed from > > the swap cache due to a race. But the following lookup will be using the > > wrong index. It won't cause any OOB issue since the swap cache index is > > truncated upon lookup, but it may lead to reclaiming of an irrelevant > > folio. > > > > This should not cause a measurable issue, but we should fix it. > > > > Fixes: fae8595505313 ("mm, swap: avoid reclaiming irrelevant swap cache") > > Signed-off-by: Kairui Song <kasong@tencent.com> > > --- > > mm/swapfile.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/mm/swapfile.c b/mm/swapfile.c > > index 4b8ab2cb49ca..4c63fc62f4cb 100644 > > --- a/mm/swapfile.c > > +++ b/mm/swapfile.c > > @@ -240,13 +240,13 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si, > > * Offset could point to the middle of a large folio, or folio > > * may no longer point to the expected offset before it's locked. > > */ > > - entry = folio->swap; > Nitpick: > This and the following reuse the folio->swap dereference and > swp_offset() many times. > You can use some local variables to cache the value into a register > and less function calls. I haven't looked into if the compiler will do > the same expression elimination on this, a good compiler should. The > following looks less busy and doesn't need the compiler to optimize it > for you. > > fe = folio->swap; > eoffset = swp_offset(fe); > if (offset < eoffset ) || offset >= eoffset + nr_pages) { > ... > } > offset = eoffset; > > This might generate better code due to less function code. If the > compiler does the perfect jobs the original code can generate the same > optimized code as well. Right, this part of the code will be gone soon so I think maybe better to keep the change minimal, and it's not a hot path. > > > - if (offset < swp_offset(entry) || offset >= swp_offset(entry) + nr_pages) { > > + if (offset < swp_offset(folio->swap) || > > + offset >= swp_offset(folio->swap) + nr_pages) { > > folio_unlock(folio); > > folio_put(folio); > > goto again; > > } > > - offset = swp_offset(entry); > > + offset = swp_offset(folio->swap); > > So the first entry is only assigned once in the function and never changed? > > You can use const to declare it. That's a very good point, thanks! > > Chris > > > > > need_reclaim = ((flags & TTRS_ANYWAY) || > > ((flags & TTRS_UNMAPPED) && !folio_mapped(folio)) || > > -- > > 2.51.0 > > > > >
On Fri, Sep 5, 2025 at 11:29 PM Kairui Song <ryncsn@gmail.com> wrote: > > On Sat, Sep 6, 2025 at 11:19 AM Chris Li <chrisl@kernel.org> wrote: > > > > Hi Kairui, > > > > The patch looks obviously correct to me with some very minor nitpicks following. > > > > Acked-by: Chris Li <chrisl@kernel.org> > > > > On Fri, Sep 5, 2025 at 12:14 PM Kairui Song <ryncsn@gmail.com> wrote: > > > > > > From: Kairui Song <kasong@tencent.com> > > > > > > The allocator will reclaim cached slots while scanning. Currently, it > > > will try again if the reclaim found a folio that is already removed from > > > the swap cache due to a race. But the following lookup will be using the > > > wrong index. It won't cause any OOB issue since the swap cache index is > > > truncated upon lookup, but it may lead to reclaiming of an irrelevant > > > folio. > > > > > > This should not cause a measurable issue, but we should fix it. > > > > > > Fixes: fae8595505313 ("mm, swap: avoid reclaiming irrelevant swap cache") > > > Signed-off-by: Kairui Song <kasong@tencent.com> > > > --- > > > mm/swapfile.c | 6 +++--- > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > diff --git a/mm/swapfile.c b/mm/swapfile.c > > > index 4b8ab2cb49ca..4c63fc62f4cb 100644 > > > --- a/mm/swapfile.c > > > +++ b/mm/swapfile.c > > > @@ -240,13 +240,13 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si, > > > * Offset could point to the middle of a large folio, or folio > > > * may no longer point to the expected offset before it's locked. > > > */ > > > - entry = folio->swap; > > Nitpick: > > This and the following reuse the folio->swap dereference and > > swp_offset() many times. > > You can use some local variables to cache the value into a register > > and less function calls. I haven't looked into if the compiler will do > > the same expression elimination on this, a good compiler should. The > > following looks less busy and doesn't need the compiler to optimize it > > for you. > > > > fe = folio->swap; > > eoffset = swp_offset(fe); > > if (offset < eoffset ) || offset >= eoffset + nr_pages) { > > ... > > } > > offset = eoffset; > > > > This might generate better code due to less function code. If the > > compiler does the perfect jobs the original code can generate the same > > optimized code as well. > > Right, this part of the code will be gone soon so I think maybe better > to keep the change minimal, and it's not a hot path. Ack. It is nitpick anyway. Most likely doesn't make a difference to modern compilers anyway. Chris
On Fri, Sep 5, 2025 at 12:14 PM Kairui Song <ryncsn@gmail.com> wrote: > > From: Kairui Song <kasong@tencent.com> > > The allocator will reclaim cached slots while scanning. Currently, it > will try again if the reclaim found a folio that is already removed from > the swap cache due to a race. But the following lookup will be using the > wrong index. It won't cause any OOB issue since the swap cache index is > truncated upon lookup, but it may lead to reclaiming of an irrelevant I mean if there is a race, folio->swap could literally be anything right? Can the following happen: between the filemap_get_folio() lookup and the locking, the folio can have its swap slot freed up, then obtain a new swap slot, potentially from an entirely different swapfile (i.e different swp_type(folio->swap)). It is very unlikely, and in many setups you only have one swapfile. Still... > folio. > > This should not cause a measurable issue, but we should fix it. > > Fixes: fae8595505313 ("mm, swap: avoid reclaiming irrelevant swap cache") > Signed-off-by: Kairui Song <kasong@tencent.com> Yeah that's pretty nuanced lol. It is unlikely to cause any issue indeed - we're just occasionally swap-cache-reclaim some rando folio haha. Anyway, FWIW: Acked-by: Nhat Pham <nphamcs@gmail.com>
On Sat, Sep 6, 2025 at 7:12 AM Nhat Pham <nphamcs@gmail.com> wrote: > > On Fri, Sep 5, 2025 at 12:14 PM Kairui Song <ryncsn@gmail.com> wrote: > > > > From: Kairui Song <kasong@tencent.com> > > > > The allocator will reclaim cached slots while scanning. Currently, it > > will try again if the reclaim found a folio that is already removed from > > the swap cache due to a race. But the following lookup will be using the > > wrong index. It won't cause any OOB issue since the swap cache index is > > truncated upon lookup, but it may lead to reclaiming of an irrelevant > > I mean if there is a race, folio->swap could literally be anything > right? Can the following happen: between the filemap_get_folio() > lookup and the locking, the folio can have its swap slot freed up, > then obtain a new swap slot, potentially from an entirely different > swapfile (i.e different swp_type(folio->swap)). > > It is very unlikely, and in many setups you only have one swapfile. Still... Yeah, but fortunately nothing under the `again:` will touch the address_space, here so a random value only causes a random lookup offset in a valid addresss_space, which is completely fine. > > > folio. > > > > This should not cause a measurable issue, but we should fix it. > > > > Fixes: fae8595505313 ("mm, swap: avoid reclaiming irrelevant swap cache") > > Signed-off-by: Kairui Song <kasong@tencent.com> > > Yeah that's pretty nuanced lol. It is unlikely to cause any issue > indeed - we're just occasionally swap-cache-reclaim some rando folio > haha. > > Anyway, FWIW: > > Acked-by: Nhat Pham <nphamcs@gmail.com> Thanks.
© 2016 - 2025 Red Hat, Inc.