From: Kairui Song <kasong@tencent.com>
Currently, none of the folio_contains callers will encounter swap
cache folios.
For fs/ callers, swap cache folios are never part of their workflow.
For filemap and truncate, folio_contains is only used for sanity checks
to verify the folio index matches the expected lookup/invalidation target.
The swap cache does not utilize filemap or truncate helpers in ways that
would trigger these checks, as it mostly implements its own cache management.
Shmem won't interact with these sanity checks either unless thing went
wrong, it would directly trigger a BUG, because swap cache index are
unrelated to shmem index, and would almost certainly mismatch (unless
on collide).
While some filemap helpers works for swap cache space, the swap cache
is different from the page cache in many ways. So this helper will
unlikely to work in a helpful way for swap cache folios.
So make it explicit here that folio_contains should not be used for
swap cache folios. This helps to avoid misuse and remove the folio_index
usage here.
Signed-off-by: Kairui Song <kasong@tencent.com>
---
include/linux/pagemap.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index af25fb640463..a0bed4568c66 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -935,14 +935,14 @@ static inline struct page *folio_file_page(struct folio *folio, pgoff_t index)
* @folio: The folio.
* @index: The page index within the file.
*
- * Context: The caller should have the page locked in order to prevent
- * (eg) shmem from moving the page between the page cache and swap cache
- * and changing its index in the middle of the operation.
+ * Context: The caller should ensure folio->index is stable and it's
+ * not added to the swap cache.
* Return: true or false.
*/
static inline bool folio_contains(struct folio *folio, pgoff_t index)
{
- return index - folio_index(folio) < folio_nr_pages(folio);
+ VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio);
+ return index - folio->index < folio_nr_pages(folio);
}
unsigned filemap_get_folios(struct address_space *mapping, pgoff_t *start,
--
2.49.0
On Mon, Apr 28, 2025 at 02:59:06AM +0800, Kairui Song wrote: > For filemap and truncate, folio_contains is only used for sanity checks > to verify the folio index matches the expected lookup/invalidation target. > The swap cache does not utilize filemap or truncate helpers in ways that > would trigger these checks, as it mostly implements its own cache management. > > Shmem won't interact with these sanity checks either unless thing went > wrong, it would directly trigger a BUG, because swap cache index are > unrelated to shmem index, and would almost certainly mismatch (unless > on collide). It does happen though. If shmem is writing the folio to swap at the same time that the file containing the folio is being truncated, we can hit this. > - * Context: The caller should have the page locked in order to prevent > - * (eg) shmem from moving the page between the page cache and swap cache > - * and changing its index in the middle of the operation. > + * Context: The caller should ensure folio->index is stable and it's > + * not added to the swap cache. I do think we need to keep the part about the folio being locked.
On Mon, Apr 28, 2025 at 8:44 AM Matthew Wilcox <willy@infradead.org> wrote: > > On Mon, Apr 28, 2025 at 02:59:06AM +0800, Kairui Song wrote: > > For filemap and truncate, folio_contains is only used for sanity checks > > to verify the folio index matches the expected lookup/invalidation target. > > The swap cache does not utilize filemap or truncate helpers in ways that > > would trigger these checks, as it mostly implements its own cache management. > > > > Shmem won't interact with these sanity checks either unless thing went > > wrong, it would directly trigger a BUG, because swap cache index are > > unrelated to shmem index, and would almost certainly mismatch (unless > > on collide). > > It does happen though. If shmem is writing the folio to swap at the > same time that the file containing the folio is being truncated, we > can hit this. Thanks for the info! I didn't check it in detail because that would likley trigger a BUG_ON but so far I didn't see any BUG_ON commit from there. Just checked there are two users in truncate: One will lock the folio and check if `folio->mapping != mapping` first, on swapout shmem removes the folio from shmem mapping so this check will skip the folio_contains check. Another seems might hit the check, the time window is extremely tiny though, only if the truncate's `xa_is_value(folio)` check passed while another CPU is running between `folio_alloc_swap` and `shmem_delete_from_page_cache` in shmem_writepage, then the next VM_BUG_ON_FOLIO(!folio_contains) will fail as folio is now a swap cache, not shmem folio anymore. Let me double check if this needs another fix.
On Mon, Apr 28, 2025 at 10:58 AM Kairui Song <ryncsn@gmail.com> wrote:
>
> On Mon, Apr 28, 2025 at 8:44 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Mon, Apr 28, 2025 at 02:59:06AM +0800, Kairui Song wrote:
> > > For filemap and truncate, folio_contains is only used for sanity checks
> > > to verify the folio index matches the expected lookup/invalidation target.
> > > The swap cache does not utilize filemap or truncate helpers in ways that
> > > would trigger these checks, as it mostly implements its own cache management.
> > >
> > > Shmem won't interact with these sanity checks either unless thing went
> > > wrong, it would directly trigger a BUG, because swap cache index are
> > > unrelated to shmem index, and would almost certainly mismatch (unless
> > > on collide).
> >
> > It does happen though. If shmem is writing the folio to swap at the
> > same time that the file containing the folio is being truncated, we
> > can hit this.
>
> Thanks for the info! I didn't check it in detail because that would
> likley trigger a BUG_ON but so far I didn't see any BUG_ON commit from
> there.
>
> Just checked there are two users in truncate:
>
> One will lock the folio and check if `folio->mapping != mapping`
> first, on swapout shmem removes the folio from shmem mapping so this
> check will skip the folio_contains check.
>
> Another seems might hit the check, the time window is extremely tiny
> though, only if the truncate's `xa_is_value(folio)` check passed while
> another CPU is running between `folio_alloc_swap` and
> `shmem_delete_from_page_cache` in shmem_writepage, then the next
> VM_BUG_ON_FOLIO(!folio_contains) will fail as folio is now a
> swap cache, not shmem folio anymore. Let me double check if this needs
> another fix.
Checking all the code path, shmem managed to avoid all possible ways
to call into truncate_inode_pages_range, this is the only function
that seemed may call folio_contains with a swap cache folio.
(except tiny-shmem, it uses this function directly for truncate,
we can ignore that as it's basically just ramfs).
For truncate, shmem need to either zap a whole (large) swap/folio entry,
or zero part of folio, or swapin a large folio so that part of it can be zero'ed
(using shmem_get_partial_folio), the swapin part is a bit special so
calling generic truncate helpers might cause unexpected behaviour.
Similar story for filemap lookup.
So shmem won't call into the truncate helper here that may risk
calling folio_contains with swap cache.
Even if somehow it does, this commit won't change the BUG_ON
behaviour, except now it tells the user the folio shouldn't be a swap cache
at all, instead of reporting a buggy index. So I think this commit should
be good to have to make the swap cache less exposed.
---
List of potential call chains that may call into the truncate helper
here, and not initialized from other FS / block, none will be used by
shmem.
do_dentry_open
/* filemap_nr_thps always 0 for shmem */
truncate_inode_pages
truncate_inode_pages_range
dquot_disable /* No quota file for shmem */
truncate_inode_pages
truncate_inode_pages_range
dquot_quota_sync /* No quota file for shmem */
truncate_inode_pages
truncate_inode_pages_range
truncate_inode_pages_final /* Override by shmem_evict_inode */
truncate_inode_pages
truncate_inode_pages_range
simple_setattr /* Override by shmem_setattr */
truncate_setsize
truncate_pagecache
truncate_inode_pages
truncate_inode_pages_range
put_aio_ring_file /* AIO calls it for private file */
truncate_setsize
truncate_pagecache
truncate_inode_pages
truncate_inode_pages_range
truncate_pagecache /* No user except other fs */
truncate_inode_pages
truncate_inode_pages_range
truncate_pagecache_range /* No user except other fs */
truncate_inode_pages_range
---
invalidate_inode_pages2 /* No user except other fs */
invalidate_inode_pages2_range
filemap_invalidate_pages /* only used by block / direct io */
invalidate_inode_pages2_range
filemap_invalidate_inode /* No user except other fs */
invalidate_inode_pages2_range
kiocb_invalidate_post_direct_write /* only used by block / direct io */
invalidate_inode_pages2_range
© 2016 - 2025 Red Hat, Inc.