include/linux/xarray.h | 3 ++- mm/swap_state.c | 6 ++++++ 2 files changed, 8 insertions(+), 1 deletion(-)
From: Yang Yang <yang.yang29@zte.com.cn>
Shadow_nodes is for shadow nodes reclaiming of workingset handling,
it is updated when page cache add or delete since long time ago
workingset only supported page cache. But when workingset supports
anonymous page detection, we missied updating shadow nodes for
it. This caused that shadow nodes of anonymous page will never be
reclaimd by scan_shadow_nodes() even they use much memory and
system memory is tense.
So update shadow_nodes of anonymous page when swap cache is
add or delete by calling xas_set_update(..workingset_update_node).
Fixes: aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU")
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Reviewed-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
---
change for v3
- Modify git log of explain of this patch do in imperative mood. Thanks to
Bagas Sanjaya.
change for v2
- Include a description of the user-visible effect. Add fixes tag. Modify comments.
Also call workingset_update_node() in clear_shadow_from_swap_cache(). Thanks
to Matthew Wilcox.
---
include/linux/xarray.h | 3 ++-
mm/swap_state.c | 6 ++++++
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index 44dd6d6e01bc..5cc1f718fec9 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -1643,7 +1643,8 @@ static inline void xas_set_order(struct xa_state *xas, unsigned long index,
* @update: Function to call when updating a node.
*
* The XArray can notify a caller after it has updated an xa_node.
- * This is advanced functionality and is only needed by the page cache.
+ * This is advanced functionality and is only needed by the page cache
+ * and swap cache.
*/
static inline void xas_set_update(struct xa_state *xas, xa_update_node_t update)
{
diff --git a/mm/swap_state.c b/mm/swap_state.c
index cb9aaa00951d..7a003d8abb37 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -94,6 +94,8 @@ int add_to_swap_cache(struct folio *folio, swp_entry_t entry,
unsigned long i, nr = folio_nr_pages(folio);
void *old;
+ xas_set_update(&xas, workingset_update_node);
+
VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio);
VM_BUG_ON_FOLIO(!folio_test_swapbacked(folio), folio);
@@ -145,6 +147,8 @@ void __delete_from_swap_cache(struct folio *folio,
pgoff_t idx = swp_offset(entry);
XA_STATE(xas, &address_space->i_pages, idx);
+ xas_set_update(&xas, workingset_update_node);
+
VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
VM_BUG_ON_FOLIO(!folio_test_swapcache(folio), folio);
VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio);
@@ -252,6 +256,8 @@ void clear_shadow_from_swap_cache(int type, unsigned long begin,
struct address_space *address_space = swap_address_space(entry);
XA_STATE(xas, &address_space->i_pages, curr);
+ xas_set_update(&xas, workingset_update_node);
+
xa_lock_irq(&address_space->i_pages);
xas_for_each(&xas, old, end) {
if (!xa_is_value(old))
--
2.15.2
On Fri, Jan 13, 2023 at 05:36:45PM +0800, yang.yang29@zte.com.cn wrote: > From: Yang Yang <yang.yang29@zte.com.cn> > > Shadow_nodes is for shadow nodes reclaiming of workingset handling, > it is updated when page cache add or delete since long time ago > workingset only supported page cache. But when workingset supports > anonymous page detection, we missied updating shadow nodes for > it. This caused that shadow nodes of anonymous page will never be > reclaimd by scan_shadow_nodes() even they use much memory and > system memory is tense. > > So update shadow_nodes of anonymous page when swap cache is > add or delete by calling xas_set_update(..workingset_update_node). What testing did you do of this? I have this crash in today's testing: 04304 BUG: kernel NULL pointer dereference, address: 0000000000000080 04304 #PF: supervisor read access in kernel mode 04304 #PF: error_code(0x0000) - not-present page 04304 PGD 0 P4D 0 04304 Oops: 0000 [#1] PREEMPT SMP NOPTI 04304 CPU: 4 PID: 3219629 Comm: sh Kdump: loaded Not tainted 6.2.0-rc4-next-20230116-00016-gd289d3de8ce5-dirty #69 04304 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-4 04/01/2014 04304 RIP: 0010:_raw_spin_trylock+0x12/0x50 04304 Code: e0 41 5c 5d c3 89 c6 48 89 df e8 89 06 00 00 4c 89 e0 5b 41 5c 5d c3 90 55 48 89 e5 53 48 89 fb bf 01 00 00 00 e8 be 5b 71 ff <8b> 03 85 c0 75 16 ba 01 00 00 00 f0 0f b1 13 b8 01 00 00 00 75 06 04304 RSP: 0018:ffff888059afbbb8 EFLAGS: 00010093 04304 RAX: 0000000000000003 RBX: 0000000000000080 RCX: 0000000000000000 04304 RDX: 0000000000000000 RSI: ffff8880033e24c8 RDI: 0000000000000001 04304 RBP: ffff888059afbbc0 R08: 0000000000000000 R09: ffff888059afbd68 04304 R10: ffff88807d9db868 R11: 0000000000000000 R12: ffff8880033e24c0 04304 R13: ffff88800a1d8008 R14: ffff8880033e24c8 R15: ffff8880033e24c0 04304 FS: 00007feeeabc6740(0000) GS:ffff88807d900000(0000) knlGS:0000000000000000 04304 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 04304 CR2: 0000000000000080 CR3: 0000000059830003 CR4: 0000000000770ea0 04304 PKRU: 55555554 04304 Call Trace: 04304 <TASK> 04304 shadow_lru_isolate+0x3a/0x120 04304 __list_lru_walk_one+0xa3/0x190 04304 ? memcg_list_lru_alloc+0x330/0x330 04304 ? memcg_list_lru_alloc+0x330/0x330 04304 list_lru_walk_one_irq+0x59/0x80 04304 scan_shadow_nodes+0x27/0x30 04304 do_shrink_slab+0x13b/0x2e0 04304 shrink_slab+0x92/0x250 04304 drop_slab+0x41/0x90 04304 drop_caches_sysctl_handler+0x70/0x80 04304 proc_sys_call_handler+0x162/0x210 04304 proc_sys_write+0xe/0x10 04304 vfs_write+0x1c7/0x3a0 04304 ksys_write+0x57/0xd0 04304 __x64_sys_write+0x14/0x20 04304 do_syscall_64+0x34/0x80 04304 entry_SYSCALL_64_after_hwframe+0x63/0xcd 04304 RIP: 0033:0x7feeeacc1190 Decoding it, shadow_lru_isolate+0x3a/0x120 maps back to this line: if (!spin_trylock(&mapping->host->i_lock)) { i_lock is at offset 128 of struct inode, so that matches the dump. I believe that swapper_spaces never have ->host set, so I don't believe you've tested this patch since 51b8c1fe250d went in back in 2021.
> i_lock is at offset 128 of struct inode, so that matches the dump. > I believe that swapper_spaces never have ->host set, so I don't > believe you've tested this patch since 51b8c1fe250d went in > back in 2021. You are totally right. I reproduce the panic in linux-next, and fix it by patch v4. I should be more careful, since I used Linux 5.14 to test the patch which is a mistake. Much apologies for the time wasted. Thanks.
> What testing did you do of this? I have this crash in today's testing: My test is this: 1.Configure zram for swap. 2.Run some program malloc and access large memory, make sure they can cause swap. 3.Watch count_shadow_nodes() and shadow_lru_isolate() to make sure that shadow_nodes are really shrinking by adding printk(). Really sorry for inadequate test, I will try more tests include drop_caches by sysctl.
© 2016 - 2025 Red Hat, Inc.