From nobody Tue Apr 7 07:07:55 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81EA1235063 for ; Mon, 16 Mar 2026 05:56:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773640609; cv=none; b=pEMGHakmmaMoW8PFUIyrWHrV2uidrSD8wMQjSHvY2hl6lAvcxDOwgJUopB+Bw0cIgAagCRBMxp9gyxFuCUsmuboJ7+9DSf/CW1VPwgZePzhiWrE7DHsUfWxx+p2qKYGuu7dvnknkkrW+peGyM7Nmpy69vzxxcoi/gkhngcKhV8o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773640609; c=relaxed/simple; bh=DennSjkJyBmTgQaR8MaO2ZHfX3G1V2Hdi1AKWWCWBSk=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=GV1x08bBbHaC/2QSuwo6LXmxvOV640YRlxGgh3k7vtBEEi3z6CHr9WJ/95iSg3cG85YYys+g+lOLuZVw0QkWYvzUuPqK700oE01rhWCUU05TaN6yFw7sNYpMi2/y86PyURixOMN8RrHizzxuq/ZsowTFwynmEi2KMBhOWzlGXoY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=HhQM2dOV; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="HhQM2dOV" Received: by smtp.kernel.org (Postfix) with ESMTPS id 481D2C2BC87; Mon, 16 Mar 2026 05:56:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773640609; bh=DennSjkJyBmTgQaR8MaO2ZHfX3G1V2Hdi1AKWWCWBSk=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=HhQM2dOV2rXzA/OqJwrBkMTUURJlvCal670U2dI8nE6VGD6MJJkvxh9JzWZme3f2R uHffRpotJJQ7tdk8EJ0nnOc5D7CGJzWd0ly+FXLZhEjkTUmWjx8hMRObg+OUqqgI7E 0pmCyTzM1aEx7y0DHtS3PZV/q2rGuG2grla1m1Pbd4E6/rxEYVF1b3jHKQ1JZiCe6P rgLcxv5e1HxwN4BkpZz9LiHIEXTZIVWz9LGo20Fm/tRR+6jpThVaURUWzlpO+gvp8H 57Ui2AiXmwvzle5xAjAKjr3dKYCKjRNOC2t+8KpT0oi6Ssn5BlAeTmHOez5vtALXNw S95xH3QPMCE9Q== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33DFAD58B21; Mon, 16 Mar 2026 05:56:49 +0000 (UTC) From: Leno Hou via B4 Relay Date: Mon, 16 Mar 2026 02:18:29 +0800 Subject: [PATCH v3 2/2] mm/mglru: maintain workingset refault context across state transitions Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260316-b4-switch-mglru-v2-v3-2-c846ce9a2321@gmail.com> References: <20260316-b4-switch-mglru-v2-v3-0-c846ce9a2321@gmail.com> In-Reply-To: <20260316-b4-switch-mglru-v2-v3-0-c846ce9a2321@gmail.com> To: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Jialing Wang , Yafang Shao , Yu Zhao , Kairui Song , Bingfang Guo , Barry Song Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Leno Hou X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1773598759; l=9959; i=lenohou@gmail.com; s=20260311; h=from:subject:message-id; bh=BzLymly7UqWV2k0JhEp7lvZpm6LCjjIAK/A6bt/7X98=; b=grGmb05RZk/p4O8A1anhDQfMEpjTfSb0d2+QR/k5NgLHzs5Ev198xlQlLCliJgPkPW8wfRcKq Az/W/3QAmaVCldO0npLWohcLNzylGLvSBOWmC8MsSvDJz6JQb4wZEzu X-Developer-Key: i=lenohou@gmail.com; a=ed25519; pk=8AVHXYurzu1kOGjk9rwvxovwSCynBkv2QAcOvSIe1rw= X-Endpoint-Received: by B4 Relay for lenohou@gmail.com/20260311 with auth_id=674 X-Original-From: Leno Hou Reply-To: lenohou@gmail.com From: Leno Hou When MGLRU state is toggled dynamically, existing shadow entries (eviction tokens) lose their context. Traditional LRU and MGLRU handle workingset refaults using different logic. Without context, shadow entries re-activated by the "wrong" reclaim logic trigger excessive page activations (pgactivate) and system thrashing, as the kernel cannot correctly distinguish if a refaulted page was originally managed by MGLRU or the traditional LRU. This patch introduces shadow entry context tracking: - Encode MGLRU origin: Introduce WORKINGSET_MGLRU_SHIFT into the shadow entry (eviction token) encoding. This adds an 'is_mglru' bit to shadow entries, allowing the kernel to correctly identify the originating reclaim logic for a page even after the global MGLRU state has been toggled. - Refault logic dispatch: Use this 'is_mglru' bit in workingset_refault() and workingset_test_recent() to dispatch refault events to the correct handler (lru_gen_refault vs. traditional workingset refault). This ensures that refaulted pages are handled by the appropriate reclaim logic regardless of the current MGLRU enabled state, preventing unnecessary thrashing and state-inconsistent refault activations during state transitions. To: Andrew Morton To: Axel Rasmussen To: Yuanchu Xie To: Wei Xu To: Barry Song <21cnbao@gmail.com> To: Jialing Wang To: Yafang Shao To: Yu Zhao To: Kairui Song To: Bingfang Guo Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Leno Hou --- include/linux/swap.h | 2 +- mm/vmscan.c | 17 ++++++++++++----- mm/workingset.c | 22 +++++++++++++++------- 3 files changed, 28 insertions(+), 13 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index 7a09df6977a5..5f7d3f08d840 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -297,7 +297,7 @@ static inline swp_entry_t page_swap_entry(struct page *= page) bool workingset_test_recent(void *shadow, bool file, bool *workingset, bool flush); void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pa= ges); -void *workingset_eviction(struct folio *folio, struct mem_cgroup *target_m= emcg); +void *workingset_eviction(struct folio *folio, struct mem_cgroup *target_m= emcg, bool lru_gen); void workingset_refault(struct folio *folio, void *shadow); void workingset_activation(struct folio *folio); =20 diff --git a/mm/vmscan.c b/mm/vmscan.c index bcefd8db9c03..de21343b5cd2 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -180,6 +180,9 @@ struct scan_control { =20 /* for recording the reclaimed slab by now */ struct reclaim_state reclaim_state; + + /* whether in lru gen scan context */ + unsigned int lru_gen:1; }; =20 #ifdef ARCH_HAS_PREFETCHW @@ -685,7 +688,7 @@ static pageout_t pageout(struct folio *folio, struct ad= dress_space *mapping, * gets returned with a refcount of 0. */ static int __remove_mapping(struct address_space *mapping, struct folio *f= olio, - bool reclaimed, struct mem_cgroup *target_memcg) + bool reclaimed, struct mem_cgroup *target_memcg, struct scan_control *sc) { int refcount; void *shadow =3D NULL; @@ -739,7 +742,7 @@ static int __remove_mapping(struct address_space *mappi= ng, struct folio *folio, swp_entry_t swap =3D folio->swap; =20 if (reclaimed && !mapping_exiting(mapping)) - shadow =3D workingset_eviction(folio, target_memcg); + shadow =3D workingset_eviction(folio, target_memcg, sc->lru_gen); memcg1_swapout(folio, swap); __swap_cache_del_folio(ci, folio, swap, shadow); swap_cluster_unlock_irq(ci); @@ -765,7 +768,7 @@ static int __remove_mapping(struct address_space *mappi= ng, struct folio *folio, */ if (reclaimed && folio_is_file_lru(folio) && !mapping_exiting(mapping) && !dax_mapping(mapping)) - shadow =3D workingset_eviction(folio, target_memcg); + shadow =3D workingset_eviction(folio, target_memcg, sc->lru_gen); __filemap_remove_folio(folio, shadow); xa_unlock_irq(&mapping->i_pages); if (mapping_shrinkable(mapping)) @@ -802,7 +805,7 @@ static int __remove_mapping(struct address_space *mappi= ng, struct folio *folio, */ long remove_mapping(struct address_space *mapping, struct folio *folio) { - if (__remove_mapping(mapping, folio, false, NULL)) { + if (__remove_mapping(mapping, folio, false, NULL, NULL)) { /* * Unfreezing the refcount with 1 effectively * drops the pagecache ref for us without requiring another @@ -1499,7 +1502,7 @@ static unsigned int shrink_folio_list(struct list_hea= d *folio_list, count_vm_events(PGLAZYFREED, nr_pages); count_memcg_folio_events(folio, PGLAZYFREED, nr_pages); } else if (!mapping || !__remove_mapping(mapping, folio, true, - sc->target_mem_cgroup)) + sc->target_mem_cgroup, sc)) goto keep_locked; =20 folio_unlock(folio); @@ -1599,6 +1602,7 @@ unsigned int reclaim_clean_pages_from_list(struct zon= e *zone, struct scan_control sc =3D { .gfp_mask =3D GFP_KERNEL, .may_unmap =3D 1, + .lru_gen =3D lru_gen_enabled(), }; struct reclaim_stat stat; unsigned int nr_reclaimed; @@ -1993,6 +1997,7 @@ static unsigned long shrink_inactive_list(unsigned lo= ng nr_to_scan, if (nr_taken =3D=3D 0) return 0; =20 + sc->lru_gen =3D 0; nr_reclaimed =3D shrink_folio_list(&folio_list, pgdat, sc, &stat, false, lruvec_memcg(lruvec)); =20 @@ -2167,6 +2172,7 @@ static unsigned int reclaim_folio_list(struct list_he= ad *folio_list, .may_unmap =3D 1, .may_swap =3D 1, .no_demotion =3D 1, + .lru_gen =3D lru_gen_enabled(), }; =20 nr_reclaimed =3D shrink_folio_list(folio_list, pgdat, &sc, &stat, true, N= ULL); @@ -4864,6 +4870,7 @@ static int evict_folios(unsigned long nr_to_scan, str= uct lruvec *lruvec, if (list_empty(&list)) return scanned; retry: + sc->lru_gen =3D 1; reclaimed =3D shrink_folio_list(&list, pgdat, sc, &stat, false, memcg); sc->nr.unqueued_dirty +=3D stat.nr_unqueued_dirty; sc->nr_reclaimed +=3D reclaimed; diff --git a/mm/workingset.c b/mm/workingset.c index 07e6836d0502..3764a4a68c2c 100644 --- a/mm/workingset.c +++ b/mm/workingset.c @@ -181,8 +181,10 @@ * refault distance will immediately activate the refaulting page. */ =20 +#define WORKINGSET_MGLRU_SHIFT 1 #define WORKINGSET_SHIFT 1 #define EVICTION_SHIFT ((BITS_PER_LONG - BITS_PER_XA_VALUE) + \ + WORKINGSET_MGLRU_SHIFT + \ WORKINGSET_SHIFT + NODES_SHIFT + \ MEM_CGROUP_ID_SHIFT) #define EVICTION_SHIFT_ANON (EVICTION_SHIFT + SWAP_COUNT_SHIFT) @@ -200,12 +202,13 @@ static unsigned int bucket_order[ANON_AND_FILE] __read_mostly; =20 static void *pack_shadow(int memcgid, pg_data_t *pgdat, unsigned long evic= tion, - bool workingset, bool file) + bool workingset, bool file, bool is_mglru) { eviction &=3D file ? EVICTION_MASK : EVICTION_MASK_ANON; eviction =3D (eviction << MEM_CGROUP_ID_SHIFT) | memcgid; eviction =3D (eviction << NODES_SHIFT) | pgdat->node_id; eviction =3D (eviction << WORKINGSET_SHIFT) | workingset; + eviction =3D (eviction << WORKINGSET_MGLRU_SHIFT) | is_mglru; =20 return xa_mk_value(eviction); } @@ -217,6 +220,7 @@ static void unpack_shadow(void *shadow, int *memcgidp, = pg_data_t **pgdat, int memcgid, nid; bool workingset; =20 + entry >>=3D WORKINGSET_MGLRU_SHIFT; workingset =3D entry & ((1UL << WORKINGSET_SHIFT) - 1); entry >>=3D WORKINGSET_SHIFT; nid =3D entry & ((1UL << NODES_SHIFT) - 1); @@ -263,7 +267,7 @@ static void *lru_gen_eviction(struct folio *folio) memcg_id =3D mem_cgroup_private_id(memcg); rcu_read_unlock(); =20 - return pack_shadow(memcg_id, pgdat, token, workingset, type); + return pack_shadow(memcg_id, pgdat, token, workingset, type, true); } =20 /* @@ -387,7 +391,8 @@ void workingset_age_nonresident(struct lruvec *lruvec, = unsigned long nr_pages) * Return: a shadow entry to be stored in @folio->mapping->i_pages in place * of the evicted @folio so that a later refault can be detected. */ -void *workingset_eviction(struct folio *folio, struct mem_cgroup *target_m= emcg) +void *workingset_eviction(struct folio *folio, struct mem_cgroup *target_m= emcg, + bool lru_gen) { struct pglist_data *pgdat =3D folio_pgdat(folio); int file =3D folio_is_file_lru(folio); @@ -400,7 +405,7 @@ void *workingset_eviction(struct folio *folio, struct m= em_cgroup *target_memcg) VM_BUG_ON_FOLIO(folio_ref_count(folio), folio); VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); =20 - if (lru_gen_enabled()) + if (lru_gen) return lru_gen_eviction(folio); =20 lruvec =3D mem_cgroup_lruvec(target_memcg, pgdat); @@ -410,7 +415,7 @@ void *workingset_eviction(struct folio *folio, struct m= em_cgroup *target_memcg) eviction >>=3D bucket_order[file]; workingset_age_nonresident(lruvec, folio_nr_pages(folio)); return pack_shadow(memcgid, pgdat, eviction, - folio_test_workingset(folio), file); + folio_test_workingset(folio), file, false); } =20 /** @@ -436,8 +441,10 @@ bool workingset_test_recent(void *shadow, bool file, b= ool *workingset, int memcgid; struct pglist_data *pgdat; unsigned long eviction; + unsigned long entry =3D xa_to_value(shadow); + bool is_mglru =3D !!(entry & WORKINGSET_MGLRU_SHIFT); =20 - if (lru_gen_enabled()) { + if (is_mglru) { bool recent; =20 rcu_read_lock(); @@ -550,10 +557,11 @@ void workingset_refault(struct folio *folio, void *sh= adow) struct lruvec *lruvec; bool workingset; long nr; + unsigned long entry =3D xa_to_value(shadow); =20 VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); =20 - if (lru_gen_enabled()) { + if (entry & ((1UL << WORKINGSET_MGLRU_SHIFT) - 1)) { lru_gen_refault(folio, shadow); return; } --=20 2.52.0