From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C97453E6398 for ; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313219; cv=none; b=JmUrHoVvix8tUfkzljWkehQkIJd9dT9h9TZE6NOp8OAAGixzszP7Ntbr7WV0LgqUTRsdOz8mb7O/XAVuyVMmp3i+UPBRlIn5AQuuQj/sy1NJjvJpzKQmYEZNVtduWH4KDF6i5bF8jolOaMvu5q2wDPSswQW9qQo3ErDgc0tAzpQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313219; c=relaxed/simple; bh=d6Ta1iP5mPFPkgxiD9u25lIVE7V40hNv4qpb8cszj/E=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=jAF+fat3phlHCuZ0LEUZHnrlxV9bSyP/sU1FpDZ1FQ2JEgTQuN+OVrj0eduVwfz396YHgmhF5wnel1m1oKRMLs5SSgCvJDLDGrlM7EbWA77YlH2aKzEvFlW/JlNm3wB3qtMXRYdPa0rCydDj+OTedaOMRWKLL9kCF18RDtosaR4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=HKaDsTsU; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="HKaDsTsU" Received: by smtp.kernel.org (Postfix) with ESMTPS id 7EB96C2BCB4; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313219; bh=d6Ta1iP5mPFPkgxiD9u25lIVE7V40hNv4qpb8cszj/E=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=HKaDsTsUBY+o3SmG696k6yt9SglM1TGcIBuBtoGo5xVCIr9tMIWzG81Ev4aGECdab pa4NU441zXdVQiwkoStsmNTuytCgXCCmmMTvNUp7Zimc9M6SxcJhnM5s/pO08hOEZe MPOeHovesJGpuCNs9VzkgN5XFYlQrQm9DJY3NTjS/CmKh1KthXM68BioqyGbQ2dFp8 RIixHhnJtNvEu1HTmXWsoMU5+V0QdkuuvHB3LK2AFi7n7CJWNWp2EOxzAqHAK4iY0I 3MWeH92H4ssy/mdS/SP2ux4AwfQvSHd2ThAIrvscwno70i6TEubphZdwLeesBREb6l kP8WPrK2ulF1A== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76E6BFF8868; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:06:52 +0800 Subject: [PATCH v7 01/15] mm/mglru: consolidate common code for retrieving evictable size Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-1-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=3107; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=Y3kkkb3C3sOmTNRwB81JWRKI8V4j68zDrNpkhlE+boo=; b=0CL/lEP4LfHtocAYd24RRpXvBtZg+qnGZk3sapZjnVvC/jbObXoa8fpMb8U4ycnX0017WnoJl 8+JVaUBlFe4DX35z+D/F4kUuqs3ratkMEnO/emJ6E4lGATKi+qpT3GL X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Merge commonly used code for counting evictable folios in a lruvec. No behavior change. Acked-by: Yuanchu Xie Reviewed-by: Barry Song Reviewed-by: Chen Ridong Reviewed-by: Axel Rasmussen Reviewed-by: Baolin Wang Signed-off-by: Kairui Song Acked-by: Shakeel Butt --- mm/vmscan.c | 36 ++++++++++++++---------------------- 1 file changed, 14 insertions(+), 22 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index b2d89ed69d22..b80fbc4fc285 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4084,27 +4084,33 @@ static void set_initial_priority(struct pglist_data= *pgdat, struct scan_control sc->priority =3D clamp(priority, DEF_PRIORITY / 2, DEF_PRIORITY); } =20 -static bool lruvec_is_sizable(struct lruvec *lruvec, struct scan_control *= sc) +static unsigned long lruvec_evictable_size(struct lruvec *lruvec, int swap= piness) { int gen, type, zone; - unsigned long total =3D 0; - int swappiness =3D get_swappiness(lruvec, sc); + unsigned long seq, total =3D 0; struct lru_gen_folio *lrugen =3D &lruvec->lrugen; - struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); DEFINE_MAX_SEQ(lruvec); DEFINE_MIN_SEQ(lruvec); =20 for_each_evictable_type(type, swappiness) { - unsigned long seq; - for (seq =3D min_seq[type]; seq <=3D max_seq; seq++) { gen =3D lru_gen_from_seq(seq); - for (zone =3D 0; zone < MAX_NR_ZONES; zone++) total +=3D max(READ_ONCE(lrugen->nr_pages[gen][type][zone]), 0L); } } =20 + return total; +} + +static bool lruvec_is_sizable(struct lruvec *lruvec, struct scan_control *= sc) +{ + unsigned long total; + int swappiness =3D get_swappiness(lruvec, sc); + struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); + + total =3D lruvec_evictable_size(lruvec, swappiness); + /* whether the size is big enough to be helpful */ return mem_cgroup_online(memcg) ? (total >> sc->priority) : total; } @@ -4909,9 +4915,6 @@ static int evict_folios(unsigned long nr_to_scan, str= uct lruvec *lruvec, static bool should_run_aging(struct lruvec *lruvec, unsigned long max_seq, int swappiness, unsigned long *nr_to_scan) { - int gen, type, zone; - unsigned long size =3D 0; - struct lru_gen_folio *lrugen =3D &lruvec->lrugen; DEFINE_MIN_SEQ(lruvec); =20 *nr_to_scan =3D 0; @@ -4919,18 +4922,7 @@ static bool should_run_aging(struct lruvec *lruvec, = unsigned long max_seq, if (evictable_min_seq(min_seq, swappiness) + MIN_NR_GENS > max_seq) return true; =20 - for_each_evictable_type(type, swappiness) { - unsigned long seq; - - for (seq =3D min_seq[type]; seq <=3D max_seq; seq++) { - gen =3D lru_gen_from_seq(seq); - - for (zone =3D 0; zone < MAX_NR_ZONES; zone++) - size +=3D max(READ_ONCE(lrugen->nr_pages[gen][type][zone]), 0L); - } - } - - *nr_to_scan =3D size; + *nr_to_scan =3D lruvec_evictable_size(lruvec, swappiness); /* better to run aging even though eviction is still possible */ return evictable_min_seq(min_seq, swappiness) + MIN_NR_GENS =3D=3D max_se= q; } --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE2C73E7165 for ; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313219; cv=none; b=QVIkqadJnhKj0m0EuivP7rGuJGQgE6KEj0qzosZbDAQhXO76JdmxH4YirsU9Gdw0lYcWxAb3MtWP7NRwxB0t1HNSHHMBzVPHSQXJwyD0i+GTPkeQJ1R3aZT4weACvw1mftFzHDbRM7Nbs5I8hCkEoXkNeUPGqjO3E6BfaiHSQOo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313219; c=relaxed/simple; bh=1YOeQEMsoIFWUlmmBdc5ofmMnY6LcY7a0cvpJ2cw9hQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=PNjCY1sdM8P7eUEbIO51BqdU+sliYo4AB/9hrEu5z7TzCYODgiXc2R4M44Ghgsnx6klHsIrwChPUPiM6SXj9NaQSdwj9yRgEahdA1k6RJMB3jEmJIAzwOk1mrvABgnMsUJi7cTeMXQHJhvVy5/GobGIPExi9uvq4rLnlvd9Vcms= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Qf1NlhNj; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Qf1NlhNj" Received: by smtp.kernel.org (Postfix) with ESMTPS id 934FBC2BCB7; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313219; bh=1YOeQEMsoIFWUlmmBdc5ofmMnY6LcY7a0cvpJ2cw9hQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=Qf1NlhNjMxWgdJ9/hNjBGEbJgT5N6caTYKuMNJY35U12SxhHv12B9pwCLqvXev8hC w/jK5DDsE2gd7AU+ykE79FMoDeaxm00OANL60Jb4Pi7MZ4eK4xcjEc3rqvIPjm9b45 p6kMRlAfoOf4OKr1UVHcFQUmRnujQGsCQZb1o/Q1pcwQmODsHrgjXQ8kiOdijKqrOJ EVMQ19xUFQAgFQMBUKO27xJDT5AXmWbTdg1vuSQZEC8fO7eE8Wrj6RUo4t37LPnNxM seiIo8+KmcaZ+pNRWyJhpbQYjyO9Fyojr5SqNwd0ts/hJFqK08tOIlg5HQkL27Duf7 bTWaVJK7k5VsQ== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8978EFF8869; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:06:53 +0800 Subject: [PATCH v7 02/15] mm/mglru: rename variables related to aging and rotation Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-2-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=2995; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=l0/ENeR1KIeJ87GpiJHayLU6igooeuma0gin50pY+k4=; b=u1dVE5TAmHFfJL1qnXcusicdetBNFOfQxBq0Bpg4qA0VOA+gkKEIUWTffNES3eyyzd1vrRD29 RzjmUIb5Ci1DmT1Iby976126yCWEdp6egFcQc5r5Ii9BLh6gFv9PpaQ X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song The current variable name isn't helpful. Make the variable names more meaningful. Only naming change, no behavior change. Suggested-by: Barry Song Reviewed-by: Baolin Wang Reviewed-by: Chen Ridong Reviewed-by: Barry Song Reviewed-by: Axel Rasmussen Signed-off-by: Kairui Song Acked-by: Shakeel Butt --- mm/vmscan.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index b80fbc4fc285..7f011ff4c478 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4934,7 +4934,7 @@ static bool should_run_aging(struct lruvec *lruvec, u= nsigned long max_seq, */ static long get_nr_to_scan(struct lruvec *lruvec, struct scan_control *sc,= int swappiness) { - bool success; + bool need_aging; unsigned long nr_to_scan; struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); DEFINE_MAX_SEQ(lruvec); @@ -4942,7 +4942,7 @@ static long get_nr_to_scan(struct lruvec *lruvec, str= uct scan_control *sc, int s if (mem_cgroup_below_min(sc->target_mem_cgroup, memcg)) return -1; =20 - success =3D should_run_aging(lruvec, max_seq, swappiness, &nr_to_scan); + need_aging =3D should_run_aging(lruvec, max_seq, swappiness, &nr_to_scan); =20 /* try to scrape all its memory if this memcg was deleted */ if (nr_to_scan && !mem_cgroup_online(memcg)) @@ -4951,7 +4951,7 @@ static long get_nr_to_scan(struct lruvec *lruvec, str= uct scan_control *sc, int s nr_to_scan =3D apply_proportional_protection(memcg, sc, nr_to_scan); =20 /* try to get away with not aging at the default priority */ - if (!success || sc->priority =3D=3D DEF_PRIORITY) + if (!need_aging || sc->priority =3D=3D DEF_PRIORITY) return nr_to_scan >> sc->priority; =20 /* stop scanning this lruvec as it's low on cold folios */ @@ -5040,7 +5040,7 @@ static bool try_to_shrink_lruvec(struct lruvec *lruve= c, struct scan_control *sc) =20 static int shrink_one(struct lruvec *lruvec, struct scan_control *sc) { - bool success; + bool need_rotate; unsigned long scanned =3D sc->nr_scanned; unsigned long reclaimed =3D sc->nr_reclaimed; struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); @@ -5058,7 +5058,7 @@ static int shrink_one(struct lruvec *lruvec, struct s= can_control *sc) memcg_memory_event(memcg, MEMCG_LOW); } =20 - success =3D try_to_shrink_lruvec(lruvec, sc); + need_rotate =3D try_to_shrink_lruvec(lruvec, sc); =20 shrink_slab(sc->gfp_mask, pgdat->node_id, memcg, sc->priority); =20 @@ -5068,10 +5068,10 @@ static int shrink_one(struct lruvec *lruvec, struct= scan_control *sc) =20 flush_reclaim_state(sc); =20 - if (success && mem_cgroup_online(memcg)) + if (need_rotate && mem_cgroup_online(memcg)) return MEMCG_LRU_YOUNG; =20 - if (!success && lruvec_is_sizable(lruvec, sc)) + if (!need_rotate && lruvec_is_sizable(lruvec, sc)) return 0; =20 /* one retry if offlined or too small */ --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C51343E4C62 for ; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313219; cv=none; b=Qo5/ZsWx27yjsqe0gkeswh+ZXSQ6ajPpNUWf3uiku0QH3FdR3ztrF8M0qaqpZd4wTQf+A7x86ICTrF1Iqo9+k4+3GMRByoX9MIVlTq0fBKXAwKier8YrMSiyxnl4H2QyYJQe5hU4HyYwYEaZqJceL9ihiHrVX52KicWhva6ZQ4c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313219; c=relaxed/simple; bh=L926ZYKmv08z/xxUtsdt7bPrPbk3jRIyRF1yuMzjHzs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=tOaVq640BY8srUBRWorRX6w7T85h3QH9d99Ei3BVcwd6kq5P8R0lgkKG6EBy7331to004X6pin+RtZe+WbEjj2H+iPi4rx/+GbF7Hs44mdPoWHAVk9+9jayb8zbJdSgyYlXz9Ic+oZXnKZKBh1q4YDtIY9uPr/XDT++x1qUVMWc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uVUTfkgG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uVUTfkgG" Received: by smtp.kernel.org (Postfix) with ESMTPS id A35ECC2BCF4; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313219; bh=L926ZYKmv08z/xxUtsdt7bPrPbk3jRIyRF1yuMzjHzs=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=uVUTfkgG55sUFJETSBEd7/pzr4vc1A/Juhfodsn86V+mohBD6YWVAAbc+nkR9/r72 RFxbjwjYcOIO4XzOQpenbMvPxbVm+MN/Ho2l4BMrQXuprCZuOO6LvZHfMxZ3eX4UrT eHhreSpF5Svch5aQx/eeMqJlzCnezRu7MtYdABMae7IQYAt4Pgvra4/ciiH5wZFd5c HK/ZtN22FdboVkpxGR6KF0ZgDWou1du2r+8ldogi9bY9236pK5NpWjipja4fGtjctR a9VY3ixTY7VTpmuFQLf9tmowQ8tFNYmUQEIl6pcdy0puVauHsEwP2D9isK8v6y3IT2 5NENOHdYoMGZg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B037FF8860; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:06:54 +0800 Subject: [PATCH v7 03/15] mm/mglru: relocate the LRU scan batch limit to callers Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-3-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=3314; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=2ZDrAewbf1TxtvY6gQRB6CyLSpuIbi1jF9zQ5rQYaJQ=; b=qtDGt5Rx+hfSvSnpA4X68Gvowz5paVgNR8eqDxVAI+zX3U1b0iPwl9P+ZT3/x0wgPHCAJkalF 6hntboypL6CCk1n4rvrr6bXC31pfeHNbW+SN/1wJmIY0sYYaZnDirX+ X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Same as active / inactive LRU, MGLRU isolates and scans folios in batches. The batch split is done hidden deep in the helper, which makes the code harder to follow. The helper's arguments are also confusing since callers usually request more folios than the batch size, so the helper almost never processes the full requested amount. Move the batch splitting into the top loop to make it cleaner, there should be no behavior change. Reviewed-by: Axel Rasmussen Reviewed-by: Baolin Wang Reviewed-by: Barry Song Reviewed-by: Chen Ridong Signed-off-by: Kairui Song Acked-by: Shakeel Butt --- mm/vmscan.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 7f011ff4c478..a011733a6392 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4695,10 +4695,10 @@ static int scan_folios(unsigned long nr_to_scan, st= ruct lruvec *lruvec, int scanned =3D 0; int isolated =3D 0; int skipped =3D 0; - int scan_batch =3D min(nr_to_scan, MAX_LRU_BATCH); - int remaining =3D scan_batch; + unsigned long remaining =3D nr_to_scan; struct lru_gen_folio *lrugen =3D &lruvec->lrugen; =20 + VM_WARN_ON_ONCE(nr_to_scan > MAX_LRU_BATCH); VM_WARN_ON_ONCE(!list_empty(list)); =20 if (get_nr_gens(lruvec, type) =3D=3D MIN_NR_GENS) @@ -4751,7 +4751,7 @@ static int scan_folios(unsigned long nr_to_scan, stru= ct lruvec *lruvec, mod_lruvec_state(lruvec, item, isolated); mod_lruvec_state(lruvec, PGREFILL, sorted); mod_lruvec_state(lruvec, PGSCAN_ANON + type, isolated); - trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, scan_batch, + trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, scanned, skipped, isolated, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); if (type =3D=3D LRU_GEN_FILE) @@ -4987,7 +4987,7 @@ static bool should_abort_scan(struct lruvec *lruvec, = struct scan_control *sc) =20 static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_contro= l *sc) { - long nr_to_scan; + long nr_batch, nr_to_scan; unsigned long scanned =3D 0; int swappiness =3D get_swappiness(lruvec, sc); =20 @@ -4998,7 +4998,8 @@ static bool try_to_shrink_lruvec(struct lruvec *lruve= c, struct scan_control *sc) if (nr_to_scan <=3D 0) break; =20 - delta =3D evict_folios(nr_to_scan, lruvec, sc, swappiness); + nr_batch =3D min(nr_to_scan, MAX_LRU_BATCH); + delta =3D evict_folios(nr_batch, lruvec, sc, swappiness); if (!delta) break; =20 @@ -5623,6 +5624,7 @@ static int run_aging(struct lruvec *lruvec, unsigned = long seq, static int run_eviction(struct lruvec *lruvec, unsigned long seq, struct s= can_control *sc, int swappiness, unsigned long nr_to_reclaim) { + int nr_batch; DEFINE_MAX_SEQ(lruvec); =20 if (seq + MIN_NR_GENS > max_seq) @@ -5639,8 +5641,8 @@ static int run_eviction(struct lruvec *lruvec, unsign= ed long seq, struct scan_co if (sc->nr_reclaimed >=3D nr_to_reclaim) return 0; =20 - if (!evict_folios(nr_to_reclaim - sc->nr_reclaimed, lruvec, sc, - swappiness)) + nr_batch =3D min(nr_to_reclaim - sc->nr_reclaimed, MAX_LRU_BATCH); + if (!evict_folios(nr_batch, lruvec, sc, swappiness)) return 0; =20 cond_resched(); --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8B9D3E717D for ; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313219; cv=none; b=dMktG1PKds8sxjsPBI4d6EAk58wzkpyGnYK67tCJ37+GC6cfv1CFKXJdlgJsaP+a6NbXd2lb4baxTy9lmbJ5nIo0caTNYv+U1uT9HsVIMIGO5+Y4aSwV14Xl8c5hOwTSI+x0mTlotXJUqqpXpnPO0R70r+pWOzJrePv2SsuXskg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313219; c=relaxed/simple; bh=nCSsnBFCq6QE97j9zl/tHdIeIyGvS/fXh9XcG+jT73Q=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=V9qG2KQD8e19EfxHaMRG34P1YMvIYQKV8zVNKq/a9BsDZlSPYsPr+01CDdA9/9hP4ozpJo7UkxIcKLq/ERKoMElMnzFg7aqwBbgDD4KZlEWqRkAOHqNh1hKNcHi5hX0yF9usMOO2J8Sox8qX0vMSxGyJxKISYg7FFxjllnFwaiM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lPgcfLjP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lPgcfLjP" Received: by smtp.kernel.org (Postfix) with ESMTPS id B6E4CC2BCF5; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313219; bh=nCSsnBFCq6QE97j9zl/tHdIeIyGvS/fXh9XcG+jT73Q=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=lPgcfLjPYHeqBf1jKJB6NZym0wlvcZrjo2/L3TKrzchMEmSoWyVpLAIE6GA9LLmWR iQ3UtZLnMfd0wp/Pl2TOzB+jxdvww9E/3W7yNGTuYZ2h99VxqV0Vlyt8orPoS/xkYL yNbkRK2f51g8BOh03vuFMcst9y1Q54Lz9vgfO/OmqHjTIPU73QyZtRuxZvzpGWtwGJ 07DshyapH5h6+acQ8FuM/y3vY+LyaL76kkWdb8M/wB9EGfqt/jYVXrkCeSgbwrvW3a 0WE5E5cGljZYoiP9H1X5RqNx+d3bPsrhYSCyWP1trqInTmj1Zfi8YxL/MXDTBCUGQA +xYll01PpkTYw== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD739FF886B; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:06:55 +0800 Subject: [PATCH v7 04/15] mm/mglru: restructure the reclaim loop Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-4-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=6529; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=ua5CZKU4fAwh1GAS55cyyf6f9byuwkTxQPOo0BUpUcY=; b=KQ/l2GgwGMdA2ubdQo08aDL2lCBqKfBZhxHNSzDBL6dva3CAPO9X+iqlM1DrYBhmTfOOD36ld kv/iOZ0uSorC2oHyy4mdavY/Q3PDOGTx5F4Mlf1t3DhrJhZXAItQcF0 X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song The current loop will calculate the scan number on each iteration. The number of folios to scan is based on the LRU length, with some unclear behaviors, e.g, the scan number is only shifted by reclaim priority when aging is not needed or when at the default priority, and it couples the number calculation with aging and rotation. Adjust, simplify it, and decouple aging and rotation. Just calculate the scan number for once at the beginning of the reclaim, always respect the reclaim priority, and make the aging and rotation more explicit. This slightly changes how aging and offline memcg reclaim works: Previously, aging was skipped at DEF_PRIORITY even when eviction was no longer possible, so the reclaimer wasted an iteration until the priority escalated. Now aging runs immediately whenever it is needed to make progress; the DEF_PRIORITY skip only applies when eviction is still viable. This may avoid wasted iterations that over-reclaim slab and break reclaim balance in multi-cgroup setups. Similar for offline memcg. Previously, offline memcg wouldn't be aged unless it didn't have any evictable folios. Now, we might age it if it has only 3 generations, which should be fine. On one hand, offline memcg might still hold long-term folios, and in fact, a long-existing offline memcg must be pinned by some long-term folios like shmem. These folios might be used by other memcg, so aging them as ordinary memcg seems correct. Besides, aging enables further reclaim of an offlined memcg, which will certainly happen if we keep shrinking it. And offline memcg might soon be no longer an issue with reparenting. Overall, the memcg LRU rotation, as described in mmzone.h, remains the same. Note that because the scan budget is now pinned at loop entry, tiny lruvec might skip this reclaim pass, also skipping aging, which could be beneficial as aging is not helpful since it will still be un-reclaimable after aging. Reclaim will go on as usual once priority escalates. Reviewed-by: Axel Rasmussen Signed-off-by: Kairui Song Acked-by: Shakeel Butt --- mm/vmscan.c | 72 ++++++++++++++++++++++++++++++---------------------------= ---- 1 file changed, 36 insertions(+), 36 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index a011733a6392..b247f216f28b 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4913,49 +4913,37 @@ static int evict_folios(unsigned long nr_to_scan, s= truct lruvec *lruvec, } =20 static bool should_run_aging(struct lruvec *lruvec, unsigned long max_seq, - int swappiness, unsigned long *nr_to_scan) + struct scan_control *sc, int swappiness) { DEFINE_MIN_SEQ(lruvec); =20 - *nr_to_scan =3D 0; /* have to run aging, since eviction is not possible anymore */ if (evictable_min_seq(min_seq, swappiness) + MIN_NR_GENS > max_seq) return true; =20 - *nr_to_scan =3D lruvec_evictable_size(lruvec, swappiness); + /* try to avoid aging, do gentle reclaim at the default priority */ + if (sc->priority =3D=3D DEF_PRIORITY) + return false; + /* better to run aging even though eviction is still possible */ return evictable_min_seq(min_seq, swappiness) + MIN_NR_GENS =3D=3D max_se= q; } =20 -/* - * For future optimizations: - * 1. Defer try_to_inc_max_seq() to workqueues to reduce latency for memcg - * reclaim. - */ -static long get_nr_to_scan(struct lruvec *lruvec, struct scan_control *sc,= int swappiness) +static long get_nr_to_scan(struct lruvec *lruvec, struct scan_control *sc, + struct mem_cgroup *memcg, int swappiness) { - bool need_aging; - unsigned long nr_to_scan; - struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); - DEFINE_MAX_SEQ(lruvec); + unsigned long nr_to_scan, evictable; =20 - if (mem_cgroup_below_min(sc->target_mem_cgroup, memcg)) - return -1; - - need_aging =3D should_run_aging(lruvec, max_seq, swappiness, &nr_to_scan); + evictable =3D lruvec_evictable_size(lruvec, swappiness); =20 /* try to scrape all its memory if this memcg was deleted */ - if (nr_to_scan && !mem_cgroup_online(memcg)) - return nr_to_scan; - - nr_to_scan =3D apply_proportional_protection(memcg, sc, nr_to_scan); + if (!mem_cgroup_online(memcg)) + return evictable; =20 - /* try to get away with not aging at the default priority */ - if (!need_aging || sc->priority =3D=3D DEF_PRIORITY) - return nr_to_scan >> sc->priority; + nr_to_scan =3D apply_proportional_protection(memcg, sc, evictable); + nr_to_scan >>=3D sc->priority; =20 - /* stop scanning this lruvec as it's low on cold folios */ - return try_to_inc_max_seq(lruvec, max_seq, swappiness, false) ? -1 : 0; + return nr_to_scan; } =20 static bool should_abort_scan(struct lruvec *lruvec, struct scan_control *= sc) @@ -4985,31 +4973,44 @@ static bool should_abort_scan(struct lruvec *lruvec= , struct scan_control *sc) return true; } =20 +/* + * For future optimizations: + * 1. Defer try_to_inc_max_seq() to workqueues to reduce latency for memcg + * reclaim. + */ static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_contro= l *sc) { + bool need_rotate =3D false; long nr_batch, nr_to_scan; - unsigned long scanned =3D 0; int swappiness =3D get_swappiness(lruvec, sc); + struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); =20 - while (true) { + nr_to_scan =3D get_nr_to_scan(lruvec, sc, memcg, swappiness); + while (nr_to_scan > 0) { int delta; + DEFINE_MAX_SEQ(lruvec); =20 - nr_to_scan =3D get_nr_to_scan(lruvec, sc, swappiness); - if (nr_to_scan <=3D 0) + if (mem_cgroup_below_min(sc->target_mem_cgroup, memcg)) { + need_rotate =3D true; break; + } + + if (should_run_aging(lruvec, max_seq, sc, swappiness)) { + if (try_to_inc_max_seq(lruvec, max_seq, swappiness, false)) + need_rotate =3D true; + /* stop scanning as it's low on cold folios */ + break; + } =20 nr_batch =3D min(nr_to_scan, MAX_LRU_BATCH); delta =3D evict_folios(nr_batch, lruvec, sc, swappiness); if (!delta) break; =20 - scanned +=3D delta; - if (scanned >=3D nr_to_scan) - break; - if (should_abort_scan(lruvec, sc)) break; =20 + nr_to_scan -=3D delta; cond_resched(); } =20 @@ -5035,8 +5036,7 @@ static bool try_to_shrink_lruvec(struct lruvec *lruve= c, struct scan_control *sc) reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); } =20 - /* whether this lruvec should be rotated */ - return nr_to_scan < 0; + return need_rotate; } =20 static int shrink_one(struct lruvec *lruvec, struct scan_control *sc) --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E92563E866B for ; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; cv=none; b=tFIMZeCC9QOPCX+SM3N0Ziv7juUClLyHvT0+XecJWbZblNdKjHxR4MLH/K7K66woKwaFHV2YudVfRyfurqkPzamzP9zWRkLOwBKwH5UVY0sRpQJRWzEfJ3e+Ki4sXjVuDevgNV/e4UK0LTY0xdOUcV1S0X8/N6C9hpPTR5PV5TE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; c=relaxed/simple; bh=F69AkOfNOrc8HFY5AD6s9a9y2jE0zbVIRmHPwZMejVA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=dJYyaVUr+9/VxpucaNIiMeGwf8Q+1Lt0vehAsQukYxM9uM/8bxONFe/hcNtQDclsOUJLd8wWKO/Fk/C5z/Nc90Jcl+306zDOcLBgXHiv5EsprVMDIFP9zIGZ7+UIoGqwagyxGCxnMxZJuwSd9r4sgFeQrMCabFKGCUNc7efVzx4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hlfO/9X5; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hlfO/9X5" Received: by smtp.kernel.org (Postfix) with ESMTPS id C75A3C2BD00; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313219; bh=F69AkOfNOrc8HFY5AD6s9a9y2jE0zbVIRmHPwZMejVA=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=hlfO/9X5aRPgLXqsraOK7OAb/r4MEAlg415q7Ijt+LMG3QHYhfLXt8H079zXaD+qD Oyp2lt480bafLv3/QAW+OK8ZzvGqfBAxRvBK6a8pT7TLTx3PGWJese3EijH1H+eUvF G4POdCKfLCBDeG3kaffyfjb4JxrOl2T1lXdBMlvncxjmz+nlsBWnAm9Ma7UmewQu/M e9hM8wM3p3Eg6PZKdByeEHRLq1yzfyaQl4lQft4QUxfrnIylw0gTLKBeOwGvWu8dDo l62VmulPbpvoxMxRUtPeU+VvGMTK/2cyT97MzReYdcNBG0F1RjDmwdCdELvZFBx8al AkbWgyGBkxnhA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF82CFF8860; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:06:56 +0800 Subject: [PATCH v7 05/15] mm/mglru: scan and count the exact number of folios Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-5-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=7211; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=hdXl21AaB9u9Yf68HFt0OG5tzCbMPEW7u+ust52iMoc=; b=Ek/aSI4Ik07jOoqPRLdVRrWbg7fHUbDxBwk0wQQ6ARGgZEo7/A/ptlqqlLw9BWFxfjPQ9Xdy+ J1Tyu5zwyHWBtFsf9HYAHOxiiNG/yHxNJ2j31nrQxGFWDTgJlr3Piwa X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Make the scan helpers return the exact number of folios being scanned or isolated. Since the reclaim loop now has a natural scan budget that controls the scan progress, returning the scan number and consuming the budget makes the scan more accurate and easier to follow. The number of scanned folios for each iteration is always larger than 0, unless the reclaim must stop for a forced aging, so there is no more need for any special handling when there is no progress made: - `return isolated || !remaining ? scanned : 0` in scan_folios: both the function and the call now just return the exact scan count, combined with the scan budget introduced in the previous commit to avoid livelock or under scan. - `scanned +=3D try_to_inc_min_seq` in evict_folios: adding a bool as a scan count was kind of confusing and no longer needed, as scan number should never be zero as long as there are still evictable gens. We may encounter a empty old gen that returns 0 scan count, to avoid that, do a try_to_inc_min_seq before toisolation which have slight to none overhead in most cases. - `evictable_min_seq + MIN_NR_GENS > max_seq` guard in evict_folios: the per-type get_nr_gens =3D=3D MIN_NR_GENS check in scan_folios naturally returns 0 when only two gens remain and breaks the loop. Also change try_to_inc_min_seq to return void, as its return value is no longer used by any caller. Call it before isolate_folios to flush any empty gens left by external folio freeing, and again after isolate_folios when scanning moved or protected folios may have emptied the oldest gen. The scan still stops if only two gens are left, as the scan number will be zero. This matches the previous behavior. This forced gen protection may be removed or softened later to improve reclaim further. Reviewed-by: Axel Rasmussen Reviewed-by: Chen Ridong Reviewed-by: Baolin Wang Signed-off-by: Kairui Song --- mm/vmscan.c | 58 +++++++++++++++++++++++++++++----------------------------- 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index b247f216f28b..2dbd39e29dfc 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3878,10 +3878,9 @@ static bool inc_min_seq(struct lruvec *lruvec, int t= ype, int swappiness) return true; } =20 -static bool try_to_inc_min_seq(struct lruvec *lruvec, int swappiness) +static void try_to_inc_min_seq(struct lruvec *lruvec, int swappiness) { int gen, type, zone; - bool success =3D false; bool seq_inc_flag =3D false; struct lru_gen_folio *lrugen =3D &lruvec->lrugen; DEFINE_MIN_SEQ(lruvec); @@ -3907,11 +3906,10 @@ static bool try_to_inc_min_seq(struct lruvec *lruve= c, int swappiness) =20 /* * If min_seq[type] of both anonymous and file is not increased, - * we can directly return false to avoid unnecessary checking - * overhead later. + * return here to avoid unnecessary checking overhead later. */ if (!seq_inc_flag) - return success; + return; =20 /* see the comment on lru_gen_folio */ if (swappiness && swappiness <=3D MAX_SWAPPINESS) { @@ -3929,10 +3927,7 @@ static bool try_to_inc_min_seq(struct lruvec *lruvec= , int swappiness) =20 reset_ctrl_pos(lruvec, type, true); WRITE_ONCE(lrugen->min_seq[type], min_seq[type]); - success =3D true; } - - return success; } =20 static bool inc_max_seq(struct lruvec *lruvec, unsigned long seq, int swap= piness) @@ -4686,7 +4681,7 @@ static bool isolate_folio(struct lruvec *lruvec, stru= ct folio *folio, struct sca =20 static int scan_folios(unsigned long nr_to_scan, struct lruvec *lruvec, struct scan_control *sc, int type, int tier, - struct list_head *list) + struct list_head *list, int *isolatedp) { int i; int gen; @@ -4756,11 +4751,9 @@ static int scan_folios(unsigned long nr_to_scan, str= uct lruvec *lruvec, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); if (type =3D=3D LRU_GEN_FILE) sc->nr.file_taken +=3D isolated; - /* - * There might not be eligible folios due to reclaim_idx. Check the - * remaining to prevent livelock if it's not making progress. - */ - return isolated || !remaining ? scanned : 0; + + *isolatedp =3D isolated; + return scanned; } =20 static int get_tier_idx(struct lruvec *lruvec, int type) @@ -4804,33 +4797,36 @@ static int get_type_to_scan(struct lruvec *lruvec, = int swappiness) =20 static int isolate_folios(unsigned long nr_to_scan, struct lruvec *lruvec, struct scan_control *sc, int swappiness, - int *type_scanned, struct list_head *list) + struct list_head *list, int *isolated, + int *isolate_type, int *isolate_scanned) { int i; + int total_scanned =3D 0; int type =3D get_type_to_scan(lruvec, swappiness); =20 for_each_evictable_type(i, swappiness) { int scanned; int tier =3D get_tier_idx(lruvec, type); =20 - *type_scanned =3D type; + scanned =3D scan_folios(nr_to_scan, lruvec, sc, + type, tier, list, isolated); =20 - scanned =3D scan_folios(nr_to_scan, lruvec, sc, type, tier, list); - if (scanned) - return scanned; + total_scanned +=3D scanned; + if (*isolated) { + *isolate_type =3D type; + *isolate_scanned =3D scanned; + break; + } =20 type =3D !type; } =20 - return 0; + return total_scanned; } =20 static int evict_folios(unsigned long nr_to_scan, struct lruvec *lruvec, struct scan_control *sc, int swappiness) { - int type; - int scanned; - int reclaimed; LIST_HEAD(list); LIST_HEAD(clean); struct folio *folio; @@ -4838,19 +4834,23 @@ static int evict_folios(unsigned long nr_to_scan, s= truct lruvec *lruvec, enum node_stat_item item; struct reclaim_stat stat; struct lru_gen_mm_walk *walk; + int scanned, reclaimed; + int isolated =3D 0, type, type_scanned; bool skip_retry =3D false; - struct lru_gen_folio *lrugen =3D &lruvec->lrugen; struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); struct pglist_data *pgdat =3D lruvec_pgdat(lruvec); =20 lruvec_lock_irq(lruvec); =20 - scanned =3D isolate_folios(nr_to_scan, lruvec, sc, swappiness, &type, &li= st); + /* In case folio deletion left empty old gens, flush them */ + try_to_inc_min_seq(lruvec, swappiness); =20 - scanned +=3D try_to_inc_min_seq(lruvec, swappiness); + scanned =3D isolate_folios(nr_to_scan, lruvec, sc, swappiness, + &list, &isolated, &type, &type_scanned); =20 - if (evictable_min_seq(lrugen->min_seq, swappiness) + MIN_NR_GENS > lrugen= ->max_seq) - scanned =3D 0; + /* Scanning may have emptied the oldest gen, flush it */ + if (scanned) + try_to_inc_min_seq(lruvec, swappiness); =20 lruvec_unlock_irq(lruvec); =20 @@ -4861,7 +4861,7 @@ static int evict_folios(unsigned long nr_to_scan, str= uct lruvec *lruvec, sc->nr.unqueued_dirty +=3D stat.nr_unqueued_dirty; sc->nr_reclaimed +=3D reclaimed; trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, - scanned, reclaimed, &stat, sc->priority, + type_scanned, reclaimed, &stat, sc->priority, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); =20 list_for_each_entry_safe_reverse(folio, next, &list, lru) { --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 093063E8C4F for ; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; cv=none; b=br6LwPCyIawZtC/eyTAmg7IrTibVb3Fs1My+LMkzSqZn2QmtizAX5fA+okCIF6pQ14FIY/pZRaJNSEXhQ9HagGvHAT3LMa1vzOmtJhVBXqTb9jgHKByjwN+jM5HZlxhOD9UHtF2XPw9du8EOzukMnOCiDiP0V73bPTNbXt1x/n0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; c=relaxed/simple; bh=qWXxnOnPKfbsgfdB3Dwc3NGWTEnlJEnolSGba/CZXrs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=tpSwYLZJkK+GLxsXF+QN8xLpZjgmaarH8QK8wmwKuK2K40G9nTE9IopFDmXLIH8Zul76mcVaI1ngs2H8lSThSN+Gd7rtSsHsUHyL5+64zKndd2sBeI5MaxW+VWl/dB8ctETba8iIgyzDOu84Xig3ypLSlyttImz8piUmf9ls43c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gTwWRFPy; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gTwWRFPy" Received: by smtp.kernel.org (Postfix) with ESMTPS id DA8A1C2BCFD; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313219; bh=qWXxnOnPKfbsgfdB3Dwc3NGWTEnlJEnolSGba/CZXrs=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=gTwWRFPyUOULpywF/HVBNioGbbTaT1EMkMrI7wQ5LfUYTpXNoRm8Hc9hG1apyWqm1 GEBgdjxaQ0EH/I/gzXRAFJZt7TpzESMoVvFAEIgOJqYj5M2IHwx9AzKfZLBpJF8ywG 8zv7E1n9hFesk28G3KIn6VGGGGKQXugnvmQu4AidfLIUKTd4S0t3xX4znOewyG9SKJ 0OT/3Hs9y+tD/q1k73uj6Ro131wpYGPnlL7kJx39WlWiJ6S15w6gWynJmdgkEpfFMD Gc0femPldjpIrJrYRe09xKPXcCsJccggDe/upocV42WKpdqvkDaaPuhvw4LEycQOKc /Asx3mLvLOpPA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id D15A4FF8868; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:06:57 +0800 Subject: [PATCH v7 06/15] mm/mglru: avoid reclaim type fall back when isolation makes no progress Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-6-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=1509; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=AV4QwhcpwdcCenfYKd1bClgIwnLcHYhA5p31fWGzsBc=; b=RPW2xoKjwCBGjBsqPc3JbY06T1rqff5twu3k3Hrf74NgrDi5y3NM+KsVtZEFYncyjGQRyHYN5 LBoBHSYdPm5D6qX7Dp4Tr7taNsz+8t35dxHiq4sIzFQaSK4wDcmyNdq X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: "Barry Song (Xiaomi)" While isolation makes no progress in scan_folios(), we quickly fall back to the other type in isolate_folios(). This is incorrect, as the current type may still have sufficient folios. Falling back can undermine the positive_ctrl_err() result from get_type_to_scan(), which is derived from swappiness. So just continue scanning this type for another round. Worth noting if the cold generations are all reclaimed, scan will no longer make any progress either, which may undermine the swappiness again. This is not a new issue and hence better be fixed later [1]. Link: https://lore.kernel.org/linux-mm/CAGsJ_4zjdOYEtuO6gNjABm7NDxW0skzBFNR= Nee-k2D6VwsYEQA@mail.gmail.com/ [1] Signed-off-by: Barry Song (Xiaomi) Reviewed-by: Kairui Song Signed-off-by: Kairui Song --- mm/vmscan.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 2dbd39e29dfc..ac9d2d4f8e65 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4817,8 +4817,13 @@ static int isolate_folios(unsigned long nr_to_scan, = struct lruvec *lruvec, *isolate_scanned =3D scanned; break; } - - type =3D !type; + /* + * If scanned > 0 and isolated =3D=3D 0, avoid falling back to the + * other type, as this type remains sufficient. Falling back + * too readily can disrupt the positive_ctrl_err() bias. + */ + if (!scanned) + type =3D !type; } =20 return total_scanned; --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17C3935C182 for ; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; cv=none; b=Efln5Jcyju6syHsqfz4cbYFZil4YVym5XMMBtS5iUQuT/MLMlfgp1vJ75M2apxTE/dt2ceMXL+qJOgFHRHpVau9oDxu30/WEa3B1HPQiEpvUqHR+kskZYzKFSGdTigtqRjzfnwhw60G+v1PCfAoRj2NMSM+ROL+Hfcqn8lHDmxc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; c=relaxed/simple; bh=rcBrmg0snM6JI0g903MCsIhVcJ+GJOO20n6dAbfvTy0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=WQNGqGBIXHmS2t1AIKm7QHk3IyiwUe0J7NzUjT6cMcBKN8OnvA6D0Yx6LvN4loBkMJoNstKVOpsbkdFj6niKmmt8ed1PhPl9u9FRTSXGZ/WYtl5dJPwlH7MRQr6dGyznC2CmXMWFWox2CsUYFlX03WOqLwSz5pmIue7EpdX9tYk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=jYI6TOMG; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jYI6TOMG" Received: by smtp.kernel.org (Postfix) with ESMTPS id EC8E3C2BD05; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313220; bh=rcBrmg0snM6JI0g903MCsIhVcJ+GJOO20n6dAbfvTy0=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=jYI6TOMG+JJTVphq6SzcRN0+kuuK17v6dp12rIVsEjkH8BoNttmLvEWpaXGRb82In ZjB1f0SHkUDd2zQjgy6jQTllh5+N/zyHfv2dgv+lT5BNJdJCFJi6f5XZvFThcD51Eb 2yc/JR/3Bnhri/SYdwbmna8iOTIdANE7pOiTdttNseDTPnmKQhwoqC4mQ5BitkZ5bz oavkbFV1qzlIRblwG//vuT6/U7gns3VPDymcsTOxGUhY8Ef4WKYUNH2j1Ec1qXt3Ee jtlce/MVMlmAhDvrHEWb2WiT3hzUG0paSMirVziqFdL7VPCW+vdoynwwrB3UtBE2kn urRZBIW8FhtxA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4E26FF8869; Mon, 27 Apr 2026 18:06:59 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:06:58 +0800 Subject: [PATCH v7 07/15] mm/mglru: use a smaller batch for reclaim Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-7-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=1039; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=REAUkY7P+qbS2gt7Q4kH3cv5QJxrLEt1oxBV4cmZIDU=; b=Oe+X0XFDd2tUVhSD10mdWdex1S7xloMQEpmYk/axubPPkIr3rJZfc/oUwWtu9DGdy5Qlw3ZEC 4smVANROV5ABXZXUZVsrrMBRAPr2kqTe6kPcqS5wA9E6mPJw7w/BTeA X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song With a fixed number to reclaim calculated at the beginning, making each following step smaller should reduce the lock contention and avoid over-aggressive reclaim of folios, as it will abort earlier when the number of folios to be reclaimed is reached. Reviewed-by: Axel Rasmussen Reviewed-by: Chen Ridong Reviewed-by: Baolin Wang Reviewed-by: Barry Song Signed-off-by: Kairui Song --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index ac9d2d4f8e65..2a607546277c 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -5007,7 +5007,7 @@ static bool try_to_shrink_lruvec(struct lruvec *lruve= c, struct scan_control *sc) break; } =20 - nr_batch =3D min(nr_to_scan, MAX_LRU_BATCH); + nr_batch =3D min(nr_to_scan, MIN_LRU_BATCH); delta =3D evict_folios(nr_batch, lruvec, sc, swappiness); if (!delta) break; --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F8243E9295 for ; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; cv=none; b=AoM6vkF/d49HgpDGk6YqHr+SbUC0y6DUM9bxbLi8oBrmTNEPlpZTEkFubkhqGZnXFPRgBL2Lqg8Y1pEmN1w4GAp8Zuh72GoMpZ0znF8NqvIFjxPjcbRjBWLW+5jGnre9ZxOJIoo33pSthIIDT/OHzUoTwkwwr2p0eH3rDdwHV1w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; c=relaxed/simple; bh=DOK386NUsZ520wbu/04MmYJXpiyWky1LXL0L0xJEEnc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=P8sRNZEFD4yENz8HeQsw6m97gFZxmUGqc9EzSuIfXnNO9XROUt6xElIKIREpURitxW+2zw+JPxD+i+maaSFTcFkkU3RloV3ZiQfDJwDopuAvQIAogFjoP5M1CmRB/De3WNZ+FV5FRx2/4s/RRZ2qT5ueCqydKZAAhBrIQnM4KyE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=tPlZw8Of; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tPlZw8Of" Received: by smtp.kernel.org (Postfix) with ESMTPS id 10672C4AF48; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313220; bh=DOK386NUsZ520wbu/04MmYJXpiyWky1LXL0L0xJEEnc=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=tPlZw8Ofwjb8af2XblCqKdAKB7LmCBVHmNvkiU/er7VW3ayN1Phj7jbxos7h2vU5F B2O26VMCg+/P5uWd+cYILVsOVddrHZFGdOSBu5ZjC5zqkgokpumT5le8t5AUqXmplG 1ve8gSs0YFz7baabg+aeCIe/oIw8fguJk+VXbgX5iTr9fjtssBv1gVYF38/5Cf5muj d15RwwY3DE7Y4xWx/TJFLl3J/ItwtawPm4zTRF2JfVocc0zm21jV4X6irLTu24EzXC 3KidLtUFtuVMtt2r91Ciarvd/I9m616a4V3WbkbonreezLFAN9Lf73r9OH8qTNNKZR OujIASZR4ja5w== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 057DDFF8865; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:06:59 +0800 Subject: [PATCH v7 08/15] mm/mglru: don't abort scan immediately right after aging Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-8-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=3743; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=tJwMbabzOAzU1rtsLiIyfOmdKQT6hOnnnbW6fs5AP+4=; b=dCW6tVzk20OP2vvexHeQXlVKV4wetE3MP2wgKaZEtwI5Ts+//V2gyFAgFaL8XShlYlbC0jmm8 omaErK+i6lTDe2wBWtWYaX9o1gMRvQYyAss21vpoE/YZLzCCT7522jo X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Right now, if eviction triggers aging, the reclaimer will abort. This is not the optimal strategy for several reasons. Aborting the reclaim early wastes a reclaim cycle when under pressure, and for concurrent reclaim, if the LRU is under aging, all concurrent reclaimers might fail. And if the age has just finished, new cold folios exposed by the aging are not reclaimed until the next reclaim iteration. What's more, the current aging trigger is quite lenient, having 3 gens with a reclaim priority lower than default will trigger aging, and blocks reclaiming from one memcg. This wastes reclaim retry cycles easily. And in the worst case, if the reclaim is making slower progress and all following attempts fail due to being blocked by aging, it triggers unexpected early OOM. And if a lruvec requires aging, it doesn't mean it's hot. Instead, the lruvec could be idle for quite a while, and hence it might contain lots of cold folios to be reclaimed. While it's helpful to rotate memcg LRU after aging for global reclaim, as global reclaim fairness is coupled with the rotation in shrink_many, memcg fairness is instead handled by cgroup iteration in shrink_node_memcgs. So, for memcg level pressure, this abort is not the key part for keeping the fairness. And in most cases, there is no need to age, and fairness must be achieved by upper-level reclaim control. So instead, just keep the scanning going unless one whole batch of folios failed to be isolated or enough folios have been scanned, which is triggered by evict_folios returning 0. And only abort for global reclaim after one batch, so when there are fewer memcgs, progress is still made, and the fairness mechanism described above still works fine. And in most cases, the one more batch attempt for global reclaim might just be enough to satisfy what the reclaimer needs, hence improving global reclaim performance by reducing reclaim retry cycles. Rotation is still there after the reclaim is done, which still follows the comment in mmzone.h. And fairness still looking good. Reviewed-by: Axel Rasmussen Reviewed-by: Chen Ridong Reviewed-by: Barry Song Signed-off-by: Kairui Song --- mm/vmscan.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 2a607546277c..42ccc6eb0748 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4985,7 +4985,7 @@ static bool should_abort_scan(struct lruvec *lruvec, = struct scan_control *sc) */ static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_contro= l *sc) { - bool need_rotate =3D false; + bool need_rotate =3D false, should_age =3D false; long nr_batch, nr_to_scan; int swappiness =3D get_swappiness(lruvec, sc); struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); @@ -5003,8 +5003,7 @@ static bool try_to_shrink_lruvec(struct lruvec *lruve= c, struct scan_control *sc) if (should_run_aging(lruvec, max_seq, sc, swappiness)) { if (try_to_inc_max_seq(lruvec, max_seq, swappiness, false)) need_rotate =3D true; - /* stop scanning as it's low on cold folios */ - break; + should_age =3D true; } =20 nr_batch =3D min(nr_to_scan, MIN_LRU_BATCH); @@ -5015,6 +5014,13 @@ static bool try_to_shrink_lruvec(struct lruvec *lruv= ec, struct scan_control *sc) if (should_abort_scan(lruvec, sc)) break; =20 + /* + * Root reclaim needs rotation when low on cold folio for better + * fairness. Cgroup reclaim gets fairness from the iterator. + */ + if (root_reclaim(sc) && should_age) + break; + nr_to_scan -=3D delta; cond_resched(); } --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 858743E929C for ; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; cv=none; b=CxzHjkPox50MWs5Xs9ZoWgwUl5yxBxKU2d0GjiZ7+fxqEZSfFwYiAiI87lbbIsJc9iMe2PYW3ojoAIRBPBjQItXvKrZBzwXPh42zXQ3v5IylU4LKtzvViQNbAr14YB4GsPU6Otq1SIuAlBxv72BBIkKFujN9Ign51TeSggl6b64= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; c=relaxed/simple; bh=ZfS24WSAF5BaTElfUTuRXGGO0zKEsVChuXuWM96dvwE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=VYxrc0wNcyv5K2xeoL6Jb/+8ikxT6r+PtmAlO2DMXq5iMeAQ/aCiagZm3w21hm4LFIeL7528nZ5bN+IQJeSEkSdFRIICJC1FlrbNle6lBq12RtdJ18OLjY7CochMKAZE7Rb+QeeHRMfdXSeuew94jj4Ug5SYknbh1IlyemhJQRc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MEuLA6HU; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MEuLA6HU" Received: by smtp.kernel.org (Postfix) with ESMTPS id 22D93C4AF49; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313220; bh=ZfS24WSAF5BaTElfUTuRXGGO0zKEsVChuXuWM96dvwE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=MEuLA6HUYZXObFY9Naw8Q8ooiuidv4TF1l3tpKPY92AkNAstaHUXz1Vntf7NK6+zc LLbklRBnM3G7sbCVkXd1W6Lb7d75WOkFOqwhoFVSkecECetHSD9z4Y3LBGl447SKMA JnKB/DAteT47EcGWF5H30TeFMijh+F5Y+w8QfA19dC/xEKwxlKCrKsajnBc1jligpq hqS8i35hmE2U3trYzLScd+MW/ATlzhQ+F9+6AZq0UzsDNbcS9Jhl+IrEuKLBWhlBcd NspUeQMTwINfr6zMeDZOQw4tIhu6oyN1+Jd2smk/697wgWqvB6/qWJavzRhpCap+Cc pBU11XNlDY+hg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B19CFF8860; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:07:00 +0800 Subject: [PATCH v7 09/15] mm/mglru: remove redundant swap constrained check upon isolation Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-9-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=1862; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=3kVEaa+69q2QQdWMI3d9+Nc1UFdcO1MQkSA+3AlCcvY=; b=G0qlbP8TdGe9GvyLqb1tlEmx425wLBr5K4r5niqGUlo9Pwy5K0n28djEMUxQFw2UmH4bI6Tn4 IONoSmPMaMfDBaJ3v78FFlusOvvvbq0AlGOER8YZb3XgTRX1ViCR3xB X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Remove the swap-constrained early reject check upon isolation. This check is a micro optimization when swap IO is not allowed, so folios are rejected early. But it is redundant and overly broad since shrink_folio_list() already handles all these cases with proper granularity. Notably, this check wrongly rejected lazyfree folios, and it doesn't cover all rejection cases. shrink_folio_list() uses may_enter_fs(), which distinguishes non-SWP_FS_OPS devices from filesystem-backed swap and does all the checks after folio is locked, so flags like swap cache are stable. This check also covers dirty file folios, which are not a problem now since sort_folio() already bumps dirty file folios to the next generation, but causes trouble for unifying dirty folio writeback handling. And there should be no performance impact from removing it. We may have lost a micro optimization, but unblocked lazyfree reclaim for NOIO contexts, which is not a common case in the first place. Reviewed-by: Axel Rasmussen Reviewed-by: Baolin Wang Reviewed-by: Chen Ridong Reviewed-by: Barry Song Signed-off-by: Kairui Song --- mm/vmscan.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 42ccc6eb0748..ea86297b604c 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4650,12 +4650,6 @@ static bool isolate_folio(struct lruvec *lruvec, str= uct folio *folio, struct sca { bool success; =20 - /* swap constrained */ - if (!(sc->gfp_mask & __GFP_IO) && - (folio_test_dirty(folio) || - (folio_test_anon(folio) && !folio_test_swapcache(folio)))) - return false; - /* raced with release_pages() */ if (!folio_try_get(folio)) return false; --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 855D73E9299 for ; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; cv=none; b=pox0luTnpNrRS0HL7fjdpFaozq4dTlyLNXvHvHIV4HmEzQ07RQ7DJ5FDfurH0wsxrqi9/ANj5R5rbgE0OY+IkX8PwxDWvOD2nIWKZhbQIxDAjouvZGhLlAeSbUaskarpDvJ3CQvKcDzokUmssUZmdU0qLmd/ZteyjcJ9gIJHwUo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; c=relaxed/simple; bh=mfCiqor+VVrtAxYwGEGeUEVf/Pi3BKvbjBUnqOQUaPE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=XFj8X8Fkze5idQPac6eAG85mly1YuH6xevYOyrblL3DACTn/9oUVs6Y7qf/KgeOp3o+uOH0YLK/AkEyZ/0ThFYF+otOG7JiID4HQVBgE5jcYejXSZmKosKTVDml7V0EgrRbEtMgMMEVIpLJjFVaAymFtz745lz4TgBjkhfb7GgU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JaU0rWeJ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JaU0rWeJ" Received: by smtp.kernel.org (Postfix) with ESMTPS id 60160C4DDE6; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313220; bh=mfCiqor+VVrtAxYwGEGeUEVf/Pi3BKvbjBUnqOQUaPE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=JaU0rWeJHka/nwMu5j7tPyoiBOJQqs7PHa1ZQwXmorpqFBI/tGgJ/FE50pnT6wju9 1zXmb2S0bYAatmAG1NmBNHW+PCUjXswebhSBM/rsmR3RLtnQhhvd2q/1DqanhbS97C sEoK+Vw7zvvGnZvUA/+H7D3xIT8tRePCPkmMfJbwleUWNxRVTiyjY+rAc2l3V7Og2e 1/Of3SC27Cn7wkwR+u6tM9lkdY9x2IivbckeKOSnw+3dHBxWPxW9BNw+S0gqfk64JS PrwwLdgky837ZUg/c7B9rbatIJjq9r1bwixSrthptJK0EVWe7NA4HVFt20tqFclVMp M71MNYJAkjt3A== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 543ADFF8868; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:07:01 +0800 Subject: [PATCH v7 10/15] mm/mglru: use the common routine for dirty/writeback reactivation Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-10-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=3255; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=XRKrUmNoX/1BhqYmqpem5sAID4oP2CHmPQ5jLKoIQdw=; b=+HxbLcfJ19SoORyG4Y2OLPDeNrxicgrtnsrc/6FJCgoKX2cLEPX61RNMFPk5HJmQvkHrBNbSr u4v9lgVufb7Aek6O+dUkL62DzBR3aENw2GgiNw8QbH0UDfCAUqpgJtM X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Currently MGLRU will move the dirty writeback folios to the second oldest gen instead of reactivate them like the classical LRU. This might help to reduce the LRU contention as it skipped the isolation. But as a result we will see these folios at the LRU tail more frequently leading to inefficient reclaim. Besides, the dirty / writeback check after isolation in shrink_folio_list is more accurate and covers more cases. So instead, just drop the special handling for dirty writeback, use the common routine and re-activate it like the classical LRU. This should in theory improve the scan efficiency. These folios will be rotated back to LRU tail once writeback is done so there is no risk of hotness inversion. And now each reclaim loop will have a higher success rate. This also prepares for unifying the writeback and throttling mechanism with classical LRU, we keep these folios far from tail so detecting the tail batch will have a similar pattern with classical LRU. The micro optimization that avoids LRU contention by skipping the isolation is gone, which should be fine. Compared to IO and writeback cost, the isolation overhead is trivial. And using the common routine also keeps the folio's referenced bits (tier bits), which could improve metrics in the long term. Also no more need to clean reclaim bit as the common routine will make use of it. Note the common routine updates a few throttling and writeback counters, which are not used, and never have been for the MGLRU case. We will start making use of these in later commits. Reviewed-by: Axel Rasmussen Reviewed-by: Barry Song Reviewed-by: Baolin Wang Signed-off-by: Kairui Song --- mm/vmscan.c | 19 ------------------- 1 file changed, 19 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index ea86297b604c..bb7e2cecf48e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4578,7 +4578,6 @@ static bool sort_folio(struct lruvec *lruvec, struct = folio *folio, struct scan_c int tier_idx) { bool success; - bool dirty, writeback; int gen =3D folio_lru_gen(folio); int type =3D folio_is_file_lru(folio); int zone =3D folio_zonenum(folio); @@ -4628,21 +4627,6 @@ static bool sort_folio(struct lruvec *lruvec, struct= folio *folio, struct scan_c return true; } =20 - dirty =3D folio_test_dirty(folio); - writeback =3D folio_test_writeback(folio); - if (type =3D=3D LRU_GEN_FILE && dirty) { - sc->nr.file_taken +=3D delta; - if (!writeback) - sc->nr.unqueued_dirty +=3D delta; - } - - /* waiting for writeback */ - if (writeback || (type =3D=3D LRU_GEN_FILE && dirty)) { - gen =3D folio_inc_gen(lruvec, folio, true); - list_move(&folio->lru, &lrugen->folios[gen][type][zone]); - return true; - } - return false; } =20 @@ -4664,9 +4648,6 @@ static bool isolate_folio(struct lruvec *lruvec, stru= ct folio *folio, struct sca if (!folio_test_referenced(folio)) set_mask_bits(&folio->flags.f, LRU_REFS_MASK, 0); =20 - /* for shrink_folio_list() */ - folio_clear_reclaim(folio); - success =3D lru_gen_del_folio(lruvec, folio, true); VM_WARN_ON_ONCE_FOLIO(!success, folio); =20 --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 935F43E92B1 for ; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; cv=none; b=et4YEUlH2ggHcdlAouxFTI3JK8bu18o9IvdjWzVXHVXj93UrgqSO7tUOfmYPUEGP0eq+hJzRRAbrAQjRNeMnDYuHIHwAIr5G8YAs3LPLSBWdtLsO1gZNznrPA2bosOygte6zMdRU8nJa7Bg3WmRj0ajppSHqCqoTs6BNuAqXz68= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; c=relaxed/simple; bh=fqzS+yy4Zz4+CewbZmc3fyi0n4VbFRk8JdMTnpxI1hU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=pJDWv33yyOOB7NujGg5yFF7hVq4gfbKOYc5qYaeaAmmxQ+FUCEM3n2aj9haTFK7YksC4BOmjwY5nR7jKajONq3OHYkZl7Mo6ApZ/JbArTEqqCGV7DdE+f7x4nRxlkDnEXHSgyUA/yxgXwfb0Moz96ItjcB/PJLMXoiNaigXOHPo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=akxen9tU; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="akxen9tU" Received: by smtp.kernel.org (Postfix) with ESMTPS id 72E69C4DDED; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313220; bh=fqzS+yy4Zz4+CewbZmc3fyi0n4VbFRk8JdMTnpxI1hU=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=akxen9tU4EByj85HgnE/ncmuVpWc4mxVqF4HknmOgIIZxGNmh1ZfqdGmp6mikegXn P2KnQjH/3pyBnnXbqojjmXr7PRQp/A+uAeH135OHBNtz7/FXbZIDpLA9+lSuuv4c8q 3qgMXHdz+N3A0YuQ79ZqNQtss+eP/u0SzOCGZsaT8NfpfZ2HOkUMK2g+My/5LQgMuS FO8G+jGFfh6fDDPpduTuEW/t5bGOQ18Ynlzz1IjYfkIaO/rEib53tIhxvEEbrDuwIZ uQttiXzKlOye3Y4NX5Vtd2ac1jfH0D5ALsOFsA8z4FnRWLsJ7gxxGc489MZCWsyWax Lqln4NcUkH/8A== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69882FF886C; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:07:02 +0800 Subject: [PATCH v7 11/15] mm/mglru: simplify and improve dirty writeback handling Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-11-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=4415; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=cckLEz79VV56ZiUG6KqsWF5FQTcZVMYl+1QmaOX61Aw=; b=Msq+P7v0Wqp375X++8WEZ0cAbRUha+ShxY+d+gfHvEzcaVzUnT+EYNvFKDbjwPUNYw4YUcDgw IwxsBmE1TjZDAfal83WdqP9VxTv/KDEZe5o9XxCvkeeMGhS7ofcfow6 X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Right now the flusher wakeup mechanism for MGLRU is less responsive and unlikely to trigger compared to classical LRU. The classical LRU wakes the flusher if one batch of folios passed to shrink_folio_list is unevictable due to under writeback. MGLRU instead check and handle this after the whole reclaim loop is done. We previously even saw OOM problems due to passive flusher, which were fixed but still not perfect [1]. We have just unified the dirty folio counting and activation routine, now just move the dirty flush into the loop right after shrink_folio_list. This improves the performance a lot for workloads involving heavy writeback and prepares for throttling too. Test with YCSB workloadb showed a major performance improvement: Before this series: Throughput(ops/sec): 62485.02962831822 AverageLatency(us): 500.9746963330107 pgpgin 159347462 workingset_refault_file 34522071 After this commit: Throughput(ops/sec): 80857.08510208207 AverageLatency(us): 386.653262968934 pgpgin 112233121 workingset_refault_file 19516246 The performance is a lot better with significantly lower refault. We also observed similar or higher performance gain for other real-world workloads. We were concerned that the dirty flush could cause more wear for SSD: that should not be the problem here, since the wakeup condition is when the dirty folios have been pushed to the tail of LRU, which indicates that memory pressure is so high that writeback is blocking the workload already. Reviewed-by: Axel Rasmussen Link: https://lore.kernel.org/linux-mm/20241026115714.1437435-1-jingxiangze= ng.cas@gmail.com/ [1] Reviewed-by: Baolin Wang Signed-off-by: Kairui Song --- mm/vmscan.c | 41 ++++++++++++++++------------------------- 1 file changed, 16 insertions(+), 25 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index bb7e2cecf48e..244cdae99573 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4724,8 +4724,6 @@ static int scan_folios(unsigned long nr_to_scan, stru= ct lruvec *lruvec, trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, scanned, skipped, isolated, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); - if (type =3D=3D LRU_GEN_FILE) - sc->nr.file_taken +=3D isolated; =20 *isolatedp =3D isolated; return scanned; @@ -4838,12 +4836,27 @@ static int evict_folios(unsigned long nr_to_scan, s= truct lruvec *lruvec, return scanned; retry: reclaimed =3D shrink_folio_list(&list, pgdat, sc, &stat, false, memcg); - sc->nr.unqueued_dirty +=3D stat.nr_unqueued_dirty; sc->nr_reclaimed +=3D reclaimed; trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, type_scanned, reclaimed, &stat, sc->priority, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); =20 + /* + * If too many file cache in the coldest generation can't be evicted + * due to being dirty, wake up the flusher. + */ + if (stat.nr_unqueued_dirty =3D=3D isolated) { + wakeup_flusher_threads(WB_REASON_VMSCAN); + + /* + * For cgroupv1 dirty throttling is achieved by waking up + * the kernel flusher here and later waiting on folios + * which are in writeback to finish (see shrink_folio_list()). + */ + if (!writeback_throttling_sane(sc)) + reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); + } + list_for_each_entry_safe_reverse(folio, next, &list, lru) { DEFINE_MIN_SEQ(lruvec); =20 @@ -5000,28 +5013,6 @@ static bool try_to_shrink_lruvec(struct lruvec *lruv= ec, struct scan_control *sc) cond_resched(); } =20 - /* - * If too many file cache in the coldest generation can't be evicted - * due to being dirty, wake up the flusher. - */ - if (sc->nr.unqueued_dirty && sc->nr.unqueued_dirty =3D=3D sc->nr.file_tak= en) { - struct pglist_data *pgdat =3D lruvec_pgdat(lruvec); - - wakeup_flusher_threads(WB_REASON_VMSCAN); - - /* - * For cgroupv1 dirty throttling is achieved by waking up - * the kernel flusher here and later waiting on folios - * which are in writeback to finish (see shrink_folio_list()). - * - * Flusher may not be able to issue writeback quickly - * enough for cgroupv1 writeback throttling to work - * on a large system. - */ - if (!writeback_throttling_sane(sc)) - reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); - } - return need_rotate; } =20 --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A43BD3B47C5 for ; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; cv=none; b=eIrgFR4Uv7Y1stofg1+thpQfIRzGJt4yWSX3XosUMeZHuNsYpQ+BzsyAFHGVC9wu3qWXsZOybiOqB5gHj+YvLFlsbM/ynJUJI2k6eMpQ3xvD35HkrI9NVFwIQytdFfl2yxkIMYQkLA0FLa45yN75yPa5bj1na4C97oAggefYiWU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; c=relaxed/simple; bh=MBJDdRh3tqnWkjaabvjBfwwcCLCcoj+BlqMmh8jqgUY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=kVYoAg/o0PkPwPT9omj1PJ24C6AxJEbrtxgSPmiaupxz62Wiq4WWdIqVqZGL/CDjHku7zrmqYCdfdLjCKsn0jVew8VwMWIbZALFMfeHVZlUquYZ8jhvh014WwBae1ufhZ2pOqGSnDhvVYwL0m4msYFEg2h+dF1WjUkgoIlVgFy4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uecVsYXV; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uecVsYXV" Received: by smtp.kernel.org (Postfix) with ESMTPS id 86E37C4DDF4; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313220; bh=MBJDdRh3tqnWkjaabvjBfwwcCLCcoj+BlqMmh8jqgUY=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=uecVsYXVPsfU3dimrq4dMrG2CKVmGRwOBl+VoH3TnG0c5x5+EsB7PybwMKcFaAaFi FfBkVLtUw/Knp+ZzVtaya++435mY9e5DCuUu99GnoEAtSsowP1XyIaMM10ebzQ+ouz oh8Ci7LSQXDK/vZ+pV9sDsGPYbnh4tRS9xr6qbrrIxe5quPlCyaAFmEssB961ZgRYY nf6+8hfLNZtp97fcXzXmAMfDUfvb5L3rmE+LrK4z2jgC/AtuTAkoYvChWHQmHURQJE nQ/PTm4IjxSzG5ru0F5TiFp8qiuu//9wW4abKwGe9GIbegxqqu9mEOF5zJxew3ywAM fVEi6HvnEaMlA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F910FF8868; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:07:03 +0800 Subject: [PATCH v7 12/15] mm/mglru: remove no longer used reclaim argument for folio protection Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-12-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=2629; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=5MkZqaWFeDlbbVjT1CAX+BljQEXWptawPTTuIO8j3f0=; b=mR+MxcXz4S/4V8MORLJC3tJ+UhzfA/ZNwM7r0rjIyEpR04Q4YSg09TK+L9kniBOJLY+GydXK8 qEetAauTeq2CMCvcNQx+IyWlefD3FlE0+1gzgD7+yxUNUDEp6CbOc18 X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Now dirty reclaim folios are handled after isolation, not before, since dirty reactivation must take the folio off LRU first, and that helps to unify the dirty handling logic. So this argument is no longer needed. Just remove it. Reviewed-by: Axel Rasmussen Signed-off-by: Kairui Song --- mm/vmscan.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 244cdae99573..eb7eb2ed1830 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3220,7 +3220,7 @@ static int folio_update_gen(struct folio *folio, int = gen) } =20 /* protect pages accessed multiple times through file descriptors */ -static int folio_inc_gen(struct lruvec *lruvec, struct folio *folio, bool = reclaiming) +static int folio_inc_gen(struct lruvec *lruvec, struct folio *folio) { int type =3D folio_is_file_lru(folio); struct lru_gen_folio *lrugen =3D &lruvec->lrugen; @@ -3239,9 +3239,6 @@ static int folio_inc_gen(struct lruvec *lruvec, struc= t folio *folio, bool reclai =20 new_flags =3D old_flags & ~(LRU_GEN_MASK | LRU_REFS_FLAGS); new_flags |=3D (new_gen + 1UL) << LRU_GEN_PGOFF; - /* for folio_end_writeback() */ - if (reclaiming) - new_flags |=3D BIT(PG_reclaim); } while (!try_cmpxchg(&folio->flags.f, &old_flags, new_flags)); =20 lru_gen_update_size(lruvec, folio, old_gen, new_gen); @@ -3855,7 +3852,7 @@ static bool inc_min_seq(struct lruvec *lruvec, int ty= pe, int swappiness) VM_WARN_ON_ONCE_FOLIO(folio_is_file_lru(folio) !=3D type, folio); VM_WARN_ON_ONCE_FOLIO(folio_zonenum(folio) !=3D zone, folio); =20 - new_gen =3D folio_inc_gen(lruvec, folio, false); + new_gen =3D folio_inc_gen(lruvec, folio); list_move_tail(&folio->lru, &lrugen->folios[new_gen][type][zone]); =20 /* don't count the workingset being lazily promoted */ @@ -4607,7 +4604,7 @@ static bool sort_folio(struct lruvec *lruvec, struct = folio *folio, struct scan_c =20 /* protected */ if (tier > tier_idx || refs + workingset =3D=3D BIT(LRU_REFS_WIDTH) + 1) { - gen =3D folio_inc_gen(lruvec, folio, false); + gen =3D folio_inc_gen(lruvec, folio); list_move(&folio->lru, &lrugen->folios[gen][type][zone]); =20 /* don't count the workingset being lazily promoted */ @@ -4622,7 +4619,7 @@ static bool sort_folio(struct lruvec *lruvec, struct = folio *folio, struct scan_c =20 /* ineligible */ if (zone > sc->reclaim_idx) { - gen =3D folio_inc_gen(lruvec, folio, false); + gen =3D folio_inc_gen(lruvec, folio); list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]); return true; } --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B9E903E9585 for ; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; cv=none; b=MuC2imo1bBGZSaa/cyktMhEcLN5qQG3ga07UTmqkXNO8tpwfzUBtV7prqgaJkOtPfD+2WLdYN8mVAU3kcOt0+k0AMXgTqDBZPWS5ucs2Sr0jW2UV9tNuLa6vr4CYaDISvCGjK2/6bZVG1PupxX0cX264Th5Mt7GS+z/tywKt+zE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; c=relaxed/simple; bh=kmvumb0PGIIBcHlg6/hochvDVLMcvKAdlTkAkPkvOBM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=XiUK3FUrT6V0pUthCjYMierQPWUlvxm7LEqrV40WzU06ye0YLbGapBXcZNBxzLtZ8+g78B0Bf/gOTNuOuix+431Vs6GGPE1LMvoFjKH96IKPDZwNy4RlSa56e6Tcy+7zfhqnTht7/Abrjd0l4iUM/8JlCk6yWsoa8XYL4j0cfdc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=B7Orrjcp; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="B7Orrjcp" Received: by smtp.kernel.org (Postfix) with ESMTPS id 9C4ECC4DDFB; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313220; bh=kmvumb0PGIIBcHlg6/hochvDVLMcvKAdlTkAkPkvOBM=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=B7OrrjcpGvn39NWAoXBKJRoZ2J5TfqgKaK45aSA+SPslYxPxj9tsY7XC2wBYlloKf HXg5Q8PvoY7UMhWA1jLuzwcwaXW3yGwV+UG/5Dcx9ieHRRkpp/6BZfitvEfhk6w6Bp bgdCmbDxYcjLPtocar/W3qSGPDE7lB/MDBonus4h9OQUnHnYMdyTE79snwuBZQ8yFJ fzdSQd137BMIJRV6rcP2vWoKQcUPptwQELhK+zDbVURO40ynd0KKQPKrOXkZw2BnZa ihZOMykUj7lblAak04jH8nvfQBv544C3O4N89IenmKp7ypCIZIZEfUVFFOXSVmIXP7 hbGneh8BzEJyQ== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9336CFF886D; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:07:04 +0800 Subject: [PATCH v7 13/15] mm/vmscan: remove sc->file_taken Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-13-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=1018; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=9c7VfboiDatsWWazTVjLAi/CqFbpKc1QmkcqxwyMdzA=; b=X7lkAO3FySIMO8Ir2SgVmPCmLlS03OhHuBnFLWiLfH9TXS5FnfC4TIDCkRKFWCR6uUEz+MA28 qtDKliaN/SyAqUO1bJzHN6YDr9WsQ07RwN/t/2dmsEDV49tzrMpmKeD X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song No one is using it now, just remove it. Reviewed-by: Axel Rasmussen Reviewed-by: Baolin Wang Reviewed-by: Chen Ridong Signed-off-by: Kairui Song --- mm/vmscan.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index eb7eb2ed1830..a071f7444232 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -173,7 +173,6 @@ struct scan_control { unsigned int congested; unsigned int writeback; unsigned int immediate; - unsigned int file_taken; unsigned int taken; } nr; =20 @@ -2040,8 +2039,6 @@ static unsigned long shrink_inactive_list(unsigned lo= ng nr_to_scan, sc->nr.writeback +=3D stat.nr_writeback; sc->nr.immediate +=3D stat.nr_immediate; sc->nr.taken +=3D nr_taken; - if (file) - sc->nr.file_taken +=3D nr_taken; =20 trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, nr_scanned, nr_reclaimed, &stat, sc->priority, file); --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CFBF03E958C for ; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; cv=none; b=nsR/toCyysABZ+Ozcy36/tFZ8j7GceKQMrxpg/tqdP0orIk2K51yDWWhkBprLo2M22tDcGoHMFrjf9ckECN8gcD59igW1teym5m5+TtJaFvypnhD+eG1v6SybH63vPN3fE3NEmw0eMZsqLF6bZMTmK5Rc+9Jz1RpJMT+iZZp8gg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313220; c=relaxed/simple; bh=sXAj14EE+Y6fcBwNhkAfFMLO/N6XGozNJ2a53+EUB28=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ZtRFkePkPpJg/coRR/CevMrj0RodZ3JTWFY28imITujU4SzsL/ymhmJnSfzzcol8HH2y3ynfhqvhjrK8CtOIZ1aH4qIOmGFo5889qY3sRQk/YxcikX0t+IYtli6uuEsoYftH1TI9AQtlkk/daodU9CGgbX1Tlol33YPQ2U40JUY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=LwzQdNsz; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LwzQdNsz" Received: by smtp.kernel.org (Postfix) with ESMTPS id B0600C4DE01; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313220; bh=sXAj14EE+Y6fcBwNhkAfFMLO/N6XGozNJ2a53+EUB28=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=LwzQdNszgApHjndc4uf/XtSyXXCtvtxz5VxWGtRNgUI8kkm9G26Hqv/Q2YeFh7Fqq gCX8ZTVpdAnM3dzqQgeHYVKxZ1P3XxqUdhlo3dkwCxrx1Xg4yMMIiC6CFlxjssecdu 7dAY1coxIYLO+0LPmxZRzp15NAXFoKwgklPJTtpJCD3Hbr1qD4XdgbbXWsum40O9iA kna1FMztSXc8Zgs927+/cOaAQAxvpyPQFpdFeRx41nNq2n+vEDup1zMK9P4UaistaU enaUbH33yfSRj9CQOlL6sJwtNcV/sscQi6z9ilH+N89UebTBJSanvWfX1VDFkaHRIx TeCQWkzm5QUQw== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7B9AFF8860; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:07:05 +0800 Subject: [PATCH v7 14/15] mm/vmscan: remove sc->unqueued_dirty Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-14-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=1092; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=nYfNvdV9CEi02RTXoIeJTz6Ua3OVwuJC4SglYki/XDU=; b=ymQtFeT3iVPi0UYmSf84OL3PLltMC+dD/RNFaNXKtoh8vcDiFERF6hhVpYQ7u4jhPRG38S25S XkVONkI9iGKAMo+/vc31PW17puEG7cruJSdfIWRNYsLpn1bIsSOWqAa X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song No one is using it now, just remove it. Suggested-by: Axel Rasmussen Reviewed-by: Baolin Wang Reviewed-by: Axel Rasmussen Reviewed-by: Barry Song Reviewed-by: Chen Ridong Signed-off-by: Kairui Song --- mm/vmscan.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index a071f7444232..902ca52ca381 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -169,7 +169,6 @@ struct scan_control { =20 struct { unsigned int dirty; - unsigned int unqueued_dirty; unsigned int congested; unsigned int writeback; unsigned int immediate; @@ -2035,7 +2034,6 @@ static unsigned long shrink_inactive_list(unsigned lo= ng nr_to_scan, =20 sc->nr.dirty +=3D stat.nr_dirty; sc->nr.congested +=3D stat.nr_congested; - sc->nr.unqueued_dirty +=3D stat.nr_unqueued_dirty; sc->nr.writeback +=3D stat.nr_writeback; sc->nr.immediate +=3D stat.nr_immediate; sc->nr.taken +=3D nr_taken; --=20 2.54.0 From nobody Thu Jun 11 19:01:02 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E30B53E9592 for ; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313221; cv=none; b=m+/2Al6qpz3D0aJl5cLJPp8a6O2Bf/rsArrg4VNYMAZikIBNMkt6WVloW1p/Ic8VG7CaIwW/a9INVcQN8UrElT8MYAYUGG1jLaC+Fsyv5NKrqtQ9W4iWitcmQBV2px3SqDD1cREknJeRdWqPa9Arizn9sCLL1Z0XNK6QV/bWNEI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777313221; c=relaxed/simple; bh=TltjEqiK8mp2zGBFuhVqWMZLycNs4HGZe1tGT42589U=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=CgY0rwZPS8+F9BNkscJeKLI5CM1guK8YvpJMUnbqOTrqLpviTymLGyg7JLBgmmdptiQfdmUhneqRV4BHz2hg3ypwXhe5Xr0KBsfOqmxH/BGVYuC9sSPU5i6SjK88naJlQei9DIuxniF3jMlYACTyefewJnGEWn9K6UppuFv0SOI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IxvlEBdN; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IxvlEBdN" Received: by smtp.kernel.org (Postfix) with ESMTPS id C1C54C4DDFF; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777313220; bh=TltjEqiK8mp2zGBFuhVqWMZLycNs4HGZe1tGT42589U=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=IxvlEBdNl5ezRo0sFBZOSuOwdFXqZ1J3+ehS9WAX/Xx1DS/g7KaXRbNAQWYmCY8LX tbTDOudCf8+a6vDgnbuvAiLZubbbivtkAQ/2Wz+LyWGBMx9Z/nPBrZK6ofPoybPYo0 TH3q6sAadEPid/3Uu3HqQjpxJBBW0BNxlt2X6E/EX533cykCnl4pVLC6GCuCU/KSNO V6eyXbdN1LEG4jwp5FJDm5u8QB4Ag3ATbQMz9j16TlOkD5b8336t91N9VvPkN7fkUs yyEb3kjpuycxOialuAE+KPuDqWXKqlmnXSfSCGz2LTA8x+BWzmXsU2p+PNVDG9BnR6 T3F3ZS+OYoRDg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA33AFF8868; Mon, 27 Apr 2026 18:07:00 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 28 Apr 2026 02:07:06 +0800 Subject: [PATCH v7 15/15] mm/vmscan: unify writeback reclaim statistic and throttling Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260428-mglru-reclaim-v7-15-02fabb92dc43@tencent.com> References: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> In-Reply-To: <20260428-mglru-reclaim-v7-0-02fabb92dc43@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Kairui Song , Johannes Weiner , David Hildenbrand , Michal Hocko , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Baolin Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, linux-kernel@vger.kernel.org, Kairui Song , Qi Zheng X-Mailer: b4 0.15.2 X-Developer-Signature: v=1; a=ed25519-sha256; t=1777313217; l=6821; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=cQ7SJbHtgzouLQNheQ5U8x6eH/mNrNkWUSaVWupAfY4=; b=bdzV3avOotE9stSy0GuDAjEjDar1QjhkCFUp7c862T4KjZU7Njhl5t31hyzGzCOP3EKeKIVRA UrIdHvmVLqRD8A/Dful197cVNiCL/UaQUSUDE1zlD0jfM38jNuWAhHT X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Currently MGLRU and non-MGLRU handle the reclaim statistic and writeback handling very differently, especially throttling. Basically MGLRU just ignored the throttling part. Let's just unify this part, use a helper to deduplicate the code so both setups will share the same behavior. Test using following reproducer using bash: echo "Setup a slow device using dm delay" dd if=3D/dev/zero of=3D/var/tmp/backing bs=3D1M count=3D2048 LOOP=3D$(losetup --show -f /var/tmp/backing) mkfs.ext4 -q $LOOP echo "0 $(blockdev --getsz $LOOP) delay $LOOP 0 0 $LOOP 0 1000" | \ dmsetup create slow_dev mkdir -p /mnt/slow && mount /dev/mapper/slow_dev /mnt/slow echo "Start writeback pressure" sync && echo 3 > /proc/sys/vm/drop_caches mkdir /sys/fs/cgroup/test_wb echo 128M > /sys/fs/cgroup/test_wb/memory.max (echo $BASHPID > /sys/fs/cgroup/test_wb/cgroup.procs && \ dd if=3D/dev/zero of=3D/mnt/slow/testfile bs=3D1M count=3D192) echo "Clean up" echo "0 $(blockdev --getsz $LOOP) error" | dmsetup load slow_dev dmsetup resume slow_dev umount -l /mnt/slow && sync dmsetup remove slow_dev Before this commit, `dd` will get OOM killed immediately if MGLRU is enabled. Classic LRU is fine. After this commit, throttling is now effective and no more spin on LRU or premature OOM. Stress test on other workloads also looks good. Global throttling is not here yet, we will fix that separately later. Suggested-by: Chen Ridong Tested-by: Leno Hou Reviewed-by: Axel Rasmussen Reviewed-by: Baolin Wang Signed-off-by: Kairui Song --- mm/vmscan.c | 92 +++++++++++++++++++++++++++++----------------------------= ---- 1 file changed, 43 insertions(+), 49 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 902ca52ca381..e452cb043d46 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1942,6 +1942,44 @@ static int current_may_throttle(void) return !(current->flags & PF_LOCAL_THROTTLE); } =20 +static void handle_reclaim_writeback(unsigned long nr_taken, + struct pglist_data *pgdat, + struct scan_control *sc, + struct reclaim_stat *stat) +{ + /* + * If dirty folios are scanned that are not queued for IO, it + * implies that flushers are not doing their job. This can + * happen when memory pressure pushes dirty folios to the end of + * the LRU before the dirty limits are breached and the dirty + * data has expired. It can also happen when the proportion of + * dirty folios grows not through writes but through memory + * pressure reclaiming all the clean cache. And in some cases, + * the flushers simply cannot keep up with the allocation + * rate. Nudge the flusher threads in case they are asleep. + */ + if (stat->nr_unqueued_dirty =3D=3D nr_taken) { + wakeup_flusher_threads(WB_REASON_VMSCAN); + /* + * For cgroupv1 dirty throttling is achieved by waking up + * the kernel flusher here and later waiting on folios + * which are in writeback to finish (see shrink_folio_list()). + * + * Flusher may not be able to issue writeback quickly + * enough for cgroupv1 writeback throttling to work + * on a large system. + */ + if (!writeback_throttling_sane(sc)) + reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); + } + + sc->nr.dirty +=3D stat->nr_dirty; + sc->nr.congested +=3D stat->nr_congested; + sc->nr.writeback +=3D stat->nr_writeback; + sc->nr.immediate +=3D stat->nr_immediate; + sc->nr.taken +=3D nr_taken; +} + /* * shrink_inactive_list() is a helper for shrink_node(). It returns the n= umber * of reclaimed pages @@ -2005,39 +2043,7 @@ static unsigned long shrink_inactive_list(unsigned l= ong nr_to_scan, lruvec_lock_irq(lruvec); lru_note_cost_unlock_irq(lruvec, file, stat.nr_pageout, nr_scanned - nr_reclaimed); - - /* - * If dirty folios are scanned that are not queued for IO, it - * implies that flushers are not doing their job. This can - * happen when memory pressure pushes dirty folios to the end of - * the LRU before the dirty limits are breached and the dirty - * data has expired. It can also happen when the proportion of - * dirty folios grows not through writes but through memory - * pressure reclaiming all the clean cache. And in some cases, - * the flushers simply cannot keep up with the allocation - * rate. Nudge the flusher threads in case they are asleep. - */ - if (stat.nr_unqueued_dirty =3D=3D nr_taken) { - wakeup_flusher_threads(WB_REASON_VMSCAN); - /* - * For cgroupv1 dirty throttling is achieved by waking up - * the kernel flusher here and later waiting on folios - * which are in writeback to finish (see shrink_folio_list()). - * - * Flusher may not be able to issue writeback quickly - * enough for cgroupv1 writeback throttling to work - * on a large system. - */ - if (!writeback_throttling_sane(sc)) - reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); - } - - sc->nr.dirty +=3D stat.nr_dirty; - sc->nr.congested +=3D stat.nr_congested; - sc->nr.writeback +=3D stat.nr_writeback; - sc->nr.immediate +=3D stat.nr_immediate; - sc->nr.taken +=3D nr_taken; - + handle_reclaim_writeback(nr_taken, pgdat, sc, &stat); trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, nr_scanned, nr_reclaimed, &stat, sc->priority, file); return nr_reclaimed; @@ -4829,26 +4835,13 @@ static int evict_folios(unsigned long nr_to_scan, s= truct lruvec *lruvec, retry: reclaimed =3D shrink_folio_list(&list, pgdat, sc, &stat, false, memcg); sc->nr_reclaimed +=3D reclaimed; + /* Retry pass is only meant for clean folios without new isolation */ + if (isolated) + handle_reclaim_writeback(isolated, pgdat, sc, &stat); trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, type_scanned, reclaimed, &stat, sc->priority, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); =20 - /* - * If too many file cache in the coldest generation can't be evicted - * due to being dirty, wake up the flusher. - */ - if (stat.nr_unqueued_dirty =3D=3D isolated) { - wakeup_flusher_threads(WB_REASON_VMSCAN); - - /* - * For cgroupv1 dirty throttling is achieved by waking up - * the kernel flusher here and later waiting on folios - * which are in writeback to finish (see shrink_folio_list()). - */ - if (!writeback_throttling_sane(sc)) - reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); - } - list_for_each_entry_safe_reverse(folio, next, &list, lru) { DEFINE_MIN_SEQ(lruvec); =20 @@ -4891,6 +4884,7 @@ static int evict_folios(unsigned long nr_to_scan, str= uct lruvec *lruvec, =20 if (!list_empty(&list)) { skip_retry =3D true; + isolated =3D 0; goto retry; } =20 --=20 2.54.0