From nobody Thu Apr 2 12:41:27 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 987B8EEB3 for ; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; cv=none; b=DX/3prk8k0QDxgy48YLvLdMUI7HjQiQQy6+UrHnLerl/avr9KJ4RHKESSRJ/BT8bF34ykI1B/mL2CvQC62Ol1On/zTJXpi/Q/JQrWbnY4cIM73437uxnVomRwM4lDu8tyZvDX2WJTXclL/1vbP2VTrjkIChc8BIgZWruvO0gA+o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; c=relaxed/simple; bh=pu/4aSrd7KSa0bxFp6wbeBj72fjrUtTNvGBmWMpcIZ0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=iYl7uMKxxm9Z+5Rxq+IcdUKaqnrH/WukrvWjBAHhSlZrYg7c802cGoV9Cgz7ScWY5O14sD+sXq5oAXOAMVJzN02Ol+rEE0euNa0nTuku+8pJexHOhmwNwljH2RmvKcwe0iyhkmRlPMEDFR39O3yCwMFxMucj4aksrY6/GYR61Ao= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aJT1dYUy; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aJT1dYUy" Received: by smtp.kernel.org (Postfix) with ESMTPS id 3FFADC2BC9E; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774727551; bh=pu/4aSrd7KSa0bxFp6wbeBj72fjrUtTNvGBmWMpcIZ0=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=aJT1dYUyNuZFIazRsyNJ/qM06xTk8MMsVQ54NlxE8kIYlPSS1XP2RRRKd5tK8aQBo vqYiLoBQ0/34wZOK1FPuyAFC5uOpEcDPU+L3+2zOjVfIYfecQjPAX3HOpL6391S0BF 3RRgB4TqQHW4Vr5mGEr+j9m2WDstmphTlZkgbt1hnU2Kh0lPMEeRpD74CfbXk0yu+o fML2o9so/dudx3hdW86dNLB9swnbnvuhah0CYqoTO5QANAPzT7tPcbcI/qmIsoz+SM IrTkXPCvLGTjAxcVG/DCvS9q3sLAQQMpyoPkhdeZomr4TF/E/bl8+xb40FV0zLfs+H tTx2VA/dObO3A== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27A7D10F3DF4; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) From: Kairui Song via B4 Relay Date: Sun, 29 Mar 2026 03:52:27 +0800 Subject: [PATCH v2 01/12] mm/mglru: consolidate common code for retrieving evitable size Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260329-mglru-reclaim-v2-1-b53a3678513c@tencent.com> References: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> In-Reply-To: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1774727549; l=3179; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=lovGNHm1LZYPq5LXk6kp/WVa26J2TdTx/242vg8RH/A=; b=hj21ORrtZUdpOHmPYaiMvHKxSRm7TyFCW8gCpjAv36+6LS9wHgaZodmMBJOM/fXMIulXhR2HX 3PH781p7nFRA0yC9EulTtuByYWOmLHfk+hMTzmS1fYftr7+Eg03L4XR X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Merge commonly used code for counting evictable folios in a lruvec. No behavior change. Return unsigned long instead of long as suggested [ Axel Rasmussen ] Acked-by: Yuanchu Xie Reviewed-by: Barry Song Reviewed-by: Chen Ridong Reviewed-by: Axel Rasmussen Reviewed-by: Baolin Wang Signed-off-by: Kairui Song --- mm/vmscan.c | 36 ++++++++++++++---------------------- 1 file changed, 14 insertions(+), 22 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 5a8c8fcccbfc..adc07501a137 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4084,27 +4084,33 @@ static void set_initial_priority(struct pglist_data= *pgdat, struct scan_control sc->priority =3D clamp(priority, DEF_PRIORITY / 2, DEF_PRIORITY); } =20 -static bool lruvec_is_sizable(struct lruvec *lruvec, struct scan_control *= sc) +static unsigned long lruvec_evictable_size(struct lruvec *lruvec, int swap= piness) { int gen, type, zone; - unsigned long total =3D 0; - int swappiness =3D get_swappiness(lruvec, sc); + unsigned long seq, total =3D 0; struct lru_gen_folio *lrugen =3D &lruvec->lrugen; - struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); DEFINE_MAX_SEQ(lruvec); DEFINE_MIN_SEQ(lruvec); =20 for_each_evictable_type(type, swappiness) { - unsigned long seq; - for (seq =3D min_seq[type]; seq <=3D max_seq; seq++) { gen =3D lru_gen_from_seq(seq); - for (zone =3D 0; zone < MAX_NR_ZONES; zone++) total +=3D max(READ_ONCE(lrugen->nr_pages[gen][type][zone]), 0L); } } =20 + return total; +} + +static bool lruvec_is_sizable(struct lruvec *lruvec, struct scan_control *= sc) +{ + unsigned long total; + int swappiness =3D get_swappiness(lruvec, sc); + struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); + + total =3D lruvec_evictable_size(lruvec, swappiness); + /* whether the size is big enough to be helpful */ return mem_cgroup_online(memcg) ? (total >> sc->priority) : total; } @@ -4909,9 +4915,6 @@ static int evict_folios(unsigned long nr_to_scan, str= uct lruvec *lruvec, static bool should_run_aging(struct lruvec *lruvec, unsigned long max_seq, int swappiness, unsigned long *nr_to_scan) { - int gen, type, zone; - unsigned long size =3D 0; - struct lru_gen_folio *lrugen =3D &lruvec->lrugen; DEFINE_MIN_SEQ(lruvec); =20 *nr_to_scan =3D 0; @@ -4919,18 +4922,7 @@ static bool should_run_aging(struct lruvec *lruvec, = unsigned long max_seq, if (evictable_min_seq(min_seq, swappiness) + MIN_NR_GENS > max_seq) return true; =20 - for_each_evictable_type(type, swappiness) { - unsigned long seq; - - for (seq =3D min_seq[type]; seq <=3D max_seq; seq++) { - gen =3D lru_gen_from_seq(seq); - - for (zone =3D 0; zone < MAX_NR_ZONES; zone++) - size +=3D max(READ_ONCE(lrugen->nr_pages[gen][type][zone]), 0L); - } - } - - *nr_to_scan =3D size; + *nr_to_scan =3D lruvec_evictable_size(lruvec, swappiness); /* better to run aging even though eviction is still possible */ return evictable_min_seq(min_seq, swappiness) + MIN_NR_GENS =3D=3D max_se= q; } --=20 2.53.0 From nobody Thu Apr 2 12:41:27 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 987FD1E868 for ; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; cv=none; b=OfOh/pe2FP3P+9j+SkL9CQDJ8o9hUJ5QnZaRbfHddk3dZ0JgTQcyU0wp8xaL3CiD212vk90JuNWDnCvR9YpaflDVTDxS878GmjSNJMUm6qy/Cx8mZogySYGabLJ8g8q4UYBNoSE1WJKDnvzKLkd1HN7zPW90TiR3f7NCkX/TdXo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; c=relaxed/simple; bh=qMplLn8Oud7+IXw7Jqz7Q0ci6PUtsFK5BHg8NJ7v5bw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=KPlqPkZm1OBnFpXVFmpq71GGqLqNPJRk/jrv2pednuvXiUPsqS+cKX2YUUfMWKL8YJqp+dZPSxvluOWbifyY/EilKHq6nmjaILKy0XGnu0rAV9j63TWTRzhl/s9W+WsBVQAi7YtzUMNxZRGvVBj5ZsMyCmNGtEz0ZLyhjn2hyYE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=MfSYtNrV; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="MfSYtNrV" Received: by smtp.kernel.org (Postfix) with ESMTPS id 46CE6C2BCB0; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774727551; bh=qMplLn8Oud7+IXw7Jqz7Q0ci6PUtsFK5BHg8NJ7v5bw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=MfSYtNrVdAb9CNV1YiWdPTrNQu7qn0K5wAF1PUFmhUD5bXa8W03nZLYDK2SlbxFDH mgufnpN7O3tHPWcSe9ncrvmx8B2lgTpIQrQxWUSWVN0jDte4nFlhWtO4EPs3I6RZT8 3yFS31+bqXuHP5BtJfoddsQe73x5LwA2TtMg06XOXrFtKOIojMK/gMusP5wOqAnjRK sC4x7XxgXxwGXGnGvJMdA8rzuVpnZS/+NR50pklxzCG97wPSPokxum9sLGROE5j+6M 4TrGGckf/+do38KgKWZcV6VJQDRuEMY+mhXQotdu8CG5IPxDQpwG7MIgnF4SpZnnGk wvu0AMO57JbNA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39A0810F3DF5; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) From: Kairui Song via B4 Relay Date: Sun, 29 Mar 2026 03:52:28 +0800 Subject: [PATCH v2 02/12] mm/mglru: rename variables related to aging and rotation Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260329-mglru-reclaim-v2-2-b53a3678513c@tencent.com> References: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> In-Reply-To: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1774727549; l=2780; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=/S4DxXkZbkmPymDZ4txoaa0S870DUGxVp3pt8OKH86o=; b=pAXljMxVpl/JontxkdHTDUsp9N8qbT9jY4ejGTIS5od5hogsVJMNr2Ts2tYOn2yIXqguzcG76 SJV0Ty4ZYC2AXCeVrwYGiT0H3sxlzJDkoGBIVKBtITYnqp2xFDzeTQY X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song The current variable name isn't helpful. Make the variable names more meaningful. Only naming change, no behavior change. Suggested-by: Barry Song Signed-off-by: Kairui Song Reviewed-by: Baolin Wang Reviewed-by: Barry Song Reviewed-by: Chen Ridong --- mm/vmscan.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index adc07501a137..f336f89a2de6 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4934,7 +4934,7 @@ static bool should_run_aging(struct lruvec *lruvec, u= nsigned long max_seq, */ static long get_nr_to_scan(struct lruvec *lruvec, struct scan_control *sc,= int swappiness) { - bool success; + bool need_aging; unsigned long nr_to_scan; struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); DEFINE_MAX_SEQ(lruvec); @@ -4942,7 +4942,7 @@ static long get_nr_to_scan(struct lruvec *lruvec, str= uct scan_control *sc, int s if (mem_cgroup_below_min(sc->target_mem_cgroup, memcg)) return -1; =20 - success =3D should_run_aging(lruvec, max_seq, swappiness, &nr_to_scan); + need_aging =3D should_run_aging(lruvec, max_seq, swappiness, &nr_to_scan); =20 /* try to scrape all its memory if this memcg was deleted */ if (nr_to_scan && !mem_cgroup_online(memcg)) @@ -4951,7 +4951,7 @@ static long get_nr_to_scan(struct lruvec *lruvec, str= uct scan_control *sc, int s nr_to_scan =3D apply_proportional_protection(memcg, sc, nr_to_scan); =20 /* try to get away with not aging at the default priority */ - if (!success || sc->priority =3D=3D DEF_PRIORITY) + if (!need_aging || sc->priority =3D=3D DEF_PRIORITY) return nr_to_scan >> sc->priority; =20 /* stop scanning this lruvec as it's low on cold folios */ @@ -5040,7 +5040,7 @@ static bool try_to_shrink_lruvec(struct lruvec *lruve= c, struct scan_control *sc) =20 static int shrink_one(struct lruvec *lruvec, struct scan_control *sc) { - bool success; + bool need_rotate; unsigned long scanned =3D sc->nr_scanned; unsigned long reclaimed =3D sc->nr_reclaimed; struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); @@ -5058,7 +5058,7 @@ static int shrink_one(struct lruvec *lruvec, struct s= can_control *sc) memcg_memory_event(memcg, MEMCG_LOW); } =20 - success =3D try_to_shrink_lruvec(lruvec, sc); + need_rotate =3D try_to_shrink_lruvec(lruvec, sc); =20 shrink_slab(sc->gfp_mask, pgdat->node_id, memcg, sc->priority); =20 @@ -5068,10 +5068,10 @@ static int shrink_one(struct lruvec *lruvec, struct= scan_control *sc) =20 flush_reclaim_state(sc); =20 - if (success && mem_cgroup_online(memcg)) + if (need_rotate && mem_cgroup_online(memcg)) return MEMCG_LRU_YOUNG; =20 - if (!success && lruvec_is_sizable(lruvec, sc)) + if (!need_rotate && lruvec_is_sizable(lruvec, sc)) return 0; =20 /* one retry if offlined or too small */ --=20 2.53.0 From nobody Thu Apr 2 12:41:27 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A73DC33BBB1 for ; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; cv=none; b=jZdWUZ4XbSotduEoTlZjsRKiSkHZs7wJ4Uacdx+hsFkJkAAnuBVpt5tLD7i+M2VUQvt+J4hTWkNGqp9zordZX0o4a2HC3dlsb6dtiJdhysTOEIl/zhcxKjCj2RgoB8dDSSdyGOoWiZLkXysyAPbQFFdOhcVWFUKBqZp58XP2k08= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; c=relaxed/simple; bh=V+u/1ghTYrgsgZAfHoL8/JWPdvg7AFyTZRkLhwRadE8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=SMBOXDdvZI7/dnR3uJpNCegCqrmXLHBlPt+vSXsmX2AR2t3MR4XWVnFEZw5PyqaP/bn1pogPlv63XoL/P7oLsnFTUcXM7oQovVtl3/XWEE+zyPuifKMXEaugbo+Luw6IAePhyPPEOicKyhG3wF44QdfL493MzMwyIWdhxA/fe1k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=HHpOUNrF; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="HHpOUNrF" Received: by smtp.kernel.org (Postfix) with ESMTPS id 5FA35C2BCB3; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774727551; bh=V+u/1ghTYrgsgZAfHoL8/JWPdvg7AFyTZRkLhwRadE8=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=HHpOUNrFFJaqCL69NLU86O/jJ9w2xAaAjGbTCQ7OdmuyyvOBCb6yI+HtYo3lgh0rD w2jFfUSnaapZr7GPScDbG41DXMpSSNbz9aen++HR4SbfqcVLzSX2X72fb45svUC70Y aouKEhNmuU1px3JbsChmNZ1lUc26NSKAfY8R+jXUd+0oFdb9hG42873XpEeYD8vfQn A/5n95vWQtWFMaTyGciGiTc33aUYLrQyFyEcsp/Q0W5AmtEash7RmQ453DsUPVmlzT ke3wwlzLBrHF+z1NdQKDJPc4PkQk0+Tik0aPNAJbvYwaFkgocG0AFjUeTC03IxfIss HTtn7HsjHIJbw== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BC8D10F3DF8; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) From: Kairui Song via B4 Relay Date: Sun, 29 Mar 2026 03:52:29 +0800 Subject: [PATCH v2 03/12] mm/mglru: relocate the LRU scan batch limit to callers Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260329-mglru-reclaim-v2-3-b53a3678513c@tencent.com> References: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> In-Reply-To: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1774727549; l=3157; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=4QFLBbTOMAuh+ZOeAXX+ybk8/f1xj2wdgM33A3nTdsE=; b=sb434GG68neBgavvN9RF+c/IZlt6aLeFt5v1BLAVw88W8kUhrfTWqgPklM9E2FF+SOStm/2tW g4HlPTF6IdBAKLmenUvKBiUiE5/2hi06XqPfY2dbZ+qyx02+g3i0r2m X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Same as active / inactive LRU, MGLRU isolates and scans folios in batches. The batch split is done hidden deep in the helper, which makes the code harder to follow. The helper's arguments are also confusing since callers usually request more folios than the batch size, so the helper almost never processes the full requested amount. Move the batch splitting into the top loop to make it cleaner, there should be no behavior change. Reviewed-by: Axel Rasmussen Signed-off-by: Kairui Song Reviewed-by: Baolin Wang Reviewed-by: Barry Song --- mm/vmscan.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index f336f89a2de6..963362523782 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4695,10 +4695,10 @@ static int scan_folios(unsigned long nr_to_scan, st= ruct lruvec *lruvec, int scanned =3D 0; int isolated =3D 0; int skipped =3D 0; - int scan_batch =3D min(nr_to_scan, MAX_LRU_BATCH); - int remaining =3D scan_batch; + unsigned long remaining =3D nr_to_scan; struct lru_gen_folio *lrugen =3D &lruvec->lrugen; =20 + VM_WARN_ON_ONCE(nr_to_scan > MAX_LRU_BATCH); VM_WARN_ON_ONCE(!list_empty(list)); =20 if (get_nr_gens(lruvec, type) =3D=3D MIN_NR_GENS) @@ -4751,7 +4751,7 @@ static int scan_folios(unsigned long nr_to_scan, stru= ct lruvec *lruvec, mod_lruvec_state(lruvec, item, isolated); mod_lruvec_state(lruvec, PGREFILL, sorted); mod_lruvec_state(lruvec, PGSCAN_ANON + type, isolated); - trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, scan_batch, + trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, scanned, skipped, isolated, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); if (type =3D=3D LRU_GEN_FILE) @@ -4987,7 +4987,7 @@ static bool should_abort_scan(struct lruvec *lruvec, = struct scan_control *sc) =20 static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_contro= l *sc) { - long nr_to_scan; + long nr_batch, nr_to_scan; unsigned long scanned =3D 0; int swappiness =3D get_swappiness(lruvec, sc); =20 @@ -4998,7 +4998,8 @@ static bool try_to_shrink_lruvec(struct lruvec *lruve= c, struct scan_control *sc) if (nr_to_scan <=3D 0) break; =20 - delta =3D evict_folios(nr_to_scan, lruvec, sc, swappiness); + nr_batch =3D min(nr_to_scan, MAX_LRU_BATCH); + delta =3D evict_folios(nr_batch, lruvec, sc, swappiness); if (!delta) break; =20 @@ -5623,6 +5624,7 @@ static int run_aging(struct lruvec *lruvec, unsigned = long seq, static int run_eviction(struct lruvec *lruvec, unsigned long seq, struct s= can_control *sc, int swappiness, unsigned long nr_to_reclaim) { + int nr_batch; DEFINE_MAX_SEQ(lruvec); =20 if (seq + MIN_NR_GENS > max_seq) @@ -5639,8 +5641,8 @@ static int run_eviction(struct lruvec *lruvec, unsign= ed long seq, struct scan_co if (sc->nr_reclaimed >=3D nr_to_reclaim) return 0; =20 - if (!evict_folios(nr_to_reclaim - sc->nr_reclaimed, lruvec, sc, - swappiness)) + nr_batch =3D min(nr_to_reclaim - sc->nr_reclaimed, MAX_LRU_BATCH); + if (!evict_folios(nr_batch, lruvec, sc, swappiness)) return 0; =20 cond_resched(); --=20 2.53.0 From nobody Thu Apr 2 12:41:27 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A039633ADB1 for ; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; cv=none; b=RrL17791VXWyHxUb5PAxjbEoj1Thn1KoJb1GERVw33zFEzLH3C/4zKUOirO9ixV/llu5fYCN3AE4S359sEpEW8VLoXAyEQadEcFKeDuH/lPGL0FL14phmyBSal60cGUqd03Mxrj6PwVKdtTpYtI7rbp2rHQRaoeBFG3BCpRtx40= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; c=relaxed/simple; bh=t6MsHbcxvfU7IqiZkN8IgPyQhTVvlGzqYWvOIjmPUCQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=b2v/J0bbwWYwmGVFkuQfJsICxtGs1zmvjpILajPXn1Gud8g4HuS/k6w55Ev2VF8i1YueecpcxpXbVxKU1ewrBxiwhnqd6tgAJMBnwI4tnsA4zBuxdg3uTjQIFt/e8wdwBwydx/8E3ek5XvGB5vYaTgXuGV6PliGAh9H2I54FsZg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=s8VTAgzS; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="s8VTAgzS" Received: by smtp.kernel.org (Postfix) with ESMTPS id 68DD8C2BCC4; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774727551; bh=t6MsHbcxvfU7IqiZkN8IgPyQhTVvlGzqYWvOIjmPUCQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=s8VTAgzSmPwtDN1GP79+hO4kjdVsFxC779L2/EL8SJA5psrugNCdmqwQK2Z9NdxfI VT+1rXTihRb5YMDNZh0Vtrk1gXiVxSMeaCTdaKIApTP/aB9nKPay3YQDveGnt3cTmj mCF028ALIwxn//NTAG6mpFmmVVCCJruocg6XTdcQQxereYnYlZJoB8u0qx7wnpJJ8e SJWvS4OBEZHGA0pkwq9i65qRX0XGTVJk8/LjKsJUg733sCkcegOBjNoOO7FnxsHuLt 9De/3lBkbg/bdvCOMeok77vsIel74diDOK71ApiSrZmdBNZ+nfhnEIEyAtIN4jGcIS ahuUzfNcJMOvg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DD5910F3DF6; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) From: Kairui Song via B4 Relay Date: Sun, 29 Mar 2026 03:52:30 +0800 Subject: [PATCH v2 04/12] mm/mglru: restructure the reclaim loop Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260329-mglru-reclaim-v2-4-b53a3678513c@tencent.com> References: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> In-Reply-To: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1774727549; l=5791; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=NTEcn1j1VzD/wA0hpGeUq8Ux67UjJ/BHDr4z4JuB2Ak=; b=ZR4spFHD3fszxHo09h9YABEgsRpcCC2VBGLvaWxsV5L0Ba2zGd1rjDJYWfE8ZKU5gqpLtXv/9 g9nmZviuqhGBRsBsBhJ7eNpMLRCSnRaHYI+odCb0I2qqilJp8/Rcezy X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song The current loop will calculate the scan number on each iteration. The number of folios to scan is based on the LRU length, with some unclear behaviors, eg, it only shifts the scan number by reclaim priority at the default priority, and it couples the number calculation with aging and rotation. Adjust, simplify it, and decouple aging and rotation. Just calculate the scan number for once at the beginning of the reclaim, always respect the reclaim priority, and make the aging and rotation more explicit. This slightly changes how offline memcg aging works: previously, offline memcg wouldn't be aged unless it didn't have any evictable folios. Now, we might age it if it has only 3 generations and the reclaim priority is less than DEF_PRIORITY, which should be fine. On one hand, offline memcg might still hold long-term folios, and in fact, a long-existing offline memcg must be pinned by some long-term folios like shmem. These folios might be used by other memcg, so aging them as ordinary memcg doesn't seem wrong. And besides, aging enables further reclaim of an offlined memcg, which will certainly happen if we keep shrinking it. And offline memcg might soon be no longer an issue once reparenting is all ready. Overall, the memcg LRU rotation, as described in mmzone.h, remains the same. Reviewed-by: Axel Rasmussen Signed-off-by: Kairui Song --- mm/vmscan.c | 70 +++++++++++++++++++++++++++++++--------------------------= ---- 1 file changed, 36 insertions(+), 34 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 963362523782..ab81ffdb241a 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4913,49 +4913,40 @@ static int evict_folios(unsigned long nr_to_scan, s= truct lruvec *lruvec, } =20 static bool should_run_aging(struct lruvec *lruvec, unsigned long max_seq, - int swappiness, unsigned long *nr_to_scan) + struct scan_control *sc, int swappiness) { DEFINE_MIN_SEQ(lruvec); =20 - *nr_to_scan =3D 0; /* have to run aging, since eviction is not possible anymore */ if (evictable_min_seq(min_seq, swappiness) + MIN_NR_GENS > max_seq) return true; =20 - *nr_to_scan =3D lruvec_evictable_size(lruvec, swappiness); + /* try to get away with not aging at the default priority */ + if (sc->priority =3D=3D DEF_PRIORITY) + return false; + /* better to run aging even though eviction is still possible */ return evictable_min_seq(min_seq, swappiness) + MIN_NR_GENS =3D=3D max_se= q; } =20 -/* - * For future optimizations: - * 1. Defer try_to_inc_max_seq() to workqueues to reduce latency for memcg - * reclaim. - */ -static long get_nr_to_scan(struct lruvec *lruvec, struct scan_control *sc,= int swappiness) +static long get_nr_to_scan(struct lruvec *lruvec, struct scan_control *sc, + struct mem_cgroup *memcg, int swappiness) { - bool need_aging; unsigned long nr_to_scan; - struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); - DEFINE_MAX_SEQ(lruvec); - - if (mem_cgroup_below_min(sc->target_mem_cgroup, memcg)) - return -1; - - need_aging =3D should_run_aging(lruvec, max_seq, swappiness, &nr_to_scan); =20 + nr_to_scan =3D lruvec_evictable_size(lruvec, swappiness); /* try to scrape all its memory if this memcg was deleted */ - if (nr_to_scan && !mem_cgroup_online(memcg)) + if (!mem_cgroup_online(memcg)) return nr_to_scan; =20 nr_to_scan =3D apply_proportional_protection(memcg, sc, nr_to_scan); =20 - /* try to get away with not aging at the default priority */ - if (!need_aging || sc->priority =3D=3D DEF_PRIORITY) - return nr_to_scan >> sc->priority; - - /* stop scanning this lruvec as it's low on cold folios */ - return try_to_inc_max_seq(lruvec, max_seq, swappiness, false) ? -1 : 0; + /* + * Always respect scan priority, minimally target + * SWAP_CLUSTER_MAX pages to keep reclaim moving forwards. + */ + nr_to_scan >>=3D sc->priority; + return max(nr_to_scan, SWAP_CLUSTER_MAX); } =20 static bool should_abort_scan(struct lruvec *lruvec, struct scan_control *= sc) @@ -4985,31 +4976,43 @@ static bool should_abort_scan(struct lruvec *lruvec= , struct scan_control *sc) return true; } =20 +/* + * For future optimizations: + * 1. Defer try_to_inc_max_seq() to workqueues to reduce latency for memcg + * reclaim. + */ static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_contro= l *sc) { + bool need_rotate =3D false; long nr_batch, nr_to_scan; - unsigned long scanned =3D 0; int swappiness =3D get_swappiness(lruvec, sc); + struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); =20 - while (true) { + nr_to_scan =3D get_nr_to_scan(lruvec, sc, memcg, swappiness); + while (nr_to_scan > 0) { int delta; + DEFINE_MAX_SEQ(lruvec); =20 - nr_to_scan =3D get_nr_to_scan(lruvec, sc, swappiness); - if (nr_to_scan <=3D 0) + if (mem_cgroup_below_min(sc->target_mem_cgroup, memcg)) { + need_rotate =3D true; break; + } + + if (should_run_aging(lruvec, max_seq, sc, swappiness)) { + if (try_to_inc_max_seq(lruvec, max_seq, swappiness, false)) + need_rotate =3D true; + break; + } =20 nr_batch =3D min(nr_to_scan, MAX_LRU_BATCH); delta =3D evict_folios(nr_batch, lruvec, sc, swappiness); if (!delta) break; =20 - scanned +=3D delta; - if (scanned >=3D nr_to_scan) - break; - if (should_abort_scan(lruvec, sc)) break; =20 + nr_to_scan -=3D delta; cond_resched(); } =20 @@ -5035,8 +5038,7 @@ static bool try_to_shrink_lruvec(struct lruvec *lruve= c, struct scan_control *sc) reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); } =20 - /* whether this lruvec should be rotated */ - return nr_to_scan < 0; + return need_rotate; } =20 static int shrink_one(struct lruvec *lruvec, struct scan_control *sc) --=20 2.53.0 From nobody Thu Apr 2 12:41:27 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C6E5933F595 for ; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; cv=none; b=GEwRhIKso9FXMWKHH8GI2T7YcOTss4/Fbgr9l2AxXS0slDntyi9Cdc/s2EWhwtqOGnrpNRQcX2w8YiQwgcx+dUxVVYUx86BtwQL0n4K8KRuhJ9Ic/QumMbTKhOeEAJ5bokxRtgCl/fcGuksFWnaNX+WiCtTbiJrGd7z57qdVU+Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; c=relaxed/simple; bh=Kcj+wVBSDKRPAoZTlEVcVgfXNV9IeRUNYVyk/wgSQrs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=RgqzhVmNHURKZbumOTrzV3pi61LO1n/kiYaIjsBdUGhL0ShecICr/PU9al8HdU4lcDzMYTYmWPpw65TseXEtVw0qkyl3yqVILRLewh6d0Un7iGu4O4xf/3xsCmLZYXRdBOjliXYWUExgrQmLxAmNfCacqH+aZ7L9zNczVEGuJGI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=k1ch7X1m; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="k1ch7X1m" Received: by smtp.kernel.org (Postfix) with ESMTPS id 7A0B7C2BCB2; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774727551; bh=Kcj+wVBSDKRPAoZTlEVcVgfXNV9IeRUNYVyk/wgSQrs=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=k1ch7X1mb+It4mYbmdply1+nMiadeowZqahxP6gTZfz/We4PjXkf2qMnnQiTdb9xc 5CB+nJ6D9IWlY14uc7b124poCdDMa1sn9AbaxxOKrvHkLbVX2CZlwykpVxjBX19SHc PmXNlWvd/LJNhyyfJcjMYH53sTsp9exjZ3Z0bmKMlZMcjPojNFwDZwEWXO8gb39zYM YTwNvKL5+QsJf2ohJIpGfeBFDqmrbwoLLV/pQyjYr9uDD2fZv3/mJaSOv3IA8YHr5g YhuGYh+HQ0P/oILT0okDq/sjp/QGlaUDtC0GJxf7ifqWXsehRclprh8d/Yswg7TsQw Mf5VkaAMoqMnw== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7032B10F3DFA; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) From: Kairui Song via B4 Relay Date: Sun, 29 Mar 2026 03:52:31 +0800 Subject: [PATCH v2 05/12] mm/mglru: scan and count the exact number of folios Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260329-mglru-reclaim-v2-5-b53a3678513c@tencent.com> References: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> In-Reply-To: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1774727549; l=5300; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=Chfeg67mZ+HunWCr2QhrTTSvG/K0wkoKuPXlOjYTRSg=; b=Ezlxj/mKqH1Rl207cwTO2nDKcU5miZsCt7tpfNHN9Sdncm6boo0rU/A5c/a/LoVajPKUpGdS3 6pmxRQUIBNODbVXgpF7xrZ7eL/J9eeintIQBlXlOwZRJl24KPFH6njP X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Make the scan helpers return the exact number of folios being scanned or isolated. Since the reclaim loop now has a natural scan budget that controls the scan progress, returning the scan number directly should make the scan more accurate and easier to follow. The number of scanned folios for each iteration is always positive and larger than 0, unless the reclaim must stop for a forced aging, so there is no more need for any special handling when there is no progress made: - `return isolated || !remaining ? scanned : 0` in scan_folios: both the function and the call now just return the exact scan count, combined with the scan budget introduced in the previous commit to avoid livelock or under scan. - `scanned +=3D try_to_inc_min_seq` in evict_folios: adding a bool as a scan count was kind of confusing and no longer needed too, as scan number will never be zero even if none of the folio in oldest generation is isolated. - `evictable_min_seq + MIN_NR_GENS > max_seq` guard in evict_folios: the per-type get_nr_gens =3D=3D MIN_NR_GENS check in scan_folios naturally returns 0 when only two gens remain and breaks the loop. Also move try_to_inc_min_seq before isolate_folios, so that any empty gens created by external folio freeing are also skipped. The scan still stops if there are only two gens left as the scan number will be zero, this behavior is same as before. This force gen protection may get removed or softened later to improve the reclaim a bit more. Signed-off-by: Kairui Song --- mm/vmscan.c | 46 +++++++++++++++++++++++----------------------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index ab81ffdb241a..c5361efa6776 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4686,7 +4686,7 @@ static bool isolate_folio(struct lruvec *lruvec, stru= ct folio *folio, struct sca =20 static int scan_folios(unsigned long nr_to_scan, struct lruvec *lruvec, struct scan_control *sc, int type, int tier, - struct list_head *list) + struct list_head *list, int *isolatedp) { int i; int gen; @@ -4756,11 +4756,9 @@ static int scan_folios(unsigned long nr_to_scan, str= uct lruvec *lruvec, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); if (type =3D=3D LRU_GEN_FILE) sc->nr.file_taken +=3D isolated; - /* - * There might not be eligible folios due to reclaim_idx. Check the - * remaining to prevent livelock if it's not making progress. - */ - return isolated || !remaining ? scanned : 0; + + *isolatedp =3D isolated; + return scanned; } =20 static int get_tier_idx(struct lruvec *lruvec, int type) @@ -4804,33 +4802,36 @@ static int get_type_to_scan(struct lruvec *lruvec, = int swappiness) =20 static int isolate_folios(unsigned long nr_to_scan, struct lruvec *lruvec, struct scan_control *sc, int swappiness, - int *type_scanned, struct list_head *list) + struct list_head *list, int *isolated, + int *isolate_type, int *isolate_scanned) { int i; + int scanned =3D 0; int type =3D get_type_to_scan(lruvec, swappiness); =20 for_each_evictable_type(i, swappiness) { - int scanned; + int type_scan; int tier =3D get_tier_idx(lruvec, type); =20 - *type_scanned =3D type; + type_scan =3D scan_folios(nr_to_scan, lruvec, sc, + type, tier, list, isolated); =20 - scanned =3D scan_folios(nr_to_scan, lruvec, sc, type, tier, list); - if (scanned) - return scanned; + scanned +=3D type_scan; + if (*isolated) { + *isolate_type =3D type; + *isolate_scanned =3D type_scan; + break; + } =20 type =3D !type; } =20 - return 0; + return scanned; } =20 static int evict_folios(unsigned long nr_to_scan, struct lruvec *lruvec, struct scan_control *sc, int swappiness) { - int type; - int scanned; - int reclaimed; LIST_HEAD(list); LIST_HEAD(clean); struct folio *folio; @@ -4838,19 +4839,18 @@ static int evict_folios(unsigned long nr_to_scan, s= truct lruvec *lruvec, enum node_stat_item item; struct reclaim_stat stat; struct lru_gen_mm_walk *walk; + int scanned, reclaimed; + int isolated =3D 0, type, type_scanned; bool skip_retry =3D false; - struct lru_gen_folio *lrugen =3D &lruvec->lrugen; struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); struct pglist_data *pgdat =3D lruvec_pgdat(lruvec); =20 lruvec_lock_irq(lruvec); =20 - scanned =3D isolate_folios(nr_to_scan, lruvec, sc, swappiness, &type, &li= st); - - scanned +=3D try_to_inc_min_seq(lruvec, swappiness); + try_to_inc_min_seq(lruvec, swappiness); =20 - if (evictable_min_seq(lrugen->min_seq, swappiness) + MIN_NR_GENS > lrugen= ->max_seq) - scanned =3D 0; + scanned =3D isolate_folios(nr_to_scan, lruvec, sc, swappiness, + &list, &isolated, &type, &type_scanned); =20 lruvec_unlock_irq(lruvec); =20 @@ -4861,7 +4861,7 @@ static int evict_folios(unsigned long nr_to_scan, str= uct lruvec *lruvec, sc->nr.unqueued_dirty +=3D stat.nr_unqueued_dirty; sc->nr_reclaimed +=3D reclaimed; trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, - scanned, reclaimed, &stat, sc->priority, + type_scanned, reclaimed, &stat, sc->priority, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); =20 list_for_each_entry_safe_reverse(folio, next, &list, lru) { --=20 2.53.0 From nobody Thu Apr 2 12:41:27 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C6DD533DEE1 for ; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; cv=none; b=nxC2WzzB7fGbNp8Y6Mh8HhIolt0HE1nsQH93WbnW9bRwu1E8uWnRJPQpyUZmwg3/0t//2wQUAtZ72go6CWiCc3Xp3t62ZsZpezjyo2fvdUmA/hPV3P0VQOoAJfGIlwzgtRBfgNl8kp1FU7NwUuxXlsEwlKOzMiAFJ82yB+oCHBM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; c=relaxed/simple; bh=9RYTKEAJRgR1YdnkgRQ1TL6l1lNmoU9ISlURLEfdnOc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ZgzPXUbkUwcbw0y1Fo2o/q5kJcXu/RSCFTqk2xsXfBQyTFQ8fiYDhsqqZQQUdX0eEYr6IAxu3Q4WScVH1OsVbWdxa+6CyEN9ZFlcOtvv+AQxLuFPr9m9GeE4yc8cb3swuM+3qI6v0Kq8dCArm2g0cHaqEXPancydFJYbI7p7m38= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=m30IZSSt; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="m30IZSSt" Received: by smtp.kernel.org (Postfix) with ESMTPS id 8C727C2BCB7; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774727551; bh=9RYTKEAJRgR1YdnkgRQ1TL6l1lNmoU9ISlURLEfdnOc=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=m30IZSStpoMQAFUGwaOgmsw4ZwFX45+LZcRMyl3QI/3+RagpckMezSfJX7J2/22iw cqP2t6MlKoHYPIulg/Z39xYqxGedJJB6mqlU2td6SXhGd0kfZVNuqnShX4/FdnhcHZ oAYe0PdOl//iaFu+9eT7w2oG/e4T3yC3YxcLpGYmUjSb1pVppHKrUshi+8drg8QpC9 23Mpq8iFiPezNIAgkqeE+lcuVWIv/ndft3zOLqeuWXwRKvWshlQ19wKwCScTJcSLsR ViwJncWuy7OLtPW197RzbqdyOtWUbH85v1rGLCCov39JVY3wRm1bE3S39Y5pyruoVk T249p7idh6sxA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 810D110F3DF9; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) From: Kairui Song via B4 Relay Date: Sun, 29 Mar 2026 03:52:32 +0800 Subject: [PATCH v2 06/12] mm/mglru: use a smaller batch for reclaim Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260329-mglru-reclaim-v2-6-b53a3678513c@tencent.com> References: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> In-Reply-To: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1774727549; l=936; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=6K0YJ01vwTDlwxdVK6T6C7w8M/m2BxZNiHO5rSzz1FE=; b=8DgFu7uQ3AkTL1Pq/A9upGeJPa79VkafPA7++cGACubDswHq0qwjZ+vbUYGkcJ7y7NOk/SXTx 9JwMempDtZXB5nOiT7OH7aDm9f783s+i8S83OrtP7RixGD+8N4asqOh X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song With a fixed number to reclaim calculated at the beginning, making each following step smaller should reduce the lock contention and avoid over-aggressive reclaim of folios, as it will abort earlier when the number of folios to be reclaimed is reached. Reviewed-by: Axel Rasmussen Reviewed-by: Chen Ridong Signed-off-by: Kairui Song Reviewed-by: Baolin Wang --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index c5361efa6776..e3ca38d0c4cd 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -5004,7 +5004,7 @@ static bool try_to_shrink_lruvec(struct lruvec *lruve= c, struct scan_control *sc) break; } =20 - nr_batch =3D min(nr_to_scan, MAX_LRU_BATCH); + nr_batch =3D min(nr_to_scan, MIN_LRU_BATCH); delta =3D evict_folios(nr_batch, lruvec, sc, swappiness); if (!delta) break; --=20 2.53.0 From nobody Thu Apr 2 12:41:27 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB8FC33F8B2 for ; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; cv=none; b=irxUqkCKLOGuHIagls9OlwpGWqTVPmBZeCK74O2fHbIoKfB3rQkbBGT8zPzI2laJwsY/PQVzBMPGgttgSFTnnf6Ob5alHHXSjwfhavWfvsXx2P294v/YjQda1/QberqttY21tTav3PjeosOBuKRBLY46VMxcDsWmWlOEAAZYHw4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; c=relaxed/simple; bh=sECggSc4MF397YOh2/Lacxd/o6F18d1Kk6r+68wxb0c=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=OgbYXa9kHDEpEGShEgn+nB8fBE33UyGB+Oe44sEgQQJh1YanRMMpSsxr8UI5eZfN2UCp4rwy4A6qTm83cUffcHdbqji+xQ4X+JN6u5mIiqbdr/ihYY92T+IAukqto4gyF2gTBmWvwtJBvvbhb839TJOnm2TwN11gtj36f+dveTA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=nb8ZGWPx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nb8ZGWPx" Received: by smtp.kernel.org (Postfix) with ESMTPS id 9B06BC2BCC7; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774727551; bh=sECggSc4MF397YOh2/Lacxd/o6F18d1Kk6r+68wxb0c=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=nb8ZGWPxqEDG1L38yQLEuAsDALo+VfAOYJ51C++dgxiFRhcLKT0KqtAzF+XN32/U+ gdGDKRAi8FzcUMCKep823Do35smfmRUJIY5sHImNpIUcGr2gRZN5KVw2M3ZQj5EjaK Xef56XKRGGJJokBhwXIuXc/x12SAUxHvIpBq5Z/3xDUJGIDkktL95xWI7hNsjyhkmu Yvj1ihHKvuf2yRiYWIcimIK097E/QNCf5l6HIiitSzI1OlwK/guRXMD05Nw7LrJwKJ BjvEIDFOlDpB8HNQelOs8xrST9cYbQu1ChbRLk0ijyA1ukfrXUk3SFw6IXzyfHmaY7 /jJ/YXWNilpLg== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91FB710F3DFB; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) From: Kairui Song via B4 Relay Date: Sun, 29 Mar 2026 03:52:33 +0800 Subject: [PATCH v2 07/12] mm/mglru: don't abort scan immediately right after aging Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260329-mglru-reclaim-v2-7-b53a3678513c@tencent.com> References: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> In-Reply-To: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1774727549; l=3429; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=9oXAEtsd+KkctSkA6zzJc0gY6moW3RMhiz1U6qfN0UQ=; b=HN9WBP30sd3hrrKuJwKAHlFAFwvd6vODwGA7Eb826xhii9eEYNqDL5i6qLjQoXwZiXYh4GCih gJOM6K37lHQBG45xgdCcGR0ja9EsZn5g1O7FPgsTxAE+5vYFcjRnzgE X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Right now, if eviction triggers aging, the reclaimer will abort. This is not the optimal strategy for several reasons. Aborting the reclaim early wastes a reclaim cycle when under pressure, and for concurrent reclaim, if the LRU is under aging, all concurrent reclaimers might fail. And if the age has just finished, new cold folios exposed by the aging are not reclaimed until the next reclaim iteration. What's more, the current aging trigger is quite lenient, having 3 gens with a reclaim priority lower than default will trigger aging, and blocks reclaiming from one memcg. This wastes reclaim retry cycles easily. And in the worst case, if the reclaim is making slower progress and all following attempts fail due to being blocked by aging, it triggers unexpected early OOM. And if a lruvec requires aging, it doesn't mean it's hot. Instead, the lruvec could be idle for quite a while, and hence it might contain lots of cold folios to be reclaimed. While it's helpful to rotate memcg LRU after aging for global reclaim, as global reclaim fairness is coupled with the rotation in shrink_many, memcg fairness is instead handled by cgroup iteration in shrink_node_memcgs. So, for memcg level pressure, this abort is not the key part for keeping the fairness. And in most cases, there is no need to age, and fairness must be achieved by upper-level reclaim control. So instead, just keep the scanning going unless one whole batch of folios failed to be isolated or enough folios have been scanned, which is triggered by evict_folios returning 0. And only abort for global reclaim after one batch, so when there are fewer memcgs, progress is still made, and the fairness mechanism described above still works fine. And in most cases, the one more batch attempt for global reclaim might just be enough to satisfy what the reclaimer needs, hence improving global reclaim performance by reducing reclaim retry cycles. Rotation is still there after the reclaim is done, which still follows the comment in mmzone.h. And fairness still looking good. Signed-off-by: Kairui Song --- mm/vmscan.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index e3ca38d0c4cd..8de5c8d5849e 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4983,7 +4983,7 @@ static bool should_abort_scan(struct lruvec *lruvec, = struct scan_control *sc) */ static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_contro= l *sc) { - bool need_rotate =3D false; + bool need_rotate =3D false, should_age =3D false; long nr_batch, nr_to_scan; int swappiness =3D get_swappiness(lruvec, sc); struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); @@ -5001,7 +5001,7 @@ static bool try_to_shrink_lruvec(struct lruvec *lruve= c, struct scan_control *sc) if (should_run_aging(lruvec, max_seq, sc, swappiness)) { if (try_to_inc_max_seq(lruvec, max_seq, swappiness, false)) need_rotate =3D true; - break; + should_age =3D true; } =20 nr_batch =3D min(nr_to_scan, MIN_LRU_BATCH); @@ -5012,6 +5012,10 @@ static bool try_to_shrink_lruvec(struct lruvec *lruv= ec, struct scan_control *sc) if (should_abort_scan(lruvec, sc)) break; =20 + /* Cgroup reclaim fairness not guarded by rotate */ + if (root_reclaim(sc) && should_age) + break; + nr_to_scan -=3D delta; cond_resched(); } --=20 2.53.0 From nobody Thu Apr 2 12:41:27 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D2FD333FE05 for ; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; cv=none; b=OhmxUv670NE/DleA2u3nOa+sqjdpRgW7FWyBZ6ECU9IU+z8ZhCjkwvqvBTp+Qkb5q27FQTMMHZpVdMcXCpeaEBSeXmi9rlWDfWzAidFTxy4q5JxVql4wvzfR+fcjPsLfLHIGXpMtRBqXXhEP/PWX6P+vI/kTyvcHi6LsiBPJFe4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727551; c=relaxed/simple; bh=ZrNLUf7TIsnwMuqf3iVQ5egKhLO7tBdBlTFX9MiJW9Y=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=pJPB557x3JrbSyALWLOTBZZbdkJa4jlm7iN8Ozv0tNxJxbQGImVYd90OR5s9efKgiSjCw8pIKy6T3xZDZG9Tb5AthrHE8p1a8M+KZzAbsDm5Zl6mwZsukS3rpMKGK2CO9quTPkoPvqVYeX9fRoQmk+zdm+z6FHl6LQ7sOFrdVo0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ciw7jlWa; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ciw7jlWa" Received: by smtp.kernel.org (Postfix) with ESMTPS id AB398C2BCFA; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774727551; bh=ZrNLUf7TIsnwMuqf3iVQ5egKhLO7tBdBlTFX9MiJW9Y=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=ciw7jlWamFG0Zkvwo8DlgG19vpldBiO2NgEw9S9FuR+GVzehh5aauskJ813ZANJ3n SybWtfhjDYGttM54IG2LKZCv7X1L3UWUbBGDs0IQBOzy2IxsN268HLHlIWiy8rnUs0 CkgSeROdNY1B277COSY/wFoeJ34vAO4lGdjcNHdtMRjjfur09XoK6GOiGhnxt0ja9C 12zt6A8XLIMizDb2C3UyCDO9BBLEnmbMQd5bxVr6a5EE7G76PLoX24sAGc+nFrOaNJ +1KKOSYOaIwkLjjCxAs/KHVWhDHt+BUZJT6UIbvSsNnBkfRxKWJZe4cCd6va6fEnjs cawBN+hsrizGA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id A33AA10F3DF8; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) From: Kairui Song via B4 Relay Date: Sun, 29 Mar 2026 03:52:34 +0800 Subject: [PATCH v2 08/12] mm/mglru: simplify and improve dirty writeback handling Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260329-mglru-reclaim-v2-8-b53a3678513c@tencent.com> References: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> In-Reply-To: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1774727549; l=5969; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=T/G4ibrY7PQDUQFvtmIQoGY8xdh7KRch5mx7kYmgnHQ=; b=657LO41LLkGKMBTi3D7JTHfNytmDYDYIFylcEUbiGOlDeDMnPmHzWiDcdWfm/O6Hjb1mo/uKU +6EKFyu+SyvC+4eYfDrMoYLJ3KBJFuSN+8217Z64/Znx3R7YWF1gCpP X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song The current handling of dirty writeback folios is not working well for file page heavy workloads: Dirty folios are protected and move to next gen upon isolation of getting throttled or reactivation upon pageout (shrink_folio_list). This might help to reduce the LRU lock contention slightly, but as a result, the ping-pong effect of folios between head and tail of last two gens is serious as the shrinker will run into protected dirty writeback folios more frequently compared to activation. The dirty flush wakeup condition is also much more passive compared to active/inactive LRU. Active / inactve LRU wakes the flusher if one batch of folios passed to shrink_folio_list is unevictable due to under writeback, but MGLRU instead has to check this after the whole reclaim loop is done, and then count the isolation protection number compared to the total reclaim number. And we previously saw OOM problems with it, too, which were fixed but still not perfect [1]. So instead, just drop the special handling for dirty writeback, just re-activate it like active / inactive LRU. And also move the dirty flush wake up check right after shrink_folio_list. This should improve both throttling and performance. Test with YCSB workloadb showed a major performance improvement: Before this series: Throughput(ops/sec): 61642.78008938203 AverageLatency(us): 507.11127774145166 pgpgin 158190589 pgpgout 5880616 workingset_refault 7262988 After this commit: Throughput(ops/sec): 80216.04855744806 (+30.1%, higher is better) AverageLatency(us): 388.17633477268913 (-23.5%, lower is better) pgpgin 101871227 (-35.6%, lower is better) pgpgout 5770028 workingset_refault 3418186 (-52.9%, lower is better) The refault rate is ~50% lower, and throughput is ~30% higher, which is a huge gain. We also observed significant performance gain for other real-world workloads. We were concerned that the dirty flush could cause more wear for SSD: that should not be the problem here, since the wakeup condition is when the dirty folios have been pushed to the tail of LRU, which indicates that memory pressure is so high that writeback is blocking the workload already. Reviewed-by: Axel Rasmussen Link: https://lore.kernel.org/linux-mm/20241026115714.1437435-1-jingxiangze= ng.cas@gmail.com/ [1] Signed-off-by: Kairui Song --- mm/vmscan.c | 57 ++++++++++++++++----------------------------------------- 1 file changed, 16 insertions(+), 41 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 8de5c8d5849e..17b5318fad39 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4583,7 +4583,6 @@ static bool sort_folio(struct lruvec *lruvec, struct = folio *folio, struct scan_c int tier_idx) { bool success; - bool dirty, writeback; int gen =3D folio_lru_gen(folio); int type =3D folio_is_file_lru(folio); int zone =3D folio_zonenum(folio); @@ -4633,21 +4632,6 @@ static bool sort_folio(struct lruvec *lruvec, struct= folio *folio, struct scan_c return true; } =20 - dirty =3D folio_test_dirty(folio); - writeback =3D folio_test_writeback(folio); - if (type =3D=3D LRU_GEN_FILE && dirty) { - sc->nr.file_taken +=3D delta; - if (!writeback) - sc->nr.unqueued_dirty +=3D delta; - } - - /* waiting for writeback */ - if (writeback || (type =3D=3D LRU_GEN_FILE && dirty)) { - gen =3D folio_inc_gen(lruvec, folio, true); - list_move(&folio->lru, &lrugen->folios[gen][type][zone]); - return true; - } - return false; } =20 @@ -4754,8 +4738,6 @@ static int scan_folios(unsigned long nr_to_scan, stru= ct lruvec *lruvec, trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, scanned, skipped, isolated, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); - if (type =3D=3D LRU_GEN_FILE) - sc->nr.file_taken +=3D isolated; =20 *isolatedp =3D isolated; return scanned; @@ -4858,12 +4840,27 @@ static int evict_folios(unsigned long nr_to_scan, s= truct lruvec *lruvec, return scanned; retry: reclaimed =3D shrink_folio_list(&list, pgdat, sc, &stat, false, memcg); - sc->nr.unqueued_dirty +=3D stat.nr_unqueued_dirty; sc->nr_reclaimed +=3D reclaimed; trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, type_scanned, reclaimed, &stat, sc->priority, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); =20 + /* + * If too many file cache in the coldest generation can't be evicted + * due to being dirty, wake up the flusher. + */ + if (stat.nr_unqueued_dirty =3D=3D isolated) { + wakeup_flusher_threads(WB_REASON_VMSCAN); + + /* + * For cgroupv1 dirty throttling is achieved by waking up + * the kernel flusher here and later waiting on folios + * which are in writeback to finish (see shrink_folio_list()). + */ + if (!writeback_throttling_sane(sc)) + reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); + } + list_for_each_entry_safe_reverse(folio, next, &list, lru) { DEFINE_MIN_SEQ(lruvec); =20 @@ -5020,28 +5017,6 @@ static bool try_to_shrink_lruvec(struct lruvec *lruv= ec, struct scan_control *sc) cond_resched(); } =20 - /* - * If too many file cache in the coldest generation can't be evicted - * due to being dirty, wake up the flusher. - */ - if (sc->nr.unqueued_dirty && sc->nr.unqueued_dirty =3D=3D sc->nr.file_tak= en) { - struct pglist_data *pgdat =3D lruvec_pgdat(lruvec); - - wakeup_flusher_threads(WB_REASON_VMSCAN); - - /* - * For cgroupv1 dirty throttling is achieved by waking up - * the kernel flusher here and later waiting on folios - * which are in writeback to finish (see shrink_folio_list()). - * - * Flusher may not be able to issue writeback quickly - * enough for cgroupv1 writeback throttling to work - * on a large system. - */ - if (!writeback_throttling_sane(sc)) - reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); - } - return need_rotate; } =20 --=20 2.53.0 From nobody Thu Apr 2 12:41:27 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E0EA134028F for ; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727552; cv=none; b=Zc0kZxjlQ6VACPDuYFWZXpHiP21ESeqZ6vjkpv6dTpIKH0czXLZZCYNNzWNW7nX4veGoU4STsDILVopwycw84zJDigSp2muet8CDbDiwm5VIutExcykIUs/R1DWE9yCKX7+nvyfW8/XQQKGPBGRYuOVKVYb/WIHYjz0HlH54dTk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727552; c=relaxed/simple; bh=IwhdDUeDoXiHUP+X3n17xP8ze8d44nHV7UUvvAvMpxw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=UwBl2Agl+GrQl61KBkZokrR1oqXLAIXrx5AuJNgrNA29iRhm1dmv7umfE4vwFf7fqh1ponJmXir2QcK1mbBc2147C8wZ74twk7zCHknmHvYZhsJPdv2AA5aonhJKW1jzZWISCi8fefieAnbxBqu4FAa3qOgkbid7beOUz0eXZxA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=YS40uPFA; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="YS40uPFA" Received: by smtp.kernel.org (Postfix) with ESMTPS id BD3C3C2BCB0; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774727551; bh=IwhdDUeDoXiHUP+X3n17xP8ze8d44nHV7UUvvAvMpxw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=YS40uPFAU79qQQ9M5w9pAuFc3tpBNWYbUrmd0tg3GBQ3NImITQ6FBUd6ImKngQemS 7fwCvgWYzt2F/DrpDf9NyVttbDzQNQTZQ1jGv0N5PgKRb+VfqXgh7ujHddHYsopDRy k4x5x5YVMqoRnEJs+MF77VCX0TZvX/RXyv+DgxKu+GIpo3kV/VEr8Ee8frev9yunbP SuaTHVLavhLJ8nM6gBjRxZY7M32asDC31NQEv/LrlZgbYQdfG2p97Yu3QzSTXUZK4m M5gmIPCYg58/hHUt5nWyAJu1lrMuti0recoz7OqWFfsPiPgbWFbjIs18Mt0WYEzYBu yfYCUTb1/GMoQ== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id B430710F3DFC; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) From: Kairui Song via B4 Relay Date: Sun, 29 Mar 2026 03:52:35 +0800 Subject: [PATCH v2 09/12] mm/mglru: remove no longer used reclaim argument for folio protection Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260329-mglru-reclaim-v2-9-b53a3678513c@tencent.com> References: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> In-Reply-To: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1774727549; l=2572; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=VpUQiKqa6L0u8xULiUTRcyn/TqUWANh3/7f3AE47AkY=; b=q400/17aQbvl8jZf/Gp51sktJ8SQrx2dNyisncTbri1+ECcL8rZw34QRp/+UkWaPL+IjLN9m/ nRgqEbu1/AODiG24HJLF2jv8ZoR50YVLzSb2nYc8GGGtkgQVhCb7vMF X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Now dirty reclaim folios are handled after isolation, not before, since dirty reactivation must take the folio off LRU first, and that helps to unify the dirty handling logic. So this argument is no longer needed. Just remove it. Signed-off-by: Kairui Song --- mm/vmscan.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 17b5318fad39..07667649c5e2 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3220,7 +3220,7 @@ static int folio_update_gen(struct folio *folio, int = gen) } =20 /* protect pages accessed multiple times through file descriptors */ -static int folio_inc_gen(struct lruvec *lruvec, struct folio *folio, bool = reclaiming) +static int folio_inc_gen(struct lruvec *lruvec, struct folio *folio) { int type =3D folio_is_file_lru(folio); struct lru_gen_folio *lrugen =3D &lruvec->lrugen; @@ -3239,9 +3239,6 @@ static int folio_inc_gen(struct lruvec *lruvec, struc= t folio *folio, bool reclai =20 new_flags =3D old_flags & ~(LRU_GEN_MASK | LRU_REFS_FLAGS); new_flags |=3D (new_gen + 1UL) << LRU_GEN_PGOFF; - /* for folio_end_writeback() */ - if (reclaiming) - new_flags |=3D BIT(PG_reclaim); } while (!try_cmpxchg(&folio->flags.f, &old_flags, new_flags)); =20 lru_gen_update_size(lruvec, folio, old_gen, new_gen); @@ -3855,7 +3852,7 @@ static bool inc_min_seq(struct lruvec *lruvec, int ty= pe, int swappiness) VM_WARN_ON_ONCE_FOLIO(folio_is_file_lru(folio) !=3D type, folio); VM_WARN_ON_ONCE_FOLIO(folio_zonenum(folio) !=3D zone, folio); =20 - new_gen =3D folio_inc_gen(lruvec, folio, false); + new_gen =3D folio_inc_gen(lruvec, folio); list_move_tail(&folio->lru, &lrugen->folios[new_gen][type][zone]); =20 /* don't count the workingset being lazily promoted */ @@ -4612,7 +4609,7 @@ static bool sort_folio(struct lruvec *lruvec, struct = folio *folio, struct scan_c =20 /* protected */ if (tier > tier_idx || refs + workingset =3D=3D BIT(LRU_REFS_WIDTH) + 1) { - gen =3D folio_inc_gen(lruvec, folio, false); + gen =3D folio_inc_gen(lruvec, folio); list_move(&folio->lru, &lrugen->folios[gen][type][zone]); =20 /* don't count the workingset being lazily promoted */ @@ -4627,7 +4624,7 @@ static bool sort_folio(struct lruvec *lruvec, struct = folio *folio, struct scan_c =20 /* ineligible */ if (zone > sc->reclaim_idx) { - gen =3D folio_inc_gen(lruvec, folio, false); + gen =3D folio_inc_gen(lruvec, folio); list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]); return true; } --=20 2.53.0 From nobody Thu Apr 2 12:41:27 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC6B7342507 for ; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727552; cv=none; b=T0FOzh+LeTxYhoI6hmMT/nmoSzhxlFjB0i5rrUFp8A3PMHM1oD6SBZZqYHqusdqgvH+vkmWp3AUfBWbWVEq0sN3UEjk49xaFlv7EFk6Dt/UVE6slLBaXoq9qJgqEuutPPDZ+tMRzvm0g3dwtQSI1iAvn+e+uz2YEJA5EqEXIFjw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727552; c=relaxed/simple; bh=z6gS7rKOYjVzbEw8wf/lZ7iZ7DzeY0whFy2eh0BxXLY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=b8mD8cejP0ZEjPBgwj71+ctg5Vcmgu3RVjyFZENacsFWmliuYW0exxLwuVTLHEQn+tyYDjuYIF9T9p4SczNeP6eoY+UrciPO+aC9dHefXy9wtmqO/HhMgQI5S+JC82F9pFSn6oLv70PfcA5CHgayTZtICMETxhoqtLvHJIe1TRw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lzNngJ+m; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lzNngJ+m" Received: by smtp.kernel.org (Postfix) with ESMTPS id CE78EC2BCB3; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774727551; bh=z6gS7rKOYjVzbEw8wf/lZ7iZ7DzeY0whFy2eh0BxXLY=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=lzNngJ+mDis5EgPE/V5eP/miv/V7mNPBQSSESlPTAy3PKl4z2L/ZGY59yTmir9FgH CO/NaCZQf0sLFMmCHcS9hDNc45z6H2LFohdQKIZ0+iDiHs06qVcQetXeRFLv5Cvh/A XsKnOWfRABVD6nBGvvf4MbpJq2huI/s9nK17ZGFJESfk3gfsCH4BHaOZ+gimTgp2Yf ZI1MADuKRGEOT/dZDP/t6L536dOJdv1fDQDuL43ZPTo4HkRCOc6yNoF9RpT07kTEnE dnUveYbZt5H7ELcaVe0AfgtiZggOuzII//erAE9hVKpr/5BFOPGADmW1aPcRvrYuUJ SsTxuF+KPy/3A== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id C628710F3DFA; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) From: Kairui Song via B4 Relay Date: Sun, 29 Mar 2026 03:52:36 +0800 Subject: [PATCH v2 10/12] mm/vmscan: remove sc->file_taken Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260329-mglru-reclaim-v2-10-b53a3678513c@tencent.com> References: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> In-Reply-To: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1774727549; l=905; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=aelGzWH2zsHZbTrxyqLN4+nv11pcaWeLxFVtgV+ftPo=; b=GvaKf9oyuKhc9X4//LD/1G6jd+FGtlDR546f8id2d4VM/aBpTHzNIPHKhR5I1S+6UggcSsLOX ze2lI36K4nBAWGetOofPr3IVsiY9IPmHyYl4eb+cgsQxPyV/VKmWt+3 X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song No one is using it now, just remove it. Reviewed-by: Axel Rasmussen Signed-off-by: Kairui Song Reviewed-by: Baolin Wang --- mm/vmscan.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 07667649c5e2..603be5ef3ef2 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -173,7 +173,6 @@ struct scan_control { unsigned int congested; unsigned int writeback; unsigned int immediate; - unsigned int file_taken; unsigned int taken; } nr; =20 @@ -2040,8 +2039,6 @@ static unsigned long shrink_inactive_list(unsigned lo= ng nr_to_scan, sc->nr.writeback +=3D stat.nr_writeback; sc->nr.immediate +=3D stat.nr_immediate; sc->nr.taken +=3D nr_taken; - if (file) - sc->nr.file_taken +=3D nr_taken; =20 trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, nr_scanned, nr_reclaimed, &stat, sc->priority, file); --=20 2.53.0 From nobody Thu Apr 2 12:41:27 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B717343D86 for ; Sat, 28 Mar 2026 19:52:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727552; cv=none; b=No47tNhidL/INum+fQ91v+zZSd6VMUHkkyBJ/Bc03+bfoc3IPErjkb34seKn41i4pcJ6t/89x39FT3qgAF2KdF0ejaAm+MlYKQKsVgwPXB4a3mUT7HzVhRfUNMPFcx0GiHab74humh1o4+NR5P/52i6C966VI8yqrM36eUFf3xs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727552; c=relaxed/simple; bh=GEmZonv36+Qup1ubevJ6ZlRq/DuzmmPuCUY205VEyyw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=NfvqA9204FJhJ8xF9y6P7s1IfKoRIA8/b2l5GEeW8PwrOq7XGmwXBQZ+s2yNL6n+VUpLt49ZUtRUtkSjOsTEzGCv/BP/v7XF9W3uLHAhTET63nd/8ouLNMFhKBrjBMq9ucS4nW3SEu6EHOghK6JtbLjcH67dhI12daFiaxnCXPo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=P4Ebp3Kj; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="P4Ebp3Kj" Received: by smtp.kernel.org (Postfix) with ESMTPS id E1339C2BCB1; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774727551; bh=GEmZonv36+Qup1ubevJ6ZlRq/DuzmmPuCUY205VEyyw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=P4Ebp3KjPFXrAb2qxb9JmL/MPOFdaswcuwIgEuT82iZqx54CAfgC5QMX1RSZagfrK O4coN0GTuIRj+l9QCrKVTTFBRzfBjeGl9Wra42dNfA2Ly4cKgnaTS0dwokFL7HLru6 xIs1/NCHyXg1lU41UexqftQcgYTtKsDZL6841Q4gNJkMThNjnvnWivXgBYCeRkByqd bPb2TjhmKSQ4+wGGlGuUnpJTY4m5dOu8FG3EHBTZ377Szgw2CF7jFtBzSHomhjqhYj X4nMgJGfltS/Gjvzn/F1KzcnUK025KQVjb0F4DNOnvwDewpx85hI9DALmCDYEGN1Xj 2xROk4D/3ofLQ== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8BD410F3DF8; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) From: Kairui Song via B4 Relay Date: Sun, 29 Mar 2026 03:52:37 +0800 Subject: [PATCH v2 11/12] mm/vmscan: remove sc->unqueued_dirty Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260329-mglru-reclaim-v2-11-b53a3678513c@tencent.com> References: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> In-Reply-To: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1774727549; l=878; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=1kSyn/NqM3RpP1g8xDeKI3xdQvt+WcrXpJEY2B2ekAo=; b=pbHhbdpkvvczwQ5CESey5xOFg1O3oJ3xHcw35gtP/OVyrgupgxyhl/28WVkmMpYTUjW19AL91 d8jSmspgQXvDKVixNgocxeclptPx0D7LmkzT6Lh8rGckdL4+7Hj85in X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song No one is using it now, just remove it. Suggested-by: Axel Rasmussen Signed-off-by: Kairui Song Reviewed-by: Baolin Wang --- mm/vmscan.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 603be5ef3ef2..1783da54ada1 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -169,7 +169,6 @@ struct scan_control { =20 struct { unsigned int dirty; - unsigned int unqueued_dirty; unsigned int congested; unsigned int writeback; unsigned int immediate; @@ -2035,7 +2034,6 @@ static unsigned long shrink_inactive_list(unsigned lo= ng nr_to_scan, =20 sc->nr.dirty +=3D stat.nr_dirty; sc->nr.congested +=3D stat.nr_congested; - sc->nr.unqueued_dirty +=3D stat.nr_unqueued_dirty; sc->nr.writeback +=3D stat.nr_writeback; sc->nr.immediate +=3D stat.nr_immediate; sc->nr.taken +=3D nr_taken; --=20 2.53.0 From nobody Thu Apr 2 12:41:27 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1CB143446C9 for ; Sat, 28 Mar 2026 19:52:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727552; cv=none; b=XkNu2KIM93FcHK+P5gn8B0oxxdZxk94cjJyr4Y3S4N9kyqCK7GPZvJ/zi0q88udr7IA9siPmrAShhFL9YoXTaCJDqqgKzhWdKLCoszeavw7hTwjKn77fEjB9wOxK1HTOI8GGfNKdkaTjOA8LXSxZ9Bh2wfb3Gr9x8WuDSPYg05M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774727552; c=relaxed/simple; bh=bcFibXg/uzRhIys8g8Z+diQcFu6B8PWOFe7oAURNKcA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=IxRR7F1WH8YrMBYHG7dAoGYMsRGnSWQ7Ti5MyIMRiUJjTAO/dY2H/Y+g8AMrotdzTk7NeFbffdyEY7n2FcBqod2Zf1c44si+JaJhMOJkO4E3hYwBqX0EcSjY5Wv7NeadCWCkIGQPOjg4eUC/UP8YdBgHjZWs5ltUaXeLoDpCir0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=CKmRWYUn; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="CKmRWYUn" Received: by smtp.kernel.org (Postfix) with ESMTPS id F2A63C2BCB7; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774727552; bh=bcFibXg/uzRhIys8g8Z+diQcFu6B8PWOFe7oAURNKcA=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=CKmRWYUnWQiXOiASqokklj7nZzO1hIJTSPw5ODIr6cnldQybfWMBzerrrSeVvu+8d bwoD496oVGkM1/joBuEOAam6AWR3yXDHnu++8RBOnOLNFmqUo0yf90smaziwAIAMwC +LPgFTePr/0cEeuWnySRczdk4navAQ4I7XaKqY74X4JF4mt5kIyuuD1Vq1xG/aiG28 D/8XEsKbZ9ozAHM6C5cNYab6wJNrkOMImRiIq2MSzICPvtrAnfmp/SgyQeEZm3nTaj ySftPcOjfEw8HL6+knEPP4VISs7tvksfPe7FsNRPyKLvHU4NW9ThAAzRAxNSLXceHi gQ7aSKMofpqfQ== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA85610F3DFA; Sat, 28 Mar 2026 19:52:31 +0000 (UTC) From: Kairui Song via B4 Relay Date: Sun, 29 Mar 2026 03:52:38 +0800 Subject: [PATCH v2 12/12] mm/vmscan: unify writeback reclaim statistic and throttling Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260329-mglru-reclaim-v2-12-b53a3678513c@tencent.com> References: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> In-Reply-To: <20260329-mglru-reclaim-v2-0-b53a3678513c@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1774727549; l=7093; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=I8jbbfuWeWbYqnTbBqSMSm96wu/ud2/1QtQ5+KraUWM=; b=ynKU2SRkk3ghH5g5uJ/eEFepcdH7qkgb2HFXraG5fQVTxkyED2JhzXot1mQKAYfsu8H16ivic amsBqsVGdciC3lrkrQ46d9YrqikfmIUkdh62Nphcpn9tUv2l72U/YmJ X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Currently MGLRU and non-MGLRU handle the reclaim statistic and writeback handling very differently, especially throttling. Basically MGLRU just ignored the throttling part. Let's just unify this part, use a helper to deduplicate the code so both setups will share the same behavior. Also remove the folio_clear_reclaim in isolate_folio which was actively invalidating the congestion control. PG_reclaim is now handled by shrink_folio_list, keeping it in isolate_folio is not helpful. Test using following reproducer using bash: echo "Setup a slow device using dm delay" dd if=3D/dev/zero of=3D/var/tmp/backing bs=3D1M count=3D2048 LOOP=3D$(losetup --show -f /var/tmp/backing) mkfs.ext4 -q $LOOP echo "0 $(blockdev --getsz $LOOP) delay $LOOP 0 0 $LOOP 0 1000" | \ dmsetup create slow_dev mkdir -p /mnt/slow && mount /dev/mapper/slow_dev /mnt/slow echo "Start writeback pressure" sync && echo 3 > /proc/sys/vm/drop_caches mkdir /sys/fs/cgroup/test_wb echo 128M > /sys/fs/cgroup/test_wb/memory.max (echo $BASHPID > /sys/fs/cgroup/test_wb/cgroup.procs && \ dd if=3D/dev/zero of=3D/mnt/slow/testfile bs=3D1M count=3D192) echo "Clean up" echo "0 $(blockdev --getsz $LOOP) error" | dmsetup load slow_dev dmsetup resume slow_dev umount -l /mnt/slow && sync dmsetup remove slow_dev Before this commit, `dd` will get OOM killed immediately if MGLRU is enabled. Classic LRU is fine. After this commit, congestion control is now effective and no more spin on LRU or premature OOM. Stress test on other workloads also looking good. Suggested-by: Chen Ridong Signed-off-by: Kairui Song Tested-by: Leno Hou --- mm/vmscan.c | 93 +++++++++++++++++++++++++++------------------------------= ---- 1 file changed, 41 insertions(+), 52 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 1783da54ada1..83c8fdf8fdc4 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1942,6 +1942,44 @@ static int current_may_throttle(void) return !(current->flags & PF_LOCAL_THROTTLE); } =20 +static void handle_reclaim_writeback(unsigned long nr_taken, + struct pglist_data *pgdat, + struct scan_control *sc, + struct reclaim_stat *stat) +{ + /* + * If dirty folios are scanned that are not queued for IO, it + * implies that flushers are not doing their job. This can + * happen when memory pressure pushes dirty folios to the end of + * the LRU before the dirty limits are breached and the dirty + * data has expired. It can also happen when the proportion of + * dirty folios grows not through writes but through memory + * pressure reclaiming all the clean cache. And in some cases, + * the flushers simply cannot keep up with the allocation + * rate. Nudge the flusher threads in case they are asleep. + */ + if (stat->nr_unqueued_dirty =3D=3D nr_taken && nr_taken) { + wakeup_flusher_threads(WB_REASON_VMSCAN); + /* + * For cgroupv1 dirty throttling is achieved by waking up + * the kernel flusher here and later waiting on folios + * which are in writeback to finish (see shrink_folio_list()). + * + * Flusher may not be able to issue writeback quickly + * enough for cgroupv1 writeback throttling to work + * on a large system. + */ + if (!writeback_throttling_sane(sc)) + reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); + } + + sc->nr.dirty +=3D stat->nr_dirty; + sc->nr.congested +=3D stat->nr_congested; + sc->nr.writeback +=3D stat->nr_writeback; + sc->nr.immediate +=3D stat->nr_immediate; + sc->nr.taken +=3D nr_taken; +} + /* * shrink_inactive_list() is a helper for shrink_node(). It returns the n= umber * of reclaimed pages @@ -2005,39 +2043,7 @@ static unsigned long shrink_inactive_list(unsigned l= ong nr_to_scan, lruvec_lock_irq(lruvec); lru_note_cost_unlock_irq(lruvec, file, stat.nr_pageout, nr_scanned - nr_reclaimed); - - /* - * If dirty folios are scanned that are not queued for IO, it - * implies that flushers are not doing their job. This can - * happen when memory pressure pushes dirty folios to the end of - * the LRU before the dirty limits are breached and the dirty - * data has expired. It can also happen when the proportion of - * dirty folios grows not through writes but through memory - * pressure reclaiming all the clean cache. And in some cases, - * the flushers simply cannot keep up with the allocation - * rate. Nudge the flusher threads in case they are asleep. - */ - if (stat.nr_unqueued_dirty =3D=3D nr_taken) { - wakeup_flusher_threads(WB_REASON_VMSCAN); - /* - * For cgroupv1 dirty throttling is achieved by waking up - * the kernel flusher here and later waiting on folios - * which are in writeback to finish (see shrink_folio_list()). - * - * Flusher may not be able to issue writeback quickly - * enough for cgroupv1 writeback throttling to work - * on a large system. - */ - if (!writeback_throttling_sane(sc)) - reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); - } - - sc->nr.dirty +=3D stat.nr_dirty; - sc->nr.congested +=3D stat.nr_congested; - sc->nr.writeback +=3D stat.nr_writeback; - sc->nr.immediate +=3D stat.nr_immediate; - sc->nr.taken +=3D nr_taken; - + handle_reclaim_writeback(nr_taken, pgdat, sc, &stat); trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, nr_scanned, nr_reclaimed, &stat, sc->priority, file); return nr_reclaimed; @@ -4651,9 +4657,6 @@ static bool isolate_folio(struct lruvec *lruvec, stru= ct folio *folio, struct sca if (!folio_test_referenced(folio)) set_mask_bits(&folio->flags.f, LRU_REFS_MASK, 0); =20 - /* for shrink_folio_list() */ - folio_clear_reclaim(folio); - success =3D lru_gen_del_folio(lruvec, folio, true); VM_WARN_ON_ONCE_FOLIO(!success, folio); =20 @@ -4833,26 +4836,11 @@ static int evict_folios(unsigned long nr_to_scan, s= truct lruvec *lruvec, retry: reclaimed =3D shrink_folio_list(&list, pgdat, sc, &stat, false, memcg); sc->nr_reclaimed +=3D reclaimed; + handle_reclaim_writeback(isolated, pgdat, sc, &stat); trace_mm_vmscan_lru_shrink_inactive(pgdat->node_id, type_scanned, reclaimed, &stat, sc->priority, type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); =20 - /* - * If too many file cache in the coldest generation can't be evicted - * due to being dirty, wake up the flusher. - */ - if (stat.nr_unqueued_dirty =3D=3D isolated) { - wakeup_flusher_threads(WB_REASON_VMSCAN); - - /* - * For cgroupv1 dirty throttling is achieved by waking up - * the kernel flusher here and later waiting on folios - * which are in writeback to finish (see shrink_folio_list()). - */ - if (!writeback_throttling_sane(sc)) - reclaim_throttle(pgdat, VMSCAN_THROTTLE_WRITEBACK); - } - list_for_each_entry_safe_reverse(folio, next, &list, lru) { DEFINE_MIN_SEQ(lruvec); =20 @@ -4895,6 +4883,7 @@ static int evict_folios(unsigned long nr_to_scan, str= uct lruvec *lruvec, =20 if (!list_empty(&list)) { skip_retry =3D true; + isolated =3D 0; goto retry; } =20 --=20 2.53.0