From nobody Wed Apr 8 06:42:03 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1B523B2FCB for ; Tue, 7 Apr 2026 12:04:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775563491; cv=none; b=I0IIXnXozqQUmZP9xigt78s3JoKB4RLo32HQ5eb+o/9aIwZCuC9ZjWY874RfSYcFjk6evoIDxmE5878WFK7S9qSushrFj7Ykt/pNsC5FwgmTDRo/XoCaJL7V72FhLCpIT8EMIw/Fyx5Nid2G3rnSklM7ISdj6EI9PWUidPeLbD4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775563491; c=relaxed/simple; bh=PYh84A53nyHTsFjf7EcHQyNYWt2bHhfXejabhh/MBkc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=cE+9NPf9R3ub1aUL+TE9r85DYUlHe5K04SvjfC0xtos6e+tKWklinTo4fPMCxW4QlwcSG//R6ThqsK6JNKciqDNhCNuXinDAqDsfdJc+zvkW1fhxE8f03AGlocqhIB8iuW95PlvKasCkA/cQjqnYu8flORj4MAnaBb+uehvbrDA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=vC/66VV/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="vC/66VV/" Received: by smtp.kernel.org (Postfix) with ESMTPS id 8FB86C2BCB1; Tue, 7 Apr 2026 12:04:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1775563491; bh=PYh84A53nyHTsFjf7EcHQyNYWt2bHhfXejabhh/MBkc=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=vC/66VV/jCGyBD8FrM4gXUOCBw4lksD3axClRDq1jvBvJtxSABhe2BcBkfvo4Ng+A xcY/nZwHzrEA++/DBxz2fy0XL3CQi4GnHFj4NWAizsaMMhVkOsrDuFXWgkIA730uHi Qk9FCBRjIZdOnKtUWH5ctQiX2X2Lph8dJoRyzDw8p34FanGP89AaCm3/6m5/EjwFeF ZpjAJ7DVo2UpaCMjN1iyjxwbqEAQIVNNNTUezy8urK+T0Q0Q79p5LPKCPQT4YWLs4E 5fB1ZaCBqphB4ePPr08A2CbAzwXLCUAiI7T8hJHwXLPo1T/PKiZ/RNRjpZcGq8bg4l w7M/X2gGAkbhA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id 872C2FEEF29; Tue, 7 Apr 2026 12:04:51 +0000 (UTC) From: Kairui Song via B4 Relay Date: Tue, 07 Apr 2026 19:57:36 +0800 Subject: [PATCH v4 07/14] mm/mglru: don't abort scan immediately right after aging Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260407-mglru-reclaim-v4-7-98cf3dc69519@tencent.com> References: <20260407-mglru-reclaim-v4-0-98cf3dc69519@tencent.com> In-Reply-To: <20260407-mglru-reclaim-v4-0-98cf3dc69519@tencent.com> To: linux-mm@kvack.org Cc: Andrew Morton , Axel Rasmussen , Yuanchu Xie , Wei Xu , Johannes Weiner , David Hildenbrand , Michal Hocko , Qi Zheng , Shakeel Butt , Lorenzo Stoakes , Barry Song , David Stevens , Chen Ridong , Leno Hou , Yafang Shao , Yu Zhao , Zicheng Wang , Kalesh Singh , Suren Baghdasaryan , Chris Li , Vernon Yang , linux-kernel@vger.kernel.org, Qi Zheng , Baolin Wang , Kairui Song X-Mailer: b4 0.15.1 X-Developer-Signature: v=1; a=ed25519-sha256; t=1775563488; l=3505; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=GxRE/zl5OeQby4fP7APi2mfKz/sRAK5Lkyko20nnWwE=; b=scutrbRvNg/iILtiYi5oBBbflWKr+fDUWNcVgfnv1UJ//9cI9hiauvRp8iroTbvuVvlmZf1Eu mAG/X+e7sSDBeZnpj7+TEghS63W0Gy3pNGXTdqvCloIAafgo7Gddrsg X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= X-Endpoint-Received: by B4 Relay for kasong@tencent.com/kasong-sign-tencent with auth_id=562 X-Original-From: Kairui Song Reply-To: kasong@tencent.com From: Kairui Song Right now, if eviction triggers aging, the reclaimer will abort. This is not the optimal strategy for several reasons. Aborting the reclaim early wastes a reclaim cycle when under pressure, and for concurrent reclaim, if the LRU is under aging, all concurrent reclaimers might fail. And if the age has just finished, new cold folios exposed by the aging are not reclaimed until the next reclaim iteration. What's more, the current aging trigger is quite lenient, having 3 gens with a reclaim priority lower than default will trigger aging, and blocks reclaiming from one memcg. This wastes reclaim retry cycles easily. And in the worst case, if the reclaim is making slower progress and all following attempts fail due to being blocked by aging, it triggers unexpected early OOM. And if a lruvec requires aging, it doesn't mean it's hot. Instead, the lruvec could be idle for quite a while, and hence it might contain lots of cold folios to be reclaimed. While it's helpful to rotate memcg LRU after aging for global reclaim, as global reclaim fairness is coupled with the rotation in shrink_many, memcg fairness is instead handled by cgroup iteration in shrink_node_memcgs. So, for memcg level pressure, this abort is not the key part for keeping the fairness. And in most cases, there is no need to age, and fairness must be achieved by upper-level reclaim control. So instead, just keep the scanning going unless one whole batch of folios failed to be isolated or enough folios have been scanned, which is triggered by evict_folios returning 0. And only abort for global reclaim after one batch, so when there are fewer memcgs, progress is still made, and the fairness mechanism described above still works fine. And in most cases, the one more batch attempt for global reclaim might just be enough to satisfy what the reclaimer needs, hence improving global reclaim performance by reducing reclaim retry cycles. Rotation is still there after the reclaim is done, which still follows the comment in mmzone.h. And fairness still looking good. Reviewed-by: Axel Rasmussen Signed-off-by: Kairui Song --- mm/vmscan.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index c673830f4ba8..354c6fef3c42 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4979,7 +4979,7 @@ static bool should_abort_scan(struct lruvec *lruvec, = struct scan_control *sc) */ static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_contro= l *sc) { - bool need_rotate =3D false; + bool need_rotate =3D false, should_age =3D false; long nr_batch, nr_to_scan; int swappiness =3D get_swappiness(lruvec, sc); struct mem_cgroup *memcg =3D lruvec_memcg(lruvec); @@ -5000,7 +5000,7 @@ static bool try_to_shrink_lruvec(struct lruvec *lruve= c, struct scan_control *sc) if (should_run_aging(lruvec, max_seq, sc, swappiness)) { if (try_to_inc_max_seq(lruvec, max_seq, swappiness, false)) need_rotate =3D true; - break; + should_age =3D true; } =20 nr_batch =3D min(nr_to_scan, MIN_LRU_BATCH); @@ -5011,6 +5011,10 @@ static bool try_to_shrink_lruvec(struct lruvec *lruv= ec, struct scan_control *sc) if (should_abort_scan(lruvec, sc)) break; =20 + /* For cgroup reclaim, fairness is handled by iterator, not rotation */ + if (root_reclaim(sc) && should_age) + break; + nr_to_scan -=3D delta; cond_resched(); } --=20 2.53.0