From nobody Tue Dec 16 11:04:37 2025 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B7E817A31C for ; Mon, 15 Dec 2025 06:46:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765781213; cv=none; b=vDGqfYESRfZFKj6ud+eXPCZAIUhDCoYfXEyGuvYn3FIrpUVh04jJuBuk1WZlEqrE2mwa48ewkyV2WexQ8NdEmwDHzuUnsK0+5da6A5VFQQGbMEcnckEQM+q0K6RchBCt/8unt37zUGW2WiLyqrYPf9tL06SvUZDpSTbCx0I+gpc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765781213; c=relaxed/simple; bh=mtp+CnZF0v/lRyKKA6BvpdCUa60zGbf2/zQ5UuxYVGI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:Cc; b=m4X4kI95TKfEzEQmUWx8xtUcViItYyPgTi+x+aCDkmvKgDhFkFwejxGyZrhNhcf7amziiXCaMNfj0Za2NqYY1JaFi2hK8qcL7XQzHKHUiQUv9lv714RYftKSE+M+DK2Fe6ufaan8gYvLUmr1xTEyOfU3bWeZCPHScxG1mPW1u5Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=bioo+fam; arc=none smtp.client-ip=209.85.216.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bioo+fam" Received: by mail-pj1-f49.google.com with SMTP id 98e67ed59e1d1-34c213f7690so1548923a91.2 for ; Sun, 14 Dec 2025 22:46:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1765781211; x=1766386011; darn=vger.kernel.org; h=cc:to:message-id:content-transfer-encoding:mime-version:subject :date:from:from:to:cc:subject:date:message-id:reply-to; bh=RdoHM3mJ1jSMiyUaXuH436UztU0Bj9UASxNBl93ugYk=; b=bioo+fammc1/hNu5JotemOH+tfWMDH3Vt4QRHT2ugP982vdYV6pGTHy2biEYBcc7ei P3lXbp8jVBSlq/oPZD+ji2z7IkpisU+j84u3QsoUcKsh/mKpjyXRFUI+POIV2NeQk0dz H+v8j3ques8P3ZaJChissJBp/ltF+FARl4AvkrEDpwGXVY5OW42I/vgUu6ScQacmiSpg Gr1qVBhOIyO+WQBxTGNBEugiJTw3NgXlmz6uRbR/f4gcFb9V8YJrW6+nB7kbttBD1V3S l3DyYQwJtsgzTdB1FeZwgD+zi9aCw0x9Pw3If/4S6ABEI1Kc6fAn03GOeQZi5nMzmY9l QQyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765781211; x=1766386011; h=cc:to:message-id:content-transfer-encoding:mime-version:subject :date:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=RdoHM3mJ1jSMiyUaXuH436UztU0Bj9UASxNBl93ugYk=; b=MYd9ruXCWpoLrm9BPLO8EL6EvTPyRmMqWzfyBBiqDr8eTYnmsXqlImfVtqUin/VG7M pvXC7HijTyMMtSM08bdq49q29Ou2fHBfqVcsESp0+nzd27Kzx2fAhlvmTTmXxFOxkYd0 IYlN3cSdjrBSMhgK3yO50IzfdTnkC9JMDFgG4jA6YKUFFzKATE++mZeJD0akTzvG/IkX pjefRfijulod6bwb3wxvLRWGWFaRASUqmxjhuuHJuAibszqb2CeMlpYRPaoZ4iiejM0q sC5WCVmnNX2PsnSm+RUuK1FTYDEOMxvz1kod5WKQJV73jrm0KwJEpkMSIythR41hjtsz 68PQ== X-Forwarded-Encrypted: i=1; AJvYcCWQRq2DtI7+iYaitgCXjgf4R76+/mVkYuM8/kJqjCqH81sSYM+Pq7JZC7LpJhdQnsNBDBhcqZvwEbWqwkg=@vger.kernel.org X-Gm-Message-State: AOJu0Yx2ODLnGvdULDLs4ZN7gt4afcilhb7KJHQEQWpVOnOr7gX51fgI GDUNWLHHq0uwUqbcNOGSURNLXtv3AKbBMvB7KodyqI5fr0FZdkXR6W8g X-Gm-Gg: AY/fxX4UFekZVhCT90HnPqEJUELaG23xtgAwXRaNWj9UVEPB3PBRXA13hl3Gp3hEheO yQgJz3BAkUhwm/F8l1QwlV0T/0IcBb9bbt1PhJ4g9HNb5eEb85+qOOzhXVp/+us6y3PviKjTWrt mdHgImA+/uW8ZtBQN0SDHIvxGqzYA6B4GenyNBxbZ9aJuqvl2S13T7/82n7n28j6t48gefrCYye YqorTtu7kzslyOho8hMzH/J/gJFlf+kgXqmiG9LLu/+19Wi33G49pufL+iRFhzQcJ9uHgyTOR9L i9XqmONREBS8pz9JZqX+GV3es/9LaGobjY0d24GmLWPYZuTw3pvX+heR3FuWtgzKJ3nGIBsxgUu 38vfNLVJeQbBdRjTHoPXdN/GiMMoK0UIAKsex1+JcNL/+AlN1OV1krBxi6Ih0+TXCx0AkmaZacQ 9Yo+DLAAKFjrBScM3XLXxiRC0xxhiWO8JO+3udfMmP8SrY34Xr0XIBksAA1g== X-Google-Smtp-Source: AGHT+IEWi9rpszfpsJIcG2AJbfKEy0KXwGK5j/XRhgO6Nbp+mDcp1SVnXoHwY8h/edXRSMxd9U9F6w== X-Received: by 2002:a17:90b:3f0e:b0:340:be44:dd11 with SMTP id 98e67ed59e1d1-34abd77b410mr8852050a91.27.1765781211469; Sun, 14 Dec 2025 22:46:51 -0800 (PST) Received: from [127.0.0.1] ([43.132.141.24]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-34abe3a2623sm7958018a91.2.2025.12.14.22.46.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 14 Dec 2025 22:46:50 -0800 (PST) From: Kairui Song Date: Mon, 15 Dec 2025 14:45:43 +0800 Subject: [PATCH RFC] mm/memcontrol: make lru_zone_size atomic and simplify sanity check Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20251215-mz-lru-size-cleanup-v1-1-95deb4d5e90f@tencent.com> X-B4-Tracking: v=1; b=H4sIAAAAAAAC/x3MwQpAQBCA4VfRnE2ZLZGr8gCuctDsYIql3Ujk3 W2O3+H/HwjiVQJUyQNeTg26uQhKE+B5cJOg2mgwmcnJEOF64+IPDHoL8iKDO3bkkqwwjza3BcR y9zLq9V87aJsa+vf9AD31WalqAAAA X-Change-ID: 20251211-mz-lru-size-cleanup-c81deccfd5d7 To: linux-mm@kvack.org, Johannes Weiner , Hugh Dickins , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Kairui Song X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1765781208; l=3839; i=kasong@tencent.com; s=kasong-sign-tencent; h=from:subject:message-id; bh=5Z8c+a+3jENgLhaYOexfVai00HxxbKrt0eKrdFBvjVY=; b=hybdzQeGCdI8OPofHO1fIoNFOLIcMOPPjiP4OyRefxmYv7dtly8Ya5ok83UtN0DMuYhqmm5eD 9BG1NGYsq7BAePb9TvwgkZAICg5q8vYAqJ7baR+K1tc/4wQ03QkPbVL X-Developer-Key: i=kasong@tencent.com; a=ed25519; pk=kCdoBuwrYph+KrkJnrr7Sm1pwwhGDdZKcKrqiK8Y1mI= From: Kairui Song commit ca707239e8a7 ("mm: update_lru_size warn and reset bad lru_size") introduced a sanity check to catch memcg counter underflow, which is more like a workaround for another bug: lru_zone_size is unsigned, so underflow will wrap it around and return an enormously large number, then the memcg shrinker will loop almost forever as the calculated number of folios to shrink is huge. That commit also checks if a zero value matches the empty LRU list, so we have to hold the LRU lock, and do the counter adding differently depending on whether the nr_pages is negative. But later commit b4536f0c829c ("mm, memcg: fix the active list aging for lowmem requests when memcg is enabled") already removed the LRU emptiness check, doing the adding differently is meaningless now. And if we just turn it into an atomic long, underflow isn't a big issue either, and can be checked at the reader side. The reader size is much less frequently called than the updater. So let's turn the counter into an atomic long and check at the reader side instead, which has a smaller overhead. Use atomic to avoid potential locking issue. The underflow correction is removed, which should be fine as if there is a mass leaking of the LRU size counter, something else may also have gone very wrong, and one should fix that leaking site instead. For now still keep the LRU lock context, in thoery that can be removed too since the update is atomic, if we can tolerate a temporary inaccurate reading, but currently there is no benefit doing so yet. Signed-off-by: Kairui Song --- include/linux/memcontrol.h | 9 +++++++-- mm/memcontrol.c | 18 +----------------- 2 files changed, 8 insertions(+), 19 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 0651865a4564..197f48faa8ba 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -112,7 +112,7 @@ struct mem_cgroup_per_node { /* Fields which get updated often at the end. */ struct lruvec lruvec; CACHELINE_PADDING(_pad2_); - unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS]; + atomic_long_t lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS]; struct mem_cgroup_reclaim_iter iter; =20 #ifdef CONFIG_MEMCG_NMI_SAFETY_REQUIRES_ATOMIC @@ -903,10 +903,15 @@ static inline unsigned long mem_cgroup_get_zone_lru_size(struct lruvec *lruvec, enum lru_list lru, int zone_idx) { + long val; struct mem_cgroup_per_node *mz; =20 mz =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); - return READ_ONCE(mz->lru_zone_size[zone_idx][lru]); + val =3D atomic_long_read(&mz->lru_zone_size[zone_idx][lru]); + if (WARN_ON_ONCE(val < 0)) + return 0; + + return val; } =20 void __mem_cgroup_handle_over_high(gfp_t gfp_mask); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 9b07db2cb232..d5da09fbe43e 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1273,28 +1273,12 @@ void mem_cgroup_update_lru_size(struct lruvec *lruv= ec, enum lru_list lru, int zid, int nr_pages) { struct mem_cgroup_per_node *mz; - unsigned long *lru_size; - long size; =20 if (mem_cgroup_disabled()) return; =20 mz =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); - lru_size =3D &mz->lru_zone_size[zid][lru]; - - if (nr_pages < 0) - *lru_size +=3D nr_pages; - - size =3D *lru_size; - if (WARN_ONCE(size < 0, - "%s(%p, %d, %d): lru_size %ld\n", - __func__, lruvec, lru, nr_pages, size)) { - VM_BUG_ON(1); - *lru_size =3D 0; - } - - if (nr_pages > 0) - *lru_size +=3D nr_pages; + atomic_long_add(nr_pages, &mz->lru_zone_size[zid][lru]); } =20 /** --- base-commit: 1ef4e3be45a85a103a667cc39fd68c3826e6acb9 change-id: 20251211-mz-lru-size-cleanup-c81deccfd5d7 Best regards, --=20 Kairui Song