From nobody Fri Dec 19 12:29:57 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E0DDE193402 for ; Mon, 14 Apr 2025 02:13:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744596792; cv=none; b=XRxeC446rO26MTUPG/W0bsGVYQ8vAGU4ZeUCDJwCkbglLjt6uLTZ0fd8oJHAd+kwS9mxAGfaCS5QlJXBE1xEcbHRWpksMZgfZdQUGrzrJWpwdZP/eV3AYnkcfsLJBEugHTSrsitibGwcE5k6qkMqJmnx6hF86NCxCACKgYt0Nbk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744596792; c=relaxed/simple; bh=UkNjBzrZQ56WR0I2Jr/2T7hIjhiXrcVzWBiJwjH1NWM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=PQpRNYT4/B6YGv3Jn5e8pMbesufuuRjN/rHF/8R+QdBPiJhwyHiHFIwiWvFBshuH7A98lxKcqCCDpKumQs9DOsBRuNEjtnTKecYD53xL7WmxLkyakhTQ0OFUdZOAdKvd4M5tnxHGP/HRON5nhNFN4PqOZBQSyikPkIRlgghC87c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=RLru4YMe; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="RLru4YMe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744596790; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D51S4gUlLsYOY6CMyTXh6Jn0QDbYM7QIvJMp3/dr794=; b=RLru4YMeiK86CNKqarL2TvjNRhapGxXfm6FgF2IX0KEFb0QREkwTqdDAjROpBSTgIno+Yv SMy6fC8cUiRkUYFkMl8TDjjYxsW2FpVl5WODkwWazrmpCMtQFYZ08hD1pw35Jn3b1fK8Wc /UvqVEcUOe0b86qvrWFqt1qkX6Nc8as= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-639-oL2pbVgrPO-LZ5pu1Csbsw-1; Sun, 13 Apr 2025 22:13:04 -0400 X-MC-Unique: oL2pbVgrPO-LZ5pu1Csbsw-1 X-Mimecast-MFC-AGG-ID: oL2pbVgrPO-LZ5pu1Csbsw_1744596782 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 89F2D1956087; Mon, 14 Apr 2025 02:13:02 +0000 (UTC) Received: from llong-thinkpadp16vgen1.westford.csb (unknown [10.22.88.48]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 382B0180AF7C; Mon, 14 Apr 2025 02:12:58 +0000 (UTC) From: Waiman Long To: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Tejun Heo , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Shuah Khan Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Waiman Long Subject: [PATCH v6 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs() Date: Sun, 13 Apr 2025 22:12:48 -0400 Message-ID: <20250414021249.3232315-2-longman@redhat.com> In-Reply-To: <20250414021249.3232315-1-longman@redhat.com> References: <20250414021249.3232315-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 The test_memcontrol selftest consistently fails its test_memcg_low sub-test due to the fact that two of its test child cgroups which have a memmory.low of 0 or an effective memory.low of 0 still have low events generated for them since mem_cgroup_below_low() use the ">=3D" operator when comparing to elow. The two failed use cases are as follows: 1) memory.low is set to 0, but low events can still be triggered and so the cgroup may have a non-zero low event count. 2) memory.low is set to a non-zero value but the cgroup has no task in it so that it has an effective low value of 0. Again it may have a non-zero low event count if memory reclaim happens. This is probably not a result expected by the users and it is really doubtful that users will check an empty cgroup with no task in it and expecting some non-zero event counts. In the first case, even though memory.low isn't set, it may still have some low protection if memory.low is set in the parent and the cgroup2 memory_recursiveprot mount option is enabled. So low event may still be recorded. The test_memcontrol.c test has to be modified to account for that. For the second case, it really doesn't make sense to have non-zero low event if the cgroup has 0 usage. So we need to skip this corner case in shrink_node_memcgs() by skipping the !usage case. With this patch applied, the test_memcg_low sub-test finishes successfully without failure in most cases. Though both test_memcg_low and test_memcg_min sub-tests may still fail occasionally if the memory.current values fall outside of the expected ranges. Suggested-by: Johannes Weiner Suggested-by: Michal Koutn=C3=BD Signed-off-by: Waiman Long --- mm/internal.h | 9 +++++++++ mm/memcontrol-v1.h | 2 -- mm/vmscan.c | 4 ++++ tools/testing/selftests/cgroup/test_memcontrol.c | 16 +++++++++++----- 4 files changed, 24 insertions(+), 7 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 50c2f590b2d0..c06fb0e8d75c 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1535,6 +1535,15 @@ void __meminit __init_page_from_nid(unsigned long pf= n, int nid); unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memc= g, int priority); =20 +#ifdef CONFIG_MEMCG +unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap); +#else +static inline unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, boo= l swap) +{ + return 1UL; +} +#endif + #ifdef CONFIG_SHRINKER_DEBUG static inline __printf(2, 0) int shrinker_debugfs_name_alloc( struct shrinker *shrinker, const char *fmt, va_list ap) diff --git a/mm/memcontrol-v1.h b/mm/memcontrol-v1.h index 6358464bb416..e92b21af92b1 100644 --- a/mm/memcontrol-v1.h +++ b/mm/memcontrol-v1.h @@ -22,8 +22,6 @@ iter !=3D NULL; \ iter =3D mem_cgroup_iter(NULL, iter, NULL)) =20 -unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap); - void drain_all_stock(struct mem_cgroup *root_memcg); =20 unsigned long memcg_events(struct mem_cgroup *memcg, int event); diff --git a/mm/vmscan.c b/mm/vmscan.c index b620d74b0f66..a771a0145a12 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -5963,6 +5963,10 @@ static void shrink_node_memcgs(pg_data_t *pgdat, str= uct scan_control *sc) =20 mem_cgroup_calculate_protection(target_memcg, memcg); =20 + /* Skip memcg with no usage */ + if (!mem_cgroup_usage(memcg, false)) + continue; + if (mem_cgroup_below_min(target_memcg, memcg)) { /* * Hard protection. diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testi= ng/selftests/cgroup/test_memcontrol.c index 16f5d74ae762..5a5dcbe57b56 100644 --- a/tools/testing/selftests/cgroup/test_memcontrol.c +++ b/tools/testing/selftests/cgroup/test_memcontrol.c @@ -380,10 +380,10 @@ static bool reclaim_until(const char *memcg, long goa= l); * * Then it checks actual memory usages and expects that: * A/B memory.current ~=3D 50M - * A/B/C memory.current ~=3D 29M - * A/B/D memory.current ~=3D 21M - * A/B/E memory.current ~=3D 0 - * A/B/F memory.current =3D 0 + * A/B/C memory.current ~=3D 29M [memory.events:low > 0] + * A/B/D memory.current ~=3D 21M [memory.events:low > 0] + * A/B/E memory.current ~=3D 0 [memory.events:low =3D=3D 0 if !memory_r= ecursiveprot, > 0 otherwise] + * A/B/F memory.current =3D 0 [memory.events:low =3D=3D 0] * (for origin of the numbers, see model in memcg_protection.m.) * * After that it tries to allocate more than there is @@ -525,8 +525,14 @@ static int test_memcg_protection(const char *root, boo= l min) goto cleanup; } =20 + /* + * Child 2 has memory.low=3D0, but some low protection is still being + * distributed down from its parent with memory.low=3D50M if cgroup2 + * memory_recursiveprot mount option is enabled. So the low event + * count will be non-zero in this case. + */ for (i =3D 0; i < ARRAY_SIZE(children); i++) { - int no_low_events_index =3D 1; + int no_low_events_index =3D has_recursiveprot ? 2 : 1; long low, oom; =20 oom =3D cg_read_key_long(children[i], "memory.events", "oom "); --=20 2.48.1 From nobody Fri Dec 19 12:29:57 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 04B2318DF80 for ; Mon, 14 Apr 2025 02:13:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744596793; cv=none; b=ADpYbJAE05laSGg5tadUpaKWy7yTdYgamXsimZactjiMBhQEv6qMLTnJh1nAJLQP4s5XRSf0pepfda+m/oZRZCTVQTQW25icH/2e3pSYrF5lUQUGGdPw5bx2KYBf7z6Hdx/Cc9f9U0tyDP3VsgoBhGaPJR1fyAhJ84DPx0sS4QQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744596793; c=relaxed/simple; bh=M9u8GhzSuXNi2zRw5UTGbv3kuYBa26qTcGgdOIkDTc4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qxL2eNpOsAH8kSL8BV2PkAStwtFZvpsZ6WO/W2cR0DX6WZ6SkR3iFIoIKnvOquf7c9OlCufKGEl39H85X2Boo2M/0mv/ZRAMi8Kh3gF+3bON5iE3s4F+jLSFI/DDbGDzOOyNUv/knJXuSAtdkJ6Y51LRY52MNK0J9afvolOoRJw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=hoxrYWzz; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="hoxrYWzz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744596790; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cWd3wodO+vJYf6OkqNBmP1cg6W0V7cLt+fJ3yVah9lQ=; b=hoxrYWzzQBtDjqh8GvmMCewrbo+xpdoT/dbQ03tg61Yhq+JmeJyJPiQLExaFi4e/cKetyh kwgqOHgqFU0RPHGzFPaxzOJiu/W938KErzPGQvHvgRmXcrlvb7XvStHVMc8005bI5Xj+7X Wp//WBes4kTsTBcLKgbHqz6QaHXeGhI= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-500-iG2B4VqjNJyB-FSbxmYGvw-1; Sun, 13 Apr 2025 22:13:09 -0400 X-MC-Unique: iG2B4VqjNJyB-FSbxmYGvw-1 X-Mimecast-MFC-AGG-ID: iG2B4VqjNJyB-FSbxmYGvw_1744596787 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id CCF5D1955DC5; Mon, 14 Apr 2025 02:13:06 +0000 (UTC) Received: from llong-thinkpadp16vgen1.westford.csb (unknown [10.22.88.48]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 11E63180B488; Mon, 14 Apr 2025 02:13:02 +0000 (UTC) From: Waiman Long To: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Tejun Heo , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Shuah Khan Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Waiman Long Subject: [PATCH v6 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection() Date: Sun, 13 Apr 2025 22:12:49 -0400 Message-ID: <20250414021249.3232315-3-longman@redhat.com> In-Reply-To: <20250414021249.3232315-1-longman@redhat.com> References: <20250414021249.3232315-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Content-Type: text/plain; charset="utf-8" The test_memcg_protection() function is used for the test_memcg_min and test_memcg_low sub-tests. This function generates a set of parent/child cgroups like: parent: memory.min/low =3D 50M child 0: memory.min/low =3D 75M, memory.current =3D 50M child 1: memory.min/low =3D 25M, memory.current =3D 50M child 2: memory.min/low =3D 0, memory.current =3D 50M After applying memory pressure, the function expects the following actual memory usages. parent: memory.current ~=3D 50M child 0: memory.current ~=3D 29M child 1: memory.current ~=3D 21M child 2: memory.current ~=3D 0 In reality, the actual memory usages can differ quite a bit from the expected values. It uses an error tolerance of 10% with the values_close() helper. Both the test_memcg_min and test_memcg_low sub-tests can fail sporadically because the actual memory usage exceeds the 10% error tolerance. Below are a sample of the usage data of the tests runs that fail. Child Actual usage Expected usage %err ----- ------------ -------------- ---- 1 16990208 22020096 -12.9% 1 17252352 22020096 -12.1% 0 37699584 30408704 +10.7% 1 14368768 22020096 -21.0% 1 16871424 22020096 -13.2% The current 10% error tolerenace might be right at the time test_memcontrol.c was first introduced in v4.18 kernel, but memory reclaim have certainly evolved quite a bit since then which may result in a bit more run-to-run variation than previously expected. Increase the error tolerance to 15% for child 0 and 20% for child 1 to minimize the chance of this type of failure. The tolerance is bigger for child 1 because an upswing in child 0 corresponds to a smaller %err than a similar downswing in child 1 due to the way %err is used in values_close(). Before this patch, a 100 test runs of test_memcontrol produced the following results: 17 not ok 1 test_memcg_min 22 not ok 2 test_memcg_low After applying this patch, there were no test failure for test_memcg_min and test_memcg_low in 100 test runs. However, these tests may still fail once in a while if the memory usage goes beyond the newly extended range. Signed-off-by: Waiman Long --- tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testi= ng/selftests/cgroup/test_memcontrol.c index 5a5dcbe57b56..2ef07b8fa718 100644 --- a/tools/testing/selftests/cgroup/test_memcontrol.c +++ b/tools/testing/selftests/cgroup/test_memcontrol.c @@ -495,10 +495,10 @@ static int test_memcg_protection(const char *root, bo= ol min) for (i =3D 0; i < ARRAY_SIZE(children); i++) c[i] =3D cg_read_long(children[i], "memory.current"); =20 - if (!values_close(c[0], MB(29), 10)) + if (!values_close(c[0], MB(29), 15)) goto cleanup; =20 - if (!values_close(c[1], MB(21), 10)) + if (!values_close(c[1], MB(21), 20)) goto cleanup; =20 if (c[3] !=3D 0) --=20 2.48.1