From nobody Mon Jun 8 05:26:06 2026 Received: from mail-ot1-f47.google.com (mail-ot1-f47.google.com [209.85.210.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A3473CC30F for ; Fri, 5 Jun 2026 15:36:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780673773; cv=none; b=pIGhUKfjeIRZuhbjsfs8mjFP0sd/3haqa+tqw4xbV4bN34Bzs8z0tUrbEVvWruO+Vl0kcM9Ap0LJypWnIzrOfawDVpY2oQB9WH07DWIWdGSkojXQwzKL7DI/PT4evoguXdKPtIdCInYoDwOTz1/MnYVRHKoheGKzk9ijbzZbEGg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780673773; c=relaxed/simple; bh=6nKZOMhL2fHuHg0gYnjNYxhIeU7ReFbDaJLhznUpz1g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=O/di3J+ZgpctSb36J7VnkUszFtpO0dPME0xWMutVo1UHWblEP0GIYj5M/6r511pXhXVGL5S3ASxg22GlgNWEoIIYvFau25ZIMLBI+sCDrI+jzNxyGbExsjLfhVKdT9KokfswcXxLfr2mpaD0jWXgfvct3N0dJImat+HgCeksVps= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=GWJkQkzH; arc=none smtp.client-ip=209.85.210.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="GWJkQkzH" Received: by mail-ot1-f47.google.com with SMTP id 46e09a7af769-7e6d991991dso1675262a34.3 for ; Fri, 05 Jun 2026 08:36:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780673766; x=1781278566; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lh7157a6UldulsYLa+rcYTJC4JAQaUR5k2UCa7NfvOc=; b=GWJkQkzH8qGa3tNsLVFPTYWXz/3ec5vbaZe1b6L/nW4/G1Uk9rg1/tLbYPF4YGvKOy dgxik5Vy9ZXq7+yyr131d5NmSavhE6n6uTbxQnZ51b6jR9vf7GRuEzb3aB/Lt3V8ht86 nENNqEeGfEZVLzaLzgiVjNkOvLYB4RpDliZ58IERFAaRIcHKmrLeAAuqyQB5T4itt+mf +4i3x4lJtH/WJ+vXrlL4xVEYMQQYokSN5uGwwB1UN3Q1jlBCB9d3C+CIq5BJojwaXD5t JRZqHDaAqM+cQJ5UYvuP33oaw5n+UKU9oeL+ynoEIzcKJjL3v8HWw5FRaWcjyUq33jie a88A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780673766; x=1781278566; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=lh7157a6UldulsYLa+rcYTJC4JAQaUR5k2UCa7NfvOc=; b=Er3ksEHQr8q2gKQG0N7Z13660M+hErR2rwoqUvmeLXi23RmjAEr4Rd1WLGpjPoioV5 Zlyag2T09QljUMRm8nT/+mI6fOQ2fuOvcwjEzo/rHBvGqrCcfsl6/LjmDb1IkSmJZwsw qsYl+EsE14UvBD8pqUeLUISRrk8RMER3nAfUkQ6Yv1cU+acv9rHWDvI/S1cuTLxZCSOL LLOoEJbzHF960fKTAw+xhC1NZuKfqTdPF5W4x5uDaG5XWUyr9+XFCNkIEZb+0yVe191Q UUhbDx2utJJtACRtetH0fy1T/HfxVQgxcRcPu5Zm6OpEthcDMso/3mw0vzh1bz+NZ+iH AZDQ== X-Forwarded-Encrypted: i=1; AFNElJ8iqHmXgUix8Vf/nR1O2O5KK4uE0Ikw1kgE1Q/SpZZmCSOq2rhJmuxg/QD1mkd0SThMO2EJmfnp1XlEAc4=@vger.kernel.org X-Gm-Message-State: AOJu0YwGovUM9LNSVkiCKCuAEIBYnY9G2dRXpT684l/s0aQv0GlrIg0C VddXjhlQBDphCY8+G61dqCF1i1/1Z3uZSFJ666Px/s32fFbma/2kdD66 X-Gm-Gg: Acq92OHe8mTPUZ0xbwcvMwC/plUEt+XbNBij0XLjpCfG8Zl2L/2Bcn4NGRKXCbz0QM/ Q62rDrhMqY6BS+BDNDg82AZprAgVrdLkUfwEnZ9DSFmluvivVS56qZ2ztngUbOaieaOVewfnIEX 9QrDmr2plSOVrd0WXrrhDdZwGQKzJUaFRX6lo+22hnIoTgq+MHTSvyWJTUS+pn148/QmdxxInEw DUsOwtiu9OlHTVObIH2E1X6ivfmZ9WRcD5kXza5LbkhVq0ZMiiuvyWRVBdMeUE9pDm5AU/iP5PT o0X0HKzafUaOLziYUPd5pG730Y3JF5HBjSSiYeXKQAUE5kZ8I3grS2LCJN5LI895IvYP0zAf/XQ QYpmm6h+mZPA4KGg+Zo6VK+Ca6dMmja6RGtaF/IrnoVnOe5DqsdnMfE21ZYAG2howYz7p/OXih6 2pNrFm+dXEBqlInl4gSD6KNpvpN97ntDy40JEqIa2bNklk3voyfM8YVjplsoGw+z0= X-Received: by 2002:a05:6820:818a:b0:69e:59af:ec9d with SMTP id 006d021491bc7-69e68c089eamr2288542eaf.31.1780673765870; Fri, 05 Jun 2026 08:36:05 -0700 (PDT) Received: from localhost ([2a03:2880:10ff:b::]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-69e464050fasm5180102eaf.9.2026.06.05.08.36.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jun 2026 08:36:05 -0700 (PDT) From: Joshua Hahn To: Johannes Weiner , Michal Hocko Cc: Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 1/6 v3] mm/page_counter: introduce per-page_counter stock Date: Fri, 5 Jun 2026 08:35:57 -0700 Message-ID: <20260605153603.234296-2-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260605153603.234296-1-joshua.hahnjy@gmail.com> References: <20260605153603.234296-1-joshua.hahnjy@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In order to avoid expensive hierarchy walks on every memcg charge and limit check, memcontrol uses per-cpu stocks (memcg_stock_pcp) to cache pre-charged pages and introduce a fast path to try_charge_memcg. However, there are a few quirks with the current implementation that could be improved upon. First, each memcg_stock_pcp can only cache the charges of 7 memcgs (defined as NR_MEMCG_STOCK), which means that once a CPU starts handling the charging of more than 7 memcgs, it randomly selects a victim memcg to evict and drain from the cpu, which can cause unnecessarily increased latencies and thrashing as memcgs continually evict each others' stock. Second, stock is tightly coupled with memcg, which means that all page counters in a memcg share the same resource. This may simplify some of the charging logic, but it prevents new page counters from being added and using a separate stock. We can address these concerns by pushing the concept of stock down to the page_counter level, which addresses the random eviction problem by getting rid of the 7 slot limit, and makes enabling separate stock caches for other page_counters simpler. Introduce a generic per-cpu stock directly in struct page_counter. Stock can optionally be enabled per-page_counter, limiting the overhead increase for page_counters who do not benefit greatly from caching charges. This patch introduces the page_counter_stock struct and its alloc/disable/free functions, but does not use these yet. Suggested-by: Johannes Weiner Signed-off-by: Joshua Hahn --- include/linux/page_counter.h | 13 +++++++++ mm/page_counter.c | 53 ++++++++++++++++++++++++++++++++++++ 2 files changed, 66 insertions(+) diff --git a/include/linux/page_counter.h b/include/linux/page_counter.h index d649b6bbbc871..c92bb2ee2a581 100644 --- a/include/linux/page_counter.h +++ b/include/linux/page_counter.h @@ -5,8 +5,15 @@ #include #include #include +#include +#include #include =20 +struct page_counter_stock { + local_trylock_t lock; + unsigned long nr_pages; +}; + struct page_counter { /* * Make sure 'usage' does not share cacheline with any other field in @@ -41,6 +48,8 @@ struct page_counter { unsigned long high; unsigned long max; struct page_counter *parent; + struct page_counter_stock __percpu *stock; + unsigned int batch; } ____cacheline_internodealigned_in_smp; =20 #if BITS_PER_LONG =3D=3D 32 @@ -99,6 +108,10 @@ static inline void page_counter_reset_watermark(struct = page_counter *counter) counter->watermark =3D usage; } =20 +int page_counter_alloc_stock(struct page_counter *counter, unsigned int ba= tch); +void page_counter_disable_stock(struct page_counter *counter); +void page_counter_free_stock(struct page_counter *counter); + #if IS_ENABLED(CONFIG_MEMCG) || IS_ENABLED(CONFIG_CGROUP_DMEM) void page_counter_calculate_protection(struct page_counter *root, struct page_counter *counter, diff --git a/mm/page_counter.c b/mm/page_counter.c index 661e0f2a5127a..9f3e3f8d896c4 100644 --- a/mm/page_counter.c +++ b/mm/page_counter.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -289,6 +290,58 @@ int page_counter_memparse(const char *buf, const char = *max, return 0; } =20 +int page_counter_alloc_stock(struct page_counter *counter, unsigned int ba= tch) +{ + struct page_counter_stock __percpu *stock; + int cpu; + + stock =3D alloc_percpu(struct page_counter_stock); + if (!stock) + return -ENOMEM; + + for_each_possible_cpu(cpu) { + struct page_counter_stock *s =3D per_cpu_ptr(stock, cpu); + + local_trylock_init(&s->lock); + } + counter->stock =3D stock; + counter->batch =3D batch; + + return 0; +} + +void page_counter_disable_stock(struct page_counter *counter) +{ + if (!counter->stock) + return; + + /* This prevents future charges from trying to deposit pages */ + WRITE_ONCE(counter->batch, 0); +} + +void page_counter_free_stock(struct page_counter *counter) +{ + unsigned long stock_to_drain =3D 0; + int cpu; + + if (!counter->stock) + return; + + for_each_possible_cpu(cpu) { + struct page_counter_stock *stock; + + stock =3D per_cpu_ptr(counter->stock, cpu); + stock_to_drain +=3D stock->nr_pages; + stock->nr_pages =3D 0; + } + + if (stock_to_drain) + page_counter_uncharge(counter, stock_to_drain); + + free_percpu(counter->stock); + counter->stock =3D NULL; +} + =20 #if IS_ENABLED(CONFIG_MEMCG) || IS_ENABLED(CONFIG_CGROUP_DMEM) /* --=20 2.53.0-Meta From nobody Mon Jun 8 05:26:06 2026 Received: from mail-ot1-f51.google.com (mail-ot1-f51.google.com [209.85.210.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 636DA3E00B9 for ; Fri, 5 Jun 2026 15:36:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780673773; cv=none; b=cdUztIu6B22LNJunm5Yv/aENBPmWPM2g9CSRjC8aVmO6gCjixwtkAi+MvqRdeBnbPN8UlgFx0LgEDm9mnsTVqvgP3oj0NsagRI3rRNtf8nmBTF/91oecrS0UnHhpfc616ZjVSsc3V+MGRnFl0q2+q/oeXWKVtsZoG4UFIeGqgMk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780673773; c=relaxed/simple; bh=gaqXcw7SaNGRC8ZlwI52O8ug9+jqeuYXrmBeNBSCLsc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JqzOS4gU8fQSZ7BE2xlPK9vo+AycPE1MKSi6xsvFlyGpjNmaKdtY5tOX6xccF0lABxScqIPlySppOa+87OnPcD/mQzXmy0FUTNl1pI4LSE/eL85j4kGKGiFeM+5TG24/OUMPM4n74AC6vlQCYN+j9U/CPVoE5IBodCi1nPaFte4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Ox/7TD9x; arc=none smtp.client-ip=209.85.210.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Ox/7TD9x" Received: by mail-ot1-f51.google.com with SMTP id 46e09a7af769-7e615efd7d7so1828979a34.2 for ; Fri, 05 Jun 2026 08:36:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780673767; x=1781278567; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=J5dsckH9YwlwdD1mktr8FrA3biFKCktv90GPZj2FBiQ=; b=Ox/7TD9xcte/J7liCvcvq1VpwM2SIA9mJCuDaY/LxDBlVqx3ucTYFMib8Po7MGsmD4 igERZKncQP7PAoMnJt5ewbi/bz6n0MMrBxZyCRKpScfjpNCwrQfFeU2kqX2TAw8UU01+ iSRgDt/vwFXNHbQeGVBAfyznpXI8AuFJ2lNsPKKj8UJ5cTODdAzzkcZYyhdqAOi6hn6y NHVGtclcUs741ECZ1s2e7p1+NOzlnHFRGbcF+kz+G+s5vxfXFXww62en/e7rHQxqSzbZ mHM5RUl5dOa/Ik0S2m0+Gjmvz72kVLyZGJsHklj/gZW4EstEPoTLIHlDa/0mYIcrc3p1 EtLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780673767; x=1781278567; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=J5dsckH9YwlwdD1mktr8FrA3biFKCktv90GPZj2FBiQ=; b=E5Yumc6THgWgH9CM6vzkYKC4mT/EMcmdGFhiuIe1UWo7GJ0Ei0QUq37GUVFAwJwtne Zad1OhXAjP0ZzPdjYD00eRieCDTBtn88a1eXW2i3r93B94qZ2AWvFR/NOy2U5d1j9fvd Rs6Jll+tnJywy3NXtLm2GYp4ymG0Z0w3iJdqqDdpvbaMUaGQnBw6P0A1CYOlqIm29ube +1YP189sFckQZqH2zla0oIW5Z8dY2+I/SuDSnKrqx2r1EMk0r7B8TUx2uNk4ykF4498i sg+aw3dtbIGfeQ8cx4XwXfxJwH0fX6TyrrqYWnaEMTszSdsIMfiDhbb67/caaUscqpza w83Q== X-Forwarded-Encrypted: i=1; AFNElJ9MZD3Zm+a83PMjYSIxGOSqZmEzWKLEoCo+MQTURVl7W2FICT7XWD+wCZZ/zrrKDiiwK3pJLaZ9OfInAMQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yzg7RJHFSMoDRH91wVGiX+shmOADlIGlM0Aicu7eJi9Z5dWNW1x yBgmn8ml767VBNTM1ITUAw3CB9SZ2StnG36kj4CwtWmHb3o0HtLLcDXu X-Gm-Gg: Acq92OGwWxmR8tU0uXnVWr0X1ycH3BdidQLraLX/mXTi/koYdv6lPUJt098nThuTlAd GWLWSqrjqXQ4RgItLkefAPr3tncjtizNWLrQnaKBi+D0GLhd4fmERxNnmoEVetoM4uZa4419a4H VQHfiEdsr2Hw4G8hNSSug/Rub+dczSuiPAOB7YjrHWpoKnOejcovXTv5t3q4gIw0b83UlRWJCQ9 SV3cO+aOYthWUyGl79i0Ly0jiVbjr2p2qw4LqIWUZeiD3TdzPgZCIpkEcMTBJykCXUOuCmfykIM yc956qpNZjEikS5tJuZ7oNnO/h/affouRJLNgvfg5pjYbuBnYMj/akwbSEImiQ8zlZu9mh/+PXD 6/+zTBAL66d5aFnUNlvUDhV184zEOBClRXeyImn3KQEh11/l21bslWmNW35ixRI9W2KyiDfaNpf It715dDH/P4QH2go66vRUfwxIrAGGMx9uKpKx2/xmgq/qxtNJLJCQTBbkyya6HJKw= X-Received: by 2002:a05:6830:8289:b0:7e3:d199:3164 with SMTP id 46e09a7af769-7e70c6a605cmr2472371a34.11.1780673767180; Fri, 05 Jun 2026 08:36:07 -0700 (PDT) Received: from localhost ([2a03:2880:10ff:a::]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7e6e7468f8fsm6342664a34.6.2026.06.05.08.36.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jun 2026 08:36:06 -0700 (PDT) From: Joshua Hahn To: Johannes Weiner , Michal Hocko Cc: Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 2/6 v3] mm/page_counter: use page_counter_stock in page_counter_try_charge Date: Fri, 5 Jun 2026 08:35:58 -0700 Message-ID: <20260605153603.234296-3-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260605153603.234296-1-joshua.hahnjy@gmail.com> References: <20260605153603.234296-1-joshua.hahnjy@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Make page_counter_try_charge() stock-aware. We preserve the same semantics as the existing stock handling logic in try_charge_memcg: 1. Limit-check against the stock. If there is enough, charge to the stock (non-hierarchical) and return immediately. 2. Greedily attempt to fulfill the charge request and fill the stock up at the same time via a hierarchical charge. 3. If we fail with this charge, retry again (once) with the exact number of pages requested. 4. If we succeed with the greedy attempt, then try to add those extra pages to the stock. If that fails (trylock), then uncharge those surplus pages hierarchically. As of this patch, the page_counter_stock is unused, as it has not been enabled on any memcg yet. No functional changes intended. Suggested-by: Johannes Weiner Signed-off-by: Joshua Hahn --- mm/page_counter.c | 42 +++++++++++++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/mm/page_counter.c b/mm/page_counter.c index 9f3e3f8d896c4..1a71de4f43fd0 100644 --- a/mm/page_counter.c +++ b/mm/page_counter.c @@ -121,9 +121,25 @@ bool page_counter_try_charge(struct page_counter *coun= ter, struct page_counter **fail) { struct page_counter *c; + unsigned long charge =3D nr_pages; + unsigned long batch =3D READ_ONCE(counter->batch); bool protection =3D track_protection(counter); bool track_failcnt =3D counter->track_failcnt; =20 + if (counter->stock && local_trylock(&counter->stock->lock)) { + struct page_counter_stock *stock =3D this_cpu_ptr(counter->stock); + + if (stock->nr_pages >=3D charge) { + stock->nr_pages -=3D charge; + local_unlock(&counter->stock->lock); + return true; + } + local_unlock(&counter->stock->lock); + } + + charge =3D max_t(unsigned long, batch, nr_pages); + +retry: for (c =3D counter; c; c =3D c->parent) { long new; /* @@ -140,9 +156,9 @@ bool page_counter_try_charge(struct page_counter *count= er, * we either see the new limit or the setter sees the * counter has changed and retries. */ - new =3D atomic_long_add_return(nr_pages, &c->usage); + new =3D atomic_long_add_return(charge, &c->usage); if (new > c->max) { - atomic_long_sub(nr_pages, &c->usage); + atomic_long_sub(charge, &c->usage); /* * This is racy, but we can live with some * inaccuracy in the failcnt which is only used @@ -163,11 +179,31 @@ bool page_counter_try_charge(struct page_counter *cou= nter, WRITE_ONCE(c->watermark, new); } } + + /* charge > nr_pages implies this page_counter has stock enabled */ + if (charge > nr_pages) { + if (local_trylock(&counter->stock->lock)) { + struct page_counter_stock *stock; + + stock =3D this_cpu_ptr(counter->stock); + stock->nr_pages +=3D charge - nr_pages; + local_unlock(&counter->stock->lock); + } else { + page_counter_uncharge(counter, charge - nr_pages); + } + } + return true; =20 failed: for (c =3D counter; c !=3D *fail; c =3D c->parent) - page_counter_cancel(c, nr_pages); + page_counter_cancel(c, charge); + + if (charge > nr_pages) { + /* Retry without trying to grab extra pages to refill stock */ + charge =3D nr_pages; + goto retry; + } =20 return false; } --=20 2.53.0-Meta From nobody Mon Jun 8 05:26:06 2026 Received: from mail-oa1-f48.google.com (mail-oa1-f48.google.com [209.85.160.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A620F3D1A82 for ; Fri, 5 Jun 2026 15:36:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780673774; cv=none; b=oZEMqHLSjm4lqi+5gGmdY4rE/3JQd9wCgADHMuKnpsMf5BeFUieWkNfUyw1OTFoRVaDMQ1vqcs3mCLiiZfq7mXhRTCk6rpfky9lN0J5EoX9pP+RiL01REWlDcUS1eqkOVL9dIB2RkM1hdzgu4QGVF9JP/YlJSVEoqwyPMxel6ag= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780673774; c=relaxed/simple; bh=0y//4rDUHeoMAcQaZSh0ER1HWxLRRPwRxRm1NwYVOyU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QEXxJ2ZUVwX+44jx8JfRLueolola4B2siruwY7Bty+0/HY0dReZjizsCLtzfUnqHDmxCghJJzK34SDVmCWszYNRrvSzVKeeaxe/+opRXe4j3/bj4m55DoxingQi+h9qQGuSxLdD44Io4TcTPfYBNm2Z4SU9rqVwWXHdNYH5zpEs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=W388OB/M; arc=none smtp.client-ip=209.85.160.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="W388OB/M" Received: by mail-oa1-f48.google.com with SMTP id 586e51a60fabf-43d1470491aso667314fac.2 for ; Fri, 05 Jun 2026 08:36:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780673768; x=1781278568; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LhnCvvePHahC4Io7XyZr75rJ8BiYNr3mH21CPC3C4jQ=; b=W388OB/M5YtmAtJid8sd7D2hKxIFIgF34XEM6OvL814KTIITld9mvcPxen7jbhkeRT o/iEAs3xq9nZVW4lKJ/JKE6pgoMpTCZ5tsEsHd4Ou1qE9m7/Lq1QDV1N+4OcfNIJttCC SjKy7RMCp68QeP4VMByM6Mfj+e8j/x5WOcq2Va80k3+iE/Hi3bbNsbP7mjRS2s5nCfV3 /NFa8+iPBzwShxV8aWx+FY+bwYxwb+yaFu4vF2Fd7oA7rWNULiTO2MblwSc4j/OeIUIX 09WOwPG7c3lgtT81NEkbKzraTf5X2Yv94JYqt8z4DR7M83ns6APHW38VXaYWAlSng1kO iTjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780673768; x=1781278568; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=LhnCvvePHahC4Io7XyZr75rJ8BiYNr3mH21CPC3C4jQ=; b=tVSw1xkFKYwzhDaUQGGtXnZF5j2XHxFIskZ0L1TX1PtrLeFNNoJpVR4eh/VUvHBcvX 2sPUv1b/byCz8mpd7p0AtfFukhQy7p5FxIWY1qWXoMwV1Ih2+01bJrpE/BDKG39Tms4f 2NL7nrF3cP2K3rjfa83JnQV400s4Nx+oIYv8/io4H1H71WKkYErLJ7IT/df+dsz6mcWW Lhl3M0Up3+bWkWwroi1CV3ii1CoAURHd/jXmZ8uhwmyW2KKzwgEOdwse2C4bzxmJ00r4 jRr59OvAbd8GVF/MGWHOczvFt7dTbwizZJNKvk0s/ha1M6+w4wLtVn7BEc9z40ju3JVX 3QPg== X-Forwarded-Encrypted: i=1; AFNElJ8JvrUnWYr8TnC9SFKiiNDwaFby7iFjRCbAHNfRe0jGrn5fh/q86TATLeNrfX9GeNsPE91Nboi8oGTwlEY=@vger.kernel.org X-Gm-Message-State: AOJu0YzePAY+3/CAXiq2BfUxPNKu/fVcEQI7hFSbnXOMHzHE5pnyy7UJ 7KPeqxlOQjFqhDFsOJySF2VDhRnKhb5SIM5+f3vyDF04bVcBmiNhalyk X-Gm-Gg: Acq92OHAgryWA7E/1Jdr8KrMdmJV7OUKiARZY3hZ+Ey6rqozk8F71MVCq2for+yYgmI TTeT4vkND7+K7FQ7d+9Pe+K1q13suBM4e6yijtV1TFyRcJhZQGU/i0712jmhOHscnISi2DGan27 9O1FKvkdT/smDmlofJfU+oUsVDXqasMxivXwvTsqrCZKwFXJ9um1NUsAtNVvtplCOLsV0Xbs+pZ VcO10u8tcPY7AVTOnZcUvDbC+FqTd0JPoU9hEqaQMr5UjNOEUa9I2+jxYr9lePCa/bGlJB7gJSH ln7yEkfbfopdaRSokjgv5GuUkaDSodnO1Js4YNhkUPLPZfCNCu7RA+OssAsNYmyO+1YzEjNlkTj bb5h9lM7XV4zkLjEFXKKmaKNp/rxfWetHthHSeC6WHNUQKzgEi/p9sFhaBaq+AJhicqH5vZXR2Y Mnj/ia+1+tw9yjjtIjlhBcEi1Ntj6nLgtIGPXHdYsOJhpnmnv1k5JaEiDtXmlOiUc= X-Received: by 2002:a05:6870:6c16:b0:42f:f368:e025 with SMTP id 586e51a60fabf-4413d373994mr2318480fac.10.1780673768386; Fri, 05 Jun 2026 08:36:08 -0700 (PDT) Received: from localhost ([2a03:2880:10ff:a::]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-440d8500f0esm7271313fac.18.2026.06.05.08.36.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jun 2026 08:36:08 -0700 (PDT) From: Joshua Hahn To: Johannes Weiner , Michal Hocko Cc: Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 3/6 v3] mm/page_counter: introduce stock drain APIs Date: Fri, 5 Jun 2026 08:35:59 -0700 Message-ID: <20260605153603.234296-4-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260605153603.234296-1-joshua.hahnjy@gmail.com> References: <20260605153603.234296-1-joshua.hahnjy@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce page_counter variants to replace memcg stock draining functions. page_counter_drain_stock_local() drains the stock of the local CPU, taking a local stock lock to serialize against concurrent charges. page_counter_drain_stock_cpu() does the same, but without taking a local lock. This is possible because it will only be called from the CPU hotplug path, where the CPU is dead and there cannot be any more charges. Suggested-by: Johannes Weiner Signed-off-by: Joshua Hahn --- include/linux/page_counter.h | 3 +++ mm/page_counter.c | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 37 insertions(+) diff --git a/include/linux/page_counter.h b/include/linux/page_counter.h index c92bb2ee2a581..4a88874019af0 100644 --- a/include/linux/page_counter.h +++ b/include/linux/page_counter.h @@ -111,6 +111,9 @@ static inline void page_counter_reset_watermark(struct = page_counter *counter) int page_counter_alloc_stock(struct page_counter *counter, unsigned int ba= tch); void page_counter_disable_stock(struct page_counter *counter); void page_counter_free_stock(struct page_counter *counter); +void page_counter_drain_stock_local(struct page_counter *counter); +void page_counter_drain_stock_cpu(struct page_counter *counter, + unsigned int cpu); =20 #if IS_ENABLED(CONFIG_MEMCG) || IS_ENABLED(CONFIG_CGROUP_DMEM) void page_counter_calculate_protection(struct page_counter *root, diff --git a/mm/page_counter.c b/mm/page_counter.c index 1a71de4f43fd0..7e7eb683472d9 100644 --- a/mm/page_counter.c +++ b/mm/page_counter.c @@ -378,6 +378,40 @@ void page_counter_free_stock(struct page_counter *coun= ter) counter->stock =3D NULL; } =20 +void page_counter_drain_stock_local(struct page_counter *counter) +{ + struct page_counter_stock *stock; + unsigned long nr_pages; + + if (!counter->stock) + return; + + local_lock(&counter->stock->lock); + stock =3D this_cpu_ptr(counter->stock); + nr_pages =3D stock->nr_pages; + stock->nr_pages =3D 0; + local_unlock(&counter->stock->lock); + + if (nr_pages) + page_counter_uncharge(counter, nr_pages); +} + +void page_counter_drain_stock_cpu(struct page_counter *counter, + unsigned int cpu) +{ + struct page_counter_stock *stock; + unsigned long nr_pages; + + if (!counter->stock) + return; + + stock =3D per_cpu_ptr(counter->stock, cpu); + nr_pages =3D stock->nr_pages; + if (nr_pages) { + stock->nr_pages =3D 0; + page_counter_uncharge(counter, nr_pages); + } +} =20 #if IS_ENABLED(CONFIG_MEMCG) || IS_ENABLED(CONFIG_CGROUP_DMEM) /* --=20 2.53.0-Meta From nobody Mon Jun 8 05:26:06 2026 Received: from mail-oa1-f46.google.com (mail-oa1-f46.google.com [209.85.160.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F1D9936C9F0 for ; Fri, 5 Jun 2026 15:36:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780673776; cv=none; b=fVl0wgBXP6qW828y5oBEjNGdfz5UjOtFG1ReFhHg38dP0MmbqsydikdwhVHe1CiDvhRjN0pPUG+ir7j7bUA9wKN7XoFqqg306lWYYxrfzZG6/hak79aa2+HEJ6ib8aXf7KFshKotD0mqq9mkM7eySBR5s93nR/acj1SY6pk2SzA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780673776; c=relaxed/simple; bh=YARM61p7OCvIlSHd3e+h/5q9hPtEiViLCkKf0hA0Dwg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tvSUTdaqaIufYe0u+lZA5aYXXeVeadunymM20rLQh25kN3iM4Rm0+O+o/BcYx5mDsMwE5eC3RswcU9nj+6ArTnFZHhAsL6F/57h9SSrzKoa0XVFWpAgXHOx7tGvkVolEVGHIeOmKsVyQL2/Sp2jMHrFmPJdhIvkgFa4Ky9Px7I4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=BlIN9Kd8; arc=none smtp.client-ip=209.85.160.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BlIN9Kd8" Received: by mail-oa1-f46.google.com with SMTP id 586e51a60fabf-440df1c768bso1578424fac.1 for ; Fri, 05 Jun 2026 08:36:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780673770; x=1781278570; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QT7NBEIym2G0kzWniqTa/IOK3LykjdeMwTsMHCm4gYI=; b=BlIN9Kd8V0sZyE9w/u/W+jKR2KgCw1LX4b5olG3fsLhhu4oewo7H0geesRynJarwSW 1NM96ncEt1xtD0KMpOmsS3adN9DbpKSt9CDo9C09RAt6Rx9t3Aa0e/6hQFcWDGrJ7evL J1JAcrIkyk4ZdaclTbtujjljsYhpCotxaRnOvJ+l7aNjFd/PCazWhTWz01ftcQ5Eh9pd DtMXKFJuKC5wecIbG+WLndJC3XPF9b3FslZ5K9IDl6WtP4MngA941hsc69EGb2CYtCO/ vS3MhRwVMcA5Spfr1uzsQyYrfAtm0LKg8lECzUUELBZhdeh11KpP4jl7f7Y30m8v0m8C 6Ebw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780673770; x=1781278570; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=QT7NBEIym2G0kzWniqTa/IOK3LykjdeMwTsMHCm4gYI=; b=LQ6Bg9mfm9rTZqYZ7w8KUdC9pHdKjBRzDFE7/LZrGyspr9m0hFR1UG5rZSlm5wqX8e tdP3nz+tTzUaPucS1YkXTq1gH/gigAJgiGyEE1WqtPURiZRaqMzPzF2tlvXfVZhczrzZ hS0rxg7/u+jbvBZM0KoStpfLuJj9nWKWKRWSELjM0o9RMcYS2cAZZMy71wbb6OFPbg9S Z1t3vTLuL2IlipE+4sjMuY/ZIZI2jnGIQlz/A4CGTSQhSgQr/GvGZ9kArJY0Lfu1C8y/ Mp4l2FZt/p6JQRjsZZoGRq29xub2i7qRIA/frzFP6nCeatrYjmSBOsuEN98DA1CqCV+X O5tg== X-Forwarded-Encrypted: i=1; AFNElJ8JJKzunjS86Bu9AGQk1PrP13lUyDKBf4x3vLgSK6fMrVRtv38avrzmO8tO5NLyotEeCyf25X3QNuNkkA0=@vger.kernel.org X-Gm-Message-State: AOJu0YwdD1CxLsI04hDPEtSnWD9NmrLm0vKqQ6vpn8D4WtceZqu9bh/s zcJn5f24MG/rGWLOfI7PImz5vXZ+cAnHeNa93heH5VfB3z6KC16inyp3 X-Gm-Gg: Acq92OFqZNRXj3dAXCsl+iRgpMH8Wi/i1PElVDBpQ6EiqoSVGHH6vAhZtsCrpwZYWce 79gkOg8dxFdr4TD4AKRJQjs8w6HTbR1rhsi1ACUik10R83aQydT9kFekGOrA6V0ZV9D28b1V8QK Xp2anT3XDlYLje0QVBwwFrKxaDQGd9lqKLSRYJJ03Ak4MIUcO4Qm1VKreYmUWWdvUo2DYywLB9i 1njSNDqJQ6f0MG1ja5xFhXJTG2HnqPSt+dNaNBbmFMwiGBC7/yxvzC9ieRry41KhZ17nPOHryhI 7n85AwCBKQ0Yt2WHRsLADTdZgvp+18GZ23uxcgEfcnWVfZzXuDJpUWxcRFA26avjC92bMWJoNcX NlTTZ6iXTsru4ZUJWb39wKtmrVuyLUpJ3SidGDBWohiP7wpfy+4Qfw0OPJi4k8mb6UIRTem/YIb 5tAViFMtxWCzg4xzbGHIULG8QR/D19CiXBwLKhMDJrHxqqijjl+3AMUsX2pEUxK6rB X-Received: by 2002:a05:6870:6488:b0:430:b01:1f79 with SMTP id 586e51a60fabf-4413d248508mr2239934fac.2.1780673769704; Fri, 05 Jun 2026 08:36:09 -0700 (PDT) Received: from localhost ([2a03:2880:10ff:54::]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-440d8295b38sm8359180fac.10.2026.06.05.08.36.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jun 2026 08:36:09 -0700 (PDT) From: Joshua Hahn To: Johannes Weiner , Michal Hocko Cc: Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 4/6 v3] mm/memcontrol: convert memcg to use page_counter_stock Date: Fri, 5 Jun 2026 08:36:00 -0700 Message-ID: <20260605153603.234296-5-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260605153603.234296-1-joshua.hahnjy@gmail.com> References: <20260605153603.234296-1-joshua.hahnjy@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now with all of the memcg_stock handling logic replicated in page_counter_stock, switch memcg to use the page_counter_stock. There are a few details that have changed: First, the old special-casing for the !allow_spinning check to avoid refilling and flushing of the old stock is removed. This special casing was important previously, because refilling the stock could do a lot of extra work by evicting one of 7 random victim memcgs in the percpu memcg_stock slots. In the new per-counter design, refilling stock just adds pages to the counter's own local cache without affecting other memcgs, so the original reason for the special case no longer applies. Also, we can now fail during page_counter_alloc_stock(), if there is not enough memory to allocate a percpu page_counter_stock. This failure is rare and nonfatal; the system can continue to operate, with the page counter working without stock and falling back to walking the hierarchy. Finally, drain_all_stock is restructured to iterate CPUs in the outer loop (rather than memcgs) to be able to schedule draining all memcgs via a single work_on_cpu call. It reduces the number of synchronous per-CPU work calls from O(memcgs * CPUs) to just O(CPUs). We also skip isolated CPUs, as schedule_drain_work() did before. We don't need its guard(rcu) here though; that rcu section existed to order async work scheduling against cpumask updates and workqueue flushes, which would lead to drain work pending. Since all work here is synchronous, we don't leave any work behind. Note that obj_stock remains untouched by these changes, and that memsw stock will be handled in the next patch. Suggested-by: Johannes Weiner Signed-off-by: Joshua Hahn --- mm/memcontrol.c | 80 ++++++++++++++++++++++--------------------------- 1 file changed, 36 insertions(+), 44 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 56cd4af082326..562ed9301f5a4 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2269,6 +2269,17 @@ static void schedule_drain_work(int cpu, struct work= _struct *work) queue_work_on(cpu, memcg_wq, work); } =20 +static long drain_stock_on_cpu(void *arg) +{ + struct mem_cgroup *root_memcg =3D arg; + struct mem_cgroup *memcg; + + for_each_mem_cgroup_tree(memcg, root_memcg) + page_counter_drain_stock_local(&memcg->memory); + + return 0; +} + /* * Drains all per-CPU charge caches for given root_memcg resp. subtree * of the hierarchy under it. @@ -2280,28 +2291,18 @@ void drain_all_stock(struct mem_cgroup *root_memcg) /* If someone's already draining, avoid adding running more workers. */ if (!mutex_trylock(&percpu_charge_mutex)) return; - /* - * Notify other cpus that system-wide "drain" is running - * We do not care about races with the cpu hotplug because cpu down - * as well as workers from this path always operate on the local - * per-cpu data. CPU up doesn't touch memcg_stock at all. - */ + + for_each_online_cpu(cpu) { + if (!cpu_is_isolated(cpu)) + work_on_cpu(cpu, drain_stock_on_cpu, root_memcg); + } + + /* Drain obj_stock on all online CPUs */ migrate_disable(); curcpu =3D smp_processor_id(); for_each_online_cpu(cpu) { - struct memcg_stock_pcp *memcg_st =3D &per_cpu(memcg_stock, cpu); struct obj_stock_pcp *obj_st =3D &per_cpu(obj_stock, cpu); =20 - if (!test_bit(FLUSHING_CACHED_CHARGE, &memcg_st->flags) && - is_memcg_drain_needed(memcg_st, root_memcg) && - !test_and_set_bit(FLUSHING_CACHED_CHARGE, - &memcg_st->flags)) { - if (cpu =3D=3D curcpu) - drain_local_memcg_stock(&memcg_st->work); - else - schedule_drain_work(cpu, &memcg_st->work); - } - if (!test_bit(FLUSHING_CACHED_CHARGE, &obj_st->flags) && obj_stock_flush_required(obj_st, root_memcg) && !test_and_set_bit(FLUSHING_CACHED_CHARGE, @@ -2318,9 +2319,13 @@ void drain_all_stock(struct mem_cgroup *root_memcg) =20 static int memcg_hotplug_cpu_dead(unsigned int cpu) { + struct mem_cgroup *memcg; + /* no need for the local lock */ drain_obj_stock(&per_cpu(obj_stock, cpu)); - drain_stock_fully(&per_cpu(memcg_stock, cpu)); + + for_each_mem_cgroup(memcg) + page_counter_drain_stock_cpu(&memcg->memory, cpu); =20 return 0; } @@ -2595,7 +2600,6 @@ void __mem_cgroup_handle_over_high(gfp_t gfp_mask) static int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask, unsigned int nr_pages) { - unsigned int batch =3D max(MEMCG_CHARGE_BATCH, nr_pages); int nr_retries =3D MAX_RECLAIM_RETRIES; struct mem_cgroup *mem_over_limit; struct page_counter *counter; @@ -2608,31 +2612,19 @@ static int try_charge_memcg(struct mem_cgroup *memc= g, gfp_t gfp_mask, bool allow_spinning =3D gfpflags_allow_spinning(gfp_mask); =20 retry: - if (consume_stock(memcg, nr_pages)) - return 0; - - if (!allow_spinning) - /* Avoid the refill and flush of the older stock */ - batch =3D nr_pages; - reclaim_options =3D MEMCG_RECLAIM_MAY_SWAP; if (!do_memsw_account() || - page_counter_try_charge(&memcg->memsw, batch, &counter)) { - if (page_counter_try_charge(&memcg->memory, batch, &counter)) + page_counter_try_charge(&memcg->memsw, nr_pages, &counter)) { + if (page_counter_try_charge(&memcg->memory, nr_pages, &counter)) goto done_restock; if (do_memsw_account()) - page_counter_uncharge(&memcg->memsw, batch); + page_counter_uncharge(&memcg->memsw, nr_pages); mem_over_limit =3D mem_cgroup_from_counter(counter, memory); } else { mem_over_limit =3D mem_cgroup_from_counter(counter, memsw); reclaim_options &=3D ~MEMCG_RECLAIM_MAY_SWAP; } =20 - if (batch > nr_pages) { - batch =3D nr_pages; - goto retry; - } - /* * Prevent unbounded recursion when reclaim operations need to * allocate memory. This might exceed the limits temporarily, @@ -2729,9 +2721,6 @@ static int try_charge_memcg(struct mem_cgroup *memcg,= gfp_t gfp_mask, return 0; =20 done_restock: - if (batch > nr_pages) - refill_stock(memcg, batch - nr_pages); - /* * If the hierarchy is above the normal consumption range, schedule * reclaim on returning to userland. We can perform reclaim here @@ -2768,7 +2757,7 @@ static int try_charge_memcg(struct mem_cgroup *memcg,= gfp_t gfp_mask, * and distribute reclaim work and delay penalties * based on how much each task is actually allocating. */ - current->memcg_nr_pages_over_high +=3D batch; + current->memcg_nr_pages_over_high +=3D nr_pages; set_notify_resume(current); break; } @@ -3073,7 +3062,7 @@ static void obj_cgroup_uncharge_pages(struct obj_cgro= up *objcg, account_kmem_nmi_safe(memcg, -nr_pages); memcg1_account_kmem(memcg, -nr_pages); if (!mem_cgroup_is_root(memcg)) - refill_stock(memcg, nr_pages); + memcg_uncharge(memcg, nr_pages); =20 css_put(&memcg->css); } @@ -4077,6 +4066,8 @@ static void __mem_cgroup_free(struct mem_cgroup *memc= g) =20 static void mem_cgroup_free(struct mem_cgroup *memcg) { + page_counter_free_stock(&memcg->memory); + page_counter_free_stock(&memcg->memsw); lru_gen_exit_memcg(memcg); memcg_wb_domain_exit(memcg); __mem_cgroup_free(memcg); @@ -4244,6 +4235,9 @@ static int mem_cgroup_css_online(struct cgroup_subsys= _state *css) refcount_set(&memcg->id.ref, 1); css_get(css); =20 + /* failure is nonfatal, charges fall back to direct hierarchy */ + page_counter_alloc_stock(&memcg->memory, MEMCG_CHARGE_BATCH); + /* * Ensure mem_cgroup_from_private_id() works once we're fully online. * @@ -4304,6 +4298,7 @@ static void mem_cgroup_css_offline(struct cgroup_subs= ys_state *css) wb_memcg_offline(memcg); lru_gen_offline_memcg(memcg); =20 + page_counter_disable_stock(&memcg->memory); drain_all_stock(memcg); =20 mem_cgroup_private_id_put(memcg, 1); @@ -5499,7 +5494,7 @@ void mem_cgroup_sk_uncharge(const struct sock *sk, un= signed int nr_pages) =20 mod_memcg_state(memcg, MEMCG_SOCK, -nr_pages); =20 - refill_stock(memcg, nr_pages); + page_counter_uncharge(&memcg->memory, nr_pages); } =20 void mem_cgroup_flush_workqueue(void) @@ -5552,12 +5547,9 @@ int __init mem_cgroup_init(void) memcg_wq =3D alloc_workqueue("memcg", WQ_PERCPU, 0); WARN_ON(!memcg_wq); =20 - for_each_possible_cpu(cpu) { - INIT_WORK(&per_cpu_ptr(&memcg_stock, cpu)->work, - drain_local_memcg_stock); + for_each_possible_cpu(cpu) INIT_WORK(&per_cpu_ptr(&obj_stock, cpu)->work, drain_local_obj_stock); - } =20 memcg_size =3D struct_size_t(struct mem_cgroup, nodeinfo, nr_node_ids); memcg_cachep =3D kmem_cache_create("mem_cgroup", memcg_size, 0, --=20 2.53.0-Meta From nobody Mon Jun 8 05:26:06 2026 Received: from mail-oo1-f42.google.com (mail-oo1-f42.google.com [209.85.161.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 58BA33B4439 for ; Fri, 5 Jun 2026 15:36:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780673776; cv=none; b=akA5NR2iKvLIS8xIo7O1l6Soy9MEcOKE4Rpfn/qGFv1BNZwteGNZkplPeK1yt5M6fOgv1wqgV0u/CdM1cnqTM5Vyv9X02K3dmKGTI/bCwmcpKVabR4eDNO0MIGm6lthauXJ9+G9oX6v9yhaJynAD/B4oFrZcALFryGDjPZZF/v0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780673776; c=relaxed/simple; bh=bIEWn8F7ak6CcMSj5/RHqMkAoowSiMeeFwKRoqLy4oI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CqjJjoX04s1KMqPXkXDUJRyds2Chey029DUpGJaLvvfjoiFWOIGFk14KRtJY3jjMQS08/sowMSoXbNKjZdqGKmcZOsY4c6EMLGWxplIEaLpZJPRbGGTu7ZaFIMS+5MXjN4Dux4e/jxLGfzr7jU/O9VXyp6G0FCI57w4cfNGupm0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hGkRBjoo; arc=none smtp.client-ip=209.85.161.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hGkRBjoo" Received: by mail-oo1-f42.google.com with SMTP id 006d021491bc7-69e59978deeso912130eaf.0 for ; Fri, 05 Jun 2026 08:36:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780673771; x=1781278571; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2POwf/K9X70lHx5WEcIKkTy5o0DDbStiw7GuBjFdd58=; b=hGkRBjoorE08vkSWw9XWZ9ybn3p19x/5BgvaFxUHZbac7XlPN7TU+dCnYRP2d5vlqE jSGjQ20foDou1QRiS9gvDh+8xX3BjVli9Q0EesOxw/TXdzDUeIjdTgDP2UssKhgawIJm CUMQ4T1u5BD1QmQJuQBSuvZCrVF+0KWstQflK4InW4/NxiH8ot41DWFfpBlKSjvRBKNd qkTiZASqL/Tsk9WPoHoF5kfU72hwLA7EKJQKDVVolysmE1EqgOteED938DwT3BZHG+ai vaQOGpOFhZ2ceVIQ4bMjLi795TCjfYR2ArXZchV4c//Zgf8wZOQ8URwIUJ0us6/RBkZp DxWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780673771; x=1781278571; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=2POwf/K9X70lHx5WEcIKkTy5o0DDbStiw7GuBjFdd58=; b=izCBOWEfwgiHRIPrThIiVQD+1fvfkOPh4qfanWwWO0AaYodKeB9J2q7PfJEAj43Mja toqskTwRuenC+SXcDjyKd0F3RS0fw+dKNr4gqqO8trqpCAa2NMmCobw1D7zX2Sa1LwrO 9jAgeSBC1S37YBn1Pa9hAquEEbBvsFZQRTF9h/BBaJVxnO4pCBlcVcrpHJsCy97ICgXP GgWmbIi5vJLE442/E8ppQeb5S7/whlvABBCnj6CTfrSGm941R8txNvVQqNoj6BLG3Vj3 +/pzTK/eBVwYWdZB+so2+ZInsAb1ymnw5e6TTrGbh8OFd4DkBLpWwap7hQfJDuAFWhKq pnzQ== X-Forwarded-Encrypted: i=1; AFNElJ8zVLFVq/bPbIxA2Wxjw4/DtOwlihvINzhkjP1w+zJWEJI7jABLqTeZcCpTp7dFOFBr4XLxXmGG55SaKbM=@vger.kernel.org X-Gm-Message-State: AOJu0YxA8aROM5RVIXbNkLN3AaY2LAo+ySSy4c7AWe2zd9emmJsesj9T wVN8fSUYonpROsOwS1wBy7ZqL6EhHFQc34DfpFyy0nE6jLBJX2L7+ShD X-Gm-Gg: Acq92OHdjLCUsQoOJR9ktlEfSSjdxEqDE5wGJxA/7jcmz5o1XlVXVZ8loPXP4fOiz8P siUs/G/E8zmy//P3CaIOxNRR+DAXUpM0eAHVsgvVHeLrEKPmyUU8aJ21YWdFP4nT330oDsR0B5N eAEKnvVLozBXLNOaS4MDXiSSHH4bAuSyNjRG1GH0xcOuEUR+F6GYHgeq6D6T/nE1Dljc8a6SoCF c2jb0xsbjuRSxJR/2iYSKnXDWzHBvrrmLt8J8qd+M1IhYsaJlBlzxkOWDswdeUawkSbjJZi26t2 ffM6BmNdzn7p5yi9Rd9zuOhbumq+9CncaEv2bO/xaZDXD00WqrAvy8tXWb9Yv1vQIFpluf/j2Jq bPGikzNfG4gUaN3kahUZoTZUzqOP58O3f9aOqiyL03pKXP1GBNd45xS8k52kP8c5JJRH5B8p7hz sISM4tm9IGkYzgPOjQE6hoE3gJx4+lKt84ARyV/7dPS7H/WARlOCPBDz9FzqlYvu2tatWgZQ0aV w== X-Received: by 2002:a05:6820:4b03:b0:69e:59ec:f70b with SMTP id 006d021491bc7-69e68c19e22mr2043956eaf.37.1780673770989; Fri, 05 Jun 2026 08:36:10 -0700 (PDT) Received: from localhost ([2a03:2880:10ff:3::]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-69e4620b0casm5398168eaf.3.2026.06.05.08.36.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jun 2026 08:36:10 -0700 (PDT) From: Joshua Hahn To: Johannes Weiner , Michal Hocko Cc: Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 5/6 v3] mm/memcontrol: optimize memsw stock for cgroup v1 Date: Fri, 5 Jun 2026 08:36:01 -0700 Message-ID: <20260605153603.234296-6-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260605153603.234296-1-joshua.hahnjy@gmail.com> References: <20260605153603.234296-1-joshua.hahnjy@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Previously, each memcg had its own stock, which was shared by all page counters within it. Specifically in try_charge_memcg, the stock limit check would occur before the memsw and memory page_counters were charged hierarchically. Now that the memcg stock was folded into the page_counter level, and we have replaced try_charge_memcg's stock check against the memory page_counter's stock, this leaves no fast path available for cgroup v1's memsw check. Introduce a new stock for the memsw page_counter, charged independently from the memory page_counter. This provides better caching on cgroup v1: The best case scenario is when both the memsw and memory page_counters can use their cached stock charge; this is the old behavior. The halfway scenario is when either the memsw or memory page_counter is within the stock size, but the other isn't. This requires one hierarchical charge. The worst case scenario is when both memsw and memory page_counters are over their limit, and must walk two page_counter hierarchies. This is the same as the old behavior. By introducing an independent stock for memsw, we can avoid the worst case scenario more often and can fail or succeed separately from the memory page counter. One user-visible change is that reported memsw usage may transiently be lower than memory usage. This happens because each counter independently batches the stock charges, so the visible values can differ by up to the stock batch size (MEMCG_CHARGE_BATCH) pages. Signed-off-by: Joshua Hahn --- mm/memcontrol.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 562ed9301f5a4..d0da2f842e2d4 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2274,8 +2274,11 @@ static long drain_stock_on_cpu(void *arg) struct mem_cgroup *root_memcg =3D arg; struct mem_cgroup *memcg; =20 - for_each_mem_cgroup_tree(memcg, root_memcg) + for_each_mem_cgroup_tree(memcg, root_memcg) { page_counter_drain_stock_local(&memcg->memory); + if (do_memsw_account()) + page_counter_drain_stock_local(&memcg->memsw); + } =20 return 0; } @@ -2324,8 +2327,11 @@ static int memcg_hotplug_cpu_dead(unsigned int cpu) /* no need for the local lock */ drain_obj_stock(&per_cpu(obj_stock, cpu)); =20 - for_each_mem_cgroup(memcg) + for_each_mem_cgroup(memcg) { page_counter_drain_stock_cpu(&memcg->memory, cpu); + if (do_memsw_account()) + page_counter_drain_stock_cpu(&memcg->memsw, cpu); + } =20 return 0; } @@ -4237,6 +4243,8 @@ static int mem_cgroup_css_online(struct cgroup_subsys= _state *css) =20 /* failure is nonfatal, charges fall back to direct hierarchy */ page_counter_alloc_stock(&memcg->memory, MEMCG_CHARGE_BATCH); + if (do_memsw_account()) + page_counter_alloc_stock(&memcg->memsw, MEMCG_CHARGE_BATCH); =20 /* * Ensure mem_cgroup_from_private_id() works once we're fully online. @@ -4299,6 +4307,8 @@ static void mem_cgroup_css_offline(struct cgroup_subs= ys_state *css) lru_gen_offline_memcg(memcg); =20 page_counter_disable_stock(&memcg->memory); + if (do_memsw_account()) + page_counter_disable_stock(&memcg->memsw); drain_all_stock(memcg); =20 mem_cgroup_private_id_put(memcg, 1); --=20 2.53.0-Meta From nobody Mon Jun 8 05:26:06 2026 Received: from mail-oo1-f54.google.com (mail-oo1-f54.google.com [209.85.161.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9CB6D3B8950 for ; Fri, 5 Jun 2026 15:36:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780673780; cv=none; b=BN4j29ZXAdGqDteXrE8JJZZvdfD1hT421/3f1kcoiaec44eie6vI+grq5fuwESvGdxCqvHkNB9YHbcalZCwb36k25f35kp1l0mq/SWvVdKnbGK6EFHAQHKxKMMCKfCQE6/u82qj6S3glm8Qvdb3CTe0+Gp2krxQoiCOgRmW6bzU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780673780; c=relaxed/simple; bh=AlnunryOiTLhLF6/0+wT6IlypoE163q0qespXBG7MNo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=V3oazH9hlIW/kqtO9m1jDSrQRk6Tq/7JcDcBF2R2jKEiIKkk/mQMAFaUV4Q53wIbuFameuEj8My6CYRsJMErkwnnMxivxdtGIk70dDaePE1dQsVqDk+leLef9GxwRkxzp3thtacsynOmdmd7Y6DLVqXxr43drIWW30usQavv4Pk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Z6OrchWC; arc=none smtp.client-ip=209.85.161.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Z6OrchWC" Received: by mail-oo1-f54.google.com with SMTP id 006d021491bc7-69d92dbc420so1419418eaf.3 for ; Fri, 05 Jun 2026 08:36:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780673772; x=1781278572; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cDYBxVf6/SnI4wuLSeXSHzzLh62y/xO5UdAB8iPGGOc=; b=Z6OrchWCG+nUnXECqeCzx2tNEDY2e26Ji8n6N2Cd6cilFZLfqm/+vOcOBUgYWz+u3w VRe7lyILe/8YpAEJvOYAs3GoYFWTawNUejjQVoIFjM1YnhGPbtEFksUkhyDdtJJFO983 jyVyM+Vhpz/Ycb6OAfkRtbjE4npJsoAheGyVkPbNTUNe1TbI34X9eTBqs/kj8RoYEjhW X/ygXvp77lTV7a+c4yiIN4+EuAlf+3NNByjCJfN3H0JOyapnkPthMxkvlUnwqt2IHeGn z9m5xHPginnGnJ/j0Sg3vxUJTd9WQvdXPgfLi5SZI45oaTAC2kUmDAbaCuQpzpxx1ceA 8JEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780673772; x=1781278572; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=cDYBxVf6/SnI4wuLSeXSHzzLh62y/xO5UdAB8iPGGOc=; b=oFZRFF7mogXT9LBVJmWh0iTYSz40ytY/0svHjCc2zIL2jc3+AuTLhk1jVH0Mj9fQMl H/on3+rgXT+P29oSvm897pyQy83zjmSLMui9Y8nS0WUX3Gi81S257wVs0L7SRlxvm5cM TyqbWOskHppBKOFrne2Fy1Ofj4N27nObM7Pw5kFBvL0i7xenHOCgLJvM6pLWCv6ESNFl i04ACXC8Kx2HG35h+EnM1zjJBD1fgrvsqS/OgHrNsNgjQ9QeASBCRhL1sHZTxMqkOBNt UKm03Avc+25kGdkAL5IHQsnWjTpzcg0SENSxohkj+XM5/zXCaCKF975MYt2SRKY/iUG5 TrYA== X-Forwarded-Encrypted: i=1; AFNElJ/oD6Wy8Z0QzJFfE8FNu12qDdWH5cmwVdxL9MuaoHa6+F9kHpbrIe8GLtFVEBoFbknq8QQ1GqYlnkL47ek=@vger.kernel.org X-Gm-Message-State: AOJu0YwcHIG0GPM+bRdYPwxOXspNRfPHPETSs5M+HBnb+bx9Z+TVWV4T FMR7U7NlxbhphkMomGELvUWEbfepXhv1hZceFNPEqxQLMCdNOVXGSGue X-Gm-Gg: Acq92OGQz/enMPyh7yZw7Y1jheJEgUieJhn+vtWkeLfUvoZMWFgeJTxAMiSosIRmTPd ck8y222xzEfvHK+5AIEqiflXc7VegkcDVFW6I79Ue4fDRS4V7dlgSJagHzFc/vWoySJMlPYlUV2 MLRTalKfrsJIuFnYPe+Q7PfQkhBXCGpWZcpf30eGKxXOTqLtlwKbPNo7p5+jFPXSm1+9AGTRk4y wHYtQM1E7LQYJSloCqD0cHMqjVlAUHYu58A19fWkm7oO9QuuN2+5RSPDsop77vOtJel13St7YXy 8gKtmFSK+t7T1N1JCXLgiRiIi9s7yOT7lIBYL81TSRiXVW5mVKPkYLyy7zvk/s5Lo0EgjGVDPB2 vlF0d/+vEE0cHXMYjwnhey/tA7R3yBBTERk6SHIrxbEwVOx60Lj1a56sLoliOiR7bBa+2c6wzcF 2APUjlO5JWU0NX03uxsFJqSdXniACkio44B9VkbZnMng24ZILX/l9tVbpb8FbMM7UH X-Received: by 2002:a05:6820:2d0b:b0:69e:71c8:193f with SMTP id 006d021491bc7-69e71c81e2dmr1257008eaf.8.1780673772254; Fri, 05 Jun 2026 08:36:12 -0700 (PDT) Received: from localhost ([2a03:2880:10ff:55::]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-440d8500f0esm7271409fac.18.2026.06.05.08.36.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jun 2026 08:36:11 -0700 (PDT) From: Joshua Hahn To: Johannes Weiner , Michal Hocko Cc: Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 6/6 v3] mm/memcontrol: remove unused memcg_stock code Date: Fri, 5 Jun 2026 08:36:02 -0700 Message-ID: <20260605153603.234296-7-joshua.hahnjy@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260605153603.234296-1-joshua.hahnjy@gmail.com> References: <20260605153603.234296-1-joshua.hahnjy@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now that all memcg_stock logic has been moved to page_counter_stock, we can remove all code related to handling memcg_stock. Note that obj_stock is untouched and is still needed. FLUSHING_CACHED_CHARGE is preserved so that it can be used by obj_stock as well. Suggested-by: Johannes Weiner Signed-off-by: Joshua Hahn --- mm/memcontrol.c | 186 ------------------------------------------------ 1 file changed, 186 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d0da2f842e2d4..3e3f8fbd19a48 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1998,25 +1998,7 @@ void mem_cgroup_print_oom_group(struct mem_cgroup *m= emcg) pr_cont(" are going to be killed due to memory.oom.group set\n"); } =20 -/* - * The value of NR_MEMCG_STOCK is selected to keep the cached memcgs and t= heir - * nr_pages in a single cacheline. This may change in future. - */ -#define NR_MEMCG_STOCK 7 #define FLUSHING_CACHED_CHARGE 0 -struct memcg_stock_pcp { - local_trylock_t lock; - uint8_t nr_pages[NR_MEMCG_STOCK]; - struct mem_cgroup *cached[NR_MEMCG_STOCK]; - - struct work_struct work; - unsigned long flags; - uint8_t drain_idx; -}; - -static DEFINE_PER_CPU_ALIGNED(struct memcg_stock_pcp, memcg_stock) =3D { - .lock =3D INIT_LOCAL_TRYLOCK(lock), -}; =20 /* * NR_OBJ_STOCK is sized so the entire hot path of obj_stock_pcp @@ -2065,47 +2047,6 @@ static void drain_obj_stock(struct obj_stock_pcp *st= ock); static bool obj_stock_flush_required(struct obj_stock_pcp *stock, struct mem_cgroup *root_memcg); =20 -/** - * consume_stock: Try to consume stocked charge on this cpu. - * @memcg: memcg to consume from. - * @nr_pages: how many pages to charge. - * - * Consume the cached charge if enough nr_pages are present otherwise retu= rn - * failure. Also return failure for charge request larger than - * MEMCG_CHARGE_BATCH or if the local lock is already taken. - * - * returns true if successful, false otherwise. - */ -static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) -{ - struct memcg_stock_pcp *stock; - uint8_t stock_pages; - bool ret =3D false; - int i; - - if (nr_pages > MEMCG_CHARGE_BATCH || - !local_trylock(&memcg_stock.lock)) - return ret; - - stock =3D this_cpu_ptr(&memcg_stock); - - for (i =3D 0; i < NR_MEMCG_STOCK; ++i) { - if (memcg !=3D READ_ONCE(stock->cached[i])) - continue; - - stock_pages =3D READ_ONCE(stock->nr_pages[i]); - if (stock_pages >=3D nr_pages) { - WRITE_ONCE(stock->nr_pages[i], stock_pages - nr_pages); - ret =3D true; - } - break; - } - - local_unlock(&memcg_stock.lock); - - return ret; -} - static void memcg_uncharge(struct mem_cgroup *memcg, unsigned int nr_pages) { page_counter_uncharge(&memcg->memory, nr_pages); @@ -2113,51 +2054,6 @@ static void memcg_uncharge(struct mem_cgroup *memcg,= unsigned int nr_pages) page_counter_uncharge(&memcg->memsw, nr_pages); } =20 -/* - * Returns stocks cached in percpu and reset cached information. - */ -static void drain_stock(struct memcg_stock_pcp *stock, int i) -{ - struct mem_cgroup *old =3D READ_ONCE(stock->cached[i]); - uint8_t stock_pages; - - if (!old) - return; - - stock_pages =3D READ_ONCE(stock->nr_pages[i]); - if (stock_pages) { - memcg_uncharge(old, stock_pages); - WRITE_ONCE(stock->nr_pages[i], 0); - } - - css_put(&old->css); - WRITE_ONCE(stock->cached[i], NULL); -} - -static void drain_stock_fully(struct memcg_stock_pcp *stock) -{ - int i; - - for (i =3D 0; i < NR_MEMCG_STOCK; ++i) - drain_stock(stock, i); -} - -static void drain_local_memcg_stock(struct work_struct *dummy) -{ - struct memcg_stock_pcp *stock; - - if (WARN_ONCE(!in_task(), "drain in non-task context")) - return; - - local_lock(&memcg_stock.lock); - - stock =3D this_cpu_ptr(&memcg_stock); - drain_stock_fully(stock); - clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); - - local_unlock(&memcg_stock.lock); -} - static void drain_local_obj_stock(struct work_struct *dummy) { struct obj_stock_pcp *stock; @@ -2174,88 +2070,6 @@ static void drain_local_obj_stock(struct work_struct= *dummy) local_unlock(&obj_stock.lock); } =20 -static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) -{ - struct memcg_stock_pcp *stock; - struct mem_cgroup *cached; - uint8_t stock_pages; - bool success =3D false; - int empty_slot =3D -1; - int i; - - /* - * For now limit MEMCG_CHARGE_BATCH to 127 and less. In future if we - * decide to increase it more than 127 then we will need more careful - * handling of nr_pages[] in struct memcg_stock_pcp. - */ - BUILD_BUG_ON(MEMCG_CHARGE_BATCH > S8_MAX); - - VM_WARN_ON_ONCE(mem_cgroup_is_root(memcg)); - - if (nr_pages > MEMCG_CHARGE_BATCH || - !local_trylock(&memcg_stock.lock)) { - /* - * In case of larger than batch refill or unlikely failure to - * lock the percpu memcg_stock.lock, uncharge memcg directly. - */ - memcg_uncharge(memcg, nr_pages); - return; - } - - stock =3D this_cpu_ptr(&memcg_stock); - for (i =3D 0; i < NR_MEMCG_STOCK; ++i) { - cached =3D READ_ONCE(stock->cached[i]); - if (!cached && empty_slot =3D=3D -1) - empty_slot =3D i; - if (memcg =3D=3D READ_ONCE(stock->cached[i])) { - stock_pages =3D READ_ONCE(stock->nr_pages[i]) + nr_pages; - WRITE_ONCE(stock->nr_pages[i], stock_pages); - if (stock_pages > MEMCG_CHARGE_BATCH) - drain_stock(stock, i); - success =3D true; - break; - } - } - - if (!success) { - i =3D empty_slot; - if (i =3D=3D -1) { - i =3D stock->drain_idx++; - if (stock->drain_idx =3D=3D NR_MEMCG_STOCK) - stock->drain_idx =3D 0; - drain_stock(stock, i); - } - css_get(&memcg->css); - WRITE_ONCE(stock->cached[i], memcg); - WRITE_ONCE(stock->nr_pages[i], nr_pages); - } - - local_unlock(&memcg_stock.lock); -} - -static bool is_memcg_drain_needed(struct memcg_stock_pcp *stock, - struct mem_cgroup *root_memcg) -{ - struct mem_cgroup *memcg; - bool flush =3D false; - int i; - - rcu_read_lock(); - for (i =3D 0; i < NR_MEMCG_STOCK; ++i) { - memcg =3D READ_ONCE(stock->cached[i]); - if (!memcg) - continue; - - if (READ_ONCE(stock->nr_pages[i]) && - mem_cgroup_is_descendant(memcg, root_memcg)) { - flush =3D true; - break; - } - } - rcu_read_unlock(); - return flush; -} - static void schedule_drain_work(int cpu, struct work_struct *work) { /* --=20 2.53.0-Meta