From nobody Wed Dec 17 17:25:09 2025 Received: from out-187.mta1.migadu.com (out-187.mta1.migadu.com [95.215.58.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A6C701CD3F for ; Fri, 4 Apr 2025 01:39:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.187 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730777; cv=none; b=ERGUIHp8Eu8KAPm2DWGnHujSWz5JafoptwcqONy4PIGMEfoySmAANEkOsjo8t3gLxcR5O6bKm07YTl1mAYESi9MwStxDi9WcE5/4x9Ob/Vi0Xh5ji2ypMCJo1uZsqHP5318MbNZgCjW5uL+iWz8zntBT3F8YIFt++nygPCie89w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730777; c=relaxed/simple; bh=Wt9frfyEWA8VJjXPXQvYvV1auqH17c8eWrrofBDNj+w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bsnsbbvPyIXrXZ/8OF6zYUPD/TyARXMaI0yfKd/8w5v/hhBtHU9t1IlHcptCIKMrLvMlF6YXr73kq+Ut/zTMmZqc6drrui0FAOLN0qkYc14MuOs309GG6HeSfxqogTGbQ4WXUiBXoiMrKAoTSM2Kg0NNWimf037B5TTVR3IsB1c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=Gtp/lT7A; arc=none smtp.client-ip=95.215.58.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="Gtp/lT7A" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1743730773; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Umv4btRy/nEoNx2OIY9ZLc4o2g+KibZrAp3iooZEy6Y=; b=Gtp/lT7ARKgjwmrfUuMxwEW+dvted/jx4y+oaBXV12gQbAtjbE8JtTP3C+4yGfOpV4rVeI zGYBza0gUb2vsRxmh6rV2dHte5ZmqbcQmI0Clp8KzTFduq0UEOnqRA/K9SlVURO2f6fy75 aLWZys/FK/LwV7mCWlcNJVI/EbRZjuk= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 1/9] memcg: remove root memcg check from refill_stock Date: Thu, 3 Apr 2025 18:39:05 -0700 Message-ID: <20250404013913.1663035-2-shakeel.butt@linux.dev> In-Reply-To: <20250404013913.1663035-1-shakeel.butt@linux.dev> References: <20250404013913.1663035-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" refill_stock can not be called with root memcg, so there is no need to check it. Instead add a warning if root is ever passed to it. Reviewed-by: Roman Gushchin Acked-by: Vlastimil Babka Signed-off-by: Shakeel Butt --- mm/memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b16b5b807d7c..ae1e953cead7 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1893,13 +1893,13 @@ static void refill_stock(struct mem_cgroup *memcg, = unsigned int nr_pages) { unsigned long flags; =20 + VM_WARN_ON_ONCE(mem_cgroup_is_root(memcg)); + if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) { /* * In case of unlikely failure to lock percpu stock_lock * uncharge memcg directly. */ - if (mem_cgroup_is_root(memcg)) - return; page_counter_uncharge(&memcg->memory, nr_pages); if (do_memsw_account()) page_counter_uncharge(&memcg->memsw, nr_pages); --=20 2.47.1 From nobody Wed Dec 17 17:25:09 2025 Received: from out-188.mta0.migadu.com (out-188.mta0.migadu.com [91.218.175.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B946B81720 for ; Fri, 4 Apr 2025 01:39:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.188 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730781; cv=none; b=iAQN3QgZSjjXA+VL3GtosA3JRTTXR+HBTjgxIMy3Z0OXL5hnWhowvfycyYGKFg+B1PqgcxOj43tJz2vSpI0j2Tu6C9oX8eX0WkEB5zsxV1d5n6VnqgM9qhGvF0t1dPPmjN0EzcRsezAhj/RFJUvyQJSnT0ovONHqqeXtBpqMdMQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730781; c=relaxed/simple; bh=dcmHxkwnLg3d2Z86cYpTujufTUr95JJ7pE3Jf4au6eE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kct/Jt7X9tZHw8wBBxpHLKAoFjvaWWktnRea/EmFOfPr0zV2hFmw0vIT5oSA6999L0MicZgRXJANQJVLa/ukB3vxrwgOQCc92t7e7kDo2yY7tSOfVULx3SgikF+fyk31N/lWbpFn7OmJmHuWVpKZgIqNa8JBTOeFiLxmo520PmM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=IAblp7+f; arc=none smtp.client-ip=91.218.175.188 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="IAblp7+f" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1743730777; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZwESqd7AZkcvjy2svN3vjPj4mb4ksMSsdpBE9pRlLtk=; b=IAblp7+fpjYEGUbuAy7stmx7sCATwj3MRNpK/SMhFEatyJbqyWryEzuHgIZ2GtmJPc1RUU fBBkXAT4axjSGU6L3payjnYBz+SQJ9SZtwFI5fgpfm4gq1gn3gcsZpDifoUpAsMKgMJPqQ LpBDY0nLoiac5uIBq6HqyzrixSBCTBI= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 2/9] memcg: decouple drain_obj_stock from local stock Date: Thu, 3 Apr 2025 18:39:06 -0700 Message-ID: <20250404013913.1663035-3-shakeel.butt@linux.dev> In-Reply-To: <20250404013913.1663035-1-shakeel.butt@linux.dev> References: <20250404013913.1663035-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Currently drain_obj_stock() can potentially call __refill_stock which accesses local cpu stock and thus requires memcg stock's local_lock. However if we look at the code paths leading to drain_obj_stock(), there is never a good reason to refill the memcg stock at all from it. At the moment, drain_obj_stock can be called from reclaim, hotplug cpu teardown, mod_objcg_state() and refill_obj_stock(). For reclaim and hotplug there is no need to refill. For the other two paths, most probably the newly switched objcg would be used in near future and thus no need to refill stock with the older objcg. In addition, __refill_stock() from drain_obj_stock() happens on rare cases, so performance is not really an issue. Let's just uncharge directly instead of refill which will also decouple drain_obj_stock from local cpu stock and local_lock requirements. Acked-by: Vlastimil Babka Reviewed-by: Roman Gushchin Signed-off-by: Shakeel Butt --- mm/memcontrol.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index ae1e953cead7..52be78515d70 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2876,7 +2876,12 @@ static struct obj_cgroup *drain_obj_stock(struct mem= cg_stock_pcp *stock) =20 mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages); memcg1_account_kmem(memcg, -nr_pages); - __refill_stock(memcg, nr_pages); + if (!mem_cgroup_is_root(memcg)) { + page_counter_uncharge(&memcg->memory, nr_pages); + if (do_memsw_account()) + page_counter_uncharge(&memcg->memsw, + nr_pages); + } =20 css_put(&memcg->css); } --=20 2.47.1 From nobody Wed Dec 17 17:25:09 2025 Received: from out-189.mta0.migadu.com (out-189.mta0.migadu.com [91.218.175.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BB31F22F01 for ; Fri, 4 Apr 2025 01:39:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.189 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730786; cv=none; b=db5TcrX9x2qn5zOBxdZ39PcfaV6fQW0qXa6i+lr8amOKCitLyoMjqyYmkdMA/7B1bRxp8NIBDOOsgwuXYjlQQFU4e8nTUNQ/OxlGEtm0+909kdr06deRc3j36jxtBpCRPlRALZHv102UnpNvUHgQcCYIuJ5KV9iCyvH0CZ3V4HM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730786; c=relaxed/simple; bh=ctz754vKWw7u6LnsnYjGytOfKItBbMHZdJseJBd6xZU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WTCzUQH3W3kvDyQFBiC4fjOZgFbUQkkPuH/jfFfkYc3VseN1ysewpXcXyOIPK3jOivSYBJfQtn28toe6hepST4A3dDRh2xh4Fk23ZCNZJOfwWI9KGPkT9Kta3Kc+JcFyPasQ+xHm8D1CjqhIMfX4/v1dC1KDUO7XSQI24wg6EWA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=GBGlTaaO; arc=none smtp.client-ip=91.218.175.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="GBGlTaaO" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1743730780; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x+VJqH57jlWrbThx984SIHQEcQAexzgLOXZKVlgElzs=; b=GBGlTaaOzOic8e54sSOr6B4iFeU2rPT3sI99/sfkGEvhKiXeEyqfuK0bvpBaWdvsDxDQZv XelY9nVeMRu7iMo1LxItUpyalrLdPwSU0UJPnLhiXAd8eCnsIuNtfNLnp8wiOMwWOhEJ6W PyvcCmYAn1lsfiVMDrkkk3ensDKlTt4= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 3/9] memcg: introduce memcg_uncharge Date: Thu, 3 Apr 2025 18:39:07 -0700 Message-ID: <20250404013913.1663035-4-shakeel.butt@linux.dev> In-Reply-To: <20250404013913.1663035-1-shakeel.butt@linux.dev> References: <20250404013913.1663035-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" At multiple places in memcontrol.c, the memory and memsw page counters are being uncharged. This is error-prone. Let's move the functionality to a newly introduced memcg_uncharge and call it from all those places. Acked-by: Vlastimil Babka Reviewed-by: Roman Gushchin Signed-off-by: Shakeel Butt --- mm/memcontrol.c | 28 ++++++++++++---------------- 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 52be78515d70..dfb3f14c1178 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1822,6 +1822,13 @@ static bool consume_stock(struct mem_cgroup *memcg, = unsigned int nr_pages, return ret; } =20 +static void memcg_uncharge(struct mem_cgroup *memcg, unsigned int nr_pages) +{ + page_counter_uncharge(&memcg->memory, nr_pages); + if (do_memsw_account()) + page_counter_uncharge(&memcg->memsw, nr_pages); +} + /* * Returns stocks cached in percpu and reset cached information. */ @@ -1834,10 +1841,7 @@ static void drain_stock(struct memcg_stock_pcp *stoc= k) return; =20 if (stock_pages) { - page_counter_uncharge(&old->memory, stock_pages); - if (do_memsw_account()) - page_counter_uncharge(&old->memsw, stock_pages); - + memcg_uncharge(old, stock_pages); WRITE_ONCE(stock->nr_pages, 0); } =20 @@ -1900,9 +1904,7 @@ static void refill_stock(struct mem_cgroup *memcg, un= signed int nr_pages) * In case of unlikely failure to lock percpu stock_lock * uncharge memcg directly. */ - page_counter_uncharge(&memcg->memory, nr_pages); - if (do_memsw_account()) - page_counter_uncharge(&memcg->memsw, nr_pages); + memcg_uncharge(memcg, nr_pages); return; } __refill_stock(memcg, nr_pages); @@ -2876,12 +2878,8 @@ static struct obj_cgroup *drain_obj_stock(struct mem= cg_stock_pcp *stock) =20 mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages); memcg1_account_kmem(memcg, -nr_pages); - if (!mem_cgroup_is_root(memcg)) { - page_counter_uncharge(&memcg->memory, nr_pages); - if (do_memsw_account()) - page_counter_uncharge(&memcg->memsw, - nr_pages); - } + if (!mem_cgroup_is_root(memcg)) + memcg_uncharge(memcg, nr_pages); =20 css_put(&memcg->css); } @@ -4702,9 +4700,7 @@ static inline void uncharge_gather_clear(struct uncha= rge_gather *ug) static void uncharge_batch(const struct uncharge_gather *ug) { if (ug->nr_memory) { - page_counter_uncharge(&ug->memcg->memory, ug->nr_memory); - if (do_memsw_account()) - page_counter_uncharge(&ug->memcg->memsw, ug->nr_memory); + memcg_uncharge(ug->memcg, ug->nr_memory); if (ug->nr_kmem) { mod_memcg_state(ug->memcg, MEMCG_KMEM, -ug->nr_kmem); memcg1_account_kmem(ug->memcg, -ug->nr_kmem); --=20 2.47.1 From nobody Wed Dec 17 17:25:09 2025 Received: from out-179.mta1.migadu.com (out-179.mta1.migadu.com [95.215.58.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BFF8C481B1 for ; Fri, 4 Apr 2025 01:39:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730794; cv=none; b=oZYblHNxdIT5BWgY/X4NMGu7GcdAlLwcVdTQ3eSOYNEMv9YNpH2jQqUyueChkRDd5K+yDmBWrmXsUy3ftMKzzbX7E3ctJkHAGfJYIRkIRhBUlt8zNiCLTFyHN8jAphHmLwkYRr+gTA3Fu2OQZmrcj+j1F1QvLK6O5K5vjFoeFqA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730794; c=relaxed/simple; bh=lcWUrcEBO0JvrNLGhirw00pS+WGEyz/jppfKcA2v70I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=taNaw88WnnqyPYzqP9bcmJg0aDlW3HcL3I03owqw0BtIfVHRFb8IFq52+z5mQ6J5sp23Igm/bvKJ3+Xeov5Lu/x8eW9BJy5AKK34Nlo0Tv/RN/OLbq4qr70erJMLdFku8bwhA6qKumZmbp6S70vmipku8IsvJbH5AsyjLntid5g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=p8V0zKcN; arc=none smtp.client-ip=95.215.58.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="p8V0zKcN" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1743730790; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RVDoE6eFCjqBV3S3ZF9n6fC4QSKAYTJ/v3+Mxoye00w=; b=p8V0zKcNaP0A77aZrgaDTTYwvVlJQkDLPvmdgXWObqyzSJbqMjREH/50MV5cOOavF/gVvm y7RVw3kZklBuLqlN/sFz6gYdix2TatsRyA/riEY0zv7KheYl44KhAGWzTNNR5wFoaKgDNw cq+jh+19OszpG+bh/5kPun+qIJ2EbW4= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 4/9] memcg: manually inline __refill_stock Date: Thu, 3 Apr 2025 18:39:08 -0700 Message-ID: <20250404013913.1663035-5-shakeel.butt@linux.dev> In-Reply-To: <20250404013913.1663035-1-shakeel.butt@linux.dev> References: <20250404013913.1663035-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" There are no more multiple callers of __refill_stock(), so simply inline it to refill_stock(). Acked-by: Vlastimil Babka Reviewed-by: Roman Gushchin Signed-off-by: Shakeel Butt --- mm/memcontrol.c | 34 +++++++++++++--------------------- 1 file changed, 13 insertions(+), 21 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index dfb3f14c1178..03a2be6d4a67 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1871,14 +1871,22 @@ static void drain_local_stock(struct work_struct *d= ummy) obj_cgroup_put(old); } =20 -/* - * Cache charges(val) to local per_cpu area. - * This will be consumed by consume_stock() function, later. - */ -static void __refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) +static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) { struct memcg_stock_pcp *stock; unsigned int stock_pages; + unsigned long flags; + + VM_WARN_ON_ONCE(mem_cgroup_is_root(memcg)); + + if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) { + /* + * In case of unlikely failure to lock percpu stock_lock + * uncharge memcg directly. + */ + memcg_uncharge(memcg, nr_pages); + return; + } =20 stock =3D this_cpu_ptr(&memcg_stock); if (READ_ONCE(stock->cached) !=3D memcg) { /* reset if necessary */ @@ -1891,23 +1899,7 @@ static void __refill_stock(struct mem_cgroup *memcg,= unsigned int nr_pages) =20 if (stock_pages > MEMCG_CHARGE_BATCH) drain_stock(stock); -} - -static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) -{ - unsigned long flags; - - VM_WARN_ON_ONCE(mem_cgroup_is_root(memcg)); =20 - if (!local_trylock_irqsave(&memcg_stock.stock_lock, flags)) { - /* - * In case of unlikely failure to lock percpu stock_lock - * uncharge memcg directly. - */ - memcg_uncharge(memcg, nr_pages); - return; - } - __refill_stock(memcg, nr_pages); local_unlock_irqrestore(&memcg_stock.stock_lock, flags); } =20 --=20 2.47.1 From nobody Wed Dec 17 17:25:09 2025 Received: from out-188.mta0.migadu.com (out-188.mta0.migadu.com [91.218.175.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7267614A614 for ; Fri, 4 Apr 2025 01:39:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.188 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730798; cv=none; b=UREyyxAk2XizOivOSNSf/NquI/cCc26Zow2JYN7sme1ClFBBxFgQJxwQ2cqaoacM3MM62SXifGHqnzMLKfz5KtsD4ZHbR+z197jYtToMy1XTPrnO4KJjeUa6BI25+qKzWcEQ1td++WL9vOgyly9vVy5IK50FUR9TEUgcN6z9Cdw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730798; c=relaxed/simple; bh=v8b9Fi3MhEJftF/PX1M+/Bkc3YxAYhK3++UVDF+VxEs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DstN+i3XvqJzIPB6qLmafmy6F4cKHZSeZDw6NJkfOl6oqO0otbK2UC0gGTGUk+CrvJged+AS061I7veK4yN2g9UOi1bTyTevBV0Eu1Chs07Y7CU/zOgc4vKQEEgSfd1ddeG9DZ51bNxiGWuChuPVOdmAzuxDa9ZgBL1TI++O9IE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=lJr61OEX; arc=none smtp.client-ip=91.218.175.188 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="lJr61OEX" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1743730794; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zBVDPRxC7zGlf74XAtJaE4SLulN0IfYjtlemuQUW//g=; b=lJr61OEX4xx/9USfw1sohnqhrgFqNhWyygUyfhavSMJTjmL+JpCYHdT4JzRjko3+YFDyoA o3FvDnCdezFhTPYkauE78iHGzVxwYtbUgoDBBCLUDjGvT++701sTPl2fBbo0WZl525k4sd c3sIlCIGT1ABcv3k91f6uLqX/LRI9r0= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 5/9] memcg: no refilling stock from obj_cgroup_release Date: Thu, 3 Apr 2025 18:39:09 -0700 Message-ID: <20250404013913.1663035-6-shakeel.butt@linux.dev> In-Reply-To: <20250404013913.1663035-1-shakeel.butt@linux.dev> References: <20250404013913.1663035-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" obj_cgroup_release is called when all the references to the objcg have been released i.e. no more memory objects are pointing to it. Most probably objcg->memcg will be pointing to some ancestor memcg. In obj_cgroup_release(), the kernel calls obj_cgroup_uncharge_pages() which refills the local stock. There is no need to refill the local stock with some ancestor memcg and flush the local stock. Let's decouple obj_cgroup_release() from the local stock by uncharging instead of refilling. One additional benefit of this change is that it removes the requirement to only call obj_cgroup_put() outside of local_lock. Acked-by: Vlastimil Babka Reviewed-by: Roman Gushchin Signed-off-by: Shakeel Butt --- mm/memcontrol.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 03a2be6d4a67..df52084e90f4 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -129,8 +129,7 @@ bool mem_cgroup_kmem_disabled(void) return cgroup_memory_nokmem; } =20 -static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg, - unsigned int nr_pages); +static void memcg_uncharge(struct mem_cgroup *memcg, unsigned int nr_pages= ); =20 static void obj_cgroup_release(struct percpu_ref *ref) { @@ -163,8 +162,16 @@ static void obj_cgroup_release(struct percpu_ref *ref) WARN_ON_ONCE(nr_bytes & (PAGE_SIZE - 1)); nr_pages =3D nr_bytes >> PAGE_SHIFT; =20 - if (nr_pages) - obj_cgroup_uncharge_pages(objcg, nr_pages); + if (nr_pages) { + struct mem_cgroup *memcg; + + memcg =3D get_mem_cgroup_from_objcg(objcg); + mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages); + memcg1_account_kmem(memcg, -nr_pages); + if (!mem_cgroup_is_root(memcg)) + memcg_uncharge(memcg, nr_pages); + mem_cgroup_put(memcg); + } =20 spin_lock_irqsave(&objcg_lock, flags); list_del(&objcg->list); --=20 2.47.1 From nobody Wed Dec 17 17:25:09 2025 Received: from out-173.mta0.migadu.com (out-173.mta0.migadu.com [91.218.175.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62F38154C0D for ; Fri, 4 Apr 2025 01:39:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730801; cv=none; b=m5l8QA4QuyFDzda3qpxMdKCpWPaIGbe81Dk7k1xBTumrKpdeuPGCxyS3HfOb3vRE1XGsWgHlS/ZOTYlpzDdJk/oDAZRpTrmFHyIDPkpKHySni2xGS3sR84zMUhFYbJuOEFmCurHVnXmw/HCJXYc78wkYTQjQneYZZRN39S9bjcc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730801; c=relaxed/simple; bh=yJ9zpS6/GVspKG5iIl+QWslNeiMC4U4tnmXp81yyKVw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=O6IBovoiUDm83p+NdboaTMQvAoBtUobHkEkuqX1y1Rn9LSPGsk48My3uGMSez0TbUQ8VhXkv0aQlcB62M/K9E0bO7rjpAJe0zCHfFHmE5vfKKzkQtiT6n6vUQ7qynYWp7aUAJZ7QoQuJIqhYWsUhOnZxXXTApaWRaPxXGVUj98o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=wuy3jajn; arc=none smtp.client-ip=91.218.175.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="wuy3jajn" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1743730797; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SPcJbZY/0G2u+L0qrxlZJ2y8LLOHW+AMha0Dw+104jw=; b=wuy3jajnDZ7Ek+Id4OsrZGBafMEeu1h0zfmikYHmWilsNXTfvonHYmMbFo/nxU8AMZYc76 czhi5qcby2jXTDXKb+TzOuNa5D5pZPWarz3OxmUPPjtKJB/8yNmVx8Uj7EyokzG0KDpTLW JU4vIFP6qzSocknGhQBCPM020VjBrm4= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 6/9] memcg: do obj_cgroup_put inside drain_obj_stock Date: Thu, 3 Apr 2025 18:39:10 -0700 Message-ID: <20250404013913.1663035-7-shakeel.butt@linux.dev> In-Reply-To: <20250404013913.1663035-1-shakeel.butt@linux.dev> References: <20250404013913.1663035-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Previously we could not call obj_cgroup_put() inside the local lock because on the put on the last reference, the release function obj_cgroup_release() may try to re-acquire the local lock. However that chain has been broken. Now simply do obj_cgroup_put() inside drain_obj_stock() instead of returning the old objcg. Reviewed-by: Roman Gushchin Signed-off-by: Shakeel Butt --- mm/memcontrol.c | 37 +++++++++++-------------------------- 1 file changed, 11 insertions(+), 26 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index df52084e90f4..7988a42b29bf 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1785,7 +1785,7 @@ static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_s= tock) =3D { }; static DEFINE_MUTEX(percpu_charge_mutex); =20 -static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock); +static void drain_obj_stock(struct memcg_stock_pcp *stock); static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg); =20 @@ -1859,7 +1859,6 @@ static void drain_stock(struct memcg_stock_pcp *stock) static void drain_local_stock(struct work_struct *dummy) { struct memcg_stock_pcp *stock; - struct obj_cgroup *old =3D NULL; unsigned long flags; =20 /* @@ -1870,12 +1869,11 @@ static void drain_local_stock(struct work_struct *d= ummy) local_lock_irqsave(&memcg_stock.stock_lock, flags); =20 stock =3D this_cpu_ptr(&memcg_stock); - old =3D drain_obj_stock(stock); + drain_obj_stock(stock); drain_stock(stock); clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); =20 local_unlock_irqrestore(&memcg_stock.stock_lock, flags); - obj_cgroup_put(old); } =20 static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) @@ -1958,18 +1956,16 @@ void drain_all_stock(struct mem_cgroup *root_memcg) static int memcg_hotplug_cpu_dead(unsigned int cpu) { struct memcg_stock_pcp *stock; - struct obj_cgroup *old; unsigned long flags; =20 stock =3D &per_cpu(memcg_stock, cpu); =20 /* drain_obj_stock requires stock_lock */ local_lock_irqsave(&memcg_stock.stock_lock, flags); - old =3D drain_obj_stock(stock); + drain_obj_stock(stock); local_unlock_irqrestore(&memcg_stock.stock_lock, flags); =20 drain_stock(stock); - obj_cgroup_put(old); =20 return 0; } @@ -2766,24 +2762,20 @@ void __memcg_kmem_uncharge_page(struct page *page, = int order) } =20 /* Replace the stock objcg with objcg, return the old objcg */ -static struct obj_cgroup *replace_stock_objcg(struct memcg_stock_pcp *stoc= k, - struct obj_cgroup *objcg) +static void replace_stock_objcg(struct memcg_stock_pcp *stock, + struct obj_cgroup *objcg) { - struct obj_cgroup *old =3D NULL; - - old =3D drain_obj_stock(stock); + drain_obj_stock(stock); obj_cgroup_get(objcg); stock->nr_bytes =3D atomic_read(&objcg->nr_charged_bytes) ? atomic_xchg(&objcg->nr_charged_bytes, 0) : 0; WRITE_ONCE(stock->cached_objcg, objcg); - return old; } =20 static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *= pgdat, enum node_stat_item idx, int nr) { struct memcg_stock_pcp *stock; - struct obj_cgroup *old =3D NULL; unsigned long flags; int *bytes; =20 @@ -2796,7 +2788,7 @@ static void mod_objcg_state(struct obj_cgroup *objcg,= struct pglist_data *pgdat, * changes. */ if (READ_ONCE(stock->cached_objcg) !=3D objcg) { - old =3D replace_stock_objcg(stock, objcg); + replace_stock_objcg(stock, objcg); stock->cached_pgdat =3D pgdat; } else if (stock->cached_pgdat !=3D pgdat) { /* Flush the existing cached vmstat data */ @@ -2837,7 +2829,6 @@ static void mod_objcg_state(struct obj_cgroup *objcg,= struct pglist_data *pgdat, __mod_objcg_mlstate(objcg, pgdat, idx, nr); =20 local_unlock_irqrestore(&memcg_stock.stock_lock, flags); - obj_cgroup_put(old); } =20 static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_by= tes) @@ -2859,12 +2850,12 @@ static bool consume_obj_stock(struct obj_cgroup *ob= jcg, unsigned int nr_bytes) return ret; } =20 -static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock) +static void drain_obj_stock(struct memcg_stock_pcp *stock) { struct obj_cgroup *old =3D READ_ONCE(stock->cached_objcg); =20 if (!old) - return NULL; + return; =20 if (stock->nr_bytes) { unsigned int nr_pages =3D stock->nr_bytes >> PAGE_SHIFT; @@ -2917,11 +2908,7 @@ static struct obj_cgroup *drain_obj_stock(struct mem= cg_stock_pcp *stock) } =20 WRITE_ONCE(stock->cached_objcg, NULL); - /* - * The `old' objects needs to be released by the caller via - * obj_cgroup_put() outside of memcg_stock_pcp::stock_lock. - */ - return old; + obj_cgroup_put(old); } =20 static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, @@ -2943,7 +2930,6 @@ static void refill_obj_stock(struct obj_cgroup *objcg= , unsigned int nr_bytes, bool allow_uncharge) { struct memcg_stock_pcp *stock; - struct obj_cgroup *old =3D NULL; unsigned long flags; unsigned int nr_pages =3D 0; =20 @@ -2951,7 +2937,7 @@ static void refill_obj_stock(struct obj_cgroup *objcg= , unsigned int nr_bytes, =20 stock =3D this_cpu_ptr(&memcg_stock); if (READ_ONCE(stock->cached_objcg) !=3D objcg) { /* reset if necessary */ - old =3D replace_stock_objcg(stock, objcg); + replace_stock_objcg(stock, objcg); allow_uncharge =3D true; /* Allow uncharge when objcg changes */ } stock->nr_bytes +=3D nr_bytes; @@ -2962,7 +2948,6 @@ static void refill_obj_stock(struct obj_cgroup *objcg= , unsigned int nr_bytes, } =20 local_unlock_irqrestore(&memcg_stock.stock_lock, flags); - obj_cgroup_put(old); =20 if (nr_pages) obj_cgroup_uncharge_pages(objcg, nr_pages); --=20 2.47.1 From nobody Wed Dec 17 17:25:09 2025 Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [91.218.175.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 614F68633F for ; Fri, 4 Apr 2025 01:40:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730806; cv=none; b=kcdqPpxW/V+1tn007767Y9YkfYLXJnS5kq02lbEzshyaW01L+l8BU2620epP774IaJHrAGlpdpEF2EF3SDx12VhHq7vykwMjC4dVpC0Qi3sskx35dNf9FmFmJoF8T9EU7WwFcSwKePiAN5fO38PyPz4gnGJMHiLoLNAcfxGwQiM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730806; c=relaxed/simple; bh=ApEjdjytrgizlDO/HJPlE1/U020o/eLJy1AJRVwvHi4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IbGyqFuzFiDm28pJ8ztvYFsJMnkp+/gp8lvlyFpSoV0eYVqcho2y+/6O2gAspVyvbunvol6T1BKDxXS1Gu0yspTCFODKPX2Armw/Dq/GVFlUF/uP7yw17AeH3m9yKd/Sc5C2dguLjmgHKvf5n7QixNG7v+Y0TWcOaO2GcsES06E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=GpZWu2DG; arc=none smtp.client-ip=91.218.175.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="GpZWu2DG" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1743730802; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5u+sGrpirO29d1KihT1tXkrq4oUxzHcwoOXUn/c4rxU=; b=GpZWu2DG35VYXPGZhdFuTR+e+p9ZnkPyl3xCS2z4XczcVBbXrTjmdDatecs8Z3InFGsO6m b/G9QZFGWTsRdvp8vgZQ7i9qVB+FhSS2T9ZuxyYQWMcowKCC84xKlbPgOFphFlxWb7A2d/ Nn+F5vsLhhPGaCYx6WQdqaUTaFypiVU= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 7/9] memcg: use __mod_memcg_state in drain_obj_stock Date: Thu, 3 Apr 2025 18:39:11 -0700 Message-ID: <20250404013913.1663035-8-shakeel.butt@linux.dev> In-Reply-To: <20250404013913.1663035-1-shakeel.butt@linux.dev> References: <20250404013913.1663035-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" For non-PREEMPT_RT kernels, drain_obj_stock() is always called with irq disabled, so we can use __mod_memcg_state() instead of mod_memcg_state(). For PREEMPT_RT, we need to add memcg_stats_[un]lock in __mod_memcg_state(). Reviewed-by: Sebastian Andrzej Siewior Reviewed-by: Roman Gushchin Acked-by: Vlastimil Babka Signed-off-by: Shakeel Butt --- mm/memcontrol.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7988a42b29bf..33aeddfff0ba 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -710,10 +710,12 @@ void __mod_memcg_state(struct mem_cgroup *memcg, enum= memcg_stat_item idx, if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, id= x)) return; =20 + memcg_stats_lock(); __this_cpu_add(memcg->vmstats_percpu->state[i], val); val =3D memcg_state_val_in_pages(idx, val); memcg_rstat_updated(memcg, val); trace_mod_memcg_state(memcg, idx, val); + memcg_stats_unlock(); } =20 #ifdef CONFIG_MEMCG_V1 @@ -2866,7 +2868,7 @@ static void drain_obj_stock(struct memcg_stock_pcp *s= tock) =20 memcg =3D get_mem_cgroup_from_objcg(old); =20 - mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages); + __mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages); memcg1_account_kmem(memcg, -nr_pages); if (!mem_cgroup_is_root(memcg)) memcg_uncharge(memcg, nr_pages); --=20 2.47.1 From nobody Wed Dec 17 17:25:09 2025 Received: from out-176.mta1.migadu.com (out-176.mta1.migadu.com [95.215.58.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0242E1C68F for ; Fri, 4 Apr 2025 01:40:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730817; cv=none; b=KuanHcjdZ0yPqdrONgGLWWsS+XYG4hgU9kIRpurVxs/bTg5z9yKyOVGYr3eYKXNji9TkGfdo+jflZbR/1ehdklafO948qvyqHkh6nia1s3cp0AaYilfxuM4qDIQDdHyM1RAw3ue+5H+SvdKOIq2N0bRbZ99uIzrxXyAx3dEaw+w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730817; c=relaxed/simple; bh=64ozTxNy9/NL2y4BP7GPrE/FG/Mi0eKug5/IuaJS+oA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Yd1PgTzZhr1Qm3XUL+/z92+CRW05i7KxnvDdr9vHUE2L8fqUxZS/u7Y7ywMZDh5s0slPVHHqT4f5Qz3NGOhYxqkbMPE5MZ11HpwlM4OKTAz0+u1vRlH7bjfnB7yWzs2ZAq+j0JWpO8Ds5WRFHI6YKgrCnyJEYJifnASV+YHDCX4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=KPfX6ZHI; arc=none smtp.client-ip=95.215.58.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="KPfX6ZHI" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1743730811; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LinbyHoyL8uDgkz9AbaMIU/FLYmhkY77RaZleKlI73I=; b=KPfX6ZHIo7QLJYz3cLSaUbA9qUfdz6+B466oIXkRVPsjNMHLcENHjHfJsbZm+4sbWA5td8 1lLF2cpN8C2iPDJDTYMBpMF5OXvKXUAyN91ExOYDInH31LiE00A4zfzwBlcq1lPGB7Uyte VvydzWMFlqbGHQkrfNiqQ6U8NBAcNrc= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 8/9] memcg: combine slab obj stock charging and accounting Date: Thu, 3 Apr 2025 18:39:12 -0700 Message-ID: <20250404013913.1663035-9-shakeel.butt@linux.dev> In-Reply-To: <20250404013913.1663035-1-shakeel.butt@linux.dev> References: <20250404013913.1663035-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" From: Vlastimil Babka When handing slab objects, we use obj_cgroup_[un]charge() for (un)charging and mod_objcg_state() to account NR_SLAB_[UN]RECLAIMABLE_B. All these operations use the percpu stock for performance. However with the calls being separate, the stock_lock is taken twice in each case. By refactoring the code, we can turn mod_objcg_state() into __account_obj_stock() which is called on a stock that's already locked and validated. On the charging side we can call this function from consume_obj_stock() when it succeeds, and refill_obj_stock() in the fallback. We just expand parameters of these functions as necessary. The uncharge side from __memcg_slab_free_hook() is just the call to refill_obj_stock(). Other callers of obj_cgroup_[un]charge() (i.e. not slab) simply pass the extra parameters as NULL/zeroes to skip the __account_obj_stock() operation. In __memcg_slab_post_alloc_hook() we now charge each object separately, but that's not a problem as we did call mod_objcg_state() for each object separately, and most allocations are non-bulk anyway. This could be improved by batching all operations until slab_pgdat(slab) changes. Some preliminary benchmarking with a kfree(kmalloc()) loop of 10M iterations with/without __GFP_ACCOUNT: Before the patch: kmalloc/kfree !memcg: 581390144 cycles kmalloc/kfree memcg: 783689984 cycles After the patch: kmalloc/kfree memcg: 658723808 cycles More than half of the overhead of __GFP_ACCOUNT relative to non-accounted case seems eliminated. Signed-off-by: Vlastimil Babka Reviewed-by: Roman Gushchin Signed-off-by: Shakeel Butt --- mm/memcontrol.c | 77 +++++++++++++++++++++++++++++-------------------- 1 file changed, 46 insertions(+), 31 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 33aeddfff0ba..3bb02f672e39 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2774,25 +2774,17 @@ static void replace_stock_objcg(struct memcg_stock_= pcp *stock, WRITE_ONCE(stock->cached_objcg, objcg); } =20 -static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *= pgdat, - enum node_stat_item idx, int nr) +static void __account_obj_stock(struct obj_cgroup *objcg, + struct memcg_stock_pcp *stock, int nr, + struct pglist_data *pgdat, enum node_stat_item idx) { - struct memcg_stock_pcp *stock; - unsigned long flags; int *bytes; =20 - local_lock_irqsave(&memcg_stock.stock_lock, flags); - stock =3D this_cpu_ptr(&memcg_stock); - /* * Save vmstat data in stock and skip vmstat array update unless - * accumulating over a page of vmstat data or when pgdat or idx - * changes. + * accumulating over a page of vmstat data or when pgdat changes. */ - if (READ_ONCE(stock->cached_objcg) !=3D objcg) { - replace_stock_objcg(stock, objcg); - stock->cached_pgdat =3D pgdat; - } else if (stock->cached_pgdat !=3D pgdat) { + if (stock->cached_pgdat !=3D pgdat) { /* Flush the existing cached vmstat data */ struct pglist_data *oldpg =3D stock->cached_pgdat; =20 @@ -2829,11 +2821,10 @@ static void mod_objcg_state(struct obj_cgroup *objc= g, struct pglist_data *pgdat, } if (nr) __mod_objcg_mlstate(objcg, pgdat, idx, nr); - - local_unlock_irqrestore(&memcg_stock.stock_lock, flags); } =20 -static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_by= tes) +static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_by= tes, + struct pglist_data *pgdat, enum node_stat_item idx) { struct memcg_stock_pcp *stock; unsigned long flags; @@ -2845,6 +2836,9 @@ static bool consume_obj_stock(struct obj_cgroup *objc= g, unsigned int nr_bytes) if (objcg =3D=3D READ_ONCE(stock->cached_objcg) && stock->nr_bytes >=3D n= r_bytes) { stock->nr_bytes -=3D nr_bytes; ret =3D true; + + if (pgdat) + __account_obj_stock(objcg, stock, nr_bytes, pgdat, idx); } =20 local_unlock_irqrestore(&memcg_stock.stock_lock, flags); @@ -2929,7 +2923,8 @@ static bool obj_stock_flush_required(struct memcg_sto= ck_pcp *stock, } =20 static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_byt= es, - bool allow_uncharge) + bool allow_uncharge, int nr_acct, struct pglist_data *pgdat, + enum node_stat_item idx) { struct memcg_stock_pcp *stock; unsigned long flags; @@ -2944,6 +2939,9 @@ static void refill_obj_stock(struct obj_cgroup *objcg= , unsigned int nr_bytes, } stock->nr_bytes +=3D nr_bytes; =20 + if (pgdat) + __account_obj_stock(objcg, stock, nr_acct, pgdat, idx); + if (allow_uncharge && (stock->nr_bytes > PAGE_SIZE)) { nr_pages =3D stock->nr_bytes >> PAGE_SHIFT; stock->nr_bytes &=3D (PAGE_SIZE - 1); @@ -2955,12 +2953,13 @@ static void refill_obj_stock(struct obj_cgroup *obj= cg, unsigned int nr_bytes, obj_cgroup_uncharge_pages(objcg, nr_pages); } =20 -int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size) +static int obj_cgroup_charge_account(struct obj_cgroup *objcg, gfp_t gfp, = size_t size, + struct pglist_data *pgdat, enum node_stat_item idx) { unsigned int nr_pages, nr_bytes; int ret; =20 - if (consume_obj_stock(objcg, size)) + if (likely(consume_obj_stock(objcg, size, pgdat, idx))) return 0; =20 /* @@ -2993,15 +2992,21 @@ int obj_cgroup_charge(struct obj_cgroup *objcg, gfp= _t gfp, size_t size) nr_pages +=3D 1; =20 ret =3D obj_cgroup_charge_pages(objcg, gfp, nr_pages); - if (!ret && nr_bytes) - refill_obj_stock(objcg, PAGE_SIZE - nr_bytes, false); + if (!ret && (nr_bytes || pgdat)) + refill_obj_stock(objcg, nr_bytes ? PAGE_SIZE - nr_bytes : 0, + false, size, pgdat, idx); =20 return ret; } =20 +int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size) +{ + return obj_cgroup_charge_account(objcg, gfp, size, NULL, 0); +} + void obj_cgroup_uncharge(struct obj_cgroup *objcg, size_t size) { - refill_obj_stock(objcg, size, true); + refill_obj_stock(objcg, size, true, 0, NULL, 0); } =20 static inline size_t obj_full_size(struct kmem_cache *s) @@ -3053,23 +3058,32 @@ bool __memcg_slab_post_alloc_hook(struct kmem_cache= *s, struct list_lru *lru, return false; } =20 - if (obj_cgroup_charge(objcg, flags, size * obj_full_size(s))) - return false; - for (i =3D 0; i < size; i++) { slab =3D virt_to_slab(p[i]); =20 if (!slab_obj_exts(slab) && alloc_slab_obj_exts(slab, s, flags, false)) { - obj_cgroup_uncharge(objcg, obj_full_size(s)); continue; } =20 + /* + * if we fail and size is 1, memcg_alloc_abort_single() will + * just free the object, which is ok as we have not assigned + * objcg to its obj_ext yet + * + * for larger sizes, kmem_cache_free_bulk() will uncharge + * any objects that were already charged and obj_ext assigned + * + * TODO: we could batch this until slab_pgdat(slab) changes + * between iterations, with a more complicated undo + */ + if (obj_cgroup_charge_account(objcg, flags, obj_full_size(s), + slab_pgdat(slab), cache_vmstat_idx(s))) + return false; + off =3D obj_to_index(s, slab, p[i]); obj_cgroup_get(objcg); slab_obj_exts(slab)[off].objcg =3D objcg; - mod_objcg_state(objcg, slab_pgdat(slab), - cache_vmstat_idx(s), obj_full_size(s)); } =20 return true; @@ -3078,6 +3092,8 @@ bool __memcg_slab_post_alloc_hook(struct kmem_cache *= s, struct list_lru *lru, void __memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p, int objects, struct slabobj_ext *obj_exts) { + size_t obj_size =3D obj_full_size(s); + for (int i =3D 0; i < objects; i++) { struct obj_cgroup *objcg; unsigned int off; @@ -3088,9 +3104,8 @@ void __memcg_slab_free_hook(struct kmem_cache *s, str= uct slab *slab, continue; =20 obj_exts[off].objcg =3D NULL; - obj_cgroup_uncharge(objcg, obj_full_size(s)); - mod_objcg_state(objcg, slab_pgdat(slab), cache_vmstat_idx(s), - -obj_full_size(s)); + refill_obj_stock(objcg, obj_size, true, -obj_size, + slab_pgdat(slab), cache_vmstat_idx(s)); obj_cgroup_put(objcg); } } --=20 2.47.1 From nobody Wed Dec 17 17:25:09 2025 Received: from out-180.mta0.migadu.com (out-180.mta0.migadu.com [91.218.175.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 173C65789D for ; Fri, 4 Apr 2025 01:40:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730819; cv=none; b=R7e6txoM/8uxLtb99nwf0snmRt7bi/FxDYPH4DaZEONyeOoA301Ou4wr/lZ9p9xqyzNFpb4x3ALtLWuQP2hl02ijKmb88q9QJLHbWxJIs+j0FU5xMANRDUoxK9dpXWxX5JjPFIKmloG30ZcWT8IU2YsnO7NBrywvPuqVzbhlyE0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743730819; c=relaxed/simple; bh=06gXiVx5lKSFFQekv0CghOGVFlT7QaHEhE3pSVzC+e0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=i8uKv+Ql4cDYK0BHcKHN8CQCPWYgSSDydhNZGTgS0xUUsuWvRFJnHHkvgfvXkqb2rUjX39bvs7C/A/WLm7WXE2jpsqGyVnk8HgC7VyE75XcQIftS48jbMfF25GTSFi7nL8IOLdyygqjLsw9cIELKZYdxP3tpLTjKHesJgFBb2rE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=YVIdQTwo; arc=none smtp.client-ip=91.218.175.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="YVIdQTwo" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1743730814; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BaL+YTe2+GRQTxvSOsA8F0slQZP0gsLtQVxtzdV+oik=; b=YVIdQTwo2vNEeKkxMFLZpTiRY9Tw49WcPNXtYb9sRqKpBiDZDA6NoG3Z2N2wJFJSP6T989 6fGtuqPuZcGBUzJ/81KT5TR2UWOQVfZDwQjl3oesbA1Zu4hGcO/aXcHyoqQ+2nfVWlgWya tDhVUf4+urqwi6rscCd66ZwobItacxM= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 9/9] memcg: manually inline replace_stock_objcg Date: Thu, 3 Apr 2025 18:39:13 -0700 Message-ID: <20250404013913.1663035-10-shakeel.butt@linux.dev> In-Reply-To: <20250404013913.1663035-1-shakeel.butt@linux.dev> References: <20250404013913.1663035-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" The replace_stock_objcg() is being called by only refill_obj_stock, so manually inline it. Reviewed-by: Roman Gushchin Acked-by: Vlastimil Babka Signed-off-by: Shakeel Butt --- mm/memcontrol.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 3bb02f672e39..aebb1f2c8657 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2763,17 +2763,6 @@ void __memcg_kmem_uncharge_page(struct page *page, i= nt order) obj_cgroup_put(objcg); } =20 -/* Replace the stock objcg with objcg, return the old objcg */ -static void replace_stock_objcg(struct memcg_stock_pcp *stock, - struct obj_cgroup *objcg) -{ - drain_obj_stock(stock); - obj_cgroup_get(objcg); - stock->nr_bytes =3D atomic_read(&objcg->nr_charged_bytes) - ? atomic_xchg(&objcg->nr_charged_bytes, 0) : 0; - WRITE_ONCE(stock->cached_objcg, objcg); -} - static void __account_obj_stock(struct obj_cgroup *objcg, struct memcg_stock_pcp *stock, int nr, struct pglist_data *pgdat, enum node_stat_item idx) @@ -2934,7 +2923,12 @@ static void refill_obj_stock(struct obj_cgroup *objc= g, unsigned int nr_bytes, =20 stock =3D this_cpu_ptr(&memcg_stock); if (READ_ONCE(stock->cached_objcg) !=3D objcg) { /* reset if necessary */ - replace_stock_objcg(stock, objcg); + drain_obj_stock(stock); + obj_cgroup_get(objcg); + stock->nr_bytes =3D atomic_read(&objcg->nr_charged_bytes) + ? atomic_xchg(&objcg->nr_charged_bytes, 0) : 0; + WRITE_ONCE(stock->cached_objcg, objcg); + allow_uncharge =3D true; /* Allow uncharge when objcg changes */ } stock->nr_bytes +=3D nr_bytes; --=20 2.47.1