From nobody Wed Dec 17 17:23:08 2025 Received: from out-189.mta1.migadu.com (out-189.mta1.migadu.com [95.215.58.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D5D8C191F60 for ; Sat, 15 Mar 2025 17:49:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.189 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742060987; cv=none; b=n9IwW6MSota52JIqXp1Vjpc7GetI+5f7bmglZvPjT4foxqODmptEHNX/D+BmpYblFMuOQDCLRzCDQQ/XN4zmd7/SniUPNEIv9Yh8cp8cLi/pa32u5PydiBsb38nuCEE45JAtlwONMpcg7WQTR6JGh4qGQZZARv/F3tkP6VXeGA0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742060987; c=relaxed/simple; bh=rR4VRO/qCn9tibn4IiHZRnKTQxDzvVTiAx2JBG5cZJY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MxX2gQMUJRBx2gQWP7v9WehCFgLXjSpYfWXy036Ibt1rrIog1R1m/RahVVbkBRqVxb7Xz3SZxsip6i95UUFXRSizdgJP1J3VP9m+4xhlaRRwfZ4GSutiXIEQxnYDHgO996Gfv4eWnADGM+Ru5gv/XQLegwzGXtH1+UDQncI1tQM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=LVtOFiDf; arc=none smtp.client-ip=95.215.58.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="LVtOFiDf" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1742060984; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cmS5hv6xwjGOhhAkz0IaR50zSoZuuoSBRO/S5eqp4mA=; b=LVtOFiDfJTfbraclra86x+fCaQarFRpDuvgbtcbJPyfEGkmeTvyYet4jd9Nwe7MYVclJnn uuJLQQVD62W9ZBqiSKhDpcCrLRrD//QYT9PyfmzobpzgN56PeZJewgqDV+tc22Dfa0cBft 3TEADM+PDuzpeOAVrq4zauWky81Tgxs= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH 1/9] memcg: remove root memcg check from refill_stock Date: Sat, 15 Mar 2025 10:49:22 -0700 Message-ID: <20250315174930.1769599-2-shakeel.butt@linux.dev> In-Reply-To: <20250315174930.1769599-1-shakeel.butt@linux.dev> References: <20250315174930.1769599-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" refill_stock can not be called with root memcg, so there is no need to check it. Signed-off-by: Shakeel Butt Acked-by: Vlastimil Babka Reviewed-by: Roman Gushchin --- mm/memcontrol.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b29433eb17fa..c09a32e93d39 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1883,6 +1883,7 @@ static void __refill_stock(struct mem_cgroup *memcg, = unsigned int nr_pages) drain_stock(stock); } =20 +/* Should never be called with root_mem_cgroup. */ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) { unsigned long flags; @@ -1892,8 +1893,6 @@ static void refill_stock(struct mem_cgroup *memcg, un= signed int nr_pages) * In case of unlikely failure to lock percpu stock_lock * uncharge memcg directly. */ - if (mem_cgroup_is_root(memcg)) - return; page_counter_uncharge(&memcg->memory, nr_pages); if (do_memsw_account()) page_counter_uncharge(&memcg->memsw, nr_pages); --=20 2.47.1 From nobody Wed Dec 17 17:23:08 2025 Received: from out-184.mta0.migadu.com (out-184.mta0.migadu.com [91.218.175.184]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 303DA2054E3 for ; Sat, 15 Mar 2025 17:49:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.184 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742060991; cv=none; b=Do/qfy73U5VO5UsMQ5wQcbSkOV5jLzSyTiIvpnHSU6S/hFBiIKND5qd/v9KUkPT0/Rj7Uap/PHWfivpR5pUZEohq5cw9Mt4yIw6eDtTaqq5Pn4dpQgt/783ZOSSeHgSDtn2ba4eic+ux+vJg/50rDPn/LAGA+J9pp4Osf+kDxVY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742060991; c=relaxed/simple; bh=7GVz3MJvICmIWhsoo81qI+uaCmHIr1hzLPA6oLWv2RY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=W+4Dme6vTeIPhQZ52ehy363accltNSVxIfdudxgZTQ8SRydGDUgG+OpUJIAKV40SysUDrG4N6brKC82r+P1gdoEgK2ByZtZK4pE5OXXBzSltlXQpCjC8eOO48cXNGSRoGnYczd4NJG3I6+euYgEzLMagXRlnUOsCw0i0gp/DbrA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=Ejy9vqLu; arc=none smtp.client-ip=91.218.175.184 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="Ejy9vqLu" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1742060986; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oUGjl5IUOeohrue8CflY3qReHtC2T034o0a04sP/4ac=; b=Ejy9vqLuhMtkzwYa4BWZ2IiChp1s5sstyd4OAgBMSkax1T0pVT20KV+VkvWvY5R0mCrrCA CWNQcrKeNlCZoz9SnQEPbL3cjNDbBmqiM/2TtzFlraiYrRnTqDXLQBiBdgjMSa88A7UtW7 qGO2/7k9JcdwLvf0v1aCNfGdPeArjgs= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH 2/9] memcg: decouple drain_obj_stock from local stock Date: Sat, 15 Mar 2025 10:49:23 -0700 Message-ID: <20250315174930.1769599-3-shakeel.butt@linux.dev> In-Reply-To: <20250315174930.1769599-1-shakeel.butt@linux.dev> References: <20250315174930.1769599-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Currently drain_obj_stock() can potentially call __refill_stock which accesses local cpu stock and thus requires memcg stock's local_lock. However if we look at the code paths leading to drain_obj_stock(), there is never a good reason to refill the memcg stock at all from it. At the moment, drain_obj_stock can be called from reclaim, hotplug cpu teardown, mod_objcg_state() and refill_obj_stock(). For reclaim and hotplug there is no need to refill. For the other two paths, most probably the newly switched objcg would be used in near future and thus no need to refill stock with the older objcg. In addition, __refill_stock() from drain_obj_stock() happens on rare cases, so performance is not really an issue. Let's just uncharge directly instead of refill which will also decouple drain_obj_stock from local cpu stock and local_lock requirements. Signed-off-by: Shakeel Butt Acked-by: Vlastimil Babka Reviewed-by: Roman Gushchin --- mm/memcontrol.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index c09a32e93d39..28cb75b5bc66 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2855,7 +2855,12 @@ static struct obj_cgroup *drain_obj_stock(struct mem= cg_stock_pcp *stock) =20 mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages); memcg1_account_kmem(memcg, -nr_pages); - __refill_stock(memcg, nr_pages); + if (!mem_cgroup_is_root(memcg)) { + page_counter_uncharge(&memcg->memory, nr_pages); + if (do_memsw_account()) + page_counter_uncharge(&memcg->memsw, + nr_pages); + } =20 css_put(&memcg->css); } --=20 2.47.1 From nobody Wed Dec 17 17:23:08 2025 Received: from out-183.mta0.migadu.com (out-183.mta0.migadu.com [91.218.175.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E62CA205516 for ; Sat, 15 Mar 2025 17:49:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.183 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742060993; cv=none; b=Ih7aG/qgua/J0dPdWZfqZDoXkpS3IC6Ft5YUxcFxzBKSKue4peN4SVP0+JkL6yVzusERZxLWvbSRq9oxS4ijnkUnkyFx6l5StXKuYKkmz0h4hXgHIqNDNUNleTVjYjNxa0QUSlSqMtOc4WUTyh1Tqh11iZ40fY9GYHlmoPktVIs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742060993; c=relaxed/simple; bh=0ftGgLW4/MlnZh9Fmz7q/GevfyJlJJyM4N09cbLOa4I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sx7B8pETvAxUYggL2D3ne7uE0W4ci/N33A1TShfb/6DCjTf4Ipxq4Oc1uZcVAsqTiAv8GZtFTLaN20aJe2ynuxTgELIMGsyuId2+cjcBKNIbi1et0NEk0qSPSluYq1fTWIZYSvHUbmCCdbnKN5a72Fe4WgFDW2+9TMPluoUDefc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=EsH21tpo; arc=none smtp.client-ip=91.218.175.183 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="EsH21tpo" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1742060990; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D9SGS5ZGVTJYUs7bi8rBcO4GG6XSRL+h8cbbja/hnd4=; b=EsH21tpoGivxemQBYrsxMd+ru7bHcE8LYPnGChI+hOtYbRl8sO0Czs0Q7Br3PmMb9k9bpa gce6M48m5W+CSQy02N7oSlHbmeiv0YDYjFi2vfK6zdqUHBTGYUDd4M1RiDkREBX83mGRH4 YD4a7IZYjimlw7tBXcpiVnolqcmcIyI= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH 3/9] memcg: introduce memcg_uncharge Date: Sat, 15 Mar 2025 10:49:24 -0700 Message-ID: <20250315174930.1769599-4-shakeel.butt@linux.dev> In-Reply-To: <20250315174930.1769599-1-shakeel.butt@linux.dev> References: <20250315174930.1769599-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" At multiple places in memcontrol.c, the memory and memsw page counters are being uncharged. This is error-prone. Let's move the functionality to a newly introduced memcg_uncharge and call it from all those places. Signed-off-by: Shakeel Butt Acked-by: Vlastimil Babka Reviewed-by: Roman Gushchin --- mm/memcontrol.c | 28 ++++++++++++---------------- 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 28cb75b5bc66..b54e3a1d23bd 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1816,6 +1816,13 @@ static bool consume_stock(struct mem_cgroup *memcg, = unsigned int nr_pages, return ret; } =20 +static void memcg_uncharge(struct mem_cgroup *memcg, unsigned int nr_pages) +{ + page_counter_uncharge(&memcg->memory, nr_pages); + if (do_memsw_account()) + page_counter_uncharge(&memcg->memsw, nr_pages); +} + /* * Returns stocks cached in percpu and reset cached information. */ @@ -1828,10 +1835,7 @@ static void drain_stock(struct memcg_stock_pcp *stoc= k) return; =20 if (stock_pages) { - page_counter_uncharge(&old->memory, stock_pages); - if (do_memsw_account()) - page_counter_uncharge(&old->memsw, stock_pages); - + memcg_uncharge(old, stock_pages); WRITE_ONCE(stock->nr_pages, 0); } =20 @@ -1893,9 +1897,7 @@ static void refill_stock(struct mem_cgroup *memcg, un= signed int nr_pages) * In case of unlikely failure to lock percpu stock_lock * uncharge memcg directly. */ - page_counter_uncharge(&memcg->memory, nr_pages); - if (do_memsw_account()) - page_counter_uncharge(&memcg->memsw, nr_pages); + memcg_uncharge(memcg, nr_pages); return; } __refill_stock(memcg, nr_pages); @@ -2855,12 +2857,8 @@ static struct obj_cgroup *drain_obj_stock(struct mem= cg_stock_pcp *stock) =20 mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages); memcg1_account_kmem(memcg, -nr_pages); - if (!mem_cgroup_is_root(memcg)) { - page_counter_uncharge(&memcg->memory, nr_pages); - if (do_memsw_account()) - page_counter_uncharge(&memcg->memsw, - nr_pages); - } + if (!mem_cgroup_is_root(memcg)) + memcg_uncharge(memcg, nr_pages); =20 css_put(&memcg->css); } @@ -4689,9 +4687,7 @@ static inline void uncharge_gather_clear(struct uncha= rge_gather *ug) static void uncharge_batch(const struct uncharge_gather *ug) { if (ug->nr_memory) { - page_counter_uncharge(&ug->memcg->memory, ug->nr_memory); - if (do_memsw_account()) - page_counter_uncharge(&ug->memcg->memsw, ug->nr_memory); + memcg_uncharge(ug->memcg, ug->nr_memory); if (ug->nr_kmem) { mod_memcg_state(ug->memcg, MEMCG_KMEM, -ug->nr_kmem); memcg1_account_kmem(ug->memcg, -ug->nr_kmem); --=20 2.47.1 From nobody Wed Dec 17 17:23:08 2025 Received: from out-180.mta0.migadu.com (out-180.mta0.migadu.com [91.218.175.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 61140205E35 for ; Sat, 15 Mar 2025 17:49:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742060998; cv=none; b=bB2/JI21JTzQ+os3YwpJujIAlfFEEnBsTX1Eh1qYRJhFFlX6FQp/7LhCn9GJ4NB/ivwpzUM8r+GjCZytoQjJmwum7GV7cc2kUeiL2as6lunoq7Gj5AaCg7dkyODhrohahrXYVV5IlX8l36WU7EgX8OKjD3PYR8bLwnRbmphKsI0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742060998; c=relaxed/simple; bh=DBl+8SmuUNvZkuieHKG5QGJI7Ytycb/wKCyDKUFwDHY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UMqYTjTwSJxCpF4K9Nodm8v4CSG7ADy7CmQl575hS+X0+P9NJglwl8bBLSX7ilMHQ/RfdL8crk/yXYIOp4z/tfag1ZG7Aa/GhbD0WG35PZApJF3n4WXv+SW12TtPiNlP9fDP7HqhQBwlT26XgXzQzAHZsjHgYDwDFD0pvhXIIng= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=UCQl96Wa; arc=none smtp.client-ip=91.218.175.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="UCQl96Wa" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1742060994; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Bc6pQ9m45TBV5JfKg0PdI6TDpBclbZtCDqtLw9XzNuU=; b=UCQl96Wa1/yEqZlafoPIygsX+2cWELzvlzoYrINa1G0wvMDEnFDPIjjN7q4Jw2b+9PzjIQ IdojMmemgziZOSXNmvSCv6u3VhSKT8RUTTHb5RfhZ0lRiieq8JAC75DNFjhgeKcfJA0DEC /WgjCitdjgMpbaZKjihNY1nTs+Dk+Tc= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH 4/9] memcg: manually inline __refill_stock Date: Sat, 15 Mar 2025 10:49:25 -0700 Message-ID: <20250315174930.1769599-5-shakeel.butt@linux.dev> In-Reply-To: <20250315174930.1769599-1-shakeel.butt@linux.dev> References: <20250315174930.1769599-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" There are no more multiple callers of __refill_stock(), so simply inline it to refill_stock(). Signed-off-by: Shakeel Butt Acked-by: Vlastimil Babka Reviewed-by: Roman Gushchin --- mm/memcontrol.c | 32 ++++++++++++-------------------- 1 file changed, 12 insertions(+), 20 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b54e3a1d23bd..7054b0ebd207 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1865,14 +1865,21 @@ static void drain_local_stock(struct work_struct *d= ummy) obj_cgroup_put(old); } =20 -/* - * Cache charges(val) to local per_cpu area. - * This will be consumed by consume_stock() function, later. - */ -static void __refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) +/* Should never be called with root_mem_cgroup. */ +static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) { struct memcg_stock_pcp *stock; unsigned int stock_pages; + unsigned long flags; + + if (!localtry_trylock_irqsave(&memcg_stock.stock_lock, flags)) { + /* + * In case of unlikely failure to lock percpu stock_lock + * uncharge memcg directly. + */ + memcg_uncharge(memcg, nr_pages); + return; + } =20 stock =3D this_cpu_ptr(&memcg_stock); if (READ_ONCE(stock->cached) !=3D memcg) { /* reset if necessary */ @@ -1885,22 +1892,7 @@ static void __refill_stock(struct mem_cgroup *memcg,= unsigned int nr_pages) =20 if (stock_pages > MEMCG_CHARGE_BATCH) drain_stock(stock); -} =20 -/* Should never be called with root_mem_cgroup. */ -static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) -{ - unsigned long flags; - - if (!localtry_trylock_irqsave(&memcg_stock.stock_lock, flags)) { - /* - * In case of unlikely failure to lock percpu stock_lock - * uncharge memcg directly. - */ - memcg_uncharge(memcg, nr_pages); - return; - } - __refill_stock(memcg, nr_pages); localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); } =20 --=20 2.47.1 From nobody Wed Dec 17 17:23:08 2025 Received: from out-179.mta0.migadu.com (out-179.mta0.migadu.com [91.218.175.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46DAC2040A8 for ; Sat, 15 Mar 2025 17:49:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742061001; cv=none; b=K1LpFFfxuymIMlq7RE8Xo3iVRfAzdkrP+xgck3InkbkKQ8/ilcQ7HGAfr41OKnmcbtEL4EWaGH44Z5QJOQydkeV2/9V9KSg9DdKtfla7eCT7Ny4tM/6fW4FNJXcC/euYPLHMddLjpNNSYHAZ4Uv/p5hK9jkO8jUKZfk6VMNHBTg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742061001; c=relaxed/simple; bh=rnw3vswnDrsPjRWC5QnmP1Un7aiy5k4mZRxD7W7IODg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oGW7NBOEugHhC7iSKOWikiydu9NgFay0+SJ+Lu/czNh24djg2XyjAzJKYZqqXUHcv4zaSZvZZYina+UiqhSGyuocrkLwJ327a+TcSswzsgQ2EYMfwXSmtvX4dn8sX4y/2Bi2upy5WKjpVF07uAatv24LKtK7cx0wd7uRj+cshWk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=e/E/nuyl; arc=none smtp.client-ip=91.218.175.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="e/E/nuyl" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1742060997; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6mWjPqT680ajW18j1m0PlnxhJw6Nvx3pVB+w31Bj7so=; b=e/E/nuylkpGXEhqHZNEobS6G+bqcssykgUckBO75NlFgK87SEgIBgQphktZTteBqNnHTbd x6QCYzHUjbLCNKPRZeaMdVcSSLOj6sGoyZmm+1mDUtiEh8IruX1fojGN/FiLHsrwdIn2s5 f2gone2vSuXSW8gupsMmoVL5SC2w8ho= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH 5/9] memcg: no refilling stock from obj_cgroup_release Date: Sat, 15 Mar 2025 10:49:26 -0700 Message-ID: <20250315174930.1769599-6-shakeel.butt@linux.dev> In-Reply-To: <20250315174930.1769599-1-shakeel.butt@linux.dev> References: <20250315174930.1769599-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" obj_cgroup_release is called when all the references to the objcg have been released i.e. no more memory objects are pointing to it. Most probably objcg->memcg will be pointing to some ancestor memcg. In obj_cgroup_release(), the kernel calls obj_cgroup_uncharge_pages() which refills the local stock. There is no need to refill the local stock with some ancestor memcg and flush the local stock. Let's decouple obj_cgroup_release() from the local stock by uncharging instead of refilling. One additional benefit of this change is that it removes the requirement to only call obj_cgroup_put() outside of local_lock. Signed-off-by: Shakeel Butt Acked-by: Vlastimil Babka Reviewed-by: Roman Gushchin --- mm/memcontrol.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7054b0ebd207..83db180455a1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -129,8 +129,7 @@ bool mem_cgroup_kmem_disabled(void) return cgroup_memory_nokmem; } =20 -static void obj_cgroup_uncharge_pages(struct obj_cgroup *objcg, - unsigned int nr_pages); +static void memcg_uncharge(struct mem_cgroup *memcg, unsigned int nr_pages= ); =20 static void obj_cgroup_release(struct percpu_ref *ref) { @@ -163,8 +162,16 @@ static void obj_cgroup_release(struct percpu_ref *ref) WARN_ON_ONCE(nr_bytes & (PAGE_SIZE - 1)); nr_pages =3D nr_bytes >> PAGE_SHIFT; =20 - if (nr_pages) - obj_cgroup_uncharge_pages(objcg, nr_pages); + if (nr_pages) { + struct mem_cgroup *memcg; + + memcg =3D get_mem_cgroup_from_objcg(objcg); + mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages); + memcg1_account_kmem(memcg, -nr_pages); + if (!mem_cgroup_is_root(memcg)) + memcg_uncharge(memcg, nr_pages); + css_put(&memcg->css); + } =20 spin_lock_irqsave(&objcg_lock, flags); list_del(&objcg->list); --=20 2.47.1 From nobody Wed Dec 17 17:23:08 2025 Received: from out-188.mta1.migadu.com (out-188.mta1.migadu.com [95.215.58.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3BD32205514 for ; Sat, 15 Mar 2025 17:50:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.188 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742061014; cv=none; b=JfYSerQotVXbJ+gUjMZsfpv4WWVibmXbu+HKU4ca9nG74KiruezuHYQEXtGe3lSAfbbnV5lQpweU2revd6Yv1TK/5YG9SRGT9QV7u6+9qewmVr3zEVkGTZBj+qjH7Ap3yj95m3H0mJXXDIdpQPVFzLejOdRC5Zvkde1FOxW0tyM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742061014; c=relaxed/simple; bh=LloJnKkIR/QdFvsz1WTvP10su9GsOgJ3e0RXRBD9G0s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FSJI/H51pZxiCcwZOn3xfvGSv0lLgwmI4T24aFoYnpEruCpX5A5eL2ULCIfBnEQgniKTe7G6FNKcCHXFh8R4IhRFPFAz+5BrmCVZB+gC2XcMzC7hL630s5hRJZlgqtrOFeZzDx8/5ln3x4nTU6aOKeQsDwustDelnRZLb4mf9xM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=kZK6wnzZ; arc=none smtp.client-ip=95.215.58.188 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="kZK6wnzZ" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1742061003; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tHvct7NUO2otKVtcPkikbwH9IxeqmMsC6tyfV+hMsCI=; b=kZK6wnzZBdB6KoWZYB2gbmXAmEglH6MpMFhomC5Kcno08qRkhhrz6/f/yhMsfRdCd8x3BY ZJHkxF4tenCzo+3OAxNHXlFxu7B6frqfODK2qZdld+O7uCsw0kDPPXntdcw72Ksl54uqxd 5fjRV2fVuE5qN1Q9kOdIaE6i/nHy4Gc= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH 6/9] memcg: do obj_cgroup_put inside drain_obj_stock Date: Sat, 15 Mar 2025 10:49:27 -0700 Message-ID: <20250315174930.1769599-7-shakeel.butt@linux.dev> In-Reply-To: <20250315174930.1769599-1-shakeel.butt@linux.dev> References: <20250315174930.1769599-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Previously we could not call obj_cgroup_put() inside the local lock because on the put on the last reference, the release function obj_cgroup_release() may try to re-acquire the local lock. However that chain has been broken. Now simply do obj_cgroup_put() inside drain_obj_stock() instead of returning the old objcg. Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin --- mm/memcontrol.c | 37 +++++++++++-------------------------- 1 file changed, 11 insertions(+), 26 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 83db180455a1..3c4de384b5a0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1778,7 +1778,7 @@ static DEFINE_PER_CPU(struct memcg_stock_pcp, memcg_s= tock) =3D { }; static DEFINE_MUTEX(percpu_charge_mutex); =20 -static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock); +static void drain_obj_stock(struct memcg_stock_pcp *stock); static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg); =20 @@ -1853,7 +1853,6 @@ static void drain_stock(struct memcg_stock_pcp *stock) static void drain_local_stock(struct work_struct *dummy) { struct memcg_stock_pcp *stock; - struct obj_cgroup *old =3D NULL; unsigned long flags; =20 /* @@ -1864,12 +1863,11 @@ static void drain_local_stock(struct work_struct *d= ummy) localtry_lock_irqsave(&memcg_stock.stock_lock, flags); =20 stock =3D this_cpu_ptr(&memcg_stock); - old =3D drain_obj_stock(stock); + drain_obj_stock(stock); drain_stock(stock); clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); =20 localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); - obj_cgroup_put(old); } =20 /* Should never be called with root_mem_cgroup. */ @@ -1951,18 +1949,16 @@ void drain_all_stock(struct mem_cgroup *root_memcg) static int memcg_hotplug_cpu_dead(unsigned int cpu) { struct memcg_stock_pcp *stock; - struct obj_cgroup *old; unsigned long flags; =20 stock =3D &per_cpu(memcg_stock, cpu); =20 /* drain_obj_stock requires stock_lock */ localtry_lock_irqsave(&memcg_stock.stock_lock, flags); - old =3D drain_obj_stock(stock); + drain_obj_stock(stock); localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); =20 drain_stock(stock); - obj_cgroup_put(old); =20 return 0; } @@ -2745,24 +2741,20 @@ void __memcg_kmem_uncharge_page(struct page *page, = int order) } =20 /* Replace the stock objcg with objcg, return the old objcg */ -static struct obj_cgroup *replace_stock_objcg(struct memcg_stock_pcp *stoc= k, - struct obj_cgroup *objcg) +static void replace_stock_objcg(struct memcg_stock_pcp *stock, + struct obj_cgroup *objcg) { - struct obj_cgroup *old =3D NULL; - - old =3D drain_obj_stock(stock); + drain_obj_stock(stock); obj_cgroup_get(objcg); stock->nr_bytes =3D atomic_read(&objcg->nr_charged_bytes) ? atomic_xchg(&objcg->nr_charged_bytes, 0) : 0; WRITE_ONCE(stock->cached_objcg, objcg); - return old; } =20 static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *= pgdat, enum node_stat_item idx, int nr) { struct memcg_stock_pcp *stock; - struct obj_cgroup *old =3D NULL; unsigned long flags; int *bytes; =20 @@ -2775,7 +2767,7 @@ static void mod_objcg_state(struct obj_cgroup *objcg,= struct pglist_data *pgdat, * changes. */ if (READ_ONCE(stock->cached_objcg) !=3D objcg) { - old =3D replace_stock_objcg(stock, objcg); + replace_stock_objcg(stock, objcg); stock->cached_pgdat =3D pgdat; } else if (stock->cached_pgdat !=3D pgdat) { /* Flush the existing cached vmstat data */ @@ -2816,7 +2808,6 @@ static void mod_objcg_state(struct obj_cgroup *objcg,= struct pglist_data *pgdat, __mod_objcg_mlstate(objcg, pgdat, idx, nr); =20 localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); - obj_cgroup_put(old); } =20 static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_by= tes) @@ -2838,12 +2829,12 @@ static bool consume_obj_stock(struct obj_cgroup *ob= jcg, unsigned int nr_bytes) return ret; } =20 -static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock) +static void drain_obj_stock(struct memcg_stock_pcp *stock) { struct obj_cgroup *old =3D READ_ONCE(stock->cached_objcg); =20 if (!old) - return NULL; + return; =20 if (stock->nr_bytes) { unsigned int nr_pages =3D stock->nr_bytes >> PAGE_SHIFT; @@ -2896,11 +2887,7 @@ static struct obj_cgroup *drain_obj_stock(struct mem= cg_stock_pcp *stock) } =20 WRITE_ONCE(stock->cached_objcg, NULL); - /* - * The `old' objects needs to be released by the caller via - * obj_cgroup_put() outside of memcg_stock_pcp::stock_lock. - */ - return old; + obj_cgroup_put(old); } =20 static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, @@ -2922,7 +2909,6 @@ static void refill_obj_stock(struct obj_cgroup *objcg= , unsigned int nr_bytes, bool allow_uncharge) { struct memcg_stock_pcp *stock; - struct obj_cgroup *old =3D NULL; unsigned long flags; unsigned int nr_pages =3D 0; =20 @@ -2930,7 +2916,7 @@ static void refill_obj_stock(struct obj_cgroup *objcg= , unsigned int nr_bytes, =20 stock =3D this_cpu_ptr(&memcg_stock); if (READ_ONCE(stock->cached_objcg) !=3D objcg) { /* reset if necessary */ - old =3D replace_stock_objcg(stock, objcg); + replace_stock_objcg(stock, objcg); allow_uncharge =3D true; /* Allow uncharge when objcg changes */ } stock->nr_bytes +=3D nr_bytes; @@ -2941,7 +2927,6 @@ static void refill_obj_stock(struct obj_cgroup *objcg= , unsigned int nr_bytes, } =20 localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); - obj_cgroup_put(old); =20 if (nr_pages) obj_cgroup_uncharge_pages(objcg, nr_pages); --=20 2.47.1 From nobody Wed Dec 17 17:23:08 2025 Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [91.218.175.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17925205517 for ; Sat, 15 Mar 2025 17:50:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742061011; cv=none; b=NkCa9/FfbAHHbmoRb+fdui8YpVH4gth5NoGrkGFDApNHO50iV9Zqj5EloN8+wBqVNhEXgBhj8y4d0vdohEiYVD/xjTWLZ5S5VRH2ZkcUE4FxWMXY0zwWGwLPCAumoJ5m4ciNaegYd+riJvjrs0qU8gjpKol0BZikO42duO9noJE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742061011; c=relaxed/simple; bh=S6vcgpernq7BZCM9jFiY2z0KoP4egAYn+qrdgf+x40k=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AF7lB3GZbxni/r+85LvyiOviMQ2IprXFuOvayet/uyJ6g1/e8xoANVnnv8XVTe/WPiINRXDYKYY5tp8qLtECB/oHNClDHYGvT+6qo0LUJbFyBDqhvJ8CGjIWv/qBvlUesisrOf6d7hDaJm7dUpyoOdvc9UwdFeFj2VDtnfcTYW4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=YP6Oq/yv; arc=none smtp.client-ip=91.218.175.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="YP6Oq/yv" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1742061007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sgD6RlJd90bDD1Zbd+KYFHxVJHSovRtS79ksTm6yv/M=; b=YP6Oq/yvNHnoJl3NkSZbo35do6cddRVpTfrN2VvwcssF29KZdMDWRXmLsbJ2Qk2XdDfhR7 d+GAaMirhcUzjsJW5CNYi9dJnKGqdyXAP23/6ynvJ0i7VDeFib1ziRJApLhc7gYJQcJN+k orO43WdiKhpvPg1EWV48jRPfM/a8IVk= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH 7/9] memcg: use __mod_memcg_state in drain_obj_stock Date: Sat, 15 Mar 2025 10:49:28 -0700 Message-ID: <20250315174930.1769599-8-shakeel.butt@linux.dev> In-Reply-To: <20250315174930.1769599-1-shakeel.butt@linux.dev> References: <20250315174930.1769599-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" For non-PREEMPT_RT kernels, drain_obj_stock() is always called with irq disabled, so we can use __mod_memcg_state() instead of mod_memcg_state(). For PREEMPT_RT, we need to add memcg_stats_[un]lock in __mod_memcg_state(). Signed-off-by: Shakeel Butt Reviewed-by: Sebastian Andrzej Siewior Acked-by: Vlastimil Babka Reviewed-by: Roman Gushchin --- mm/memcontrol.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 3c4de384b5a0..dfe9c2eb7816 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -707,10 +707,12 @@ void __mod_memcg_state(struct mem_cgroup *memcg, enum= memcg_stat_item idx, if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, id= x)) return; =20 + memcg_stats_lock(); __this_cpu_add(memcg->vmstats_percpu->state[i], val); val =3D memcg_state_val_in_pages(idx, val); memcg_rstat_updated(memcg, val); trace_mod_memcg_state(memcg, idx, val); + memcg_stats_unlock(); } =20 #ifdef CONFIG_MEMCG_V1 @@ -2845,7 +2847,7 @@ static void drain_obj_stock(struct memcg_stock_pcp *s= tock) =20 memcg =3D get_mem_cgroup_from_objcg(old); =20 - mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages); + __mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages); memcg1_account_kmem(memcg, -nr_pages); if (!mem_cgroup_is_root(memcg)) memcg_uncharge(memcg, nr_pages); --=20 2.47.1 From nobody Wed Dec 17 17:23:08 2025 Received: from out-183.mta0.migadu.com (out-183.mta0.migadu.com [91.218.175.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 26EF6205AA8 for ; Sat, 15 Mar 2025 17:50:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.183 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742061014; cv=none; b=JkHaoZYOofidtQeNt8xw/mx9fEkGluqudqtG+503MvK/2iOA9/uvvZUi6DGe+iAWQtd4IC1oCN3T3Gtfpy3KWTs8kbvQlhDsk3OQPT+3/9eXGPz3lbGg86hZfQgmFUZPG1CjgyzMosGplJrlMxN6hI6ykLovsb1GlWld7RTHpa4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742061014; c=relaxed/simple; bh=Cv3THES77lUgLh8esbgj1YwwbLYIxyafSiP9r0Y5/+Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dNUFfOGIT2PXTzC9qCUUPFzF+hiCpawm9fK+nz4BYC05oB19aP457eMc0n5JvYIb5zjg/T9Xz+Ju93BFsErvyij6hUbqIfsmYUqRkq7UxxYLLFPV0PuAJ/LUgMbK52gYqgT+zWENIBiYojfE+kZ6qSQgDmMDCvtyRLt7xf8U/Dw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=eqqFYRay; arc=none smtp.client-ip=91.218.175.183 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="eqqFYRay" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1742061010; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=o4xUih43AY3K2la6xaLoSlR0RMvoWQOKhCt2j9dxoEk=; b=eqqFYRay+fQyoV6n5jswA6nCcb5FmqTbDPLFlZ2zp0kXlAMkjJwG98vDbE3BTw/x9vWnlN AIIUxGky5BNUCIqwWFkVJ8TuF2dvZuQgv+zD3kE9FReetGUYiptNLv9fON3ko4pdWauiGw 9o7QOfY/CwY1Uuiipxm1wber65uJISc= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH 8/9] memcg: combine slab obj stock charging and accounting Date: Sat, 15 Mar 2025 10:49:29 -0700 Message-ID: <20250315174930.1769599-9-shakeel.butt@linux.dev> In-Reply-To: <20250315174930.1769599-1-shakeel.butt@linux.dev> References: <20250315174930.1769599-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" From: Vlastimil Babka When handing slab objects, we use obj_cgroup_[un]charge() for (un)charging and mod_objcg_state() to account NR_SLAB_[UN]RECLAIMABLE_B. All these operations use the percpu stock for performance. However with the calls being separate, the stock_lock is taken twice in each case. By refactoring the code, we can turn mod_objcg_state() into __account_obj_stock() which is called on a stock that's already locked and validated. On the charging side we can call this function from consume_obj_stock() when it succeeds, and refill_obj_stock() in the fallback. We just expand parameters of these functions as necessary. The uncharge side from __memcg_slab_free_hook() is just the call to refill_obj_stock(). Other callers of obj_cgroup_[un]charge() (i.e. not slab) simply pass the extra parameters as NULL/zeroes to skip the __account_obj_stock() operation. In __memcg_slab_post_alloc_hook() we now charge each object separately, but that's not a problem as we did call mod_objcg_state() for each object separately, and most allocations are non-bulk anyway. This could be improved by batching all operations until slab_pgdat(slab) changes. Some preliminary benchmarking with a kfree(kmalloc()) loop of 10M iterations with/without __GFP_ACCOUNT: Before the patch: kmalloc/kfree !memcg: 581390144 cycles kmalloc/kfree memcg: 783689984 cycles After the patch: kmalloc/kfree memcg: 658723808 cycles More than half of the overhead of __GFP_ACCOUNT relative to non-accounted case seems eliminated. Signed-off-by: Vlastimil Babka Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin --- mm/memcontrol.c | 77 +++++++++++++++++++++++++++++-------------------- 1 file changed, 46 insertions(+), 31 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index dfe9c2eb7816..553eb1d7250a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2753,25 +2753,17 @@ static void replace_stock_objcg(struct memcg_stock_= pcp *stock, WRITE_ONCE(stock->cached_objcg, objcg); } =20 -static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *= pgdat, - enum node_stat_item idx, int nr) +static void __account_obj_stock(struct obj_cgroup *objcg, + struct memcg_stock_pcp *stock, int nr, + struct pglist_data *pgdat, enum node_stat_item idx) { - struct memcg_stock_pcp *stock; - unsigned long flags; int *bytes; =20 - localtry_lock_irqsave(&memcg_stock.stock_lock, flags); - stock =3D this_cpu_ptr(&memcg_stock); - /* * Save vmstat data in stock and skip vmstat array update unless - * accumulating over a page of vmstat data or when pgdat or idx - * changes. + * accumulating over a page of vmstat data or when pgdat changes. */ - if (READ_ONCE(stock->cached_objcg) !=3D objcg) { - replace_stock_objcg(stock, objcg); - stock->cached_pgdat =3D pgdat; - } else if (stock->cached_pgdat !=3D pgdat) { + if (stock->cached_pgdat !=3D pgdat) { /* Flush the existing cached vmstat data */ struct pglist_data *oldpg =3D stock->cached_pgdat; =20 @@ -2808,11 +2800,10 @@ static void mod_objcg_state(struct obj_cgroup *objc= g, struct pglist_data *pgdat, } if (nr) __mod_objcg_mlstate(objcg, pgdat, idx, nr); - - localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); } =20 -static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_by= tes) +static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_by= tes, + struct pglist_data *pgdat, enum node_stat_item idx) { struct memcg_stock_pcp *stock; unsigned long flags; @@ -2824,6 +2815,9 @@ static bool consume_obj_stock(struct obj_cgroup *objc= g, unsigned int nr_bytes) if (objcg =3D=3D READ_ONCE(stock->cached_objcg) && stock->nr_bytes >=3D n= r_bytes) { stock->nr_bytes -=3D nr_bytes; ret =3D true; + + if (pgdat) + __account_obj_stock(objcg, stock, nr_bytes, pgdat, idx); } =20 localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); @@ -2908,7 +2902,8 @@ static bool obj_stock_flush_required(struct memcg_sto= ck_pcp *stock, } =20 static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_byt= es, - bool allow_uncharge) + bool allow_uncharge, int nr_acct, struct pglist_data *pgdat, + enum node_stat_item idx) { struct memcg_stock_pcp *stock; unsigned long flags; @@ -2923,6 +2918,9 @@ static void refill_obj_stock(struct obj_cgroup *objcg= , unsigned int nr_bytes, } stock->nr_bytes +=3D nr_bytes; =20 + if (pgdat) + __account_obj_stock(objcg, stock, nr_acct, pgdat, idx); + if (allow_uncharge && (stock->nr_bytes > PAGE_SIZE)) { nr_pages =3D stock->nr_bytes >> PAGE_SHIFT; stock->nr_bytes &=3D (PAGE_SIZE - 1); @@ -2934,12 +2932,13 @@ static void refill_obj_stock(struct obj_cgroup *obj= cg, unsigned int nr_bytes, obj_cgroup_uncharge_pages(objcg, nr_pages); } =20 -int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size) +static int obj_cgroup_charge_account(struct obj_cgroup *objcg, gfp_t gfp, = size_t size, + struct pglist_data *pgdat, enum node_stat_item idx) { unsigned int nr_pages, nr_bytes; int ret; =20 - if (consume_obj_stock(objcg, size)) + if (likely(consume_obj_stock(objcg, size, pgdat, idx))) return 0; =20 /* @@ -2972,15 +2971,21 @@ int obj_cgroup_charge(struct obj_cgroup *objcg, gfp= _t gfp, size_t size) nr_pages +=3D 1; =20 ret =3D obj_cgroup_charge_pages(objcg, gfp, nr_pages); - if (!ret && nr_bytes) - refill_obj_stock(objcg, PAGE_SIZE - nr_bytes, false); + if (!ret && (nr_bytes || pgdat)) + refill_obj_stock(objcg, nr_bytes ? PAGE_SIZE - nr_bytes : 0, + false, size, pgdat, idx); =20 return ret; } =20 +int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size) +{ + return obj_cgroup_charge_account(objcg, gfp, size, NULL, 0); +} + void obj_cgroup_uncharge(struct obj_cgroup *objcg, size_t size) { - refill_obj_stock(objcg, size, true); + refill_obj_stock(objcg, size, true, 0, NULL, 0); } =20 static inline size_t obj_full_size(struct kmem_cache *s) @@ -3032,23 +3037,32 @@ bool __memcg_slab_post_alloc_hook(struct kmem_cache= *s, struct list_lru *lru, return false; } =20 - if (obj_cgroup_charge(objcg, flags, size * obj_full_size(s))) - return false; - for (i =3D 0; i < size; i++) { slab =3D virt_to_slab(p[i]); =20 if (!slab_obj_exts(slab) && alloc_slab_obj_exts(slab, s, flags, false)) { - obj_cgroup_uncharge(objcg, obj_full_size(s)); continue; } =20 + /* + * if we fail and size is 1, memcg_alloc_abort_single() will + * just free the object, which is ok as we have not assigned + * objcg to its obj_ext yet + * + * for larger sizes, kmem_cache_free_bulk() will uncharge + * any objects that were already charged and obj_ext assigned + * + * TODO: we could batch this until slab_pgdat(slab) changes + * between iterations, with a more complicated undo + */ + if (obj_cgroup_charge_account(objcg, flags, obj_full_size(s), + slab_pgdat(slab), cache_vmstat_idx(s))) + return false; + off =3D obj_to_index(s, slab, p[i]); obj_cgroup_get(objcg); slab_obj_exts(slab)[off].objcg =3D objcg; - mod_objcg_state(objcg, slab_pgdat(slab), - cache_vmstat_idx(s), obj_full_size(s)); } =20 return true; @@ -3057,6 +3071,8 @@ bool __memcg_slab_post_alloc_hook(struct kmem_cache *= s, struct list_lru *lru, void __memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, void **p, int objects, struct slabobj_ext *obj_exts) { + size_t obj_size =3D obj_full_size(s); + for (int i =3D 0; i < objects; i++) { struct obj_cgroup *objcg; unsigned int off; @@ -3067,9 +3083,8 @@ void __memcg_slab_free_hook(struct kmem_cache *s, str= uct slab *slab, continue; =20 obj_exts[off].objcg =3D NULL; - obj_cgroup_uncharge(objcg, obj_full_size(s)); - mod_objcg_state(objcg, slab_pgdat(slab), cache_vmstat_idx(s), - -obj_full_size(s)); + refill_obj_stock(objcg, obj_size, true, -obj_size, + slab_pgdat(slab), cache_vmstat_idx(s)); obj_cgroup_put(objcg); } } --=20 2.47.1 From nobody Wed Dec 17 17:23:08 2025 Received: from out-173.mta0.migadu.com (out-173.mta0.migadu.com [91.218.175.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C32D62066E3 for ; Sat, 15 Mar 2025 17:50:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742061016; cv=none; b=IadPsUI855C6od77OhhlZScpAj4YKfibrj6NYD1pO7PHkTNVoLrTIvOl0zX/jCfvUI7tvcRjXXODNO9u4KBwwSdEm+xBpGDoHR7mUS9GyXFAlwjyQOLeUdhyChFWl0Crt8qvbKdZzQX/Yli7c5ShLpTZVcP/rIvRrsiA1vcsxMI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742061016; c=relaxed/simple; bh=pX6LjMd9E3wFWwByfNIBXNtx/iaZ5LpUzpJ93/Soi6I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Puw8o575U8x894dKnMpLQV9IPaD+R2eXofj70BTc01EIXAi+CkjEInIbAXAXgqrP2oO+MRubRwsKRmOzihJxTC8d0tTWWgaOQ4uSWWgB1CfNgWNOC6vBC3pFAZHdbyraUz2RVcFvv2mYhQKwsMcZMshldy5L4p/vukXO0rrwU9o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=LHDQm3SA; arc=none smtp.client-ip=91.218.175.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="LHDQm3SA" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1742061013; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TYGBaqcauUTYU40tZGtRDDBbg7pHPWXhJB6/pipNZSg=; b=LHDQm3SAQHbOdOc/lx7xNRctT2V94M44B935v1mp0JKaFZD4ihZxYWRfQk1mXWZlM0YnV8 mUJPtgGxHSicxTjlkyDNWob2m9rowYzZQ0N2FbX6cTsr0uOiJu8tvTqevoPkB6VEYHtP9L MnOPP/rrbzJKtED4KZRq6GaquCbHc40= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH 9/9] memcg: manually inline replace_stock_objcg Date: Sat, 15 Mar 2025 10:49:30 -0700 Message-ID: <20250315174930.1769599-10-shakeel.butt@linux.dev> In-Reply-To: <20250315174930.1769599-1-shakeel.butt@linux.dev> References: <20250315174930.1769599-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" The replace_stock_objcg() is being called by only refill_obj_stock, so manually inline it. Signed-off-by: Shakeel Butt Acked-by: Vlastimil Babka Reviewed-by: Roman Gushchin --- mm/memcontrol.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 553eb1d7250a..f6e3fc418866 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2742,17 +2742,6 @@ void __memcg_kmem_uncharge_page(struct page *page, i= nt order) obj_cgroup_put(objcg); } =20 -/* Replace the stock objcg with objcg, return the old objcg */ -static void replace_stock_objcg(struct memcg_stock_pcp *stock, - struct obj_cgroup *objcg) -{ - drain_obj_stock(stock); - obj_cgroup_get(objcg); - stock->nr_bytes =3D atomic_read(&objcg->nr_charged_bytes) - ? atomic_xchg(&objcg->nr_charged_bytes, 0) : 0; - WRITE_ONCE(stock->cached_objcg, objcg); -} - static void __account_obj_stock(struct obj_cgroup *objcg, struct memcg_stock_pcp *stock, int nr, struct pglist_data *pgdat, enum node_stat_item idx) @@ -2913,7 +2902,12 @@ static void refill_obj_stock(struct obj_cgroup *objc= g, unsigned int nr_bytes, =20 stock =3D this_cpu_ptr(&memcg_stock); if (READ_ONCE(stock->cached_objcg) !=3D objcg) { /* reset if necessary */ - replace_stock_objcg(stock, objcg); + drain_obj_stock(stock); + obj_cgroup_get(objcg); + stock->nr_bytes =3D atomic_read(&objcg->nr_charged_bytes) + ? atomic_xchg(&objcg->nr_charged_bytes, 0) : 0; + WRITE_ONCE(stock->cached_objcg, objcg); + allow_uncharge =3D true; /* Allow uncharge when objcg changes */ } stock->nr_bytes +=3D nr_bytes; --=20 2.47.1