From nobody Fri Dec 19 17:34:03 2025 Received: from out-189.mta1.migadu.com (out-189.mta1.migadu.com [95.215.58.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A3DB92882AD for ; Wed, 14 May 2025 18:42:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.189 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248144; cv=none; b=pbTAxuB57yWFb7+lqagK/CekFmXimyvLnyqMJ0BQj10BAdOekzRndFycf0ek2liPkdgKUjQwE34MGULu8n61P+/fCBnLmgA3x0GITfBvhtuUFRTKZOnbX+FN6q3njAPdyyocB54yBZfo2AjxevitsE+ZDUIP+nkIGoVP4xP/wbk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248144; c=relaxed/simple; bh=3Fr0cQ+G90nKuUSZg94me3gC2Oyt2UO3Hef1WWjs74U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bz/sJnw7RY74a1msRCDFvW2q6mqCpIZO6sGfRe8HkDzmGtv3ETrhM3eK4hYCdBS/ezcCblbCL/nWw/O/I3cQFUus18+XJQjpmja3GDgwssBEnC3criZPDn0s9snZTQ2UzUSSqkqQXAcqj5FhffwQx93jpvAc3KkJhMiAAr2cYIk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=L9zMuLin; arc=none smtp.client-ip=95.215.58.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="L9zMuLin" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1747248139; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I1JZROuIroIvEzIUKiaMX+NW2Feg9mUr1VdRm7ju+Rc=; b=L9zMuLin+KdSVhzEN0SGMJKsSLDXu2kWmM+sbQ2yv1PrTje8O/eF0Gy1pQg2iugldzHZ+T M7psiGTwbuAYe7PTX0V1VJcEOxef1/R4hP0pUoImfGXbfRl58KPC0yIkyYMzoNlqhdjxFf e8EZzcrUZShSAZ0loYLOk0N7rzBcYl8= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , Harry Yoo , Yosry Ahmed , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 1/7] memcg: memcg_rstat_updated re-entrant safe against irqs Date: Wed, 14 May 2025 11:41:52 -0700 Message-ID: <20250514184158.3471331-2-shakeel.butt@linux.dev> In-Reply-To: <20250514184158.3471331-1-shakeel.butt@linux.dev> References: <20250514184158.3471331-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" The function memcg_rstat_updated() is used to track the memcg stats updates for optimizing the flushes. At the moment, it is not re-entrant safe and the callers disabled irqs before calling. However to achieve the goal of updating memcg stats without irqs, memcg_rstat_updated() needs to be re-entrant safe against irqs. This patch makes memcg_rstat_updated() re-entrant safe using this_cpu_* ops. On archs with CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS, this patch is also making memcg_rstat_updated() nmi safe. Signed-off-by: Shakeel Butt Reviewed-by: Vlastimil Babka Tested-by: Alexei Starovoitov --- mm/memcontrol.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 89476a71a18d..2464a58fbf17 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -505,8 +505,8 @@ struct memcg_vmstats_percpu { unsigned int stats_updates; =20 /* Cached pointers for fast iteration in memcg_rstat_updated() */ - struct memcg_vmstats_percpu *parent; - struct memcg_vmstats *vmstats; + struct memcg_vmstats_percpu __percpu *parent_pcpu; + struct memcg_vmstats *vmstats; =20 /* The above should fit a single cacheline for memcg_rstat_updated() */ =20 @@ -588,16 +588,21 @@ static bool memcg_vmstats_needs_flush(struct memcg_vm= stats *vmstats) =20 static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val) { + struct memcg_vmstats_percpu __percpu *statc_pcpu; struct memcg_vmstats_percpu *statc; - int cpu =3D smp_processor_id(); + int cpu; unsigned int stats_updates; =20 if (!val) return; =20 + /* Don't assume callers have preemption disabled. */ + cpu =3D get_cpu(); + cgroup_rstat_updated(memcg->css.cgroup, cpu); - statc =3D this_cpu_ptr(memcg->vmstats_percpu); - for (; statc; statc =3D statc->parent) { + statc_pcpu =3D memcg->vmstats_percpu; + for (; statc_pcpu; statc_pcpu =3D statc->parent_pcpu) { + statc =3D this_cpu_ptr(statc_pcpu); /* * If @memcg is already flushable then all its ancestors are * flushable as well and also there is no need to increase @@ -606,14 +611,15 @@ static inline void memcg_rstat_updated(struct mem_cgr= oup *memcg, int val) if (memcg_vmstats_needs_flush(statc->vmstats)) break; =20 - stats_updates =3D READ_ONCE(statc->stats_updates) + abs(val); - WRITE_ONCE(statc->stats_updates, stats_updates); + stats_updates =3D this_cpu_add_return(statc_pcpu->stats_updates, + abs(val)); if (stats_updates < MEMCG_CHARGE_BATCH) continue; =20 + stats_updates =3D this_cpu_xchg(statc_pcpu->stats_updates, 0); atomic64_add(stats_updates, &statc->vmstats->stats_updates); - WRITE_ONCE(statc->stats_updates, 0); } + put_cpu(); } =20 static void __mem_cgroup_flush_stats(struct mem_cgroup *memcg, bool force) @@ -3691,7 +3697,7 @@ static void mem_cgroup_free(struct mem_cgroup *memcg) =20 static struct mem_cgroup *mem_cgroup_alloc(struct mem_cgroup *parent) { - struct memcg_vmstats_percpu *statc, *pstatc; + struct memcg_vmstats_percpu *statc, __percpu *pstatc_pcpu; struct mem_cgroup *memcg; int node, cpu; int __maybe_unused i; @@ -3722,9 +3728,9 @@ static struct mem_cgroup *mem_cgroup_alloc(struct mem= _cgroup *parent) =20 for_each_possible_cpu(cpu) { if (parent) - pstatc =3D per_cpu_ptr(parent->vmstats_percpu, cpu); + pstatc_pcpu =3D parent->vmstats_percpu; statc =3D per_cpu_ptr(memcg->vmstats_percpu, cpu); - statc->parent =3D parent ? pstatc : NULL; + statc->parent_pcpu =3D parent ? pstatc_pcpu : NULL; statc->vmstats =3D memcg->vmstats; } =20 --=20 2.47.1 From nobody Fri Dec 19 17:34:03 2025 Received: from out-174.mta1.migadu.com (out-174.mta1.migadu.com [95.215.58.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 144D628A1ED for ; Wed, 14 May 2025 18:42:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248152; cv=none; b=e5hHaLC+FcL9hEeGqJOcjbFtPrAYSuC3C0eY3vR0xhFT7wJEOVUmkuP9n43SSdg770TVOyNN+S5XIELDEA1E99qu57KvOo6eQEYxqZgqTf9PqN3K9WqSlyCoE69ZmyM62KuDu/EH+xjgcy63uQ0Ce9YylOoRIDr/JROm82vopgQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248152; c=relaxed/simple; bh=+2HNtNHSyHe09aiFKC4uxM3PAH6DOJWNjNKEmVXT27I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sgBeY6l1UHIiH+cPd0yC2Eh7ExfzQlHiunS+pp1m/AgtXj4xK3s5DfLnXIPOVNLfLCq6PeeNwRTNhn/YUWBSFxMDcF1rDI9y8/n9Pus4iDLajzm5enhXPAQuW76JXGJFpOGuwp2QILcGScN4xkR2244n2D9cgJ1wyJiXtVwUTng= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=O82arQvG; arc=none smtp.client-ip=95.215.58.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="O82arQvG" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1747248148; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pnF7krW2jVoclxTmKq2Rm1VmAZ/F/mXL38we6iUw/xE=; b=O82arQvGzNDNLe5lnOtBcCE5uvdvACBYJ1PalJjlxxRDPRFgkRaG6x0Uchr66drqz47Jjr PXA6t11IzYBcrVXHGHNqjFWOsEV5uJE2n3iLvDJ3zn0qpesRQJXDy2quVFbtLTIlVna6ae XZZLZPbQIYDaDTrGf0FrrUl72yz+7IM= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , Harry Yoo , Yosry Ahmed , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 2/7] memcg: move preempt disable to callers of memcg_rstat_updated Date: Wed, 14 May 2025 11:41:53 -0700 Message-ID: <20250514184158.3471331-3-shakeel.butt@linux.dev> In-Reply-To: <20250514184158.3471331-1-shakeel.butt@linux.dev> References: <20250514184158.3471331-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Let's move the explicit preempt disable code to the callers of memcg_rstat_updated and also remove the memcg_stats_lock and related functions which ensures the callers of stats update functions have disabled preemption because now the stats update functions are explicitly disabling preemption. Signed-off-by: Shakeel Butt Acked-by: Vlastimil Babka --- mm/memcontrol.c | 74 +++++++++++++------------------------------------ 1 file changed, 19 insertions(+), 55 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 2464a58fbf17..1750d86012f3 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -557,48 +557,22 @@ static u64 flush_last_time; =20 #define FLUSH_TIME (2UL*HZ) =20 -/* - * Accessors to ensure that preemption is disabled on PREEMPT_RT because i= t can - * not rely on this as part of an acquired spinlock_t lock. These function= s are - * never used in hardirq context on PREEMPT_RT and therefore disabling pre= emtion - * is sufficient. - */ -static void memcg_stats_lock(void) -{ - preempt_disable_nested(); - VM_WARN_ON_IRQS_ENABLED(); -} - -static void __memcg_stats_lock(void) -{ - preempt_disable_nested(); -} - -static void memcg_stats_unlock(void) -{ - preempt_enable_nested(); -} - - static bool memcg_vmstats_needs_flush(struct memcg_vmstats *vmstats) { return atomic64_read(&vmstats->stats_updates) > MEMCG_CHARGE_BATCH * num_online_cpus(); } =20 -static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val) +static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val, + int cpu) { struct memcg_vmstats_percpu __percpu *statc_pcpu; struct memcg_vmstats_percpu *statc; - int cpu; unsigned int stats_updates; =20 if (!val) return; =20 - /* Don't assume callers have preemption disabled. */ - cpu =3D get_cpu(); - cgroup_rstat_updated(memcg->css.cgroup, cpu); statc_pcpu =3D memcg->vmstats_percpu; for (; statc_pcpu; statc_pcpu =3D statc->parent_pcpu) { @@ -619,7 +593,6 @@ static inline void memcg_rstat_updated(struct mem_cgrou= p *memcg, int val) stats_updates =3D this_cpu_xchg(statc_pcpu->stats_updates, 0); atomic64_add(stats_updates, &statc->vmstats->stats_updates); } - put_cpu(); } =20 static void __mem_cgroup_flush_stats(struct mem_cgroup *memcg, bool force) @@ -717,6 +690,7 @@ void __mod_memcg_state(struct mem_cgroup *memcg, enum m= emcg_stat_item idx, int val) { int i =3D memcg_stats_index(idx); + int cpu; =20 if (mem_cgroup_disabled()) return; @@ -724,12 +698,14 @@ void __mod_memcg_state(struct mem_cgroup *memcg, enum= memcg_stat_item idx, if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, id= x)) return; =20 - memcg_stats_lock(); + cpu =3D get_cpu(); + __this_cpu_add(memcg->vmstats_percpu->state[i], val); val =3D memcg_state_val_in_pages(idx, val); - memcg_rstat_updated(memcg, val); + memcg_rstat_updated(memcg, val, cpu); trace_mod_memcg_state(memcg, idx, val); - memcg_stats_unlock(); + + put_cpu(); } =20 #ifdef CONFIG_MEMCG_V1 @@ -758,6 +734,7 @@ static void __mod_memcg_lruvec_state(struct lruvec *lru= vec, struct mem_cgroup_per_node *pn; struct mem_cgroup *memcg; int i =3D memcg_stats_index(idx); + int cpu; =20 if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, id= x)) return; @@ -765,24 +742,7 @@ static void __mod_memcg_lruvec_state(struct lruvec *lr= uvec, pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); memcg =3D pn->memcg; =20 - /* - * The caller from rmap relies on disabled preemption because they never - * update their counter from in-interrupt context. For these two - * counters we check that the update is never performed from an - * interrupt context while other caller need to have disabled interrupt. - */ - __memcg_stats_lock(); - if (IS_ENABLED(CONFIG_DEBUG_VM)) { - switch (idx) { - case NR_ANON_MAPPED: - case NR_FILE_MAPPED: - case NR_ANON_THPS: - WARN_ON_ONCE(!in_task()); - break; - default: - VM_WARN_ON_IRQS_ENABLED(); - } - } + cpu =3D get_cpu(); =20 /* Update memcg */ __this_cpu_add(memcg->vmstats_percpu->state[i], val); @@ -791,9 +751,10 @@ static void __mod_memcg_lruvec_state(struct lruvec *lr= uvec, __this_cpu_add(pn->lruvec_stats_percpu->state[i], val); =20 val =3D memcg_state_val_in_pages(idx, val); - memcg_rstat_updated(memcg, val); + memcg_rstat_updated(memcg, val, cpu); trace_mod_memcg_lruvec_state(memcg, idx, val); - memcg_stats_unlock(); + + put_cpu(); } =20 /** @@ -873,6 +834,7 @@ void __count_memcg_events(struct mem_cgroup *memcg, enu= m vm_event_item idx, unsigned long count) { int i =3D memcg_events_index(idx); + int cpu; =20 if (mem_cgroup_disabled()) return; @@ -880,11 +842,13 @@ void __count_memcg_events(struct mem_cgroup *memcg, e= num vm_event_item idx, if (WARN_ONCE(BAD_STAT_IDX(i), "%s: missing stat item %d\n", __func__, id= x)) return; =20 - memcg_stats_lock(); + cpu =3D get_cpu(); + __this_cpu_add(memcg->vmstats_percpu->events[i], count); - memcg_rstat_updated(memcg, count); + memcg_rstat_updated(memcg, count, cpu); trace_count_memcg_events(memcg, idx, count); - memcg_stats_unlock(); + + put_cpu(); } =20 unsigned long memcg_events(struct mem_cgroup *memcg, int event) --=20 2.47.1 From nobody Fri Dec 19 17:34:03 2025 Received: from out-172.mta0.migadu.com (out-172.mta0.migadu.com [91.218.175.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 06E132882B6 for ; Wed, 14 May 2025 18:42:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248161; cv=none; b=FoeluWiz97aZ8nfg9DcWAw3zTr8aFYMNAwA3jj4juRpVIef/ki/vkHTrs55UWhBMZEPQqWj4O4mmfjC4u8fRWstQPg2ggVXvNuwzioCWIeReUihIyzCudBiU3I0VfUp8riU2BIPdGlGDTAT1EvNIKSKcP6G7BBEIZox70ZfXQqU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248161; c=relaxed/simple; bh=mTNNAhXxTmuKIpiWZywIQ/5hY9vcPCNEi5pwe+i0vF4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pgHmh+DEdA5tWHVRM8ju66OtAzdB/1LKy/IXW4dCwndpjoeaTs++J7e4MFBJLRcoca98sJ/P49THnQyn4ebn9iz5BUvzT2nXcPQZiQrtOnqgjFnBUNG79stq8h3baF8+wgNK0yQjIHVsgZCr2qy6P7VaNyU/D3XW0Ue1jdW9Rqo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=AeGHYahc; arc=none smtp.client-ip=91.218.175.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="AeGHYahc" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1747248156; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0hq3oSPGOoYcep4MVQZiDjVHKcY+KXmN8QcGYtIe92g=; b=AeGHYahcQD4nY/SaOCUvLSzBipLtoKpoNJa3KmuL+Uo+M+Zi3uVoBSjGWm8N3I8w3dhhLm pVJt8eREwCiR9EwSHIxX2Se2gB4i3bxdgu7u+wT0Uk9+iso2cJpcHXF8fWaulJAe1jO0Ft CFTHhmUOmtjMi82XHcOkaTc1OZslUCs= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , Harry Yoo , Yosry Ahmed , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 3/7] memcg: make mod_memcg_state re-entrant safe against irqs Date: Wed, 14 May 2025 11:41:54 -0700 Message-ID: <20250514184158.3471331-4-shakeel.butt@linux.dev> In-Reply-To: <20250514184158.3471331-1-shakeel.butt@linux.dev> References: <20250514184158.3471331-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Let's make mod_memcg_state re-entrant safe against irqs. The only thing needed is to convert the usage of __this_cpu_add() to this_cpu_add(). In addition, with re-entrant safety, there is no need to disable irqs. mod_memcg_state() is not safe against nmi, so let's add warning if someone tries to call it in nmi context. Signed-off-by: Shakeel Butt Acked-by: Vlastimil Babka --- include/linux/memcontrol.h | 20 ++------------------ mm/memcontrol.c | 8 ++++---- 2 files changed, 6 insertions(+), 22 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 9ed75f82b858..92861ff3c43f 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -903,19 +903,9 @@ struct mem_cgroup *mem_cgroup_get_oom_group(struct tas= k_struct *victim, struct mem_cgroup *oom_domain); void mem_cgroup_print_oom_group(struct mem_cgroup *memcg); =20 -void __mod_memcg_state(struct mem_cgroup *memcg, enum memcg_stat_item idx, - int val); - /* idx can be of type enum memcg_stat_item or node_stat_item */ -static inline void mod_memcg_state(struct mem_cgroup *memcg, - enum memcg_stat_item idx, int val) -{ - unsigned long flags; - - local_irq_save(flags); - __mod_memcg_state(memcg, idx, val); - local_irq_restore(flags); -} +void mod_memcg_state(struct mem_cgroup *memcg, + enum memcg_stat_item idx, int val); =20 static inline void mod_memcg_page_state(struct page *page, enum memcg_stat_item idx, int val) @@ -1375,12 +1365,6 @@ static inline void mem_cgroup_print_oom_group(struct= mem_cgroup *memcg) { } =20 -static inline void __mod_memcg_state(struct mem_cgroup *memcg, - enum memcg_stat_item idx, - int nr) -{ -} - static inline void mod_memcg_state(struct mem_cgroup *memcg, enum memcg_stat_item idx, int nr) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 1750d86012f3..c5a835071610 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -681,12 +681,12 @@ static int memcg_state_val_in_pages(int idx, int val) } =20 /** - * __mod_memcg_state - update cgroup memory statistics + * mod_memcg_state - update cgroup memory statistics * @memcg: the memory cgroup * @idx: the stat item - can be enum memcg_stat_item or enum node_stat_item * @val: delta to add to the counter, can be negative */ -void __mod_memcg_state(struct mem_cgroup *memcg, enum memcg_stat_item idx, +void mod_memcg_state(struct mem_cgroup *memcg, enum memcg_stat_item idx, int val) { int i =3D memcg_stats_index(idx); @@ -700,7 +700,7 @@ void __mod_memcg_state(struct mem_cgroup *memcg, enum m= emcg_stat_item idx, =20 cpu =3D get_cpu(); =20 - __this_cpu_add(memcg->vmstats_percpu->state[i], val); + this_cpu_add(memcg->vmstats_percpu->state[i], val); val =3D memcg_state_val_in_pages(idx, val); memcg_rstat_updated(memcg, val, cpu); trace_mod_memcg_state(memcg, idx, val); @@ -2920,7 +2920,7 @@ static void drain_obj_stock(struct obj_stock_pcp *sto= ck) =20 memcg =3D get_mem_cgroup_from_objcg(old); =20 - __mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages); + mod_memcg_state(memcg, MEMCG_KMEM, -nr_pages); memcg1_account_kmem(memcg, -nr_pages); if (!mem_cgroup_is_root(memcg)) memcg_uncharge(memcg, nr_pages); --=20 2.47.1 From nobody Fri Dec 19 17:34:03 2025 Received: from out-179.mta0.migadu.com (out-179.mta0.migadu.com [91.218.175.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D3A4A288510 for ; Wed, 14 May 2025 18:42:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248165; cv=none; b=OQcRA+7bLfTJ+l3/l7gGyLRJaoXFrltprwmFlKUTPzI1ux1tvMNenHVd3au8Tu3xFMyjFzbQ698kFPhcdLGSJvT0mslChR6YKOqmOD8kJPkarKa7cSvR9B1Wl+hknJmg8YpSh5766vrvs/JRDsGQowSpAtdFS5FkgW8PxBSpLyA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248165; c=relaxed/simple; bh=9a9EAI7HjHhqk+KWb+uIby+4wcCto83ZQTHrDewpE5Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Mo7K//L2MYogqoqFw2e9dQx7CBWY1rCxkv6tcXXwJ6lLzqC7i857yPugcZWmIIpxiYJ7LOmVdxFK+5ZXtMpjrVmCS2tX7lxhZdK75p85IcULZVNvlqDY+36u059e0zzxt057KHOD5PAL74Dyob0srjjntlDlDOscDHm1smNElzo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=HPtxLwPj; arc=none smtp.client-ip=91.218.175.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="HPtxLwPj" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1747248160; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SUZYKygsw0cKUEN+JGdm3r/cNIBlAjnoiVTSGdy45nM=; b=HPtxLwPjDj5tqjCFtwX92LOzUI2qCAAx7nVFvpOlYx50xbs4s/Fy13Y9s9Qvw+8xVxiY8H V2WCjEYTPudskCJM673GcxuVhRiR162USn+/2ztfctd5+lkECGf8FDLJSGd2TjK79GGOBf cCXBYez5Bq6M6ZySrPh5GHB1M/dPDqc= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , Harry Yoo , Yosry Ahmed , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 4/7] memcg: make count_memcg_events re-entrant safe against irqs Date: Wed, 14 May 2025 11:41:55 -0700 Message-ID: <20250514184158.3471331-5-shakeel.butt@linux.dev> In-Reply-To: <20250514184158.3471331-1-shakeel.butt@linux.dev> References: <20250514184158.3471331-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Let's make count_memcg_events re-entrant safe against irqs. The only thing needed is to convert the usage of __this_cpu_add() to this_cpu_add(). In addition, with re-entrant safety, there is no need to disable irqs. Also add warnings for in_nmi() as it is not safe against nmi context. Signed-off-by: Shakeel Butt Acked-by: Vlastimil Babka --- include/linux/memcontrol.h | 21 ++------------------- mm/memcontrol-v1.c | 6 +++--- mm/memcontrol.c | 6 +++--- mm/swap.c | 8 ++++---- mm/vmscan.c | 14 +++++++------- 5 files changed, 19 insertions(+), 36 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 92861ff3c43f..f7848f73f41c 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -942,19 +942,8 @@ static inline void mod_lruvec_kmem_state(void *p, enum= node_stat_item idx, local_irq_restore(flags); } =20 -void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, - unsigned long count); - -static inline void count_memcg_events(struct mem_cgroup *memcg, - enum vm_event_item idx, - unsigned long count) -{ - unsigned long flags; - - local_irq_save(flags); - __count_memcg_events(memcg, idx, count); - local_irq_restore(flags); -} +void count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, + unsigned long count); =20 static inline void count_memcg_folio_events(struct folio *folio, enum vm_event_item idx, unsigned long nr) @@ -1418,12 +1407,6 @@ static inline void mod_lruvec_kmem_state(void *p, en= um node_stat_item idx, } =20 static inline void count_memcg_events(struct mem_cgroup *memcg, - enum vm_event_item idx, - unsigned long count) -{ -} - -static inline void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, unsigned long count) { diff --git a/mm/memcontrol-v1.c b/mm/memcontrol-v1.c index 54c49cbfc968..4b94731305b9 100644 --- a/mm/memcontrol-v1.c +++ b/mm/memcontrol-v1.c @@ -512,9 +512,9 @@ static void memcg1_charge_statistics(struct mem_cgroup = *memcg, int nr_pages) { /* pagein of a big page is an event. So, ignore page size */ if (nr_pages > 0) - __count_memcg_events(memcg, PGPGIN, 1); + count_memcg_events(memcg, PGPGIN, 1); else { - __count_memcg_events(memcg, PGPGOUT, 1); + count_memcg_events(memcg, PGPGOUT, 1); nr_pages =3D -nr_pages; /* for event */ } =20 @@ -689,7 +689,7 @@ void memcg1_uncharge_batch(struct mem_cgroup *memcg, un= signed long pgpgout, unsigned long flags; =20 local_irq_save(flags); - __count_memcg_events(memcg, PGPGOUT, pgpgout); + count_memcg_events(memcg, PGPGOUT, pgpgout); __this_cpu_add(memcg->events_percpu->nr_page_events, nr_memory); memcg1_check_events(memcg, nid); local_irq_restore(flags); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index c5a835071610..0923072386c2 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -825,12 +825,12 @@ void __mod_lruvec_kmem_state(void *p, enum node_stat_= item idx, int val) } =20 /** - * __count_memcg_events - account VM events in a cgroup + * count_memcg_events - account VM events in a cgroup * @memcg: the memory cgroup * @idx: the event item * @count: the number of events that occurred */ -void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, +void count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, unsigned long count) { int i =3D memcg_events_index(idx); @@ -844,7 +844,7 @@ void __count_memcg_events(struct mem_cgroup *memcg, enu= m vm_event_item idx, =20 cpu =3D get_cpu(); =20 - __this_cpu_add(memcg->vmstats_percpu->events[i], count); + this_cpu_add(memcg->vmstats_percpu->events[i], count); memcg_rstat_updated(memcg, count, cpu); trace_count_memcg_events(memcg, idx, count); =20 diff --git a/mm/swap.c b/mm/swap.c index 77b2d5997873..4fc322f7111a 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -309,7 +309,7 @@ static void lru_activate(struct lruvec *lruvec, struct = folio *folio) trace_mm_lru_activate(folio); =20 __count_vm_events(PGACTIVATE, nr_pages); - __count_memcg_events(lruvec_memcg(lruvec), PGACTIVATE, nr_pages); + count_memcg_events(lruvec_memcg(lruvec), PGACTIVATE, nr_pages); } =20 #ifdef CONFIG_SMP @@ -581,7 +581,7 @@ static void lru_deactivate_file(struct lruvec *lruvec, = struct folio *folio) =20 if (active) { __count_vm_events(PGDEACTIVATE, nr_pages); - __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, + count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, nr_pages); } } @@ -599,7 +599,7 @@ static void lru_deactivate(struct lruvec *lruvec, struc= t folio *folio) lruvec_add_folio(lruvec, folio); =20 __count_vm_events(PGDEACTIVATE, nr_pages); - __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, nr_pages); + count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, nr_pages); } =20 static void lru_lazyfree(struct lruvec *lruvec, struct folio *folio) @@ -625,7 +625,7 @@ static void lru_lazyfree(struct lruvec *lruvec, struct = folio *folio) lruvec_add_folio(lruvec, folio); =20 __count_vm_events(PGLAZYFREE, nr_pages); - __count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE, nr_pages); + count_memcg_events(lruvec_memcg(lruvec), PGLAZYFREE, nr_pages); } =20 /* diff --git a/mm/vmscan.c b/mm/vmscan.c index 5efd939d8c76..f86d264558f5 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2028,7 +2028,7 @@ static unsigned long shrink_inactive_list(unsigned lo= ng nr_to_scan, item =3D PGSCAN_KSWAPD + reclaimer_offset(sc); if (!cgroup_reclaim(sc)) __count_vm_events(item, nr_scanned); - __count_memcg_events(lruvec_memcg(lruvec), item, nr_scanned); + count_memcg_events(lruvec_memcg(lruvec), item, nr_scanned); __count_vm_events(PGSCAN_ANON + file, nr_scanned); =20 spin_unlock_irq(&lruvec->lru_lock); @@ -2048,7 +2048,7 @@ static unsigned long shrink_inactive_list(unsigned lo= ng nr_to_scan, item =3D PGSTEAL_KSWAPD + reclaimer_offset(sc); if (!cgroup_reclaim(sc)) __count_vm_events(item, nr_reclaimed); - __count_memcg_events(lruvec_memcg(lruvec), item, nr_reclaimed); + count_memcg_events(lruvec_memcg(lruvec), item, nr_reclaimed); __count_vm_events(PGSTEAL_ANON + file, nr_reclaimed); spin_unlock_irq(&lruvec->lru_lock); =20 @@ -2138,7 +2138,7 @@ static void shrink_active_list(unsigned long nr_to_sc= an, =20 if (!cgroup_reclaim(sc)) __count_vm_events(PGREFILL, nr_scanned); - __count_memcg_events(lruvec_memcg(lruvec), PGREFILL, nr_scanned); + count_memcg_events(lruvec_memcg(lruvec), PGREFILL, nr_scanned); =20 spin_unlock_irq(&lruvec->lru_lock); =20 @@ -2195,7 +2195,7 @@ static void shrink_active_list(unsigned long nr_to_sc= an, nr_deactivate =3D move_folios_to_lru(lruvec, &l_inactive); =20 __count_vm_events(PGDEACTIVATE, nr_deactivate); - __count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, nr_deactivate); + count_memcg_events(lruvec_memcg(lruvec), PGDEACTIVATE, nr_deactivate); =20 __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken); spin_unlock_irq(&lruvec->lru_lock); @@ -4616,8 +4616,8 @@ static int scan_folios(unsigned long nr_to_scan, stru= ct lruvec *lruvec, __count_vm_events(item, isolated); __count_vm_events(PGREFILL, sorted); } - __count_memcg_events(memcg, item, isolated); - __count_memcg_events(memcg, PGREFILL, sorted); + count_memcg_events(memcg, item, isolated); + count_memcg_events(memcg, PGREFILL, sorted); __count_vm_events(PGSCAN_ANON + type, isolated); trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, MAX_LRU_BATCH, scanned, skipped, isolated, @@ -4769,7 +4769,7 @@ static int evict_folios(unsigned long nr_to_scan, str= uct lruvec *lruvec, item =3D PGSTEAL_KSWAPD + reclaimer_offset(sc); if (!cgroup_reclaim(sc)) __count_vm_events(item, reclaimed); - __count_memcg_events(memcg, item, reclaimed); + count_memcg_events(memcg, item, reclaimed); __count_vm_events(PGSTEAL_ANON + type, reclaimed); =20 spin_unlock_irq(&lruvec->lru_lock); --=20 2.47.1 From nobody Fri Dec 19 17:34:03 2025 Received: from out-172.mta0.migadu.com (out-172.mta0.migadu.com [91.218.175.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6EBD328C019 for ; Wed, 14 May 2025 18:42:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248169; cv=none; b=WZ/nIYCaCxFM/yAauPgNlse27etmy+xQyHigmiJfNI/PNiGAaJ4DjUzfc2B4Lvl1qzeKTY2KJMyiwndrIOrPwFYpgnp9+lqQFmJNQccqIT/nGk6Hj4ZTCxT/59viYum8CprBOMkGANVjKaWx/SJp1mT9ZC7Bxgxz8lqbTlgjpt4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248169; c=relaxed/simple; bh=c7f1JqYL+N+gc3929U7LX1WduHuibXWR30XjLdvV0mg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Uiqo8Zw2ruZpakKL3Nt4PSv5BQEqObYaCgBomwD1chy0SwiN1G4/xEJK9XFgxC5WmWfpXRSX9GzVRQrjo/Uede/CpO9oaHPRlV8KFUsittCYXqctTdNbFs05Lf3b1CsMF395ILoXn7WB7UWAN7HubJGsUmDm0x6mAGv3yBdwpro= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=Sd0foDw8; arc=none smtp.client-ip=91.218.175.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="Sd0foDw8" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1747248164; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=etryLQv1PQn8hNrBUm3oFgL1RI4JaFP1NEKUUORZ0W4=; b=Sd0foDw8hTyLha3GBCyFbPszUlMlfOiyuB8MM6D8ax08Ur18pDkzsPgo1ZDU3tADiJcFvR hVs8N0Z1cEs0GP6UaeqjfCC82z82c2jtj2Ne/PlQ3IriJckhYtsZjTWmvLcDzZL8WWGw1H QFJPu8f70R7oPFAHu4ZOBGSQ4RIE7B8= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , Harry Yoo , Yosry Ahmed , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 5/7] memcg: make __mod_memcg_lruvec_state re-entrant safe against irqs Date: Wed, 14 May 2025 11:41:56 -0700 Message-ID: <20250514184158.3471331-6-shakeel.butt@linux.dev> In-Reply-To: <20250514184158.3471331-1-shakeel.butt@linux.dev> References: <20250514184158.3471331-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Let's make __mod_memcg_lruvec_state re-entrant safe and name it mod_memcg_lruvec_state(). The only thing needed is to convert the usage of __this_cpu_add() to this_cpu_add(). There are two callers of mod_memcg_lruvec_state() and one of them i.e. __mod_objcg_mlstate() will be re-entrant safe as well, so, rename it mod_objcg_mlstate(). The last caller __mod_lruvec_state() still calls __mod_node_page_state() which is not re-entrant safe yet, so keep it as is. Signed-off-by: Shakeel Butt Acked-by: Vlastimil Babka --- mm/memcontrol.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 0923072386c2..1071db0b1df8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -727,7 +727,7 @@ unsigned long memcg_page_state_local(struct mem_cgroup = *memcg, int idx) } #endif =20 -static void __mod_memcg_lruvec_state(struct lruvec *lruvec, +static void mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, int val) { @@ -745,10 +745,10 @@ static void __mod_memcg_lruvec_state(struct lruvec *l= ruvec, cpu =3D get_cpu(); =20 /* Update memcg */ - __this_cpu_add(memcg->vmstats_percpu->state[i], val); + this_cpu_add(memcg->vmstats_percpu->state[i], val); =20 /* Update lruvec */ - __this_cpu_add(pn->lruvec_stats_percpu->state[i], val); + this_cpu_add(pn->lruvec_stats_percpu->state[i], val); =20 val =3D memcg_state_val_in_pages(idx, val); memcg_rstat_updated(memcg, val, cpu); @@ -775,7 +775,7 @@ void __mod_lruvec_state(struct lruvec *lruvec, enum nod= e_stat_item idx, =20 /* Update memcg and lruvec */ if (!mem_cgroup_disabled()) - __mod_memcg_lruvec_state(lruvec, idx, val); + mod_memcg_lruvec_state(lruvec, idx, val); } =20 void __lruvec_stat_mod_folio(struct folio *folio, enum node_stat_item idx, @@ -2527,7 +2527,7 @@ static void commit_charge(struct folio *folio, struct= mem_cgroup *memcg) folio->memcg_data =3D (unsigned long)memcg; } =20 -static inline void __mod_objcg_mlstate(struct obj_cgroup *objcg, +static inline void mod_objcg_mlstate(struct obj_cgroup *objcg, struct pglist_data *pgdat, enum node_stat_item idx, int nr) { @@ -2537,7 +2537,7 @@ static inline void __mod_objcg_mlstate(struct obj_cgr= oup *objcg, rcu_read_lock(); memcg =3D obj_cgroup_memcg(objcg); lruvec =3D mem_cgroup_lruvec(memcg, pgdat); - __mod_memcg_lruvec_state(lruvec, idx, nr); + mod_memcg_lruvec_state(lruvec, idx, nr); rcu_read_unlock(); } =20 @@ -2847,12 +2847,12 @@ static void __account_obj_stock(struct obj_cgroup *= objcg, struct pglist_data *oldpg =3D stock->cached_pgdat; =20 if (stock->nr_slab_reclaimable_b) { - __mod_objcg_mlstate(objcg, oldpg, NR_SLAB_RECLAIMABLE_B, + mod_objcg_mlstate(objcg, oldpg, NR_SLAB_RECLAIMABLE_B, stock->nr_slab_reclaimable_b); stock->nr_slab_reclaimable_b =3D 0; } if (stock->nr_slab_unreclaimable_b) { - __mod_objcg_mlstate(objcg, oldpg, NR_SLAB_UNRECLAIMABLE_B, + mod_objcg_mlstate(objcg, oldpg, NR_SLAB_UNRECLAIMABLE_B, stock->nr_slab_unreclaimable_b); stock->nr_slab_unreclaimable_b =3D 0; } @@ -2878,7 +2878,7 @@ static void __account_obj_stock(struct obj_cgroup *ob= jcg, } } if (nr) - __mod_objcg_mlstate(objcg, pgdat, idx, nr); + mod_objcg_mlstate(objcg, pgdat, idx, nr); } =20 static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_by= tes, @@ -2947,13 +2947,13 @@ static void drain_obj_stock(struct obj_stock_pcp *s= tock) */ if (stock->nr_slab_reclaimable_b || stock->nr_slab_unreclaimable_b) { if (stock->nr_slab_reclaimable_b) { - __mod_objcg_mlstate(old, stock->cached_pgdat, + mod_objcg_mlstate(old, stock->cached_pgdat, NR_SLAB_RECLAIMABLE_B, stock->nr_slab_reclaimable_b); stock->nr_slab_reclaimable_b =3D 0; } if (stock->nr_slab_unreclaimable_b) { - __mod_objcg_mlstate(old, stock->cached_pgdat, + mod_objcg_mlstate(old, stock->cached_pgdat, NR_SLAB_UNRECLAIMABLE_B, stock->nr_slab_unreclaimable_b); stock->nr_slab_unreclaimable_b =3D 0; --=20 2.47.1 From nobody Fri Dec 19 17:34:03 2025 Received: from out-170.mta0.migadu.com (out-170.mta0.migadu.com [91.218.175.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD28928C5BF for ; Wed, 14 May 2025 18:42:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248171; cv=none; b=dVdo+yOl2dPKs2QkBy71T+x1ffVdHD2jTk0yAqxkL88KCap+z3xbe1cFghoHEDuYTmoAqH6QNXDGBA9RBKjtxhfFjjpYJxUBGzK9l6zDG13SYsMJga0Gfz7kWAVG1Y5za/BpKXdp7f+AzmtXUCcB0Ht2QNAd1yYZLXk1xSR6YUs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248171; c=relaxed/simple; bh=F3WEBTPiQLWz9cCb+hjP8Ux7u/ANXbxtzcyn85cfhmw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Sb0yatj5dzMu439A++76SXXUu/R9u6W1rZzOAUehqWRRfyydNMhS3tUsjg9ZCAlCKNGuOSWfFiVdnyeI4NKe92aa43dprk7w3dUK54sRp5L9eFu03w0yE3w5oMxqQLmGuX2KtTuQ7AqnfkDjWCkd1y8Hh+NdIuueVEH62n5l6QI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=XsH+sSP2; arc=none smtp.client-ip=91.218.175.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="XsH+sSP2" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1747248167; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tl4rUg93LkTokTk1IS4m9ISxrd6enFA07YnhN9En2KU=; b=XsH+sSP2IqgV4b7JFzKmne3RJ7B8fs6sll5nB5jaO2ynnf01GgAFSgEyNXGG86e9Z7QQ1C xAVYpXslEoie+4agb0gtbili30uXYuaShK5I8zmCyUoS39PXhQWaMMqSpDmDEFLEr7kO7A tKA3XPMFfwrhihunQK+imHxFUbmwrx0= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , Harry Yoo , Yosry Ahmed , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 6/7] memcg: no stock lock for cpu hot-unplug Date: Wed, 14 May 2025 11:41:57 -0700 Message-ID: <20250514184158.3471331-7-shakeel.butt@linux.dev> In-Reply-To: <20250514184158.3471331-1-shakeel.butt@linux.dev> References: <20250514184158.3471331-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Previously on the cpu hot-unplug, the kernel would call drain_obj_stock() with objcg local lock. However local lock was not needed as the stock which was accessed belongs to a dead cpu but we kept it there to disable irqs as drain_obj_stock() may call mod_objcg_mlstate() which required irqs disabled. However there is no need to disable irqs now for mod_objcg_mlstate(), so we can remove the local lock altogether from cpu hot-unplug path. Signed-off-by: Shakeel Butt Acked-by: Vlastimil Babka --- mm/memcontrol.c | 11 +---------- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 1071db0b1df8..04d756be708b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2025,17 +2025,8 @@ void drain_all_stock(struct mem_cgroup *root_memcg) =20 static int memcg_hotplug_cpu_dead(unsigned int cpu) { - struct obj_stock_pcp *obj_st; - unsigned long flags; - - obj_st =3D &per_cpu(obj_stock, cpu); - - /* drain_obj_stock requires objstock.lock */ - local_lock_irqsave(&obj_stock.lock, flags); - drain_obj_stock(obj_st); - local_unlock_irqrestore(&obj_stock.lock, flags); - /* no need for the local lock */ + drain_obj_stock(&per_cpu(obj_stock, cpu)); drain_stock_fully(&per_cpu(memcg_stock, cpu)); =20 return 0; --=20 2.47.1 From nobody Fri Dec 19 17:34:03 2025 Received: from out-187.mta1.migadu.com (out-187.mta1.migadu.com [95.215.58.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4861E288506 for ; Wed, 14 May 2025 18:42:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.187 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248176; cv=none; b=Ewk5jcB6r9IrMkB9A2GIpCn2gk2H7GiABZMczjQwb/+uiVwmnIQ7ut20JlBYtGmQBuyOH/fJG2QlIpmwvksc2I9Z/PTGzkDd0N5xhhoaAJUA5Z2h0/+d7jn07gvKE1a8nUtkeX5cQCLsoBZf2FIo9TOQ15goGw6Re1Cz7uqaAwU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747248176; c=relaxed/simple; bh=eOV34W3QxTAg9tNCEM3huHB/nJbyQjF4t6Sq0HwlHdM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SCNDihsFWf89SfuBs1WLSD3dfU5Bv2r6QdhOI2Q+Pt82J7ag2ne59CMfjwesUdymf4QDHcvpidecvcqivN+5ONEa9k3+rJO3y2oXcurI1IAFOEdF9TyusxCJJI0pqF6+Zo7Yf8wkUUSSuUc2he/Rblpq6vsiS6/3bYlsIA2Cj3Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=Z5+Yysz/; arc=none smtp.client-ip=95.215.58.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="Z5+Yysz/" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1747248171; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t4sog0H8PQnyhxh0qyypANz3qKaJTBcH3eV85qDti8s=; b=Z5+Yysz/a0FdntBwdRVd8vs8HsgqhcPMP+VKoaKIvnPkLTDX4LkzhvGK1PfZ7VZPIKZMun JBNzghWN4lIUeiS04Y8Rcd9V0puEcW7IGz7xPLlH0EmzJODv/2n91JroIJvlIakoJCqSkL VAjiedXqXbQZ+nC5FqwmFn+0vYrU1DA= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , Harry Yoo , Yosry Ahmed , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH v2 7/7] memcg: objcg stock trylock without irq disabling Date: Wed, 14 May 2025 11:41:58 -0700 Message-ID: <20250514184158.3471331-8-shakeel.butt@linux.dev> In-Reply-To: <20250514184158.3471331-1-shakeel.butt@linux.dev> References: <20250514184158.3471331-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" There is no need to disable irqs to use objcg per-cpu stock, so let's just not do that but consume_obj_stock() and refill_obj_stock() will need to use trylock instead to avoid deadlock against irq. One consequence of this change is that the charge request from irq context may take slowpath more often but it should be rare. Signed-off-by: Shakeel Butt Acked-by: Vlastimil Babka --- mm/memcontrol.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 04d756be708b..e17b698f6243 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1882,18 +1882,17 @@ static void drain_local_memcg_stock(struct work_str= uct *dummy) static void drain_local_obj_stock(struct work_struct *dummy) { struct obj_stock_pcp *stock; - unsigned long flags; =20 if (WARN_ONCE(!in_task(), "drain in non-task context")) return; =20 - local_lock_irqsave(&obj_stock.lock, flags); + local_lock(&obj_stock.lock); =20 stock =3D this_cpu_ptr(&obj_stock); drain_obj_stock(stock); clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); =20 - local_unlock_irqrestore(&obj_stock.lock, flags); + local_unlock(&obj_stock.lock); } =20 static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) @@ -2876,10 +2875,10 @@ static bool consume_obj_stock(struct obj_cgroup *ob= jcg, unsigned int nr_bytes, struct pglist_data *pgdat, enum node_stat_item idx) { struct obj_stock_pcp *stock; - unsigned long flags; bool ret =3D false; =20 - local_lock_irqsave(&obj_stock.lock, flags); + if (!local_trylock(&obj_stock.lock)) + return ret; =20 stock =3D this_cpu_ptr(&obj_stock); if (objcg =3D=3D READ_ONCE(stock->cached_objcg) && stock->nr_bytes >=3D n= r_bytes) { @@ -2890,7 +2889,7 @@ static bool consume_obj_stock(struct obj_cgroup *objc= g, unsigned int nr_bytes, __account_obj_stock(objcg, stock, nr_bytes, pgdat, idx); } =20 - local_unlock_irqrestore(&obj_stock.lock, flags); + local_unlock(&obj_stock.lock); =20 return ret; } @@ -2979,10 +2978,16 @@ static void refill_obj_stock(struct obj_cgroup *obj= cg, unsigned int nr_bytes, enum node_stat_item idx) { struct obj_stock_pcp *stock; - unsigned long flags; unsigned int nr_pages =3D 0; =20 - local_lock_irqsave(&obj_stock.lock, flags); + if (!local_trylock(&obj_stock.lock)) { + if (pgdat) + mod_objcg_mlstate(objcg, pgdat, idx, nr_bytes); + nr_pages =3D nr_bytes >> PAGE_SHIFT; + nr_bytes =3D nr_bytes & (PAGE_SIZE - 1); + atomic_add(nr_bytes, &objcg->nr_charged_bytes); + goto out; + } =20 stock =3D this_cpu_ptr(&obj_stock); if (READ_ONCE(stock->cached_objcg) !=3D objcg) { /* reset if necessary */ @@ -3004,8 +3009,8 @@ static void refill_obj_stock(struct obj_cgroup *objcg= , unsigned int nr_bytes, stock->nr_bytes &=3D (PAGE_SIZE - 1); } =20 - local_unlock_irqrestore(&obj_stock.lock, flags); - + local_unlock(&obj_stock.lock); +out: if (nr_pages) obj_cgroup_uncharge_pages(objcg, nr_pages); } --=20 2.47.1