From nobody Thu Dec 18 18:00:23 2025 Received: from out-186.mta0.migadu.com (out-186.mta0.migadu.com [91.218.175.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B16CD1798C for ; Tue, 30 Apr 2024 06:06:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.186 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457188; cv=none; b=OtByzQGaPfngq7/eY/6rPf8t/bjTy83V2X7V+ByAxBcstBPyEYyu6rhrTPwAPZjCysxptQ9eNMWHbPQTSOo3N+JyqF5KGoYaNRmF/+pClLL2+pLS58RS2Rma8KPwCBx1aQ6+toh/BIZotrGYq5wEVlGw/RAAiKGi/hZkq+YcMCY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457188; c=relaxed/simple; bh=3dy6hl1zCDi+m0I7sZGZ37h4N9d3LYVwPdHpe6OAbjg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NWmrSydrcoONk1nHVZXgxL29sFeM423M46ZLsPdlnBD6Jq5QsaotwvrRerwUAQI9geAmPw2KyUW9QfK6b0SHkpzHanURpTCrOmuK0V3BqmkALgT9Srvi8OweSTEpBTyvSgJTSIAfZomaI2xEYEWzPvn29sG9ntYmXly9wLBnRN4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=gWysJ5rJ; arc=none smtp.client-ip=91.218.175.186 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="gWysJ5rJ" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1714457184; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Wvw793twJ1F4/KqGjGhIdLueYM54shZE8VvQWV77fqg=; b=gWysJ5rJLgTNd7+7sPEXX3lDa/aX41N7fiFaoVJoqBw2GLQoTmpvYSXD+j56elG71KufDu 5H3chniFPjB0Kb5+t8HX4JvT1HtVlVG8HX9EX65NGXLDJ1eBuARlgTh71roMJCtj96Mtw2 +qgkfyNgcm3UlGBUDOM5HOeHYGXdz14= From: Shakeel Butt To: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Yosry Ahmed , "T . J . Mercier" Cc: kernel-team@meta.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 1/8] memcg: reduce memory size of mem_cgroup_events_index Date: Mon, 29 Apr 2024 23:06:05 -0700 Message-ID: <20240430060612.2171650-2-shakeel.butt@linux.dev> In-Reply-To: <20240430060612.2171650-1-shakeel.butt@linux.dev> References: <20240430060612.2171650-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" mem_cgroup_events_index is a translation table to get the right index of the memcg relevant entry for the general vm_event_item. At the moment, it is defined as integer array. However on a typical system the max entry of vm_event_item (NR_VM_EVENT_ITEMS) is 113, so we don't need to use int as storage type of the array. For now just use int8_t as type and add a BUILD_BUG_ON() and will switch to short once NR_VM_EVENT_ITEMS touches 127. Another benefit of this change is that the translation table fits in 2 cachelines while previously it would require 8 cachelines (assuming 64 bytes cachesline). Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin Reviewed-by: T.J. Mercier Reviewed-by: Yosry Ahmed --- Changes since v2: - Used S8_MAX instead of 127 - Update commit message based on Yosry's feedback. mm/memcontrol.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 602ad5faad4d..c146187cda9c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -607,11 +607,13 @@ static const unsigned int memcg_vm_event_stat[] =3D { }; =20 #define NR_MEMCG_EVENTS ARRAY_SIZE(memcg_vm_event_stat) -static int mem_cgroup_events_index[NR_VM_EVENT_ITEMS] __read_mostly; +static int8_t mem_cgroup_events_index[NR_VM_EVENT_ITEMS] __read_mostly; =20 static void init_memcg_events(void) { - int i; + int8_t i; + + BUILD_BUG_ON(NR_VM_EVENT_ITEMS >=3D S8_MAX); =20 for (i =3D 0; i < NR_MEMCG_EVENTS; ++i) mem_cgroup_events_index[memcg_vm_event_stat[i]] =3D i + 1; --=20 2.43.0 From nobody Thu Dec 18 18:00:23 2025 Received: from out-181.mta0.migadu.com (out-181.mta0.migadu.com [91.218.175.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8D05518B1B for ; Tue, 30 Apr 2024 06:06:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457191; cv=none; b=E6VYoYJynELwcIq7oj4+tYpZ1xqVx9CleNxqI7PXArZWZ2sI196LY6KB88LYCDrWCEJa21rD2DY8FTYOnIy3adlGd/tHlurg2WC9I1BVyskGy8z/9UI486fOt6U5PrTYDhdyH1fVfst4nKHKLM+Bs6gGUkG28u9rbF9gDZjHD1Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457191; c=relaxed/simple; bh=5vGSlyy+Ek43QGGOneK8CV180s7IvsRn2TQjmFvCs94=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ruvkEUBOobfp3eHqWJoSjtZmZqMIq+y35QMB9dKIMOookrGb9wVOmxyD31mDN0Ld49QzNl9TXCFClXeIn/FjlcqmxPd2vbbHlSrbDWwN7hvrxs2FVZ1Y6j53x3kAx6lqN5/+S7PJLFgooxXKVbTViWdXXe5IbXcRvoY368gPJTk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=je6PYpsu; arc=none smtp.client-ip=91.218.175.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="je6PYpsu" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1714457187; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TJurZ26Ei0Ypt+z9OF2t2IjIht8hsBt7wgbBwSaQpf8=; b=je6PYpsuWMMT4ClDbZOM+FDe6Wa7t5KPl7NGfnZGypdcykVsLQ7bzMC/DUlhoH4PbjhXOw kSljsOgicb7sCAwL0LnTAbu6ZwNNVnO66ZdtMa3H8yCDKLM4rtyAKijVBa7NOETK6dJIAT WERHD4+A0LUkC53y/Rl7J/LAm+rUWzM= From: Shakeel Butt To: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Yosry Ahmed , "T . J . Mercier" Cc: kernel-team@meta.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 2/8] memcg: dynamically allocate lruvec_stats Date: Mon, 29 Apr 2024 23:06:06 -0700 Message-ID: <20240430060612.2171650-3-shakeel.butt@linux.dev> In-Reply-To: <20240430060612.2171650-1-shakeel.butt@linux.dev> References: <20240430060612.2171650-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" To decouple the dependency of lruvec_stats on NR_VM_NODE_STAT_ITEMS, we need to dynamically allocate lruvec_stats in the mem_cgroup_per_node structure. Also move the definition of lruvec_stats_percpu and lruvec_stats and related functions to the memcontrol.c to facilitate later patches. No functional changes in the patch. Signed-off-by: Shakeel Butt Reviewed-by: Yosry Ahmed Reviewed-by: T.J. Mercier --- Changes since v2: - N/A include/linux/memcontrol.h | 62 +++------------------------ mm/memcontrol.c | 87 ++++++++++++++++++++++++++++++++------ 2 files changed, 81 insertions(+), 68 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 9aba0d0462ca..ab8a6e884375 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -83,6 +83,8 @@ enum mem_cgroup_events_target { =20 struct memcg_vmstats_percpu; struct memcg_vmstats; +struct lruvec_stats_percpu; +struct lruvec_stats; =20 struct mem_cgroup_reclaim_iter { struct mem_cgroup *position; @@ -90,25 +92,6 @@ struct mem_cgroup_reclaim_iter { unsigned int generation; }; =20 -struct lruvec_stats_percpu { - /* Local (CPU and cgroup) state */ - long state[NR_VM_NODE_STAT_ITEMS]; - - /* Delta calculation for lockless upward propagation */ - long state_prev[NR_VM_NODE_STAT_ITEMS]; -}; - -struct lruvec_stats { - /* Aggregated (CPU and subtree) state */ - long state[NR_VM_NODE_STAT_ITEMS]; - - /* Non-hierarchical (CPU aggregated) state */ - long state_local[NR_VM_NODE_STAT_ITEMS]; - - /* Pending child counts during tree propagation */ - long state_pending[NR_VM_NODE_STAT_ITEMS]; -}; - /* * per-node information in memory controller. */ @@ -116,7 +99,7 @@ struct mem_cgroup_per_node { struct lruvec lruvec; =20 struct lruvec_stats_percpu __percpu *lruvec_stats_percpu; - struct lruvec_stats lruvec_stats; + struct lruvec_stats *lruvec_stats; =20 unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS]; =20 @@ -1037,42 +1020,9 @@ static inline void mod_memcg_page_state(struct page = *page, } =20 unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx); - -static inline unsigned long lruvec_page_state(struct lruvec *lruvec, - enum node_stat_item idx) -{ - struct mem_cgroup_per_node *pn; - long x; - - if (mem_cgroup_disabled()) - return node_page_state(lruvec_pgdat(lruvec), idx); - - pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); - x =3D READ_ONCE(pn->lruvec_stats.state[idx]); -#ifdef CONFIG_SMP - if (x < 0) - x =3D 0; -#endif - return x; -} - -static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec, - enum node_stat_item idx) -{ - struct mem_cgroup_per_node *pn; - long x =3D 0; - - if (mem_cgroup_disabled()) - return node_page_state(lruvec_pgdat(lruvec), idx); - - pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); - x =3D READ_ONCE(pn->lruvec_stats.state_local[idx]); -#ifdef CONFIG_SMP - if (x < 0) - x =3D 0; -#endif - return x; -} +unsigned long lruvec_page_state(struct lruvec *lruvec, enum node_stat_item= idx); +unsigned long lruvec_page_state_local(struct lruvec *lruvec, + enum node_stat_item idx); =20 void mem_cgroup_flush_stats(struct mem_cgroup *memcg); void mem_cgroup_flush_stats_ratelimited(struct mem_cgroup *memcg); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index c146187cda9c..7126459ec56a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -576,6 +576,60 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_t= ree_per_node *mctz) return mz; } =20 +struct lruvec_stats_percpu { + /* Local (CPU and cgroup) state */ + long state[NR_VM_NODE_STAT_ITEMS]; + + /* Delta calculation for lockless upward propagation */ + long state_prev[NR_VM_NODE_STAT_ITEMS]; +}; + +struct lruvec_stats { + /* Aggregated (CPU and subtree) state */ + long state[NR_VM_NODE_STAT_ITEMS]; + + /* Non-hierarchical (CPU aggregated) state */ + long state_local[NR_VM_NODE_STAT_ITEMS]; + + /* Pending child counts during tree propagation */ + long state_pending[NR_VM_NODE_STAT_ITEMS]; +}; + +unsigned long lruvec_page_state(struct lruvec *lruvec, enum node_stat_item= idx) +{ + struct mem_cgroup_per_node *pn; + long x; + + if (mem_cgroup_disabled()) + return node_page_state(lruvec_pgdat(lruvec), idx); + + pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); + x =3D READ_ONCE(pn->lruvec_stats->state[idx]); +#ifdef CONFIG_SMP + if (x < 0) + x =3D 0; +#endif + return x; +} + +unsigned long lruvec_page_state_local(struct lruvec *lruvec, + enum node_stat_item idx) +{ + struct mem_cgroup_per_node *pn; + long x =3D 0; + + if (mem_cgroup_disabled()) + return node_page_state(lruvec_pgdat(lruvec), idx); + + pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); + x =3D READ_ONCE(pn->lruvec_stats->state_local[idx]); +#ifdef CONFIG_SMP + if (x < 0) + x =3D 0; +#endif + return x; +} + /* Subset of vm_event_item to report for memcg event stats */ static const unsigned int memcg_vm_event_stat[] =3D { PGPGIN, @@ -5491,18 +5545,25 @@ static int alloc_mem_cgroup_per_node_info(struct me= m_cgroup *memcg, int node) if (!pn) return 1; =20 + pn->lruvec_stats =3D kzalloc_node(sizeof(struct lruvec_stats), GFP_KERNEL, + node); + if (!pn->lruvec_stats) + goto fail; + pn->lruvec_stats_percpu =3D alloc_percpu_gfp(struct lruvec_stats_percpu, GFP_KERNEL_ACCOUNT); - if (!pn->lruvec_stats_percpu) { - kfree(pn); - return 1; - } + if (!pn->lruvec_stats_percpu) + goto fail; =20 lruvec_init(&pn->lruvec); pn->memcg =3D memcg; =20 memcg->nodeinfo[node] =3D pn; return 0; +fail: + kfree(pn->lruvec_stats); + kfree(pn); + return 1; } =20 static void free_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int no= de) @@ -5513,6 +5574,7 @@ static void free_mem_cgroup_per_node_info(struct mem_= cgroup *memcg, int node) return; =20 free_percpu(pn->lruvec_stats_percpu); + kfree(pn->lruvec_stats); kfree(pn); } =20 @@ -5865,18 +5927,19 @@ static void mem_cgroup_css_rstat_flush(struct cgrou= p_subsys_state *css, int cpu) =20 for_each_node_state(nid, N_MEMORY) { struct mem_cgroup_per_node *pn =3D memcg->nodeinfo[nid]; - struct mem_cgroup_per_node *ppn =3D NULL; + struct lruvec_stats *lstats =3D pn->lruvec_stats; + struct lruvec_stats *plstats =3D NULL; struct lruvec_stats_percpu *lstatc; =20 if (parent) - ppn =3D parent->nodeinfo[nid]; + plstats =3D parent->nodeinfo[nid]->lruvec_stats; =20 lstatc =3D per_cpu_ptr(pn->lruvec_stats_percpu, cpu); =20 for (i =3D 0; i < NR_VM_NODE_STAT_ITEMS; i++) { - delta =3D pn->lruvec_stats.state_pending[i]; + delta =3D lstats->state_pending[i]; if (delta) - pn->lruvec_stats.state_pending[i] =3D 0; + lstats->state_pending[i] =3D 0; =20 delta_cpu =3D 0; v =3D READ_ONCE(lstatc->state[i]); @@ -5887,12 +5950,12 @@ static void mem_cgroup_css_rstat_flush(struct cgrou= p_subsys_state *css, int cpu) } =20 if (delta_cpu) - pn->lruvec_stats.state_local[i] +=3D delta_cpu; + lstats->state_local[i] +=3D delta_cpu; =20 if (delta) { - pn->lruvec_stats.state[i] +=3D delta; - if (ppn) - ppn->lruvec_stats.state_pending[i] +=3D delta; + lstats->state[i] +=3D delta; + if (plstats) + plstats->state_pending[i] +=3D delta; } } } --=20 2.43.0 From nobody Thu Dec 18 18:00:23 2025 Received: from out-176.mta0.migadu.com (out-176.mta0.migadu.com [91.218.175.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 73F871B96B for ; Tue, 30 Apr 2024 06:06:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457194; cv=none; b=kKxmgym+dC40xeItspBPDIUH5lRsBsUC6oGCzyhcQBESBUOnWfnZRgGOv9pWmOKZsALSe3AC725BFdFDo73r6xvS/bSj07R50LiJjHFUPOOiQyhmOVO16rV28R4+JB4VdICae5IvmyBX/RBztoy8uvS3MV/A47I+IFX5Aja9KUU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457194; c=relaxed/simple; bh=OCjUhL8u0ustWTtr862Hux6+rr/q4YzkONnKLU21Nk0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dDjfyemwqCF4PS3px04aFyJoQe8NS+GPh0uG3vbYtMJTfxpLtQbDoLK8rtoYX9Q5iQd/0A+NnaMs3x0CxNupv145BsosRkfbqFe2dCX5+TxehZ2fQfNTvsTW/kCoHgEnD5VyMTRu2iI3oHAPQ/YzE1pE/IfV3sIEtPt1AsKoEBw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=PeTHaWtH; arc=none smtp.client-ip=91.218.175.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="PeTHaWtH" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1714457190; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5r5hMtfswp5qoj9glT20FCqOKYLmIdRrP0lNWsirtps=; b=PeTHaWtHFNHgJkCXWCF/Ai07h2OyJw4wbWxj/hRwj0ynslSaUgA9yIq+xHiiRg6Is0gE5T dBPacjZehCXpuz9tiwffhX/AyogEmjHQ/UiWLdnv0irunhDrAvkdSSYKPHm5ydKRd5y1H8 xYbZIifAU4bef22QTLtYPQfKgrHJZPI= From: Shakeel Butt To: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Yosry Ahmed , "T . J . Mercier" Cc: kernel-team@meta.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 3/8] mm: memcg: account memory used for memcg vmstats and lruvec stats Date: Mon, 29 Apr 2024 23:06:07 -0700 Message-ID: <20240430060612.2171650-4-shakeel.butt@linux.dev> In-Reply-To: <20240430060612.2171650-1-shakeel.butt@linux.dev> References: <20240430060612.2171650-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" From: Roman Gushchin The percpu memory used by memcg's memory statistics is already accounted. For consistency, let's enable accounting for vmstats and lruvec stats as well. Signed-off-by: Roman Gushchin Signed-off-by: Shakeel Butt Reviewed-by: T.J. Mercier Reviewed-by: Yosry Ahmed --- mm/memcontrol.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7126459ec56a..434cff91b65e 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5545,8 +5545,8 @@ static int alloc_mem_cgroup_per_node_info(struct mem_= cgroup *memcg, int node) if (!pn) return 1; =20 - pn->lruvec_stats =3D kzalloc_node(sizeof(struct lruvec_stats), GFP_KERNEL, - node); + pn->lruvec_stats =3D kzalloc_node(sizeof(struct lruvec_stats), + GFP_KERNEL_ACCOUNT, node); if (!pn->lruvec_stats) goto fail; =20 @@ -5617,7 +5617,8 @@ static struct mem_cgroup *mem_cgroup_alloc(struct mem= _cgroup *parent) goto fail; } =20 - memcg->vmstats =3D kzalloc(sizeof(struct memcg_vmstats), GFP_KERNEL); + memcg->vmstats =3D kzalloc(sizeof(struct memcg_vmstats), + GFP_KERNEL_ACCOUNT); if (!memcg->vmstats) goto fail; =20 --=20 2.43.0 From nobody Thu Dec 18 18:00:23 2025 Received: from out-179.mta0.migadu.com (out-179.mta0.migadu.com [91.218.175.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEAE1179BE for ; Tue, 30 Apr 2024 06:06:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457196; cv=none; b=S3JVI6T/0LGGWSZW2MplN3nvdqIPt4RYJ2RehyFEAghB9eviXCvIJRC/YYUGqx/BNIZNb5/nmYNsFISj24n/7Nyc/w43vFdcLjg8KTJoVlKpBLEuLaybOuEao2TP3LyqqBcLKiXFyoNFE1uVRg45Gv5Iw4r1dHCTTrhSD5sPO6w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457196; c=relaxed/simple; bh=k18r5R/umXpyYhFtu0gNERJkRcu2wpGo4Q1J2bSt24k=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dTsx83q3N4btew35yCTymm8SGPnuxj7kD4QSoigxXIHQ6poehk/K6BPmImpp7d0BwoBi77rnrs2FrxsAwLSRDU/LL888O4nDcWI3m6BseC7swfHozK2O7Kpa1V9el+ZPeqzw9XSWIRReI+itb7vCuCBi675eVYdUOOV/nFkFLHg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=eT6UOJAv; arc=none smtp.client-ip=91.218.175.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="eT6UOJAv" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1714457193; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=46jIpUDCGTb729LeCv+9t97jXtyDL66VCoduqMx+CMM=; b=eT6UOJAv9lSCNtyA6T8DFFhJFSoH7G/OxSaLKTW2ca6EPrjCF4awZ5/G+GTTdGnfIt7dO/ gGZQHMBCXl1ffB5QTlzSqRSn06kHWxZkfxhUyvdzoXBxyPrBOndD3fBF9ArOJEv3kU3e6u 9yY3/ETZZnAG1RZYdU7lE+4P4vGERZI= From: Shakeel Butt To: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Yosry Ahmed , "T . J . Mercier" Cc: kernel-team@meta.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 4/8] memcg: reduce memory for the lruvec and memcg stats Date: Mon, 29 Apr 2024 23:06:08 -0700 Message-ID: <20240430060612.2171650-5-shakeel.butt@linux.dev> In-Reply-To: <20240430060612.2171650-1-shakeel.butt@linux.dev> References: <20240430060612.2171650-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" At the moment, the amount of memory allocated for stats related structs in the mem_cgroup corresponds to the size of enum node_stat_item. However not all fields in enum node_stat_item has corresponding memcg stats. So, let's use indirection mechanism similar to the one used for memcg vmstats management. For a given x86_64 config, the size of stats with and without patch is: structs size in bytes w/o with struct lruvec_stats 1128 648 struct lruvec_stats_percpu 752 432 struct memcg_vmstats 1832 1352 struct memcg_vmstats_percpu 1280 960 The memory savings is further compounded by the fact that these structs are allocated for each cpu and for each node. To be precise, for each memcg the memory saved would be: Memory saved =3D ((21 * 3 * NR_NODES) + (21 * 2 * NR_NODS * NR_CPUS) + (21 * 3) + (21 * 2 * NR_CPUS)) * sizeof(long) Where 21 is the number of fields eliminated. Signed-off-by: Shakeel Butt --- Changes since v2: - N/A mm/memcontrol.c | 138 ++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 115 insertions(+), 23 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 434cff91b65e..f424c5b2ba9b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -576,35 +576,105 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup= _tree_per_node *mctz) return mz; } =20 +/* Subset of node_stat_item for memcg stats */ +static const unsigned int memcg_node_stat_items[] =3D { + NR_INACTIVE_ANON, + NR_ACTIVE_ANON, + NR_INACTIVE_FILE, + NR_ACTIVE_FILE, + NR_UNEVICTABLE, + NR_SLAB_RECLAIMABLE_B, + NR_SLAB_UNRECLAIMABLE_B, + WORKINGSET_REFAULT_ANON, + WORKINGSET_REFAULT_FILE, + WORKINGSET_ACTIVATE_ANON, + WORKINGSET_ACTIVATE_FILE, + WORKINGSET_RESTORE_ANON, + WORKINGSET_RESTORE_FILE, + WORKINGSET_NODERECLAIM, + NR_ANON_MAPPED, + NR_FILE_MAPPED, + NR_FILE_PAGES, + NR_FILE_DIRTY, + NR_WRITEBACK, + NR_SHMEM, + NR_SHMEM_THPS, + NR_FILE_THPS, + NR_ANON_THPS, + NR_KERNEL_STACK_KB, + NR_PAGETABLE, + NR_SECONDARY_PAGETABLE, +#ifdef CONFIG_SWAP + NR_SWAPCACHE, +#endif +}; + +static const unsigned int memcg_stat_items[] =3D { + MEMCG_SWAP, + MEMCG_SOCK, + MEMCG_PERCPU_B, + MEMCG_VMALLOC, + MEMCG_KMEM, + MEMCG_ZSWAP_B, + MEMCG_ZSWAPPED, +}; + +#define NR_MEMCG_NODE_STAT_ITEMS ARRAY_SIZE(memcg_node_stat_items) +#define NR_MEMCG_STATS (NR_MEMCG_NODE_STAT_ITEMS + ARRAY_SIZE(memcg_stat_i= tems)) +static int8_t mem_cgroup_stats_index[MEMCG_NR_STAT] __read_mostly; + +static void init_memcg_stats(void) +{ + int8_t i, j =3D 0; + + /* Switch to short once this failure occurs. */ + BUILD_BUG_ON(NR_MEMCG_STATS >=3D 127 /* INT8_MAX */); + + for (i =3D 0; i < NR_MEMCG_NODE_STAT_ITEMS; ++i) + mem_cgroup_stats_index[memcg_node_stat_items[i]] =3D ++j; + + for (i =3D 0; i < ARRAY_SIZE(memcg_stat_items); ++i) + mem_cgroup_stats_index[memcg_stat_items[i]] =3D ++j; +} + +static inline int memcg_stats_index(int idx) +{ + return mem_cgroup_stats_index[idx] - 1; +} + struct lruvec_stats_percpu { /* Local (CPU and cgroup) state */ - long state[NR_VM_NODE_STAT_ITEMS]; + long state[NR_MEMCG_NODE_STAT_ITEMS]; =20 /* Delta calculation for lockless upward propagation */ - long state_prev[NR_VM_NODE_STAT_ITEMS]; + long state_prev[NR_MEMCG_NODE_STAT_ITEMS]; }; =20 struct lruvec_stats { /* Aggregated (CPU and subtree) state */ - long state[NR_VM_NODE_STAT_ITEMS]; + long state[NR_MEMCG_NODE_STAT_ITEMS]; =20 /* Non-hierarchical (CPU aggregated) state */ - long state_local[NR_VM_NODE_STAT_ITEMS]; + long state_local[NR_MEMCG_NODE_STAT_ITEMS]; =20 /* Pending child counts during tree propagation */ - long state_pending[NR_VM_NODE_STAT_ITEMS]; + long state_pending[NR_MEMCG_NODE_STAT_ITEMS]; }; =20 unsigned long lruvec_page_state(struct lruvec *lruvec, enum node_stat_item= idx) { struct mem_cgroup_per_node *pn; - long x; + long x =3D 0; + int i; =20 if (mem_cgroup_disabled()) return node_page_state(lruvec_pgdat(lruvec), idx); =20 - pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); - x =3D READ_ONCE(pn->lruvec_stats->state[idx]); + i =3D memcg_stats_index(idx); + if (i >=3D 0) { + pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); + x =3D READ_ONCE(pn->lruvec_stats->state[i]); + } #ifdef CONFIG_SMP if (x < 0) x =3D 0; @@ -617,12 +687,16 @@ unsigned long lruvec_page_state_local(struct lruvec *= lruvec, { struct mem_cgroup_per_node *pn; long x =3D 0; + int i; =20 if (mem_cgroup_disabled()) return node_page_state(lruvec_pgdat(lruvec), idx); =20 - pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); - x =3D READ_ONCE(pn->lruvec_stats->state_local[idx]); + i =3D memcg_stats_index(idx); + if (i >=3D 0) { + pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); + x =3D READ_ONCE(pn->lruvec_stats->state_local[i]); + } #ifdef CONFIG_SMP if (x < 0) x =3D 0; @@ -689,11 +763,11 @@ struct memcg_vmstats_percpu { /* The above should fit a single cacheline for memcg_rstat_updated() */ =20 /* Local (CPU and cgroup) page state & events */ - long state[MEMCG_NR_STAT]; + long state[NR_MEMCG_STATS]; unsigned long events[NR_MEMCG_EVENTS]; =20 /* Delta calculation for lockless upward propagation */ - long state_prev[MEMCG_NR_STAT]; + long state_prev[NR_MEMCG_STATS]; unsigned long events_prev[NR_MEMCG_EVENTS]; =20 /* Cgroup1: threshold notifications & softlimit tree updates */ @@ -703,15 +777,15 @@ struct memcg_vmstats_percpu { =20 struct memcg_vmstats { /* Aggregated (CPU and subtree) page state & events */ - long state[MEMCG_NR_STAT]; + long state[NR_MEMCG_STATS]; unsigned long events[NR_MEMCG_EVENTS]; =20 /* Non-hierarchical (CPU aggregated) page state & events */ - long state_local[MEMCG_NR_STAT]; + long state_local[NR_MEMCG_STATS]; unsigned long events_local[NR_MEMCG_EVENTS]; =20 /* Pending child counts during tree propagation */ - long state_pending[MEMCG_NR_STAT]; + long state_pending[NR_MEMCG_STATS]; unsigned long events_pending[NR_MEMCG_EVENTS]; =20 /* Stats updates since the last flush */ @@ -844,7 +918,13 @@ static void flush_memcg_stats_dwork(struct work_struct= *w) =20 unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) { - long x =3D READ_ONCE(memcg->vmstats->state[idx]); + long x; + int i =3D memcg_stats_index(idx); + + if (i < 0) + return 0; + + x =3D READ_ONCE(memcg->vmstats->state[i]); #ifdef CONFIG_SMP if (x < 0) x =3D 0; @@ -876,18 +956,25 @@ static int memcg_state_val_in_pages(int idx, int val) */ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) { - if (mem_cgroup_disabled()) + int i =3D memcg_stats_index(idx); + + if (mem_cgroup_disabled() || i < 0) return; =20 - __this_cpu_add(memcg->vmstats_percpu->state[idx], val); + __this_cpu_add(memcg->vmstats_percpu->state[i], val); memcg_rstat_updated(memcg, memcg_state_val_in_pages(idx, val)); } =20 /* idx can be of type enum memcg_stat_item or node_stat_item. */ static unsigned long memcg_page_state_local(struct mem_cgroup *memcg, int = idx) { - long x =3D READ_ONCE(memcg->vmstats->state_local[idx]); + long x; + int i =3D memcg_stats_index(idx); + + if (i < 0) + return 0; =20 + x =3D READ_ONCE(memcg->vmstats->state_local[i]); #ifdef CONFIG_SMP if (x < 0) x =3D 0; @@ -901,6 +988,10 @@ static void __mod_memcg_lruvec_state(struct lruvec *lr= uvec, { struct mem_cgroup_per_node *pn; struct mem_cgroup *memcg; + int i =3D memcg_stats_index(idx); + + if (i < 0) + return; =20 pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); memcg =3D pn->memcg; @@ -930,10 +1021,10 @@ static void __mod_memcg_lruvec_state(struct lruvec *= lruvec, } =20 /* Update memcg */ - __this_cpu_add(memcg->vmstats_percpu->state[idx], val); + __this_cpu_add(memcg->vmstats_percpu->state[i], val); =20 /* Update lruvec */ - __this_cpu_add(pn->lruvec_stats_percpu->state[idx], val); + __this_cpu_add(pn->lruvec_stats_percpu->state[i], val); =20 memcg_rstat_updated(memcg, memcg_state_val_in_pages(idx, val)); memcg_stats_unlock(); @@ -5702,6 +5793,7 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *pare= nt_css) page_counter_init(&memcg->kmem, &parent->kmem); page_counter_init(&memcg->tcpmem, &parent->tcpmem); } else { + init_memcg_stats(); init_memcg_events(); page_counter_init(&memcg->memory, NULL); page_counter_init(&memcg->swap, NULL); @@ -5873,7 +5965,7 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_= subsys_state *css, int cpu) =20 statc =3D per_cpu_ptr(memcg->vmstats_percpu, cpu); =20 - for (i =3D 0; i < MEMCG_NR_STAT; i++) { + for (i =3D 0; i < NR_MEMCG_STATS; i++) { /* * Collect the aggregated propagation counts of groups * below us. We're in a per-cpu loop here and this is @@ -5937,7 +6029,7 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_= subsys_state *css, int cpu) =20 lstatc =3D per_cpu_ptr(pn->lruvec_stats_percpu, cpu); =20 - for (i =3D 0; i < NR_VM_NODE_STAT_ITEMS; i++) { + for (i =3D 0; i < NR_MEMCG_NODE_STAT_ITEMS; i++) { delta =3D lstats->state_pending[i]; if (delta) lstats->state_pending[i] =3D 0; --=20 2.43.0 From nobody Thu Dec 18 18:00:23 2025 Received: from out-181.mta1.migadu.com (out-181.mta1.migadu.com [95.215.58.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 321E72C853 for ; Tue, 30 Apr 2024 06:06:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457200; cv=none; b=YP1yQ/u6OoDfEt6GuSPKoe+i7fPwyBnhhClvmbKjrFTxHaP+wZL5hUBZwNpgv3wMAUo3aYQhPm92C5wEqQ0fgFXg3hSN5qIyZgti/JnOOvdsB5gJ6roFTGswidhxiLesRNfgIvdzlLKgDGQ0A8//iT1qH7tPzCbNqfyv9QVFSYU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457200; c=relaxed/simple; bh=/iEtJbezM6wgT7YXBmc8I+XITwFptD75NotZe/FLwtk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RPd+axcRkbZPW8kmCz+u05UGKqihnpnlWVUtcft7DHwMKGb2L7kaFZ8dVK/5FhLssfaKQt8ElJakN89owWMKsKT/AvUp/YjCrPCMm+3vKugEV9b7zMgSMo7HYYumKeTXm7rKmQ9zcv03pe5xohhIjQQst54nUZ3L76jbkgWd49s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=EfBe/UdG; arc=none smtp.client-ip=95.215.58.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="EfBe/UdG" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1714457197; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BPVK+LGjMG4ohSIE1zDoj8EpOU3OQH41GwgZeY5ZUQE=; b=EfBe/UdGoVlt2JhWhQf0xF720ZRjNkbnEaL968MBYRXAEfDgAk6LqFh+Rm50VUpYIDY/cs lTYQ4CduLYDHFatXqJEXF81zWjZVFefdLKzR+Vz4EU5vqzA/O01NyIDzRvas+cBk/Dt79Q tgleknm1AKtzrboCI0KYvo7eigvehRc= From: Shakeel Butt To: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Yosry Ahmed , "T . J . Mercier" Cc: kernel-team@meta.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 5/8] memcg: cleanup __mod_memcg_lruvec_state Date: Mon, 29 Apr 2024 23:06:09 -0700 Message-ID: <20240430060612.2171650-6-shakeel.butt@linux.dev> In-Reply-To: <20240430060612.2171650-1-shakeel.butt@linux.dev> References: <20240430060612.2171650-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" There are no memcg specific stats for NR_SHMEM_PMDMAPPED and NR_FILE_PMDMAPPED. Let's remove them. Signed-off-by: Shakeel Butt Reviewed-by: Yosry Ahmed Reviewed-by: Roman Gushchin Reviewed-by: T.J. Mercier --- Changes since v2: - N/A mm/memcontrol.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f424c5b2ba9b..df94abc0088f 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1008,8 +1008,6 @@ static void __mod_memcg_lruvec_state(struct lruvec *l= ruvec, case NR_ANON_MAPPED: case NR_FILE_MAPPED: case NR_ANON_THPS: - case NR_SHMEM_PMDMAPPED: - case NR_FILE_PMDMAPPED: if (WARN_ON_ONCE(!in_task())) pr_warn("stat item index: %d\n", idx); break; --=20 2.43.0 From nobody Thu Dec 18 18:00:23 2025 Received: from out-187.mta0.migadu.com (out-187.mta0.migadu.com [91.218.175.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 93BCF3771E for ; Tue, 30 Apr 2024 06:06:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.187 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457203; cv=none; b=PioiwAAVFPy+bxF/58EgvMSyHbaq93p/bI6G06uERbujAFgG1vPTEdJmJ96VA3Yfw27zmCW5xUtYu5BRdVV9zbUdcYVHtRSkW5Gsli30eUlB6U+egp1FXU9CVERZK9QiZ4FPh2RcxAjZ4piNsO6lv6PkNORkznKqC//yVJgtr60= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457203; c=relaxed/simple; bh=Yt6taaOV50WQ9OI6QYKbIN8dZVODj5+9ojwQp16L3iM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aoBeYxgX0XpzhIEvmK0nRsBG0w7bvWfeIO0qJ3Dp01YwkJxhWmdnPVRokZGPP1oClD/0wlXPTZZjO0zKqk5JFuTVEUsk4VU5EhkwZKxzasurx7itxyE9F8XDWaFstYKtujnOEFQp4ocd8GEOO6IQccKYmJFhU6+ijOYNz3lnaeU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=Zvjx05Vg; arc=none smtp.client-ip=91.218.175.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="Zvjx05Vg" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1714457199; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3q1W7D1q9MXTrJoQi5E1RFDlpl05owU67WZ+nEUBEsw=; b=Zvjx05VgmEQtihTxz/c69px6v2O5rBQ6CAXVV5XJDCN9QIGwtVdWfgW0aW1WLezr8zeDVJ zKAt36dlwhmlOfD/FK11SMqB6MnAnnB0oRd07xODCZ2IYBioYy8jixmJkk1+4QTpOQfCkT WFnehtYOPj0Jpa6WCliKDbOMN+qGLUo= From: Shakeel Butt To: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Yosry Ahmed , "T . J . Mercier" Cc: kernel-team@meta.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 6/8] mm: cleanup WORKINGSET_NODES in workingset Date: Mon, 29 Apr 2024 23:06:10 -0700 Message-ID: <20240430060612.2171650-7-shakeel.butt@linux.dev> In-Reply-To: <20240430060612.2171650-1-shakeel.butt@linux.dev> References: <20240430060612.2171650-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" WORKINGSET_NODES is not exposed in the memcg stats and thus there is no need to use the memcg specific stat update functions for it. In future if we decide to expose WORKINGSET_NODES in the memcg stats, we can revert this patch. Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin Reviewed-by: T.J. Mercier --- Changes since v2: - N/A mm/workingset.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/mm/workingset.c b/mm/workingset.c index f2a0ecaf708d..c22adb93622a 100644 --- a/mm/workingset.c +++ b/mm/workingset.c @@ -618,6 +618,7 @@ struct list_lru shadow_nodes; void workingset_update_node(struct xa_node *node) { struct address_space *mapping; + struct page *page =3D virt_to_page(node); =20 /* * Track non-empty nodes that contain only shadow entries; @@ -633,12 +634,12 @@ void workingset_update_node(struct xa_node *node) if (node->count && node->count =3D=3D node->nr_values) { if (list_empty(&node->private_list)) { list_lru_add_obj(&shadow_nodes, &node->private_list); - __inc_lruvec_kmem_state(node, WORKINGSET_NODES); + __inc_node_page_state(page, WORKINGSET_NODES); } } else { if (!list_empty(&node->private_list)) { list_lru_del_obj(&shadow_nodes, &node->private_list); - __dec_lruvec_kmem_state(node, WORKINGSET_NODES); + __dec_node_page_state(page, WORKINGSET_NODES); } } } @@ -742,7 +743,7 @@ static enum lru_status shadow_lru_isolate(struct list_h= ead *item, } =20 list_lru_isolate(lru, item); - __dec_lruvec_kmem_state(node, WORKINGSET_NODES); + __dec_node_page_state(virt_to_page(node), WORKINGSET_NODES); =20 spin_unlock(lru_lock); =20 --=20 2.43.0 From nobody Thu Dec 18 18:00:23 2025 Received: from out-187.mta0.migadu.com (out-187.mta0.migadu.com [91.218.175.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3DDED3B1A2 for ; Tue, 30 Apr 2024 06:06:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.187 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457205; cv=none; b=jIgdDVMD1Uqtl4O5gabO0FvaxW31tAvK7tJSsAId1b0g4GVNiO/i2ghZ1jR7GO5RuBaRNghDo76ftbgsIeYySHbVSO4FwHbBF4j8qH0XqOKgITEn6V5KSF3R2V7oCardCNQYPUsaEabtHemY9ulPsJNbrC+EtQPSVCQZx6RLLV8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457205; c=relaxed/simple; bh=Ib6rRlgqmKYtC6TnTGgY8/vznC1/0zQVzuYMOueSGG0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QwvuIr1TC6aC8I5RSRZjRYbbYRzgpu0Sem6dTlg8ycmIjH/hM5+rp06/P7Pz0mONTGlzFBk/aUfxINe2krKa3Wvo2vjEAevCJ11mzbYShTTMlw2CrLKwU2kEq/MakHSb7IPXdXFjTW0uH40BZKaK6e54ifmcQIptKxSXIDSVsfo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=i9Ajc1vV; arc=none smtp.client-ip=91.218.175.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="i9Ajc1vV" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1714457202; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/V685kde9L0ZctsmCACzkCSs3YoKScmeI639UOEVavY=; b=i9Ajc1vVEAxx3DPEALD4PHoTBLkXVrvxG8ZT0+gtHZi8WqtrmLvE1uboXg7LzmLXkBmgAe mkiSrRVvWlwUHmiKEHwyPJOySqzpYOfr9raHJAytZqjOw0fL57/fyFm58R4EVIawHdQKE1 7rcnHurz3ckYjl7kQ8vgmKrnNFzRbXU= From: Shakeel Butt To: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Yosry Ahmed , "T . J . Mercier" Cc: kernel-team@meta.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 7/8] memcg: warn for unexpected events and stats Date: Mon, 29 Apr 2024 23:06:11 -0700 Message-ID: <20240430060612.2171650-8-shakeel.butt@linux.dev> In-Reply-To: <20240430060612.2171650-1-shakeel.butt@linux.dev> References: <20240430060612.2171650-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" To reduce memory usage by the memcg events and stats, the kernel uses indirection table and only allocate stats and events which are being used by the memcg code. To make this more robust, let's add warnings where unexpected stats and events indexes are used. Signed-off-by: Shakeel Butt --- Changes since v2: - Based on feedback from Johannes, switched to WARN_ONCE() from pr_warn_once(). mm/memcontrol.c | 55 ++++++++++++++++++++++++++++--------------------- 1 file changed, 32 insertions(+), 23 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index df94abc0088f..72e36977a96e 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -664,17 +664,18 @@ struct lruvec_stats { unsigned long lruvec_page_state(struct lruvec *lruvec, enum node_stat_item= idx) { struct mem_cgroup_per_node *pn; - long x =3D 0; + long x; int i; =20 if (mem_cgroup_disabled()) return node_page_state(lruvec_pgdat(lruvec), idx); =20 i =3D memcg_stats_index(idx); - if (i >=3D 0) { - pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); - x =3D READ_ONCE(pn->lruvec_stats->state[i]); - } + if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) + return 0; + + pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); + x =3D READ_ONCE(pn->lruvec_stats->state[i]); #ifdef CONFIG_SMP if (x < 0) x =3D 0; @@ -686,17 +687,18 @@ unsigned long lruvec_page_state_local(struct lruvec *= lruvec, enum node_stat_item idx) { struct mem_cgroup_per_node *pn; - long x =3D 0; + long x; int i; =20 if (mem_cgroup_disabled()) return node_page_state(lruvec_pgdat(lruvec), idx); =20 i =3D memcg_stats_index(idx); - if (i >=3D 0) { - pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); - x =3D READ_ONCE(pn->lruvec_stats->state_local[i]); - } + if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) + return 0; + + pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); + x =3D READ_ONCE(pn->lruvec_stats->state_local[i]); #ifdef CONFIG_SMP if (x < 0) x =3D 0; @@ -921,7 +923,7 @@ unsigned long memcg_page_state(struct mem_cgroup *memcg= , int idx) long x; int i =3D memcg_stats_index(idx); =20 - if (i < 0) + if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) return 0; =20 x =3D READ_ONCE(memcg->vmstats->state[i]); @@ -958,7 +960,10 @@ void __mod_memcg_state(struct mem_cgroup *memcg, int i= dx, int val) { int i =3D memcg_stats_index(idx); =20 - if (mem_cgroup_disabled() || i < 0) + if (mem_cgroup_disabled()) + return; + + if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) return; =20 __this_cpu_add(memcg->vmstats_percpu->state[i], val); @@ -971,7 +976,7 @@ static unsigned long memcg_page_state_local(struct mem_= cgroup *memcg, int idx) long x; int i =3D memcg_stats_index(idx); =20 - if (i < 0) + if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) return 0; =20 x =3D READ_ONCE(memcg->vmstats->state_local[i]); @@ -990,7 +995,7 @@ static void __mod_memcg_lruvec_state(struct lruvec *lru= vec, struct mem_cgroup *memcg; int i =3D memcg_stats_index(idx); =20 - if (i < 0) + if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) return; =20 pn =3D container_of(lruvec, struct mem_cgroup_per_node, lruvec); @@ -1104,34 +1109,38 @@ void __mod_lruvec_kmem_state(void *p, enum node_sta= t_item idx, int val) void __count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, unsigned long count) { - int index =3D memcg_events_index(idx); + int i =3D memcg_events_index(idx); =20 - if (mem_cgroup_disabled() || index < 0) + if (mem_cgroup_disabled()) + return; + + if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, idx)) return; =20 memcg_stats_lock(); - __this_cpu_add(memcg->vmstats_percpu->events[index], count); + __this_cpu_add(memcg->vmstats_percpu->events[i], count); memcg_rstat_updated(memcg, count); memcg_stats_unlock(); } =20 static unsigned long memcg_events(struct mem_cgroup *memcg, int event) { - int index =3D memcg_events_index(event); + int i =3D memcg_events_index(event); =20 - if (index < 0) + if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, event)) return 0; - return READ_ONCE(memcg->vmstats->events[index]); + + return READ_ONCE(memcg->vmstats->events[i]); } =20 static unsigned long memcg_events_local(struct mem_cgroup *memcg, int even= t) { - int index =3D memcg_events_index(event); + int i =3D memcg_events_index(event); =20 - if (index < 0) + if (WARN_ONCE(i < 0, "%s: missing stat item %d\n", __func__, event)) return 0; =20 - return READ_ONCE(memcg->vmstats->events_local[index]); + return READ_ONCE(memcg->vmstats->events_local[i]); } =20 static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg, --=20 2.43.0 From nobody Thu Dec 18 18:00:23 2025 Received: from out-175.mta1.migadu.com (out-175.mta1.migadu.com [95.215.58.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A74CA3FBAA for ; Tue, 30 Apr 2024 06:06:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457210; cv=none; b=Ucc4BMZ+OTvXQoWEvVTkShZ3GdDLe3IyHWH65wcxvfUKEf4jkyzFBcSexxDn6luYwrreN7U+t0pYmVTXEVBdpe/t70xbNINoWDexmCFcCvql3ZWHrI/bNRftGHpZHN1gS/WaM67mLdFYUT/fU+D9uvQL67Fv6oxMj2O7wcbRGqk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714457210; c=relaxed/simple; bh=M6Cfe210NCjI1yL10FhmrelwHlZmt4J9rn9QcetojBI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cJClAvN+/C/k+Cl/FulFXzkkIi7/8E+U47hcfx8/vtL6l1djW3MVWbFdZVPTC3zbEFsgitBaJZVTO1fkpu0mOiAGXad5bQdUPcz1dZVHzUXTwc/TNNjcnx5TYDtjpQ5ovfOdtK1tLhLD/ByuTQiOQS0qYtNnxwHhuTL/uh3NnaQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=rotHBScM; arc=none smtp.client-ip=95.215.58.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="rotHBScM" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1714457207; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oQu7rbPfAQy6XdrYuqRfpS3viClFCHuzUf1lmpnIBnw=; b=rotHBScM1JlySEyfeor2cpEW+V9EK+9UeCwpDcgSfKbo+f1dWWgvcQN7mi7VyWHyq1/fRf IzfXcU8ln7O4xZudZdXn5uue6q1VbcyhdHtiuhipO4lqg7CJmp+yB/63oyYllZKOXFUrxN tFEkBtSRbv0k1HFIJIKt5si+4wW9Xvk= From: Shakeel Butt To: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Yosry Ahmed , "T . J . Mercier" Cc: kernel-team@meta.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 8/8] memcg: use proper type for mod_memcg_state Date: Mon, 29 Apr 2024 23:06:12 -0700 Message-ID: <20240430060612.2171650-9-shakeel.butt@linux.dev> In-Reply-To: <20240430060612.2171650-1-shakeel.butt@linux.dev> References: <20240430060612.2171650-1-shakeel.butt@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" The memcg stats update functions can take arbitrary integer but the only input which make sense is enum memcg_stat_item and we don't want these functions to be called with arbitrary integer, so replace the parameter type with enum memcg_stat_item and compiler will be able to warn if memcg stat update functions are called with incorrect index value. Signed-off-by: Shakeel Butt Reviewed-by: T.J. Mercier --- Change since v2: - Fixed whitespace issue based on TJ's suggestion. include/linux/memcontrol.h | 13 +++++++------ mm/memcontrol.c | 3 ++- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index ab8a6e884375..030d34e9d117 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -974,7 +974,8 @@ void mem_cgroup_print_oom_group(struct mem_cgroup *memc= g); void folio_memcg_lock(struct folio *folio); void folio_memcg_unlock(struct folio *folio); =20 -void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val); +void __mod_memcg_state(struct mem_cgroup *memcg, enum memcg_stat_item idx, + int val); =20 /* try to stablize folio_memcg() for all the pages in a memcg */ static inline bool mem_cgroup_trylock_pages(struct mem_cgroup *memcg) @@ -995,7 +996,7 @@ static inline void mem_cgroup_unlock_pages(void) =20 /* idx can be of type enum memcg_stat_item or node_stat_item */ static inline void mod_memcg_state(struct mem_cgroup *memcg, - int idx, int val) + enum memcg_stat_item idx, int val) { unsigned long flags; =20 @@ -1005,7 +1006,7 @@ static inline void mod_memcg_state(struct mem_cgroup = *memcg, } =20 static inline void mod_memcg_page_state(struct page *page, - int idx, int val) + enum memcg_stat_item idx, int val) { struct mem_cgroup *memcg; =20 @@ -1491,19 +1492,19 @@ static inline void mem_cgroup_print_oom_group(struc= t mem_cgroup *memcg) } =20 static inline void __mod_memcg_state(struct mem_cgroup *memcg, - int idx, + enum memcg_stat_item idx, int nr) { } =20 static inline void mod_memcg_state(struct mem_cgroup *memcg, - int idx, + enum memcg_stat_item idx, int nr) { } =20 static inline void mod_memcg_page_state(struct page *page, - int idx, int val) + enum memcg_stat_item idx, int val) { } =20 diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 72e36977a96e..f5fc16b918ba 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -956,7 +956,8 @@ static int memcg_state_val_in_pages(int idx, int val) * @idx: the stat item - can be enum memcg_stat_item or enum node_stat_item * @val: delta to add to the counter, can be negative */ -void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) +void __mod_memcg_state(struct mem_cgroup *memcg, enum memcg_stat_item idx, + int val) { int i =3D memcg_stats_index(idx); =20 --=20 2.43.0