[PATCH] mm: memcg: dump memcg protection info on oom or alloc failures

Shakeel Butt posted 1 patch 3 months ago
include/linux/memcontrol.h |  5 +++++
mm/memcontrol.c            | 13 +++++++++++++
mm/oom_kill.c              |  1 +
mm/page_alloc.c            |  1 +
4 files changed, 20 insertions(+)
[PATCH] mm: memcg: dump memcg protection info on oom or alloc failures
Posted by Shakeel Butt 3 months ago
Currently kernel dumps memory state on oom and allocation failures. One
of the question usually raised on those dumps is why the kernel has not
reclaimed the reclaimable memory instead of triggering oom. One
potential reason is the usage of memory protection provided by memcg.
So, let's also dump the memory protected by the memcg in such reports to
ease the debugging.

Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
---
 include/linux/memcontrol.h |  5 +++++
 mm/memcontrol.c            | 13 +++++++++++++
 mm/oom_kill.c              |  1 +
 mm/page_alloc.c            |  1 +
 4 files changed, 20 insertions(+)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 8d2e250535a8..6861f0ff02b5 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1779,6 +1779,7 @@ static inline bool memcg_is_dying(struct mem_cgroup *memcg)
 	return memcg ? css_is_dying(&memcg->css) : false;
 }
 
+void mem_cgroup_show_protected_memory(struct mem_cgroup *memcg);
 #else
 static inline bool mem_cgroup_kmem_disabled(void)
 {
@@ -1850,6 +1851,10 @@ static inline bool memcg_is_dying(struct mem_cgroup *memcg)
 {
 	return false;
 }
+
+static inline void mem_cgroup_show_protected_memory(struct mem_cgroup *memcg)
+{
+}
 #endif /* CONFIG_MEMCG */
 
 #if defined(CONFIG_MEMCG) && defined(CONFIG_ZSWAP)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c34029e92bab..623446821b00 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5636,3 +5636,16 @@ bool mem_cgroup_node_allowed(struct mem_cgroup *memcg, int nid)
 {
 	return memcg ? cpuset_node_allowed(memcg->css.cgroup, nid) : true;
 }
+
+void mem_cgroup_show_protected_memory(struct mem_cgroup *memcg)
+{
+	if (mem_cgroup_disabled() || !cgroup_subsys_on_dfl(memory_cgrp_subsys))
+		return;
+
+	if (!memcg)
+		memcg = root_mem_cgroup;
+
+	pr_warn("Memory cgroup min protection %lukB -- low protection %lukB",
+		K(atomic_long_read(&memcg->memory.children_min_usage)*PAGE_SIZE),
+		K(atomic_long_read(&memcg->memory.children_low_usage)*PAGE_SIZE));
+}
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index c145b0feecc1..5eb11fbba704 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -472,6 +472,7 @@ static void dump_header(struct oom_control *oc)
 		if (should_dump_unreclaim_slab())
 			dump_unreclaimable_slab();
 	}
+	mem_cgroup_show_protected_memory(oc->memcg);
 	if (sysctl_oom_dump_tasks)
 		dump_tasks(oc);
 }
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e4efda1158b2..26be5734253f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3977,6 +3977,7 @@ static void warn_alloc_show_mem(gfp_t gfp_mask, nodemask_t *nodemask)
 		filter &= ~SHOW_MEM_FILTER_NODES;
 
 	__show_mem(filter, nodemask, gfp_zone(gfp_mask));
+	mem_cgroup_show_protected_memory(NULL);
 }
 
 void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
-- 
2.47.3
Re: [PATCH] mm: memcg: dump memcg protection info on oom or alloc failures
Posted by Vlastimil Babka 2 months, 4 weeks ago
On 11/8/25 00:40, Shakeel Butt wrote:
> Currently kernel dumps memory state on oom and allocation failures. One
> of the question usually raised on those dumps is why the kernel has not
> reclaimed the reclaimable memory instead of triggering oom. One
> potential reason is the usage of memory protection provided by memcg.
> So, let's also dump the memory protected by the memcg in such reports to
> ease the debugging.
> 
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>

Acked-by: Vlastimil Babka <vbabka@suse.cz>
Re: [PATCH] mm: memcg: dump memcg protection info on oom or alloc failures
Posted by Michal Hocko 2 months, 4 weeks ago
On Fri 07-11-25 15:40:41, Shakeel Butt wrote:
> Currently kernel dumps memory state on oom and allocation failures. One
> of the question usually raised on those dumps is why the kernel has not
> reclaimed the reclaimable memory instead of triggering oom. One
> potential reason is the usage of memory protection provided by memcg.
> So, let's also dump the memory protected by the memcg in such reports to
> ease the debugging.

Makes sense to me.

> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>

Acked-by: Michal Hocko <mhocko@suse.com>
Thanks!

> ---
>  include/linux/memcontrol.h |  5 +++++
>  mm/memcontrol.c            | 13 +++++++++++++
>  mm/oom_kill.c              |  1 +
>  mm/page_alloc.c            |  1 +
>  4 files changed, 20 insertions(+)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 8d2e250535a8..6861f0ff02b5 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -1779,6 +1779,7 @@ static inline bool memcg_is_dying(struct mem_cgroup *memcg)
>  	return memcg ? css_is_dying(&memcg->css) : false;
>  }
>  
> +void mem_cgroup_show_protected_memory(struct mem_cgroup *memcg);
>  #else
>  static inline bool mem_cgroup_kmem_disabled(void)
>  {
> @@ -1850,6 +1851,10 @@ static inline bool memcg_is_dying(struct mem_cgroup *memcg)
>  {
>  	return false;
>  }
> +
> +static inline void mem_cgroup_show_protected_memory(struct mem_cgroup *memcg)
> +{
> +}
>  #endif /* CONFIG_MEMCG */
>  
>  #if defined(CONFIG_MEMCG) && defined(CONFIG_ZSWAP)
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c34029e92bab..623446821b00 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5636,3 +5636,16 @@ bool mem_cgroup_node_allowed(struct mem_cgroup *memcg, int nid)
>  {
>  	return memcg ? cpuset_node_allowed(memcg->css.cgroup, nid) : true;
>  }
> +
> +void mem_cgroup_show_protected_memory(struct mem_cgroup *memcg)
> +{
> +	if (mem_cgroup_disabled() || !cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +		return;
> +
> +	if (!memcg)
> +		memcg = root_mem_cgroup;
> +
> +	pr_warn("Memory cgroup min protection %lukB -- low protection %lukB",
> +		K(atomic_long_read(&memcg->memory.children_min_usage)*PAGE_SIZE),
> +		K(atomic_long_read(&memcg->memory.children_low_usage)*PAGE_SIZE));
> +}
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index c145b0feecc1..5eb11fbba704 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -472,6 +472,7 @@ static void dump_header(struct oom_control *oc)
>  		if (should_dump_unreclaim_slab())
>  			dump_unreclaimable_slab();
>  	}
> +	mem_cgroup_show_protected_memory(oc->memcg);
>  	if (sysctl_oom_dump_tasks)
>  		dump_tasks(oc);
>  }
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index e4efda1158b2..26be5734253f 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3977,6 +3977,7 @@ static void warn_alloc_show_mem(gfp_t gfp_mask, nodemask_t *nodemask)
>  		filter &= ~SHOW_MEM_FILTER_NODES;
>  
>  	__show_mem(filter, nodemask, gfp_zone(gfp_mask));
> +	mem_cgroup_show_protected_memory(NULL);
>  }
>  
>  void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
> -- 
> 2.47.3

-- 
Michal Hocko
SUSE Labs
Re: [PATCH] mm: memcg: dump memcg protection info on oom or alloc failures
Posted by SeongJae Park 3 months ago
On Fri,  7 Nov 2025 15:40:41 -0800 Shakeel Butt <shakeel.butt@linux.dev> wrote:

> Currently kernel dumps memory state on oom and allocation failures. One
> of the question usually raised on those dumps is why the kernel has not
> reclaimed the reclaimable memory instead of triggering oom. One
> potential reason is the usage of memory protection provided by memcg.
> So, let's also dump the memory protected by the memcg in such reports to
> ease the debugging.
> 
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> ---
[...]
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c34029e92bab..623446821b00 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5636,3 +5636,16 @@ bool mem_cgroup_node_allowed(struct mem_cgroup *memcg, int nid)
>  {
>  	return memcg ? cpuset_node_allowed(memcg->css.cgroup, nid) : true;
>  }
> +
> +void mem_cgroup_show_protected_memory(struct mem_cgroup *memcg)
> +{
> +	if (mem_cgroup_disabled() || !cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +		return;
> +
> +	if (!memcg)
> +		memcg = root_mem_cgroup;
> +
> +	pr_warn("Memory cgroup min protection %lukB -- low protection %lukB",
> +		K(atomic_long_read(&memcg->memory.children_min_usage)*PAGE_SIZE),
> +		K(atomic_long_read(&memcg->memory.children_low_usage)*PAGE_SIZE));
> +}

I didn't expect this function is showing the information by calling pr_warn().
To me, "show" feels like something for file operations, like memory_min_show().

What about s/show/dump/ on the name?  It makes it more consistent with the
subject of this patch, and other similar functions like dump_page() ?

No strong opinion.  The current name is also ok for me, but I'm just curious your thought.


Thanks,
SJ

[...]
Re: [PATCH] mm: memcg: dump memcg protection info on oom or alloc failures
Posted by Shakeel Butt 3 months ago
On Fri, Nov 07, 2025 at 06:26:38PM -0800, SeongJae Park wrote:
> On Fri,  7 Nov 2025 15:40:41 -0800 Shakeel Butt <shakeel.butt@linux.dev> wrote:
> 
> > Currently kernel dumps memory state on oom and allocation failures. One
> > of the question usually raised on those dumps is why the kernel has not
> > reclaimed the reclaimable memory instead of triggering oom. One
> > potential reason is the usage of memory protection provided by memcg.
> > So, let's also dump the memory protected by the memcg in such reports to
> > ease the debugging.
> > 
> > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> > ---
> [...]
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index c34029e92bab..623446821b00 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -5636,3 +5636,16 @@ bool mem_cgroup_node_allowed(struct mem_cgroup *memcg, int nid)
> >  {
> >  	return memcg ? cpuset_node_allowed(memcg->css.cgroup, nid) : true;
> >  }
> > +
> > +void mem_cgroup_show_protected_memory(struct mem_cgroup *memcg)
> > +{
> > +	if (mem_cgroup_disabled() || !cgroup_subsys_on_dfl(memory_cgrp_subsys))
> > +		return;
> > +
> > +	if (!memcg)
> > +		memcg = root_mem_cgroup;
> > +
> > +	pr_warn("Memory cgroup min protection %lukB -- low protection %lukB",
> > +		K(atomic_long_read(&memcg->memory.children_min_usage)*PAGE_SIZE),
> > +		K(atomic_long_read(&memcg->memory.children_low_usage)*PAGE_SIZE));
> > +}
> 
> I didn't expect this function is showing the information by calling pr_warn().
> To me, "show" feels like something for file operations, like memory_min_show().
> 
> What about s/show/dump/ on the name?  It makes it more consistent with the
> subject of this patch, and other similar functions like dump_page() ?
> 
> No strong opinion.  The current name is also ok for me, but I'm just curious your thought.
> 

I just took the inspiration from show_mem(). Initially I was trying to
put these pr_warn in show_mem() but noticed that it was called from more
places than I intend to print this info, so decided to have a separate
function.

Thanks for taking a look.