[PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info

Yueyang Pan posted 2 patches 1 month ago
There is a newer version of this series
[PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Yueyang Pan 1 month ago
In production, show_mem() can be called concurrently from two
different entities, for example one from oom_kill_process()
another from __alloc_pages_slowpath from another kthread. This
patch adds a mutex and invokes trylock before printing out the
kernel alloc info in show_mem(). This way two alloc info won't
interleave with each other, which then makes parsing easier.

Signed-off-by: Yueyang Pan <pyyjason@gmail.com>
---
 mm/show_mem.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/show_mem.c b/mm/show_mem.c
index b71e222fde86..8814b5f8a7dc 100644
--- a/mm/show_mem.c
+++ b/mm/show_mem.c
@@ -23,6 +23,8 @@ EXPORT_SYMBOL(_totalram_pages);
 unsigned long totalreserve_pages __read_mostly;
 unsigned long totalcma_pages __read_mostly;
 
+static DEFINE_MUTEX(mem_alloc_profiling_mutex);
+
 static inline void show_node(struct zone *zone)
 {
 	if (IS_ENABLED(CONFIG_NUMA))
@@ -419,7 +421,7 @@ void __show_mem(unsigned int filter, nodemask_t *nodemask, int max_zone_idx)
 	printk("%lu pages hwpoisoned\n", atomic_long_read(&num_poisoned_pages));
 #endif
 #ifdef CONFIG_MEM_ALLOC_PROFILING
-	if (mem_alloc_profiling_enabled()) {
+	if (mem_alloc_profiling_enabled() && mutex_trylock(&mem_alloc_profiling_mutex)) {
 		struct codetag_bytes tags[10];
 		size_t i, nr;
 
@@ -445,6 +447,7 @@ void __show_mem(unsigned int filter, nodemask_t *nodemask, int max_zone_idx)
 						  ct->lineno, ct->function);
 			}
 		}
+		mutex_unlock(&mem_alloc_profiling_mutex);
 	}
 #endif
 }
-- 
2.47.3
Re: [PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Andrew Morton 1 month ago
On Wed, 27 Aug 2025 11:34:23 -0700 Yueyang Pan <pyyjason@gmail.com> wrote:

> In production, show_mem() can be called concurrently from two
> different entities, for example one from oom_kill_process()
> another from __alloc_pages_slowpath from another kthread. This
> patch adds a mutex and invokes trylock before printing out the
> kernel alloc info in show_mem(). This way two alloc info won't
> interleave with each other, which then makes parsing easier.
> 

Fair enough, I guess.

> --- a/mm/show_mem.c
> +++ b/mm/show_mem.c
> @@ -23,6 +23,8 @@ EXPORT_SYMBOL(_totalram_pages);
>  unsigned long totalreserve_pages __read_mostly;
>  unsigned long totalcma_pages __read_mostly;
>  
> +static DEFINE_MUTEX(mem_alloc_profiling_mutex);

It would be a bit neater to make this local to __show_mem() - it didn't
need file scope.

Also, mutex_unlock() isn't to be used from interrupt context, so
problem.

Something like atomic cmpxchg or test_and_set_bit could be used and
wouldn't involve mutex_unlock()'s wakeup logic, which isn't needed
here.

>  static inline void show_node(struct zone *zone)
>  {
>  	if (IS_ENABLED(CONFIG_NUMA))
> @@ -419,7 +421,7 @@ void __show_mem(unsigned int filter, nodemask_t *nodemask, int max_zone_idx)
>  	printk("%lu pages hwpoisoned\n", atomic_long_read(&num_poisoned_pages));
>  #endif
>  #ifdef CONFIG_MEM_ALLOC_PROFILING
> -	if (mem_alloc_profiling_enabled()) {
> +	if (mem_alloc_profiling_enabled() && mutex_trylock(&mem_alloc_profiling_mutex)) {
>  		struct codetag_bytes tags[10];
>  		size_t i, nr;
>  
> @@ -445,6 +447,7 @@ void __show_mem(unsigned int filter, nodemask_t *nodemask, int max_zone_idx)
>  						  ct->lineno, ct->function);
>  			}
>  		}
> +		mutex_unlock(&mem_alloc_profiling_mutex);
>  	}

If we're going to suppress the usual output then how about we let
people know this happened, rather than silently dropping it?

pr_notice("memory allocation output suppressed due to show_mem() contention\n")

or something like that?
Re: [PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Yueyang Pan 1 month ago
On Wed, Aug 27, 2025 at 03:06:19PM -0700, Andrew Morton wrote:
> On Wed, 27 Aug 2025 11:34:23 -0700 Yueyang Pan <pyyjason@gmail.com> wrote:
> 
> > In production, show_mem() can be called concurrently from two
> > different entities, for example one from oom_kill_process()
> > another from __alloc_pages_slowpath from another kthread. This
> > patch adds a mutex and invokes trylock before printing out the
> > kernel alloc info in show_mem(). This way two alloc info won't
> > interleave with each other, which then makes parsing easier.
> > 
> 
> Fair enough, I guess.
> 
> > --- a/mm/show_mem.c
> > +++ b/mm/show_mem.c
> > @@ -23,6 +23,8 @@ EXPORT_SYMBOL(_totalram_pages);
> >  unsigned long totalreserve_pages __read_mostly;
> >  unsigned long totalcma_pages __read_mostly;
> >  
> > +static DEFINE_MUTEX(mem_alloc_profiling_mutex);
> 
> It would be a bit neater to make this local to __show_mem() - it didn't
> need file scope.

Thanks for your feedback, Andrew. I will move it the next version.

> 
> Also, mutex_unlock() isn't to be used from interrupt context, so
> problem.
> 
> Something like atomic cmpxchg or test_and_set_bit could be used and
> wouldn't involve mutex_unlock()'s wakeup logic, which isn't needed
> here.

I was not aware of interrupt context before. I will change to test-and-set 
lock in the next version.

> 
> >  static inline void show_node(struct zone *zone)
> >  {
> >  	if (IS_ENABLED(CONFIG_NUMA))
> > @@ -419,7 +421,7 @@ void __show_mem(unsigned int filter, nodemask_t *nodemask, int max_zone_idx)
> >  	printk("%lu pages hwpoisoned\n", atomic_long_read(&num_poisoned_pages));
> >  #endif
> >  #ifdef CONFIG_MEM_ALLOC_PROFILING
> > -	if (mem_alloc_profiling_enabled()) {
> > +	if (mem_alloc_profiling_enabled() && mutex_trylock(&mem_alloc_profiling_mutex)) {
> >  		struct codetag_bytes tags[10];
> >  		size_t i, nr;
> >  
> > @@ -445,6 +447,7 @@ void __show_mem(unsigned int filter, nodemask_t *nodemask, int max_zone_idx)
> >  						  ct->lineno, ct->function);
> >  			}
> >  		}
> > +		mutex_unlock(&mem_alloc_profiling_mutex);
> >  	}
> 
> If we're going to suppress the usual output then how about we let
> people know this happened, rather than silently dropping it?
> 
> pr_notice("memory allocation output suppressed due to show_mem() contention\n")
> 
> or something like that?

For this point, I am sort of on Shakeel's side. Probably I won't call 
it suppressed as two concurrent printers is actually sharing this 
global information.

Thanks,
Pan
Re: [PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Vlastimil Babka 1 month ago
On 8/28/25 10:34, Yueyang Pan wrote:
> On Wed, Aug 27, 2025 at 03:06:19PM -0700, Andrew Morton wrote:
>> On Wed, 27 Aug 2025 11:34:23 -0700 Yueyang Pan <pyyjason@gmail.com> wrote:
>> 
>> > In production, show_mem() can be called concurrently from two
>> > different entities, for example one from oom_kill_process()
>> > another from __alloc_pages_slowpath from another kthread. This
>> > patch adds a mutex and invokes trylock before printing out the
>> > kernel alloc info in show_mem(). This way two alloc info won't
>> > interleave with each other, which then makes parsing easier.

What about the rest of the information printed by show_mem() being interleaved?

>> > 
>> 
>> Fair enough, I guess.
>> 
>> > --- a/mm/show_mem.c
>> > +++ b/mm/show_mem.c
>> > @@ -23,6 +23,8 @@ EXPORT_SYMBOL(_totalram_pages);
>> >  unsigned long totalreserve_pages __read_mostly;
>> >  unsigned long totalcma_pages __read_mostly;
>> >  
>> > +static DEFINE_MUTEX(mem_alloc_profiling_mutex);
>> 
>> It would be a bit neater to make this local to __show_mem() - it didn't
>> need file scope.
> 
> Thanks for your feedback, Andrew. I will move it the next version.
> 
>> 
>> Also, mutex_unlock() isn't to be used from interrupt context, so
>> problem.
>> 
>> Something like atomic cmpxchg or test_and_set_bit could be used and
>> wouldn't involve mutex_unlock()'s wakeup logic, which isn't needed
>> here.
> 
> I was not aware of interrupt context before. I will change to test-and-set 
> lock in the next version.

Perhaps simply spinlock_t with spin_trylock()?
Re: [PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Shakeel Butt 1 month ago
On Thu, Aug 28, 2025 at 10:41:23AM +0200, Vlastimil Babka wrote:
> > 
> > I was not aware of interrupt context before. I will change to test-and-set 
> > lock in the next version.
> 
> Perhaps simply spinlock_t with spin_trylock()?
> 

Will lockdep complain that this spinlock is taken in non-irq and irq
context without disabling irqs?
Re: [PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Vlastimil Babka 1 month ago
On 8/28/25 18:35, Shakeel Butt wrote:
> On Thu, Aug 28, 2025 at 10:41:23AM +0200, Vlastimil Babka wrote:
>> > 
>> > I was not aware of interrupt context before. I will change to test-and-set 
>> > lock in the next version.
>> 
>> Perhaps simply spinlock_t with spin_trylock()?
>> 
> 
> Will lockdep complain that this spinlock is taken in non-irq and irq
> context without disabling irqs?

If we only use spin_trylock then it won't.
Re: [PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Yueyang Pan 1 month ago
On Thu, Aug 28, 2025 at 10:41:23AM +0200, Vlastimil Babka wrote:
> On 8/28/25 10:34, Yueyang Pan wrote:
> > On Wed, Aug 27, 2025 at 03:06:19PM -0700, Andrew Morton wrote:
> >> On Wed, 27 Aug 2025 11:34:23 -0700 Yueyang Pan <pyyjason@gmail.com> wrote:
> >> 
> >> > In production, show_mem() can be called concurrently from two
> >> > different entities, for example one from oom_kill_process()
> >> > another from __alloc_pages_slowpath from another kthread. This
> >> > patch adds a mutex and invokes trylock before printing out the
> >> > kernel alloc info in show_mem(). This way two alloc info won't
> >> > interleave with each other, which then makes parsing easier.
> 
> What about the rest of the information printed by show_mem() being interleaved?

Thanks for your feedback, Vlastimil. We cannot use trylock for the rest 
part as node filter can be different.
Do you think we need a lock to prevent the whole show_mem() from being 
interleaved and to acquire it at the very beginning? Will it be too 
heavy?

> 
> >> > 
> >> 
> >> Fair enough, I guess.
> >> 
> >> > --- a/mm/show_mem.c
> >> > +++ b/mm/show_mem.c
> >> > @@ -23,6 +23,8 @@ EXPORT_SYMBOL(_totalram_pages);
> >> >  unsigned long totalreserve_pages __read_mostly;
> >> >  unsigned long totalcma_pages __read_mostly;
> >> >  
> >> > +static DEFINE_MUTEX(mem_alloc_profiling_mutex);
> >> 
> >> It would be a bit neater to make this local to __show_mem() - it didn't
> >> need file scope.
> > 
> > Thanks for your feedback, Andrew. I will move it the next version.
> > 
> >> 
> >> Also, mutex_unlock() isn't to be used from interrupt context, so
> >> problem.
> >> 
> >> Something like atomic cmpxchg or test_and_set_bit could be used and
> >> wouldn't involve mutex_unlock()'s wakeup logic, which isn't needed
> >> here.
> > 
> > I was not aware of interrupt context before. I will change to test-and-set 
> > lock in the next version.
> 
> Perhaps simply spinlock_t with spin_trylock()?
> 

Agreed.

Thanks
Pan
Re: [PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Vlastimil Babka 1 month ago
On 8/28/25 10:47, Yueyang Pan wrote:
> On Thu, Aug 28, 2025 at 10:41:23AM +0200, Vlastimil Babka wrote:
>> On 8/28/25 10:34, Yueyang Pan wrote:
>> > On Wed, Aug 27, 2025 at 03:06:19PM -0700, Andrew Morton wrote:
>> >> On Wed, 27 Aug 2025 11:34:23 -0700 Yueyang Pan <pyyjason@gmail.com> wrote:
>> >> 
>> >> > In production, show_mem() can be called concurrently from two
>> >> > different entities, for example one from oom_kill_process()
>> >> > another from __alloc_pages_slowpath from another kthread. This
>> >> > patch adds a mutex and invokes trylock before printing out the
>> >> > kernel alloc info in show_mem(). This way two alloc info won't
>> >> > interleave with each other, which then makes parsing easier.
>> 
>> What about the rest of the information printed by show_mem() being interleaved?
> 
> Thanks for your feedback, Vlastimil. We cannot use trylock for the rest 
> part as node filter can be different.

Right.

> Do you think we need a lock to prevent the whole show_mem() from being 
> interleaved and to acquire it at the very beginning? Will it be too 
> heavy?

It might be risky so perhaps let's not. Guess we can disentangle by dmesg
showing the thread id prefix.
Re: [PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Yueyang Pan 1 month ago
On Thu, Aug 28, 2025 at 10:53:01AM +0200, Vlastimil Babka wrote:
> On 8/28/25 10:47, Yueyang Pan wrote:
> > On Thu, Aug 28, 2025 at 10:41:23AM +0200, Vlastimil Babka wrote:
> >> On 8/28/25 10:34, Yueyang Pan wrote:
> >> > On Wed, Aug 27, 2025 at 03:06:19PM -0700, Andrew Morton wrote:
> >> >> On Wed, 27 Aug 2025 11:34:23 -0700 Yueyang Pan <pyyjason@gmail.com> wrote:
> >> >> 
> >> >> > In production, show_mem() can be called concurrently from two
> >> >> > different entities, for example one from oom_kill_process()
> >> >> > another from __alloc_pages_slowpath from another kthread. This
> >> >> > patch adds a mutex and invokes trylock before printing out the
> >> >> > kernel alloc info in show_mem(). This way two alloc info won't
> >> >> > interleave with each other, which then makes parsing easier.
> >> 
> >> What about the rest of the information printed by show_mem() being interleaved?
> > 
> > Thanks for your feedback, Vlastimil. We cannot use trylock for the rest 
> > part as node filter can be different.
> 
> Right.
> 
> > Do you think we need a lock to prevent the whole show_mem() from being 
> > interleaved and to acquire it at the very beginning? Will it be too 
> > heavy?
> 
> It might be risky so perhaps let's not. Guess we can disentangle by dmesg
> showing the thread id prefix.

I have thought about this. Since each line can interleave with another, we 
would end up adding tid to each line. Not sure if this is acceptable.

Thanks
Pan
Re: [PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Vlastimil Babka 1 month ago
On 8/28/25 11:51, Yueyang Pan wrote:
> On Thu, Aug 28, 2025 at 10:53:01AM +0200, Vlastimil Babka wrote:
>> On 8/28/25 10:47, Yueyang Pan wrote:
>> > On Thu, Aug 28, 2025 at 10:41:23AM +0200, Vlastimil Babka wrote:
>> >> On 8/28/25 10:34, Yueyang Pan wrote:
>> >> > On Wed, Aug 27, 2025 at 03:06:19PM -0700, Andrew Morton wrote:
>> >> >> On Wed, 27 Aug 2025 11:34:23 -0700 Yueyang Pan <pyyjason@gmail.com> wrote:
>> >> >> 
>> >> >> > In production, show_mem() can be called concurrently from two
>> >> >> > different entities, for example one from oom_kill_process()
>> >> >> > another from __alloc_pages_slowpath from another kthread. This
>> >> >> > patch adds a mutex and invokes trylock before printing out the
>> >> >> > kernel alloc info in show_mem(). This way two alloc info won't
>> >> >> > interleave with each other, which then makes parsing easier.
>> >> 
>> >> What about the rest of the information printed by show_mem() being interleaved?
>> > 
>> > Thanks for your feedback, Vlastimil. We cannot use trylock for the rest 
>> > part as node filter can be different.
>> 
>> Right.
>> 
>> > Do you think we need a lock to prevent the whole show_mem() from being 
>> > interleaved and to acquire it at the very beginning? Will it be too 
>> > heavy?
>> 
>> It might be risky so perhaps let's not. Guess we can disentangle by dmesg
>> showing the thread id prefix.
> 
> I have thought about this. Since each line can interleave with another, we 
> would end up adding tid to each line. Not sure if this is acceptable.

I meant that printk/dmesg already does that so it's fine.

> Thanks
> Pan
Re: [PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Yueyang Pan 1 month ago
On Thu, Aug 28, 2025 at 11:54:58AM +0200, Vlastimil Babka wrote:
> On 8/28/25 11:51, Yueyang Pan wrote:
> > On Thu, Aug 28, 2025 at 10:53:01AM +0200, Vlastimil Babka wrote:
> >> On 8/28/25 10:47, Yueyang Pan wrote:
> >> > On Thu, Aug 28, 2025 at 10:41:23AM +0200, Vlastimil Babka wrote:
> >> >> On 8/28/25 10:34, Yueyang Pan wrote:
> >> >> > On Wed, Aug 27, 2025 at 03:06:19PM -0700, Andrew Morton wrote:
> >> >> >> On Wed, 27 Aug 2025 11:34:23 -0700 Yueyang Pan <pyyjason@gmail.com> wrote:
> >> >> >> 
> >> >> >> > In production, show_mem() can be called concurrently from two
> >> >> >> > different entities, for example one from oom_kill_process()
> >> >> >> > another from __alloc_pages_slowpath from another kthread. This
> >> >> >> > patch adds a mutex and invokes trylock before printing out the
> >> >> >> > kernel alloc info in show_mem(). This way two alloc info won't
> >> >> >> > interleave with each other, which then makes parsing easier.
> >> >> 
> >> >> What about the rest of the information printed by show_mem() being interleaved?
> >> > 
> >> > Thanks for your feedback, Vlastimil. We cannot use trylock for the rest 
> >> > part as node filter can be different.
> >> 
> >> Right.
> >> 
> >> > Do you think we need a lock to prevent the whole show_mem() from being 
> >> > interleaved and to acquire it at the very beginning? Will it be too 
> >> > heavy?
> >> 
> >> It might be risky so perhaps let's not. Guess we can disentangle by dmesg
> >> showing the thread id prefix.
> > 
> > I have thought about this. Since each line can interleave with another, we 
> > would end up adding tid to each line. Not sure if this is acceptable.
> 
> I meant that printk/dmesg already does that so it's fine.

Cool. Then I will do this for the previous part before memory allocation info.

> 
> > Thanks
> > Pan
>
Re: [PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Shakeel Butt 1 month ago
On Wed, Aug 27, 2025 at 03:06:19PM -0700, Andrew Morton wrote:
> On Wed, 27 Aug 2025 11:34:23 -0700 Yueyang Pan <pyyjason@gmail.com> wrote:
> 
> > In production, show_mem() can be called concurrently from two
> > different entities, for example one from oom_kill_process()
> > another from __alloc_pages_slowpath from another kthread. This
> > patch adds a mutex and invokes trylock before printing out the
> > kernel alloc info in show_mem(). This way two alloc info won't
> > interleave with each other, which then makes parsing easier.
> > 
> 
> Fair enough, I guess.
> 
> > --- a/mm/show_mem.c
> > +++ b/mm/show_mem.c
> > @@ -23,6 +23,8 @@ EXPORT_SYMBOL(_totalram_pages);
> >  unsigned long totalreserve_pages __read_mostly;
> >  unsigned long totalcma_pages __read_mostly;
> >  
> > +static DEFINE_MUTEX(mem_alloc_profiling_mutex);
> 
> It would be a bit neater to make this local to __show_mem() - it didn't
> need file scope.

+1, something static to __show_mem().

> 
> Also, mutex_unlock() isn't to be used from interrupt context, so
> problem.
> 
> Something like atomic cmpxchg or test_and_set_bit could be used and
> wouldn't involve mutex_unlock()'s wakeup logic, which isn't needed
> here.

+1

> 
> >  static inline void show_node(struct zone *zone)
> >  {
> >  	if (IS_ENABLED(CONFIG_NUMA))
> > @@ -419,7 +421,7 @@ void __show_mem(unsigned int filter, nodemask_t *nodemask, int max_zone_idx)
> >  	printk("%lu pages hwpoisoned\n", atomic_long_read(&num_poisoned_pages));
> >  #endif
> >  #ifdef CONFIG_MEM_ALLOC_PROFILING
> > -	if (mem_alloc_profiling_enabled()) {
> > +	if (mem_alloc_profiling_enabled() && mutex_trylock(&mem_alloc_profiling_mutex)) {
> >  		struct codetag_bytes tags[10];
> >  		size_t i, nr;
> >  
> > @@ -445,6 +447,7 @@ void __show_mem(unsigned int filter, nodemask_t *nodemask, int max_zone_idx)
> >  						  ct->lineno, ct->function);
> >  			}
> >  		}
> > +		mutex_unlock(&mem_alloc_profiling_mutex);
> >  	}
> 
> If we're going to suppress the usual output then how about we let
> people know this happened, rather than silently dropping it?
> 
> pr_notice("memory allocation output suppressed due to show_mem() contention\n")
> 
> or something like that?

Personally I think this is not needed as this patch is suppressing only
the memory allocation profiling output which is global, will be same
for all the consumers and context does not matter. All consumers will
get the memory allocation profiling data eventually.
Re: [PATCH v1 2/2] mm/show_mem: Add trylock while printing alloc info
Posted by Yueyang Pan 1 month ago
On Wed, Aug 27, 2025 at 03:28:41PM -0700, Shakeel Butt wrote:
> On Wed, Aug 27, 2025 at 03:06:19PM -0700, Andrew Morton wrote:
> > On Wed, 27 Aug 2025 11:34:23 -0700 Yueyang Pan <pyyjason@gmail.com> wrote:
> > 
> > > In production, show_mem() can be called concurrently from two
> > > different entities, for example one from oom_kill_process()
> > > another from __alloc_pages_slowpath from another kthread. This
> > > patch adds a mutex and invokes trylock before printing out the
> > > kernel alloc info in show_mem(). This way two alloc info won't
> > > interleave with each other, which then makes parsing easier.
> > > 
> > 
> > Fair enough, I guess.
> > 
> > > --- a/mm/show_mem.c
> > > +++ b/mm/show_mem.c
> > > @@ -23,6 +23,8 @@ EXPORT_SYMBOL(_totalram_pages);
> > >  unsigned long totalreserve_pages __read_mostly;
> > >  unsigned long totalcma_pages __read_mostly;
> > >  
> > > +static DEFINE_MUTEX(mem_alloc_profiling_mutex);
> > 
> > It would be a bit neater to make this local to __show_mem() - it didn't
> > need file scope.
> 
> +1, something static to __show_mem().

Thanks for your feedback, Shakeel. See my reply to Andrew for this.

> 
> > 
> > Also, mutex_unlock() isn't to be used from interrupt context, so
> > problem.
> > 
> > Something like atomic cmpxchg or test_and_set_bit could be used and
> > wouldn't involve mutex_unlock()'s wakeup logic, which isn't needed
> > here.
> 
> +1

Again, see my reply to Andrew.

> 
> > 
> > >  static inline void show_node(struct zone *zone)
> > >  {
> > >  	if (IS_ENABLED(CONFIG_NUMA))
> > > @@ -419,7 +421,7 @@ void __show_mem(unsigned int filter, nodemask_t *nodemask, int max_zone_idx)
> > >  	printk("%lu pages hwpoisoned\n", atomic_long_read(&num_poisoned_pages));
> > >  #endif
> > >  #ifdef CONFIG_MEM_ALLOC_PROFILING
> > > -	if (mem_alloc_profiling_enabled()) {
> > > +	if (mem_alloc_profiling_enabled() && mutex_trylock(&mem_alloc_profiling_mutex)) {
> > >  		struct codetag_bytes tags[10];
> > >  		size_t i, nr;
> > >  
> > > @@ -445,6 +447,7 @@ void __show_mem(unsigned int filter, nodemask_t *nodemask, int max_zone_idx)
> > >  						  ct->lineno, ct->function);
> > >  			}
> > >  		}
> > > +		mutex_unlock(&mem_alloc_profiling_mutex);
> > >  	}
> > 
> > If we're going to suppress the usual output then how about we let
> > people know this happened, rather than silently dropping it?
> > 
> > pr_notice("memory allocation output suppressed due to show_mem() contention\n")
> > 
> > or something like that?
> 
> Personally I think this is not needed as this patch is suppressing only
> the memory allocation profiling output which is global, will be same
> for all the consumers and context does not matter. All consumers will
> get the memory allocation profiling data eventually.

For this point, I sort of agree with you. Wait for others' opinions?

Thanks
Pan