[v1] generalize panic_print's dump function to be used by other kernel parts

[PATCH v1 0/3] generalize panic_print's dump function to be used by other kernel parts

Posted by Feng Tang 9 months ago

When working on kernel stability issues, panic, task-hung and 
software/hardware lockup are frequently met. And to debug them, user
may need lots of system information at that time, like task call stacks,
lock info, memory info etc. 

panic case already has panic_print_sys_info() for this purpose, and has
a 'panic_print' bitmask to control what kinds of information is needed,
which is also helpful to debug other task-hung and lockup cases.

So this patchset extract the function out, and make it usable for other
cases which also need system info for debugging. 

Locally these have been used in our bug chasing for stablility issues
and was helpful.

Please help to review, thanks!

- Feng

Changelog:

  Since RFC:
     * Don't print all cpu backtrace if 'sysctl_hung_task_all_cpu_backtracemay'
       is 'false' (Lance Yang)
     * Change the name of 2 new kernel control knob to have 'mask' inside, and
       add kernel document and code comments for them (Lance Yang)
     * Make the sys_show_info() support printk msg replay and all CPU backtrace. 

Feng Tang (3):
  kernel/panic: generalize panic_print's function to show sys info
  kernel/hung_task: add option to dump system info when hung task
    detected
  kernel/watchdog: add option to dump system info when system is locked
    up

 .../admin-guide/kernel-parameters.txt         | 10 ++++
 include/linux/panic.h                         | 11 +++++
 kernel/hung_task.c                            | 42 ++++++++++++-----
 kernel/panic.c                                | 47 ++++++++++---------
 kernel/watchdog.c                             | 20 ++++++++
 5 files changed, 95 insertions(+), 35 deletions(-)

-- 
2.39.5 (Apple Git-154)

Re: [PATCH v1 0/3] generalize panic_print's dump function to be used by other kernel parts

Posted by Andrew Morton 9 months ago

On Sun, 11 May 2025 16:52:51 +0800 Feng Tang <feng.tang@linux.alibaba.com> wrote:

> When working on kernel stability issues, panic, task-hung and 
> software/hardware lockup are frequently met. And to debug them, user
> may need lots of system information at that time, like task call stacks,
> lock info, memory info etc. 
> 
> panic case already has panic_print_sys_info() for this purpose, and has
> a 'panic_print' bitmask to control what kinds of information is needed,
> which is also helpful to debug other task-hung and lockup cases.
> 
> So this patchset extract the function out, and make it usable for other
> cases which also need system info for debugging. 
> 
> Locally these have been used in our bug chasing for stablility issues
> and was helpful.

Truth.  Our responses to panics, oopses, WARNs, BUGs, OOMs etc seem
quite poorly organized.  Some effort to clean up (and document!) all of
this sounds good.

My vote is to permit the display of every scrap of information we can
think of in all situations.  And then to permit users to select which of
that information is to be displayed under each situation.

As for this patchset - sounds good to me.  For now I'll await input
from reviewers.

Re: [PATCH v1 0/3] generalize panic_print's dump function to be used by other kernel parts

Posted by Feng Tang 9 months ago

Hi Andrew,

Thanks for the review!

On Sun, May 11, 2025 at 06:46:17PM -0700, Andrew Morton wrote:
> On Sun, 11 May 2025 16:52:51 +0800 Feng Tang <feng.tang@linux.alibaba.com> wrote:
> 
> > When working on kernel stability issues, panic, task-hung and 
> > software/hardware lockup are frequently met. And to debug them, user
> > may need lots of system information at that time, like task call stacks,
> > lock info, memory info etc. 
> > 
> > panic case already has panic_print_sys_info() for this purpose, and has
> > a 'panic_print' bitmask to control what kinds of information is needed,
> > which is also helpful to debug other task-hung and lockup cases.
> > 
> > So this patchset extract the function out, and make it usable for other
> > cases which also need system info for debugging. 
> > 
> > Locally these have been used in our bug chasing for stablility issues
> > and was helpful.
> 
> Truth.  Our responses to panics, oopses, WARNs, BUGs, OOMs etc seem
> quite poorly organized.  Some effort to clean up (and document!) all of
> this sounds good.
> 
> My vote is to permit the display of every scrap of information we can
> think of in all situations.  And then to permit users to select which of
> that information is to be displayed under each situation.

Good point! Maybe one future todo is to add a gloabl system info dump
function with ONE global knob for selecting different kinds of information,
which could be embedded into some cases you mentioned above.

> As for this patchset - sounds good to me.  For now I'll await input
> from reviewers.

Thank you!

- Feng

Re: [PATCH v1 0/3] generalize panic_print's dump function to be used by other kernel parts

Posted by Lance Yang 9 months ago


On 2025/5/12 11:14, Feng Tang wrote:
> Hi Andrew,
> 
> Thanks for the review!
> 
> On Sun, May 11, 2025 at 06:46:17PM -0700, Andrew Morton wrote:
>> On Sun, 11 May 2025 16:52:51 +0800 Feng Tang <feng.tang@linux.alibaba.com> wrote:
>>
>>> When working on kernel stability issues, panic, task-hung and
>>> software/hardware lockup are frequently met. And to debug them, user
>>> may need lots of system information at that time, like task call stacks,
>>> lock info, memory info etc.
>>>
>>> panic case already has panic_print_sys_info() for this purpose, and has
>>> a 'panic_print' bitmask to control what kinds of information is needed,
>>> which is also helpful to debug other task-hung and lockup cases.
>>>
>>> So this patchset extract the function out, and make it usable for other
>>> cases which also need system info for debugging.
>>>
>>> Locally these have been used in our bug chasing for stablility issues
>>> and was helpful.
>>
>> Truth.  Our responses to panics, oopses, WARNs, BUGs, OOMs etc seem
>> quite poorly organized.  Some effort to clean up (and document!) all of
>> this sounds good.
>>
>> My vote is to permit the display of every scrap of information we can
>> think of in all situations.  And then to permit users to select which of
>> that information is to be displayed under each situation.

Completely agreed. The tricky part is making a global knob that works for
all situations without breaking userspace, but it's a better system-wide
approach ;)

> 
> Good point! Maybe one future todo is to add a gloabl system info dump
> function with ONE global knob for selecting different kinds of information,
> which could be embedded into some cases you mentioned above.

IMHO, for features with their own knobs, we need:
a) The global knob (if enabled) turns on all related feature-level knobs,
b) while still allowing users to manually override individual knobs.

Something like:

If SYS_PRINT_ALL_CPU_BT (global knob) is on, it enables 
hung_task_all_cpu_backtrace
for hung-task situation automatically. But users can still disable it via
hung_task_all_cpu_backtrace.

Anyway, the global knob (when set) controls all feature-level knobs, but
they can override it if explicitly set ;)

Thanks,
Lance

> 
>> As for this patchset - sounds good to me.  For now I'll await input
>> from reviewers.
> 
> Thank you!
> 
> - Feng

Re: [PATCH v1 0/3] generalize panic_print's dump function to be used by other kernel parts

Posted by Petr Mladek 9 months ago

On Mon 2025-05-12 16:23:30, Lance Yang wrote:
> 
> 
> On 2025/5/12 11:14, Feng Tang wrote:
> > Hi Andrew,
> > 
> > Thanks for the review!
> > 
> > On Sun, May 11, 2025 at 06:46:17PM -0700, Andrew Morton wrote:
> > > On Sun, 11 May 2025 16:52:51 +0800 Feng Tang <feng.tang@linux.alibaba.com> wrote:
> > > 
> > > > When working on kernel stability issues, panic, task-hung and
> > > > software/hardware lockup are frequently met. And to debug them, user
> > > > may need lots of system information at that time, like task call stacks,
> > > > lock info, memory info etc.
> > > > 
> > > > panic case already has panic_print_sys_info() for this purpose, and has
> > > > a 'panic_print' bitmask to control what kinds of information is needed,
> > > > which is also helpful to debug other task-hung and lockup cases.
> > > > 
> > > > So this patchset extract the function out, and make it usable for other
> > > > cases which also need system info for debugging.
> > > > 
> > > > Locally these have been used in our bug chasing for stablility issues
> > > > and was helpful.
> > > 
> > > Truth.  Our responses to panics, oopses, WARNs, BUGs, OOMs etc seem
> > > quite poorly organized.  Some effort to clean up (and document!) all of
> > > this sounds good.
> > > 
> > > My vote is to permit the display of every scrap of information we can
> > > think of in all situations.  And then to permit users to select which of
> > > that information is to be displayed under each situation.
> 
> Completely agreed. The tricky part is making a global knob that works for
> all situations without breaking userspace, but it's a better system-wide
> approach ;)
> 
> > 
> > Good point! Maybe one future todo is to add a gloabl system info dump
> > function with ONE global knob for selecting different kinds of information,
> > which could be embedded into some cases you mentioned above.
> 
> IMHO, for features with their own knobs, we need:
> a) The global knob (if enabled) turns on all related feature-level knobs,
> b) while still allowing users to manually override individual knobs.
> 
> Something like:
> 
> If SYS_PRINT_ALL_CPU_BT (global knob) is on, it enables
> hung_task_all_cpu_backtrace
> for hung-task situation automatically. But users can still disable it via
> hung_task_all_cpu_backtrace.

I am all for unifying the options for printing debug information
in various emergency situations. I am just not sure whether we really
want to do the same in all situations.

Some lockup detectors tries to be more clever, for example:

  + RCU stall detector prints backtraces only from CPUs which are
    involved in the stall, see print_other_cpu_stall().

  + Workqueues watchdog shows backtraces from tasks which are
    preventing forward progress, see show_cpu_pool_hog().

And stalls are about scheduling (disabled preemption, disabled IRQ,
deadlocks, too long uninterruptible sleep). OOM is about memory
usage. Oops is about an invalid memory access. WARNs() are
completely random stuff.

Also I am afraid of printing too much information when the system
is supposed to continue running. It would make sense to print it in
nbcon_cpu_emergency_enter()/exit() context which disables
preemption. And it might cause softlockups on its own.

Finally, I wonder whether ftrace_dump() might cause a livelock when ftrace
is adding new messages in parallel.

The situation is much easier during panic() because the system is
going to die() anyway, non-panic CPUs are stopped, ...

That said, I could understand that people might want to see as much
information as possible when the console is fast and the range of
possible problems is big.

Anyway, I have added few more people into Cc who are interested into
the various watchdogs.

And there is parallel initiative which tries to unify the loglevel or
somehow make the filtering easier, see
https://lore.kernel.org/r/20250424070436.2380215-1-senozhatsky@chromium.org

Best Regards,
Petr

Re: [PATCH v1 0/3] generalize panic_print's dump function to be used by other kernel parts

Posted by Feng Tang 8 months, 4 weeks ago

On Tue, May 13, 2025 at 03:27:33PM +0200, Petr Mladek wrote:
> On Mon 2025-05-12 16:23:30, Lance Yang wrote:
> > 
> > 
> > On 2025/5/12 11:14, Feng Tang wrote:
> > > Hi Andrew,
> > > 
> > > Thanks for the review!
> > > 
> > > On Sun, May 11, 2025 at 06:46:17PM -0700, Andrew Morton wrote:
> > > > On Sun, 11 May 2025 16:52:51 +0800 Feng Tang <feng.tang@linux.alibaba.com> wrote:
> > > > 
> > > > > When working on kernel stability issues, panic, task-hung and
> > > > > software/hardware lockup are frequently met. And to debug them, user
> > > > > may need lots of system information at that time, like task call stacks,
> > > > > lock info, memory info etc.
> > > > > 
> > > > > panic case already has panic_print_sys_info() for this purpose, and has
> > > > > a 'panic_print' bitmask to control what kinds of information is needed,
> > > > > which is also helpful to debug other task-hung and lockup cases.
> > > > > 
> > > > > So this patchset extract the function out, and make it usable for other
> > > > > cases which also need system info for debugging.
> > > > > 
> > > > > Locally these have been used in our bug chasing for stablility issues
> > > > > and was helpful.
> > > > 
> > > > Truth.  Our responses to panics, oopses, WARNs, BUGs, OOMs etc seem
> > > > quite poorly organized.  Some effort to clean up (and document!) all of
> > > > this sounds good.
> > > > 
> > > > My vote is to permit the display of every scrap of information we can
> > > > think of in all situations.  And then to permit users to select which of
> > > > that information is to be displayed under each situation.
> > 
> > Completely agreed. The tricky part is making a global knob that works for
> > all situations without breaking userspace, but it's a better system-wide
> > approach ;)
> > 
> > > 
> > > Good point! Maybe one future todo is to add a gloabl system info dump
> > > function with ONE global knob for selecting different kinds of information,
> > > which could be embedded into some cases you mentioned above.
> > 
> > IMHO, for features with their own knobs, we need:
> > a) The global knob (if enabled) turns on all related feature-level knobs,
> > b) while still allowing users to manually override individual knobs.
> > 
> > Something like:
> > 
> > If SYS_PRINT_ALL_CPU_BT (global knob) is on, it enables
> > hung_task_all_cpu_backtrace
> > for hung-task situation automatically. But users can still disable it via
> > hung_task_all_cpu_backtrace.
> 
> I am all for unifying the options for printing debug information
> in various emergency situations. I am just not sure whether we really
> want to do the same in all situations.

Yes, valid concern.

> Some lockup detectors tries to be more clever, for example:
> 
>   + RCU stall detector prints backtraces only from CPUs which are
>     involved in the stall, see print_other_cpu_stall().
> 
>   + Workqueues watchdog shows backtraces from tasks which are
>     preventing forward progress, see show_cpu_pool_hog().
> 
> And stalls are about scheduling (disabled preemption, disabled IRQ,
> deadlocks, too long uninterruptible sleep). OOM is about memory
> usage. Oops is about an invalid memory access. WARNs() are
> completely random stuff.

Agreed. I noticed RCU has special handling and I skipped "RCU stall"
case in this patchset on purpose :)

> Also I am afraid of printing too much information when the system
> is supposed to continue running. It would make sense to print it in
> nbcon_cpu_emergency_enter()/exit() context which disables
> preemption. And it might cause softlockups on its own.

Yes. And for the global knob, my thought is it's 0 (disabled) by
default, which equals doing nothing. And its user should be
experienced developers who knows precisely what information they
need, and set it runtime or by kernel command line. 

For 'panic_print' which we used frequently in bug chasing, we still
let it be 0 even in our debug version kernel, and only enable it
in debugging.

As for the patch set, I tried to not change existing behavior, and
just added option for user to get more info when needed.

> Finally, I wonder whether ftrace_dump() might cause a livelock when ftrace
> is adding new messages in parallel.

IIUC,  ftrace_dump_one() will turn off the tracing during dump, and
should be safe.

> The situation is much easier during panic() because the system is
> going to die() anyway, non-panic CPUs are stopped, ...

Yes.

> That said, I could understand that people might want to see as much
> information as possible when the console is fast and the range of
> possible problems is big.
 
I agree with you that more is not always better, it should be based
on real needs, case by case.

> Anyway, I have added few more people into Cc who are interested into
> the various watchdogs.
> 
> And there is parallel initiative which tries to unify the loglevel or
> somehow make the filtering easier, see
> https://lore.kernel.org/r/20250424070436.2380215-1-senozhatsky@chromium.org

Thanks for involving more people and sharing the link.

Thanks,
Feng

> Best Regards,
> Petr

Re: [PATCH v1 0/3] generalize panic_print's dump function to be used by other kernel parts

Posted by Paul E. McKenney 9 months ago

On Tue, May 13, 2025 at 03:27:33PM +0200, Petr Mladek wrote:
> On Mon 2025-05-12 16:23:30, Lance Yang wrote:
> > 
> > 
> > On 2025/5/12 11:14, Feng Tang wrote:
> > > Hi Andrew,
> > > 
> > > Thanks for the review!
> > > 
> > > On Sun, May 11, 2025 at 06:46:17PM -0700, Andrew Morton wrote:
> > > > On Sun, 11 May 2025 16:52:51 +0800 Feng Tang <feng.tang@linux.alibaba.com> wrote:
> > > > 
> > > > > When working on kernel stability issues, panic, task-hung and
> > > > > software/hardware lockup are frequently met. And to debug them, user
> > > > > may need lots of system information at that time, like task call stacks,
> > > > > lock info, memory info etc.
> > > > > 
> > > > > panic case already has panic_print_sys_info() for this purpose, and has
> > > > > a 'panic_print' bitmask to control what kinds of information is needed,
> > > > > which is also helpful to debug other task-hung and lockup cases.
> > > > > 
> > > > > So this patchset extract the function out, and make it usable for other
> > > > > cases which also need system info for debugging.
> > > > > 
> > > > > Locally these have been used in our bug chasing for stablility issues
> > > > > and was helpful.
> > > > 
> > > > Truth.  Our responses to panics, oopses, WARNs, BUGs, OOMs etc seem
> > > > quite poorly organized.  Some effort to clean up (and document!) all of
> > > > this sounds good.
> > > > 
> > > > My vote is to permit the display of every scrap of information we can
> > > > think of in all situations.  And then to permit users to select which of
> > > > that information is to be displayed under each situation.
> > 
> > Completely agreed. The tricky part is making a global knob that works for
> > all situations without breaking userspace, but it's a better system-wide
> > approach ;)
> > 
> > > 
> > > Good point! Maybe one future todo is to add a gloabl system info dump
> > > function with ONE global knob for selecting different kinds of information,
> > > which could be embedded into some cases you mentioned above.
> > 
> > IMHO, for features with their own knobs, we need:
> > a) The global knob (if enabled) turns on all related feature-level knobs,
> > b) while still allowing users to manually override individual knobs.
> > 
> > Something like:
> > 
> > If SYS_PRINT_ALL_CPU_BT (global knob) is on, it enables
> > hung_task_all_cpu_backtrace
> > for hung-task situation automatically. But users can still disable it via
> > hung_task_all_cpu_backtrace.
> 
> I am all for unifying the options for printing debug information
> in various emergency situations. I am just not sure whether we really
> want to do the same in all situations.
> 
> Some lockup detectors tries to be more clever, for example:
> 
>   + RCU stall detector prints backtraces only from CPUs which are
>     involved in the stall, see print_other_cpu_stall().
> 
>   + Workqueues watchdog shows backtraces from tasks which are
>     preventing forward progress, see show_cpu_pool_hog().
> 
> And stalls are about scheduling (disabled preemption, disabled IRQ,
> deadlocks, too long uninterruptible sleep). OOM is about memory
> usage. Oops is about an invalid memory access. WARNs() are
> completely random stuff.
> 
> Also I am afraid of printing too much information when the system
> is supposed to continue running. It would make sense to print it in
> nbcon_cpu_emergency_enter()/exit() context which disables
> preemption. And it might cause softlockups on its own.

And we did do some of the cleverness that Petr points out because of
problems caused by flooding the console log.  We first ran into this
sort of thing on embedded systems with slow serial consoles (where 115K
baud is now way slow), but it also shows up in other environments, for
example, those committing large numbers of console logs to stable storage,
multiplexing large numbers of logs across networks that sometimes get
congested, and so on.

So I second the call for individual knobs, either in addition to or
instead of the global knob.

> Finally, I wonder whether ftrace_dump() might cause a livelock when ftrace
> is adding new messages in parallel.

It definitely can cause problems, and me learning this the hard way is
why rcutorture calls tracing_off() before calling ftrace_dump().

> The situation is much easier during panic() because the system is
> going to die() anyway, non-panic CPUs are stopped, ...
> 
> That said, I could understand that people might want to see as much
> information as possible when the console is fast and the range of
> possible problems is big.

No argument here.

							Thanx, Paul

> Anyway, I have added few more people into Cc who are interested into
> the various watchdogs.
> 
> And there is parallel initiative which tries to unify the loglevel or
> somehow make the filtering easier, see
> https://lore.kernel.org/r/20250424070436.2380215-1-senozhatsky@chromium.org
> 
> Best Regards,
> Petr

Re: [PATCH v1 0/3] generalize panic_print's dump function to be used by other kernel parts

Posted by Feng Tang 8 months, 4 weeks ago

On Tue, May 13, 2025 at 10:09:51AM -0700, Paul E. McKenney wrote:
> On Tue, May 13, 2025 at 03:27:33PM +0200, Petr Mladek wrote:
> > On Mon 2025-05-12 16:23:30, Lance Yang wrote:
> > > 
> > > 
> > > On 2025/5/12 11:14, Feng Tang wrote:
> > > > Hi Andrew,
> > > > 
> > > > Thanks for the review!
> > > > 
> > > > On Sun, May 11, 2025 at 06:46:17PM -0700, Andrew Morton wrote:
> > > > > On Sun, 11 May 2025 16:52:51 +0800 Feng Tang <feng.tang@linux.alibaba.com> wrote:
> > > > > 
> > > > > > When working on kernel stability issues, panic, task-hung and
> > > > > > software/hardware lockup are frequently met. And to debug them, user
> > > > > > may need lots of system information at that time, like task call stacks,
> > > > > > lock info, memory info etc.
> > > > > > 
> > > > > > panic case already has panic_print_sys_info() for this purpose, and has
> > > > > > a 'panic_print' bitmask to control what kinds of information is needed,
> > > > > > which is also helpful to debug other task-hung and lockup cases.
> > > > > > 
> > > > > > So this patchset extract the function out, and make it usable for other
> > > > > > cases which also need system info for debugging.
> > > > > > 
> > > > > > Locally these have been used in our bug chasing for stablility issues
> > > > > > and was helpful.
> > > > > 
> > > > > Truth.  Our responses to panics, oopses, WARNs, BUGs, OOMs etc seem
> > > > > quite poorly organized.  Some effort to clean up (and document!) all of
> > > > > this sounds good.
> > > > > 
> > > > > My vote is to permit the display of every scrap of information we can
> > > > > think of in all situations.  And then to permit users to select which of
> > > > > that information is to be displayed under each situation.
> > > 
> > > Completely agreed. The tricky part is making a global knob that works for
> > > all situations without breaking userspace, but it's a better system-wide
> > > approach ;)
> > > 
> > > > 
> > > > Good point! Maybe one future todo is to add a gloabl system info dump
> > > > function with ONE global knob for selecting different kinds of information,
> > > > which could be embedded into some cases you mentioned above.
> > > 
> > > IMHO, for features with their own knobs, we need:
> > > a) The global knob (if enabled) turns on all related feature-level knobs,
> > > b) while still allowing users to manually override individual knobs.
> > > 
> > > Something like:
> > > 
> > > If SYS_PRINT_ALL_CPU_BT (global knob) is on, it enables
> > > hung_task_all_cpu_backtrace
> > > for hung-task situation automatically. But users can still disable it via
> > > hung_task_all_cpu_backtrace.
> > 
> > I am all for unifying the options for printing debug information
> > in various emergency situations. I am just not sure whether we really
> > want to do the same in all situations.
> > 
> > Some lockup detectors tries to be more clever, for example:
> > 
> >   + RCU stall detector prints backtraces only from CPUs which are
> >     involved in the stall, see print_other_cpu_stall().
> > 
> >   + Workqueues watchdog shows backtraces from tasks which are
> >     preventing forward progress, see show_cpu_pool_hog().
> > 
> > And stalls are about scheduling (disabled preemption, disabled IRQ,
> > deadlocks, too long uninterruptible sleep). OOM is about memory
> > usage. Oops is about an invalid memory access. WARNs() are
> > completely random stuff.
> > 
> > Also I am afraid of printing too much information when the system
> > is supposed to continue running. It would make sense to print it in
> > nbcon_cpu_emergency_enter()/exit() context which disables
> > preemption. And it might cause softlockups on its own.
> 
> And we did do some of the cleverness that Petr points out because of
> problems caused by flooding the console log.  We first ran into this
> sort of thing on embedded systems with slow serial consoles (where 115K
> baud is now way slow), but it also shows up in other environments, for
> example, those committing large numbers of console logs to stable storage,
> multiplexing large numbers of logs across networks that sometimes get
> congested, and so on.
> 
> So I second the call for individual knobs, either in addition to or
> instead of the global knob.

Thanks for the detail elaboration! RCU stall case is also a main target
in the stability issues I have worked on, besides panic/taskhung/lockup.
I noticed it has its own mature handling, and dare not to touch it in
this patchset :)

Thanks,
Feng

> 
> > Finally, I wonder whether ftrace_dump() might cause a livelock when ftrace
> > is adding new messages in parallel.
> 
> It definitely can cause problems, and me learning this the hard way is
> why rcutorture calls tracing_off() before calling ftrace_dump().
> 
> > The situation is much easier during panic() because the system is
> > going to die() anyway, non-panic CPUs are stopped, ...
> > 
> > That said, I could understand that people might want to see as much
> > information as possible when the console is fast and the range of
> > possible problems is big.
> 
> No argument here.
> 
> 							Thanx, Paul
> 
> > Anyway, I have added few more people into Cc who are interested into
> > the various watchdogs.
> > 
> > And there is parallel initiative which tries to unify the loglevel or
> > somehow make the filtering easier, see
> > https://lore.kernel.org/r/20250424070436.2380215-1-senozhatsky@chromium.org
> > 
> > Best Regards,
> > Petr

Re: [PATCH v1 0/3] generalize panic_print's dump function to be used by other kernel parts

Posted by Paul E. McKenney 8 months, 4 weeks ago

On Wed, May 14, 2025 at 11:33:23AM +0800, Feng Tang wrote:
> On Tue, May 13, 2025 at 10:09:51AM -0700, Paul E. McKenney wrote:
> > On Tue, May 13, 2025 at 03:27:33PM +0200, Petr Mladek wrote:
> > > On Mon 2025-05-12 16:23:30, Lance Yang wrote:
> > > > 
> > > > 
> > > > On 2025/5/12 11:14, Feng Tang wrote:
> > > > > Hi Andrew,
> > > > > 
> > > > > Thanks for the review!
> > > > > 
> > > > > On Sun, May 11, 2025 at 06:46:17PM -0700, Andrew Morton wrote:
> > > > > > On Sun, 11 May 2025 16:52:51 +0800 Feng Tang <feng.tang@linux.alibaba.com> wrote:
> > > > > > 
> > > > > > > When working on kernel stability issues, panic, task-hung and
> > > > > > > software/hardware lockup are frequently met. And to debug them, user
> > > > > > > may need lots of system information at that time, like task call stacks,
> > > > > > > lock info, memory info etc.
> > > > > > > 
> > > > > > > panic case already has panic_print_sys_info() for this purpose, and has
> > > > > > > a 'panic_print' bitmask to control what kinds of information is needed,
> > > > > > > which is also helpful to debug other task-hung and lockup cases.
> > > > > > > 
> > > > > > > So this patchset extract the function out, and make it usable for other
> > > > > > > cases which also need system info for debugging.
> > > > > > > 
> > > > > > > Locally these have been used in our bug chasing for stablility issues
> > > > > > > and was helpful.
> > > > > > 
> > > > > > Truth.  Our responses to panics, oopses, WARNs, BUGs, OOMs etc seem
> > > > > > quite poorly organized.  Some effort to clean up (and document!) all of
> > > > > > this sounds good.
> > > > > > 
> > > > > > My vote is to permit the display of every scrap of information we can
> > > > > > think of in all situations.  And then to permit users to select which of
> > > > > > that information is to be displayed under each situation.
> > > > 
> > > > Completely agreed. The tricky part is making a global knob that works for
> > > > all situations without breaking userspace, but it's a better system-wide
> > > > approach ;)
> > > > 
> > > > > 
> > > > > Good point! Maybe one future todo is to add a gloabl system info dump
> > > > > function with ONE global knob for selecting different kinds of information,
> > > > > which could be embedded into some cases you mentioned above.
> > > > 
> > > > IMHO, for features with their own knobs, we need:
> > > > a) The global knob (if enabled) turns on all related feature-level knobs,
> > > > b) while still allowing users to manually override individual knobs.
> > > > 
> > > > Something like:
> > > > 
> > > > If SYS_PRINT_ALL_CPU_BT (global knob) is on, it enables
> > > > hung_task_all_cpu_backtrace
> > > > for hung-task situation automatically. But users can still disable it via
> > > > hung_task_all_cpu_backtrace.
> > > 
> > > I am all for unifying the options for printing debug information
> > > in various emergency situations. I am just not sure whether we really
> > > want to do the same in all situations.
> > > 
> > > Some lockup detectors tries to be more clever, for example:
> > > 
> > >   + RCU stall detector prints backtraces only from CPUs which are
> > >     involved in the stall, see print_other_cpu_stall().
> > > 
> > >   + Workqueues watchdog shows backtraces from tasks which are
> > >     preventing forward progress, see show_cpu_pool_hog().
> > > 
> > > And stalls are about scheduling (disabled preemption, disabled IRQ,
> > > deadlocks, too long uninterruptible sleep). OOM is about memory
> > > usage. Oops is about an invalid memory access. WARNs() are
> > > completely random stuff.
> > > 
> > > Also I am afraid of printing too much information when the system
> > > is supposed to continue running. It would make sense to print it in
> > > nbcon_cpu_emergency_enter()/exit() context which disables
> > > preemption. And it might cause softlockups on its own.
> > 
> > And we did do some of the cleverness that Petr points out because of
> > problems caused by flooding the console log.  We first ran into this
> > sort of thing on embedded systems with slow serial consoles (where 115K
> > baud is now way slow), but it also shows up in other environments, for
> > example, those committing large numbers of console logs to stable storage,
> > multiplexing large numbers of logs across networks that sometimes get
> > congested, and so on.
> > 
> > So I second the call for individual knobs, either in addition to or
> > instead of the global knob.
> 
> Thanks for the detail elaboration! RCU stall case is also a main target
> in the stability issues I have worked on, besides panic/taskhung/lockup.
> I noticed it has its own mature handling, and dare not to touch it in
> this patchset :)

Uhhh...

Please do dare to touch it, as that is the only way that it can possibly
improve.  Just please also be very careful *how* you touch it.  ;-)

							Thanx, Paul

> Thanks,
> Feng
> 
> > 
> > > Finally, I wonder whether ftrace_dump() might cause a livelock when ftrace
> > > is adding new messages in parallel.
> > 
> > It definitely can cause problems, and me learning this the hard way is
> > why rcutorture calls tracing_off() before calling ftrace_dump().
> > 
> > > The situation is much easier during panic() because the system is
> > > going to die() anyway, non-panic CPUs are stopped, ...
> > > 
> > > That said, I could understand that people might want to see as much
> > > information as possible when the console is fast and the range of
> > > possible problems is big.
> > 
> > No argument here.
> > 
> > 							Thanx, Paul
> > 
> > > Anyway, I have added few more people into Cc who are interested into
> > > the various watchdogs.
> > > 
> > > And there is parallel initiative which tries to unify the loglevel or
> > > somehow make the filtering easier, see
> > > https://lore.kernel.org/r/20250424070436.2380215-1-senozhatsky@chromium.org
> > > 
> > > Best Regards,
> > > Petr

Re: [PATCH v1 0/3] generalize panic_print's dump function to be used by other kernel parts

Posted by Feng Tang 9 months ago

On Mon, May 12, 2025 at 04:23:30PM +0800, Lance Yang wrote:
> 
> 
> On 2025/5/12 11:14, Feng Tang wrote:
> > Hi Andrew,
> > 
> > Thanks for the review!
> > 
> > On Sun, May 11, 2025 at 06:46:17PM -0700, Andrew Morton wrote:
> > > On Sun, 11 May 2025 16:52:51 +0800 Feng Tang <feng.tang@linux.alibaba.com> wrote:
> > > 
> > > > When working on kernel stability issues, panic, task-hung and
> > > > software/hardware lockup are frequently met. And to debug them, user
> > > > may need lots of system information at that time, like task call stacks,
> > > > lock info, memory info etc.
> > > > 
> > > > panic case already has panic_print_sys_info() for this purpose, and has
> > > > a 'panic_print' bitmask to control what kinds of information is needed,
> > > > which is also helpful to debug other task-hung and lockup cases.
> > > > 
> > > > So this patchset extract the function out, and make it usable for other
> > > > cases which also need system info for debugging.
> > > > 
> > > > Locally these have been used in our bug chasing for stablility issues
> > > > and was helpful.
> > > 
> > > Truth.  Our responses to panics, oopses, WARNs, BUGs, OOMs etc seem
> > > quite poorly organized.  Some effort to clean up (and document!) all of
> > > this sounds good.
> > > 
> > > My vote is to permit the display of every scrap of information we can
> > > think of in all situations.  And then to permit users to select which of
> > > that information is to be displayed under each situation.
> 
> Completely agreed. The tricky part is making a global knob that works for
> all situations without breaking userspace, but it's a better system-wide
> approach ;)
> 
> > 
> > Good point! Maybe one future todo is to add a gloabl system info dump
> > function with ONE global knob for selecting different kinds of information,
> > which could be embedded into some cases you mentioned above.
> 
> IMHO, for features with their own knobs, we need:
> a) The global knob (if enabled) turns on all related feature-level knobs,
> b) while still allowing users to manually override individual knobs.
> 
> Something like:
> 
> If SYS_PRINT_ALL_CPU_BT (global knob) is on, it enables
> hung_task_all_cpu_backtrace
> for hung-task situation automatically. But users can still disable it via
> hung_task_all_cpu_backtrace.
> 
> Anyway, the global knob (when set) controls all feature-level knobs, but
> they can override it if explicitly set ;)

Yes, it makes sense for parts which already has its own user space
control knob.

What I proposed is a todo mostly for other parts than panic/hungtask
in this patchset, as these parts have some special handling required,
like panic need to handle printk-replay for kexec case. 

Thanks,
Feng