kernel/panic.c | 1 + 1 file changed, 1 insertion(+)
When a panic happens, it blocks the cpu, which may
trigger the hardlockup detector if some dump is slow.
So call hardlockup_detector_perf_stop() to disable
hardlockup dector.
Signed-off-by: Wang Jinchao <wangjinchao600@gmail.com>
---
kernel/panic.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/panic.c b/kernel/panic.c
index b0b9a8bf4560..52a1ac4ad447 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -339,6 +339,7 @@ void panic(const char *fmt, ...)
*/
local_irq_disable();
preempt_disable_notrace();
+ hardlockup_detector_perf_stop();
/*
* It's possible to come here directly from a panic-assertion and
--
2.43.0
On Wed 2025-07-30 11:06:33, Wang Jinchao wrote: > When a panic happens, it blocks the cpu, which may > trigger the hardlockup detector if some dump is slow. > So call hardlockup_detector_perf_stop() to disable > hardlockup dector. Could you please provide more details, especially the log showing the problem? I wonder if this is similar to https://lore.kernel.org/all/SN6PR02MB4157A4C5E8CB219A75263A17D46DA@SN6PR02MB4157.namprd02.prod.outlook.com/ There was a problem that a non-panic CPU might get stuck in pl011_console_write_thread() or any other con->write_thread() callback because nbcon_reacquire_nobuf(wctxt) ended in an infinite loop. It was a real lockup. It has got recently fixed in 6.17-rc1 by the commit 571c1ea91a73db56bd94 ("printk: nbcon: Allow reacquire during panic"), see https://patch.msgid.link/20250606185549.900611-1-john.ogness@linutronix.de It is possible that it fixed your problem as well. That said, it might make sense to disable the hardlockup detector during panic. But I do not like the proposed way, see below. > --- a/kernel/panic.c > +++ b/kernel/panic.c > @@ -339,6 +339,7 @@ void panic(const char *fmt, ...) > */ > local_irq_disable(); > preempt_disable_notrace(); > + hardlockup_detector_perf_stop(); I see the following in kernel/watchdog_perf.c: /** * hardlockup_detector_perf_stop - Globally stop watchdog events * * Special interface for x86 to handle the perf HT bug. */ void __init hardlockup_detector_perf_stop(void) { [...] lockdep_assert_cpus_held(); [...] } 1. It is suspicious to see an x86-specific "hacky" function called in the generic panic(). Is this safe? What about other hardlockup detectors? 2. I expect that lockdep_assert_cpus_held() would complain when CONFIG_LOCKDEP was enabled. Anyway, it does not look safe. panic() might be called in any context, including NMI, and I see: + hardlockup_detector_perf_stop() + perf_event_disable() + perf_event_ctx_lock() + mutex_lock_nested() This might cause deadlock when called in NMI, definitely. Alternative: A conservative approach would be to update watchdog_hardlockup_check() so that it does nothing when panic_in_progress() returns true. It would even work for both hardlockup detectors implementation. Best Regards, Petr
On 8/19/25 23:01, Petr Mladek wrote: > On Wed 2025-07-30 11:06:33, Wang Jinchao wrote: >> When a panic happens, it blocks the cpu, which may >> trigger the hardlockup detector if some dump is slow. >> So call hardlockup_detector_perf_stop() to disable >> hardlockup dector. > > Could you please provide more details, especially the log showing > the problem? Here's what happened: I configured the kernel to use efi-pstore for kdump logging while enabling the perf hard lockup detector (NMI). Perhaps the efi-pstore was slow and there were too many logs. When the first panic was triggered, the pstore dump callback in kmsg_dump()->dumper->dump() took a long time, which triggered the NMI watchdog. Then emergency_restart() triggered the machine restart before the efi-pstore operation finished. The function call flow looked like this: ```c real panic() { kmsg_dump() { ... pstore_dump() { start_dump(); ... // long time operation triggers NMI watchdog nmi panic() { ... emergency_restart(); //pstore unfinished } ... finish_dump(); // never reached } } } ``` This created a nested panic situation where the second panic interrupted the crash dump process, causing the loss of the original panic information. > > I wonder if this is similar to > https://lore.kernel.org/all/SN6PR02MB4157A4C5E8CB219A75263A17D46DA@SN6PR02MB4157.namprd02.prod.outlook.com/ > > There was a problem that a non-panic CPU might get stuck in > pl011_console_write_thread() or any other con->write_thread() > callback because nbcon_reacquire_nobuf(wctxt) ended in an infinite > loop. > > It was a real lockup. It has got recently fixed in 6.17-rc1 by > the commit 571c1ea91a73db56bd94 ("printk: nbcon: Allow reacquire > during panic"), see > https://patch.msgid.link/20250606185549.900611-1-john.ogness@linutronix.de > It is possible that it fixed your problem as well. > > That said, it might make sense to disable the hardlockup > detector during panic. But I do not like the proposed way, > see below. > >> --- a/kernel/panic.c >> +++ b/kernel/panic.c >> @@ -339,6 +339,7 @@ void panic(const char *fmt, ...) >> */ >> local_irq_disable(); >> preempt_disable_notrace(); >> + hardlockup_detector_perf_stop(); > > I see the following in kernel/watchdog_perf.c: > > /** > * hardlockup_detector_perf_stop - Globally stop watchdog events > * > * Special interface for x86 to handle the perf HT bug. > */ > void __init hardlockup_detector_perf_stop(void) > { > [...] > lockdep_assert_cpus_held(); > [...] > } > > 1. It is suspicious to see an x86-specific "hacky" function called in > the generic panic(). > > Is this safe? > What about other hardlockup detectors? > > > 2. I expect that lockdep_assert_cpus_held() would complain > when CONFIG_LOCKDEP was enabled. > > > Anyway, it does not look safe. panic() might be called in any context, > including NMI, and I see: > > + hardlockup_detector_perf_stop() > + perf_event_disable() > + perf_event_ctx_lock() > + mutex_lock_nested() > > This might cause deadlock when called in NMI, definitely. > > Alternative: > > A conservative approach would be to update watchdog_hardlockup_check() > so that it does nothing when panic_in_progress() returns true. It > would even work for both hardlockup detectors implementation. Yes, I think it is a better solution. I didn't find panic_in_progress() but found hardlockup_detector_perf_stop() available instead :) I will send another patch. > > Best Regards, > Petr -- Best regards, Jinchao
Adding Peter Zijlstra into Cc. The nested panic() should return. But panic() was never supposed to return. It seems that it is not marked as noreturn but I am not sure whether some tricks are not hidden somewhere, in objtool, or... On Wed 2025-08-20 14:22:52, Jinchao Wang wrote: > On 8/19/25 23:01, Petr Mladek wrote: > > On Wed 2025-07-30 11:06:33, Wang Jinchao wrote: > > > When a panic happens, it blocks the cpu, which may > > > trigger the hardlockup detector if some dump is slow. > > > So call hardlockup_detector_perf_stop() to disable > > > hardlockup dector. > > > > Could you please provide more details, especially the log showing > > the problem? > > Here's what happened: I configured the kernel to use efi-pstore for kdump > logging while enabling the perf hard lockup detector (NMI). Perhaps the > efi-pstore was slow and there were too many logs. When the first panic was > triggered, the pstore dump callback in kmsg_dump()->dumper->dump() took a > long time, which triggered the NMI watchdog. Then emergency_restart() > triggered the machine restart before the efi-pstore operation finished. > The function call flow looked like this: > > ```c > real panic() { > kmsg_dump() { > ... > pstore_dump() { > start_dump(); > ... // long time operation triggers NMI watchdog > nmi panic() { > ... > emergency_restart(); //pstore unfinished > } > ... > finish_dump(); // never reached > } > } > } > ``` > > This created a nested panic situation where the second panic interrupted > the crash dump process, causing the loss of the original panic information. I believe that we should prevent the nested panic() in the first place. There already is the following code: void vpanic(const char *fmt, va_list args) { [...] * Only one CPU is allowed to execute the panic code from here. For * multiple parallel invocations of panic, all other CPUs either * stop themself or will wait until they are stopped by the 1st CPU * with smp_send_stop(). * * cmpxchg success means this is the 1st CPU which comes here, * so go ahead. * `old_cpu == this_cpu' means we came from nmi_panic() which sets * panic_cpu to this CPU. In this case, this is also the 1st CPU. */ old_cpu = PANIC_CPU_INVALID; this_cpu = raw_smp_processor_id(); /* atomic_try_cmpxchg updates old_cpu on failure */ if (atomic_try_cmpxchg(&panic_cpu, &old_cpu, this_cpu)) { /* go ahead */ } else if (old_cpu != this_cpu) panic_smp_self_stop(); We should improve it to detect nested panic() call as well, something like: this_cpu = raw_smp_processor_id(); /* Bail out in a nested panic(). Let the outer one finish the job. */ if (this_cpu == atomic_read(&panic_cpu)) return; /* atomic_try_cmpxchg updates old_cpu on failure */ old_cpu = PANIC_CPU_INVALID; if (atomic_try_cmpxchg(&panic_cpu, &old_cpu, this_cpu)) { /* go ahead */ } else if (old_cpu != this_cpu) panic_smp_self_stop(); > > That said, it might make sense to disable the hardlockup > > detector during panic. But I do not like the proposed way, > > see below. > > > > > --- a/kernel/panic.c > > > +++ b/kernel/panic.c > > > @@ -339,6 +339,7 @@ void panic(const char *fmt, ...) > > > */ > > > local_irq_disable(); > > > preempt_disable_notrace(); > > > + hardlockup_detector_perf_stop(); > > > > Anyway, it does not look safe. panic() might be called in any context, > > including NMI, and I see: > > > > + hardlockup_detector_perf_stop() > > + perf_event_disable() > > + perf_event_ctx_lock() > > + mutex_lock_nested() > > > > This might cause deadlock when called in NMI, definitely. > > > > Alternative: > > > > A conservative approach would be to update watchdog_hardlockup_check() > > so that it does nothing when panic_in_progress() returns true. It > > would even work for both hardlockup detectors implementation. > Yes, I think it is a better solution. > I didn't find panic_in_progress() but found hardlockup_detector_perf_stop() > available instead :) > I will send another patch. OK. Best Regards, Petr
On 8/20/25 18:22, Petr Mladek wrote: > Adding Peter Zijlstra into Cc. > > The nested panic() should return. But panic() was never supposed to > return. It seems that it is not marked as noreturn but I am not sure > whether some tricks are not hidden somewhere, in objtool, or... > > On Wed 2025-08-20 14:22:52, Jinchao Wang wrote: >> On 8/19/25 23:01, Petr Mladek wrote: >>> On Wed 2025-07-30 11:06:33, Wang Jinchao wrote: >>>> When a panic happens, it blocks the cpu, which may >>>> trigger the hardlockup detector if some dump is slow. >>>> So call hardlockup_detector_perf_stop() to disable >>>> hardlockup dector. >>> >>> Could you please provide more details, especially the log showing >>> the problem? >> >> Here's what happened: I configured the kernel to use efi-pstore for kdump >> logging while enabling the perf hard lockup detector (NMI). Perhaps the >> efi-pstore was slow and there were too many logs. When the first panic was >> triggered, the pstore dump callback in kmsg_dump()->dumper->dump() took a >> long time, which triggered the NMI watchdog. Then emergency_restart() >> triggered the machine restart before the efi-pstore operation finished. >> The function call flow looked like this: >> >> ```c >> real panic() { >> kmsg_dump() { >> ... >> pstore_dump() { >> start_dump(); >> ... // long time operation triggers NMI watchdog >> nmi panic() { >> ... >> emergency_restart(); //pstore unfinished >> } >> ... >> finish_dump(); // never reached >> } >> } >> } >> ``` >> >> This created a nested panic situation where the second panic interrupted >> the crash dump process, causing the loss of the original panic information. > > I believe that we should prevent the nested panic() in the first > place. There already is the following code: > > void vpanic(const char *fmt, va_list args) > { > [...] > * Only one CPU is allowed to execute the panic code from here. For > * multiple parallel invocations of panic, all other CPUs either > * stop themself or will wait until they are stopped by the 1st CPU > * with smp_send_stop(). > * > * cmpxchg success means this is the 1st CPU which comes here, > * so go ahead. > * `old_cpu == this_cpu' means we came from nmi_panic() which sets > * panic_cpu to this CPU. In this case, this is also the 1st CPU. > */ > old_cpu = PANIC_CPU_INVALID; > this_cpu = raw_smp_processor_id(); > > /* atomic_try_cmpxchg updates old_cpu on failure */ > if (atomic_try_cmpxchg(&panic_cpu, &old_cpu, this_cpu)) { > /* go ahead */ > } else if (old_cpu != this_cpu) > panic_smp_self_stop(); > > > We should improve it to detect nested panic() call as well, > something like: > > this_cpu = raw_smp_processor_id(); > /* Bail out in a nested panic(). Let the outer one finish the job. */ > if (this_cpu == atomic_read(&panic_cpu)) > return; > > /* atomic_try_cmpxchg updates old_cpu on failure */ > old_cpu = PANIC_CPU_INVALID; > if (atomic_try_cmpxchg(&panic_cpu, &old_cpu, this_cpu)) { > /* go ahead */ > } else if (old_cpu != this_cpu) > panic_smp_self_stop(); > Agree with you. Please see my patchset of "panic: introduce panic status function family" https://lore.kernel.org/all/20250820091702.512524-1-wangjinchao600@gmail.com/ For this nested panic problem, see patch 9. > >>> That said, it might make sense to disable the hardlockup >>> detector during panic. But I do not like the proposed way, >>> see below. >>> >>>> --- a/kernel/panic.c >>>> +++ b/kernel/panic.c >>>> @@ -339,6 +339,7 @@ void panic(const char *fmt, ...) >>>> */ >>>> local_irq_disable(); >>>> preempt_disable_notrace(); >>>> + hardlockup_detector_perf_stop(); >>> >>> Anyway, it does not look safe. panic() might be called in any context, >>> including NMI, and I see: >>> >>> + hardlockup_detector_perf_stop() >>> + perf_event_disable() >>> + perf_event_ctx_lock() >>> + mutex_lock_nested() >>> >>> This might cause deadlock when called in NMI, definitely. >>> >>> Alternative: >>> >>> A conservative approach would be to update watchdog_hardlockup_check() >>> so that it does nothing when panic_in_progress() returns true. It >>> would even work for both hardlockup detectors implementation. >> Yes, I think it is a better solution. >> I didn't find panic_in_progress() but found hardlockup_detector_perf_stop() >> available instead :) >> I will send another patch. > > OK. > > Best Regards, > Petr -- Best regards, Jinchao
© 2016 - 2025 Red Hat, Inc.