kernel/sched/sched.h | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
A deadlock is observed when WARN_ONCE() uses printk() inside the scheduler
logic. printk_deferred_once() is a WARN_ONCE() similar special printk
facility for the scheduler to avoid unnecessary deadlocks.
One of the deadlock stacks looks like this:
queued_spin_lock_slowpath() --
//waiting on uart_port_lock which is already hold by current cpu
queued_spin_lock(inline)
do_raw_spin_lock(inline)
__raw_spin_lock_irqsave(inline)
_raw_spin_lock_irqsave()
qcom_geni_serial_console_write()
call_console_driver(inline)
console_emit_next_record()
console_flush_all(inline)
console_unlock()
vprintk_emit()
vprintk_default()
vprintk()
_printk()
__warn_printk()
rq_pin_lock(inline)
rq_lock_irqsave(inline)
.
wake_up_process(inline)
wake_up_worker(inline)
insert_work(inline)
__queue_work()
queue_work_on()
queue_work(inline)
tty_flip_buffer_push()
handle_rx_console()
qcom_geni_serial_handle_rx(inline)
qcom_geni_serial_isr() --
//acquired uart_port_lock
__handle_irq_event_percpu()
handle_irq_event_percpu(inline)
handle_irq_event()
.
After the change, the SCHED_WARN_ON(x) printout appears as follows:
[ x.xxxxxx] WARNING: ------------[ cut here ]------------
[ x.xxxxxx] WARNING: x
Fixes: 9148a3a10e0b ("sched/debug: Add SCHED_WARN_ON()")
Signed-off-by: Maria Yu <quic_aiquny@quicinc.com>
---
kernel/sched/sched.h | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index aef716c41edba8a6ec7d88db121fe0a9934b5264..863004630ad793c21272566f186d2fbee7fcc95b 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -92,7 +92,16 @@ struct cpuidle_state;
#include "cpudeadline.h"
#ifdef CONFIG_SCHED_DEBUG
-# define SCHED_WARN_ON(x) WARN_ONCE(x, #x)
+# define deferred_warn_on(fmt, ...) \
+ printk_deferred_once(KERN_WARNING "WARNING: " fmt, ##__VA_ARGS__)
+# define SCHED_WARN_ON(x) ({\
+ int __ret_warn_on = !!(x); \
+ if (unlikely(__ret_warn_on)) { \
+ deferred_warn_on(CUT_HERE); \
+ deferred_warn_on(#x); \
+ } \
+ unlikely(__ret_warn_on); \
+})
#else
# define SCHED_WARN_ON(x) ({ (void)(x), 0; })
#endif
---
base-commit: af98d8a36a963e758e84266d152b92c7b51d4ecb
change-id: 20241226-sched-use-printk_deferred_once-2275b3a2575e
Best regards,
--
Maria Yu <quic_aiquny@quicinc.com>
On Fri, Dec 27, 2024 at 05:27:10PM +0800, Maria Yu wrote: > A deadlock is observed when WARN_ONCE() uses printk() inside the scheduler > logic. printk_deferred_once() is a WARN_ONCE() similar special printk > facility for the scheduler to avoid unnecessary deadlocks. No, problem is with printk. Using delayed will make it so that you'll never see the output if it dies. Only use delayed for printk()s that are expected and non fatal.
On Mon, Jan 13, 2025 at 02:12:43PM +0100, Peter Zijlstra wrote: > On Fri, Dec 27, 2024 at 05:27:10PM +0800, Maria Yu wrote: > > A deadlock is observed when WARN_ONCE() uses printk() inside the scheduler > > logic. printk_deferred_once() is a WARN_ONCE() similar special printk > > facility for the scheduler to avoid unnecessary deadlocks. > > No, problem is with printk. Using delayed will make it so that you'll > never see the output if it dies. > > Only use delayed for printk()s that are expected and non fatal. Also, if you trip WARN in scheduler, correct thing to do is fix WARN, not make WARN run 'better'. No WARN, no problem.
On 1/13/2025 9:16 PM, Peter Zijlstra wrote: > On Mon, Jan 13, 2025 at 02:12:43PM +0100, Peter Zijlstra wrote: >> On Fri, Dec 27, 2024 at 05:27:10PM +0800, Maria Yu wrote: >>> A deadlock is observed when WARN_ONCE() uses printk() inside the scheduler >>> logic. printk_deferred_once() is a WARN_ONCE() similar special printk >>> facility for the scheduler to avoid unnecessary deadlocks. >> >> No, problem is with printk. Using delayed will make it so that you'll >> never see the output if it dies. printk_deferred() only defers the console print, while it still ensures that the log buffer is logged immediately and correctly. There may be a side effect of showing fewer logs in a corner case if the kernel crashes before the first console_unlock is called. However, printk_deferred() will have the entire sched warn log in the printk log buffer, and the log buffer content is a common method to check the crash root cause. Therefore, in my opinion, in scenarios where the console-enabled printk is likely to crash, printk_deferred() will be more useful since the entire log information can be captured in the log buffer. >> >> Only use delayed for printk()s that are expected and non fatal. > > Also, if you trip WARN in scheduler, correct thing to do is fix WARN, > not make WARN run 'better'. No WARN, no problem. I agree that the real issue caused SCHED_WARN_ON() should be addressed. The actual sched logic problem is also being handled separately. However, it is still useful to use the correct method to print the information needed for debugging. Therefore, this patch aims to fix the debug method of SCHED_WARN_ON(). We've encountered various crash issues that exhibited the similar deadlock problem, but with different SCHED_WARN_ON() actually. Anyway, we don't want SCHED_WARN_ON() to sometimes be a warning in the log and other times cause a BUG_ON() behavior that crashes the device. -- Thx and BRs, Aiqun(Maria) Yu
© 2016 - 2026 Red Hat, Inc.