kernel/printk/internal.h | 8 ++-- kernel/printk/nbcon.c | 9 ++++- kernel/printk/printk.c | 81 ++++++++++++++++++++++++++++++++-------- 3 files changed, 78 insertions(+), 20 deletions(-)
This is v2 of a series to address multiple reports [0][1] (+ 2 offlist) of suspend failing when NBCON console drivers are in use. With the help of NXP and NVIDIA we were able to isolate the problem and verify the fix. v1 is here [2]. The first NBCON drivers appeared in 6.13, so currently there is no LTS kernel that requires this series. But it should go into 6.17.x and 6.18. The changes since v1: - For printk_trigger_flush() add support for all flush types that are available. This will prevent printk_trigger_flush() from trying to inappropriately queue irq_work after this series is applied. - Add WARN_ON_ONCE() to the printk irq_work queueing functions in case they are called when irq_work is blocked. There should never be (and currently are no) such callers, but these functions are externally available. John Ogness [0] https://lore.kernel.org/lkml/80b020fc-c18a-4da4-b222-16da1cab2f4c@nvidia.com [1] https://lore.kernel.org/lkml/DB9PR04MB8429E7DDF2D93C2695DE401D92C4A@DB9PR04MB8429.eurprd04.prod.outlook.com [2] https://lore.kernel.org/lkml/20251111144328.887159-1-john.ogness@linutronix.de John Ogness (2): printk: Allow printk_trigger_flush() to flush all types printk: Avoid scheduling irq_work on suspend kernel/printk/internal.h | 8 ++-- kernel/printk/nbcon.c | 9 ++++- kernel/printk/printk.c | 81 ++++++++++++++++++++++++++++++++-------- 3 files changed, 78 insertions(+), 20 deletions(-) base-commit: e9a6fb0bcdd7609be6969112f3fbfcce3b1d4a7c -- 2.47.3
On Thu, Nov 13, 2025 at 05:09:46PM +0106, John Ogness wrote: > This is v2 of a series to address multiple reports [0][1] > (+ 2 offlist) of suspend failing when NBCON console drivers are > in use. With the help of NXP and NVIDIA we were able to isolate > the problem and verify the fix. > > v1 is here [2]. > > The first NBCON drivers appeared in 6.13, so currently there is > no LTS kernel that requires this series. But it should go into > 6.17.x and 6.18. > > The changes since v1: > > - For printk_trigger_flush() add support for all flush types > that are available. This will prevent printk_trigger_flush() > from trying to inappropriately queue irq_work after this > series is applied. > > - Add WARN_ON_ONCE() to the printk irq_work queueing functions > in case they are called when irq_work is blocked. There > should never be (and currently are no) such callers, but > these functions are externally available. > > John Ogness > > [0] https://lore.kernel.org/lkml/80b020fc-c18a-4da4-b222-16da1cab2f4c@nvidia.com > [1] https://lore.kernel.org/lkml/DB9PR04MB8429E7DDF2D93C2695DE401D92C4A@DB9PR04MB8429.eurprd04.prod.outlook.com > [2] https://lore.kernel.org/lkml/20251111144328.887159-1-john.ogness@linutronix.de > > John Ogness (2): > printk: Allow printk_trigger_flush() to flush all types > printk: Avoid scheduling irq_work on suspend > > kernel/printk/internal.h | 8 ++-- > kernel/printk/nbcon.c | 9 ++++- > kernel/printk/printk.c | 81 ++++++++++++++++++++++++++++++++-------- > 3 files changed, 78 insertions(+), 20 deletions(-) Sorry, I'm a bit late, but this seems to solve all the issues we were seeing on Tegra boards, so for the record: Tested-by: Thierry Reding <treding@nvidia.com>
On Thu 2025-11-13 17:09:46, John Ogness wrote: > This is v2 of a series to address multiple reports [0][1] > (+ 2 offlist) of suspend failing when NBCON console drivers are > in use. With the help of NXP and NVIDIA we were able to isolate > the problem and verify the fix. > > The first NBCON drivers appeared in 6.13, so currently there is > no LTS kernel that requires this series. But it should go into > 6.17.x and 6.18. > > John Ogness (2): > printk: Allow printk_trigger_flush() to flush all types > printk: Avoid scheduling irq_work on suspend > > kernel/printk/internal.h | 8 ++-- > kernel/printk/nbcon.c | 9 ++++- > kernel/printk/printk.c | 81 ++++++++++++++++++++++++++++++++-------- > 3 files changed, 78 insertions(+), 20 deletions(-) JFYI, the patchset has been committed into printk/linux.git, branch rework/suspend-fixes. Best Regards, Petr
Hi Petr,
On 2025-11-19, Petr Mladek <pmladek@suse.com> wrote:
> JFYI, the patchset has been committed into printk/linux.git,
> branch rework/suspend-fixes.
While doing more testing I hit the new WARN_ON_ONCE() in
__wake_up_klogd():
[ 125.306075][ T92] Timekeeping suspended for 9.749 seconds
[ 125.306093][ T92] ------------[ cut here ]------------
[ 125.306108][ T92] WARNING: CPU: 0 PID: 92 at kernel/printk/printk.c:4539 vprintk_emit+0x134/0x2e8
[ 125.306151][ T92] Modules linked in: pm33xx ti_emif_sram wkup_m3_ipc wkup_m3_rproc omap_mailbox rtc_omap
[ 125.306249][ T92] CPU: 0 UID: 0 PID: 92 Comm: rtcwake Not tainted 6.18.0-rc5-00005-g3d7d27fc1b14 #162 PREEMPT
[ 125.306276][ T92] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 125.306290][ T92] Call trace:
[ 125.306308][ T92] unwind_backtrace from show_stack+0x18/0x1c
[ 125.306356][ T92] show_stack from dump_stack_lvl+0x50/0x64
[ 125.306398][ T92] dump_stack_lvl from __warn+0x7c/0x160
[ 125.306433][ T92] __warn from warn_slowpath_fmt+0x158/0x1f0
[ 125.306459][ T92] warn_slowpath_fmt from vprintk_emit+0x134/0x2e8
[ 125.306487][ T92] vprintk_emit from _printk_deferred+0x44/0x84
[ 125.306520][ T92] _printk_deferred from tk_debug_account_sleep_time+0x78/0x88
[ 125.306574][ T92] tk_debug_account_sleep_time from timekeeping_inject_sleeptime64+0x3c/0x6c
[ 125.306624][ T92] timekeeping_inject_sleeptime64 from rtc_resume.part.0+0x158/0x178
[ 125.306666][ T92] rtc_resume.part.0 from rtc_resume+0x54/0x64
[ 125.306705][ T92] rtc_resume from dpm_run_callback+0x68/0x1d4
[ 125.306747][ T92] dpm_run_callback from device_resume+0xc8/0x200
[ 125.306779][ T92] device_resume from dpm_resume+0x208/0x304
[ 125.306813][ T92] dpm_resume from dpm_resume_end+0x14/0x24
[ 125.306846][ T92] dpm_resume_end from suspend_devices_and_enter+0x1e8/0x8a4
[ 125.306892][ T92] suspend_devices_and_enter from pm_suspend+0x328/0x3c0
[ 125.306924][ T92] pm_suspend from state_store+0x70/0xd0
[ 125.306955][ T92] state_store from kernfs_fop_write_iter+0x124/0x1e4
[ 125.307001][ T92] kernfs_fop_write_iter from vfs_write+0x1f0/0x2bc
[ 125.307049][ T92] vfs_write from ksys_write+0x68/0xe8
[ 125.307085][ T92] ksys_write from ret_fast_syscall+0x0/0x58
[ 125.307113][ T92] Exception stack(0xd025dfa8 to 0xd025dff0)
[ 125.307137][ T92] dfa0: 00000004 bed09f71 00000004 bed09f71 00000003 00000001
[ 125.307157][ T92] dfc0: 00000004 bed09f71 00000003 00000004 00510bd4 00000000 00000000 0050e634
[ 125.307172][ T92] dfe0: 00000004 bed09bd8 b6ebc20b b6e35616
[ 125.307185][ T92] ---[ end trace 0000000000000000 ]---
It is due to a use of printk_deferred(). This goes through the special
case of "level == LOGLEVEL_SCHED" in vprintk_emit(). Originally I had
patched this code as well, but then later removed it thinking that it
was not needed. But it is needed. :-/ Something like:
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index b1c0d35cf3ca..c27fc7fc64eb 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2393,7 +2393,7 @@ asmlinkage int vprintk_emit(int facility, int level,
/* If called from the scheduler, we can not call up(). */
if (level == LOGLEVEL_SCHED) {
level = LOGLEVEL_DEFAULT;
- ft.legacy_offload |= ft.legacy_direct;
+ ft.legacy_offload |= ft.legacy_direct && !console_irqwork_blocked;
ft.legacy_direct = false;
}
Is this solution ok for you? Do you prefer a follow-up patch or a v3?
John
On Thu 2025-11-20 12:09:43, John Ogness wrote:
> Hi Petr,
>
> On 2025-11-19, Petr Mladek <pmladek@suse.com> wrote:
> > JFYI, the patchset has been committed into printk/linux.git,
> > branch rework/suspend-fixes.
>
> While doing more testing I hit the new WARN_ON_ONCE() in
> __wake_up_klogd():
>
> [ 125.306075][ T92] Timekeeping suspended for 9.749 seconds
> [ 125.306093][ T92] ------------[ cut here ]------------
> [ 125.306108][ T92] WARNING: CPU: 0 PID: 92 at kernel/printk/printk.c:4539 vprintk_emit+0x134/0x2e8
> [ 125.306151][ T92] Modules linked in: pm33xx ti_emif_sram wkup_m3_ipc wkup_m3_rproc omap_mailbox rtc_omap
> [ 125.306249][ T92] CPU: 0 UID: 0 PID: 92 Comm: rtcwake Not tainted 6.18.0-rc5-00005-g3d7d27fc1b14 #162 PREEMPT
> [ 125.306276][ T92] Hardware name: Generic AM33XX (Flattened Device Tree)
> [ 125.306290][ T92] Call trace:
> [ 125.306308][ T92] unwind_backtrace from show_stack+0x18/0x1c
> [ 125.306356][ T92] show_stack from dump_stack_lvl+0x50/0x64
> [ 125.306398][ T92] dump_stack_lvl from __warn+0x7c/0x160
> [ 125.306433][ T92] __warn from warn_slowpath_fmt+0x158/0x1f0
> [ 125.306459][ T92] warn_slowpath_fmt from vprintk_emit+0x134/0x2e8
> [ 125.306487][ T92] vprintk_emit from _printk_deferred+0x44/0x84
> [ 125.306520][ T92] _printk_deferred from tk_debug_account_sleep_time+0x78/0x88
> [ 125.306574][ T92] tk_debug_account_sleep_time from timekeeping_inject_sleeptime64+0x3c/0x6c
> [ 125.306624][ T92] timekeeping_inject_sleeptime64 from rtc_resume.part.0+0x158/0x178
> [ 125.306666][ T92] rtc_resume.part.0 from rtc_resume+0x54/0x64
> [ 125.306705][ T92] rtc_resume from dpm_run_callback+0x68/0x1d4
> [ 125.306747][ T92] dpm_run_callback from device_resume+0xc8/0x200
> [ 125.306779][ T92] device_resume from dpm_resume+0x208/0x304
> [ 125.306813][ T92] dpm_resume from dpm_resume_end+0x14/0x24
> [ 125.306846][ T92] dpm_resume_end from suspend_devices_and_enter+0x1e8/0x8a4
> [ 125.306892][ T92] suspend_devices_and_enter from pm_suspend+0x328/0x3c0
> [ 125.306924][ T92] pm_suspend from state_store+0x70/0xd0
> [ 125.306955][ T92] state_store from kernfs_fop_write_iter+0x124/0x1e4
> [ 125.307001][ T92] kernfs_fop_write_iter from vfs_write+0x1f0/0x2bc
> [ 125.307049][ T92] vfs_write from ksys_write+0x68/0xe8
> [ 125.307085][ T92] ksys_write from ret_fast_syscall+0x0/0x58
> [ 125.307113][ T92] Exception stack(0xd025dfa8 to 0xd025dff0)
> [ 125.307137][ T92] dfa0: 00000004 bed09f71 00000004 bed09f71 00000003 00000001
> [ 125.307157][ T92] dfc0: 00000004 bed09f71 00000003 00000004 00510bd4 00000000 00000000 0050e634
> [ 125.307172][ T92] dfe0: 00000004 bed09bd8 b6ebc20b b6e35616
> [ 125.307185][ T92] ---[ end trace 0000000000000000 ]---
>
> It is due to a use of printk_deferred(). This goes through the special
> case of "level == LOGLEVEL_SCHED" in vprintk_emit(). Originally I had
> patched this code as well, but then later removed it thinking that it
> was not needed. But it is needed. :-/ Something like:
Great catch!
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index b1c0d35cf3ca..c27fc7fc64eb 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2393,7 +2393,7 @@ asmlinkage int vprintk_emit(int facility, int level,
> /* If called from the scheduler, we can not call up(). */
> if (level == LOGLEVEL_SCHED) {
> level = LOGLEVEL_DEFAULT;
> - ft.legacy_offload |= ft.legacy_direct;
> + ft.legacy_offload |= ft.legacy_direct && !console_irqwork_blocked;
> ft.legacy_direct = false;
> }
>
> Is this solution ok for you? Do you prefer a follow-up patch or a v3?
Nothing better comes to my mind ;-) A follow-up patch would be
lovely. Please, go ahead.
Best Regards,
Petr
On Thu 2025-11-13 17:09:46, John Ogness wrote: > This is v2 of a series to address multiple reports [0][1] > (+ 2 offlist) of suspend failing when NBCON console drivers are > in use. With the help of NXP and NVIDIA we were able to isolate > the problem and verify the fix. > > v1 is here [2]. > > The first NBCON drivers appeared in 6.13, so currently there is > no LTS kernel that requires this series. But it should go into > 6.17.x and 6.18. > > The changes since v1: > > - For printk_trigger_flush() add support for all flush types > that are available. This will prevent printk_trigger_flush() > from trying to inappropriately queue irq_work after this > series is applied. > > - Add WARN_ON_ONCE() to the printk irq_work queueing functions > in case they are called when irq_work is blocked. There > should never be (and currently are no) such callers, but > these functions are externally available. > > John Ogness > > [0] https://lore.kernel.org/lkml/80b020fc-c18a-4da4-b222-16da1cab2f4c@nvidia.com > [1] https://lore.kernel.org/lkml/DB9PR04MB8429E7DDF2D93C2695DE401D92C4A@DB9PR04MB8429.eurprd04.prod.outlook.com > [2] https://lore.kernel.org/lkml/20251111144328.887159-1-john.ogness@linutronix.de > > John Ogness (2): > printk: Allow printk_trigger_flush() to flush all types > printk: Avoid scheduling irq_work on suspend > > kernel/printk/internal.h | 8 ++-- > kernel/printk/nbcon.c | 9 ++++- > kernel/printk/printk.c | 81 ++++++++++++++++++++++++++++++++-------- > 3 files changed, 78 insertions(+), 20 deletions(-) The patchset seems to be ready for linux-next from my POV. I am going to wait few more days for potential feedback. I'll push it later the following week unless anyone complains. Best Regards, Petr
© 2016 - 2026 Red Hat, Inc.