[PATCH 3/3] printk/nbcon: Release nbcon consoles ownership in atomic flush after each emitted record

Petr Mladek posted 3 patches 5 days, 9 hours ago
[PATCH 3/3] printk/nbcon: Release nbcon consoles ownership in atomic flush after each emitted record
Posted by Petr Mladek 5 days, 9 hours ago
printk() tries to flush messages with NBCON_PRIO_EMERGENCY on
nbcon consoles immediately. It might take seconds to flush all
pending lines on slow serial consoles. Note that there might be
hundreds of messages, for example:

[    3.771531][    T1] pci 0000:3e:08.1: [8086:324
** replaying previous printk message **
[    3.771531][    T1] pci 0000:3e:08.1: [8086:3246] type 00 class 0x088000 PCIe Root Complex Integrated Endpoint
[ ... more than 2000 lines, about 200kB messages ... ]
[    3.837752][    T1] pci 0000:20:01.0: Adding to iommu group 18
[    3.837851][    T
** replaying previous printk message **
[    3.837851][    T1] pci 0000:20:03.0: Adding to iommu group 19
[    3.837946][    T1] pci 0000:20:05.0: Adding to iommu group 20
[ ... more than 500 messages for iommu groups 21-590 ...]
[    3.912932][    T1] pci 0000:f6:00.1: Adding to iommu group 591
[    3.913070][    T1] pci 0000:f6:00.2: Adding to iommu group 592
[    3.913243][    T1] DMAR: Intel(R) Virtualization Technology for Directed I/O
[    3.913245][    T1] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[    3.913245][    T1] software IO TLB: mapped [mem 0x000000004f000000-0x0000000053000000] (64MB)
[    3.913324][    T1] RAPL PMU: API unit is 2^-32 Joules, 3 fixed counters, 655360 ms ovfl timer
[    3.913325][    T1] RAPL PMU: hw unit of domain package 2^-14 Joules
[    3.913326][    T1] RAPL PMU: hw unit of domain dram 2^-14 Joules
[    3.913327][    T1] RAPL PMU: hw unit of domain psys 2^-0 Joules
[    3.933486][    T1] ------------[ cut here ]------------
[    3.933488][    T1] WARNING: CPU: 2 PID: 1 at arch/x86/events/intel/uncore.c:1156 uncore_pci_pmu_register+0x15e/0x180
[    3.930291][    C0] watchdog: Watchdog detected hard LOCKUP on cpu 0
[    3.930291][    C0] Kernel panic - not syncing: Hard LOCKUP
[...]
[    3.930291][    C0] CPU: 0 UID: 0 PID: 18 Comm: pr/ttyS0 Not tainted...
[...]
[    3.930291][    C0] RIP: 0010:nbcon_reacquire_nobuf+0x11/0x50
[    3.930291][    C0] Call Trace:
[...]
[    3.930291][    C0]  <TASK>
[    3.930291][    C0]  serial8250_console_write+0x16d/0x5c0
[    3.930291][    C0]  nbcon_emit_next_record+0x22c/0x250
[    3.930291][    C0]  nbcon_emit_one+0x93/0xe0
[    3.930291][    C0]  nbcon_kthread_func+0x13c/0x1c0

The are visible two takeovers of the console ownership:

  - The 1st one is triggered by the "WARNING: CPU: 2 PID: 1 at
    arch/x86/..." line printed with NBCON_PRIO_EMERGENCY.

  - The 2nd one is triggered by the "Kernel panic - not syncing:
    Hard LOCKUP" line printed with NBCON_PRIO_PANIC.

There are more than 2500 lines, at about 240kB, emitted between
the takeover and the 1st "WARNING" line in the emergency context.
This amount of pending messages had to be flushed by
nbcon_atomic_flush_pending() when WARN() printed its first line.

The atomic flush was holding the nbcon console context for too long so
that it triggered hard lockup on the CPU running the printk kthread
"pr/ttyS0". The kthread needed to reacquire the console ownership
for restoring the original serial port state in serial8250_console_write().

Prevent the hardlockup by releasing the nbcon console ownership after
each emitted record.

Note that __nbcon_atomic_flush_pending_con() used to hold the console
ownership all the time because it blocked the printk kthread. Otherwise
the kthread tried to flush the messages in parallel which caused repeated
takeovers and more replayed messages.

It is not longer a problem because the repeated takeovers are blocked
by the counter of emergency contexts, see nbcon_cpu_emergency_cnt.

Link: https://lore.kernel.org/all/aNQO-zl3k1l4ENfy@pathway.suse.cz
Signed-off-by: Petr Mladek <pmladek@suse.com>
---
 kernel/printk/nbcon.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
index 219ae0c8b5ed..e298346111b2 100644
--- a/kernel/printk/nbcon.c
+++ b/kernel/printk/nbcon.c
@@ -1532,10 +1532,10 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
 	ctxt->prio			= nbcon_get_default_prio();
 	ctxt->allow_unsafe_takeover	= allow_unsafe_takeover;
 
-	if (!nbcon_context_try_acquire(ctxt, false))
-		return -EPERM;
-
 	while (nbcon_seq_read(con) < stop_seq) {
+		if (!nbcon_context_try_acquire(ctxt, false))
+			return -EPERM;
+
 		/*
 		 * nbcon_emit_next_record() returns false when the console was
 		 * handed over or taken over. In both cases the context is no
@@ -1544,6 +1544,8 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
 		if (!nbcon_emit_next_record(&wctxt, true))
 			return -EAGAIN;
 
+		nbcon_context_release(ctxt);
+
 		if (!ctxt->backlog) {
 			/* Are there reserved but not yet finalized records? */
 			if (nbcon_seq_read(con) < stop_seq)
@@ -1552,7 +1554,6 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
 		}
 	}
 
-	nbcon_context_release(ctxt);
 	return err;
 }
 
-- 
2.51.0
Re: [PATCH 3/3] printk/nbcon: Release nbcon consoles ownership in atomic flush after each emitted record
Posted by Andrew Murray 2 days, 13 hours ago
On Fri, 26 Sept 2025 at 13:50, Petr Mladek <pmladek@suse.com> wrote:
>
> printk() tries to flush messages with NBCON_PRIO_EMERGENCY on
> nbcon consoles immediately. It might take seconds to flush all
> pending lines on slow serial consoles. Note that there might be
> hundreds of messages, for example:
>
> [    3.771531][    T1] pci 0000:3e:08.1: [8086:324
> ** replaying previous printk message **
> [    3.771531][    T1] pci 0000:3e:08.1: [8086:3246] type 00 class 0x088000 PCIe Root Complex Integrated Endpoint
> [ ... more than 2000 lines, about 200kB messages ... ]
> [    3.837752][    T1] pci 0000:20:01.0: Adding to iommu group 18
> [    3.837851][    T
> ** replaying previous printk message **
> [    3.837851][    T1] pci 0000:20:03.0: Adding to iommu group 19
> [    3.837946][    T1] pci 0000:20:05.0: Adding to iommu group 20
> [ ... more than 500 messages for iommu groups 21-590 ...]
> [    3.912932][    T1] pci 0000:f6:00.1: Adding to iommu group 591
> [    3.913070][    T1] pci 0000:f6:00.2: Adding to iommu group 592
> [    3.913243][    T1] DMAR: Intel(R) Virtualization Technology for Directed I/O
> [    3.913245][    T1] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> [    3.913245][    T1] software IO TLB: mapped [mem 0x000000004f000000-0x0000000053000000] (64MB)
> [    3.913324][    T1] RAPL PMU: API unit is 2^-32 Joules, 3 fixed counters, 655360 ms ovfl timer
> [    3.913325][    T1] RAPL PMU: hw unit of domain package 2^-14 Joules
> [    3.913326][    T1] RAPL PMU: hw unit of domain dram 2^-14 Joules
> [    3.913327][    T1] RAPL PMU: hw unit of domain psys 2^-0 Joules
> [    3.933486][    T1] ------------[ cut here ]------------
> [    3.933488][    T1] WARNING: CPU: 2 PID: 1 at arch/x86/events/intel/uncore.c:1156 uncore_pci_pmu_register+0x15e/0x180
> [    3.930291][    C0] watchdog: Watchdog detected hard LOCKUP on cpu 0
> [    3.930291][    C0] Kernel panic - not syncing: Hard LOCKUP
> [...]
> [    3.930291][    C0] CPU: 0 UID: 0 PID: 18 Comm: pr/ttyS0 Not tainted...
> [...]
> [    3.930291][    C0] RIP: 0010:nbcon_reacquire_nobuf+0x11/0x50
> [    3.930291][    C0] Call Trace:
> [...]
> [    3.930291][    C0]  <TASK>
> [    3.930291][    C0]  serial8250_console_write+0x16d/0x5c0
> [    3.930291][    C0]  nbcon_emit_next_record+0x22c/0x250
> [    3.930291][    C0]  nbcon_emit_one+0x93/0xe0
> [    3.930291][    C0]  nbcon_kthread_func+0x13c/0x1c0
>
> The are visible two takeovers of the console ownership:
>
>   - The 1st one is triggered by the "WARNING: CPU: 2 PID: 1 at
>     arch/x86/..." line printed with NBCON_PRIO_EMERGENCY.
>
>   - The 2nd one is triggered by the "Kernel panic - not syncing:
>     Hard LOCKUP" line printed with NBCON_PRIO_PANIC.
>
> There are more than 2500 lines, at about 240kB, emitted between
> the takeover and the 1st "WARNING" line in the emergency context.
> This amount of pending messages had to be flushed by
> nbcon_atomic_flush_pending() when WARN() printed its first line.
>
> The atomic flush was holding the nbcon console context for too long so
> that it triggered hard lockup on the CPU running the printk kthread
> "pr/ttyS0". The kthread needed to reacquire the console ownership
> for restoring the original serial port state in serial8250_console_write().
>
> Prevent the hardlockup by releasing the nbcon console ownership after
> each emitted record.
>
> Note that __nbcon_atomic_flush_pending_con() used to hold the console
> ownership all the time because it blocked the printk kthread. Otherwise
> the kthread tried to flush the messages in parallel which caused repeated
> takeovers and more replayed messages.
>
> It is not longer a problem because the repeated takeovers are blocked
> by the counter of emergency contexts, see nbcon_cpu_emergency_cnt.
>
> Link: https://lore.kernel.org/all/aNQO-zl3k1l4ENfy@pathway.suse.cz
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> ---
>  kernel/printk/nbcon.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
> index 219ae0c8b5ed..e298346111b2 100644
> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -1532,10 +1532,10 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
>         ctxt->prio                      = nbcon_get_default_prio();
>         ctxt->allow_unsafe_takeover     = allow_unsafe_takeover;
>
> -       if (!nbcon_context_try_acquire(ctxt, false))
> -               return -EPERM;
> -
>         while (nbcon_seq_read(con) < stop_seq) {
> +               if (!nbcon_context_try_acquire(ctxt, false))
> +                       return -EPERM;
> +
>                 /*
>                  * nbcon_emit_next_record() returns false when the console was
>                  * handed over or taken over. In both cases the context is no
> @@ -1544,6 +1544,8 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
>                 if (!nbcon_emit_next_record(&wctxt, true))
>                         return -EAGAIN;
>
> +               nbcon_context_release(ctxt);
> +
>                 if (!ctxt->backlog) {
>                         /* Are there reserved but not yet finalized records? */
>                         if (nbcon_seq_read(con) < stop_seq)
> @@ -1552,7 +1554,6 @@ static int __nbcon_atomic_flush_pending_con(struct console *con, u64 stop_seq,
>                 }
>         }
>
> -       nbcon_context_release(ctxt);
>         return err;
>  }
>
> --
> 2.51.0
>

Reviewed-by: Andrew Murray <amurray@thegoodpenguin.co.uk>

Thanks,

Andrew Murray
Re: [PATCH 3/3] printk/nbcon: Release nbcon consoles ownership in atomic flush after each emitted record
Posted by John Ogness 5 days, 7 hours ago
On 2025-09-26, Petr Mladek <pmladek@suse.com> wrote:
> printk() tries to flush messages with NBCON_PRIO_EMERGENCY on
> nbcon consoles immediately. It might take seconds to flush all
> pending lines on slow serial consoles. Note that there might be
> hundreds of messages, for example:
>
> [    3.771531][    T1] pci 0000:3e:08.1: [8086:324
> ** replaying previous printk message **
> [    3.771531][    T1] pci 0000:3e:08.1: [8086:3246] type 00 class 0x088000 PCIe Root Complex Integrated Endpoint
> [ ... more than 2000 lines, about 200kB messages ... ]
> [    3.837752][    T1] pci 0000:20:01.0: Adding to iommu group 18
> [    3.837851][    T
> ** replaying previous printk message **
> [    3.837851][    T1] pci 0000:20:03.0: Adding to iommu group 19
> [    3.837946][    T1] pci 0000:20:05.0: Adding to iommu group 20
> [ ... more than 500 messages for iommu groups 21-590 ...]
> [    3.912932][    T1] pci 0000:f6:00.1: Adding to iommu group 591
> [    3.913070][    T1] pci 0000:f6:00.2: Adding to iommu group 592
> [    3.913243][    T1] DMAR: Intel(R) Virtualization Technology for Directed I/O
> [    3.913245][    T1] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> [    3.913245][    T1] software IO TLB: mapped [mem 0x000000004f000000-0x0000000053000000] (64MB)
> [    3.913324][    T1] RAPL PMU: API unit is 2^-32 Joules, 3 fixed counters, 655360 ms ovfl timer
> [    3.913325][    T1] RAPL PMU: hw unit of domain package 2^-14 Joules
> [    3.913326][    T1] RAPL PMU: hw unit of domain dram 2^-14 Joules
> [    3.913327][    T1] RAPL PMU: hw unit of domain psys 2^-0 Joules
> [    3.933486][    T1] ------------[ cut here ]------------
> [    3.933488][    T1] WARNING: CPU: 2 PID: 1 at arch/x86/events/intel/uncore.c:1156 uncore_pci_pmu_register+0x15e/0x180
> [    3.930291][    C0] watchdog: Watchdog detected hard LOCKUP on cpu 0
> [    3.930291][    C0] Kernel panic - not syncing: Hard LOCKUP
> [...]
> [    3.930291][    C0] CPU: 0 UID: 0 PID: 18 Comm: pr/ttyS0 Not tainted...
> [...]
> [    3.930291][    C0] RIP: 0010:nbcon_reacquire_nobuf+0x11/0x50
> [    3.930291][    C0] Call Trace:
> [...]
> [    3.930291][    C0]  <TASK>
> [    3.930291][    C0]  serial8250_console_write+0x16d/0x5c0
> [    3.930291][    C0]  nbcon_emit_next_record+0x22c/0x250
> [    3.930291][    C0]  nbcon_emit_one+0x93/0xe0
> [    3.930291][    C0]  nbcon_kthread_func+0x13c/0x1c0
>
> The are visible two takeovers of the console ownership:
>
>   - The 1st one is triggered by the "WARNING: CPU: 2 PID: 1 at
>     arch/x86/..." line printed with NBCON_PRIO_EMERGENCY.
>
>   - The 2nd one is triggered by the "Kernel panic - not syncing:
>     Hard LOCKUP" line printed with NBCON_PRIO_PANIC.
>
> There are more than 2500 lines, at about 240kB, emitted between
> the takeover and the 1st "WARNING" line in the emergency context.
> This amount of pending messages had to be flushed by
> nbcon_atomic_flush_pending() when WARN() printed its first line.
>
> The atomic flush was holding the nbcon console context for too long so
> that it triggered hard lockup on the CPU running the printk kthread
> "pr/ttyS0". The kthread needed to reacquire the console ownership
> for restoring the original serial port state in serial8250_console_write().
>
> Prevent the hardlockup by releasing the nbcon console ownership after
> each emitted record.
>
> Note that __nbcon_atomic_flush_pending_con() used to hold the console
> ownership all the time because it blocked the printk kthread. Otherwise
> the kthread tried to flush the messages in parallel which caused repeated
> takeovers and more replayed messages.
>
> It is not longer a problem because the repeated takeovers are blocked
> by the counter of emergency contexts, see nbcon_cpu_emergency_cnt.
>
> Link: https://lore.kernel.org/all/aNQO-zl3k1l4ENfy@pathway.suse.cz
> Signed-off-by: Petr Mladek <pmladek@suse.com>

Reviewed-by: John Ogness <john.ogness@linutronix.de>

Looks good and performs as advertised.