[PATCH 0/3] printk/nbcon: Prevent hardlockup reports caused by atomic nbcon flush

Petr Mladek posted 3 patches 4 months, 2 weeks ago
kernel/printk/internal.h |  1 +
kernel/printk/nbcon.c    | 43 +++++++++++++++++++++++++++++++++++-----
kernel/printk/printk.c   |  2 +-
3 files changed, 40 insertions(+), 6 deletions(-)
[PATCH 0/3] printk/nbcon: Prevent hardlockup reports caused by atomic nbcon flush
Posted by Petr Mladek 4 months, 2 weeks ago
This patchset should solve problem which was being discussed
at https://lore.kernel.org/all/aNFR45fL2L4PavNc@pathway.suse.cz

__nbcon_atomic_flush_pending_con() preserves the nbcon console
ownership all the time when flushing pending messages. It might
take a long time with slow serial consoles.

It might trigger a hardlockup report on another CPU which is
busy waiting for the nbcon console ownership, for example,
in nbcon_reacquire_nobuf() or __uart_port_nbcon_acquire().

The problem is solved by the 3rd patch. It releases the console
context ownership after each record.

The 3rd patch alone would increase the risk of takeovers and repeated
lines. It is prevented by the 1st patch which blocks the printk kthread
when any CPU is in an emergency context.

The 2nd patch allows to block the printk kthread also in panic.
It is not important. It is just an obvious update of the check
for emergency contexts.

Note: The patchset applies against current Linus' tree (v6.17-rc7).

      The 2nd patch would need an update after the consolisation of
      the panic state API gets merged via -mm tree,
      see https://lore.kernel.org/r/20250825022947.1596226-2-wangjinchao600@gmail.com

Petr Mladek (3):
  printk/nbcon: Block printk kthreads when any CPU is in an emergency
    context
  printk/nbcon/panic: Allow printk kthread to sleep when the system is
    in panic
  printk/nbcon: Release nbcon consoles ownership in atomic flush after
    each emitted record

 kernel/printk/internal.h |  1 +
 kernel/printk/nbcon.c    | 43 +++++++++++++++++++++++++++++++++++-----
 kernel/printk/printk.c   |  2 +-
 3 files changed, 40 insertions(+), 6 deletions(-)

-- 
2.51.0
Re: [PATCH 0/3] printk/nbcon: Prevent hardlockup reports caused by atomic nbcon flush
Posted by Petr Mladek 3 months, 1 week ago
On Fri 2025-09-26 14:49:09, Petr Mladek wrote:
> This patchset should solve problem which was being discussed
> at https://lore.kernel.org/all/aNFR45fL2L4PavNc@pathway.suse.cz
> 
> __nbcon_atomic_flush_pending_con() preserves the nbcon console
> ownership all the time when flushing pending messages. It might
> take a long time with slow serial consoles.
> 
> It might trigger a hardlockup report on another CPU which is
> busy waiting for the nbcon console ownership, for example,
> in nbcon_reacquire_nobuf() or __uart_port_nbcon_acquire().
> 
> The problem is solved by the 3rd patch. It releases the console
> context ownership after each record.
> 
> The 3rd patch alone would increase the risk of takeovers and repeated
> lines. It is prevented by the 1st patch which blocks the printk kthread
> when any CPU is in an emergency context.
> 
> The 2nd patch allows to block the printk kthread also in panic.
> It is not important. It is just an obvious update of the check
> for emergency contexts.
> 
> Note: The patchset applies against current Linus' tree (v6.17-rc7).
> 
>       The 2nd patch would need an update after the consolisation of
>       the panic state API gets merged via -mm tree,
>       see https://lore.kernel.org/r/20250825022947.1596226-2-wangjinchao600@gmail.com
> 
> Petr Mladek (3):
>   printk/nbcon: Block printk kthreads when any CPU is in an emergency
>     context
>   printk/nbcon/panic: Allow printk kthread to sleep when the system is
>     in panic
>   printk/nbcon: Release nbcon consoles ownership in atomic flush after
>     each emitted record
> 
>  kernel/printk/internal.h |  1 +
>  kernel/printk/nbcon.c    | 43 +++++++++++++++++++++++++++++++++++-----
>  kernel/printk/printk.c   |  2 +-
>  3 files changed, 40 insertions(+), 6 deletions(-)

JFYI, the patchset has been comitted into printk/linux.git,
branch rework/atomic-flush-hardlockup[1].

It is queued for 6.19.

Note that I did the following modifications:

  + Added changes into the 1st patch proposed by John[2], namely:
     + initialize nbcon_cpu_emergency_cnt and make it static.
     + call nbcon_kthreads_wake() only when printk_get_console_flush_type()
       sets ft.nbcon_offload.

  + Rebased 2nd patch on top of 6.18-rc1 (panic_in_progress() moved to
    linux/panic.h).

[1] https://git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git/log/?h=rework/atomic-flush-hardlockup
[2] https://lore.kernel.org/all/841pnti8k2.fsf@jogness.linutronix.de/

Best Regards,
Petr

PS: I thought about sending v2. But v1 already got enough Acks and
    I added the requested changes by cut&paste.