This RFC proposes enabling netconsole on the NBCON infrastructure.
Context:
=======
Mike[1] reported a netconsole HARDIRQ-safe → HARDIRQ-unsafe lock
warning a while ago. The root cause involved IRQ-unsafe locks
being called within the console lock context. These IRQ-unsafe locks are
on some very specific network drivers TX path (ieee80211 as in Mike's
report).
A possible solution is to mark these devices as NOT supported by
netpoll (aka IFF_DISABLE_NETPOLL). Another solution is to send "most" of
the netconsole messages from non-atomic contexts (aka thread in nbcon
parlance), and only rely on atomic context when the host is crashing.
On top of that, nbcon is a much modern console implementation, which
brings others benefits to netconsole, so, this patches move netconsole
to NBCON.
Until recently, NBCON lacked support for non-atomic consoles
(CON_NBCON_ATOMIC_UNSAFE), thus, this port was not possible so far.
John recently implemented CON_NBCON_ATOMIC_UNSAFE in commit 187de7c212e5
("printk: nbcon: Allow unsafe write_atomic() for panic"), enabling
netconsole to use nbcon framework.
The patchset implements NBCON support in 2 phases:
1. Refactoring: Extract message fragmentation logic into a reusable helper
function.
2. Extended console support: Introduce CONFIG_NETCONSOLE_NBCON for consoles
implementing device lock/unlock callbacks
Backward Compatibility
======================
When CONFIG_NETCONSOLE_NBCON is disabled (the default), both extended
and basic consoles continue using the legacy console infrastructure,
ensuring full backward compatibility.
Current Limitations
===================
Netconsole continues to call netpoll and network TX helpers with interrupts
disabled. The network xmit callbacks are called with IRQ disabled
(target_list_lock is an IRQ safe spinlock)
spin_lock_irqsave(&target_list_lock, *flags)
list_for_each_entry(nt, &target_list, list)
netpoll_send_udp();
__netpoll_send_skb()
lockdep_assert_irqs_disabled()
While this patchset doesn't fully resolve the issue in [1], it removes
one layer of the problem and, moves the problem into the network domain,
which is a huge win.
Also, the commit 187de7c212e5 ("printk: nbcon: Allow unsafe
write_atomic() for panic") is still not on net-next, thus, NIPA will
fail for this RFC. Also, this patch is based on linux-next as of
20251121 instead of net-next.
Next steps
==========
1) Move the target_list_lock to RCU
2) Assess if __netpoll_send_skb() can be called with IRQ enabled
3) Mark devices that rely on IRQ unsafe contexts with IFF_DISABLE_NETPOLL
4) Use CON_NBCON_ATOMIC_UNSAFE only if the netpoll device has
IFF_DISABLE_NETPOLL, otherwise, unset CON_NBCON_ATOMIC_UNSAFE and be
a more normal NBCON user.
[1] https://lore.kernel.org/all/b2qps3uywhmjaym4mht2wpxul4yqtuuayeoq4iv4k3zf5wdgh3@tocu6c7mj4lt/
---
Breno Leitao (2):
netconsole: extract message fragmentation into write_msg_target()
netconsole: add CONFIG_NETCONSOLE_NBCON for nbcon support
drivers/net/Kconfig | 14 ++++++++
drivers/net/netconsole.c | 94 ++++++++++++++++++++++++++++++++++++++++++------
2 files changed, 98 insertions(+), 10 deletions(-)
---
base-commit: d724c6f85e80a23ed46b7ebc6e38b527c09d64f5
change-id: 20251117-nbcon-f24477ca9f3e
Best regards,
--
Breno Leitao <leitao@debian.org>