[PATCH RFC net-next 0/2] netconsole: NBCON Infrastructure Support

Breno Leitao posted 2 patches 1 week, 3 days ago
drivers/net/Kconfig      | 14 ++++++++
drivers/net/netconsole.c | 94 ++++++++++++++++++++++++++++++++++++++++++------
2 files changed, 98 insertions(+), 10 deletions(-)
[PATCH RFC net-next 0/2] netconsole: NBCON Infrastructure Support
Posted by Breno Leitao 1 week, 3 days ago
This RFC proposes enabling netconsole on the NBCON infrastructure.

Context:
=======

Mike[1] reported a netconsole HARDIRQ-safe → HARDIRQ-unsafe lock
warning a while ago. The root cause involved IRQ-unsafe locks
being called within the console lock context. These IRQ-unsafe locks are
on some very specific network drivers TX path (ieee80211 as in Mike's
report).

A possible solution is to mark these devices as NOT supported by
netpoll (aka IFF_DISABLE_NETPOLL). Another solution is to send "most" of
the netconsole messages from non-atomic contexts (aka thread in nbcon
parlance), and only rely on atomic context when the host is crashing.

On top of that, nbcon is a much modern console implementation, which
brings others benefits to netconsole, so, this patches move netconsole
to NBCON.

Until recently, NBCON lacked support for non-atomic consoles
(CON_NBCON_ATOMIC_UNSAFE), thus, this port was not possible so far.

John recently implemented CON_NBCON_ATOMIC_UNSAFE in commit 187de7c212e5
("printk: nbcon: Allow unsafe write_atomic() for panic"), enabling
netconsole to use nbcon framework.

The patchset implements NBCON support in 2 phases:

1. Refactoring: Extract message fragmentation logic into a reusable helper
function.

2. Extended console support: Introduce CONFIG_NETCONSOLE_NBCON for consoles
implementing device lock/unlock callbacks

Backward Compatibility
======================

When CONFIG_NETCONSOLE_NBCON is disabled (the default), both extended
and basic consoles continue using the legacy console infrastructure,
ensuring full backward compatibility.

Current Limitations
===================

Netconsole continues to call netpoll and network TX helpers with interrupts
disabled. The network xmit callbacks are called with IRQ disabled
(target_list_lock is an IRQ safe spinlock)

spin_lock_irqsave(&target_list_lock, *flags)
	list_for_each_entry(nt, &target_list, list)
		netpoll_send_udp();
			__netpoll_send_skb()
				lockdep_assert_irqs_disabled()

While this patchset doesn't fully resolve the issue in [1], it removes
one layer of the problem and, moves the problem into the network domain,
which is a huge win.

Also, the commit 187de7c212e5 ("printk: nbcon: Allow unsafe
write_atomic() for panic") is still not on net-next, thus, NIPA will
fail for this RFC. Also, this patch is based on linux-next as of
20251121 instead of net-next.

Next steps
==========

1) Move the target_list_lock to RCU
2) Assess if __netpoll_send_skb() can be called with IRQ enabled
3) Mark devices that rely on IRQ unsafe  contexts with IFF_DISABLE_NETPOLL
4) Use CON_NBCON_ATOMIC_UNSAFE only if the netpoll device has
   IFF_DISABLE_NETPOLL, otherwise, unset CON_NBCON_ATOMIC_UNSAFE and be
   a more normal NBCON user.

[1] https://lore.kernel.org/all/b2qps3uywhmjaym4mht2wpxul4yqtuuayeoq4iv4k3zf5wdgh3@tocu6c7mj4lt/

---
Breno Leitao (2):
      netconsole: extract message fragmentation into write_msg_target()
      netconsole: add CONFIG_NETCONSOLE_NBCON for nbcon support

 drivers/net/Kconfig      | 14 ++++++++
 drivers/net/netconsole.c | 94 ++++++++++++++++++++++++++++++++++++++++++------
 2 files changed, 98 insertions(+), 10 deletions(-)
---
base-commit: d724c6f85e80a23ed46b7ebc6e38b527c09d64f5
change-id: 20251117-nbcon-f24477ca9f3e

Best regards,
--  
Breno Leitao <leitao@debian.org>