These helpers will be used when calling console->write_atomic on
KDB code in the next patch. It's basically the same implementaion
as nbcon_device_try_acquire, but using NBCON_PORIO_EMERGENCY when
acquiring the context.
For release we need to flush the console, since some messages could be
added before the context was acquired, as KDB emits the messages using
con->{write,write_atomic} instead of storing them on the ring buffer.
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
---
include/linux/console.h | 6 +++++
kernel/printk/nbcon.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 72 insertions(+)
diff --git a/include/linux/console.h b/include/linux/console.h
index 67af483574727c00eea1d5a1eacc994755c92607..b34c5a0b86303e2fb4583fa467d8be43761cf756 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -605,6 +605,9 @@ extern bool nbcon_can_proceed(struct nbcon_write_context *wctxt);
extern bool nbcon_enter_unsafe(struct nbcon_write_context *wctxt);
extern bool nbcon_exit_unsafe(struct nbcon_write_context *wctxt);
extern void nbcon_reacquire_nobuf(struct nbcon_write_context *wctxt);
+extern bool nbcon_kdb_try_acquire(struct console *con,
+ struct nbcon_write_context *wctxt);
+extern void nbcon_kdb_release(struct nbcon_write_context *wctxt);
/*
* Check if the given console is currently capable and allowed to print
@@ -654,6 +657,9 @@ static inline bool nbcon_can_proceed(struct nbcon_write_context *wctxt) { return
static inline bool nbcon_enter_unsafe(struct nbcon_write_context *wctxt) { return false; }
static inline bool nbcon_exit_unsafe(struct nbcon_write_context *wctxt) { return false; }
static inline void nbcon_reacquire_nobuf(struct nbcon_write_context *wctxt) { }
+static inline bool nbcon_kdb_try_acquire(struct console *con,
+ struct nbcon_write_context *wctxt) { return false; }
+static inline void nbcon_kdb_release(struct console *con) { }
static inline bool console_is_usable(struct console *con, short flags,
bool use_atomic) { return false; }
#endif
diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
index 646801813415f0abe40cabf2f28ca9e30664f028..ff218e95a505fd10521c2c4dfb00ad5ec5773953 100644
--- a/kernel/printk/nbcon.c
+++ b/kernel/printk/nbcon.c
@@ -1855,3 +1855,69 @@ void nbcon_device_release(struct console *con)
console_srcu_read_unlock(cookie);
}
EXPORT_SYMBOL_GPL(nbcon_device_release);
+
+/**
+ * nbcon_kdb_try_acquire - Try to acquire nbcon console, enter unsafe
+ * section, and initialized nbcon write context
+ * @con: The nbcon console to acquire
+ * @wctxt: The nbcon write context to be used on success
+ *
+ * Context: Under console_srcu_read_lock() for emiting a single kdb message
+ * using the given con->write_atomic() callback. Can be called
+ * only when the console is usable at the moment.
+ *
+ * Return: True if the console was acquired. False otherwise.
+ *
+ * kdb emits messages on consoles registered for printk() without
+ * storing them into the ring buffer. It has to acquire the console
+ * ownerhip so that it could call con->write_atomic() callback a safe way.
+ *
+ * This function acquires the nbcon console using priority NBCON_PRIO_EMERGENCY
+ * and marks it unsafe for handover/takeover.
+ */
+bool nbcon_kdb_try_acquire(struct console *con,
+ struct nbcon_write_context *wctxt)
+{
+ struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
+
+ memset(ctxt, 0, sizeof(*ctxt));
+ ctxt->console = con;
+ ctxt->prio = NBCON_PRIO_EMERGENCY;
+
+ if (!nbcon_context_try_acquire(ctxt, false))
+ return false;
+
+ if (!nbcon_context_enter_unsafe(ctxt))
+ return false;
+
+ return true;
+}
+
+/**
+ * nbcon_kdb_release - Exit unsafe section and release the nbcon console
+ *
+ * @wctxt: The nbcon write context initialized by a successful
+ * nbcon_kdb_try_acquire()
+ *
+ * Context: Under console_srcu_read_lock() for emiting a single kdb message
+ * using the given con->write_atomic() callback. Can be called
+ * only when the console is usable at the moment.
+ */
+void nbcon_kdb_release(struct nbcon_write_context *wctxt)
+{
+ struct nbcon_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
+
+ if (!nbcon_context_exit_unsafe(ctxt))
+ return;
+
+ nbcon_context_release(ctxt);
+
+ /*
+ * Flush any new printk() messages added when the console was blocked.
+ * Only the console used by the given write context was blocked.
+ * The console was locked only when the write_atomic() callback
+ * was usable.
+ */
+ __nbcon_atomic_flush_pending_con(ctxt->console,
+ prb_next_reserve_seq(prb), false);
+}
--
2.50.0
On Tue 2025-09-02 15:33:53, Marcos Paulo de Souza wrote: > These helpers will be used when calling console->write_atomic on > KDB code in the next patch. It's basically the same implementaion > as nbcon_device_try_acquire, but using NBCON_PORIO_EMERGENCY when > acquiring the context. > > For release we need to flush the console, since some messages could be > added before the context was acquired, as KDB emits the messages using > con->{write,write_atomic} instead of storing them on the ring buffer. I am a bit confused by the last paragraph. It is a very long sentence. Sigh, I wanted to propose a simple and clear alternative. But I ended in a rabbit hole and with a rather complex text: <proposal> The atomic flush in the release function is questionable. vkdb_printf() is primary called only when other CPUs are quiescent in kdb_main_loop() and do not call the classic printk(). But, for example, the write_atomic() callback might print debug messages. Or there is one kdb_printf() called in kgdb_panic() before other CPUs are quiescent. So the flush might be useful. Especially, when the kdb code fails to quiescent the CPUs and returns early. Let's keep it simple and just call __nbcon_atomic_flush_pending_con(). It uses write_atomic() callback which is used by the locked kdb code anyway. The legacy loop (console_trylock()/console_unlock()) is not usable in kdb context. It might make sense to trigger the flush via the printk kthread. But it would not work in panic() where is the only known kdb_printf() called when other CPUs are not quiescent. So, it does not look worth it. </proposal> What do you think? My opinion: Honestly, I think that the flush is not much important because it will most offten have nothing to do. I am just not sure whether it is better to have it there or avoid it. It might be better to remove it after all. And just document the decision. Best Regards, Petr
On 2025-09-05, Petr Mladek <pmladek@suse.com> wrote: > On Tue 2025-09-02 15:33:53, Marcos Paulo de Souza wrote: >> These helpers will be used when calling console->write_atomic on >> KDB code in the next patch. It's basically the same implementaion >> as nbcon_device_try_acquire, but using NBCON_PORIO_EMERGENCY when >> acquiring the context. >> >> For release we need to flush the console, since some messages could be >> added before the context was acquired, as KDB emits the messages using >> con->{write,write_atomic} instead of storing them on the ring buffer. > > I am a bit confused by the last paragraph. It is a very long sentence. > > Sigh, I wanted to propose a simple and clear alternative. But I ended > in a rabbit hole and with a rather complex text: > > <proposal> > The atomic flush in the release function is questionable. vkdb_printf() > is primary called only when other CPUs are quiescent in kdb_main_loop() > and do not call the classic printk(). But, for example, the > write_atomic() callback might print debug messages. Or there is > one kdb_printf() called in kgdb_panic() before other CPUs are > quiescent. So the flush might be useful. Especially, when > the kdb code fails to quiescent the CPUs and returns early. > > Let's keep it simple and just call __nbcon_atomic_flush_pending_con(). > It uses write_atomic() callback which is used by the locked kdb code > anyway. > > The legacy loop (console_trylock()/console_unlock()) is not > usable in kdb context. > > It might make sense to trigger the flush via the printk kthread. > But it would not work in panic() where is the only known kdb_printf() > called when other CPUs are not quiescent. So, it does not look > worth it. > </proposal> > > What do you think? > > My opinion: > > Honestly, I think that the flush is not much important because > it will most offten have nothing to do. > > I am just not sure whether it is better to have it there > or avoid it. It might be better to remove it after all. > And just document the decision. IMHO keeping the flush is fine. There are cases where there might be something to print. And since a printing kthread will get no chance to print as long as kdb is alive, we should have kdb flushing that console. Note that this is the only console that will actually see the new messages immediately as all the other CPUs and quiesced. For this reason we probably want to use __nbcon_atomic_flush_pending() to try to flush _all_ the consoles. As to the last paragraph of the commit message, I would keep it simple: After release try to flush all consoles since there may be a backlog of messages in the ringbuffer. The kthread console printers do not get a chance to run while kdb is active. John
On Mon 2025-09-08 14:15:08, John Ogness wrote: > On 2025-09-05, Petr Mladek <pmladek@suse.com> wrote: > > On Tue 2025-09-02 15:33:53, Marcos Paulo de Souza wrote: > >> These helpers will be used when calling console->write_atomic on > >> KDB code in the next patch. It's basically the same implementaion > >> as nbcon_device_try_acquire, but using NBCON_PORIO_EMERGENCY when > >> acquiring the context. > >> > >> For release we need to flush the console, since some messages could be > >> added before the context was acquired, as KDB emits the messages using > >> con->{write,write_atomic} instead of storing them on the ring buffer. > > > > I am a bit confused by the last paragraph. It is a very long sentence. > > > > Sigh, I wanted to propose a simple and clear alternative. But I ended > > in a rabbit hole and with a rather complex text: > > > > <proposal> > > The atomic flush in the release function is questionable. vkdb_printf() > > is primary called only when other CPUs are quiescent in kdb_main_loop() > > and do not call the classic printk(). But, for example, the > > write_atomic() callback might print debug messages. Or there is > > one kdb_printf() called in kgdb_panic() before other CPUs are > > quiescent. So the flush might be useful. Especially, when > > the kdb code fails to quiescent the CPUs and returns early. > > > > Let's keep it simple and just call __nbcon_atomic_flush_pending_con(). > > It uses write_atomic() callback which is used by the locked kdb code > > anyway. > > > > The legacy loop (console_trylock()/console_unlock()) is not > > usable in kdb context. > > > > It might make sense to trigger the flush via the printk kthread. > > But it would not work in panic() where is the only known kdb_printf() > > called when other CPUs are not quiescent. So, it does not look > > worth it. > > </proposal> > > > > What do you think? > > > > My opinion: > > > > Honestly, I think that the flush is not much important because > > it will most offten have nothing to do. > > > > I am just not sure whether it is better to have it there > > or avoid it. It might be better to remove it after all. > > And just document the decision. > > IMHO keeping the flush is fine. There are cases where there might be > something to print. And since a printing kthread will get no chance to > print as long as kdb is alive, we should have kdb flushing that > console. > > Note that this is the only console that will actually see the new > messages immediately as all the other CPUs and quiesced. I do not understand this argument. IMHO, this new try_acquire()/release() API should primary flush only the console which was (b)locked by this API. It will be called in kdb_msg_write() which tries to write to all registered consoles. So the other nbcon consoles will get flushed when the try_acquire() succeeds on them. And the legacy conosles were never flushed. > For this reason > we probably want to use __nbcon_atomic_flush_pending() to try to flush > _all_ the consoles. I would prefer to keep __nbcon_atomic_flush_pending_con(). I mean to flush only the console which was blocked. Note that we would need to increment oops_in_progress if we wanted to flush legacy consoles in this context... which would spread the mess into nbcon code... > As to the last paragraph of the commit message, I would keep it simple: > > After release try to flush all consoles since there may be a backlog of > messages in the ringbuffer. The kthread console printers do not get a > chance to run while kdb is active. I like this text. Best Regards, Petr
On 2025-09-09, Petr Mladek <pmladek@suse.com> wrote: >> > Honestly, I think that the flush is not much important because >> > it will most offten have nothing to do. >> > >> > I am just not sure whether it is better to have it there >> > or avoid it. It might be better to remove it after all. >> > And just document the decision. >> >> IMHO keeping the flush is fine. There are cases where there might be >> something to print. And since a printing kthread will get no chance to >> print as long as kdb is alive, we should have kdb flushing that >> console. >> >> Note that this is the only console that will actually see the new >> messages immediately as all the other CPUs and quiesced. > > I do not understand this argument. IMHO, this new > try_acquire()/release() API should primary flush only > the console which was (b)locked by this API. > > It will be called in kdb_msg_write() which tries to write > to all registered consoles. So the other nbcon consoles will > get flushed when the try_acquire() succeeds on them. And the > legacy conosles were never flushed. Right. I oversaw that it acquires each of the nbcon's. > I would prefer to keep __nbcon_atomic_flush_pending_con(). > I mean to flush only the console which was blocked. Agreed. >> After release try to flush all consoles since there may be a backlog of >> messages in the ringbuffer. The kthread console printers do not get a >> chance to run while kdb is active. > > I like this text. OK, but then change it to talk only about the one console. After release try to flush the console since there may be a backlog of messages in the ringbuffer. The kthread console printers do not get a chance to run while kdb is active. John
© 2016 - 2025 Red Hat, Inc.