From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B482BC7EE2F for ; Thu, 2 Mar 2023 20:02:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229651AbjCBUCC (ORCPT ); Thu, 2 Mar 2023 15:02:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229560AbjCBUB4 (ORCPT ); Thu, 2 Mar 2023 15:01:56 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79A0B580CF; Thu, 2 Mar 2023 12:01:18 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787213; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DoYVoXVZrCYkg08eCZi+GFBjouXoTRm7NbD5Lab3y1I=; b=VBIL4+Msso+bsBpBkRWgUCzcxeRp/G3msz+6YpJrTDuNdC6iezrN/x8tA0fYEyze716M/e Ra+bxRMisvYR8HtexQc1Sf69B0qG8jszsCddr+l+Gh8R7B1XuVa/ZxsMuE+fVQp1ulQ7xr rveb8Yf36aQCkI9jvfUqD9WR2U5qObgZj5162d3losRwGeiSmzuc0FfvAkLCEUCGI6saqp sbFvh7sDdlXfOqVuHZG0Hxq7amwBT+LGqKrAJyIF9v4+V1JoAjGDlxerkydG6VW96p+DJy aoE45PIkBISGYZQkDZPHyAeEGC5urD1+b4k7xL1iR1txma/eGahduqhk1DQx0Q== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787213; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DoYVoXVZrCYkg08eCZi+GFBjouXoTRm7NbD5Lab3y1I=; b=TgHHbQF6uSL3sXilGT+WTi8RkER3+tScRX/N2/F3apPRIcJkA24h1B4cOQQd8Aim3txLrV nlMqE8sHjjgzxbBg== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Jason Wessel , Daniel Thompson , Douglas Anderson , Aaron Tomlin , Luis Chamberlain , kgdb-bugreport@lists.sourceforge.net, Greg Kroah-Hartman , linux-fsdevel@vger.kernel.org, Andrew Morton , "Guilherme G. Piccoli" , David Gow , Tiezhu Yang , Daniel Vetter , tangmeng , "Paul E. McKenney" , Frederic Weisbecker , Neeraj Upadhyay , Josh Triplett , Mathieu Desnoyers , Lai Jiangshan , Joel Fernandes , rcu@vger.kernel.org Subject: [PATCH printk v1 00/18] serial: 8250: implement non-BKL console In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> Date: Thu, 02 Mar 2023 21:04:50 +0106 Message-ID: <87wn3zsz5x.fsf@jogness.linutronix.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implement the necessary callbacks to allow the 8250 console driver to perform as a non-BKL console. Remove the implementation for the legacy console callback (write) and add implementations for the non-BKL consoles (write_atomic, write_thread, port_lock) and add CON_NO_BKL to the initial flags. This is an all-in-one commit meant only for testing the new printk non-BKL infrastructure. It is not meant to be included mainline in this form. In particular, it includes mainline driver fixes that need to be submitted individually. Although non-BKL consoles can coexist with legacy consoles, you will only receive all the benefits of the non-BKL consoles, if this console driver is the only console. That means no netconsole, no tty1, no earlyprintk, no earlycon. Just the uart8250. For example: console=3DttyS0,115200 Signed-off-by: John Ogness diff --git a/drivers/tty/serial/8250/8250.h b/drivers/tty/serial/8250/8250.h index 287153d32536..d8da34bb9ae3 100644 Tested-by: Daniel Thompson --- a/drivers/tty/serial/8250/8250.h +++ b/drivers/tty/serial/8250/8250.h @@ -177,12 +177,154 @@ static inline void serial_dl_write(struct uart_8250_= port *up, int value) up->dl_write(up, value); } =20 +static inline bool serial8250_is_console(struct uart_port *port) +{ + return uart_console(port) && !hlist_unhashed_lockless(&port->cons->node); +} + +static inline void serial8250_init_wctxt(struct cons_write_context *wctxt, + struct console *cons) +{ + struct cons_context *ctxt =3D &ACCESS_PRIVATE(wctxt, ctxt); + + memset(wctxt, 0, sizeof(*wctxt)); + ctxt->console =3D cons; + ctxt->prio =3D CONS_PRIO_NORMAL; + /* Both require the port lock, so they cannot clobber each other. */ + ctxt->thread =3D 1; +} + +static inline void serial8250_console_acquire(struct cons_write_context *w= ctxt, + struct console *cons) +{ + serial8250_init_wctxt(wctxt, cons); + while (!console_try_acquire(wctxt)) { + cpu_relax(); + serial8250_init_wctxt(wctxt, cons); + } +} + +static inline void serial8250_enter_unsafe(struct uart_8250_port *up) +{ + struct uart_port *port =3D &up->port; + + lockdep_assert_held_once(&port->lock); + + for (;;) { + up->cookie =3D console_srcu_read_lock(); + + serial8250_console_acquire(&up->wctxt, port->cons); + + if (console_enter_unsafe(&up->wctxt)) + break; + + console_srcu_read_unlock(up->cookie); + cpu_relax(); + } +} + +static inline void serial8250_exit_unsafe(struct uart_8250_port *up) +{ + struct uart_port *port =3D &up->port; + + lockdep_assert_held_once(&port->lock); + + /* + * FIXME: The 8250 driver does not support hostile takeovers + * in the unsafe section. + */ + if (!WARN_ON_ONCE(!console_exit_unsafe(&up->wctxt))) + WARN_ON_ONCE(!console_release(&up->wctxt)); + + console_srcu_read_unlock(up->cookie); +} + +static inline int serial8250_in_IER(struct uart_8250_port *up) +{ + struct uart_port *port =3D &up->port; + bool is_console; + int ier; + + is_console =3D serial8250_is_console(port); + + if (is_console) + serial8250_enter_unsafe(up); + + ier =3D serial_in(up, UART_IER); + + if (is_console) + serial8250_exit_unsafe(up); + + return ier; +} + +static inline bool __serial8250_set_IER(struct uart_8250_port *up, + struct cons_write_context *wctxt, + int ier) +{ + if (wctxt && !console_can_proceed(wctxt)) + return false; + serial_out(up, UART_IER, ier); + return true; +} + +static inline void serial8250_set_IER(struct uart_8250_port *up, int ier) +{ + struct uart_port *port =3D &up->port; + bool is_console; + + is_console =3D serial8250_is_console(port); + + if (is_console) { + serial8250_enter_unsafe(up); + __serial8250_set_IER(up, &up->wctxt, ier); + serial8250_exit_unsafe(up); + } else { + __serial8250_set_IER(up, NULL, ier); + } +} + +static inline bool __serial8250_clear_IER(struct uart_8250_port *up, + struct cons_write_context *wctxt, + int *prior) +{ + unsigned int clearval =3D 0; + + if (up->capabilities & UART_CAP_UUE) + clearval =3D UART_IER_UUE; + + *prior =3D serial_in(up, UART_IER); + if (wctxt && !console_can_proceed(wctxt)) + return false; + serial_out(up, UART_IER, clearval); + return true; +} + +static inline int serial8250_clear_IER(struct uart_8250_port *up) +{ + struct uart_port *port =3D &up->port; + bool is_console; + int prior; + + is_console =3D serial8250_is_console(port); + + if (is_console) { + serial8250_enter_unsafe(up); + __serial8250_clear_IER(up, &up->wctxt, &prior); + serial8250_exit_unsafe(up); + } else { + __serial8250_clear_IER(up, NULL, &prior); + } + + return prior; +} + static inline bool serial8250_set_THRI(struct uart_8250_port *up) { if (up->ier & UART_IER_THRI) return false; up->ier |=3D UART_IER_THRI; - serial_out(up, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); return true; } =20 @@ -191,7 +333,7 @@ static inline bool serial8250_clear_THRI(struct uart_82= 50_port *up) if (!(up->ier & UART_IER_THRI)) return false; up->ier &=3D ~UART_IER_THRI; - serial_out(up, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); return true; } =20 diff --git a/drivers/tty/serial/8250/8250_aspeed_vuart.c b/drivers/tty/seri= al/8250/8250_aspeed_vuart.c index 9d2a7856784f..7cc6b527c088 100644 --- a/drivers/tty/serial/8250/8250_aspeed_vuart.c +++ b/drivers/tty/serial/8250/8250_aspeed_vuart.c @@ -278,7 +278,7 @@ static void __aspeed_vuart_set_throttle(struct uart_825= 0_port *up, up->ier &=3D ~irqs; if (!throttle) up->ier |=3D irqs; - serial_out(up, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); } static void aspeed_vuart_set_throttle(struct uart_port *port, bool throttl= e) { diff --git a/drivers/tty/serial/8250/8250_bcm7271.c b/drivers/tty/serial/82= 50/8250_bcm7271.c index ed5a94747692..adb1a3247807 100644 --- a/drivers/tty/serial/8250/8250_bcm7271.c +++ b/drivers/tty/serial/8250/8250_bcm7271.c @@ -606,8 +606,10 @@ static int brcmuart_startup(struct uart_port *port) * Disable the Receive Data Interrupt because the DMA engine * will handle this. */ + spin_lock_irq(&port->lock); up->ier &=3D ~UART_IER_RDI; - serial_port_out(port, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); + spin_unlock_irq(&port->lock); =20 priv->tx_running =3D false; priv->dma.rx_dma =3D NULL; @@ -787,6 +789,12 @@ static int brcmuart_handle_irq(struct uart_port *p) spin_lock_irqsave(&p->lock, flags); status =3D serial_port_in(p, UART_LSR); if ((status & UART_LSR_DR) =3D=3D 0) { + bool is_console; + + is_console =3D serial8250_is_console(port); + + if (is_console) + serial8250_enter_unsafe(p); =20 ier =3D serial_port_in(p, UART_IER); /* @@ -807,6 +815,9 @@ static int brcmuart_handle_irq(struct uart_port *p) serial_port_in(p, UART_RX); } =20 + if (is_console) + serial8250_exit_unsafe(p); + handled =3D 1; } spin_unlock_irqrestore(&p->lock, flags); @@ -844,12 +855,22 @@ static enum hrtimer_restart brcmuart_hrtimer_func(str= uct hrtimer *t) /* re-enable receive unless upper layer has disabled it */ if ((up->ier & (UART_IER_RLSI | UART_IER_RDI)) =3D=3D (UART_IER_RLSI | UART_IER_RDI)) { + bool is_console; + + is_console =3D serial8250_is_console(port); + + if (is_console) + serial8250_enter_unsafe(p); + status =3D serial_port_in(p, UART_IER); status |=3D (UART_IER_RLSI | UART_IER_RDI); serial_port_out(p, UART_IER, status); status =3D serial_port_in(p, UART_MCR); status |=3D UART_MCR_RTS; serial_port_out(p, UART_MCR, status); + + if (is_console) + serial8250_exit_unsafe(p); } spin_unlock_irqrestore(&p->lock, flags); return HRTIMER_NORESTART; diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/= 8250_core.c index ab63c308be0a..688ecfc6e1d5 100644 --- a/drivers/tty/serial/8250/8250_core.c +++ b/drivers/tty/serial/8250/8250_core.c @@ -256,6 +256,7 @@ static void serial8250_timeout(struct timer_list *t) static void serial8250_backup_timeout(struct timer_list *t) { struct uart_8250_port *up =3D from_timer(up, t, timer); + struct uart_port *port =3D &up->port; unsigned int iir, ier =3D 0, lsr; unsigned long flags; =20 @@ -266,8 +267,18 @@ static void serial8250_backup_timeout(struct timer_lis= t *t) * based handler. */ if (up->port.irq) { + bool is_console; + + is_console =3D serial8250_is_console(port); + + if (is_console) + serial8250_enter_unsafe(up); + ier =3D serial_in(up, UART_IER); serial_out(up, UART_IER, 0); + + if (is_console) + serial8250_exit_unsafe(up); } =20 iir =3D serial_in(up, UART_IIR); @@ -290,7 +301,7 @@ static void serial8250_backup_timeout(struct timer_list= *t) serial8250_tx_chars(up); =20 if (up->port.irq) - serial_out(up, UART_IER, ier); + serial8250_set_IER(up, ier); =20 spin_unlock_irqrestore(&up->port.lock, flags); =20 @@ -576,12 +587,30 @@ serial8250_register_ports(struct uart_driver *drv, st= ruct device *dev) =20 #ifdef CONFIG_SERIAL_8250_CONSOLE =20 -static void univ8250_console_write(struct console *co, const char *s, - unsigned int count) +static void univ8250_console_port_lock(struct console *con, bool do_lock, = unsigned long *flags) +{ + struct uart_8250_port *up =3D &serial8250_ports[con->index]; + + if (do_lock) + spin_lock_irqsave(&up->port.lock, *flags); + else + spin_unlock_irqrestore(&up->port.lock, *flags); +} + +static bool univ8250_console_write_atomic(struct console *co, + struct cons_write_context *wctxt) +{ + struct uart_8250_port *up =3D &serial8250_ports[co->index]; + + return serial8250_console_write_atomic(up, wctxt); +} + +static bool univ8250_console_write_thread(struct console *co, + struct cons_write_context *wctxt) { struct uart_8250_port *up =3D &serial8250_ports[co->index]; =20 - serial8250_console_write(up, s, count); + return serial8250_console_write_thread(up, wctxt); } =20 static int univ8250_console_setup(struct console *co, char *options) @@ -669,12 +698,14 @@ static int univ8250_console_match(struct console *co,= char *name, int idx, =20 static struct console univ8250_console =3D { .name =3D "ttyS", - .write =3D univ8250_console_write, + .write_atomic =3D univ8250_console_write_atomic, + .write_thread =3D univ8250_console_write_thread, + .port_lock =3D univ8250_console_port_lock, .device =3D uart_console_device, .setup =3D univ8250_console_setup, .exit =3D univ8250_console_exit, .match =3D univ8250_console_match, - .flags =3D CON_PRINTBUFFER | CON_ANYTIME, + .flags =3D CON_PRINTBUFFER | CON_ANYTIME | CON_NO_BKL, .index =3D -1, .data =3D &serial8250_reg, }; @@ -962,7 +993,7 @@ static void serial_8250_overrun_backoff_work(struct wor= k_struct *work) spin_lock_irqsave(&port->lock, flags); up->ier |=3D UART_IER_RLSI | UART_IER_RDI; up->port.read_status_mask |=3D UART_LSR_DR; - serial_out(up, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); spin_unlock_irqrestore(&port->lock, flags); } =20 diff --git a/drivers/tty/serial/8250/8250_exar.c b/drivers/tty/serial/8250/= 8250_exar.c index 64770c62bbec..ccb70b20b1f4 100644 --- a/drivers/tty/serial/8250/8250_exar.c +++ b/drivers/tty/serial/8250/8250_exar.c @@ -185,6 +185,10 @@ static void xr17v35x_set_divisor(struct uart_port *p, = unsigned int baud, =20 static int xr17v35x_startup(struct uart_port *port) { + struct uart_8250_port *up =3D up_to_u8250p(port); + + spin_lock_irq(&port->lock); + /* * First enable access to IER [7:5], ISR [5:4], FCR [5:4], * MCR [7:5] and MSR [7:0] @@ -195,7 +199,9 @@ static int xr17v35x_startup(struct uart_port *port) * Make sure all interrups are masked until initialization is * complete and the FIFOs are cleared */ - serial_port_out(port, UART_IER, 0); + serial8250_set_IER(up, 0); + + spin_unlock_irq(&port->lock); =20 return serial8250_do_startup(port); } diff --git a/drivers/tty/serial/8250/8250_fsl.c b/drivers/tty/serial/8250/8= 250_fsl.c index 8aad15622a2e..74bb85b705e7 100644 --- a/drivers/tty/serial/8250/8250_fsl.c +++ b/drivers/tty/serial/8250/8250_fsl.c @@ -58,7 +58,8 @@ int fsl8250_handle_irq(struct uart_port *port) if ((orig_lsr & UART_LSR_OE) && (up->overrun_backoff_time_ms > 0)) { unsigned long delay; =20 - up->ier =3D port->serial_in(port, UART_IER); + up->ier =3D serial8250_in_IER(up); + if (up->ier & (UART_IER_RLSI | UART_IER_RDI)) { port->ops->stop_rx(port); } else { diff --git a/drivers/tty/serial/8250/8250_ingenic.c b/drivers/tty/serial/82= 50/8250_ingenic.c index 617b8ce60d6b..548904c3d11b 100644 --- a/drivers/tty/serial/8250/8250_ingenic.c +++ b/drivers/tty/serial/8250/8250_ingenic.c @@ -171,6 +171,7 @@ OF_EARLYCON_DECLARE(x1000_uart, "ingenic,x1000-uart", =20 static void ingenic_uart_serial_out(struct uart_port *p, int offset, int v= alue) { + struct uart_8250_port *up =3D up_to_u8250p(p); int ier; =20 switch (offset) { @@ -192,7 +193,7 @@ static void ingenic_uart_serial_out(struct uart_port *p= , int offset, int value) * If we have enabled modem status IRQs we should enable * modem mode. */ - ier =3D p->serial_in(p, UART_IER); + ier =3D serial8250_in_IER(up); =20 if (ier & UART_IER_MSI) value |=3D UART_MCR_MDCE | UART_MCR_FCM; diff --git a/drivers/tty/serial/8250/8250_mtk.c b/drivers/tty/serial/8250/8= 250_mtk.c index fb1d5ec0940e..bf7ab55c8923 100644 --- a/drivers/tty/serial/8250/8250_mtk.c +++ b/drivers/tty/serial/8250/8250_mtk.c @@ -222,12 +222,38 @@ static void mtk8250_shutdown(struct uart_port *port) =20 static void mtk8250_disable_intrs(struct uart_8250_port *up, int mask) { - serial_out(up, UART_IER, serial_in(up, UART_IER) & (~mask)); + struct uart_port *port =3D &up->port; + bool is_console; + int ier; + + is_console =3D serial8250_is_console(port); + + if (is_console) + serial8250_enter_unsafe(up); + + ier =3D serial_in(up, UART_IER); + serial_out(up, UART_IER, ier & (~mask)); + + if (is_console) + serial8250_exit_unsafe(up); } =20 static void mtk8250_enable_intrs(struct uart_8250_port *up, int mask) { - serial_out(up, UART_IER, serial_in(up, UART_IER) | mask); + struct uart_port *port =3D &up->port; + bool is_console; + int ier; + + is_console =3D serial8250_is_console(port); + + if (is_console) + serial8250_enter_unsafe(up); + + ier =3D serial_in(up, UART_IER); + serial_out(up, UART_IER, ier | mask); + + if (is_console) + serial8250_exit_unsafe(up); } =20 static void mtk8250_set_flow_ctrl(struct uart_8250_port *up, int mode) diff --git a/drivers/tty/serial/8250/8250_omap.c b/drivers/tty/serial/8250/= 8250_omap.c index 734f092ef839..bfa50a26349d 100644 --- a/drivers/tty/serial/8250/8250_omap.c +++ b/drivers/tty/serial/8250/8250_omap.c @@ -334,8 +334,7 @@ static void omap8250_restore_regs(struct uart_8250_port= *up) =20 /* drop TCR + TLR access, we setup XON/XOFF later */ serial8250_out_MCR(up, mcr); - - serial_out(up, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); =20 serial_out(up, UART_LCR, UART_LCR_CONF_MODE_B); serial_dl_write(up, priv->quot); @@ -523,16 +522,21 @@ static void omap_8250_pm(struct uart_port *port, unsi= gned int state, u8 efr; =20 pm_runtime_get_sync(port->dev); + + spin_lock_irq(&port->lock); + serial_out(up, UART_LCR, UART_LCR_CONF_MODE_B); efr =3D serial_in(up, UART_EFR); serial_out(up, UART_EFR, efr | UART_EFR_ECB); serial_out(up, UART_LCR, 0); =20 - serial_out(up, UART_IER, (state !=3D 0) ? UART_IERX_SLEEP : 0); + serial8250_set_IER(up, (state !=3D 0) ? UART_IERX_SLEEP : 0); serial_out(up, UART_LCR, UART_LCR_CONF_MODE_B); serial_out(up, UART_EFR, efr); serial_out(up, UART_LCR, 0); =20 + spin_unlock_irq(&port->lock); + pm_runtime_mark_last_busy(port->dev); pm_runtime_put_autosuspend(port->dev); } @@ -649,7 +653,8 @@ static irqreturn_t omap8250_irq(int irq, void *dev_id) if ((lsr & UART_LSR_OE) && up->overrun_backoff_time_ms > 0) { unsigned long delay; =20 - up->ier =3D port->serial_in(port, UART_IER); + spin_lock(&port->lock); + up->ier =3D serial8250_in_IER(up); if (up->ier & (UART_IER_RLSI | UART_IER_RDI)) { port->ops->stop_rx(port); } else { @@ -658,6 +663,7 @@ static irqreturn_t omap8250_irq(int irq, void *dev_id) */ cancel_delayed_work(&up->overrun_backoff); } + spin_unlock(&port->lock); =20 delay =3D msecs_to_jiffies(up->overrun_backoff_time_ms); schedule_delayed_work(&up->overrun_backoff, delay); @@ -707,8 +713,10 @@ static int omap_8250_startup(struct uart_port *port) if (ret < 0) goto err; =20 + spin_lock_irq(&port->lock); up->ier =3D UART_IER_RLSI | UART_IER_RDI; - serial_out(up, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); + spin_unlock_irq(&port->lock); =20 #ifdef CONFIG_PM up->capabilities |=3D UART_CAP_RPM; @@ -748,8 +756,10 @@ static void omap_8250_shutdown(struct uart_port *port) if (priv->habit & UART_HAS_EFR2) serial_out(up, UART_OMAP_EFR2, 0x0); =20 + spin_lock_irq(&port->lock); up->ier =3D 0; - serial_out(up, UART_IER, 0); + serial8250_set_IER(up, 0); + spin_unlock_irq(&port->lock); =20 if (up->dma) serial8250_release_dma(up); @@ -797,7 +807,7 @@ static void omap_8250_unthrottle(struct uart_port *port) up->dma->rx_dma(up); up->ier |=3D UART_IER_RLSI | UART_IER_RDI; port->read_status_mask |=3D UART_LSR_DR; - serial_out(up, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); spin_unlock_irqrestore(&port->lock, flags); =20 pm_runtime_mark_last_busy(port->dev); @@ -956,7 +966,7 @@ static void __dma_rx_complete(void *param) __dma_rx_do_complete(p); if (!priv->throttled) { p->ier |=3D UART_IER_RLSI | UART_IER_RDI; - serial_out(p, UART_IER, p->ier); + serial8250_set_IER(p, p->ier); if (!(priv->habit & UART_HAS_EFR2)) omap_8250_rx_dma(p); } @@ -1013,7 +1023,7 @@ static int omap_8250_rx_dma(struct uart_8250_port *p) * callback to run. */ p->ier &=3D ~(UART_IER_RLSI | UART_IER_RDI); - serial_out(p, UART_IER, p->ier); + serial8250_set_IER(p, p->ier); } goto out; } @@ -1226,12 +1236,12 @@ static void am654_8250_handle_rx_dma(struct uart_82= 50_port *up, u8 iir, * periodic timeouts, re-enable interrupts. */ up->ier &=3D ~(UART_IER_RLSI | UART_IER_RDI); - serial_out(up, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); omap_8250_rx_dma_flush(up); serial_in(up, UART_IIR); serial_out(up, UART_OMAP_EFR2, 0x0); up->ier |=3D UART_IER_RLSI | UART_IER_RDI; - serial_out(up, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); } } =20 @@ -1717,12 +1727,16 @@ static int omap8250_runtime_resume(struct device *d= ev) =20 up =3D serial8250_get_port(priv->line); =20 + spin_lock_irq(&up->port.lock); + if (omap8250_lost_context(up)) omap8250_restore_regs(up); =20 if (up->dma && up->dma->rxchan && !(priv->habit & UART_HAS_EFR2)) omap_8250_rx_dma(up); =20 + spin_unlock_irq(&up->port.lock); + priv->latency =3D priv->calc_latency; schedule_work(&priv->qos_work); return 0; diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/= 8250_port.c index fa43df05342b..f1976d9a8a38 100644 --- a/drivers/tty/serial/8250/8250_port.c +++ b/drivers/tty/serial/8250/8250_port.c @@ -744,6 +744,7 @@ static void serial8250_set_sleep(struct uart_8250_port = *p, int sleep) serial8250_rpm_get(p); =20 if (p->capabilities & UART_CAP_SLEEP) { + spin_lock_irq(&p->port.lock); if (p->capabilities & UART_CAP_EFR) { lcr =3D serial_in(p, UART_LCR); efr =3D serial_in(p, UART_EFR); @@ -751,25 +752,18 @@ static void serial8250_set_sleep(struct uart_8250_por= t *p, int sleep) serial_out(p, UART_EFR, UART_EFR_ECB); serial_out(p, UART_LCR, 0); } - serial_out(p, UART_IER, sleep ? UART_IERX_SLEEP : 0); + serial8250_set_IER(p, sleep ? UART_IERX_SLEEP : 0); if (p->capabilities & UART_CAP_EFR) { serial_out(p, UART_LCR, UART_LCR_CONF_MODE_B); serial_out(p, UART_EFR, efr); serial_out(p, UART_LCR, lcr); } + spin_unlock_irq(&p->port.lock); } =20 serial8250_rpm_put(p); } =20 -static void serial8250_clear_IER(struct uart_8250_port *up) -{ - if (up->capabilities & UART_CAP_UUE) - serial_out(up, UART_IER, UART_IER_UUE); - else - serial_out(up, UART_IER, 0); -} - #ifdef CONFIG_SERIAL_8250_RSA /* * Attempts to turn on the RSA FIFO. Returns zero on failure. @@ -1033,8 +1027,10 @@ static int broken_efr(struct uart_8250_port *up) */ static void autoconfig_16550a(struct uart_8250_port *up) { + struct uart_port *port =3D &up->port; unsigned char status1, status2; unsigned int iersave; + bool is_console; =20 up->port.type =3D PORT_16550A; up->capabilities |=3D UART_CAP_FIFO; @@ -1150,6 +1146,11 @@ static void autoconfig_16550a(struct uart_8250_port = *up) return; } =20 + is_console =3D serial8250_is_console(port); + + if (is_console) + serial8250_enter_unsafe(up); + /* * Try writing and reading the UART_IER_UUE bit (b6). * If it works, this is probably one of the Xscale platform's @@ -1185,6 +1186,9 @@ static void autoconfig_16550a(struct uart_8250_port *= up) } serial_out(up, UART_IER, iersave); =20 + if (is_console) + serial8250_exit_unsafe(up); + /* * We distinguish between 16550A and U6 16550A by counting * how many bytes are in the FIFO. @@ -1226,6 +1230,13 @@ static void autoconfig(struct uart_8250_port *up) up->bugs =3D 0; =20 if (!(port->flags & UPF_BUGGY_UART)) { + bool is_console; + + is_console =3D serial8250_is_console(port); + + if (is_console) + serial8250_enter_unsafe(up); + /* * Do a simple existence test first; if we fail this, * there's no point trying anything else. @@ -1255,6 +1266,10 @@ static void autoconfig(struct uart_8250_port *up) #endif scratch3 =3D serial_in(up, UART_IER) & UART_IER_ALL_INTR; serial_out(up, UART_IER, scratch); + + if (is_console) + serial8250_exit_unsafe(up); + if (scratch2 !=3D 0 || scratch3 !=3D UART_IER_ALL_INTR) { /* * We failed; there's nothing here @@ -1376,6 +1391,7 @@ static void autoconfig_irq(struct uart_8250_port *up) unsigned char save_ICP =3D 0; unsigned int ICP =3D 0; unsigned long irqs; + bool is_console; int irq; =20 if (port->flags & UPF_FOURPORT) { @@ -1385,8 +1401,12 @@ static void autoconfig_irq(struct uart_8250_port *up) inb_p(ICP); } =20 - if (uart_console(port)) + is_console =3D serial8250_is_console(port); + + if (is_console) { console_lock(); + serial8250_enter_unsafe(up); + } =20 /* forget possible initially masked and pending IRQ */ probe_irq_off(probe_irq_on()); @@ -1418,8 +1438,10 @@ static void autoconfig_irq(struct uart_8250_port *up) if (port->flags & UPF_FOURPORT) outb_p(save_ICP, ICP); =20 - if (uart_console(port)) + if (is_console) { + serial8250_exit_unsafe(up); console_unlock(); + } =20 port->irq =3D (irq > 0) ? irq : 0; } @@ -1432,7 +1454,7 @@ static void serial8250_stop_rx(struct uart_port *port) =20 up->ier &=3D ~(UART_IER_RLSI | UART_IER_RDI); up->port.read_status_mask &=3D ~UART_LSR_DR; - serial_port_out(port, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); =20 serial8250_rpm_put(up); } @@ -1462,7 +1484,7 @@ void serial8250_em485_stop_tx(struct uart_8250_port *= p) serial8250_clear_and_reinit_fifos(p); =20 p->ier |=3D UART_IER_RLSI | UART_IER_RDI; - serial_port_out(&p->port, UART_IER, p->ier); + serial8250_set_IER(p, p->ier); } } EXPORT_SYMBOL_GPL(serial8250_em485_stop_tx); @@ -1709,7 +1731,7 @@ static void serial8250_disable_ms(struct uart_port *p= ort) mctrl_gpio_disable_ms(up->gpios); =20 up->ier &=3D ~UART_IER_MSI; - serial_port_out(port, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); } =20 static void serial8250_enable_ms(struct uart_port *port) @@ -1725,7 +1747,7 @@ static void serial8250_enable_ms(struct uart_port *po= rt) up->ier |=3D UART_IER_MSI; =20 serial8250_rpm_get(up); - serial_port_out(port, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); serial8250_rpm_put(up); } =20 @@ -2160,9 +2182,10 @@ static void serial8250_put_poll_char(struct uart_por= t *port, serial8250_rpm_get(up); /* * First save the IER then disable the interrupts + * + * Best-effort IER access because other CPUs are quiesced. */ - ier =3D serial_port_in(port, UART_IER); - serial8250_clear_IER(up); + __serial8250_clear_IER(up, NULL, &ier); =20 wait_for_xmitr(up, UART_LSR_BOTH_EMPTY); /* @@ -2175,7 +2198,7 @@ static void serial8250_put_poll_char(struct uart_port= *port, * and restore the IER */ wait_for_xmitr(up, UART_LSR_BOTH_EMPTY); - serial_port_out(port, UART_IER, ier); + __serial8250_set_IER(up, NULL, ier); serial8250_rpm_put(up); } =20 @@ -2186,6 +2209,7 @@ int serial8250_do_startup(struct uart_port *port) struct uart_8250_port *up =3D up_to_u8250p(port); unsigned long flags; unsigned char iir; + bool is_console; int retval; u16 lsr; =20 @@ -2203,21 +2227,25 @@ int serial8250_do_startup(struct uart_port *port) serial8250_rpm_get(up); if (port->type =3D=3D PORT_16C950) { /* Wake up and initialize UART */ + spin_lock_irqsave(&port->lock, flags); up->acr =3D 0; serial_port_out(port, UART_LCR, UART_LCR_CONF_MODE_B); serial_port_out(port, UART_EFR, UART_EFR_ECB); - serial_port_out(port, UART_IER, 0); + serial8250_set_IER(up, 0); serial_port_out(port, UART_LCR, 0); serial_icr_write(up, UART_CSR, 0); /* Reset the UART */ serial_port_out(port, UART_LCR, UART_LCR_CONF_MODE_B); serial_port_out(port, UART_EFR, UART_EFR_ECB); serial_port_out(port, UART_LCR, 0); + spin_unlock_irqrestore(&port->lock, flags); } =20 if (port->type =3D=3D PORT_DA830) { /* Reset the port */ - serial_port_out(port, UART_IER, 0); + spin_lock_irqsave(&port->lock, flags); + serial8250_set_IER(up, 0); serial_port_out(port, UART_DA830_PWREMU_MGMT, 0); + spin_unlock_irqrestore(&port->lock, flags); mdelay(10); =20 /* Enable Tx, Rx and free run mode */ @@ -2315,6 +2343,8 @@ int serial8250_do_startup(struct uart_port *port) if (retval) goto out; =20 + is_console =3D serial8250_is_console(port); + if (port->irq && !(up->port.flags & UPF_NO_THRE_TEST)) { unsigned char iir1; =20 @@ -2331,6 +2361,9 @@ int serial8250_do_startup(struct uart_port *port) */ spin_lock_irqsave(&port->lock, flags); =20 + if (is_console) + serial8250_enter_unsafe(up); + wait_for_xmitr(up, UART_LSR_THRE); serial_port_out_sync(port, UART_IER, UART_IER_THRI); udelay(1); /* allow THRE to set */ @@ -2341,6 +2374,9 @@ int serial8250_do_startup(struct uart_port *port) iir =3D serial_port_in(port, UART_IIR); serial_port_out(port, UART_IER, 0); =20 + if (is_console) + serial8250_exit_unsafe(up); + spin_unlock_irqrestore(&port->lock, flags); =20 if (port->irqflags & IRQF_SHARED) @@ -2395,10 +2431,14 @@ int serial8250_do_startup(struct uart_port *port) * Do a quick test to see if we receive an interrupt when we enable * the TX irq. */ + if (is_console) + serial8250_enter_unsafe(up); serial_port_out(port, UART_IER, UART_IER_THRI); lsr =3D serial_port_in(port, UART_LSR); iir =3D serial_port_in(port, UART_IIR); serial_port_out(port, UART_IER, 0); + if (is_console) + serial8250_exit_unsafe(up); =20 if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT) { if (!(up->bugs & UART_BUG_TXEN)) { @@ -2430,7 +2470,7 @@ int serial8250_do_startup(struct uart_port *port) if (up->dma) { const char *msg =3D NULL; =20 - if (uart_console(port)) + if (is_console) msg =3D "forbid DMA for kernel console"; else if (serial8250_request_dma(up)) msg =3D "failed to request DMA"; @@ -2481,7 +2521,7 @@ void serial8250_do_shutdown(struct uart_port *port) */ spin_lock_irqsave(&port->lock, flags); up->ier =3D 0; - serial_port_out(port, UART_IER, 0); + serial8250_set_IER(up, 0); spin_unlock_irqrestore(&port->lock, flags); =20 synchronize_irq(port->irq); @@ -2847,7 +2887,7 @@ serial8250_do_set_termios(struct uart_port *port, str= uct ktermios *termios, if (up->capabilities & UART_CAP_RTOIE) up->ier |=3D UART_IER_RTOIE; =20 - serial_port_out(port, UART_IER, up->ier); + serial8250_set_IER(up, up->ier); =20 if (up->capabilities & UART_CAP_EFR) { unsigned char efr =3D 0; @@ -3312,12 +3352,21 @@ EXPORT_SYMBOL_GPL(serial8250_set_defaults); =20 #ifdef CONFIG_SERIAL_8250_CONSOLE =20 -static void serial8250_console_putchar(struct uart_port *port, unsigned ch= ar ch) +static bool serial8250_console_putchar(struct uart_port *port, unsigned ch= ar ch, + struct cons_write_context *wctxt) { struct uart_8250_port *up =3D up_to_u8250p(port); =20 wait_for_xmitr(up, UART_LSR_THRE); + if (!console_can_proceed(wctxt)) + return false; serial_port_out(port, UART_TX, ch); + if (ch =3D=3D '\n') + up->console_newline_needed =3D false; + else + up->console_newline_needed =3D true; + + return true; } =20 /* @@ -3346,33 +3395,134 @@ static void serial8250_console_restore(struct uart= _8250_port *up) serial8250_out_MCR(up, up->mcr | UART_MCR_DTR | UART_MCR_RTS); } =20 +static bool __serial8250_console_write(struct uart_port *port, struct cons= _write_context *wctxt, + const char *s, unsigned int count, + bool (*putchar)(struct uart_port *, unsigned char, struct cons_write_con= text *)) +{ + bool finished =3D false; + unsigned int i; + + for (i =3D 0; i < count; i++, s++) { + if (*s =3D=3D '\n') { + if (!putchar(port, '\r', wctxt)) + goto out; + } + if (!putchar(port, *s, wctxt)) + goto out; + } + finished =3D true; +out: + return finished; +} + +static bool serial8250_console_write(struct uart_port *port, struct cons_w= rite_context *wctxt, + const char *s, unsigned int count, + bool (*putchar)(struct uart_port *, unsigned char, struct cons_write_con= text *)) +{ + return __serial8250_console_write(port, wctxt, s, count, putchar); +} + +static bool atomic_print_line(struct uart_8250_port *up, + struct cons_write_context *wctxt) +{ + struct uart_port *port =3D &up->port; + char buf[4]; + + if (up->console_newline_needed && + !__serial8250_console_write(port, wctxt, "\n", 1, serial8250_console_= putchar)) { + return false; + } + + sprintf(buf, "A%d", raw_smp_processor_id()); + if (!__serial8250_console_write(port, wctxt, buf, strlen(buf), serial8250= _console_putchar)) + return false; + + return __serial8250_console_write(port, wctxt, wctxt->outbuf, wctxt->len, + serial8250_console_putchar); +} + +static void atomic_console_reacquire(struct cons_write_context *wctxt, + struct cons_write_context *wctxt_init) +{ + memcpy(wctxt, wctxt_init, sizeof(*wctxt)); + while (!console_try_acquire(wctxt)) { + cpu_relax(); + memcpy(wctxt, wctxt_init, sizeof(*wctxt)); + } +} + /* - * Print a string to the serial port using the device FIFO - * - * It sends fifosize bytes and then waits for the fifo - * to get empty. + * It should be possible to support a hostile takeover in an unsafe + * section if it is write_atomic() that is being taken over. But where + * to put this policy? */ -static void serial8250_console_fifo_write(struct uart_8250_port *up, - const char *s, unsigned int count) +bool serial8250_console_write_atomic(struct uart_8250_port *up, + struct cons_write_context *wctxt) { - int i; - const char *end =3D s + count; - unsigned int fifosize =3D up->tx_loadsz; - bool cr_sent =3D false; - - while (s !=3D end) { - wait_for_lsr(up, UART_LSR_THRE); - - for (i =3D 0; i < fifosize && s !=3D end; ++i) { - if (*s =3D=3D '\n' && !cr_sent) { - serial_out(up, UART_TX, '\r'); - cr_sent =3D true; - } else { - serial_out(up, UART_TX, *s++); - cr_sent =3D false; - } + struct cons_write_context wctxt_init =3D {}; + struct cons_context *ctxt_init =3D &ACCESS_PRIVATE(&wctxt_init, ctxt); + struct cons_context *ctxt =3D &ACCESS_PRIVATE(wctxt, ctxt); + bool can_print =3D true; + unsigned int ier; + + /* With write_atomic, another context may hold the port->lock. */ + + ctxt_init->console =3D ctxt->console; + ctxt_init->prio =3D ctxt->prio; + ctxt_init->thread =3D ctxt->thread; + + touch_nmi_watchdog(); + + /* + * Enter unsafe in order to disable interrupts. If the console is + * lost before the interrupts are disabled, bail out because another + * context took over the printing. If the console is lost after the + * interrutps are disabled, the console must be reacquired in order + * to re-enable the interrupts. However in that case no printing is + * allowed because another context took over the printing. + */ + + if (!console_enter_unsafe(wctxt)) + return false; + + if (!__serial8250_clear_IER(up, wctxt, &ier)) + return false; + + if (console_exit_unsafe(wctxt)) { + can_print =3D atomic_print_line(up, wctxt); + if (!can_print) + atomic_console_reacquire(wctxt, &wctxt_init); + + if (can_print) { + can_print =3D console_can_proceed(wctxt); + if (can_print) + wait_for_xmitr(up, UART_LSR_BOTH_EMPTY); + else + atomic_console_reacquire(wctxt, &wctxt_init); + } + } else { + atomic_console_reacquire(wctxt, &wctxt_init); + } + + /* + * Enter unsafe in order to enable interrupts. If the console is + * lost before the interrupts are enabled, the console must be + * reacquired in order to re-enable the interrupts. + */ + + for (;;) { + if (console_enter_unsafe(wctxt) && + __serial8250_set_IER(up, wctxt, ier)) { + break; } + + /* HW-IRQs still disabled. Reacquire to enable them. */ + atomic_console_reacquire(wctxt, &wctxt_init); } + + console_exit_unsafe(wctxt); + + return can_print; } =20 /* @@ -3384,64 +3534,54 @@ static void serial8250_console_fifo_write(struct ua= rt_8250_port *up, * Doing runtime PM is really a bad idea for the kernel console. * Thus, we assume the function is called when device is powered up. */ -void serial8250_console_write(struct uart_8250_port *up, const char *s, - unsigned int count) +bool serial8250_console_write_thread(struct uart_8250_port *up, + struct cons_write_context *wctxt) { struct uart_8250_em485 *em485 =3D up->em485; struct uart_port *port =3D &up->port; - unsigned long flags; - unsigned int ier, use_fifo; - int locked =3D 1; - - touch_nmi_watchdog(); - - if (oops_in_progress) - locked =3D spin_trylock_irqsave(&port->lock, flags); - else - spin_lock_irqsave(&port->lock, flags); + unsigned int count =3D wctxt->len; + const char *s =3D wctxt->outbuf; + bool finished =3D false; + unsigned int ier; + char buf[4]; =20 /* * First save the IER then disable the interrupts */ - ier =3D serial_port_in(port, UART_IER); - serial8250_clear_IER(up); + if (!console_enter_unsafe(wctxt) || + !__serial8250_clear_IER(up, wctxt, &ier)) { + goto out; + } + if (!console_exit_unsafe(wctxt)) + goto out; =20 /* check scratch reg to see if port powered off during system sleep */ if (up->canary && (up->canary !=3D serial_port_in(port, UART_SCR))) { + if (!console_enter_unsafe(wctxt)) + goto out; serial8250_console_restore(up); + if (!console_exit_unsafe(wctxt)) + goto out; up->canary =3D 0; } =20 if (em485) { - if (em485->tx_stopped) + if (em485->tx_stopped) { + if (!console_enter_unsafe(wctxt)) + goto out; up->rs485_start_tx(up); - mdelay(port->rs485.delay_rts_before_send); + if (!console_exit_unsafe(wctxt)) + goto out; + } + mdelay(port->rs485.delay_rts_before_send); /* WTF?! Seriously?! */ } =20 - use_fifo =3D (up->capabilities & UART_CAP_FIFO) && - /* - * BCM283x requires to check the fifo - * after each byte. - */ - !(up->capabilities & UART_CAP_MINI) && - /* - * tx_loadsz contains the transmit fifo size - */ - up->tx_loadsz > 1 && - (up->fcr & UART_FCR_ENABLE_FIFO) && - port->state && - test_bit(TTY_PORT_INITIALIZED, &port->state->port.iflags) && - /* - * After we put a data in the fifo, the controller will send - * it regardless of the CTS state. Therefore, only use fifo - * if we don't use control flow. - */ - !(up->port.flags & UPF_CONS_FLOW); + sprintf(buf, "T%d", raw_smp_processor_id()); + if (serial8250_console_write(port, wctxt, buf, strlen(buf), serial8250_co= nsole_putchar)) + finished =3D serial8250_console_write(port, wctxt, s, count, serial8250_= console_putchar); =20 - if (likely(use_fifo)) - serial8250_console_fifo_write(up, s, count); - else - uart_console_write(port, s, count, serial8250_console_putchar); + if (!finished) + goto out; =20 /* * Finally, wait for transmitter to become empty @@ -3450,12 +3590,20 @@ void serial8250_console_write(struct uart_8250_port= *up, const char *s, wait_for_xmitr(up, UART_LSR_BOTH_EMPTY); =20 if (em485) { - mdelay(port->rs485.delay_rts_after_send); - if (em485->tx_stopped) + mdelay(port->rs485.delay_rts_after_send); /* WTF?! Seriously?! */ + if (em485->tx_stopped) { + if (!console_enter_unsafe(wctxt)) + goto out; up->rs485_stop_tx(up); + if (!console_exit_unsafe(wctxt)) + goto out; + } } - - serial_port_out(port, UART_IER, ier); + if (!console_enter_unsafe(wctxt)) + goto out; + WARN_ON_ONCE(!__serial8250_set_IER(up, wctxt, ier)); + if (!console_exit_unsafe(wctxt)) + goto out; =20 /* * The receive handling will happen properly because the @@ -3464,11 +3612,15 @@ void serial8250_console_write(struct uart_8250_port= *up, const char *s, * call it if we have saved something in the saved flags * while processing with interrupts off. */ - if (up->msr_saved_flags) + if (up->msr_saved_flags) { + if (!console_enter_unsafe(wctxt)) + goto out; serial8250_modem_status(up); - - if (locked) - spin_unlock_irqrestore(&port->lock, flags); + if (!console_exit_unsafe(wctxt)) + goto out; + } +out: + return finished; } =20 static unsigned int probe_baud(struct uart_port *port) @@ -3488,6 +3640,7 @@ static unsigned int probe_baud(struct uart_port *port) =20 int serial8250_console_setup(struct uart_port *port, char *options, bool p= robe) { + struct uart_8250_port *up =3D up_to_u8250p(port); int baud =3D 9600; int bits =3D 8; int parity =3D 'n'; @@ -3497,6 +3650,8 @@ int serial8250_console_setup(struct uart_port *port, = char *options, bool probe) if (!port->iobase && !port->membase) return -ENODEV; =20 + up->console_newline_needed =3D false; + if (options) uart_parse_options(options, &baud, &parity, &bits, &flow); else if (probe) diff --git a/drivers/tty/serial/8250/Kconfig b/drivers/tty/serial/8250/Kcon= fig index 978dc196c29b..22656e8370ea 100644 --- a/drivers/tty/serial/8250/Kconfig +++ b/drivers/tty/serial/8250/Kconfig @@ -9,6 +9,7 @@ config SERIAL_8250 depends on !S390 select SERIAL_CORE select SERIAL_MCTRL_GPIO if GPIOLIB + select HAVE_ATOMIC_CONSOLE help This selects whether you want to include the driver for the standard serial ports. The standard answer is Y. People who might say N diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_c= ore.c index 2bd32c8ece39..9901f916dc1a 100644 --- a/drivers/tty/serial/serial_core.c +++ b/drivers/tty/serial/serial_core.c @@ -2336,8 +2336,11 @@ int uart_suspend_port(struct uart_driver *drv, struc= t uart_port *uport) * able to Re-start_rx later. */ if (!console_suspend_enabled && uart_console(uport)) { - if (uport->ops->start_rx) + if (uport->ops->start_rx) { + spin_lock_irq(&uport->lock); uport->ops->stop_rx(uport); + spin_unlock_irq(&uport->lock); + } goto unlock; } =20 @@ -2430,8 +2433,11 @@ int uart_resume_port(struct uart_driver *drv, struct= uart_port *uport) if (console_suspend_enabled) uart_change_pm(state, UART_PM_STATE_ON); uport->ops->set_termios(uport, &termios, NULL); - if (!console_suspend_enabled && uport->ops->start_rx) + if (!console_suspend_enabled && uport->ops->start_rx) { + spin_lock_irq(&uport->lock); uport->ops->start_rx(uport); + spin_unlock_irq(&uport->lock); + } if (console_suspend_enabled) console_start(uport->cons); } diff --git a/include/linux/serial_8250.h b/include/linux/serial_8250.h index 19376bee9667..9055a22992ed 100644 --- a/include/linux/serial_8250.h +++ b/include/linux/serial_8250.h @@ -125,6 +125,8 @@ struct uart_8250_port { #define MSR_SAVE_FLAGS UART_MSR_ANY_DELTA unsigned char msr_saved_flags; =20 + bool console_newline_needed; + struct uart_8250_dma *dma; const struct uart_8250_ops *ops; =20 @@ -139,6 +141,9 @@ struct uart_8250_port { /* Serial port overrun backoff */ struct delayed_work overrun_backoff; u32 overrun_backoff_time_ms; + + struct cons_write_context wctxt; + int cookie; }; =20 static inline struct uart_8250_port *up_to_u8250p(struct uart_port *up) @@ -178,8 +183,10 @@ void serial8250_tx_chars(struct uart_8250_port *up); unsigned int serial8250_modem_status(struct uart_8250_port *up); void serial8250_init_port(struct uart_8250_port *up); void serial8250_set_defaults(struct uart_8250_port *up); -void serial8250_console_write(struct uart_8250_port *up, const char *s, - unsigned int count); +bool serial8250_console_write_atomic(struct uart_8250_port *up, + struct cons_write_context *wctxt); +bool serial8250_console_write_thread(struct uart_8250_port *up, + struct cons_write_context *wctxt); int serial8250_console_setup(struct uart_port *port, char *options, bool p= robe); int serial8250_console_exit(struct uart_port *port); From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22917C7EE39 for ; Thu, 2 Mar 2023 19:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230075AbjCBT6D (ORCPT ); Thu, 2 Mar 2023 14:58:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229541AbjCBT5p (ORCPT ); Thu, 2 Mar 2023 14:57:45 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58EF54743A for ; Thu, 2 Mar 2023 11:57:44 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787062; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bn4hK/pc8yfuTHf2lVQ6K39OCmfHLmDwBEQ4LD+VvX8=; b=sVHSx/LE7UqM7m9QqJqsI82I3YhkAegnS1wj0KV1SpI9PJKxnb/I07juk3E23hRkXfc2D6 E93VCEpd5hBOfkTcqtKMqdM9Bqq4/eZi6MzoPvWe4psYtIkpoQu60kDH0sLmcaiU7QkY0X oDq5R/XP1V0TGr6dCS4tfdp9e3DMGiTaUD1jcFG32b1bHe9GxpBVT+NeVHY6xsi7tXZV5h z81xAF0OvvZ4rAgVg7Rm83bheDM1no0axFc4D6eWG0otXxz5IZgHm1Xf85D/23B5Hs4Njo BlNGym9rw1vlZ2oDc+dInihAYTmNb2UuXy5Ssxs8wXPTFtkEMuf3GwXXZdlUQA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787062; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bn4hK/pc8yfuTHf2lVQ6K39OCmfHLmDwBEQ4LD+VvX8=; b=95Ms7ku4PtEVQkCu9sJ1bRajCtfp7nzS53jwqnhi/CkI9LIirtg2epuND9iUDCiYb5LQd8 TSNC5vetA4XLnrDA== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Jason Wessel , Daniel Thompson , Douglas Anderson , Aaron Tomlin , Luis Chamberlain , kgdb-bugreport@lists.sourceforge.net Subject: [PATCH printk v1 01/18] kdb: do not assume write() callback available Date: Thu, 2 Mar 2023 21:02:01 +0106 Message-Id: <20230302195618.156940-2-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" It is allowed for consoles to provide no write() callback. For example ttynull does this. Check if a write() callback is available before using it. Signed-off-by: John Ogness Acked-by: Daniel Thompson Reviewed-by: Daniel Thompson Reviewed-by: Douglas Anderson Reviewed-by: Petr Mladek Tested-by: Daniel Thompson --- kernel/debug/kdb/kdb_io.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c index 5c7e9ba7cd6b..e9139dfc1f0a 100644 --- a/kernel/debug/kdb/kdb_io.c +++ b/kernel/debug/kdb/kdb_io.c @@ -576,6 +576,8 @@ static void kdb_msg_write(const char *msg, int msg_len) continue; if (c =3D=3D dbg_io_ops->cons) continue; + if (!c->write) + continue; /* * Set oops_in_progress to encourage the console drivers to * disregard their internal spin locks: in the current calling --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F271CC7EE36 for ; Thu, 2 Mar 2023 19:58:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230049AbjCBT6A (ORCPT ); Thu, 2 Mar 2023 14:58:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54792 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229779AbjCBT5p (ORCPT ); Thu, 2 Mar 2023 14:57:45 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61787474D7 for ; Thu, 2 Mar 2023 11:57:44 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787062; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+RiY4OM6UJRNz/WSBX/5KTZe1S/q7JPH46lONkiG6+g=; b=Y//5m69smqt/3rvjKkJp+AlCiSiSrJ2hIBzpHRCpeJXg/jOSGUYgrqzG6OPRuopLBUgI93 drFPLWbNUW/fRZlcONLMavFXa5Ibf1mkU0drw9z2KBEmFfJsksGvn1r5BQExYV/wcjj/D9 Lhx/bq9/xUikhZhko9k7557OS/3qwp2w3O2W9RgfQBrbQwwcZUi7nOQVxdn6PbbK0ZVTNR p9N6Y1bTFLq74dYQ9u4Iv62cwBG27Gy0I7yiG8WBav5uqHe7as38pYd49SanoYvV+CQl1N f4tx5rN4Cndz1RU2q5bwq6uoZMmD4xhgtDWyLhUarNHhzL1TuU+ycaxQ+E++iQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787062; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+RiY4OM6UJRNz/WSBX/5KTZe1S/q7JPH46lONkiG6+g=; b=5mkIXmg8E1wb4LxyOtqUWqr3pPJWzahz+3aLJVXlzhMk3YTSTUrSC36xyMS9fCgNetkQGg J14n8LFF4jHNUMBg== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org Subject: [PATCH printk v1 02/18] printk: Add NMI check to down_trylock_console_sem() Date: Thu, 2 Mar 2023 21:02:02 +0106 Message-Id: <20230302195618.156940-3-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The printk path is NMI safe because it only adds content to the buffer and then triggers the delayed output via irq_work. If the console is flushed or unblanked (on panic) from NMI then it can deadlock in down_trylock_console_sem() because the semaphore is not NMI safe. Avoid try-locking the console from NMI and assume it failed. Signed-off-by: John Ogness Tested-by: Daniel Thompson --- kernel/printk/printk.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 40c5f4170ac7..84af038292d9 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -318,6 +318,10 @@ static int __down_trylock_console_sem(unsigned long ip) int lock_failed; unsigned long flags; =20 + /* Semaphores are not NMI-safe. */ + if (in_nmi()) + return 1; + /* * Here and in __up_console_sem() we need to be in safe mode, * because spindump/WARN/etc from under console ->lock will --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D74DC7EE43 for ; Thu, 2 Mar 2023 19:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229679AbjCBT6H (ORCPT ); Thu, 2 Mar 2023 14:58:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229794AbjCBT5q (ORCPT ); Thu, 2 Mar 2023 14:57:46 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 62044474DE for ; Thu, 2 Mar 2023 11:57:44 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787063; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P2i344YuQotLWLLh6A7EkhSEbuMCwNrp9Bq5Z1X7/c0=; b=YMCQ3Q5j0mLRMU57xKPv0P3mSbWo0Ywjx2iCdMhsDuTXc1VGeoD7OhwB3PZioa1Q6rBV5x op9kGRs7DNY2tbQ+m6gUZxo8isQ4uWQzwqGlv4d2PFhAfuJbrKmW0oigkvHuDDS0mEYedn K09hhIZChvrQnP6jg13NnsoxEYMUXKke8zh/k3+wNURrPYkk1hC1Wp9WTHo2lLesvw8qiq 2yzsCgLNrX30Ebq2zr+YcKqFBwTlbsBWY1TltUhSJqP5LyVMzHRBX1djhklF0Q5sCoHXcH CTFE6ChYtFsBfkrRFpCI9O/wU7BaD4JQkDUqKq9HeQuz8WjxVSm6Pz53lSXeHA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787063; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P2i344YuQotLWLLh6A7EkhSEbuMCwNrp9Bq5Z1X7/c0=; b=6LDiB0rmVnvPOZP6uR2CnybeIo6zVvSY0985uqwkaHrhDpNtH84j+OHtP7kKDCT1pBWMmC Vrc8je4gESeoJrDg== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org Subject: [PATCH printk v1 03/18] printk: Consolidate console deferred printing Date: Thu, 2 Mar 2023 21:02:03 +0106 Message-Id: <20230302195618.156940-4-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Printig to consoles can be deferred for several reasons: - explicitly with printk_deferred() - printk() in NMI context - recursive printk() calls The current implementation is not consistent. For printk_deferred(), irq work is scheduled twice. For NMI und recursive, panic CPU suppression and caller delays are not properly enforced. Correct these inconsistencies by consolidating the deferred printing code so that vprintk_deferred() is the toplevel function for deferred printing and vprintk_emit() will perform whichever irq_work queueing is appropriate. Also add kerneldoc for wake_up_klogd() and defer_console_output() to clarify their differences and appropriate usage. Signed-off-by: John Ogness Tested-by: Daniel Thompson --- kernel/printk/printk.c | 31 ++++++++++++++++++++++++------- kernel/printk/printk_safe.c | 9 ++------- 2 files changed, 26 insertions(+), 14 deletions(-) diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 84af038292d9..bdeaf12e0bd2 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2321,7 +2321,10 @@ asmlinkage int vprintk_emit(int facility, int level, preempt_enable(); } =20 - wake_up_klogd(); + if (in_sched) + defer_console_output(); + else + wake_up_klogd(); return printed_len; } EXPORT_SYMBOL(vprintk_emit); @@ -3811,11 +3814,30 @@ static void __wake_up_klogd(int val) preempt_enable(); } =20 +/** + * wake_up_klogd - Wake kernel logging daemon + * + * Use this function when new records have been added to the ringbuffer + * and the console printing for those records is handled elsewhere. In + * this case only the logging daemon needs to be woken. + * + * Context: Any context. + */ void wake_up_klogd(void) { __wake_up_klogd(PRINTK_PENDING_WAKEUP); } =20 +/** + * defer_console_output - Wake kernel logging daemon and trigger + * console printing in a deferred context + * + * Use this function when new records have been added to the ringbuffer + * but the current context is unable to perform the console printing. + * This function also wakes the logging daemon. + * + * Context: Any context. + */ void defer_console_output(void) { /* @@ -3832,12 +3854,7 @@ void printk_trigger_flush(void) =20 int vprintk_deferred(const char *fmt, va_list args) { - int r; - - r =3D vprintk_emit(0, LOGLEVEL_SCHED, NULL, fmt, args); - defer_console_output(); - - return r; + return vprintk_emit(0, LOGLEVEL_SCHED, NULL, fmt, args); } =20 int _printk_deferred(const char *fmt, ...) diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c index ef0f9a2044da..6d10927a07d8 100644 --- a/kernel/printk/printk_safe.c +++ b/kernel/printk/printk_safe.c @@ -38,13 +38,8 @@ asmlinkage int vprintk(const char *fmt, va_list args) * Use the main logbuf even in NMI. But avoid calling console * drivers that might have their own locks. */ - if (this_cpu_read(printk_context) || in_nmi()) { - int len; - - len =3D vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args); - defer_console_output(); - return len; - } + if (this_cpu_read(printk_context) || in_nmi()) + return vprintk_deferred(fmt, args); =20 /* No obstacles. */ return vprintk_default(fmt, args); --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8168DC7EE3A for ; Thu, 2 Mar 2023 19:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230160AbjCBT6J (ORCPT ); Thu, 2 Mar 2023 14:58:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54796 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229805AbjCBT5q (ORCPT ); Thu, 2 Mar 2023 14:57:46 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3CEE474E3 for ; Thu, 2 Mar 2023 11:57:44 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787063; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=R9tOYyS2PzrZZNPqHz2RZVi/VNmEwSgQ/BXnraEltaw=; b=t7LjSSSEITvx8kbiv5CVXv2TXdNm6VXkiTcPhTMnFJX4uIr7wF06FV5+xsBpNoi8RF6APL UO2YbBOrbzGF/RMFHD+y8k1VuMeEzXW42wTe98I556VXsXrPWorh0w3dwAcO2Rwf4Vxgkd I3enyGsYKtohM4C8uiKqw3hL//CQ8usus0d4Il4sQZK30JSxgfQ8QbCpygTMbOsDtRJTeJ Fk+vKkKIejQ3e6s8f8TnxJ/kazi1uJky+fO9DpowvQunD4NR/+aCzJ+p7peDZ37PlzKSH9 mCK0vzr1g86tZ39IPMVc2Vo3iU3nFyewTWe7vUwFXdm+rNqrX4szqlm9ejBNtA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787063; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=R9tOYyS2PzrZZNPqHz2RZVi/VNmEwSgQ/BXnraEltaw=; b=JM39akAot7pRoHec+7jJQXSabDYbtHF8CZ6R6xuPk+UspGwDe0FQfGbU+lqt/abGpZprEb eBv9KM06PZPGPvDA== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Greg Kroah-Hartman Subject: [PATCH printk v1 04/18] printk: Add per-console suspended state Date: Thu, 2 Mar 2023 21:02:04 +0106 Message-Id: <20230302195618.156940-5-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Currently the global @console_suspended is used to determine if consoles are in a suspended state. Its primary purpose is to allow usage of the console_lock when suspended without causing console printing. It is synchronized by the console_lock. Rather than relying on the console_lock to determine suspended state, make it an official per-console state that is set within console->flags. This allows the state to be queried via SRCU. @console_suspended will continue to exist, but now only to implement the console_lock/console_unlock trickery and _not_ to represent the suspend state of a particular console. Signed-off-by: John Ogness Tested-by: Daniel Thompson --- include/linux/console.h | 3 +++ kernel/printk/printk.c | 46 ++++++++++++++++++++++++++++++++--------- 2 files changed, 39 insertions(+), 10 deletions(-) diff --git a/include/linux/console.h b/include/linux/console.h index 1e36958aa656..f7967fb238e0 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -153,6 +153,8 @@ static inline int con_debug_leave(void) * receiving the printk spam for obvious reasons. * @CON_EXTENDED: The console supports the extended output format of * /dev/kmesg which requires a larger output buffer. + * @CON_SUSPENDED: Indicates if a console is suspended. If true, the + * printing callbacks must not be called. */ enum cons_flags { CON_PRINTBUFFER =3D BIT(0), @@ -162,6 +164,7 @@ enum cons_flags { CON_ANYTIME =3D BIT(4), CON_BRL =3D BIT(5), CON_EXTENDED =3D BIT(6), + CON_SUSPENDED =3D BIT(7), }; =20 /** diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index bdeaf12e0bd2..626d467c7e9b 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2563,10 +2563,26 @@ MODULE_PARM_DESC(console_no_auto_verbose, "Disable = console loglevel raise to hig */ void suspend_console(void) { + struct console *con; + if (!console_suspend_enabled) return; pr_info("Suspending console(s) (use no_console_suspend to debug)\n"); pr_flush(1000, true); + + console_list_lock(); + for_each_console(con) + console_srcu_write_flags(con, con->flags | CON_SUSPENDED); + console_list_unlock(); + + /* + * Ensure that all SRCU list walks have completed. All printing + * contexts must be able to see that they are suspended so that it + * is guaranteed that all printing has stopped when this function + * completes. + */ + synchronize_srcu(&console_srcu); + console_lock(); console_suspended =3D 1; up_console_sem(); @@ -2574,11 +2590,26 @@ void suspend_console(void) =20 void resume_console(void) { + struct console *con; + if (!console_suspend_enabled) return; down_console_sem(); console_suspended =3D 0; console_unlock(); + + console_list_lock(); + for_each_console(con) + console_srcu_write_flags(con, con->flags & ~CON_SUSPENDED); + console_list_unlock(); + + /* + * Ensure that all SRCU list walks have completed. All printing + * contexts must be able to see they are no longer suspended so + * that they are guaranteed to wake up and resume printing. + */ + synchronize_srcu(&console_srcu); + pr_flush(1000, true); } =20 @@ -2681,6 +2712,9 @@ static inline bool console_is_usable(struct console *= con) if (!(flags & CON_ENABLED)) return false; =20 + if ((flags & CON_SUSPENDED)) + return false; + if (!con->write) return false; =20 @@ -3695,8 +3729,7 @@ static bool __pr_flush(struct console *con, int timeo= ut_ms, bool reset_on_progre =20 /* * Hold the console_lock to guarantee safe access to - * console->seq and to prevent changes to @console_suspended - * until all consoles have been processed. + * console->seq. */ console_lock(); =20 @@ -3712,14 +3745,7 @@ static bool __pr_flush(struct console *con, int time= out_ms, bool reset_on_progre } console_srcu_read_unlock(cookie); =20 - /* - * If consoles are suspended, it cannot be expected that they - * make forward progress, so timeout immediately. @diff is - * still used to return a valid flush status. - */ - if (console_suspended) - remaining =3D 0; - else if (diff !=3D last_diff && reset_on_progress) + if (diff !=3D last_diff && reset_on_progress) remaining =3D timeout_ms; =20 console_unlock(); --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92699C83003 for ; Thu, 2 Mar 2023 19:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230175AbjCBT6L (ORCPT ); Thu, 2 Mar 2023 14:58:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54832 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229952AbjCBT5t (ORCPT ); Thu, 2 Mar 2023 14:57:49 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 087C94743A; Thu, 2 Mar 2023 11:57:47 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787063; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Yj+YRLcxasSK2Pr36OxlyCW12ZJIKbcJmPt2KN23fnk=; b=q0zx+CQAWzAd3CsFCV0mps1oEewIWbGVkVJHEYLq3UwUGl0le4p55y3jpO8RtTEJCo1Xu3 h/nIH3TRoOQRc78cCLN24mxnCZh4PWeeA5Ow/9tP4tNC5r4DJssuYylMYNE/kVNWR/h4Bz gJHmefuvh69DrctBafZISV1c0YYxZwkznh3GRZMG++mOyCK10+pqvUUC4Zb+frHnsAq0wl JCpTdfDavMFMMFYKxGWy5IEc2t6TyL+bcpaHoUzvnHdAd0ac79Za7OuWtWoSLcLX8g2fYh 9ZeNeRwtsFioL6DxLwJfYZNaMIR+ai6g2oCPR5th2FoqJKpLKUIFhx5wu8z/IQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787063; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Yj+YRLcxasSK2Pr36OxlyCW12ZJIKbcJmPt2KN23fnk=; b=e4u+AHjyLLx27mBhuY/PR6fdYBpXDMV1DR6iEafBPhGU4LP9j8mdaBPYZ6VuNlGlHww0b2 91WcrEWlX45z9ZCQ== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Greg Kroah-Hartman , linux-fsdevel@vger.kernel.org Subject: [PATCH printk v1 05/18] printk: Add non-BKL console basic infrastructure Date: Thu, 2 Mar 2023 21:02:05 +0106 Message-Id: <20230302195618.156940-6-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner The current console/printk subsystem is protected by a Big Kernel Lock, (aka console_lock) which has ill defined semantics and is more or less stateless. This puts severe limitations on the console subsystem and makes forced takeover and output in emergency and panic situations a fragile endavour which is based on try and pray. The goal of non-BKL consoles is to break out of the console lock jail and to provide a new infrastructure that avoids the pitfalls and allows console drivers to be gradually converted over. The proposed infrastructure aims for the following properties: - Per console locking instead of global locking - Per console state which allows to make informed decisions - Stateful handover and takeover As a first step state is added to struct console. The per console state is an atomic_long_t with a 32bit bit field and on 64bit also a 32bit sequence for tracking the last printed ringbuffer sequence number. On 32bit the sequence is separate from state for obvious reasons which requires handling a few extra race conditions. Reserve state bits, which will be populated later in the series. Wire it up into the console register/unregister functionality and exclude such consoles from being handled in the console BKL mechanisms. Since the non-BKL consoles will not depend on the console lock/unlock dance for printing, only perform said dance if a BKL console is registered. The decision to use a bitfield was made as using a plain u32 with mask/shift operations turned out to result in uncomprehensible code. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Tested-by: Daniel Thompson --- fs/proc/consoles.c | 1 + include/linux/console.h | 33 +++++++++ kernel/printk/Makefile | 2 +- kernel/printk/internal.h | 10 +++ kernel/printk/printk.c | 50 +++++++++++-- kernel/printk/printk_nobkl.c | 137 +++++++++++++++++++++++++++++++++++ 6 files changed, 226 insertions(+), 7 deletions(-) create mode 100644 kernel/printk/printk_nobkl.c diff --git a/fs/proc/consoles.c b/fs/proc/consoles.c index e0758fe7936d..9ce506866e60 100644 --- a/fs/proc/consoles.c +++ b/fs/proc/consoles.c @@ -21,6 +21,7 @@ static int show_console_dev(struct seq_file *m, void *v) { CON_ENABLED, 'E' }, { CON_CONSDEV, 'C' }, { CON_BOOT, 'B' }, + { CON_NO_BKL, 'N' }, { CON_PRINTBUFFER, 'p' }, { CON_BRL, 'b' }, { CON_ANYTIME, 'a' }, diff --git a/include/linux/console.h b/include/linux/console.h index f7967fb238e0..b9d2ad580128 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -155,6 +155,8 @@ static inline int con_debug_leave(void) * /dev/kmesg which requires a larger output buffer. * @CON_SUSPENDED: Indicates if a console is suspended. If true, the * printing callbacks must not be called. + * @CON_NO_BKL: Console can operate outside of the BKL style console_lock + * constraints. */ enum cons_flags { CON_PRINTBUFFER =3D BIT(0), @@ -165,6 +167,32 @@ enum cons_flags { CON_BRL =3D BIT(5), CON_EXTENDED =3D BIT(6), CON_SUSPENDED =3D BIT(7), + CON_NO_BKL =3D BIT(8), +}; + +/** + * struct cons_state - console state for NOBKL consoles + * @atom: Compound of the state fields for atomic operations + * @seq: Sequence for record tracking (64bit only) + * @bits: Compound of the state bits below + * + * To be used for state read and preparation of atomic_long_cmpxchg() + * operations. + */ +struct cons_state { + union { + unsigned long atom; + struct { +#ifdef CONFIG_64BIT + u32 seq; +#endif + union { + u32 bits; + struct { + }; + }; + }; + }; }; =20 /** @@ -186,6 +214,8 @@ enum cons_flags { * @dropped: Number of unreported dropped ringbuffer records * @data: Driver private data * @node: hlist node for the console list + * + * @atomic_state: State array for NOBKL consoles; real and handover */ struct console { char name[16]; @@ -205,6 +235,9 @@ struct console { unsigned long dropped; void *data; struct hlist_node node; + + /* NOBKL console specific members */ + atomic_long_t __private atomic_state[2]; }; =20 #ifdef CONFIG_LOCKDEP diff --git a/kernel/printk/Makefile b/kernel/printk/Makefile index f5b388e810b9..b36683bd2f82 100644 --- a/kernel/printk/Makefile +++ b/kernel/printk/Makefile @@ -1,6 +1,6 @@ # SPDX-License-Identifier: GPL-2.0-only obj-y =3D printk.o -obj-$(CONFIG_PRINTK) +=3D printk_safe.o +obj-$(CONFIG_PRINTK) +=3D printk_safe.o printk_nobkl.o obj-$(CONFIG_A11Y_BRAILLE_CONSOLE) +=3D braille.o obj-$(CONFIG_PRINTK_INDEX) +=3D index.o =20 diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index 2a17704136f1..da380579263b 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -3,6 +3,7 @@ * internal.h - printk internal definitions */ #include +#include =20 #if defined(CONFIG_PRINTK) && defined(CONFIG_SYSCTL) void __init printk_sysctl_init(void); @@ -61,6 +62,10 @@ void defer_console_output(void); =20 u16 printk_parse_prefix(const char *text, int *level, enum printk_info_flags *flags); + +void cons_nobkl_cleanup(struct console *con); +void cons_nobkl_init(struct console *con); + #else =20 #define PRINTK_PREFIX_MAX 0 @@ -76,8 +81,13 @@ u16 printk_parse_prefix(const char *text, int *level, #define printk_safe_exit_irqrestore(flags) local_irq_restore(flags) =20 static inline bool printk_percpu_data_ready(void) { return false; } +static inline void cons_nobkl_init(struct console *con) { } +static inline void cons_nobkl_cleanup(struct console *con) { } + #endif /* CONFIG_PRINTK */ =20 +extern bool have_boot_console; + /** * struct printk_buffers - Buffers to read/format/output printk messages. * @outbuf: After formatting, contains text to output. diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 626d467c7e9b..b2c7c92c3d79 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -446,6 +446,19 @@ static int console_msg_format =3D MSG_FORMAT_DEFAULT; /* syslog_lock protects syslog_* variables and write access to clear_seq. = */ static DEFINE_MUTEX(syslog_lock); =20 +/* + * Specifies if a BKL console was ever registered. Used to determine if the + * console lock/unlock dance is needed for console printing. + */ +static bool have_bkl_console; + +/* + * Specifies if a boot console is registered. Used to determine if NOBKL + * consoles may be used since NOBKL consoles cannot synchronize with boot + * consoles. + */ +bool have_boot_console; + #ifdef CONFIG_PRINTK DECLARE_WAIT_QUEUE_HEAD(log_wait); /* All 3 protected by @syslog_lock. */ @@ -2301,7 +2314,7 @@ asmlinkage int vprintk_emit(int facility, int level, printed_len =3D vprintk_store(facility, level, dev_info, fmt, args); =20 /* If called from the scheduler, we can not call up(). */ - if (!in_sched) { + if (!in_sched && have_bkl_console) { /* * The caller may be holding system-critical or * timing-sensitive locks. Disable preemption during @@ -2624,7 +2637,7 @@ void resume_console(void) */ static int console_cpu_notify(unsigned int cpu) { - if (!cpuhp_tasks_frozen) { + if (!cpuhp_tasks_frozen && have_bkl_console) { /* If trylock fails, someone else is doing the printing */ if (console_trylock()) console_unlock(); @@ -3098,6 +3111,9 @@ void console_unblank(void) struct console *c; int cookie; =20 + if (!have_bkl_console) + return; + /* * Stop console printing because the unblank() callback may * assume the console is not within its write() callback. @@ -3135,6 +3151,9 @@ void console_unblank(void) */ void console_flush_on_panic(enum con_flush_mode mode) { + if (!have_bkl_console) + return; + /* * If someone else is holding the console lock, trylock will fail * and may_schedule may be set. Ignore and proceed to unlock so @@ -3310,9 +3329,10 @@ static void try_enable_default_console(struct consol= e *newcon) newcon->flags |=3D CON_CONSDEV; } =20 -#define con_printk(lvl, con, fmt, ...) \ - printk(lvl pr_fmt("%sconsole [%s%d] " fmt), \ - (con->flags & CON_BOOT) ? "boot" : "", \ +#define con_printk(lvl, con, fmt, ...) \ + printk(lvl pr_fmt("%s%sconsole [%s%d] " fmt), \ + (con->flags & CON_NO_BKL) ? "" : "legacy ", \ + (con->flags & CON_BOOT) ? "boot" : "", \ con->name, con->index, ##__VA_ARGS__) =20 static void console_init_seq(struct console *newcon, bool bootcon_register= ed) @@ -3472,6 +3492,14 @@ void register_console(struct console *newcon) newcon->dropped =3D 0; console_init_seq(newcon, bootcon_registered); =20 + if (!(newcon->flags & CON_NO_BKL)) + have_bkl_console =3D true; + else + cons_nobkl_init(newcon); + + if (newcon->flags & CON_BOOT) + have_boot_console =3D true; + /* * Put this console in the list - keep the * preferred driver at the head of the list. @@ -3515,6 +3543,9 @@ void register_console(struct console *newcon) if (con->flags & CON_BOOT) unregister_console_locked(con); } + + /* All boot consoles have been unregistered. */ + have_boot_console =3D false; } unlock: console_list_unlock(); @@ -3563,6 +3594,9 @@ static int unregister_console_locked(struct console *= console) */ synchronize_srcu(&console_srcu); =20 + if (console->flags & CON_NO_BKL) + cons_nobkl_cleanup(console); + console_sysfs_notify(); =20 if (console->exit) @@ -3866,11 +3900,15 @@ void wake_up_klogd(void) */ void defer_console_output(void) { + int val =3D PRINTK_PENDING_WAKEUP; + /* * New messages may have been added directly to the ringbuffer * using vprintk_store(), so wake any waiters as well. */ - __wake_up_klogd(PRINTK_PENDING_WAKEUP | PRINTK_PENDING_OUTPUT); + if (have_bkl_console) + val |=3D PRINTK_PENDING_OUTPUT; + __wake_up_klogd(val); } =20 void printk_trigger_flush(void) diff --git a/kernel/printk/printk_nobkl.c b/kernel/printk/printk_nobkl.c new file mode 100644 index 000000000000..8df3626808dd --- /dev/null +++ b/kernel/printk/printk_nobkl.c @@ -0,0 +1,137 @@ +// SPDX-License-Identifier: GPL-2.0-only +// Copyright (C) 2022 Linutronix GmbH, John Ogness +// Copyright (C) 2022 Intel, Thomas Gleixner + +#include +#include +#include "internal.h" +/* + * Printk implementation for consoles that do not depend on the BKL style + * console_lock mechanism. + * + * Console is locked on a CPU when state::locked is set and state:cpu =3D= =3D + * current CPU. This is valid for the current execution context. + * + * Nesting execution contexts on the same CPU can carefully take over + * if the driver allows reentrancy via state::unsafe =3D false. When the + * interrupted context resumes it checks the state before entering + * an unsafe region and aborts the operation if it detects a takeover. + * + * In case of panic or emergency the nesting context can take over the + * console forcefully. The write callback is then invoked with the unsafe + * flag set in the write context data, which allows the driver side to avo= id + * locks and to evaluate the driver state so it can use an emergency path + * or repair the state instead of blindly assuming that it works. + * + * If the interrupted context touches the assigned record buffer after + * takeover, it does not cause harm because at the same execution level + * there is no concurrency on the same CPU. A threaded printer always has + * its own record buffer so it can never interfere with any of the per CPU + * record buffers. + * + * A concurrent writer on a different CPU can request to take over the + * console by: + * + * 1) Carefully writing the desired state into state[REQ] + * if there is no same or higher priority request pending. + * This locks state[REQ] except for higher priority + * waiters. + * + * 2) Setting state[CUR].req_prio unless a same or higher + * priority waiter won the race. + * + * 3) Carefully spin on state[CUR] until that is locked with the + * expected state. When the state is not the expected one then it + * has to verify that state[REQ] is still the same and that + * state[CUR] has not been taken over or unlocked. + * + * The unlocker hands over to state[REQ], but only if state[CUR] + * matches. + * + * In case that the owner does not react on the request and does not make + * observable progress, the waiter will timeout and can then decide to do + * a hostile takeover. + */ + +#define copy_full_state(_dst, _src) do { _dst =3D _src; } while (0) +#define copy_bit_state(_dst, _src) do { _dst.bits =3D _src.bits; } while (= 0) + +#ifdef CONFIG_64BIT +#define copy_seq_state64(_dst, _src) do { _dst.seq =3D _src.seq; } while (= 0) +#else +#define copy_seq_state64(_dst, _src) do { } while (0) +#endif + +enum state_selector { + CON_STATE_CUR, + CON_STATE_REQ, +}; + +/** + * cons_state_set - Helper function to set the console state + * @con: Console to update + * @which: Selects real state or handover state + * @new: The new state to write + * + * Only to be used when the console is not yet or no longer visible in the + * system. Otherwise use cons_state_try_cmpxchg(). + */ +static inline void cons_state_set(struct console *con, enum state_selector= which, + struct cons_state *new) +{ + atomic_long_set(&ACCESS_PRIVATE(con, atomic_state[which]), new->atom); +} + +/** + * cons_state_read - Helper function to read the console state + * @con: Console to update + * @which: Selects real state or handover state + * @state: The state to store the result + */ +static inline void cons_state_read(struct console *con, enum state_selecto= r which, + struct cons_state *state) +{ + state->atom =3D atomic_long_read(&ACCESS_PRIVATE(con, atomic_state[which]= )); +} + +/** + * cons_state_try_cmpxchg() - Helper function for atomic_long_try_cmpxchg(= ) on console state + * @con: Console to update + * @which: Selects real state or handover state + * @old: Old/expected state + * @new: New state + * + * Returns: True on success, false on fail + */ +static inline bool cons_state_try_cmpxchg(struct console *con, + enum state_selector which, + struct cons_state *old, + struct cons_state *new) +{ + return atomic_long_try_cmpxchg(&ACCESS_PRIVATE(con, atomic_state[which]), + &old->atom, new->atom); +} + +/** + * cons_nobkl_init - Initialize the NOBKL console specific data + * @con: Console to initialize + */ +void cons_nobkl_init(struct console *con) +{ + struct cons_state state =3D { }; + + cons_state_set(con, CON_STATE_CUR, &state); + cons_state_set(con, CON_STATE_REQ, &state); +} + +/** + * cons_nobkl_cleanup - Cleanup the NOBKL console specific data + * @con: Console to cleanup + */ +void cons_nobkl_cleanup(struct console *con) +{ + struct cons_state state =3D { }; + + cons_state_set(con, CON_STATE_CUR, &state); + cons_state_set(con, CON_STATE_REQ, &state); +} --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A45E4C7EE45 for ; Thu, 2 Mar 2023 19:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230189AbjCBT6O (ORCPT ); Thu, 2 Mar 2023 14:58:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229953AbjCBT5t (ORCPT ); Thu, 2 Mar 2023 14:57:49 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0BF37474D7 for ; Thu, 2 Mar 2023 11:57:47 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787064; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2CZAbAfVRcCZykreSseF4BxE7wfvpnXRuXtzB7UnPI0=; b=gj2EekBu62fenhSoE9+Y9XeOTSAa3N3PZvcS2tVf960lxHYOwZrOpLb+8gU1xuqGl9Ltmy C69IOP98cKyez0YNrZMeLNd2aF9QUg0xHzLflDl+lqy7a8j5dQHJzQnOheo9R+eekeesi2 S6clEMuG697P3rCDwOqSAwqup96ghEPTSiKYjpd1NZ0cP5G75footxE/YC2282iDNuk5kO QyIlKaQzy1VNgMVUDGYmj0lF1WjliDJoyUat2QuWkxRAB1v0Nz+52wSoVUzay7T8por5Gy goIY+llQwZBYwG0UCISfmuhTAmXXKXEFmc+IqWZRoPTYxRAMwUOpBnbXjKnXfg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787064; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2CZAbAfVRcCZykreSseF4BxE7wfvpnXRuXtzB7UnPI0=; b=hiSTFYVmmzLXgk6SuAxKWIIEc26zdd6CUaxwE2Lp1z4pHMXdtOsy74gfHb920MRVV8eqIA zcPILMQyj99MGcCw== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Greg Kroah-Hartman Subject: [PATCH printk v1 06/18] printk: nobkl: Add acquire/release logic Date: Thu, 2 Mar 2023 21:02:06 +0106 Message-Id: <20230302195618.156940-7-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner Add per console acquire/release functionality. The console 'locked' state is a combination of several state fields: - The 'locked' bit - The 'cpu' field that denotes on which CPU the console is locked - The 'cur_prio' field that contains the severity of the printk context that owns the console. This field is used for decisions whether to attempt friendly handovers and also prevents takeovers from a less severe context, e.g. to protect the panic CPU. The acquire mechanism comes with several flavours: - Straight forward acquire when the console is not contended - Friendly handover mechanism based on a request/grant handshake The requesting context: 1) Puts the desired handover state (CPU nr, prio) into a separate handover state 2) Sets the 'req_prio' field in the real console state 3) Waits (with a timeout) for the owning context to handover The owning context: 1) Observes the 'req_prio' field set 2) Hands the console over to the requesting context by switching the console state to the handover state that was provided by the requester - Hostile takeover The new owner takes the console over without handshake This is required when friendly handovers are not possible, i.e. the higher priority context interrupted the owning context on the same CPU or the owning context is not able to make progress on a remote CPU. The release is the counterpart which either releases the console directly or hands it gracefully over to a requester. All operations on console::atomic_state[CUR|REQ] are atomic cmpxchg based to handle concurrency. The acquire/release functions implement only minimal policies: - Preference for higher priority contexts - Protection of the panic CPU All other policy decisions have to be made at the call sites. The design allows to implement the well known: acquire() output_one_line() release() algorithm, but also allows to avoid the per line acquire/release for e.g. panic situations by doing the acquire once and then relying on the panic CPU protection for the rest. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Tested-by: Daniel Thompson --- include/linux/console.h | 82 ++++++ kernel/printk/printk_nobkl.c | 531 +++++++++++++++++++++++++++++++++++ 2 files changed, 613 insertions(+) diff --git a/include/linux/console.h b/include/linux/console.h index b9d2ad580128..2c95fcc765e6 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -176,8 +176,20 @@ enum cons_flags { * @seq: Sequence for record tracking (64bit only) * @bits: Compound of the state bits below * + * @locked: Console is locked by a writer + * @unsafe: Console is busy in a non takeover region + * @cur_prio: The priority of the current output + * @req_prio: The priority of a handover request + * @cpu: The CPU on which the writer runs + * * To be used for state read and preparation of atomic_long_cmpxchg() * operations. + * + * The @req_prio field is particularly important to allow spin-waiting to + * timeout and give up without the risk of it being assigned the lock + * after giving up. The @req_prio field has a nice side-effect that it + * also makes it possible for a single read+cmpxchg in the common case of + * acquire and release. */ struct cons_state { union { @@ -189,12 +201,79 @@ struct cons_state { union { u32 bits; struct { + u32 locked : 1; + u32 unsafe : 1; + u32 cur_prio : 2; + u32 req_prio : 2; + u32 cpu : 18; }; }; }; }; }; =20 +/** + * cons_prio - console writer priority for NOBKL consoles + * @CONS_PRIO_NONE: Unused + * @CONS_PRIO_NORMAL: Regular printk + * @CONS_PRIO_EMERGENCY: Emergency output (WARN/OOPS...) + * @CONS_PRIO_PANIC: Panic output + * + * Emergency output can carefully takeover the console even without consent + * of the owner, ideally only when @cons_state::unsafe is not set. Panic + * output can ignore the unsafe flag as a last resort. If panic output is + * active no takeover is possible until the panic output releases the + * console. + */ +enum cons_prio { + CONS_PRIO_NONE =3D 0, + CONS_PRIO_NORMAL, + CONS_PRIO_EMERGENCY, + CONS_PRIO_PANIC, +}; + +struct console; + +/** + * struct cons_context - Context for console acquire/release + * @console: The associated console + * @state: The state at acquire time + * @old_state: The old state when try_acquire() failed for analysis + * by the caller + * @hov_state: The handover state for spin and cleanup + * @req_state: The request state for spin and cleanup + * @spinwait_max_us: Limit for spinwait acquire + * @prio: Priority of the context + * @hostile: Hostile takeover requested. Cleared on normal + * acquire or friendly handover + * @spinwait: Spinwait on acquire if possible + */ +struct cons_context { + struct console *console; + struct cons_state state; + struct cons_state old_state; + struct cons_state hov_state; + struct cons_state req_state; + unsigned int spinwait_max_us; + enum cons_prio prio; + unsigned int hostile : 1; + unsigned int spinwait : 1; +}; + +/** + * struct cons_write_context - Context handed to the write callbacks + * @ctxt: The core console context + * @outbuf: Pointer to the text buffer for output + * @len: Length to write + * @unsafe: Invoked in unsafe state due to force takeover + */ +struct cons_write_context { + struct cons_context __private ctxt; + char *outbuf; + unsigned int len; + bool unsafe; +}; + /** * struct console - The console descriptor structure * @name: The name of the console driver @@ -364,6 +443,9 @@ static inline bool console_is_registered(const struct c= onsole *con) lockdep_assert_console_list_lock_held(); \ hlist_for_each_entry(con, &console_list, node) =20 +extern bool console_try_acquire(struct cons_write_context *wctxt); +extern bool console_release(struct cons_write_context *wctxt); + extern int console_set_on_cmdline; extern struct console *early_console; =20 diff --git a/kernel/printk/printk_nobkl.c b/kernel/printk/printk_nobkl.c index 8df3626808dd..78136347a328 100644 --- a/kernel/printk/printk_nobkl.c +++ b/kernel/printk/printk_nobkl.c @@ -4,6 +4,7 @@ =20 #include #include +#include #include "internal.h" /* * Printk implementation for consoles that do not depend on the BKL style @@ -112,6 +113,536 @@ static inline bool cons_state_try_cmpxchg(struct cons= ole *con, &old->atom, new->atom); } =20 +/** + * cons_state_full_match - Check whether the full state matches + * @cur: The state to check + * @prev: The previous state + * + * Returns: True if matching, false otherwise. + * + * Check the full state including state::seq on 64bit. For take over + * detection. + */ +static inline bool cons_state_full_match(struct cons_state cur, + struct cons_state prev) +{ + /* + * req_prio can be set by a concurrent writer for friendly + * handover. Ignore it in the comparison. + */ + cur.req_prio =3D prev.req_prio; + return cur.atom =3D=3D prev.atom; +} + +/** + * cons_state_bits_match - Check for matching state bits + * @cur: The state to check + * @prev: The previous state + * + * Returns: True if state matches, false otherwise. + * + * Contrary to cons_state_full_match this checks only the bits and ignores + * a sequence change on 64bits. On 32bit the two functions are identical. + */ +static inline bool cons_state_bits_match(struct cons_state cur, struct con= s_state prev) +{ + /* + * req_prio can be set by a concurrent writer for friendly + * handover. Ignore it in the comparison. + */ + cur.req_prio =3D prev.req_prio; + return cur.bits =3D=3D prev.bits; +} + +/** + * cons_check_panic - Check whether a remote CPU is in panic + * + * Returns: True if a remote CPU is in panic, false otherwise. + */ +static inline bool cons_check_panic(void) +{ + unsigned int pcpu =3D atomic_read(&panic_cpu); + + return pcpu !=3D PANIC_CPU_INVALID && pcpu !=3D smp_processor_id(); +} + +/** + * cons_cleanup_handover - Cleanup a handover request + * @ctxt: Pointer to acquire context + * + * @ctxt->hov_state contains the state to clean up + */ +static void cons_cleanup_handover(struct cons_context *ctxt) +{ + struct console *con =3D ctxt->console; + struct cons_state new; + + /* + * No loop required. Either hov_state is still the same or + * not. + */ + new.atom =3D 0; + cons_state_try_cmpxchg(con, CON_STATE_REQ, &ctxt->hov_state, &new); +} + +/** + * cons_setup_handover - Setup a handover request + * @ctxt: Pointer to acquire context + * + * Returns: True if a handover request was setup, false otherwise. + * + * On success @ctxt->hov_state contains the requested handover state + * + * On failure this context is not allowed to request a handover from the + * current owner. Reasons would be priority too low or a remote CPU in pan= ic. + * In both cases this context should give up trying to acquire the console. + */ +static bool cons_setup_handover(struct cons_context *ctxt) +{ + unsigned int cpu =3D smp_processor_id(); + struct console *con =3D ctxt->console; + struct cons_state old; + struct cons_state hstate =3D { + .locked =3D 1, + .cur_prio =3D ctxt->prio, + .cpu =3D cpu, + }; + + /* + * Try to store hstate in @con->atomic_state[REQ]. This might + * race with a higher priority waiter. + */ + cons_state_read(con, CON_STATE_REQ, &old); + do { + if (cons_check_panic()) + return false; + + /* Same or higher priority waiter exists? */ + if (old.cur_prio >=3D ctxt->prio) + return false; + + } while (!cons_state_try_cmpxchg(con, CON_STATE_REQ, &old, &hstate)); + + /* Save that state for comparison in spinwait */ + copy_full_state(ctxt->hov_state, hstate); + return true; +} + +/** + * cons_setup_request - Setup a handover request in state[CUR] + * @ctxt: Pointer to acquire context + * @old: The state that was used to make the decision to spin wait + * + * Returns: True if a handover request was setup in state[CUR], false + * otherwise. + * + * On success @ctxt->req_state contains the request state that was set in + * state[CUR] + * + * On failure this context encountered unexpected state values. This + * context should retry the full handover request setup process (the + * handover request setup by cons_setup_handover() is now invalidated + * and must be performed again). + */ +static bool cons_setup_request(struct cons_context *ctxt, struct cons_stat= e old) +{ + struct console *con =3D ctxt->console; + struct cons_state cur; + struct cons_state new; + + /* Now set the request in state[CUR] */ + cons_state_read(con, CON_STATE_CUR, &cur); + do { + if (cons_check_panic()) + goto cleanup; + + /* Bit state changed vs. the decision to spinwait? */ + if (!cons_state_bits_match(cur, old)) + goto cleanup; + + /* + * A higher or equal priority context already setup a + * request? + */ + if (cur.req_prio >=3D ctxt->prio) + goto cleanup; + + /* Setup a request for handover. */ + copy_full_state(new, cur); + new.req_prio =3D ctxt->prio; + } while (!cons_state_try_cmpxchg(con, CON_STATE_CUR, &cur, &new)); + + /* Save that state for comparison in spinwait */ + copy_bit_state(ctxt->req_state, new); + return true; + +cleanup: + cons_cleanup_handover(ctxt); + return false; +} + +/** + * cons_try_acquire_spin - Complete the spinwait attempt + * @ctxt: Pointer to an acquire context that contains + * all information about the acquire mode + * + * @ctxt->hov_state contains the handover state that was set in + * state[REQ] + * @ctxt->req_state contains the request state that was set in + * state[CUR] + * + * Returns: 0 if successfully locked. -EBUSY on timeout. -EAGAIN on + * unexpected state values. + * + * On success @ctxt->state contains the new state that was set in + * state[CUR] + * + * On -EBUSY failure this context timed out. This context should either + * give up or attempt a hostile takeover. + * + * On -EAGAIN failure this context encountered unexpected state values. + * This context should retry the full handover request setup process (the + * handover request setup by cons_setup_handover() is now invalidated and + * must be performed again). + */ +static bool cons_try_acquire_spin(struct cons_context *ctxt) +{ + struct console *con =3D ctxt->console; + struct cons_state cur; + struct cons_state new; + int err =3D -EAGAIN; + int timeout; + + /* Now wait for the other side to hand over */ + for (timeout =3D ctxt->spinwait_max_us; timeout >=3D 0; timeout--) { + /* Timeout immediately if a remote panic is detected. */ + if (cons_check_panic()) + break; + + cons_state_read(con, CON_STATE_CUR, &cur); + + /* + * If the real state of the console matches the handover state + * that this context setup, then the handover was a success + * and this context is now the owner. + * + * Note that this might have raced with a new higher priority + * requester coming in after the lock was handed over. + * However, that requester will see that the owner changes and + * setup a new request for the current owner (this context). + */ + if (cons_state_bits_match(cur, ctxt->hov_state)) + goto success; + + /* + * If state changed since the request was made, give up as + * it is no longer consistent. This must include + * state::req_prio since there could be a higher priority + * request available. + */ + if (cur.bits !=3D ctxt->req_state.bits) + goto cleanup; + + /* + * Finally check whether the handover state is still + * the same. + */ + cons_state_read(con, CON_STATE_REQ, &cur); + if (cur.atom !=3D ctxt->hov_state.atom) + goto cleanup; + + /* Account time */ + if (timeout > 0) + udelay(1); + } + + /* + * Timeout. Cleanup the handover state and carefully try to reset + * req_prio in the real state. The reset is important to ensure + * that the owner does not hand over the lock after this context + * has given up waiting. + */ + cons_cleanup_handover(ctxt); + + cons_state_read(con, CON_STATE_CUR, &cur); + do { + /* + * The timeout might have raced with the owner coming late + * and handing it over gracefully. + */ + if (cons_state_bits_match(cur, ctxt->hov_state)) + goto success; + + /* + * Validate that the state matches with the state at request + * time. If this check fails, there is already a higher + * priority context waiting or the owner has changed (either + * by higher priority or by hostile takeover). In all fail + * cases this context is no longer in line for a handover to + * take place, so no reset is necessary. + */ + if (cur.bits !=3D ctxt->req_state.bits) + goto cleanup; + + copy_full_state(new, cur); + new.req_prio =3D 0; + } while (!cons_state_try_cmpxchg(con, CON_STATE_CUR, &cur, &new)); + /* Reset worked. Report timeout. */ + return -EBUSY; + +success: + /* Store the real state */ + copy_full_state(ctxt->state, cur); + ctxt->hostile =3D false; + err =3D 0; + +cleanup: + cons_cleanup_handover(ctxt); + return err; +} + +/** + * __cons_try_acquire - Try to acquire the console for printk output + * @ctxt: Pointer to an acquire context that contains + * all information about the acquire mode + * + * Returns: True if the acquire was successful. False on fail. + * + * In case of success @ctxt->state contains the acquisition + * state. + * + * In case of fail @ctxt->old_state contains the state + * that was read from @con->state for analysis by the caller. + */ +static bool __cons_try_acquire(struct cons_context *ctxt) +{ + unsigned int cpu =3D smp_processor_id(); + struct console *con =3D ctxt->console; + short flags =3D console_srcu_read_flags(con); + struct cons_state old; + struct cons_state new; + int err; + + if (WARN_ON_ONCE(!(flags & CON_NO_BKL))) + return false; +again: + cons_state_read(con, CON_STATE_CUR, &old); + + /* Preserve it for the caller and for spinwait */ + copy_full_state(ctxt->old_state, old); + + if (cons_check_panic()) + return false; + + /* Set up the new state for takeover */ + copy_full_state(new, old); + new.locked =3D 1; + new.cur_prio =3D ctxt->prio; + new.req_prio =3D CONS_PRIO_NONE; + new.cpu =3D cpu; + + /* Attempt to acquire it directly if unlocked */ + if (!old.locked) { + if (!cons_state_try_cmpxchg(con, CON_STATE_CUR, &old, &new)) + goto again; + + ctxt->hostile =3D false; + copy_full_state(ctxt->state, new); + goto success; + } + + /* + * If the active context is on the same CPU then there is + * obviously no handshake possible. + */ + if (old.cpu =3D=3D cpu) + goto check_hostile; + + /* + * If a handover request with same or higher priority is already + * pending then this context cannot setup a handover request. + */ + if (old.req_prio >=3D ctxt->prio) + goto check_hostile; + + /* + * If the caller did not request spin-waiting then performing a + * handover is not an option. + */ + if (!ctxt->spinwait) + goto check_hostile; + + /* + * Setup the request in state[REQ]. If this fails then this + * context is not allowed to setup a handover request. + */ + if (!cons_setup_handover(ctxt)) + goto check_hostile; + + /* + * Setup the request in state[CUR]. Hand in the state that was + * used to make the decision to spinwait above, for comparison. If + * this fails then unexpected state values were encountered and the + * full request setup process is retried. + */ + if (!cons_setup_request(ctxt, old)) + goto again; + + /* + * Spin-wait to acquire the console. If this fails then unexpected + * state values were encountered (for example, a hostile takeover by + * another context) and the full request setup process is retried. + */ + err =3D cons_try_acquire_spin(ctxt); + if (err) { + if (err =3D=3D -EAGAIN) + goto again; + goto check_hostile; + } +success: + /* Common updates on success */ + return true; + +check_hostile: + if (!ctxt->hostile) + return false; + + if (cons_check_panic()) + return false; + + if (!cons_state_try_cmpxchg(con, CON_STATE_CUR, &old, &new)) + goto again; + + copy_full_state(ctxt->state, new); + goto success; +} + +/** + * cons_try_acquire - Try to acquire the console for printk output + * @ctxt: Pointer to an acquire context that contains + * all information about the acquire mode + * + * Returns: True if the acquire was successful. False on fail. + * + * In case of success @ctxt->state contains the acquisition + * state. + * + * In case of fail @ctxt->old_state contains the state + * that was read from @con->state for analysis by the caller. + */ +static bool cons_try_acquire(struct cons_context *ctxt) +{ + if (__cons_try_acquire(ctxt)) + return true; + + ctxt->state.atom =3D 0; + return false; +} + +/** + * __cons_release - Release the console after output is done + * @ctxt: The acquire context that contains the state + * at cons_try_acquire() + * + * Returns: True if the release was regular + * + * False if the console is in unusable state or was handed over + * with handshake or taken over hostile without handshake. + * + * The return value tells the caller whether it needs to evaluate further + * printing. + */ +static bool __cons_release(struct cons_context *ctxt) +{ + struct console *con =3D ctxt->console; + short flags =3D console_srcu_read_flags(con); + struct cons_state hstate; + struct cons_state old; + struct cons_state new; + + if (WARN_ON_ONCE(!(flags & CON_NO_BKL))) + return false; + + cons_state_read(con, CON_STATE_CUR, &old); +again: + if (!cons_state_bits_match(old, ctxt->state)) + return false; + + /* Release it directly when no handover request is pending. */ + if (!old.req_prio) + goto unlock; + + /* Read the handover target state */ + cons_state_read(con, CON_STATE_REQ, &hstate); + + /* If the waiter gave up hstate is 0 */ + if (!hstate.atom) + goto unlock; + + /* + * If a higher priority waiter raced against a lower priority + * waiter then unlock instead of handing over to either. The + * higher priority waiter will notice the updated state and + * retry. + */ + if (hstate.cur_prio !=3D old.req_prio) + goto unlock; + + /* Switch the state and preserve the sequence on 64bit */ + copy_bit_state(new, hstate); + copy_seq_state64(new, old); + if (!cons_state_try_cmpxchg(con, CON_STATE_CUR, &old, &new)) + goto again; + + return true; + +unlock: + /* Clear the state and preserve the sequence on 64bit */ + new.atom =3D 0; + copy_seq_state64(new, old); + if (!cons_state_try_cmpxchg(con, CON_STATE_CUR, &old, &new)) + goto again; + + return true; +} + +/** + * cons_release - Release the console after output is done + * @ctxt: The acquire context that contains the state + * at cons_try_acquire() + * + * Returns: True if the release was regular + * + * False if the console is in unusable state or was handed over + * with handshake or taken over hostile without handshake. + * + * The return value tells the caller whether it needs to evaluate further + * printing. + */ +static bool cons_release(struct cons_context *ctxt) +{ + bool ret =3D __cons_release(ctxt); + + ctxt->state.atom =3D 0; + return ret; +} + +bool console_try_acquire(struct cons_write_context *wctxt) +{ + struct cons_context *ctxt =3D &ACCESS_PRIVATE(wctxt, ctxt); + + return cons_try_acquire(ctxt); +} +EXPORT_SYMBOL(console_try_acquire); + +bool console_release(struct cons_write_context *wctxt) +{ + struct cons_context *ctxt =3D &ACCESS_PRIVATE(wctxt, ctxt); + + return cons_release(ctxt); +} +EXPORT_SYMBOL(console_release); + /** * cons_nobkl_init - Initialize the NOBKL console specific data * @con: Console to initialize --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6F52C83005 for ; Thu, 2 Mar 2023 19:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230206AbjCBT6T (ORCPT ); Thu, 2 Mar 2023 14:58:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54842 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229958AbjCBT5u (ORCPT ); Thu, 2 Mar 2023 14:57:50 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1EA73474E3 for ; Thu, 2 Mar 2023 11:57:47 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787064; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MBDHpobBKXIM6RJKVOV/6OUdGKuVSr8Ny7FvUBSWgMA=; b=F9jgTIjau6FlbBvAkNBAGqExHf6rP9l0Bsbra3pCuqLciyr5U+3sOKcR51Ys9BhohKzDOQ ygTHyVdbpUkccUIHtIZwiNJBR7CRy185ReNC5jiv8b3yWb078Mc+SzsAUmTqoSMlL7IBvx 7PSK2aA0AF+NsP6S2+n8gXAiXCihcyot56X84mk6Yrvf1u+aWcBQt+KeM4ElObelskrxTx fmoHE9Wjrb2fkoAPGBiDjyBVKvvRPKJMeXeGjchOVIihMvdIC2RN+1gpz9MsCRGpTdvZiC CxY4KqUCr86w9O7aetoqTk1zKJam1pU8Io//5zEnL5KPOWmdnTZ9hRJO1g7tYA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787064; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MBDHpobBKXIM6RJKVOV/6OUdGKuVSr8Ny7FvUBSWgMA=; b=z9Uc4xrA3/WMTHj6hBOSc63s19KRlMAU+WpRllEa+ErDvhMF4h2f4M86PATw2yLy6iblRu aBBusNS+sInwIrBw== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Greg Kroah-Hartman Subject: [PATCH printk v1 07/18] printk: nobkl: Add buffer management Date: Thu, 2 Mar 2023 21:02:07 +0106 Message-Id: <20230302195618.156940-8-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner In case of hostile takeovers it must be ensured that the previous owner cannot scribble over the output buffer of the emergency/panic context. This is achieved by: - Adding a global output buffer instance for early boot (pre per CPU data being available). - Allocating an output buffer per console for threaded printers once printer threads become available. - Allocating per CPU output buffers per console for printing from all contexts not covered by the other buffers. - Choosing the appropriate buffer is handled in the acquire/release functions. The output buffer is wrapped into a separate data structure so other context related fields can be added in later steps. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Tested-by: Daniel Thompson --- include/linux/console.h | 13 ++++++ kernel/printk/internal.h | 22 +++++++-- kernel/printk/printk.c | 26 +++++++---- kernel/printk/printk_nobkl.c | 90 +++++++++++++++++++++++++++++++++++- 4 files changed, 137 insertions(+), 14 deletions(-) diff --git a/include/linux/console.h b/include/linux/console.h index 2c95fcc765e6..3d989104240f 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -178,6 +178,7 @@ enum cons_flags { * * @locked: Console is locked by a writer * @unsafe: Console is busy in a non takeover region + * @thread: Current owner is the printk thread * @cur_prio: The priority of the current output * @req_prio: The priority of a handover request * @cpu: The CPU on which the writer runs @@ -203,6 +204,7 @@ struct cons_state { struct { u32 locked : 1; u32 unsafe : 1; + u32 thread : 1; u32 cur_prio : 2; u32 req_prio : 2; u32 cpu : 18; @@ -233,6 +235,7 @@ enum cons_prio { }; =20 struct console; +struct printk_buffers; =20 /** * struct cons_context - Context for console acquire/release @@ -244,6 +247,8 @@ struct console; * @req_state: The request state for spin and cleanup * @spinwait_max_us: Limit for spinwait acquire * @prio: Priority of the context + * @pbufs: Pointer to the text buffer for this context + * @thread: The acquire is printk thread context * @hostile: Hostile takeover requested. Cleared on normal * acquire or friendly handover * @spinwait: Spinwait on acquire if possible @@ -256,6 +261,8 @@ struct cons_context { struct cons_state req_state; unsigned int spinwait_max_us; enum cons_prio prio; + struct printk_buffers *pbufs; + unsigned int thread : 1; unsigned int hostile : 1; unsigned int spinwait : 1; }; @@ -274,6 +281,8 @@ struct cons_write_context { bool unsafe; }; =20 +struct cons_context_data; + /** * struct console - The console descriptor structure * @name: The name of the console driver @@ -295,6 +304,8 @@ struct cons_write_context { * @node: hlist node for the console list * * @atomic_state: State array for NOBKL consoles; real and handover + * @thread_pbufs: Pointer to thread private buffer + * @pcpu_data: Pointer to percpu context data */ struct console { char name[16]; @@ -317,6 +328,8 @@ struct console { =20 /* NOBKL console specific members */ atomic_long_t __private atomic_state[2]; + struct printk_buffers *thread_pbufs; + struct cons_context_data __percpu *pcpu_data; }; =20 #ifdef CONFIG_LOCKDEP diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index da380579263b..61ecdde5c872 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -13,8 +13,13 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, i= nt write, #define printk_sysctl_init() do { } while (0) #endif =20 -#ifdef CONFIG_PRINTK +#define con_printk(lvl, con, fmt, ...) \ + printk(lvl pr_fmt("%s%sconsole [%s%d] " fmt), \ + (con->flags & CON_NO_BKL) ? "" : "legacy ", \ + (con->flags & CON_BOOT) ? "boot" : "", \ + con->name, con->index, ##__VA_ARGS__) =20 +#ifdef CONFIG_PRINTK #ifdef CONFIG_PRINTK_CALLER #define PRINTK_PREFIX_MAX 48 #else @@ -64,7 +69,8 @@ u16 printk_parse_prefix(const char *text, int *level, enum printk_info_flags *flags); =20 void cons_nobkl_cleanup(struct console *con); -void cons_nobkl_init(struct console *con); +bool cons_nobkl_init(struct console *con); +bool cons_alloc_percpu_data(struct console *con); =20 #else =20 @@ -81,7 +87,7 @@ void cons_nobkl_init(struct console *con); #define printk_safe_exit_irqrestore(flags) local_irq_restore(flags) =20 static inline bool printk_percpu_data_ready(void) { return false; } -static inline void cons_nobkl_init(struct console *con) { } +static inline bool cons_nobkl_init(struct console *con) { return true; } static inline void cons_nobkl_cleanup(struct console *con) { } =20 #endif /* CONFIG_PRINTK */ @@ -113,3 +119,13 @@ struct printk_message { u64 seq; unsigned long dropped; }; + +/** + * struct cons_context_data - console context data + * @pbufs: Buffer for storing the text + * + * Used for early boot and for per CPU data. + */ +struct cons_context_data { + struct printk_buffers pbufs; +}; diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index b2c7c92c3d79..3abefdead7ae 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -459,6 +459,8 @@ static bool have_bkl_console; */ bool have_boot_console; =20 +static int unregister_console_locked(struct console *console); + #ifdef CONFIG_PRINTK DECLARE_WAIT_QUEUE_HEAD(log_wait); /* All 3 protected by @syslog_lock. */ @@ -1117,7 +1119,19 @@ static inline void log_buf_add_cpu(void) {} =20 static void __init set_percpu_data_ready(void) { + struct hlist_node *tmp; + struct console *con; + + console_list_lock(); + + hlist_for_each_entry_safe(con, tmp, &console_list, node) { + if (!cons_alloc_percpu_data(con)) + unregister_console_locked(con); + } + __printk_percpu_data_ready =3D true; + + console_list_unlock(); } =20 static unsigned int __init add_to_rb(struct printk_ringbuffer *rb, @@ -3329,12 +3343,6 @@ static void try_enable_default_console(struct consol= e *newcon) newcon->flags |=3D CON_CONSDEV; } =20 -#define con_printk(lvl, con, fmt, ...) \ - printk(lvl pr_fmt("%s%sconsole [%s%d] " fmt), \ - (con->flags & CON_NO_BKL) ? "" : "legacy ", \ - (con->flags & CON_BOOT) ? "boot" : "", \ - con->name, con->index, ##__VA_ARGS__) - static void console_init_seq(struct console *newcon, bool bootcon_register= ed) { struct console *con; @@ -3399,8 +3407,6 @@ static void console_init_seq(struct console *newcon, = bool bootcon_registered) #define console_first() \ hlist_entry(console_list.first, struct console, node) =20 -static int unregister_console_locked(struct console *console); - /* * The console driver calls this routine during kernel initialization * to register the console printing procedure with printk() and to @@ -3494,8 +3500,8 @@ void register_console(struct console *newcon) =20 if (!(newcon->flags & CON_NO_BKL)) have_bkl_console =3D true; - else - cons_nobkl_init(newcon); + else if (!cons_nobkl_init(newcon)) + goto unlock; =20 if (newcon->flags & CON_BOOT) have_boot_console =3D true; diff --git a/kernel/printk/printk_nobkl.c b/kernel/printk/printk_nobkl.c index 78136347a328..7db56ffd263a 100644 --- a/kernel/printk/printk_nobkl.c +++ b/kernel/printk/printk_nobkl.c @@ -166,6 +166,47 @@ static inline bool cons_check_panic(void) return pcpu !=3D PANIC_CPU_INVALID && pcpu !=3D smp_processor_id(); } =20 +static struct cons_context_data early_cons_ctxt_data __initdata; + +/** + * cons_context_set_pbufs - Set the output text buffer for the current con= text + * @ctxt: Pointer to the acquire context + * + * Buffer selection: + * 1) Early boot uses the global (initdata) buffer + * 2) Printer threads use the dynamically allocated per-console buffers + * 3) All other contexts use the per CPU buffers + * + * This guarantees that there is no concurrency on the output records ever. + * Early boot and per CPU nesting is not a problem. The takeover logic + * tells the interrupted context that the buffer has been overwritten. + * + * There are two critical regions that matter: + * + * 1) Context is filling the buffer with a record. After interruption + * it continues to sprintf() the record and before it goes to + * write it out, it checks the state, notices the takeover, discards + * the content and backs out. + * + * 2) Context is in a unsafe critical region in the driver. After + * interruption it might read overwritten data from the output + * buffer. When it leaves the critical region it notices and backs + * out. Hostile takeovers in driver critical regions are best effort + * and there is not much that can be done about that. + */ +static __ref void cons_context_set_pbufs(struct cons_context *ctxt) +{ + struct console *con =3D ctxt->console; + + /* Thread context or early boot? */ + if (ctxt->thread) + ctxt->pbufs =3D con->thread_pbufs; + else if (!con->pcpu_data) + ctxt->pbufs =3D &early_cons_ctxt_data.pbufs; + else + ctxt->pbufs =3D &(this_cpu_ptr(con->pcpu_data)->pbufs); +} + /** * cons_cleanup_handover - Cleanup a handover request * @ctxt: Pointer to acquire context @@ -501,6 +542,7 @@ static bool __cons_try_acquire(struct cons_context *ctx= t) } success: /* Common updates on success */ + cons_context_set_pbufs(ctxt); return true; =20 check_hostile: @@ -623,6 +665,9 @@ static bool cons_release(struct cons_context *ctxt) { bool ret =3D __cons_release(ctxt); =20 + /* Invalidate the buffer pointer. It is no longer valid. */ + ctxt->pbufs =3D NULL; + ctxt->state.atom =3D 0; return ret; } @@ -643,16 +688,58 @@ bool console_release(struct cons_write_context *wctxt) } EXPORT_SYMBOL(console_release); =20 +/** + * cons_alloc_percpu_data - Allocate percpu data for a console + * @con: Console to allocate for + * + * Returns: True on success. False otherwise and the console cannot be use= d. + * + * If it is not yet possible to allocate per CPU data, success is returned. + * When per CPU data becomes possible, set_percpu_data_ready() will call + * this function again for all registered consoles. + */ +bool cons_alloc_percpu_data(struct console *con) +{ + if (!printk_percpu_data_ready()) + return true; + + con->pcpu_data =3D alloc_percpu(typeof(*con->pcpu_data)); + if (con->pcpu_data) + return true; + + con_printk(KERN_WARNING, con, "failed to allocate percpu buffers\n"); + return false; +} + +/** + * cons_free_percpu_data - Free percpu data of a console on unregister + * @con: Console to clean up + */ +static void cons_free_percpu_data(struct console *con) +{ + if (!con->pcpu_data) + return; + + free_percpu(con->pcpu_data); + con->pcpu_data =3D NULL; +} + /** * cons_nobkl_init - Initialize the NOBKL console specific data * @con: Console to initialize + * + * Returns: True on success. False otherwise and the console cannot be use= d. */ -void cons_nobkl_init(struct console *con) +bool cons_nobkl_init(struct console *con) { struct cons_state state =3D { }; =20 + if (!cons_alloc_percpu_data(con)) + return false; + cons_state_set(con, CON_STATE_CUR, &state); cons_state_set(con, CON_STATE_REQ, &state); + return true; } =20 /** @@ -665,4 +752,5 @@ void cons_nobkl_cleanup(struct console *con) =20 cons_state_set(con, CON_STATE_CUR, &state); cons_state_set(con, CON_STATE_REQ, &state); + cons_free_percpu_data(con); } --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37494C678D4 for ; Thu, 2 Mar 2023 19:59:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230246AbjCBT62 (ORCPT ); Thu, 2 Mar 2023 14:58:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230006AbjCBT5v (ORCPT ); Thu, 2 Mar 2023 14:57:51 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 14ED9474DE for ; Thu, 2 Mar 2023 11:57:47 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787064; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=edvO9X8XZ958RaRcZGgZGLMOgts69wn+b0VXMGnOne8=; b=Z8SkxmFxnzZ9XMdnLPhidK001rury81RhEEZivwkSCKoRgFr0dCOxmR0h/z3+80UgOT7oA EOVaC11S5kOUGrxy8X9MnPuIz6Hmu4GsiYouT3WUQT2CVDP1+ZkK9h6VGa4dUvp/BxQLSK Z9WaNfDWMq0K/DVxOjis1rjgxNZvtAQ7v1ZCGrsdrlsKiS0JZydo7b08K96BndmdrhFexz /Oo8fWN+Xc/FyUg1cZcUNtL9lKwOu/sMzwATep0VZb3yOLBr5vElr15PNjG2CxR/+oa5Ck CFKV65Adgz3BL1yo4eFgz+293xDv0tQ/EK4w1HGMasvmr23wylz+WzEJZ0aHRg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787064; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=edvO9X8XZ958RaRcZGgZGLMOgts69wn+b0VXMGnOne8=; b=cRT531wHVlDzGkhhheBxiGJJeNEiGr0t/Op9b5IcxUx7F/gGoR9k3VKK7eNm3PxPTSl33w 6No9KuH5jwPHD/BQ== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Greg Kroah-Hartman Subject: [PATCH printk v1 08/18] printk: nobkl: Add sequence handling Date: Thu, 2 Mar 2023 21:02:08 +0106 Message-Id: <20230302195618.156940-9-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner On 64bit systems the sequence tracking is embedded into the atomic console state, on 32bit it has to be stored in a separate atomic member. The latter needs to handle the non-atomicity in hostile takeover cases, while 64bit can completely rely on the state atomicity. The ringbuffer sequence number is 64bit, but having a 32bit representation in the console is sufficient. If a console ever gets more than 2^31 records behind the ringbuffer then this is the least of the problems. On acquire() the atomic 32bit sequence number is expanded to 64 bit by folding the ringbuffer's sequence into it carefully. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Tested-by: Daniel Thompson --- include/linux/console.h | 8 ++ kernel/printk/internal.h | 4 + kernel/printk/printk.c | 61 +++++++--- kernel/printk/printk_nobkl.c | 224 +++++++++++++++++++++++++++++++++++ 4 files changed, 280 insertions(+), 17 deletions(-) diff --git a/include/linux/console.h b/include/linux/console.h index 3d989104240f..942cc7f57798 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -246,6 +246,8 @@ struct printk_buffers; * @hov_state: The handover state for spin and cleanup * @req_state: The request state for spin and cleanup * @spinwait_max_us: Limit for spinwait acquire + * @oldseq: The sequence number at acquire() + * @newseq: The sequence number for progress * @prio: Priority of the context * @pbufs: Pointer to the text buffer for this context * @thread: The acquire is printk thread context @@ -259,6 +261,8 @@ struct cons_context { struct cons_state old_state; struct cons_state hov_state; struct cons_state req_state; + u64 oldseq; + u64 newseq; unsigned int spinwait_max_us; enum cons_prio prio; struct printk_buffers *pbufs; @@ -304,6 +308,7 @@ struct cons_context_data; * @node: hlist node for the console list * * @atomic_state: State array for NOBKL consoles; real and handover + * @atomic_seq: Sequence for record tracking (32bit only) * @thread_pbufs: Pointer to thread private buffer * @pcpu_data: Pointer to percpu context data */ @@ -328,6 +333,9 @@ struct console { =20 /* NOBKL console specific members */ atomic_long_t __private atomic_state[2]; +#ifndef CONFIG_64BIT + atomic_t __private atomic_seq; +#endif struct printk_buffers *thread_pbufs; struct cons_context_data __percpu *pcpu_data; }; diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index 61ecdde5c872..15a412065327 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -4,6 +4,7 @@ */ #include #include +#include "printk_ringbuffer.h" =20 #if defined(CONFIG_PRINTK) && defined(CONFIG_SYSCTL) void __init printk_sysctl_init(void); @@ -41,6 +42,8 @@ enum printk_info_flags { LOG_CONT =3D 8, /* text is a fragment of a continuation line */ }; =20 +extern struct printk_ringbuffer *prb; + __printf(4, 0) int vprintk_store(int facility, int level, const struct dev_printk_info *dev_info, @@ -68,6 +71,7 @@ void defer_console_output(void); u16 printk_parse_prefix(const char *text, int *level, enum printk_info_flags *flags); =20 +u64 cons_read_seq(struct console *con); void cons_nobkl_cleanup(struct console *con); bool cons_nobkl_init(struct console *con); bool cons_alloc_percpu_data(struct console *con); diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 3abefdead7ae..21b31183ff2b 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -511,7 +511,7 @@ _DEFINE_PRINTKRB(printk_rb_static, CONFIG_LOG_BUF_SHIFT= - PRB_AVGBITS, =20 static struct printk_ringbuffer printk_rb_dynamic; =20 -static struct printk_ringbuffer *prb =3D &printk_rb_static; +struct printk_ringbuffer *prb =3D &printk_rb_static; =20 /* * We cannot access per-CPU data (e.g. per-CPU flush irq_work) before @@ -2728,30 +2728,39 @@ static bool abandon_console_lock_in_panic(void) =20 /* * Check if the given console is currently capable and allowed to print - * records. - * - * Requires the console_srcu_read_lock. + * records. If the caller only works with certain types of consoles, the + * caller is responsible for checking the console type before calling + * this function. */ -static inline bool console_is_usable(struct console *con) +static inline bool console_is_usable(struct console *con, short flags) { - short flags =3D console_srcu_read_flags(con); - if (!(flags & CON_ENABLED)) return false; =20 if ((flags & CON_SUSPENDED)) return false; =20 - if (!con->write) - return false; - /* - * Console drivers may assume that per-cpu resources have been - * allocated. So unless they're explicitly marked as being able to - * cope (CON_ANYTIME) don't call them until this CPU is officially up. + * The usability of a console varies depending on whether + * it is a NOBKL console or not. */ - if (!cpu_online(raw_smp_processor_id()) && !(flags & CON_ANYTIME)) - return false; + + if (flags & CON_NO_BKL) { + if (have_boot_console) + return false; + + } else { + if (!con->write) + return false; + /* + * Console drivers may assume that per-cpu resources have + * been allocated. So unless they're explicitly marked as + * being able to cope (CON_ANYTIME) don't call them until + * this CPU is officially up. + */ + if (!cpu_online(raw_smp_processor_id()) && !(flags & CON_ANYTIME)) + return false; + } =20 return true; } @@ -3001,9 +3010,14 @@ static bool console_flush_all(bool do_cond_resched, = u64 *next_seq, bool *handove =20 cookie =3D console_srcu_read_lock(); for_each_console_srcu(con) { + short flags =3D console_srcu_read_flags(con); bool progress; =20 - if (!console_is_usable(con)) + /* console_flush_all() is only for legacy consoles. */ + if (flags & CON_NO_BKL) + continue; + + if (!console_is_usable(con, flags)) continue; any_usable =3D true; =20 @@ -3775,10 +3789,23 @@ static bool __pr_flush(struct console *con, int tim= eout_ms, bool reset_on_progre =20 cookie =3D console_srcu_read_lock(); for_each_console_srcu(c) { + short flags; + if (con && con !=3D c) continue; - if (!console_is_usable(c)) + + flags =3D console_srcu_read_flags(c); + + if (!console_is_usable(c, flags)) continue; + + /* + * Since the console is locked, use this opportunity + * to update console->seq for NOBKL consoles. + */ + if (flags & CON_NO_BKL) + c->seq =3D cons_read_seq(c); + printk_seq =3D c->seq; if (printk_seq < seq) diff +=3D seq - printk_seq; diff --git a/kernel/printk/printk_nobkl.c b/kernel/printk/printk_nobkl.c index 7db56ffd263a..7184a93a5b0d 100644 --- a/kernel/printk/printk_nobkl.c +++ b/kernel/printk/printk_nobkl.c @@ -5,6 +5,7 @@ #include #include #include +#include "printk_ringbuffer.h" #include "internal.h" /* * Printk implementation for consoles that do not depend on the BKL style @@ -207,6 +208,227 @@ static __ref void cons_context_set_pbufs(struct cons_= context *ctxt) ctxt->pbufs =3D &(this_cpu_ptr(con->pcpu_data)->pbufs); } =20 +/** + * cons_seq_init - Helper function to initialize the console sequence + * @con: Console to work on + * + * Set @con->atomic_seq to the starting record, or if that record no + * longer exists, the oldest available record. For init only. Do not + * use for runtime updates. + */ +static void cons_seq_init(struct console *con) +{ + u32 seq =3D (u32)max_t(u64, con->seq, prb_first_valid_seq(prb)); +#ifdef CONFIG_64BIT + struct cons_state state; + + cons_state_read(con, CON_STATE_CUR, &state); + state.seq =3D seq; + cons_state_set(con, CON_STATE_CUR, &state); +#else + atomic_set(&ACCESS_PRIVATE(con, atomic_seq), seq); +#endif +} + +static inline u64 cons_expand_seq(u64 seq) +{ + u64 rbseq; + + /* + * The provided sequence is only the lower 32bits of the ringbuffer + * sequence. It needs to be expanded to 64bit. Get the next sequence + * number from the ringbuffer and fold it. + */ + rbseq =3D prb_next_seq(prb); + seq =3D rbseq - ((u32)rbseq - (u32)seq); + + return seq; +} + +/** + * cons_read_seq - Read the current console sequence + * @con: Console to read the sequence of + * + * Returns: Sequence number of the next record to print on @con. + */ +u64 cons_read_seq(struct console *con) +{ + u64 seq; +#ifdef CONFIG_64BIT + struct cons_state state; + + cons_state_read(con, CON_STATE_CUR, &state); + seq =3D state.seq; +#else + seq =3D atomic_read(&ACCESS_PRIVATE(con, atomic_seq)); +#endif + return cons_expand_seq(seq); +} + +/** + * cons_context_set_seq - Setup the context with the next sequence to print + * @ctxt: Pointer to an acquire context that contains + * all information about the acquire mode + * + * On return the retrieved sequence number is stored in ctxt->oldseq. + * + * The sequence number is safe in forceful takeover situations. + * + * Either the writer succeeded to update before it got interrupted + * or it failed. In the latter case the takeover will print the + * same line again. + * + * The sequence is only the lower 32bits of the ringbuffer sequence. The + * ringbuffer must be 2^31 records ahead to get out of sync. This needs + * some care when starting a console, i.e setting the sequence to 0 is + * wrong. It has to be set to the oldest valid sequence in the ringbuffer + * as that cannot be more than 2^31 records away + * + * On 64bit the 32bit sequence is part of console::state, which is saved + * in @ctxt->state. This prevents the 32bit update race. + */ +static void cons_context_set_seq(struct cons_context *ctxt) +{ +#ifdef CONFIG_64BIT + ctxt->oldseq =3D ctxt->state.seq; +#else + ctxt->oldseq =3D atomic_read(&ACCESS_PRIVATE(ctxt->console, atomic_seq)); +#endif + ctxt->oldseq =3D cons_expand_seq(ctxt->oldseq); + ctxt->newseq =3D ctxt->oldseq; +} + +/** + * cons_seq_try_update - Try to update the console sequence number + * @ctxt: Pointer to an acquire context that contains + * all information about the acquire mode + * + * Returns: True if the console sequence was updated, false otherwise. + * + * Internal helper as the logic is different on 32bit and 64bit. + * + * On 32 bit the sequence is separate from state and therefore + * subject to a subtle race in the case of hostile takeovers. + * + * On 64 bit the sequence is part of the state and therefore safe + * vs. hostile takeovers. + * + * In case of fail the console has been taken over and @ctxt is + * invalid. Caller has to reacquire the console. + */ +#ifdef CONFIG_64BIT +static bool __maybe_unused cons_seq_try_update(struct cons_context *ctxt) +{ + struct console *con =3D ctxt->console; + struct cons_state old; + struct cons_state new; + + cons_state_read(con, CON_STATE_CUR, &old); + do { + /* Make sure this context is still the owner. */ + if (!cons_state_bits_match(old, ctxt->state)) + return false; + + /* Preserve bit state */ + copy_bit_state(new, old); + new.seq =3D ctxt->newseq; + + /* + * Can race with hostile takeover or with a handover + * request. + */ + } while (!cons_state_try_cmpxchg(con, CON_STATE_CUR, &old, &new)); + + copy_full_state(ctxt->state, new); + ctxt->oldseq =3D ctxt->newseq; + + return true; +} +#else +static bool cons_release(struct cons_context *ctxt); +static bool __maybe_unused cons_seq_try_update(struct cons_context *ctxt) +{ + struct console *con =3D ctxt->console; + struct cons_state state; + int pcpu; + u32 old; + u32 new; + + /* + * There is a corner case that needs to be considered here: + * + * CPU0 CPU1 + * printk() + * acquire() -> emergency + * write() acquire() + * update_seq() + * state =3D=3D OK + * --> NMI + * takeover() + * <--- write() + * cmpxchg() succeeds update_seq() + * cmpxchg() fails + * + * There is nothing that can be done about this other than having + * yet another state bit that needs to be tracked and analyzed, + * but fails to cover the problem completely. + * + * No other scenarios expose such a problem. On same CPU takeovers + * the cmpxchg() always fails on the interrupted context after the + * interrupting context finished printing, but that's fine as it + * does not own the console anymore. The state check after the + * failed cmpxchg prevents that. + */ + cons_state_read(con, CON_STATE_CUR, &state); + /* Make sure this context is still the owner. */ + if (!cons_state_bits_match(state, ctxt->state)) + return false; + + /* + * Get the original sequence number that was retrieved + * from @con->atomic_seq. @con->atomic_seq should be still + * the same. 32bit truncates. See cons_context_set_seq(). + */ + old =3D (u32)ctxt->oldseq; + new =3D (u32)ctxt->newseq; + if (atomic_try_cmpxchg(&ACCESS_PRIVATE(con, atomic_seq), &old, new)) { + ctxt->oldseq =3D ctxt->newseq; + return true; + } + + /* + * Reread the state. If this context does not own the console anymore + * then it cannot touch the sequence again. + */ + cons_state_read(con, CON_STATE_CUR, &state); + if (!cons_state_bits_match(state, ctxt->state)) + return false; + + pcpu =3D atomic_read(&panic_cpu); + if (pcpu =3D=3D smp_processor_id()) { + /* + * This is the panic CPU. Emitting a warning here does not + * help at all. The callchain is clear and the priority is + * to get the messages out. In the worst case duplicated + * ones. That's a job for postprocessing. + */ + atomic_set(&ACCESS_PRIVATE(con, atomic_seq), new); + ctxt->oldseq =3D ctxt->newseq; + return true; + } + + /* + * Only emit a warning when this happens outside of a panic + * situation as on panic it's neither useful nor helping to let the + * panic CPU get the important stuff out. + */ + WARN_ON_ONCE(pcpu =3D=3D PANIC_CPU_INVALID); + + cons_release(ctxt); + return false; +} +#endif + /** * cons_cleanup_handover - Cleanup a handover request * @ctxt: Pointer to acquire context @@ -542,6 +764,7 @@ static bool __cons_try_acquire(struct cons_context *ctx= t) } success: /* Common updates on success */ + cons_context_set_seq(ctxt); cons_context_set_pbufs(ctxt); return true; =20 @@ -739,6 +962,7 @@ bool cons_nobkl_init(struct console *con) =20 cons_state_set(con, CON_STATE_CUR, &state); cons_state_set(con, CON_STATE_REQ, &state); + cons_seq_init(con); return true; } =20 --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8C1FC87FDD for ; Thu, 2 Mar 2023 19:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230213AbjCBT6V (ORCPT ); Thu, 2 Mar 2023 14:58:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230000AbjCBT5u (ORCPT ); Thu, 2 Mar 2023 14:57:50 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA2CF474E8 for ; Thu, 2 Mar 2023 11:57:47 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787065; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=j2fMZu7G+YK9eneabidoKIltYZn7KfVS0Zt0k2ab0y0=; b=3nCVLeHqSTGTltSXJiVNkoE1q8fdIPyjeuYWyzSd4c7S1+AztCDDJIZRwc0oNa7iLmCJQm k1Er7/aw3qWpZ5X9i3hP/KFf5rWAlU1SVOB9NQiRAwwJFny8SQTVfRzkBeGk4Tk1GBEeL9 +R6vWhYToMqDU/WTBI2pnHuRXnFZ8xwCV+0FIV1mnDERyiqPfG8Itln0C9xQM0Av/Kn8Ij jvbWmqO7Ukglg/lBmB3MTAUNFLn4gYrqvQlJwL3lw7+gFlvXL5UhB8rA42LBRfSdCmyyxI thEG1ez8IbCg8z+lLVV51R7KoWAVTGhI1v6bPzmDNeL4n+F36DR5mxSgxHQl9A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787065; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=j2fMZu7G+YK9eneabidoKIltYZn7KfVS0Zt0k2ab0y0=; b=NbLLJtoEj1oiWEL7cz5tkc94m1maoG84ijey5UzBqQUkgzAHCQZ1fbPMsrx18wvdzgYufs BPYawCdSurW+GCBw== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Greg Kroah-Hartman Subject: [PATCH printk v1 09/18] printk: nobkl: Add print state functions Date: Thu, 2 Mar 2023 21:02:09 +0106 Message-Id: <20230302195618.156940-10-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner Provide three functions which are related to the safe handover mechanism and allow console drivers to denote takeover unsafe sections: - console_can_proceed() Invoked by a console driver to check whether a handover request is pending or whether the console was taken over in a hostile fashion. - console_enter/exit_unsafe() Invoked by a console driver to denote that the driver output function is about to enter or to leave an critical region where a hostile take over is unsafe. These functions are also cancellation points. The unsafe state is stored in the console state and allows a takeover attempt to make informed decisions whether to take over and/or output on such a console at all. The unsafe state is also available to the driver in the write context for the atomic_write() output function so the driver can make informed decisions about the required actions or take a special emergency path. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Tested-by: Daniel Thompson --- include/linux/console.h | 3 + kernel/printk/printk_nobkl.c | 139 +++++++++++++++++++++++++++++++++++ 2 files changed, 142 insertions(+) diff --git a/include/linux/console.h b/include/linux/console.h index 942cc7f57798..0779757cb917 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -464,6 +464,9 @@ static inline bool console_is_registered(const struct c= onsole *con) lockdep_assert_console_list_lock_held(); \ hlist_for_each_entry(con, &console_list, node) =20 +extern bool console_can_proceed(struct cons_write_context *wctxt); +extern bool console_enter_unsafe(struct cons_write_context *wctxt); +extern bool console_exit_unsafe(struct cons_write_context *wctxt); extern bool console_try_acquire(struct cons_write_context *wctxt); extern bool console_release(struct cons_write_context *wctxt); =20 diff --git a/kernel/printk/printk_nobkl.c b/kernel/printk/printk_nobkl.c index 7184a93a5b0d..3318a79a150a 100644 --- a/kernel/printk/printk_nobkl.c +++ b/kernel/printk/printk_nobkl.c @@ -947,6 +947,145 @@ static void cons_free_percpu_data(struct console *con) con->pcpu_data =3D NULL; } =20 +/** + * console_can_proceed - Check whether printing can proceed + * @wctxt: The write context that was handed to the write function + * + * Returns: True if the state is correct. False if a handover + * has been requested or if the console was taken + * over. + * + * Must be invoked after the record was dumped into the assigned record + * buffer and at appropriate safe places in the driver. For unsafe driver + * sections see console_enter_unsafe(). + * + * When this function returns false then the calling context is not allowed + * to go forward and has to back out immediately and carefully. The buffer + * content is no longer trusted either and the console lock is no longer + * held. + */ +bool console_can_proceed(struct cons_write_context *wctxt) +{ + struct cons_context *ctxt =3D &ACCESS_PRIVATE(wctxt, ctxt); + struct console *con =3D ctxt->console; + struct cons_state state; + + cons_state_read(con, CON_STATE_CUR, &state); + /* Store it for analysis or reuse */ + copy_full_state(ctxt->old_state, state); + + /* Make sure this context is still the owner. */ + if (!cons_state_full_match(state, ctxt->state)) + return false; + + /* + * Having a safe point for take over and eventually a few + * duplicated characters or a full line is way better than a + * hostile takeover. Post processing can take care of the garbage. + * Continue if the requested priority is not sufficient. + */ + if (state.req_prio <=3D state.cur_prio) + return true; + + /* + * A console printer within an unsafe region is allowed to continue. + * It can perform the handover when exiting the safe region. Otherwise + * a hostile takeover will be necessary. + */ + if (state.unsafe) + return true; + + /* Release and hand over */ + cons_release(ctxt); + /* + * This does not check whether the handover succeeded. The + * outermost callsite has to make the final decision whether printing + * should continue or not (via reacquire, possibly hostile). The + * console is unlocked already so go back all the way instead of + * trying to implement heuristics in tons of places. + */ + return false; +} + +/** + * __console_update_unsafe - Update the unsafe bit in @con->atomic_state + * @wctxt: The write context that was handed to the write function + * + * Returns: True if the state is correct. False if a handover + * has been requested or if the console was taken + * over. + * + * Must be invoked before an unsafe driver section is entered. + * + * When this function returns false then the calling context is not allowed + * to go forward and has to back out immediately and carefully. The buffer + * content is no longer trusted either and the console lock is no longer + * held. + * + * Internal helper to avoid duplicated code + */ +static bool __console_update_unsafe(struct cons_write_context *wctxt, bool= unsafe) +{ + struct cons_context *ctxt =3D &ACCESS_PRIVATE(wctxt, ctxt); + struct console *con =3D ctxt->console; + struct cons_state new; + + do { + if (!console_can_proceed(wctxt)) + return false; + /* + * console_can_proceed() saved the real state in + * ctxt->old_state + */ + copy_full_state(new, ctxt->old_state); + new.unsafe =3D unsafe; + + } while (!cons_state_try_cmpxchg(con, CON_STATE_CUR, &ctxt->old_state, &n= ew)); + + copy_full_state(ctxt->state, new); + return true; +} + +/** + * console_enter_unsafe - Enter an unsafe region in the driver + * @wctxt: The write context that was handed to the write function + * + * Returns: True if the state is correct. False if a handover + * has been requested or if the console was taken + * over. + * + * Must be invoked before an unsafe driver section is entered. + * + * When this function returns false then the calling context is not allowed + * to go forward and has to back out immediately and carefully. The buffer + * content is no longer trusted either and the console lock is no longer + * held. + */ +bool console_enter_unsafe(struct cons_write_context *wctxt) +{ + return __console_update_unsafe(wctxt, true); +} + +/** + * console_exit_unsafe - Exit an unsafe region in the driver + * @wctxt: The write context that was handed to the write function + * + * Returns: True if the state is correct. False if a handover + * has been requested or if the console was taken + * over. + * + * Must be invoked before an unsafe driver section is exited. + * + * When this function returns false then the calling context is not allowed + * to go forward and has to back out immediately and carefully. The buffer + * content is no longer trusted either and the console lock is no longer + * held. + */ +bool console_exit_unsafe(struct cons_write_context *wctxt) +{ + return __console_update_unsafe(wctxt, false); +} + /** * cons_nobkl_init - Initialize the NOBKL console specific data * @con: Console to initialize --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B57EDC8300C for ; Thu, 2 Mar 2023 19:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230199AbjCBT6Q (ORCPT ); Thu, 2 Mar 2023 14:58:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54844 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229957AbjCBT5u (ORCPT ); Thu, 2 Mar 2023 14:57:50 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E660C474D0 for ; Thu, 2 Mar 2023 11:57:47 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787065; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1U46EmVNN9TnCO3B4brlRZe5YwAS3YvC6c1avL6o62U=; b=E6l68efT6omso5tbfgrByAF4g9A0Zve+auKMhKFOudl2Ov5osLZcTEDSZjFBxI9u4AkmzU 9OBoTMGuBH/UwX1Vcw2GGJtnjMpQEE7KSltB2YH34Dna9084ObAYx1gbQNfv/PoXFu/BCz MFKnDsJ3EeTaxLLHi6oh0b3lCtCJyWJCvuSdz4g7whEd0CUtPBmgO4YdmPj6P8V3xdoDKm fZDX+6G3LtUeRcTwd/clkFGx4et8KAnBI2LZdsJQ9jIsxPisKUNggL7o7j3nLKH69DCOTw 6yxvhhLovuMg4yL9e73yAou2uU4m3b0B+cmh/1bt8oSVBnd/XZ/E8WyL3ct/6g== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787065; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1U46EmVNN9TnCO3B4brlRZe5YwAS3YvC6c1avL6o62U=; b=Xdd82s+bmvpLoHZz1SKKV6YdK9jMrOoOnLvcaRY94voySdrwV8w00Tq++CjoESTFYKvg6f KPyZP4m/3kf2VODQ== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Greg Kroah-Hartman Subject: [PATCH printk v1 10/18] printk: nobkl: Add emit function and callback functions for atomic printing Date: Thu, 2 Mar 2023 21:02:10 +0106 Message-Id: <20230302195618.156940-11-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner Implement an emit function for non-BKL consoles to output printk messages. It utilizes the lockless printk_get_next_message() and console_prepend_dropped() functions to retrieve/build the output message. The emit function includes the required safety points to check for handover/takeover and calls a new write_atomic callback of the console driver to output the message. It also includes proper handling for updating the non-BKL console sequence number. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Tested-by: Daniel Thompson --- include/linux/console.h | 8 +++ kernel/printk/internal.h | 9 +++ kernel/printk/printk.c | 12 ++-- kernel/printk/printk_nobkl.c | 121 ++++++++++++++++++++++++++++++++++- 4 files changed, 141 insertions(+), 9 deletions(-) diff --git a/include/linux/console.h b/include/linux/console.h index 0779757cb917..15f71ccfcd9d 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -250,10 +250,12 @@ struct printk_buffers; * @newseq: The sequence number for progress * @prio: Priority of the context * @pbufs: Pointer to the text buffer for this context + * @dropped: Dropped counter for the current context * @thread: The acquire is printk thread context * @hostile: Hostile takeover requested. Cleared on normal * acquire or friendly handover * @spinwait: Spinwait on acquire if possible + * @backlog: Ringbuffer has pending records */ struct cons_context { struct console *console; @@ -266,9 +268,11 @@ struct cons_context { unsigned int spinwait_max_us; enum cons_prio prio; struct printk_buffers *pbufs; + unsigned long dropped; unsigned int thread : 1; unsigned int hostile : 1; unsigned int spinwait : 1; + unsigned int backlog : 1; }; =20 /** @@ -310,6 +314,7 @@ struct cons_context_data; * @atomic_state: State array for NOBKL consoles; real and handover * @atomic_seq: Sequence for record tracking (32bit only) * @thread_pbufs: Pointer to thread private buffer + * @write_atomic: Write callback for atomic context * @pcpu_data: Pointer to percpu context data */ struct console { @@ -337,6 +342,9 @@ struct console { atomic_t __private atomic_seq; #endif struct printk_buffers *thread_pbufs; + + bool (*write_atomic)(struct console *con, struct cons_write_context *wctx= t); + struct cons_context_data __percpu *pcpu_data; }; =20 diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index 15a412065327..13dd0ce23c37 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -133,3 +133,12 @@ struct printk_message { struct cons_context_data { struct printk_buffers pbufs; }; + +#ifdef CONFIG_PRINTK + +bool printk_get_next_message(struct printk_message *pmsg, u64 seq, + bool is_extended, bool may_supress); +void console_prepend_dropped(struct printk_message *pmsg, + unsigned long dropped); + +#endif diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 21b31183ff2b..eab0358baa6f 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -715,9 +715,6 @@ static ssize_t msg_print_ext_body(char *buf, size_t siz= e, return len; } =20 -static bool printk_get_next_message(struct printk_message *pmsg, u64 seq, - bool is_extended, bool may_supress); - /* /dev/kmsg - userspace message inject/listen interface */ struct devkmsg_user { atomic64_t seq; @@ -2786,7 +2783,7 @@ static void __console_unlock(void) * If @pmsg->pbufs->outbuf is modified, @pmsg->outbuf_len is updated. */ #ifdef CONFIG_PRINTK -static void console_prepend_dropped(struct printk_message *pmsg, unsigned = long dropped) +void console_prepend_dropped(struct printk_message *pmsg, unsigned long dr= opped) { struct printk_buffers *pbufs =3D pmsg->pbufs; const size_t scratchbuf_sz =3D sizeof(pbufs->scratchbuf); @@ -2818,7 +2815,8 @@ static void console_prepend_dropped(struct printk_mes= sage *pmsg, unsigned long d pmsg->outbuf_len +=3D len; } #else -#define console_prepend_dropped(pmsg, dropped) +static inline void console_prepend_dropped(struct printk_message *pmsg, + unsigned long dropped) { } #endif /* CONFIG_PRINTK */ =20 /* @@ -2840,8 +2838,8 @@ static void console_prepend_dropped(struct printk_mes= sage *pmsg, unsigned long d * of @pmsg are valid. (See the documentation of struct printk_message * for information about the @pmsg fields.) */ -static bool printk_get_next_message(struct printk_message *pmsg, u64 seq, - bool is_extended, bool may_suppress) +bool printk_get_next_message(struct printk_message *pmsg, u64 seq, + bool is_extended, bool may_suppress) { static int panic_console_dropped; =20 diff --git a/kernel/printk/printk_nobkl.c b/kernel/printk/printk_nobkl.c index 3318a79a150a..5c591bced1be 100644 --- a/kernel/printk/printk_nobkl.c +++ b/kernel/printk/printk_nobkl.c @@ -317,7 +317,7 @@ static void cons_context_set_seq(struct cons_context *c= txt) * invalid. Caller has to reacquire the console. */ #ifdef CONFIG_64BIT -static bool __maybe_unused cons_seq_try_update(struct cons_context *ctxt) +static bool cons_seq_try_update(struct cons_context *ctxt) { struct console *con =3D ctxt->console; struct cons_state old; @@ -346,7 +346,7 @@ static bool __maybe_unused cons_seq_try_update(struct c= ons_context *ctxt) } #else static bool cons_release(struct cons_context *ctxt); -static bool __maybe_unused cons_seq_try_update(struct cons_context *ctxt) +static bool cons_seq_try_update(struct cons_context *ctxt) { struct console *con =3D ctxt->console; struct cons_state state; @@ -1086,6 +1086,123 @@ bool console_exit_unsafe(struct cons_write_context = *wctxt) return __console_update_unsafe(wctxt, false); } =20 +/** + * cons_get_record - Fill the buffer with the next pending ringbuffer reco= rd + * @wctxt: The write context which will be handed to the write function + * + * Returns: True if there are records available. If the next record should + * be printed, the output buffer is filled and @wctxt->outbuf + * points to the text to print. If @wctxt->outbuf is NULL after + * the call, the record should not be printed but the caller must + * still update the console sequence number. + * + * False means that there are no pending records anymore and the + * printing can stop. + */ +static bool cons_get_record(struct cons_write_context *wctxt) +{ + struct cons_context *ctxt =3D &ACCESS_PRIVATE(wctxt, ctxt); + struct console *con =3D ctxt->console; + bool is_extended =3D console_srcu_read_flags(con) & CON_EXTENDED; + struct printk_message pmsg =3D { + .pbufs =3D ctxt->pbufs, + }; + + if (!printk_get_next_message(&pmsg, ctxt->newseq, is_extended, true)) + return false; + + ctxt->newseq =3D pmsg.seq; + ctxt->dropped +=3D pmsg.dropped; + + if (pmsg.outbuf_len =3D=3D 0) { + wctxt->outbuf =3D NULL; + } else { + if (ctxt->dropped && !is_extended) + console_prepend_dropped(&pmsg, ctxt->dropped); + wctxt->outbuf =3D &pmsg.pbufs->outbuf[0]; + } + + wctxt->len =3D pmsg.outbuf_len; + + return true; +} + +/** + * cons_emit_record - Emit record in the acquired context + * @wctxt: The write context that will be handed to the write function + * + * Returns: False if the operation was aborted (takeover or handover). + * True otherwise + * + * When false is returned, the caller is not allowed to touch console stat= e. + * The console is owned by someone else. If the caller wants to print more + * it has to reacquire the console first. + * + * When true is returned, @wctxt->ctxt.backlog indicates whether there are + * still records pending in the ringbuffer, + */ +static int __maybe_unused cons_emit_record(struct cons_write_context *wctx= t) +{ + struct cons_context *ctxt =3D &ACCESS_PRIVATE(wctxt, ctxt); + struct console *con =3D ctxt->console; + bool done =3D false; + + /* + * @con->dropped is not protected in case of hostile takeovers so + * the update below is racy. Annotate it accordingly. + */ + ctxt->dropped =3D data_race(READ_ONCE(con->dropped)); + + /* Fill the output buffer with the next record */ + ctxt->backlog =3D cons_get_record(wctxt); + if (!ctxt->backlog) + return true; + + /* Safety point. Don't touch state in case of takeover */ + if (!console_can_proceed(wctxt)) + return false; + + /* Counterpart to the read above */ + WRITE_ONCE(con->dropped, ctxt->dropped); + + /* + * In case of skipped records, Update sequence state in @con. + */ + if (!wctxt->outbuf) + goto update; + + /* Tell the driver about potential unsafe state */ + wctxt->unsafe =3D ctxt->state.unsafe; + + if (!ctxt->thread && con->write_atomic) { + done =3D con->write_atomic(con, wctxt); + } else { + cons_release(ctxt); + WARN_ON_ONCE(1); + return false; + } + + /* If not done, the write was aborted due to takeover */ + if (!done) + return false; + + /* If there was a dropped message, it has now been output. */ + if (ctxt->dropped) { + ctxt->dropped =3D 0; + /* Counterpart to the read above */ + WRITE_ONCE(con->dropped, ctxt->dropped); + } +update: + ctxt->newseq++; + /* + * The sequence update attempt is not part of console_release() + * because in panic situations the console is not released by + * the panic CPU until all records are written. On 32bit the + * sequence is separate from state anyway. + */ + return cons_seq_try_update(ctxt); +} + /** * cons_nobkl_init - Initialize the NOBKL console specific data * @con: Console to initialize --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18AA8C87FE1 for ; Thu, 2 Mar 2023 19:58:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230229AbjCBT6Y (ORCPT ); Thu, 2 Mar 2023 14:58:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54862 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230019AbjCBT5v (ORCPT ); Thu, 2 Mar 2023 14:57:51 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A146474EB for ; Thu, 2 Mar 2023 11:57:48 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787065; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Y+tfDj0YbYrshmUSH6s5mWBls3FqcK+wPi39QFNGqNY=; b=pW9iDDImWizt7l6Yp36l+xW2lyyUcO8NSnNTQCD+n8H//EbHA9icdh91NRGNwG8yDgGth1 SS3ixk+1KbEY7yvuyLTtGFaWpY9mlxOfrav7zCrIA5vKBBwqHrbP3UzUcBQkL2gibHHiiv GDcvA5uwGuoGs2TBgxCej2Ofj0eqaFeGty+x2cyWzV3CH+42lF3d0QBPOEoc/wwd54UZr6 KXhfrmZkm6eQch35DNVxnlxAACAci9xoRhyuMX8uxy0AqYaF2P9tBvqvYthvij3/rTGAVs q8FYH/SURTjzZjVNO/osEi55Lh5g4MJo1HvVaZOjyXo62PWF/xDG4IAENPOaMw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787065; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Y+tfDj0YbYrshmUSH6s5mWBls3FqcK+wPi39QFNGqNY=; b=SEDDKnYjnEkHYb0raNoJEINmESrcMLXBfI4Aqhx+MQm/p/LJhBtyLkXvP+pCEMwUanYkCm uZ1B7X/5775OZgCw== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Greg Kroah-Hartman Subject: [PATCH printk v1 11/18] printk: nobkl: Introduce printer threads Date: Thu, 2 Mar 2023 21:02:11 +0106 Message-Id: <20230302195618.156940-12-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner Add the infrastructure to create a printer thread per console along with the required thread function, which is takeover/handover aware. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Tested-by: Daniel Thompson --- include/linux/console.h | 11 ++ kernel/printk/internal.h | 54 ++++++++ kernel/printk/printk.c | 52 ++----- kernel/printk/printk_nobkl.c | 259 ++++++++++++++++++++++++++++++++++- 4 files changed, 336 insertions(+), 40 deletions(-) diff --git a/include/linux/console.h b/include/linux/console.h index 15f71ccfcd9d..2c120c3f3c6e 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -17,6 +17,7 @@ #include #include #include +#include #include =20 struct vc_data; @@ -314,7 +315,12 @@ struct cons_context_data; * @atomic_state: State array for NOBKL consoles; real and handover * @atomic_seq: Sequence for record tracking (32bit only) * @thread_pbufs: Pointer to thread private buffer + * @kthread: Pointer to kernel thread + * @rcuwait: RCU wait for the kernel thread + * @kthread_waiting: Indicator whether the kthread is waiting to be woken * @write_atomic: Write callback for atomic context + * @write_thread: Write callback for printk threaded printing + * @port_lock: Callback to lock/unlock the port lock * @pcpu_data: Pointer to percpu context data */ struct console { @@ -342,8 +348,13 @@ struct console { atomic_t __private atomic_seq; #endif struct printk_buffers *thread_pbufs; + struct task_struct *kthread; + struct rcuwait rcuwait; + atomic_t kthread_waiting; =20 bool (*write_atomic)(struct console *con, struct cons_write_context *wctx= t); + bool (*write_thread)(struct console *con, struct cons_write_context *wctx= t); + void (*port_lock)(struct console *con, bool do_lock, unsigned long *flags= ); =20 struct cons_context_data __percpu *pcpu_data; }; diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index 13dd0ce23c37..8856beed65da 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -44,6 +44,8 @@ enum printk_info_flags { =20 extern struct printk_ringbuffer *prb; =20 +extern bool have_boot_console; + __printf(4, 0) int vprintk_store(int facility, int level, const struct dev_printk_info *dev_info, @@ -75,6 +77,55 @@ u64 cons_read_seq(struct console *con); void cons_nobkl_cleanup(struct console *con); bool cons_nobkl_init(struct console *con); bool cons_alloc_percpu_data(struct console *con); +void cons_kthread_create(struct console *con); + +/* + * Check if the given console is currently capable and allowed to print + * records. If the caller only works with certain types of consoles, the + * caller is responsible for checking the console type before calling + * this function. + */ +static inline bool console_is_usable(struct console *con, short flags) +{ + if (!(flags & CON_ENABLED)) + return false; + + if ((flags & CON_SUSPENDED)) + return false; + + /* + * The usability of a console varies depending on whether + * it is a NOBKL console or not. + */ + + if (flags & CON_NO_BKL) { + if (have_boot_console) + return false; + + } else { + if (!con->write) + return false; + /* + * Console drivers may assume that per-cpu resources have + * been allocated. So unless they're explicitly marked as + * being able to cope (CON_ANYTIME) don't call them until + * this CPU is officially up. + */ + if (!cpu_online(raw_smp_processor_id()) && !(flags & CON_ANYTIME)) + return false; + } + + return true; +} + +/** + * cons_kthread_wake - Wake up a printk thread + * @con: Console to operate on + */ +static inline void cons_kthread_wake(struct console *con) +{ + rcuwait_wake_up(&con->rcuwait); +} =20 #else =20 @@ -82,6 +133,9 @@ bool cons_alloc_percpu_data(struct console *con); #define PRINTK_MESSAGE_MAX 0 #define PRINTKRB_RECORD_MAX 0 =20 +static inline void cons_kthread_wake(struct console *con) { } +static inline void cons_kthread_create(struct console *con) { } + /* * In !PRINTK builds we still export console_sem * semaphore and some of console functions (console_unlock()/etc.), so diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index eab0358baa6f..4c6abb033ec1 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2723,45 +2723,6 @@ static bool abandon_console_lock_in_panic(void) return atomic_read(&panic_cpu) !=3D raw_smp_processor_id(); } =20 -/* - * Check if the given console is currently capable and allowed to print - * records. If the caller only works with certain types of consoles, the - * caller is responsible for checking the console type before calling - * this function. - */ -static inline bool console_is_usable(struct console *con, short flags) -{ - if (!(flags & CON_ENABLED)) - return false; - - if ((flags & CON_SUSPENDED)) - return false; - - /* - * The usability of a console varies depending on whether - * it is a NOBKL console or not. - */ - - if (flags & CON_NO_BKL) { - if (have_boot_console) - return false; - - } else { - if (!con->write) - return false; - /* - * Console drivers may assume that per-cpu resources have - * been allocated. So unless they're explicitly marked as - * being able to cope (CON_ANYTIME) don't call them until - * this CPU is officially up. - */ - if (!cpu_online(raw_smp_processor_id()) && !(flags & CON_ANYTIME)) - return false; - } - - return true; -} - static void __console_unlock(void) { console_locked =3D 0; @@ -3573,10 +3534,14 @@ EXPORT_SYMBOL(register_console); /* Must be called under console_list_lock(). */ static int unregister_console_locked(struct console *console) { + struct console *c; + bool is_boot_con; int res; =20 lockdep_assert_console_list_lock_held(); =20 + is_boot_con =3D console->flags & CON_BOOT; + con_printk(KERN_INFO, console, "disabled\n"); =20 res =3D _braille_unregister_console(console); @@ -3620,6 +3585,15 @@ static int unregister_console_locked(struct console = *console) if (console->exit) res =3D console->exit(console); =20 + /* + * Each time a boot console unregisters, try to start up the printing + * threads. They will only start if this was the last boot console. + */ + if (is_boot_con) { + for_each_console(c) + cons_kthread_create(c); + } + return res; } =20 diff --git a/kernel/printk/printk_nobkl.c b/kernel/printk/printk_nobkl.c index 5c591bced1be..bc3b69223897 100644 --- a/kernel/printk/printk_nobkl.c +++ b/kernel/printk/printk_nobkl.c @@ -5,6 +5,8 @@ #include #include #include +#include +#include #include "printk_ringbuffer.h" #include "internal.h" /* @@ -700,6 +702,7 @@ static bool __cons_try_acquire(struct cons_context *ctx= t) /* Set up the new state for takeover */ copy_full_state(new, old); new.locked =3D 1; + new.thread =3D ctxt->thread; new.cur_prio =3D ctxt->prio; new.req_prio =3D CONS_PRIO_NONE; new.cpu =3D cpu; @@ -714,6 +717,14 @@ static bool __cons_try_acquire(struct cons_context *ct= xt) goto success; } =20 + /* + * A threaded printer context will never spin or perform a + * hostile takeover. The atomic writer will wake the thread + * when it is done with the important output. + */ + if (ctxt->thread) + return false; + /* * If the active context is on the same CPU then there is * obviously no handshake possible. @@ -871,6 +882,9 @@ static bool __cons_release(struct cons_context *ctxt) return true; } =20 +static bool printk_threads_enabled __ro_after_init; +static bool printk_force_atomic __initdata; + /** * cons_release - Release the console after output is done * @ctxt: The acquire context that contains the state @@ -1141,7 +1155,7 @@ static bool cons_get_record(struct cons_write_context= *wctxt) * When true is returned, @wctxt->ctxt.backlog indicates whether there are * still records pending in the ringbuffer, */ -static int __maybe_unused cons_emit_record(struct cons_write_context *wctx= t) +static bool cons_emit_record(struct cons_write_context *wctxt) { struct cons_context *ctxt =3D &ACCESS_PRIVATE(wctxt, ctxt); struct console *con =3D ctxt->console; @@ -1176,6 +1190,8 @@ static int __maybe_unused cons_emit_record(struct con= s_write_context *wctxt) =20 if (!ctxt->thread && con->write_atomic) { done =3D con->write_atomic(con, wctxt); + } else if (ctxt->thread && con->write_thread) { + done =3D con->write_thread(con, wctxt); } else { cons_release(ctxt); WARN_ON_ONCE(1); @@ -1203,6 +1219,243 @@ static int __maybe_unused cons_emit_record(struct c= ons_write_context *wctxt) return cons_seq_try_update(ctxt); } =20 +/** + * cons_kthread_should_wakeup - Check whether the printk thread should wak= eup + * @con: Console to operate on + * @ctxt: The acquire context that contains the state + * at console_acquire() + * + * Returns: True if the thread should shutdown or if the console is allowe= d to + * print and a record is available. False otherwise + * + * After the thread wakes up, it must first check if it should shutdown be= fore + * attempting any printing. + */ +static bool cons_kthread_should_wakeup(struct console *con, struct cons_co= ntext *ctxt) +{ + bool is_usable; + short flags; + int cookie; + + if (kthread_should_stop()) + return true; + + cookie =3D console_srcu_read_lock(); + flags =3D console_srcu_read_flags(con); + is_usable =3D console_is_usable(con, flags); + console_srcu_read_unlock(cookie); + + if (!is_usable) + return false; + + /* This reads state and sequence on 64bit. On 32bit only state */ + cons_state_read(con, CON_STATE_CUR, &ctxt->state); + + /* + * Atomic printing is running on some other CPU. The owner + * will wake the console thread on unlock if necessary. + */ + if (ctxt->state.locked) + return false; + + /* Bring the sequence in @ctxt up to date */ + cons_context_set_seq(ctxt); + + return prb_read_valid(prb, ctxt->oldseq, NULL); +} + +/** + * cons_kthread_func - The printk thread function + * @__console: Console to operate on + */ +static int cons_kthread_func(void *__console) +{ + struct console *con =3D __console; + struct cons_write_context wctxt =3D { + .ctxt.console =3D con, + .ctxt.prio =3D CONS_PRIO_NORMAL, + .ctxt.thread =3D 1, + }; + struct cons_context *ctxt =3D &ACCESS_PRIVATE(&wctxt, ctxt); + unsigned long flags; + short con_flags; + bool backlog; + int cookie; + int ret; + + for (;;) { + atomic_inc(&con->kthread_waiting); + + /* + * Provides a full memory barrier vs. cons_kthread_wake(). + */ + ret =3D rcuwait_wait_event(&con->rcuwait, + cons_kthread_should_wakeup(con, ctxt), + TASK_INTERRUPTIBLE); + + atomic_dec(&con->kthread_waiting); + + if (kthread_should_stop()) + break; + + /* Wait was interrupted by a spurious signal, go back to sleep */ + if (ret) + continue; + + for (;;) { + cookie =3D console_srcu_read_lock(); + + /* + * Ensure this stays on the CPU to make handover and + * takeover possible. + */ + if (con->port_lock) + con->port_lock(con, true, &flags); + else + migrate_disable(); + + /* + * Try to acquire the console without attempting to + * take over. If an atomic printer wants to hand + * back to the thread it simply wakes it up. + */ + if (!cons_try_acquire(ctxt)) + break; + + con_flags =3D console_srcu_read_flags(con); + + if (console_is_usable(con, con_flags)) { + /* + * If the emit fails, this context is no + * longer the owner. Abort the processing and + * wait for new records to print. + */ + if (!cons_emit_record(&wctxt)) + break; + backlog =3D ctxt->backlog; + } else { + backlog =3D false; + } + + /* + * If the release fails, this context was not the + * owner. Abort the processing and wait for new + * records to print. + */ + if (!cons_release(ctxt)) + break; + + /* Backlog done? */ + if (!backlog) + break; + + if (con->port_lock) + con->port_lock(con, false, &flags); + else + migrate_enable(); + + console_srcu_read_unlock(cookie); + + cond_resched(); + } + if (con->port_lock) + con->port_lock(con, false, &flags); + else + migrate_enable(); + + console_srcu_read_unlock(cookie); + } + return 0; +} + +/** + * cons_kthread_stop - Stop a printk thread + * @con: Console to operate on + */ +static void cons_kthread_stop(struct console *con) +{ + lockdep_assert_console_list_lock_held(); + + if (!con->kthread) + return; + + kthread_stop(con->kthread); + con->kthread =3D NULL; + + kfree(con->thread_pbufs); + con->thread_pbufs =3D NULL; +} + +/** + * cons_kthread_create - Create a printk thread + * @con: Console to operate on + * + * If it fails, let the console proceed. The atomic part might + * be usable and useful. + */ +void cons_kthread_create(struct console *con) +{ + struct task_struct *kt; + struct console *c; + + lockdep_assert_console_list_lock_held(); + + if (!(con->flags & CON_NO_BKL) || !con->write_thread) + return; + + if (!printk_threads_enabled || con->kthread) + return; + + /* + * Printer threads cannot be started as long as any boot console is + * registered because there is no way to synchronize the hardware + * registers between boot console code and regular console code. + */ + for_each_console(c) { + if (c->flags & CON_BOOT) + return; + } + have_boot_console =3D false; + + con->thread_pbufs =3D kmalloc(sizeof(*con->thread_pbufs), GFP_KERNEL); + if (!con->thread_pbufs) { + con_printk(KERN_ERR, con, "failed to allocate printing thread buffers\n"= ); + return; + } + + kt =3D kthread_run(cons_kthread_func, con, "pr/%s%d", con->name, con->ind= ex); + if (IS_ERR(kt)) { + con_printk(KERN_ERR, con, "failed to start printing thread\n"); + kfree(con->thread_pbufs); + con->thread_pbufs =3D NULL; + return; + } + + con->kthread =3D kt; + + /* + * It is important that console printing threads are scheduled + * shortly after a printk call and with generous runtime budgets. + */ + sched_set_normal(con->kthread, -20); +} + +static int __init printk_setup_threads(void) +{ + struct console *con; + + if (printk_force_atomic) + return 0; + + console_list_lock(); + printk_threads_enabled =3D true; + for_each_console(con) + cons_kthread_create(con); + console_list_unlock(); + return 0; +} +early_initcall(printk_setup_threads); + /** * cons_nobkl_init - Initialize the NOBKL console specific data * @con: Console to initialize @@ -1216,9 +1469,12 @@ bool cons_nobkl_init(struct console *con) if (!cons_alloc_percpu_data(con)) return false; =20 + rcuwait_init(&con->rcuwait); + atomic_set(&con->kthread_waiting, 0); cons_state_set(con, CON_STATE_CUR, &state); cons_state_set(con, CON_STATE_REQ, &state); cons_seq_init(con); + cons_kthread_create(con); return true; } =20 @@ -1230,6 +1486,7 @@ void cons_nobkl_cleanup(struct console *con) { struct cons_state state =3D { }; =20 + cons_kthread_stop(con); cons_state_set(con, CON_STATE_CUR, &state); cons_state_set(con, CON_STATE_REQ, &state); cons_free_percpu_data(con); --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B24AC87FE7 for ; Thu, 2 Mar 2023 19:58:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230240AbjCBT61 (ORCPT ); Thu, 2 Mar 2023 14:58:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230023AbjCBT5v (ORCPT ); Thu, 2 Mar 2023 14:57:51 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1AA17474F3 for ; Thu, 2 Mar 2023 11:57:48 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J+6Uo9a3gUV/uuNaER8XdIymGyF2GY1924++ShHc6Ps=; b=RSRBYZTTYaC3LqqH5o7zHh4Atx/RzX5EKyzZAAfcuCZznQPTbYdupyci1tRR188esFhNwD BiL2RbhsOBPoHfD/GS0+WegM8gE0/7NTbY/efehrYwfkTD2GOBbVTHo31uhXag4sBlR+jJ +cEssFYpPDyaX2tavQb75eaiMwincI2pu8SmhmMV9z4iEftYIaQtCC3jJBat5IgIdpEBFH 5j5auv4w92gmaUpq/LltX0tRRc4qPbx9aTkuZlDdttl+zmYIxydm8NZh0CEvxrRWxlWm3K YcIF9bTFXTRWNKlU4zz9p+HE3bhC9j/9zizAskHjQqpHsH8eDME+ogDU2GcgHQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J+6Uo9a3gUV/uuNaER8XdIymGyF2GY1924++ShHc6Ps=; b=Nr5ZVS+lu4p8NFKGgEoRNUu8vjta4e+sJ0nnmeiaNvlChK2fxp5g9GfUx7JbDAntyHWLZq rWH7UBL1slMBzQDw== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Greg Kroah-Hartman Subject: [PATCH printk v1 12/18] printk: nobkl: Add printer thread wakeups Date: Thu, 2 Mar 2023 21:02:12 +0106 Message-Id: <20230302195618.156940-13-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner Add a function to wakeup the printer threads. Use the new function when: - records are added to the printk ringbuffer - consoles are started - consoles are resumed The actual waking is performed via irq_work so that the wakeup can be triggered from any context. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Tested-by: Daniel Thompson --- include/linux/console.h | 3 +++ kernel/printk/internal.h | 1 + kernel/printk/printk.c | 26 ++++++++++++++++++++++++++ kernel/printk/printk_nobkl.c | 32 ++++++++++++++++++++++++++++++++ 4 files changed, 62 insertions(+) diff --git a/include/linux/console.h b/include/linux/console.h index 2c120c3f3c6e..710f1e72cd0f 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -16,6 +16,7 @@ =20 #include #include +#include #include #include #include @@ -317,6 +318,7 @@ struct cons_context_data; * @thread_pbufs: Pointer to thread private buffer * @kthread: Pointer to kernel thread * @rcuwait: RCU wait for the kernel thread + * @irq_work: IRQ work for thread wakeup * @kthread_waiting: Indicator whether the kthread is waiting to be woken * @write_atomic: Write callback for atomic context * @write_thread: Write callback for printk threaded printing @@ -350,6 +352,7 @@ struct console { struct printk_buffers *thread_pbufs; struct task_struct *kthread; struct rcuwait rcuwait; + struct irq_work irq_work; atomic_t kthread_waiting; =20 bool (*write_atomic)(struct console *con, struct cons_write_context *wctx= t); diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index 8856beed65da..a72402c1ac93 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -78,6 +78,7 @@ void cons_nobkl_cleanup(struct console *con); bool cons_nobkl_init(struct console *con); bool cons_alloc_percpu_data(struct console *con); void cons_kthread_create(struct console *con); +void cons_wake_threads(void); =20 /* * Check if the given console is currently capable and allowed to print diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 4c6abb033ec1..19f682fcae10 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2345,6 +2345,7 @@ asmlinkage int vprintk_emit(int facility, int level, preempt_enable(); } =20 + cons_wake_threads(); if (in_sched) defer_console_output(); else @@ -2615,6 +2616,8 @@ void suspend_console(void) void resume_console(void) { struct console *con; + short flags; + int cookie; =20 if (!console_suspend_enabled) return; @@ -2634,6 +2637,14 @@ void resume_console(void) */ synchronize_srcu(&console_srcu); =20 + cookie =3D console_srcu_read_lock(); + for_each_console_srcu(con) { + flags =3D console_srcu_read_flags(con); + if (flags & CON_NO_BKL) + cons_kthread_wake(con); + } + console_srcu_read_unlock(cookie); + pr_flush(1000, true); } =20 @@ -3226,9 +3237,23 @@ EXPORT_SYMBOL(console_stop); =20 void console_start(struct console *console) { + short flags; + console_list_lock(); console_srcu_write_flags(console, console->flags | CON_ENABLED); + flags =3D console->flags; console_list_unlock(); + + /* + * Ensure that all SRCU list walks have completed. The related + * printing context must be able to see it is enabled so that + * it is guaranteed to wake up and resume printing. + */ + synchronize_srcu(&console_srcu); + + if (flags & CON_NO_BKL) + cons_kthread_wake(console); + __pr_flush(console, 1000, true); } EXPORT_SYMBOL(console_start); @@ -3918,6 +3943,7 @@ void defer_console_output(void) =20 void printk_trigger_flush(void) { + cons_wake_threads(); defer_console_output(); } =20 diff --git a/kernel/printk/printk_nobkl.c b/kernel/printk/printk_nobkl.c index bc3b69223897..890fc8d44f1d 100644 --- a/kernel/printk/printk_nobkl.c +++ b/kernel/printk/printk_nobkl.c @@ -1368,6 +1368,37 @@ static int cons_kthread_func(void *__console) return 0; } =20 +/** + * cons_irq_work - irq work to wake printk thread + * @irq_work: The irq work to operate on + */ +static void cons_irq_work(struct irq_work *irq_work) +{ + struct console *con =3D container_of(irq_work, struct console, irq_work); + + cons_kthread_wake(con); +} + +/** + * cons_wake_threads - Wake up printing threads + * + * A printing thread is only woken if it is within the @kthread_waiting + * block. If it is not within the block (or enters the block later), it + * will see any new records and continue printing on its own. + */ +void cons_wake_threads(void) +{ + struct console *con; + int cookie; + + cookie =3D console_srcu_read_lock(); + for_each_console_srcu(con) { + if (con->kthread && atomic_read(&con->kthread_waiting)) + irq_work_queue(&con->irq_work); + } + console_srcu_read_unlock(cookie); +} + /** * cons_kthread_stop - Stop a printk thread * @con: Console to operate on @@ -1471,6 +1502,7 @@ bool cons_nobkl_init(struct console *con) =20 rcuwait_init(&con->rcuwait); atomic_set(&con->kthread_waiting, 0); + init_irq_work(&con->irq_work, cons_irq_work); cons_state_set(con, CON_STATE_CUR, &state); cons_state_set(con, CON_STATE_REQ, &state); cons_seq_init(con); --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62AA6C6FA8E for ; Thu, 2 Mar 2023 19:59:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229500AbjCBT6b (ORCPT ); Thu, 2 Mar 2023 14:58:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230009AbjCBT5v (ORCPT ); Thu, 2 Mar 2023 14:57:51 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50DDF48E2C for ; Thu, 2 Mar 2023 11:57:48 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SvKO9QqGXwAvVE6Dd+q4rL71MFOGdvNMhIjt6NWB61k=; b=xFmMnktcYtR/SAZFBChYvEhtDZGN55+DFpfBQ+HJzArF1j0HI23kXWoElF+Gvt3BtANk/+ V6XnfIrpIEk7ejTRN5LumMy+rIXq3tK3qfLvCFaoaJRbKMqMMB5W4D1bK86Mw5RfVQyzJg 2a1DMgF6egBp0+/MrTz5W1AqGuw1ZTk5EahdmeCO6pQTgCGViY9n31TiApFUUELd4GJpiL tMaBN2Sl5BA5ipbeifYsl4upBKT+pYhgAHlKXO6PCrMig28NwnMcZmDzQbwld8wNNen8aB tQgRcGGob4kxd3V/aZf7IPn5Rwj55/dTl22xE3jVDfZz8AW1BTefuw/XNzP5WA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SvKO9QqGXwAvVE6Dd+q4rL71MFOGdvNMhIjt6NWB61k=; b=hBrxnupy7aV7oa6D6dncweUttM+48UeatpUZwYCzkqwHf63e8pswDSzMZvamlboPdEXVu6 Yzjl1f7QfkTMSvBw== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Greg Kroah-Hartman Subject: [PATCH printk v1 13/18] printk: nobkl: Add write context storage for atomic writes Date: Thu, 2 Mar 2023 21:02:13 +0106 Message-Id: <20230302195618.156940-14-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner The number of consoles is unknown at compile time and allocating write contexts on stack in emergency/panic situations is not desired either. Allocate a write context array (one for each priority level) along with the per CPU output buffers, thus allowing atomic contexts on multiple CPUs and priority levels to execute simultaneously without clobbering each other's write context. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Tested-by: Daniel Thompson --- include/linux/console.h | 2 ++ kernel/printk/internal.h | 5 +++++ 2 files changed, 7 insertions(+) diff --git a/include/linux/console.h b/include/linux/console.h index 710f1e72cd0f..089a94a3dd8d 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -222,6 +222,7 @@ struct cons_state { * @CONS_PRIO_NORMAL: Regular printk * @CONS_PRIO_EMERGENCY: Emergency output (WARN/OOPS...) * @CONS_PRIO_PANIC: Panic output + * @CONS_PRIO_MAX: The number of priority levels * * Emergency output can carefully takeover the console even without consent * of the owner, ideally only when @cons_state::unsafe is not set. Panic @@ -234,6 +235,7 @@ enum cons_prio { CONS_PRIO_NORMAL, CONS_PRIO_EMERGENCY, CONS_PRIO_PANIC, + CONS_PRIO_MAX, }; =20 struct console; diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index a72402c1ac93..a417e3992b7a 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -181,11 +181,16 @@ struct printk_message { =20 /** * struct cons_context_data - console context data + * @wctxt: Write context per priority level * @pbufs: Buffer for storing the text * * Used for early boot and for per CPU data. + * + * The write contexts are allocated to avoid having them on stack, e.g. in + * warn() or panic(). */ struct cons_context_data { + struct cons_write_context wctxt[CONS_PRIO_MAX]; struct printk_buffers pbufs; }; =20 --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DBE4C7EE36 for ; Thu, 2 Mar 2023 19:59:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229639AbjCBT6k (ORCPT ); Thu, 2 Mar 2023 14:58:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230026AbjCBT5w (ORCPT ); Thu, 2 Mar 2023 14:57:52 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5081E48E00 for ; Thu, 2 Mar 2023 11:57:48 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RQaoWu8ziTxaQYkBHImtYGFT53DoX5McXDIC69zOSXc=; b=vibtepaI9AQNACjD92h8htstYfM9lD/KWP1JQ1XJ3hDVDY86RX43hZvBOksmIXv7fT+9cF LmOlLc8Dia6xYHOxJ940DMENQH5+5lpSgOcfnEtogxaUasq4yMGESGxKXCFRaygDG7pl98 eOoR1wp+WSj1pju5NqKI3Y00cIF1mZHBXDSyPoxGZYzNSrKQbhpIGjIWLznMfouQmnHYBA yOnpmsVDjZVJTG+gvTQEhEz1ENW32/Vk2j2C5KivaWstvQhp/824zUgG2OYJ4+gxhq6Rht YG5XwiigT3MT0ETCyXCzBAjnU3HaTngiepoRPvLtQx/zKVCXBitqJUssw2ar1Q== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RQaoWu8ziTxaQYkBHImtYGFT53DoX5McXDIC69zOSXc=; b=liDqf5RNe8L2vRWh6DMg0qe8spuli+CLZsev5BebWhi6aIE0/2dTfJCB3HiJP55XfbUNvq fj2V3mycATkGbXDg== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Greg Kroah-Hartman Subject: [PATCH printk v1 14/18] printk: nobkl: Provide functions for atomic write enforcement Date: Thu, 2 Mar 2023 21:02:14 +0106 Message-Id: <20230302195618.156940-15-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner Threaded printk is the preferred mechanism to tame the noisyness of printk, but WARN/OOPS/PANIC require printing out immediately since the printer threads might not be able to run. Add per CPU state to denote the priority/urgency of the output and provide functions to flush the printk backlog for priority elevated contexts and when the printing threads are not available (such as early boot). Note that when a CPU is in a priority elevated state, flushing only occurs when dropping back to a lower priority. This allows the full set of printk records (WARN/OOPS/PANIC output) to be stored in the ringbuffer before beginning to flush the backlog. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Tested-by: Daniel Thompson --- include/linux/console.h | 8 ++ include/linux/printk.h | 9 ++ kernel/printk/printk.c | 35 +++-- kernel/printk/printk_nobkl.c | 240 +++++++++++++++++++++++++++++++++++ 4 files changed, 283 insertions(+), 9 deletions(-) diff --git a/include/linux/console.h b/include/linux/console.h index 089a94a3dd8d..afc683e722bb 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -494,6 +494,14 @@ extern bool console_exit_unsafe(struct cons_write_cont= ext *wctxt); extern bool console_try_acquire(struct cons_write_context *wctxt); extern bool console_release(struct cons_write_context *wctxt); =20 +#ifdef CONFIG_PRINTK +extern enum cons_prio cons_atomic_enter(enum cons_prio prio); +extern void cons_atomic_exit(enum cons_prio prio, enum cons_prio prev_prio= ); +#else +static inline enum cons_prio cons_atomic_enter(enum cons_prio prio) { retu= rn CONS_PRIO_NONE; } +static inline void cons_atomic_exit(enum cons_prio prio, enum cons_prio pr= ev_prio) { } +#endif + extern int console_set_on_cmdline; extern struct console *early_console; =20 diff --git a/include/linux/printk.h b/include/linux/printk.h index 8ef499ab3c1e..d2aafc79b611 100644 --- a/include/linux/printk.h +++ b/include/linux/printk.h @@ -139,6 +139,7 @@ void early_printk(const char *s, ...) { } #endif =20 struct dev_printk_info; +struct cons_write_context; =20 #ifdef CONFIG_PRINTK asmlinkage __printf(4, 0) @@ -192,6 +193,8 @@ void show_regs_print_info(const char *log_lvl); extern asmlinkage void dump_stack_lvl(const char *log_lvl) __cold; extern asmlinkage void dump_stack(void) __cold; void printk_trigger_flush(void); +extern void cons_atomic_flush(struct cons_write_context *printk_caller_wct= xt, + bool skip_unsafe); #else static inline __printf(1, 0) int vprintk(const char *s, va_list args) @@ -271,6 +274,12 @@ static inline void dump_stack(void) static inline void printk_trigger_flush(void) { } + +static inline void cons_atomic_flush(struct cons_write_context *printk_cal= ler_wctxt, + bool skip_unsafe) +{ +} + #endif =20 #ifdef CONFIG_SMP diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 19f682fcae10..015c240f9f04 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2304,6 +2304,7 @@ asmlinkage int vprintk_emit(int facility, int level, const struct dev_printk_info *dev_info, const char *fmt, va_list args) { + struct cons_write_context wctxt =3D { }; int printed_len; bool in_sched =3D false; =20 @@ -2324,16 +2325,25 @@ asmlinkage int vprintk_emit(int facility, int level, =20 printed_len =3D vprintk_store(facility, level, dev_info, fmt, args); =20 + /* + * The caller may be holding system-critical or + * timing-sensitive locks. Disable preemption during + * printing of all remaining records to all consoles so that + * this context can return as soon as possible. Hopefully + * another printk() caller will take over the printing. + */ + preempt_disable(); + + /* + * Flush the non-BKL consoles. This only leads to direct atomic + * printing for non-BKL consoles that do not have a printer + * thread available. Otherwise the printer thread will perform + * the printing. + */ + cons_atomic_flush(&wctxt, true); + /* If called from the scheduler, we can not call up(). */ if (!in_sched && have_bkl_console) { - /* - * The caller may be holding system-critical or - * timing-sensitive locks. Disable preemption during - * printing of all remaining records to all consoles so that - * this context can return as soon as possible. Hopefully - * another printk() caller will take over the printing. - */ - preempt_disable(); /* * Try to acquire and then immediately release the console * semaphore. The release will print out buffers. With the @@ -2342,9 +2352,10 @@ asmlinkage int vprintk_emit(int facility, int level, */ if (console_trylock_spinning()) console_unlock(); - preempt_enable(); } =20 + preempt_enable(); + cons_wake_threads(); if (in_sched) defer_console_output(); @@ -3943,6 +3954,12 @@ void defer_console_output(void) =20 void printk_trigger_flush(void) { + struct cons_write_context wctxt =3D { }; + + preempt_disable(); + cons_atomic_flush(&wctxt, true); + preempt_enable(); + cons_wake_threads(); defer_console_output(); } diff --git a/kernel/printk/printk_nobkl.c b/kernel/printk/printk_nobkl.c index 890fc8d44f1d..001a1ca9793f 100644 --- a/kernel/printk/printk_nobkl.c +++ b/kernel/printk/printk_nobkl.c @@ -1399,6 +1399,246 @@ void cons_wake_threads(void) console_srcu_read_unlock(cookie); } =20 +/** + * struct cons_cpu_state - Per CPU printk context state + * @prio: The current context priority level + * @nesting: Per priority nest counter + */ +struct cons_cpu_state { + enum cons_prio prio; + int nesting[CONS_PRIO_MAX]; +}; + +static DEFINE_PER_CPU(struct cons_cpu_state, cons_pcpu_state); +static struct cons_cpu_state early_cons_pcpu_state __initdata; + +/** + * cons_get_cpu_state - Get the per CPU console state pointer + * + * Returns either a pointer to the per CPU state of the current CPU or to + * the init data state during early boot. + */ +static __ref struct cons_cpu_state *cons_get_cpu_state(void) +{ + if (!printk_percpu_data_ready()) + return &early_cons_pcpu_state; + + return this_cpu_ptr(&cons_pcpu_state); +} + +/** + * cons_get_wctxt - Get the write context for atomic printing + * @con: Console to operate on + * @prio: Priority of the context + * + * Returns either the per CPU context or the builtin context for + * early boot. + */ +static __ref struct cons_write_context *cons_get_wctxt(struct console *con, + enum cons_prio prio) +{ + if (!con->pcpu_data) + return &early_cons_ctxt_data.wctxt[prio]; + + return &this_cpu_ptr(con->pcpu_data)->wctxt[prio]; +} + +/** + * cons_atomic_try_acquire - Try to acquire the console for atomic printing + * @con: The console to acquire + * @ctxt: The console context instance to work on + * @prio: The priority of the current context + */ +static bool cons_atomic_try_acquire(struct console *con, struct cons_conte= xt *ctxt, + enum cons_prio prio, bool skip_unsafe) +{ + memset(ctxt, 0, sizeof(*ctxt)); + ctxt->console =3D con; + ctxt->spinwait_max_us =3D 2000; + ctxt->prio =3D prio; + ctxt->spinwait =3D 1; + + /* Try to acquire it directly or via a friendly handover */ + if (cons_try_acquire(ctxt)) + return true; + + /* Investigate whether a hostile takeover is due */ + if (ctxt->old_state.cur_prio >=3D prio) + return false; + + if (!ctxt->old_state.unsafe || !skip_unsafe) + ctxt->hostile =3D 1; + return cons_try_acquire(ctxt); +} + +/** + * cons_atomic_flush_con - Flush one console in atomic mode + * @wctxt: The write context struct to use for this context + * @con: The console to flush + * @prio: The priority of the current context + * @skip_unsafe: True, to avoid unsafe hostile takeovers + */ +static void cons_atomic_flush_con(struct cons_write_context *wctxt, struct= console *con, + enum cons_prio prio, bool skip_unsafe) +{ + struct cons_context *ctxt =3D &ACCESS_PRIVATE(wctxt, ctxt); + bool wake_thread =3D false; + short flags; + + if (!cons_atomic_try_acquire(con, ctxt, prio, skip_unsafe)) + return; + + do { + flags =3D console_srcu_read_flags(con); + + if (!console_is_usable(con, flags)) + break; + + /* + * For normal prio messages let the printer thread handle + * the printing if it is available. + */ + if (prio <=3D CONS_PRIO_NORMAL && con->kthread) { + wake_thread =3D true; + break; + } + + /* + * cons_emit_record() returns false when the console was + * handed over or taken over. In both cases the context is + * no longer valid. + */ + if (!cons_emit_record(wctxt)) + return; + } while (ctxt->backlog); + + cons_release(ctxt); + + if (wake_thread && atomic_read(&con->kthread_waiting)) + irq_work_queue(&con->irq_work); +} + +/** + * cons_atomic_flush - Flush consoles in atomic mode if required + * @printk_caller_wctxt: The write context struct to use for this + * context (for printk() context only) + * @skip_unsafe: True, to avoid unsafe hostile takeovers + */ +void cons_atomic_flush(struct cons_write_context *printk_caller_wctxt, boo= l skip_unsafe) +{ + struct cons_write_context *wctxt; + struct cons_cpu_state *cpu_state; + struct console *con; + short flags; + int cookie; + + cpu_state =3D cons_get_cpu_state(); + + /* + * When in an elevated priority, the printk() calls are not + * individually flushed. This is to allow the full output to + * be dumped to the ringbuffer before starting with printing + * the backlog. + */ + if (cpu_state->prio > CONS_PRIO_NORMAL && printk_caller_wctxt) + return; + + /* + * Let the outermost write of this priority print. This avoids + * nasty hackery for nested WARN() where the printing itself + * generates one. + * + * cpu_state->prio <=3D CONS_PRIO_NORMAL is not subject to nesting + * and can proceed in order to allow atomic printing when consoles + * do not have a printer thread. + */ + if (cpu_state->prio > CONS_PRIO_NORMAL && + cpu_state->nesting[cpu_state->prio] !=3D 1) + return; + + cookie =3D console_srcu_read_lock(); + for_each_console_srcu(con) { + if (!con->write_atomic) + continue; + + flags =3D console_srcu_read_flags(con); + + if (!console_is_usable(con, flags)) + continue; + + if (cpu_state->prio > CONS_PRIO_NORMAL || !con->kthread) { + if (printk_caller_wctxt) + wctxt =3D printk_caller_wctxt; + else + wctxt =3D cons_get_wctxt(con, cpu_state->prio); + cons_atomic_flush_con(wctxt, con, cpu_state->prio, skip_unsafe); + } + } + console_srcu_read_unlock(cookie); +} + +/** + * cons_atomic_enter - Enter a context that enforces atomic printing + * @prio: Priority of the context + * + * Returns: The previous priority that needs to be fed into + * the corresponding cons_atomic_exit() + */ +enum cons_prio cons_atomic_enter(enum cons_prio prio) +{ + struct cons_cpu_state *cpu_state; + enum cons_prio prev_prio; + + migrate_disable(); + cpu_state =3D cons_get_cpu_state(); + + prev_prio =3D cpu_state->prio; + if (prev_prio < prio) + cpu_state->prio =3D prio; + + /* + * Increment the nesting on @cpu_state->prio so a WARN() + * nested into a panic printout does not attempt to + * scribble state. + */ + cpu_state->nesting[cpu_state->prio]++; + + return prev_prio; +} + +/** + * cons_atomic_exit - Exit a context that enforces atomic printing + * @prio: Priority of the context to leave + * @prev_prio: Priority of the previous context for restore + * + * @prev_prio is the priority returned by the corresponding cons_atomic_en= ter(). + */ +void cons_atomic_exit(enum cons_prio prio, enum cons_prio prev_prio) +{ + struct cons_cpu_state *cpu_state; + + cons_atomic_flush(NULL, true); + + cpu_state =3D cons_get_cpu_state(); + + if (cpu_state->prio =3D=3D CONS_PRIO_PANIC) + cons_atomic_flush(NULL, false); + + /* + * Undo the nesting of cons_atomic_enter() at the CPU state + * priority. + */ + cpu_state->nesting[cpu_state->prio]--; + + /* + * Restore the previous priority, which was returned by + * cons_atomic_enter(). + */ + cpu_state->prio =3D prev_prio; + + migrate_enable(); +} + /** * cons_kthread_stop - Stop a printk thread * @con: Console to operate on --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A06CC87FDE for ; Thu, 2 Mar 2023 19:58:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230235AbjCBT6Z (ORCPT ); Thu, 2 Mar 2023 14:58:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230024AbjCBT5w (ORCPT ); Thu, 2 Mar 2023 14:57:52 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AFEFB474D7 for ; Thu, 2 Mar 2023 11:57:50 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787067; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UbylHEX9bZ2Zz2vO7zrZ8IqoDnUW9FIGTcOLhEnljfc=; b=SzIhebPlt8K1wgUDVQ2Wh7Tb6U0elzXDwN42e+qk6p2+KJA3FqCB72QdiDFkqN5Rz6hN+m HKUcbOR5LpLtLiX997qg9iueEXEFKN2UXb/CWtgNW8XLZYrbBmNQB8+Lu3RVnl2iThp0Ki DxsgqYmPYbWJfh2LqkjcVungwPE368w3ZQk6iDdxTq6zrYi/NaWzsyVo4dN03fhx+tw/G2 zPj6I4tuLNqgFTurakY8dVCGSaciIMuQ73HQmFp+JtFzviEiZhie9xMdGN3uK++J4f5h64 ZqTPB4S/Jt49sYX3tuTfBXSn1jkPLjqsJeJYRi9k/xiBbIlj9PzEhMOTAJb0Ew== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787067; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UbylHEX9bZ2Zz2vO7zrZ8IqoDnUW9FIGTcOLhEnljfc=; b=VYNs/XHznKBn3ldggoAmpbvOhFfxFE91NmiYSlRyerxEjdw7bcVdX12/MvvF6kyEsz17Ug cnhlnMwLs693aCDA== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org Subject: [PATCH printk v1 15/18] printk: nobkl: Stop threads on shutdown/reboot Date: Thu, 2 Mar 2023 21:02:15 +0106 Message-Id: <20230302195618.156940-16-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Register a syscore_ops shutdown function to stop all threaded printers on shutdown/reboot. This allows printk to transition back to atomic printing in order to provide a robust mechanism for outputting the final messages. Signed-off-by: John Ogness Tested-by: Daniel Thompson --- kernel/printk/printk_nobkl.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/kernel/printk/printk_nobkl.c b/kernel/printk/printk_nobkl.c index 001a1ca9793f..53989c8f1dbc 100644 --- a/kernel/printk/printk_nobkl.c +++ b/kernel/printk/printk_nobkl.c @@ -7,6 +7,7 @@ #include #include #include +#include #include "printk_ringbuffer.h" #include "internal.h" /* @@ -1763,3 +1764,33 @@ void cons_nobkl_cleanup(struct console *con) cons_state_set(con, CON_STATE_REQ, &state); cons_free_percpu_data(con); } + +/** + * printk_kthread_shutdown - shutdown all threaded printers + * + * On system shutdown all threaded printers are stopped. This allows printk + * to transition back to atomic printing, thus providing a robust mechanism + * for the final shutdown/reboot messages to be output. + */ +static void printk_kthread_shutdown(void) +{ + struct console *con; + + console_list_lock(); + for_each_console(con) { + if (con->flags & CON_NO_BKL) + cons_kthread_stop(con); + } + console_list_unlock(); +} + +static struct syscore_ops printk_syscore_ops =3D { + .shutdown =3D printk_kthread_shutdown, +}; + +static int __init printk_init_ops(void) +{ + register_syscore_ops(&printk_syscore_ops); + return 0; +} +device_initcall(printk_init_ops); --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07EA7C87FDC for ; Thu, 2 Mar 2023 19:58:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230220AbjCBT6W (ORCPT ); Thu, 2 Mar 2023 14:58:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230025AbjCBT5w (ORCPT ); Thu, 2 Mar 2023 14:57:52 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF6AA4743A for ; Thu, 2 Mar 2023 11:57:50 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787067; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5Ukjcgr8xBhy+g3OZs13uetHsjpUVZC4eHKa1M+r91I=; b=HRhjUOhkaz87s/S228utMAWOjOX7FX/XBqnyxw9Bux9d/cig1aDf5yJMr0p+mk1ysQcQgt kMkxq30KNB2XnDG8LUhTyGDmMi57b1J7H6XDH+n1yRivsxyfO0CK7dLFLuaj+3jHSfGPSF 5vxtaoPWz6YIUF+LBhCSe7q4DBqUoNr5UlnfeJg2iLMf4Q9/iRRROb7dlDOmKuYw7Zdty/ fSrcUARLWJ26OulLCfpUIxkxoR9xaeglM0N/Ap1d3KCD5UUVUWOP79m1/mqGCKoEpAM4EB D/vQwxci3QQT5XlVS7j7OnE1VHpyhlzMpiJ7y+qA1oon2q/ZAwEb/6sYyL8haQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787067; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5Ukjcgr8xBhy+g3OZs13uetHsjpUVZC4eHKa1M+r91I=; b=1bCHwZvCMxdX8QJcoPVDUANjvlSfXk6CGVNBnj/3eTMfWMiBWX2aIEYW056QPWZNJ1jXzM SjDkNN5cm/XiV2DA== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, Andrew Morton , "Guilherme G. Piccoli" , Luis Chamberlain , David Gow , Tiezhu Yang , Daniel Vetter , tangmeng Subject: [PATCH printk v1 16/18] kernel/panic: Add atomic write enforcement to warn/panic Date: Thu, 2 Mar 2023 21:02:16 +0106 Message-Id: <20230302195618.156940-17-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner Invoke the atomic write enforcement functions for warn/panic to ensure that the information gets out to the consoles. For the panic case, add explicit intermediate atomic flush calls to ensure immediate flushing at important points. Otherwise the atomic flushing only occurs when dropping out of the elevated priority, which for panic may never happen. It is important to note that if there are any legacy consoles registered, they will be attempting to directly print from the printk-caller context, which may jeopardize the reliability of the atomic consoles. Optimally there should be no legacy consoles registered. Co-developed-by: John Ogness Signed-off-by: John Ogness Signed-off-by: Thomas Gleixner (Intel) Tested-by: Daniel Thompson --- kernel/panic.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/kernel/panic.c b/kernel/panic.c index da323209f583..db9834fbdf26 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -209,6 +209,7 @@ static void panic_print_sys_info(bool console_flush) */ void panic(const char *fmt, ...) { + enum cons_prio prev_prio; static char buf[1024]; va_list args; long i, i_next =3D 0, len; @@ -256,6 +257,8 @@ void panic(const char *fmt, ...) if (old_cpu !=3D PANIC_CPU_INVALID && old_cpu !=3D this_cpu) panic_smp_self_stop(); =20 + prev_prio =3D cons_atomic_enter(CONS_PRIO_PANIC); + console_verbose(); bust_spinlocks(1); va_start(args, fmt); @@ -329,6 +332,8 @@ void panic(const char *fmt, ...) if (_crash_kexec_post_notifiers) __crash_kexec(NULL); =20 + cons_atomic_flush(NULL, true); + console_unblank(); =20 /* @@ -353,6 +358,7 @@ void panic(const char *fmt, ...) * We can't use the "normal" timers since we just panicked. */ pr_emerg("Rebooting in %d seconds..\n", panic_timeout); + cons_atomic_flush(NULL, true); =20 for (i =3D 0; i < panic_timeout * 1000; i +=3D PANIC_TIMER_STEP) { touch_nmi_watchdog(); @@ -371,6 +377,7 @@ void panic(const char *fmt, ...) */ if (panic_reboot_mode !=3D REBOOT_UNDEFINED) reboot_mode =3D panic_reboot_mode; + cons_atomic_flush(NULL, true); emergency_restart(); } #ifdef __sparc__ @@ -383,12 +390,16 @@ void panic(const char *fmt, ...) } #endif #if defined(CONFIG_S390) + cons_atomic_flush(NULL, true); disabled_wait(); #endif pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf); =20 /* Do not scroll important messages printed above */ suppress_printk =3D 1; + + cons_atomic_exit(CONS_PRIO_PANIC, prev_prio); + local_irq_enable(); for (i =3D 0; ; i +=3D PANIC_TIMER_STEP) { touch_softlockup_watchdog(); @@ -599,6 +610,10 @@ struct warn_args { void __warn(const char *file, int line, void *caller, unsigned taint, struct pt_regs *regs, struct warn_args *args) { + enum cons_prio prev_prio; + + prev_prio =3D cons_atomic_enter(CONS_PRIO_EMERGENCY); + disable_trace_on_warning(); =20 if (file) @@ -630,6 +645,8 @@ void __warn(const char *file, int line, void *caller, u= nsigned taint, =20 /* Just a warning, don't kill lockdep. */ add_taint(taint, LOCKDEP_STILL_OK); + + cons_atomic_exit(CONS_PRIO_EMERGENCY, prev_prio); } =20 #ifndef __WARN_FLAGS --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B0AFC7EE2F for ; Thu, 2 Mar 2023 19:59:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229725AbjCBT6f (ORCPT ); Thu, 2 Mar 2023 14:58:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230027AbjCBT5w (ORCPT ); Thu, 2 Mar 2023 14:57:52 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A55B48E31; Thu, 2 Mar 2023 11:57:51 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787067; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BQXtIuVdKd6aYDdz0BrUrqG+XHrNJCfdUZckqmV0CGk=; b=lKgPdymx4ineOUjiWIUNWjOABTzLnRXZY8vvYPeuGMuZWqnHBPU9P2t7cAZ2XErOH9TERp mZFVRGlsJDT9mmyXB2vyT/So853g8vXuVt0VrWgeqBDYsl2CQQ0l/JwodkN74KwAeCGd7h GbpcdgKqBRioGBN6RgjmRl6VGTdc6mG/ld1dhiHAbSFr325qgbqtxcA6nLqVVVwzYoamSw 5bzTiusEtufwdgqd2sHwYpbUlebWmaJfgTvsT4WsSdbX//sVYn4zvF7Fe7ewscZJUCIgOq D1Om2f5V4LFfkUtA+45GZzB61yGT32SHDtw4CnLx9Ttz5eih0506JYBj2XPFog== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787067; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BQXtIuVdKd6aYDdz0BrUrqG+XHrNJCfdUZckqmV0CGk=; b=Ivxb5hj77f/zf0BVQiuddIlSIi9XtkPI5IZVsai/WeryhUuVI7T0Rj3XQdt8Hfl+dPYn3e khSAw2pY/t5rqSBQ== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org, "Paul E. McKenney" , Frederic Weisbecker , Neeraj Upadhyay , Josh Triplett , Mathieu Desnoyers , Lai Jiangshan , Joel Fernandes , rcu@vger.kernel.org Subject: [PATCH printk v1 17/18] rcu: Add atomic write enforcement for rcu stalls Date: Thu, 2 Mar 2023 21:02:17 +0106 Message-Id: <20230302195618.156940-18-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Invoke the atomic write enforcement functions for rcu stalls to ensure that the information gets out to the consoles. It is important to note that if there are any legacy consoles registered, they will be attempting to directly print from the printk-caller context, which may jeopardize the reliability of the atomic consoles. Optimally there should be no legacy consoles registered. Signed-off-by: John Ogness Tested-by: Daniel Thompson --- kernel/rcu/tree_stall.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h index 5653560573e2..25207a213e7a 100644 --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -8,6 +8,7 @@ */ =20 #include +#include =20 //////////////////////////////////////////////////////////////////////////= //// // @@ -551,6 +552,7 @@ static void rcu_check_gp_kthread_expired_fqs_timer(void) =20 static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps) { + enum cons_prio prev_prio; int cpu; unsigned long flags; unsigned long gpa; @@ -566,6 +568,8 @@ static void print_other_cpu_stall(unsigned long gp_seq,= unsigned long gps) if (rcu_stall_is_suppressed()) return; =20 + prev_prio =3D cons_atomic_enter(CONS_PRIO_EMERGENCY); + /* * OK, time to rat on our buddy... * See Documentation/RCU/stallwarn.rst for info on how to debug @@ -620,6 +624,8 @@ static void print_other_cpu_stall(unsigned long gp_seq,= unsigned long gps) panic_on_rcu_stall(); =20 rcu_force_quiescent_state(); /* Kick them all. */ + + cons_atomic_exit(CONS_PRIO_EMERGENCY, prev_prio); } =20 static void print_cpu_stall(unsigned long gps) --=20 2.30.2 From nobody Sat Apr 11 07:08:25 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0D3EC7EE32 for ; Thu, 2 Mar 2023 19:59:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230085AbjCBT6m (ORCPT ); Thu, 2 Mar 2023 14:58:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55038 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230029AbjCBT5w (ORCPT ); Thu, 2 Mar 2023 14:57:52 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4A78E48E3C for ; Thu, 2 Mar 2023 11:57:51 -0800 (PST) From: John Ogness DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1677787068; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pInHvcVDhlPvPce/Y/2kDlCM4xbrektXByOcawRj+0U=; b=h1FbnCMGHQDvQR5n7qJZizMYJFjX9il2feilw35+EzPvA/b1gvi3IxQZhoGDSqoMBfkAFC bcdmRZrczX/5lWhxyohoBiY5o8FEch1kiZht4hLoVSOFdv51lFEsyzcxkb4slBKtDyDtKR X3W8eSvKqfAFUcdpW/Labe6N7r83IPxdily+bPw7jLdYCgn6JXNaDRYyUxbS16gpd3kuPz OwrGiI/0vV/syNp/sQlbeWl9OU0RFDODOIPAKst6+L09Ctc5iOFCVPPcLtFeyCUIiMvXUA 5UF6fxZILwMDXChTqqDICPzKk7wpwpcG3b8QHTDYzo3cO9lf/VPHRYot9H5QwA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1677787068; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pInHvcVDhlPvPce/Y/2kDlCM4xbrektXByOcawRj+0U=; b=maOqQIw3FYXH1/cDnM8pH2rfYr5sExWFW5Ascwc4DyQUHDC3ZKkIMwpmoF5qNS7tzvuH3d 7/dLoL+9MjTNiYCA== To: Petr Mladek Cc: Sergey Senozhatsky , Steven Rostedt , Thomas Gleixner , linux-kernel@vger.kernel.org Subject: [PATCH printk v1 18/18] printk: Perform atomic flush in console_flush_on_panic() Date: Thu, 2 Mar 2023 21:02:18 +0106 Message-Id: <20230302195618.156940-19-john.ogness@linutronix.de> In-Reply-To: <20230302195618.156940-1-john.ogness@linutronix.de> References: <20230302195618.156940-1-john.ogness@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Typically the panic() function will take care of atomic flushing the non-BKL consoles on panic. However, there are several users of console_flush_on_panic() outside of panic(). Also perform atomic flushing in console_flush_on_panic(). A new function cons_force_seq() is implemented to support the mode=3DCONSOLE_REPLAY_ALL feature. Signed-off-by: John Ogness Tested-by: Daniel Thompson --- kernel/printk/internal.h | 2 ++ kernel/printk/printk.c | 28 ++++++++++++++++++++++------ kernel/printk/printk_nobkl.c | 24 ++++++++++++++++++++++++ 3 files changed, 48 insertions(+), 6 deletions(-) diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index a417e3992b7a..f147ca386afa 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -79,6 +79,7 @@ bool cons_nobkl_init(struct console *con); bool cons_alloc_percpu_data(struct console *con); void cons_kthread_create(struct console *con); void cons_wake_threads(void); +void cons_force_seq(struct console *con, u64 seq); =20 /* * Check if the given console is currently capable and allowed to print @@ -148,6 +149,7 @@ static inline void cons_kthread_create(struct console *= con) { } static inline bool printk_percpu_data_ready(void) { return false; } static inline bool cons_nobkl_init(struct console *con) { return true; } static inline void cons_nobkl_cleanup(struct console *con) { } +static inline void cons_force_seq(struct console *con, u64 seq) { } =20 #endif /* CONFIG_PRINTK */ =20 diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 015c240f9f04..9a8ba8b3dca5 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -3160,6 +3160,28 @@ void console_unblank(void) */ void console_flush_on_panic(enum con_flush_mode mode) { + struct console *c; + short flags; + int cookie; + u64 seq; + + seq =3D prb_first_valid_seq(prb); + + /* + * Safely flush the atomic consoles before trying to flush any + * BKL/legacy consoles. + */ + if (mode =3D=3D CONSOLE_REPLAY_ALL) { + cookie =3D console_srcu_read_lock(); + for_each_console_srcu(c) { + flags =3D console_srcu_read_flags(c); + if (flags & CON_NO_BKL) + cons_force_seq(c, seq); + } + console_srcu_read_unlock(cookie); + } + cons_atomic_flush(NULL, true); + if (!have_bkl_console) return; =20 @@ -3174,12 +3196,6 @@ void console_flush_on_panic(enum con_flush_mode mode) console_may_schedule =3D 0; =20 if (mode =3D=3D CONSOLE_REPLAY_ALL) { - struct console *c; - int cookie; - u64 seq; - - seq =3D prb_first_valid_seq(prb); - cookie =3D console_srcu_read_lock(); for_each_console_srcu(c) { /* diff --git a/kernel/printk/printk_nobkl.c b/kernel/printk/printk_nobkl.c index 53989c8f1dbc..ac2ba785500e 100644 --- a/kernel/printk/printk_nobkl.c +++ b/kernel/printk/printk_nobkl.c @@ -233,6 +233,30 @@ static void cons_seq_init(struct console *con) #endif } =20 +/** + * cons_force_seq - Force a specified sequence number for a console + * @con: Console to work on + * @seq: Sequence number to force + * + * This function is only intended to be used in emergency situations. In + * particular: console_flush_on_panic(CONSOLE_REPLAY_ALL) + */ +void cons_force_seq(struct console *con, u64 seq) +{ +#ifdef CONFIG_64BIT + struct cons_state old; + struct cons_state new; + + do { + cons_state_read(con, CON_STATE_CUR, &old); + copy_bit_state(new, old); + new.seq =3D seq; + } while (!cons_state_try_cmpxchg(con, CON_STATE_CUR, &old, &new)); +#else + atomic_set(&ACCESS_PRIVATE(con, atomic_seq), seq); +#endif +} + static inline u64 cons_expand_seq(u64 seq) { u64 rbseq; --=20 2.30.2