kernel/printk/printk.c | 39 ++++++++++++++++++++++++++------------- 1 file changed, 26 insertions(+), 13 deletions(-)
Threaded console printing does not take into consideration that boot
consoles may be accessing the same hardware as normal consoles and thus
must not be called in parallel.
Since it is currently not possible to identify which consoles are
accessing the same hardware, delay threaded console printing activation
until it is known that there are no boot consoles registered.
Link: https://lore.kernel.org/r/20220619204949.50d9154d@thinkpad
Link: https://lore.kernel.org/r/2a82eae7-a256-f70c-fd82-4e510750906e@samsung.com
Link: https://lore.kernel.org/r/20220619204949.50d9154d@thinkpad
Reported-by: Marek Behún <kabel@kernel.org>
[john.ogness@linutronix.de: Better description of the problem.]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Tested-by: Marek Behún <kabel@kernel.org>
---
Changes against v1:
+ Updated comment and commit message [Linus,John]
+ Added Tested-by [Marek]
kernel/printk/printk.c | 39 ++++++++++++++++++++++++++-------------
1 file changed, 26 insertions(+), 13 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index b095fb5f5f61..157a3e8c01bb 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3551,6 +3551,19 @@ void __init console_init(void)
}
}
+static int __init printk_activate_kthreads(void)
+{
+ struct console *con;
+
+ console_lock();
+ printk_kthreads_available = true;
+ for_each_console(con)
+ printk_start_kthread(con);
+ console_unlock();
+
+ return 0;
+}
+
/*
* Some boot consoles access data that is in the init section and which will
* be discarded after the initcalls have been run. To make sure that no code
@@ -3567,6 +3580,7 @@ void __init console_init(void)
*/
static int __init printk_late_init(void)
{
+ bool no_bootcon = true;
struct console *con;
int ret;
@@ -3588,7 +3602,10 @@ static int __init printk_late_init(void)
pr_warn("bootconsole [%s%d] uses init memory and must be disabled even before the real one is ready\n",
con->name, con->index);
unregister_console(con);
+ continue;
}
+
+ no_bootcon = false;
}
ret = cpuhp_setup_state_nocalls(CPUHP_PRINTK_DEAD, "printk:dead", NULL,
console_cpu_notify);
@@ -3597,23 +3614,19 @@ static int __init printk_late_init(void)
console_cpu_notify, NULL);
WARN_ON(ret < 0);
printk_sysctl_init();
- return 0;
-}
-late_initcall(printk_late_init);
-
-static int __init printk_activate_kthreads(void)
-{
- struct console *con;
- console_lock();
- printk_kthreads_available = true;
- for_each_console(con)
- printk_start_kthread(con);
- console_unlock();
+ /*
+ * Boot consoles may be accessing the same hardware as normal
+ * consoles and thus must not be called in parallel. Therefore
+ * only activate threaded console printing if it is known that
+ * there are no boot consoles registered.
+ */
+ if (no_bootcon)
+ printk_activate_kthreads();
return 0;
}
-early_initcall(printk_activate_kthreads);
+late_initcall(printk_late_init);
#if defined CONFIG_PRINTK
/* If @con is specified, only wait for that console. Otherwise wait for all. */
--
2.35.3
Hi Petr,
On Tue, Jun 21, 2022 at 11:09 AM Petr Mladek <pmladek@suse.com> wrote:
> Threaded console printing does not take into consideration that boot
> consoles may be accessing the same hardware as normal consoles and thus
> must not be called in parallel.
>
> Since it is currently not possible to identify which consoles are
> accessing the same hardware, delay threaded console printing activation
> until it is known that there are no boot consoles registered.
>
> Link: https://lore.kernel.org/r/20220619204949.50d9154d@thinkpad
> Link: https://lore.kernel.org/r/2a82eae7-a256-f70c-fd82-4e510750906e@samsung.com
> Link: https://lore.kernel.org/r/20220619204949.50d9154d@thinkpad
> Reported-by: Marek Behún <kabel@kernel.org>
> [john.ogness@linutronix.de: Better description of the problem.]
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> Tested-by: Marek Behún <kabel@kernel.org>
Thanks, this restores the lost printing of "38000000.serial: ttySIF0
at MMIO ..." on SiPEED MAiXBiT, so
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Hi Petr, On Tue, 21 Jun 2022 at 18:09, Petr Mladek <pmladek@suse.com> wrote: > > Threaded console printing does not take into consideration that boot > consoles may be accessing the same hardware as normal consoles and thus > must not be called in parallel. > > Since it is currently not possible to identify which consoles are > accessing the same hardware, delay threaded console printing activation > until it is known that there are no boot consoles registered. > > Link: https://lore.kernel.org/r/20220619204949.50d9154d@thinkpad > Link: https://lore.kernel.org/r/2a82eae7-a256-f70c-fd82-4e510750906e@samsung.com > Link: https://lore.kernel.org/r/20220619204949.50d9154d@thinkpad > Reported-by: Marek Behún <kabel@kernel.org> > [john.ogness@linutronix.de: Better description of the problem.] > Signed-off-by: Petr Mladek <pmladek@suse.com> > Tested-by: Marek Behún <kabel@kernel.org> > --- > Changes against v1: The lockups on boot seem to be gone on my boards with this patch. So FWIW: Tested-by: Daniel Palmer<daniel@thingy.jp>
On Tue, Jun 21, 2022 at 6:42 AM Daniel Palmer <daniel@0x0f.com> wrote:
>
> The lockups on boot seem to be gone on my boards with this patch.
Good.
Petr, was this all the reports sorted out? Sounds like we can keep the
kernel thread model.
Linus
On Tue 2022-06-21 07:45:09, Linus Torvalds wrote: > On Tue, Jun 21, 2022 at 6:42 AM Daniel Palmer <daniel@0x0f.com> wrote: > > > > The lockups on boot seem to be gone on my boards with this patch. > > Good. > > Petr, was this all the reports sorted out? Sounds like we can keep the > kernel thread model. Yes, it seems that we fixed all the reports when boot failed or the console was messed or silent. There is one more issue, see https://lore.kernel.org/r/YqyANveL50uxupfQ@zx2c4.com It is about synchronization between messages printed by userspace and kernel consoles. The synchronization was never guaranteed. I think that it is not an argument to remove the kthreads. They are really needed, especially for huge systems, noisy debugging, or RT where softlockups really hurts. My opinion is that we might easily support 3 printk modes, switched on the command line: 1. Use printk console kthreads when the system is normally running. It makes printk() predictable and safe. We do our best to switch to the direct mode when the kthreads are not reliable, for example, panic, suspend, reboot. IMHO, it should be default, especially for production systems. 2. Use an atomic console in fully synchronous mode. It is inspired by a patch from Peter Zijlstra. It calls the (serial) console directly from printk() and uses CPU-reentrant lock to serialize the messages between CPUs. AFAIK, Peter and some others use this approach to debug some nasty bugs in the scheduler, NMI, early boot when even the legacy code using console_lock() is not reliable enough. John Ogness is working on the atomic serial console. It would allow to integrate this mode a clean way. It is not usable for production because printk() might slow down the entire system. 3. Use the legacy code that tries to call consoles from printk() via console_trylock(). We need this code anyway for the early boot, suspend, reboot, panic. It would be used for debugging nasty bugs like the 2nd mode for system without serial console. It will be pretty hard to create lockless variant for more complicate consoles. I am not happy that we need more modes. But I think that they all have a good justification. Best Regards, Petr
On 2022-06-21, Petr Mladek <pmladek@suse.com> wrote: > Threaded console printing does not take into consideration that boot > consoles may be accessing the same hardware as normal consoles and thus > must not be called in parallel. > > Since it is currently not possible to identify which consoles are > accessing the same hardware, delay threaded console printing activation > until it is known that there are no boot consoles registered. > > Link: https://lore.kernel.org/r/20220619204949.50d9154d@thinkpad > Link: https://lore.kernel.org/r/2a82eae7-a256-f70c-fd82-4e510750906e@samsung.com > Link: https://lore.kernel.org/r/20220619204949.50d9154d@thinkpad > Reported-by: Marek Behún <kabel@kernel.org> > [john.ogness@linutronix.de: Better description of the problem.] > Signed-off-by: Petr Mladek <pmladek@suse.com> > Tested-by: Marek Behún <kabel@kernel.org> Reviewed-by: John Ogness <john.ogness@linutronix.de>
On (22/06/21 11:09), Petr Mladek wrote:
> Threaded console printing does not take into consideration that boot
> consoles may be accessing the same hardware as normal consoles and thus
> must not be called in parallel.
>
> Since it is currently not possible to identify which consoles are
> accessing the same hardware, delay threaded console printing activation
> until it is known that there are no boot consoles registered.
>
> Link: https://lore.kernel.org/r/20220619204949.50d9154d@thinkpad
> Link: https://lore.kernel.org/r/2a82eae7-a256-f70c-fd82-4e510750906e@samsung.com
> Link: https://lore.kernel.org/r/20220619204949.50d9154d@thinkpad
> Reported-by: Marek Behún <kabel@kernel.org>
> [john.ogness@linutronix.de: Better description of the problem.]
> Signed-off-by: Petr Mladek <pmladek@suse.com>
> Tested-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
[..]
> +static int __init printk_activate_kthreads(void)
> +{
> + struct console *con;
> +
> + console_lock();
> + printk_kthreads_available = true;
> + for_each_console(con)
> + printk_start_kthread(con);
> + console_unlock();
> +
> + return 0;
> +}
> +
> /*
> * Some boot consoles access data that is in the init section and which will
> * be discarded after the initcalls have been run. To make sure that no code
> @@ -3567,6 +3580,7 @@ void __init console_init(void)
> */
> static int __init printk_late_init(void)
> {
> + bool no_bootcon = true;
> struct console *con;
> int ret;
>
> @@ -3588,7 +3602,10 @@ static int __init printk_late_init(void)
> pr_warn("bootconsole [%s%d] uses init memory and must be disabled even before the real one is ready\n",
> con->name, con->index);
> unregister_console(con);
> + continue;
> }
> +
> + no_bootcon = false;
> }
> ret = cpuhp_setup_state_nocalls(CPUHP_PRINTK_DEAD, "printk:dead", NULL,
> console_cpu_notify);
> @@ -3597,23 +3614,19 @@ static int __init printk_late_init(void)
> console_cpu_notify, NULL);
> WARN_ON(ret < 0);
> printk_sysctl_init();
> - return 0;
> -}
> -late_initcall(printk_late_init);
> -
> -static int __init printk_activate_kthreads(void)
> -{
> - struct console *con;
>
> - console_lock();
> - printk_kthreads_available = true;
> - for_each_console(con)
> - printk_start_kthread(con);
> - console_unlock();
> + /*
> + * Boot consoles may be accessing the same hardware as normal
> + * consoles and thus must not be called in parallel. Therefore
> + * only activate threaded console printing if it is known that
> + * there are no boot consoles registered.
> + */
> + if (no_bootcon)
> + printk_activate_kthreads();
A quick question. Here we still can have bootcon which can unregistered
later, right? Do you think it'll make sense to check if printing kthreads
can be safely started and start them if so (if no CON_BOOT found and kthreads
are not already created) at the end of unregister_console()?
On Tue 2022-06-21 18:16:18, Sergey Senozhatsky wrote:
> On (22/06/21 11:09), Petr Mladek wrote:
> > Threaded console printing does not take into consideration that boot
> > consoles may be accessing the same hardware as normal consoles and thus
> > must not be called in parallel.
> >
> > Since it is currently not possible to identify which consoles are
> > accessing the same hardware, delay threaded console printing activation
> > until it is known that there are no boot consoles registered.
> >
> > Link: https://lore.kernel.org/r/20220619204949.50d9154d@thinkpad
> > Link: https://lore.kernel.org/r/2a82eae7-a256-f70c-fd82-4e510750906e@samsung.com
> > Link: https://lore.kernel.org/r/20220619204949.50d9154d@thinkpad
> > Reported-by: Marek Behún <kabel@kernel.org>
> > [john.ogness@linutronix.de: Better description of the problem.]
> > Signed-off-by: Petr Mladek <pmladek@suse.com>
> > Tested-by: Marek Behún <kabel@kernel.org>
>
> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Thanks.
> > -static int __init printk_activate_kthreads(void)
> > -{
> > - struct console *con;
> >
> > - console_lock();
> > - printk_kthreads_available = true;
> > - for_each_console(con)
> > - printk_start_kthread(con);
> > - console_unlock();
> > + /*
> > + * Boot consoles may be accessing the same hardware as normal
> > + * consoles and thus must not be called in parallel. Therefore
> > + * only activate threaded console printing if it is known that
> > + * there are no boot consoles registered.
> > + */
> > + if (no_bootcon)
> > + printk_activate_kthreads();
>
> A quick question. Here we still can have bootcon which can unregistered
> later, right? Do you think it'll make sense to check if printing kthreads
> can be safely started and start them if so (if no CON_BOOT found and kthreads
> are not already created) at the end of unregister_console()?
Yeah, that's my plan how to optimize it in the future. I just
wanted to do something simple and be on the safe side for 5.19.
Best Regards,
Petr
On (22/06/21 13:19), Petr Mladek wrote:
> > > -static int __init printk_activate_kthreads(void)
> > > -{
> > > - struct console *con;
> > >
> > > - console_lock();
> > > - printk_kthreads_available = true;
> > > - for_each_console(con)
> > > - printk_start_kthread(con);
> > > - console_unlock();
> > > + /*
> > > + * Boot consoles may be accessing the same hardware as normal
> > > + * consoles and thus must not be called in parallel. Therefore
> > > + * only activate threaded console printing if it is known that
> > > + * there are no boot consoles registered.
> > > + */
> > > + if (no_bootcon)
> > > + printk_activate_kthreads();
> >
> > A quick question. Here we still can have bootcon which can unregistered
> > later, right? Do you think it'll make sense to check if printing kthreads
> > can be safely started and start them if so (if no CON_BOOT found and kthreads
> > are not already created) at the end of unregister_console()?
>
> Yeah, that's my plan how to optimize it in the future. I just
> wanted to do something simple and be on the safe side for 5.19.
Sounds good.
© 2016 - 2026 Red Hat, Inc.