[PATCH 1/2] printk: Introduce LOUD_CON flag

Marcos Paulo de Souza posted 2 patches 1 month, 1 week ago
[PATCH 1/2] printk: Introduce LOUD_CON flag
Posted by Marcos Paulo de Souza 1 month, 1 week ago
Introduce LOUD_CON flag to printk. The new flag will make it possible to
create a context where printk messages will never be suppressed. This
new context information will be stored in the already existing
printk_context per-CPU variable. This variable was changed from 'int' to
'unsigned int' to avoid issues with automatic casting.

This mechanism will be used in the next patch to create a loud_console
context on sysrq handling, removing an existing workaround on the
loglevel global variable. The workaround existed to make sure that sysrq
header messages were sent to all consoles.

Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
---
 include/linux/printk.h      |  3 +++
 kernel/printk/internal.h    |  3 +++
 kernel/printk/printk.c      |  8 +++++++-
 kernel/printk/printk_safe.c | 27 +++++++++++++++++++++++++--
 4 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index eca9bb2ee637..0ac9879d94ac 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -166,6 +166,9 @@ __printf(1, 2) __cold int _printk_deferred(const char *fmt, ...);
 extern void __printk_deferred_enter(void);
 extern void __printk_deferred_exit(void);
 
+extern void printk_loud_console_enter(void);
+extern void printk_loud_console_exit(void);
+
 /*
  * The printk_deferred_enter/exit macros are available only as a hack for
  * some code paths that need to defer all printk console printing. Interrupts
diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index 3fcb48502adb..2649e8a66cef 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -53,6 +53,8 @@ int devkmsg_sysctl_set_loglvl(const struct ctl_table *table, int write,
 
 /* Flags for a single printk record. */
 enum printk_info_flags {
+	/* always show on console, ignore console_loglevel */
+	LOG_LOUD_CON	= 1,
 	LOG_NEWLINE	= 2,	/* text ended with a newline */
 	LOG_CONT	= 8,	/* text is a fragment of a continuation line */
 };
@@ -90,6 +92,7 @@ bool printk_percpu_data_ready(void);
 
 void defer_console_output(void);
 bool is_printk_legacy_deferred(void);
+bool is_printk_console_loud(void);
 
 u16 printk_parse_prefix(const char *text, int *level,
 			enum printk_info_flags *flags);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index beb808f4c367..b893825fe21d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1321,6 +1321,7 @@ static void boot_delay_msec(int level)
 	unsigned long timeout;
 
 	if ((boot_delay == 0 || system_state >= SYSTEM_RUNNING)
+		|| is_printk_console_loud()
 		|| suppress_message_printing(level)) {
 		return;
 	}
@@ -2273,6 +2274,9 @@ int vprintk_store(int facility, int level,
 	if (dev_info)
 		flags |= LOG_NEWLINE;
 
+	if (is_printk_console_loud())
+		flags |= LOG_LOUD_CON;
+
 	if (flags & LOG_CONT) {
 		prb_rec_init_wr(&r, reserve_size);
 		if (prb_reserve_in_last(&e, prb, &r, caller_id, PRINTKRB_RECORD_MAX)) {
@@ -2947,6 +2951,7 @@ bool printk_get_next_message(struct printk_message *pmsg, u64 seq,
 	struct printk_info info;
 	struct printk_record r;
 	size_t len = 0;
+	bool loud_con;
 
 	/*
 	 * Formatting extended messages requires a separate buffer, so use the
@@ -2965,9 +2970,10 @@ bool printk_get_next_message(struct printk_message *pmsg, u64 seq,
 
 	pmsg->seq = r.info->seq;
 	pmsg->dropped = r.info->seq - seq;
+	loud_con = r.info->flags & LOG_LOUD_CON;
 
 	/* Skip record that has level above the console loglevel. */
-	if (may_suppress && suppress_message_printing(r.info->level))
+	if (!loud_con && may_suppress && suppress_message_printing(r.info->level))
 		goto out;
 
 	if (is_extended) {
diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
index 2b35a9d3919d..4618988baeea 100644
--- a/kernel/printk/printk_safe.c
+++ b/kernel/printk/printk_safe.c
@@ -12,7 +12,30 @@
 
 #include "internal.h"
 
-static DEFINE_PER_CPU(int, printk_context);
+static DEFINE_PER_CPU(unsigned int, printk_context);
+
+#define PRINTK_SAFE_CONTEXT_MASK		0x0000ffffU
+#define PRINTK_LOUD_CONSOLE_CONTEXT_MASK	0xffff0000U
+#define PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET	0x00010000U
+
+void noinstr printk_loud_console_enter(void)
+{
+	cant_migrate();
+	this_cpu_add(printk_context, PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET);
+}
+
+void noinstr printk_loud_console_exit(void)
+{
+	cant_migrate();
+	this_cpu_sub(printk_context, PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET);
+}
+
+/* Safe in any context. CPU migration is always disabled when set. */
+bool is_printk_console_loud(void)
+{
+	return !!(this_cpu_read(printk_context) &
+			PRINTK_LOUD_CONSOLE_CONTEXT_MASK);
+}
 
 /* Can be preempted by NMI. */
 void __printk_safe_enter(void)
@@ -45,7 +68,7 @@ bool is_printk_legacy_deferred(void)
 	 * context. CPU migration is always disabled when set.
 	 */
 	return (force_legacy_kthread() ||
-		this_cpu_read(printk_context) ||
+		!!(this_cpu_read(printk_context) & PRINTK_SAFE_CONTEXT_MASK) ||
 		in_nmi());
 }
 

-- 
2.46.1
Re: [PATCH 1/2] printk: Introduce LOUD_CON flag
Posted by kernel test robot 1 month, 1 week ago
Hi Marcos,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 1d227fcc72223cbdd34d0ce13541cbaab5e0d72f]

url:    https://github.com/intel-lab-lkp/linux/commits/Marcos-Paulo-de-Souza/printk-Introduce-LOUD_CON-flag/20241017-010521
base:   1d227fcc72223cbdd34d0ce13541cbaab5e0d72f
patch link:    https://lore.kernel.org/r/20241016-printk-loud-con-v1-1-065e4dad6632%40suse.com
patch subject: [PATCH 1/2] printk: Introduce LOUD_CON flag
config: x86_64-randconfig-001-20241018 (https://download.01.org/0day-ci/archive/20241019/202410190915.WgFGTQdE-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-12) 11.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241019/202410190915.WgFGTQdE-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410190915.WgFGTQdE-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> vmlinux.o: warning: objtool: printk_loud_console_enter+0x11: call to __cant_migrate() leaves .noinstr.text section
>> vmlinux.o: warning: objtool: printk_loud_console_exit+0x11: call to __cant_migrate() leaves .noinstr.text section

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH 1/2] printk: Introduce LOUD_CON flag
Posted by John Ogness 1 month, 1 week ago
Hi Marcus,

On 2024-10-16, Marcos Paulo de Souza <mpdesouza@suse.com> wrote:
> Introduce LOUD_CON flag to printk.

Generally speaking, I do not like the name "LOUD_CON". The flag is
related to records, not consoles. Something like "NO_SUPPRESS" or
"FORCE_PRINT" might be more appropriate. Note that naming is not my
strength.

> The new flag will make it possible to
> create a context where printk messages will never be suppressed. This
> new context information will be stored in the already existing
> printk_context per-CPU variable. This variable was changed from 'int' to
> 'unsigned int' to avoid issues with automatic casting.
>
> This mechanism will be used in the next patch to create a loud_console
> context on sysrq handling, removing an existing workaround on the
> loglevel global variable. The workaround existed to make sure that sysrq
> header messages were sent to all consoles.

IMO the more interesting aspect is that the "loud" flag is stored in the
ringbuffer so that the message is not suppressed, even if printed later
(for example because it was deferred). This actually even fixes a bug
since the current workaround will not perform as expected if the sysrq
records are deferred (for example due to threaded printing or consoles
that are registered later).

> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index beb808f4c367..b893825fe21d 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -1321,6 +1321,7 @@ static void boot_delay_msec(int level)
>  	unsigned long timeout;
>  
>  	if ((boot_delay == 0 || system_state >= SYSTEM_RUNNING)
> +		|| is_printk_console_loud()
>  		|| suppress_message_printing(level)) {

I do not think "loud" should be a reason to skip the delays. The delays
are there to slow down printing. I would think that for "loud" messages,
this is even more important. I suppose this function (as well as
printk_delay()) would need a new boolean parameter whether it is a
"loud" message. Then:

	|| (!loud_con && suppress_message_printing(level))

> @@ -2273,6 +2274,9 @@ int vprintk_store(int facility, int level,
>  	if (dev_info)
>  		flags |= LOG_NEWLINE;
>  
> +	if (is_printk_console_loud())
> +		flags |= LOG_LOUD_CON;
> +
>  	if (flags & LOG_CONT) {
>  		prb_rec_init_wr(&r, reserve_size);
>  		if (prb_reserve_in_last(&e, prb, &r, caller_id, PRINTKRB_RECORD_MAX)) {

I guess LOG_LOUD_CON should also be set in the LOG_CONT case (like
LOG_NEWLINE does).

> diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> index 2b35a9d3919d..4618988baeea 100644
> --- a/kernel/printk/printk_safe.c
> +++ b/kernel/printk/printk_safe.c
> @@ -12,7 +12,30 @@
>  
>  #include "internal.h"
>  
> -static DEFINE_PER_CPU(int, printk_context);
> +static DEFINE_PER_CPU(unsigned int, printk_context);
> +
> +#define PRINTK_SAFE_CONTEXT_MASK		0x0000ffffU
> +#define PRINTK_LOUD_CONSOLE_CONTEXT_MASK	0xffff0000U
> +#define PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET	0x00010000U
> +
> +void noinstr printk_loud_console_enter(void)
> +{
> +	cant_migrate();
> +	this_cpu_add(printk_context, PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET);
> +}

Have you tested this with lockdep? AFAICT, the write_sysrq_trigger()
path can migrate since it is only using rcu_read_lock() in
__handle_sysrq().

John Ogness
Re: [PATCH 1/2] printk: Introduce LOUD_CON flag
Posted by Marcos Paulo de Souza 1 month, 1 week ago
On Wed, 2024-10-16 at 20:17 +0206, John Ogness wrote:
> Hi Marcus,
> 
> On 2024-10-16, Marcos Paulo de Souza <mpdesouza@suse.com> wrote:
> > Introduce LOUD_CON flag to printk.
> 
> Generally speaking, I do not like the name "LOUD_CON". The flag is
> related to records, not consoles. Something like "NO_SUPPRESS" or
> "FORCE_PRINT" might be more appropriate. Note that naming is not my
> strength.

Makes sense. I'll talk with Petr to check which better name we case,
but personally speaking I liked FORCE_CON that he suggested.

> 
> > The new flag will make it possible to
> > create a context where printk messages will never be suppressed.
> > This
> > new context information will be stored in the already existing
> > printk_context per-CPU variable. This variable was changed from
> > 'int' to
> > 'unsigned int' to avoid issues with automatic casting.
> > 
> > This mechanism will be used in the next patch to create a
> > loud_console
> > context on sysrq handling, removing an existing workaround on the
> > loglevel global variable. The workaround existed to make sure that
> > sysrq
> > header messages were sent to all consoles.
> 
> IMO the more interesting aspect is that the "loud" flag is stored in
> the
> ringbuffer so that the message is not suppressed, even if printed
> later
> (for example because it was deferred). This actually even fixes a bug
> since the current workaround will not perform as expected if the
> sysrq
> records are deferred (for example due to threaded printing or
> consoles
> that are registered later).

Indeed, I'll describe that this new behavior fixes a problem.

> 
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index beb808f4c367..b893825fe21d 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -1321,6 +1321,7 @@ static void boot_delay_msec(int level)
> >  	unsigned long timeout;
> >  
> >  	if ((boot_delay == 0 || system_state >= SYSTEM_RUNNING)
> > +		|| is_printk_console_loud()
> >  		|| suppress_message_printing(level)) {
> 
> I do not think "loud" should be a reason to skip the delays. The
> delays
> are there to slow down printing. I would think that for "loud"
> messages,
> this is even more important. I suppose this function (as well as
> printk_delay()) would need a new boolean parameter whether it is a
> "loud" message. Then:
> 
> 	|| (!loud_con && suppress_message_printing(level))

Done locally, thanks!

> 
> > @@ -2273,6 +2274,9 @@ int vprintk_store(int facility, int level,
> >  	if (dev_info)
> >  		flags |= LOG_NEWLINE;
> >  
> > +	if (is_printk_console_loud())
> > +		flags |= LOG_LOUD_CON;
> > +
> >  	if (flags & LOG_CONT) {
> >  		prb_rec_init_wr(&r, reserve_size);
> >  		if (prb_reserve_in_last(&e, prb, &r, caller_id,
> > PRINTKRB_RECORD_MAX)) {
> 
> I guess LOG_LOUD_CON should also be set in the LOG_CONT case (like
> LOG_NEWLINE does).
> 
> > diff --git a/kernel/printk/printk_safe.c
> > b/kernel/printk/printk_safe.c
> > index 2b35a9d3919d..4618988baeea 100644
> > --- a/kernel/printk/printk_safe.c
> > +++ b/kernel/printk/printk_safe.c
> > @@ -12,7 +12,30 @@
> >  
> >  #include "internal.h"
> >  
> > -static DEFINE_PER_CPU(int, printk_context);
> > +static DEFINE_PER_CPU(unsigned int, printk_context);
> > +
> > +#define PRINTK_SAFE_CONTEXT_MASK		0x0000ffffU
> > +#define PRINTK_LOUD_CONSOLE_CONTEXT_MASK	0xffff0000U
> > +#define PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET	0x00010000U
> > +
> > +void noinstr printk_loud_console_enter(void)
> > +{
> > +	cant_migrate();
> > +	this_cpu_add(printk_context,
> > PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET);
> > +}
> 
> Have you tested this with lockdep? AFAICT, the write_sysrq_trigger()
> path can migrate since it is only using rcu_read_lock() in
> __handle_sysrq().

Now I have and I found the error. Let me investigate how I can proceed
here.

Thanks a lot for the review!

> 
> John Ogness
Re: [PATCH 1/2] printk: Introduce LOUD_CON flag
Posted by Petr Mladek 1 month, 1 week ago
On Wed 2024-10-16 20:17:48, John Ogness wrote:
> Hi Marcus,
> 
> On 2024-10-16, Marcos Paulo de Souza <mpdesouza@suse.com> wrote:
> > Introduce LOUD_CON flag to printk.
> 
> Generally speaking, I do not like the name "LOUD_CON". The flag is
> related to records, not consoles. Something like "NO_SUPPRESS" or
> "FORCE_PRINT" might be more appropriate. Note that naming is not my
> strength.

I see. I would prefer to avoid names with double negatives.
So I prefer the "FORCE_PRINT" variant.

That said, I think that "FORCE_CON" might also be good.
Aka, the message forces consoles to show it.

> > The new flag will make it possible to
> > create a context where printk messages will never be suppressed. This
> > new context information will be stored in the already existing
> > printk_context per-CPU variable. This variable was changed from 'int' to
> > 'unsigned int' to avoid issues with automatic casting.
> >
> > This mechanism will be used in the next patch to create a loud_console
> > context on sysrq handling, removing an existing workaround on the
> > loglevel global variable. The workaround existed to make sure that sysrq
> > header messages were sent to all consoles.
> 
> IMO the more interesting aspect is that the "loud" flag is stored in the
> ringbuffer so that the message is not suppressed, even if printed later
> (for example because it was deferred). This actually even fixes a bug
> since the current workaround will not perform as expected if the sysrq
> records are deferred (for example due to threaded printing or consoles
> that are registered later).

Yeah. I think that the fix of this race is the main motivation for
the change. We should make it more clear in the commit message.

> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index beb808f4c367..b893825fe21d 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -1321,6 +1321,7 @@ static void boot_delay_msec(int level)
> >  	unsigned long timeout;
> >  
> >  	if ((boot_delay == 0 || system_state >= SYSTEM_RUNNING)
> > +		|| is_printk_console_loud()
> >  		|| suppress_message_printing(level)) {
> 
> I do not think "loud" should be a reason to skip the delays. The delays
> are there to slow down printing. I would think that for "loud" messages,
> this is even more important. I suppose this function (as well as
> printk_delay()) would need a new boolean parameter whether it is a
> "loud" message. Then:
> 
> 	|| (!loud_con && suppress_message_printing(level))

Yes, this makes more sense.

> > @@ -2273,6 +2274,9 @@ int vprintk_store(int facility, int level,
> >  	if (dev_info)
> >  		flags |= LOG_NEWLINE;
> >  
> > +	if (is_printk_console_loud())
> > +		flags |= LOG_LOUD_CON;
> > +
> >  	if (flags & LOG_CONT) {
> >  		prb_rec_init_wr(&r, reserve_size);
> >  		if (prb_reserve_in_last(&e, prb, &r, caller_id, PRINTKRB_RECORD_MAX)) {
> 
> I guess LOG_LOUD_CON should also be set in the LOG_CONT case (like
> LOG_NEWLINE does).

Great catch! Yes, it would make sense to set it even in the re-opened record.

> > diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> > index 2b35a9d3919d..4618988baeea 100644
> > --- a/kernel/printk/printk_safe.c
> > +++ b/kernel/printk/printk_safe.c
> > @@ -12,7 +12,30 @@
> >  
> >  #include "internal.h"
> >  
> > -static DEFINE_PER_CPU(int, printk_context);
> > +static DEFINE_PER_CPU(unsigned int, printk_context);
> > +
> > +#define PRINTK_SAFE_CONTEXT_MASK		0x0000ffffU
> > +#define PRINTK_LOUD_CONSOLE_CONTEXT_MASK	0xffff0000U
> > +#define PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET	0x00010000U
> > +
> > +void noinstr printk_loud_console_enter(void)
> > +{
> > +	cant_migrate();
> > +	this_cpu_add(printk_context, PRINTK_LOUD_CONSOLE_CONTEXT_OFFSET);
> > +}
> 
> Have you tested this with lockdep? AFAICT, the write_sysrq_trigger()
> path can migrate since it is only using rcu_read_lock() in
> __handle_sysrq().

Interesting. I have always had connected rcu_read_lock() with
preempt_disable(). But I see that there actually exists preemptive
variant with CONFIG_PREEMPT_RCU. And I guess that preemptive RCU
read lock is a must-to-have on RT.

And indeed:

# echo h >/proc/sysrq-trigger

produced:

[   53.669907] BUG: assuming non migratable context at kernel/printk/printk_safe.c:23
[   53.669920] in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 1637, name: bash
[   53.669931] 2 locks held by bash/1637:
[   53.669936]  #0: ffff8ae680a384a8 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x6e/0xf0
[   53.669968]  #1: ffffffff83f226e0 (rcu_read_lock){....}-{1:3}, at: __handle_sysrq+0x3d/0x120
[   53.670002] CPU: 2 UID: 0 PID: 1637 Comm: bash Not tainted 6.12.0-rc3-default+ #67
[   53.670011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
[   53.670020] Call Trace:
[   53.670026]  <TASK>
[   53.670045]  dump_stack_lvl+0x6c/0xa0
[   53.670064]  __cant_migrate.cold+0x7c/0x89
[   53.670080]  printk_loud_console_enter+0x15/0x30
[   53.670088]  __handle_sysrq+0x60/0x120
[   53.670104]  write_sysrq_trigger+0x6a/0xa0
[   53.670120]  proc_reg_write+0x5f/0xb0
[   53.670132]  vfs_write+0xf9/0x540
[   53.670147]  ? __lock_release.isra.0+0x1a6/0x2c0
[   53.670172]  ? do_user_addr_fault+0x38c/0x720
[   53.670197]  ksys_write+0x6e/0xf0
[   53.670220]  do_syscall_64+0x79/0x190
[   53.670238]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   53.670252] RIP: 0033:0x7fa4247d9b13
[   53.670269] Code: 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 f3 c3 0f 1f 00 41 54 55 49 89 d4 53 48 89
[   53.670275] RSP: 002b:00007ffe62690f38 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   53.670282] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fa4247d9b13
[   53.670288] RDX: 0000000000000002 RSI: 00005557ac516ec0 RDI: 0000000000000001
[   53.670291] RBP: 00005557ac516ec0 R08: 000000000000000a R09: 0000000000000000
[   53.670295] R10: 00007fa4246da468 R11: 0000000000000246 R12: 00007fa424abd500
[   53.670298] R13: 0000000000000002 R14: 00007fa424ac2c00 R15: 0000000000000002
[   53.670350]  </TASK>
[   53.670358] sysrq: HELP : loglevel(0-9) reboot(b) crash(c) show-all-locks(d) terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) show-blocked-tasks(w) dump-ftrace-buffer(z) replay-kernel-logs(R) 


IMHO, the best solution would be to call migrate_disable()/enable()
in printk_loud_console_enter()/exit().

We should not disable preemption because the output could
be long.

Best Regards,
Petr
Re: [PATCH 1/2] printk: Introduce LOUD_CON flag
Posted by John Ogness 1 month, 1 week ago
On 2024-10-17, Petr Mladek <pmladek@suse.com> wrote:
> # echo h >/proc/sysrq-trigger
>
> produced:
>
> [   53.669907] BUG: assuming non migratable context at kernel/printk/printk_safe.c:23
> [   53.669920] in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 1637, name: bash
> [   53.669931] 2 locks held by bash/1637:
> [   53.669936]  #0: ffff8ae680a384a8 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x6e/0xf0
> [   53.669968]  #1: ffffffff83f226e0 (rcu_read_lock){....}-{1:3}, at: __handle_sysrq+0x3d/0x120
> [   53.670002] CPU: 2 UID: 0 PID: 1637 Comm: bash Not tainted 6.12.0-rc3-default+ #67
> [   53.670011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
> [   53.670020] Call Trace:
> [   53.670026]  <TASK>
> [   53.670045]  dump_stack_lvl+0x6c/0xa0
> [   53.670064]  __cant_migrate.cold+0x7c/0x89
> [   53.670080]  printk_loud_console_enter+0x15/0x30
> [   53.670088]  __handle_sysrq+0x60/0x120
> [   53.670104]  write_sysrq_trigger+0x6a/0xa0
> [   53.670120]  proc_reg_write+0x5f/0xb0
> [   53.670132]  vfs_write+0xf9/0x540
> [   53.670147]  ? __lock_release.isra.0+0x1a6/0x2c0
> [   53.670172]  ? do_user_addr_fault+0x38c/0x720
> [   53.670197]  ksys_write+0x6e/0xf0
> [   53.670220]  do_syscall_64+0x79/0x190
> [   53.670238]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> IMHO, the best solution would be to call migrate_disable()/enable()
> in printk_loud_console_enter()/exit().

That will not work because migrate_enable() can only be called from
can_sleep context. Instead, the migrate_disable()/enable() should be at
the few (one?) call sites where printk_loud_console_enter()/exit() is
used from task context.

This is also how printk_deferred_enter()/exit() works, relying on the
caller to disable migration if necessary.

John
Re: [PATCH 1/2] printk: Introduce LOUD_CON flag
Posted by Petr Mladek 1 month ago
On Fri 2024-10-18 09:20:19, John Ogness wrote:
> On 2024-10-17, Petr Mladek <pmladek@suse.com> wrote:
> > # echo h >/proc/sysrq-trigger
> >
> > produced:
> >
> > [   53.669907] BUG: assuming non migratable context at kernel/printk/printk_safe.c:23
> > [   53.669920] in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 1637, name: bash
> > [   53.669931] 2 locks held by bash/1637:
> > [   53.669936]  #0: ffff8ae680a384a8 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x6e/0xf0
> > [   53.669968]  #1: ffffffff83f226e0 (rcu_read_lock){....}-{1:3}, at: __handle_sysrq+0x3d/0x120
> > [   53.670002] CPU: 2 UID: 0 PID: 1637 Comm: bash Not tainted 6.12.0-rc3-default+ #67
> > [   53.670011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
> > [   53.670020] Call Trace:
> > [   53.670026]  <TASK>
> > [   53.670045]  dump_stack_lvl+0x6c/0xa0
> > [   53.670064]  __cant_migrate.cold+0x7c/0x89
> > [   53.670080]  printk_loud_console_enter+0x15/0x30
> > [   53.670088]  __handle_sysrq+0x60/0x120
> > [   53.670104]  write_sysrq_trigger+0x6a/0xa0
> > [   53.670120]  proc_reg_write+0x5f/0xb0
> > [   53.670132]  vfs_write+0xf9/0x540
> > [   53.670147]  ? __lock_release.isra.0+0x1a6/0x2c0
> > [   53.670172]  ? do_user_addr_fault+0x38c/0x720
> > [   53.670197]  ksys_write+0x6e/0xf0
> > [   53.670220]  do_syscall_64+0x79/0x190
> > [   53.670238]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >
> > IMHO, the best solution would be to call migrate_disable()/enable()
> > in printk_loud_console_enter()/exit().
> 
> That will not work because migrate_enable() can only be called from
> can_sleep context. Instead, the migrate_disable()/enable() should be at
> the few (one?) call sites where printk_loud_console_enter()/exit() is
> used from task context.

Hmm, if I get it correctly, we could not use migrate_disable() in
__handle_sysrq() because it can be called also in atomic context,
for example:

  + pl010_int()
    + pl010_rx_chars()
      + uart_handle_sysrq_char()
	+ handle_sysrq()
	  + __handle_sysrq()

I do not see any easy way how to distinguish whether it was called in
an atomic context or not.

So, I see three possibilities:

  1. Explicitly call preempt_disable() in __handle_sysrq().

     It would be just around the the single line or the help. But still,
     I do not like it much.


  2. Avoid the per-CPU variable. Force adding the LOUD_CON/FORCE_CON
     flag using a global variable, e.g. printk_force_console.

     The problem is that it might affect also messages printed by
     other CPUs. And there might be many.

     Well, console_loglevel is a global variable. The original code
     had a similar problem.


  3. Add the LOUD_CON/FLUSH_CON flag via a parameter. For example,
     by a special LOGLEVEL_FORCE_CON, similar to LOGLEVEL_SCHED.

     I might work well for __handle_sysrq() which calls the affected
     printk() directly.

     But it won't work, for example, for kdb_show_stack(). It wants
     to show messages printed by a nested functions.


I personally prefer the 2nd variant. It fixes the problem and it
should not make things worse.

Best Regards,
Petr
Re: [PATCH 1/2] printk: Introduce LOUD_CON flag
Posted by John Ogness 1 month ago
On 2024-10-21, Petr Mladek <pmladek@suse.com> wrote:
>> That will not work because migrate_enable() can only be called from
>> can_sleep context. Instead, the migrate_disable()/enable() should be at
>> the few (one?) call sites where printk_loud_console_enter()/exit() is
>> used from task context.
>
> Hmm, if I get it correctly, we could not use migrate_disable() in
> __handle_sysrq() because it can be called also in atomic context,

I am talking about callers of __handle_sysrq() and/or their callers.

For example write_sysrq_trigger() could do:

	migrate_disable();
	__handle_sysrq(c, false);
	migrate_enable();

Or a new wrapper could be introduced for this purpose:

static inline void wrapper handle_sysrq_task(u8 key, bool check_mask)
{
	migrate_disable();
	__handle_sysrq(key, check_mask);
	migrate_enable();
}

A quick grep shows about 25 call sites to check.

> I do not see any easy way how to distinguish whether it was called in
> an atomic context or not.

There is no clean way to do that. If this information is needed, it must
be tracked by the call chain.

> So, I see three possibilities:
>
>   1. Explicitly call preempt_disable() in __handle_sysrq().
>
>      It would be just around the the single line or the help. But still,
>      I do not like it much.

Not acceptable for PREEMPT_RT since sysrq is exposed to external inputs.

>   2. Avoid the per-CPU variable. Force adding the LOUD_CON/FORCE_CON
>      flag using a global variable, e.g. printk_force_console.
>
>      The problem is that it might affect also messages printed by
>      other CPUs. And there might be many.
>
>      Well, console_loglevel is a global variable. The original code
>      had a similar problem.

If disabling migration is not an option for you, this would be my second
choice. I assume tagging too many messages is better than not tagging
enough. And, as you say, this is effectively what the current code is
trying to do.

>   3. Add the LOUD_CON/FLUSH_CON flag via a parameter. For example,
>      by a special LOGLEVEL_FORCE_CON, similar to LOGLEVEL_SCHED.
>
>      I might work well for __handle_sysrq() which calls the affected
>      printk() directly.
>
>      But it won't work, for example, for kdb_show_stack(). It wants
>      to show messages printed by a nested functions.

Right, this has limited usefulness and might miss the important things,
which tend to be within helper functions.

John
Re: [PATCH 1/2] printk: Introduce LOUD_CON flag
Posted by Marcos Paulo de Souza 1 month ago
On Mon, 2024-10-21 at 16:17 +0206, John Ogness wrote:
> On 2024-10-21, Petr Mladek <pmladek@suse.com> wrote:
> > > That will not work because migrate_enable() can only be called
> > > from
> > > can_sleep context. Instead, the migrate_disable()/enable() should
> > > be at
> > > the few (one?) call sites where
> > > printk_loud_console_enter()/exit() is
> > > used from task context.
> > 
> > Hmm, if I get it correctly, we could not use migrate_disable() in
> > __handle_sysrq() because it can be called also in atomic context,
> 
> I am talking about callers of __handle_sysrq() and/or their callers.
> 
> For example write_sysrq_trigger() could do:
> 
> 	migrate_disable();
> 	__handle_sysrq(c, false);
> 	migrate_enable();
> 
> Or a new wrapper could be introduced for this purpose:
> 
> static inline void wrapper handle_sysrq_task(u8 key, bool check_mask)
> {
> 	migrate_disable();
> 	__handle_sysrq(key, check_mask);
> 	migrate_enable();
> }
> 
> A quick grep shows about 25 call sites to check.
> 
> > I do not see any easy way how to distinguish whether it was called
> > in
> > an atomic context or not.
> 
> There is no clean way to do that. If this information is needed, it
> must
> be tracked by the call chain.
> 
> > So, I see three possibilities:
> > 
> >   1. Explicitly call preempt_disable() in __handle_sysrq().
> > 
> >      It would be just around the the single line or the help. But
> > still,
> >      I do not like it much.
> 
> Not acceptable for PREEMPT_RT since sysrq is exposed to external
> inputs.
> 
> >   2. Avoid the per-CPU variable. Force adding the
> > LOUD_CON/FORCE_CON
> >      flag using a global variable, e.g. printk_force_console.
> > 
> >      The problem is that it might affect also messages printed by
> >      other CPUs. And there might be many.
> > 
> >      Well, console_loglevel is a global variable. The original code
> >      had a similar problem.
> 
> If disabling migration is not an option for you, this would be my
> second
> choice. I assume tagging too many messages is better than not tagging
> enough. And, as you say, this is effectively what the current code is
> trying to do.

Thanks for your input John. I talked with Petr and he suggested to
follow this option. I'll prepare the changes and send them after
reviewing them with Petr.

Thanks,
  Marcos

> 
> >   3. Add the LOUD_CON/FLUSH_CON flag via a parameter. For example,
> >      by a special LOGLEVEL_FORCE_CON, similar to LOGLEVEL_SCHED.
> > 
> >      I might work well for __handle_sysrq() which calls the
> > affected
> >      printk() directly.
> > 
> >      But it won't work, for example, for kdb_show_stack(). It wants
> >      to show messages printed by a nested functions.
> 
> Right, this has limited usefulness and might miss the important
> things,
> which tend to be within helper functions.
> 
> John
Re: [PATCH 1/2] printk: Introduce LOUD_CON flag
Posted by Petr Mladek 1 month ago
On Wed 2024-10-23 17:36:04, Marcos Paulo de Souza wrote:
> On Mon, 2024-10-21 at 16:17 +0206, John Ogness wrote:
> > On 2024-10-21, Petr Mladek <pmladek@suse.com> wrote:
> > > > That will not work because migrate_enable() can only be called
> > > > from
> > > > can_sleep context. Instead, the migrate_disable()/enable() should
> > > > be at
> > > > the few (one?) call sites where
> > > > printk_loud_console_enter()/exit() is
> > > > used from task context.
> > > 
> > > Hmm, if I get it correctly, we could not use migrate_disable() in
> > > __handle_sysrq() because it can be called also in atomic context,
> > 
> > I am talking about callers of __handle_sysrq() and/or their callers.
> > 
> > For example write_sysrq_trigger() could do:
> > 
> > 	migrate_disable();
> > 	__handle_sysrq(c, false);
> > 	migrate_enable();
> > 
> > Or a new wrapper could be introduced for this purpose:
> > 
> > static inline void wrapper handle_sysrq_task(u8 key, bool check_mask)
> > {
> > 	migrate_disable();
> > 	__handle_sysrq(key, check_mask);
> > 	migrate_enable();
> > }
> > 
> > A quick grep shows about 25 call sites to check.
> >
> > > I do not see any easy way how to distinguish whether it was called
> > > in
> > > an atomic context or not.
> > 
> > There is no clean way to do that. If this information is needed, it
> > must
> > be tracked by the call chain.
> > 
> > > So, I see three possibilities:
> > > 
> > >   1. Explicitly call preempt_disable() in __handle_sysrq().
> > > 
> > >      It would be just around the the single line or the help. But
> > > still,
> > >      I do not like it much.
> > 
> > Not acceptable for PREEMPT_RT since sysrq is exposed to external
> > inputs.
> > 
> > >   2. Avoid the per-CPU variable. Force adding the
> > > LOUD_CON/FORCE_CON
> > >      flag using a global variable, e.g. printk_force_console.
> > > 
> > >      The problem is that it might affect also messages printed by
> > >      other CPUs. And there might be many.
> > > 
> > >      Well, console_loglevel is a global variable. The original code
> > >      had a similar problem.
> > 
> > If disabling migration is not an option for you, this would be my
> > second
> > choice. I assume tagging too many messages is better than not tagging
> > enough. And, as you say, this is effectively what the current code is
> > trying to do.
> 
> Thanks for your input John. I talked with Petr and he suggested to
> follow this option. I'll prepare the changes and send them after
> reviewing them with Petr.

Just for record. I propose this way because disabling migration would
require checking all callers (25 as mentioned above).
migrate_disable() is needed and can be called only in the task
context.

I do not think that it is worth the effort. And it would be error
prone (hard to maintain).

Best Regards,
Petr