printk: selective deactivation of feature ignoring non panic cpu's messages

Posted by Donghyeok Choe 1 year, 1 month ago

Hello.

In 'Commit 779dbc2e78d7', a feature was included in which the message
from non panic cpu was ignored. I understand that panic cpu's message
is the most important feature in the SW livelockup issue.

However, SOC vendor's linux kernel device driver developers
in environments where hw reliability is not guaranteed The message
from non panic cpu is also important.

In the process of performing a panic when a specific cpu is fault,
another cpu has the ability to output important err information
through interrupt. If a particular cpu has an incorrect sfr access,
the debugging device driver's ISR behavior to debug it helps you
identify the problem easily.

I can output non panic cpu's backtrace through Commit bcc954c6caba,
but this seems to be not enough for me.
I would like to print out the message of non panic cpu as it is.
Can I use early_param to selectively disable that feature?

For example, as below.
I'm sending you an inquiry instead of a patch because I need to share
more about my problem situation.
Please forgive me if I broke the mailing etiquette.

ex)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index fb242739aec8..3f420e8bdb2c 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2368,6 +2368,17 @@ void printk_legacy_allow_panic_sync(void)
        }
 }

+static bool __read_mostly keep_printk_all_cpu_in_panic;
+
+static int __init keep_printk_all_cpu_in_panic_setup(char *str)
+{
+       keep_printk_all_cpu_in_panic = true;
+       pr_info("printk: keep printk all cpu in panic.\n");
+
+       return 0;
+}
+early_param("keep_printk_all_cpu_in_panic", keep_printk_all_cpu_in_panic_setup);
+
 asmlinkage int vprintk_emit(int facility, int level,
                            const struct dev_printk_info *dev_info,
                            const char *fmt, va_list args)
@@ -2379,13 +2390,15 @@ asmlinkage int vprintk_emit(int facility, int level,
        if (unlikely(suppress_printk))
                return 0;

-       /*
-        * The messages on the panic CPU are the most important. If
-        * non-panic CPUs are generating any messages, they will be
-        * silently dropped.
-        */
-       if (other_cpu_in_panic() && !panic_triggering_all_cpu_backtrace)
-               return 0;
+       if (!keep_printk_all_cpu_in_panic) {
+               /*
+                * The messages on the panic CPU are the most important. If
+                * non-panic CPUs are generating any messages, they will be
+                * silently dropped.
+                */
+               if (other_cpu_in_panic() && !panic_triggering_all_cpu_backtrace)
+                       return 0;
+       }

Thank you.

Re: printk: selective deactivation of feature ignoring non panic cpu's messages

Posted by John Ogness 1 year, 1 month ago

Hi Donghyeok,

On 2025-02-26, Donghyeok Choe <d7271.choe@samsung.com> wrote:
> I would like to print out the message of non panic cpu as it is.
> Can I use early_param to selectively disable that feature?

I have no issues about allowing this type of feature for debugging
purposes. I do not know if early_param is the best approach. I expect
Petr will offer good insight here.

> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index fb242739aec8..3f420e8bdb2c 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2368,6 +2368,17 @@ void printk_legacy_allow_panic_sync(void)
>         }
>  }
>
> +static bool __read_mostly keep_printk_all_cpu_in_panic;
> +
> +static int __init keep_printk_all_cpu_in_panic_setup(char *str)
> +{
> +       keep_printk_all_cpu_in_panic = true;
> +       pr_info("printk: keep printk all cpu in panic.\n");
> +
> +       return 0;
> +}
> +early_param("keep_printk_all_cpu_in_panic", keep_printk_all_cpu_in_panic_setup);

Quite a long argument. I am horrible at naming. I expect Petr would have
a good suggestion (if early_param is the way to go).

> +
>  asmlinkage int vprintk_emit(int facility, int level,
>                             const struct dev_printk_info *dev_info,
>                             const char *fmt, va_list args)
> @@ -2379,13 +2390,15 @@ asmlinkage int vprintk_emit(int facility, int level,
>         if (unlikely(suppress_printk))
>                 return 0;
>
> -       /*
> -        * The messages on the panic CPU are the most important. If
> -        * non-panic CPUs are generating any messages, they will be
> -        * silently dropped.
> -        */
> -       if (other_cpu_in_panic() && !panic_triggering_all_cpu_backtrace)
> -               return 0;
> +       if (!keep_printk_all_cpu_in_panic) {
> +               /*
> +                * The messages on the panic CPU are the most important. If
> +                * non-panic CPUs are generating any messages, they will be
> +                * silently dropped.
> +                */
> +               if (other_cpu_in_panic() && !panic_triggering_all_cpu_backtrace)
> +                       return 0;
> +       }

I would not nest it. Just something like:

	/*
	 * The messages on the panic CPU are the most important. If
	 * non-panic CPUs are generating any messages, they may be
	 * silently dropped.
	 */
	if (!keep_printk_all_cpu_in_panic &&
	    !panic_triggering_all_cpu_backtrace &&
	    other_cpu_in_panic()) {
		return 0;
	}

John

Re: printk: selective deactivation of feature ignoring non panic cpu's messages

Posted by Petr Mladek 1 year, 1 month ago

On Wed 2025-02-26 05:31:53, John Ogness wrote:
> Hi Donghyeok,
> 
> On 2025-02-26, Donghyeok Choe <d7271.choe@samsung.com> wrote:
> > I would like to print out the message of non panic cpu as it is.
> > Can I use early_param to selectively disable that feature?
> 
> I have no issues about allowing this type of feature for debugging
> purposes.

Yes. It makes sense. Another scenario might be when
panic_other_cpus_shutdown() is not able to stop some CPUs.
It might be useful to see messages from the problematic ones.

> I do not know if early_param is the best approach. I expect
> Petr will offer good insight here.

early_param() looks good to me. There are already similar early
parameters, for example, "ignore_loglevel".


> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index fb242739aec8..3f420e8bdb2c 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -2368,6 +2368,17 @@ void printk_legacy_allow_panic_sync(void)
> >         }
> >  }
> >
> > +static bool __read_mostly keep_printk_all_cpu_in_panic;
> > +
> > +static int __init keep_printk_all_cpu_in_panic_setup(char *str)
> > +{
> > +       keep_printk_all_cpu_in_panic = true;
> > +       pr_info("printk: keep printk all cpu in panic.\n");
> > +
> > +       return 0;
> > +}
> > +early_param("keep_printk_all_cpu_in_panic", keep_printk_all_cpu_in_panic_setup);
> 
> Quite a long argument. I am horrible at naming. I expect Petr would have
> a good suggestion (if early_param is the way to go).

Heh. It seems to be hard to find a good name ;-)

Anyway, I would use "printk_" prefix to make it clear that
it is printk-related. The following comes to my mind:

  + printk_allow_non_panic_cpus
  + printk_keep_non_panic_cpus
  + printk_debug_non_panic_cpus

I prefer "printk_debug_non_panic_cpus", see below.


> >  asmlinkage int vprintk_emit(int facility, int level,
> >                             const struct dev_printk_info *dev_info,
> >                             const char *fmt, va_list args)
> > @@ -2379,13 +2390,15 @@ asmlinkage int vprintk_emit(int facility, int level,
> >         if (unlikely(suppress_printk))
> >                 return 0;
> >
> > -       /*
> > -        * The messages on the panic CPU are the most important. If
> > -        * non-panic CPUs are generating any messages, they will be
> > -        * silently dropped.
> > -        */
> > -       if (other_cpu_in_panic() && !panic_triggering_all_cpu_backtrace)
> > -               return 0;
> > +       if (!keep_printk_all_cpu_in_panic) {
> > +               /*
> > +                * The messages on the panic CPU are the most important. If
> > +                * non-panic CPUs are generating any messages, they will be
> > +                * silently dropped.
> > +                */
> > +               if (other_cpu_in_panic() && !panic_triggering_all_cpu_backtrace)
> > +                       return 0;
> > +       }
> 
> I would not nest it. Just something like:
> 
> 	/*
> 	 * The messages on the panic CPU are the most important. If
> 	 * non-panic CPUs are generating any messages, they may be
> 	 * silently dropped.
> 	 */
> 	if (!keep_printk_all_cpu_in_panic &&
> 	    !panic_triggering_all_cpu_backtrace &&
> 	    other_cpu_in_panic()) {
> 		return 0;
> 	}

I would prefer this form as well.

Thinking loudly:

I wonder if this is actually safe. I recall that we simplified the
design somewhere because we expected that non-panic CPUs will not
add messages. I am not sure that I found all locations. But
we might want to revise:


1st problem: _prb_read_valid() skips non-finalized records on non-panic CPUs.

   opinion: We should not do it in this case.


2nd problem: Is _prb_read_valid() actually safe when
	panic_triggering_all_cpu_backtrace is true?

   opinion: It should be safe because the backtraces from different CPUs
	are serialized via printk_cpu_sync_get_irqsave().


3rd problem: nbcon_get_default_prio() returns NBCON_PRIO_NORMAL on
	non-panic CPUs. As a result, printk_get_console_flush_type()
	would suggest flushing like when the system works as expected.

	But the legacy-loop will bail out after flushing one
	message on one console, see console_flush_all(). It is weird
	behavior.

	Another question is who would flush the messages when the panic()
	CPU does not reach the explicit flush.

   opinion: We should probably try to flush the messages on non-panic
	CPUs in this mode when safe. This is why I prefer the name
	"printk_debug_non_panic_cpus".

	We should update console_flush_all() to do not bail out when
	the new option is set.

	We should call nbcon_atomic_flush_pending() on non-panic CPUs
	when the new option is set. printk_get_console_flush_type()
	should behave like with NBCON_PRIO_EMERGENCY.

	Maybe, nbcon_get_default_prio() should actually return
	NBCON_PRIO_EMERGENCY on non-panic CPUs when this option is set.
	It allow the non-panic CPUs to take over the nbcon context
	from the potentially frozen kthread.


Best Regards,
Petr

Re: printk: selective deactivation of feature ignoring non panic cpu's messages

Posted by John Ogness 1 year, 1 month ago

On 2025-02-26, Petr Mladek <pmladek@suse.com> wrote:
> I wonder if this is actually safe. I recall that we simplified the
> design somewhere because we expected that non-panic CPUs will not
> add messages.

Perhaps you are recalling commit 7412dc6d55ee ("dump_stack: Do not get
cpu_sync for panic CPU").

> I am not sure that I found all locations. But
> we might want to revise:
>
>
> 1st problem: _prb_read_valid() skips non-finalized records on non-panic CPUs.
>
>    opinion: We should not do it in this case.

I don't know. This could result in seeing even less output if some CPU
didn't finalize a record.

> 2nd problem: Is _prb_read_valid() actually safe when
> 	panic_triggering_all_cpu_backtrace is true?
>
>    opinion: It should be safe because the backtraces from different CPUs
> 	are serialized via printk_cpu_sync_get_irqsave().

To clarify, by "safe" you mean it does not skip backtrace records from
other CPUs.

It does not skip their records because trigger_all_cpu_backtrace() waits
(up to 10 seconds) for the other CPUs to finish storing their backtraces
before continuing.

The use of the printk_cpu_sync is only to avoid interweaving the
multiple non-panic CPU backtraces.

> 3rd problem: nbcon_get_default_prio() returns NBCON_PRIO_NORMAL on
> 	non-panic CPUs. As a result, printk_get_console_flush_type()
> 	would suggest flushing like when the system works as expected.
>
> 	But the legacy-loop will bail out after flushing one
> 	message on one console, see console_flush_all(). It is weird
> 	behavior.

I believe you are talking about commit 8ebc476fd51e ("printk: Drop
console_sem during panic")? And also be aware of commit 51a1d258e50e
("printk: Keep non-panic-CPUs out of console lock").

> 	Another question is who would flush the messages when the panic()
> 	CPU does not reach the explicit flush.

Nobody. That is by design.

>    opinion: We should probably try to flush the messages on non-panic
> 	CPUs in this mode when safe. This is why I prefer the name
> 	"printk_debug_non_panic_cpus".
>
> 	We should update console_flush_all() to do not bail out when
> 	the new option is set.

It is not that simple. Legacy printing involves acquiring the
console_lock, which currently is not possible for non-panic CPUs during
panic.

> 	We should call nbcon_atomic_flush_pending() on non-panic CPUs
> 	when the new option is set. printk_get_console_flush_type()
> 	should behave like with NBCON_PRIO_EMERGENCY.

Note that non-panic CPUs during panic are also forbidden from acquiring
console ownership.

> 	Maybe, nbcon_get_default_prio() should actually return
> 	NBCON_PRIO_EMERGENCY on non-panic CPUs when this option is set.
> 	It allow the non-panic CPUs to take over the nbcon context
> 	from the potentially frozen kthread.

Note that nbcon_waiter_matches() requires that "Only one CPU is allowed
to request PANIC priority."

IMO it is opening up a huge can of worms to start allowing non-panic
CPUs to acquire the console_lock, take over console ownership, and print
(possibly with PANIC priority).

You could argue that this is just a debug mode. But I don't think that
is justification to allow things to explode and possibly deadlock, when
they otherwise would not have.

Perhaps Donghyeok might be happy enough if the debug option simply
allowed the non-panic CPUs to store records. There are plenty of tools
available to get at the dmesg buffer.

John

Re: printk: selective deactivation of feature ignoring non panic cpu's messages

Posted by Petr Mladek 1 year ago

On Fri 2025-02-28 15:26:34, John Ogness wrote:
> On 2025-02-26, Petr Mladek <pmladek@suse.com> wrote:
> > I wonder if this is actually safe. I recall that we simplified the
> > design somewhere because we expected that non-panic CPUs will not
> > add messages.
> 
> Perhaps you are recalling commit 7412dc6d55ee ("dump_stack: Do not get
> cpu_sync for panic CPU").

Yeah.

> > I am not sure that I found all locations. But
> > we might want to revise:
> >
> >
> > 1st problem: _prb_read_valid() skips non-finalized records on non-panic CPUs.
> >
> >    opinion: We should not do it in this case.
> 
> I don't know. This could result in seeing even less output if some CPU
> didn't finalize a record.

But it might also drop messages which are just being added.
And the person enabling the new option is explicitly interested
into the messages from non-panic CPUs.

To make it clear. I do not want to revert the change. But I would
avoid the skip when the new option is used.

The messages might be skipped later when CPUs are stopped.
The information is already available via the @legacy_allow_panic_sync
variable.

I mean something like:

--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -2143,7 +2143,9 @@ static bool _prb_read_valid(struct printk_ringbuffer *rb, u64 *seq,
 			 * But it would have the sequence number returned
 			 * by "prb_next_reserve_seq() - 1".
 			 */
-			if (this_cpu_in_panic() && ((*seq + 1) < prb_next_reserve_seq(rb)))
+			if (this_cpu_in_panic() &&
+			    (!printk_debug_non_panic_cpus || legacy_allow_panic_sync)
+			    ((*seq + 1) < prb_next_reserve_seq(rb)))
 				(*seq)++;
 			else
 				return false;

Sigh, I am not much happy with the complexity. But if we did not
do it then people might have hard times to realize why the messages
from non-panic CPUs are skipped. Especially because this behavior is
pretty hidden in the log buffer code...

Another motivation is the thread about processing panic notifiers
before stopping CPU. It is still not clear whether it is really
needed. But it is another case where panic() did not work as
expected because a non-panic CPU did not finish a work,
see https://lore.kernel.org/all/20250221022328.47078-1-ryotkkr98@gmail.com/

> > 2nd problem: Is _prb_read_valid() actually safe when
> > 	panic_triggering_all_cpu_backtrace is true?
> >
> >    opinion: It should be safe because the backtraces from different CPUs
> > 	are serialized via printk_cpu_sync_get_irqsave().
> 
> To clarify, by "safe" you mean it does not skip backtrace records from
> other CPUs.
>
> It does not skip their records because trigger_all_cpu_backtrace() waits
> (up to 10 seconds) for the other CPUs to finish storing their backtraces
> before continuing.

OK, I propose the following changes:

  + rename the option to "printk_debug_non_panic_cpus"

  + do not skip the messages in _prb_read_valid() when this option
    is used before the non-panic CPUs are stopped.

Best Regards,
Petr

Re: printk: selective deactivation of feature ignoring non panic cpu's messages

Posted by John Ogness 1 year ago

On 2025-03-04, Petr Mladek <pmladek@suse.com> wrote:
> I mean something like:
>
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -2143,7 +2143,9 @@ static bool _prb_read_valid(struct printk_ringbuffer *rb, u64 *seq,
>  			 * But it would have the sequence number returned
>  			 * by "prb_next_reserve_seq() - 1".
>  			 */
> -			if (this_cpu_in_panic() && ((*seq + 1) < prb_next_reserve_seq(rb)))
> +			if (this_cpu_in_panic() &&
> +			    (!printk_debug_non_panic_cpus || legacy_allow_panic_sync) &&
> +			    ((*seq + 1) < prb_next_reserve_seq(rb)))
>  				(*seq)++;
>  			else
>  				return false;

Ah, OK. Thanks for the clarification

> OK, I propose the following changes:
>
>   + rename the option to "printk_debug_non_panic_cpus"
>
>   + do not skip the messages in _prb_read_valid() when this option
>     is used before the non-panic CPUs are stopped.

And of course:

    + allow non-panic CPUs in panic to store messages when this option
      is set

I would also keep the dump_stack_lvl() implementation as it is, even if
it could lead to interweaving of backtraces. Anyone using
printk_debug_non_panic_cpus should have CONFIG_PRINTK_CALLER enabled.

John

Re: printk: selective deactivation of feature ignoring non panic cpu's messages

Posted by Petr Mladek 1 year ago

On Tue 2025-03-04 15:05:52, John Ogness wrote:
> On 2025-03-04, Petr Mladek <pmladek@suse.com> wrote:
> > I mean something like:
> >
> > --- a/kernel/printk/printk_ringbuffer.c
> > +++ b/kernel/printk/printk_ringbuffer.c
> > @@ -2143,7 +2143,9 @@ static bool _prb_read_valid(struct printk_ringbuffer *rb, u64 *seq,
> >  			 * But it would have the sequence number returned
> >  			 * by "prb_next_reserve_seq() - 1".
> >  			 */
> > -			if (this_cpu_in_panic() && ((*seq + 1) < prb_next_reserve_seq(rb)))
> > +			if (this_cpu_in_panic() &&
> > +			    (!printk_debug_non_panic_cpus || legacy_allow_panic_sync) &&
> > +			    ((*seq + 1) < prb_next_reserve_seq(rb)))
> >  				(*seq)++;
> >  			else
> >  				return false;
> 
> Ah, OK. Thanks for the clarification
> 
> > OK, I propose the following changes:
> >
> >   + rename the option to "printk_debug_non_panic_cpus"
> >
> >   + do not skip the messages in _prb_read_valid() when this option
> >     is used before the non-panic CPUs are stopped.
> 
> And of course:
> 
>     + allow non-panic CPUs in panic to store messages when this option
>       is set

Yes.

> I would also keep the dump_stack_lvl() implementation as it is, even if
> it could lead to interweaving of backtraces. Anyone using
> printk_debug_non_panic_cpus should have CONFIG_PRINTK_CALLER enabled.

I agree.

Best Regards,
Petr

Re: printk: selective deactivation of feature ignoring non panic cpu's messages

Posted by Donghyeok Choe 1 year ago

On Tue, Mar 04, 2025 at 03:15:46PM +0100, Petr Mladek wrote:
> On Tue 2025-03-04 15:05:52, John Ogness wrote:
> > On 2025-03-04, Petr Mladek <pmladek@suse.com> wrote:
> > > I mean something like:
> > >
> > > --- a/kernel/printk/printk_ringbuffer.c
> > > +++ b/kernel/printk/printk_ringbuffer.c
> > > @@ -2143,7 +2143,9 @@ static bool _prb_read_valid(struct printk_ringbuffer *rb, u64 *seq,
> > >  			 * But it would have the sequence number returned
> > >  			 * by "prb_next_reserve_seq() - 1".
> > >  			 */
> > > -			if (this_cpu_in_panic() && ((*seq + 1) < prb_next_reserve_seq(rb)))
> > > +			if (this_cpu_in_panic() &&
> > > +			    (!printk_debug_non_panic_cpus || legacy_allow_panic_sync) &&
> > > +			    ((*seq + 1) < prb_next_reserve_seq(rb)))
> > >  				(*seq)++;
> > >  			else
> > >  				return false;
> > 
> > Ah, OK. Thanks for the clarification
> > 
> > > OK, I propose the following changes:
> > >
> > >   + rename the option to "printk_debug_non_panic_cpus"
> > >
> > >   + do not skip the messages in _prb_read_valid() when this option
> > >     is used before the non-panic CPUs are stopped.
> > 
> > And of course:
> > 
> >     + allow non-panic CPUs in panic to store messages when this option
> >       is set
> > I would also keep the dump_stack_lvl() implementation as it is, even if
> > it could lead to interweaving of backtraces. Anyone using
> > printk_debug_non_panic_cpus should have CONFIG_PRINTK_CALLER enabled.


Hi Petr Mladek,

This is gentle remind.

I did upload the patch containing the above contents.
Link: https://lore.kernel.org/all/20250305044046.1249972-1-d7271.choe@samsung.com/

Even if it's not necessarily this patch, I hope the content will be applied.
When can the 'printk_debug_non_panic_cpus' function be reflected
in mainline source code?


Thanks.

Re: printk: selective deactivation of feature ignoring non panic cpu's messages

Posted by Donghyeok Choe 1 year ago

On Fri, Feb 28, 2025 at 03:26:34PM +0106, John Ogness wrote:
> Perhaps Donghyeok might be happy enough if the debug option simply
> allowed the non-panic CPUs to store records. There are plenty of tools
> available to get at the dmesg buffer.

That's right. Just the addition of debug options can make me very happy.
If I find any problems using debug option on, I'll share them with you
immediately.

First, Shall we reflect only the patch that adds our debug option?
I will also study the problems raised. I want to be able to
participate in your discussion someday.

Thank you.