[PATCH] printk/panic: Add option allow non panic cpus logging to ringbuffer

Donghyeok Choe posted 1 patch 11 months, 1 week ago
kernel/printk/internal.h          |  1 +
kernel/printk/printk.c            | 17 ++++++++++++++++-
kernel/printk/printk_ringbuffer.c |  4 +++-
3 files changed, 20 insertions(+), 2 deletions(-)
[PATCH] printk/panic: Add option allow non panic cpus logging to ringbuffer
Posted by Donghyeok Choe 11 months, 1 week ago
commit 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing
to ringbuffer") disabled non-panic CPUs to further write messages to
ringbuffer after panicked.

commit bcc954c6caba ("printk/panic: Allow cpu backtraces to
be written into ringbuffer during panic") allows non-panicked CPUs
to write backtrace only.

Since that, there was a problem with not being able to check the
important logs of the non-panicked CPUs.

Fix the issue by adding debug option(printk_debug_non_panic_cpus) to
output the non-paniced CPUs' message.

Link: https://lore.kernel.org/all/Z8cLEkqLL2IOyNIj@pathway/
Signed-off-by: Donghyeok Choe <d7271.choe@samsung.com>
---
 kernel/printk/internal.h          |  1 +
 kernel/printk/printk.c            | 17 ++++++++++++++++-
 kernel/printk/printk_ringbuffer.c |  4 +++-
 3 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index a91bdf802967..22d193907bb8 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -64,6 +64,7 @@ struct dev_printk_info;
 
 extern struct printk_ringbuffer *prb;
 extern bool printk_kthreads_running;
+extern bool printk_debug_non_panic_cpus;
 
 __printf(4, 0)
 int vprintk_store(int facility, int level,
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 07668433644b..7631fd8ba1c7 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2375,6 +2375,19 @@ void printk_legacy_allow_panic_sync(void)
 	}
 }
 
+bool __read_mostly printk_debug_non_panic_cpus;
+
+#ifdef CONFIG_PRINTK_CALLER
+static int __init printk_debug_non_panic_cpus_setup(char *str)
+{
+	printk_debug_non_panic_cpus = true;
+	pr_info("printk: keep printk all cpu in panic.\n");
+
+	return 0;
+}
+early_param("printk_debug_non_panic_cpus", printk_debug_non_panic_cpus_setup);
+#endif
+
 asmlinkage int vprintk_emit(int facility, int level,
 			    const struct dev_printk_info *dev_info,
 			    const char *fmt, va_list args)
@@ -2391,7 +2404,9 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 * non-panic CPUs are generating any messages, they will be
 	 * silently dropped.
 	 */
-	if (other_cpu_in_panic() && !panic_triggering_all_cpu_backtrace)
+	if (!printk_debug_non_panic_cpus &&
+	    !panic_triggering_all_cpu_backtrace &&
+	    other_cpu_in_panic())
 		return 0;
 
 	printk_get_console_flush_type(&ft);
diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 88e8f3a61922..ffab5f6c1855 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -2143,7 +2143,9 @@ static bool _prb_read_valid(struct printk_ringbuffer *rb, u64 *seq,
 			 * But it would have the sequence number returned
 			 * by "prb_next_reserve_seq() - 1".
 			 */
-			if (this_cpu_in_panic() && ((*seq + 1) < prb_next_reserve_seq(rb)))
+			if (this_cpu_in_panic() &&
+				(!printk_debug_non_panic_cpus || legacy_allow_panic_sync) &&
+				((*seq + 1) < prb_next_reserve_seq(rb)))
 				(*seq)++;
 			else
 				return false;
-- 
2.48.0
Re: [PATCH] printk/panic: Add option allow non panic cpus logging to ringbuffer
Posted by Petr Mladek 10 months, 3 weeks ago
Hi,

first, I am sorry for the late review. I have been snowed under many
other tasks.

Second, the patch looks fine. I just would like to do few cosmetic
improvements.

Let's start with the Subject. It has few small grammatical mistakes.
I suggest something like:

  "printk/panic: Add option to allow non-panic CPUs to write
   to the ring buffer."

On Wed 2025-03-05 13:40:46, Donghyeok Choe wrote:
> commit 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing
> to ringbuffer") disabled non-panic CPUs to further write messages to
> ringbuffer after panicked.
> 
> commit bcc954c6caba ("printk/panic: Allow cpu backtraces to
> be written into ringbuffer during panic") allows non-panicked CPUs
> to write backtrace only.
> 
> Since that, there was a problem with not being able to check the
> important logs of the non-panicked CPUs.
> 
> Fix the issue by adding debug option(printk_debug_non_panic_cpus) to
> output the non-paniced CPUs' message.

I would slightly rewrite it. I took inspiration
in the first version of this patch, see
https://lore.kernel.org/r/20250226031628.GB592457@tiffany
and asked "Gemini" to help:

<proposal>
Commit 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing to ringbuffer")
aimed to isolate panic-related messages. However, when panic() itself
malfunctions, messages from non-panic CPUs become crucial for debugging.

While commit bcc954c6caba ("printk/panic: Allow cpu backtraces to
be written into ringbuffer during panic") enables non-panic CPU
backtraces, it may not provide sufficient diagnostic information.

Introduce the "printk_debug_non_panic_cpus" command-line option,
enabling non-panic CPU messages to be stored in the ring buffer during
a panic. This also prevents discarding non-finalized messages from
non-panic CPUs during console flushing, providing a more comprehensive
view of system state during critical failures.
</proposal>

> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2375,6 +2375,19 @@ void printk_legacy_allow_panic_sync(void)
>  	}
>  }
>  
> +bool __read_mostly printk_debug_non_panic_cpus;
> +
> +#ifdef CONFIG_PRINTK_CALLER
> +static int __init printk_debug_non_panic_cpus_setup(char *str)
> +{
> +	printk_debug_non_panic_cpus = true;
> +	pr_info("printk: keep printk all cpu in panic.\n");

I would update the message:

	pr_info("printk: allow messages from non-panic CPUs in panic()\n");


> +
> +	return 0;
> +}
> +early_param("printk_debug_non_panic_cpus", printk_debug_non_panic_cpus_setup);
> +#endif
> +
>  asmlinkage int vprintk_emit(int facility, int level,
>  			    const struct dev_printk_info *dev_info,
>  			    const char *fmt, va_list args)
> @@ -2391,7 +2404,9 @@ asmlinkage int vprintk_emit(int facility, int level,
>  	 * non-panic CPUs are generating any messages, they will be
>  	 * silently dropped.
>  	 */
> -	if (other_cpu_in_panic() && !panic_triggering_all_cpu_backtrace)
> +	if (!printk_debug_non_panic_cpus &&
> +	    !panic_triggering_all_cpu_backtrace &&
> +	    other_cpu_in_panic())

I would switch the order:

	if (other_cpu_in_panic() &&
	    !printk_debug_non_panic_cpus &&
	    !panic_triggering_all_cpu_backtrace)

IMHO, it is more logical. Also both "!printk_debug_non_panic_cpus"
and "!panic_triggering_all_cpu_backtrace" are always true by default.

>  		return 0;
>  
>  	printk_get_console_flush_type(&ft);
> diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
> index 88e8f3a61922..ffab5f6c1855 100644
> --- a/kernel/printk/printk_ringbuffer.c
> +++ b/kernel/printk/printk_ringbuffer.c
> @@ -2143,7 +2143,9 @@ static bool _prb_read_valid(struct printk_ringbuffer *rb, u64 *seq,
>  			 * But it would have the sequence number returned
>  			 * by "prb_next_reserve_seq() - 1".
>  			 */
> -			if (this_cpu_in_panic() && ((*seq + 1) < prb_next_reserve_seq(rb)))
> +			if (this_cpu_in_panic() &&
> +				(!printk_debug_non_panic_cpus || legacy_allow_panic_sync) &&
> +				((*seq + 1) < prb_next_reserve_seq(rb)))
>  				(*seq)++;
>  			else
>  				return false;

The indendantation does not help much to understand where the if
(condition) ends. Also I would udate the comment. I suggest something
like:

--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -2133,9 +2133,9 @@ static bool _prb_read_valid(struct printk_ringbuffer *rb, u64 *seq,
 			 * there may be other finalized records beyond that
 			 * need to be printed for a panic situation. If this
 			 * is the panic CPU, skip this
-			 * non-existent/non-finalized record unless it is
-			 * at or beyond the head, in which case it is not
-			 * possible to continue.
+			 * non-existent/non-finalized record unless non-panic
+			 * CPUs are still running and their debugging is
+			 * explicitly enabled.
 			 *
 			 * Note that new messages printed on panic CPU are
 			 * finalized when we are here. The only exception
@@ -2143,10 +2143,13 @@ static bool _prb_read_valid(struct printk_ringbuffer *rb, u64 *seq,
 			 * But it would have the sequence number returned
 			 * by "prb_next_reserve_seq() - 1".
 			 */
-			if (this_cpu_in_panic() && ((*seq + 1) < prb_next_reserve_seq(rb)))
+			if (this_cpu_in_panic() &&
+			    (!printk_debug_non_panic_cpus || legacy_allow_panic_sync) &&
+			    ((*seq + 1) < prb_next_reserve_seq(rb))) {
 				(*seq)++;
-			else
+			} else {
 				return false;
+			}
 		}
 	}
 

With the above changes:

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr
Re: [PATCH] printk/panic: Add option allow non panic cpus logging to ringbuffer
Posted by John Ogness 10 months, 3 weeks ago
On 2025-03-17, Petr Mladek <pmladek@suse.com> wrote:
>> +#ifdef CONFIG_PRINTK_CALLER
>> +static int __init printk_debug_non_panic_cpus_setup(char *str)
>> +{
>> +	printk_debug_non_panic_cpus = true;
>> +	pr_info("printk: keep printk all cpu in panic.\n");
>
> I would update the message:
>
> 	pr_info("printk: allow messages from non-panic CPUs in panic()\n");

Note that every printk message in printk.c is automatically prepended
with "printk: " (see #define pr_fmt(fmt) at the top of printk.c) so
please just use:

	pr_info("allow messages from non-panic CPUs in panic()\n");

John Ogness