[PATCH v2] printk/panic: Add option to allow non-panic CPUs to write to the ring buffer.

Donghyeok Choe posted 1 patch 9 months ago
kernel/printk/internal.h          |  1 +
kernel/printk/printk.c            | 17 ++++++++++++++++-
kernel/printk/printk_ringbuffer.c | 13 ++++++++-----
3 files changed, 25 insertions(+), 6 deletions(-)
[PATCH v2] printk/panic: Add option to allow non-panic CPUs to write to the ring buffer.
Posted by Donghyeok Choe 9 months ago
Commit 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing to ringbuffer")
aimed to isolate panic-related messages. However, when panic() itself
malfunctions, messages from non-panic CPUs become crucial for debugging.

While commit bcc954c6caba ("printk/panic: Allow cpu backtraces to
be written into ringbuffer during panic") enables non-panic CPU
backtraces, it may not provide sufficient diagnostic information.

Introduce the "printk_debug_non_panic_cpus" command-line option,
enabling non-panic CPU messages to be stored in the ring buffer during
a panic. This also prevents discarding non-finalized messages from
non-panic CPUs during console flushing, providing a more comprehensive
view of system state during critical failures.

Link: https://lore.kernel.org/all/Z8cLEkqLL2IOyNIj@pathway/
Signed-off-by: Donghyeok Choe <d7271.choe@samsung.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/all/20250305044046.1249972-1-d7271.choe@samsung.com/
---
 kernel/printk/internal.h          |  1 +
 kernel/printk/printk.c            | 17 ++++++++++++++++-
 kernel/printk/printk_ringbuffer.c | 13 ++++++++-----
 3 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h
index a91bdf802967..22d193907bb8 100644
--- a/kernel/printk/internal.h
+++ b/kernel/printk/internal.h
@@ -64,6 +64,7 @@ struct dev_printk_info;
 
 extern struct printk_ringbuffer *prb;
 extern bool printk_kthreads_running;
+extern bool printk_debug_non_panic_cpus;
 
 __printf(4, 0)
 int vprintk_store(int facility, int level,
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 07668433644b..6f81a1718f9d 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2375,6 +2375,19 @@ void printk_legacy_allow_panic_sync(void)
 	}
 }
 
+bool __read_mostly printk_debug_non_panic_cpus;
+
+#ifdef CONFIG_PRINTK_CALLER
+static int __init printk_debug_non_panic_cpus_setup(char *str)
+{
+	printk_debug_non_panic_cpus = true;
+	pr_info("allow messages from non-panic CPUs in panic()\n");
+
+	return 0;
+}
+early_param("printk_debug_non_panic_cpus", printk_debug_non_panic_cpus_setup);
+#endif
+
 asmlinkage int vprintk_emit(int facility, int level,
 			    const struct dev_printk_info *dev_info,
 			    const char *fmt, va_list args)
@@ -2391,7 +2404,9 @@ asmlinkage int vprintk_emit(int facility, int level,
 	 * non-panic CPUs are generating any messages, they will be
 	 * silently dropped.
 	 */
-	if (other_cpu_in_panic() && !panic_triggering_all_cpu_backtrace)
+	if (other_cpu_in_panic() &&
+	    !printk_debug_non_panic_cpus &&
+	    !panic_triggering_all_cpu_backtrace)
 		return 0;
 
 	printk_get_console_flush_type(&ft);
diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index 88e8f3a61922..928dd8c7ec2a 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -2133,9 +2133,9 @@ static bool _prb_read_valid(struct printk_ringbuffer *rb, u64 *seq,
 			 * there may be other finalized records beyond that
 			 * need to be printed for a panic situation. If this
 			 * is the panic CPU, skip this
-			 * non-existent/non-finalized record unless it is
-			 * at or beyond the head, in which case it is not
-			 * possible to continue.
+			 * non-existent/non-finalized record unless non-panic
+			 * CPUs are still running and their debugging is
+			 * explicitly enabled.
 			 *
 			 * Note that new messages printed on panic CPU are
 			 * finalized when we are here. The only exception
@@ -2143,10 +2143,13 @@ static bool _prb_read_valid(struct printk_ringbuffer *rb, u64 *seq,
 			 * But it would have the sequence number returned
 			 * by "prb_next_reserve_seq() - 1".
 			 */
-			if (this_cpu_in_panic() && ((*seq + 1) < prb_next_reserve_seq(rb)))
+			if (this_cpu_in_panic() &&
+			    (!printk_debug_non_panic_cpus || legacy_allow_panic_sync) &&
+			    ((*seq + 1) < prb_next_reserve_seq(rb))) {
 				(*seq)++;
-			else
+			} else {
 				return false;
+			}
 		}
 	}
 
-- 
2.48.0
Re: [PATCH v2] printk/panic: Add option to allow non-panic CPUs to write to the ring buffer.
Posted by Petr Mladek 9 months ago
On Tue 2025-03-18 11:23:20, Donghyeok Choe wrote:
> Commit 779dbc2e78d7 ("printk: Avoid non-panic CPUs writing to ringbuffer")
> aimed to isolate panic-related messages. However, when panic() itself
> malfunctions, messages from non-panic CPUs become crucial for debugging.
> 
> While commit bcc954c6caba ("printk/panic: Allow cpu backtraces to
> be written into ringbuffer during panic") enables non-panic CPU
> backtraces, it may not provide sufficient diagnostic information.
> 
> Introduce the "printk_debug_non_panic_cpus" command-line option,
> enabling non-panic CPU messages to be stored in the ring buffer during
> a panic. This also prevents discarding non-finalized messages from
> non-panic CPUs during console flushing, providing a more comprehensive
> view of system state during critical failures.
> 
> Link: https://lore.kernel.org/all/Z8cLEkqLL2IOyNIj@pathway/
> Signed-off-by: Donghyeok Choe <d7271.choe@samsung.com>
> Reviewed-by: Petr Mladek <pmladek@suse.com>
> Link: https://lore.kernel.org/all/20250305044046.1249972-1-d7271.choe@samsung.com/

I was about to push this patch but I have realized that we forgot
to update Documentation/admin-guide/kernel-parameters.txt

Also I found that the prefix "printk_" did not fit well with
the existing options. There were only module parameters
which started with "printk." prefix.

So, I decided to remove the "printk_" prefix and add also
the module_paramter, so that we have both:

early_param("debug_non_panic_cpus", debug_non_panic_cpus_setup);
module_param(debug_non_panic_cpus, bool, 0644);
MODULE_PARM_DESC(debug_non_panic_cpus,
		 "allow messages from non-panic CPUs in panic()");

It is similar to "ignore_loglevel".

I had a dilemma whether to send it for review or just push it.
I decided to push it because nobody did really care about
the name and the changes were straightforward. Also it did not
make much sense to miss the upcoming merge window just
because of these cosmetic issues.


Result:

The patch has been comitted into printk/linux.git,
branch for-6.15, see
https://web.git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git/commit/?h=for-6.15&id=c1aa3daa517292303d98ff61f0440c354669f948

Feel free to ask me take it back. I am going to wait few days
before sending pull request for 6.15.

Best Regards,
Petr
Re: [PATCH v2] printk/panic: Add option to allow non-panic CPUs to write to the ring buffer.
Posted by Donghyeok Choe 8 months, 4 weeks ago
On Thu, Mar 20, 2025 at 04:27:49PM +0100, Petr Mladek wrote:
> Result:
> 
> The patch has been comitted into printk/linux.git,
> branch for-6.15, see
> https://web.git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git/commit/?h=for-6.15&id=c1aa3daa517292303d98ff61f0440c354669f948
> 
> Feel free to ask me take it back. I am going to wait few days
> before sending pull request for 6.15.

Thank you for providing detailed explanations about your modifications.
I really appreciate the clarity, and I have learned a lot from your changes.
Your approach makes perfect sense, and I have no objections to
what you are planning to do.

Thank you.