[PATCH v5 1/3] ring-buffer: Flush and stop persistent ring buffer on panic

Masami Hiramatsu (Google) posted 3 patches 1 month, 1 week ago
There is a newer version of this series
[PATCH v5 1/3] ring-buffer: Flush and stop persistent ring buffer on panic
Posted by Masami Hiramatsu (Google) 1 month, 1 week ago
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

On real hardware, panic and machine reboot may not flush hardware cache
to memory. This means the persistent ring buffer, which relies on a
coherent state of memory, may not have its events written to the buffer
and they may be lost. Moreover, there may be inconsistency with the
counters which are used for validation of the integrity of the
persistent ring buffer which may cause all data to be discarded.

To avoid this issue, stop recording of the ring buffer on panic and
flush the cache of the ring buffer's memory.

Fixes: e645535a954a ("tracing: Add option to use memmapped memory for trace boot instance")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v5:
   - Use ring_buffer_record_off() instead of ring_buffer_record_disable().
   - Use flush_cache_all() to ensure flush all cache.
 Changes in v3:
   - update patch description.
---
 kernel/trace/ring_buffer.c |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index f16f053ef77d..0eb6e6595f37 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -6,6 +6,7 @@
  */
 #include <linux/sched/isolation.h>
 #include <linux/trace_recursion.h>
+#include <linux/panic_notifier.h>
 #include <linux/trace_events.h>
 #include <linux/ring_buffer.h>
 #include <linux/trace_clock.h>
@@ -589,6 +590,7 @@ struct trace_buffer {
 
 	unsigned long			range_addr_start;
 	unsigned long			range_addr_end;
+	struct notifier_block		flush_nb;
 
 	struct ring_buffer_meta		*meta;
 
@@ -2471,6 +2473,15 @@ static void rb_free_cpu_buffer(struct ring_buffer_per_cpu *cpu_buffer)
 	kfree(cpu_buffer);
 }
 
+static int rb_flush_buffer_cb(struct notifier_block *nb, unsigned long event, void *data)
+{
+	struct trace_buffer *buffer = container_of(nb, struct trace_buffer, flush_nb);
+
+	ring_buffer_record_off(buffer);
+	flush_cache_all();
+	return NOTIFY_DONE;
+}
+
 static struct trace_buffer *alloc_buffer(unsigned long size, unsigned flags,
 					 int order, unsigned long start,
 					 unsigned long end,
@@ -2590,6 +2601,12 @@ static struct trace_buffer *alloc_buffer(unsigned long size, unsigned flags,
 
 	mutex_init(&buffer->mutex);
 
+	/* Persistent ring buffer needs to flush cache before reboot. */
+	if (start & end) {
+		buffer->flush_nb.notifier_call = rb_flush_buffer_cb;
+		atomic_notifier_chain_register(&panic_notifier_list, &buffer->flush_nb);
+	}
+
 	return_ptr(buffer);
 
  fail_free_buffers:
@@ -2677,6 +2694,9 @@ ring_buffer_free(struct trace_buffer *buffer)
 {
 	int cpu;
 
+	if (buffer->range_addr_start && buffer->range_addr_end)
+		atomic_notifier_chain_unregister(&panic_notifier_list, &buffer->flush_nb);
+
 	cpuhp_state_remove_instance(CPUHP_TRACE_RB_PREPARE, &buffer->node);
 
 	irq_work_sync(&buffer->irq_work.work);
Re: [PATCH v5 1/3] ring-buffer: Flush and stop persistent ring buffer on panic
Posted by kernel test robot 1 month, 1 week ago
Hi Masami,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v7.0-rc1 next-20260226]
[cannot apply to trace/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Masami-Hiramatsu-Google/ring-buffer-Flush-and-stop-persistent-ring-buffer-on-panic/20260226-222418
base:   linus/master
patch link:    https://lore.kernel.org/r/177211311593.419230.2212568977306190482.stgit%40mhiramat.tok.corp.google.com
patch subject: [PATCH v5 1/3] ring-buffer: Flush and stop persistent ring buffer on panic
config: sparc-randconfig-001-20260227 (https://download.01.org/0day-ci/archive/20260227/202602270244.3JWhusi4-lkp@intel.com/config)
compiler: sparc64-linux-gcc (GCC) 11.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260227/202602270244.3JWhusi4-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602270244.3JWhusi4-lkp@intel.com/

All errors (new ones prefixed by >>):

   kernel/trace/ring_buffer.c: In function 'rb_flush_buffer_cb':
>> kernel/trace/ring_buffer.c:2481:9: error: implicit declaration of function 'flush_cache_all'; did you mean 'flush_cache_page'? [-Werror=implicit-function-declaration]
    2481 |         flush_cache_all();
         |         ^~~~~~~~~~~~~~~
         |         flush_cache_page
   cc1: some warnings being treated as errors


vim +2481 kernel/trace/ring_buffer.c

  2475	
  2476	static int rb_flush_buffer_cb(struct notifier_block *nb, unsigned long event, void *data)
  2477	{
  2478		struct trace_buffer *buffer = container_of(nb, struct trace_buffer, flush_nb);
  2479	
  2480		ring_buffer_record_off(buffer);
> 2481		flush_cache_all();
  2482		return NOTIFY_DONE;
  2483	}
  2484	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH v5 1/3] ring-buffer: Flush and stop persistent ring buffer on panic
Posted by Masami Hiramatsu (Google) 1 month, 1 week ago
On Fri, 27 Feb 2026 02:48:26 +0800
kernel test robot <lkp@intel.com> wrote:

> Hi Masami,
> 
> kernel test robot noticed the following build errors:
> 
> [auto build test ERROR on linus/master]
> [also build test ERROR on v7.0-rc1 next-20260226]
> [cannot apply to trace/for-next]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Masami-Hiramatsu-Google/ring-buffer-Flush-and-stop-persistent-ring-buffer-on-panic/20260226-222418
> base:   linus/master
> patch link:    https://lore.kernel.org/r/177211311593.419230.2212568977306190482.stgit%40mhiramat.tok.corp.google.com
> patch subject: [PATCH v5 1/3] ring-buffer: Flush and stop persistent ring buffer on panic
> config: sparc-randconfig-001-20260227 (https://download.01.org/0day-ci/archive/20260227/202602270244.3JWhusi4-lkp@intel.com/config)
> compiler: sparc64-linux-gcc (GCC) 11.5.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260227/202602270244.3JWhusi4-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202602270244.3JWhusi4-lkp@intel.com/
> 
> All errors (new ones prefixed by >>):
> 
>    kernel/trace/ring_buffer.c: In function 'rb_flush_buffer_cb':
> >> kernel/trace/ring_buffer.c:2481:9: error: implicit declaration of function 'flush_cache_all'; did you mean 'flush_cache_page'? [-Werror=implicit-function-declaration]
>     2481 |         flush_cache_all();
>          |         ^~~~~~~~~~~~~~~
>          |         flush_cache_page
>    cc1: some warnings being treated as errors

I guess this is a miss of the sparc. Anyway, I will add asm/ring_buffer.h
to define a wrapper macro. At least I need to sync the cache on arm64 but
flush_cache_all() is a dummy on arm64. I would like to use dcache_clear_pop()
instead.

Thanks,

> 
> 
> vim +2481 kernel/trace/ring_buffer.c
> 
>   2475	
>   2476	static int rb_flush_buffer_cb(struct notifier_block *nb, unsigned long event, void *data)
>   2477	{
>   2478		struct trace_buffer *buffer = container_of(nb, struct trace_buffer, flush_nb);
>   2479	
>   2480		ring_buffer_record_off(buffer);
> > 2481		flush_cache_all();
>   2482		return NOTIFY_DONE;
>   2483	}
>   2484	
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>
Re: [PATCH v5 1/3] ring-buffer: Flush and stop persistent ring buffer on panic
Posted by kernel test robot 1 month, 1 week ago
Hi Masami,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v7.0-rc1 next-20260226]
[cannot apply to trace/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Masami-Hiramatsu-Google/ring-buffer-Flush-and-stop-persistent-ring-buffer-on-panic/20260226-222418
base:   linus/master
patch link:    https://lore.kernel.org/r/177211311593.419230.2212568977306190482.stgit%40mhiramat.tok.corp.google.com
patch subject: [PATCH v5 1/3] ring-buffer: Flush and stop persistent ring buffer on panic
config: sparc64-defconfig (https://download.01.org/0day-ci/archive/20260227/202602270132.zddhkLDS-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260227/202602270132.zddhkLDS-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202602270132.zddhkLDS-lkp@intel.com/

All errors (new ones prefixed by >>):

>> kernel/trace/ring_buffer.c:2481:2: error: call to undeclared function 'flush_cache_all'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
    2481 |         flush_cache_all();
         |         ^
   kernel/trace/ring_buffer.c:2481:2: note: did you mean 'flush_dcache_page'?
   arch/sparc/include/asm/cacheflush_64.h:51:20: note: 'flush_dcache_page' declared here
      51 | static inline void flush_dcache_page(struct page *page)
         |                    ^
   1 error generated.


vim +/flush_cache_all +2481 kernel/trace/ring_buffer.c

  2475	
  2476	static int rb_flush_buffer_cb(struct notifier_block *nb, unsigned long event, void *data)
  2477	{
  2478		struct trace_buffer *buffer = container_of(nb, struct trace_buffer, flush_nb);
  2479	
  2480		ring_buffer_record_off(buffer);
> 2481		flush_cache_all();
  2482		return NOTIFY_DONE;
  2483	}
  2484	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki