fs/coredump.c | 5 +++++ include/trace/events/coredump.h | 47 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+)
Coredump is a generally useful and interesting event in the lifetime
of a process. Add a tracepoint so it can be monitored through the
standard kernel tracing infrastructure.
BPF-based crash monitoring is an advanced approach that
allows real-time crash interception: by attaching a BPF program at
this point, tools can use bpf_get_stack() with BPF_F_USER_STACK to
capture the user-space stack trace at the exact moment of the crash,
before the process is fully terminated, without waiting for a
coredump file to be written and parsed.
However, there is currently no stable kernel API for this use case.
Existing tools rely on attaching fentry probes to do_coredump(),
which is an internal function whose signature changes across kernel
versions, breaking these tools.
Add a stable tracepoint that fires at the beginning of
do_coredump(), providing BPF programs a reliable attachment point.
At tracepoint time, the crashing process context is still live, so
BPF programs can call bpf_get_stack() with BPF_F_USER_STACK to
extract the user-space backtrace.
The tracepoint records:
- sig: signal number that triggered the coredump
- comm: process name
- pid: process PID
Example output:
$ echo 1 > /sys/kernel/tracing/events/coredump/coredump/enable
$ sleep 999 &
$ kill -SEGV $!
$ cat /sys/kernel/tracing/trace
# TASK-PID CPU# ||||| TIMESTAMP FUNCTION
# | | | ||||| | |
sleep-634 [036] ..... 145.222206: coredump: sig=11 comm=sleep pid=634
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
fs/coredump.c | 5 +++++
include/trace/events/coredump.h | 47 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 52 insertions(+)
diff --git a/fs/coredump.c b/fs/coredump.c
index 29df8aa19e2e7..bb6fdb1f458e9 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -63,6 +63,9 @@
#include <trace/events/sched.h>
+#define CREATE_TRACE_POINTS
+#include <trace/events/coredump.h>
+
static bool dump_vma_snapshot(struct coredump_params *cprm);
static void free_vma_snapshot(struct coredump_params *cprm);
@@ -1090,6 +1093,8 @@ static inline bool coredump_skip(const struct coredump_params *cprm,
static void do_coredump(struct core_name *cn, struct coredump_params *cprm,
size_t **argv, int *argc, const struct linux_binfmt *binfmt)
{
+ trace_coredump(cprm->siginfo->si_signo);
+
if (!coredump_parse(cn, cprm, argv, argc)) {
coredump_report_failure("format_corename failed, aborting core");
return;
diff --git a/include/trace/events/coredump.h b/include/trace/events/coredump.h
new file mode 100644
index 0000000000000..59617eba3dbcf
--- /dev/null
+++ b/include/trace/events/coredump.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (c) 2026 Meta Platforms, Inc. and affiliates.
+ * Copyright (c) 2026 Breno Leitao <leitao@debian.org>
+ */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM coredump
+
+#if !defined(_TRACE_COREDUMP_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_COREDUMP_H
+
+#include <linux/sched.h>
+#include <linux/tracepoint.h>
+
+/**
+ * coredump - called when a coredump starts
+ * @sig: signal number that triggered the coredump
+ *
+ * This tracepoint fires at the beginning of a coredump attempt,
+ * providing a stable interface for monitoring coredump events.
+ */
+TRACE_EVENT(coredump,
+
+ TP_PROTO(int sig),
+
+ TP_ARGS(sig),
+
+ TP_STRUCT__entry(
+ __field(int, sig)
+ __array(char, comm, TASK_COMM_LEN)
+ __field(pid_t, pid)
+ ),
+
+ TP_fast_assign(
+ __entry->sig = sig;
+ memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
+ __entry->pid = current->pid;
+ ),
+
+ TP_printk("sig=%d comm=%s pid=%d",
+ __entry->sig, __entry->comm, __entry->pid)
+);
+
+#endif /* _TRACE_COREDUMP_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
---
base-commit: b5d083a3ed1e2798396d5e491432e887da8d4a06
change-id: 20260320-coredump_tracepoint-4de4399ce1b6
Best regards,
--
Breno Leitao <leitao@debian.org>
On Fri, 20 Mar 2026 05:33:34 -0700, Breno Leitao wrote:
> Coredump is a generally useful and interesting event in the lifetime
> of a process. Add a tracepoint so it can be monitored through the
> standard kernel tracing infrastructure.
>
> BPF-based crash monitoring is an advanced approach that
> allows real-time crash interception: by attaching a BPF program at
> this point, tools can use bpf_get_stack() with BPF_F_USER_STACK to
> capture the user-space stack trace at the exact moment of the crash,
> before the process is fully terminated, without waiting for a
> coredump file to be written and parsed.
>
> [...]
"stable" with a grain of salt. We make no such guarantees that it won't be
moved around if needed.
---
Applied to the vfs-7.1.misc branch of the vfs/vfs.git tree.
Patches in the vfs-7.1.misc branch should appear in linux-next soon.
Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.
It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.
Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs-7.1.misc
[1/1] coredump: add tracepoint for coredump events
https://git.kernel.org/vfs/vfs/c/8e69edaf49bc
On Fri, Mar 20, 2026 at 02:21:48PM +0100, Christian Brauner wrote: > On Fri, 20 Mar 2026 05:33:34 -0700, Breno Leitao wrote: > > Coredump is a generally useful and interesting event in the lifetime > > of a process. Add a tracepoint so it can be monitored through the > > standard kernel tracing infrastructure. > > > > BPF-based crash monitoring is an advanced approach that > > allows real-time crash interception: by attaching a BPF program at > > this point, tools can use bpf_get_stack() with BPF_F_USER_STACK to > > capture the user-space stack trace at the exact moment of the crash, > > before the process is fully terminated, without waiting for a > > coredump file to be written and parsed. > > > > [...] > > "stable" with a grain of salt. We make no such guarantees that it won't be > moved around if needed. Ack. At least tracepoints offer more stability compared to fentry/function-based approaches which can be inlined, renamed, or otherwise modified. Thanks for reviewing this. --breno
On Fri, Mar 20, 2026 at 05:33:34AM -0700, Breno Leitao wrote:
> Coredump is a generally useful and interesting event in the lifetime
> of a process. Add a tracepoint so it can be monitored through the
> standard kernel tracing infrastructure.
>
> BPF-based crash monitoring is an advanced approach that
> allows real-time crash interception: by attaching a BPF program at
> this point, tools can use bpf_get_stack() with BPF_F_USER_STACK to
> capture the user-space stack trace at the exact moment of the crash,
> before the process is fully terminated, without waiting for a
> coredump file to be written and parsed.
>
> However, there is currently no stable kernel API for this use case.
> Existing tools rely on attaching fentry probes to do_coredump(),
> which is an internal function whose signature changes across kernel
> versions, breaking these tools.
>
> Add a stable tracepoint that fires at the beginning of
> do_coredump(), providing BPF programs a reliable attachment point.
> At tracepoint time, the crashing process context is still live, so
> BPF programs can call bpf_get_stack() with BPF_F_USER_STACK to
> extract the user-space backtrace.
>
> The tracepoint records:
> - sig: signal number that triggered the coredump
> - comm: process name
> - pid: process PID
>
> Example output:
>
> $ echo 1 > /sys/kernel/tracing/events/coredump/coredump/enable
> $ sleep 999 &
> $ kill -SEGV $!
> $ cat /sys/kernel/tracing/trace
> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> # | | | ||||| | |
> sleep-634 [036] ..... 145.222206: coredump: sig=11 comm=sleep pid=634
>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> fs/coredump.c | 5 +++++
> include/trace/events/coredump.h | 47 +++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 52 insertions(+)
>
> diff --git a/fs/coredump.c b/fs/coredump.c
> index 29df8aa19e2e7..bb6fdb1f458e9 100644
> --- a/fs/coredump.c
> +++ b/fs/coredump.c
> @@ -63,6 +63,9 @@
>
> #include <trace/events/sched.h>
>
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/coredump.h>
> +
> static bool dump_vma_snapshot(struct coredump_params *cprm);
> static void free_vma_snapshot(struct coredump_params *cprm);
>
> @@ -1090,6 +1093,8 @@ static inline bool coredump_skip(const struct coredump_params *cprm,
> static void do_coredump(struct core_name *cn, struct coredump_params *cprm,
> size_t **argv, int *argc, const struct linux_binfmt *binfmt)
> {
> + trace_coredump(cprm->siginfo->si_signo);
> +
> if (!coredump_parse(cn, cprm, argv, argc)) {
> coredump_report_failure("format_corename failed, aborting core");
> return;
> diff --git a/include/trace/events/coredump.h b/include/trace/events/coredump.h
> new file mode 100644
> index 0000000000000..59617eba3dbcf
> --- /dev/null
> +++ b/include/trace/events/coredump.h
> @@ -0,0 +1,47 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright (c) 2026 Meta Platforms, Inc. and affiliates.
> + * Copyright (c) 2026 Breno Leitao <leitao@debian.org>
> + */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM coredump
> +
> +#if !defined(_TRACE_COREDUMP_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_COREDUMP_H
> +
> +#include <linux/sched.h>
> +#include <linux/tracepoint.h>
> +
> +/**
> + * coredump - called when a coredump starts
> + * @sig: signal number that triggered the coredump
> + *
> + * This tracepoint fires at the beginning of a coredump attempt,
> + * providing a stable interface for monitoring coredump events.
> + */
> +TRACE_EVENT(coredump,
> +
> + TP_PROTO(int sig),
> +
> + TP_ARGS(sig),
> +
> + TP_STRUCT__entry(
> + __field(int, sig)
> + __array(char, comm, TASK_COMM_LEN)
> + __field(pid_t, pid)
> + ),
> +
> + TP_fast_assign(
> + __entry->sig = sig;
> + memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
> + __entry->pid = current->pid;
That's the TID as seen in the global pid namespace.
I assume this is what you want but worth noting.
On Fri, 20 Mar 2026 14:21:23 +0100
Christian Brauner <brauner@kernel.org> wrote:
> > +TRACE_EVENT(coredump,
> > +
> > + TP_PROTO(int sig),
> > +
> > + TP_ARGS(sig),
> > +
> > + TP_STRUCT__entry(
> > + __field(int, sig)
> > + __array(char, comm, TASK_COMM_LEN)
> > + __field(pid_t, pid)
> > + ),
> > +
> > + TP_fast_assign(
> > + __entry->sig = sig;
> > + memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
> > + __entry->pid = current->pid;
>
> That's the TID as seen in the global pid namespace.
> I assume this is what you want but worth noting.
Not to mention the pid is saved in all trace events and is available for
perf and bpf too. Even the change log showed it:
sleep-634 [036] ..... 145.222206: coredump: sig=11 comm=sleep pid=634
^^^ ^^^
So it should not be included. It's duplicate and only wastes space. Now if
you wanted to save the name space pid, that may be useful.
-- Steve
On Fri, Mar 20, 2026 at 02:48:34PM -0400, Steven Rostedt wrote: > On Fri, 20 Mar 2026 14:21:23 +0100 > Christian Brauner <brauner@kernel.org> wrote: > > > > +TRACE_EVENT(coredump, > > > + > > > + TP_PROTO(int sig), > > > + > > > + TP_ARGS(sig), > > > + > > > + TP_STRUCT__entry( > > > + __field(int, sig) > > > + __array(char, comm, TASK_COMM_LEN) > > > + __field(pid_t, pid) > > > + ), > > > + > > > + TP_fast_assign( > > > + __entry->sig = sig; > > > + memcpy(__entry->comm, current->comm, TASK_COMM_LEN); > > > + __entry->pid = current->pid; > > > > That's the TID as seen in the global pid namespace. > > I assume this is what you want but worth noting. > > Not to mention the pid is saved in all trace events and is available for > perf and bpf too. Even the change log showed it: > > sleep-634 [036] ..... 145.222206: coredump: sig=11 comm=sleep pid=634 > > ^^^ ^^^ > > So it should not be included. It's duplicate and only wastes space. Now if > you wanted to save the name space pid, that may be useful. In my use case, I don't need the namespace pid since I'm primarily focused on system-wide monitoring, and the global pid is sufficient for my purposes. Thanks for the heads-up. I'll update the patch to remove the pid field. Thanks, --breno
© 2016 - 2026 Red Hat, Inc.