kernel/bpf/syscall.c | 7 +++++++ 1 file changed, 7 insertions(+)
A use-after-free issue reported from syzbot exists in __bpf_trace_run().
BUG: KASAN: slab-use-after-free in __bpf_trace_run kernel/trace/bpf_trace.c:2075 [inline]
-> struct bpf_prog *prog = link->link.prog;
The link(struct bpf_raw_tp_link) was freed before accessing
link->link.prog.
The root cause is that: When bpf_probe_unregister() is called, tasks may
have already entered the old tp_probes array (RCU read-side section)
before rcu_assign_pointer() updates tp->funcs. These tasks can access the
link through the old array. Without synchronization, the link can be freed
via call_rcu() after bpf_probe_unregister() in bpf_link_free(), leading to
use-after-free in __bpf_trace_run().
CPU 0 (free link) CPU 1 (enter old tp probe)
───────────────── ────────────────────────
rcu_read_lock()
old_funcs = tp->funcs
bpf_raw_tp_link_release()
bpf_probe_unregister()
rcu_assign_pointer(tp->funcs, new)
call_srcu/call_rcu_tasks_trace(old_tp)
...
call_rcu/call_rcu_tasks_trace(&link->rcu, ...)
(RCU grace period)
kfree(link)
__bpf_trace_run(link, ...)
access link->link.prog
UAF!
Fix by calling tracepoint_synchronize_unregister() to ensure all
in-flight tracepoint callbacks have completed, so the link is no
longer reachable before it is freed.
The issue was introduced by commit d4dfc5700e86 ("bpf:
pass whole link instead of prog when triggering raw tracepoint"),
which changed tracepoint callbacks to receive bpf_raw_tp_link pointers
instead of bpf_prog pointers.
Prior to this commit, this issue did not occur because the bpf_prog was
directly used and protected by reference counting.
Fixes: d4dfc5700e86 ("bpf: pass whole link instead of prog when triggering raw tracepoint")
Reported-by: syzbot+b4c5ad098c821bf8d8bc@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=b4c5ad098c821bf8d8bc
Tested-by: syzbot+b4c5ad098c821bf8d8bc@syzkaller.appspotmail.com
Signed-off-by: Qing Wang <wangqing7171@gmail.com>
---
Changes in v2:
- Modified commit message from bpf-ci AI reviewed.
- Link to v1: https://lore.kernel.org/all/20260304070927.178464-1-wangqing7171@gmail.com/T/
kernel/bpf/syscall.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 0378e83b4099..dd491bc35027 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3783,6 +3783,13 @@ static void bpf_raw_tp_link_release(struct bpf_link *link)
bpf_probe_unregister(raw_tp->btp, raw_tp);
bpf_put_raw_tracepoint(raw_tp->btp);
+
+ /*
+ * Wait for all in-flight tracepoint callbacks to complete so the
+ * link is no longer reachable through tp_probes. This prevents
+ * use-after-free in __bpf_trace_run() when a tracepoint fires.
+ */
+ tracepoint_synchronize_unregister();
}
static void bpf_raw_tp_link_dealloc(struct bpf_link *link)
--
2.34.1
On Wed, Mar 04, 2026 at 05:23:45PM +0800, Qing Wang wrote:
> A use-after-free issue reported from syzbot exists in __bpf_trace_run().
>
> BUG: KASAN: slab-use-after-free in __bpf_trace_run kernel/trace/bpf_trace.c:2075 [inline]
> -> struct bpf_prog *prog = link->link.prog;
>
> The link(struct bpf_raw_tp_link) was freed before accessing
> link->link.prog.
>
> The root cause is that: When bpf_probe_unregister() is called, tasks may
> have already entered the old tp_probes array (RCU read-side section)
> before rcu_assign_pointer() updates tp->funcs. These tasks can access the
> link through the old array. Without synchronization, the link can be freed
> via call_rcu() after bpf_probe_unregister() in bpf_link_free(), leading to
> use-after-free in __bpf_trace_run().
>
> CPU 0 (free link) CPU 1 (enter old tp probe)
> ───────────────── ────────────────────────
>
> rcu_read_lock()
> old_funcs = tp->funcs
> bpf_raw_tp_link_release()
> bpf_probe_unregister()
> rcu_assign_pointer(tp->funcs, new)
> call_srcu/call_rcu_tasks_trace(old_tp)
> ...
> call_rcu/call_rcu_tasks_trace(&link->rcu, ...)
If CPU 1 is in an RCU read-side section, then call_rcu would wait for
the RCU GP anyway before freeing the link in question.
> (RCU grace period)
> kfree(link)
> __bpf_trace_run(link, ...)
> access link->link.prog
> UAF!
>
> Fix by calling tracepoint_synchronize_unregister() to ensure all
> in-flight tracepoint callbacks have completed, so the link is no
> longer reachable before it is freed.
It looks like tracepoint_synchronize_unregister() just calls
synchronize_rcu_tasks_trace() and synchronize_rcu(), but it should also
be sufficient to use call_rcu() or call_rcu_tasks_trace() to ensure that
the appopriate grace period elapses for that tracepoint. Is the extra
delay just masking the problem instead of fixing the root cause?
> The issue was introduced by commit d4dfc5700e86 ("bpf:
> pass whole link instead of prog when triggering raw tracepoint"),
> which changed tracepoint callbacks to receive bpf_raw_tp_link pointers
> instead of bpf_prog pointers.
Did you run a bisect?
> Prior to this commit, this issue did not occur because the bpf_prog was
> directly used and protected by reference counting.
>
> Fixes: d4dfc5700e86 ("bpf: pass whole link instead of prog when triggering raw tracepoint")
> Reported-by: syzbot+b4c5ad098c821bf8d8bc@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=b4c5ad098c821bf8d8bc
> Tested-by: syzbot+b4c5ad098c821bf8d8bc@syzkaller.appspotmail.com
> Signed-off-by: Qing Wang <wangqing7171@gmail.com>
> ---
> Changes in v2:
> - Modified commit message from bpf-ci AI reviewed.
> - Link to v1: https://lore.kernel.org/all/20260304070927.178464-1-wangqing7171@gmail.com/T/
>
> kernel/bpf/syscall.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 0378e83b4099..dd491bc35027 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -3783,6 +3783,13 @@ static void bpf_raw_tp_link_release(struct bpf_link *link)
>
> bpf_probe_unregister(raw_tp->btp, raw_tp);
> bpf_put_raw_tracepoint(raw_tp->btp);
> +
> + /*
> + * Wait for all in-flight tracepoint callbacks to complete so the
> + * link is no longer reachable through tp_probes. This prevents
> + * use-after-free in __bpf_trace_run() when a tracepoint fires.
> + */
> + tracepoint_synchronize_unregister();
> }
>
> static void bpf_raw_tp_link_dealloc(struct bpf_link *link)
> --
> 2.34.1
>
Jordan
On Thu, 05 Mar 2026 at 09:38, Jordan Rife <jrife@google.com> wrote:
> > A use-after-free issue reported from syzbot exists in __bpf_trace_run().
> >
> > BUG: KASAN: slab-use-after-free in __bpf_trace_run kernel/trace/bpf_trace.c:2075 [inline]
> > -> struct bpf_prog *prog = link->link.prog;
> >
> > The link(struct bpf_raw_tp_link) was freed before accessing
> > link->link.prog.
> >
> > The root cause is that: When bpf_probe_unregister() is called, tasks may
> > have already entered the old tp_probes array (RCU read-side section)
> > before rcu_assign_pointer() updates tp->funcs. These tasks can access the
> > link through the old array. Without synchronization, the link can be freed
> > via call_rcu() after bpf_probe_unregister() in bpf_link_free(), leading to
> > use-after-free in __bpf_trace_run().
> >
> > CPU 0 (free link) CPU 1 (enter old tp probe)
> > ───────────────── ────────────────────────
> >
> > rcu_read_lock()
> > old_funcs = tp->funcs
> > bpf_raw_tp_link_release()
> > bpf_probe_unregister()
> > rcu_assign_pointer(tp->funcs, new)
> > call_srcu/call_rcu_tasks_trace(old_tp)
> > ...
> > call_rcu/call_rcu_tasks_trace(&link->rcu, ...)
>
> If CPU 1 is in an RCU read-side section, then call_rcu would wait for
> the RCU GP anyway before freeing the link in question.
Sry, It's my mistake that it should be 'srcu_read_lock(&tracepoint_srcu)'[0]
but not rcu_read_lock(), so that misleaded you. It only wait for the srcu
grace period (tracepoint).
[0]
include/linux/tracepoint.h:279
#define __DECLARE_TRACE(name, proto, args, cond, data_proto) \
__DECLARE_TRACE_COMMON(name, PARAMS(proto), PARAMS(args), PARAMS(data_proto)) \
static inline void __do_trace_##name(proto) \
{ \
TRACEPOINT_CHECK(name) \
if (cond) { \
guard(srcu_fast_notrace)(&tracepoint_srcu); \ <----
__DO_TRACE_CALL(name, TP_ARGS(args)); \
} \
}
> > (RCU grace period)
> > kfree(link)
> > __bpf_trace_run(link, ...)
> > access link->link.prog
> > UAF!
> >
> > Fix by calling tracepoint_synchronize_unregister() to ensure all
> > in-flight tracepoint callbacks have completed, so the link is no
> > longer reachable before it is freed.
>
> It looks like tracepoint_synchronize_unregister() just calls
> synchronize_rcu_tasks_trace() and synchronize_rcu(), but it should also
> be sufficient to use call_rcu() or call_rcu_tasks_trace() to ensure that
> the appopriate grace period elapses for that tracepoint. Is the extra
> delay just masking the problem instead of fixing the root cause?
I think using synchronize_srcu(&tracepoint_srcu) is enough to ensure those
used old tp_probes can exit srcu_read_lock() before kfree(link). It needs
further discussion whether to use tracepoint_synchronize_unregister().
> > The issue was introduced by commit d4dfc5700e86 ("bpf:
> > pass whole link instead of prog when triggering raw tracepoint"),
> > which changed tracepoint callbacks to receive bpf_raw_tp_link pointers
> > instead of bpf_prog pointers.
>
> Did you run a bisect?
I'm trying to run it, but I haven't reproduced it yet.
--
Qing
© 2016 - 2026 Red Hat, Inc.