bpf: Add BPF program type for overriding tracepoint probes

[RFC PATCH bpf-next v1 0/3] bpf: Add BPF program type for overriding tracepoint probes

Posted by Fuyu Zhao 2 weeks, 1 day ago

Hi everyone,

This patchset introduces a new BPF program type that allows overriding
a tracepoint probe function registered via register_trace_*.

Motivation
----------
Tracepoint probe functions registered via register_trace_* in the kernel
cannot be dynamically modified, changing a probe function requires recompiling
the kernel and rebooting. Nor can BPF programs change an existing
probe function.

Overiding tracepoint supports a way to apply patches into kernel quickly
(such as applying security ones), through predefined static tracepoints,
without waiting for upstream integration.

This patchset demonstrates the way to override probe functions by BPF program.

Overview
--------
This patchset adds BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE program type.
When this type of BPF program attaches, it overrides the target tracepoint
probe function.

And it also extends a new struct type "tracepoint_func_snapshot", which extends
the tracepoint structure. It is used to record the original probe function
registered by kernel after BPF program being attached and restore from it
after detachment. 

Critical steps
--------------

1. Attach: Attach programs via the raw_tracepoint_open syscall.
2. Override: 
   (a) Locate the target probe by `probe_name`.
   (b) Override target probe with the BPF program.
   (c) Save the BPF program and target probe function into "tracepoint_func_snapshot".
3. Restore: When the BPF program is detached, automatically restore
   the original probe function from earlier saved snapshot.

Future work
-----------
This patchset is intended as a first step toward supporting BPF programs
that can override tracepoint probes. The current implementation may not yet
cover all use cases or handle every corner case.

I welcome feedback and suggestions from the community, and will continue to
refine and improve the design based on comments and real-world requirements.

Thanks!
Fuyu

Fuyu Zhao (3):
  bpf: Introduce BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE
  libbpf: Add support for BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE
  selftests/bpf: Add selftest for "raw_tp.o"

 include/linux/bpf_types.h                     |   2 +
 include/linux/trace_events.h                  |   9 +
 include/linux/tracepoint-defs.h               |   6 +
 include/linux/tracepoint.h                    |   3 +
 include/uapi/linux/bpf.h                      |   2 +
 kernel/bpf/syscall.c                          |  35 +++-
 kernel/trace/bpf_trace.c                      |  31 +++
 kernel/tracepoint.c                           | 190 +++++++++++++++++-
 tools/include/uapi/linux/bpf.h                |   2 +
 tools/lib/bpf/bpf.c                           |   1 +
 tools/lib/bpf/bpf.h                           |   3 +-
 tools/lib/bpf/libbpf.c                        |  27 ++-
 tools/lib/bpf/libbpf.h                        |   3 +-
 .../bpf/prog_tests/raw_tp_override_test_run.c |  23 +++
 .../bpf/progs/test_raw_tp_override_test_run.c |  20 ++
 .../selftests/bpf/test_kmods/bpf_testmod.c    |   7 +
 16 files changed, 352 insertions(+), 12 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/raw_tp_override_test_run.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_raw_tp_override_test_run.c

-- 
2.43.0

Re: [RFC PATCH bpf-next v1 0/3] bpf: Add BPF program type for overriding tracepoint probes

Posted by Song Liu 2 weeks ago

On Wed, Sep 17, 2025 at 12:23 AM Fuyu Zhao <zhaofuyu@vivo.com> wrote:
>
> Hi everyone,
>
> This patchset introduces a new BPF program type that allows overriding
> a tracepoint probe function registered via register_trace_*.
>
> Motivation
> ----------
> Tracepoint probe functions registered via register_trace_* in the kernel
> cannot be dynamically modified, changing a probe function requires recompiling
> the kernel and rebooting. Nor can BPF programs change an existing
> probe function.
>
> Overiding tracepoint supports a way to apply patches into kernel quickly
> (such as applying security ones), through predefined static tracepoints,
> without waiting for upstream integration.

IIUC, this work solves the same problem as raw tracepoint (raw_tp) or raw
tracepoint with btf (tp_btf).

Did I miss something?

Thanks,
Song

Re: [RFC PATCH bpf-next v1 0/3] bpf: Add BPF program type for overriding tracepoint probes

Posted by Fuyu Zhao 2 weeks ago


On 9/18/2025 4:02 AM, Song Liu wrote:
> On Wed, Sep 17, 2025 at 12:23 AM Fuyu Zhao <zhaofuyu@vivo.com> wrote:
>>
>> Hi everyone,
>>
>> This patchset introduces a new BPF program type that allows overriding
>> a tracepoint probe function registered via register_trace_*.
>>
>> Motivation
>> ----------
>> Tracepoint probe functions registered via register_trace_* in the kernel
>> cannot be dynamically modified, changing a probe function requires recompiling
>> the kernel and rebooting. Nor can BPF programs change an existing
>> probe function.
>>
>> Overiding tracepoint supports a way to apply patches into kernel quickly
>> (such as applying security ones), through predefined static tracepoints,
>> without waiting for upstream integration.
> 
> IIUC, this work solves the same problem as raw tracepoint (raw_tp) or raw
> tracepoint with btf (tp_btf).
> 
> Did I miss something?
> 
> Thanks,
> Song

As I understand it, raw tracepoints (raw_tp) and raw tracepoint (raw_tp)
are designed mainly for tracing the kernel. The goal of this work is to
provide a way to override the tracepoint callback, so that kernel behavior
can be adjusted dynamically.

Thanks,
Fuyu

Re: [RFC PATCH bpf-next v1 0/3] bpf: Add BPF program type for overriding tracepoint probes

Posted by Jiri Olsa 2 weeks ago

On Thu, Sep 18, 2025 at 04:05:51PM +0800, Fuyu Zhao wrote:
> 
> 
> On 9/18/2025 4:02 AM, Song Liu wrote:
> > On Wed, Sep 17, 2025 at 12:23 AM Fuyu Zhao <zhaofuyu@vivo.com> wrote:
> >>
> >> Hi everyone,
> >>
> >> This patchset introduces a new BPF program type that allows overriding
> >> a tracepoint probe function registered via register_trace_*.
> >>
> >> Motivation
> >> ----------
> >> Tracepoint probe functions registered via register_trace_* in the kernel
> >> cannot be dynamically modified, changing a probe function requires recompiling
> >> the kernel and rebooting. Nor can BPF programs change an existing
> >> probe function.
> >>
> >> Overiding tracepoint supports a way to apply patches into kernel quickly
> >> (such as applying security ones), through predefined static tracepoints,
> >> without waiting for upstream integration.
> > 
> > IIUC, this work solves the same problem as raw tracepoint (raw_tp) or raw
> > tracepoint with btf (tp_btf).
> > 
> > Did I miss something?
> > 
> > Thanks,
> > Song
> 
> As I understand it, raw tracepoints (raw_tp) and raw tracepoint (raw_tp)
> are designed mainly for tracing the kernel. The goal of this work is to
> provide a way to override the tracepoint callback, so that kernel behavior
> can be adjusted dynamically.

hi,
what's the use case for this? also I'd think you can do that just by
unregister the callback you want to override and register new one?

thanks,
jirka

Re: [RFC PATCH bpf-next v1 0/3] bpf: Add BPF program type for overriding tracepoint probes

Posted by Fuyu Zhao 1 week, 6 days ago


On 9/18/2025 4:47 PM, Jiri Olsa wrote:
> [You don't often get email from olsajiri@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> On Thu, Sep 18, 2025 at 04:05:51PM +0800, Fuyu Zhao wrote:
>>
>>
>> On 9/18/2025 4:02 AM, Song Liu wrote:
>>> On Wed, Sep 17, 2025 at 12:23 AM Fuyu Zhao <zhaofuyu@vivo.com> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> This patchset introduces a new BPF program type that allows overriding
>>>> a tracepoint probe function registered via register_trace_*.
>>>>
>>>> Motivation
>>>> ----------
>>>> Tracepoint probe functions registered via register_trace_* in the kernel
>>>> cannot be dynamically modified, changing a probe function requires recompiling
>>>> the kernel and rebooting. Nor can BPF programs change an existing
>>>> probe function.
>>>>
>>>> Overiding tracepoint supports a way to apply patches into kernel quickly
>>>> (such as applying security ones), through predefined static tracepoints,
>>>> without waiting for upstream integration.
>>>
>>> IIUC, this work solves the same problem as raw tracepoint (raw_tp) or raw
>>> tracepoint with btf (tp_btf).
>>>
>>> Did I miss something?
>>>
>>> Thanks,
>>> Song
>>
>> As I understand it, raw tracepoints (raw_tp) and raw tracepoint (raw_tp)
>> are designed mainly for tracing the kernel. The goal of this work is to
>> provide a way to override the tracepoint callback, so that kernel behavior
>> can be adjusted dynamically.
> 
> hi,
> what's the use case for this? also I'd think you can do that just by
> unregister the callback you want to override and register new one?
> 
> thanks,
> jirka

At this moment, I don't have a real-world example. However, I mentioned one
possible use case in my reply to Steven:

> One possible use case is CPU core selection under certain scenarios. For example,
> developers may want to experiment with alternative strategies for deciding
> which CPU a task should run on to improve performance.
>  
> If a tracepoint is added as a hook point in this path, then overriding its
> function callback could make it possible to dynamically adjust the
> cpu-selection logic without rebuilding and rebooting the kernel.

As for the reason not to unregister and register a new callback:
callbacks registered directly inside the kernel cannot be unregistered from
user space. From user space, we can only attach additional callbacks
with BPF programs, but can not remove or replace the ones already
registered in the kernel. Therefore, an override mechanism is needed.

Thanks,
Fuyu

Re: [RFC PATCH bpf-next v1 0/3] bpf: Add BPF program type for overriding tracepoint probes

Posted by Steven Rostedt 1 week, 6 days ago

On Thu, 18 Sep 2025 21:15:57 +0800
Fuyu Zhao <zhaofuyu@vivo.com> wrote:

> As for the reason not to unregister and register a new callback:
> callbacks registered directly inside the kernel cannot be unregistered from
> user space. From user space, we can only attach additional callbacks
> with BPF programs, but can not remove or replace the ones already
> registered in the kernel. Therefore, an override mechanism is needed.

The fact that user space cannot unregister or override the current
callbacks, to me is a feature and not a bug.

-- Steve

Re: [RFC PATCH bpf-next v1 0/3] bpf: Add BPF program type for overriding tracepoint probes

Posted by Fuyu Zhao 1 week, 5 days ago


On 9/18/2025 11:32 PM, Steven Rostedt wrote:
> On Thu, 18 Sep 2025 21:15:57 +0800
> Fuyu Zhao <zhaofuyu@vivo.com> wrote:
> 
>> As for the reason not to unregister and register a new callback:
>> callbacks registered directly inside the kernel cannot be unregistered from
>> user space. From user space, we can only attach additional callbacks
>> with BPF programs, but can not remove or replace the ones already
>> registered in the kernel. Therefore, an override mechanism is needed.
> 
> The fact that user space cannot unregister or override the current
> callbacks, to me is a feature and not a bug.
> 
> -- Steve

I see, thank you for sharing your view — I’ll keep it in mind.

Sincerely,
Fuyu

Re: [RFC PATCH bpf-next v1 0/3] bpf: Add BPF program type for overriding tracepoint probes

Posted by Steven Rostedt 2 weeks ago

On Wed, 17 Sep 2025 15:22:39 +0800
Fuyu Zhao <zhaofuyu@vivo.com> wrote:

> Hi everyone,
> 
> This patchset introduces a new BPF program type that allows overriding
> a tracepoint probe function registered via register_trace_*.
> 
> Motivation
> ----------
> Tracepoint probe functions registered via register_trace_* in the kernel
> cannot be dynamically modified, changing a probe function requires recompiling
> the kernel and rebooting. Nor can BPF programs change an existing
> probe function.

I'm confused by what you mean by "tracepoint probe function"?

You mean the function callback that gets called via the "register_trace_*()"?

> 
> Overiding tracepoint supports a way to apply patches into kernel quickly
> (such as applying security ones), through predefined static tracepoints,
> without waiting for upstream integration.

This sounds way out of scope for tracepoints. Please provide a solid
example for this.

> 
> This patchset demonstrates the way to override probe functions by BPF program.
> 
> Overview
> --------
> This patchset adds BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE program type.
> When this type of BPF program attaches, it overrides the target tracepoint
> probe function.
> 
> And it also extends a new struct type "tracepoint_func_snapshot", which extends
> the tracepoint structure. It is used to record the original probe function
> registered by kernel after BPF program being attached and restore from it
> after detachment. 

The tracepoint structure exists for every tracepoint in the kernel. By
adding a pointer to it, you just increased the size of the tracepoint. I'm
already complaining that each tracepoint causes around 5K of memory
overhead, and I'd like to make it smaller.

-- Steve

Re: [RFC PATCH bpf-next v1 0/3] bpf: Add BPF program type for overriding tracepoint probes

Posted by Fuyu Zhao 1 week, 6 days ago

Sorry, I just realized that I forgot to include the CC list in my first reply.
Resending with CCs. Apologies to Steven for the extra noise.

On 9/18/2025 3:30 AM, Steven Rostedt wrote:
> On Wed, 17 Sep 2025 15:22:39 +0800
> Fuyu Zhao <zhaofuyu@vivo.com> wrote:
> 
>> Hi everyone,
>>
>> This patchset introduces a new BPF program type that allows overriding
>> a tracepoint probe function registered via register_trace_*.
>>
>> Motivation
>> ----------
>> Tracepoint probe functions registered via register_trace_* in the kernel
>> cannot be dynamically modified, changing a probe function requires recompiling
>> the kernel and rebooting. Nor can BPF programs change an existing
>> probe function.
> 
> I'm confused by what you mean by "tracepoint probe function"?
> 
> You mean the function callback that gets called via the "register_trace_*()"?
> 

Yes, that’s correct.
My earlier wording was not very precise — thanks for pointing that out.

>>
>> Overiding tracepoint supports a way to apply patches into kernel quickly
>> (such as applying security ones), through predefined static tracepoints,
>> without waiting for upstream integration.
> 
> This sounds way out of scope for tracepoints. Please provide a solid
> example for this.
> 

I appreciate your comment. The example I gave about security patches probably
wasn’t a good one here — I just meant to show the idea of changing kernel
behavior at runtime. Sorry for the confusion.

At the moment, I don’t have a solid real-world example to provide.
This work is still in an exploratory stage.

One possible use case is CPU core selection under certain scenarios. For example,
developers may want to experiment with alternative strategies for deciding
which CPU a task should run on to improve performance.

If a tracepoint is added as a hook point in this path, then overriding its
function callback could make it possible to dynamically adjust the
cpu-selection logic without rebuilding and rebooting the kernel.

The same mechanism could also be applied in other kernel paths where
developers want to make quick changes from user space.

>>
>> This patchset demonstrates the way to override probe functions by BPF program.
>>
>> Overview
>> --------
>> This patchset adds BPF_PROG_TYPE_RAW_TRACEPOINT_OVERRIDE program type.
>> When this type of BPF program attaches, it overrides the target tracepoint
>> probe function.
>>
>> And it also extends a new struct type "tracepoint_func_snapshot", which extends
>> the tracepoint structure. It is used to record the original probe function
>> registered by kernel after BPF program being attached and restore from it
>> after detachment. 
> 
> The tracepoint structure exists for every tracepoint in the kernel. By
> adding a pointer to it, you just increased the size of the tracepoint. I'm
> already complaining that each tracepoint causes around 5K of memory
> overhead, and I'd like to make it smaller.
> 
> -- Steve
> 

It is true that adding a pointer to the tracepoint structure increases
memory overhead. However, memory for "snapshot" pointer will only be allocated
after a BPF program is attached, and freed once it is dettached.

I am also considering whether it is possible to reuse existing structures
to reduce memory usage.

I'd be very grateful for any suggestions or guidance you might have.

Thanks,
Fuyu

Re: [RFC PATCH bpf-next v1 0/3] bpf: Add BPF program type for overriding tracepoint probes

Posted by Steven Rostedt 1 week, 6 days ago

On Thu, 18 Sep 2025 20:33:22 +0800
Fuyu Zhao <zhaofuyu@vivo.com> wrote:

> At the moment, I don’t have a solid real-world example to provide.
> This work is still in an exploratory stage.

We shouldn't be in the business of "if you build it, they will come".
Unless there is a concrete use case now, I would not be adding anything.

My entire workflow for what I created in the tracing system was "I have a
need, I will implement it". The "need" came first. I then wrote code to
satisfy that need. It should not be the other way around.

-- Steve

Re: [RFC PATCH bpf-next v1 0/3] bpf: Add BPF program type for overriding tracepoint probes

Posted by Chuck Wolber 1 week, 3 days ago

On Thu Sep 18, 2025 at 3:29 PM UTC, Steven Rostedt wrote:
>
> My entire workflow for what I created in the tracing system was "I have a
> need, I will implement it". The "need" came first. I then wrote code to
> satisfy that need. It should not be the other way around.

Tagging on to this sentiment - the kernel's design is emergent and will always
remain so.

Speculative features have a very low probability of reflecting the required
design language. On the other hand, if someone needs a thing, the need will
drive the use of conformal design language.

..Ch:W..

Re: [RFC PATCH bpf-next v1 0/3] bpf: Add BPF program type for overriding tracepoint probes

Posted by Fuyu Zhao 1 week, 5 days ago

On 9/18/2025 11:24 PM, Steven Rostedt wrote:
> On Thu, 18 Sep 2025 20:33:22 +0800
> Fuyu Zhao <zhaofuyu@vivo.com> wrote:
> 
>> At the moment, I don’t have a solid real-world example to provide.
>> This work is still in an exploratory stage.
> 
> We shouldn't be in the business of "if you build it, they will come".
> Unless there is a concrete use case now, I would not be adding anything.
> 
> My entire workflow for what I created in the tracing system was "I have a
> need, I will implement it". The "need" came first. I then wrote code to
> satisfy that need. It should not be the other way around.
> 
> -- Steve

Thanks a lot for the feedback and guidance.

I understand your point that new functionality should be driven by real
needs rather than exploratory ideas.

I’ll keep looking into this. If I find a concrete use case that
demonstrates clear value, I’ll bring it back for discussion.

Thanks again.