[PATCH bpf-next RFC 0/2] Pass external callchain entry to get_perf_callchain

Tao Chen posted 2 patches 2 months ago
There is a newer version of this series
include/linux/perf_event.h |  5 +++--
kernel/bpf/stackmap.c      | 19 +++++++++++--------
kernel/events/callchain.c  | 18 ++++++++++++------
kernel/events/core.c       |  2 +-
4 files changed, 27 insertions(+), 17 deletions(-)
[PATCH bpf-next RFC 0/2] Pass external callchain entry to get_perf_callchain
Posted by Tao Chen 2 months ago
Background
==========
Alexei noted we should use preempt_disable to protect get_perf_callchain
in bpf stackmap.
https://lore.kernel.org/bpf/CAADnVQ+s8B7-fvR1TNO-bniSyKv57cH_ihRszmZV7pQDyV=VDQ@mail.gmail.com

A previous patch was submitted to attempt fixing this issue. And Andrii
suggested teach get_perf_callchain to let us pass that buffer directly to
avoid that unnecessary copy.
https://lore.kernel.org/bpf/20250926153952.1661146-1-chen.dylane@linux.dev

Proposed Solution
=================
Add external perf_callchain_entry parameter for get_perf_callchain to
allow us to use external buffer from BPF side. The biggest advantage is
that it can reduce unnecessary copies.

Todo
====
If the above changes are reasonable, it seems that get_callchain_entry_for_task
could also use an external perf_callchain_entry.

But I'm not sure if this modification is appropriate. After all, the
implementation of get_callchain_entry in the perf subsystem seems much more
complex than directly using an external buffer.

Comments and suggestions are always welcome.

Tao Chen (2):
  perf: Use extern perf_callchain_entry for get_perf_callchain
  bpf: Pass external callchain entry to get_perf_callchain

 include/linux/perf_event.h |  5 +++--
 kernel/bpf/stackmap.c      | 19 +++++++++++--------
 kernel/events/callchain.c  | 18 ++++++++++++------
 kernel/events/core.c       |  2 +-
 4 files changed, 27 insertions(+), 17 deletions(-)

-- 
2.48.1
Re: [PATCH bpf-next RFC 0/2] Pass external callchain entry to get_perf_callchain
Posted by Jiri Olsa 2 months ago
On Tue, Oct 14, 2025 at 01:47:19AM +0800, Tao Chen wrote:
> Background
> ==========
> Alexei noted we should use preempt_disable to protect get_perf_callchain
> in bpf stackmap.
> https://lore.kernel.org/bpf/CAADnVQ+s8B7-fvR1TNO-bniSyKv57cH_ihRszmZV7pQDyV=VDQ@mail.gmail.com
> 
> A previous patch was submitted to attempt fixing this issue. And Andrii
> suggested teach get_perf_callchain to let us pass that buffer directly to
> avoid that unnecessary copy.
> https://lore.kernel.org/bpf/20250926153952.1661146-1-chen.dylane@linux.dev
> 
> Proposed Solution
> =================
> Add external perf_callchain_entry parameter for get_perf_callchain to
> allow us to use external buffer from BPF side. The biggest advantage is
> that it can reduce unnecessary copies.
> 
> Todo
> ====
> If the above changes are reasonable, it seems that get_callchain_entry_for_task
> could also use an external perf_callchain_entry.
> 
> But I'm not sure if this modification is appropriate. After all, the
> implementation of get_callchain_entry in the perf subsystem seems much more
> complex than directly using an external buffer.
> 
> Comments and suggestions are always welcome.
> 
> Tao Chen (2):
>   perf: Use extern perf_callchain_entry for get_perf_callchain
>   bpf: Pass external callchain entry to get_perf_callchain

hi,
I can't get this applied on bpf-next/master, what do I miss?

thanks,
jirka


> 
>  include/linux/perf_event.h |  5 +++--
>  kernel/bpf/stackmap.c      | 19 +++++++++++--------
>  kernel/events/callchain.c  | 18 ++++++++++++------
>  kernel/events/core.c       |  2 +-
>  4 files changed, 27 insertions(+), 17 deletions(-)
> 
> -- 
> 2.48.1
>
Re: [PATCH bpf-next RFC 0/2] Pass external callchain entry to get_perf_callchain
Posted by Yonghong Song 2 months ago

On 10/13/25 1:41 PM, Jiri Olsa wrote:
> On Tue, Oct 14, 2025 at 01:47:19AM +0800, Tao Chen wrote:
>> Background
>> ==========
>> Alexei noted we should use preempt_disable to protect get_perf_callchain
>> in bpf stackmap.
>> https://lore.kernel.org/bpf/CAADnVQ+s8B7-fvR1TNO-bniSyKv57cH_ihRszmZV7pQDyV=VDQ@mail.gmail.com
>>
>> A previous patch was submitted to attempt fixing this issue. And Andrii
>> suggested teach get_perf_callchain to let us pass that buffer directly to
>> avoid that unnecessary copy.
>> https://lore.kernel.org/bpf/20250926153952.1661146-1-chen.dylane@linux.dev
>>
>> Proposed Solution
>> =================
>> Add external perf_callchain_entry parameter for get_perf_callchain to
>> allow us to use external buffer from BPF side. The biggest advantage is
>> that it can reduce unnecessary copies.
>>
>> Todo
>> ====
>> If the above changes are reasonable, it seems that get_callchain_entry_for_task
>> could also use an external perf_callchain_entry.
>>
>> But I'm not sure if this modification is appropriate. After all, the
>> implementation of get_callchain_entry in the perf subsystem seems much more
>> complex than directly using an external buffer.
>>
>> Comments and suggestions are always welcome.
>>
>> Tao Chen (2):
>>    perf: Use extern perf_callchain_entry for get_perf_callchain
>>    bpf: Pass external callchain entry to get_perf_callchain
> hi,
> I can't get this applied on bpf-next/master, what do I miss?

This path is not based on top of latest bpf/bpf-next tree.
The current diff:

  struct perf_callchain_entry *
-get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
-		   u32 max_stack, bool crosstask, bool add_mark)
+get_perf_callchain(struct pt_regs *regs, struct perf_callchain_entry *external_entry,
+		   u32 init_nr, bool kernel, bool user, u32 max_stack, bool crosstask,
+		   bool add_mark)
  {

The actual signature in kernel/events/callchain.c

struct perf_callchain_entry *
get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
                    u32 max_stack, bool crosstask, bool add_mark)
{


>
> thanks,
> jirka
>
>
>>   include/linux/perf_event.h |  5 +++--
>>   kernel/bpf/stackmap.c      | 19 +++++++++++--------
>>   kernel/events/callchain.c  | 18 ++++++++++++------
>>   kernel/events/core.c       |  2 +-
>>   4 files changed, 27 insertions(+), 17 deletions(-)
>>
>> -- 
>> 2.48.1
>>
Re: [PATCH bpf-next RFC 0/2] Pass external callchain entry to get_perf_callchain
Posted by Tao Chen 2 months ago
在 2025/10/14 05:37, Yonghong Song 写道:
> 
> 
> On 10/13/25 1:41 PM, Jiri Olsa wrote:
>> On Tue, Oct 14, 2025 at 01:47:19AM +0800, Tao Chen wrote:
>>> Background
>>> ==========
>>> Alexei noted we should use preempt_disable to protect get_perf_callchain
>>> in bpf stackmap.
>>> https://lore.kernel.org/bpf/CAADnVQ+s8B7-fvR1TNO- 
>>> bniSyKv57cH_ihRszmZV7pQDyV=VDQ@mail.gmail.com
>>>
>>> A previous patch was submitted to attempt fixing this issue. And Andrii
>>> suggested teach get_perf_callchain to let us pass that buffer 
>>> directly to
>>> avoid that unnecessary copy.
>>> https://lore.kernel.org/bpf/20250926153952.1661146-1- 
>>> chen.dylane@linux.dev
>>>
>>> Proposed Solution
>>> =================
>>> Add external perf_callchain_entry parameter for get_perf_callchain to
>>> allow us to use external buffer from BPF side. The biggest advantage is
>>> that it can reduce unnecessary copies.
>>>
>>> Todo
>>> ====
>>> If the above changes are reasonable, it seems that 
>>> get_callchain_entry_for_task
>>> could also use an external perf_callchain_entry.
>>>
>>> But I'm not sure if this modification is appropriate. After all, the
>>> implementation of get_callchain_entry in the perf subsystem seems 
>>> much more
>>> complex than directly using an external buffer.
>>>
>>> Comments and suggestions are always welcome.
>>>
>>> Tao Chen (2):
>>>    perf: Use extern perf_callchain_entry for get_perf_callchain
>>>    bpf: Pass external callchain entry to get_perf_callchain
>> hi,
>> I can't get this applied on bpf-next/master, what do I miss?
> 
> This path is not based on top of latest bpf/bpf-next tree.
> The current diff:
> 
>   struct perf_callchain_entry *
> -get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool 
> user,
> -           u32 max_stack, bool crosstask, bool add_mark)
> +get_perf_callchain(struct pt_regs *regs, struct perf_callchain_entry 
> *external_entry,
> +           u32 init_nr, bool kernel, bool user, u32 max_stack, bool 
> crosstask,
> +           bool add_mark)
>   {
> 
> The actual signature in kernel/events/callchain.c
> 
> struct perf_callchain_entry *
> get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
>                     u32 max_stack, bool crosstask, bool add_mark)
> {
> 
> 
>>
>> thanks,
>> jirka
>>
>>
>>>   include/linux/perf_event.h |  5 +++--
>>>   kernel/bpf/stackmap.c      | 19 +++++++++++--------
>>>   kernel/events/callchain.c  | 18 ++++++++++++------
>>>   kernel/events/core.c       |  2 +-
>>>   4 files changed, 27 insertions(+), 17 deletions(-)
>>>
>>> -- 
>>> 2.48.1
>>>
> 

My mistake. I’ll update the code and resend it.

-- 
Best Regards
Tao Chen