include/linux/perf_event.h | 5 +++-- kernel/bpf/stackmap.c | 19 +++++++++++-------- kernel/events/callchain.c | 18 ++++++++++++------ kernel/events/core.c | 2 +- 4 files changed, 27 insertions(+), 17 deletions(-)
Background ========== Alexei noted we should use preempt_disable to protect get_perf_callchain in bpf stackmap. https://lore.kernel.org/bpf/CAADnVQ+s8B7-fvR1TNO-bniSyKv57cH_ihRszmZV7pQDyV=VDQ@mail.gmail.com A previous patch was submitted to attempt fixing this issue. And Andrii suggested teach get_perf_callchain to let us pass that buffer directly to avoid that unnecessary copy. https://lore.kernel.org/bpf/20250926153952.1661146-1-chen.dylane@linux.dev Proposed Solution ================= Add external perf_callchain_entry parameter for get_perf_callchain to allow us to use external buffer from BPF side. The biggest advantage is that it can reduce unnecessary copies. Todo ==== If the above changes are reasonable, it seems that get_callchain_entry_for_task could also use an external perf_callchain_entry. But I'm not sure if this modification is appropriate. After all, the implementation of get_callchain_entry in the perf subsystem seems much more complex than directly using an external buffer. Comments and suggestions are always welcome. Tao Chen (2): perf: Use extern perf_callchain_entry for get_perf_callchain bpf: Pass external callchain entry to get_perf_callchain include/linux/perf_event.h | 5 +++-- kernel/bpf/stackmap.c | 19 +++++++++++-------- kernel/events/callchain.c | 18 ++++++++++++------ kernel/events/core.c | 2 +- 4 files changed, 27 insertions(+), 17 deletions(-) -- 2.48.1
On Tue, Oct 14, 2025 at 01:47:19AM +0800, Tao Chen wrote: > Background > ========== > Alexei noted we should use preempt_disable to protect get_perf_callchain > in bpf stackmap. > https://lore.kernel.org/bpf/CAADnVQ+s8B7-fvR1TNO-bniSyKv57cH_ihRszmZV7pQDyV=VDQ@mail.gmail.com > > A previous patch was submitted to attempt fixing this issue. And Andrii > suggested teach get_perf_callchain to let us pass that buffer directly to > avoid that unnecessary copy. > https://lore.kernel.org/bpf/20250926153952.1661146-1-chen.dylane@linux.dev > > Proposed Solution > ================= > Add external perf_callchain_entry parameter for get_perf_callchain to > allow us to use external buffer from BPF side. The biggest advantage is > that it can reduce unnecessary copies. > > Todo > ==== > If the above changes are reasonable, it seems that get_callchain_entry_for_task > could also use an external perf_callchain_entry. > > But I'm not sure if this modification is appropriate. After all, the > implementation of get_callchain_entry in the perf subsystem seems much more > complex than directly using an external buffer. > > Comments and suggestions are always welcome. > > Tao Chen (2): > perf: Use extern perf_callchain_entry for get_perf_callchain > bpf: Pass external callchain entry to get_perf_callchain hi, I can't get this applied on bpf-next/master, what do I miss? thanks, jirka > > include/linux/perf_event.h | 5 +++-- > kernel/bpf/stackmap.c | 19 +++++++++++-------- > kernel/events/callchain.c | 18 ++++++++++++------ > kernel/events/core.c | 2 +- > 4 files changed, 27 insertions(+), 17 deletions(-) > > -- > 2.48.1 >
On 10/13/25 1:41 PM, Jiri Olsa wrote:
> On Tue, Oct 14, 2025 at 01:47:19AM +0800, Tao Chen wrote:
>> Background
>> ==========
>> Alexei noted we should use preempt_disable to protect get_perf_callchain
>> in bpf stackmap.
>> https://lore.kernel.org/bpf/CAADnVQ+s8B7-fvR1TNO-bniSyKv57cH_ihRszmZV7pQDyV=VDQ@mail.gmail.com
>>
>> A previous patch was submitted to attempt fixing this issue. And Andrii
>> suggested teach get_perf_callchain to let us pass that buffer directly to
>> avoid that unnecessary copy.
>> https://lore.kernel.org/bpf/20250926153952.1661146-1-chen.dylane@linux.dev
>>
>> Proposed Solution
>> =================
>> Add external perf_callchain_entry parameter for get_perf_callchain to
>> allow us to use external buffer from BPF side. The biggest advantage is
>> that it can reduce unnecessary copies.
>>
>> Todo
>> ====
>> If the above changes are reasonable, it seems that get_callchain_entry_for_task
>> could also use an external perf_callchain_entry.
>>
>> But I'm not sure if this modification is appropriate. After all, the
>> implementation of get_callchain_entry in the perf subsystem seems much more
>> complex than directly using an external buffer.
>>
>> Comments and suggestions are always welcome.
>>
>> Tao Chen (2):
>> perf: Use extern perf_callchain_entry for get_perf_callchain
>> bpf: Pass external callchain entry to get_perf_callchain
> hi,
> I can't get this applied on bpf-next/master, what do I miss?
This path is not based on top of latest bpf/bpf-next tree.
The current diff:
struct perf_callchain_entry *
-get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
- u32 max_stack, bool crosstask, bool add_mark)
+get_perf_callchain(struct pt_regs *regs, struct perf_callchain_entry *external_entry,
+ u32 init_nr, bool kernel, bool user, u32 max_stack, bool crosstask,
+ bool add_mark)
{
The actual signature in kernel/events/callchain.c
struct perf_callchain_entry *
get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
u32 max_stack, bool crosstask, bool add_mark)
{
>
> thanks,
> jirka
>
>
>> include/linux/perf_event.h | 5 +++--
>> kernel/bpf/stackmap.c | 19 +++++++++++--------
>> kernel/events/callchain.c | 18 ++++++++++++------
>> kernel/events/core.c | 2 +-
>> 4 files changed, 27 insertions(+), 17 deletions(-)
>>
>> --
>> 2.48.1
>>
在 2025/10/14 05:37, Yonghong Song 写道:
>
>
> On 10/13/25 1:41 PM, Jiri Olsa wrote:
>> On Tue, Oct 14, 2025 at 01:47:19AM +0800, Tao Chen wrote:
>>> Background
>>> ==========
>>> Alexei noted we should use preempt_disable to protect get_perf_callchain
>>> in bpf stackmap.
>>> https://lore.kernel.org/bpf/CAADnVQ+s8B7-fvR1TNO-
>>> bniSyKv57cH_ihRszmZV7pQDyV=VDQ@mail.gmail.com
>>>
>>> A previous patch was submitted to attempt fixing this issue. And Andrii
>>> suggested teach get_perf_callchain to let us pass that buffer
>>> directly to
>>> avoid that unnecessary copy.
>>> https://lore.kernel.org/bpf/20250926153952.1661146-1-
>>> chen.dylane@linux.dev
>>>
>>> Proposed Solution
>>> =================
>>> Add external perf_callchain_entry parameter for get_perf_callchain to
>>> allow us to use external buffer from BPF side. The biggest advantage is
>>> that it can reduce unnecessary copies.
>>>
>>> Todo
>>> ====
>>> If the above changes are reasonable, it seems that
>>> get_callchain_entry_for_task
>>> could also use an external perf_callchain_entry.
>>>
>>> But I'm not sure if this modification is appropriate. After all, the
>>> implementation of get_callchain_entry in the perf subsystem seems
>>> much more
>>> complex than directly using an external buffer.
>>>
>>> Comments and suggestions are always welcome.
>>>
>>> Tao Chen (2):
>>> perf: Use extern perf_callchain_entry for get_perf_callchain
>>> bpf: Pass external callchain entry to get_perf_callchain
>> hi,
>> I can't get this applied on bpf-next/master, what do I miss?
>
> This path is not based on top of latest bpf/bpf-next tree.
> The current diff:
>
> struct perf_callchain_entry *
> -get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool
> user,
> - u32 max_stack, bool crosstask, bool add_mark)
> +get_perf_callchain(struct pt_regs *regs, struct perf_callchain_entry
> *external_entry,
> + u32 init_nr, bool kernel, bool user, u32 max_stack, bool
> crosstask,
> + bool add_mark)
> {
>
> The actual signature in kernel/events/callchain.c
>
> struct perf_callchain_entry *
> get_perf_callchain(struct pt_regs *regs, bool kernel, bool user,
> u32 max_stack, bool crosstask, bool add_mark)
> {
>
>
>>
>> thanks,
>> jirka
>>
>>
>>> include/linux/perf_event.h | 5 +++--
>>> kernel/bpf/stackmap.c | 19 +++++++++++--------
>>> kernel/events/callchain.c | 18 ++++++++++++------
>>> kernel/events/core.c | 2 +-
>>> 4 files changed, 27 insertions(+), 17 deletions(-)
>>>
>>> --
>>> 2.48.1
>>>
>
My mistake. I’ll update the code and resend it.
--
Best Regards
Tao Chen
© 2016 - 2025 Red Hat, Inc.