When I use bpf_program__attach_kprobe_multi_opts to hook a BPF program that contains the bpf_get_stackid function on the arm64 architecture,
I find that the stack trace cannot be obtained. The trace->nr in __bpf_get_stackid is 0, and the function returns -EFAULT.
For example:
diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi.c b/tools/testing/selftests/bpf/progs/kprobe_multi.c
index 9e1ca8e34913..844fa88cdc4c 100644
--- a/tools/testing/selftests/bpf/progs/kprobe_multi.c
+++ b/tools/testing/selftests/bpf/progs/kprobe_multi.c
@@ -36,6 +36,15 @@ __u64 kretprobe_test6_result = 0;
__u64 kretprobe_test7_result = 0;
__u64 kretprobe_test8_result = 0;
+typedef __u64 stack_trace_t[2];
+
+struct {
+ __uint(type, BPF_MAP_TYPE_STACK_TRACE);
+ __uint(max_entries, 1024);
+ __type(key, __u32);
+ __type(value, stack_trace_t);
+} stacks SEC(".maps");
+
static void kprobe_multi_check(void *ctx, bool is_return)
{
if (bpf_get_current_pid_tgid() >> 32 != pid)
@@ -100,7 +109,9 @@ int test_kretprobe(struct pt_regs *ctx)
SEC("kprobe.multi")
int test_kprobe_manual(struct pt_regs *ctx)
{
+ int id = bpf_get_stackid(ctx, &stacks, 0);
kprobe_multi_check(ctx, false);
+ bpf_printk("stackid: %d\n", id);
return 0;
}
./test_progs -t kprobe_multi_test/attach_api_pattern
#155/4 kprobe_multi_test/attach_api_pattern:OK
#155 kprobe_multi_test:OK
#156 kprobe_multi_testmod_test:OK
Summary: 2/1 PASSED, 0 SKIPPED, 0 FAILED
cat /sys/kernel/debug/tracing/trace
test_progs-68315 [004] ...1. 13377.097527: bpf_trace_printk: stackid: -14
......
Test Version:
6ff4a0fa3e1 ("bpf, arm64: Call bpf_jit_binary_pack_finalize() in bpf_jit_free()")
Linux localhost.localdomain 6.17.0-rc5+ #2 SMP PREEMPT_DYNAMIC Fri Sep 19 10:29:07 CST 2025 aarch64 aarch64 aarch64 GNU/Linux
clang version 17.0.6 ( 17.0.6-30.p03.ky11)
gcc (GCC) 12.3.1 (kylin 12.3.1-62.p02.ky11)
GNU Make 4.4.1
On Fri, Sep 19, 2025 at 12:19 AM Feng Yang <yangfeng59949@163.com> wrote: > > When I use bpf_program__attach_kprobe_multi_opts to hook a BPF program that contains the bpf_get_stackid function on the arm64 architecture, > I find that the stack trace cannot be obtained. The trace->nr in __bpf_get_stackid is 0, and the function returns -EFAULT. > > For example: > diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi.c b/tools/testing/selftests/bpf/progs/kprobe_multi.c > index 9e1ca8e34913..844fa88cdc4c 100644 > --- a/tools/testing/selftests/bpf/progs/kprobe_multi.c > +++ b/tools/testing/selftests/bpf/progs/kprobe_multi.c > @@ -36,6 +36,15 @@ __u64 kretprobe_test6_result = 0; > __u64 kretprobe_test7_result = 0; > __u64 kretprobe_test8_result = 0; > > +typedef __u64 stack_trace_t[2]; > + > +struct { > + __uint(type, BPF_MAP_TYPE_STACK_TRACE); > + __uint(max_entries, 1024); > + __type(key, __u32); > + __type(value, stack_trace_t); > +} stacks SEC(".maps"); > + > static void kprobe_multi_check(void *ctx, bool is_return) > { > if (bpf_get_current_pid_tgid() >> 32 != pid) > @@ -100,7 +109,9 @@ int test_kretprobe(struct pt_regs *ctx) > SEC("kprobe.multi") > int test_kprobe_manual(struct pt_regs *ctx) > { > + int id = bpf_get_stackid(ctx, &stacks, 0); ftrace_partial_regs() supposed to work on x86 and arm64, but since multi-kprobe is the only user... I suspect the arm64 implementation wasn't really tested. Or maybe there is some other issue. Masami, Jiri, thoughts?
On Fri, 19 Sep 2025 19:56:20 -0700 Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > On Fri, Sep 19, 2025 at 12:19 AM Feng Yang <yangfeng59949@163.com> wrote: > > > > When I use bpf_program__attach_kprobe_multi_opts to hook a BPF program that contains the bpf_get_stackid function on the arm64 architecture, > > I find that the stack trace cannot be obtained. The trace->nr in __bpf_get_stackid is 0, and the function returns -EFAULT. > > > > For example: > > diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi.c b/tools/testing/selftests/bpf/progs/kprobe_multi.c > > index 9e1ca8e34913..844fa88cdc4c 100644 > > --- a/tools/testing/selftests/bpf/progs/kprobe_multi.c > > +++ b/tools/testing/selftests/bpf/progs/kprobe_multi.c > > @@ -36,6 +36,15 @@ __u64 kretprobe_test6_result = 0; > > __u64 kretprobe_test7_result = 0; > > __u64 kretprobe_test8_result = 0; > > > > +typedef __u64 stack_trace_t[2]; > > + > > +struct { > > + __uint(type, BPF_MAP_TYPE_STACK_TRACE); > > + __uint(max_entries, 1024); > > + __type(key, __u32); > > + __type(value, stack_trace_t); > > +} stacks SEC(".maps"); > > + > > static void kprobe_multi_check(void *ctx, bool is_return) > > { > > if (bpf_get_current_pid_tgid() >> 32 != pid) > > @@ -100,7 +109,9 @@ int test_kretprobe(struct pt_regs *ctx) > > SEC("kprobe.multi") > > int test_kprobe_manual(struct pt_regs *ctx) > > { > > + int id = bpf_get_stackid(ctx, &stacks, 0); > > ftrace_partial_regs() supposed to work on x86 and arm64, > but since multi-kprobe is the only user... It should be able to unwind stack. It saves sp, pc, lr, fp. regs->sp = afregs->sp; regs->pc = afregs->pc; regs->regs[29] = afregs->fp; regs->regs[30] = afregs->lr; > I suspect the arm64 implementation wasn't really tested. > Or maybe there is some other issue. It depends on how bpf_get_stackid() works. Some registers for that function may not be saved. If it returns -EFAULT, the get_perf_callchain() returns NULL. struct perf_callchain_entry * get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user, u32 max_stack, bool crosstask, bool add_mark) { ... entry = get_callchain_entry(&rctx); if (!entry) return NULL; Thus the `get_callchain_entry(&rctx)` returns NULL. But if so, this does not related to the ftrace_partial_regs(), because get_callchain_entry() returns the per-cpu callchain woarking buffer for the context, not decoding stack. struct perf_callchain_entry *get_callchain_entry(int *rctx) { int cpu; struct callchain_cpus_entries *entries; *rctx = get_recursion_context(this_cpu_ptr(callchain_recursion)); if (*rctx == -1) return NULL; entries = rcu_dereference(callchain_cpus_entries); if (!entries) { put_recursion_context(this_cpu_ptr(callchain_recursion), *rctx); return NULL; } cpu = smp_processor_id(); return (((void *)entries->cpu_entries[cpu]) + (*rctx * perf_callchain_entry__sizeof())); } What context does BPF expect, and how does it detect? Thank you, -- Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Sun, 21 Sep 2025 22:30:37 +0900 Masami Hiramatsu wrote: > On Fri, 19 Sep 2025 19:56:20 -0700 > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > On Fri, Sep 19, 2025 at 12:19 AM Feng Yang <yangfeng59949@163.com> wrote: > > > > > > When I use bpf_program__attach_kprobe_multi_opts to hook a BPF program that contains the bpf_get_stackid function on the arm64 architecture, > > > I find that the stack trace cannot be obtained. The trace->nr in __bpf_get_stackid is 0, and the function returns -EFAULT. > > > > > > For example: > > > diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi.c b/tools/testing/selftests/bpf/progs/kprobe_multi.c > > > index 9e1ca8e34913..844fa88cdc4c 100644 > > > --- a/tools/testing/selftests/bpf/progs/kprobe_multi.c > > > +++ b/tools/testing/selftests/bpf/progs/kprobe_multi.c > > > @@ -36,6 +36,15 @@ __u64 kretprobe_test6_result = 0; > > > __u64 kretprobe_test7_result = 0; > > > __u64 kretprobe_test8_result = 0; > > > > > > +typedef __u64 stack_trace_t[2]; > > > + > > > +struct { > > > + __uint(type, BPF_MAP_TYPE_STACK_TRACE); > > > + __uint(max_entries, 1024); > > > + __type(key, __u32); > > > + __type(value, stack_trace_t); > > > +} stacks SEC(".maps"); > > > + > > > static void kprobe_multi_check(void *ctx, bool is_return) > > > { > > > if (bpf_get_current_pid_tgid() >> 32 != pid) > > > @@ -100,7 +109,9 @@ int test_kretprobe(struct pt_regs *ctx) > > > SEC("kprobe.multi") > > > int test_kprobe_manual(struct pt_regs *ctx) > > > { > > > + int id = bpf_get_stackid(ctx, &stacks, 0); > > > > ftrace_partial_regs() supposed to work on x86 and arm64, > > but since multi-kprobe is the only user... > > It should be able to unwind stack. It saves sp, pc, lr, fp. > > regs->sp = afregs->sp; > regs->pc = afregs->pc; > regs->regs[29] = afregs->fp; > regs->regs[30] = afregs->lr; > > > I suspect the arm64 implementation wasn't really tested. > > Or maybe there is some other issue. > > It depends on how bpf_get_stackid() works. Some registers for that > function may not be saved. > > If it returns -EFAULT, the get_perf_callchain() returns NULL. > During my test, the reason for returning -EFAULT was that trace->nr was 0. static long __bpf_get_stackid(struct bpf_map *map, struct perf_callchain_entry *trace, u64 flags) { struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); struct stack_map_bucket *bucket, *new_bucket, *old_bucket; u32 skip = flags & BPF_F_SKIP_FIELD_MASK; u32 hash, id, trace_nr, trace_len; bool user = flags & BPF_F_USER_STACK; u64 *ips; bool hash_matches; if (trace->nr <= skip) /* skipping more than usable stack trace */ return -EFAULT; ...... } thanks
On Mon, 22 Sep 2025 10:15:31 +0800 Feng Yang <yangfeng59949@163.com> wrote: > On Sun, 21 Sep 2025 22:30:37 +0900 Masami Hiramatsu wrote: > > > On Fri, 19 Sep 2025 19:56:20 -0700 > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > > > On Fri, Sep 19, 2025 at 12:19 AM Feng Yang <yangfeng59949@163.com> wrote: > > > > > > > > When I use bpf_program__attach_kprobe_multi_opts to hook a BPF program that contains the bpf_get_stackid function on the arm64 architecture, > > > > I find that the stack trace cannot be obtained. The trace->nr in __bpf_get_stackid is 0, and the function returns -EFAULT. > > > > > > > > For example: > > > > diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi.c b/tools/testing/selftests/bpf/progs/kprobe_multi.c > > > > index 9e1ca8e34913..844fa88cdc4c 100644 > > > > --- a/tools/testing/selftests/bpf/progs/kprobe_multi.c > > > > +++ b/tools/testing/selftests/bpf/progs/kprobe_multi.c > > > > @@ -36,6 +36,15 @@ __u64 kretprobe_test6_result = 0; > > > > __u64 kretprobe_test7_result = 0; > > > > __u64 kretprobe_test8_result = 0; > > > > > > > > +typedef __u64 stack_trace_t[2]; > > > > + > > > > +struct { > > > > + __uint(type, BPF_MAP_TYPE_STACK_TRACE); > > > > + __uint(max_entries, 1024); > > > > + __type(key, __u32); > > > > + __type(value, stack_trace_t); > > > > +} stacks SEC(".maps"); > > > > + > > > > static void kprobe_multi_check(void *ctx, bool is_return) > > > > { > > > > if (bpf_get_current_pid_tgid() >> 32 != pid) > > > > @@ -100,7 +109,9 @@ int test_kretprobe(struct pt_regs *ctx) > > > > SEC("kprobe.multi") > > > > int test_kprobe_manual(struct pt_regs *ctx) > > > > { > > > > + int id = bpf_get_stackid(ctx, &stacks, 0); > > > > > > ftrace_partial_regs() supposed to work on x86 and arm64, > > > but since multi-kprobe is the only user... > > > > It should be able to unwind stack. It saves sp, pc, lr, fp. > > > > regs->sp = afregs->sp; > > regs->pc = afregs->pc; > > regs->regs[29] = afregs->fp; > > regs->regs[30] = afregs->lr; > > > > > I suspect the arm64 implementation wasn't really tested. > > > Or maybe there is some other issue. > > > > It depends on how bpf_get_stackid() works. Some registers for that > > function may not be saved. > > > > If it returns -EFAULT, the get_perf_callchain() returns NULL. > > > > During my test, the reason for returning -EFAULT was that trace->nr was 0. > > static long __bpf_get_stackid(struct bpf_map *map, > struct perf_callchain_entry *trace, u64 flags) > { > struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); > struct stack_map_bucket *bucket, *new_bucket, *old_bucket; > u32 skip = flags & BPF_F_SKIP_FIELD_MASK; > u32 hash, id, trace_nr, trace_len; > bool user = flags & BPF_F_USER_STACK; > u64 *ips; > bool hash_matches; > > if (trace->nr <= skip) > /* skipping more than usable stack trace */ > return -EFAULT; > ...... Hmm. The "trace" is returned from get_perf_callchain() get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user, u32 max_stack, bool crosstask, bool add_mark) { ... if (kernel && !user_mode(regs)) { if (add_mark) perf_callchain_store_context(&ctx, PERF_CONTEXT_KERNEL); perf_callchain_kernel(&ctx, regs); } So this means `perf_callchain_kernel(&ctx, regs);` fails to unwind stack. perf_callchain_kernel() -> arch_stack_walk() -> kunwind_stack_walk() That is `kunwind_init_from_regs()` and `do_kunwind()`. if (regs) { if (task != current) return -EINVAL; kunwind_init_from_regs(&state, regs); } else if (task == current) { kunwind_init_from_caller(&state); } else { kunwind_init_from_task(&state, task); } return do_kunwind(&state, consume_state, cookie); For initialization, it should be OK because it only refers pc and fp(regs[29]), which are recovered by ftrace_partial_regs(). static __always_inline void kunwind_init_from_regs(struct kunwind_state *state, struct pt_regs *regs) { kunwind_init(state, current); state->regs = regs; state->common.fp = regs->regs[29]; state->common.pc = regs->pc; state->source = KUNWIND_SOURCE_REGS_PC; } And do_kunwind() should work increase trace->nr before return unless `kunwind_recover_return_address()` fails. static __always_inline int do_kunwind(struct kunwind_state *state, kunwind_consume_fn consume_state, void *cookie) { int ret; ret = kunwind_recover_return_address(state); if (ret) return ret; while (1) { if (!consume_state(state, cookie)) <--- this increases trace->nr (*). return -EINVAL; ret = kunwind_next(state); if (ret == -ENOENT) return 0; if (ret < 0) return ret; } } (*) consume_state() == arch_kunwind_consume_entry() -> data->consume_entry == callchain_trace() -> perf_callchain_store(). Hmm, can you also dump the regs and insert pr_info() to find which function fails? Thanks, -- Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Wed, 24 Sep 2025 00:32:15 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > On Mon, 22 Sep 2025 10:15:31 +0800 > Feng Yang <yangfeng59949@163.com> wrote: > > > On Sun, 21 Sep 2025 22:30:37 +0900 Masami Hiramatsu wrote: > > > > > On Fri, 19 Sep 2025 19:56:20 -0700 > > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > > > > > On Fri, Sep 19, 2025 at 12:19 AM Feng Yang <yangfeng59949@163.com> wrote: > > > > > > > > > > When I use bpf_program__attach_kprobe_multi_opts to hook a BPF program that contains the bpf_get_stackid function on the arm64 architecture, > > > > > I find that the stack trace cannot be obtained. The trace->nr in __bpf_get_stackid is 0, and the function returns -EFAULT. > > > > > > > > > > For example: > > > > > diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi.c b/tools/testing/selftests/bpf/progs/kprobe_multi.c > > > > > index 9e1ca8e34913..844fa88cdc4c 100644 > > > > > --- a/tools/testing/selftests/bpf/progs/kprobe_multi.c > > > > > +++ b/tools/testing/selftests/bpf/progs/kprobe_multi.c > > > > > @@ -36,6 +36,15 @@ __u64 kretprobe_test6_result = 0; > > > > > __u64 kretprobe_test7_result = 0; > > > > > __u64 kretprobe_test8_result = 0; > > > > > > > > > > +typedef __u64 stack_trace_t[2]; > > > > > + > > > > > +struct { > > > > > + __uint(type, BPF_MAP_TYPE_STACK_TRACE); > > > > > + __uint(max_entries, 1024); > > > > > + __type(key, __u32); > > > > > + __type(value, stack_trace_t); > > > > > +} stacks SEC(".maps"); > > > > > + > > > > > static void kprobe_multi_check(void *ctx, bool is_return) > > > > > { > > > > > if (bpf_get_current_pid_tgid() >> 32 != pid) > > > > > @@ -100,7 +109,9 @@ int test_kretprobe(struct pt_regs *ctx) > > > > > SEC("kprobe.multi") > > > > > int test_kprobe_manual(struct pt_regs *ctx) > > > > > { > > > > > + int id = bpf_get_stackid(ctx, &stacks, 0); > > > > > > > > ftrace_partial_regs() supposed to work on x86 and arm64, > > > > but since multi-kprobe is the only user... > > > > > > It should be able to unwind stack. It saves sp, pc, lr, fp. > > > > > > regs->sp = afregs->sp; > > > regs->pc = afregs->pc; > > > regs->regs[29] = afregs->fp; > > > regs->regs[30] = afregs->lr; > > > > > > > I suspect the arm64 implementation wasn't really tested. > > > > Or maybe there is some other issue. > > > > > > It depends on how bpf_get_stackid() works. Some registers for that > > > function may not be saved. > > > > > > If it returns -EFAULT, the get_perf_callchain() returns NULL. > > > > > > > During my test, the reason for returning -EFAULT was that trace->nr was 0. > > > > static long __bpf_get_stackid(struct bpf_map *map, > > struct perf_callchain_entry *trace, u64 flags) > > { > > struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); > > struct stack_map_bucket *bucket, *new_bucket, *old_bucket; > > u32 skip = flags & BPF_F_SKIP_FIELD_MASK; > > u32 hash, id, trace_nr, trace_len; > > bool user = flags & BPF_F_USER_STACK; > > u64 *ips; > > bool hash_matches; > > > > if (trace->nr <= skip) > > /* skipping more than usable stack trace */ > > return -EFAULT; > > ...... > > Hmm. The "trace" is returned from get_perf_callchain() > > get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user, > u32 max_stack, bool crosstask, bool add_mark) > { > ... > > if (kernel && !user_mode(regs)) { > if (add_mark) > perf_callchain_store_context(&ctx, PERF_CONTEXT_KERNEL); > perf_callchain_kernel(&ctx, regs); > } > > So this means `perf_callchain_kernel(&ctx, regs);` fails to unwind stack. > > perf_callchain_kernel() -> arch_stack_walk() -> kunwind_stack_walk() > > That is `kunwind_init_from_regs()` and `do_kunwind()`. > > if (regs) { > if (task != current) > return -EINVAL; > kunwind_init_from_regs(&state, regs); > } else if (task == current) { > kunwind_init_from_caller(&state); > } else { > kunwind_init_from_task(&state, task); > } > > return do_kunwind(&state, consume_state, cookie); > > For initialization, it should be OK because it only refers pc and > fp(regs[29]), which are recovered by ftrace_partial_regs(). > > static __always_inline void > kunwind_init_from_regs(struct kunwind_state *state, > struct pt_regs *regs) > { > kunwind_init(state, current); > > state->regs = regs; > state->common.fp = regs->regs[29]; > state->common.pc = regs->pc; > state->source = KUNWIND_SOURCE_REGS_PC; > } > > And do_kunwind() should work increase trace->nr before return > unless `kunwind_recover_return_address()` fails. > > static __always_inline int > do_kunwind(struct kunwind_state *state, kunwind_consume_fn consume_state, > void *cookie) > { > int ret; > > ret = kunwind_recover_return_address(state); > if (ret) > return ret; > > while (1) { > if (!consume_state(state, cookie)) <--- this increases trace->nr (*). > return -EINVAL; > ret = kunwind_next(state); > if (ret == -ENOENT) > return 0; > if (ret < 0) > return ret; > } > } > > (*) consume_state() == arch_kunwind_consume_entry() > -> data->consume_entry == callchain_trace() -> perf_callchain_store(). > > Hmm, can you also dump the regs and insert pr_info() to find > which function fails? > > Thanks, > After testing, it was found that the stack could not be obtained because user_mode(regs) returned 1. Referring to the arch_ftrace_fill_perf_regs function in your email (https://lore.kernel.org/all/173518997908.391279.15910334347345106424.stgit@devnote2/), I made the following modification: by setting the value of pstate, the stack can now be obtained successfully. diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h index 058a99aa44bd..f2814175e958 100644 --- a/arch/arm64/include/asm/ftrace.h +++ b/arch/arm64/include/asm/ftrace.h @@ -159,11 +159,13 @@ ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs) { struct __arch_ftrace_regs *afregs = arch_ftrace_regs(fregs); memcpy(regs->regs, afregs->regs, sizeof(afregs->regs)); regs->sp = afregs->sp; regs->pc = afregs->pc; regs->regs[29] = afregs->fp; regs->regs[30] = afregs->lr; + regs->pstate = PSR_MODE_EL1h; return regs; } However, I'm not sure if there will be any other impacts... By the way, during my testing, I also noticed that when executing bpf_get_stackid via kprobes or tracepoints, the command bpftrace -e 'kprobe:bpf_get_stackid {printf("bpf_get_stackid\n");}' produces no output. However, it does output something when bpf_get_stackid is invoked via uprobes. This phenomenon also occurs on the x86 architecture, could this be a bug as well? Thanks.
On Wed, Sep 24, 2025 at 02:25:36PM +0800, Feng Yang wrote: SNIP > > Hmm, can you also dump the regs and insert pr_info() to find > > which function fails? > > > > Thanks, > > > > After testing, it was found that the stack could not be obtained because user_mode(regs) returned 1. > Referring to the arch_ftrace_fill_perf_regs function in your email > (https://lore.kernel.org/all/173518997908.391279.15910334347345106424.stgit@devnote2/), > I made the following modification: by setting the value of pstate, the stack can now be obtained successfully. > > diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h > index 058a99aa44bd..f2814175e958 100644 > --- a/arch/arm64/include/asm/ftrace.h > +++ b/arch/arm64/include/asm/ftrace.h > @@ -159,11 +159,13 @@ ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs) > { > struct __arch_ftrace_regs *afregs = arch_ftrace_regs(fregs); > > memcpy(regs->regs, afregs->regs, sizeof(afregs->regs)); > regs->sp = afregs->sp; > regs->pc = afregs->pc; > regs->regs[29] = afregs->fp; > regs->regs[30] = afregs->lr; > + regs->pstate = PSR_MODE_EL1h; > return regs; > } > However, I'm not sure if there will be any other impacts... nice, the test works for me with this change.. could you please send formal patch? I can polish and send out the test [1] thanks, jirka [1] https://github.com/kernel-patches/bpf/pull/9845/commits/11b31cd465a83b8719cb06331c8e81794cca40fa
On Wed, 24 Sep 2025 14:25:36 +0800 Feng Yang <yangfeng59949@163.com> wrote: > By the way, during my testing, I also noticed that when executing bpf_get_stackid via kprobes or tracepoints, > the command bpftrace -e 'kprobe:bpf_get_stackid {printf("bpf_get_stackid\n");}' produces no output. I think this is because the bpf_get_stackid is a kind of recursive event from kprobes. Kprobe handler can not be reentered. > However, it does output something when bpf_get_stackid is invoked via uprobes. > This phenomenon also occurs on the x86 architecture, could this be a bug as well? Maybe if bpf_get_stackid() is kicked from uprobes, it is not recursive call from kprobes, so it works. So it is expected behavior, not a bug. Sorry for confusion. Thank you, > > Thanks. > -- Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Wed, 24 Sep 2025 17:04:16 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > After testing, it was found that the stack could not be obtained because user_mode(regs) returned 1. > > Referring to the arch_ftrace_fill_perf_regs function in your email > > (https://lore.kernel.org/all/173518997908.391279.15910334347345106424.stgit@devnote2/), > > I made the following modification: by setting the value of pstate, the stack can now be obtained successfully. > > > > diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h > > index 058a99aa44bd..f2814175e958 100644 > > --- a/arch/arm64/include/asm/ftrace.h > > +++ b/arch/arm64/include/asm/ftrace.h > > @@ -159,11 +159,13 @@ ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs) > > { > > struct __arch_ftrace_regs *afregs = arch_ftrace_regs(fregs); > > > > memcpy(regs->regs, afregs->regs, sizeof(afregs->regs)); > > regs->sp = afregs->sp; > > regs->pc = afregs->pc; > > regs->regs[29] = afregs->fp; > > regs->regs[30] = afregs->lr; > > + regs->pstate = PSR_MODE_EL1h; > > Good catch! Should I submit this patch, or will you carry out a more complete fix? > > By the way, during my testing, I also noticed that when executing bpf_get_stackid via kprobes or tracepoints, > > the command bpftrace -e 'kprobe:bpf_get_stackid {printf("bpf_get_stackid\n");}' produces no output. > > I think this is because the bpf_get_stackid is a kind of recursive > event from kprobes. Kprobe handler can not be reentered. > > > However, it does output something when bpf_get_stackid is invoked via uprobes. > > This phenomenon also occurs on the x86 architecture, could this be a bug as well? > > Maybe if bpf_get_stackid() is kicked from uprobes, it is not recursive > call from kprobes, so it works. > > So it is expected behavior, not a bug. Sorry for confusion. > > > Thank you, Thank you very much for your explanation.
On Wed, 24 Sep 2025 14:25:36 +0800 Feng Yang <yangfeng59949@163.com> wrote: > On Wed, 24 Sep 2025 00:32:15 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > > On Mon, 22 Sep 2025 10:15:31 +0800 > > Feng Yang <yangfeng59949@163.com> wrote: > > > > > On Sun, 21 Sep 2025 22:30:37 +0900 Masami Hiramatsu wrote: > > > > > > > On Fri, 19 Sep 2025 19:56:20 -0700 > > > > Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > > > > > > > On Fri, Sep 19, 2025 at 12:19 AM Feng Yang <yangfeng59949@163.com> wrote: > > > > > > > > > > > > When I use bpf_program__attach_kprobe_multi_opts to hook a BPF program that contains the bpf_get_stackid function on the arm64 architecture, > > > > > > I find that the stack trace cannot be obtained. The trace->nr in __bpf_get_stackid is 0, and the function returns -EFAULT. > > > > > > > > > > > > For example: > > > > > > diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi.c b/tools/testing/selftests/bpf/progs/kprobe_multi.c > > > > > > index 9e1ca8e34913..844fa88cdc4c 100644 > > > > > > --- a/tools/testing/selftests/bpf/progs/kprobe_multi.c > > > > > > +++ b/tools/testing/selftests/bpf/progs/kprobe_multi.c > > > > > > @@ -36,6 +36,15 @@ __u64 kretprobe_test6_result = 0; > > > > > > __u64 kretprobe_test7_result = 0; > > > > > > __u64 kretprobe_test8_result = 0; > > > > > > > > > > > > +typedef __u64 stack_trace_t[2]; > > > > > > + > > > > > > +struct { > > > > > > + __uint(type, BPF_MAP_TYPE_STACK_TRACE); > > > > > > + __uint(max_entries, 1024); > > > > > > + __type(key, __u32); > > > > > > + __type(value, stack_trace_t); > > > > > > +} stacks SEC(".maps"); > > > > > > + > > > > > > static void kprobe_multi_check(void *ctx, bool is_return) > > > > > > { > > > > > > if (bpf_get_current_pid_tgid() >> 32 != pid) > > > > > > @@ -100,7 +109,9 @@ int test_kretprobe(struct pt_regs *ctx) > > > > > > SEC("kprobe.multi") > > > > > > int test_kprobe_manual(struct pt_regs *ctx) > > > > > > { > > > > > > + int id = bpf_get_stackid(ctx, &stacks, 0); > > > > > > > > > > ftrace_partial_regs() supposed to work on x86 and arm64, > > > > > but since multi-kprobe is the only user... > > > > > > > > It should be able to unwind stack. It saves sp, pc, lr, fp. > > > > > > > > regs->sp = afregs->sp; > > > > regs->pc = afregs->pc; > > > > regs->regs[29] = afregs->fp; > > > > regs->regs[30] = afregs->lr; > > > > > > > > > I suspect the arm64 implementation wasn't really tested. > > > > > Or maybe there is some other issue. > > > > > > > > It depends on how bpf_get_stackid() works. Some registers for that > > > > function may not be saved. > > > > > > > > If it returns -EFAULT, the get_perf_callchain() returns NULL. > > > > > > > > > > During my test, the reason for returning -EFAULT was that trace->nr was 0. > > > > > > static long __bpf_get_stackid(struct bpf_map *map, > > > struct perf_callchain_entry *trace, u64 flags) > > > { > > > struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); > > > struct stack_map_bucket *bucket, *new_bucket, *old_bucket; > > > u32 skip = flags & BPF_F_SKIP_FIELD_MASK; > > > u32 hash, id, trace_nr, trace_len; > > > bool user = flags & BPF_F_USER_STACK; > > > u64 *ips; > > > bool hash_matches; > > > > > > if (trace->nr <= skip) > > > /* skipping more than usable stack trace */ > > > return -EFAULT; > > > ...... > > > > Hmm. The "trace" is returned from get_perf_callchain() > > > > get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user, > > u32 max_stack, bool crosstask, bool add_mark) > > { > > ... > > > > if (kernel && !user_mode(regs)) { > > if (add_mark) > > perf_callchain_store_context(&ctx, PERF_CONTEXT_KERNEL); > > perf_callchain_kernel(&ctx, regs); > > } > > > > So this means `perf_callchain_kernel(&ctx, regs);` fails to unwind stack. > > > > perf_callchain_kernel() -> arch_stack_walk() -> kunwind_stack_walk() > > > > That is `kunwind_init_from_regs()` and `do_kunwind()`. > > > > if (regs) { > > if (task != current) > > return -EINVAL; > > kunwind_init_from_regs(&state, regs); > > } else if (task == current) { > > kunwind_init_from_caller(&state); > > } else { > > kunwind_init_from_task(&state, task); > > } > > > > return do_kunwind(&state, consume_state, cookie); > > > > For initialization, it should be OK because it only refers pc and > > fp(regs[29]), which are recovered by ftrace_partial_regs(). > > > > static __always_inline void > > kunwind_init_from_regs(struct kunwind_state *state, > > struct pt_regs *regs) > > { > > kunwind_init(state, current); > > > > state->regs = regs; > > state->common.fp = regs->regs[29]; > > state->common.pc = regs->pc; > > state->source = KUNWIND_SOURCE_REGS_PC; > > } > > > > And do_kunwind() should work increase trace->nr before return > > unless `kunwind_recover_return_address()` fails. > > > > static __always_inline int > > do_kunwind(struct kunwind_state *state, kunwind_consume_fn consume_state, > > void *cookie) > > { > > int ret; > > > > ret = kunwind_recover_return_address(state); > > if (ret) > > return ret; > > > > while (1) { > > if (!consume_state(state, cookie)) <--- this increases trace->nr (*). > > return -EINVAL; > > ret = kunwind_next(state); > > if (ret == -ENOENT) > > return 0; > > if (ret < 0) > > return ret; > > } > > } > > > > (*) consume_state() == arch_kunwind_consume_entry() > > -> data->consume_entry == callchain_trace() -> perf_callchain_store(). > > > > Hmm, can you also dump the regs and insert pr_info() to find > > which function fails? > > > > Thanks, > > > > After testing, it was found that the stack could not be obtained because user_mode(regs) returned 1. > Referring to the arch_ftrace_fill_perf_regs function in your email > (https://lore.kernel.org/all/173518997908.391279.15910334347345106424.stgit@devnote2/), > I made the following modification: by setting the value of pstate, the stack can now be obtained successfully. > > diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h > index 058a99aa44bd..f2814175e958 100644 > --- a/arch/arm64/include/asm/ftrace.h > +++ b/arch/arm64/include/asm/ftrace.h > @@ -159,11 +159,13 @@ ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs) > { > struct __arch_ftrace_regs *afregs = arch_ftrace_regs(fregs); > > memcpy(regs->regs, afregs->regs, sizeof(afregs->regs)); > regs->sp = afregs->sp; > regs->pc = afregs->pc; > regs->regs[29] = afregs->fp; > regs->regs[30] = afregs->lr; > + regs->pstate = PSR_MODE_EL1h; Good catch! > return regs; > } > However, I'm not sure if there will be any other impacts... > > By the way, during my testing, I also noticed that when executing bpf_get_stackid via kprobes or tracepoints, > the command bpftrace -e 'kprobe:bpf_get_stackid {printf("bpf_get_stackid\n");}' produces no output. That is strange. since normal kprobes passes full pt_regs. > However, it does output something when bpf_get_stackid is invoked via uprobes. > This phenomenon also occurs on the x86 architecture, could this be a bug as well? Yes, it must be a bug. Thanks! > > Thanks. > -- Masami Hiramatsu (Google) <mhiramat@kernel.org>
© 2016 - 2025 Red Hat, Inc.