arch/x86/kernel/alternative.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
kprobe_multi_link_exit_handler -> is_endbr.
It is not protected by the "bpf_prog_active", so it can't be traced by
kprobe-multi, which can cause recurring and panic the kernel. Fix it by
make it notrace.
Fixes: 72e213a7ccf9 ("x86/ibt: Clean up is_endbr()")
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
arch/x86/kernel/alternative.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 69fb818df2ee..f67a31c77c89 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1108,7 +1108,7 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { }
#ifdef CONFIG_X86_KERNEL_IBT
-__noendbr bool is_endbr(u32 *val)
+__noendbr notrace bool is_endbr(u32 *val)
{
u32 endbr;
--
2.51.0
On Thu, 18 Sep 2025 20:09:39 +0800 Menglong Dong <menglong8.dong@gmail.com> wrote: > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > kprobe_multi_link_exit_handler -> is_endbr. > > It is not protected by the "bpf_prog_active", so it can't be traced by > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > make it notrace. Ah, OK. This is fprobe's issue. fprobe depends on fgraph to check recursion, but fgraph only detects the recursion in the entry handler. Thus it happens in the exit handler, fprobe does not check the recursion. But since the fprobe provides users to register callback at exit, it should check the recursion in return path too. Thanks, -- Masami Hiramatsu (Google) <mhiramat@kernel.org>
On 2025/9/19 16:52 Masami Hiramatsu <mhiramat@kernel.org> write: > On Thu, 18 Sep 2025 20:09:39 +0800 > Menglong Dong <menglong8.dong@gmail.com> wrote: > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > kprobe_multi_link_exit_handler -> is_endbr. > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > make it notrace. > > Ah, OK. This is fprobe's issue. fprobe depends on fgraph to check > recursion, but fgraph only detects the recursion in the entry handler. > Thus it happens in the exit handler, fprobe does not check the recursion. > > But since the fprobe provides users to register callback at exit, it > should check the recursion in return path too. That's a good idea to provide recursion checking for the exit handler, which is able to solve this problem too. If so, we don't need to check the recursion on the kprobe-multi anymore. Do we? Thanks! Menglong Dong > > Thanks, > > -- > Masami Hiramatsu (Google) <mhiramat@kernel.org> >
On Fri, 19 Sep 2025 16:58:57 +0800 Menglong Dong <menglong8.dong@gmail.com> wrote: > On 2025/9/19 16:52 Masami Hiramatsu <mhiramat@kernel.org> write: > > On Thu, 18 Sep 2025 20:09:39 +0800 > > Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > make it notrace. > > > > Ah, OK. This is fprobe's issue. fprobe depends on fgraph to check > > recursion, but fgraph only detects the recursion in the entry handler. > > Thus it happens in the exit handler, fprobe does not check the recursion. > > > > But since the fprobe provides users to register callback at exit, it > > should check the recursion in return path too. > > That's a good idea to provide recursion checking for the exit handler, > which is able to solve this problem too. > > If so, we don't need to check the recursion on the kprobe-multi anymore. > Do we? Yes, but *if possible*, please avoid calling such functions from fprobe callbacks. This just prevents kernel crash from such recursion, but that means it is not possible to trace such functions. Thank you, > > Thanks! > Menglong Dong > > > > > Thanks, > > > > -- > > Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > > > -- Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > kprobe_multi_link_exit_handler -> is_endbr. > > It is not protected by the "bpf_prog_active", so it can't be traced by > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > make it notrace. This is very much a riddle wrapped in an enigma. Notably kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that cryptic next line sufficient to explain why its a problem. I suspect the is_endbr() you did mean is the one in arch_ftrace_get_symaddr(), but who knows. Also, depending on compiler insanity, it is possible the thing out-of-lines things like __is_endbr(), getting you yet another __fentry__ site. Please try again.
On Thu, Sep 18, 2025 at 9:05 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > kprobe_multi_link_exit_handler -> is_endbr. > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > make it notrace. > > This is very much a riddle wrapped in an enigma. Notably > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > cryptic next line sufficient to explain why its a problem. > > I suspect the is_endbr() you did mean is the one in > arch_ftrace_get_symaddr(), but who knows. Yeah, I mean kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> arch_ftrace_get_symaddr -> is_endbr actually. And CONFIG_X86_KERNEL_IBT is enabled of course. > > Also, depending on compiler insanity, it is possible the thing > out-of-lines things like __is_endbr(), getting you yet another > __fentry__ site. The panic happens when I run the bpf bench testing: ./bench kretprobe-multi-all And skip the "is_endbr" fix this problem. __is_endbr should be marked with "notrace" too. I slacked off on it, as it didn't happen in my case :/ > > Please try again.
On Thu, Sep 18, 2025 at 09:32:27PM +0800, Menglong Dong wrote: > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > make it notrace. > > > > This is very much a riddle wrapped in an enigma. Notably > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > > cryptic next line sufficient to explain why its a problem. > > > > I suspect the is_endbr() you did mean is the one in > > arch_ftrace_get_symaddr(), but who knows. > > Yeah, I mean > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > arch_ftrace_get_symaddr -> is_endbr > actually. And CONFIG_X86_KERNEL_IBT is enabled of course. > > > > > Also, depending on compiler insanity, it is possible the thing > > out-of-lines things like __is_endbr(), getting you yet another > > __fentry__ site. > > The panic happens when I run the bpf bench testing: > ./bench kretprobe-multi-all > > And skip the "is_endbr" fix this problem. But why does it panic? Supposedly you've done the analysis; but then forgot to write it down? Why is kprobe_multi_link_exit_handler() special; doesn't the issue also exist with kprobe_multi_link_handler() ? If so, removing __fentry__ isn't going to help much, you can just stick an actual kprobe in is_endbr(), right?
On Thu, 18 Sep 2025 18:56:56 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Thu, Sep 18, 2025 at 09:32:27PM +0800, Menglong Dong wrote: > > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > > make it notrace. > > > > > > This is very much a riddle wrapped in an enigma. Notably > > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > > > cryptic next line sufficient to explain why its a problem. > > > > > > I suspect the is_endbr() you did mean is the one in > > > arch_ftrace_get_symaddr(), but who knows. > > > > Yeah, I mean > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > > arch_ftrace_get_symaddr -> is_endbr > > actually. And CONFIG_X86_KERNEL_IBT is enabled of course. > > > > > > > > Also, depending on compiler insanity, it is possible the thing > > > out-of-lines things like __is_endbr(), getting you yet another > > > __fentry__ site. > > > > The panic happens when I run the bpf bench testing: > > ./bench kretprobe-multi-all > > > > And skip the "is_endbr" fix this problem. > > But why does it panic? Supposedly you've done the analysis; but then > forgot to write it down? Yeah, that is an fprobe's bug. It should not panic. I sent a fix. https://lore.kernel.org/all/175828305637.117978.4183947592750468265.stgit@devnote2/ Thank you, > > Why is kprobe_multi_link_exit_handler() special; doesn't the issue also > exist with kprobe_multi_link_handler() ? If so, removing __fentry__ > isn't going to help much, you can just stick an actual kprobe in > is_endbr(), right? > -- Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Thu, Sep 18, 2025 at 6:32 AM Menglong Dong <menglong8.dong@gmail.com> wrote: > > On Thu, Sep 18, 2025 at 9:05 PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > make it notrace. > > > > This is very much a riddle wrapped in an enigma. Notably > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > > cryptic next line sufficient to explain why its a problem. > > > > I suspect the is_endbr() you did mean is the one in > > arch_ftrace_get_symaddr(), but who knows. > > Yeah, I mean > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > arch_ftrace_get_symaddr -> is_endbr > actually. And CONFIG_X86_KERNEL_IBT is enabled of course. All this makes sense to me. __noendbr bool is_endbr(u32 *val) needs "notrace", since it's in alternative.c and won't get inlined (unless LTO+luck).
On Thu, Sep 18, 2025 at 09:02:31AM -0700, Alexei Starovoitov wrote: > On Thu, Sep 18, 2025 at 6:32???AM Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > > make it notrace. > > > > > > This is very much a riddle wrapped in an enigma. Notably > > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > > > cryptic next line sufficient to explain why its a problem. > > > > > > I suspect the is_endbr() you did mean is the one in > > > arch_ftrace_get_symaddr(), but who knows. > > > > Yeah, I mean > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > > arch_ftrace_get_symaddr -> is_endbr > > actually. And CONFIG_X86_KERNEL_IBT is enabled of course. > > All this makes sense to me. As written down, I'm still clueless. > __noendbr bool is_endbr(u32 *val) needs "notrace", > since it's in alternative.c and won't get inlined (unless LTO+luck). notrace don't help with kprobes in general, only with __fentry__ sites. I've still not clue why there is a panic, or why notrace would be sufficient.
On Thu, Sep 18, 2025 at 9:59 AM Peter Zijlstra <peterz@infradead.org> wrote: > > On Thu, Sep 18, 2025 at 09:02:31AM -0700, Alexei Starovoitov wrote: > > On Thu, Sep 18, 2025 at 6:32???AM Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > > > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > > > make it notrace. > > > > > > > > This is very much a riddle wrapped in an enigma. Notably > > > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > > > > cryptic next line sufficient to explain why its a problem. > > > > > > > > I suspect the is_endbr() you did mean is the one in > > > > arch_ftrace_get_symaddr(), but who knows. > > > > > > Yeah, I mean > > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > > > arch_ftrace_get_symaddr -> is_endbr > > > actually. And CONFIG_X86_KERNEL_IBT is enabled of course. > > > > All this makes sense to me. > > As written down, I'm still clueless. > > > __noendbr bool is_endbr(u32 *val) needs "notrace", > > since it's in alternative.c and won't get inlined (unless LTO+luck). > > notrace don't help with kprobes in general, only with __fentry__ sites. Are you sure ? Last time I checked "notrace" prevents kprobing anywhere in that function.
On Thu, Sep 18, 2025 at 10:53:51AM -0700, Alexei Starovoitov wrote: > > notrace don't help with kprobes in general, only with __fentry__ sites. > > Are you sure ? Last time I checked "notrace" prevents kprobing > anywhere in that function. #define notrace __attribute__((__no_instrument_function__)) AFAICT that only inhibits the compiler from generating __fentry__ calls. There is NOKPROBE_SYMBOL() to explicitly exclude kprobes from a symbol. And we have noinstr, which along with a gazillion code-gen attributes puts things in a .noinstr.text section and that section is also excluded from kprobe placement.
On Fri, Sep 19, 2025 at 1:54 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Sep 18, 2025 at 9:59 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Thu, Sep 18, 2025 at 09:02:31AM -0700, Alexei Starovoitov wrote: > > > On Thu, Sep 18, 2025 at 6:32???AM Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > > > > > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > > > > make it notrace. > > > > > > > > > > This is very much a riddle wrapped in an enigma. Notably > > > > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > > > > > cryptic next line sufficient to explain why its a problem. > > > > > > > > > > I suspect the is_endbr() you did mean is the one in > > > > > arch_ftrace_get_symaddr(), but who knows. > > > > > > > > Yeah, I mean > > > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > > > > arch_ftrace_get_symaddr -> is_endbr > > > > actually. And CONFIG_X86_KERNEL_IBT is enabled of course. > > > > > > All this makes sense to me. > > > > As written down, I'm still clueless. Ok, let me describe the problem in deetail. First of all, it has nothing to do with kprobe. The bpf program of type kprobe-multi based on fprobe, and fprobe base on fgraph. So it's all about the ftrace, which means __fentry__. Second, let me explain the recur detection of the kprobe-multi. Let's take the is_endbr() for example. When it is hooked by the bpf program of type kretprobe-multi, following calling chain will happen: is_endbr -> __ftrace_return_to_handler -> fprobe_return -> kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> arch_ftrace_get_symaddr -> is_endbr Look, is_endbr() is called again during the ftrace handler, so it will trigger the ftrace handler(__ftrace_return_to_handler) again, which causes recurrence. Such recurrence can be detected. In kprobe_multi_link_prog_run(), the percpu various "bpf_prog_active" will be increased by 1 before we run the bpf progs, and decrease by 1 after the bpf progs finish. If the kprobe_multi_link_prog_run() is triggered again during bpf progs run, it will check if bpf_prog_active is zero, and return directly if it is not. Therefore, recurrence can't happen within the "bpf_prog_active" protection. However, the calling to is_endbr() is not within that scope, which makes the recurrence happen. Hope I described it clearly 😉 Thanks! Menglong Dong > > > > > __noendbr bool is_endbr(u32 *val) needs "notrace", > > > since it's in alternative.c and won't get inlined (unless LTO+luck). > > > > notrace don't help with kprobes in general, only with __fentry__ sites. > > Are you sure ? Last time I checked "notrace" prevents kprobing > anywhere in that function.
On Fri, Sep 19, 2025 at 09:13:15AM +0800, Menglong Dong wrote: > Ok, let me describe the problem in deetail. > > First of all, it has nothing to do with kprobe. The bpf program of type > kprobe-multi based on fprobe, and fprobe base on fgraph. So it's all > about the ftrace, which means __fentry__. Well, that's not confusing at all. Something called kprobe-multi not being related to kprobes :-( > Second, let me explain the recur detection of the kprobe-multi. Let's > take the is_endbr() for example. When it is hooked by the bpf program > of type kretprobe-multi, following calling chain will happen: > > is_endbr -> __ftrace_return_to_handler -> fprobe_return -> > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > arch_ftrace_get_symaddr -> is_endbr > > Look, is_endbr() is called again during the ftrace handler, so it will > trigger the ftrace handler(__ftrace_return_to_handler) again, which > causes recurrence. Right. > Such recurrence can be detected. In kprobe_multi_link_prog_run(), > the percpu various "bpf_prog_active" will be increased by 1 before we > run the bpf progs, and decrease by 1 after the bpf progs finish. If the > kprobe_multi_link_prog_run() is triggered again during bpf progs run, > it will check if bpf_prog_active is zero, and return directly if it is not. > Therefore, recurrence can't happen within the "bpf_prog_active" protection. As I think Masami already said, the problem is the layer. You're trying to fix an ftrace problem at the bpf layer. > However, the calling to is_endbr() is not within that scope, which makes > the recurrence happen. Sorta, I'm still sketchy on the whole kprobe-multi thing. Anyway, I don't mind making is_endbr() invisible to tracing, that might just have security benefits too. But I think first the ftrace folks need to figure out how to best kill that recursion, because I don't think is_endbr is particularly special here. It is just one more function that can emit a __fentry__ site. Anyway, something like the below would do: Note that without making __is_endbr() __always_inline, you run the risk of the compiler being retarded (they often are in the face of KASAN/UBSAN like) and deciding to out-of-line that function, resulting in yet another __fentry__ site. An added advantage of noinstr is that it is validated by objtool to never call to !noinstr code. As such, you can be sure there is no instrumentation in it. (the below hasn't been near a compiler) --- arch/x86/include/asm/ibt.h | 2 +- arch/x86/kernel/alternative.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/ibt.h b/arch/x86/include/asm/ibt.h index 5e45d6424722..54937a527042 100644 --- a/arch/x86/include/asm/ibt.h +++ b/arch/x86/include/asm/ibt.h @@ -65,7 +65,7 @@ static __always_inline __attribute_const__ u32 gen_endbr_poison(void) return 0xd6401f0f; /* nopl -42(%rax) */ } -static inline bool __is_endbr(u32 val) +static __always_inline bool __is_endbr(u32 val) { if (val == gen_endbr_poison()) return true; diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 69fb818df2ee..f791e7abd466 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -1108,7 +1108,7 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { } #ifdef CONFIG_X86_KERNEL_IBT -__noendbr bool is_endbr(u32 *val) +__noendbr noinstr bool is_endbr(u32 *val) { u32 endbr;
On 2025/9/22 14:52 Peter Zijlstra <peterz@infradead.org> write: > On Fri, Sep 19, 2025 at 09:13:15AM +0800, Menglong Dong wrote: > > > Ok, let me describe the problem in deetail. > > > > First of all, it has nothing to do with kprobe. The bpf program of type > > kprobe-multi based on fprobe, and fprobe base on fgraph. So it's all > > about the ftrace, which means __fentry__. > > Well, that's not confusing at all. Something called kprobe-multi not > being related to kprobes :-( > > > Second, let me explain the recur detection of the kprobe-multi. Let's > > take the is_endbr() for example. When it is hooked by the bpf program > > of type kretprobe-multi, following calling chain will happen: > > > > is_endbr -> __ftrace_return_to_handler -> fprobe_return -> > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > > arch_ftrace_get_symaddr -> is_endbr > > > > Look, is_endbr() is called again during the ftrace handler, so it will > > trigger the ftrace handler(__ftrace_return_to_handler) again, which > > causes recurrence. > > Right. > > > Such recurrence can be detected. In kprobe_multi_link_prog_run(), > > the percpu various "bpf_prog_active" will be increased by 1 before we > > run the bpf progs, and decrease by 1 after the bpf progs finish. If the > > kprobe_multi_link_prog_run() is triggered again during bpf progs run, > > it will check if bpf_prog_active is zero, and return directly if it is not. > > Therefore, recurrence can't happen within the "bpf_prog_active" protection. > > As I think Masami already said, the problem is the layer. You're trying > to fix an ftrace problem at the bpf layer. Yeah, I see. And Masami has already posted a series for this problem in: https://lore.kernel.org/bpf/175852291163.307379.14414635977719513326.stgit@devnote2/ > > > However, the calling to is_endbr() is not within that scope, which makes > > the recurrence happen. > > Sorta, I'm still sketchy on the whole kprobe-multi thing. > > Anyway, I don't mind making is_endbr() invisible to tracing, that might > just have security benefits too. But I think first the ftrace folks need > to figure out how to best kill that recursion, because I don't think > is_endbr is particularly special here. So, does this patch seem useful after all? OK, I'll send a V2 base on your following suggestion. Thanks! Menglong Dong > > It is just one more function that can emit a __fentry__ site. > > Anyway, something like the below would do: > > Note that without making __is_endbr() __always_inline, you run the risk > of the compiler being retarded (they often are in the face of > KASAN/UBSAN like) and deciding to out-of-line that function, resulting > in yet another __fentry__ site. > > An added advantage of noinstr is that it is validated by objtool to > never call to !noinstr code. As such, you can be sure there is no > instrumentation in it. > > (the below hasn't been near a compiler) > > --- > arch/x86/include/asm/ibt.h | 2 +- > arch/x86/kernel/alternative.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/ibt.h b/arch/x86/include/asm/ibt.h > index 5e45d6424722..54937a527042 100644 > --- a/arch/x86/include/asm/ibt.h > +++ b/arch/x86/include/asm/ibt.h > @@ -65,7 +65,7 @@ static __always_inline __attribute_const__ u32 gen_endbr_poison(void) > return 0xd6401f0f; /* nopl -42(%rax) */ > } > > -static inline bool __is_endbr(u32 val) > +static __always_inline bool __is_endbr(u32 val) > { > if (val == gen_endbr_poison()) > return true; > diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c > index 69fb818df2ee..f791e7abd466 100644 > --- a/arch/x86/kernel/alternative.c > +++ b/arch/x86/kernel/alternative.c > @@ -1108,7 +1108,7 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { } > > #ifdef CONFIG_X86_KERNEL_IBT > > -__noendbr bool is_endbr(u32 *val) > +__noendbr noinstr bool is_endbr(u32 *val) > { > u32 endbr; > > >
On Mon, Sep 22, 2025 at 03:13:38PM +0800, menglong.dong@linux.dev wrote: > > Anyway, I don't mind making is_endbr() invisible to tracing, that might > > just have security benefits too. But I think first the ftrace folks need > > to figure out how to best kill that recursion, because I don't think > > is_endbr is particularly special here. > > So, does this patch seem useful after all? The use lies in making it harder to find/manipulate endbr things. > OK, I'll send a V2 base on your following suggestion. Hold off until Masami/Steve have fixed the ftrace recursion issue. After that we can do this.
On 2025/9/22 15:19 Peter Zijlstra <peterz@infradead.org> write: > On Mon, Sep 22, 2025 at 03:13:38PM +0800, menglong.dong@linux.dev wrote: > > > > Anyway, I don't mind making is_endbr() invisible to tracing, that might > > > just have security benefits too. But I think first the ftrace folks need > > > to figure out how to best kill that recursion, because I don't think > > > is_endbr is particularly special here. > > > > So, does this patch seem useful after all? > > The use lies in making it harder to find/manipulate endbr things. > > > OK, I'll send a V2 base on your following suggestion. > > Hold off until Masami/Steve have fixed the ftrace recursion issue. After > that we can do this. OK! > >
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
function_graph_enter_regs() prevents itself from recursion by
ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
which is called at the exit, does not prevent such recursion.
Therefore, while it can prevent recursive calls from
fgraph_ops::entryfunc(), it is not able to prevent recursive calls
to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
This can lead an unexpected recursion bug reported by Menglong.
is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
-> kprobe_multi_link_exit_handler -> is_endbr.
To fix this issue, acquire ftrace_test_recursion_trylock() in the
__ftrace_return_to_handler() after unwind the shadow stack to mark
this section must prevent recursive call of fgraph inside user-defined
fgraph_ops::retfunc().
This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
fprobe on function-graph tracer"), because before that fgraph was
only used from the function graph tracer. Fprobe allowed user to run
any callbacks from fgraph after that commit.
Reported-by: Menglong Dong <menglong8.dong@gmail.com>
Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
kernel/trace/fgraph.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 1e3b32b1e82c..08dde420635b 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
unsigned long bitmap;
unsigned long ret;
int offset;
+ int bit;
int i;
ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
@@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
if (fregs)
ftrace_regs_set_instruction_pointer(fregs, ret);
+ bit = ftrace_test_recursion_trylock(trace.func, ret);
+ /*
+ * This must be succeeded because the entry handler returns before
+ * modifying the return address if it is nested. Anyway, we need to
+ * avoid calling user callbacks if it is nested.
+ */
+ if (WARN_ON_ONCE(bit < 0))
+ goto out;
+
#ifdef CONFIG_FUNCTION_GRAPH_RETVAL
trace.retval = ftrace_regs_get_return_value(fregs);
#endif
@@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
}
}
+ ftrace_test_recursion_unlock(bit);
+out:
/*
* The ftrace_graph_return() may still access the current
* ret_stack structure, we need to make sure the update of
On Fri, 19 Sep 2025 20:57:36 +0900 "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > function_graph_enter_regs() prevents itself from recursion by > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), > which is called at the exit, does not prevent such recursion. > Therefore, while it can prevent recursive calls from > fgraph_ops::entryfunc(), it is not able to prevent recursive calls > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. > This can lead an unexpected recursion bug reported by Menglong. > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return > -> kprobe_multi_link_exit_handler -> is_endbr. So basically its if the handler for the return part calls something that it is tracing, it can trigger the recursion? > > To fix this issue, acquire ftrace_test_recursion_trylock() in the > __ftrace_return_to_handler() after unwind the shadow stack to mark > this section must prevent recursive call of fgraph inside user-defined > fgraph_ops::retfunc(). > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite > fprobe on function-graph tracer"), because before that fgraph was > only used from the function graph tracer. Fprobe allowed user to run > any callbacks from fgraph after that commit. I would actually say it's because before this commit, the return handler callers never called anything that the entry handlers didn't already call. If there was recursion, the entry handler would catch it (and the entry tells fgraph if the exit handler should be called). The difference here is with fprobes, you can have the exit handler calling functions that the entry handler does not, which exposes more cases where recursion could happen. > > Reported-by: Menglong Dong <menglong8.dong@gmail.com> > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/ > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") > Cc: stable@vger.kernel.org > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > --- > kernel/trace/fgraph.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > index 1e3b32b1e82c..08dde420635b 100644 > --- a/kernel/trace/fgraph.c > +++ b/kernel/trace/fgraph.c > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > unsigned long bitmap; > unsigned long ret; > int offset; > + int bit; > int i; > > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset); > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > if (fregs) > ftrace_regs_set_instruction_pointer(fregs, ret); > > + bit = ftrace_test_recursion_trylock(trace.func, ret); > + /* > + * This must be succeeded because the entry handler returns before > + * modifying the return address if it is nested. Anyway, we need to > + * avoid calling user callbacks if it is nested. > + */ > + if (WARN_ON_ONCE(bit < 0)) I'm not so sure we need the warn on here. We should probably hook it to the recursion detection infrastructure that the function tracer has. The reason I would say not to have the warn on, is because we don't have a warn on for recursion happening at the entry handler. Because this now is exposed by fprobe allowing different routines to be called at exit than what is used in entry, it can easily be triggered. -- Steve > + goto out; > + > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > trace.retval = ftrace_regs_get_return_value(fregs); > #endif > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > } > } > > + ftrace_test_recursion_unlock(bit); > +out: > /* > * The ftrace_graph_return() may still access the current > * ret_stack structure, we need to make sure the update of
On Fri, 19 Sep 2025 11:27:46 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > On Fri, 19 Sep 2025 20:57:36 +0900 > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > function_graph_enter_regs() prevents itself from recursion by > > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), > > which is called at the exit, does not prevent such recursion. > > Therefore, while it can prevent recursive calls from > > fgraph_ops::entryfunc(), it is not able to prevent recursive calls > > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. > > This can lead an unexpected recursion bug reported by Menglong. > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return > > -> kprobe_multi_link_exit_handler -> is_endbr. > > So basically its if the handler for the return part calls something that it > is tracing, it can trigger the recursion? > > > > > To fix this issue, acquire ftrace_test_recursion_trylock() in the > > __ftrace_return_to_handler() after unwind the shadow stack to mark > > this section must prevent recursive call of fgraph inside user-defined > > fgraph_ops::retfunc(). > > > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite > > fprobe on function-graph tracer"), because before that fgraph was > > only used from the function graph tracer. Fprobe allowed user to run > > any callbacks from fgraph after that commit. > > I would actually say it's because before this commit, the return handler > callers never called anything that the entry handlers didn't already call. > If there was recursion, the entry handler would catch it (and the entry > tells fgraph if the exit handler should be called). > > The difference here is with fprobes, you can have the exit handler calling > functions that the entry handler does not, which exposes more cases where > recursion could happen. > > > > > Reported-by: Menglong Dong <menglong8.dong@gmail.com> > > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/ > > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") > > Cc: stable@vger.kernel.org > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > --- > > kernel/trace/fgraph.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > > index 1e3b32b1e82c..08dde420635b 100644 > > --- a/kernel/trace/fgraph.c > > +++ b/kernel/trace/fgraph.c > > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > unsigned long bitmap; > > unsigned long ret; > > int offset; > > + int bit; > > int i; > > > > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset); > > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > if (fregs) > > ftrace_regs_set_instruction_pointer(fregs, ret); > > > > + bit = ftrace_test_recursion_trylock(trace.func, ret); > > + /* > > + * This must be succeeded because the entry handler returns before > > + * modifying the return address if it is nested. Anyway, we need to > > + * avoid calling user callbacks if it is nested. > > + */ > > + if (WARN_ON_ONCE(bit < 0)) > > I'm not so sure we need the warn on here. We should probably hook it to the > recursion detection infrastructure that the function tracer has. The reason I added WARN_ON here is this should not happen. The recursion should be detected at the entry, not at exit. The __ftrace_return_to_handler() is installed only if the entry does NOT detect the recursion. In that case I think the exit should succeed recursion_trylock(). > > The reason I would say not to have the warn on, is because we don't have a > warn on for recursion happening at the entry handler. Because this now is > exposed by fprobe allowing different routines to be called at exit than > what is used in entry, it can easily be triggered. At the entry, if it detect recursion, it exits soon. But here we have to process stack unwind to get the return address. This recursion_trylock() is to mark this is in the critical section, not detect it. Thank you, > > -- Steve > > > > > + goto out; > > + > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > > trace.retval = ftrace_regs_get_return_value(fregs); > > #endif > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > } > > } > > > > + ftrace_test_recursion_unlock(bit); > > +out: > > /* > > * The ftrace_graph_return() may still access the current > > * ret_stack structure, we need to make sure the update of > -- Masami Hiramatsu (Google) <mhiramat@kernel.org>
On Sun, 21 Sep 2025 13:05:19 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > > The reason I would say not to have the warn on, is because we don't have a > > warn on for recursion happening at the entry handler. Because this now is > > exposed by fprobe allowing different routines to be called at exit than > > what is used in entry, it can easily be triggered. > > At the entry, if it detect recursion, it exits soon. But here we have to > process stack unwind to get the return address. This recursion_trylock() > is to mark this is in the critical section, not detect it. Ah, because the first instance of the exit callback sets the recursion bit. This will cause recursed entry calls to detect the recursion bit and return without setting the exit handler to be called. That is, by setting the recursion bit in the exit handler, it will cause a recursion in entry to fail before the exit is called again. I'd like to update the comment: + bit = ftrace_test_recursion_trylock(trace.func, ret); + /* + * This must be succeeded because the entry handler returns before + * modifying the return address if it is nested. Anyway, we need to + * avoid calling user callbacks if it is nested. + */ + if (WARN_ON_ONCE(bit < 0)) + goto out; + to: /* * Setting the recursion bit here will cause the graph entry to * detect recursion before the exit handle will. If the ext * handler detects recursion, something went wrong. */ if (WARN_ON_ONCE(bit < 0)) -- Steve
© 2016 - 2025 Red Hat, Inc.