x86/ibt: make is_endbr() notrace

[PATCH] x86/ibt: make is_endbr() notrace

Posted by Menglong Dong 4 months, 3 weeks ago

is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
kprobe_multi_link_exit_handler -> is_endbr.

It is not protected by the "bpf_prog_active", so it can't be traced by
kprobe-multi, which can cause recurring and panic the kernel. Fix it by
make it notrace.

Fixes: 72e213a7ccf9 ("x86/ibt: Clean up is_endbr()")
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
 arch/x86/kernel/alternative.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 69fb818df2ee..f67a31c77c89 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1108,7 +1108,7 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { }
 
 #ifdef CONFIG_X86_KERNEL_IBT
 
-__noendbr bool is_endbr(u32 *val)
+__noendbr notrace bool is_endbr(u32 *val)
 {
 	u32 endbr;
 
-- 
2.51.0

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Masami Hiramatsu (Google) 4 months, 3 weeks ago

On Thu, 18 Sep 2025 20:09:39 +0800
Menglong Dong <menglong8.dong@gmail.com> wrote:

> is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> kprobe_multi_link_exit_handler -> is_endbr.
> 
> It is not protected by the "bpf_prog_active", so it can't be traced by
> kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> make it notrace.

Ah, OK. This is fprobe's issue. fprobe depends on fgraph to check
recursion, but fgraph only detects the recursion in the entry handler.
Thus it happens in the exit handler, fprobe does not check the recursion.

But since the fprobe provides users to register callback at exit, it
should check the recursion in return path too.

Thanks,

-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Menglong Dong 4 months, 3 weeks ago

On 2025/9/19 16:52 Masami Hiramatsu <mhiramat@kernel.org> write:
> On Thu, 18 Sep 2025 20:09:39 +0800
> Menglong Dong <menglong8.dong@gmail.com> wrote:
> 
> > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > kprobe_multi_link_exit_handler -> is_endbr.
> > 
> > It is not protected by the "bpf_prog_active", so it can't be traced by
> > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > make it notrace.
> 
> Ah, OK. This is fprobe's issue. fprobe depends on fgraph to check
> recursion, but fgraph only detects the recursion in the entry handler.
> Thus it happens in the exit handler, fprobe does not check the recursion.
> 
> But since the fprobe provides users to register callback at exit, it
> should check the recursion in return path too.

That's a good idea to provide recursion checking for the exit handler,
which is able to solve this problem too.

If so, we don't need to check the recursion on the kprobe-multi anymore.
Do we?

Thanks!
Menglong Dong

> 
> Thanks,
> 
> -- 
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
>

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Masami Hiramatsu (Google) 4 months, 3 weeks ago

On Fri, 19 Sep 2025 16:58:57 +0800
Menglong Dong <menglong8.dong@gmail.com> wrote:

> On 2025/9/19 16:52 Masami Hiramatsu <mhiramat@kernel.org> write:
> > On Thu, 18 Sep 2025 20:09:39 +0800
> > Menglong Dong <menglong8.dong@gmail.com> wrote:
> > 
> > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > kprobe_multi_link_exit_handler -> is_endbr.
> > > 
> > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > make it notrace.
> > 
> > Ah, OK. This is fprobe's issue. fprobe depends on fgraph to check
> > recursion, but fgraph only detects the recursion in the entry handler.
> > Thus it happens in the exit handler, fprobe does not check the recursion.
> > 
> > But since the fprobe provides users to register callback at exit, it
> > should check the recursion in return path too.
> 
> That's a good idea to provide recursion checking for the exit handler,
> which is able to solve this problem too.
> 
> If so, we don't need to check the recursion on the kprobe-multi anymore.
> Do we?

Yes, but *if possible*, please avoid calling such functions from fprobe
callbacks. This just prevents kernel crash from such recursion, but that
means it is not possible to trace such functions.

Thank you,


> 
> Thanks!
> Menglong Dong
> 
> > 
> > Thanks,
> > 
> > -- 
> > Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > 
> 
> 
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Peter Zijlstra 4 months, 3 weeks ago

On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> kprobe_multi_link_exit_handler -> is_endbr.
> 
> It is not protected by the "bpf_prog_active", so it can't be traced by
> kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> make it notrace.

This is very much a riddle wrapped in an enigma. Notably
kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
cryptic next line sufficient to explain why its a problem.

I suspect the is_endbr() you did mean is the one in
arch_ftrace_get_symaddr(), but who knows.

Also, depending on compiler insanity, it is possible the thing
out-of-lines things like __is_endbr(), getting you yet another
__fentry__ site.

Please try again.

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Menglong Dong 4 months, 3 weeks ago

On Thu, Sep 18, 2025 at 9:05 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > kprobe_multi_link_exit_handler -> is_endbr.
> >
> > It is not protected by the "bpf_prog_active", so it can't be traced by
> > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > make it notrace.
>
> This is very much a riddle wrapped in an enigma. Notably
> kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> cryptic next line sufficient to explain why its a problem.
>
> I suspect the is_endbr() you did mean is the one in
> arch_ftrace_get_symaddr(), but who knows.

Yeah, I mean
kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
arch_ftrace_get_symaddr -> is_endbr
actually. And CONFIG_X86_KERNEL_IBT is enabled of course.

>
> Also, depending on compiler insanity, it is possible the thing
> out-of-lines things like __is_endbr(), getting you yet another
> __fentry__ site.

The panic happens when I run the bpf bench testing:
  ./bench kretprobe-multi-all

And skip the "is_endbr" fix this problem.

__is_endbr should be marked with "notrace" too. I slacked off
on it, as it didn't happen in my case :/

>
> Please try again.

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Peter Zijlstra 4 months, 3 weeks ago

On Thu, Sep 18, 2025 at 09:32:27PM +0800, Menglong Dong wrote:
> On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > kprobe_multi_link_exit_handler -> is_endbr.
> > >
> > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > make it notrace.
> >
> > This is very much a riddle wrapped in an enigma. Notably
> > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> > cryptic next line sufficient to explain why its a problem.
> >
> > I suspect the is_endbr() you did mean is the one in
> > arch_ftrace_get_symaddr(), but who knows.
> 
> Yeah, I mean
> kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> arch_ftrace_get_symaddr -> is_endbr
> actually. And CONFIG_X86_KERNEL_IBT is enabled of course.
> 
> >
> > Also, depending on compiler insanity, it is possible the thing
> > out-of-lines things like __is_endbr(), getting you yet another
> > __fentry__ site.
> 
> The panic happens when I run the bpf bench testing:
>   ./bench kretprobe-multi-all
> 
> And skip the "is_endbr" fix this problem.

But why does it panic? Supposedly you've done the analysis; but then
forgot to write it down?

Why is kprobe_multi_link_exit_handler() special; doesn't the issue also
exist with kprobe_multi_link_handler() ? If so, removing __fentry__
isn't going to help much, you can just stick an actual kprobe in
is_endbr(), right?

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Masami Hiramatsu (Google) 4 months, 3 weeks ago

On Thu, 18 Sep 2025 18:56:56 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Thu, Sep 18, 2025 at 09:32:27PM +0800, Menglong Dong wrote:
> > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > > kprobe_multi_link_exit_handler -> is_endbr.
> > > >
> > > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > > make it notrace.
> > >
> > > This is very much a riddle wrapped in an enigma. Notably
> > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> > > cryptic next line sufficient to explain why its a problem.
> > >
> > > I suspect the is_endbr() you did mean is the one in
> > > arch_ftrace_get_symaddr(), but who knows.
> > 
> > Yeah, I mean
> > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> > arch_ftrace_get_symaddr -> is_endbr
> > actually. And CONFIG_X86_KERNEL_IBT is enabled of course.
> > 
> > >
> > > Also, depending on compiler insanity, it is possible the thing
> > > out-of-lines things like __is_endbr(), getting you yet another
> > > __fentry__ site.
> > 
> > The panic happens when I run the bpf bench testing:
> >   ./bench kretprobe-multi-all
> > 
> > And skip the "is_endbr" fix this problem.
> 
> But why does it panic? Supposedly you've done the analysis; but then
> forgot to write it down?

Yeah, that is an fprobe's bug. It should not panic. I sent a fix.

https://lore.kernel.org/all/175828305637.117978.4183947592750468265.stgit@devnote2/

Thank you,

> 
> Why is kprobe_multi_link_exit_handler() special; doesn't the issue also
> exist with kprobe_multi_link_handler() ? If so, removing __fentry__
> isn't going to help much, you can just stick an actual kprobe in
> is_endbr(), right?
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Alexei Starovoitov 4 months, 3 weeks ago

On Thu, Sep 18, 2025 at 6:32 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> On Thu, Sep 18, 2025 at 9:05 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > kprobe_multi_link_exit_handler -> is_endbr.
> > >
> > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > make it notrace.
> >
> > This is very much a riddle wrapped in an enigma. Notably
> > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> > cryptic next line sufficient to explain why its a problem.
> >
> > I suspect the is_endbr() you did mean is the one in
> > arch_ftrace_get_symaddr(), but who knows.
>
> Yeah, I mean
> kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> arch_ftrace_get_symaddr -> is_endbr
> actually. And CONFIG_X86_KERNEL_IBT is enabled of course.

All this makes sense to me.
__noendbr bool is_endbr(u32 *val) needs "notrace",
since it's in alternative.c and won't get inlined (unless LTO+luck).

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Peter Zijlstra 4 months, 3 weeks ago

On Thu, Sep 18, 2025 at 09:02:31AM -0700, Alexei Starovoitov wrote:
> On Thu, Sep 18, 2025 at 6:32???AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> >
> > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > > kprobe_multi_link_exit_handler -> is_endbr.
> > > >
> > > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > > make it notrace.
> > >
> > > This is very much a riddle wrapped in an enigma. Notably
> > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> > > cryptic next line sufficient to explain why its a problem.
> > >
> > > I suspect the is_endbr() you did mean is the one in
> > > arch_ftrace_get_symaddr(), but who knows.
> >
> > Yeah, I mean
> > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> > arch_ftrace_get_symaddr -> is_endbr
> > actually. And CONFIG_X86_KERNEL_IBT is enabled of course.
> 
> All this makes sense to me.

As written down, I'm still clueless.

> __noendbr bool is_endbr(u32 *val) needs "notrace",
> since it's in alternative.c and won't get inlined (unless LTO+luck).

notrace don't help with kprobes in general, only with __fentry__ sites.

I've still not clue why there is a panic, or why notrace would be
sufficient.

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Alexei Starovoitov 4 months, 3 weeks ago

On Thu, Sep 18, 2025 at 9:59 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Sep 18, 2025 at 09:02:31AM -0700, Alexei Starovoitov wrote:
> > On Thu, Sep 18, 2025 at 6:32???AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > >
> > > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote:
> > > >
> > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > > > kprobe_multi_link_exit_handler -> is_endbr.
> > > > >
> > > > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > > > make it notrace.
> > > >
> > > > This is very much a riddle wrapped in an enigma. Notably
> > > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> > > > cryptic next line sufficient to explain why its a problem.
> > > >
> > > > I suspect the is_endbr() you did mean is the one in
> > > > arch_ftrace_get_symaddr(), but who knows.
> > >
> > > Yeah, I mean
> > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> > > arch_ftrace_get_symaddr -> is_endbr
> > > actually. And CONFIG_X86_KERNEL_IBT is enabled of course.
> >
> > All this makes sense to me.
>
> As written down, I'm still clueless.
>
> > __noendbr bool is_endbr(u32 *val) needs "notrace",
> > since it's in alternative.c and won't get inlined (unless LTO+luck).
>
> notrace don't help with kprobes in general, only with __fentry__ sites.

Are you sure ? Last time I checked "notrace" prevents kprobing
anywhere in that function.

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Peter Zijlstra 4 months, 2 weeks ago

On Thu, Sep 18, 2025 at 10:53:51AM -0700, Alexei Starovoitov wrote:

> > notrace don't help with kprobes in general, only with __fentry__ sites.
> 
> Are you sure ? Last time I checked "notrace" prevents kprobing
> anywhere in that function.

#define notrace                 __attribute__((__no_instrument_function__))

AFAICT that only inhibits the compiler from generating __fentry__ calls.

There is NOKPROBE_SYMBOL() to explicitly exclude kprobes from a symbol.

And we have noinstr, which along with a gazillion code-gen attributes
puts things in a .noinstr.text section and that section is also excluded
from kprobe placement.

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Menglong Dong 4 months, 3 weeks ago

On Fri, Sep 19, 2025 at 1:54 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Sep 18, 2025 at 9:59 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Sep 18, 2025 at 09:02:31AM -0700, Alexei Starovoitov wrote:
> > > On Thu, Sep 18, 2025 at 6:32???AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote:
> > > > >
> > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > > > > kprobe_multi_link_exit_handler -> is_endbr.
> > > > > >
> > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > > > > make it notrace.
> > > > >
> > > > > This is very much a riddle wrapped in an enigma. Notably
> > > > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> > > > > cryptic next line sufficient to explain why its a problem.
> > > > >
> > > > > I suspect the is_endbr() you did mean is the one in
> > > > > arch_ftrace_get_symaddr(), but who knows.
> > > >
> > > > Yeah, I mean
> > > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> > > > arch_ftrace_get_symaddr -> is_endbr
> > > > actually. And CONFIG_X86_KERNEL_IBT is enabled of course.
> > >
> > > All this makes sense to me.
> >
> > As written down, I'm still clueless.

Ok, let me describe the problem in deetail.

First of all, it has nothing to do with kprobe. The bpf program of type
kprobe-multi based on fprobe, and fprobe base on fgraph. So it's all
about the ftrace, which means __fentry__.

Second, let me explain the recur detection of the kprobe-multi. Let's
take the is_endbr() for example. When it is hooked by the bpf program
of type kretprobe-multi, following calling chain will happen:

  is_endbr -> __ftrace_return_to_handler -> fprobe_return ->
  kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
  arch_ftrace_get_symaddr -> is_endbr

Look, is_endbr() is called again during the ftrace handler, so it will
trigger the ftrace handler(__ftrace_return_to_handler) again, which
causes recurrence.

Such recurrence can be detected. In kprobe_multi_link_prog_run(),
the percpu various "bpf_prog_active" will be increased by 1 before we
run the bpf progs, and decrease by 1 after the bpf progs finish. If the
kprobe_multi_link_prog_run() is triggered again during bpf progs run,
it will check if bpf_prog_active is zero, and return directly if it is not.
Therefore, recurrence can't happen within the "bpf_prog_active" protection.

However, the calling to is_endbr() is not within that scope, which makes
the recurrence happen.

Hope I described it clearly 😉

Thanks!
Menglong Dong

> >
> > > __noendbr bool is_endbr(u32 *val) needs "notrace",
> > > since it's in alternative.c and won't get inlined (unless LTO+luck).
> >
> > notrace don't help with kprobes in general, only with __fentry__ sites.
>
> Are you sure ? Last time I checked "notrace" prevents kprobing
> anywhere in that function.

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Peter Zijlstra 4 months, 2 weeks ago

On Fri, Sep 19, 2025 at 09:13:15AM +0800, Menglong Dong wrote:

> Ok, let me describe the problem in deetail.
> 
> First of all, it has nothing to do with kprobe. The bpf program of type
> kprobe-multi based on fprobe, and fprobe base on fgraph. So it's all
> about the ftrace, which means __fentry__.

Well, that's not confusing at all. Something called kprobe-multi not
being related to kprobes :-(

> Second, let me explain the recur detection of the kprobe-multi. Let's
> take the is_endbr() for example. When it is hooked by the bpf program
> of type kretprobe-multi, following calling chain will happen:
> 
>   is_endbr -> __ftrace_return_to_handler -> fprobe_return ->
>   kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
>   arch_ftrace_get_symaddr -> is_endbr
> 
> Look, is_endbr() is called again during the ftrace handler, so it will
> trigger the ftrace handler(__ftrace_return_to_handler) again, which
> causes recurrence.

Right.

> Such recurrence can be detected. In kprobe_multi_link_prog_run(),
> the percpu various "bpf_prog_active" will be increased by 1 before we
> run the bpf progs, and decrease by 1 after the bpf progs finish. If the
> kprobe_multi_link_prog_run() is triggered again during bpf progs run,
> it will check if bpf_prog_active is zero, and return directly if it is not.
> Therefore, recurrence can't happen within the "bpf_prog_active" protection.

As I think Masami already said, the problem is the layer. You're trying
to fix an ftrace problem at the bpf layer.

> However, the calling to is_endbr() is not within that scope, which makes
> the recurrence happen.

Sorta, I'm still sketchy on the whole kprobe-multi thing.

Anyway, I don't mind making is_endbr() invisible to tracing, that might
just have security benefits too. But I think first the ftrace folks need
to figure out how to best kill that recursion, because I don't think
is_endbr is particularly special here.

It is just one more function that can emit a __fentry__ site.

Anyway, something like the below would do:

Note that without making __is_endbr() __always_inline, you run the risk
of the compiler being retarded (they often are in the face of
KASAN/UBSAN like) and deciding to out-of-line that function, resulting
in yet another __fentry__ site.

An added advantage of noinstr is that it is validated by objtool to
never call to !noinstr code. As such, you can be sure there is no
instrumentation in it.

(the below hasn't been near a compiler)

---
 arch/x86/include/asm/ibt.h    | 2 +-
 arch/x86/kernel/alternative.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/ibt.h b/arch/x86/include/asm/ibt.h
index 5e45d6424722..54937a527042 100644
--- a/arch/x86/include/asm/ibt.h
+++ b/arch/x86/include/asm/ibt.h
@@ -65,7 +65,7 @@ static __always_inline __attribute_const__ u32 gen_endbr_poison(void)
 	return 0xd6401f0f; /* nopl -42(%rax) */
 }
 
-static inline bool __is_endbr(u32 val)
+static __always_inline bool __is_endbr(u32 val)
 {
 	if (val == gen_endbr_poison())
 		return true;
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 69fb818df2ee..f791e7abd466 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1108,7 +1108,7 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { }
 
 #ifdef CONFIG_X86_KERNEL_IBT
 
-__noendbr bool is_endbr(u32 *val)
+__noendbr noinstr bool is_endbr(u32 *val)
 {
 	u32 endbr;

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by menglong.dong@linux.dev 4 months, 2 weeks ago

On 2025/9/22 14:52 Peter Zijlstra <peterz@infradead.org> write:
> On Fri, Sep 19, 2025 at 09:13:15AM +0800, Menglong Dong wrote:
> 
> > Ok, let me describe the problem in deetail.
> > 
> > First of all, it has nothing to do with kprobe. The bpf program of type
> > kprobe-multi based on fprobe, and fprobe base on fgraph. So it's all
> > about the ftrace, which means __fentry__.
> 
> Well, that's not confusing at all. Something called kprobe-multi not
> being related to kprobes :-(
> 
> > Second, let me explain the recur detection of the kprobe-multi. Let's
> > take the is_endbr() for example. When it is hooked by the bpf program
> > of type kretprobe-multi, following calling chain will happen:
> > 
> >   is_endbr -> __ftrace_return_to_handler -> fprobe_return ->
> >   kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> >   arch_ftrace_get_symaddr -> is_endbr
> > 
> > Look, is_endbr() is called again during the ftrace handler, so it will
> > trigger the ftrace handler(__ftrace_return_to_handler) again, which
> > causes recurrence.
> 
> Right.
> 
> > Such recurrence can be detected. In kprobe_multi_link_prog_run(),
> > the percpu various "bpf_prog_active" will be increased by 1 before we
> > run the bpf progs, and decrease by 1 after the bpf progs finish. If the
> > kprobe_multi_link_prog_run() is triggered again during bpf progs run,
> > it will check if bpf_prog_active is zero, and return directly if it is not.
> > Therefore, recurrence can't happen within the "bpf_prog_active" protection.
> 
> As I think Masami already said, the problem is the layer. You're trying
> to fix an ftrace problem at the bpf layer.

Yeah, I see. And Masami has already posted a series for this
problem in:

https://lore.kernel.org/bpf/175852291163.307379.14414635977719513326.stgit@devnote2/

> 
> > However, the calling to is_endbr() is not within that scope, which makes
> > the recurrence happen.
> 
> Sorta, I'm still sketchy on the whole kprobe-multi thing.
> 
> Anyway, I don't mind making is_endbr() invisible to tracing, that might
> just have security benefits too. But I think first the ftrace folks need
> to figure out how to best kill that recursion, because I don't think
> is_endbr is particularly special here.

So, does this patch seem useful after all?

OK, I'll send a V2 base on your following suggestion.

Thanks!
Menglong Dong

> 
> It is just one more function that can emit a __fentry__ site.
> 
> Anyway, something like the below would do:
> 
> Note that without making __is_endbr() __always_inline, you run the risk
> of the compiler being retarded (they often are in the face of
> KASAN/UBSAN like) and deciding to out-of-line that function, resulting
> in yet another __fentry__ site.
> 
> An added advantage of noinstr is that it is validated by objtool to
> never call to !noinstr code. As such, you can be sure there is no
> instrumentation in it.
> 
> (the below hasn't been near a compiler)
> 
> ---
>  arch/x86/include/asm/ibt.h    | 2 +-
>  arch/x86/kernel/alternative.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/ibt.h b/arch/x86/include/asm/ibt.h
> index 5e45d6424722..54937a527042 100644
> --- a/arch/x86/include/asm/ibt.h
> +++ b/arch/x86/include/asm/ibt.h
> @@ -65,7 +65,7 @@ static __always_inline __attribute_const__ u32 gen_endbr_poison(void)
>  	return 0xd6401f0f; /* nopl -42(%rax) */
>  }
>  
> -static inline bool __is_endbr(u32 val)
> +static __always_inline bool __is_endbr(u32 val)
>  {
>  	if (val == gen_endbr_poison())
>  		return true;
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index 69fb818df2ee..f791e7abd466 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -1108,7 +1108,7 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { }
>  
>  #ifdef CONFIG_X86_KERNEL_IBT
>  
> -__noendbr bool is_endbr(u32 *val)
> +__noendbr noinstr bool is_endbr(u32 *val)
>  {
>  	u32 endbr;
>  
> 
>

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Peter Zijlstra 4 months, 2 weeks ago

On Mon, Sep 22, 2025 at 03:13:38PM +0800, menglong.dong@linux.dev wrote:

> > Anyway, I don't mind making is_endbr() invisible to tracing, that might
> > just have security benefits too. But I think first the ftrace folks need
> > to figure out how to best kill that recursion, because I don't think
> > is_endbr is particularly special here.
> 
> So, does this patch seem useful after all?

The use lies in making it harder to find/manipulate endbr things.

> OK, I'll send a V2 base on your following suggestion.

Hold off until Masami/Steve have fixed the ftrace recursion issue. After
that we can do this.

Re: [PATCH] x86/ibt: make is_endbr() notrace

Posted by Menglong Dong 4 months, 2 weeks ago

On 2025/9/22 15:19 Peter Zijlstra <peterz@infradead.org> write:
> On Mon, Sep 22, 2025 at 03:13:38PM +0800, menglong.dong@linux.dev wrote:
> 
> > > Anyway, I don't mind making is_endbr() invisible to tracing, that might
> > > just have security benefits too. But I think first the ftrace folks need
> > > to figure out how to best kill that recursion, because I don't think
> > > is_endbr is particularly special here.
> > 
> > So, does this patch seem useful after all?
> 
> The use lies in making it harder to find/manipulate endbr things.
> 
> > OK, I'll send a V2 base on your following suggestion.
> 
> Hold off until Masami/Steve have fixed the ftrace recursion issue. After
> that we can do this.

OK!

> 
>

[PATCH] tracing: fgraph: Protect return handler from recursion loop

Posted by Masami Hiramatsu (Google) 4 months, 3 weeks ago

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

function_graph_enter_regs() prevents itself from recursion by
ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
which is called at the exit, does not prevent such recursion.
Therefore, while it can prevent recursive calls from
fgraph_ops::entryfunc(), it is not able to prevent recursive calls
to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
This can lead an unexpected recursion bug reported by Menglong.

 is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
  -> kprobe_multi_link_exit_handler -> is_endbr.

To fix this issue, acquire ftrace_test_recursion_trylock() in the
__ftrace_return_to_handler() after unwind the shadow stack to mark
this section must prevent recursive call of fgraph inside user-defined
fgraph_ops::retfunc().

This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
fprobe on function-graph tracer"), because before that fgraph was
only used from the function graph tracer. Fprobe allowed user to run
any callbacks from fgraph after that commit.

Reported-by: Menglong Dong <menglong8.dong@gmail.com>
Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 kernel/trace/fgraph.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 1e3b32b1e82c..08dde420635b 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
 	unsigned long bitmap;
 	unsigned long ret;
 	int offset;
+	int bit;
 	int i;
 
 	ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
@@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
 	if (fregs)
 		ftrace_regs_set_instruction_pointer(fregs, ret);
 
+	bit = ftrace_test_recursion_trylock(trace.func, ret);
+	/*
+	 * This must be succeeded because the entry handler returns before
+	 * modifying the return address if it is nested. Anyway, we need to
+	 * avoid calling user callbacks if it is nested.
+	 */
+	if (WARN_ON_ONCE(bit < 0))
+		goto out;
+
 #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
 	trace.retval = ftrace_regs_get_return_value(fregs);
 #endif
@@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
 		}
 	}
 
+	ftrace_test_recursion_unlock(bit);
+out:
 	/*
 	 * The ftrace_graph_return() may still access the current
 	 * ret_stack structure, we need to make sure the update of

Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop

Posted by Steven Rostedt 4 months, 3 weeks ago

On Fri, 19 Sep 2025 20:57:36 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> function_graph_enter_regs() prevents itself from recursion by
> ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> which is called at the exit, does not prevent such recursion.
> Therefore, while it can prevent recursive calls from
> fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> This can lead an unexpected recursion bug reported by Menglong.
> 
>  is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
>   -> kprobe_multi_link_exit_handler -> is_endbr.  

So basically its if the handler for the return part calls something that it
is tracing, it can trigger the recursion?

> 
> To fix this issue, acquire ftrace_test_recursion_trylock() in the
> __ftrace_return_to_handler() after unwind the shadow stack to mark
> this section must prevent recursive call of fgraph inside user-defined
> fgraph_ops::retfunc().
> 
> This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> fprobe on function-graph tracer"), because before that fgraph was
> only used from the function graph tracer. Fprobe allowed user to run
> any callbacks from fgraph after that commit.

I would actually say it's because before this commit, the return handler
callers never called anything that the entry handlers didn't already call.
If there was recursion, the entry handler would catch it (and the entry
tells fgraph if the exit handler should be called).

The difference here is with fprobes, you can have the exit handler calling
functions that the entry handler does not, which exposes more cases where
recursion could happen.

> 
> Reported-by: Menglong Dong <menglong8.dong@gmail.com>
> Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
> Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> Cc: stable@vger.kernel.org
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>  kernel/trace/fgraph.c |   12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 1e3b32b1e82c..08dde420635b 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
>  	unsigned long bitmap;
>  	unsigned long ret;
>  	int offset;
> +	int bit;
>  	int i;
>  
>  	ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
> @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
>  	if (fregs)
>  		ftrace_regs_set_instruction_pointer(fregs, ret);
>  
> +	bit = ftrace_test_recursion_trylock(trace.func, ret);
> +	/*
> +	 * This must be succeeded because the entry handler returns before
> +	 * modifying the return address if it is nested. Anyway, we need to
> +	 * avoid calling user callbacks if it is nested.
> +	 */
> +	if (WARN_ON_ONCE(bit < 0))

I'm not so sure we need the warn on here. We should probably hook it to the
recursion detection infrastructure that the function tracer has.

The reason I would say not to have the warn on, is because we don't have a
warn on for recursion happening at the entry handler. Because this now is
exposed by fprobe allowing different routines to be called at exit than
what is used in entry, it can easily be triggered.

-- Steve



> +		goto out;
> +
>  #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
>  	trace.retval = ftrace_regs_get_return_value(fregs);
>  #endif
> @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
>  		}
>  	}
>  
> +	ftrace_test_recursion_unlock(bit);
> +out:
>  	/*
>  	 * The ftrace_graph_return() may still access the current
>  	 * ret_stack structure, we need to make sure the update of

Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop

Posted by Masami Hiramatsu (Google) 4 months, 2 weeks ago

On Fri, 19 Sep 2025 11:27:46 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Fri, 19 Sep 2025 20:57:36 +0900
> "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> 
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > 
> > function_graph_enter_regs() prevents itself from recursion by
> > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> > which is called at the exit, does not prevent such recursion.
> > Therefore, while it can prevent recursive calls from
> > fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> > This can lead an unexpected recursion bug reported by Menglong.
> > 
> >  is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
> >   -> kprobe_multi_link_exit_handler -> is_endbr.  
> 
> So basically its if the handler for the return part calls something that it
> is tracing, it can trigger the recursion?
> 
> > 
> > To fix this issue, acquire ftrace_test_recursion_trylock() in the
> > __ftrace_return_to_handler() after unwind the shadow stack to mark
> > this section must prevent recursive call of fgraph inside user-defined
> > fgraph_ops::retfunc().
> > 
> > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> > fprobe on function-graph tracer"), because before that fgraph was
> > only used from the function graph tracer. Fprobe allowed user to run
> > any callbacks from fgraph after that commit.
> 
> I would actually say it's because before this commit, the return handler
> callers never called anything that the entry handlers didn't already call.
> If there was recursion, the entry handler would catch it (and the entry
> tells fgraph if the exit handler should be called).
> 
> The difference here is with fprobes, you can have the exit handler calling
> functions that the entry handler does not, which exposes more cases where
> recursion could happen.
> 
> > 
> > Reported-by: Menglong Dong <menglong8.dong@gmail.com>
> > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
> > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> >  kernel/trace/fgraph.c |   12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> > 
> > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> > index 1e3b32b1e82c..08dde420635b 100644
> > --- a/kernel/trace/fgraph.c
> > +++ b/kernel/trace/fgraph.c
> > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> >  	unsigned long bitmap;
> >  	unsigned long ret;
> >  	int offset;
> > +	int bit;
> >  	int i;
> >  
> >  	ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
> > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> >  	if (fregs)
> >  		ftrace_regs_set_instruction_pointer(fregs, ret);
> >  
> > +	bit = ftrace_test_recursion_trylock(trace.func, ret);
> > +	/*
> > +	 * This must be succeeded because the entry handler returns before
> > +	 * modifying the return address if it is nested. Anyway, we need to
> > +	 * avoid calling user callbacks if it is nested.
> > +	 */
> > +	if (WARN_ON_ONCE(bit < 0))
> 
> I'm not so sure we need the warn on here. We should probably hook it to the
> recursion detection infrastructure that the function tracer has.

The reason I added WARN_ON here is this should not happen. The recursion
should be detected at the entry, not at exit. The __ftrace_return_to_handler()
is installed only if the entry does NOT detect the recursion. In that case
I think the exit should succeed recursion_trylock().

> 
> The reason I would say not to have the warn on, is because we don't have a
> warn on for recursion happening at the entry handler. Because this now is
> exposed by fprobe allowing different routines to be called at exit than
> what is used in entry, it can easily be triggered.

At the entry, if it detect recursion, it exits soon. But here we have to
process stack unwind to get the return address. This recursion_trylock()
is to mark this is in the critical section, not detect it.

Thank you,

> 
> -- Steve
> 
> 
> 
> > +		goto out;
> > +
> >  #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> >  	trace.retval = ftrace_regs_get_return_value(fregs);
> >  #endif
> > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> >  		}
> >  	}
> >  
> > +	ftrace_test_recursion_unlock(bit);
> > +out:
> >  	/*
> >  	 * The ftrace_graph_return() may still access the current
> >  	 * ret_stack structure, we need to make sure the update of
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop

Posted by Steven Rostedt 4 months, 2 weeks ago

On Sun, 21 Sep 2025 13:05:19 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

>  
> > The reason I would say not to have the warn on, is because we don't have a
> > warn on for recursion happening at the entry handler. Because this now is
> > exposed by fprobe allowing different routines to be called at exit than
> > what is used in entry, it can easily be triggered.  
> 
> At the entry, if it detect recursion, it exits soon. But here we have to
> process stack unwind to get the return address. This recursion_trylock()
> is to mark this is in the critical section, not detect it.

Ah, because the first instance of the exit callback sets the recursion
bit. This will cause recursed entry calls to detect the recursion bit
and return without setting the exit handler to be called.

That is, by setting the recursion bit in the exit handler, it will cause
a recursion in entry to fail before the exit is called again.

I'd like to update the comment:

+	bit = ftrace_test_recursion_trylock(trace.func, ret);
+	/*
+	 * This must be succeeded because the entry handler returns before
+	 * modifying the return address if it is nested. Anyway, we need to
+	 * avoid calling user callbacks if it is nested.
+	 */
+	if (WARN_ON_ONCE(bit < 0))
+		goto out;
+

to:

	/*
	 * Setting the recursion bit here will cause the graph entry to
	 * detect recursion before the exit handle will. If the ext
	 * handler detects recursion, something went wrong.
	 */
	if (WARN_ON_ONCE(bit < 0))

-- Steve