[PATCH perf/core 11/22] selftests/bpf: Use 5-byte nop for x86 usdt probes

Jiri Olsa posted 22 patches 7 months, 4 weeks ago
There is a newer version of this series
[PATCH perf/core 11/22] selftests/bpf: Use 5-byte nop for x86 usdt probes
Posted by Jiri Olsa 7 months, 4 weeks ago
Using 5-byte nop for x86 usdt probes so we can switch
to optimized uprobe them.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/testing/selftests/bpf/sdt.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/sdt.h b/tools/testing/selftests/bpf/sdt.h
index 1fcfa5160231..1d62c06f5ddc 100644
--- a/tools/testing/selftests/bpf/sdt.h
+++ b/tools/testing/selftests/bpf/sdt.h
@@ -236,6 +236,13 @@ __extension__ extern unsigned long long __sdt_unsp;
 #define _SDT_NOP	nop
 #endif
 
+/* Use 5 byte nop for x86_64 to allow optimizing uprobes. */
+#if defined(__x86_64__)
+# define _SDT_DEF_NOP _SDT_ASM_5(990:	.byte 0x0f, 0x1f, 0x44, 0x00, 0x00)
+#else
+# define _SDT_DEF_NOP _SDT_ASM_1(990:	_SDT_NOP)
+#endif
+
 #define _SDT_NOTE_NAME	"stapsdt"
 #define _SDT_NOTE_TYPE	3
 
@@ -288,7 +295,7 @@ __extension__ extern unsigned long long __sdt_unsp;
 
 #define _SDT_ASM_BODY(provider, name, pack_args, args, ...)		      \
   _SDT_DEF_MACROS							      \
-  _SDT_ASM_1(990:	_SDT_NOP)					      \
+  _SDT_DEF_NOP								      \
   _SDT_ASM_3(		.pushsection .note.stapsdt,_SDT_ASM_AUTOGROUP,"note") \
   _SDT_ASM_1(		.balign 4)					      \
   _SDT_ASM_3(		.4byte 992f-991f, 994f-993f, _SDT_NOTE_TYPE)	      \
-- 
2.49.0
Re: [PATCH perf/core 11/22] selftests/bpf: Use 5-byte nop for x86 usdt probes
Posted by Andrii Nakryiko 7 months, 3 weeks ago
On Mon, Apr 21, 2025 at 2:46 PM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Using 5-byte nop for x86 usdt probes so we can switch
> to optimized uprobe them.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/testing/selftests/bpf/sdt.h | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>

So sdt.h is an exact copy/paste from systemtap-sdt sources. I'd prefer
to not modify it unnecessarily.

How about we copy/paste usdt.h ([0]) and use *that* for your
benchmarks? I've already anticipated the need to change nop
instruction, so you won't even need to modify the usdt.h file itself,
just

#define USDT_NOP .byte 0x0f, 0x1f, 0x44, 0x00, 0x00

before #include "usdt.h"


  [0] https://github.com/libbpf/usdt/blob/main/usdt.h

> diff --git a/tools/testing/selftests/bpf/sdt.h b/tools/testing/selftests/bpf/sdt.h
> index 1fcfa5160231..1d62c06f5ddc 100644
> --- a/tools/testing/selftests/bpf/sdt.h
> +++ b/tools/testing/selftests/bpf/sdt.h
> @@ -236,6 +236,13 @@ __extension__ extern unsigned long long __sdt_unsp;
>  #define _SDT_NOP       nop
>  #endif
>
> +/* Use 5 byte nop for x86_64 to allow optimizing uprobes. */
> +#if defined(__x86_64__)
> +# define _SDT_DEF_NOP _SDT_ASM_5(990:  .byte 0x0f, 0x1f, 0x44, 0x00, 0x00)
> +#else
> +# define _SDT_DEF_NOP _SDT_ASM_1(990:  _SDT_NOP)
> +#endif
> +
>  #define _SDT_NOTE_NAME "stapsdt"
>  #define _SDT_NOTE_TYPE 3
>
> @@ -288,7 +295,7 @@ __extension__ extern unsigned long long __sdt_unsp;
>
>  #define _SDT_ASM_BODY(provider, name, pack_args, args, ...)                  \
>    _SDT_DEF_MACROS                                                            \
> -  _SDT_ASM_1(990:      _SDT_NOP)                                             \
> +  _SDT_DEF_NOP                                                               \
>    _SDT_ASM_3(          .pushsection .note.stapsdt,_SDT_ASM_AUTOGROUP,"note") \
>    _SDT_ASM_1(          .balign 4)                                            \
>    _SDT_ASM_3(          .4byte 992f-991f, 994f-993f, _SDT_NOTE_TYPE)          \
> --
> 2.49.0
>
Re: [PATCH perf/core 11/22] selftests/bpf: Use 5-byte nop for x86 usdt probes
Posted by Jiri Olsa 7 months, 3 weeks ago
On Wed, Apr 23, 2025 at 10:33:18AM -0700, Andrii Nakryiko wrote:
> On Mon, Apr 21, 2025 at 2:46 PM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Using 5-byte nop for x86 usdt probes so we can switch
> > to optimized uprobe them.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  tools/testing/selftests/bpf/sdt.h | 9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> >
> 
> So sdt.h is an exact copy/paste from systemtap-sdt sources. I'd prefer
> to not modify it unnecessarily.
> 
> How about we copy/paste usdt.h ([0]) and use *that* for your
> benchmarks? I've already anticipated the need to change nop
> instruction, so you won't even need to modify the usdt.h file itself,
> just
> 
> #define USDT_NOP .byte 0x0f, 0x1f, 0x44, 0x00, 0x00
> 
> before #include "usdt.h"


sounds good, but it seems we need bit more changes for that,
so far I ended up with:

-       __usdt_asm1(990:        USDT_NOP)                                                       \
+       __usdt_asm5(990:        USDT_NOP)                                                       \

but it still won't compile, will need to spend more time on that,
unless you have better solution

thanks,
jirka

> 
> 
>   [0] https://github.com/libbpf/usdt/blob/main/usdt.h
> 
> > diff --git a/tools/testing/selftests/bpf/sdt.h b/tools/testing/selftests/bpf/sdt.h
> > index 1fcfa5160231..1d62c06f5ddc 100644
> > --- a/tools/testing/selftests/bpf/sdt.h
> > +++ b/tools/testing/selftests/bpf/sdt.h
> > @@ -236,6 +236,13 @@ __extension__ extern unsigned long long __sdt_unsp;
> >  #define _SDT_NOP       nop
> >  #endif
> >
> > +/* Use 5 byte nop for x86_64 to allow optimizing uprobes. */
> > +#if defined(__x86_64__)
> > +# define _SDT_DEF_NOP _SDT_ASM_5(990:  .byte 0x0f, 0x1f, 0x44, 0x00, 0x00)
> > +#else
> > +# define _SDT_DEF_NOP _SDT_ASM_1(990:  _SDT_NOP)
> > +#endif
> > +
> >  #define _SDT_NOTE_NAME "stapsdt"
> >  #define _SDT_NOTE_TYPE 3
> >
> > @@ -288,7 +295,7 @@ __extension__ extern unsigned long long __sdt_unsp;
> >
> >  #define _SDT_ASM_BODY(provider, name, pack_args, args, ...)                  \
> >    _SDT_DEF_MACROS                                                            \
> > -  _SDT_ASM_1(990:      _SDT_NOP)                                             \
> > +  _SDT_DEF_NOP                                                               \
> >    _SDT_ASM_3(          .pushsection .note.stapsdt,_SDT_ASM_AUTOGROUP,"note") \
> >    _SDT_ASM_1(          .balign 4)                                            \
> >    _SDT_ASM_3(          .4byte 992f-991f, 994f-993f, _SDT_NOTE_TYPE)          \
> > --
> > 2.49.0
> >
Re: [PATCH perf/core 11/22] selftests/bpf: Use 5-byte nop for x86 usdt probes
Posted by Andrii Nakryiko 7 months, 3 weeks ago
On Thu, Apr 24, 2025 at 5:49 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Wed, Apr 23, 2025 at 10:33:18AM -0700, Andrii Nakryiko wrote:
> > On Mon, Apr 21, 2025 at 2:46 PM Jiri Olsa <jolsa@kernel.org> wrote:
> > >
> > > Using 5-byte nop for x86 usdt probes so we can switch
> > > to optimized uprobe them.
> > >
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > >  tools/testing/selftests/bpf/sdt.h | 9 ++++++++-
> > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > >
> >
> > So sdt.h is an exact copy/paste from systemtap-sdt sources. I'd prefer
> > to not modify it unnecessarily.
> >
> > How about we copy/paste usdt.h ([0]) and use *that* for your
> > benchmarks? I've already anticipated the need to change nop
> > instruction, so you won't even need to modify the usdt.h file itself,
> > just
> >
> > #define USDT_NOP .byte 0x0f, 0x1f, 0x44, 0x00, 0x00
> >
> > before #include "usdt.h"
>
>
> sounds good, but it seems we need bit more changes for that,
> so far I ended up with:
>
> -       __usdt_asm1(990:        USDT_NOP)                                                       \
> +       __usdt_asm5(990:        USDT_NOP)                                                       \
>
> but it still won't compile, will need to spend more time on that,
> unless you have better solution
>

Use

#define USDT_NOP .ascii "\x0F\x1F\x44\x00\x00"

for now, I'll need to improve macro magic to handle instructions with
commas in them...

> thanks,
> jirka
>
> >
> >
> >   [0] https://github.com/libbpf/usdt/blob/main/usdt.h
> >
> > > diff --git a/tools/testing/selftests/bpf/sdt.h b/tools/testing/selftests/bpf/sdt.h
> > > index 1fcfa5160231..1d62c06f5ddc 100644
> > > --- a/tools/testing/selftests/bpf/sdt.h
> > > +++ b/tools/testing/selftests/bpf/sdt.h
> > > @@ -236,6 +236,13 @@ __extension__ extern unsigned long long __sdt_unsp;
> > >  #define _SDT_NOP       nop
> > >  #endif
> > >
> > > +/* Use 5 byte nop for x86_64 to allow optimizing uprobes. */
> > > +#if defined(__x86_64__)
> > > +# define _SDT_DEF_NOP _SDT_ASM_5(990:  .byte 0x0f, 0x1f, 0x44, 0x00, 0x00)
> > > +#else
> > > +# define _SDT_DEF_NOP _SDT_ASM_1(990:  _SDT_NOP)
> > > +#endif
> > > +
> > >  #define _SDT_NOTE_NAME "stapsdt"
> > >  #define _SDT_NOTE_TYPE 3
> > >
> > > @@ -288,7 +295,7 @@ __extension__ extern unsigned long long __sdt_unsp;
> > >
> > >  #define _SDT_ASM_BODY(provider, name, pack_args, args, ...)                  \
> > >    _SDT_DEF_MACROS                                                            \
> > > -  _SDT_ASM_1(990:      _SDT_NOP)                                             \
> > > +  _SDT_DEF_NOP                                                               \
> > >    _SDT_ASM_3(          .pushsection .note.stapsdt,_SDT_ASM_AUTOGROUP,"note") \
> > >    _SDT_ASM_1(          .balign 4)                                            \
> > >    _SDT_ASM_3(          .4byte 992f-991f, 994f-993f, _SDT_NOTE_TYPE)          \
> > > --
> > > 2.49.0
> > >
Re: [PATCH perf/core 11/22] selftests/bpf: Use 5-byte nop for x86 usdt probes
Posted by Andrii Nakryiko 7 months, 3 weeks ago
On Thu, Apr 24, 2025 at 9:29 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Apr 24, 2025 at 5:49 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Wed, Apr 23, 2025 at 10:33:18AM -0700, Andrii Nakryiko wrote:
> > > On Mon, Apr 21, 2025 at 2:46 PM Jiri Olsa <jolsa@kernel.org> wrote:
> > > >
> > > > Using 5-byte nop for x86 usdt probes so we can switch
> > > > to optimized uprobe them.
> > > >
> > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > ---
> > > >  tools/testing/selftests/bpf/sdt.h | 9 ++++++++-
> > > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > >
> > >
> > > So sdt.h is an exact copy/paste from systemtap-sdt sources. I'd prefer
> > > to not modify it unnecessarily.
> > >
> > > How about we copy/paste usdt.h ([0]) and use *that* for your
> > > benchmarks? I've already anticipated the need to change nop
> > > instruction, so you won't even need to modify the usdt.h file itself,
> > > just
> > >
> > > #define USDT_NOP .byte 0x0f, 0x1f, 0x44, 0x00, 0x00
> > >
> > > before #include "usdt.h"
> >
> >
> > sounds good, but it seems we need bit more changes for that,
> > so far I ended up with:
> >
> > -       __usdt_asm1(990:        USDT_NOP)                                                       \
> > +       __usdt_asm5(990:        USDT_NOP)                                                       \
> >
> > but it still won't compile, will need to spend more time on that,
> > unless you have better solution
> >
>
> Use
>
> #define USDT_NOP .ascii "\x0F\x1F\x44\x00\x00"
>
> for now, I'll need to improve macro magic to handle instructions with
> commas in them...

Ok, fixed in [0]. If you get the latest version, the .byte approach
will work (I have tests in CI now to validate this).

  [0] https://github.com/libbpf/usdt/pull/12

>
> > thanks,
> > jirka
> >
> > >
> > >
> > >   [0] https://github.com/libbpf/usdt/blob/main/usdt.h
> > >
> > > > diff --git a/tools/testing/selftests/bpf/sdt.h b/tools/testing/selftests/bpf/sdt.h
> > > > index 1fcfa5160231..1d62c06f5ddc 100644
> > > > --- a/tools/testing/selftests/bpf/sdt.h
> > > > +++ b/tools/testing/selftests/bpf/sdt.h
> > > > @@ -236,6 +236,13 @@ __extension__ extern unsigned long long __sdt_unsp;
> > > >  #define _SDT_NOP       nop
> > > >  #endif
> > > >
> > > > +/* Use 5 byte nop for x86_64 to allow optimizing uprobes. */
> > > > +#if defined(__x86_64__)
> > > > +# define _SDT_DEF_NOP _SDT_ASM_5(990:  .byte 0x0f, 0x1f, 0x44, 0x00, 0x00)
> > > > +#else
> > > > +# define _SDT_DEF_NOP _SDT_ASM_1(990:  _SDT_NOP)
> > > > +#endif
> > > > +
> > > >  #define _SDT_NOTE_NAME "stapsdt"
> > > >  #define _SDT_NOTE_TYPE 3
> > > >
> > > > @@ -288,7 +295,7 @@ __extension__ extern unsigned long long __sdt_unsp;
> > > >
> > > >  #define _SDT_ASM_BODY(provider, name, pack_args, args, ...)                  \
> > > >    _SDT_DEF_MACROS                                                            \
> > > > -  _SDT_ASM_1(990:      _SDT_NOP)                                             \
> > > > +  _SDT_DEF_NOP                                                               \
> > > >    _SDT_ASM_3(          .pushsection .note.stapsdt,_SDT_ASM_AUTOGROUP,"note") \
> > > >    _SDT_ASM_1(          .balign 4)                                            \
> > > >    _SDT_ASM_3(          .4byte 992f-991f, 994f-993f, _SDT_NOTE_TYPE)          \
> > > > --
> > > > 2.49.0
> > > >
Re: [PATCH perf/core 11/22] selftests/bpf: Use 5-byte nop for x86 usdt probes
Posted by Jiri Olsa 7 months, 3 weeks ago
On Thu, Apr 24, 2025 at 11:20:11AM -0700, Andrii Nakryiko wrote:
> On Thu, Apr 24, 2025 at 9:29 AM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Apr 24, 2025 at 5:49 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Wed, Apr 23, 2025 at 10:33:18AM -0700, Andrii Nakryiko wrote:
> > > > On Mon, Apr 21, 2025 at 2:46 PM Jiri Olsa <jolsa@kernel.org> wrote:
> > > > >
> > > > > Using 5-byte nop for x86 usdt probes so we can switch
> > > > > to optimized uprobe them.
> > > > >
> > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > ---
> > > > >  tools/testing/selftests/bpf/sdt.h | 9 ++++++++-
> > > > >  1 file changed, 8 insertions(+), 1 deletion(-)
> > > > >
> > > >
> > > > So sdt.h is an exact copy/paste from systemtap-sdt sources. I'd prefer
> > > > to not modify it unnecessarily.
> > > >
> > > > How about we copy/paste usdt.h ([0]) and use *that* for your
> > > > benchmarks? I've already anticipated the need to change nop
> > > > instruction, so you won't even need to modify the usdt.h file itself,
> > > > just
> > > >
> > > > #define USDT_NOP .byte 0x0f, 0x1f, 0x44, 0x00, 0x00
> > > >
> > > > before #include "usdt.h"
> > >
> > >
> > > sounds good, but it seems we need bit more changes for that,
> > > so far I ended up with:
> > >
> > > -       __usdt_asm1(990:        USDT_NOP)                                                       \
> > > +       __usdt_asm5(990:        USDT_NOP)                                                       \
> > >
> > > but it still won't compile, will need to spend more time on that,
> > > unless you have better solution
> > >
> >
> > Use
> >
> > #define USDT_NOP .ascii "\x0F\x1F\x44\x00\x00"
> >
> > for now, I'll need to improve macro magic to handle instructions with
> > commas in them...
> 
> Ok, fixed in [0]. If you get the latest version, the .byte approach
> will work (I have tests in CI now to validate this).
> 
>   [0] https://github.com/libbpf/usdt/pull/12

yep, works nicely, thanks

jirka