[PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12

Howard Chu posted 2 patches 1 month, 2 weeks ago
tools/perf/Makefile.config                    |   4 +-
.../bpf_skel/augmented_raw_syscalls.bpf.c     | 122 ++++++++++--------
2 files changed, 69 insertions(+), 57 deletions(-)
[PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Howard Chu 1 month, 2 weeks ago
Changes in v2:
- Resolved a clang-16 build error pointed out by Namhyung Kim
  <namhyung@kernel.org>

The new augmentation feature in perf trace, along with the protocol
change (from payload to payload->value), breaks the clang 12 build.

perf trace actually builds for any clang version newer than clang 16.
However, as pointed out by Namhyung Kim <namhyung@kernel.org> and Ian
Rogers <irogers@google.com>, clang 16, which was released in 2023, is
still too new for most users. Additionally, as James Clark
<james.clark@linaro.org> noted, some commonly used distributions do not
yet support clang 16. Therefore, breaking BPF features between clang 12
and clang 15 is not a good approach.

This patch series rewrites the BPF program in a way that allows it to
pass the BPF verifier, even when the BPF bytecode is generated by older
versions of clang.

However, I have only tested it till clang 14, as older versions are not
supported by my distribution.

Howard Chu (2):
  perf build: Change the clang check back to 12.0.1
  perf trace: Rewrite BPF code to pass the verifier

 tools/perf/Makefile.config                    |   4 +-
 .../bpf_skel/augmented_raw_syscalls.bpf.c     | 122 ++++++++++--------
 2 files changed, 69 insertions(+), 57 deletions(-)

-- 
2.43.0
Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Namhyung Kim 1 month, 1 week ago
On Thu, Oct 10, 2024 at 07:14:00PM -0700, Howard Chu wrote:
> Changes in v2:
> - Resolved a clang-16 build error pointed out by Namhyung Kim
>   <namhyung@kernel.org>
> 
> The new augmentation feature in perf trace, along with the protocol
> change (from payload to payload->value), breaks the clang 12 build.
> 
> perf trace actually builds for any clang version newer than clang 16.
> However, as pointed out by Namhyung Kim <namhyung@kernel.org> and Ian
> Rogers <irogers@google.com>, clang 16, which was released in 2023, is
> still too new for most users. Additionally, as James Clark
> <james.clark@linaro.org> noted, some commonly used distributions do not
> yet support clang 16. Therefore, breaking BPF features between clang 12
> and clang 15 is not a good approach.
> 
> This patch series rewrites the BPF program in a way that allows it to
> pass the BPF verifier, even when the BPF bytecode is generated by older
> versions of clang.
> 
> However, I have only tested it till clang 14, as older versions are not
> supported by my distribution.
> 
> Howard Chu (2):
>   perf build: Change the clang check back to 12.0.1
>   perf trace: Rewrite BPF code to pass the verifier

Tested with clang 16.  And I think it's better to change the order of
the commits so it can fix the problem first and then check the version.

Thanks,
Namhyung

> 
>  tools/perf/Makefile.config                    |   4 +-
>  .../bpf_skel/augmented_raw_syscalls.bpf.c     | 122 ++++++++++--------
>  2 files changed, 69 insertions(+), 57 deletions(-)
> 
> -- 
> 2.43.0
>
Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Arnaldo Carvalho de Melo 1 month, 1 week ago
On Tue, Oct 15, 2024 at 11:32:45AM -0700, Namhyung Kim wrote:
> On Thu, Oct 10, 2024 at 07:14:00PM -0700, Howard Chu wrote:
> > Changes in v2:
> > - Resolved a clang-16 build error pointed out by Namhyung Kim
> >   <namhyung@kernel.org>

> > The new augmentation feature in perf trace, along with the protocol
> > change (from payload to payload->value), breaks the clang 12 build.

> > perf trace actually builds for any clang version newer than clang 16.
> > However, as pointed out by Namhyung Kim <namhyung@kernel.org> and Ian
> > Rogers <irogers@google.com>, clang 16, which was released in 2023, is
> > still too new for most users. Additionally, as James Clark
> > <james.clark@linaro.org> noted, some commonly used distributions do not
> > yet support clang 16. Therefore, breaking BPF features between clang 12
> > and clang 15 is not a good approach.

> > This patch series rewrites the BPF program in a way that allows it to
> > pass the BPF verifier, even when the BPF bytecode is generated by older
> > versions of clang.

> > However, I have only tested it till clang 14, as older versions are not
> > supported by my distribution.
 
> > Howard Chu (2):
> >   perf build: Change the clang check back to 12.0.1
> >   perf trace: Rewrite BPF code to pass the verifier
 
> Tested with clang 16.  And I think it's better to change the order of
> the commits so it can fix the problem first and then check the version.

So, I tested it on a RHEL8 system and it gets built with clang 17 but
then fails to load, the verifier complains about lack of bounds checking
for the index of the syscall array, with or without this last patch from
Howard.

I also simplified it to a more minimal version withour renaming
variables, so that we see what exactly fixed the problem, its available
at the perf-tools/tmp.perf-tools branch, I've talked about it with
Howard over chat.

Song Liu reproduced the problem (unsure with what clang and kernel
versions) and couldn't find a way to fix it using the usual tricks to
coax clang to keep the bounds checking for the verifier to get
satisfied.

More generally I'll use virtme-ng[1] to test with a wider range of
kernels, not just clang versions.

- Arnaldo

[1] https://kernel-recipes.org/en/2024/virtme-ng/
Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Arnaldo Carvalho de Melo 1 month, 1 week ago
On Tue, Oct 15, 2024 at 04:35:16PM -0300, Arnaldo Carvalho de Melo wrote:
> On Tue, Oct 15, 2024 at 11:32:45AM -0700, Namhyung Kim wrote:
> > On Thu, Oct 10, 2024 at 07:14:00PM -0700, Howard Chu wrote:
> > > Howard Chu (2):
> > >   perf build: Change the clang check back to 12.0.1
> > >   perf trace: Rewrite BPF code to pass the verifier
>  
> > Tested with clang 16.  And I think it's better to change the order of
> > the commits so it can fix the problem first and then check the version.
 
> So, I tested it on a RHEL8 system and it gets built with clang 17 but
> then fails to load, the verifier complains about lack of bounds checking
> for the index of the syscall array, with or without this last patch from
> Howard.
 
> I also simplified it to a more minimal version withour renaming
> variables, so that we see what exactly fixed the problem, its available
> at the perf-tools/tmp.perf-tools branch, I've talked about it with
> Howard over chat.

Just to give more details, without the following patch:

[acme@dell-per740-01 perf-tools]$ git diff
diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
index 3b30aa74a3ae..873fa738530b 100644
--- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
+++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
@@ -485,7 +485,8 @@ static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
                        if (!bpf_probe_read_user(((struct augmented_arg *)payload_offset)->value, size, arg))
                                augmented = true;
                } else if (size < 0 && size >= -6) { /* buffer */
-                       index = -(size + 1);
+                       //index = -(size + 1);
+                       index = 0;
                        aug_size = args->args[index];
 
                        if (aug_size > TRACE_AUG_MAX_BUF)
[acme@dell-per740-01 perf-tools]$

We end up with:

; } else if (size < 0 && size >= -6) { /* buffer */
111: (b7) r1 = -6
112: (2d) if r1 > r9 goto pc-22
 R0=map_value(id=0,off=0,ks=4,vs=24688,imm=0) R1_w=inv-6 R2=invP40 R3=inv(id=0) R4=map_value(id=0,off=112,ks=4,vs=24688,imm=0) R6=map_value(id=0,off=20,ks=4,vs=24,imm=0) R7=map_value(id=0,off=56,ks=4,vs=8272,imm=0) R8=invP6 R9=inv(id=0,umin_value=18446744073709551610,var_off=(0xffffffff00000000; 0xffffffff)) R10=fp0 fp-8=mmmmmmmm fp-16=map_value fp-24=map_value fp-32=map_value fp-40=invP40 fp-48=ctx fp-56=map_value fp-64=inv1 fp-72=map_value fp-80=map_value
; index = -(size + 1);
113: (a7) r9 ^= -1
114: (67) r9 <<= 32
115: (77) r9 >>= 32
; aug_size = args->args[index];
116: (67) r9 <<= 3
117: (79) r1 = *(u64 *)(r10 -32)
118: (0f) r1 += r9
last_idx 118 first_idx 111
regs=200 stack=0 before 117: (79) r1 = *(u64 *)(r10 -32)
regs=200 stack=0 before 116: (67) r9 <<= 3
regs=200 stack=0 before 115: (77) r9 >>= 32
regs=200 stack=0 before 114: (67) r9 <<= 32
regs=200 stack=0 before 113: (a7) r9 ^= -1
regs=200 stack=0 before 112: (2d) if r1 > r9 goto pc-22
regs=202 stack=0 before 111: (b7) r1 = -6
 R0=map_value(id=0,off=0,ks=4,vs=24688,imm=0) R1=inv1 R2=invP40 R3=inv(id=0) R4=map_value(id=0,off=112,ks=4,vs=24688,imm=0) R6=map_value(id=0,off=20,ks=4,vs=24,imm=0) R7=map_value(id=0,off=56,ks=4,vs=8272,imm=0) R8=invP6 R9_r=invP(id=0,smin_value=-2147483648,smax_value=0) R10=fp0 fp-8=mmmmmmmm fp-16=map_value fp-24=map_value fp-32_r=map_value fp-40=invP40 fp-48=ctx fp-56=map_value fp-64=inv1 fp-72=map_value fp-80=map_value
parent didn't have regs=200 stack=0 marks
last_idx 85 first_idx 85
regs=200 stack=0 before 85: (6d) if r1 s> r9 goto pc+25
 R0_w=map_value(id=0,off=0,ks=4,vs=24688,imm=0) R1_rw=invP1 R2_w=invP40 R3_rw=inv(id=0) R4_w=map_value(id=0,off=112,ks=4,vs=24688,imm=0) R6_rw=map_value(id=0,off=20,ks=4,vs=24,imm=0) R7_rw=map_value(id=0,off=56,ks=4,vs=8272,imm=0) R8_rw=invP6 R9_rw=invP(id=0,smin_value=-2147483648,smax_value=4096) R10=fp0 fp-8=mmmmmmmm fp-16_rw=map_value fp-24_rw=map_value fp-32_r=map_value fp-40_rw=invP40 fp-48_r=ctx fp-56_r=map_value fp-64_rw=invP1 fp-72_r=map_value fp-80_r=map_value
parent already had regs=202 stack=0 marks
119: (79) r9 = *(u64 *)(r1 +16)
 R0=map_value(id=0,off=0,ks=4,vs=24688,imm=0) R1_w=map_value(id=0,off=0,ks=4,vs=8272,umax_value=34359738360,var_off=(0x0; 0x7fffffff8),s32_max_value=2147483640,u32_max_value=-8) R2=invP40 R3=inv(id=0) R4=map_value(id=0,off=112,ks=4,vs=24688,imm=0) R6=map_value(id=0,off=20,ks=4,vs=24,imm=0) R7=map_value(id=0,off=56,ks=4,vs=8272,imm=0) R8=invP6 R9_w=invP(id=0,umax_value=34359738360,var_off=(0x0; 0x7fffffff8),s32_max_value=2147483640,u32_max_value=-8) R10=fp0 fp-8=mmmmmmmm fp-16=map_value fp-24=map_value fp-32=map_value fp-40=invP40 fp-48=ctx fp-56=map_value fp-64=inv1 fp-72=map_value fp-80=map_value
R1 unbounded memory access, make sure to bounds check any such access
processed 549 insns (limit 1000000) max_states_per_insn 3 total_states 27 peak_states 27 mark_read 16

With it (which is not a fix, I'm just using it to narrow down what
calculation the compiler is generating code the verifier in that kernel
dislikes):

[root@dell-per740-01 ~]# ~acme/bin/perf trace -e *sleep* sleep 1.234567890
     0.000 (1234.696 ms): sleep/3852601 nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 234567890 })                  = 0
[root@dell-per740-01 ~]# 

So I'm trying adding extra bounds checking, marking the index as
volatile, adding compiler barriers, etc, all the fun with the verifier,
but got distracted with other stuff, coming back to this now.

Ok, the following seems to do the trick:

[acme@dell-per740-01 perf-tools]$ git diff
diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
index 3b30aa74a3ae..ef87a04ff8d0 100644
--- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
+++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
@@ -486,6 +486,7 @@ static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
                                augmented = true;
                } else if (size < 0 && size >= -6) { /* buffer */
                        index = -(size + 1);
+                       index &= 7; // To satisfy the bounds checking with the verifier in some kernels
                        aug_size = args->args[index];
 
                        if (aug_size > TRACE_AUG_MAX_BUF)

I'll now test it without Howard's patch to see if it fixes the RHEL8 +
clang 17 case.

- Arnaldo
 
> Song Liu reproduced the problem (unsure with what clang and kernel
> versions) and couldn't find a way to fix it using the usual tricks to
> coax clang to keep the bounds checking for the verifier to get
> satisfied.
> 
> More generally I'll use virtme-ng[1] to test with a wider range of
> kernels, not just clang versions.
> 
> - Arnaldo
> 
> [1] https://kernel-recipes.org/en/2024/virtme-ng/
Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Arnaldo Carvalho de Melo 1 month, 1 week ago
On Tue, Oct 15, 2024 at 04:58:56PM -0300, Arnaldo Carvalho de Melo wrote:
> So I'm trying adding extra bounds checking, marking the index as
> volatile, adding compiler barriers, etc, all the fun with the verifier,
> but got distracted with other stuff, coming back to this now.
 
> Ok, the following seems to do the trick:
 
> [acme@dell-per740-01 perf-tools]$ git diff
> diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> index 3b30aa74a3ae..ef87a04ff8d0 100644
> --- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> +++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> @@ -486,6 +486,7 @@ static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
>                                 augmented = true;
>                 } else if (size < 0 && size >= -6) { /* buffer */
>                         index = -(size + 1);
> +                       index &= 7; // To satisfy the bounds checking with the verifier in some kernels
>                         aug_size = args->args[index];
>  
>                         if (aug_size > TRACE_AUG_MAX_BUF)
> 
> I'll now test it without Howard's patch to see if it fixes the RHEL8 +
> clang 17 case.

It works with this one-liner + the simplified patch from Howard and also
on this other system (RHEL9), as well as with Fedora 40, it would be
nice if someone could test with clang 16 and report back the version of
the kernel tested as well as the distro name/release, that way I can try
to get my hands on such as system and test there as well.

Its all at:

https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git tmp.perf-tools

This is the current set of patches that when further tested will go to
Linus for v6.12:

⬢[acme@toolbox perf-tools]$ git log --oneline torvalds/master..
ff14baa7a290bf42 (HEAD -> perf-tools, x1/perf-tools, perf-tools/tmp.perf-tools) perf trace augmented_raw_syscalls: Add more checks to pass the verifier
46180bec048aad85 perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
45d1aadac64869a2 perf build: Change the clang check back to 12.0.1
4e21679eb81b5f0d perf trace: The return from 'write' isn't a pid
2d2314d4b09b5ed9 tools headers UAPI: Sync linux/const.h with the kernel headers
⬢[acme@toolbox perf-tools]$

[root@nine ~]# uname -a
Linux nine 5.14.0-427.31.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Aug 9 14:06:03 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
[root@nine ~]# perf trace -e *sleep* sleep 1.234567890
     0.000 (1234.742 ms): sleep/80014 clock_nanosleep(rqtp: 0x7ffc55b11240, rmtp: 0x7ffc55b11230)           = 0
[root@nine ~]# clang --version
clang version 17.0.6 (Red Hat, Inc. 17.0.6-5.el9)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
[root@nine ~]#

- Arnaldo
Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Namhyung Kim 1 month, 1 week ago
Hi Arnaldo,

On Tue, Oct 15, 2024 at 05:37:38PM -0300, Arnaldo Carvalho de Melo wrote:
> On Tue, Oct 15, 2024 at 04:58:56PM -0300, Arnaldo Carvalho de Melo wrote:
> > So I'm trying adding extra bounds checking, marking the index as
> > volatile, adding compiler barriers, etc, all the fun with the verifier,
> > but got distracted with other stuff, coming back to this now.
>  
> > Ok, the following seems to do the trick:
>  
> > [acme@dell-per740-01 perf-tools]$ git diff
> > diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > index 3b30aa74a3ae..ef87a04ff8d0 100644
> > --- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > +++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > @@ -486,6 +486,7 @@ static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
> >                                 augmented = true;
> >                 } else if (size < 0 && size >= -6) { /* buffer */
> >                         index = -(size + 1);
> > +                       index &= 7; // To satisfy the bounds checking with the verifier in some kernels
> >                         aug_size = args->args[index];
> >  
> >                         if (aug_size > TRACE_AUG_MAX_BUF)
> > 
> > I'll now test it without Howard's patch to see if it fixes the RHEL8 +
> > clang 17 case.
> 
> It works with this one-liner + the simplified patch from Howard and also
> on this other system (RHEL9), as well as with Fedora 40, it would be
> nice if someone could test with clang 16 and report back the version of
> the kernel tested as well as the distro name/release, that way I can try
> to get my hands on such as system and test there as well.
> 
> Its all at:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git tmp.perf-tools
> 
> This is the current set of patches that when further tested will go to
> Linus for v6.12:
> 
> ⬢[acme@toolbox perf-tools]$ git log --oneline torvalds/master..
> ff14baa7a290bf42 (HEAD -> perf-tools, x1/perf-tools, perf-tools/tmp.perf-tools) perf trace augmented_raw_syscalls: Add more checks to pass the verifier
> 46180bec048aad85 perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
> 45d1aadac64869a2 perf build: Change the clang check back to 12.0.1

Wouldn't it be better to have this change after fixing the verifier
issues in the later commits?

> 4e21679eb81b5f0d perf trace: The return from 'write' isn't a pid
> 2d2314d4b09b5ed9 tools headers UAPI: Sync linux/const.h with the kernel headers
> ⬢[acme@toolbox perf-tools]$

I guess you also need the syscalltbl fix from Jiri Slaby.

https://lore.kernel.org/linux-perf-users/3a592835-a14f-40be-8961-c0cee7720a94@kernel.org/

Thanks,
Namhyung

> 
> [root@nine ~]# uname -a
> Linux nine 5.14.0-427.31.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Aug 9 14:06:03 EDT 2024 x86_64 x86_64 x86_64 GNU/Linux
> [root@nine ~]# perf trace -e *sleep* sleep 1.234567890
>      0.000 (1234.742 ms): sleep/80014 clock_nanosleep(rqtp: 0x7ffc55b11240, rmtp: 0x7ffc55b11230)           = 0
> [root@nine ~]# clang --version
> clang version 17.0.6 (Red Hat, Inc. 17.0.6-5.el9)
> Target: x86_64-redhat-linux-gnu
> Thread model: posix
> InstalledDir: /usr/bin
> [root@nine ~]#
> 
> - Arnaldo
Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Arnaldo Carvalho de Melo 1 month, 1 week ago
On Tue, Oct 15, 2024 at 07:06:35PM -0700, Namhyung Kim wrote:
> Hi Arnaldo,
> 
> On Tue, Oct 15, 2024 at 05:37:38PM -0300, Arnaldo Carvalho de Melo wrote:
> > On Tue, Oct 15, 2024 at 04:58:56PM -0300, Arnaldo Carvalho de Melo wrote:
> > > So I'm trying adding extra bounds checking, marking the index as
> > > volatile, adding compiler barriers, etc, all the fun with the verifier,
> > > but got distracted with other stuff, coming back to this now.
> >  
> > > Ok, the following seems to do the trick:
> >  
> > > [acme@dell-per740-01 perf-tools]$ git diff
> > > diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > > index 3b30aa74a3ae..ef87a04ff8d0 100644
> > > --- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > > +++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > > @@ -486,6 +486,7 @@ static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
> > >                                 augmented = true;
> > >                 } else if (size < 0 && size >= -6) { /* buffer */
> > >                         index = -(size + 1);
> > > +                       index &= 7; // To satisfy the bounds checking with the verifier in some kernels
> > >                         aug_size = args->args[index];
> > >  
> > >                         if (aug_size > TRACE_AUG_MAX_BUF)
> > > 
> > > I'll now test it without Howard's patch to see if it fixes the RHEL8 +
> > > clang 17 case.
> > 
> > It works with this one-liner + the simplified patch from Howard and also
> > on this other system (RHEL9), as well as with Fedora 40, it would be
> > nice if someone could test with clang 16 and report back the version of
> > the kernel tested as well as the distro name/release, that way I can try
> > to get my hands on such as system and test there as well.
> > 
> > Its all at:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git tmp.perf-tools
> > 
> > This is the current set of patches that when further tested will go to
> > Linus for v6.12:
> > 
> > ⬢[acme@toolbox perf-tools]$ git log --oneline torvalds/master..
> > ff14baa7a290bf42 (HEAD -> perf-tools, x1/perf-tools, perf-tools/tmp.perf-tools) perf trace augmented_raw_syscalls: Add more checks to pass the verifier
> > 46180bec048aad85 perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
> > 45d1aadac64869a2 perf build: Change the clang check back to 12.0.1
> 
> Wouldn't it be better to have this change after fixing the verifier
> issues in the later commits?

I'm still testing it, this is a one-liner, so I think that the order in
which the patches are applied isn't important. Also Howard's patch (the
simplified one) doesn't clash with it.
 
> > 4e21679eb81b5f0d perf trace: The return from 'write' isn't a pid
> > 2d2314d4b09b5ed9 tools headers UAPI: Sync linux/const.h with the kernel headers
> > ⬢[acme@toolbox perf-tools]$
> 
> I guess you also need the syscalltbl fix from Jiri Slaby.
> 
> https://lore.kernel.org/linux-perf-users/3a592835-a14f-40be-8961-c0cee7720a94@kernel.org/

Right, he provided a report about the patch I sent solving the case, I
have to check if he replied to my question about perf trace actually
_working_ on a 32-bit arch system.

I also want to test it, I'm trying to get hold of such a system.

- Arnaldo
Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Namhyung Kim 1 month ago
Hi Arnaldo,

On Wed, Oct 16, 2024 at 11:22:15AM -0300, Arnaldo Carvalho de Melo wrote:
> On Tue, Oct 15, 2024 at 07:06:35PM -0700, Namhyung Kim wrote:
> > Hi Arnaldo,
> > 
> > On Tue, Oct 15, 2024 at 05:37:38PM -0300, Arnaldo Carvalho de Melo wrote:
> > > On Tue, Oct 15, 2024 at 04:58:56PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > So I'm trying adding extra bounds checking, marking the index as
> > > > volatile, adding compiler barriers, etc, all the fun with the verifier,
> > > > but got distracted with other stuff, coming back to this now.
> > >  
> > > > Ok, the following seems to do the trick:
> > >  
> > > > [acme@dell-per740-01 perf-tools]$ git diff
> > > > diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > > > index 3b30aa74a3ae..ef87a04ff8d0 100644
> > > > --- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > > > +++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > > > @@ -486,6 +486,7 @@ static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
> > > >                                 augmented = true;
> > > >                 } else if (size < 0 && size >= -6) { /* buffer */
> > > >                         index = -(size + 1);
> > > > +                       index &= 7; // To satisfy the bounds checking with the verifier in some kernels
> > > >                         aug_size = args->args[index];
> > > >  
> > > >                         if (aug_size > TRACE_AUG_MAX_BUF)
> > > > 
> > > > I'll now test it without Howard's patch to see if it fixes the RHEL8 +
> > > > clang 17 case.
> > > 
> > > It works with this one-liner + the simplified patch from Howard and also
> > > on this other system (RHEL9), as well as with Fedora 40, it would be
> > > nice if someone could test with clang 16 and report back the version of
> > > the kernel tested as well as the distro name/release, that way I can try
> > > to get my hands on such as system and test there as well.
> > > 
> > > Its all at:
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git tmp.perf-tools
> > > 
> > > This is the current set of patches that when further tested will go to
> > > Linus for v6.12:
> > > 
> > > ⬢[acme@toolbox perf-tools]$ git log --oneline torvalds/master..
> > > ff14baa7a290bf42 (HEAD -> perf-tools, x1/perf-tools, perf-tools/tmp.perf-tools) perf trace augmented_raw_syscalls: Add more checks to pass the verifier
> > > 46180bec048aad85 perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
> > > 45d1aadac64869a2 perf build: Change the clang check back to 12.0.1
> > 
> > Wouldn't it be better to have this change after fixing the verifier
> > issues in the later commits?
> 
> I'm still testing it, this is a one-liner, so I think that the order in
> which the patches are applied isn't important. Also Howard's patch (the
> simplified one) doesn't clash with it.

I'm afraid if it'd break git bisect by allowing old clang versions
before the fix.

Thanks,
Namhyung

>  
> > > 4e21679eb81b5f0d perf trace: The return from 'write' isn't a pid
> > > 2d2314d4b09b5ed9 tools headers UAPI: Sync linux/const.h with the kernel headers
> > > ⬢[acme@toolbox perf-tools]$
> > 
> > I guess you also need the syscalltbl fix from Jiri Slaby.
> > 
> > https://lore.kernel.org/linux-perf-users/3a592835-a14f-40be-8961-c0cee7720a94@kernel.org/
> 
> Right, he provided a report about the patch I sent solving the case, I
> have to check if he replied to my question about perf trace actually
> _working_ on a 32-bit arch system.
> 
> I also want to test it, I'm trying to get hold of such a system.
> 
> - Arnaldo
Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Arnaldo Carvalho de Melo 1 month ago
On Tue, Oct 22, 2024 at 10:04:52AM -0700, Namhyung Kim wrote:
> Hi Arnaldo,
> 
> On Wed, Oct 16, 2024 at 11:22:15AM -0300, Arnaldo Carvalho de Melo wrote:
> > On Tue, Oct 15, 2024 at 07:06:35PM -0700, Namhyung Kim wrote:
> > > Hi Arnaldo,
> > > 
> > > On Tue, Oct 15, 2024 at 05:37:38PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > On Tue, Oct 15, 2024 at 04:58:56PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > So I'm trying adding extra bounds checking, marking the index as
> > > > > volatile, adding compiler barriers, etc, all the fun with the verifier,
> > > > > but got distracted with other stuff, coming back to this now.
> > > >  
> > > > > Ok, the following seems to do the trick:
> > > >  
> > > > > [acme@dell-per740-01 perf-tools]$ git diff
> > > > > diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > > > > index 3b30aa74a3ae..ef87a04ff8d0 100644
> > > > > --- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > > > > +++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > > > > @@ -486,6 +486,7 @@ static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
> > > > >                                 augmented = true;
> > > > >                 } else if (size < 0 && size >= -6) { /* buffer */
> > > > >                         index = -(size + 1);
> > > > > +                       index &= 7; // To satisfy the bounds checking with the verifier in some kernels
> > > > >                         aug_size = args->args[index];
> > > > >  
> > > > >                         if (aug_size > TRACE_AUG_MAX_BUF)
> > > > > 
> > > > > I'll now test it without Howard's patch to see if it fixes the RHEL8 +
> > > > > clang 17 case.
> > > > 
> > > > It works with this one-liner + the simplified patch from Howard and also
> > > > on this other system (RHEL9), as well as with Fedora 40, it would be
> > > > nice if someone could test with clang 16 and report back the version of
> > > > the kernel tested as well as the distro name/release, that way I can try
> > > > to get my hands on such as system and test there as well.
> > > > 
> > > > Its all at:
> > > > 
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git tmp.perf-tools
> > > > 
> > > > This is the current set of patches that when further tested will go to
> > > > Linus for v6.12:
> > > > 
> > > > ⬢[acme@toolbox perf-tools]$ git log --oneline torvalds/master..
> > > > ff14baa7a290bf42 (HEAD -> perf-tools, x1/perf-tools, perf-tools/tmp.perf-tools) perf trace augmented_raw_syscalls: Add more checks to pass the verifier
> > > > 46180bec048aad85 perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
> > > > 45d1aadac64869a2 perf build: Change the clang check back to 12.0.1
> > > 
> > > Wouldn't it be better to have this change after fixing the verifier
> > > issues in the later commits?
> > 
> > I'm still testing it, this is a one-liner, so I think that the order in
> > which the patches are applied isn't important. Also Howard's patch (the
> > simplified one) doesn't clash with it.
> 
> I'm afraid if it'd break git bisect by allowing old clang versions
> before the fix.

I can reorder the patches if you think it is interesting, but from the
extended set of tests I'm performing on different kernel and clang
version and in x86_64 and arm 64-bit, 32-bit and various distros, I'm
not sure bisection is an option for BPF programs at this stage 8-)

There, did it now it looks like this:

⬢ [acme@toolbox perf-tools]$ git log --oneline torvalds/master..
5d3a1b9ca3b1a059 (HEAD -> perf-tools) perf trace arm32: Fix iteration of syscall ids in syscalltbl->entries
34d2358a24fb5963 perf trace augmented_raw_syscalls: Add more checks to pass the verifier
cdb84c31bd2813de perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
e5c1811c590c4312 perf build: Change the clang check back to 12.0.1
39c6a356201ebbd7 perf trace: The return from 'write' isn't a pid
ab8aaab874c4aa37 tools headers UAPI: Sync linux/const.h with the kernel headers
⬢ [acme@toolbox perf-tools]$

Is that what you meant?

Thanks,

- Arnaldo
Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Namhyung Kim 1 month ago
On Tue, Oct 22, 2024 at 03:33:00PM -0300, Arnaldo Carvalho de Melo wrote:
> On Tue, Oct 22, 2024 at 10:04:52AM -0700, Namhyung Kim wrote:
> > Hi Arnaldo,
> > 
> > On Wed, Oct 16, 2024 at 11:22:15AM -0300, Arnaldo Carvalho de Melo wrote:
> > > On Tue, Oct 15, 2024 at 07:06:35PM -0700, Namhyung Kim wrote:
> > > > Hi Arnaldo,
> > > > 
> > > > On Tue, Oct 15, 2024 at 05:37:38PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > On Tue, Oct 15, 2024 at 04:58:56PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > So I'm trying adding extra bounds checking, marking the index as
> > > > > > volatile, adding compiler barriers, etc, all the fun with the verifier,
> > > > > > but got distracted with other stuff, coming back to this now.
> > > > >  
> > > > > > Ok, the following seems to do the trick:
> > > > >  
> > > > > > [acme@dell-per740-01 perf-tools]$ git diff
> > > > > > diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > > > > > index 3b30aa74a3ae..ef87a04ff8d0 100644
> > > > > > --- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > > > > > +++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
> > > > > > @@ -486,6 +486,7 @@ static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
> > > > > >                                 augmented = true;
> > > > > >                 } else if (size < 0 && size >= -6) { /* buffer */
> > > > > >                         index = -(size + 1);
> > > > > > +                       index &= 7; // To satisfy the bounds checking with the verifier in some kernels
> > > > > >                         aug_size = args->args[index];
> > > > > >  
> > > > > >                         if (aug_size > TRACE_AUG_MAX_BUF)
> > > > > > 
> > > > > > I'll now test it without Howard's patch to see if it fixes the RHEL8 +
> > > > > > clang 17 case.
> > > > > 
> > > > > It works with this one-liner + the simplified patch from Howard and also
> > > > > on this other system (RHEL9), as well as with Fedora 40, it would be
> > > > > nice if someone could test with clang 16 and report back the version of
> > > > > the kernel tested as well as the distro name/release, that way I can try
> > > > > to get my hands on such as system and test there as well.
> > > > > 
> > > > > Its all at:
> > > > > 
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git tmp.perf-tools
> > > > > 
> > > > > This is the current set of patches that when further tested will go to
> > > > > Linus for v6.12:
> > > > > 
> > > > > ⬢[acme@toolbox perf-tools]$ git log --oneline torvalds/master..
> > > > > ff14baa7a290bf42 (HEAD -> perf-tools, x1/perf-tools, perf-tools/tmp.perf-tools) perf trace augmented_raw_syscalls: Add more checks to pass the verifier
> > > > > 46180bec048aad85 perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
> > > > > 45d1aadac64869a2 perf build: Change the clang check back to 12.0.1
> > > > 
> > > > Wouldn't it be better to have this change after fixing the verifier
> > > > issues in the later commits?
> > > 
> > > I'm still testing it, this is a one-liner, so I think that the order in
> > > which the patches are applied isn't important. Also Howard's patch (the
> > > simplified one) doesn't clash with it.
> > 
> > I'm afraid if it'd break git bisect by allowing old clang versions
> > before the fix.
> 
> I can reorder the patches if you think it is interesting, but from the
> extended set of tests I'm performing on different kernel and clang
> version and in x86_64 and arm 64-bit, 32-bit and various distros, I'm
> not sure bisection is an option for BPF programs at this stage 8-)

Maybe.  But at least we can try our best. :)

> 
> There, did it now it looks like this:
> 
> ⬢ [acme@toolbox perf-tools]$ git log --oneline torvalds/master..
> 5d3a1b9ca3b1a059 (HEAD -> perf-tools) perf trace arm32: Fix iteration of syscall ids in syscalltbl->entries
> 34d2358a24fb5963 perf trace augmented_raw_syscalls: Add more checks to pass the verifier
> cdb84c31bd2813de perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
> e5c1811c590c4312 perf build: Change the clang check back to 12.0.1
> 39c6a356201ebbd7 perf trace: The return from 'write' isn't a pid
> ab8aaab874c4aa37 tools headers UAPI: Sync linux/const.h with the kernel headers
> ⬢ [acme@toolbox perf-tools]$
> 
> Is that what you meant?

Nope, I'd like to have e5c1811c590c4312 after the verifier fixes:

 34d2358a24fb5963 perf trace augmented_raw_syscalls: Add more checks to pass the verifier
 cdb84c31bd2813de perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers

Thanks,
Namhyung

Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Arnaldo Carvalho de Melo 1 month ago
On Tue, Oct 22, 2024 at 04:52:16PM -0700, Namhyung Kim wrote:
> On Tue, Oct 22, 2024 at 03:33:00PM -0300, Arnaldo Carvalho de Melo wrote:
> > On Tue, Oct 22, 2024 at 10:04:52AM -0700, Namhyung Kim wrote:
> > > On Wed, Oct 16, 2024 at 11:22:15AM -0300, Arnaldo Carvalho de Melo wrote:
> > > > On Tue, Oct 15, 2024 at 07:06:35PM -0700, Namhyung Kim wrote:
> > > > > On Tue, Oct 15, 2024 at 05:37:38PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > On Tue, Oct 15, 2024 at 04:58:56PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > Its all at:

> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git tmp.perf-tools

> > > > > > This is the current set of patches that when further tested will go to
> > > > > > Linus for v6.12:

> > > > > > ⬢[acme@toolbox perf-tools]$ git log --oneline torvalds/master..
> > > > > > ff14baa7a290bf42 (HEAD -> perf-tools, x1/perf-tools, perf-tools/tmp.perf-tools) perf trace augmented_raw_syscalls: Add more checks to pass the verifier
> > > > > > 46180bec048aad85 perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
> > > > > > 45d1aadac64869a2 perf build: Change the clang check back to 12.0.1

> > > > > Wouldn't it be better to have this change after fixing the verifier
> > > > > issues in the later commits?

> > > > I'm still testing it, this is a one-liner, so I think that the order in
> > > > which the patches are applied isn't important. Also Howard's patch (the
> > > > simplified one) doesn't clash with it.

> > > I'm afraid if it'd break git bisect by allowing old clang versions
> > > before the fix.

> > I can reorder the patches if you think it is interesting, but from the
> > extended set of tests I'm performing on different kernel and clang
> > version and in x86_64 and arm 64-bit, 32-bit and various distros, I'm
> > not sure bisection is an option for BPF programs at this stage 8-)

> Maybe.  But at least we can try our best. :)

That rhel8 verifier case was with clang 17 even, but whatever, see below
:-)

> > There, did it now it looks like this:
> > 
> > ⬢ [acme@toolbox perf-tools]$ git log --oneline torvalds/master..
> > 5d3a1b9ca3b1a059 (HEAD -> perf-tools) perf trace arm32: Fix iteration of syscall ids in syscalltbl->entries
> > 34d2358a24fb5963 perf trace augmented_raw_syscalls: Add more checks to pass the verifier
> > cdb84c31bd2813de perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
> > e5c1811c590c4312 perf build: Change the clang check back to 12.0.1
> > 39c6a356201ebbd7 perf trace: The return from 'write' isn't a pid
> > ab8aaab874c4aa37 tools headers UAPI: Sync linux/const.h with the kernel headers
> > ⬢ [acme@toolbox perf-tools]$

> > Is that what you meant?
 
> Nope, I'd like to have e5c1811c590c4312 after the verifier fixes:
 
>  34d2358a24fb5963 perf trace augmented_raw_syscalls: Add more checks to pass the verifier
>  cdb84c31bd2813de perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers

Done:

⬢ [acme@toolbox perf-tools]$ git log --oneline torvalds/master..
d822ca29a4fc5278 (HEAD -> perf-tools) tools headers UAPI: Sync kvm headers with the kernel sources
5d35634ecc2d2c39 perf trace: Fix non-listed archs in the syscalltbl routines
7fbff3c0e085745b perf build: Change the clang check back to 12.0.1
395d38419f1853de perf trace augmented_raw_syscalls: Add more checks to pass the verifier
ecabac70ff919580 perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
39c6a356201ebbd7 perf trace: The return from 'write' isn't a pid
ab8aaab874c4aa37 tools headers UAPI: Sync linux/const.h with the kernel headers
⬢ [acme@toolbox perf-tools]$

- Arnaldo
Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Namhyung Kim 1 month ago
On Wed, Oct 23, 2024 at 11:39:24AM -0300, Arnaldo Carvalho de Melo wrote:
> On Tue, Oct 22, 2024 at 04:52:16PM -0700, Namhyung Kim wrote:
> > On Tue, Oct 22, 2024 at 03:33:00PM -0300, Arnaldo Carvalho de Melo wrote:
> > > On Tue, Oct 22, 2024 at 10:04:52AM -0700, Namhyung Kim wrote:
> > > > On Wed, Oct 16, 2024 at 11:22:15AM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > On Tue, Oct 15, 2024 at 07:06:35PM -0700, Namhyung Kim wrote:
> > > > > > On Tue, Oct 15, 2024 at 05:37:38PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > > On Tue, Oct 15, 2024 at 04:58:56PM -0300, Arnaldo Carvalho de Melo wrote:
> > > > > > > Its all at:
> 
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git tmp.perf-tools
> 
> > > > > > > This is the current set of patches that when further tested will go to
> > > > > > > Linus for v6.12:
> 
> > > > > > > ⬢[acme@toolbox perf-tools]$ git log --oneline torvalds/master..
> > > > > > > ff14baa7a290bf42 (HEAD -> perf-tools, x1/perf-tools, perf-tools/tmp.perf-tools) perf trace augmented_raw_syscalls: Add more checks to pass the verifier
> > > > > > > 46180bec048aad85 perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
> > > > > > > 45d1aadac64869a2 perf build: Change the clang check back to 12.0.1
> 
> > > > > > Wouldn't it be better to have this change after fixing the verifier
> > > > > > issues in the later commits?
> 
> > > > > I'm still testing it, this is a one-liner, so I think that the order in
> > > > > which the patches are applied isn't important. Also Howard's patch (the
> > > > > simplified one) doesn't clash with it.
> 
> > > > I'm afraid if it'd break git bisect by allowing old clang versions
> > > > before the fix.
> 
> > > I can reorder the patches if you think it is interesting, but from the
> > > extended set of tests I'm performing on different kernel and clang
> > > version and in x86_64 and arm 64-bit, 32-bit and various distros, I'm
> > > not sure bisection is an option for BPF programs at this stage 8-)
> 
> > Maybe.  But at least we can try our best. :)
> 
> That rhel8 verifier case was with clang 17 even, but whatever, see below
> :-)
> 
> > > There, did it now it looks like this:
> > > 
> > > ⬢ [acme@toolbox perf-tools]$ git log --oneline torvalds/master..
> > > 5d3a1b9ca3b1a059 (HEAD -> perf-tools) perf trace arm32: Fix iteration of syscall ids in syscalltbl->entries
> > > 34d2358a24fb5963 perf trace augmented_raw_syscalls: Add more checks to pass the verifier
> > > cdb84c31bd2813de perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
> > > e5c1811c590c4312 perf build: Change the clang check back to 12.0.1
> > > 39c6a356201ebbd7 perf trace: The return from 'write' isn't a pid
> > > ab8aaab874c4aa37 tools headers UAPI: Sync linux/const.h with the kernel headers
> > > ⬢ [acme@toolbox perf-tools]$
> 
> > > Is that what you meant?
>  
> > Nope, I'd like to have e5c1811c590c4312 after the verifier fixes:
>  
> >  34d2358a24fb5963 perf trace augmented_raw_syscalls: Add more checks to pass the verifier
> >  cdb84c31bd2813de perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
> 
> Done:
> 
> ⬢ [acme@toolbox perf-tools]$ git log --oneline torvalds/master..
> d822ca29a4fc5278 (HEAD -> perf-tools) tools headers UAPI: Sync kvm headers with the kernel sources
> 5d35634ecc2d2c39 perf trace: Fix non-listed archs in the syscalltbl routines
> 7fbff3c0e085745b perf build: Change the clang check back to 12.0.1
> 395d38419f1853de perf trace augmented_raw_syscalls: Add more checks to pass the verifier
> ecabac70ff919580 perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
> 39c6a356201ebbd7 perf trace: The return from 'write' isn't a pid
> ab8aaab874c4aa37 tools headers UAPI: Sync linux/const.h with the kernel headers
> ⬢ [acme@toolbox perf-tools]$

Thanks!
Namhyung

Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by Song Liu 1 month, 1 week ago

> On Oct 15, 2024, at 1:37 PM, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> 
> On Tue, Oct 15, 2024 at 04:58:56PM -0300, Arnaldo Carvalho de Melo wrote:
>> So I'm trying adding extra bounds checking, marking the index as
>> volatile, adding compiler barriers, etc, all the fun with the verifier,
>> but got distracted with other stuff, coming back to this now.
> 
>> Ok, the following seems to do the trick:
> 
>> [acme@dell-per740-01 perf-tools]$ git diff
>> diff --git a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
>> index 3b30aa74a3ae..ef87a04ff8d0 100644
>> --- a/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
>> +++ b/tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
>> @@ -486,6 +486,7 @@ static int augment_sys_enter(void *ctx, struct syscall_enter_args *args)
>>                                augmented = true;
>>                } else if (size < 0 && size >= -6) { /* buffer */
>>                        index = -(size + 1);
>> +                       index &= 7; // To satisfy the bounds checking with the verifier in some kernels
>>                        aug_size = args->args[index];
>> 
>>                        if (aug_size > TRACE_AUG_MAX_BUF)
>> 
>> I'll now test it without Howard's patch to see if it fixes the RHEL8 +
>> clang 17 case.
> 
> It works with this one-liner + the simplified patch from Howard and also
> on this other system (RHEL9), as well as with Fedora 40, it would be
> nice if someone could test with clang 16 and report back the version of
> the kernel tested as well as the distro name/release, that way I can try
> to get my hands on such as system and test there as well.

I am testing this one-liner with clang 18.1.8 and an older kernel (it is 
our version of 5.12 kernel, but the BPF verifier there should be similar
to recent kernels). verifier still fails the program. IIUC, the compiler 
is smart enough to just remove the one-liner completely.


Thanks,
Song

Re: [PATCH v2 0/2] perf trace: Fix support for the new BPF feature in clang 12
Posted by James Clark 1 month, 2 weeks ago

On 11/10/2024 3:14 am, Howard Chu wrote:
> Changes in v2:
> - Resolved a clang-16 build error pointed out by Namhyung Kim
>    <namhyung@kernel.org>
> 
> The new augmentation feature in perf trace, along with the protocol
> change (from payload to payload->value), breaks the clang 12 build.
> 
> perf trace actually builds for any clang version newer than clang 16.
> However, as pointed out by Namhyung Kim <namhyung@kernel.org> and Ian
> Rogers <irogers@google.com>, clang 16, which was released in 2023, is
> still too new for most users. Additionally, as James Clark
> <james.clark@linaro.org> noted, some commonly used distributions do not
> yet support clang 16. Therefore, breaking BPF features between clang 12
> and clang 15 is not a good approach.
> 
> This patch series rewrites the BPF program in a way that allows it to
> pass the BPF verifier, even when the BPF bytecode is generated by older
> versions of clang.
> 
> However, I have only tested it till clang 14, as older versions are not
> supported by my distribution.
> 
> Howard Chu (2):
>    perf build: Change the clang check back to 12.0.1
>    perf trace: Rewrite BPF code to pass the verifier
> 
>   tools/perf/Makefile.config                    |   4 +-
>   .../bpf_skel/augmented_raw_syscalls.bpf.c     | 122 ++++++++++--------
>   2 files changed, 69 insertions(+), 57 deletions(-)
> 


Tested with clang 15:

$ sudo perf trace -e write --max-events=100 -- echo hello

    0.000 ( 0.014 ms): echo/976249 write(fd: 1, buf: hello\10, count: 6)
                                             =

Tested-by: James Clark <james.clark@linaro.org>