tools/perf/tests/shell/trace_btf_general.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
The following test case fails on linux-next repo:
❯ uname -a
Linux s83lp47.lnxne.boe 6.18.0-20251116.rc5.git0.0f2995693867.63.\
fc42.s390x+next #1 SMP Sun Nov 16 20:05:28 CET 2025 s390x GNU/Linux
# perf test -Fv 109
--- start ---
Checking if vmlinux BTF exists
Testing perf trace's string augmentation
Testing perf trace's buffer augmentation
Buffer augmentation test failed, output:
buffer content
echo/23281 write(1, buffer conten, 15, "") = 15
---- end ----
109: perf trace BTF general tests : FAILED!
#
The root case is a changed output format on linux-next.
There is an addional "" string as forth parameter in the write()
line. Here is the detailed output on linux-repo.
Please note that this depends on the kernel and not on the perf tool.
Output on linux next kernel:
# uname -a
Linux f43 6.18.0-rc5-next-20251114tmr-n #1 SMP PREEMPT_DYNAMIC ...
# perf config trace.show_arg_names=false trace.show_duration=false \
trace.show_timestamp=false trace.args_alignment=0
# ./perf trace --sort-events -e write --max-events=1 \
-- echo 'buffer content' 1>/dev/null
echo/7676 write(1, buffer content\10, 15, "") = 15
#
Output on linux kernel:
# uname -a
Linux b3560002.lnxne.boe 6.18.0-rc5m-perf #6 ....
# perf config trace.show_arg_names=false trace.show_duration=false \
trace.show_timestamp=false trace.args_alignment=0
# ./perf trace --sort-events -e write --max-events=1 \
-- echo 'buffer content' 1>/dev/null
echo/36932 write(1, buffer content\10, 15) = 15
#
Add the optional forth parameter in the extented regular expression to
accept both output formats.
Output after:
# ./perf test -Fv 'perf trace BTF general tests'
--- start ---
Checking if vmlinux BTF exists
Testing perf trace's string augmentation
Testing perf trace's buffer augmentation
Testing perf trace's struct augmentation
---- end ----
115: perf trace BTF general tests : Ok
#
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
---
tools/perf/tests/shell/trace_btf_general.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/tests/shell/trace_btf_general.sh b/tools/perf/tests/shell/trace_btf_general.sh
index ef2da806be6b..9cd6180062d8 100755
--- a/tools/perf/tests/shell/trace_btf_general.sh
+++ b/tools/perf/tests/shell/trace_btf_general.sh
@@ -39,7 +39,7 @@ trace_test_buffer() {
echo "Testing perf trace's buffer augmentation"
# echo will insert a newline (\10) at the end of the buffer
output="$(perf trace --sort-events -e write --max-events=1 -- echo "${buffer}" 2>&1)"
- if ! echo "$output" | grep -q -E "^echo/[0-9]+ write\([0-9]+, ${buffer}.*, [0-9]+\) += +[0-9]+$"
+ if ! echo "$output" | grep -qE "^echo/[0-9]+ write\([0-9]+, ${buffer}\\\\10, [0-9]+(, ..)?\) += +[0-9]+$"
then
printf "Buffer augmentation test failed, output:\n$output\n"
err=1
--
2.51.1
On Mon, Nov 17, 2025 at 01:43:59PM +0100, Thomas Richter wrote:
> The following test case fails on linux-next repo:
>
---8<--- snip ---8<---
> #
>
> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
> ---
> tools/perf/tests/shell/trace_btf_general.sh | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/tests/shell/trace_btf_general.sh b/tools/perf/tests/shell/trace_btf_general.sh
> index ef2da806be6b..9cd6180062d8 100755
> --- a/tools/perf/tests/shell/trace_btf_general.sh
> +++ b/tools/perf/tests/shell/trace_btf_general.sh
> @@ -39,7 +39,7 @@ trace_test_buffer() {
> echo "Testing perf trace's buffer augmentation"
> # echo will insert a newline (\10) at the end of the buffer
> output="$(perf trace --sort-events -e write --max-events=1 -- echo "${buffer}" 2>&1)"
> - if ! echo "$output" | grep -q -E "^echo/[0-9]+ write\([0-9]+, ${buffer}.*, [0-9]+\) += +[0-9]+$"
> + if ! echo "$output" | grep -qE "^echo/[0-9]+ write\([0-9]+, ${buffer}\\\\10, [0-9]+(, ..)?\) += +[0-9]+$"
> then
> printf "Buffer augmentation test failed, output:\n$output\n"
> err=1
---------------8<---------------
Tested-by: Jan Polensky <japo@linux.ibm.com>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Thank you Thomas
Nit: maybe `(, \"\")?` instead of `(, ..)?` for clarity?
On 11/18/25 19:30, Jan Polensky wrote:
> On Mon, Nov 17, 2025 at 01:43:59PM +0100, Thomas Richter wrote:
>> The following test case fails on linux-next repo:
>>
> ---8<--- snip ---8<---
>> #
>>
>> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
>> ---
>> tools/perf/tests/shell/trace_btf_general.sh | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/tests/shell/trace_btf_general.sh b/tools/perf/tests/shell/trace_btf_general.sh
>> index ef2da806be6b..9cd6180062d8 100755
>> --- a/tools/perf/tests/shell/trace_btf_general.sh
>> +++ b/tools/perf/tests/shell/trace_btf_general.sh
>> @@ -39,7 +39,7 @@ trace_test_buffer() {
>> echo "Testing perf trace's buffer augmentation"
>> # echo will insert a newline (\10) at the end of the buffer
>> output="$(perf trace --sort-events -e write --max-events=1 -- echo "${buffer}" 2>&1)"
>> - if ! echo "$output" | grep -q -E "^echo/[0-9]+ write\([0-9]+, ${buffer}.*, [0-9]+\) += +[0-9]+$"
>> + if ! echo "$output" | grep -qE "^echo/[0-9]+ write\([0-9]+, ${buffer}\\\\10, [0-9]+(, ..)?\) += +[0-9]+$"
>> then
>> printf "Buffer augmentation test failed, output:\n$output\n"
>> err=1
> ---------------8<---------------
> Tested-by: Jan Polensky <japo@linux.ibm.com>
> Reviewed-by: Jan Polensky <japo@linux.ibm.com>
>
> Thank you Thomas
>
> Nit: maybe `(, \"\")?` instead of `(, ..)?` for clarity?
>
Thanks will do it.
--
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Wolfgang Wendt
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
Hello,
On Mon, Nov 17, 2025 at 01:43:59PM +0100, Thomas Richter wrote:
> The following test case fails on linux-next repo:
>
> ❯ uname -a
> Linux s83lp47.lnxne.boe 6.18.0-20251116.rc5.git0.0f2995693867.63.\
> fc42.s390x+next #1 SMP Sun Nov 16 20:05:28 CET 2025 s390x GNU/Linux
>
> # perf test -Fv 109
> --- start ---
> Checking if vmlinux BTF exists
> Testing perf trace's string augmentation
> Testing perf trace's buffer augmentation
> Buffer augmentation test failed, output:
> buffer content
> echo/23281 write(1, buffer conten, 15, "") = 15
> ---- end ----
> 109: perf trace BTF general tests : FAILED!
> #
>
> The root case is a changed output format on linux-next.
> There is an addional "" string as forth parameter in the write()
> line. Here is the detailed output on linux-repo.
> Please note that this depends on the kernel and not on the perf tool.
Thanks for the report. Do you know what the 4th arg is? It'd be nice
if you can dump the contents of the event format which is
/sys/kernel/tracing/events/syscalls/sys_enter_write/format.
Thanks,
Namhyung
>
> Output on linux next kernel:
> # uname -a
> Linux f43 6.18.0-rc5-next-20251114tmr-n #1 SMP PREEMPT_DYNAMIC ...
> # perf config trace.show_arg_names=false trace.show_duration=false \
> trace.show_timestamp=false trace.args_alignment=0
> # ./perf trace --sort-events -e write --max-events=1 \
> -- echo 'buffer content' 1>/dev/null
> echo/7676 write(1, buffer content\10, 15, "") = 15
> #
>
> Output on linux kernel:
> # uname -a
> Linux b3560002.lnxne.boe 6.18.0-rc5m-perf #6 ....
> # perf config trace.show_arg_names=false trace.show_duration=false \
> trace.show_timestamp=false trace.args_alignment=0
> # ./perf trace --sort-events -e write --max-events=1 \
> -- echo 'buffer content' 1>/dev/null
> echo/36932 write(1, buffer content\10, 15) = 15
> #
>
> Add the optional forth parameter in the extented regular expression to
> accept both output formats.
>
> Output after:
> # ./perf test -Fv 'perf trace BTF general tests'
> --- start ---
> Checking if vmlinux BTF exists
> Testing perf trace's string augmentation
> Testing perf trace's buffer augmentation
> Testing perf trace's struct augmentation
> ---- end ----
> 115: perf trace BTF general tests : Ok
> #
>
> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
> ---
> tools/perf/tests/shell/trace_btf_general.sh | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/tests/shell/trace_btf_general.sh b/tools/perf/tests/shell/trace_btf_general.sh
> index ef2da806be6b..9cd6180062d8 100755
> --- a/tools/perf/tests/shell/trace_btf_general.sh
> +++ b/tools/perf/tests/shell/trace_btf_general.sh
> @@ -39,7 +39,7 @@ trace_test_buffer() {
> echo "Testing perf trace's buffer augmentation"
> # echo will insert a newline (\10) at the end of the buffer
> output="$(perf trace --sort-events -e write --max-events=1 -- echo "${buffer}" 2>&1)"
> - if ! echo "$output" | grep -q -E "^echo/[0-9]+ write\([0-9]+, ${buffer}.*, [0-9]+\) += +[0-9]+$"
> + if ! echo "$output" | grep -qE "^echo/[0-9]+ write\([0-9]+, ${buffer}\\\\10, [0-9]+(, ..)?\) += +[0-9]+$"
> then
> printf "Buffer augmentation test failed, output:\n$output\n"
> err=1
> --
> 2.51.1
>
On 11/18/25 02:58, Namhyung Kim wrote:
> Hello,
>
> On Mon, Nov 17, 2025 at 01:43:59PM +0100, Thomas Richter wrote:
>> The following test case fails on linux-next repo:
>>
>> ❯ uname -a
>> Linux s83lp47.lnxne.boe 6.18.0-20251116.rc5.git0.0f2995693867.63.\
>> fc42.s390x+next #1 SMP Sun Nov 16 20:05:28 CET 2025 s390x GNU/Linux
>>
>> # perf test -Fv 109
>> --- start ---
>> Checking if vmlinux BTF exists
>> Testing perf trace's string augmentation
>> Testing perf trace's buffer augmentation
>> Buffer augmentation test failed, output:
>> buffer content
>> echo/23281 write(1, buffer conten, 15, "") = 15
>> ---- end ----
>> 109: perf trace BTF general tests : FAILED!
>> #
>>
>> The root case is a changed output format on linux-next.
>> There is an addional "" string as forth parameter in the write()
>> line. Here is the detailed output on linux-repo.
>> Please note that this depends on the kernel and not on the perf tool.
>
> Thanks for the report. Do you know what the 4th arg is? It'd be nice
> if you can dump the contents of the event format which is
> /sys/kernel/tracing/events/syscalls/sys_enter_write/format.
>
> Thanks,
> Namhyung
>
Here is the output from my x86 virtual machine with linux-next 20251114 tag.
bash-5.3# uname -a
Linux f43 6.18.0-rc5-next-20251114tmr-n #1 SMP PREEMPT_DYNAMIC Mon Nov 17 11:24:02 CET 2025 x86_64 GNU/Linux
bash-5.3# cat /sys/kernel/tracing/events/syscalls/sys_enter_write/format
name: sys_enter_write
ID: 758
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:int __syscall_nr; offset:8; size:4; signed:1;
field:unsigned int fd; offset:16; size:8; signed:0;
field:const char * buf; offset:24; size:8; signed:0;
field:size_t count; offset:32; size:8; signed:0;
field:__data_loc char[] __buf_val; offset:40; size:4; signed:0;
print fmt: "fd: 0x%08lx, buf: 0x%08lx (%s), count: 0x%08lx", ((unsigned long)(REC->fd)), ((unsigned long)(REC->buf)), __print_dynamic_array(__buf_val, 1), ((unsigned long)(REC->count))
bash-5.3#
Hope this helps.
>>
>> Output on linux next kernel:
>> # uname -a
>> Linux f43 6.18.0-rc5-next-20251114tmr-n #1 SMP PREEMPT_DYNAMIC ...
>> # perf config trace.show_arg_names=false trace.show_duration=false \
>> trace.show_timestamp=false trace.args_alignment=0
>> # ./perf trace --sort-events -e write --max-events=1 \
>> -- echo 'buffer content' 1>/dev/null
>> echo/7676 write(1, buffer content\10, 15, "") = 15
>> #
>>
>> Output on linux kernel:
>> # uname -a
>> Linux b3560002.lnxne.boe 6.18.0-rc5m-perf #6 ....
>> # perf config trace.show_arg_names=false trace.show_duration=false \
>> trace.show_timestamp=false trace.args_alignment=0
>> # ./perf trace --sort-events -e write --max-events=1 \
>> -- echo 'buffer content' 1>/dev/null
>> echo/36932 write(1, buffer content\10, 15) = 15
>> #
>>
>> Add the optional forth parameter in the extented regular expression to
>> accept both output formats.
>>
>> Output after:
>> # ./perf test -Fv 'perf trace BTF general tests'
>> --- start ---
>> Checking if vmlinux BTF exists
>> Testing perf trace's string augmentation
>> Testing perf trace's buffer augmentation
>> Testing perf trace's struct augmentation
>> ---- end ----
>> 115: perf trace BTF general tests : Ok
>> #
>>
>> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
>> ---
>> tools/perf/tests/shell/trace_btf_general.sh | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/tests/shell/trace_btf_general.sh b/tools/perf/tests/shell/trace_btf_general.sh
>> index ef2da806be6b..9cd6180062d8 100755
>> --- a/tools/perf/tests/shell/trace_btf_general.sh
>> +++ b/tools/perf/tests/shell/trace_btf_general.sh
>> @@ -39,7 +39,7 @@ trace_test_buffer() {
>> echo "Testing perf trace's buffer augmentation"
>> # echo will insert a newline (\10) at the end of the buffer
>> output="$(perf trace --sort-events -e write --max-events=1 -- echo "${buffer}" 2>&1)"
>> - if ! echo "$output" | grep -q -E "^echo/[0-9]+ write\([0-9]+, ${buffer}.*, [0-9]+\) += +[0-9]+$"
>> + if ! echo "$output" | grep -qE "^echo/[0-9]+ write\([0-9]+, ${buffer}\\\\10, [0-9]+(, ..)?\) += +[0-9]+$"
>> then
>> printf "Buffer augmentation test failed, output:\n$output\n"
>> err=1
>> --
>> 2.51.1
>>
--
Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany
--
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Wolfgang Wendt
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
On Tue, Nov 18, 2025 at 07:15:45AM +0100, Thomas Richter wrote: > On 11/18/25 02:58, Namhyung Kim wrote: > > Hello, > > > > On Mon, Nov 17, 2025 at 01:43:59PM +0100, Thomas Richter wrote: > >> The following test case fails on linux-next repo: > >> > >> ❯ uname -a > >> Linux s83lp47.lnxne.boe 6.18.0-20251116.rc5.git0.0f2995693867.63.\ > >> fc42.s390x+next #1 SMP Sun Nov 16 20:05:28 CET 2025 s390x GNU/Linux > >> > >> # perf test -Fv 109 > >> --- start --- > >> Checking if vmlinux BTF exists > >> Testing perf trace's string augmentation > >> Testing perf trace's buffer augmentation > >> Buffer augmentation test failed, output: > >> buffer content > >> echo/23281 write(1, buffer conten, 15, "") = 15 > >> ---- end ---- > >> 109: perf trace BTF general tests : FAILED! > >> # > >> > >> The root case is a changed output format on linux-next. > >> There is an addional "" string as forth parameter in the write() > >> line. Here is the detailed output on linux-repo. > >> Please note that this depends on the kernel and not on the perf tool. > > > > Thanks for the report. Do you know what the 4th arg is? It'd be nice > > if you can dump the contents of the event format which is > > /sys/kernel/tracing/events/syscalls/sys_enter_write/format. > > > > Thanks, > > Namhyung > > > > Here is the output from my x86 virtual machine with linux-next 20251114 tag. Thanks for sharing this! > > bash-5.3# uname -a > Linux f43 6.18.0-rc5-next-20251114tmr-n #1 SMP PREEMPT_DYNAMIC Mon Nov 17 11:24:02 CET 2025 x86_64 GNU/Linux > bash-5.3# cat /sys/kernel/tracing/events/syscalls/sys_enter_write/format > name: sys_enter_write > ID: 758 > format: > field:unsigned short common_type; offset:0; size:2; signed:0; > field:unsigned char common_flags; offset:2; size:1; signed:0; > field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > field:int common_pid; offset:4; size:4; signed:1; > > field:int __syscall_nr; offset:8; size:4; signed:1; > field:unsigned int fd; offset:16; size:8; signed:0; > field:const char * buf; offset:24; size:8; signed:0; > field:size_t count; offset:32; size:8; signed:0; > field:__data_loc char[] __buf_val; offset:40; size:4; signed:0; Indeed, I see this new field __buf_val. Steve, is this what you added recently for taking user contents? Hmm.. this makes perf trace confused wrt the syscall parameters. Is it always __buf_val or has any patterns? > > print fmt: "fd: 0x%08lx, buf: 0x%08lx (%s), count: 0x%08lx", ((unsigned long)(REC->fd)), ((unsigned long)(REC->buf)), __print_dynamic_array(__buf_val, 1), ((unsigned long)(REC->count)) > bash-5.3# > > Hope this helps. Yes it did, thanks! Namhyung
On Mon, 17 Nov 2025 22:43:21 -0800 Namhyung Kim <namhyung@kernel.org> wrote: > > bash-5.3# uname -a > > Linux f43 6.18.0-rc5-next-20251114tmr-n #1 SMP PREEMPT_DYNAMIC Mon Nov 17 11:24:02 CET 2025 x86_64 GNU/Linux > > bash-5.3# cat /sys/kernel/tracing/events/syscalls/sys_enter_write/format > > name: sys_enter_write > > ID: 758 > > format: > > field:unsigned short common_type; offset:0; size:2; signed:0; > > field:unsigned char common_flags; offset:2; size:1; signed:0; > > field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > > field:int common_pid; offset:4; size:4; signed:1; > > > > field:int __syscall_nr; offset:8; size:4; signed:1; > > field:unsigned int fd; offset:16; size:8; signed:0; > > field:const char * buf; offset:24; size:8; signed:0; > > field:size_t count; offset:32; size:8; signed:0; > > field:__data_loc char[] __buf_val; offset:40; size:4; signed:0; > > Indeed, I see this new field __buf_val. > > Steve, is this what you added recently for taking user contents? Yes. > Hmm.. this makes perf trace confused wrt the syscall parameters. > Is it always __buf_val or has any patterns? Really? It still uses libtraceevent right? I made sure that this didn't break trace-cmd and thought that perf would work too. -- Steve
On Tue, Nov 18, 2025 at 01:24:51PM -0500, Steven Rostedt wrote: > On Mon, 17 Nov 2025 22:43:21 -0800 > Namhyung Kim <namhyung@kernel.org> wrote: > > > > bash-5.3# uname -a > > > Linux f43 6.18.0-rc5-next-20251114tmr-n #1 SMP PREEMPT_DYNAMIC Mon Nov 17 11:24:02 CET 2025 x86_64 GNU/Linux > > > bash-5.3# cat /sys/kernel/tracing/events/syscalls/sys_enter_write/format > > > name: sys_enter_write > > > ID: 758 > > > format: > > > field:unsigned short common_type; offset:0; size:2; signed:0; > > > field:unsigned char common_flags; offset:2; size:1; signed:0; > > > field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > > > field:int common_pid; offset:4; size:4; signed:1; > > > > > > field:int __syscall_nr; offset:8; size:4; signed:1; > > > field:unsigned int fd; offset:16; size:8; signed:0; > > > field:const char * buf; offset:24; size:8; signed:0; > > > field:size_t count; offset:32; size:8; signed:0; > > > field:__data_loc char[] __buf_val; offset:40; size:4; signed:0; > > > > Indeed, I see this new field __buf_val. > > > > Steve, is this what you added recently for taking user contents? > > Yes. > > > Hmm.. this makes perf trace confused wrt the syscall parameters. > > Is it always __buf_val or has any patterns? > > Really? It still uses libtraceevent right? I made sure that this didn't > break trace-cmd and thought that perf would work too. It doesn't completely break perf trace but added new parameter for the write syscall at the end. IIUC perf trace iterates the format fields after __syscall_nr and take them all as syscall parameters. Thanks, Namhyung
On Tue, 18 Nov 2025 20:36:46 -0800 Namhyung Kim <namhyung@kernel.org> wrote: > > Really? It still uses libtraceevent right? I made sure that this didn't > > break trace-cmd and thought that perf would work too. > > It doesn't completely break perf trace but added new parameter for the > write syscall at the end. IIUC perf trace iterates the format fields > after __syscall_nr and take them all as syscall parameters. Is this a regression? Or can perf be fixed? I just ran it and I have this: 542.337 ( 0.131 ms): sshd-session/1189 write(fd: 7<socket:[9749]>, buf: , count: 268) = 268 I haven't tried it without the patches. Does it usually show what "buf" is? Now with the reading of user space, it can show the content too! -- Steve
On Wed, Nov 19, 2025 at 12:59:03PM -0500, Steven Rostedt wrote:
> On Tue, 18 Nov 2025 20:36:46 -0800
> Namhyung Kim <namhyung@kernel.org> wrote:
>
> > > Really? It still uses libtraceevent right? I made sure that this didn't
> > > break trace-cmd and thought that perf would work too.
> >
> > It doesn't completely break perf trace but added new parameter for the
> > write syscall at the end. IIUC perf trace iterates the format fields
> > after __syscall_nr and take them all as syscall parameters.
>
> Is this a regression? Or can perf be fixed?
>
> I just ran it and I have this:
>
> 542.337 ( 0.131 ms): sshd-session/1189 write(fd: 7<socket:[9749]>, buf: , count: 268) = 268
>
> I haven't tried it without the patches. Does it usually show what "buf" is?
> Now with the reading of user space, it can show the content too!
Yep, it reads the content using BPF. This is on my 6.16 kernel.
$ sudo perf trace -e write -- /bin/echo hello
hello
0.000 ( 0.014 ms): echo/61922 write(fd: 1, buf: hello\10, count: 6) = 6
Thanks,
Namhyung
On 11/20/25 01:52, Namhyung Kim wrote: > On Wed, Nov 19, 2025 at 12:59:03PM -0500, Steven Rostedt wrote: >> On Tue, 18 Nov 2025 20:36:46 -0800 >> Namhyung Kim <namhyung@kernel.org> wrote: >> >>>> Really? It still uses libtraceevent right? I made sure that this didn't >>>> break trace-cmd and thought that perf would work too. >>> >>> It doesn't completely break perf trace but added new parameter for the >>> write syscall at the end. IIUC perf trace iterates the format fields >>> after __syscall_nr and take them all as syscall parameters. >> >> Is this a regression? Or can perf be fixed? >> >> I just ran it and I have this: >> >> 542.337 ( 0.131 ms): sshd-session/1189 write(fd: 7<socket:[9749]>, buf: , count: 268) = 268 >> >> I haven't tried it without the patches. Does it usually show what "buf" is? >> Now with the reading of user space, it can show the content too! > > Yep, it reads the content using BPF. This is on my 6.16 kernel. > > $ sudo perf trace -e write -- /bin/echo hello > hello > 0.000 ( 0.014 ms): echo/61922 write(fd: 1, buf: hello\10, count: 6) = 6 > > Thanks, > Namhyung > > Hello Namhyung, Steven, friendly ping... any progress here? -- Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany -- IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Wolfgang Wendt Geschäftsführung: David Faller Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
On Wed, 26 Nov 2025 08:13:00 +0100 Thomas Richter <tmricht@linux.ibm.com> wrote: > >> I haven't tried it without the patches. Does it usually show what "buf" is? > >> Now with the reading of user space, it can show the content too! > > > > Yep, it reads the content using BPF. This is on my 6.16 kernel. > > > > $ sudo perf trace -e write -- /bin/echo hello > > hello > > 0.000 ( 0.014 ms): echo/61922 write(fd: 1, buf: hello\10, count: 6) = 6 > > > > Thanks, > > Namhyung > > > > > > Hello Namhyung, Steven, > > friendly ping... any progress here? > I honestly have no clue how to fix this, as I don't even know where to look. Is it BPF that is messing up? If so, where's the BPF program that is doing this. I thought BPF is supposed to handle updates and should never cause API breakage? I'll continue to look at the builtin-trace.c, but it seems that the BPF program it's attached to is handing it garbage with: perf trace -e syscalls:sys_enter_write The new fields are at the end. The BPF program should simply ignore those values. But again, I don't know where this BPF program lives. -- Steve
On 11/26/25 16:24, Steven Rostedt wrote: > On Wed, 26 Nov 2025 08:13:00 +0100 > Thomas Richter <tmricht@linux.ibm.com> wrote: > >>>> I haven't tried it without the patches. Does it usually show what "buf" is? >>>> Now with the reading of user space, it can show the content too! >>> >>> Yep, it reads the content using BPF. This is on my 6.16 kernel. >>> >>> $ sudo perf trace -e write -- /bin/echo hello >>> hello >>> 0.000 ( 0.014 ms): echo/61922 write(fd: 1, buf: hello\10, count: 6) = 6 >>> >>> Thanks, >>> Namhyung >>> >>> >> >> Hello Namhyung, Steven, >> >> friendly ping... any progress here? >> > > I honestly have no clue how to fix this, as I don't even know where to > look. Is it BPF that is messing up? If so, where's the BPF program that is > doing this. Yeah, sounds very familiar... happens to me all the time :-)> > I thought BPF is supposed to handle updates and should never cause API > breakage? > > I'll continue to look at the builtin-trace.c, but it seems that the BPF > program it's attached to is handing it garbage with: > > perf trace -e syscalls:sys_enter_write > > The new fields are at the end. The BPF program should simply ignore those > values. But again, I don't know where this BPF program lives. > > -- Steve Ok, this sounds that the issue will linger around for some time. Can we then adopt the test case check for perf test case 'perf trace BTF general tests' to accept that strange 4th parameter? Right now this test fails on a daily basis in linux-next and will most likely be in linux repo soon, where it will also fail. This is sort of 'unfortunate' that our daily continuous integration testing reports several errors per day... If you guys object to that fix I sent out a few days ago (which is absolutely ok), I will disable the test in our CI test suite. Thanks a lot. -- Thomas Richter, Dept 3303, IBM s390 Linux Development, Boeblingen, Germany -- IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Wolfgang Wendt Geschäftsführung: David Faller Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294
Arnaldo, How can I make perf trace not confused by the extra fields in the system call trace events? Ftrace can now show the contents of the system call user space buffers, but it appears that this breaks perf!!! system: syscalls name: sys_enter_write ID: 791 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int __syscall_nr; offset:8; size:4; signed:1; field:unsigned int fd; offset:16; size:8; signed:0; field:const char * buf; offset:24; size:8; signed:0; field:size_t count; offset:32; size:8; signed:0; field:__data_loc char[] __buf_val; offset:40; size:4; signed:0; That new __buf_val appears to confuse perf, but I'm having a hell of a time trying to figure out where it reads it! -- Steve On Wed, 26 Nov 2025 10:24:01 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > On Wed, 26 Nov 2025 08:13:00 +0100 > Thomas Richter <tmricht@linux.ibm.com> wrote: > > > >> I haven't tried it without the patches. Does it usually show what "buf" is? > > >> Now with the reading of user space, it can show the content too! > > > > > > Yep, it reads the content using BPF. This is on my 6.16 kernel. > > > > > > $ sudo perf trace -e write -- /bin/echo hello > > > hello > > > 0.000 ( 0.014 ms): echo/61922 write(fd: 1, buf: hello\10, count: 6) = 6 > > > > > > Thanks, > > > Namhyung > > > > > > > > > > Hello Namhyung, Steven, > > > > friendly ping... any progress here? > > > > I honestly have no clue how to fix this, as I don't even know where to > look. Is it BPF that is messing up? If so, where's the BPF program that is > doing this. > > I thought BPF is supposed to handle updates and should never cause API > breakage? > > I'll continue to look at the builtin-trace.c, but it seems that the BPF > program it's attached to is handing it garbage with: > > perf trace -e syscalls:sys_enter_write > > The new fields are at the end. The BPF program should simply ignore those > values. But again, I don't know where this BPF program lives. > > -- Steve
On Wed, Nov 26, 2025 at 12:12:29PM -0500, Steven Rostedt wrote: > > Arnaldo, > > How can I make perf trace not confused by the extra fields in the system > call trace events? > > Ftrace can now show the contents of the system call user space buffers, but > it appears that this breaks perf!!! > > system: syscalls > name: sys_enter_write > ID: 791 > format: > field:unsigned short common_type; offset:0; size:2; signed:0; > field:unsigned char common_flags; offset:2; size:1; signed:0; > field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > field:int common_pid; offset:4; size:4; signed:1; > > field:int __syscall_nr; offset:8; size:4; signed:1; > field:unsigned int fd; offset:16; size:8; signed:0; > field:const char * buf; offset:24; size:8; signed:0; > field:size_t count; offset:32; size:8; signed:0; > field:__data_loc char[] __buf_val; offset:40; size:4; signed:0; > > That new __buf_val appears to confuse perf, but I'm having a hell of a time > trying to figure out where it reads it! I've discussed with Steven and concluded that we should change perf to ignore fields with "__data_loc char[]" type in syscalls. Let me take a look. Thanks, Namhyung
Hi guys, On Wed, Nov 26, 2025 at 10:57 AM Namhyung Kim <namhyung@kernel.org> wrote: > > On Wed, Nov 26, 2025 at 12:12:29PM -0500, Steven Rostedt wrote: > > > > Arnaldo, > > > > How can I make perf trace not confused by the extra fields in the system > > call trace events? > > > > Ftrace can now show the contents of the system call user space buffers, but > > it appears that this breaks perf!!! > > > > system: syscalls > > name: sys_enter_write > > ID: 791 > > format: > > field:unsigned short common_type; offset:0; size:2; signed:0; > > field:unsigned char common_flags; offset:2; size:1; signed:0; > > field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > > field:int common_pid; offset:4; size:4; signed:1; > > > > field:int __syscall_nr; offset:8; size:4; signed:1; > > field:unsigned int fd; offset:16; size:8; signed:0; > > field:const char * buf; offset:24; size:8; signed:0; > > field:size_t count; offset:32; size:8; signed:0; > > field:__data_loc char[] __buf_val; offset:40; size:4; signed:0; > > > > That new __buf_val appears to confuse perf, but I'm having a hell of a time > > trying to figure out where it reads it! > > I've discussed with Steven and concluded that we should change perf to > ignore fields with "__data_loc char[]" type in syscalls. Let me take a > look. Thanks, I'll also give it a look. Thanks, Howard > > Thanks, > Namhyung > >
On Wed, 26 Nov 2025 12:12:29 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:
> Arnaldo,
>
> How can I make perf trace not confused by the extra fields in the system
> call trace events?
>
> Ftrace can now show the contents of the system call user space buffers, but
> it appears that this breaks perf!!!
>
> system: syscalls
> name: sys_enter_write
> ID: 791
> format:
> field:unsigned short common_type; offset:0; size:2; signed:0;
> field:unsigned char common_flags; offset:2; size:1; signed:0;
> field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
> field:int common_pid; offset:4; size:4; signed:1;
>
> field:int __syscall_nr; offset:8; size:4; signed:1;
> field:unsigned int fd; offset:16; size:8; signed:0;
> field:const char * buf; offset:24; size:8; signed:0;
> field:size_t count; offset:32; size:8; signed:0;
> field:__data_loc char[] __buf_val; offset:40; size:4; signed:0;
>
> That new __buf_val appears to confuse perf, but I'm having a hell of a time
> trying to figure out where it reads it!
>
Hmm, it gets less confused (at least it doesn't crash), when I don't have
perf read the extra values.
Thomas, if you add the below patch, does it fix things for you?
-- Steve
diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
index e96d0063cbcf..add809d226dc 100644
--- a/kernel/trace/trace_syscalls.c
+++ b/kernel/trace/trace_syscalls.c
@@ -1403,7 +1403,6 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id)
struct hlist_head *head;
unsigned long args[6];
bool valid_prog_array;
- bool mayfault;
char *user_ptr;
int user_sizes[SYSCALL_FAULT_MAX_CNT] = {};
int buf_size = CONFIG_TRACE_SYSCALL_BUF_SIZE_DEFAULT;
@@ -1431,15 +1430,6 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id)
syscall_get_arguments(current, regs, args);
- /* Check if this syscall event faults in user space memory */
- mayfault = sys_data->user_mask != 0;
-
- if (mayfault) {
- if (syscall_get_data(sys_data, args, &user_ptr,
- &size, user_sizes, &uargs, buf_size) < 0)
- return;
- }
-
head = this_cpu_ptr(sys_data->enter_event->perf_events);
valid_prog_array = bpf_prog_array_valid(sys_data->enter_event);
if (!valid_prog_array && hlist_empty(head))
@@ -1457,9 +1447,6 @@ static void perf_syscall_enter(void *ignore, struct pt_regs *regs, long id)
rec->nr = syscall_nr;
memcpy(&rec->args, args, sizeof(unsigned long) * sys_data->nb_args);
- if (mayfault)
- syscall_put_data(sys_data, rec, user_ptr, size, user_sizes, uargs);
-
if ((valid_prog_array &&
!perf_call_bpf_enter(sys_data->enter_event, fake_regs, sys_data, rec)) ||
hlist_empty(head)) {
© 2016 - 2025 Red Hat, Inc.