[PATCH] perf trace: Fix unaligned access for augmented args

Namhyung Kim posted 1 patch 1 year, 1 month ago
tools/perf/builtin-trace.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
[PATCH] perf trace: Fix unaligned access for augmented args
Posted by Namhyung Kim 1 year, 1 month ago
Some version of compilers reported unaligned accesses in perf trace when
undefined-behavior sanitizer is on.  I found that it uses raw data in the
sample directly and assuming it's properly aligned.

Unlike other sample fields, the raw data is not 8-byte aligned because
there's a size field (u32) before the actual data.  So I added a static
buffer in syscall__augmented_args() and return it instead.  This is not
ideal but should work well as perf trace is single-threaded.

A better approach would be aligning the raw data by adding a 4-byte data
before the augmented args but I'm afraid it'd break the backward
compatibility.

Closes: https://lore.kernel.org/r/Z2STgyD1p456Qqhg@google.com
Cc: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-trace.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index e70e634fbfaf33f5..3f06411514c5b58a 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2582,7 +2582,6 @@ static int trace__fprintf_sample(struct trace *trace, struct evsel *evsel,
 
 static void *syscall__augmented_args(struct syscall *sc, struct perf_sample *sample, int *augmented_args_size, int raw_augmented_args_size)
 {
-	void *augmented_args = NULL;
 	/*
 	 * For now with BPF raw_augmented we hook into raw_syscalls:sys_enter
 	 * and there we get all 6 syscall args plus the tracepoint common fields
@@ -2600,10 +2599,24 @@ static void *syscall__augmented_args(struct syscall *sc, struct perf_sample *sam
 	int args_size = raw_augmented_args_size ?: sc->args_size;
 
 	*augmented_args_size = sample->raw_size - args_size;
-	if (*augmented_args_size > 0)
-		augmented_args = sample->raw_data + args_size;
+	if (*augmented_args_size > 0) {
+		static uintptr_t argbuf[1024]; /* assuming single-threaded */
+
+		if ((size_t)(*augmented_args_size) > sizeof(argbuf))
+			return NULL;
+
+		/*
+		 * The perf ring-buffer is 8-byte aligned but sample->raw_data
+		 * is not because it's preceded by u32 size.  Later, beautifier
+		 * will use the augmented args with stricter alignments like in
+		 * some struct.  To make sure it's aligned, let's copy the args
+		 * into a static buffer as it's single-threaded for now.
+		 */
+		memcpy(argbuf, sample->raw_data + args_size, *augmented_args_size);
 
-	return augmented_args;
+		return argbuf;
+	}
+	return NULL;
 }
 
 static void syscall__exit(struct syscall *sc)
-- 
2.47.1.613.gc27f4b7a9f-goog
Re: [PATCH] perf trace: Fix unaligned access for augmented args
Posted by Arnaldo Carvalho de Melo 1 year, 1 month ago
On Thu, Jan 02, 2025 at 12:12:47PM -0800, Namhyung Kim wrote:
> Some version of compilers reported unaligned accesses in perf trace when
> undefined-behavior sanitizer is on.  I found that it uses raw data in the
> sample directly and assuming it's properly aligned.
> 
> Unlike other sample fields, the raw data is not 8-byte aligned because
> there's a size field (u32) before the actual data.  So I added a static
> buffer in syscall__augmented_args() and return it instead.  This is not
> ideal but should work well as perf trace is single-threaded.
> 
> A better approach would be aligning the raw data by adding a 4-byte data
> before the augmented args but I'm afraid it'd break the backward
> compatibility.
 
You mean for 'perf trace record' files?

Older tools will not be able to process new files, while old files will
be remain processable by new tools if we insert a u32 with zeroes before
the size field, that way if the first u32 is not zero, we do as you do
below and incur the cost of copying to that intermediary buffer,
otherwise we read the real size in the next u32 and don't incur the cost
of copying.

Your fix below works as it incurs the cost all the time, which is ok for
now, but as a follow up patch we can see if the approach I described
above works.

Applying.

- Arnaldo

> Closes: https://lore.kernel.org/r/Z2STgyD1p456Qqhg@google.com
> Cc: Howard Chu <howardchu95@gmail.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/builtin-trace.c | 21 +++++++++++++++++----
>  1 file changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index e70e634fbfaf33f5..3f06411514c5b58a 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -2582,7 +2582,6 @@ static int trace__fprintf_sample(struct trace *trace, struct evsel *evsel,
>  
>  static void *syscall__augmented_args(struct syscall *sc, struct perf_sample *sample, int *augmented_args_size, int raw_augmented_args_size)
>  {
> -	void *augmented_args = NULL;
>  	/*
>  	 * For now with BPF raw_augmented we hook into raw_syscalls:sys_enter
>  	 * and there we get all 6 syscall args plus the tracepoint common fields
> @@ -2600,10 +2599,24 @@ static void *syscall__augmented_args(struct syscall *sc, struct perf_sample *sam
>  	int args_size = raw_augmented_args_size ?: sc->args_size;
>  
>  	*augmented_args_size = sample->raw_size - args_size;
> -	if (*augmented_args_size > 0)
> -		augmented_args = sample->raw_data + args_size;
> +	if (*augmented_args_size > 0) {
> +		static uintptr_t argbuf[1024]; /* assuming single-threaded */
> +
> +		if ((size_t)(*augmented_args_size) > sizeof(argbuf))
> +			return NULL;
> +
> +		/*
> +		 * The perf ring-buffer is 8-byte aligned but sample->raw_data
> +		 * is not because it's preceded by u32 size.  Later, beautifier
> +		 * will use the augmented args with stricter alignments like in
> +		 * some struct.  To make sure it's aligned, let's copy the args
> +		 * into a static buffer as it's single-threaded for now.
> +		 */
> +		memcpy(argbuf, sample->raw_data + args_size, *augmented_args_size);
>  
> -	return augmented_args;
> +		return argbuf;
> +	}
> +	return NULL;
>  }
>  
>  static void syscall__exit(struct syscall *sc)
> -- 
> 2.47.1.613.gc27f4b7a9f-goog
Re: [PATCH] perf trace: Fix unaligned access for augmented args
Posted by Arnaldo Carvalho de Melo 1 year, 1 month ago
On Thu, Jan 09, 2025 at 03:46:33PM -0300, Arnaldo Carvalho de Melo wrote:
> On Thu, Jan 02, 2025 at 12:12:47PM -0800, Namhyung Kim wrote:
> > Some version of compilers reported unaligned accesses in perf trace when
> > undefined-behavior sanitizer is on.  I found that it uses raw data in the
> > sample directly and assuming it's properly aligned.

> > Unlike other sample fields, the raw data is not 8-byte aligned because
> > there's a size field (u32) before the actual data.  So I added a static
> > buffer in syscall__augmented_args() and return it instead.  This is not
> > ideal but should work well as perf trace is single-threaded.
 
> > A better approach would be aligning the raw data by adding a 4-byte data
> > before the augmented args but I'm afraid it'd break the backward
> > compatibility.
  
> You mean for 'perf trace record' files?
 
> Older tools will not be able to process new files, while old files will
> be remain processable by new tools if we insert a u32 with zeroes before
> the size field, that way if the first u32 is not zero, we do as you do
> below and incur the cost of copying to that intermediary buffer,
> otherwise we read the real size in the next u32 and don't incur the cost
> of copying.
 
> Your fix below works as it incurs the cost all the time, which is ok for
> now, but as a follow up patch we can see if the approach I described
> above works.
 
> Applying.

Applied, and forget about my suggestion above, as we discussed this is
not feasible, I added these notes to your patch:

   Committer testing:
    
    To build with the undefined behaviour sanitizer:
    
     $ make CC=clang EXTRA_CFLAGS=-fsanitize=undefined -C tools/perf
    
    Checking if the resulting binary is instrumented:
    
      root@number:~# nm ~/bin/perf | grep ubsan | wc -l
      113
      root@number:~# nm ~/bin/perf | grep ubsan | tail -5
      000000000043d5b0 t _ZN7__ubsanL19UBsanOnDeadlySignalEiPvS0_
      000000000043ce50 T _ZNK7__ubsan5Value12getSIntValueEv
      000000000043cf40 T _ZNK7__ubsan5Value12getUIntValueEv
      000000000043d140 T _ZNK7__ubsan5Value13getFloatValueEv
      000000000043cfd0 T _ZNK7__ubsan5Value19getPositiveIntValueEv
      root@number:~#
    
    Now running something that will access timespec, as reported in the
    Closes URL:
    
      root@number:~# perf trace --max-events=1 -e *nano* sleep 1.1
      trace/beauty/timespec.c:10:64: runtime error: member access within misaligned address 0x7fc583cfb2a4 for type 'struct augmented_arg', which requires 8 byte alignment
      0x7fc583cfb2a4: note: pointer points here
        99 99 11 00 10 00 00 00  00 00 00 00 01 00 00 00  00 00 00 00 01 e1 f5 05  00 00 00 00 00 00 00 00
                    ^
      SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior trace/beauty/timespec.c:10:64
      <SNIP>
    
    As Namhyung said we need to make the raw_data to be 64-bit aligned,
    probably we need to add a PERF_SAMPLE_ALIGNED_RAW with a 64-bit raw_size
    instead of the current u32 done at kernel/events/core.c,
    perf_output_sample(), that perf_output_put(handle, raw->size) where
    raw->size is an u32 and then the raw_data is always 64-bit unaligned...
    
    After the patch:
    
      root@number:~# perf trace -e *nano* sleep 1.1
           0.000 (1100.064 ms): sleep/1984224 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 100000001 }, rmtp: 0x7fff5b3fe970) = 0
      root@number:~#
Re: [PATCH] perf trace: Fix unaligned access for augmented args
Posted by Howard Chu 1 year, 1 month ago
Hello Namhyung,

Thanks for the fix, and sorry for the delay and for making you do
this. I should've done it myself earlier. This bug is present in the
commit without the whole BTF thing.

Here is the commit before '45a0c928e7aa perf trace: BTF-based enum
pretty printing for syscall args'

$ git log --oneline
bfa54a793ba7 (HEAD) driver core: bus: Fix double free in driver API
bus_register()

perf $ UBSAN_OPTIONS=print_stacktrace=1 ./perf trace -- sleep 1
builtin-trace.c:1715:35: runtime error: index 6 out of bounds for type
'syscall_arg_fmt [6]'
    #0 0x5d0994789a74 in syscall__alloc_arg_fmts
/root/hw/linux-perf/tools/perf/builtin-trace.c:1715
    #1 0x5d099478b72c in trace__read_syscall_info
/root/hw/linux-perf/tools/perf/builtin-trace.c:1868
    #2 0x5d099478e571 in trace__syscall_info
/root/hw/linux-perf/tools/perf/builtin-trace.c:2179
    #3 0x5d099479ac81 in trace__init_syscall_bpf_progs
/root/hw/linux-perf/tools/perf/builtin-trace.c:3333
    #4 0x5d099479c28c in trace__init_syscalls_bpf_prog_array_maps
/root/hw/linux-perf/tools/perf/builtin-trace.c:3466
    #5 0x5d09947a0098 in trace__run
/root/hw/linux-perf/tools/perf/builtin-trace.c:3932
    #6 0x5d09947aa62d in cmd_trace
/root/hw/linux-perf/tools/perf/builtin-trace.c:5073
    #7 0x5d09947b6eed in run_builtin /root/hw/linux-perf/tools/perf/perf.c:350
    #8 0x5d09947b7518 in handle_internal_command
/root/hw/linux-perf/tools/perf/perf.c:403
    #9 0x5d09947b77ef in run_argv /root/hw/linux-perf/tools/perf/perf.c:447
    #10 0x5d09947b7d5e in main /root/hw/linux-perf/tools/perf/perf.c:561
    #11 0x7ec47642a1c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    #12 0x7ec47642a28a in __libc_start_main_impl ../csu/libc-start.c:360
    #13 0x5d0994615d34 in _start
(/root/hw/linux-perf/tools/perf/perf+0x4bdd34) (BuildId:
791904aaae2afa7e7ad7e3aa80a32b71e824abcf)

         ? (         ): sleep/180215  ... [continued]: execve())
                                    = 0
     0.039 ( 0.002 ms): sleep/180215 brk()
                                    = 0x604c45ccb000
     0.075 ( 0.005 ms): sleep/180215 mmap(len: 8192, prot: READ|WRITE,
flags: PRIVATE|ANONYMOUS)           = 0x7ff6de94d000

builtin-trace.c:1531:55: runtime error: member access within
misaligned address 0x7ec47192343c for type 'struct augmented_arg',
which requires 8 byte alignment
0x7ec47192343c: note: pointer points here
  f6 7f 00 00 13 00 00 00  2f 65 74 63 2f 6c 64 2e  73 6f 2e 70 72 65
6c 6f  61 64 00 00 00 00 00 00
              ^
    #0 0x5d0994788527 in syscall_arg__scnprintf_augmented_string
/root/hw/linux-perf/tools/perf/builtin-trace.c:1531
    #1 0x5d09947887ca in syscall_arg__scnprintf_filename
/root/hw/linux-perf/tools/perf/builtin-trace.c:1545
    #2 0x5d099478d436 in syscall_arg_fmt__scnprintf_val
/root/hw/linux-perf/tools/perf/builtin-trace.c:2044
    #3 0x5d099478dd8b in syscall__scnprintf_args
/root/hw/linux-perf/tools/perf/builtin-trace.c:2106
    #4 0x5d0994790c44 in trace__sys_enter
/root/hw/linux-perf/tools/perf/builtin-trace.c:2387
    #5 0x5d0994799ba5 in trace__handle_event
/root/hw/linux-perf/tools/perf/builtin-trace.c:3198
    #6 0x5d099479d3eb in __trace__deliver_event
/root/hw/linux-perf/tools/perf/builtin-trace.c:3635
    #7 0x5d099479d6c9 in trace__deliver_event
/root/hw/linux-perf/tools/perf/builtin-trace.c:3662
    #8 0x5d09947a0cc4 in trace__run
/root/hw/linux-perf/tools/perf/builtin-trace.c:4010
    #9 0x5d09947aa62d in cmd_trace
/root/hw/linux-perf/tools/perf/builtin-trace.c:5073
    #10 0x5d09947b6eed in run_builtin /root/hw/linux-perf/tools/perf/perf.c:350
    #11 0x5d09947b7518 in handle_internal_command
/root/hw/linux-perf/tools/perf/perf.c:403
    #12 0x5d09947b77ef in run_argv /root/hw/linux-perf/tools/perf/perf.c:447
    #13 0x5d09947b7d5e in main /root/hw/linux-perf/tools/perf/perf.c:561
    #14 0x7ec47642a1c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    #15 0x7ec47642a28a in __libc_start_main_impl ../csu/libc-start.c:360
    #16 0x5d0994615d34 in _start
(/root/hw/linux-perf/tools/perf/perf+0x4bdd34) (BuildId:
791904aaae2afa7e7ad7e3aa80a32b71e824abcf)

     <snip>

trace/beauty/timespec.c:12:9: runtime error: member access within
misaligned address 0x7ec4719264b4 for type 'struct timespec', which
requires 8 byte alignment
0x7ec4719264b4: note: pointer points here
  00 00 00 00 01 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00
00 00  00 00 00 00 00 00 00 00
              ^
    #0 0x5d09947b02e7 in syscall_arg__scnprintf_augmented_timespec
trace/beauty/timespec.c:12
    #1 0x5d09947b0417 in syscall_arg__scnprintf_timespec
trace/beauty/timespec.c:18
    #2 0x5d099478d436 in syscall_arg_fmt__scnprintf_val
/root/hw/linux-perf/tools/perf/builtin-trace.c:2044
    #3 0x5d099478dd8b in syscall__scnprintf_args
/root/hw/linux-perf/tools/perf/builtin-trace.c:2106
    #4 0x5d0994790c44 in trace__sys_enter
/root/hw/linux-perf/tools/perf/builtin-trace.c:2387
    #5 0x5d0994799ba5 in trace__handle_event
/root/hw/linux-perf/tools/perf/builtin-trace.c:3198
    #6 0x5d099479d3eb in __trace__deliver_event
/root/hw/linux-perf/tools/perf/builtin-trace.c:3635
    #7 0x5d099479d6c9 in trace__deliver_event
/root/hw/linux-perf/tools/perf/builtin-trace.c:3662
    #8 0x5d09947a0cc4 in trace__run
/root/hw/linux-perf/tools/perf/builtin-trace.c:4010
    #9 0x5d09947aa62d in cmd_trace
/root/hw/linux-perf/tools/perf/builtin-trace.c:5073
    #10 0x5d09947b6eed in run_builtin /root/hw/linux-perf/tools/perf/perf.c:350
    #11 0x5d09947b7518 in handle_internal_command
/root/hw/linux-perf/tools/perf/perf.c:403
    #12 0x5d09947b77ef in run_argv /root/hw/linux-perf/tools/perf/perf.c:447
    #13 0x5d09947b7d5e in main /root/hw/linux-perf/tools/perf/perf.c:561
    #14 0x7ec47642a1c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    #15 0x7ec47642a28a in __libc_start_main_impl ../csu/libc-start.c:360
    #16 0x5d0994615d34 in _start
(/root/hw/linux-perf/tools/perf/perf+0x4bdd34) (BuildId:
791904aaae2afa7e7ad7e3aa80a32b71e824abcf)

As seen above, I encountered the same runtime error of misalignment as
you did in https://lore.kernel.org/all/Z2STgyD1p456Qqhg@google.com/,
not just in time_spec.c, but also in the access of augmented_arg in
builtin-trace.c.

On Thu, Jan 2, 2025 at 12:12 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Some version of compilers reported unaligned accesses in perf trace when
> undefined-behavior sanitizer is on.  I found that it uses raw data in the
> sample directly and assuming it's properly aligned.
>
> Unlike other sample fields, the raw data is not 8-byte aligned because
> there's a size field (u32) before the actual data.  So I added a static
> buffer in syscall__augmented_args() and return it instead.  This is not
> ideal but should work well as perf trace is single-threaded.
>
> A better approach would be aligning the raw data by adding a 4-byte data
> before the augmented args but I'm afraid it'd break the backward
> compatibility.

Can you explain backward compatibility? Do you mean the 'perf trace
record' and its perf data file?

With your patch attached:

perf $ UBSAN_OPTIONS=print_stacktrace=1 ./perf trace -e
clock_nanosleep -- sleep 1
builtin-trace.c:1966:35: runtime error: index 6 out of bounds for type
'syscall_arg_fmt [6]'
    #0 0x577f3fc3d18c in syscall__alloc_arg_fmts
/root/hw/linux-perf/tools/perf/builtin-trace.c:1966
    #1 0x577f3fc3f0c1 in trace__read_syscall_info
/root/hw/linux-perf/tools/perf/builtin-trace.c:2129
    #2 0x577f3fc422ff in trace__syscall_info
/root/hw/linux-perf/tools/perf/builtin-trace.c:2466
    #3 0x577f3fc51b30 in trace__init_syscalls_bpf_prog_array_maps
/root/hw/linux-perf/tools/perf/builtin-trace.c:3927
    #4 0x577f3fc5591c in trace__run
/root/hw/linux-perf/tools/perf/builtin-trace.c:4365
    #5 0x577f3fc5fd48 in cmd_trace
/root/hw/linux-perf/tools/perf/builtin-trace.c:5532
    #6 0x577f3fc6c697 in run_builtin /root/hw/linux-perf/tools/perf/perf.c:351
    #7 0x577f3fc6ccc2 in handle_internal_command
/root/hw/linux-perf/tools/perf/perf.c:404
    #8 0x577f3fc6cf99 in run_argv /root/hw/linux-perf/tools/perf/perf.c:448
    #9 0x577f3fc6d503 in main /root/hw/linux-perf/tools/perf/perf.c:560
    #10 0x72edf8a2a1c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
    #11 0x72edf8a2a28a in __libc_start_main_impl ../csu/libc-start.c:360
    #12 0x577f3fac12c4 in _start
(/root/hw/linux-perf/tools/perf/perf+0x4e82c4) (BuildId:
bca8e50b69a43c91b4d187140c12c6608770d99e)

     0.000 (1000.225 ms): sleep/330971 clock_nanosleep(rqtp: {
.tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7fff11ba9dc0) = 0

No more misalignment, and I'll fix the index-out-of-bound bug.

Reviewed-by: Howard Chu <howardchu95@gmail.com>

Thanks
Howard