arch/m68k/Kconfig | 6 ++++ arch/m68k/kernel/Makefile | 1 + arch/m68k/kernel/irq.c | 2 ++ arch/m68k/kernel/stacktrace.c | 70 +++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 79 insertions(+)
In order to debug latency issues, I wanted to use ftrace on my M54418 coldfire. Sadly, it is not supported yet. Thanks to Steven [1] it does not sound too difficult. This small series adds basic functions to make it work, and for the few tests I could do, it seems to be working well. Here is a simple output I get: ``` bash-232 [000] d..3. 947.629000: thread_noise: bash:232 start 947.629000000 duration 0 ns timerlat/0-274 [000] ..... 947.629000: #51598 context thread timer_latency 409280 ns bash-232 [000] d.h.. 947.630000: #51599 context irq timer_latency 110720 ns bash-232 [000] dnh1. 947.630000: irq_noise: timer:206 start 947.629000000 duration 1000000 ns bash-232 [000] d..3. 947.630000: thread_noise: bash:232 start 947.630000000 duration 0 ns timerlat/0-274 [000] ..... 947.630000: #51599 context thread timer_latency 407168 ns bash-232 [000] d.h.. 947.631000: #51600 context irq timer_latency 108608 ns bash-232 [000] dnh1. 947.631000: irq_noise: timer:206 start 947.630000000 duration 1000000 ns bash-232 [000] d..3. 947.631000: thread_noise: bash:232 start 947.631000000 duration 0 ns timerlat/0-274 [000] ..... 947.631000: #51600 context thread timer_latency 401472 ns ``` I am very interested by any relevant test to do (switch events ? Anything else ?) to improve the series (and the platform :-)). I am quite sure I missed a *lot* of things, but it seems to do what I need :-). I post it as RFC for now, in particular because I added a new file, and I am not sure if it is the proper way. Thanks for your remarks and improvements ! [1]: https://lore.kernel.org/linux-m68k/20241018124511.70d29198@gandalf.local.home Signed-off-by: Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> --- Jean-Michel Hautbois (2): m68k: Add tracirqs arch: m68k: Add STACKTRACE support arch/m68k/Kconfig | 6 ++++ arch/m68k/kernel/Makefile | 1 + arch/m68k/kernel/irq.c | 2 ++ arch/m68k/kernel/stacktrace.c | 70 +++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 79 insertions(+) --- base-commit: 42f7652d3eb527d03665b09edac47f85fb600924 change-id: 20241021-add-m68k-tracing-support-36c18d2233d8 Best regards, -- Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org>
Hi all, On 21/10/2024 11:44, Jean-Michel Hautbois wrote: > In order to debug latency issues, I wanted to use ftrace on my M54418 > coldfire. Sadly, it is not supported yet. > Thanks to Steven [1] it does not sound too difficult. > > This small series adds basic functions to make it work, and for the few > tests I could do, it seems to be working well. I did a few tests (well, a *lot* to be honest :-)) after adding HR timers support on my m68k coldfire kernel. Now, I think ftrace shows a nice issue, but I think my stack_trace is not correct ? # wakeup_rt latency trace v1.1.5 on 6.12.0-rc4-00116-g2f00ff61fd82-dirty # -------------------------------------------------------------------- # latency: 451 us, #6/6, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0) # ----------------- # | task: irq/104-enet-fe-109 (uid:0 nice:0 policy:1 rt_prio:50) # ----------------- # # _------=> CPU# # / _-----=> irqs-off/BH-disabled # | / _----=> need-resched # || / _---=> hardirq/softirq # ||| / _--=> preempt-depth # |||| / _-=> migrate-disable # ||||| / delay # cmd pid |||||| time | caller # \ / |||||| \ | / telnetd-224 0dnh4. 14us+: 224:120:R + [000] 109: 49:R irq/104-enet-fe telnetd-224 0dnh4. 50us+: <stack trace> telnetd-224 0dnh4. 65us!: 0 telnetd-224 0d..3. 396us+: __traceiter_sched_switch telnetd-224 0d..3. 418us+: 224:120:R ==> [000] 109: 49:R irq/104-enet-fe telnetd-224 0d..3. 440us : <stack trace> Nevertheless it sounds like a really high latency for wake_up(). I have a custom driver which basically gets an IRQ, and calls wake_up on a read() call. This wake_up() on a high cpu usage can be more than 1ms ! Even with a fifo/99 priority for my kernel thread ! I don't know if it rings any bell ? I can obviously do more tests if it can help getting down to the issue :-). Thanks ! > Here is a simple output I get: > > ``` > > bash-232 [000] d..3. 947.629000: thread_noise: bash:232 start 947.629000000 duration 0 ns > timerlat/0-274 [000] ..... 947.629000: #51598 context thread timer_latency 409280 ns > bash-232 [000] d.h.. 947.630000: #51599 context irq timer_latency 110720 ns > bash-232 [000] dnh1. 947.630000: irq_noise: timer:206 start 947.629000000 duration 1000000 ns > bash-232 [000] d..3. 947.630000: thread_noise: bash:232 start 947.630000000 duration 0 ns > timerlat/0-274 [000] ..... 947.630000: #51599 context thread timer_latency 407168 ns > bash-232 [000] d.h.. 947.631000: #51600 context irq timer_latency 108608 ns > bash-232 [000] dnh1. 947.631000: irq_noise: timer:206 start 947.630000000 duration 1000000 ns > bash-232 [000] d..3. 947.631000: thread_noise: bash:232 start 947.631000000 duration 0 ns > timerlat/0-274 [000] ..... 947.631000: #51600 context thread timer_latency 401472 ns > ``` > > I am very interested by any relevant test to do (switch events ? > Anything else ?) to improve the series (and the platform :-)). > > I am quite sure I missed a *lot* of things, but it seems to do what I > need :-). I post it as RFC for now, in particular because I added a new > file, and I am not sure if it is the proper way. > > Thanks for your remarks and improvements ! > > [1]: https://lore.kernel.org/linux-m68k/20241018124511.70d29198@gandalf.local.home > > Signed-off-by: Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> > --- > Jean-Michel Hautbois (2): > m68k: Add tracirqs > arch: m68k: Add STACKTRACE support > > arch/m68k/Kconfig | 6 ++++ > arch/m68k/kernel/Makefile | 1 + > arch/m68k/kernel/irq.c | 2 ++ > arch/m68k/kernel/stacktrace.c | 70 +++++++++++++++++++++++++++++++++++++++++++ > 4 files changed, 79 insertions(+) > --- > base-commit: 42f7652d3eb527d03665b09edac47f85fb600924 > change-id: 20241021-add-m68k-tracing-support-36c18d2233d8 > > Best regards,
On Fri, 15 Nov 2024 09:26:07 +0100 Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > Nevertheless it sounds like a really high latency for wake_up(). > > I have a custom driver which basically gets an IRQ, and calls wake_up on > a read() call. This wake_up() on a high cpu usage can be more than 1ms ! > Even with a fifo/99 priority for my kernel thread ! > > I don't know if it rings any bell ? > I can obviously do more tests if it can help getting down to the issue :-). Try running timerlat. -- Steve
Hi Steve, On 15/11/2024 16:25, Steven Rostedt wrote: > On Fri, 15 Nov 2024 09:26:07 +0100 > Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > >> Nevertheless it sounds like a really high latency for wake_up(). >> >> I have a custom driver which basically gets an IRQ, and calls wake_up on >> a read() call. This wake_up() on a high cpu usage can be more than 1ms ! >> Even with a fifo/99 priority for my kernel thread ! >> >> I don't know if it rings any bell ? >> I can obviously do more tests if it can help getting down to the issue :-). > > Try running timerlat. Thanks ! Here is what I get: # echo timerlat > current_tracer # echo 1 > events/osnoise/enable # echo 25 > osnoise/stop_tracing_total_us # tail -10 trace bash-224 [000] d.h.. 153.268917: #77645 context irq timer_latency 45056 ns bash-224 [000] dnh.. 153.268987: irq_noise: timer:206 start 153.268879083 duration 93957 ns bash-224 [000] d.... 153.269056: thread_noise: bash:224 start 153.268905324 duration 71045 ns timerlat/0-271 [000] ..... 153.269103: #77645 context thread timer_latency 230656 ns bash-224 [000] d.h.. 153.269735: irq_noise: timer:206 start 153.269613847 duration 103558 ns bash-224 [000] d.h.. 153.269911: #77646 context irq timer_latency 40640 ns bash-224 [000] dnh.. 153.269982: irq_noise: timer:206 start 153.269875367 duration 93190 ns bash-224 [000] d.... 153.270053: thread_noise: bash:224 start 153.269900969 duration 72709 ns timerlat/0-271 [000] ..... 153.270100: #77646 context thread timer_latency 227008 ns timerlat/0-271 [000] ..... 153.270155: timerlat_main: stop tracing hit on cpu 0 It looks awful, right ? JM
On Fri, 15 Nov 2024 16:33:06 +0100 Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > Hi Steve, > > On 15/11/2024 16:25, Steven Rostedt wrote: > > On Fri, 15 Nov 2024 09:26:07 +0100 > > Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > > > >> Nevertheless it sounds like a really high latency for wake_up(). > >> > >> I have a custom driver which basically gets an IRQ, and calls wake_up on > >> a read() call. This wake_up() on a high cpu usage can be more than 1ms ! > >> Even with a fifo/99 priority for my kernel thread ! > >> > >> I don't know if it rings any bell ? > >> I can obviously do more tests if it can help getting down to the issue :-). > > > > Try running timerlat. > > Thanks ! > Here is what I get: > # echo timerlat > current_tracer > # echo 1 > events/osnoise/enable > # echo 25 > osnoise/stop_tracing_total_us > # tail -10 trace > bash-224 [000] d.h.. 153.268917: #77645 context irq timer_latency 45056 ns > bash-224 [000] dnh.. 153.268987: irq_noise: timer:206 start 153.268879083 duration 93957 ns > bash-224 [000] d.... 153.269056: thread_noise: bash:224 start 153.268905324 duration 71045 ns > timerlat/0-271 [000] ..... 153.269103: #77645 context thread timer_latency 230656 ns > bash-224 [000] d.h.. 153.269735: irq_noise: timer:206 start 153.269613847 duration 103558 ns > bash-224 [000] d.h.. 153.269911: #77646 context irq timer_latency 40640 ns > bash-224 [000] dnh.. 153.269982: irq_noise: timer:206 start 153.269875367 duration 93190 ns > bash-224 [000] d.... 153.270053: thread_noise: bash:224 start 153.269900969 duration 72709 ns > timerlat/0-271 [000] ..... 153.270100: #77646 context thread timer_latency 227008 ns > timerlat/0-271 [000] ..... 153.270155: timerlat_main: stop tracing hit on cpu 0 > > It looks awful, right ? awful is relative ;-) If that was on x86, I would say it was bad. Also check out rtla (in tools/trace/rtla). # rtla timerlat top Timer Latency 0 00:00:23 | IRQ Timer Latency (us) | Thread Timer Latency (us) | Ret user Timer Latency (us) CPU COUNT | cur min avg max | cur min avg max | cur min avg max 0 #21515 | 580 573 599 827 | 598 585 641 860 | 606 589 659 889 1 #21513 | 461 452 477 506 | 480 315 521 561 | 488 384 539 575 2 #10827 | 962 961 988 1016 | 983 805 1026 1075 | 989 940 1045 1081 3 #21512 | 68 61 86 313 | 87 73 128 190 | 95 79 146 338 4 #21510 | 254 246 271 464 | 273 70 315 473 | 281 183 333 477 5 #21509 | 397 388 414 441 | 416 215 457 649 | 424 338 475 655 6 #21508 | 496 496 522 566 | 509 322 563 605 | 515 424 579 611 7 #21507 | 658 648 675 702 | 676 471 717 757 | 684 627 735 763 ---------------|----------------------------------------|----------------------------------------|--------------------------------------- ALL #161401 e0 | 61 472 1016 | 70 514 1075 | 79 532 1081 Or on a virtual machine running on my x86 server: # rtla timerlat top Timer Latency 0 00:00:19 | IRQ Timer Latency (us) | Thread Timer Latency (us) | Ret user Timer Latency (us) CPU COUNT | cur min avg max | cur min avg max | cur min avg max 0 #4600 | 4 3 3104 3223 | 41 41 3131 3262 | 46 46 3143 3304 1 #4589 | 3678 3 3675 3742 | 3705 15 3703 3814 | 3718 22 3716 3828 2 #4571 | 3118 56 3135 3233 | 3145 79 3162 3260 | 3157 91 3175 3275 3 #4579 | 3509 3 3514 3625 | 3535 7 3540 3665 | 3547 9 3553 3702 4 #4676 | 2 1 2912 4016 | 10 5 2940 4044 | 15 7 2952 4060 5 #4572 | 3446 9 3458 3517 | 3475 32 3486 3562 | 3488 41 3499 3576 6 #4692 | 3948 0 3767 4002 | 3984 5 3794 4079 | 4000 7 3806 4108 7 #4580 | 3141 2 3144 3269 | 3168 20 3171 3296 | 3180 30 3184 3332 ---------------|----------------------------------------|----------------------------------------|--------------------------------------- ALL #36859 e0 | 0 3339 4016 | 5 3366 4079 | 7 3379 4108 That has some really poor numbers. -- Steve
Hi Steve, On 15/11/2024 20:55, Steven Rostedt wrote: > On Fri, 15 Nov 2024 16:33:06 +0100 > Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > >> Hi Steve, >> >> On 15/11/2024 16:25, Steven Rostedt wrote: >>> On Fri, 15 Nov 2024 09:26:07 +0100 >>> Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: >>> >>>> Nevertheless it sounds like a really high latency for wake_up(). >>>> >>>> I have a custom driver which basically gets an IRQ, and calls wake_up on >>>> a read() call. This wake_up() on a high cpu usage can be more than 1ms ! >>>> Even with a fifo/99 priority for my kernel thread ! >>>> >>>> I don't know if it rings any bell ? >>>> I can obviously do more tests if it can help getting down to the issue :-). >>> >>> Try running timerlat. >> >> Thanks ! >> Here is what I get: >> # echo timerlat > current_tracer >> # echo 1 > events/osnoise/enable >> # echo 25 > osnoise/stop_tracing_total_us >> # tail -10 trace >> bash-224 [000] d.h.. 153.268917: #77645 context irq timer_latency 45056 ns >> bash-224 [000] dnh.. 153.268987: irq_noise: timer:206 start 153.268879083 duration 93957 ns >> bash-224 [000] d.... 153.269056: thread_noise: bash:224 start 153.268905324 duration 71045 ns >> timerlat/0-271 [000] ..... 153.269103: #77645 context thread timer_latency 230656 ns >> bash-224 [000] d.h.. 153.269735: irq_noise: timer:206 start 153.269613847 duration 103558 ns >> bash-224 [000] d.h.. 153.269911: #77646 context irq timer_latency 40640 ns >> bash-224 [000] dnh.. 153.269982: irq_noise: timer:206 start 153.269875367 duration 93190 ns >> bash-224 [000] d.... 153.270053: thread_noise: bash:224 start 153.269900969 duration 72709 ns >> timerlat/0-271 [000] ..... 153.270100: #77646 context thread timer_latency 227008 ns >> timerlat/0-271 [000] ..... 153.270155: timerlat_main: stop tracing hit on cpu 0 >> >> It looks awful, right ? > > awful is relative ;-) If that was on x86, I would say it was bad. > > Also check out rtla (in tools/trace/rtla). Thanks ! I knew it only by name, so I watched a presentation recorded during OSS summit given by Daniel Bristot de Oliveira who wrote it and it is really impressive ! I had to modify the source code a bit, as it does not compile with my uclibc toolchain: diff --git a/tools/tracing/rtla/Makefile.rtla b/tools/tracing/rtla/Makefile.rtla index cc1d6b615475..b22016a88d09 100644 --- a/tools/tracing/rtla/Makefile.rtla +++ b/tools/tracing/rtla/Makefile.rtla @@ -15,7 +15,7 @@ $(call allow-override,LD_SO_CONF_PATH,/etc/ld.so.conf.d/) $(call allow-override,LDCONFIG,ldconfig) export CC AR STRIP PKG_CONFIG LD_SO_CONF_PATH LDCONFIG -FOPTS := -flto=auto -ffat-lto-objects -fexceptions -fstack-protector-strong \ +FOPTS := -flto=auto -ffat-lto-objects -fexceptions \ -fasynchronous-unwind-tables -fstack-clash-protection WOPTS := -O -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 \ -Wp,-D_GLIBCXX_ASSERTIONS -Wno-maybe-uninitialized diff --git a/tools/tracing/rtla/src/timerlat_u.c b/tools/tracing/rtla/src/timerlat_u.c index 01dbf9a6b5a5..92ad2388b123 100644 --- a/tools/tracing/rtla/src/timerlat_u.c +++ b/tools/tracing/rtla/src/timerlat_u.c @@ -15,10 +15,16 @@ #include <pthread.h> #include <sys/wait.h> #include <sys/prctl.h> +#include <sys/syscall.h> #include "utils.h" #include "timerlat_u.h" +static inline pid_t gettid(void) +{ + return syscall(SYS_gettid); +} + /* * This is the user-space main for the tool timerlatu/ threads. * diff --git a/tools/tracing/rtla/src/utils.c b/tools/tracing/rtla/src/utils.c index 9ac71a66840c..b754dc1016a4 100644 --- a/tools/tracing/rtla/src/utils.c +++ b/tools/tracing/rtla/src/utils.c @@ -229,6 +229,9 @@ long parse_ns_duration(char *val) #elif __s390x__ # define __NR_sched_setattr 345 # define __NR_sched_getattr 346 +#elif __m68k__ +# define __NR_sched_setattr 349 +# define __NR_sched_getattr 350 #endif #define SCHED_DEADLINE 6 But it is not enough, as executing rtla fails with a segfault. I can dump a core, but I could not manage to build gdb for my board so I can't debug it (I don't know how to debug a coredump without gdb !). JM
[ Added Tomas as he knows this code better than I do ] On Mon, 18 Nov 2024 11:11:48 +0100 Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > Hi Steve, > > On 15/11/2024 20:55, Steven Rostedt wrote: > > On Fri, 15 Nov 2024 16:33:06 +0100 > > Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > > > >> Hi Steve, > >> > >> On 15/11/2024 16:25, Steven Rostedt wrote: > >>> On Fri, 15 Nov 2024 09:26:07 +0100 > >>> Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > >>> > >>>> Nevertheless it sounds like a really high latency for wake_up(). > >>>> > >>>> I have a custom driver which basically gets an IRQ, and calls wake_up on > >>>> a read() call. This wake_up() on a high cpu usage can be more than 1ms ! > >>>> Even with a fifo/99 priority for my kernel thread ! > >>>> > >>>> I don't know if it rings any bell ? > >>>> I can obviously do more tests if it can help getting down to the issue :-). > >>> > >>> Try running timerlat. > >> > >> Thanks ! > >> Here is what I get: > >> # echo timerlat > current_tracer > >> # echo 1 > events/osnoise/enable > >> # echo 25 > osnoise/stop_tracing_total_us > >> # tail -10 trace > >> bash-224 [000] d.h.. 153.268917: #77645 context irq timer_latency 45056 ns > >> bash-224 [000] dnh.. 153.268987: irq_noise: timer:206 start 153.268879083 duration 93957 ns > >> bash-224 [000] d.... 153.269056: thread_noise: bash:224 start 153.268905324 duration 71045 ns > >> timerlat/0-271 [000] ..... 153.269103: #77645 context thread timer_latency 230656 ns > >> bash-224 [000] d.h.. 153.269735: irq_noise: timer:206 start 153.269613847 duration 103558 ns > >> bash-224 [000] d.h.. 153.269911: #77646 context irq timer_latency 40640 ns > >> bash-224 [000] dnh.. 153.269982: irq_noise: timer:206 start 153.269875367 duration 93190 ns > >> bash-224 [000] d.... 153.270053: thread_noise: bash:224 start 153.269900969 duration 72709 ns > >> timerlat/0-271 [000] ..... 153.270100: #77646 context thread timer_latency 227008 ns > >> timerlat/0-271 [000] ..... 153.270155: timerlat_main: stop tracing hit on cpu 0 > >> > >> It looks awful, right ? > > > > awful is relative ;-) If that was on x86, I would say it was bad. > > > > Also check out rtla (in tools/trace/rtla). > > Thanks ! I knew it only by name, so I watched a presentation recorded > during OSS summit given by Daniel Bristot de Oliveira who wrote it and > it is really impressive ! > > I had to modify the source code a bit, as it does not compile with my > uclibc toolchain: > diff --git a/tools/tracing/rtla/Makefile.rtla > b/tools/tracing/rtla/Makefile.rtla > index cc1d6b615475..b22016a88d09 100644 > --- a/tools/tracing/rtla/Makefile.rtla > +++ b/tools/tracing/rtla/Makefile.rtla > @@ -15,7 +15,7 @@ $(call allow-override,LD_SO_CONF_PATH,/etc/ld.so.conf.d/) > $(call allow-override,LDCONFIG,ldconfig) > export CC AR STRIP PKG_CONFIG LD_SO_CONF_PATH LDCONFIG > > -FOPTS := -flto=auto -ffat-lto-objects -fexceptions > -fstack-protector-strong \ > +FOPTS := -flto=auto -ffat-lto-objects -fexceptions \ > -fasynchronous-unwind-tables -fstack-clash-protection > WOPTS := -O -Wall -Werror=format-security > -Wp,-D_FORTIFY_SOURCE=2 \ > -Wp,-D_GLIBCXX_ASSERTIONS -Wno-maybe-uninitialized I'm not sure what the consequence of the above would be. Perhaps Daniel just copied this from other code? > diff --git a/tools/tracing/rtla/src/timerlat_u.c > b/tools/tracing/rtla/src/timerlat_u.c > index 01dbf9a6b5a5..92ad2388b123 100644 > --- a/tools/tracing/rtla/src/timerlat_u.c > +++ b/tools/tracing/rtla/src/timerlat_u.c > @@ -15,10 +15,16 @@ > #include <pthread.h> > #include <sys/wait.h> > #include <sys/prctl.h> > +#include <sys/syscall.h> > > #include "utils.h" > #include "timerlat_u.h" > > +static inline pid_t gettid(void) > +{ > + return syscall(SYS_gettid); > +} > + > /* > * This is the user-space main for the tool timerlatu/ threads. > * > diff --git a/tools/tracing/rtla/src/utils.c b/tools/tracing/rtla/src/utils.c > index 9ac71a66840c..b754dc1016a4 100644 > --- a/tools/tracing/rtla/src/utils.c > +++ b/tools/tracing/rtla/src/utils.c > @@ -229,6 +229,9 @@ long parse_ns_duration(char *val) > #elif __s390x__ > # define __NR_sched_setattr 345 > # define __NR_sched_getattr 346 > +#elif __m68k__ > +# define __NR_sched_setattr 349 > +# define __NR_sched_getattr 350 > #endif > > #define SCHED_DEADLINE 6 > > But it is not enough, as executing rtla fails with a segfault. > I can dump a core, but I could not manage to build gdb for my board so I > can't debug it (I don't know how to debug a coredump without gdb !). printf()! That's how I debug things without gdb ;-) -- Steve
Hi Steve ! On 18/11/2024 21:20, Steven Rostedt wrote: > > [ Added Tomas as he knows this code better than I do ] Thanks ! > > On Mon, 18 Nov 2024 11:11:48 +0100 > Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > >> Hi Steve, >> >> On 15/11/2024 20:55, Steven Rostedt wrote: >>> On Fri, 15 Nov 2024 16:33:06 +0100 >>> Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: >>> >>>> Hi Steve, >>>> >>>> On 15/11/2024 16:25, Steven Rostedt wrote: >>>>> On Fri, 15 Nov 2024 09:26:07 +0100 >>>>> Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: >>>>> >>>>>> Nevertheless it sounds like a really high latency for wake_up(). >>>>>> >>>>>> I have a custom driver which basically gets an IRQ, and calls wake_up on >>>>>> a read() call. This wake_up() on a high cpu usage can be more than 1ms ! >>>>>> Even with a fifo/99 priority for my kernel thread ! >>>>>> >>>>>> I don't know if it rings any bell ? >>>>>> I can obviously do more tests if it can help getting down to the issue :-). >>>>> >>>>> Try running timerlat. >>>> >>>> Thanks ! >>>> Here is what I get: >>>> # echo timerlat > current_tracer >>>> # echo 1 > events/osnoise/enable >>>> # echo 25 > osnoise/stop_tracing_total_us >>>> # tail -10 trace >>>> bash-224 [000] d.h.. 153.268917: #77645 context irq timer_latency 45056 ns >>>> bash-224 [000] dnh.. 153.268987: irq_noise: timer:206 start 153.268879083 duration 93957 ns >>>> bash-224 [000] d.... 153.269056: thread_noise: bash:224 start 153.268905324 duration 71045 ns >>>> timerlat/0-271 [000] ..... 153.269103: #77645 context thread timer_latency 230656 ns >>>> bash-224 [000] d.h.. 153.269735: irq_noise: timer:206 start 153.269613847 duration 103558 ns >>>> bash-224 [000] d.h.. 153.269911: #77646 context irq timer_latency 40640 ns >>>> bash-224 [000] dnh.. 153.269982: irq_noise: timer:206 start 153.269875367 duration 93190 ns >>>> bash-224 [000] d.... 153.270053: thread_noise: bash:224 start 153.269900969 duration 72709 ns >>>> timerlat/0-271 [000] ..... 153.270100: #77646 context thread timer_latency 227008 ns >>>> timerlat/0-271 [000] ..... 153.270155: timerlat_main: stop tracing hit on cpu 0 >>>> >>>> It looks awful, right ? >>> >>> awful is relative ;-) If that was on x86, I would say it was bad. >>> >>> Also check out rtla (in tools/trace/rtla). >> >> Thanks ! I knew it only by name, so I watched a presentation recorded >> during OSS summit given by Daniel Bristot de Oliveira who wrote it and >> it is really impressive ! >> >> I had to modify the source code a bit, as it does not compile with my >> uclibc toolchain: >> diff --git a/tools/tracing/rtla/Makefile.rtla >> b/tools/tracing/rtla/Makefile.rtla >> index cc1d6b615475..b22016a88d09 100644 >> --- a/tools/tracing/rtla/Makefile.rtla >> +++ b/tools/tracing/rtla/Makefile.rtla >> @@ -15,7 +15,7 @@ $(call allow-override,LD_SO_CONF_PATH,/etc/ld.so.conf.d/) >> $(call allow-override,LDCONFIG,ldconfig) >> export CC AR STRIP PKG_CONFIG LD_SO_CONF_PATH LDCONFIG >> >> -FOPTS := -flto=auto -ffat-lto-objects -fexceptions >> -fstack-protector-strong \ >> +FOPTS := -flto=auto -ffat-lto-objects -fexceptions \ >> -fasynchronous-unwind-tables -fstack-clash-protection >> WOPTS := -O -Wall -Werror=format-security >> -Wp,-D_FORTIFY_SOURCE=2 \ >> -Wp,-D_GLIBCXX_ASSERTIONS -Wno-maybe-uninitialized > > I'm not sure what the consequence of the above would be. Perhaps Daniel > just copied this from other code? > >> diff --git a/tools/tracing/rtla/src/timerlat_u.c >> b/tools/tracing/rtla/src/timerlat_u.c >> index 01dbf9a6b5a5..92ad2388b123 100644 >> --- a/tools/tracing/rtla/src/timerlat_u.c >> +++ b/tools/tracing/rtla/src/timerlat_u.c >> @@ -15,10 +15,16 @@ >> #include <pthread.h> >> #include <sys/wait.h> >> #include <sys/prctl.h> >> +#include <sys/syscall.h> >> >> #include "utils.h" >> #include "timerlat_u.h" >> >> +static inline pid_t gettid(void) >> +{ >> + return syscall(SYS_gettid); >> +} >> + >> /* >> * This is the user-space main for the tool timerlatu/ threads. >> * >> diff --git a/tools/tracing/rtla/src/utils.c b/tools/tracing/rtla/src/utils.c >> index 9ac71a66840c..b754dc1016a4 100644 >> --- a/tools/tracing/rtla/src/utils.c >> +++ b/tools/tracing/rtla/src/utils.c >> @@ -229,6 +229,9 @@ long parse_ns_duration(char *val) >> #elif __s390x__ >> # define __NR_sched_setattr 345 >> # define __NR_sched_getattr 346 >> +#elif __m68k__ >> +# define __NR_sched_setattr 349 >> +# define __NR_sched_getattr 350 >> #endif >> >> #define SCHED_DEADLINE 6 >> >> But it is not enough, as executing rtla fails with a segfault. >> I can dump a core, but I could not manage to build gdb for my board so I >> can't debug it (I don't know how to debug a coredump without gdb !). > > printf()! That's how I debug things without gdb ;-) Indeed printf gave me clues ! It appears to be a bug in libtracefs (v1.8.1). rtla segfaults when calling tracefs_local_events() in trace_instance_init(). Debugging libtracefs pointed me to the load_events() function, and the segfault happens after tep_parse_event() is called for "/sys/kernel/debug/tracing/events/vmscan/mm_vmscan_write_folio/format". Going through the calls I get to event_read_print_args(). I changed libtraceevent log level to get the warnings, and it says: libtraceevent: Resource temporarily unavailable unknown op '.' Segmentation fault JM
On Tue, 19 Nov 2024 15:24:00 +0100 Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > Indeed printf gave me clues ! > It appears to be a bug in libtracefs (v1.8.1). rtla segfaults when > calling tracefs_local_events() in trace_instance_init(). > > Debugging libtracefs pointed me to the load_events() function, and the > segfault happens after tep_parse_event() is called for > "/sys/kernel/debug/tracing/events/vmscan/mm_vmscan_write_folio/format". > > Going through the calls I get to event_read_print_args(). > I changed libtraceevent log level to get the warnings, and it says: > libtraceevent: Resource temporarily unavailable > unknown op '.' > Segmentation fault Can you do me a favor and send me privately a tarball of: # cp -r /sys/kernel/tracing/events /tmp/events # cd /tmp # tar -cvjf events.tar.bz2 events You can't call tar on the /sys/kernel/tracing files as those are pseudo files with size of zero, and tar will just record empty files :-p -- Steve
On Tue, 19 Nov 2024 10:26:31 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > Can you do me a favor and send me privately a tarball of: > > # cp -r /sys/kernel/tracing/events /tmp/events > # cd /tmp > # tar -cvjf events.tar.bz2 events > > You can't call tar on the /sys/kernel/tracing files as those are pseudo > files with size of zero, and tar will just record empty files :-p It crashes on parsing this: name: mm_vmscan_write_folio ID: 198 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:unsigned long pfn; offset:8; size:4; signed:0; field:int reclaim_flags; offset:12; size:4; signed:1; print fmt: "page=%p pfn=0x%lx flags=%s", (mem_map + ((REC->pfn) - (m68k_memory[0].addr >> 13))), REC->pfn, (REC->reclaim_flags) ? __print_flags(REC->reclaim_flags, "|", {0x0001u, "RECLAIM_WB_ANON"}, {0x0002u, "RECLAIM_WB_FILE"}, {0x0010u, "RECLAIM_WB_MIXED"}, {0x0004u, "RECLAIM_WB_SYNC"}, {0x0008u, "RECLAIM_WB_ASYNC"} ) : "RECLAIM_WB_NONE" It shouldn't crash, but it also found a bug in your code ;-) You reference two variables that are not part of the event: "mem_map" and "m68k_memory[0].addr" Do these variables ever change? Because the TP_printk() part of the TRACE_EVENT() macro is called a long time after the event is recorded. It could be seconds, minutes, days or even months (and unlikely possibly years) later. The event takes place and runs the TP_fast_assign() to record the event in the ring buffer. Then some time later, when you read the "trace" file, the TP_printk() portion gets run. If you wait months before reading that, it is executed months later. Now you have "mem_map" and "m68k_memory[0].addr" in that output that gets run months after the fact. Are they constant throughout the boot? Now another issue is that user space has no idea what those values are. Now user space can not print the values. Currently the code crashes because you are the first one to reference a global value from a trace event print fmt. That should probably be fixed to simply fail to parse the event and ignore the print format logic (which defaults to just printing the raw fields). -- Steve
Hi Steve, On 19/11/2024 17:28, Steven Rostedt wrote: > On Tue, 19 Nov 2024 10:26:31 -0500 > Steven Rostedt <rostedt@goodmis.org> wrote: > >> Can you do me a favor and send me privately a tarball of: >> >> # cp -r /sys/kernel/tracing/events /tmp/events >> # cd /tmp >> # tar -cvjf events.tar.bz2 events >> >> You can't call tar on the /sys/kernel/tracing files as those are pseudo >> files with size of zero, and tar will just record empty files :-p > > It crashes on parsing this: > > name: mm_vmscan_write_folio > ID: 198 > format: > field:unsigned short common_type; offset:0; size:2; signed:0; > field:unsigned char common_flags; offset:2; size:1; signed:0; > field:unsigned char common_preempt_count; offset:3; size:1; signed:0; > field:int common_pid; offset:4; size:4; signed:1; > > field:unsigned long pfn; offset:8; size:4; signed:0; > field:int reclaim_flags; offset:12; size:4; signed:1; > > print fmt: "page=%p pfn=0x%lx flags=%s", (mem_map + ((REC->pfn) - (m68k_memory[0].addr >> 13))), REC->pfn, (REC->reclaim_flags) ? __print_flags(REC->reclaim_flags, "|", {0x0001u, "RECLAIM_WB_ANON"}, {0x0002u, "RECLAIM_WB_FILE"}, {0x0010u, "RECLAIM_WB_MIXED"}, {0x0004u, "RECLAIM_WB_SYNC"}, {0x0008u, "RECLAIM_WB_ASYNC"} ) : "RECLAIM_WB_NONE" > > > It shouldn't crash, but it also found a bug in your code ;-) In my code is a really big assumption :-). > You reference two variables that are not part of the event: > > "mem_map" and "m68k_memory[0].addr" > > Do these variables ever change? Because the TP_printk() part of the > TRACE_EVENT() macro is called a long time after the event is recorded. It > could be seconds, minutes, days or even months (and unlikely possibly > years) later. I am really not the best placed to answer. AFAIK, it sounds like those are never changing. > > The event takes place and runs the TP_fast_assign() to record the event in > the ring buffer. Then some time later, when you read the "trace" file, the > TP_printk() portion gets run. If you wait months before reading that, it is > executed months later. > > Now you have "mem_map" and "m68k_memory[0].addr" in that output that gets > run months after the fact. Are they constant throughout the boot? I don't know. > Now another issue is that user space has no idea what those values are. Now > user space can not print the values. Currently the code crashes because you > are the first one to reference a global value from a trace event print fmt. > That should probably be fixed to simply fail to parse the event and ignore > the print format logic (which defaults to just printing the raw fields). The patch you sent works... But, it fails a bit later: Dispatching timerlat u procs starting loop User-space timerlat pid 230 on cpu 0 Segmentation fault > > -- Steve >
Hi Steve, On 20/11/24 07:06, Jean-Michel Hautbois wrote: > >> You reference two variables that are not part of the event: >> >> "mem_map" and "m68k_memory[0].addr" >> >> Do these variables ever change? Because the TP_printk() part of the >> TRACE_EVENT() macro is called a long time after the event is >> recorded. It >> could be seconds, minutes, days or even months (and unlikely possibly >> years) later. > > I am really not the best placed to answer. > AFAIK, it sounds like those are never changing. m68k_memory[0].addr never changes (that segment is usually where the kernel is loaded to, and can't be hotplugged). mem_map is equal to NODE_DATA(0)->node_mem_map on m68k (mm/mm_init.c:__init alloc_node_mem_map()) and won't change either. Cheers, Michael > >> >> The event takes place and runs the TP_fast_assign() to record the >> event in >> the ring buffer. Then some time later, when you read the "trace" >> file, the >> TP_printk() portion gets run. If you wait months before reading that, >> it is >> executed months later. >> >> Now you have "mem_map" and "m68k_memory[0].addr" in that output that >> gets >> run months after the fact. Are they constant throughout the boot? > > I don't know. > >> Now another issue is that user space has no idea what those values >> are. Now >> user space can not print the values. Currently the code crashes >> because you >> are the first one to reference a global value from a trace event >> print fmt. >> That should probably be fixed to simply fail to parse the event and >> ignore >> the print format logic (which defaults to just printing the raw fields). > > The patch you sent works... > But, it fails a bit later: > Dispatching timerlat u procs > starting loop > User-space timerlat pid 230 on cpu 0 > Segmentation fault > > >> >> -- Steve >> > >
On Tue, 19 Nov 2024 19:06:45 +0100 Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > > > > It shouldn't crash, but it also found a bug in your code ;-) > > In my code is a really big assumption :-). Well, not your personally, but I meant "your" as in m68k code. > > > You reference two variables that are not part of the event: > > > > "mem_map" and "m68k_memory[0].addr" > > > > Do these variables ever change? Because the TP_printk() part of the > > TRACE_EVENT() macro is called a long time after the event is recorded. It > > could be seconds, minutes, days or even months (and unlikely possibly > > years) later. > > I am really not the best placed to answer. > AFAIK, it sounds like those are never changing. That would mean they are OK and will not corrupt the trace, but it will be meaningless for tools like perf and trace-cmd. > > > > > The event takes place and runs the TP_fast_assign() to record the event in > > the ring buffer. Then some time later, when you read the "trace" file, the > > TP_printk() portion gets run. If you wait months before reading that, it is > > executed months later. > > > > Now you have "mem_map" and "m68k_memory[0].addr" in that output that gets > > run months after the fact. Are they constant throughout the boot? > > I don't know. > > > Now another issue is that user space has no idea what those values are. Now > > user space can not print the values. Currently the code crashes because you > > are the first one to reference a global value from a trace event print fmt. > > That should probably be fixed to simply fail to parse the event and ignore > > the print format logic (which defaults to just printing the raw fields). > > The patch you sent works... > But, it fails a bit later: > Dispatching timerlat u procs > starting loop > User-space timerlat pid 230 on cpu 0 > Segmentation fault > More printk? ;-) -- Steve
Hi Steve, On 19/11/2024 19:10, Steven Rostedt wrote: > On Tue, 19 Nov 2024 19:06:45 +0100 > Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: >>> >>> It shouldn't crash, but it also found a bug in your code ;-) >> >> In my code is a really big assumption :-). > > Well, not your personally, but I meant "your" as in m68k code. > >> >>> You reference two variables that are not part of the event: >>> >>> "mem_map" and "m68k_memory[0].addr" >>> >>> Do these variables ever change? Because the TP_printk() part of the >>> TRACE_EVENT() macro is called a long time after the event is recorded. It >>> could be seconds, minutes, days or even months (and unlikely possibly >>> years) later. >> >> I am really not the best placed to answer. >> AFAIK, it sounds like those are never changing. > > That would mean they are OK and will not corrupt the trace, but it will be > meaningless for tools like perf and trace-cmd. > >> >>> >>> The event takes place and runs the TP_fast_assign() to record the event in >>> the ring buffer. Then some time later, when you read the "trace" file, the >>> TP_printk() portion gets run. If you wait months before reading that, it is >>> executed months later. >>> >>> Now you have "mem_map" and "m68k_memory[0].addr" in that output that gets >>> run months after the fact. Are they constant throughout the boot? >> >> I don't know. >> >>> Now another issue is that user space has no idea what those values are. Now >>> user space can not print the values. Currently the code crashes because you >>> are the first one to reference a global value from a trace event print fmt. >>> That should probably be fixed to simply fail to parse the event and ignore >>> the print format logic (which defaults to just printing the raw fields). >> >> The patch you sent works... >> But, it fails a bit later: >> Dispatching timerlat u procs >> starting loop >> User-space timerlat pid 230 on cpu 0 >> Segmentation fault >> > > More printk? ;-) Indeed, but the result is not straightforward this time :-(. Long story short: it fails at kbuffer_load_subbuffer() call in read_cpu_pages(). I added printf in the kbuffer helpers in libevent, and it finishes at: __read_long_4: call read_4 at 0x600230c2 __read_4_sw: ptr=0x8044e2ac static unsigned int __read_4_sw(void *ptr) { printf("%s: ptr=%p, value: %08x\n", __func__, ptr, *(unsigned int *)ptr); unsigned int data = *(unsigned int *)ptr; printf("%s: data=%08x\n", __func__, data); return swap_4(data); } As soon as ptr is dereferenced, the segfault appears. ptr should be ok though, as the address is valid afaik... I must say that now I am stuck :-(. Thanks, JM
On Wed, 20 Nov 2024 12:47:19 +0100 Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > Long story short: it fails at kbuffer_load_subbuffer() call in > read_cpu_pages(). > > I added printf in the kbuffer helpers in libevent, and it finishes at: > __read_long_4: call read_4 at 0x600230c2 > __read_4_sw: ptr=0x8044e2ac > > static unsigned int __read_4_sw(void *ptr) > { > printf("%s: ptr=%p, value: %08x\n", __func__, ptr, *(unsigned int *)ptr); > unsigned int data = *(unsigned int *)ptr; > printf("%s: data=%08x\n", __func__, data); > > return swap_4(data); > } > > As soon as ptr is dereferenced, the segfault appears. > ptr should be ok though, as the address is valid afaik... But you don't know what ptr it failed on, right? If dereferencing a pointer will crash, the below line: printf("%s: ptr=%p, value: %08x\n", __func__, ptr, *(unsigned int *)ptr); Will crash before printing, because you are dereferencing ptr. Perhaps you should change this to: printf("%s: ptr=%p\n" value: %08x\n", __func__, ptr); printf(" value: %08x\n", *(unsigned int *)ptr); And that way you will see what 'ptr' is before the crash. Or did you do that already? -- Steve > > I must say that now I am stuck :-(.
Hi Steve, On 20/11/2024 16:31, Steven Rostedt wrote: > On Wed, 20 Nov 2024 12:47:19 +0100 > Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > >> Long story short: it fails at kbuffer_load_subbuffer() call in >> read_cpu_pages(). >> >> I added printf in the kbuffer helpers in libevent, and it finishes at: >> __read_long_4: call read_4 at 0x600230c2 >> __read_4_sw: ptr=0x8044e2ac >> >> static unsigned int __read_4_sw(void *ptr) >> { >> printf("%s: ptr=%p, value: %08x\n", __func__, ptr, *(unsigned int *)ptr); >> unsigned int data = *(unsigned int *)ptr; >> printf("%s: data=%08x\n", __func__, data); >> >> return swap_4(data); >> } >> >> As soon as ptr is dereferenced, the segfault appears. >> ptr should be ok though, as the address is valid afaik... > > But you don't know what ptr it failed on, right? > > If dereferencing a pointer will crash, the below line: > > printf("%s: ptr=%p, value: %08x\n", __func__, ptr, *(unsigned int *)ptr); > > Will crash before printing, because you are dereferencing ptr. Perhaps you > should change this to: > > printf("%s: ptr=%p\n" value: %08x\n", __func__, ptr); > printf(" value: %08x\n", *(unsigned int *)ptr); > > And that way you will see what 'ptr' is before the crash. Or did you do > that already? Yes, I did, sorry I thought it was in the previous dump :-(. kbuffer_load_subbuffer: kbuf 0x8001d520, ptr 0x8025e2a0, call read_8 read_8 swap_8 kbuffer_load_subbuffer: kbuf 0x8001d520, ptr 0x8025e2a8, read_long read_long: call read_long at 0x60022ef4 with 0x8025e2a8 __read_long_4: call read_4 at 0x600230de with 0x8025e2a8 __read_4_sw with 0x8025e2a8 __read_4_sw: ptr=0x8025e2a8, value: 00001ff0 __read_4_sw: data=00001ff0 swap_4 for 00001ff0 __read_long_4: --> read_4 at 0x600230de: f01f0000 __read_4_sw with 0x8025e2a8 __read_4_sw: ptr=0x8025e2a8, value: 00001ff0 __read_4_sw: data=00001ff0 swap_4 for 00001ff0 read_long: --> read_long at 0x60022ef4: f01f0000 __read_long_4: call read_4 at 0x600230de with 0x8025e2a8 __read_4_sw with 0x8025e2a8 __read_4_sw: ptr=0x8025e2a8, value: 00001ff0 __read_4_sw: data=00001ff0 swap_4 for 00001ff0 __read_long_4: --> read_4 at 0x600230de: f01f0000 __read_4_sw with 0x8025e2a8 __read_4_sw: ptr=0x8025e2a8, value: 00001ff0 __read_4_sw: data=00001ff0 swap_4 for 00001ff0 kbuffer_load_subbuffer: --> read_long, flags=f01f0000 kbuffer_load_subbuffer: --> size=1f0000 kbuffer_load_subbuffer: kbuf->data=0x8025e2ac, kbuf->size=1f0000 kbuffer_load_subbuffer: kbuf 0x8001d520, ptr 0x8044e2ac, read_long lost_events read_long: call read_long at 0x60022ef4 with 0x8044e2ac __read_long_4: call read_4 at 0x600230de with 0x8044e2ac __read_4_sw with 0x8044e2ac Segmentation fault JM
On Wed, 20 Nov 2024 16:59:55 +0100 Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > > And that way you will see what 'ptr' is before the crash. Or did you do > > that already? > > Yes, I did, sorry I thought it was in the previous dump :-(. Can you see if this makes a difference? Patch libtracefs: diff --git a/src/tracefs-events.c b/src/tracefs-events.c index 77d1ba89b038..19ea3b3f8d36 100644 --- a/src/tracefs-events.c +++ b/src/tracefs-events.c @@ -274,7 +274,7 @@ static int open_cpu_files(struct tracefs_instance *instance, cpu_set_t *cpus, if (snapshot) tcpu = tracefs_cpu_snapshot_open(instance, cpu, true); else - tcpu = tracefs_cpu_open_mapped(instance, cpu, true); + tcpu = tracefs_cpu_open(instance, cpu, true); if (!tcpu) goto error; -- Steve
On 20/11/2024 17:43, Steven Rostedt wrote: > On Wed, 20 Nov 2024 16:59:55 +0100 > Jean-Michel Hautbois <jeanmichel.hautbois@yoseli.org> wrote: > >>> And that way you will see what 'ptr' is before the crash. Or did you do >>> that already? >> >> Yes, I did, sorry I thought it was in the previous dump :-(. > > Can you see if this makes a difference? > > Patch libtracefs: > > diff --git a/src/tracefs-events.c b/src/tracefs-events.c > index 77d1ba89b038..19ea3b3f8d36 100644 > --- a/src/tracefs-events.c > +++ b/src/tracefs-events.c > @@ -274,7 +274,7 @@ static int open_cpu_files(struct tracefs_instance *instance, cpu_set_t *cpus, > if (snapshot) > tcpu = tracefs_cpu_snapshot_open(instance, cpu, true); > else > - tcpu = tracefs_cpu_open_mapped(instance, cpu, true); > + tcpu = tracefs_cpu_open(instance, cpu, true); > if (!tcpu) > goto error; > Nope. Nice try :-) ! If you need me to add a few specific printfs I can run it. JM
On Tue, 19 Nov 2024 11:28:50 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > print fmt: "page=%p pfn=0x%lx flags=%s", (mem_map + ((REC->pfn) - (m68k_memory[0].addr >> 13))), REC->pfn, (REC->reclaim_flags) ? __print_flags(REC->reclaim_flags, "|", {0x0001u, "RECLAIM_WB_ANON"}, {0x0002u, "RECLAIM_WB_FILE"}, {0x0010u, "RECLAIM_WB_MIXED"}, {0x0004u, "RECLAIM_WB_SYNC"}, {0x0008u, "RECLAIM_WB_ASYNC"} ) : "RECLAIM_WB_NONE" Running gdb on this, I found that it crashed on the "m68k_memory[0].addr" because it failed to fail on the '.' that it doesn't know how to parse. This patch should stop the crash (against libtraceevent) -- Steve diff --git a/src/event-parse.c b/src/event-parse.c index 0427061603db..a6da8f04cbf3 100644 --- a/src/event-parse.c +++ b/src/event-parse.c @@ -2498,6 +2498,10 @@ process_op(struct tep_event *event, struct tep_print_arg *arg, char **tok) /* higher prios need to be closer to the root */ prio = get_op_prio(*tok); + if (prio < 0) { + token = *tok; + goto out_free; + } if (prio > arg->op.prio) return process_op(event, arg, tok);
© 2016 - 2024 Red Hat, Inc.