[PATCH v2 0/4] Tracing contention lock owner call stack

Chun-Tse Shao posted 4 patches 11 months, 2 weeks ago
There is a newer version of this series
tools/perf/builtin-lock.c                     |  54 +++--
tools/perf/util/bpf_lock_contention.c         |  58 +++++-
.../perf/util/bpf_skel/lock_contention.bpf.c  | 192 +++++++++++++++++-
tools/perf/util/bpf_skel/lock_data.h          |   6 +
tools/perf/util/lock-contention.h             |   2 +
5 files changed, 292 insertions(+), 20 deletions(-)
[PATCH v2 0/4] Tracing contention lock owner call stack
Posted by Chun-Tse Shao 11 months, 2 weeks ago
For perf lock contention, the current owner tracking (-o option) only
works with per-thread mode (-t option). Enabling call stack mode for
owner can be useful for diagnosing why a system running slow in
lock contention.

Example output:
  $ sudo ~/linux/tools/perf/perf lock con -abvo -Y mutex-spin -E16 perf bench sched pipe
   ...
   contended   total wait     max wait     avg wait         type   caller

         171      1.55 ms     20.26 us      9.06 us        mutex   pipe_read+0x57
                          0xffffffffac6318e7  pipe_read+0x57
                          0xffffffffac623862  vfs_read+0x332
                          0xffffffffac62434b  ksys_read+0xbb
                          0xfffffffface604b2  do_syscall_64+0x82
                          0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
          36    193.71 us     15.27 us      5.38 us        mutex   pipe_write+0x50
                          0xffffffffac631ee0  pipe_write+0x50
                          0xffffffffac6241db  vfs_write+0x3bb
                          0xffffffffac6244ab  ksys_write+0xbb
                          0xfffffffface604b2  do_syscall_64+0x82
                          0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
           4     51.22 us     16.47 us     12.80 us        mutex   do_epoll_wait+0x24d
                          0xffffffffac691f0d  do_epoll_wait+0x24d
                          0xffffffffac69249b  do_epoll_pwait.part.0+0xb
                          0xffffffffac693ba5  __x64_sys_epoll_pwait+0x95
                          0xfffffffface604b2  do_syscall_64+0x82
                          0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
           2     20.88 us     11.95 us     10.44 us        mutex   do_epoll_wait+0x24d
                          0xffffffffac691f0d  do_epoll_wait+0x24d
                          0xffffffffac693943  __x64_sys_epoll_wait+0x73
                          0xfffffffface604b2  do_syscall_64+0x82
                          0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
           1      7.33 us      7.33 us      7.33 us        mutex   do_epoll_ctl+0x6c1
                          0xffffffffac692e01  do_epoll_ctl+0x6c1
                          0xffffffffac6937e0  __x64_sys_epoll_ctl+0x70
                          0xfffffffface604b2  do_syscall_64+0x82
                          0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
           1      6.64 us      6.64 us      6.64 us        mutex   do_epoll_ctl+0x3d4
                          0xffffffffac692b14  do_epoll_ctl+0x3d4
                          0xffffffffac6937e0  __x64_sys_epoll_ctl+0x70
                          0xfffffffface604b2  do_syscall_64+0x82
                          0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76

  === owner stack trace ===

           3     31.24 us     15.27 us     10.41 us        mutex   pipe_read+0x348
                          0xffffffffac631bd8  pipe_read+0x348
                          0xffffffffac623862  vfs_read+0x332
                          0xffffffffac62434b  ksys_read+0xbb
                          0xfffffffface604b2  do_syscall_64+0x82
                          0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
  ...

v2: Fix logic deficit in patch 2/4.
Chun-Tse Shao (4):
  perf lock: Add bpf maps for owner stack tracing
  perf lock: Retrieve owner callstack in bpf program
  perf lock: Make rb_tree helper functions generic
  perf lock: Report owner stack in usermode

 tools/perf/builtin-lock.c                     |  54 +++--
 tools/perf/util/bpf_lock_contention.c         |  58 +++++-
 .../perf/util/bpf_skel/lock_contention.bpf.c  | 192 +++++++++++++++++-
 tools/perf/util/bpf_skel/lock_data.h          |   6 +
 tools/perf/util/lock-contention.h             |   2 +
 5 files changed, 292 insertions(+), 20 deletions(-)

--
2.47.1.688.g23fc6f90ad-goog
Re: [PATCH v2 0/4] Tracing contention lock owner call stack
Posted by Chun-Tse Shao 10 months, 4 weeks ago
To Namhyung, thanks for your review! I have my v3 available on:
https://lore.kernel.org/linux-perf-users/20250129001905.619859-1-ctshao@google.com

Few topics I would like to discuss with you:

In [PATCH v2 1/4]:

>         new_id = __sync_fetch_and_add(&id_gen, 1);
It caused compile error and I think it is a bug for bpf:
https://github.com/llvm/llvm-project/issues/91888

In [PATCH v2 2/4]:

> > +struct task_struct *bpf_task_from_pid(s32 pid) __ksym;
> > +void bpf_task_release(struct task_struct *p) __ksym;
>
> To support old (ancient?) kernels, you can declare them as __weak and
> check if one of them is defined and ignore owner stacks on them.  Also
> you can check them in user space and turn off the option before loading.
>
Would like to know more about how to do this.

> > +                     (otdata->count)--;
>
> No need for parenthesis, and it needs to be atomic dec.
`__sync_fetch_and_sub(&otdata->count, 1)` caused compile error as well
and I think it is bpf bug.

In [PATCH v2 4/4]:

> > +     st->name = strdup(name);
>
> Why do you copy and strdup()?  Anyway please check the error.
`st->name` was not allocated and I need `strdup` to do that.




On Sun, Jan 12, 2025 at 9:22 PM Chun-Tse Shao <ctshao@google.com> wrote:
>
> For perf lock contention, the current owner tracking (-o option) only
> works with per-thread mode (-t option). Enabling call stack mode for
> owner can be useful for diagnosing why a system running slow in
> lock contention.
>
> Example output:
>   $ sudo ~/linux/tools/perf/perf lock con -abvo -Y mutex-spin -E16 perf bench sched pipe
>    ...
>    contended   total wait     max wait     avg wait         type   caller
>
>          171      1.55 ms     20.26 us      9.06 us        mutex   pipe_read+0x57
>                           0xffffffffac6318e7  pipe_read+0x57
>                           0xffffffffac623862  vfs_read+0x332
>                           0xffffffffac62434b  ksys_read+0xbb
>                           0xfffffffface604b2  do_syscall_64+0x82
>                           0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
>           36    193.71 us     15.27 us      5.38 us        mutex   pipe_write+0x50
>                           0xffffffffac631ee0  pipe_write+0x50
>                           0xffffffffac6241db  vfs_write+0x3bb
>                           0xffffffffac6244ab  ksys_write+0xbb
>                           0xfffffffface604b2  do_syscall_64+0x82
>                           0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
>            4     51.22 us     16.47 us     12.80 us        mutex   do_epoll_wait+0x24d
>                           0xffffffffac691f0d  do_epoll_wait+0x24d
>                           0xffffffffac69249b  do_epoll_pwait.part.0+0xb
>                           0xffffffffac693ba5  __x64_sys_epoll_pwait+0x95
>                           0xfffffffface604b2  do_syscall_64+0x82
>                           0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
>            2     20.88 us     11.95 us     10.44 us        mutex   do_epoll_wait+0x24d
>                           0xffffffffac691f0d  do_epoll_wait+0x24d
>                           0xffffffffac693943  __x64_sys_epoll_wait+0x73
>                           0xfffffffface604b2  do_syscall_64+0x82
>                           0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
>            1      7.33 us      7.33 us      7.33 us        mutex   do_epoll_ctl+0x6c1
>                           0xffffffffac692e01  do_epoll_ctl+0x6c1
>                           0xffffffffac6937e0  __x64_sys_epoll_ctl+0x70
>                           0xfffffffface604b2  do_syscall_64+0x82
>                           0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
>            1      6.64 us      6.64 us      6.64 us        mutex   do_epoll_ctl+0x3d4
>                           0xffffffffac692b14  do_epoll_ctl+0x3d4
>                           0xffffffffac6937e0  __x64_sys_epoll_ctl+0x70
>                           0xfffffffface604b2  do_syscall_64+0x82
>                           0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
>
>   === owner stack trace ===
>
>            3     31.24 us     15.27 us     10.41 us        mutex   pipe_read+0x348
>                           0xffffffffac631bd8  pipe_read+0x348
>                           0xffffffffac623862  vfs_read+0x332
>                           0xffffffffac62434b  ksys_read+0xbb
>                           0xfffffffface604b2  do_syscall_64+0x82
>                           0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
>   ...
>
> v2: Fix logic deficit in patch 2/4.
> Chun-Tse Shao (4):
>   perf lock: Add bpf maps for owner stack tracing
>   perf lock: Retrieve owner callstack in bpf program
>   perf lock: Make rb_tree helper functions generic
>   perf lock: Report owner stack in usermode
>
>  tools/perf/builtin-lock.c                     |  54 +++--
>  tools/perf/util/bpf_lock_contention.c         |  58 +++++-
>  .../perf/util/bpf_skel/lock_contention.bpf.c  | 192 +++++++++++++++++-
>  tools/perf/util/bpf_skel/lock_data.h          |   6 +
>  tools/perf/util/lock-contention.h             |   2 +
>  5 files changed, 292 insertions(+), 20 deletions(-)
>
> --
> 2.47.1.688.g23fc6f90ad-goog
>