tools/perf/builtin-lock.c | 46 +++++++---- tools/perf/util/bpf_lock_contention.c | 35 ++++++++- .../perf/util/bpf_skel/lock_contention.bpf.c | 77 +++++++++++++++++++ tools/perf/util/bpf_skel/lock_data.h | 14 ++++ 4 files changed, 152 insertions(+), 20 deletions(-)
Hello,
This patchset improves the symbolization of locks for -l/--lock-addr mode.
As of now it only shows global lock symbols present in the kallsyms. But
we can add some more lock symbols by traversing pointers in the BPF program.
For example, mmap_lock can be reached from the mm_struct of the current task
(task_struct->mm->mmap_lock) and we can compare the address of the give lock
with it. Similarly I've added 'siglock' for current->sighand->siglock.
On the other hand, we can traverse some of semi-global locks like per-cpu,
per-device, per-filesystem and so on. I've added 'rqlock' for each cpu's
runqueue lock.
It cannot cover all types of locks in the system but it'd be fairly usefule
if we can add many of often contended locks. I tried to add futex locks
but it failed to find the __futex_data symbol from BTF. I'm not sure why but
I guess it's because the struct doesn't have a tag name.
Those locks are added just because they got caught during my test.
It'd be nice if you suggest which locks to add and how to do that. :)
I'm thinking if there's a way to track file-based locks (like epoll, etc).
Finally I also added a lock type name after the symbols (if any) so that we
can get some idea even though it has no symbol. The example output looks
like below:
$ sudo ./perf lock con -abl -- sleep 1
contended total wait max wait avg wait address symbol
44 6.13 ms 284.49 us 139.28 us ffffffff92e06080 tasklist_lock (rwlock)
159 983.38 us 12.38 us 6.18 us ffff8cc717c90000 siglock (spinlock)
10 679.90 us 153.35 us 67.99 us ffff8cdc2872aaf8 mmap_lock (rwsem)
9 558.11 us 180.67 us 62.01 us ffff8cd647914038 mmap_lock (rwsem)
78 228.56 us 7.82 us 2.93 us ffff8cc700061c00 (spinlock)
5 41.60 us 16.93 us 8.32 us ffffd853acb41468 (spinlock)
10 37.24 us 5.87 us 3.72 us ffff8cd560b5c200 siglock (spinlock)
4 11.17 us 3.97 us 2.79 us ffff8d053ddf0c80 rq_lock (spinlock)
1 7.86 us 7.86 us 7.86 us ffff8cd64791404c (spinlock)
1 4.13 us 4.13 us 4.13 us ffff8d053d930c80 rq_lock (spinlock)
7 3.98 us 1.67 us 568 ns ffff8ccb92479440 (mutex)
2 2.62 us 2.33 us 1.31 us ffff8cc702e6ede0 (rwlock)
The tasklist_lock is global so it's from the kallsyms. But others like
siglock, mmap_lock and rq_lock are from the BPF.
You get get the code at 'perf/lock-symbol-v1' branch in
git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
Thanks,
Namhyung
Namhyung Kim (4):
perf lock contention: Track and show mmap_lock with address
perf lock contention: Track and show siglock with address
perf lock contention: Show per-cpu rq_lock with address
perf lock contention: Show lock type with address
tools/perf/builtin-lock.c | 46 +++++++----
tools/perf/util/bpf_lock_contention.c | 35 ++++++++-
.../perf/util/bpf_skel/lock_contention.bpf.c | 77 +++++++++++++++++++
tools/perf/util/bpf_skel/lock_data.h | 14 ++++
4 files changed, 152 insertions(+), 20 deletions(-)
base-commit: b8fa3e3833c14151a47ebebbc5427dcfe94bb407
--
2.40.0.rc1.284.g88254d51c5-goog
Em Mon, Mar 13, 2023 at 01:48:21PM -0700, Namhyung Kim escreveu: > Hello, > > This patchset improves the symbolization of locks for -l/--lock-addr mode. > As of now it only shows global lock symbols present in the kallsyms. But > we can add some more lock symbols by traversing pointers in the BPF program. > > For example, mmap_lock can be reached from the mm_struct of the current task > (task_struct->mm->mmap_lock) and we can compare the address of the give lock > with it. Similarly I've added 'siglock' for current->sighand->siglock. > > On the other hand, we can traverse some of semi-global locks like per-cpu, > per-device, per-filesystem and so on. I've added 'rqlock' for each cpu's > runqueue lock. > > It cannot cover all types of locks in the system but it'd be fairly usefule > if we can add many of often contended locks. I tried to add futex locks > but it failed to find the __futex_data symbol from BTF. I'm not sure why but > I guess it's because the struct doesn't have a tag name. > > Those locks are added just because they got caught during my test. > It'd be nice if you suggest which locks to add and how to do that. :) > I'm thinking if there's a way to track file-based locks (like epoll, etc). > > Finally I also added a lock type name after the symbols (if any) so that we > can get some idea even though it has no symbol. The example output looks > like below: > > $ sudo ./perf lock con -abl -- sleep 1 > contended total wait max wait avg wait address symbol > > 44 6.13 ms 284.49 us 139.28 us ffffffff92e06080 tasklist_lock (rwlock) > 159 983.38 us 12.38 us 6.18 us ffff8cc717c90000 siglock (spinlock) > 10 679.90 us 153.35 us 67.99 us ffff8cdc2872aaf8 mmap_lock (rwsem) > 9 558.11 us 180.67 us 62.01 us ffff8cd647914038 mmap_lock (rwsem) > 78 228.56 us 7.82 us 2.93 us ffff8cc700061c00 (spinlock) > 5 41.60 us 16.93 us 8.32 us ffffd853acb41468 (spinlock) > 10 37.24 us 5.87 us 3.72 us ffff8cd560b5c200 siglock (spinlock) > 4 11.17 us 3.97 us 2.79 us ffff8d053ddf0c80 rq_lock (spinlock) > 1 7.86 us 7.86 us 7.86 us ffff8cd64791404c (spinlock) > 1 4.13 us 4.13 us 4.13 us ffff8d053d930c80 rq_lock (spinlock) > 7 3.98 us 1.67 us 568 ns ffff8ccb92479440 (mutex) > 2 2.62 us 2.33 us 1.31 us ffff8cc702e6ede0 (rwlock) > > The tasklist_lock is global so it's from the kallsyms. But others like > siglock, mmap_lock and rq_lock are from the BPF. Beautiful :-) And the csets are _so_ small and demonstrate techniques that should be used in more and more tools. Applied, testing. - Arnaldo > You get get the code at 'perf/lock-symbol-v1' branch in > > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git > > Thanks, > Namhyung > > Namhyung Kim (4): > perf lock contention: Track and show mmap_lock with address > perf lock contention: Track and show siglock with address > perf lock contention: Show per-cpu rq_lock with address > perf lock contention: Show lock type with address > > tools/perf/builtin-lock.c | 46 +++++++---- > tools/perf/util/bpf_lock_contention.c | 35 ++++++++- > .../perf/util/bpf_skel/lock_contention.bpf.c | 77 +++++++++++++++++++ > tools/perf/util/bpf_skel/lock_data.h | 14 ++++ > 4 files changed, 152 insertions(+), 20 deletions(-) > > > base-commit: b8fa3e3833c14151a47ebebbc5427dcfe94bb407 > -- > 2.40.0.rc1.284.g88254d51c5-goog > -- - Arnaldo
Em Mon, Mar 13, 2023 at 06:45:53PM -0300, Arnaldo Carvalho de Melo escreveu: > Em Mon, Mar 13, 2023 at 01:48:21PM -0700, Namhyung Kim escreveu: > > Hello, > > > > This patchset improves the symbolization of locks for -l/--lock-addr mode. > > As of now it only shows global lock symbols present in the kallsyms. But > > we can add some more lock symbols by traversing pointers in the BPF program. > > > > For example, mmap_lock can be reached from the mm_struct of the current task > > (task_struct->mm->mmap_lock) and we can compare the address of the give lock > > with it. Similarly I've added 'siglock' for current->sighand->siglock. Hey, we can go a bit further by using something like pahole's --expand_types and --expand_pointers and play iterating a type members and looking for locks, like: ⬢[acme@toolbox pahole]$ pahole task_struct | grep spinlock_t spinlock_t alloc_lock; /* 3280 4 */ raw_spinlock_t pi_lock; /* 3284 4 */ seqcount_spinlock_t mems_allowed_seq; /* 3616 4 */ ⬢[acme@toolbox pahole]$ Expand points will find mmap_lock: ⬢[acme@toolbox pahole]$ pahole --expand_pointers -C task_struct | grep -B10 mmap_lock } *pgd; atomic_t membarrier_state; atomic_t mm_users; atomic_t mm_count; /* XXX 4 bytes hole, try to pack */ atomic_long_t pgtables_bytes; int map_count; spinlock_t page_table_lock; struct rw_semaphore mmap_lock; ^C ⬢[acme@toolbox pahole]$ ITs just too much expansion to see task_struct->mm, but it is there, of course: ⬢[acme@toolbox pahole]$ pahole mm_struct | grep mmap_lock struct rw_semaphore mmap_lock; /* 120 40 */ ⬢[acme@toolbox pahole]$ Also: ⬢[acme@toolbox pahole]$ pahole --contains rw_semaphore address_space signal_struct key inode super_block quota_info user_namespace blocking_notifier_head backing_dev_info anon_vma tty_struct cpufreq_policy tcf_block ipc_ids autogroup kvm_arch posix_clock listener_list uprobe kernfs_root configfs_fragment ext4_inode_info ext4_group_info btrfs_fs_info extent_buffer btrfs_dev_replace btrfs_space_info btrfs_inode btrfs_block_group tpm_chip ib_device ib_xrcd blk_crypto_profile controller led_classdev cppc_pcc_data dm_snapshot ⬢[acme@toolbox pahole]$ And: ⬢[acme@toolbox pahole]$ pahole --find_pointers_to mm_struct task_struct: mm task_struct: active_mm vm_area_struct: vm_mm flush_tlb_info: mm signal_struct: oom_mm tlb_state: loaded_mm linux_binprm: mm mmu_gather: mm trace_event_raw_xen_mmu_ptep_modify_prot: mm trace_event_raw_xen_mmu_alloc_ptpage: mm trace_event_raw_xen_mmu_pgd: mm trace_event_raw_xen_mmu_flush_tlb_multi: mm trace_event_raw_hyperv_mmu_flush_tlb_multi: mm mmu_notifier: mm mmu_notifier_range: mm sgx_encl_mm: mm rq: prev_mm kvm: mm cpuset_migrate_mm_work: mm mmap_unlock_irq_work: mm delayed_uprobe: mm map_info: mm trace_event_raw_mmap_lock: mm trace_event_raw_mmap_lock_acquire_returned: mm mm_walk: mm make_exclusive_args: mm mmu_interval_notifier: mm mm_slot: mm rmap_item: mm trace_event_raw_mm_khugepaged_scan_pmd: mm trace_event_raw_mm_collapse_huge_page: mm trace_event_raw_mm_collapse_huge_page_swapin: mm mm_slot: mm move_charge_struct: mm userfaultfd_ctx: mm proc_maps_private: mm remap_pfn: mm intel_svm: mm binder_alloc: vma_vm_mm ⬢[acme@toolbox pahole]$ - Arnaldo > > On the other hand, we can traverse some of semi-global locks like per-cpu, > > per-device, per-filesystem and so on. I've added 'rqlock' for each cpu's > > runqueue lock. > > > > It cannot cover all types of locks in the system but it'd be fairly usefule > > if we can add many of often contended locks. I tried to add futex locks > > but it failed to find the __futex_data symbol from BTF. I'm not sure why but > > I guess it's because the struct doesn't have a tag name. > > > > Those locks are added just because they got caught during my test. > > It'd be nice if you suggest which locks to add and how to do that. :) > > I'm thinking if there's a way to track file-based locks (like epoll, etc). > > > > Finally I also added a lock type name after the symbols (if any) so that we > > can get some idea even though it has no symbol. The example output looks > > like below: > > > > $ sudo ./perf lock con -abl -- sleep 1 > > contended total wait max wait avg wait address symbol > > > > 44 6.13 ms 284.49 us 139.28 us ffffffff92e06080 tasklist_lock (rwlock) > > 159 983.38 us 12.38 us 6.18 us ffff8cc717c90000 siglock (spinlock) > > 10 679.90 us 153.35 us 67.99 us ffff8cdc2872aaf8 mmap_lock (rwsem) > > 9 558.11 us 180.67 us 62.01 us ffff8cd647914038 mmap_lock (rwsem) > > 78 228.56 us 7.82 us 2.93 us ffff8cc700061c00 (spinlock) > > 5 41.60 us 16.93 us 8.32 us ffffd853acb41468 (spinlock) > > 10 37.24 us 5.87 us 3.72 us ffff8cd560b5c200 siglock (spinlock) > > 4 11.17 us 3.97 us 2.79 us ffff8d053ddf0c80 rq_lock (spinlock) > > 1 7.86 us 7.86 us 7.86 us ffff8cd64791404c (spinlock) > > 1 4.13 us 4.13 us 4.13 us ffff8d053d930c80 rq_lock (spinlock) > > 7 3.98 us 1.67 us 568 ns ffff8ccb92479440 (mutex) > > 2 2.62 us 2.33 us 1.31 us ffff8cc702e6ede0 (rwlock) > > > > The tasklist_lock is global so it's from the kallsyms. But others like > > siglock, mmap_lock and rq_lock are from the BPF. > > Beautiful :-) > > And the csets are _so_ small and demonstrate techniques that should be > used in more and more tools. > > Applied, testing. > > - Arnaldo > > > You get get the code at 'perf/lock-symbol-v1' branch in > > > > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git > > > > Thanks, > > Namhyung > > > > Namhyung Kim (4): > > perf lock contention: Track and show mmap_lock with address > > perf lock contention: Track and show siglock with address > > perf lock contention: Show per-cpu rq_lock with address > > perf lock contention: Show lock type with address > > > > tools/perf/builtin-lock.c | 46 +++++++---- > > tools/perf/util/bpf_lock_contention.c | 35 ++++++++- > > .../perf/util/bpf_skel/lock_contention.bpf.c | 77 +++++++++++++++++++ > > tools/perf/util/bpf_skel/lock_data.h | 14 ++++ > > 4 files changed, 152 insertions(+), 20 deletions(-) > > > > > > base-commit: b8fa3e3833c14151a47ebebbc5427dcfe94bb407 > > -- > > 2.40.0.rc1.284.g88254d51c5-goog > > > > -- > > - Arnaldo -- - Arnaldo
Hi Arnaldo, On Tue, Mar 14, 2023 at 5:23 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Mon, Mar 13, 2023 at 06:45:53PM -0300, Arnaldo Carvalho de Melo escreveu: > > Em Mon, Mar 13, 2023 at 01:48:21PM -0700, Namhyung Kim escreveu: > > > Hello, > > > > > > This patchset improves the symbolization of locks for -l/--lock-addr mode. > > > As of now it only shows global lock symbols present in the kallsyms. But > > > we can add some more lock symbols by traversing pointers in the BPF program. > > > > > > For example, mmap_lock can be reached from the mm_struct of the current task > > > (task_struct->mm->mmap_lock) and we can compare the address of the give lock > > > with it. Similarly I've added 'siglock' for current->sighand->siglock. > > Hey, we can go a bit further by using something like pahole's > --expand_types and --expand_pointers and play iterating a type members > and looking for locks, like: > > ⬢[acme@toolbox pahole]$ pahole task_struct | grep spinlock_t > spinlock_t alloc_lock; /* 3280 4 */ > raw_spinlock_t pi_lock; /* 3284 4 */ > seqcount_spinlock_t mems_allowed_seq; /* 3616 4 */ > ⬢[acme@toolbox pahole]$ > > Expand points will find mmap_lock: > > ⬢[acme@toolbox pahole]$ pahole --expand_pointers -C task_struct | grep -B10 mmap_lock > } *pgd; > atomic_t membarrier_state; > atomic_t mm_users; > atomic_t mm_count; > > /* XXX 4 bytes hole, try to pack */ > > atomic_long_t pgtables_bytes; > int map_count; > spinlock_t page_table_lock; > struct rw_semaphore mmap_lock; > ^C > ⬢[acme@toolbox pahole]$ > > > ITs just too much expansion to see task_struct->mm, but it is there, of > course: > > ⬢[acme@toolbox pahole]$ pahole mm_struct | grep mmap_lock > struct rw_semaphore mmap_lock; /* 120 40 */ > ⬢[acme@toolbox pahole]$ > > Also: > > ⬢[acme@toolbox pahole]$ pahole --contains rw_semaphore > address_space > signal_struct > key > inode > super_block > quota_info > user_namespace > blocking_notifier_head > backing_dev_info > anon_vma > tty_struct > cpufreq_policy > tcf_block > ipc_ids > autogroup > kvm_arch > posix_clock > listener_list > uprobe > kernfs_root > configfs_fragment > ext4_inode_info > ext4_group_info > btrfs_fs_info > extent_buffer > btrfs_dev_replace > btrfs_space_info > btrfs_inode > btrfs_block_group > tpm_chip > ib_device > ib_xrcd > blk_crypto_profile > controller > led_classdev > cppc_pcc_data > dm_snapshot > ⬢[acme@toolbox pahole]$ > > And: > > ⬢[acme@toolbox pahole]$ pahole --find_pointers_to mm_struct > task_struct: mm > task_struct: active_mm > vm_area_struct: vm_mm > flush_tlb_info: mm > signal_struct: oom_mm > tlb_state: loaded_mm > linux_binprm: mm > mmu_gather: mm > trace_event_raw_xen_mmu_ptep_modify_prot: mm > trace_event_raw_xen_mmu_alloc_ptpage: mm > trace_event_raw_xen_mmu_pgd: mm > trace_event_raw_xen_mmu_flush_tlb_multi: mm > trace_event_raw_hyperv_mmu_flush_tlb_multi: mm > mmu_notifier: mm > mmu_notifier_range: mm > sgx_encl_mm: mm > rq: prev_mm > kvm: mm > cpuset_migrate_mm_work: mm > mmap_unlock_irq_work: mm > delayed_uprobe: mm > map_info: mm > trace_event_raw_mmap_lock: mm > trace_event_raw_mmap_lock_acquire_returned: mm > mm_walk: mm > make_exclusive_args: mm > mmu_interval_notifier: mm > mm_slot: mm > rmap_item: mm > trace_event_raw_mm_khugepaged_scan_pmd: mm > trace_event_raw_mm_collapse_huge_page: mm > trace_event_raw_mm_collapse_huge_page_swapin: mm > mm_slot: mm > move_charge_struct: mm > userfaultfd_ctx: mm > proc_maps_private: mm > remap_pfn: mm > intel_svm: mm > binder_alloc: vma_vm_mm > ⬢[acme@toolbox pahole]$ This looks really cool! especially. I'm especially interested in adding super_block and kernfs_root. Let me see how I can add them. Thanks, Namhyung
© 2016 - 2026 Red Hat, Inc.