tools/include/uapi/linux/perf_event.h | 21 +++++- tools/lib/perf/include/perf/event.h | 7 ++ tools/perf/Documentation/perf-script.txt | 5 ++ tools/perf/builtin-script.c | 92 +++++++++++++++++++++++ tools/perf/util/callchain.c | 24 ++++++ tools/perf/util/callchain.h | 3 + tools/perf/util/event.c | 1 + tools/perf/util/evlist.c | 1 + tools/perf/util/evlist.h | 1 + tools/perf/util/evsel.c | 32 +++++++- tools/perf/util/evsel.h | 1 + tools/perf/util/machine.c | 1 + tools/perf/util/perf_event_attr_fprintf.c | 1 + tools/perf/util/sample.h | 3 +- tools/perf/util/session.c | 78 +++++++++++++++++++ tools/perf/util/tool.c | 2 + tools/perf/util/tool.h | 4 +- 17 files changed, 273 insertions(+), 4 deletions(-)
Hello, This is a counterpart for Josh's kernel change v2 [1] to support deferred user callchains. The change is transparent and users should not notice anything with the deferred callchains. $ perf record -g sleep 1 I added --[no-]merge-callchains option to control output of perf script. You can verify it has the deferred callchains like this: $ perf script --no-merge-callchains perf 801 [000] 18.031793: 1 cycles:P: ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) perf 801 [000] 18.031814: DEFERRED CALLCHAIN 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) ... When the callchain is merged (it's the default) it'd look like below: $ perf script perf 801 [000] 18.031793: 1 cycles:P: ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) ... Notice that the last line and it has the __GI___ioctl in the same callchain. It should work with other tools like perf report. The code is available at 'perf/defer-callchain-v2' branch in https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git Thanks, Namhyung [1] https://lore.kernel.org/lkml/cover.1726268190.git.jpoimboe@kernel.org Namhyung Kim (5): perf tools: Sync UAPI perf_event.h header perf tools: Minimal DEFERRED_CALLCHAIN support perf record: Enable defer_callchain for user callchains perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED perf tools: Merge deferred user callchains tools/include/uapi/linux/perf_event.h | 21 +++++- tools/lib/perf/include/perf/event.h | 7 ++ tools/perf/Documentation/perf-script.txt | 5 ++ tools/perf/builtin-script.c | 92 +++++++++++++++++++++++ tools/perf/util/callchain.c | 24 ++++++ tools/perf/util/callchain.h | 3 + tools/perf/util/event.c | 1 + tools/perf/util/evlist.c | 1 + tools/perf/util/evlist.h | 1 + tools/perf/util/evsel.c | 32 +++++++- tools/perf/util/evsel.h | 1 + tools/perf/util/machine.c | 1 + tools/perf/util/perf_event_attr_fprintf.c | 1 + tools/perf/util/sample.h | 3 +- tools/perf/util/session.c | 78 +++++++++++++++++++ tools/perf/util/tool.c | 2 + tools/perf/util/tool.h | 4 +- 17 files changed, 273 insertions(+), 4 deletions(-) -- 2.46.0.792.g87dc391469-goog
On 2024-09-17 6:28 p.m., Namhyung Kim wrote: > Hello, > > This is a counterpart for Josh's kernel change v2 [1] to support deferred > user callchains. The change is transparent and users should not notice > anything with the deferred callchains. > > $ perf record -g sleep 1 > > I added --[no-]merge-callchains option to control output of perf script. > You can verify it has the deferred callchains like this: > > $ perf script --no-merge-callchains > perf 801 [000] 18.031793: 1 cycles:P: > ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) > ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) > ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) > ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) > ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) > ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) > ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) > ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) > ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) > ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) > ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) > ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) > ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) > > perf 801 [000] 18.031814: DEFERRED CALLCHAIN > 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) > > ... > > When the callchain is merged (it's the default) it'd look like below: > > $ perf script > perf 801 [000] 18.031793: 1 cycles:P: > ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) > ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) > ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) > ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) > ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) > ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) > ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) > ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) > ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) > ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) > ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) > ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) > ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) > 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) > > ... > > Notice that the last line and it has the __GI___ioctl in the same > callchain. It should work with other tools like perf report. It seems it only works with perf report -D, when I test it on a non-hybrid machine. $perf record -e branches -g -c 3000000 ~/tchain_edit [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.397 MB perf.data ] $ perf report -D | tail -n 17 Aggregated stats: TOTAL events: 8235 MMAP events: 78 ( 0.9%) COMM events: 2 ( 0.0%) EXIT events: 1 ( 0.0%) SAMPLE events: 4060 (49.3%) MMAP2 events: 2 ( 0.0%) KSYMBOL events: 12 ( 0.1%) BPF_EVENT events: 12 ( 0.1%) CALLCHAIN_DEFERRED events: 4060 (49.3%) FINISHED_ROUND events: 3 ( 0.0%) ID_INDEX events: 1 ( 0.0%) THREAD_MAP events: 1 ( 0.0%) CPU_MAP events: 1 ( 0.0%) TIME_CONV events: 1 ( 0.0%) FINISHED_INIT events: 1 ( 0.0%) $ perf report Error: The perf.data data has no samples! # To display the perf.data header info, please use --header/--header-only options. # On a hybrid machine, perf record errors out. $perf record -g true [ perf record: Woken up 1 times to write data ] 0x58a8 [0x38]: failed to process type: 22 [Bad address] [ perf record: Captured and wrote 0.022 MB perf.data ] Thanks, Kan > > The code is available at 'perf/defer-callchain-v2' branch in > https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git > > Thanks, > Namhyung > > [1] https://lore.kernel.org/lkml/cover.1726268190.git.jpoimboe@kernel.org > > > Namhyung Kim (5): > perf tools: Sync UAPI perf_event.h header > perf tools: Minimal DEFERRED_CALLCHAIN support > perf record: Enable defer_callchain for user callchains > perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED > perf tools: Merge deferred user callchains > > tools/include/uapi/linux/perf_event.h | 21 +++++- > tools/lib/perf/include/perf/event.h | 7 ++ > tools/perf/Documentation/perf-script.txt | 5 ++ > tools/perf/builtin-script.c | 92 +++++++++++++++++++++++ > tools/perf/util/callchain.c | 24 ++++++ > tools/perf/util/callchain.h | 3 + > tools/perf/util/event.c | 1 + > tools/perf/util/evlist.c | 1 + > tools/perf/util/evlist.h | 1 + > tools/perf/util/evsel.c | 32 +++++++- > tools/perf/util/evsel.h | 1 + > tools/perf/util/machine.c | 1 + > tools/perf/util/perf_event_attr_fprintf.c | 1 + > tools/perf/util/sample.h | 3 +- > tools/perf/util/session.c | 78 +++++++++++++++++++ > tools/perf/util/tool.c | 2 + > tools/perf/util/tool.h | 4 +- > 17 files changed, 273 insertions(+), 4 deletions(-) >
Hi Kan, On Wed, Sep 18, 2024 at 04:26:56PM -0400, Liang, Kan wrote: > > > On 2024-09-17 6:28 p.m., Namhyung Kim wrote: > > Hello, > > > > This is a counterpart for Josh's kernel change v2 [1] to support deferred > > user callchains. The change is transparent and users should not notice > > anything with the deferred callchains. > > > > $ perf record -g sleep 1 > > > > I added --[no-]merge-callchains option to control output of perf script. > > You can verify it has the deferred callchains like this: > > > > $ perf script --no-merge-callchains > > perf 801 [000] 18.031793: 1 cycles:P: > > ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) > > ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) > > ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) > > ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) > > ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) > > ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) > > ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) > > ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) > > ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) > > ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) > > ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) > > ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) > > ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) > > > > perf 801 [000] 18.031814: DEFERRED CALLCHAIN > > 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) > > > > ... > > > > When the callchain is merged (it's the default) it'd look like below: > > > > $ perf script > > perf 801 [000] 18.031793: 1 cycles:P: > > ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) > > ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) > > ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) > > ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) > > ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) > > ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) > > ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) > > ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) > > ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) > > ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) > > ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) > > ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) > > ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) > > 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) > > > > ... > > > > Notice that the last line and it has the __GI___ioctl in the same > > callchain. It should work with other tools like perf report. > > > It seems it only works with perf report -D, when I test it on a > non-hybrid machine. > $perf record -e branches -g -c 3000000 ~/tchain_edit > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.397 MB perf.data ] > $ perf report -D | tail -n 17 > > Aggregated stats: > TOTAL events: 8235 > MMAP events: 78 ( 0.9%) > COMM events: 2 ( 0.0%) > EXIT events: 1 ( 0.0%) > SAMPLE events: 4060 (49.3%) > MMAP2 events: 2 ( 0.0%) > KSYMBOL events: 12 ( 0.1%) > BPF_EVENT events: 12 ( 0.1%) > CALLCHAIN_DEFERRED events: 4060 (49.3%) > FINISHED_ROUND events: 3 ( 0.0%) > ID_INDEX events: 1 ( 0.0%) > THREAD_MAP events: 1 ( 0.0%) > CPU_MAP events: 1 ( 0.0%) > TIME_CONV events: 1 ( 0.0%) > FINISHED_INIT events: 1 ( 0.0%) > $ perf report > Error: > The perf.data data has no samples! > # To display the perf.data header info, please use > --header/--header-only options. > # > > > On a hybrid machine, perf record errors out. > > $perf record -g true > [ perf record: Woken up 1 times to write data ] > 0x58a8 [0x38]: failed to process type: 22 [Bad address] > [ perf record: Captured and wrote 0.022 MB perf.data ] Thanks for the test, I'll take a look what I missed. Thanks, Namhyung
On Wed, Sep 18, 2024 at 12:28 AM Namhyung Kim <namhyung@kernel.org> wrote: > > Hello, > > This is a counterpart for Josh's kernel change v2 [1] to support deferred > user callchains. The change is transparent and users should not notice > anything with the deferred callchains. > > $ perf record -g sleep 1 > > I added --[no-]merge-callchains option to control output of perf script. > You can verify it has the deferred callchains like this: > > $ perf script --no-merge-callchains > perf 801 [000] 18.031793: 1 cycles:P: > ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) > ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) > ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) > ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) > ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) > ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) > ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) > ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) > ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) > ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) > ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) > ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) > ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) > > perf 801 [000] 18.031814: DEFERRED CALLCHAIN > 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) > > ... > > When the callchain is merged (it's the default) it'd look like below: > > $ perf script > perf 801 [000] 18.031793: 1 cycles:P: > ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) > ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) > ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) > ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) > ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) > ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) > ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) > ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) > ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) > ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) > ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) > ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) > ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) > 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) > > ... > > Notice that the last line and it has the __GI___ioctl in the same > callchain. It should work with other tools like perf report. Hi Namhyung, I think this is interesting work! The issue feels similar to leader sampling and some of the unpicking of that we've been dealing with. With leader sampling it was added and then the dispatch of events modified so that tools wouldn't see leader samples, instead new events would be synthesized based on the leader sample data. However, the leader sample event wasn't changed and so now we have multiple repeated events and perf inject wouldn't just pass through a perf data file. What I'm expecting based on this description is that a deferred call chain will be merged with a regular one, however, perf inject isn't updated to drop the deferred callchain so now we have the deferred callchain event twice. My feeling is that making the dispatch of events to tools "smart" is a false economy. Tools can add handlers for these events easily enough. What's harder is undoing the smartness when it does things that lead to duplicated events and the like. I'm not a fan of how leader sampling was implemented and I still think it odd that with perf script we see invented events when trying to just dump the contents of a perf.data file. Thanks, Ian > The code is available at 'perf/defer-callchain-v2' branch in > https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git > > Thanks, > Namhyung > > [1] https://lore.kernel.org/lkml/cover.1726268190.git.jpoimboe@kernel.org > > > Namhyung Kim (5): > perf tools: Sync UAPI perf_event.h header > perf tools: Minimal DEFERRED_CALLCHAIN support > perf record: Enable defer_callchain for user callchains > perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED > perf tools: Merge deferred user callchains > > tools/include/uapi/linux/perf_event.h | 21 +++++- > tools/lib/perf/include/perf/event.h | 7 ++ > tools/perf/Documentation/perf-script.txt | 5 ++ > tools/perf/builtin-script.c | 92 +++++++++++++++++++++++ > tools/perf/util/callchain.c | 24 ++++++ > tools/perf/util/callchain.h | 3 + > tools/perf/util/event.c | 1 + > tools/perf/util/evlist.c | 1 + > tools/perf/util/evlist.h | 1 + > tools/perf/util/evsel.c | 32 +++++++- > tools/perf/util/evsel.h | 1 + > tools/perf/util/machine.c | 1 + > tools/perf/util/perf_event_attr_fprintf.c | 1 + > tools/perf/util/sample.h | 3 +- > tools/perf/util/session.c | 78 +++++++++++++++++++ > tools/perf/util/tool.c | 2 + > tools/perf/util/tool.h | 4 +- > 17 files changed, 273 insertions(+), 4 deletions(-) > > -- > 2.46.0.792.g87dc391469-goog >
Hi Ian, On Wed, Sep 18, 2024 at 08:38:22AM +0200, Ian Rogers wrote: > On Wed, Sep 18, 2024 at 12:28 AM Namhyung Kim <namhyung@kernel.org> wrote: > > > > Hello, > > > > This is a counterpart for Josh's kernel change v2 [1] to support deferred > > user callchains. The change is transparent and users should not notice > > anything with the deferred callchains. > > > > $ perf record -g sleep 1 > > > > I added --[no-]merge-callchains option to control output of perf script. > > You can verify it has the deferred callchains like this: > > > > $ perf script --no-merge-callchains > > perf 801 [000] 18.031793: 1 cycles:P: > > ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) > > ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) > > ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) > > ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) > > ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) > > ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) > > ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) > > ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) > > ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) > > ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) > > ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) > > ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) > > ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) > > > > perf 801 [000] 18.031814: DEFERRED CALLCHAIN > > 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) > > > > ... > > > > When the callchain is merged (it's the default) it'd look like below: > > > > $ perf script > > perf 801 [000] 18.031793: 1 cycles:P: > > ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) > > ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) > > ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) > > ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) > > ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) > > ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) > > ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) > > ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) > > ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) > > ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) > > ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) > > ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) > > ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) > > 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) > > > > ... > > > > Notice that the last line and it has the __GI___ioctl in the same > > callchain. It should work with other tools like perf report. > > Hi Namhyung, I think this is interesting work! > > The issue feels similar to leader sampling and some of the unpicking > of that we've been dealing with. With leader sampling it was added and > then the dispatch of events modified so that tools wouldn't see leader > samples, instead new events would be synthesized based on the leader > sample data. However, the leader sample event wasn't changed and so > now we have multiple repeated events and perf inject wouldn't just > pass through a perf data file. > > What I'm expecting based on this description is that a deferred call > chain will be merged with a regular one, however, perf inject isn't > updated to drop the deferred callchain so now we have the deferred > callchain event twice. > > My feeling is that making the dispatch of events to tools "smart" is a > false economy. Tools can add handlers for these events easily enough. > What's harder is undoing the smartness when it does things that lead > to duplicated events and the like. I'm not a fan of how leader > sampling was implemented and I still think it odd that with perf > script we see invented events when trying to just dump the contents of > a perf.data file. That's why I added perf_tool.merge_deferred_callchains flag to control the behavior. I haven't implemented it to perf inject because it covers a couple of different use cases. I believe the default behavior is to not invoke the callback for deferred callchains during perf inject and each sample will get the full callchains. But you can add a new callback and set perf_tool.merge_deferred_callchains to false. Thanks, Namhyung
On Wed, Sep 18, 2024 at 11:38 AM Namhyung Kim <namhyung@kernel.org> wrote: > > Hi Ian, > > On Wed, Sep 18, 2024 at 08:38:22AM +0200, Ian Rogers wrote: > > On Wed, Sep 18, 2024 at 12:28 AM Namhyung Kim <namhyung@kernel.org> wrote: > > > > > > Hello, > > > > > > This is a counterpart for Josh's kernel change v2 [1] to support deferred > > > user callchains. The change is transparent and users should not notice > > > anything with the deferred callchains. > > > > > > $ perf record -g sleep 1 > > > > > > I added --[no-]merge-callchains option to control output of perf script. > > > You can verify it has the deferred callchains like this: > > > > > > $ perf script --no-merge-callchains > > > perf 801 [000] 18.031793: 1 cycles:P: > > > ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) > > > ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) > > > ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) > > > ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) > > > ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) > > > ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) > > > ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) > > > ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) > > > ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) > > > ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) > > > ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) > > > ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) > > > ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) > > > > > > perf 801 [000] 18.031814: DEFERRED CALLCHAIN > > > 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) > > > > > > ... > > > > > > When the callchain is merged (it's the default) it'd look like below: > > > > > > $ perf script > > > perf 801 [000] 18.031793: 1 cycles:P: > > > ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) > > > ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) > > > ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) > > > ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) > > > ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) > > > ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) > > > ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) > > > ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) > > > ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) > > > ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) > > > ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) > > > ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) > > > ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) > > > 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) > > > > > > ... > > > > > > Notice that the last line and it has the __GI___ioctl in the same > > > callchain. It should work with other tools like perf report. > > > > Hi Namhyung, I think this is interesting work! > > > > The issue feels similar to leader sampling and some of the unpicking > > of that we've been dealing with. With leader sampling it was added and > > then the dispatch of events modified so that tools wouldn't see leader > > samples, instead new events would be synthesized based on the leader > > sample data. However, the leader sample event wasn't changed and so > > now we have multiple repeated events and perf inject wouldn't just > > pass through a perf data file. > > > > What I'm expecting based on this description is that a deferred call > > chain will be merged with a regular one, however, perf inject isn't > > updated to drop the deferred callchain so now we have the deferred > > callchain event twice. > > > > My feeling is that making the dispatch of events to tools "smart" is a > > false economy. Tools can add handlers for these events easily enough. > > What's harder is undoing the smartness when it does things that lead > > to duplicated events and the like. I'm not a fan of how leader > > sampling was implemented and I still think it odd that with perf > > script we see invented events when trying to just dump the contents of > > a perf.data file. > > That's why I added perf_tool.merge_deferred_callchains flag to control > the behavior. I haven't implemented it to perf inject because it covers > a couple of different use cases. I believe the default behavior is to > not invoke the callback for deferred callchains during perf inject and > each sample will get the full callchains. But you can add a new > callback and set perf_tool.merge_deferred_callchains to false. I wonder if there is a different strategy for handling this. Normally with a visitor pattern you fail when you call an unimplemented visitor, this is then a signal the (in our case) tool needs to handle the new case. This avoids naively doing things like making perf inject duplicate events. The equivalent in the perf code would be to initialize the callbacks in the tool constructor to be to stubs that abort, then explicitly initialize and use things like callchain merging as appropriate. The whole booleans next to the callbacks feels like a kludge and likely to hide bugs. It is also marginally less efficient. Thanks, Ian
Hi Ian, On Wed, Sep 18, 2024 at 03:39:31PM +0200, Ian Rogers wrote: > On Wed, Sep 18, 2024 at 11:38 AM Namhyung Kim <namhyung@kernel.org> wrote: > > > > Hi Ian, > > > > On Wed, Sep 18, 2024 at 08:38:22AM +0200, Ian Rogers wrote: > > > On Wed, Sep 18, 2024 at 12:28 AM Namhyung Kim <namhyung@kernel.org> wrote: > > > > > > > > Hello, > > > > > > > > This is a counterpart for Josh's kernel change v2 [1] to support deferred > > > > user callchains. The change is transparent and users should not notice > > > > anything with the deferred callchains. > > > > > > > > $ perf record -g sleep 1 > > > > > > > > I added --[no-]merge-callchains option to control output of perf script. > > > > You can verify it has the deferred callchains like this: > > > > > > > > $ perf script --no-merge-callchains > > > > perf 801 [000] 18.031793: 1 cycles:P: > > > > ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) > > > > ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) > > > > ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) > > > > ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) > > > > ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) > > > > ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) > > > > ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) > > > > ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) > > > > ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) > > > > ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) > > > > ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) > > > > ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) > > > > ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) > > > > > > > > perf 801 [000] 18.031814: DEFERRED CALLCHAIN > > > > 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) > > > > > > > > ... > > > > > > > > When the callchain is merged (it's the default) it'd look like below: > > > > > > > > $ perf script > > > > perf 801 [000] 18.031793: 1 cycles:P: > > > > ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms]) > > > > ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms]) > > > > ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms]) > > > > ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms]) > > > > ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms]) > > > > ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms]) > > > > ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms]) > > > > ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms]) > > > > ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms]) > > > > ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms]) > > > > ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms]) > > > > ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms]) > > > > ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms]) > > > > 7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6) > > > > > > > > ... > > > > > > > > Notice that the last line and it has the __GI___ioctl in the same > > > > callchain. It should work with other tools like perf report. > > > > > > Hi Namhyung, I think this is interesting work! > > > > > > The issue feels similar to leader sampling and some of the unpicking > > > of that we've been dealing with. With leader sampling it was added and > > > then the dispatch of events modified so that tools wouldn't see leader > > > samples, instead new events would be synthesized based on the leader > > > sample data. However, the leader sample event wasn't changed and so > > > now we have multiple repeated events and perf inject wouldn't just > > > pass through a perf data file. > > > > > > What I'm expecting based on this description is that a deferred call > > > chain will be merged with a regular one, however, perf inject isn't > > > updated to drop the deferred callchain so now we have the deferred > > > callchain event twice. > > > > > > My feeling is that making the dispatch of events to tools "smart" is a > > > false economy. Tools can add handlers for these events easily enough. > > > What's harder is undoing the smartness when it does things that lead > > > to duplicated events and the like. I'm not a fan of how leader > > > sampling was implemented and I still think it odd that with perf > > > script we see invented events when trying to just dump the contents of > > > a perf.data file. > > > > That's why I added perf_tool.merge_deferred_callchains flag to control > > the behavior. I haven't implemented it to perf inject because it covers > > a couple of different use cases. I believe the default behavior is to > > not invoke the callback for deferred callchains during perf inject and > > each sample will get the full callchains. But you can add a new > > callback and set perf_tool.merge_deferred_callchains to false. > > I wonder if there is a different strategy for handling this. Normally > with a visitor pattern you fail when you call an unimplemented > visitor, this is then a signal the (in our case) tool needs to handle > the new case. This avoids naively doing things like making perf inject > duplicate events. The equivalent in the perf code would be to > initialize the callbacks in the tool constructor to be to stubs that > abort, then explicitly initialize and use things like callchain > merging as appropriate. The whole booleans next to the callbacks feels > like a kludge and likely to hide bugs. It is also marginally less > efficient. Well.. we might change it that way later, but I just wanted to test the deferred callchains quickly in this series. Thanks, Namhyung
© 2016 - 2024 Red Hat, Inc.