tools/lib/perf/evlist.c | 9 +++ tools/lib/perf/include/internal/evlist.h | 2 + tools/perf/Documentation/perf-record.txt | 3 + tools/perf/builtin-record.c | 92 +++++++++++++++------- tools/perf/tests/attr/system-wide-dummy | 14 ++-- tools/perf/tests/attr/test-record-C0 | 4 +- tools/perf/tests/attr/test-record-dummy-C0 | 55 +++++++++++++ tools/perf/tests/shell/record_sideband.sh | 58 ++++++++++++++ tools/perf/util/evlist.c | 18 +++++ tools/perf/util/evlist.h | 1 + 10 files changed, 221 insertions(+), 35 deletions(-) create mode 100644 tools/perf/tests/attr/test-record-dummy-C0 create mode 100755 tools/perf/tests/shell/record_sideband.sh
User space tasks can migrate between CPUs, track sideband events for all CPUs. The specific scenarios are as follows: CPU0 CPU1 perf record -C 0 start taskA starts to be created and executed -> PERF_RECORD_COMM and PERF_RECORD_MMAP events only deliver to CPU1 ...... | migrate to CPU0 | Running on CPU0 <----------/ ... perf record -C 0 stop Now perf samples the PC of taskA. However, perf does not record the PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA. Therefore, the comm and symbols of taskA cannot be parsed. The sys_perf_event_open invoked is as follows: # perf --debug verbose=3 record -e cpu-clock -C 1 true <SNIP> Opening: cpu-clock ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0 (PERF_COUNT_SW_CPU_CLOCK) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER read_format ID|LOST disabled 1 inherit 1 freq 1 sample_id_all 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5 Opening: dummy:u ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0x9 (PERF_COUNT_SW_DUMMY) { sample_period, sample_freq } 1 sample_type IP|TID|TIME|CPU|IDENTIFIER read_format ID|LOST inherit 1 exclude_kernel 1 exclude_hv 1 mmap 1 comm 1 task 1 sample_id_all 1 exclude_guest 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7 sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9 sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10 sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11 sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12 sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13 sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14 <SNIP> Changes since_v7: - The condition for requiring system_wide sideband is changed to "as long as a non-dummy event exists" (patch4). - Modify the corresponding test case to record only dummy event (patch6). - Thanks to tested-by tag from Ravi, but because the solution is modified, the tested-by tag of Ravi is not added to this version. Changes since_v6: - Patch1: 1. No change. 2. Keep Acked-by tag from Adrian. - Patch2: 1. Update commit message as suggested by Ian. 2. Keep Acked-by tag from Adrian because code is not modified. - Patch3: 1. Update comment as suggested by Ian. 2. Merge original patch5 ("perf test: Update base-record & system-wide-dummy attr") as suggested by Ian. 3. Only merge commit, keep Acked-by tag from Adrian. - Patch4: 1. No change. Because Adrian recommends not changing the function name. 2. Keep Acked-by tag from Adrian. - Patch5: 1. Add cleanup on trap function as suggested by Ian. 2. Remove Tested-by tag from Adrian because the script is modified. - Patch6: 1. Add Reviewed-by tag from Ian. Changes since_v5: - No code changes. - Detailed commit message of patch3. - Add Acked-by and Tested-by tags from Adrian Hunter. Changes since_v4: - Simplify check code for record__tracking_system_wide(). - Add perf attr test result to commit message for patch 7. Changes since_v3: - Check fall_kernel, all_user, and dummy or exclude_user when determining whether system wide is required. Changes since_v2: - Rename record_tracking.sh to record_sideband.sh in tools/perf/tests/shell. - Remove "perf evlist: Skip dummy event sample_type check for evlist_config" patch. - Add opts->all_kernel check in record__config_tracking_events(). - Add perf_event_attr test for record selected CPUs exclude_user. - Update base-record & system-wide-dummy sample_type attr expected values for test-record-C0. Changes since v1: - Add perf_evlist__go_system_wide() via internal/evlist.h instead of exporting perf_evlist__propagate_maps(). - Use evlist__add_aux_dummy() instead of evlist__add_dummy() in evlist__findnew_tracking_event(). - Add a parameter in evlist__findnew_tracking_event() to deal with system_wide inside. - Add sideband for all CPUs when tracing selected CPUs comments on the perf record man page. - Use "sideband events" instead of "tracking events". - Adjust the patches Sequence. - Add patch5 to skip dummy event sample_type check for evlist_config. - Add patch6 to update system-wide-dummy attr values for perf test. Yang Jihong (6): perf evlist: Add perf_evlist__go_system_wide() helper perf evlist: Add evlist__findnew_tracking_event() helper perf record: Move setting tracking events before record__init_thread_masks() perf record: Track sideband events for all CPUs when tracing selected CPUs perf test: Add test case for record sideband events perf test: Add perf_event_attr test for record selected CPUs exclude_user Yang Jihong (6): perf evlist: Add perf_evlist__go_system_wide() helper perf evlist: Add evlist__findnew_tracking_event() helper perf record: Move setting tracking events before record__init_thread_masks() perf record: Track sideband events for all CPUs when tracing selected CPUs perf test: Add test case for record sideband events perf test: Add perf_event_attr test for record dummy event tools/lib/perf/evlist.c | 9 +++ tools/lib/perf/include/internal/evlist.h | 2 + tools/perf/Documentation/perf-record.txt | 3 + tools/perf/builtin-record.c | 92 +++++++++++++++------- tools/perf/tests/attr/system-wide-dummy | 14 ++-- tools/perf/tests/attr/test-record-C0 | 4 +- tools/perf/tests/attr/test-record-dummy-C0 | 55 +++++++++++++ tools/perf/tests/shell/record_sideband.sh | 58 ++++++++++++++ tools/perf/util/evlist.c | 18 +++++ tools/perf/util/evlist.h | 1 + 10 files changed, 221 insertions(+), 35 deletions(-) create mode 100644 tools/perf/tests/attr/test-record-dummy-C0 create mode 100755 tools/perf/tests/shell/record_sideband.sh -- 2.30.GIT
On 04-Sep-23 8:03 AM, Yang Jihong wrote: > User space tasks can migrate between CPUs, track sideband events for all > CPUs. > > The specific scenarios are as follows: > > CPU0 CPU1 > perf record -C 0 start > taskA starts to be created and executed > -> PERF_RECORD_COMM and PERF_RECORD_MMAP > events only deliver to CPU1 > ...... > | > migrate to CPU0 > | > Running on CPU0 <----------/ > ... > > perf record -C 0 stop > > Now perf samples the PC of taskA. However, perf does not record the > PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA. > Therefore, the comm and symbols of taskA cannot be parsed. > > The sys_perf_event_open invoked is as follows: > > # perf --debug verbose=3 record -e cpu-clock -C 1 true > <SNIP> > Opening: cpu-clock > ------------------------------------------------------------ > perf_event_attr: > type 1 (PERF_TYPE_SOFTWARE) > size 136 > config 0 (PERF_COUNT_SW_CPU_CLOCK) > { sample_period, sample_freq } 4000 > sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER > read_format ID|LOST > disabled 1 > inherit 1 > freq 1 > sample_id_all 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5 > Opening: dummy:u > ------------------------------------------------------------ > perf_event_attr: > type 1 (PERF_TYPE_SOFTWARE) > size 136 > config 0x9 (PERF_COUNT_SW_DUMMY) > { sample_period, sample_freq } 1 > sample_type IP|TID|TIME|CPU|IDENTIFIER > read_format ID|LOST > inherit 1 > exclude_kernel 1 > exclude_hv 1 > mmap 1 > comm 1 > task 1 > sample_id_all 1 > exclude_guest 1 > mmap2 1 > comm_exec 1 > ksymbol 1 > bpf_event 1 > ------------------------------------------------------------ > sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7 > sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9 > sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10 > sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11 > sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12 > sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13 > sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14 > <SNIP> > > Changes since_v7: > - The condition for requiring system_wide sideband is changed to > "as long as a non-dummy event exists" (patch4). > - Modify the corresponding test case to record only dummy event (patch6). > - Thanks to tested-by tag from Ravi, but because the solution is modified, > the tested-by tag of Ravi is not added to this version. I've re-tested v8 with my simple test. Tested-by: Ravi Bangoria <ravi.bangoria@amd.com>
Em Tue, Sep 12, 2023 at 02:41:56PM +0530, Ravi Bangoria escreveu: > On 04-Sep-23 8:03 AM, Yang Jihong wrote: > > User space tasks can migrate between CPUs, track sideband events for all > > CPUs. > > > > The specific scenarios are as follows: > > > > CPU0 CPU1 > > perf record -C 0 start > > taskA starts to be created and executed > > -> PERF_RECORD_COMM and PERF_RECORD_MMAP > > events only deliver to CPU1 > > ...... > > | > > migrate to CPU0 > > | > > Running on CPU0 <----------/ > > ... > > > > perf record -C 0 stop > > > > Now perf samples the PC of taskA. However, perf does not record the > > PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA. > > Therefore, the comm and symbols of taskA cannot be parsed. > > > > The sys_perf_event_open invoked is as follows: > > > > # perf --debug verbose=3 record -e cpu-clock -C 1 true > > <SNIP> > > Opening: cpu-clock > > ------------------------------------------------------------ > > perf_event_attr: > > type 1 (PERF_TYPE_SOFTWARE) > > size 136 > > config 0 (PERF_COUNT_SW_CPU_CLOCK) > > { sample_period, sample_freq } 4000 > > sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER > > read_format ID|LOST > > disabled 1 > > inherit 1 > > freq 1 > > sample_id_all 1 > > exclude_guest 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5 > > Opening: dummy:u > > ------------------------------------------------------------ > > perf_event_attr: > > type 1 (PERF_TYPE_SOFTWARE) > > size 136 > > config 0x9 (PERF_COUNT_SW_DUMMY) > > { sample_period, sample_freq } 1 > > sample_type IP|TID|TIME|CPU|IDENTIFIER > > read_format ID|LOST > > inherit 1 > > exclude_kernel 1 > > exclude_hv 1 > > mmap 1 > > comm 1 > > task 1 > > sample_id_all 1 > > exclude_guest 1 > > mmap2 1 > > comm_exec 1 > > ksymbol 1 > > bpf_event 1 > > ------------------------------------------------------------ > > sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 > > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7 > > sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9 > > sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10 > > sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11 > > sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12 > > sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13 > > sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14 > > <SNIP> > > > > Changes since_v7: > > - The condition for requiring system_wide sideband is changed to > > "as long as a non-dummy event exists" (patch4). > > - Modify the corresponding test case to record only dummy event (patch6). > > - Thanks to tested-by tag from Ravi, but because the solution is modified, > > the tested-by tag of Ravi is not added to this version. > > I've re-tested v8 with my simple test. > > Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Thanks, applied to the csets that were still sitting in an umpublished perf-tools-next local branch, soon public. - Arnaldo
Hello, On Tue, Sep 12, 2023 at 1:32 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Tue, Sep 12, 2023 at 02:41:56PM +0530, Ravi Bangoria escreveu: > > On 04-Sep-23 8:03 AM, Yang Jihong wrote: > > > User space tasks can migrate between CPUs, track sideband events for all > > > CPUs. > > > > > > The specific scenarios are as follows: > > > > > > CPU0 CPU1 > > > perf record -C 0 start > > > taskA starts to be created and executed > > > -> PERF_RECORD_COMM and PERF_RECORD_MMAP > > > events only deliver to CPU1 > > > ...... > > > | > > > migrate to CPU0 > > > | > > > Running on CPU0 <----------/ > > > ... > > > > > > perf record -C 0 stop > > > > > > Now perf samples the PC of taskA. However, perf does not record the > > > PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA. > > > Therefore, the comm and symbols of taskA cannot be parsed. > > > > > > The sys_perf_event_open invoked is as follows: > > > > > > # perf --debug verbose=3 record -e cpu-clock -C 1 true > > > <SNIP> > > > Opening: cpu-clock > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 1 (PERF_TYPE_SOFTWARE) > > > size 136 > > > config 0 (PERF_COUNT_SW_CPU_CLOCK) > > > { sample_period, sample_freq } 4000 > > > sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER > > > read_format ID|LOST > > > disabled 1 > > > inherit 1 > > > freq 1 > > > sample_id_all 1 > > > exclude_guest 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5 > > > Opening: dummy:u > > > ------------------------------------------------------------ > > > perf_event_attr: > > > type 1 (PERF_TYPE_SOFTWARE) > > > size 136 > > > config 0x9 (PERF_COUNT_SW_DUMMY) > > > { sample_period, sample_freq } 1 > > > sample_type IP|TID|TIME|CPU|IDENTIFIER > > > read_format ID|LOST > > > inherit 1 > > > exclude_kernel 1 > > > exclude_hv 1 > > > mmap 1 > > > comm 1 > > > task 1 > > > sample_id_all 1 > > > exclude_guest 1 > > > mmap2 1 > > > comm_exec 1 > > > ksymbol 1 > > > bpf_event 1 > > > ------------------------------------------------------------ > > > sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 > > > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7 > > > sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9 > > > sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10 > > > sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11 > > > sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12 > > > sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13 > > > sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14 > > > <SNIP> > > > > > > Changes since_v7: > > > - The condition for requiring system_wide sideband is changed to > > > "as long as a non-dummy event exists" (patch4). > > > - Modify the corresponding test case to record only dummy event (patch6). > > > - Thanks to tested-by tag from Ravi, but because the solution is modified, > > > the tested-by tag of Ravi is not added to this version. > > > > I've re-tested v8 with my simple test. > > > > Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> > > > Thanks, applied to the csets that were still sitting in an umpublished > perf-tools-next local branch, soon public. Now I'm seeing a perf test failure on perf-tools-next. $ sudo ./perf test -v 17 17: Setup struct perf_event_attr : --- start --- test child forked, pid 1616372 Using CPUID GenuineIntel-6-8C-1 running './tests/attr/test-record-branch-filter-k' running './tests/attr/test-record-period' running './tests/attr/test-record-graph-default' test limitation '!aarch64' excluded architecture list ['aarch64'] running './tests/attr/test-record-branch-filter-any' running './tests/attr/test-record-data' running './tests/attr/test-stat-detailed-1' running './tests/attr/test-record-branch-filter-hv' running './tests/attr/test-record-graph-fp' test limitation '!aarch64' excluded architecture list ['aarch64'] running './tests/attr/test-record-basic' running './tests/attr/test-record-group2' running './tests/attr/test-stat-detailed-3' running './tests/attr/test-record-branch-any' running './tests/attr/test-record-branch-filter-ind_call' running './tests/attr/test-stat-detailed-2' running './tests/attr/test-record-group1' running './tests/attr/test-record-count' running './tests/attr/test-record-no-samples' running './tests/attr/test-record-graph-dwarf' running './tests/attr/test-record-spe-period' test limitation 'aarch64' skipped [x86_64] './tests/attr/test-record-spe-period' running './tests/attr/test-record-graph-fp-aarch64' test limitation 'aarch64' skipped [x86_64] './tests/attr/test-record-graph-fp-aarch64' running './tests/attr/test-record-freq' running './tests/attr/test-record-pfm-period' running './tests/attr/test-record-no-buffering' running './tests/attr/test-record-no-inherit' running './tests/attr/test-record-branch-filter-any_ret' running './tests/attr/test-record-raw' running './tests/attr/test-record-dummy-C0' expected read_format=4, got 20 FAILED './tests/attr/test-record-dummy-C0' - match failure test child finished with -1 ---- end ---- Setup struct perf_event_attr: FAILED!
Hello, On 2023/9/16 8:14, Namhyung Kim wrote: > Hello, > > On Tue, Sep 12, 2023 at 1:32 PM Arnaldo Carvalho de Melo > <acme@kernel.org> wrote: >> >> Em Tue, Sep 12, 2023 at 02:41:56PM +0530, Ravi Bangoria escreveu: >>> On 04-Sep-23 8:03 AM, Yang Jihong wrote: >>>> User space tasks can migrate between CPUs, track sideband events for all >>>> CPUs. >>>> >>>> The specific scenarios are as follows: >>>> >>>> CPU0 CPU1 >>>> perf record -C 0 start >>>> taskA starts to be created and executed >>>> -> PERF_RECORD_COMM and PERF_RECORD_MMAP >>>> events only deliver to CPU1 >>>> ...... >>>> | >>>> migrate to CPU0 >>>> | >>>> Running on CPU0 <----------/ >>>> ... >>>> >>>> perf record -C 0 stop >>>> >>>> Now perf samples the PC of taskA. However, perf does not record the >>>> PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA. >>>> Therefore, the comm and symbols of taskA cannot be parsed. >>>> >>>> The sys_perf_event_open invoked is as follows: >>>> >>>> # perf --debug verbose=3 record -e cpu-clock -C 1 true >>>> <SNIP> >>>> Opening: cpu-clock >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 1 (PERF_TYPE_SOFTWARE) >>>> size 136 >>>> config 0 (PERF_COUNT_SW_CPU_CLOCK) >>>> { sample_period, sample_freq } 4000 >>>> sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER >>>> read_format ID|LOST >>>> disabled 1 >>>> inherit 1 >>>> freq 1 >>>> sample_id_all 1 >>>> exclude_guest 1 >>>> ------------------------------------------------------------ >>>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5 >>>> Opening: dummy:u >>>> ------------------------------------------------------------ >>>> perf_event_attr: >>>> type 1 (PERF_TYPE_SOFTWARE) >>>> size 136 >>>> config 0x9 (PERF_COUNT_SW_DUMMY) >>>> { sample_period, sample_freq } 1 >>>> sample_type IP|TID|TIME|CPU|IDENTIFIER >>>> read_format ID|LOST >>>> inherit 1 >>>> exclude_kernel 1 >>>> exclude_hv 1 >>>> mmap 1 >>>> comm 1 >>>> task 1 >>>> sample_id_all 1 >>>> exclude_guest 1 >>>> mmap2 1 >>>> comm_exec 1 >>>> ksymbol 1 >>>> bpf_event 1 >>>> ------------------------------------------------------------ >>>> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 >>>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7 >>>> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9 >>>> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10 >>>> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11 >>>> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12 >>>> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13 >>>> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14 >>>> <SNIP> >>>> >>>> Changes since_v7: >>>> - The condition for requiring system_wide sideband is changed to >>>> "as long as a non-dummy event exists" (patch4). >>>> - Modify the corresponding test case to record only dummy event (patch6). >>>> - Thanks to tested-by tag from Ravi, but because the solution is modified, >>>> the tested-by tag of Ravi is not added to this version. >>> >>> I've re-tested v8 with my simple test. >>> >>> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> >> >> >> Thanks, applied to the csets that were still sitting in an umpublished >> perf-tools-next local branch, soon public. > > Now I'm seeing a perf test failure on perf-tools-next. Uh.. the kernel I was using before didn't support PERF_FORMAT_LOST, so forget about supporting PERF_FORMAT_LOST. I've updated the kernel and retested it. The link to the fixed patch is as follows: https://lore.kernel.org/all/20230916091641.776031-1-yangjihong1@huawei.com/ Thanks, Yang
On Sat, Sep 16, 2023 at 2:24 AM Yang Jihong <yangjihong1@huawei.com> wrote: > > Hello, > > On 2023/9/16 8:14, Namhyung Kim wrote: > > Hello, > > > > On Tue, Sep 12, 2023 at 1:32 PM Arnaldo Carvalho de Melo > > <acme@kernel.org> wrote: > >> > >> Em Tue, Sep 12, 2023 at 02:41:56PM +0530, Ravi Bangoria escreveu: > >>> On 04-Sep-23 8:03 AM, Yang Jihong wrote: > >>>> User space tasks can migrate between CPUs, track sideband events for all > >>>> CPUs. > >>>> > >>>> The specific scenarios are as follows: > >>>> > >>>> CPU0 CPU1 > >>>> perf record -C 0 start > >>>> taskA starts to be created and executed > >>>> -> PERF_RECORD_COMM and PERF_RECORD_MMAP > >>>> events only deliver to CPU1 > >>>> ...... > >>>> | > >>>> migrate to CPU0 > >>>> | > >>>> Running on CPU0 <----------/ > >>>> ... > >>>> > >>>> perf record -C 0 stop > >>>> > >>>> Now perf samples the PC of taskA. However, perf does not record the > >>>> PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA. > >>>> Therefore, the comm and symbols of taskA cannot be parsed. > >>>> > >>>> The sys_perf_event_open invoked is as follows: > >>>> > >>>> # perf --debug verbose=3 record -e cpu-clock -C 1 true > >>>> <SNIP> > >>>> Opening: cpu-clock > >>>> ------------------------------------------------------------ > >>>> perf_event_attr: > >>>> type 1 (PERF_TYPE_SOFTWARE) > >>>> size 136 > >>>> config 0 (PERF_COUNT_SW_CPU_CLOCK) > >>>> { sample_period, sample_freq } 4000 > >>>> sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER > >>>> read_format ID|LOST > >>>> disabled 1 > >>>> inherit 1 > >>>> freq 1 > >>>> sample_id_all 1 > >>>> exclude_guest 1 > >>>> ------------------------------------------------------------ > >>>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5 > >>>> Opening: dummy:u > >>>> ------------------------------------------------------------ > >>>> perf_event_attr: > >>>> type 1 (PERF_TYPE_SOFTWARE) > >>>> size 136 > >>>> config 0x9 (PERF_COUNT_SW_DUMMY) > >>>> { sample_period, sample_freq } 1 > >>>> sample_type IP|TID|TIME|CPU|IDENTIFIER > >>>> read_format ID|LOST > >>>> inherit 1 > >>>> exclude_kernel 1 > >>>> exclude_hv 1 > >>>> mmap 1 > >>>> comm 1 > >>>> task 1 > >>>> sample_id_all 1 > >>>> exclude_guest 1 > >>>> mmap2 1 > >>>> comm_exec 1 > >>>> ksymbol 1 > >>>> bpf_event 1 > >>>> ------------------------------------------------------------ > >>>> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 > >>>> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7 > >>>> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9 > >>>> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10 > >>>> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11 > >>>> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12 > >>>> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13 > >>>> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14 > >>>> <SNIP> > >>>> > >>>> Changes since_v7: > >>>> - The condition for requiring system_wide sideband is changed to > >>>> "as long as a non-dummy event exists" (patch4). > >>>> - Modify the corresponding test case to record only dummy event (patch6). > >>>> - Thanks to tested-by tag from Ravi, but because the solution is modified, > >>>> the tested-by tag of Ravi is not added to this version. > >>> > >>> I've re-tested v8 with my simple test. > >>> > >>> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> > >> > >> > >> Thanks, applied to the csets that were still sitting in an umpublished > >> perf-tools-next local branch, soon public. > > > > Now I'm seeing a perf test failure on perf-tools-next. > > Uh.. the kernel I was using before didn't support PERF_FORMAT_LOST, so > forget about supporting PERF_FORMAT_LOST. I've updated the kernel and > retested it. > > The link to the fixed patch is as follows: > https://lore.kernel.org/all/20230916091641.776031-1-yangjihong1@huawei.com/ Thank you for the quick fix! Namhyung
Em Mon, Sep 04, 2023 at 02:33:34AM +0000, Yang Jihong escreveu: > User space tasks can migrate between CPUs, track sideband events for all > CPUs. > > The specific scenarios are as follows: > > CPU0 CPU1 > perf record -C 0 start > taskA starts to be created and executed > -> PERF_RECORD_COMM and PERF_RECORD_MMAP > events only deliver to CPU1 > ...... > | > migrate to CPU0 > | > Running on CPU0 <----------/ > ... > > perf record -C 0 stop > > Now perf samples the PC of taskA. However, perf does not record the > PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA. > Therefore, the comm and symbols of taskA cannot be parsed. > > The sys_perf_event_open invoked is as follows: > > # perf --debug verbose=3 record -e cpu-clock -C 1 true > <SNIP> > Opening: cpu-clock > ------------------------------------------------------------ > perf_event_attr: > type 1 (PERF_TYPE_SOFTWARE) > size 136 > config 0 (PERF_COUNT_SW_CPU_CLOCK) > { sample_period, sample_freq } 4000 > sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER > read_format ID|LOST > disabled 1 > inherit 1 > freq 1 > sample_id_all 1 > exclude_guest 1 > ------------------------------------------------------------ > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5 > Opening: dummy:u > ------------------------------------------------------------ > perf_event_attr: > type 1 (PERF_TYPE_SOFTWARE) > size 136 > config 0x9 (PERF_COUNT_SW_DUMMY) > { sample_period, sample_freq } 1 > sample_type IP|TID|TIME|CPU|IDENTIFIER > read_format ID|LOST > inherit 1 > exclude_kernel 1 > exclude_hv 1 > mmap 1 > comm 1 > task 1 > sample_id_all 1 > exclude_guest 1 > mmap2 1 > comm_exec 1 > ksymbol 1 > bpf_event 1 > ------------------------------------------------------------ > sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 > sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7 > sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9 > sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10 > sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11 > sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12 > sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13 > sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14 > <SNIP> > > Changes since_v7: > - The condition for requiring system_wide sideband is changed to > "as long as a non-dummy event exists" (patch4). > - Modify the corresponding test case to record only dummy event (patch6). > - Thanks to tested-by tag from Ravi, but because the solution is modified, > the tested-by tag of Ravi is not added to this version. > > Changes since_v6: > - Patch1: > 1. No change. > 2. Keep Acked-by tag from Adrian. > - Patch2: > 1. Update commit message as suggested by Ian. > 2. Keep Acked-by tag from Adrian because code is not modified. > - Patch3: > 1. Update comment as suggested by Ian. > 2. Merge original patch5 ("perf test: Update base-record & system-wide-dummy attr") as suggested by Ian. > 3. Only merge commit, keep Acked-by tag from Adrian. > - Patch4: > 1. No change. Because Adrian recommends not changing the function name. > 2. Keep Acked-by tag from Adrian. > - Patch5: > 1. Add cleanup on trap function as suggested by Ian. > 2. Remove Tested-by tag from Adrian because the script is modified. > - Patch6: > 1. Add Reviewed-by tag from Ian. I'm in doubt about these Acked-by/Reviewed-by tags, do they still stand? They are not in the latest series, can you please check? - Arnaldo > Changes since_v5: > - No code changes. > - Detailed commit message of patch3. > - Add Acked-by and Tested-by tags from Adrian Hunter. > > Changes since_v4: > - Simplify check code for record__tracking_system_wide(). > - Add perf attr test result to commit message for patch 7. > > Changes since_v3: > - Check fall_kernel, all_user, and dummy or exclude_user when determining > whether system wide is required. > > Changes since_v2: > - Rename record_tracking.sh to record_sideband.sh in tools/perf/tests/shell. > - Remove "perf evlist: Skip dummy event sample_type check for evlist_config" patch. > - Add opts->all_kernel check in record__config_tracking_events(). > - Add perf_event_attr test for record selected CPUs exclude_user. > - Update base-record & system-wide-dummy sample_type attr expected values for test-record-C0. > > Changes since v1: > - Add perf_evlist__go_system_wide() via internal/evlist.h instead of > exporting perf_evlist__propagate_maps(). > - Use evlist__add_aux_dummy() instead of evlist__add_dummy() in > evlist__findnew_tracking_event(). > - Add a parameter in evlist__findnew_tracking_event() to deal with > system_wide inside. > - Add sideband for all CPUs when tracing selected CPUs comments on > the perf record man page. > - Use "sideband events" instead of "tracking events". > - Adjust the patches Sequence. > - Add patch5 to skip dummy event sample_type check for evlist_config. > - Add patch6 to update system-wide-dummy attr values for perf test. > > Yang Jihong (6): > perf evlist: Add perf_evlist__go_system_wide() helper > perf evlist: Add evlist__findnew_tracking_event() helper > perf record: Move setting tracking events before > record__init_thread_masks() > perf record: Track sideband events for all CPUs when tracing selected > CPUs > perf test: Add test case for record sideband events > perf test: Add perf_event_attr test for record selected CPUs > exclude_user > > Yang Jihong (6): > perf evlist: Add perf_evlist__go_system_wide() helper > perf evlist: Add evlist__findnew_tracking_event() helper > perf record: Move setting tracking events before > record__init_thread_masks() > perf record: Track sideband events for all CPUs when tracing selected > CPUs > perf test: Add test case for record sideband events > perf test: Add perf_event_attr test for record dummy event > > tools/lib/perf/evlist.c | 9 +++ > tools/lib/perf/include/internal/evlist.h | 2 + > tools/perf/Documentation/perf-record.txt | 3 + > tools/perf/builtin-record.c | 92 +++++++++++++++------- > tools/perf/tests/attr/system-wide-dummy | 14 ++-- > tools/perf/tests/attr/test-record-C0 | 4 +- > tools/perf/tests/attr/test-record-dummy-C0 | 55 +++++++++++++ > tools/perf/tests/shell/record_sideband.sh | 58 ++++++++++++++ > tools/perf/util/evlist.c | 18 +++++ > tools/perf/util/evlist.h | 1 + > 10 files changed, 221 insertions(+), 35 deletions(-) > create mode 100644 tools/perf/tests/attr/test-record-dummy-C0 > create mode 100755 tools/perf/tests/shell/record_sideband.sh > > -- > 2.30.GIT > -- - Arnaldo
Hello, On 2023/9/7 0:08, Arnaldo Carvalho de Melo wrote: > Em Mon, Sep 04, 2023 at 02:33:34AM +0000, Yang Jihong escreveu: >> User space tasks can migrate between CPUs, track sideband events for all >> CPUs. >> >> The specific scenarios are as follows: >> >> CPU0 CPU1 >> perf record -C 0 start >> taskA starts to be created and executed >> -> PERF_RECORD_COMM and PERF_RECORD_MMAP >> events only deliver to CPU1 >> ...... >> | >> migrate to CPU0 >> | >> Running on CPU0 <----------/ >> ... >> >> perf record -C 0 stop >> >> Now perf samples the PC of taskA. However, perf does not record the >> PERF_RECORD_COMM and PERF_RECORD_COMM events of taskA. >> Therefore, the comm and symbols of taskA cannot be parsed. >> >> The sys_perf_event_open invoked is as follows: >> >> # perf --debug verbose=3 record -e cpu-clock -C 1 true >> <SNIP> >> Opening: cpu-clock >> ------------------------------------------------------------ >> perf_event_attr: >> type 1 (PERF_TYPE_SOFTWARE) >> size 136 >> config 0 (PERF_COUNT_SW_CPU_CLOCK) >> { sample_period, sample_freq } 4000 >> sample_type IP|TID|TIME|CPU|PERIOD|IDENTIFIER >> read_format ID|LOST >> disabled 1 >> inherit 1 >> freq 1 >> sample_id_all 1 >> exclude_guest 1 >> ------------------------------------------------------------ >> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5 >> Opening: dummy:u >> ------------------------------------------------------------ >> perf_event_attr: >> type 1 (PERF_TYPE_SOFTWARE) >> size 136 >> config 0x9 (PERF_COUNT_SW_DUMMY) >> { sample_period, sample_freq } 1 >> sample_type IP|TID|TIME|CPU|IDENTIFIER >> read_format ID|LOST >> inherit 1 >> exclude_kernel 1 >> exclude_hv 1 >> mmap 1 >> comm 1 >> task 1 >> sample_id_all 1 >> exclude_guest 1 >> mmap2 1 >> comm_exec 1 >> ksymbol 1 >> bpf_event 1 >> ------------------------------------------------------------ >> sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6 >> sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 7 >> sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 9 >> sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 10 >> sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 11 >> sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 12 >> sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 13 >> sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 14 >> <SNIP> >> >> Changes since_v7: >> - The condition for requiring system_wide sideband is changed to >> "as long as a non-dummy event exists" (patch4). >> - Modify the corresponding test case to record only dummy event (patch6). >> - Thanks to tested-by tag from Ravi, but because the solution is modified, >> the tested-by tag of Ravi is not added to this version. >> >> Changes since_v6: >> - Patch1: >> 1. No change. >> 2. Keep Acked-by tag from Adrian. >> - Patch2: >> 1. Update commit message as suggested by Ian. >> 2. Keep Acked-by tag from Adrian because code is not modified. >> - Patch3: >> 1. Update comment as suggested by Ian. >> 2. Merge original patch5 ("perf test: Update base-record & system-wide-dummy attr") as suggested by Ian. >> 3. Only merge commit, keep Acked-by tag from Adrian. >> - Patch4: >> 1. No change. Because Adrian recommends not changing the function name. >> 2. Keep Acked-by tag from Adrian. >> - Patch5: >> 1. Add cleanup on trap function as suggested by Ian. >> 2. Remove Tested-by tag from Adrian because the script is modified. >> - Patch6: >> 1. Add Reviewed-by tag from Ian. > > I'm in doubt about these Acked-by/Reviewed-by tags, do they still stand? They are > not in the latest series, can you please check? Uh, uh. Because several reviewers have different opinions on the solution, I modified it several times. Now v8 patchset is different from the previous versions. I only keep the Acked-by/Reviewed-by tags of patches that are not modified. For patches that have modified the code, I remove the tags. Therefore, please refer to the v8 series for Acked-by/Reviewed-by tags. This version needs to be confirmed by reviewers. Thanks, Yang
© 2016 - 2025 Red Hat, Inc.