[PATCH V2 00/23] perf intel-pt: Better support for perf record --cpu

Adrian Hunter posted 23 patches 4 years ago
There is a newer version of this series
tools/lib/perf/evlist.c                  |  80 ++++++++++-------------
tools/lib/perf/evsel.c                   |  15 +++++
tools/lib/perf/include/internal/evlist.h |   3 +-
tools/lib/perf/include/internal/evsel.h  |  10 +++
tools/lib/perf/include/perf/evsel.h      |   1 +
tools/perf/arch/arm/util/cs-etm.c        |   1 +
tools/perf/arch/arm64/util/arm-spe.c     |   1 +
tools/perf/arch/s390/util/auxtrace.c     |   1 +
tools/perf/arch/x86/util/intel-bts.c     |   1 +
tools/perf/arch/x86/util/intel-pt.c      |  32 ++++------
tools/perf/builtin-record.c              |  39 +++++-------
tools/perf/builtin-stat.c                |   5 +-
tools/perf/tests/shell/test_intel_pt.sh  |  71 +++++++++++++++++++++
tools/perf/util/auxtrace.c               |  31 +++++++--
tools/perf/util/auxtrace.h               |  10 +--
tools/perf/util/evlist.c                 | 106 +++++++++++++++----------------
tools/perf/util/evlist.h                 |   7 +-
tools/perf/util/evsel.c                  |   1 +
tools/perf/util/evsel.h                  |   1 +
tools/perf/util/mmap.c                   |   4 +-
tools/perf/util/parse-events.c           |   2 +-
21 files changed, 261 insertions(+), 161 deletions(-)
create mode 100755 tools/perf/tests/shell/test_intel_pt.sh
[PATCH V2 00/23] perf intel-pt: Better support for perf record --cpu
Posted by Adrian Hunter 4 years ago
Hi

Here are V2 patches to support capturing Intel PT sideband events such as
mmap, task, context switch, text poke etc, on every CPU even when tracing
selected user_requested_cpus.  That is, when using the perf record -C or
 --cpu option.

This is needed for:
1. text poke: a text poke on any CPU affects all CPUs
2. tracing user space: a user space process can migrate between CPUs so
mmap events that happen on a different CPU can be needed to decode a
user_requested_cpus CPU.

For example:

	Trace on CPU 1:

	perf record --kcore -C 1 -e intel_pt// &

	Start a task on CPU 0:

	taskset 0x1 testprog &

	Migrate it to CPU 1:

	taskset -p 0x2 <testprog pid>

	Stop tracing:

	kill %1

	Prior to these changes there will be errors decoding testprog
	in userspace because the comm and mmap events for testprog will not
	have been captured.

There is quite a bit of preparation:

The first patch is a small Intel PT test for system-wide side band.  The
test fails before the patches are applied, passed afterwards.

      perf intel-pt: Add a test for system-wide side band [new in V1]

The next 5 patches stop auxtrace mixing up mmap idx between evlist and
evsel.  That is going to matter when
evlist->all_cpus != evlist->user_requested_cpus != evsel->cpus:

      libperf evsel: Factor out perf_evsel__ioctl() [now applied]
      libperf evsel: Add perf_evsel__enable_thread()
      perf evlist: Use libperf functions in evlist__enable_event_idx()
      perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c
      perf auxtrace: Do not mix up mmap idx

The next 6 patches stop attempts to auxtrace mmap when it is not an
auxtrace event e.g. when mmapping the CPUs on which only sideband is
captured:

      libperf evlist: Remove ->idx() per_cpu parameter
      libperf evlist: Move ->idx() into mmap_per_evsel()
      libperf evlist: Add evsel as a parameter to ->idx()
      perf auxtrace: Record whether an auxtrace mmap is needed
      perf auxctrace: Add mmap_needed to auxtrace_mmap_params
      perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter

The next 5 patches switch to setting up dummy event maps before adding the
evsel so that the evsel is subject to map propagation, primarily to cause
addition of the evsel's CPUs to all_cpus.

      perf evlist: Factor out evlist__dummy_event()
      perf evlist: Add evlist__add_system_wide_dummy()
      perf record: Use evlist__add_system_wide_dummy() in record__config_text_poke()
      perf intel-pt: Use evlist__add_system_wide_dummy() for switch tracking
      perf intel-pt: Track sideband system-wide when needed

The remaining patches make more significant changes.

First change from using user_requested_cpus to using all_cpus where necessary:

      perf tools: Allow all_cpus to be a superset of user_requested_cpus

Secondly, mmap all per-thread and all per-cpu events:

      libperf evlist: Allow mixing per-thread and per-cpu mmaps
      libperf evlist: Check nr_mmaps is correct [new in V1]

Stop using system_wide flag for uncore because it will not work anymore:

      perf stat: Add requires_cpu flag for uncore
      libperf evsel: Add comments for booleans [new in V1]

Finally change map propagation so that system-wide events retain their cpus and
(dummy) threads:

      perf tools: Allow system-wide events to keep their own CPUs
      perf tools: Allow system-wide events to keep their own threads


Changes in V2:

	Added some Acked-by: Ian Rogers <irogers@google.com>

      libperf evsel: Add perf_evsel__enable_thread()
	Use perf_cpu_map__for_each_cpu()

      perf auxtrace: Add mmap_needed to auxtrace_mmap_params
	Add documentation comment for mmap_needed

      perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter
	Fix missing auxtrace_mmap_params__set_idx change

      libperf evlist: Check nr_mmaps is correct
	Remove unused code

      libperf evsel: Add comments for booleans
	Amend comments

      perf evlist: Add evlist__add_dummy_on_all_cpus()
	Rename evlist__add_system_wide -> evlist__add_on_all_cpus
	Changed patch subject accordingly

      perf record: Use evlist__add_dummy_on_all_cpus() in record__config_text_poke()
	Rename evlist__add_system_wide -> evlist__add_on_all_cpus
	Changed patch subject accordingly

      perf intel-pt: Use evlist__add_dummy_on_all_cpus() for switch tracking
	Rename evlist__add_system_wide -> evlist__add_on_all_cpus
	Changed patch subject accordingly

Changes in V1:

      perf intel-pt: Add a test for system-wide side band
	New patch

      libperf evsel: Factor out perf_evsel__ioctl()
	Dropped because it has been applied.

      libperf evsel: Add perf_evsel__enable_thread()
	Rename variable i -> idx

      perf auxtrace: Do not mix up mmap idx
	Rename variable cpu to cpu_map_idx

      perf tools: Allow all_cpus to be a superset of user_requested_cpus
	Add Acked-by: Ian Rogers <irogers@google.com>

      libperf evlist: Allow mixing per-thread and per-cpu mmaps
	Fix perf_evlist__nr_mmaps() calculation

      libperf evlist: Check nr_mmaps is correct
	New patch

      libperf evsel: Add comments for booleans
	New patch

      perf tools: Allow system-wide events to keep their own CPUs
      perf tools: Allow system-wide events to keep their own threads


Adrian Hunter (23):
      perf intel-pt: Add a test for system-wide side band
      libperf evsel: Add perf_evsel__enable_thread()
      perf evlist: Use libperf functions in evlist__enable_event_idx()
      perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c
      perf auxtrace: Do not mix up mmap idx
      libperf evlist: Remove ->idx() per_cpu parameter
      libperf evlist: Move ->idx() into mmap_per_evsel()
      libperf evlist: Add evsel as a parameter to ->idx()
      perf auxtrace: Record whether an auxtrace mmap is needed
      perf auxtrace: Add mmap_needed to auxtrace_mmap_params
      perf auxtrace: Remove auxtrace_mmap_params__set_idx() per_cpu parameter
      perf evlist: Factor out evlist__dummy_event()
      perf evlist: Add evlist__add_dummy_on_all_cpus()
      perf record: Use evlist__add_dummy_on_all_cpus() in record__config_text_poke()
      perf intel-pt: Use evlist__add_dummy_on_all_cpus() for switch tracking
      perf intel-pt: Track sideband system-wide when needed
      perf tools: Allow all_cpus to be a superset of user_requested_cpus
      libperf evlist: Allow mixing per-thread and per-cpu mmaps
      libperf evlist: Check nr_mmaps is correct
      perf stat: Add requires_cpu flag for uncore
      libperf evsel: Add comments for booleans
      perf tools: Allow system-wide events to keep their own CPUs
      perf tools: Allow system-wide events to keep their own threads

 tools/lib/perf/evlist.c                  |  80 ++++++++++-------------
 tools/lib/perf/evsel.c                   |  15 +++++
 tools/lib/perf/include/internal/evlist.h |   3 +-
 tools/lib/perf/include/internal/evsel.h  |  10 +++
 tools/lib/perf/include/perf/evsel.h      |   1 +
 tools/perf/arch/arm/util/cs-etm.c        |   1 +
 tools/perf/arch/arm64/util/arm-spe.c     |   1 +
 tools/perf/arch/s390/util/auxtrace.c     |   1 +
 tools/perf/arch/x86/util/intel-bts.c     |   1 +
 tools/perf/arch/x86/util/intel-pt.c      |  32 ++++------
 tools/perf/builtin-record.c              |  39 +++++-------
 tools/perf/builtin-stat.c                |   5 +-
 tools/perf/tests/shell/test_intel_pt.sh  |  71 +++++++++++++++++++++
 tools/perf/util/auxtrace.c               |  31 +++++++--
 tools/perf/util/auxtrace.h               |  10 +--
 tools/perf/util/evlist.c                 | 106 +++++++++++++++----------------
 tools/perf/util/evlist.h                 |   7 +-
 tools/perf/util/evsel.c                  |   1 +
 tools/perf/util/evsel.h                  |   1 +
 tools/perf/util/mmap.c                   |   4 +-
 tools/perf/util/parse-events.c           |   2 +-
 21 files changed, 261 insertions(+), 161 deletions(-)
 create mode 100755 tools/perf/tests/shell/test_intel_pt.sh


Regards
Adrian
Re: [PATCH V2 00/23] perf intel-pt: Better support for perf record --cpu
Posted by Leo Yan 4 years ago
Hi Adrian,

On Fri, May 06, 2022 at 03:25:38PM +0300, Adrian Hunter wrote:
> Hi
> 
> Here are V2 patches to support capturing Intel PT sideband events such as
> mmap, task, context switch, text poke etc, on every CPU even when tracing
> selected user_requested_cpus.  That is, when using the perf record -C or
>  --cpu option.
> 
> This is needed for:
> 1. text poke: a text poke on any CPU affects all CPUs
> 2. tracing user space: a user space process can migrate between CPUs so
> mmap events that happen on a different CPU can be needed to decode a
> user_requested_cpus CPU.
> 
> For example:
> 
> 	Trace on CPU 1:
> 
> 	perf record --kcore -C 1 -e intel_pt// &
> 
> 	Start a task on CPU 0:
> 
> 	taskset 0x1 testprog &
> 
> 	Migrate it to CPU 1:
> 
> 	taskset -p 0x2 <testprog pid>
> 
> 	Stop tracing:
> 
> 	kill %1
> 
> 	Prior to these changes there will be errors decoding testprog
> 	in userspace because the comm and mmap events for testprog will not
> 	have been captured.

Thanks a lot for this patch set, I believe this is a common issue for
AUX trace (not only for Intel-PT), so I verified this patch set for both
Arm CoreSight and SPE; unfortunately both cannot see MMAP events for
migrated task.  I used below commands:

  # perf record -B -N --no-bpf-event -e cs_etm//u -C 0 -- taskset --cpu-list 1 uname
  # perf script  --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
  0


  # perf record -B -N --no-bpf-event -e arm_spe_0//u -C 0 -- taskset --cpu-list 1 uname
  # perf script --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
  0

I didn't dive into details for this patch set, so I cannot say the
failure is caused by any issue in this patch set.  But it's definitely
we need to look into for Arm platforms to root cause what's the reason
it cannot record MMAP events properly when migrate tasks.  Loop James
and German for this reason.

Thanks,
Leo
Re: [PATCH V2 00/23] perf intel-pt: Better support for perf record --cpu
Posted by Adrian Hunter 4 years ago
On 8/05/22 18:08, Leo Yan wrote:
> Hi Adrian,
> 
> On Fri, May 06, 2022 at 03:25:38PM +0300, Adrian Hunter wrote:
>> Hi
>>
>> Here are V2 patches to support capturing Intel PT sideband events such as
>> mmap, task, context switch, text poke etc, on every CPU even when tracing
>> selected user_requested_cpus.  That is, when using the perf record -C or
>>  --cpu option.
>>
>> This is needed for:
>> 1. text poke: a text poke on any CPU affects all CPUs
>> 2. tracing user space: a user space process can migrate between CPUs so
>> mmap events that happen on a different CPU can be needed to decode a
>> user_requested_cpus CPU.
>>
>> For example:
>>
>> 	Trace on CPU 1:
>>
>> 	perf record --kcore -C 1 -e intel_pt// &
>>
>> 	Start a task on CPU 0:
>>
>> 	taskset 0x1 testprog &
>>
>> 	Migrate it to CPU 1:
>>
>> 	taskset -p 0x2 <testprog pid>
>>
>> 	Stop tracing:
>>
>> 	kill %1
>>
>> 	Prior to these changes there will be errors decoding testprog
>> 	in userspace because the comm and mmap events for testprog will not
>> 	have been captured.
> 
> Thanks a lot for this patch set, I believe this is a common issue for
> AUX trace (not only for Intel-PT), so I verified this patch set for both
> Arm CoreSight and SPE; unfortunately both cannot see MMAP events for
> migrated task.  I used below commands:
> 
>   # perf record -B -N --no-bpf-event -e cs_etm//u -C 0 -- taskset --cpu-list 1 uname
>   # perf script  --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
>   0
> 
> 
>   # perf record -B -N --no-bpf-event -e arm_spe_0//u -C 0 -- taskset --cpu-list 1 uname
>   # perf script --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
>   0
> 
> I didn't dive into details for this patch set, so I cannot say the
> failure is caused by any issue in this patch set.  But it's definitely
> we need to look into for Arm platforms to root cause what's the reason
> it cannot record MMAP events properly when migrate tasks.  Loop James
> and German for this reason.

You would need the equivalent of patch "perf intel-pt: Track sideband
system-wide when needed" which makes use of new helper
evlist__add_aux_dummy() to set up the dummy event with the option to
make it "system wide".

cs_etm_recording_options() and arm_spe_recording_options() have similar
code.

You will need to decide if it is worth the extra sideband.  I decided
if it became an issue, it could be made optional in the future.
Re: [PATCH V2 00/23] perf intel-pt: Better support for perf record --cpu
Posted by Leo Yan 4 years ago
On Mon, May 09, 2022 at 08:44:02AM +0300, Adrian Hunter wrote:

[...]

> > Thanks a lot for this patch set, I believe this is a common issue for
> > AUX trace (not only for Intel-PT), so I verified this patch set for both
> > Arm CoreSight and SPE; unfortunately both cannot see MMAP events for
> > migrated task.  I used below commands:
> > 
> >   # perf record -B -N --no-bpf-event -e cs_etm//u -C 0 -- taskset --cpu-list 1 uname
> >   # perf script  --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
> >   0
> > 
> > 
> >   # perf record -B -N --no-bpf-event -e arm_spe_0//u -C 0 -- taskset --cpu-list 1 uname
> >   # perf script --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
> >   0
> > 
> > I didn't dive into details for this patch set, so I cannot say the
> > failure is caused by any issue in this patch set.  But it's definitely
> > we need to look into for Arm platforms to root cause what's the reason
> > it cannot record MMAP events properly when migrate tasks.  Loop James
> > and German for this reason.
> 
> You would need the equivalent of patch "perf intel-pt: Track sideband
> system-wide when needed" which makes use of new helper
> evlist__add_aux_dummy() to set up the dummy event with the option to
> make it "system wide".
> 
> cs_etm_recording_options() and arm_spe_recording_options() have similar
> code.

Thanks a lot for the guidance.

I applied the simliar change for cs_etm_recording_options() and
arm_spe_recording_options(), both can pass below tests:

  # perf record -B -N --no-bpf-event -e cs_etm//u -C 0 -- taskset --cpu-list 1 uname
  # perf script  --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l
  4

  # perf record -B -N --no-bpf-event -e arm_spe_0//u -C 0 -- taskset --cpu-list 1 uname
  # perf script --no-itrace --show-mmap-events -C 1 2>/dev/null | grep MMAP | wc -l      
  4

And I tested a more complex case for migrating a test program 'sysbench'
in the middle of perf session, it still fails to parse any samples
testing program 'sysbench'.  I need to do more homework for this part,
but welcome any suggestions, thanks!  The testing script is:

---8<---

export PATH=/mnt/export/arm-linux-kernel/tools/perf/:$PATH

perf record --kcore -C 1 -e cs_etm// &
PERF_PID=$!
echo "Perf PID ${PERF_PID}"

sleep 2

taskset 0x1 ./sysbench --test=memory --max-requests=1000000000 run &
TEST_PROG_PID=$!
echo "Test Prog PID ${TEST_PROG_PID}"

sleep 1

taskset -p 0x2 $TEST_PROG_PID

sleep 1

kill $PERF_PID

> You will need to decide if it is worth the extra sideband.  I decided
> if it became an issue, it could be made optional in the future.

Yeah, the condition checking for system wide tracking in patch 16/23
looks good to me.

Thanks,
Leo