tools/perf/arch/x86/util/intel-pt.c | 2 +- tools/perf/bench/pmu-scan.c | 8 +- tools/perf/builtin-list.c | 13 +- tools/perf/pmu-events/empty-pmu-events.c | 49 +- tools/perf/pmu-events/jevents.py | 312 +++++++-- tools/perf/pmu-events/pmu-events.h | 15 +- tools/perf/tests/parse-events.c | 2 +- tools/perf/tests/pmu-events.c | 148 +++-- tools/perf/tests/pmu.c | 2 +- tools/perf/util/metricgroup.c | 10 +- tools/perf/util/parse-events.c | 87 ++- tools/perf/util/parse-events.h | 3 +- tools/perf/util/pmu.c | 806 +++++++++++++++-------- tools/perf/util/pmu.h | 96 ++- tools/perf/util/pmu.y | 20 +- tools/perf/util/pmus.c | 230 +++---- tools/perf/util/s390-sample-raw.c | 50 +- 17 files changed, 1141 insertions(+), 712 deletions(-)
Lazily load PMU data both from sysfs and json files. Reorganize
json data to be more PMU oriented to facilitate this, for
example, json data is now sorted into arrays for their PMU.
In refactoring the code some changes were made to get rid of maximum
encoding sizes for events (256 bytes), with input files being directly
passed to the lex generated code. There is also a small event parse
error message improvement.
Some results from an Intel tigerlake laptop running Debian:
Binary size reduction of 5.3% or 552,864 bytes because the PMU
name no longer appears in the string or desc field.
stat -e cpu/cycles/ minor faults reduced from 1733 to 1667, open calls reduced
from 171 to 94.
stat default minor faults reduced from 1805 to 1717, open calls reduced
from 654 to 343.
Average PMU scanning reduced from 4720.641usec to 2927.293usec.
Average core PMU scanning reduced from 1004.658usec to 232.668usec
(4.3x faster).
v2: Add error path for failing strdup when allocating a format,
suggested by Arnaldo. Rebased on top of tmp.perf-tools-next
removing 8 patches. Added "perf jevents: Don't append Unit to
desc" to save yet more encoding json event space.
Ian Rogers (18):
perf pmu: Make the loading of formats lazy
perf pmu: Abstract alias/event struct
perf pmu-events: Add extra underscore to function names
perf jevents: Group events by PMU
perf parse-events: Improve error message for double setting
perf s390 s390_cpumcfdg_dump: Don't scan all PMUs
perf pmu-events: Reduce processed events by passing PMU
perf pmu-events: Add pmu_events_table__find_event
perf pmu: Parse sysfs events directly from a file
perf pmu: Prefer passing pmu to aliases list
perf pmu: Merge json events with sysfs at load time
perf pmu: Cache json events table
perf pmu: Lazily add json events
perf pmu: Scan type early to fail an invalid PMU quickly
perf pmu: Be lazy about loading event info files from sysfs
perf pmu: Lazily load sysfs aliases
perf jevents: Sort strings in the big C string to reduce faults
perf jevents: Don't append Unit to desc
tools/perf/arch/x86/util/intel-pt.c | 2 +-
tools/perf/bench/pmu-scan.c | 8 +-
tools/perf/builtin-list.c | 13 +-
tools/perf/pmu-events/empty-pmu-events.c | 49 +-
tools/perf/pmu-events/jevents.py | 312 +++++++--
tools/perf/pmu-events/pmu-events.h | 15 +-
tools/perf/tests/parse-events.c | 2 +-
tools/perf/tests/pmu-events.c | 148 +++--
tools/perf/tests/pmu.c | 2 +-
tools/perf/util/metricgroup.c | 10 +-
tools/perf/util/parse-events.c | 87 ++-
tools/perf/util/parse-events.h | 3 +-
tools/perf/util/pmu.c | 806 +++++++++++++++--------
tools/perf/util/pmu.h | 96 ++-
tools/perf/util/pmu.y | 20 +-
tools/perf/util/pmus.c | 230 +++----
tools/perf/util/s390-sample-raw.c | 50 +-
17 files changed, 1141 insertions(+), 712 deletions(-)
--
2.42.0.rc1.204.g551eb34607-goog
Em Wed, Aug 23, 2023 at 09:13:12PM -0700, Ian Rogers escreveu:
> Lazily load PMU data both from sysfs and json files. Reorganize
> json data to be more PMU oriented to facilitate this, for
> example, json data is now sorted into arrays for their PMU.
>
> In refactoring the code some changes were made to get rid of maximum
> encoding sizes for events (256 bytes), with input files being directly
> passed to the lex generated code. There is also a small event parse
> error message improvement.
>
> Some results from an Intel tigerlake laptop running Debian:
>
> Binary size reduction of 5.3% or 552,864 bytes because the PMU
> name no longer appears in the string or desc field.
>
> stat -e cpu/cycles/ minor faults reduced from 1733 to 1667, open calls reduced
> from 171 to 94.
>
> stat default minor faults reduced from 1805 to 1717, open calls reduced
> from 654 to 343.
>
> Average PMU scanning reduced from 4720.641usec to 2927.293usec.
> Average core PMU scanning reduced from 1004.658usec to 232.668usec
> (4.3x faster).
>
> v2: Add error path for failing strdup when allocating a format,
> suggested by Arnaldo. Rebased on top of tmp.perf-tools-next
> removing 8 patches. Added "perf jevents: Don't append Unit to
> desc" to save yet more encoding json event space.
So this is failing here:
[acme@quaco ~]$ perf test 10
10: PMU events :
10.1: PMU event table sanity : FAILED!
10.2: PMU event map aliases : FAILED!
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs: Ok
10.5: Parsing of metric thresholds with fake PMUs : Ok
[acme@quaco ~]$
[root@quaco ~]# grep -m1 "model name" /proc/cpuinfo
model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
[root@quaco ~]#
[root@quaco ~]# perf test -vv -F 10 |& head -40
10: PMU events :
10.1: PMU event table sanity :
--- start ---
testing event table bp_l1_btb_correct: pass
testing event table bp_l2_btb_correct: pass
testing event table dispatch_blocked.any: pass
testing event table eist_trans: pass
testing event table l3_cache_rd: pass
testing event table segment_reg_loads.any: pass
testing event e1 uncore_hisi_ddrc.flux_wcmd: mismatched desc, DDRC write commands vs DDRC write commands. Unit: hisi_sccl,ddrc
Strange:
if (!is_same(e1->desc, e2->desc)) {
pr_debug2("testing event e1 %s: mismatched desc, %s vs %s\n",
e1->name, e1->desc, e2->desc);
return -1;
}
Adding "" around those descs:
testing event e1 uncore_hisi_ddrc.flux_wcmd: mismatched desc, "DDRC write commands" vs "DDRC write commands. Unit: hisi_sccl,ddrc"
I see, its the last patch, removing it the tests passes, please take a
look at tmp.perf-tools-next
- Arnaldo
---- end ----
PMU events subtest 1: FAILED!
10.2: PMU event map aliases :
--- start ---
Using CPUID GenuineIntel-6-8E-A
testing aliases core PMU cpu: matched event bp_l1_btb_correct
testing aliases core PMU cpu: matched event bp_l2_btb_correct
testing aliases core PMU cpu: matched event segment_reg_loads.any
testing aliases core PMU cpu: matched event dispatch_blocked.any
testing aliases core PMU cpu: matched event eist_trans
testing aliases core PMU cpu: matched event l3_cache_rd
testing core PMU cpu aliases: pass
testing aliases PMU hisi_sccl1_ddrc2: mismatched desc, DDRC write commands vs DDRC write commands. Unit: hisi_sccl,ddrc
testing aliases uncore PMU hisi_sccl1_ddrc2: could not match alias uncore_hisi_ddrc.flux_wcmd
---- end ----
PMU events subtest 2: FAILED!
10.3: Parsing of PMU event table metrics :
--- start ---
Found metric 'CPI'
metric expr 1 / IPC for CPI
parsing metric: 1 / IPC
metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
parsing metric: inst_retired.any / cpu_clk_unhalted.thread
found event inst_retired.any
found event cpu_clk_unhalted.thread
Parsing metric events '{inst_retired.any/metric-id=inst_retired.any/,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W'
Attempting to add event pmu 'inst_retired.any' with '(null),' that may result in non-fatal errors
After aliases, add event pmu 'inst_retired.any' with '(null),' that may result in non-fatal errors
Attempting to add event pmu 'cpu_clk_unhalted.thread' with '(null),' that may result in non-fatal errors
After aliases, add event pmu 'cpu_clk_unhalted.thread' with '(null),' that may result in non-fatal errors
[root@quaco ~]#
Trying on a AMD 5950x:
[root@five ~]# perf test -F -vv 10 |& head -40
10: PMU events :
10.1: PMU event table sanity :
--- start ---
testing event table bp_l1_btb_correct: pass
testing event table bp_l2_btb_correct: pass
testing event table dispatch_blocked.any: pass
testing event table eist_trans: pass
testing event table l3_cache_rd: pass
testing event table segment_reg_loads.any: pass
testing event e1 uncore_hisi_ddrc.flux_wcmd: mismatched desc, DDRC write commands vs DDRC write commands. Unit: hisi_sccl,ddrc
---- end ----
PMU events subtest 1: FAILED!
10.2: PMU event map aliases :
--- start ---
Using CPUID AuthenticAMD-25-21-0
testing aliases core PMU cpu: matched event bp_l1_btb_correct
testing aliases core PMU cpu: matched event bp_l2_btb_correct
testing aliases core PMU cpu: matched event segment_reg_loads.any
testing aliases core PMU cpu: matched event dispatch_blocked.any
testing aliases core PMU cpu: matched event eist_trans
testing aliases core PMU cpu: matched event l3_cache_rd
testing core PMU cpu aliases: pass
testing aliases PMU hisi_sccl1_ddrc2: mismatched desc, DDRC write commands vs DDRC write commands. Unit: hisi_sccl,ddrc
testing aliases uncore PMU hisi_sccl1_ddrc2: could not match alias uncore_hisi_ddrc.flux_wcmd
---- end ----
PMU events subtest 2: FAILED!
10.3: Parsing of PMU event table metrics :
--- start ---
Found metric 'CPI'
metric expr 1 / IPC for CPI
parsing metric: 1 / IPC
metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
parsing metric: inst_retired.any / cpu_clk_unhalted.thread
found event inst_retired.any
found event cpu_clk_unhalted.thread
Parsing metric events '{inst_retired.any/metric-id=inst_retired.any/,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W'
Attempting to add event pmu 'inst_retired.any' with '(null),' that may result in non-fatal errors
After aliases, add event pmu 'inst_retired.any' with '(null),' that may result in non-fatal errors
Attempting to add event pmu 'cpu_clk_unhalted.thread' with '(null),' that may result in non-fatal errors
After aliases, add event pmu 'cpu_clk_unhalted.thread' with '(null),' that may result in non-fatal errors
[root@five ~]#
On Thu, Aug 24, 2023 at 7:52 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Wed, Aug 23, 2023 at 09:13:12PM -0700, Ian Rogers escreveu:
> > Lazily load PMU data both from sysfs and json files. Reorganize
> > json data to be more PMU oriented to facilitate this, for
> > example, json data is now sorted into arrays for their PMU.
> >
> > In refactoring the code some changes were made to get rid of maximum
> > encoding sizes for events (256 bytes), with input files being directly
> > passed to the lex generated code. There is also a small event parse
> > error message improvement.
> >
> > Some results from an Intel tigerlake laptop running Debian:
> >
> > Binary size reduction of 5.3% or 552,864 bytes because the PMU
> > name no longer appears in the string or desc field.
> >
> > stat -e cpu/cycles/ minor faults reduced from 1733 to 1667, open calls reduced
> > from 171 to 94.
> >
> > stat default minor faults reduced from 1805 to 1717, open calls reduced
> > from 654 to 343.
> >
> > Average PMU scanning reduced from 4720.641usec to 2927.293usec.
> > Average core PMU scanning reduced from 1004.658usec to 232.668usec
> > (4.3x faster).
> >
> > v2: Add error path for failing strdup when allocating a format,
> > suggested by Arnaldo. Rebased on top of tmp.perf-tools-next
> > removing 8 patches. Added "perf jevents: Don't append Unit to
> > desc" to save yet more encoding json event space.
>
> So this is failing here:
>
> [acme@quaco ~]$ perf test 10
> 10: PMU events :
> 10.1: PMU event table sanity : FAILED!
> 10.2: PMU event map aliases : FAILED!
> 10.3: Parsing of PMU event table metrics : Ok
> 10.4: Parsing of PMU event table metrics with fake PMUs: Ok
> 10.5: Parsing of metric thresholds with fake PMUs : Ok
> [acme@quaco ~]$
>
> [root@quaco ~]# grep -m1 "model name" /proc/cpuinfo
> model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
> [root@quaco ~]#
>
>
> [root@quaco ~]# perf test -vv -F 10 |& head -40
> 10: PMU events :
> 10.1: PMU event table sanity :
> --- start ---
> testing event table bp_l1_btb_correct: pass
> testing event table bp_l2_btb_correct: pass
> testing event table dispatch_blocked.any: pass
> testing event table eist_trans: pass
> testing event table l3_cache_rd: pass
> testing event table segment_reg_loads.any: pass
> testing event e1 uncore_hisi_ddrc.flux_wcmd: mismatched desc, DDRC write commands vs DDRC write commands. Unit: hisi_sccl,ddrc
>
>
> Strange:
>
> if (!is_same(e1->desc, e2->desc)) {
> pr_debug2("testing event e1 %s: mismatched desc, %s vs %s\n",
> e1->name, e1->desc, e2->desc);
> return -1;
> }
>
> Adding "" around those descs:
>
> testing event e1 uncore_hisi_ddrc.flux_wcmd: mismatched desc, "DDRC write commands" vs "DDRC write commands. Unit: hisi_sccl,ddrc"
>
> I see, its the last patch, removing it the tests passes, please take a
> look at tmp.perf-tools-next
>
> - Arnaldo
Thanks, I'll address the issue (hardcoded assumption on jevents.py
output) and resend the patch.
Ian
> ---- end ----
> PMU events subtest 1: FAILED!
> 10.2: PMU event map aliases :
> --- start ---
> Using CPUID GenuineIntel-6-8E-A
> testing aliases core PMU cpu: matched event bp_l1_btb_correct
> testing aliases core PMU cpu: matched event bp_l2_btb_correct
> testing aliases core PMU cpu: matched event segment_reg_loads.any
> testing aliases core PMU cpu: matched event dispatch_blocked.any
> testing aliases core PMU cpu: matched event eist_trans
> testing aliases core PMU cpu: matched event l3_cache_rd
> testing core PMU cpu aliases: pass
> testing aliases PMU hisi_sccl1_ddrc2: mismatched desc, DDRC write commands vs DDRC write commands. Unit: hisi_sccl,ddrc
> testing aliases uncore PMU hisi_sccl1_ddrc2: could not match alias uncore_hisi_ddrc.flux_wcmd
> ---- end ----
> PMU events subtest 2: FAILED!
> 10.3: Parsing of PMU event table metrics :
> --- start ---
> Found metric 'CPI'
> metric expr 1 / IPC for CPI
> parsing metric: 1 / IPC
> metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
> parsing metric: inst_retired.any / cpu_clk_unhalted.thread
> found event inst_retired.any
> found event cpu_clk_unhalted.thread
> Parsing metric events '{inst_retired.any/metric-id=inst_retired.any/,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W'
> Attempting to add event pmu 'inst_retired.any' with '(null),' that may result in non-fatal errors
> After aliases, add event pmu 'inst_retired.any' with '(null),' that may result in non-fatal errors
> Attempting to add event pmu 'cpu_clk_unhalted.thread' with '(null),' that may result in non-fatal errors
> After aliases, add event pmu 'cpu_clk_unhalted.thread' with '(null),' that may result in non-fatal errors
> [root@quaco ~]#
>
> Trying on a AMD 5950x:
>
> [root@five ~]# perf test -F -vv 10 |& head -40
> 10: PMU events :
> 10.1: PMU event table sanity :
> --- start ---
> testing event table bp_l1_btb_correct: pass
> testing event table bp_l2_btb_correct: pass
> testing event table dispatch_blocked.any: pass
> testing event table eist_trans: pass
> testing event table l3_cache_rd: pass
> testing event table segment_reg_loads.any: pass
> testing event e1 uncore_hisi_ddrc.flux_wcmd: mismatched desc, DDRC write commands vs DDRC write commands. Unit: hisi_sccl,ddrc
> ---- end ----
> PMU events subtest 1: FAILED!
> 10.2: PMU event map aliases :
> --- start ---
> Using CPUID AuthenticAMD-25-21-0
> testing aliases core PMU cpu: matched event bp_l1_btb_correct
> testing aliases core PMU cpu: matched event bp_l2_btb_correct
> testing aliases core PMU cpu: matched event segment_reg_loads.any
> testing aliases core PMU cpu: matched event dispatch_blocked.any
> testing aliases core PMU cpu: matched event eist_trans
> testing aliases core PMU cpu: matched event l3_cache_rd
> testing core PMU cpu aliases: pass
> testing aliases PMU hisi_sccl1_ddrc2: mismatched desc, DDRC write commands vs DDRC write commands. Unit: hisi_sccl,ddrc
> testing aliases uncore PMU hisi_sccl1_ddrc2: could not match alias uncore_hisi_ddrc.flux_wcmd
> ---- end ----
> PMU events subtest 2: FAILED!
> 10.3: Parsing of PMU event table metrics :
> --- start ---
> Found metric 'CPI'
> metric expr 1 / IPC for CPI
> parsing metric: 1 / IPC
> metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
> parsing metric: inst_retired.any / cpu_clk_unhalted.thread
> found event inst_retired.any
> found event cpu_clk_unhalted.thread
> Parsing metric events '{inst_retired.any/metric-id=inst_retired.any/,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W'
> Attempting to add event pmu 'inst_retired.any' with '(null),' that may result in non-fatal errors
> After aliases, add event pmu 'inst_retired.any' with '(null),' that may result in non-fatal errors
> Attempting to add event pmu 'cpu_clk_unhalted.thread' with '(null),' that may result in non-fatal errors
> After aliases, add event pmu 'cpu_clk_unhalted.thread' with '(null),' that may result in non-fatal errors
> [root@five ~]#
© 2016 - 2025 Red Hat, Inc.