tools/perf/builtin-record.c | 47 ++++++++++++++++++--- tools/perf/util/parse-events.c | 26 +++++++++--- tools/perf/util/parse-events.l | 76 +++++++++++++++++----------------- tools/perf/util/parse-events.y | 60 ++++++++++++++++++--------- 4 files changed, 139 insertions(+), 70 deletions(-)
At the RISC-V summit the topic of avoiding event data being in the
RISC-V PMU kernel driver came up. There is a preference for sysfs/JSON
events being the priority when no PMU is provided so that legacy
events maybe supported via json. Originally Mark Rutland also
expressed at LPC 2023 that doing this would resolve bugs on ARM Apple
M? processors, but James Clark more recently tested this and believes
the driver issues there may not have existed or have been resolved. In
any case, it is inconsistent that with a PMU event names avoid legacy
encodings, but when wildcarding PMUs (ie without a PMU with the event
name) the legacy encodings have priority.
The patch doing this work was reverted in a v6.10 release candidate
as, even though the patch was posted for weeks and had been on
linux-next for weeks without issue, Linus was in the habit of using
explicit legacy events with unsupported precision options on his
Neoverse-N1. This machine has SLC PMU events for bus and CPU cycles
where ARM decided to call the events bus_cycles and cycles, the latter
being also a legacy event name. ARM haven't renamed the cycles event
to a more consistent cpu_cycles and avoided the problem. With these
changes the problematic event will now be skipped, a large warning
produced, and perf record will continue for the other PMU events. This
solution was proposed by Arnaldo.
v6: Rebase of v5 (dropping already merged patches):
https://lore.kernel.org/lkml/20250109222109.567031-1-irogers@google.com/
that unusually had an RFC posted for it:
https://lore.kernel.org/lkml/Z7Z5kv75BMML2A1q@google.com/
Note, this patch conflicts/contradicts:
https://lore.kernel.org/lkml/20250312211623.2495798-1-irogers@google.com/
that I posted so that we could either consistently prioritize
sysfs/json (these patches) or legacy events (the other
patches). That lack of event printing and encoding inconsistency
is most prominent in the encoding of events like "instructions"
which on hybrid are reported as "cpu_core/instructions/" but
"instructions" before these patches gets a legacy encoding while
"cpu_core/instructions/" gets a sysfs/json encoding. These patches
make "instructions" always get a sysfs/json encoding while the
alternate patches make it always get a legacy encoding.
v5: Follow Namhyung's suggestion and ignore the case where command
line dummy events fail to open alongside other events that all
fail to open. Note, the Tested-by tags are left on the series as
v4 and v5 were changing an error case that doesn't occur in
testing but was manually tested by myself.
v4: Rework the no events opening change from v3 to make it handle
multiple dummy events. Sadly an evlist isn't empty if it just
contains dummy events as the dummy event may be used with "perf
record -e dummy .." as a way to determine whether permission
issues exist. Other software events like cpu-clock would suffice
for this, but the using dummy genie has left the bottle.
Another problem is that we appear to have an excessive number of
dummy events added, for example, we can likely avoid a dummy event
and add sideband data to the original event. For auxtrace more
dummy events may be opened too. Anyway, this has led to the
approach taken in patch 3 where the number of dummy parsed events
is computed. If the number of removed/failing-to-open non-dummy
events matches the number of non-dummy events then we want to
fail, but only if there are no parsed dummy events or if there was
one then it must have opened. The math here is hard to read, but
passes my manual testing.
v3: Make no events opening for perf record a failure as suggested by
James Clark and Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>. Also,
rebase.
v2: Rebase and add tested-by tags from James Clark, Leo Yan and Atish
Patra who have tested on RISC-V and ARM CPUs, including the
problem case from before.
Ian Rogers (2):
perf record: Skip don't fail for events that don't open
perf parse-events: Reapply "Prefer sysfs/JSON hardware events over
legacy"
tools/perf/builtin-record.c | 47 ++++++++++++++++++---
tools/perf/util/parse-events.c | 26 +++++++++---
tools/perf/util/parse-events.l | 76 +++++++++++++++++-----------------
tools/perf/util/parse-events.y | 60 ++++++++++++++++++---------
4 files changed, 139 insertions(+), 70 deletions(-)
--
2.49.0.395.g12beb8f557-goog
On Mon, Mar 24, 2025 at 9:46 AM Ian Rogers <irogers@google.com> wrote:
>
> At the RISC-V summit the topic of avoiding event data being in the
> RISC-V PMU kernel driver came up. There is a preference for sysfs/JSON
> events being the priority when no PMU is provided so that legacy
> events maybe supported via json. Originally Mark Rutland also
> expressed at LPC 2023 that doing this would resolve bugs on ARM Apple
> M? processors, but James Clark more recently tested this and believes
> the driver issues there may not have existed or have been resolved. In
> any case, it is inconsistent that with a PMU event names avoid legacy
> encodings, but when wildcarding PMUs (ie without a PMU with the event
> name) the legacy encodings have priority.
>
> The patch doing this work was reverted in a v6.10 release candidate
> as, even though the patch was posted for weeks and had been on
> linux-next for weeks without issue, Linus was in the habit of using
> explicit legacy events with unsupported precision options on his
> Neoverse-N1. This machine has SLC PMU events for bus and CPU cycles
> where ARM decided to call the events bus_cycles and cycles, the latter
> being also a legacy event name. ARM haven't renamed the cycles event
> to a more consistent cpu_cycles and avoided the problem. With these
> changes the problematic event will now be skipped, a large warning
> produced, and perf record will continue for the other PMU events. This
> solution was proposed by Arnaldo.
>
> v6: Rebase of v5 (dropping already merged patches):
> https://lore.kernel.org/lkml/20250109222109.567031-1-irogers@google.com/
> that unusually had an RFC posted for it:
> https://lore.kernel.org/lkml/Z7Z5kv75BMML2A1q@google.com/
> Note, this patch conflicts/contradicts:
> https://lore.kernel.org/lkml/20250312211623.2495798-1-irogers@google.com/
> that I posted so that we could either consistently prioritize
> sysfs/json (these patches) or legacy events (the other
> patches). That lack of event printing and encoding inconsistency
> is most prominent in the encoding of events like "instructions"
> which on hybrid are reported as "cpu_core/instructions/" but
> "instructions" before these patches gets a legacy encoding while
> "cpu_core/instructions/" gets a sysfs/json encoding. These patches
> make "instructions" always get a sysfs/json encoding while the
> alternate patches make it always get a legacy encoding.
So another fun finding. Sysfs and json events are case insensitive:
```
$ perf stat -e 'inst_retired.any,INST_RETIRED.ANY' true
Performance counter stats for 'true':
129,134 cpu_atom/inst_retired.any:u/
<not counted> cpu_core/inst_retired.any:u/
(0.00%)
129,134 cpu_atom/INST_RETIRED.ANY:u/
<not counted> cpu_core/INST_RETIRED.ANY:u/
(0.00%)
0.002193191 seconds time elapsed
0.002354000 seconds user
0.000000000 seconds sys
```
But legacy events match in lex code that is case sensitive. This means
(on x86) the event 'instructions' is currently legacy, but the event
'INSTRUCTIONS' is a sysfs event. The event CYCLES is a parse error as
there is no sysfs/json version. Given legacy events don't follow the
case insensitivity norm this is more evidence we need to reduce their
priority by merging these patches.
Thanks,
Ian
> v5: Follow Namhyung's suggestion and ignore the case where command
> line dummy events fail to open alongside other events that all
> fail to open. Note, the Tested-by tags are left on the series as
> v4 and v5 were changing an error case that doesn't occur in
> testing but was manually tested by myself.
>
> v4: Rework the no events opening change from v3 to make it handle
> multiple dummy events. Sadly an evlist isn't empty if it just
> contains dummy events as the dummy event may be used with "perf
> record -e dummy .." as a way to determine whether permission
> issues exist. Other software events like cpu-clock would suffice
> for this, but the using dummy genie has left the bottle.
>
> Another problem is that we appear to have an excessive number of
> dummy events added, for example, we can likely avoid a dummy event
> and add sideband data to the original event. For auxtrace more
> dummy events may be opened too. Anyway, this has led to the
> approach taken in patch 3 where the number of dummy parsed events
> is computed. If the number of removed/failing-to-open non-dummy
> events matches the number of non-dummy events then we want to
> fail, but only if there are no parsed dummy events or if there was
> one then it must have opened. The math here is hard to read, but
> passes my manual testing.
>
> v3: Make no events opening for perf record a failure as suggested by
> James Clark and Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>. Also,
> rebase.
>
> v2: Rebase and add tested-by tags from James Clark, Leo Yan and Atish
> Patra who have tested on RISC-V and ARM CPUs, including the
> problem case from before.
>
> Ian Rogers (2):
> perf record: Skip don't fail for events that don't open
> perf parse-events: Reapply "Prefer sysfs/JSON hardware events over
> legacy"
>
> tools/perf/builtin-record.c | 47 ++++++++++++++++++---
> tools/perf/util/parse-events.c | 26 +++++++++---
> tools/perf/util/parse-events.l | 76 +++++++++++++++++-----------------
> tools/perf/util/parse-events.y | 60 ++++++++++++++++++---------
> 4 files changed, 139 insertions(+), 70 deletions(-)
>
> --
> 2.49.0.395.g12beb8f557-goog
>
On Thu, Mar 27, 2025 at 12:13:45PM -0700, Ian Rogers wrote:
> On Mon, Mar 24, 2025 at 9:46 AM Ian Rogers <irogers@google.com> wrote:
> >
> > At the RISC-V summit the topic of avoiding event data being in the
> > RISC-V PMU kernel driver came up. There is a preference for sysfs/JSON
> > events being the priority when no PMU is provided so that legacy
> > events maybe supported via json. Originally Mark Rutland also
> > expressed at LPC 2023 that doing this would resolve bugs on ARM Apple
> > M? processors, but James Clark more recently tested this and believes
> > the driver issues there may not have existed or have been resolved. In
> > any case, it is inconsistent that with a PMU event names avoid legacy
> > encodings, but when wildcarding PMUs (ie without a PMU with the event
> > name) the legacy encodings have priority.
> >
> > The patch doing this work was reverted in a v6.10 release candidate
> > as, even though the patch was posted for weeks and had been on
> > linux-next for weeks without issue, Linus was in the habit of using
> > explicit legacy events with unsupported precision options on his
> > Neoverse-N1. This machine has SLC PMU events for bus and CPU cycles
> > where ARM decided to call the events bus_cycles and cycles, the latter
> > being also a legacy event name. ARM haven't renamed the cycles event
> > to a more consistent cpu_cycles and avoided the problem. With these
> > changes the problematic event will now be skipped, a large warning
> > produced, and perf record will continue for the other PMU events. This
> > solution was proposed by Arnaldo.
> >
> > v6: Rebase of v5 (dropping already merged patches):
> > https://lore.kernel.org/lkml/20250109222109.567031-1-irogers@google.com/
> > that unusually had an RFC posted for it:
> > https://lore.kernel.org/lkml/Z7Z5kv75BMML2A1q@google.com/
> > Note, this patch conflicts/contradicts:
> > https://lore.kernel.org/lkml/20250312211623.2495798-1-irogers@google.com/
> > that I posted so that we could either consistently prioritize
> > sysfs/json (these patches) or legacy events (the other
> > patches). That lack of event printing and encoding inconsistency
> > is most prominent in the encoding of events like "instructions"
> > which on hybrid are reported as "cpu_core/instructions/" but
> > "instructions" before these patches gets a legacy encoding while
> > "cpu_core/instructions/" gets a sysfs/json encoding. These patches
> > make "instructions" always get a sysfs/json encoding while the
> > alternate patches make it always get a legacy encoding.
>
> So another fun finding. Sysfs and json events are case insensitive:
> ```
> $ perf stat -e 'inst_retired.any,INST_RETIRED.ANY' true
>
> Performance counter stats for 'true':
>
> 129,134 cpu_atom/inst_retired.any:u/
> <not counted> cpu_core/inst_retired.any:u/
> (0.00%)
> 129,134 cpu_atom/INST_RETIRED.ANY:u/
> <not counted> cpu_core/INST_RETIRED.ANY:u/
> (0.00%)
>
> 0.002193191 seconds time elapsed
>
> 0.002354000 seconds user
> 0.000000000 seconds sys
> ```
> But legacy events match in lex code that is case sensitive. This means
> (on x86) the event 'instructions' is currently legacy, but the event
> 'INSTRUCTIONS' is a sysfs event. The event CYCLES is a parse error as
> there is no sysfs/json version. Given legacy events don't follow the
> case insensitivity norm this is more evidence we need to reduce their
> priority by merging these patches.
root@number:~# perf trace -e perf_event_open perf stat -C 1 -e INSTRUCTIONS,instructions,cycles sleep 1
0.000 ( 0.025 ms): :620592/620592 perf_event_open(attr_uptr: { type: 4 (cpu), size: 136, config: 0xc0 (instructions), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1 }, pid: -1, cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 3
0.030 ( 0.004 ms): :620592/620592 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1 }, pid: -1, cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 8
0.035 ( 0.003 ms): :620592/620592 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1 }, pid: -1, cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 9
Performance counter stats for 'CPU(s) 1':
1,499,102 INSTRUCTIONS
1,498,883 instructions # 0.81 insn per cycle
1,850,082 cycles
1.001553577 seconds time elapsed
root@number:~#
So the behaviour if "instructions" is specified, since perf started, is
to have this:
0.030 ( 0.004 ms): :620592/620592 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1 }, pid: -1, cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 8
And this is what we continue to obtain.
At some point we started supporting sysfs/JSON and then INSTRUCTIONS
started being accepted and we are getting:
0.000 ( 0.025 ms): :620592/620592 perf_event_open(attr_uptr: { type: 4 (cpu), size: 136, config: 0xc0 (instructions), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1 }, pid: -1, cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 3
Which is what is expected, no change in behaviour over time.
- Arnaldo
On Tue, Apr 29, 2025 at 8:14 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Thu, Mar 27, 2025 at 12:13:45PM -0700, Ian Rogers wrote:
> > On Mon, Mar 24, 2025 at 9:46 AM Ian Rogers <irogers@google.com> wrote:
> > >
> > > At the RISC-V summit the topic of avoiding event data being in the
> > > RISC-V PMU kernel driver came up. There is a preference for sysfs/JSON
> > > events being the priority when no PMU is provided so that legacy
> > > events maybe supported via json. Originally Mark Rutland also
> > > expressed at LPC 2023 that doing this would resolve bugs on ARM Apple
> > > M? processors, but James Clark more recently tested this and believes
> > > the driver issues there may not have existed or have been resolved. In
> > > any case, it is inconsistent that with a PMU event names avoid legacy
> > > encodings, but when wildcarding PMUs (ie without a PMU with the event
> > > name) the legacy encodings have priority.
> > >
> > > The patch doing this work was reverted in a v6.10 release candidate
> > > as, even though the patch was posted for weeks and had been on
> > > linux-next for weeks without issue, Linus was in the habit of using
> > > explicit legacy events with unsupported precision options on his
> > > Neoverse-N1. This machine has SLC PMU events for bus and CPU cycles
> > > where ARM decided to call the events bus_cycles and cycles, the latter
> > > being also a legacy event name. ARM haven't renamed the cycles event
> > > to a more consistent cpu_cycles and avoided the problem. With these
> > > changes the problematic event will now be skipped, a large warning
> > > produced, and perf record will continue for the other PMU events. This
> > > solution was proposed by Arnaldo.
> > >
> > > v6: Rebase of v5 (dropping already merged patches):
> > > https://lore.kernel.org/lkml/20250109222109.567031-1-irogers@google.com/
> > > that unusually had an RFC posted for it:
> > > https://lore.kernel.org/lkml/Z7Z5kv75BMML2A1q@google.com/
> > > Note, this patch conflicts/contradicts:
> > > https://lore.kernel.org/lkml/20250312211623.2495798-1-irogers@google.com/
> > > that I posted so that we could either consistently prioritize
> > > sysfs/json (these patches) or legacy events (the other
> > > patches). That lack of event printing and encoding inconsistency
> > > is most prominent in the encoding of events like "instructions"
> > > which on hybrid are reported as "cpu_core/instructions/" but
> > > "instructions" before these patches gets a legacy encoding while
> > > "cpu_core/instructions/" gets a sysfs/json encoding. These patches
> > > make "instructions" always get a sysfs/json encoding while the
> > > alternate patches make it always get a legacy encoding.
> >
> > So another fun finding. Sysfs and json events are case insensitive:
> > ```
> > $ perf stat -e 'inst_retired.any,INST_RETIRED.ANY' true
> >
> > Performance counter stats for 'true':
> >
> > 129,134 cpu_atom/inst_retired.any:u/
> > <not counted> cpu_core/inst_retired.any:u/
> > (0.00%)
> > 129,134 cpu_atom/INST_RETIRED.ANY:u/
> > <not counted> cpu_core/INST_RETIRED.ANY:u/
> > (0.00%)
> >
> > 0.002193191 seconds time elapsed
> >
> > 0.002354000 seconds user
> > 0.000000000 seconds sys
> > ```
> > But legacy events match in lex code that is case sensitive. This means
> > (on x86) the event 'instructions' is currently legacy, but the event
> > 'INSTRUCTIONS' is a sysfs event. The event CYCLES is a parse error as
> > there is no sysfs/json version. Given legacy events don't follow the
> > case insensitivity norm this is more evidence we need to reduce their
> > priority by merging these patches.
>
> root@number:~# perf trace -e perf_event_open perf stat -C 1 -e INSTRUCTIONS,instructions,cycles sleep 1
> 0.000 ( 0.025 ms): :620592/620592 perf_event_open(attr_uptr: { type: 4 (cpu), size: 136, config: 0xc0 (instructions), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1 }, pid: -1, cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 3
> 0.030 ( 0.004 ms): :620592/620592 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1 }, pid: -1, cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 8
> 0.035 ( 0.003 ms): :620592/620592 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1 }, pid: -1, cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 9
>
> Performance counter stats for 'CPU(s) 1':
>
> 1,499,102 INSTRUCTIONS
> 1,498,883 instructions # 0.81 insn per cycle
> 1,850,082 cycles
>
> 1.001553577 seconds time elapsed
>
> root@number:~#
>
> So the behaviour if "instructions" is specified, since perf started, is
> to have this:
>
> 0.030 ( 0.004 ms): :620592/620592 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1 }, pid: -1, cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 8
>
> And this is what we continue to obtain.
>
> At some point we started supporting sysfs/JSON and then INSTRUCTIONS
> started being accepted and we are getting:
>
> 0.000 ( 0.025 ms): :620592/620592 perf_event_open(attr_uptr: { type: 4 (cpu), size: 136, config: 0xc0 (instructions), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1 }, pid: -1, cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 3
>
> Which is what is expected, no change in behaviour over time.
I'm not sure what the point of this comment is. With perf we use
strcasecmp to match events:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/pmu.c?h=perf-tools-next#n473
```
static struct perf_pmu_alias *perf_pmu__find_alias(struct perf_pmu *pmu,
const char *name,
bool load)
{
struct perf_pmu_alias *alias;
...
list_for_each_entry(alias, &pmu->aliases, list) {
if (!strcasecmp(alias->name, name))
return alias;
...
```
We even lower case event names during the json parsing:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/pmu-events/jevents.py?h=perf-tools-next#n326
We don't use case insensitive pattern matching when matching legacy events:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/parse-events.l?h=perf-tools-next#n392
```
...
cpu-cycles|cycles { return sym(yyscanner, PERF_TYPE_HARDWARE,
PERF_COUNT_HW_CPU_CYCLES); }
stalled-cycles-frontend|idle-cycles-frontend { return sym(yyscanner,
PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND); }
stalled-cycles-backend|idle-cycles-backend { return sym(yyscanner,
PERF_TYPE_HARDWARE, PERF_COUNT_HW_STALLED_CYCLES_BACKEND); }
instructions { return sym(yyscanner, PERF_TYPE_HARDWARE,
PERF_COUNT_HW_INSTRUCTIONS); }
cache-references { return sym(yyscanner, PERF_TYPE_HARDWARE,
PERF_COUNT_HW_CACHE_REFERENCES); }
cache-misses { return sym(yyscanner, PERF_TYPE_HARDWARE,
PERF_COUNT_HW_CACHE_MISSES); }
branch-instructions|branches { return sym(yyscanner,
PERF_TYPE_HARDWARE, PERF_COUNT_HW_BRANCH_INSTRUCTIONS); }
branch-misses { return sym(yyscanner, PERF_TYPE_HARDWARE,
PERF_COUNT_HW_BRANCH_MISSES); }
bus-cycles { return sym(yyscanner, PERF_TYPE_HARDWARE,
PERF_COUNT_HW_BUS_CYCLES); }
ref-cycles { return sym(yyscanner, PERF_TYPE_HARDWARE,
PERF_COUNT_HW_REF_CPU_CYCLES); }
cpu-clock { return sym(yyscanner, PERF_TYPE_SOFTWARE,
PERF_COUNT_SW_CPU_CLOCK); }
task-clock { return sym(yyscanner, PERF_TYPE_SOFTWARE,
PERF_COUNT_SW_TASK_CLOCK); }
page-faults|faults { return sym(yyscanner, PERF_TYPE_SOFTWARE,
PERF_COUNT_SW_PAGE_FAULTS); }
minor-faults { return sym(yyscanner, PERF_TYPE_SOFTWARE,
PERF_COUNT_SW_PAGE_FAULTS_MIN); }
major-faults { return sym(yyscanner, PERF_TYPE_SOFTWARE,
PERF_COUNT_SW_PAGE_FAULTS_MAJ); }
context-switches|cs { return sym(yyscanner, PERF_TYPE_SOFTWARE,
PERF_COUNT_SW_CONTEXT_SWITCHES); }
cpu-migrations|migrations { return sym(yyscanner, PERF_TYPE_SOFTWARE,
PERF_COUNT_SW_CPU_MIGRATIONS); }
alignment-faults { return sym(yyscanner, PERF_TYPE_SOFTWARE,
PERF_COUNT_SW_ALIGNMENT_FAULTS); }
emulation-faults { return sym(yyscanner, PERF_TYPE_SOFTWARE,
PERF_COUNT_SW_EMULATION_FAULTS); }
dummy { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
bpf-output { return sym(yyscanner, PERF_TYPE_SOFTWARE,
PERF_COUNT_SW_BPF_OUTPUT); }
cgroup-switches { return sym(yyscanner, PERF_TYPE_SOFTWARE,
PERF_COUNT_SW_CGROUP_SWITCHES); }
...
```
This means the behavior of instructions,INSTRUCTIONS,cpu/instructions/
all vary when parsed. I am working to change the metric parsing so
that rather than matching strings it works off of the config values,
this fixes the problem that metrics cannot handle tracepoints without
the need to reinvent event parsing. As the config values vary for some
if not all of instructions,INSTRUCTIONS,cpu/instructions/ then it will
impact de-duplicating events. There's generally an expectation that
events are case insensitive and when that's not true it looks like a
bug to me.
Just to repeat for clarity. Legacy events were mapped to fixed
encodings and had priority. Intel/Arnaldo/Jiri changed that to make it
so that legacy encodings would wildcard match. I cleaned that up in
making metrics work, including with Intel hybrid CPUs, for the support
of topdown metrics and so the Intel hybrid approach was made generic
and not hard coded to Intel PMUs. My change broke Apple-M as they had
been reliant on not having their PMUs spotted as core legacy
supporting PMUs, with events programmed as if they were uncore PMUs.
Mark Rutland argued that sysfs/json should be made the priority and I
made it so, focussing first on the case where a PMU is specified as
that matched the bug. The later changes are just following through on
that priority change. RISC-V supports this change given the variable
nature of how they encode events.
Note, there is a newer version of this patch series:
https://lore.kernel.org/lkml/20250416045117.876775-1-irogers@google.com/
it addresses problems with tracking events and perf_api_probe using
wildcarding. The patches are applied in:
https://github.com/googleprodkernel/linux-perf/
Thanks,
Ian
> - Arnaldo
© 2016 - 2026 Red Hat, Inc.