tools/lib/perf/evlist.c | 118 ++++++++++++++++-------- tools/lib/perf/evsel.c | 9 +- tools/lib/perf/include/internal/evsel.h | 3 +- tools/perf/builtin-stat.c | 9 +- tools/perf/tests/event_update.c | 4 +- tools/perf/util/evlist.c | 15 +-- tools/perf/util/evsel.c | 55 +++++++++-- tools/perf/util/evsel.h | 5 + tools/perf/util/expr.c | 2 +- tools/perf/util/header.c | 4 +- tools/perf/util/parse-events.c | 102 ++++++++++++++------ tools/perf/util/pmus.c | 29 +++--- tools/perf/util/pmus.h | 2 + tools/perf/util/stat.c | 6 +- tools/perf/util/synthetic-events.c | 4 +- tools/perf/util/tool_pmu.c | 56 +++++++++-- tools/perf/util/tool_pmu.h | 2 +- 17 files changed, 297 insertions(+), 128 deletions(-)
On hybrid systems some PMUs apply to all core types, particularly for metrics the msr PMU and the tsc event. The metrics often only want the values of the counter for their specific core type. These patches allow the cpu term in an event to give a PMU name to take the cpumask from. For example: $ perf stat -e msr/tsc,cpu=cpu_atom/ ... will aggregate the msr/tsc/ value but only for atom cores. In doing this problems were identified in how cpumasks are handled by parsing and event setup when cpumasks are specified along with a task to profile. The event parsing, cpumask evlist propagation code and perf stat code are updated accordingly. The final result of the patch series is to be able to run: ``` $ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10 10.1: Basic parsing test : Ok 10.2: Parsing without PMU name : Ok 10.3: Parsing with PMU name : Ok Performance counter stats for 'perf test -F 10': 63,704,975 msr/tsc/ 47,060,704 msr/tsc,cpu=cpu_core/ (4.62%) 16,640,591 msr/tsc,cpu=cpu_atom/ (2.18%) ``` This has (further) identified a kernel bug for task events around the enabled time being too large leading to invalid scaling (hence the --no-scale in the command line above). Ian Rogers (12): perf parse-events: Warn if a cpu term is unsupported by a CPU perf stat: Avoid buffer overflow to the aggregation map perf stat: Don't size aggregation ids from user_requested_cpus perf parse-events: Allow the cpu term to be a PMU perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask libperf evsel: Rename own_cpus to pmu_cpus libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete perf evsel: Use libperf perf_evsel__exit perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu perf parse-events: Minor __add_event refactoring perf evsel: Add evsel__open_per_cpu_and_thread perf parse-events: Support user CPUs mixed with threads/processes tools/lib/perf/evlist.c | 118 ++++++++++++++++-------- tools/lib/perf/evsel.c | 9 +- tools/lib/perf/include/internal/evsel.h | 3 +- tools/perf/builtin-stat.c | 9 +- tools/perf/tests/event_update.c | 4 +- tools/perf/util/evlist.c | 15 +-- tools/perf/util/evsel.c | 55 +++++++++-- tools/perf/util/evsel.h | 5 + tools/perf/util/expr.c | 2 +- tools/perf/util/header.c | 4 +- tools/perf/util/parse-events.c | 102 ++++++++++++++------ tools/perf/util/pmus.c | 29 +++--- tools/perf/util/pmus.h | 2 + tools/perf/util/stat.c | 6 +- tools/perf/util/synthetic-events.c | 4 +- tools/perf/util/tool_pmu.c | 56 +++++++++-- tools/perf/util/tool_pmu.h | 2 +- 17 files changed, 297 insertions(+), 128 deletions(-) -- 2.50.0.727.gbf7dc18ff4-goog
On Fri, Jun 27, 2025 at 12:24 PM Ian Rogers <irogers@google.com> wrote: > > On hybrid systems some PMUs apply to all core types, particularly for > metrics the msr PMU and the tsc event. The metrics often only want the > values of the counter for their specific core type. These patches > allow the cpu term in an event to give a PMU name to take the cpumask > from. For example: > > $ perf stat -e msr/tsc,cpu=cpu_atom/ ... > > will aggregate the msr/tsc/ value but only for atom cores. In doing > this problems were identified in how cpumasks are handled by parsing > and event setup when cpumasks are specified along with a task to > profile. The event parsing, cpumask evlist propagation code and perf > stat code are updated accordingly. > > The final result of the patch series is to be able to run: > ``` > $ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10 > 10.1: Basic parsing test : Ok > 10.2: Parsing without PMU name : Ok > 10.3: Parsing with PMU name : Ok > > Performance counter stats for 'perf test -F 10': > > 63,704,975 msr/tsc/ > 47,060,704 msr/tsc,cpu=cpu_core/ (4.62%) > 16,640,591 msr/tsc,cpu=cpu_atom/ (2.18%) > ``` > > This has (further) identified a kernel bug for task events around the > enabled time being too large leading to invalid scaling (hence the > --no-scale in the command line above). > > Ian Rogers (12): > perf parse-events: Warn if a cpu term is unsupported by a CPU > perf stat: Avoid buffer overflow to the aggregation map > perf stat: Don't size aggregation ids from user_requested_cpus > perf parse-events: Allow the cpu term to be a PMU > perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask > libperf evsel: Rename own_cpus to pmu_cpus > libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete > perf evsel: Use libperf perf_evsel__exit > perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu > perf parse-events: Minor __add_event refactoring > perf evsel: Add evsel__open_per_cpu_and_thread > perf parse-events: Support user CPUs mixed with threads/processes Ping. Thanks, Ian > tools/lib/perf/evlist.c | 118 ++++++++++++++++-------- > tools/lib/perf/evsel.c | 9 +- > tools/lib/perf/include/internal/evsel.h | 3 +- > tools/perf/builtin-stat.c | 9 +- > tools/perf/tests/event_update.c | 4 +- > tools/perf/util/evlist.c | 15 +-- > tools/perf/util/evsel.c | 55 +++++++++-- > tools/perf/util/evsel.h | 5 + > tools/perf/util/expr.c | 2 +- > tools/perf/util/header.c | 4 +- > tools/perf/util/parse-events.c | 102 ++++++++++++++------ > tools/perf/util/pmus.c | 29 +++--- > tools/perf/util/pmus.h | 2 + > tools/perf/util/stat.c | 6 +- > tools/perf/util/synthetic-events.c | 4 +- > tools/perf/util/tool_pmu.c | 56 +++++++++-- > tools/perf/util/tool_pmu.h | 2 +- > 17 files changed, 297 insertions(+), 128 deletions(-) > > -- > 2.50.0.727.gbf7dc18ff4-goog >
On Tue, 2025-07-15 at 12:55 -0700, Ian Rogers wrote: > On Fri, Jun 27, 2025 at 12:24 PM Ian Rogers <irogers@google.com> wrote: > > > > On hybrid systems some PMUs apply to all core types, particularly for > > metrics the msr PMU and the tsc event. The metrics often only want the > > values of the counter for their specific core type. These patches > > allow the cpu term in an event to give a PMU name to take the cpumask > > from. For example: > > > > $ perf stat -e msr/tsc,cpu=cpu_atom/ ... > > > > will aggregate the msr/tsc/ value but only for atom cores. In doing > > this problems were identified in how cpumasks are handled by parsing > > and event setup when cpumasks are specified along with a task to > > profile. The event parsing, cpumask evlist propagation code and perf > > stat code are updated accordingly. > > > > The final result of the patch series is to be able to run: > > ``` > > $ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10 > > 10.1: Basic parsing test : Ok > > 10.2: Parsing without PMU name : Ok > > 10.3: Parsing with PMU name : Ok > > > > Performance counter stats for 'perf test -F 10': > > > > 63,704,975 msr/tsc/ > > 47,060,704 msr/tsc,cpu=cpu_core/ (4.62%) > > 16,640,591 msr/tsc,cpu=cpu_atom/ (2.18%) > > ``` > > > > This has (further) identified a kernel bug for task events around the > > enabled time being too large leading to invalid scaling (hence the > > --no-scale in the command line above). > > > > Ian Rogers (12): > > perf parse-events: Warn if a cpu term is unsupported by a CPU > > perf stat: Avoid buffer overflow to the aggregation map > > perf stat: Don't size aggregation ids from user_requested_cpus > > perf parse-events: Allow the cpu term to be a PMU > > perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask > > libperf evsel: Rename own_cpus to pmu_cpus > > libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete > > perf evsel: Use libperf perf_evsel__exit > > perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu > > perf parse-events: Minor __add_event refactoring > > perf evsel: Add evsel__open_per_cpu_and_thread > > perf parse-events: Support user CPUs mixed with threads/processes > > Ping. Hi Ian, Looks good to me. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Thanks, Tom > > Thanks, > Ian > > > tools/lib/perf/evlist.c | 118 ++++++++++++++++-------- > > tools/lib/perf/evsel.c | 9 +- > > tools/lib/perf/include/internal/evsel.h | 3 +- > > tools/perf/builtin-stat.c | 9 +- > > tools/perf/tests/event_update.c | 4 +- > > tools/perf/util/evlist.c | 15 +-- > > tools/perf/util/evsel.c | 55 +++++++++-- > > tools/perf/util/evsel.h | 5 + > > tools/perf/util/expr.c | 2 +- > > tools/perf/util/header.c | 4 +- > > tools/perf/util/parse-events.c | 102 ++++++++++++++------ > > tools/perf/util/pmus.c | 29 +++--- > > tools/perf/util/pmus.h | 2 + > > tools/perf/util/stat.c | 6 +- > > tools/perf/util/synthetic-events.c | 4 +- > > tools/perf/util/tool_pmu.c | 56 +++++++++-- > > tools/perf/util/tool_pmu.h | 2 +- > > 17 files changed, 297 insertions(+), 128 deletions(-) > > > > -- > > 2.50.0.727.gbf7dc18ff4-goog > >
On 27/06/2025 8:24 pm, Ian Rogers wrote: > On hybrid systems some PMUs apply to all core types, particularly for > metrics the msr PMU and the tsc event. The metrics often only want the > values of the counter for their specific core type. These patches > allow the cpu term in an event to give a PMU name to take the cpumask > from. For example: > > $ perf stat -e msr/tsc,cpu=cpu_atom/ ... > > will aggregate the msr/tsc/ value but only for atom cores. In doing > this problems were identified in how cpumasks are handled by parsing > and event setup when cpumasks are specified along with a task to > profile. The event parsing, cpumask evlist propagation code and perf > stat code are updated accordingly. > > The final result of the patch series is to be able to run: > ``` > $ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10 > 10.1: Basic parsing test : Ok > 10.2: Parsing without PMU name : Ok > 10.3: Parsing with PMU name : Ok > > Performance counter stats for 'perf test -F 10': > > 63,704,975 msr/tsc/ > 47,060,704 msr/tsc,cpu=cpu_core/ (4.62%) > 16,640,591 msr/tsc,cpu=cpu_atom/ (2.18%) > ``` > > This has (further) identified a kernel bug for task events around the > enabled time being too large leading to invalid scaling (hence the > --no-scale in the command line above). > > Ian Rogers (12): > perf parse-events: Warn if a cpu term is unsupported by a CPU > perf stat: Avoid buffer overflow to the aggregation map > perf stat: Don't size aggregation ids from user_requested_cpus > perf parse-events: Allow the cpu term to be a PMU > perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask > libperf evsel: Rename own_cpus to pmu_cpus > libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete > perf evsel: Use libperf perf_evsel__exit > perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu > perf parse-events: Minor __add_event refactoring > perf evsel: Add evsel__open_per_cpu_and_thread > perf parse-events: Support user CPUs mixed with threads/processes > > tools/lib/perf/evlist.c | 118 ++++++++++++++++-------- > tools/lib/perf/evsel.c | 9 +- > tools/lib/perf/include/internal/evsel.h | 3 +- > tools/perf/builtin-stat.c | 9 +- > tools/perf/tests/event_update.c | 4 +- > tools/perf/util/evlist.c | 15 +-- > tools/perf/util/evsel.c | 55 +++++++++-- > tools/perf/util/evsel.h | 5 + > tools/perf/util/expr.c | 2 +- > tools/perf/util/header.c | 4 +- > tools/perf/util/parse-events.c | 102 ++++++++++++++------ > tools/perf/util/pmus.c | 29 +++--- > tools/perf/util/pmus.h | 2 + > tools/perf/util/stat.c | 6 +- > tools/perf/util/synthetic-events.c | 4 +- > tools/perf/util/tool_pmu.c | 56 +++++++++-- > tools/perf/util/tool_pmu.h | 2 +- > 17 files changed, 297 insertions(+), 128 deletions(-) > Tested-by: James Clark <james.clark@linaro.org>
On Mon, Jul 21, 2025 at 9:13 AM James Clark <james.clark@linaro.org> wrote: > > > > On 27/06/2025 8:24 pm, Ian Rogers wrote: > > On hybrid systems some PMUs apply to all core types, particularly for > > metrics the msr PMU and the tsc event. The metrics often only want the > > values of the counter for their specific core type. These patches > > allow the cpu term in an event to give a PMU name to take the cpumask > > from. For example: > > > > $ perf stat -e msr/tsc,cpu=cpu_atom/ ... > > > > will aggregate the msr/tsc/ value but only for atom cores. In doing > > this problems were identified in how cpumasks are handled by parsing > > and event setup when cpumasks are specified along with a task to > > profile. The event parsing, cpumask evlist propagation code and perf > > stat code are updated accordingly. > > > > The final result of the patch series is to be able to run: > > ``` > > $ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10 > > 10.1: Basic parsing test : Ok > > 10.2: Parsing without PMU name : Ok > > 10.3: Parsing with PMU name : Ok > > > > Performance counter stats for 'perf test -F 10': > > > > 63,704,975 msr/tsc/ > > 47,060,704 msr/tsc,cpu=cpu_core/ (4.62%) > > 16,640,591 msr/tsc,cpu=cpu_atom/ (2.18%) > > ``` > > > > This has (further) identified a kernel bug for task events around the > > enabled time being too large leading to invalid scaling (hence the > > --no-scale in the command line above). > > > > Ian Rogers (12): > > perf parse-events: Warn if a cpu term is unsupported by a CPU > > perf stat: Avoid buffer overflow to the aggregation map > > perf stat: Don't size aggregation ids from user_requested_cpus > > perf parse-events: Allow the cpu term to be a PMU > > perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask > > libperf evsel: Rename own_cpus to pmu_cpus > > libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete > > perf evsel: Use libperf perf_evsel__exit > > perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu > > perf parse-events: Minor __add_event refactoring > > perf evsel: Add evsel__open_per_cpu_and_thread > > perf parse-events: Support user CPUs mixed with threads/processes > > > > tools/lib/perf/evlist.c | 118 ++++++++++++++++-------- > > tools/lib/perf/evsel.c | 9 +- > > tools/lib/perf/include/internal/evsel.h | 3 +- > > tools/perf/builtin-stat.c | 9 +- > > tools/perf/tests/event_update.c | 4 +- > > tools/perf/util/evlist.c | 15 +-- > > tools/perf/util/evsel.c | 55 +++++++++-- > > tools/perf/util/evsel.h | 5 + > > tools/perf/util/expr.c | 2 +- > > tools/perf/util/header.c | 4 +- > > tools/perf/util/parse-events.c | 102 ++++++++++++++------ > > tools/perf/util/pmus.c | 29 +++--- > > tools/perf/util/pmus.h | 2 + > > tools/perf/util/stat.c | 6 +- > > tools/perf/util/synthetic-events.c | 4 +- > > tools/perf/util/tool_pmu.c | 56 +++++++++-- > > tools/perf/util/tool_pmu.h | 2 +- > > 17 files changed, 297 insertions(+), 128 deletions(-) > > > > Tested-by: James Clark <james.clark@linaro.org> Much appreciated, thanks James! There's a v2 patch set but the Tested-by will be good for the majority of patches that are unchanged in that: https://lore.kernel.org/lkml/20250717210233.1143622-1-irogers@google.com/ I'm of course interested in getting RFC feedback on: https://lore.kernel.org/lkml/20250716223924.825772-1-irogers@google.com/ which introduces an extra state to avoid gathering enabled time on CPUs an event can't run on. Thanks, Ian
© 2016 - 2025 Red Hat, Inc.