tools/perf/util/stat.c | 6 ++++++ 1 file changed, 6 insertions(+)
Sockets with event zero counts are hidden in `perf stat --per-socket`
because alias merging aggregates counts but not CPU maps. This causes
the display logic to incorrectly skip sockets not present in the leader
instance's map.
For example, with AMD_UMC:
$ cat /sys/bus/event_source/devices/amd_umc*/cpumask
0
...
128
...
$ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1
Performance counter stats for 'system wide':
S0 12 0 amd_umc/config=0xff/
1.000777829 seconds time elapsed
Merge CPU maps in `evsel__merge_aggr_counters` to ensure the aggregated
event correctly reflects all underlying PMU instances.
After fix:
$ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1
Performance counter stats for 'system wide':
S0 12 0 amd_umc/config=0xff/
S1 12 0 amd_umc/config=0xff/
1.000666681 seconds time elapsed
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
---
tools/perf/util/stat.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 14d169e22e8f..d696d217f98d 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -535,6 +535,12 @@ static int evsel__merge_aggr_counters(struct evsel *evsel, struct evsel *alias)
aggr_counts_a->run += aggr_counts_b->run;
}
+ /*
+ * Merge the CPU maps so that the display logic (e.g. should_skip_zero_counter)
+ * knows this merged event covers all CPUs from both aliases.
+ */
+ perf_cpu_map__merge(&evsel->core.cpus, alias->core.cpus);
+
return 0;
}
--
2.54.0.rc1.555.g9c883467ad-goog
On Thu, Apr 16, 2026 at 1:48 PM Chun-Tse Shao <ctshao@google.com> wrote: > > Sockets with event zero counts are hidden in `perf stat --per-socket` > because alias merging aggregates counts but not CPU maps. This causes > the display logic to incorrectly skip sockets not present in the leader > instance's map. > > For example, with AMD_UMC: > $ cat /sys/bus/event_source/devices/amd_umc*/cpumask > 0 > ... > 128 > ... > > $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1 > > Performance counter stats for 'system wide': > > S0 12 0 amd_umc/config=0xff/ > > 1.000777829 seconds time elapsed > > Merge CPU maps in `evsel__merge_aggr_counters` to ensure the aggregated > event correctly reflects all underlying PMU instances. > > After fix: > $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1 > > Performance counter stats for 'system wide': > > S0 12 0 amd_umc/config=0xff/ > S1 12 0 amd_umc/config=0xff/ > > 1.000666681 seconds time elapsed > > Signed-off-by: Chun-Tse Shao <ctshao@google.com> > --- > tools/perf/util/stat.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c > index 14d169e22e8f..d696d217f98d 100644 > --- a/tools/perf/util/stat.c > +++ b/tools/perf/util/stat.c > @@ -535,6 +535,12 @@ static int evsel__merge_aggr_counters(struct evsel *evsel, struct evsel *alias) > aggr_counts_a->run += aggr_counts_b->run; > } > > + /* > + * Merge the CPU maps so that the display logic (e.g. should_skip_zero_counter) > + * knows this merged event covers all CPUs from both aliases. > + */ > + perf_cpu_map__merge(&evsel->core.cpus, alias->core.cpus); > + Once we have counts, mutating the cpumaps seems wrong; it could impact other things such as the evlist copies of unioned together cpumaps. I'm confused by the bug, we should have two aggregate ids one for each socket. In should_skip_zero_counter it should only skip the zero count if the counter is not applicable for the counter currently trying to be printed. Does the display code fail to iterate through all the counters? Thanks, Ian > return 0; > } > > -- > 2.54.0.rc1.555.g9c883467ad-goog >
Instead, I submitted another fix to tweak `Merge CPU maps when merging uncore aliases`. Please take a look lore.kernel.org/20260501181129.938256-1-ctshao@google.com Thanks, CT On Thu, Apr 16, 2026 at 2:05 PM Ian Rogers <irogers@google.com> wrote: > > On Thu, Apr 16, 2026 at 1:48 PM Chun-Tse Shao <ctshao@google.com> wrote: > > > > Sockets with event zero counts are hidden in `perf stat --per-socket` > > because alias merging aggregates counts but not CPU maps. This causes > > the display logic to incorrectly skip sockets not present in the leader > > instance's map. > > > > For example, with AMD_UMC: > > $ cat /sys/bus/event_source/devices/amd_umc*/cpumask > > 0 > > ... > > 128 > > ... > > > > $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1 > > > > Performance counter stats for 'system wide': > > > > S0 12 0 amd_umc/config=0xff/ > > > > 1.000777829 seconds time elapsed > > > > Merge CPU maps in `evsel__merge_aggr_counters` to ensure the aggregated > > event correctly reflects all underlying PMU instances. > > > > After fix: > > $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1 > > > > Performance counter stats for 'system wide': > > > > S0 12 0 amd_umc/config=0xff/ > > S1 12 0 amd_umc/config=0xff/ > > > > 1.000666681 seconds time elapsed > > > > Signed-off-by: Chun-Tse Shao <ctshao@google.com> > > --- > > tools/perf/util/stat.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c > > index 14d169e22e8f..d696d217f98d 100644 > > --- a/tools/perf/util/stat.c > > +++ b/tools/perf/util/stat.c > > @@ -535,6 +535,12 @@ static int evsel__merge_aggr_counters(struct evsel *evsel, struct evsel *alias) > > aggr_counts_a->run += aggr_counts_b->run; > > } > > > > + /* > > + * Merge the CPU maps so that the display logic (e.g. should_skip_zero_counter) > > + * knows this merged event covers all CPUs from both aliases. > > + */ > > + perf_cpu_map__merge(&evsel->core.cpus, alias->core.cpus); > > + > > Once we have counts, mutating the cpumaps seems wrong; it could impact > other things such as the evlist copies of unioned together cpumaps. > I'm confused by the bug, we should have two aggregate ids one for each > socket. In should_skip_zero_counter it should only skip the zero count > if the counter is not applicable for the counter currently trying to > be printed. Does the display code fail to iterate through all the > counters? > > Thanks, > Ian > > > return 0; > > } > > > > -- > > 2.54.0.rc1.555.g9c883467ad-goog > >
My fault, what I mean was "tweak should_skip_zero_counter" On Fri, May 1, 2026 at 11:17 AM Chun-Tse Shao <ctshao@google.com> wrote: > > Instead, I submitted another fix to tweak `Merge CPU maps when merging > uncore aliases`. Please take a look > lore.kernel.org/20260501181129.938256-1-ctshao@google.com > > Thanks, > CT > > > On Thu, Apr 16, 2026 at 2:05 PM Ian Rogers <irogers@google.com> wrote: > > > > On Thu, Apr 16, 2026 at 1:48 PM Chun-Tse Shao <ctshao@google.com> wrote: > > > > > > Sockets with event zero counts are hidden in `perf stat --per-socket` > > > because alias merging aggregates counts but not CPU maps. This causes > > > the display logic to incorrectly skip sockets not present in the leader > > > instance's map. > > > > > > For example, with AMD_UMC: > > > $ cat /sys/bus/event_source/devices/amd_umc*/cpumask > > > 0 > > > ... > > > 128 > > > ... > > > > > > $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1 > > > > > > Performance counter stats for 'system wide': > > > > > > S0 12 0 amd_umc/config=0xff/ > > > > > > 1.000777829 seconds time elapsed > > > > > > Merge CPU maps in `evsel__merge_aggr_counters` to ensure the aggregated > > > event correctly reflects all underlying PMU instances. > > > > > > After fix: > > > $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1 > > > > > > Performance counter stats for 'system wide': > > > > > > S0 12 0 amd_umc/config=0xff/ > > > S1 12 0 amd_umc/config=0xff/ > > > > > > 1.000666681 seconds time elapsed > > > > > > Signed-off-by: Chun-Tse Shao <ctshao@google.com> > > > --- > > > tools/perf/util/stat.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c > > > index 14d169e22e8f..d696d217f98d 100644 > > > --- a/tools/perf/util/stat.c > > > +++ b/tools/perf/util/stat.c > > > @@ -535,6 +535,12 @@ static int evsel__merge_aggr_counters(struct evsel *evsel, struct evsel *alias) > > > aggr_counts_a->run += aggr_counts_b->run; > > > } > > > > > > + /* > > > + * Merge the CPU maps so that the display logic (e.g. should_skip_zero_counter) > > > + * knows this merged event covers all CPUs from both aliases. > > > + */ > > > + perf_cpu_map__merge(&evsel->core.cpus, alias->core.cpus); > > > + > > > > Once we have counts, mutating the cpumaps seems wrong; it could impact > > other things such as the evlist copies of unioned together cpumaps. > > I'm confused by the bug, we should have two aggregate ids one for each > > socket. In should_skip_zero_counter it should only skip the zero count > > if the counter is not applicable for the counter currently trying to > > be printed. Does the display code fail to iterate through all the > > counters? > > > > Thanks, > > Ian > > > > > return 0; > > > } > > > > > > -- > > > 2.54.0.rc1.555.g9c883467ad-goog > > >
© 2016 - 2026 Red Hat, Inc.