[PATCH] perf stat: Merge CPU maps when merging uncore aliases

Chun-Tse Shao posted 1 patch 2 months ago
tools/perf/util/stat.c | 6 ++++++
1 file changed, 6 insertions(+)
[PATCH] perf stat: Merge CPU maps when merging uncore aliases
Posted by Chun-Tse Shao 2 months ago
Sockets with event zero counts are hidden in `perf stat --per-socket`
because alias merging aggregates counts but not CPU maps. This causes
the display logic to incorrectly skip sockets not present in the leader
instance's map.

For example, with AMD_UMC:
  $ cat /sys/bus/event_source/devices/amd_umc*/cpumask
  0
  ...
  128
  ...

  $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1

   Performance counter stats for 'system wide':

  S0       12                  0      amd_umc/config=0xff/

         1.000777829 seconds time elapsed

Merge CPU maps in `evsel__merge_aggr_counters` to ensure the aggregated
event correctly reflects all underlying PMU instances.

After fix:
  $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1

   Performance counter stats for 'system wide':

  S0       12                  0      amd_umc/config=0xff/
  S1       12                  0      amd_umc/config=0xff/

         1.000666681 seconds time elapsed

Signed-off-by: Chun-Tse Shao <ctshao@google.com>
---
 tools/perf/util/stat.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 14d169e22e8f..d696d217f98d 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -535,6 +535,12 @@ static int evsel__merge_aggr_counters(struct evsel *evsel, struct evsel *alias)
 		aggr_counts_a->run += aggr_counts_b->run;
 	}
 
+	/*
+	 * Merge the CPU maps so that the display logic (e.g. should_skip_zero_counter)
+	 * knows this merged event covers all CPUs from both aliases.
+	 */
+	perf_cpu_map__merge(&evsel->core.cpus, alias->core.cpus);
+
 	return 0;
 }
 
-- 
2.54.0.rc1.555.g9c883467ad-goog
Re: [PATCH] perf stat: Merge CPU maps when merging uncore aliases
Posted by Ian Rogers 2 months ago
On Thu, Apr 16, 2026 at 1:48 PM Chun-Tse Shao <ctshao@google.com> wrote:
>
> Sockets with event zero counts are hidden in `perf stat --per-socket`
> because alias merging aggregates counts but not CPU maps. This causes
> the display logic to incorrectly skip sockets not present in the leader
> instance's map.
>
> For example, with AMD_UMC:
>   $ cat /sys/bus/event_source/devices/amd_umc*/cpumask
>   0
>   ...
>   128
>   ...
>
>   $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1
>
>    Performance counter stats for 'system wide':
>
>   S0       12                  0      amd_umc/config=0xff/
>
>          1.000777829 seconds time elapsed
>
> Merge CPU maps in `evsel__merge_aggr_counters` to ensure the aggregated
> event correctly reflects all underlying PMU instances.
>
> After fix:
>   $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1
>
>    Performance counter stats for 'system wide':
>
>   S0       12                  0      amd_umc/config=0xff/
>   S1       12                  0      amd_umc/config=0xff/
>
>          1.000666681 seconds time elapsed
>
> Signed-off-by: Chun-Tse Shao <ctshao@google.com>
> ---
>  tools/perf/util/stat.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index 14d169e22e8f..d696d217f98d 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -535,6 +535,12 @@ static int evsel__merge_aggr_counters(struct evsel *evsel, struct evsel *alias)
>                 aggr_counts_a->run += aggr_counts_b->run;
>         }
>
> +       /*
> +        * Merge the CPU maps so that the display logic (e.g. should_skip_zero_counter)
> +        * knows this merged event covers all CPUs from both aliases.
> +        */
> +       perf_cpu_map__merge(&evsel->core.cpus, alias->core.cpus);
> +

Once we have counts, mutating the cpumaps seems wrong; it could impact
other things such as the evlist copies of unioned together cpumaps.
I'm confused by the bug, we should have two aggregate ids one for each
socket. In should_skip_zero_counter it should only skip the zero count
if the counter is not applicable for the counter currently trying to
be printed. Does the display code fail to iterate through all the
counters?

Thanks,
Ian

>         return 0;
>  }
>
> --
> 2.54.0.rc1.555.g9c883467ad-goog
>
Re: [PATCH] perf stat: Merge CPU maps when merging uncore aliases
Posted by Chun-Tse Shao 1 month, 2 weeks ago
Instead, I submitted another fix to tweak `Merge CPU maps when merging
uncore aliases`. Please take a look
lore.kernel.org/20260501181129.938256-1-ctshao@google.com

Thanks,
CT


On Thu, Apr 16, 2026 at 2:05 PM Ian Rogers <irogers@google.com> wrote:
>
> On Thu, Apr 16, 2026 at 1:48 PM Chun-Tse Shao <ctshao@google.com> wrote:
> >
> > Sockets with event zero counts are hidden in `perf stat --per-socket`
> > because alias merging aggregates counts but not CPU maps. This causes
> > the display logic to incorrectly skip sockets not present in the leader
> > instance's map.
> >
> > For example, with AMD_UMC:
> >   $ cat /sys/bus/event_source/devices/amd_umc*/cpumask
> >   0
> >   ...
> >   128
> >   ...
> >
> >   $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1
> >
> >    Performance counter stats for 'system wide':
> >
> >   S0       12                  0      amd_umc/config=0xff/
> >
> >          1.000777829 seconds time elapsed
> >
> > Merge CPU maps in `evsel__merge_aggr_counters` to ensure the aggregated
> > event correctly reflects all underlying PMU instances.
> >
> > After fix:
> >   $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1
> >
> >    Performance counter stats for 'system wide':
> >
> >   S0       12                  0      amd_umc/config=0xff/
> >   S1       12                  0      amd_umc/config=0xff/
> >
> >          1.000666681 seconds time elapsed
> >
> > Signed-off-by: Chun-Tse Shao <ctshao@google.com>
> > ---
> >  tools/perf/util/stat.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> > index 14d169e22e8f..d696d217f98d 100644
> > --- a/tools/perf/util/stat.c
> > +++ b/tools/perf/util/stat.c
> > @@ -535,6 +535,12 @@ static int evsel__merge_aggr_counters(struct evsel *evsel, struct evsel *alias)
> >                 aggr_counts_a->run += aggr_counts_b->run;
> >         }
> >
> > +       /*
> > +        * Merge the CPU maps so that the display logic (e.g. should_skip_zero_counter)
> > +        * knows this merged event covers all CPUs from both aliases.
> > +        */
> > +       perf_cpu_map__merge(&evsel->core.cpus, alias->core.cpus);
> > +
>
> Once we have counts, mutating the cpumaps seems wrong; it could impact
> other things such as the evlist copies of unioned together cpumaps.
> I'm confused by the bug, we should have two aggregate ids one for each
> socket. In should_skip_zero_counter it should only skip the zero count
> if the counter is not applicable for the counter currently trying to
> be printed. Does the display code fail to iterate through all the
> counters?
>
> Thanks,
> Ian
>
> >         return 0;
> >  }
> >
> > --
> > 2.54.0.rc1.555.g9c883467ad-goog
> >
Re: [PATCH] perf stat: Merge CPU maps when merging uncore aliases
Posted by Chun-Tse Shao 1 month, 2 weeks ago
My fault, what I mean was "tweak should_skip_zero_counter"

On Fri, May 1, 2026 at 11:17 AM Chun-Tse Shao <ctshao@google.com> wrote:
>
> Instead, I submitted another fix to tweak `Merge CPU maps when merging
> uncore aliases`. Please take a look
> lore.kernel.org/20260501181129.938256-1-ctshao@google.com
>
> Thanks,
> CT
>
>
> On Thu, Apr 16, 2026 at 2:05 PM Ian Rogers <irogers@google.com> wrote:
> >
> > On Thu, Apr 16, 2026 at 1:48 PM Chun-Tse Shao <ctshao@google.com> wrote:
> > >
> > > Sockets with event zero counts are hidden in `perf stat --per-socket`
> > > because alias merging aggregates counts but not CPU maps. This causes
> > > the display logic to incorrectly skip sockets not present in the leader
> > > instance's map.
> > >
> > > For example, with AMD_UMC:
> > >   $ cat /sys/bus/event_source/devices/amd_umc*/cpumask
> > >   0
> > >   ...
> > >   128
> > >   ...
> > >
> > >   $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1
> > >
> > >    Performance counter stats for 'system wide':
> > >
> > >   S0       12                  0      amd_umc/config=0xff/
> > >
> > >          1.000777829 seconds time elapsed
> > >
> > > Merge CPU maps in `evsel__merge_aggr_counters` to ensure the aggregated
> > > event correctly reflects all underlying PMU instances.
> > >
> > > After fix:
> > >   $ perf stat -e amd_umc/config=0xff/ --per-socket -a -- sleep 1
> > >
> > >    Performance counter stats for 'system wide':
> > >
> > >   S0       12                  0      amd_umc/config=0xff/
> > >   S1       12                  0      amd_umc/config=0xff/
> > >
> > >          1.000666681 seconds time elapsed
> > >
> > > Signed-off-by: Chun-Tse Shao <ctshao@google.com>
> > > ---
> > >  tools/perf/util/stat.c | 6 ++++++
> > >  1 file changed, 6 insertions(+)
> > >
> > > diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> > > index 14d169e22e8f..d696d217f98d 100644
> > > --- a/tools/perf/util/stat.c
> > > +++ b/tools/perf/util/stat.c
> > > @@ -535,6 +535,12 @@ static int evsel__merge_aggr_counters(struct evsel *evsel, struct evsel *alias)
> > >                 aggr_counts_a->run += aggr_counts_b->run;
> > >         }
> > >
> > > +       /*
> > > +        * Merge the CPU maps so that the display logic (e.g. should_skip_zero_counter)
> > > +        * knows this merged event covers all CPUs from both aliases.
> > > +        */
> > > +       perf_cpu_map__merge(&evsel->core.cpus, alias->core.cpus);
> > > +
> >
> > Once we have counts, mutating the cpumaps seems wrong; it could impact
> > other things such as the evlist copies of unioned together cpumaps.
> > I'm confused by the bug, we should have two aggregate ids one for each
> > socket. In should_skip_zero_counter it should only skip the zero count
> > if the counter is not applicable for the counter currently trying to
> > be printed. Does the display code fail to iterate through all the
> > counters?
> >
> > Thanks,
> > Ian
> >
> > >         return 0;
> > >  }
> > >
> > > --
> > > 2.54.0.rc1.555.g9c883467ad-goog
> > >