[PATCH v1 0/2] Fix incorrect counts when count the same uncore event multiple times

Chun-Tse Shao posted 2 patches 8 months, 3 weeks ago
There is a newer version of this series
tools/perf/builtin-record.c    |   7 +-
tools/perf/builtin-top.c       |   7 +-
tools/perf/util/evlist.c       |  66 +++++++++-----
tools/perf/util/evlist.h       |   3 +-
tools/perf/util/evsel.c        | 116 ++++++++++++++++++++++++-
tools/perf/util/evsel.h        |  11 ++-
tools/perf/util/parse-events.c |  45 ++++++----
tools/perf/util/stat-display.c | 151 +--------------------------------
tools/perf/util/stat.c         |  40 +--------
9 files changed, 214 insertions(+), 232 deletions(-)
[PATCH v1 0/2] Fix incorrect counts when count the same uncore event multiple times
Posted by Chun-Tse Shao 8 months, 3 weeks ago
Let's take a look an example, the machine is SKX with 6 IMC devices.

  perf stat -e clockticks,clockticks -I 1000
  #           time             counts unit events
       1.001127430      6,901,503,174      uncore_imc_0/clockticks/
       1.001127430      3,940,896,301      uncore_imc_0/clockticks/
       2.002649722        988,376,876      uncore_imc_0/clockticks/
       2.002649722        988,376,141      uncore_imc_0/clockticks/
       3.004071319      1,000,292,675      uncore_imc_0/clockticks/
       3.004071319      1,000,294,160      uncore_imc_0/clockticks/

1) The events name should not be uniquified.
2) The initial count for the first `clockticks` is doubled.
3) Subsequent count only report for the first IMC device.

The first patch fixes 1) and 3), and the second patch fixes 2).

After these fix:

  perf stat -e clockticks,clockticks -I 1000
  #           time             counts unit events
       1.001127586      4,126,938,857      clockticks
       1.001127586      4,121,564,277      clockticks
       2.001686014      3,953,806,350      clockticks
       2.001686014      3,953,809,541      clockticks
       3.003121403      4,137,750,252      clockticks
       3.003121403      4,137,749,048      clockticks

I also tested `-A`, `--per-socket`, `--per-die` and `--per-core`, all
looks good.

Ian Rogers (2):
  perf evlist: Make uniquifying counter names consistent
  perf parse-events: Use wildcard processing to set an event to merge
    into

 tools/perf/builtin-record.c    |   7 +-
 tools/perf/builtin-top.c       |   7 +-
 tools/perf/util/evlist.c       |  66 +++++++++-----
 tools/perf/util/evlist.h       |   3 +-
 tools/perf/util/evsel.c        | 116 ++++++++++++++++++++++++-
 tools/perf/util/evsel.h        |  11 ++-
 tools/perf/util/parse-events.c |  45 ++++++----
 tools/perf/util/stat-display.c | 151 +--------------------------------
 tools/perf/util/stat.c         |  40 +--------
 9 files changed, 214 insertions(+), 232 deletions(-)

--
2.49.0.472.ge94155a9ec-goog
Re: [PATCH v1 0/2] Fix incorrect counts when count the same uncore event multiple times
Posted by Ian Rogers 8 months, 3 weeks ago
On Wed, Mar 26, 2025 at 4:49 PM Chun-Tse Shao <ctshao@google.com> wrote:
>
> Let's take a look an example, the machine is SKX with 6 IMC devices.
>
>   perf stat -e clockticks,clockticks -I 1000
>   #           time             counts unit events
>        1.001127430      6,901,503,174      uncore_imc_0/clockticks/
>        1.001127430      3,940,896,301      uncore_imc_0/clockticks/
>        2.002649722        988,376,876      uncore_imc_0/clockticks/
>        2.002649722        988,376,141      uncore_imc_0/clockticks/
>        3.004071319      1,000,292,675      uncore_imc_0/clockticks/
>        3.004071319      1,000,294,160      uncore_imc_0/clockticks/
>
> 1) The events name should not be uniquified.
> 2) The initial count for the first `clockticks` is doubled.
> 3) Subsequent count only report for the first IMC device.
>
> The first patch fixes 1) and 3), and the second patch fixes 2).
>
> After these fix:
>
>   perf stat -e clockticks,clockticks -I 1000
>   #           time             counts unit events
>        1.001127586      4,126,938,857      clockticks
>        1.001127586      4,121,564,277      clockticks
>        2.001686014      3,953,806,350      clockticks
>        2.001686014      3,953,809,541      clockticks
>        3.003121403      4,137,750,252      clockticks
>        3.003121403      4,137,749,048      clockticks
>
> I also tested `-A`, `--per-socket`, `--per-die` and `--per-core`, all
> looks good.

Thanks CT, I tested on hybrid and it looks good. I did notice a
regression with hwmon:

Before:
```
$ perf stat -e data_read,temp1 -a sleep 0.1

Performance counter stats for 'system wide':

           212.12 MiB  data_read
18,446,744,073,709,284.00 'C   hwmon_acpitz/temp1/
            46.00 'C   hwmon_coretemp/temp1/
            32.00 'C   hwmon_iwlwifi_1/temp1/
            40.85 'C   hwmon_nvme/temp1/
            47.00 'C   hwmon_spd5118/temp1/
```

After:
```
$  perf stat -e data_read,temp1 -a sleep 0.1

Performance counter stats for 'system wide':

           213.08 MiB  data_read
18,446,744,073,709,448.00 'C   temp1
```

So we're not uniquifying the hwmon events any more, I'll look into a
fix. I also need to open a bug to use my acptiz device to solve the
world's energy problems as it is currently running with 5% of the
surface temperature of the Sun.

Thanks,
Ian

> Ian Rogers (2):
>   perf evlist: Make uniquifying counter names consistent
>   perf parse-events: Use wildcard processing to set an event to merge
>     into
>
>  tools/perf/builtin-record.c    |   7 +-
>  tools/perf/builtin-top.c       |   7 +-
>  tools/perf/util/evlist.c       |  66 +++++++++-----
>  tools/perf/util/evlist.h       |   3 +-
>  tools/perf/util/evsel.c        | 116 ++++++++++++++++++++++++-
>  tools/perf/util/evsel.h        |  11 ++-
>  tools/perf/util/parse-events.c |  45 ++++++----
>  tools/perf/util/stat-display.c | 151 +--------------------------------
>  tools/perf/util/stat.c         |  40 +--------
>  9 files changed, 214 insertions(+), 232 deletions(-)
>
> --
> 2.49.0.472.ge94155a9ec-goog
>