[PATCH v1 0/3] Metric related performance improvements

Ian Rogers posted 3 patches 10 months ago
There is a newer version of this series
tools/perf/builtin-stat.c                |   6 +-
tools/perf/pmu-events/empty-pmu-events.c |  66 ++++++++-
tools/perf/pmu-events/jevents.py         |  66 ++++++++-
tools/perf/pmu-events/pmu-events.h       |  23 +++-
tools/perf/tests/pmu-events.c            | 129 +++++++++--------
tools/perf/util/fncache.c                |  69 +++++-----
tools/perf/util/fncache.h                |   1 -
tools/perf/util/hwmon_pmu.c              |  43 +++---
tools/perf/util/metricgroup.c            | 102 ++++++--------
tools/perf/util/metricgroup.h            |   2 +-
tools/perf/util/pmu.c                    | 167 +++++++++++++++--------
tools/perf/util/pmu.h                    |   4 +-
tools/perf/util/srccode.c                |   4 +-
tools/perf/util/tool_pmu.c               |  17 +--
14 files changed, 430 insertions(+), 269 deletions(-)
[PATCH v1 0/3] Metric related performance improvements
Posted by Ian Rogers 10 months ago
The "PMU JSON event tests" have been running slowly, these changes
target improving them with an improvement of the test running 8 to 10
times faster.

The first patch changes from searching through all aliases by name in
a list to using a hashmap. Doing a fast hashmap__find means testing
for having an event needn't load from disk if an event is already
present.

The second patch switch the fncache to use a hashmap rather than its
own hashmap with a limited number of buckets. When there are many
filename queries, such as with a test, there are many collisions with
the previous fncache approach leading to linear searching of the
entries.

The final patch adds a find function for metrics. Normally metrics can
match by name and group, however, only name matching happens when one
metric refers to another. As we test every "id" in a metric to see if
it is a metric, the find function can dominate performance as it
linearly searches all metrics. Add a find function for the metrics
table so that a metric can be found by name with a binary search.

Before these changes:
```
$ time perf test -v 10
 10: PMU JSON event tests                                            :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 10.5: Parsing of metric thresholds with fake PMUs                   : Ok

real    0m18.499s
user    0m18.150s
sys     0m3.273s
```

After these changes:
```
$ time perf test -v 10
 10: PMU JSON event tests                                            :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 10.5: Parsing of metric thresholds with fake PMUs                   : Ok

real    0m2.338s
user    0m1.797s
sys     0m2.186s
```

Ian Rogers (3):
  perf pmu: Change aliases from list to hashmap
  perf fncache: Switch to using hashmap
  perf metricgroup: Binary search when resolving referred to metrics

 tools/perf/builtin-stat.c                |   6 +-
 tools/perf/pmu-events/empty-pmu-events.c |  66 ++++++++-
 tools/perf/pmu-events/jevents.py         |  66 ++++++++-
 tools/perf/pmu-events/pmu-events.h       |  23 +++-
 tools/perf/tests/pmu-events.c            | 129 +++++++++--------
 tools/perf/util/fncache.c                |  69 +++++-----
 tools/perf/util/fncache.h                |   1 -
 tools/perf/util/hwmon_pmu.c              |  43 +++---
 tools/perf/util/metricgroup.c            | 102 ++++++--------
 tools/perf/util/metricgroup.h            |   2 +-
 tools/perf/util/pmu.c                    | 167 +++++++++++++++--------
 tools/perf/util/pmu.h                    |   4 +-
 tools/perf/util/srccode.c                |   4 +-
 tools/perf/util/tool_pmu.c               |  17 +--
 14 files changed, 430 insertions(+), 269 deletions(-)

-- 
2.49.0.504.g3bcea36a83-goog
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Namhyung Kim 10 months ago
Hi Ian,

On Wed, Apr 09, 2025 at 09:45:29PM -0700, Ian Rogers wrote:
> The "PMU JSON event tests" have been running slowly, these changes
> target improving them with an improvement of the test running 8 to 10
> times faster.
> 
> The first patch changes from searching through all aliases by name in
> a list to using a hashmap. Doing a fast hashmap__find means testing
> for having an event needn't load from disk if an event is already
> present.
> 
> The second patch switch the fncache to use a hashmap rather than its
> own hashmap with a limited number of buckets. When there are many
> filename queries, such as with a test, there are many collisions with
> the previous fncache approach leading to linear searching of the
> entries.
> 
> The final patch adds a find function for metrics. Normally metrics can
> match by name and group, however, only name matching happens when one
> metric refers to another. As we test every "id" in a metric to see if
> it is a metric, the find function can dominate performance as it
> linearly searches all metrics. Add a find function for the metrics
> table so that a metric can be found by name with a binary search.
> 
> Before these changes:
> ```
> $ time perf test -v 10
>  10: PMU JSON event tests                                            :
>  10.1: PMU event table sanity                                        : Ok
>  10.2: PMU event map aliases                                         : Ok
>  10.3: Parsing of PMU event table metrics                            : Ok
>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
>  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> 
> real    0m18.499s
> user    0m18.150s
> sys     0m3.273s
> ```
> 
> After these changes:
> ```
> $ time perf test -v 10
>  10: PMU JSON event tests                                            :
>  10.1: PMU event table sanity                                        : Ok
>  10.2: PMU event map aliases                                         : Ok
>  10.3: Parsing of PMU event table metrics                            : Ok
>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
>  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> 
> real    0m2.338s
> user    0m1.797s
> sys     0m2.186s
> ```

Great, I also see the speedup on my machine from 32s to 3s.

Tested-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung

> 
> Ian Rogers (3):
>   perf pmu: Change aliases from list to hashmap
>   perf fncache: Switch to using hashmap
>   perf metricgroup: Binary search when resolving referred to metrics
> 
>  tools/perf/builtin-stat.c                |   6 +-
>  tools/perf/pmu-events/empty-pmu-events.c |  66 ++++++++-
>  tools/perf/pmu-events/jevents.py         |  66 ++++++++-
>  tools/perf/pmu-events/pmu-events.h       |  23 +++-
>  tools/perf/tests/pmu-events.c            | 129 +++++++++--------
>  tools/perf/util/fncache.c                |  69 +++++-----
>  tools/perf/util/fncache.h                |   1 -
>  tools/perf/util/hwmon_pmu.c              |  43 +++---
>  tools/perf/util/metricgroup.c            | 102 ++++++--------
>  tools/perf/util/metricgroup.h            |   2 +-
>  tools/perf/util/pmu.c                    | 167 +++++++++++++++--------
>  tools/perf/util/pmu.h                    |   4 +-
>  tools/perf/util/srccode.c                |   4 +-
>  tools/perf/util/tool_pmu.c               |  17 +--
>  14 files changed, 430 insertions(+), 269 deletions(-)
> 
> -- 
> 2.49.0.504.g3bcea36a83-goog
>
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Arnaldo Carvalho de Melo 8 months, 4 weeks ago
On Wed, Apr 09, 2025 at 11:49:16PM -0700, Namhyung Kim wrote:
> Hi Ian,
> 
> On Wed, Apr 09, 2025 at 09:45:29PM -0700, Ian Rogers wrote:
> > The "PMU JSON event tests" have been running slowly, these changes
> > target improving them with an improvement of the test running 8 to 10
> > times faster.
> > 
> > The first patch changes from searching through all aliases by name in
> > a list to using a hashmap. Doing a fast hashmap__find means testing
> > for having an event needn't load from disk if an event is already
> > present.
> > 
> > The second patch switch the fncache to use a hashmap rather than its
> > own hashmap with a limited number of buckets. When there are many
> > filename queries, such as with a test, there are many collisions with
> > the previous fncache approach leading to linear searching of the
> > entries.
> > 
> > The final patch adds a find function for metrics. Normally metrics can
> > match by name and group, however, only name matching happens when one
> > metric refers to another. As we test every "id" in a metric to see if
> > it is a metric, the find function can dominate performance as it
> > linearly searches all metrics. Add a find function for the metrics
> > table so that a metric can be found by name with a binary search.
> > 
> > Before these changes:
> > ```
> > $ time perf test -v 10
> >  10: PMU JSON event tests                                            :
> >  10.1: PMU event table sanity                                        : Ok
> >  10.2: PMU event map aliases                                         : Ok
> >  10.3: Parsing of PMU event table metrics                            : Ok
> >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> > 
> > real    0m18.499s
> > user    0m18.150s
> > sys     0m3.273s
> > ```
> > 
> > After these changes:
> > ```
> > $ time perf test -v 10
> >  10: PMU JSON event tests                                            :
> >  10.1: PMU event table sanity                                        : Ok
> >  10.2: PMU event map aliases                                         : Ok
> >  10.3: Parsing of PMU event table metrics                            : Ok
> >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> > 
> > real    0m2.338s
> > user    0m1.797s
> > sys     0m2.186s
> > ```
> 
> Great, I also see the speedup on my machine from 32s to 3s.
> 
> Tested-by: Namhyung Kim <namhyung@kernel.org>

I'm collecting this for v2 as well, ok? Holler if you disagree.

- Arnaldo
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Arnaldo Carvalho de Melo 8 months, 4 weeks ago
On Tue, May 13, 2025 at 04:34:28PM -0300, Arnaldo Carvalho de Melo wrote:
> On Wed, Apr 09, 2025 at 11:49:16PM -0700, Namhyung Kim wrote:
> > Great, I also see the speedup on my machine from 32s to 3s.

> > Tested-by: Namhyung Kim <namhyung@kernel.org>
 
> I'm collecting this for v2 as well, ok? Holler if you disagree.

BTW, in my workstation:

Before:

  root@number:~# time perf test "Parsing of PMU event table metrics"
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

  real  0m9.286s
  user  0m9.354s
  sys   0m0.062s
  root@number:~#

After:

  root@number:~# time perf test "Parsing of PMU event table metrics"
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

  real  0m0.689s
  user  0m0.766s
  sys   0m0.042s
  root@number:~# time perf test 10
   10: PMU JSON event tests                                            :
   10.1: PMU event table sanity                                        : Ok
   10.2: PMU event map aliases                                         : Ok
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
   10.5: Parsing of metric thresholds with fake PMUs                   : Ok

  real  0m0.696s
  user  0m0.807s
  sys   0m0.064s
  root@number:~#
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Ian Rogers 9 months, 2 weeks ago
On Wed, Apr 9, 2025 at 11:49 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Ian,
>
> On Wed, Apr 09, 2025 at 09:45:29PM -0700, Ian Rogers wrote:
> > The "PMU JSON event tests" have been running slowly, these changes
> > target improving them with an improvement of the test running 8 to 10
> > times faster.
> >
> > The first patch changes from searching through all aliases by name in
> > a list to using a hashmap. Doing a fast hashmap__find means testing
> > for having an event needn't load from disk if an event is already
> > present.
> >
> > The second patch switch the fncache to use a hashmap rather than its
> > own hashmap with a limited number of buckets. When there are many
> > filename queries, such as with a test, there are many collisions with
> > the previous fncache approach leading to linear searching of the
> > entries.
> >
> > The final patch adds a find function for metrics. Normally metrics can
> > match by name and group, however, only name matching happens when one
> > metric refers to another. As we test every "id" in a metric to see if
> > it is a metric, the find function can dominate performance as it
> > linearly searches all metrics. Add a find function for the metrics
> > table so that a metric can be found by name with a binary search.
> >
> > Before these changes:
> > ```
> > $ time perf test -v 10
> >  10: PMU JSON event tests                                            :
> >  10.1: PMU event table sanity                                        : Ok
> >  10.2: PMU event map aliases                                         : Ok
> >  10.3: Parsing of PMU event table metrics                            : Ok
> >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> >
> > real    0m18.499s
> > user    0m18.150s
> > sys     0m3.273s
> > ```
> >
> > After these changes:
> > ```
> > $ time perf test -v 10
> >  10: PMU JSON event tests                                            :
> >  10.1: PMU event table sanity                                        : Ok
> >  10.2: PMU event map aliases                                         : Ok
> >  10.3: Parsing of PMU event table metrics                            : Ok
> >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> >
> > real    0m2.338s
> > user    0m1.797s
> > sys     0m2.186s
> > ```
>
> Great, I also see the speedup on my machine from 32s to 3s.
>
> Tested-by: Namhyung Kim <namhyung@kernel.org>

Ping.

Thanks,
Ian

> Thanks,
> Namhyung
>
> >
> > Ian Rogers (3):
> >   perf pmu: Change aliases from list to hashmap
> >   perf fncache: Switch to using hashmap
> >   perf metricgroup: Binary search when resolving referred to metrics
> >
> >  tools/perf/builtin-stat.c                |   6 +-
> >  tools/perf/pmu-events/empty-pmu-events.c |  66 ++++++++-
> >  tools/perf/pmu-events/jevents.py         |  66 ++++++++-
> >  tools/perf/pmu-events/pmu-events.h       |  23 +++-
> >  tools/perf/tests/pmu-events.c            | 129 +++++++++--------
> >  tools/perf/util/fncache.c                |  69 +++++-----
> >  tools/perf/util/fncache.h                |   1 -
> >  tools/perf/util/hwmon_pmu.c              |  43 +++---
> >  tools/perf/util/metricgroup.c            | 102 ++++++--------
> >  tools/perf/util/metricgroup.h            |   2 +-
> >  tools/perf/util/pmu.c                    | 167 +++++++++++++++--------
> >  tools/perf/util/pmu.h                    |   4 +-
> >  tools/perf/util/srccode.c                |   4 +-
> >  tools/perf/util/tool_pmu.c               |  17 +--
> >  14 files changed, 430 insertions(+), 269 deletions(-)
> >
> > --
> > 2.49.0.504.g3bcea36a83-goog
> >
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Arnaldo Carvalho de Melo 9 months ago
On Wed, Apr 23, 2025 at 01:48:22PM -0700, Ian Rogers wrote:
> On Wed, Apr 9, 2025 at 11:49 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hi Ian,
> >
> > On Wed, Apr 09, 2025 at 09:45:29PM -0700, Ian Rogers wrote:
> > > The "PMU JSON event tests" have been running slowly, these changes
> > > target improving them with an improvement of the test running 8 to 10
> > > times faster.
> > >
> > > The first patch changes from searching through all aliases by name in
> > > a list to using a hashmap. Doing a fast hashmap__find means testing
> > > for having an event needn't load from disk if an event is already
> > > present.
> > >
> > > The second patch switch the fncache to use a hashmap rather than its
> > > own hashmap with a limited number of buckets. When there are many
> > > filename queries, such as with a test, there are many collisions with
> > > the previous fncache approach leading to linear searching of the
> > > entries.
> > >
> > > The final patch adds a find function for metrics. Normally metrics can
> > > match by name and group, however, only name matching happens when one
> > > metric refers to another. As we test every "id" in a metric to see if
> > > it is a metric, the find function can dominate performance as it
> > > linearly searches all metrics. Add a find function for the metrics
> > > table so that a metric can be found by name with a binary search.
> > >
> > > Before these changes:
> > > ```
> > > $ time perf test -v 10
> > >  10: PMU JSON event tests                                            :
> > >  10.1: PMU event table sanity                                        : Ok
> > >  10.2: PMU event map aliases                                         : Ok
> > >  10.3: Parsing of PMU event table metrics                            : Ok
> > >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> > >
> > > real    0m18.499s
> > > user    0m18.150s
> > > sys     0m3.273s
> > > ```
> > >
> > > After these changes:
> > > ```
> > > $ time perf test -v 10
> > >  10: PMU JSON event tests                                            :
> > >  10.1: PMU event table sanity                                        : Ok
> > >  10.2: PMU event map aliases                                         : Ok
> > >  10.3: Parsing of PMU event table metrics                            : Ok
> > >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> > >
> > > real    0m2.338s
> > > user    0m1.797s
> > > sys     0m2.186s
> > > ```
> >
> > Great, I also see the speedup on my machine from 32s to 3s.
> >
> > Tested-by: Namhyung Kim <namhyung@kernel.org>
> 
> Ping.

I'll try to fix up it later, if you don't beat me to it, will continue
with the other patches you listed to get the ones that applies merged:

Total patches: 3
---
Cover: ./20250409_irogers_metric_related_performance_improvements.cover
 Link: https://lore.kernel.org/r/20250410044532.52017-1-irogers@google.com
 Base: not specified
       git am ./20250409_irogers_metric_related_performance_improvements.mbx
⬢ [acme@toolbx perf-tools-next]$        git am ./20250409_irogers_metric_related_performance_improvements.mbx
Applying: perf pmu: Change aliases from list to hashmap
error: patch failed: tools/perf/util/pmu.c:532
error: tools/perf/util/pmu.c: patch does not apply
Patch failed at 0001 perf pmu: Change aliases from list to hashmap
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
⬢ [acme@toolbx perf-tools-next]$ 
⬢ [acme@toolbx perf-tools-next]$ git am --abort
⬢ [acme@toolbx perf-tools-next]$ patch -p1 < ./20250409_irogers_metric_related_performance_improvements.mbx
patching file tools/perf/tests/pmu-events.c
patching file tools/perf/util/hwmon_pmu.c
patching file tools/perf/util/pmu.c
Hunk #3 succeeded at 417 (offset 11 lines).
Hunk #4 succeeded at 451 (offset 11 lines).
Hunk #5 FAILED at 541.
Hunk #6 succeeded at 657 (offset 41 lines).
Hunk #7 succeeded at 1146 (offset 41 lines).
Hunk #8 succeeded at 1238 (offset 41 lines).
Hunk #9 succeeded at 1259 (offset 41 lines).
Hunk #10 succeeded at 2018 (offset 48 lines).
Hunk #11 succeeded at 2033 (offset 48 lines).
Hunk #12 succeeded at 2502 (offset 59 lines).
Hunk #13 succeeded at 2522 (offset 59 lines).
1 out of 13 hunks FAILED -- saving rejects to file tools/perf/util/pmu.c.rej
patching file tools/perf/util/pmu.h
Hunk #3 succeeded at 295 (offset 5 lines).
patching file tools/perf/util/tool_pmu.c
Hunk #1 succeeded at 502 (offset 6 lines).
patching file tools/perf/util/fncache.c
patching file tools/perf/util/fncache.h
patching file tools/perf/util/srccode.c
patching file tools/perf/builtin-stat.c
Hunk #1 succeeded at 1854 (offset -2 lines).
Hunk #2 succeeded at 1888 (offset -2 lines).
Hunk #3 succeeded at 1978 (offset -2 lines).
patching file tools/perf/pmu-events/empty-pmu-events.c
Hunk #1 succeeded at 449 (offset 6 lines).
Hunk #2 succeeded at 495 (offset 6 lines).
Hunk #3 succeeded at 552 (offset 6 lines).
patching file tools/perf/pmu-events/jevents.py
Hunk #1 succeeded at 972 (offset 6 lines).
Hunk #2 succeeded at 1018 (offset 6 lines).
Hunk #3 succeeded at 1075 (offset 6 lines).
patching file tools/perf/pmu-events/pmu-events.h
Hunk #1 succeeded at 74 (offset 3 lines).
Hunk #2 succeeded at 89 (offset 3 lines).
Hunk #3 succeeded at 105 (offset 3 lines).
patching file tools/perf/util/metricgroup.c
patching file tools/perf/util/metricgroup.h
⬢ [acme@toolbx perf-tools-next]$
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Ian Rogers 9 months ago
On Mon, May 12, 2025 at 9:40 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Wed, Apr 23, 2025 at 01:48:22PM -0700, Ian Rogers wrote:
> > On Wed, Apr 9, 2025 at 11:49 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > Hi Ian,
> > >
> > > On Wed, Apr 09, 2025 at 09:45:29PM -0700, Ian Rogers wrote:
> > > > The "PMU JSON event tests" have been running slowly, these changes
> > > > target improving them with an improvement of the test running 8 to 10
> > > > times faster.
> > > >
> > > > The first patch changes from searching through all aliases by name in
> > > > a list to using a hashmap. Doing a fast hashmap__find means testing
> > > > for having an event needn't load from disk if an event is already
> > > > present.
> > > >
> > > > The second patch switch the fncache to use a hashmap rather than its
> > > > own hashmap with a limited number of buckets. When there are many
> > > > filename queries, such as with a test, there are many collisions with
> > > > the previous fncache approach leading to linear searching of the
> > > > entries.
> > > >
> > > > The final patch adds a find function for metrics. Normally metrics can
> > > > match by name and group, however, only name matching happens when one
> > > > metric refers to another. As we test every "id" in a metric to see if
> > > > it is a metric, the find function can dominate performance as it
> > > > linearly searches all metrics. Add a find function for the metrics
> > > > table so that a metric can be found by name with a binary search.
> > > >
> > > > Before these changes:
> > > > ```
> > > > $ time perf test -v 10
> > > >  10: PMU JSON event tests                                            :
> > > >  10.1: PMU event table sanity                                        : Ok
> > > >  10.2: PMU event map aliases                                         : Ok
> > > >  10.3: Parsing of PMU event table metrics                            : Ok
> > > >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> > > >
> > > > real    0m18.499s
> > > > user    0m18.150s
> > > > sys     0m3.273s
> > > > ```
> > > >
> > > > After these changes:
> > > > ```
> > > > $ time perf test -v 10
> > > >  10: PMU JSON event tests                                            :
> > > >  10.1: PMU event table sanity                                        : Ok
> > > >  10.2: PMU event map aliases                                         : Ok
> > > >  10.3: Parsing of PMU event table metrics                            : Ok
> > > >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> > > >
> > > > real    0m2.338s
> > > > user    0m1.797s
> > > > sys     0m2.186s
> > > > ```
> > >
> > > Great, I also see the speedup on my machine from 32s to 3s.
> > >
> > > Tested-by: Namhyung Kim <namhyung@kernel.org>
> >
> > Ping.
>
> I'll try to fix up it later, if you don't beat me to it, will continue
> with the other patches you listed to get the ones that applies merged:
>
> Total patches: 3
> ---
> Cover: ./20250409_irogers_metric_related_performance_improvements.cover
>  Link: https://lore.kernel.org/r/20250410044532.52017-1-irogers@google.com
>  Base: not specified
>        git am ./20250409_irogers_metric_related_performance_improvements.mbx
> ⬢ [acme@toolbx perf-tools-next]$        git am ./20250409_irogers_metric_related_performance_improvements.mbx
> Applying: perf pmu: Change aliases from list to hashmap
> error: patch failed: tools/perf/util/pmu.c:532
> error: tools/perf/util/pmu.c: patch does not apply
> Patch failed at 0001 perf pmu: Change aliases from list to hashmap
> hint: Use 'git am --show-current-patch=diff' to see the failed patch
> hint: When you have resolved this problem, run "git am --continue".
> hint: If you prefer to skip this patch, run "git am --skip" instead.
> hint: To restore the original branch and stop patching, run "git am --abort".
> hint: Disable this message with "git config set advice.mergeConflict false"
> ⬢ [acme@toolbx perf-tools-next]$
> ⬢ [acme@toolbx perf-tools-next]$ git am --abort
> ⬢ [acme@toolbx perf-tools-next]$ patch -p1 < ./20250409_irogers_metric_related_performance_improvements.mbx
> patching file tools/perf/tests/pmu-events.c
> patching file tools/perf/util/hwmon_pmu.c
> patching file tools/perf/util/pmu.c
> Hunk #3 succeeded at 417 (offset 11 lines).
> Hunk #4 succeeded at 451 (offset 11 lines).
> Hunk #5 FAILED at 541.
> Hunk #6 succeeded at 657 (offset 41 lines).
> Hunk #7 succeeded at 1146 (offset 41 lines).
> Hunk #8 succeeded at 1238 (offset 41 lines).
> Hunk #9 succeeded at 1259 (offset 41 lines).
> Hunk #10 succeeded at 2018 (offset 48 lines).
> Hunk #11 succeeded at 2033 (offset 48 lines).
> Hunk #12 succeeded at 2502 (offset 59 lines).
> Hunk #13 succeeded at 2522 (offset 59 lines).
> 1 out of 13 hunks FAILED -- saving rejects to file tools/perf/util/pmu.c.rej
> patching file tools/perf/util/pmu.h
> Hunk #3 succeeded at 295 (offset 5 lines).
> patching file tools/perf/util/tool_pmu.c
> Hunk #1 succeeded at 502 (offset 6 lines).
> patching file tools/perf/util/fncache.c
> patching file tools/perf/util/fncache.h
> patching file tools/perf/util/srccode.c
> patching file tools/perf/builtin-stat.c
> Hunk #1 succeeded at 1854 (offset -2 lines).
> Hunk #2 succeeded at 1888 (offset -2 lines).
> Hunk #3 succeeded at 1978 (offset -2 lines).
> patching file tools/perf/pmu-events/empty-pmu-events.c
> Hunk #1 succeeded at 449 (offset 6 lines).
> Hunk #2 succeeded at 495 (offset 6 lines).
> Hunk #3 succeeded at 552 (offset 6 lines).
> patching file tools/perf/pmu-events/jevents.py
> Hunk #1 succeeded at 972 (offset 6 lines).
> Hunk #2 succeeded at 1018 (offset 6 lines).
> Hunk #3 succeeded at 1075 (offset 6 lines).
> patching file tools/perf/pmu-events/pmu-events.h
> Hunk #1 succeeded at 74 (offset 3 lines).
> Hunk #2 succeeded at 89 (offset 3 lines).
> Hunk #3 succeeded at 105 (offset 3 lines).
> patching file tools/perf/util/metricgroup.c
> patching file tools/perf/util/metricgroup.h
> ⬢ [acme@toolbx perf-tools-next]$

Thanks Arnaldo! Happy to send a rebase on tmp.perf-tools-next if useful.

Thanks,
Ian
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Arnaldo Carvalho de Melo 9 months ago
On Mon, May 12, 2025 at 09:57:45AM -0700, Ian Rogers wrote:
> On Mon, May 12, 2025 at 9:40 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > Hunk #4 succeeded at 451 (offset 11 lines).
> > Hunk #5 FAILED at 541.
> > Hunk #6 succeeded at 657 (offset 41 lines).
> > ⬢ [acme@toolbx perf-tools-next]$
 
> Thanks Arnaldo! Happy to send a rebase on tmp.perf-tools-next if useful.

Sure, I just pushed what I have:

⬢ [acme@toolbx perf-tools-next]$ git log --oneline -10
255f5b6d060be5a4 (HEAD -> perf-tools-next, x1/perf-tools-next, x1/HEAD, perf-tools-next/tmp.perf-tools-next, five/perf-tools-next, five/HEAD) perf parse-events: Add "cpu" term to set the CPU an event is recorded on
168c7b509109fe26 perf parse-events: Set is_pmu_core for legacy hardware events
f60c3f44689ac2bc perf stat: Use counter cpumask to skip zero values
2e7a2f7f3c6e3a99 libperf cpumap: Add ability to create CPU from a single CPU number
365e02ddb65d443f perf tests metrics: Permission related fixes
f0869f31562bde2e perf evsel: Add per-thread warning for EOPNOTSUPP open failues
17e548405a81665f perf scripts python: exported-sql-viewer.py: Fix pattern matching with Python 3
352b088164b5cde1 perf intel-pt: Do not default to recording all switch events
e00eac6b5b6d956f perf intel-pt: Fix PEBS-via-PT data_src
cd17a9b1a779459d (perf-tools-next/perf-tools-next) perf test demangle-ocaml: Switch to using dso__demangle_sym()
⬢ [acme@toolbx perf-tools-next]$

- Arnaldo