[PATCH v1 0/3] Metric related performance improvements

Ian Rogers posted 3 patches 8 months, 1 week ago
There is a newer version of this series
tools/perf/builtin-stat.c                |   6 +-
tools/perf/pmu-events/empty-pmu-events.c |  66 ++++++++-
tools/perf/pmu-events/jevents.py         |  66 ++++++++-
tools/perf/pmu-events/pmu-events.h       |  23 +++-
tools/perf/tests/pmu-events.c            | 129 +++++++++--------
tools/perf/util/fncache.c                |  69 +++++-----
tools/perf/util/fncache.h                |   1 -
tools/perf/util/hwmon_pmu.c              |  43 +++---
tools/perf/util/metricgroup.c            | 102 ++++++--------
tools/perf/util/metricgroup.h            |   2 +-
tools/perf/util/pmu.c                    | 167 +++++++++++++++--------
tools/perf/util/pmu.h                    |   4 +-
tools/perf/util/srccode.c                |   4 +-
tools/perf/util/tool_pmu.c               |  17 +--
14 files changed, 430 insertions(+), 269 deletions(-)
[PATCH v1 0/3] Metric related performance improvements
Posted by Ian Rogers 8 months, 1 week ago
The "PMU JSON event tests" have been running slowly, these changes
target improving them with an improvement of the test running 8 to 10
times faster.

The first patch changes from searching through all aliases by name in
a list to using a hashmap. Doing a fast hashmap__find means testing
for having an event needn't load from disk if an event is already
present.

The second patch switch the fncache to use a hashmap rather than its
own hashmap with a limited number of buckets. When there are many
filename queries, such as with a test, there are many collisions with
the previous fncache approach leading to linear searching of the
entries.

The final patch adds a find function for metrics. Normally metrics can
match by name and group, however, only name matching happens when one
metric refers to another. As we test every "id" in a metric to see if
it is a metric, the find function can dominate performance as it
linearly searches all metrics. Add a find function for the metrics
table so that a metric can be found by name with a binary search.

Before these changes:
```
$ time perf test -v 10
 10: PMU JSON event tests                                            :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 10.5: Parsing of metric thresholds with fake PMUs                   : Ok

real    0m18.499s
user    0m18.150s
sys     0m3.273s
```

After these changes:
```
$ time perf test -v 10
 10: PMU JSON event tests                                            :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 10.5: Parsing of metric thresholds with fake PMUs                   : Ok

real    0m2.338s
user    0m1.797s
sys     0m2.186s
```

Ian Rogers (3):
  perf pmu: Change aliases from list to hashmap
  perf fncache: Switch to using hashmap
  perf metricgroup: Binary search when resolving referred to metrics

 tools/perf/builtin-stat.c                |   6 +-
 tools/perf/pmu-events/empty-pmu-events.c |  66 ++++++++-
 tools/perf/pmu-events/jevents.py         |  66 ++++++++-
 tools/perf/pmu-events/pmu-events.h       |  23 +++-
 tools/perf/tests/pmu-events.c            | 129 +++++++++--------
 tools/perf/util/fncache.c                |  69 +++++-----
 tools/perf/util/fncache.h                |   1 -
 tools/perf/util/hwmon_pmu.c              |  43 +++---
 tools/perf/util/metricgroup.c            | 102 ++++++--------
 tools/perf/util/metricgroup.h            |   2 +-
 tools/perf/util/pmu.c                    | 167 +++++++++++++++--------
 tools/perf/util/pmu.h                    |   4 +-
 tools/perf/util/srccode.c                |   4 +-
 tools/perf/util/tool_pmu.c               |  17 +--
 14 files changed, 430 insertions(+), 269 deletions(-)

-- 
2.49.0.504.g3bcea36a83-goog
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Namhyung Kim 8 months, 1 week ago
Hi Ian,

On Wed, Apr 09, 2025 at 09:45:29PM -0700, Ian Rogers wrote:
> The "PMU JSON event tests" have been running slowly, these changes
> target improving them with an improvement of the test running 8 to 10
> times faster.
> 
> The first patch changes from searching through all aliases by name in
> a list to using a hashmap. Doing a fast hashmap__find means testing
> for having an event needn't load from disk if an event is already
> present.
> 
> The second patch switch the fncache to use a hashmap rather than its
> own hashmap with a limited number of buckets. When there are many
> filename queries, such as with a test, there are many collisions with
> the previous fncache approach leading to linear searching of the
> entries.
> 
> The final patch adds a find function for metrics. Normally metrics can
> match by name and group, however, only name matching happens when one
> metric refers to another. As we test every "id" in a metric to see if
> it is a metric, the find function can dominate performance as it
> linearly searches all metrics. Add a find function for the metrics
> table so that a metric can be found by name with a binary search.
> 
> Before these changes:
> ```
> $ time perf test -v 10
>  10: PMU JSON event tests                                            :
>  10.1: PMU event table sanity                                        : Ok
>  10.2: PMU event map aliases                                         : Ok
>  10.3: Parsing of PMU event table metrics                            : Ok
>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
>  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> 
> real    0m18.499s
> user    0m18.150s
> sys     0m3.273s
> ```
> 
> After these changes:
> ```
> $ time perf test -v 10
>  10: PMU JSON event tests                                            :
>  10.1: PMU event table sanity                                        : Ok
>  10.2: PMU event map aliases                                         : Ok
>  10.3: Parsing of PMU event table metrics                            : Ok
>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
>  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> 
> real    0m2.338s
> user    0m1.797s
> sys     0m2.186s
> ```

Great, I also see the speedup on my machine from 32s to 3s.

Tested-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung

> 
> Ian Rogers (3):
>   perf pmu: Change aliases from list to hashmap
>   perf fncache: Switch to using hashmap
>   perf metricgroup: Binary search when resolving referred to metrics
> 
>  tools/perf/builtin-stat.c                |   6 +-
>  tools/perf/pmu-events/empty-pmu-events.c |  66 ++++++++-
>  tools/perf/pmu-events/jevents.py         |  66 ++++++++-
>  tools/perf/pmu-events/pmu-events.h       |  23 +++-
>  tools/perf/tests/pmu-events.c            | 129 +++++++++--------
>  tools/perf/util/fncache.c                |  69 +++++-----
>  tools/perf/util/fncache.h                |   1 -
>  tools/perf/util/hwmon_pmu.c              |  43 +++---
>  tools/perf/util/metricgroup.c            | 102 ++++++--------
>  tools/perf/util/metricgroup.h            |   2 +-
>  tools/perf/util/pmu.c                    | 167 +++++++++++++++--------
>  tools/perf/util/pmu.h                    |   4 +-
>  tools/perf/util/srccode.c                |   4 +-
>  tools/perf/util/tool_pmu.c               |  17 +--
>  14 files changed, 430 insertions(+), 269 deletions(-)
> 
> -- 
> 2.49.0.504.g3bcea36a83-goog
>
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Arnaldo Carvalho de Melo 7 months, 1 week ago
On Wed, Apr 09, 2025 at 11:49:16PM -0700, Namhyung Kim wrote:
> Hi Ian,
> 
> On Wed, Apr 09, 2025 at 09:45:29PM -0700, Ian Rogers wrote:
> > The "PMU JSON event tests" have been running slowly, these changes
> > target improving them with an improvement of the test running 8 to 10
> > times faster.
> > 
> > The first patch changes from searching through all aliases by name in
> > a list to using a hashmap. Doing a fast hashmap__find means testing
> > for having an event needn't load from disk if an event is already
> > present.
> > 
> > The second patch switch the fncache to use a hashmap rather than its
> > own hashmap with a limited number of buckets. When there are many
> > filename queries, such as with a test, there are many collisions with
> > the previous fncache approach leading to linear searching of the
> > entries.
> > 
> > The final patch adds a find function for metrics. Normally metrics can
> > match by name and group, however, only name matching happens when one
> > metric refers to another. As we test every "id" in a metric to see if
> > it is a metric, the find function can dominate performance as it
> > linearly searches all metrics. Add a find function for the metrics
> > table so that a metric can be found by name with a binary search.
> > 
> > Before these changes:
> > ```
> > $ time perf test -v 10
> >  10: PMU JSON event tests                                            :
> >  10.1: PMU event table sanity                                        : Ok
> >  10.2: PMU event map aliases                                         : Ok
> >  10.3: Parsing of PMU event table metrics                            : Ok
> >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> > 
> > real    0m18.499s
> > user    0m18.150s
> > sys     0m3.273s
> > ```
> > 
> > After these changes:
> > ```
> > $ time perf test -v 10
> >  10: PMU JSON event tests                                            :
> >  10.1: PMU event table sanity                                        : Ok
> >  10.2: PMU event map aliases                                         : Ok
> >  10.3: Parsing of PMU event table metrics                            : Ok
> >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> > 
> > real    0m2.338s
> > user    0m1.797s
> > sys     0m2.186s
> > ```
> 
> Great, I also see the speedup on my machine from 32s to 3s.
> 
> Tested-by: Namhyung Kim <namhyung@kernel.org>

I'm collecting this for v2 as well, ok? Holler if you disagree.

- Arnaldo
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Arnaldo Carvalho de Melo 7 months, 1 week ago
On Tue, May 13, 2025 at 04:34:28PM -0300, Arnaldo Carvalho de Melo wrote:
> On Wed, Apr 09, 2025 at 11:49:16PM -0700, Namhyung Kim wrote:
> > Great, I also see the speedup on my machine from 32s to 3s.

> > Tested-by: Namhyung Kim <namhyung@kernel.org>
 
> I'm collecting this for v2 as well, ok? Holler if you disagree.

BTW, in my workstation:

Before:

  root@number:~# time perf test "Parsing of PMU event table metrics"
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

  real  0m9.286s
  user  0m9.354s
  sys   0m0.062s
  root@number:~#

After:

  root@number:~# time perf test "Parsing of PMU event table metrics"
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

  real  0m0.689s
  user  0m0.766s
  sys   0m0.042s
  root@number:~# time perf test 10
   10: PMU JSON event tests                                            :
   10.1: PMU event table sanity                                        : Ok
   10.2: PMU event map aliases                                         : Ok
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
   10.5: Parsing of metric thresholds with fake PMUs                   : Ok

  real  0m0.696s
  user  0m0.807s
  sys   0m0.064s
  root@number:~#
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Ian Rogers 7 months, 4 weeks ago
On Wed, Apr 9, 2025 at 11:49 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Ian,
>
> On Wed, Apr 09, 2025 at 09:45:29PM -0700, Ian Rogers wrote:
> > The "PMU JSON event tests" have been running slowly, these changes
> > target improving them with an improvement of the test running 8 to 10
> > times faster.
> >
> > The first patch changes from searching through all aliases by name in
> > a list to using a hashmap. Doing a fast hashmap__find means testing
> > for having an event needn't load from disk if an event is already
> > present.
> >
> > The second patch switch the fncache to use a hashmap rather than its
> > own hashmap with a limited number of buckets. When there are many
> > filename queries, such as with a test, there are many collisions with
> > the previous fncache approach leading to linear searching of the
> > entries.
> >
> > The final patch adds a find function for metrics. Normally metrics can
> > match by name and group, however, only name matching happens when one
> > metric refers to another. As we test every "id" in a metric to see if
> > it is a metric, the find function can dominate performance as it
> > linearly searches all metrics. Add a find function for the metrics
> > table so that a metric can be found by name with a binary search.
> >
> > Before these changes:
> > ```
> > $ time perf test -v 10
> >  10: PMU JSON event tests                                            :
> >  10.1: PMU event table sanity                                        : Ok
> >  10.2: PMU event map aliases                                         : Ok
> >  10.3: Parsing of PMU event table metrics                            : Ok
> >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> >
> > real    0m18.499s
> > user    0m18.150s
> > sys     0m3.273s
> > ```
> >
> > After these changes:
> > ```
> > $ time perf test -v 10
> >  10: PMU JSON event tests                                            :
> >  10.1: PMU event table sanity                                        : Ok
> >  10.2: PMU event map aliases                                         : Ok
> >  10.3: Parsing of PMU event table metrics                            : Ok
> >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> >
> > real    0m2.338s
> > user    0m1.797s
> > sys     0m2.186s
> > ```
>
> Great, I also see the speedup on my machine from 32s to 3s.
>
> Tested-by: Namhyung Kim <namhyung@kernel.org>

Ping.

Thanks,
Ian

> Thanks,
> Namhyung
>
> >
> > Ian Rogers (3):
> >   perf pmu: Change aliases from list to hashmap
> >   perf fncache: Switch to using hashmap
> >   perf metricgroup: Binary search when resolving referred to metrics
> >
> >  tools/perf/builtin-stat.c                |   6 +-
> >  tools/perf/pmu-events/empty-pmu-events.c |  66 ++++++++-
> >  tools/perf/pmu-events/jevents.py         |  66 ++++++++-
> >  tools/perf/pmu-events/pmu-events.h       |  23 +++-
> >  tools/perf/tests/pmu-events.c            | 129 +++++++++--------
> >  tools/perf/util/fncache.c                |  69 +++++-----
> >  tools/perf/util/fncache.h                |   1 -
> >  tools/perf/util/hwmon_pmu.c              |  43 +++---
> >  tools/perf/util/metricgroup.c            | 102 ++++++--------
> >  tools/perf/util/metricgroup.h            |   2 +-
> >  tools/perf/util/pmu.c                    | 167 +++++++++++++++--------
> >  tools/perf/util/pmu.h                    |   4 +-
> >  tools/perf/util/srccode.c                |   4 +-
> >  tools/perf/util/tool_pmu.c               |  17 +--
> >  14 files changed, 430 insertions(+), 269 deletions(-)
> >
> > --
> > 2.49.0.504.g3bcea36a83-goog
> >
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Arnaldo Carvalho de Melo 7 months, 1 week ago
On Wed, Apr 23, 2025 at 01:48:22PM -0700, Ian Rogers wrote:
> On Wed, Apr 9, 2025 at 11:49 PM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hi Ian,
> >
> > On Wed, Apr 09, 2025 at 09:45:29PM -0700, Ian Rogers wrote:
> > > The "PMU JSON event tests" have been running slowly, these changes
> > > target improving them with an improvement of the test running 8 to 10
> > > times faster.
> > >
> > > The first patch changes from searching through all aliases by name in
> > > a list to using a hashmap. Doing a fast hashmap__find means testing
> > > for having an event needn't load from disk if an event is already
> > > present.
> > >
> > > The second patch switch the fncache to use a hashmap rather than its
> > > own hashmap with a limited number of buckets. When there are many
> > > filename queries, such as with a test, there are many collisions with
> > > the previous fncache approach leading to linear searching of the
> > > entries.
> > >
> > > The final patch adds a find function for metrics. Normally metrics can
> > > match by name and group, however, only name matching happens when one
> > > metric refers to another. As we test every "id" in a metric to see if
> > > it is a metric, the find function can dominate performance as it
> > > linearly searches all metrics. Add a find function for the metrics
> > > table so that a metric can be found by name with a binary search.
> > >
> > > Before these changes:
> > > ```
> > > $ time perf test -v 10
> > >  10: PMU JSON event tests                                            :
> > >  10.1: PMU event table sanity                                        : Ok
> > >  10.2: PMU event map aliases                                         : Ok
> > >  10.3: Parsing of PMU event table metrics                            : Ok
> > >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> > >
> > > real    0m18.499s
> > > user    0m18.150s
> > > sys     0m3.273s
> > > ```
> > >
> > > After these changes:
> > > ```
> > > $ time perf test -v 10
> > >  10: PMU JSON event tests                                            :
> > >  10.1: PMU event table sanity                                        : Ok
> > >  10.2: PMU event map aliases                                         : Ok
> > >  10.3: Parsing of PMU event table metrics                            : Ok
> > >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> > >
> > > real    0m2.338s
> > > user    0m1.797s
> > > sys     0m2.186s
> > > ```
> >
> > Great, I also see the speedup on my machine from 32s to 3s.
> >
> > Tested-by: Namhyung Kim <namhyung@kernel.org>
> 
> Ping.

I'll try to fix up it later, if you don't beat me to it, will continue
with the other patches you listed to get the ones that applies merged:

Total patches: 3
---
Cover: ./20250409_irogers_metric_related_performance_improvements.cover
 Link: https://lore.kernel.org/r/20250410044532.52017-1-irogers@google.com
 Base: not specified
       git am ./20250409_irogers_metric_related_performance_improvements.mbx
⬢ [acme@toolbx perf-tools-next]$        git am ./20250409_irogers_metric_related_performance_improvements.mbx
Applying: perf pmu: Change aliases from list to hashmap
error: patch failed: tools/perf/util/pmu.c:532
error: tools/perf/util/pmu.c: patch does not apply
Patch failed at 0001 perf pmu: Change aliases from list to hashmap
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
⬢ [acme@toolbx perf-tools-next]$ 
⬢ [acme@toolbx perf-tools-next]$ git am --abort
⬢ [acme@toolbx perf-tools-next]$ patch -p1 < ./20250409_irogers_metric_related_performance_improvements.mbx
patching file tools/perf/tests/pmu-events.c
patching file tools/perf/util/hwmon_pmu.c
patching file tools/perf/util/pmu.c
Hunk #3 succeeded at 417 (offset 11 lines).
Hunk #4 succeeded at 451 (offset 11 lines).
Hunk #5 FAILED at 541.
Hunk #6 succeeded at 657 (offset 41 lines).
Hunk #7 succeeded at 1146 (offset 41 lines).
Hunk #8 succeeded at 1238 (offset 41 lines).
Hunk #9 succeeded at 1259 (offset 41 lines).
Hunk #10 succeeded at 2018 (offset 48 lines).
Hunk #11 succeeded at 2033 (offset 48 lines).
Hunk #12 succeeded at 2502 (offset 59 lines).
Hunk #13 succeeded at 2522 (offset 59 lines).
1 out of 13 hunks FAILED -- saving rejects to file tools/perf/util/pmu.c.rej
patching file tools/perf/util/pmu.h
Hunk #3 succeeded at 295 (offset 5 lines).
patching file tools/perf/util/tool_pmu.c
Hunk #1 succeeded at 502 (offset 6 lines).
patching file tools/perf/util/fncache.c
patching file tools/perf/util/fncache.h
patching file tools/perf/util/srccode.c
patching file tools/perf/builtin-stat.c
Hunk #1 succeeded at 1854 (offset -2 lines).
Hunk #2 succeeded at 1888 (offset -2 lines).
Hunk #3 succeeded at 1978 (offset -2 lines).
patching file tools/perf/pmu-events/empty-pmu-events.c
Hunk #1 succeeded at 449 (offset 6 lines).
Hunk #2 succeeded at 495 (offset 6 lines).
Hunk #3 succeeded at 552 (offset 6 lines).
patching file tools/perf/pmu-events/jevents.py
Hunk #1 succeeded at 972 (offset 6 lines).
Hunk #2 succeeded at 1018 (offset 6 lines).
Hunk #3 succeeded at 1075 (offset 6 lines).
patching file tools/perf/pmu-events/pmu-events.h
Hunk #1 succeeded at 74 (offset 3 lines).
Hunk #2 succeeded at 89 (offset 3 lines).
Hunk #3 succeeded at 105 (offset 3 lines).
patching file tools/perf/util/metricgroup.c
patching file tools/perf/util/metricgroup.h
⬢ [acme@toolbx perf-tools-next]$
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Ian Rogers 7 months, 1 week ago
On Mon, May 12, 2025 at 9:40 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> On Wed, Apr 23, 2025 at 01:48:22PM -0700, Ian Rogers wrote:
> > On Wed, Apr 9, 2025 at 11:49 PM Namhyung Kim <namhyung@kernel.org> wrote:
> > >
> > > Hi Ian,
> > >
> > > On Wed, Apr 09, 2025 at 09:45:29PM -0700, Ian Rogers wrote:
> > > > The "PMU JSON event tests" have been running slowly, these changes
> > > > target improving them with an improvement of the test running 8 to 10
> > > > times faster.
> > > >
> > > > The first patch changes from searching through all aliases by name in
> > > > a list to using a hashmap. Doing a fast hashmap__find means testing
> > > > for having an event needn't load from disk if an event is already
> > > > present.
> > > >
> > > > The second patch switch the fncache to use a hashmap rather than its
> > > > own hashmap with a limited number of buckets. When there are many
> > > > filename queries, such as with a test, there are many collisions with
> > > > the previous fncache approach leading to linear searching of the
> > > > entries.
> > > >
> > > > The final patch adds a find function for metrics. Normally metrics can
> > > > match by name and group, however, only name matching happens when one
> > > > metric refers to another. As we test every "id" in a metric to see if
> > > > it is a metric, the find function can dominate performance as it
> > > > linearly searches all metrics. Add a find function for the metrics
> > > > table so that a metric can be found by name with a binary search.
> > > >
> > > > Before these changes:
> > > > ```
> > > > $ time perf test -v 10
> > > >  10: PMU JSON event tests                                            :
> > > >  10.1: PMU event table sanity                                        : Ok
> > > >  10.2: PMU event map aliases                                         : Ok
> > > >  10.3: Parsing of PMU event table metrics                            : Ok
> > > >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> > > >
> > > > real    0m18.499s
> > > > user    0m18.150s
> > > > sys     0m3.273s
> > > > ```
> > > >
> > > > After these changes:
> > > > ```
> > > > $ time perf test -v 10
> > > >  10: PMU JSON event tests                                            :
> > > >  10.1: PMU event table sanity                                        : Ok
> > > >  10.2: PMU event map aliases                                         : Ok
> > > >  10.3: Parsing of PMU event table metrics                            : Ok
> > > >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > >  10.5: Parsing of metric thresholds with fake PMUs                   : Ok
> > > >
> > > > real    0m2.338s
> > > > user    0m1.797s
> > > > sys     0m2.186s
> > > > ```
> > >
> > > Great, I also see the speedup on my machine from 32s to 3s.
> > >
> > > Tested-by: Namhyung Kim <namhyung@kernel.org>
> >
> > Ping.
>
> I'll try to fix up it later, if you don't beat me to it, will continue
> with the other patches you listed to get the ones that applies merged:
>
> Total patches: 3
> ---
> Cover: ./20250409_irogers_metric_related_performance_improvements.cover
>  Link: https://lore.kernel.org/r/20250410044532.52017-1-irogers@google.com
>  Base: not specified
>        git am ./20250409_irogers_metric_related_performance_improvements.mbx
> ⬢ [acme@toolbx perf-tools-next]$        git am ./20250409_irogers_metric_related_performance_improvements.mbx
> Applying: perf pmu: Change aliases from list to hashmap
> error: patch failed: tools/perf/util/pmu.c:532
> error: tools/perf/util/pmu.c: patch does not apply
> Patch failed at 0001 perf pmu: Change aliases from list to hashmap
> hint: Use 'git am --show-current-patch=diff' to see the failed patch
> hint: When you have resolved this problem, run "git am --continue".
> hint: If you prefer to skip this patch, run "git am --skip" instead.
> hint: To restore the original branch and stop patching, run "git am --abort".
> hint: Disable this message with "git config set advice.mergeConflict false"
> ⬢ [acme@toolbx perf-tools-next]$
> ⬢ [acme@toolbx perf-tools-next]$ git am --abort
> ⬢ [acme@toolbx perf-tools-next]$ patch -p1 < ./20250409_irogers_metric_related_performance_improvements.mbx
> patching file tools/perf/tests/pmu-events.c
> patching file tools/perf/util/hwmon_pmu.c
> patching file tools/perf/util/pmu.c
> Hunk #3 succeeded at 417 (offset 11 lines).
> Hunk #4 succeeded at 451 (offset 11 lines).
> Hunk #5 FAILED at 541.
> Hunk #6 succeeded at 657 (offset 41 lines).
> Hunk #7 succeeded at 1146 (offset 41 lines).
> Hunk #8 succeeded at 1238 (offset 41 lines).
> Hunk #9 succeeded at 1259 (offset 41 lines).
> Hunk #10 succeeded at 2018 (offset 48 lines).
> Hunk #11 succeeded at 2033 (offset 48 lines).
> Hunk #12 succeeded at 2502 (offset 59 lines).
> Hunk #13 succeeded at 2522 (offset 59 lines).
> 1 out of 13 hunks FAILED -- saving rejects to file tools/perf/util/pmu.c.rej
> patching file tools/perf/util/pmu.h
> Hunk #3 succeeded at 295 (offset 5 lines).
> patching file tools/perf/util/tool_pmu.c
> Hunk #1 succeeded at 502 (offset 6 lines).
> patching file tools/perf/util/fncache.c
> patching file tools/perf/util/fncache.h
> patching file tools/perf/util/srccode.c
> patching file tools/perf/builtin-stat.c
> Hunk #1 succeeded at 1854 (offset -2 lines).
> Hunk #2 succeeded at 1888 (offset -2 lines).
> Hunk #3 succeeded at 1978 (offset -2 lines).
> patching file tools/perf/pmu-events/empty-pmu-events.c
> Hunk #1 succeeded at 449 (offset 6 lines).
> Hunk #2 succeeded at 495 (offset 6 lines).
> Hunk #3 succeeded at 552 (offset 6 lines).
> patching file tools/perf/pmu-events/jevents.py
> Hunk #1 succeeded at 972 (offset 6 lines).
> Hunk #2 succeeded at 1018 (offset 6 lines).
> Hunk #3 succeeded at 1075 (offset 6 lines).
> patching file tools/perf/pmu-events/pmu-events.h
> Hunk #1 succeeded at 74 (offset 3 lines).
> Hunk #2 succeeded at 89 (offset 3 lines).
> Hunk #3 succeeded at 105 (offset 3 lines).
> patching file tools/perf/util/metricgroup.c
> patching file tools/perf/util/metricgroup.h
> ⬢ [acme@toolbx perf-tools-next]$

Thanks Arnaldo! Happy to send a rebase on tmp.perf-tools-next if useful.

Thanks,
Ian
Re: [PATCH v1 0/3] Metric related performance improvements
Posted by Arnaldo Carvalho de Melo 7 months, 1 week ago
On Mon, May 12, 2025 at 09:57:45AM -0700, Ian Rogers wrote:
> On Mon, May 12, 2025 at 9:40 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > Hunk #4 succeeded at 451 (offset 11 lines).
> > Hunk #5 FAILED at 541.
> > Hunk #6 succeeded at 657 (offset 41 lines).
> > ⬢ [acme@toolbx perf-tools-next]$
 
> Thanks Arnaldo! Happy to send a rebase on tmp.perf-tools-next if useful.

Sure, I just pushed what I have:

⬢ [acme@toolbx perf-tools-next]$ git log --oneline -10
255f5b6d060be5a4 (HEAD -> perf-tools-next, x1/perf-tools-next, x1/HEAD, perf-tools-next/tmp.perf-tools-next, five/perf-tools-next, five/HEAD) perf parse-events: Add "cpu" term to set the CPU an event is recorded on
168c7b509109fe26 perf parse-events: Set is_pmu_core for legacy hardware events
f60c3f44689ac2bc perf stat: Use counter cpumask to skip zero values
2e7a2f7f3c6e3a99 libperf cpumap: Add ability to create CPU from a single CPU number
365e02ddb65d443f perf tests metrics: Permission related fixes
f0869f31562bde2e perf evsel: Add per-thread warning for EOPNOTSUPP open failues
17e548405a81665f perf scripts python: exported-sql-viewer.py: Fix pattern matching with Python 3
352b088164b5cde1 perf intel-pt: Do not default to recording all switch events
e00eac6b5b6d956f perf intel-pt: Fix PEBS-via-PT data_src
cd17a9b1a779459d (perf-tools-next/perf-tools-next) perf test demangle-ocaml: Switch to using dso__demangle_sym()
⬢ [acme@toolbx perf-tools-next]$

- Arnaldo