[PATCH v4 00/16] Intel TPEBS min/max/mean/last support

Ian Rogers posted 16 patches 1 month ago
There is a newer version of this series
tools/perf/Documentation/perf-stat.txt   |   7 +
tools/perf/builtin-stat.c                |  29 +-
tools/perf/pmu-events/empty-pmu-events.c | 216 +++----
tools/perf/pmu-events/jevents.py         |   6 +
tools/perf/pmu-events/pmu-events.h       |   3 +
tools/perf/util/Build                    |   2 +-
tools/perf/util/evlist.c                 |   1 -
tools/perf/util/evsel.c                  |  22 +-
tools/perf/util/evsel.h                  |   6 +
tools/perf/util/intel-tpebs.c            | 682 ++++++++++++++---------
tools/perf/util/intel-tpebs.h            |  40 +-
tools/perf/util/parse-events.c           |   4 +
tools/perf/util/pmu.c                    |  52 +-
tools/perf/util/pmu.h                    |   3 +
14 files changed, 666 insertions(+), 407 deletions(-)
[PATCH v4 00/16] Intel TPEBS min/max/mean/last support
Posted by Ian Rogers 1 month ago
The patches add support to computing the min, max, mean or last
retirement latency and then using that value as the basis for metrics.
When values aren't available, support is added to use the retirement
latency as recorded for an event in the perf json.

Support is added for reading the retirement latency from the forked
perf command more than once. To avoid killing the process commands are
sent through the control fd. Some name handling is changed to make it
more robust.

Rather than retirement latency events having issues with perf record,
make it so that the retirement latency modifier enables sample
weights.

v4: Don't use json min/max in retirement latency stats as they will
    never update afterwards. Warn once if json data is used when TPEBS
    recording was requested.

v3: Two fixes from Kan Liang. Ensure min/max statistics don't vary
    when real samples are encountered.

v2: Addition of header cleanup patch originally posted:
    https://lore.kernel.org/lkml/20241210191823.612631-1-irogers@google.com/
    as there are no arch specific reasons not to build this code.
    Fix bug in "perf pmu-events: Add retirement latency to JSON events
    inside of perf" where "t->stats.n != 0" should have been
    "t->stats.n == 0".
    Add patch so that perf record of a retirement latency event
    doesn't crash but instead enables sample weights for the event.

Ian Rogers (16):
  perf intel-tpebs: Cleanup header
  perf intel-tpebs: Simplify tpebs_cmd
  perf intel-tpebs: Rename tpebs_start to evsel__tpebs_open
  perf intel-tpebs: Separate evsel__tpebs_prepare out of
    evsel__tpebs_open
  perf intel-tpebs: Move cpumap_buf out of evsel__tpebs_open
  perf intel-tpebs: Reduce scope of tpebs_events_size
  perf intel-tpebs: Inline get_perf_record_args
  perf intel-tpebs: Ensure events are opened, factor out finding
  perf intel-tpebs: Refactor tpebs_results list
  perf intel-tpebs: Add support for updating counts in evsel__tpebs_read
  perf intel-tpebs: Add mutex for tpebs_results
  perf intel-tpebs: Don't close record on read
  perf intel-tpebs: Use stats for retirement latency statistics
  perf stat: Add mean, min, max and last --tpebs-mode options
  perf pmu-events: Add retirement latency to JSON events inside of perf
  perf record: Retirement latency cleanup in evsel__config

 tools/perf/Documentation/perf-stat.txt   |   7 +
 tools/perf/builtin-stat.c                |  29 +-
 tools/perf/pmu-events/empty-pmu-events.c | 216 +++----
 tools/perf/pmu-events/jevents.py         |   6 +
 tools/perf/pmu-events/pmu-events.h       |   3 +
 tools/perf/util/Build                    |   2 +-
 tools/perf/util/evlist.c                 |   1 -
 tools/perf/util/evsel.c                  |  22 +-
 tools/perf/util/evsel.h                  |   6 +
 tools/perf/util/intel-tpebs.c            | 682 ++++++++++++++---------
 tools/perf/util/intel-tpebs.h            |  40 +-
 tools/perf/util/parse-events.c           |   4 +
 tools/perf/util/pmu.c                    |  52 +-
 tools/perf/util/pmu.h                    |   3 +
 14 files changed, 666 insertions(+), 407 deletions(-)

-- 
2.49.0.504.g3bcea36a83-goog
Re: [PATCH v4 00/16] Intel TPEBS min/max/mean/last support
Posted by Namhyung Kim 4 weeks ago
On Tue, Apr 08, 2025 at 11:10:27PM -0700, Ian Rogers wrote:
> The patches add support to computing the min, max, mean or last
> retirement latency and then using that value as the basis for metrics.
> When values aren't available, support is added to use the retirement
> latency as recorded for an event in the perf json.
> 
> Support is added for reading the retirement latency from the forked
> perf command more than once. To avoid killing the process commands are
> sent through the control fd. Some name handling is changed to make it
> more robust.
> 
> Rather than retirement latency events having issues with perf record,
> make it so that the retirement latency modifier enables sample
> weights.
> 
> v4: Don't use json min/max in retirement latency stats as they will
>     never update afterwards. Warn once if json data is used when TPEBS
>     recording was requested.
> 
> v3: Two fixes from Kan Liang. Ensure min/max statistics don't vary
>     when real samples are encountered.
> 
> v2: Addition of header cleanup patch originally posted:
>     https://lore.kernel.org/lkml/20241210191823.612631-1-irogers@google.com/
>     as there are no arch specific reasons not to build this code.
>     Fix bug in "perf pmu-events: Add retirement latency to JSON events
>     inside of perf" where "t->stats.n != 0" should have been
>     "t->stats.n == 0".
>     Add patch so that perf record of a retirement latency event
>     doesn't crash but instead enables sample weights for the event.
> 
> Ian Rogers (16):
>   perf intel-tpebs: Cleanup header
>   perf intel-tpebs: Simplify tpebs_cmd
>   perf intel-tpebs: Rename tpebs_start to evsel__tpebs_open
>   perf intel-tpebs: Separate evsel__tpebs_prepare out of
>     evsel__tpebs_open
>   perf intel-tpebs: Move cpumap_buf out of evsel__tpebs_open
>   perf intel-tpebs: Reduce scope of tpebs_events_size
>   perf intel-tpebs: Inline get_perf_record_args
>   perf intel-tpebs: Ensure events are opened, factor out finding
>   perf intel-tpebs: Refactor tpebs_results list
>   perf intel-tpebs: Add support for updating counts in evsel__tpebs_read
>   perf intel-tpebs: Add mutex for tpebs_results
>   perf intel-tpebs: Don't close record on read
>   perf intel-tpebs: Use stats for retirement latency statistics
>   perf stat: Add mean, min, max and last --tpebs-mode options
>   perf pmu-events: Add retirement latency to JSON events inside of perf
>   perf record: Retirement latency cleanup in evsel__config

I have a nitpick but otherwise looks good to me.

Acked-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung

> 
>  tools/perf/Documentation/perf-stat.txt   |   7 +
>  tools/perf/builtin-stat.c                |  29 +-
>  tools/perf/pmu-events/empty-pmu-events.c | 216 +++----
>  tools/perf/pmu-events/jevents.py         |   6 +
>  tools/perf/pmu-events/pmu-events.h       |   3 +
>  tools/perf/util/Build                    |   2 +-
>  tools/perf/util/evlist.c                 |   1 -
>  tools/perf/util/evsel.c                  |  22 +-
>  tools/perf/util/evsel.h                  |   6 +
>  tools/perf/util/intel-tpebs.c            | 682 ++++++++++++++---------
>  tools/perf/util/intel-tpebs.h            |  40 +-
>  tools/perf/util/parse-events.c           |   4 +
>  tools/perf/util/pmu.c                    |  52 +-
>  tools/perf/util/pmu.h                    |   3 +
>  14 files changed, 666 insertions(+), 407 deletions(-)
> 
> -- 
> 2.49.0.504.g3bcea36a83-goog
>