[v7] perf stat affinity changes

[PATCH v7 0/6] perf stat affinity changes

Posted by Ian Rogers 3 days, 20 hours ago

Change how affinities work with evlist__for_each_cpu. Move the
affinity code into the iterator to simplify setting it up. Detect when
affinities will and won't be profitable, for example a tool event and
a regular perf event (or read group) may face less delay from a single
IPI for the event read than from a call to sched_setaffinity. Add a
 --no-affinity flag to perf stat to allow affinities to be disabled.

v7: Revert "perf tool_pmu: More accurately set the cpus for tool
    events" that caused issues with user specified CPUs (Andres Freund
    <andres@anarazel.de>). Fix a null test is prepare_metric so that
    missing events can't trigger segfaults (Andres Freund). Make the
    CPU map propagation improve the CPU maps for tool events that only
    read on index 0, this allows later setting when
    evlist__create_maps is called with the correct user CPUs. Rebase
    previous non-merged affinity changes that hadn't been picked up
    yet.

v6: Drop merged tool event change. Move TPEBS fix into its own patch
    1st.
https://lore.kernel.org/lkml/20260108212652.768875-1-irogers@google.com/

v5: Drop merged changes. Move tool event reading to first
    patch. Change --no-affinity flag to --affinity/--no-affinity flag.
https://lore.kernel.org/lkml/20251118211326.1840989-1-irogers@google.com/
    On v5 there was discussion with Andi Kleen who points out that
    affinities will work better with real time priorities but using
    this requires privileges.

v4: Rebase. Add patch to reduce scope of walltime_nsec_stats now that
    the legacy metric code is no more. Minor tweak to the ru_stats
    clean up.
https://lore.kernel.org/lkml/20251113180517.44096-1-irogers@google.com/

v3: Add affinity clean ups and read tool events last.
https://lore.kernel.org/lkml/20251106071241.141234-1-irogers@google.com/

v2: Fixed an aggregation index issue:
https://lore.kernel.org/lkml/20251104234148.3103176-2-irogers@google.com/

v1:
https://lore.kernel.org/lkml/20251104053449.1208800-1-irogers@google.com/

Ian Rogers (6):
  Revert "perf tool_pmu: More accurately set the cpus for tool events"
  perf stat-shadow: In prepare_metric fix guard on reading NULL
    perf_stat_evsel
  perf evlist: Special map propagation for tool events that read on 1
    CPU
  perf evlist: Missing TPEBS close in evlist__close
  perf evlist: Reduce affinity use and move into iterator, fix no
    affinity
  perf stat: Add no-affinity flag

 tools/lib/perf/evlist.c                 |  36 +++++-
 tools/lib/perf/include/internal/evsel.h |   2 +
 tools/perf/Documentation/perf-stat.txt  |   4 +
 tools/perf/builtin-stat.c               | 114 ++++++++---------
 tools/perf/util/evlist.c                | 156 +++++++++++++++---------
 tools/perf/util/evlist.h                |  27 ++--
 tools/perf/util/parse-events.c          |  10 +-
 tools/perf/util/pmu.c                   |  23 ++++
 tools/perf/util/pmu.h                   |   3 +
 tools/perf/util/stat-shadow.c           |   7 +-
 tools/perf/util/tool_pmu.c              |  19 ---
 tools/perf/util/tool_pmu.h              |   1 -
 12 files changed, 237 insertions(+), 165 deletions(-)

-- 
2.53.0.rc2.204.g2597b5adb4-goog

Re: [PATCH v7 0/6] perf stat affinity changes

Posted by Arnaldo Carvalho de Melo 21 hours ago

On Tue, Feb 03, 2026 at 02:51:23PM -0800, Ian Rogers wrote:
> Change how affinities work with evlist__for_each_cpu. Move the
> affinity code into the iterator to simplify setting it up. Detect when
> affinities will and won't be profitable, for example a tool event and
> a regular perf event (or read group) may face less delay from a single
> IPI for the event read than from a call to sched_setaffinity. Add a
>  --no-affinity flag to perf stat to allow affinities to be disabled.
> 
> v7: Revert "perf tool_pmu: More accurately set the cpus for tool
>     events" that caused issues with user specified CPUs (Andres Freund
>     <andres@anarazel.de>). Fix a null test is prepare_metric so that
>     missing events can't trigger segfaults (Andres Freund). Make the
>     CPU map propagation improve the CPU maps for tool events that only
>     read on index 0, this allows later setting when
>     evlist__create_maps is called with the correct user CPUs. Rebase
>     previous non-merged affinity changes that hadn't been picked up
>     yet.

Not applying to tmp.perf-tools-next, can you please check?

- Arnaldo
 
> v6: Drop merged tool event change. Move TPEBS fix into its own patch
>     1st.
> https://lore.kernel.org/lkml/20260108212652.768875-1-irogers@google.com/
> 
> v5: Drop merged changes. Move tool event reading to first
>     patch. Change --no-affinity flag to --affinity/--no-affinity flag.
> https://lore.kernel.org/lkml/20251118211326.1840989-1-irogers@google.com/
>     On v5 there was discussion with Andi Kleen who points out that
>     affinities will work better with real time priorities but using
>     this requires privileges.
> 
> v4: Rebase. Add patch to reduce scope of walltime_nsec_stats now that
>     the legacy metric code is no more. Minor tweak to the ru_stats
>     clean up.
> https://lore.kernel.org/lkml/20251113180517.44096-1-irogers@google.com/
> 
> v3: Add affinity clean ups and read tool events last.
> https://lore.kernel.org/lkml/20251106071241.141234-1-irogers@google.com/
> 
> v2: Fixed an aggregation index issue:
> https://lore.kernel.org/lkml/20251104234148.3103176-2-irogers@google.com/
> 
> v1:
> https://lore.kernel.org/lkml/20251104053449.1208800-1-irogers@google.com/
> 
> Ian Rogers (6):
>   Revert "perf tool_pmu: More accurately set the cpus for tool events"
>   perf stat-shadow: In prepare_metric fix guard on reading NULL
>     perf_stat_evsel
>   perf evlist: Special map propagation for tool events that read on 1
>     CPU
>   perf evlist: Missing TPEBS close in evlist__close
>   perf evlist: Reduce affinity use and move into iterator, fix no
>     affinity
>   perf stat: Add no-affinity flag
> 
>  tools/lib/perf/evlist.c                 |  36 +++++-
>  tools/lib/perf/include/internal/evsel.h |   2 +
>  tools/perf/Documentation/perf-stat.txt  |   4 +
>  tools/perf/builtin-stat.c               | 114 ++++++++---------
>  tools/perf/util/evlist.c                | 156 +++++++++++++++---------
>  tools/perf/util/evlist.h                |  27 ++--
>  tools/perf/util/parse-events.c          |  10 +-
>  tools/perf/util/pmu.c                   |  23 ++++
>  tools/perf/util/pmu.h                   |   3 +
>  tools/perf/util/stat-shadow.c           |   7 +-
>  tools/perf/util/tool_pmu.c              |  19 ---
>  tools/perf/util/tool_pmu.h              |   1 -
>  12 files changed, 237 insertions(+), 165 deletions(-)
> 
> -- 
> 2.53.0.rc2.204.g2597b5adb4-goog

Re: [PATCH v7 0/6] perf stat affinity changes