perf stat: Cleanup counter aggregation (v2)

[PATCHSET 00/19] perf stat: Cleanup counter aggregation (v2)

Posted by Namhyung Kim 3 years, 5 months ago

Hello,

Current perf stat code is somewhat hard to follow since it handles
many combinations of PMUs/events for given display and aggregation
options.  This is my attempt to clean it up a little. ;-)

changes in v2)
 * fix a segfault in perf stat report for per-process record  (Jiri)
 * fix metric only display  (Jiri)
 * add evsel__reset_aggr_stat  (ian)
 * add more comments  (Ian)
 * add Acked-by from Ian

My first concern is that aggregation and display routines are intermixed
and processed differently depends on the aggregation mode.  I'd like to
separate them apart and make the logic clearer.

To do that, I added struct perf_stat_aggr to save the aggregated counter
values and other info.  It'll be allocated and processed according to
the aggr_mode and display logic will use it.

I've tested the following combination.

  $ cat test-matrix.sh
  #!/bin/sh

  set -e

  yes > /dev/null &
  TARGET=$!

  ./perf stat true
  ./perf stat -a true
  ./perf stat -C0 true
  ./perf stat -p $TARGET true
  ./perf stat -t $TARGET true

  ./perf stat -a -A true
  ./perf stat -a --per-node true
  ./perf stat -a --per-socket true
  ./perf stat -a --per-die true
  ./perf stat -a --per-core true
  ./perf stat -a --per-thread true

  ./perf stat -a -I 500 sleep 1
  ./perf stat -a -I 500 --summary sleep 1
  ./perf stat -a -I 500 --per-socket sleep 1
  ./perf stat -a -I 500 --summary --per-socket sleep 1

  ./perf stat -a --metric-only true
  ./perf stat -a --metric-only --per-socket true
  ./perf stat -a --metric-only -I 500 sleep 1
  ./perf stat -a --metric-only -I 500 --per-socket sleep 1

  ./perf stat record true && ./perf stat report
  ./perf stat record -p $TARGET true && ./perf stat report
  ./perf stat record -a true && ./perf stat report
  ./perf stat record -a --per-core true && ./perf stat report
  ./perf stat record -a --per-core --metric-only true && ./perf stat report
  ./perf stat record -a -I 500 sleep 1 && ./perf stat report
  ./perf stat record -a -I 500 --per-core sleep 1 && ./perf stat report
  ./perf stat record -a -I 500 --per-core --metric-only sleep 1 && ./perf stat report

  ./perf stat -a -A -e cpu/event=cpu-cycles,percore/ true
  ./perf stat -a -A -e cpu/event=cpu-cycles,percore/ --percore-show-thread true

  kill $TARGET

The code is available at 'perf/stat-aggr-v2' branch in

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung

Namhyung Kim (19):
  perf tools: Save evsel->pmu in parse_events()
  perf tools: Use pmu info in evsel__is_hybrid()
  perf stat: Use evsel__is_hybrid() more
  perf stat: Add aggr id for global mode
  perf stat: Add cpu aggr id for no aggregation mode
  perf stat: Add 'needs_sort' argument to cpu_aggr_map__new()
  perf stat: Add struct perf_stat_aggr to perf_stat_evsel
  perf stat: Allocate evsel->stats->aggr properly
  perf stat: Aggregate events using evsel->stats->aggr
  perf stat: Aggregate per-thread stats using evsel->stats->aggr
  perf stat: Allocate aggr counts for recorded data
  perf stat: Reset aggr counts for each interval
  perf stat: Split process_counters()
  perf stat: Add perf_stat_merge_counters()
  perf stat: Add perf_stat_process_percore()
  perf stat: Add perf_stat_process_shadow_stats()
  perf stat: Display event stats using aggr counts
  perf stat: Display percore events properly
  perf stat: Remove unused perf_counts.aggr field

 tools/perf/builtin-script.c                   |   4 +-
 tools/perf/builtin-stat.c                     | 186 +++++--
 tools/perf/tests/parse-metric.c               |   2 +-
 tools/perf/tests/pmu-events.c                 |   2 +-
 tools/perf/util/counts.c                      |   1 -
 tools/perf/util/counts.h                      |   1 -
 tools/perf/util/cpumap.c                      |  16 +-
 tools/perf/util/cpumap.h                      |   8 +-
 tools/perf/util/evsel.c                       |  13 +-
 tools/perf/util/parse-events.c                |   1 +
 tools/perf/util/pmu.c                         |   4 +
 .../scripting-engines/trace-event-python.c    |   6 -
 tools/perf/util/stat-display.c                | 462 +++---------------
 tools/perf/util/stat.c                        | 385 ++++++++++++---
 tools/perf/util/stat.h                        |  40 +-
 15 files changed, 602 insertions(+), 529 deletions(-)


base-commit: d79310700590b8b40d8c867012d6c899ea6fd505
-- 
2.38.0.413.g74048e4d9e-goog

Re: [PATCHSET 00/19] perf stat: Cleanup counter aggregation (v2)

Posted by Jiri Olsa 3 years, 5 months ago

On Thu, Oct 13, 2022 at 11:15:31PM -0700, Namhyung Kim wrote:
> Hello,
> 
> Current perf stat code is somewhat hard to follow since it handles
> many combinations of PMUs/events for given display and aggregation
> options.  This is my attempt to clean it up a little. ;-)
> 
> changes in v2)
>  * fix a segfault in perf stat report for per-process record  (Jiri)
>  * fix metric only display  (Jiri)
>  * add evsel__reset_aggr_stat  (ian)
>  * add more comments  (Ian)
>  * add Acked-by from Ian
> 
> My first concern is that aggregation and display routines are intermixed
> and processed differently depends on the aggregation mode.  I'd like to
> separate them apart and make the logic clearer.
> 
> To do that, I added struct perf_stat_aggr to save the aggregated counter
> values and other info.  It'll be allocated and processed according to
> the aggr_mode and display logic will use it.
> 
> I've tested the following combination.
> 
>   $ cat test-matrix.sh
>   #!/bin/sh
> 
>   set -e
> 
>   yes > /dev/null &
>   TARGET=$!
> 
>   ./perf stat true
>   ./perf stat -a true
>   ./perf stat -C0 true
>   ./perf stat -p $TARGET true
>   ./perf stat -t $TARGET true
> 
>   ./perf stat -a -A true
>   ./perf stat -a --per-node true
>   ./perf stat -a --per-socket true
>   ./perf stat -a --per-die true
>   ./perf stat -a --per-core true
>   ./perf stat -a --per-thread true
> 
>   ./perf stat -a -I 500 sleep 1
>   ./perf stat -a -I 500 --summary sleep 1
>   ./perf stat -a -I 500 --per-socket sleep 1
>   ./perf stat -a -I 500 --summary --per-socket sleep 1
> 
>   ./perf stat -a --metric-only true
>   ./perf stat -a --metric-only --per-socket true
>   ./perf stat -a --metric-only -I 500 sleep 1
>   ./perf stat -a --metric-only -I 500 --per-socket sleep 1
> 
>   ./perf stat record true && ./perf stat report
>   ./perf stat record -p $TARGET true && ./perf stat report
>   ./perf stat record -a true && ./perf stat report
>   ./perf stat record -a --per-core true && ./perf stat report
>   ./perf stat record -a --per-core --metric-only true && ./perf stat report
>   ./perf stat record -a -I 500 sleep 1 && ./perf stat report
>   ./perf stat record -a -I 500 --per-core sleep 1 && ./perf stat report
>   ./perf stat record -a -I 500 --per-core --metric-only sleep 1 && ./perf stat report
> 
>   ./perf stat -a -A -e cpu/event=cpu-cycles,percore/ true
>   ./perf stat -a -A -e cpu/event=cpu-cycles,percore/ --percore-show-thread true
> 
>   kill $TARGET
> 
> The code is available at 'perf/stat-aggr-v2' branch in
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Michael,
ay chance you could run your test suite on top of this change?

thanks,
jirka

> 
> Thanks,
> Namhyung
> 
> Namhyung Kim (19):
>   perf tools: Save evsel->pmu in parse_events()
>   perf tools: Use pmu info in evsel__is_hybrid()
>   perf stat: Use evsel__is_hybrid() more
>   perf stat: Add aggr id for global mode
>   perf stat: Add cpu aggr id for no aggregation mode
>   perf stat: Add 'needs_sort' argument to cpu_aggr_map__new()
>   perf stat: Add struct perf_stat_aggr to perf_stat_evsel
>   perf stat: Allocate evsel->stats->aggr properly
>   perf stat: Aggregate events using evsel->stats->aggr
>   perf stat: Aggregate per-thread stats using evsel->stats->aggr
>   perf stat: Allocate aggr counts for recorded data
>   perf stat: Reset aggr counts for each interval
>   perf stat: Split process_counters()
>   perf stat: Add perf_stat_merge_counters()
>   perf stat: Add perf_stat_process_percore()
>   perf stat: Add perf_stat_process_shadow_stats()
>   perf stat: Display event stats using aggr counts
>   perf stat: Display percore events properly
>   perf stat: Remove unused perf_counts.aggr field
> 
>  tools/perf/builtin-script.c                   |   4 +-
>  tools/perf/builtin-stat.c                     | 186 +++++--
>  tools/perf/tests/parse-metric.c               |   2 +-
>  tools/perf/tests/pmu-events.c                 |   2 +-
>  tools/perf/util/counts.c                      |   1 -
>  tools/perf/util/counts.h                      |   1 -
>  tools/perf/util/cpumap.c                      |  16 +-
>  tools/perf/util/cpumap.h                      |   8 +-
>  tools/perf/util/evsel.c                       |  13 +-
>  tools/perf/util/parse-events.c                |   1 +
>  tools/perf/util/pmu.c                         |   4 +
>  .../scripting-engines/trace-event-python.c    |   6 -
>  tools/perf/util/stat-display.c                | 462 +++---------------
>  tools/perf/util/stat.c                        | 385 ++++++++++++---
>  tools/perf/util/stat.h                        |  40 +-
>  15 files changed, 602 insertions(+), 529 deletions(-)
> 
> 
> base-commit: d79310700590b8b40d8c867012d6c899ea6fd505
> -- 
> 2.38.0.413.g74048e4d9e-goog
>

Re: [PATCHSET 00/19] perf stat: Cleanup counter aggregation (v2)

Posted by Namhyung Kim 3 years, 5 months ago

Hi Jiri and Michael,

On Thu, Oct 13, 2022 at 11:56 PM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Thu, Oct 13, 2022 at 11:15:31PM -0700, Namhyung Kim wrote:
> > Hello,
> >
> > Current perf stat code is somewhat hard to follow since it handles
> > many combinations of PMUs/events for given display and aggregation
> > options.  This is my attempt to clean it up a little. ;-)
> >
> > changes in v2)
> >  * fix a segfault in perf stat report for per-process record  (Jiri)
> >  * fix metric only display  (Jiri)
> >  * add evsel__reset_aggr_stat  (ian)
> >  * add more comments  (Ian)
> >  * add Acked-by from Ian
> >
> > My first concern is that aggregation and display routines are intermixed
> > and processed differently depends on the aggregation mode.  I'd like to
> > separate them apart and make the logic clearer.
> >
> > To do that, I added struct perf_stat_aggr to save the aggregated counter
> > values and other info.  It'll be allocated and processed according to
> > the aggr_mode and display logic will use it.
> >
> > I've tested the following combination.
> >
> >   $ cat test-matrix.sh
> >   #!/bin/sh
> >
> >   set -e
> >
> >   yes > /dev/null &
> >   TARGET=$!
> >
> >   ./perf stat true
> >   ./perf stat -a true
> >   ./perf stat -C0 true
> >   ./perf stat -p $TARGET true
> >   ./perf stat -t $TARGET true
> >
> >   ./perf stat -a -A true
> >   ./perf stat -a --per-node true
> >   ./perf stat -a --per-socket true
> >   ./perf stat -a --per-die true
> >   ./perf stat -a --per-core true
> >   ./perf stat -a --per-thread true
> >
> >   ./perf stat -a -I 500 sleep 1
> >   ./perf stat -a -I 500 --summary sleep 1
> >   ./perf stat -a -I 500 --per-socket sleep 1
> >   ./perf stat -a -I 500 --summary --per-socket sleep 1
> >
> >   ./perf stat -a --metric-only true
> >   ./perf stat -a --metric-only --per-socket true
> >   ./perf stat -a --metric-only -I 500 sleep 1
> >   ./perf stat -a --metric-only -I 500 --per-socket sleep 1
> >
> >   ./perf stat record true && ./perf stat report
> >   ./perf stat record -p $TARGET true && ./perf stat report
> >   ./perf stat record -a true && ./perf stat report
> >   ./perf stat record -a --per-core true && ./perf stat report
> >   ./perf stat record -a --per-core --metric-only true && ./perf stat report
> >   ./perf stat record -a -I 500 sleep 1 && ./perf stat report
> >   ./perf stat record -a -I 500 --per-core sleep 1 && ./perf stat report
> >   ./perf stat record -a -I 500 --per-core --metric-only sleep 1 && ./perf stat report
> >
> >   ./perf stat -a -A -e cpu/event=cpu-cycles,percore/ true
> >   ./perf stat -a -A -e cpu/event=cpu-cycles,percore/ --percore-show-thread true
> >
> >   kill $TARGET
> >
> > The code is available at 'perf/stat-aggr-v2' branch in
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Michael,
> ay chance you could run your test suite on top of this change?

I've noticed there's an issue with cgroups.  Will send a fix soon.

Thanks,
Namhyung

[PATCH 20/19] perf stat: Factor out evsel__count_has_error()

Posted by Namhyung Kim 3 years, 5 months ago

It's possible to have 0 enabled/running time for some per-task or per-cgroup
events since it's not scheduled on any CPU.  Treating the whole event as
failed would not work in this case.  Thinking again, the code only existed
when any CPU-level aggregation is enabled (like per-socket, per-core, ...).

To make it clearer, factor out the condition check into the new
evsel__count_has_error() function and add some comments.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/stat.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 6ab9c58beca7..9dfa8cac6bc4 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -396,6 +396,25 @@ static int check_per_pkg(struct evsel *counter, struct perf_counts_values *vals,
 	return ret;
 }
 
+static bool evsel__count_has_error(struct evsel *evsel,
+				   struct perf_counts_values *count,
+				   struct perf_stat_config *config)
+{
+	/* the evsel was failed already */
+	if (evsel->err || evsel->counts->scaled == -1)
+		return true;
+
+	/* this is meaningful for CPU aggregation modes only */
+	if (config->aggr_mode == AGGR_GLOBAL)
+		return false;
+
+	/* it's considered ok when it actually ran */
+	if (count->ena != 0 && count->run != 0)
+		return false;
+
+	return true;
+}
+
 static int
 process_counter_values(struct perf_stat_config *config, struct evsel *evsel,
 		       int cpu_map_idx, int thread,
@@ -450,11 +469,9 @@ process_counter_values(struct perf_stat_config *config, struct evsel *evsel,
 
 			/*
 			 * When any result is bad, make them all to give consistent output
-			 * in interval mode.  But per-task counters can have 0 enabled time
-			 * when some tasks are idle.
+			 * in interval mode.
 			 */
-			if (((count->ena == 0 || count->run == 0) && cpu.cpu != -1) ||
-			    evsel->counts->scaled == -1) {
+			if (evsel__count_has_error(evsel, count, config) && !ps_aggr->failed) {
 				ps_aggr->counts.val = 0;
 				ps_aggr->counts.ena = 0;
 				ps_aggr->counts.run = 0;
-- 
2.38.0.413.g74048e4d9e-goog

Re: [PATCH 20/19] perf stat: Factor out evsel__count_has_error()

Posted by Arnaldo Carvalho de Melo 3 years, 5 months ago

Em Fri, Oct 14, 2022 at 11:16:55AM -0700, Namhyung Kim escreveu:
> It's possible to have 0 enabled/running time for some per-task or per-cgroup
> events since it's not scheduled on any CPU.  Treating the whole event as
> failed would not work in this case.  Thinking again, the code only existed
> when any CPU-level aggregation is enabled (like per-socket, per-core, ...).
> 
> To make it clearer, factor out the condition check into the new
> evsel__count_has_error() function and add some comments.

So I should just add this one to the 19-long patchkit I already applied
locally, ok.

- Arnaldo
 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/stat.c | 25 +++++++++++++++++++++----
>  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
> index 6ab9c58beca7..9dfa8cac6bc4 100644
> --- a/tools/perf/util/stat.c
> +++ b/tools/perf/util/stat.c
> @@ -396,6 +396,25 @@ static int check_per_pkg(struct evsel *counter, struct perf_counts_values *vals,
>  	return ret;
>  }
>  
> +static bool evsel__count_has_error(struct evsel *evsel,
> +				   struct perf_counts_values *count,
> +				   struct perf_stat_config *config)
> +{
> +	/* the evsel was failed already */
> +	if (evsel->err || evsel->counts->scaled == -1)
> +		return true;
> +
> +	/* this is meaningful for CPU aggregation modes only */
> +	if (config->aggr_mode == AGGR_GLOBAL)
> +		return false;
> +
> +	/* it's considered ok when it actually ran */
> +	if (count->ena != 0 && count->run != 0)
> +		return false;
> +
> +	return true;
> +}
> +
>  static int
>  process_counter_values(struct perf_stat_config *config, struct evsel *evsel,
>  		       int cpu_map_idx, int thread,
> @@ -450,11 +469,9 @@ process_counter_values(struct perf_stat_config *config, struct evsel *evsel,
>  
>  			/*
>  			 * When any result is bad, make them all to give consistent output
> -			 * in interval mode.  But per-task counters can have 0 enabled time
> -			 * when some tasks are idle.
> +			 * in interval mode.
>  			 */
> -			if (((count->ena == 0 || count->run == 0) && cpu.cpu != -1) ||
> -			    evsel->counts->scaled == -1) {
> +			if (evsel__count_has_error(evsel, count, config) && !ps_aggr->failed) {
>  				ps_aggr->counts.val = 0;
>  				ps_aggr->counts.ena = 0;
>  				ps_aggr->counts.run = 0;
> -- 
> 2.38.0.413.g74048e4d9e-goog

-- 

- Arnaldo