This series fixes an issue where cgroup metrics were not being correctly
associated and reported (showing `nan %`) when using BPF counters
(i.e., with `--bpf-counters`).
The root cause is that cgroup BPF counters only open the "leader" events
(for the first cgroup) and leave "follower" events unopened. Unopened
events have their `supported` flag set to `false` by default. During
metric calculation, `prepare_metric` checks `evsel->supported` and
discards the value (setting it to `NAN`) if it is `false`, leading to
`nan %` in the output.
The first patch fixes this by propagating the `supported` flag from the
leader events to the follower events in `bperf_load_program`. It also
adds a validation check to prevent potential division-by-zero (SIGFPE)
crashes.
The second patch adds a new shell test (`stat_metrics_cgrp.sh`) to
permanently cover this scenario (testing both with and without BPF
counters) and prevent regressions.
Reported-by: Svilen Kanev <skanev@google.com>
Changes since v3:
- Add validation check in `bperf_load_program` to prevent
potential division-by-zero (SIGFPE) when `num_events` is 0 (e.g.,
due to a trailing comma in cgroups list).
- In the test script, dynamically pair instructions and cycles
events using the PMU name. This ensures that we correctly verify
if the associated cycles event was counted before performing the
metric check, regardless of which event (cycles or instructions)
is the metric leader.
- Collect Namhyung's Acked-by for patch 1.
Changes since v2:
- Use `--metrics=insn_per_cycle` for the system-wide capability
check in `check_system_wide` to ensure both cycles and instructions
are supported.
- Use exact cgroup field matching with surrounding commas
(`,${cgrp},`) in the `grep` check to prevent false positives when
matching the root cgroup (`/`).
- Only enforce metric validation if both `cycles` and `instructions`
counts are numeric and non-zero, preventing false failures on
idle cgroups.
- Wrap long lines and fix trailing whitespace in the test script to
comply with style guidelines.
Changes since v1:
- Fix commit message for patch 2 to remove the mention of
insn_per_cycle check (which was simplified out).
- Quote variables in shell script to prevent word splitting.
- Use grep -F in shell script for literal matching of cgroup
names containing dots.
Ian Rogers (2):
perf stat: Propagate supported flag to follower cgroup BPF events
perf test: Add stat metrics --for-each-cgroup test
tools/perf/tests/shell/stat_metrics_cgrp.sh | 200 ++++++++++++++++++++
tools/perf/util/bpf_counter_cgroup.c | 20 ++
2 files changed, 220 insertions(+)
create mode 100755 tools/perf/tests/shell/stat_metrics_cgrp.sh
--
2.54.0.631.ge1b05301d1-goog