In multi-socket or Sub-NUMA Clustering (SNC) configurations, uncore
metrics (such as lpm_miss_lat) calculate incorrect values because they
divide by a static socket CHA count rather than the aggregation target.
Fix this by dynamically utilizing the aggregation count (`aggr->nr`) on
the metric's leader event. In standard aggregation modes, `aggr->nr` is
already automatically populated with the correct number of active
hardware units contributing to that stats bucket.
Before the fix:
perf stat -M lpm_miss_lat --per-socket --metric-only -a -j -- sleep 1
{"socket" : "S0", "counters" : 28, "ns lpm_miss_lat_rem" : "163.4", "ns lpm_miss_lat_loc" : "27.5"}
{"socket" : "S1", "counters" : 28, "ns lpm_miss_lat_rem" : "170.5", "ns lpm_miss_lat_loc" : "25.5"}
perf stat -M lpm_miss_lat --metric-only -a -j -- sleep 1
{"ns lpm_miss_lat_rem" : "90.4", "ns lpm_miss_lat_loc" : "12.6"}
`lpm_miss_lat` on global aggr is 0.5x the actual value.
After the fix:
perf stat -M lpm_miss_lat --per-socket --metric-only -a -j -- sleep 1
{"socket" : "S0", "counters" : 28, "ns lpm_miss_lat_rem" : "174.8", "ns lpm_miss_lat_loc" : "34.8"}
{"socket" : "S1", "counters" : 28, "ns lpm_miss_lat_rem" : "170.4", "ns lpm_miss_lat_loc" : "23.9"}
perf stat -M lpm_miss_lat --metric-only -a -j -- sleep 1
{"ns lpm_miss_lat_rem" : "174.5", "ns lpm_miss_lat_loc" : "26.2"}
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
---
tools/perf/util/stat-shadow.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index bc2d44df7baf..60fdf26c5bb0 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -103,7 +103,7 @@ static int prepare_metric(struct perf_stat_config *config,
val *= 1e-9;
}
if (!source_count)
- source_count = evsel__source_count(metric_events[i]);
+ source_count = aggr->nr;
}
}
n = strdup(evsel__metric_id(metric_events[i]));
--
2.54.0.746.g67dd491aae-goog