[PATCH] perf stat: Fix uncore metric scaling bug across sockets and nodes

Chun-Tse Shao posted 1 patch 6 days, 4 hours ago
tools/perf/util/stat-shadow.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] perf stat: Fix uncore metric scaling bug across sockets and nodes
Posted by Chun-Tse Shao 6 days, 4 hours ago
In multi-socket or Sub-NUMA Clustering (SNC) configurations, uncore
metrics (such as lpm_miss_lat) calculate incorrect values because they
divide by a static socket CHA count rather than the aggregation target.

Fix this by dynamically utilizing the aggregation count (`aggr->nr`) on
the metric's leader event. In standard aggregation modes, `aggr->nr` is
already automatically populated with the correct number of active
hardware units contributing to that stats bucket.

Before the fix:
  perf stat -M lpm_miss_lat --per-socket --metric-only -a -j -- sleep 1
  {"socket" : "S0", "counters" : 28, "ns  lpm_miss_lat_rem" : "163.4", "ns  lpm_miss_lat_loc" : "27.5"}
  {"socket" : "S1", "counters" : 28, "ns  lpm_miss_lat_rem" : "170.5", "ns  lpm_miss_lat_loc" : "25.5"}
  perf stat -M lpm_miss_lat --metric-only -a -j -- sleep 1
  {"ns  lpm_miss_lat_rem" : "90.4", "ns  lpm_miss_lat_loc" : "12.6"}
`lpm_miss_lat` on global aggr is 0.5x the actual value.

After the fix:
  perf stat -M lpm_miss_lat --per-socket --metric-only -a -j -- sleep 1
  {"socket" : "S0", "counters" : 28, "ns  lpm_miss_lat_rem" : "174.8", "ns  lpm_miss_lat_loc" : "34.8"}
  {"socket" : "S1", "counters" : 28, "ns  lpm_miss_lat_rem" : "170.4", "ns  lpm_miss_lat_loc" : "23.9"}
  perf stat -M lpm_miss_lat --metric-only -a -j -- sleep 1
  {"ns  lpm_miss_lat_rem" : "174.5", "ns  lpm_miss_lat_loc" : "26.2"}

Signed-off-by: Chun-Tse Shao <ctshao@google.com>
---
 tools/perf/util/stat-shadow.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index bc2d44df7baf..60fdf26c5bb0 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -103,7 +103,7 @@ static int prepare_metric(struct perf_stat_config *config,
 					val *= 1e-9;
 				}
 				if (!source_count)
-					source_count = evsel__source_count(metric_events[i]);
+					source_count = aggr->nr;
 			}
 		}
 		n = strdup(evsel__metric_id(metric_events[i]));
-- 
2.54.0.746.g67dd491aae-goog