Two semi-related perf throttling fixes

[PATCH 1/2] perf/x86: Avoid double accounting of PMU NMI latencies
Posted by Calvin Owens 18 hours ago
Because NMIs always poll all handlers, calling perf_sample_event_took()
unconditionally in perf_ibs_nmi_handler() and perf_event_nmi_handler()
causes two latency numbers to be fed into the exponentially weighted
moving average for each NMI on AMD machines, one of which is much
smaller than the other:

    <...>-70985   [029] d.Z1. 13311.704313: nmi_handler: perf_event_nmi_handler() delta_ns: 6732 handled: 1
    <...>-70985   [029] d.Z1. 13311.704317: nmi_handler: nmi_cpu_backtrace_handler() delta_ns: 1673 handled: 0
    <...>-70985   [029] d.Z1. 13311.704319: nmi_handler: perf_ibs_nmi_handler() delta_ns: 2064 handled: 0

This can bias the average unrealistically low, in this case because the
latency of perf_ibs_handle_irq() doing nothing is averaged with the
latency of amd_pmu_v2_handle_irq() doing real work:

    # bpftrace -e 'kprobe:perf_sample_event_took {\
    printf("%s: cpu=%02d sample_len_ns=%d\n", strftime("%S.%f", nsecs), cpu(), arg0); }'
    Attached 1 probe
    02.836860: cpu=17 sample_len_ns=7775
    02.836871: cpu=17 sample_len_ns=1492  // avg=4634
    03.042803: cpu=20 sample_len_ns=4298
    03.042810: cpu=20 sample_len_ns=1152  // avg=2725
    03.204410: cpu=27 sample_len_ns=6973
    03.204420: cpu=27 sample_len_ns=1302  // avg=4137
    03.622364: cpu=00 sample_len_ns=5270
    03.622371: cpu=00 sample_len_ns=992   // avg=3131

Avoid the problem by only accounting the latency of the handler which
actually handled the NMI.

Fixes: c2872d381f1a ("perf/x86/ibs: Add IBS interrupt to the dynamic throttle")
Signed-off-by: Calvin Owens <calvin@wbinvd.org>
---
 arch/x86/events/amd/ibs.c | 6 +++---
 arch/x86/events/core.c    | 3 ++-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index aca89f23d2e0..036385de2123 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -1402,10 +1402,10 @@ perf_ibs_nmi_handler(unsigned int cmd, struct pt_regs *regs)
 	handled += perf_ibs_handle_irq(&perf_ibs_fetch, regs);
 	handled += perf_ibs_handle_irq(&perf_ibs_op, regs);
 
-	if (handled)
+	if (handled) {
 		inc_irq_stat(apic_perf_irqs);
-
-	perf_sample_event_took(sched_clock() - stamp);
+		perf_sample_event_took(sched_clock() - stamp);
+	}
 
 	return handled;
 }
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 810ab21ffd99..d1c7612e2e5b 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1814,7 +1814,8 @@ perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
 	ret = static_call(x86_pmu_handle_irq)(regs);
 	finish_clock = sched_clock();
 
-	perf_sample_event_took(finish_clock - start_clock);
+	if (ret)
+		perf_sample_event_took(finish_clock - start_clock);
 
 	return ret;
 }
-- 
2.47.3
[PATCH 1/2] perf/x86: Avoid double accounting of PMU NMI latencies
[PATCH 2/2] perf: Don't throttle based on NMI watchdog events