This series fixes correctness issues in Intel uncore PMU setup:
- If all init_box() on a PMU fails, the PMU sysfs node may still exist,
while perf events read zeros and silently report wrong data.
- If init_box() fails on only some dies, perf may return partial
non-zero counts, which is harder to diagnose.
- CPU hotplug ref/unref ordering bugs can skip init_box() when the first
CPU in a die comes online, and can call box_exit() prematurely when
the second-to-last CPU goes offline.
- PCI PMU cleanup on setup failure has activeboxes leaks and potential
NULL pointer dereference in error paths.
To address this, the series introduces a PMU broken state to track setup
failures and switches MSR/MMIO PMUs to lazy registration, matching
existing PCI behavior.
To avoid merge conflicts, this series should be applied after:
https://lore.kernel.org/lkml/20260527151154.130505-1-zide.chen@intel.com/
(textual conflict, no logical dependency)
Only cosmetic changes only in v3.
V3 changes:
- patch 2/8: Instead of removing atomic_inc(&box->refcnt) in PMU
register, add the corresponding atomic_dec_return(&box->refcnt) in
PMU unregister. (Dapeng)
- patch 6/8: Minor changes in code comments.
- patch 7/8: Minor changelog update. (Dapeng)
- Add Reviewed-by tags.
V2 changes:
- Add new patch 1 to fix PCI PMU cleanup issues (Sashiko)
- Keep pmu->activeboxes naming and semantics to avoid potential refcnt
leaks in the uncore_pci_remove() path. To accomplish this, make the
PMU broken flag sticky and decrement pmu->activeboxes on active box
only.
- Update commit messages and changelogs according.
V2: https://lore.kernel.org/lkml/20260601170114.173359-1-zide.chen@intel.com/
V1: https://lore.kernel.org/lkml/20260512233048.9577-1-zide.chen@intel.com/
Sashiko's review: https://sashiko.dev/#/patchset/20260512233048.9577-1-zide.chen@intel.com
Zide Chen (8):
perf/x86/intel/uncore: Fix PCI PMU cleanup on setup failure
perf/x86/intel/uncore: Fix refcnt and other cleanups
perf/x86/intel/uncore: Let init_box() callback report failures
perf/x86/intel/uncore: Keep PCI PMUs working when MMIO/MSR setup fails
perf/x86/intel/uncore: Factor out box setup code
perf/x86/intel/uncore: Introduce PMU flags and broken state
perf/x86/intel/uncore: Fix uncore_box ref/unref ordering
perf/x86/intel/uncore: Implement lazy setup for MSR/MMIO PMUs
arch/x86/events/intel/uncore.c | 225 +++++++++++------------
arch/x86/events/intel/uncore.h | 39 +++-
arch/x86/events/intel/uncore_discovery.c | 21 ++-
arch/x86/events/intel/uncore_discovery.h | 6 +-
arch/x86/events/intel/uncore_nhmex.c | 3 +-
arch/x86/events/intel/uncore_snb.c | 82 ++++++---
arch/x86/events/intel/uncore_snbep.c | 77 +++++---
7 files changed, 255 insertions(+), 198 deletions(-)
--
2.54.0