[Patch v4 00/13] arch-PEBS enabling for Intel platforms

Dapeng Mi posted 13 patches 3 months, 2 weeks ago
There is a newer version of this series
arch/x86/events/core.c                        |  37 +-
arch/x86/events/intel/core.c                  | 256 +++++++-
arch/x86/events/intel/ds.c                    | 595 ++++++++++++++----
arch/x86/events/perf_event.h                  |  46 +-
arch/x86/include/asm/intel_ds.h               |  10 +-
arch/x86/include/asm/msr-index.h              |  20 +
arch/x86/include/asm/perf_event.h             | 117 +++-
arch/x86/include/uapi/asm/perf_regs.h         |   4 +-
arch/x86/kernel/perf_regs.c                   |   7 +
tools/arch/x86/include/uapi/asm/perf_regs.h   |   7 +-
tools/perf/arch/x86/util/perf_regs.c          |   2 +
tools/perf/util/intel-pt.c                    |   2 +-
.../perf/util/perf-regs-arch/perf_regs_x86.c  |   2 +
13 files changed, 959 insertions(+), 146 deletions(-)
[Patch v4 00/13] arch-PEBS enabling for Intel platforms
Posted by Dapeng Mi 3 months, 2 weeks ago
This patchset introduces architectural PEBS support for Intel platforms
like Clearwater Forest (CWF) and Panther Lake (PTL). The detailed
information about arch-PEBS can be found in chapter 11
"architectural PEBS" of "Intel Architecture Instruction Set Extensions
and Future Features".

Comparing with v3 patchset, the most significant change is to remove the
sampling support for new SIMD regs (OPMASK/YMM/ZMM). Considering the
complication of supporting SIMD regs sampling, the SIMD regs sampling
support is extracted as an independent patchset[1] and this patchset only
focus on the arch-PEBS enabling itself. Once the basic SIMD regs sampling
is supported, the arch-PEBS based SIMD regs (OPMASK/YMM/ZMM) sampling
would be added on top of the basic SIMD regs sampling.

Changes:
  v3 -> v4:
  * Rebase code to 6.16-rc2
  * Extract the new SIMD regs sampling to an independent patchset
  * Fix the PEBS buffer allocation issue (Peter)
  * Fix the arch-PEBS dynamic constraints issue (Kan)

Tests:
  Run below tests on Clearwater Forest and Pantherlake, no issue is
  found.

  1. Basic perf counting case.
    perf stat -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles}' sleep 1

  2. Basic PMI based perf sampling case.
    perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles}' sleep 1

  3. Basic PEBS based perf sampling case.
    perf record -e '{branches,branches,branches,branches,branches,branches,branches,branches,cycles,instructions,ref-cycles}:p' sleep 1

  4. PEBS sampling case with basic, GPRs, vector-registers and LBR groups
    perf record -e branches:p -Iax,bx,ip,ssp,xmm0 -b -c 10000 sleep 1

  5. User space PEBS sampling case with basic GPRs and LBR groups
    perf record -e branches:p --user-regs=ax,bx,ip -b -c 10000 sleep 1

  6 PEBS sampling case with auxiliary (memory info) group
    perf mem record sleep 1

  7. PEBS sampling case with counter group
    perf record -e '{branches:p,branches,cycles}:S' -c 10000 sleep 1

  8. Perf stat and record test
    perf test 96; perf test 125


History:
  v3: https://lore.kernel.org/all/20250415114428.341182-1-dapeng1.mi@linux.intel.com/
  v2: https://lore.kernel.org/all/20250218152818.158614-1-dapeng1.mi@linux.intel.com/
  v1: https://lore.kernel.org/all/20250123140721.2496639-1-dapeng1.mi@linux.intel.com/

Ref:
  [1]: https://lore.kernel.org/all/20250613134943.3186517-1-kan.liang@linux.intel.com/


Dapeng Mi (13):
  perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call
  perf/x86/intel: Correct large PEBS flag check
  perf/x86/intel: Initialize architectural PEBS
  perf/x86/intel/ds: Factor out PEBS record processing code to functions
  perf/x86/intel/ds: Factor out PEBS group processing code to functions
  perf/x86/intel: Process arch-PEBS records or record fragments
  perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
  perf/x86/intel: Update dyn_constranit base on PEBS event precise level
  perf/x86/intel: Setup PEBS data configuration and enable legacy groups
  perf/x86/intel: Add counter group support for arch-PEBS
  perf/x86: Support to sample SSP register
  perf/x86/intel: Support to sample SSP register for arch-PEBS
  perf tools: x86: Support to show SSP register

 arch/x86/events/core.c                        |  37 +-
 arch/x86/events/intel/core.c                  | 256 +++++++-
 arch/x86/events/intel/ds.c                    | 595 ++++++++++++++----
 arch/x86/events/perf_event.h                  |  46 +-
 arch/x86/include/asm/intel_ds.h               |  10 +-
 arch/x86/include/asm/msr-index.h              |  20 +
 arch/x86/include/asm/perf_event.h             | 117 +++-
 arch/x86/include/uapi/asm/perf_regs.h         |   4 +-
 arch/x86/kernel/perf_regs.c                   |   7 +
 tools/arch/x86/include/uapi/asm/perf_regs.h   |   7 +-
 tools/perf/arch/x86/util/perf_regs.c          |   2 +
 tools/perf/util/intel-pt.c                    |   2 +-
 .../perf/util/perf-regs-arch/perf_regs_x86.c  |   2 +
 13 files changed, 959 insertions(+), 146 deletions(-)


base-commit: e04c78d86a9699d136910cfc0bdcf01087e3267e
-- 
2.43.0