[PATCH v5 00/44] KVM: x86: Add support for mediated vPMUs

Sean Christopherson posted 44 patches 1 month, 4 weeks ago
.../admin-guide/kernel-parameters.txt         |  49 ++
arch/arm64/kvm/arm.c                          |   2 +-
arch/loongarch/kvm/main.c                     |   2 +-
arch/riscv/kvm/main.c                         |   2 +-
arch/x86/entry/entry_fred.c                   |   1 +
arch/x86/events/amd/core.c                    |   2 +
arch/x86/events/core.c                        |  32 +-
arch/x86/events/intel/core.c                  |   5 +
arch/x86/include/asm/hardirq.h                |   3 +
arch/x86/include/asm/idtentry.h               |   6 +
arch/x86/include/asm/irq_vectors.h            |   4 +-
arch/x86/include/asm/kvm-x86-ops.h            |   2 +-
arch/x86/include/asm/kvm-x86-pmu-ops.h        |   4 +
arch/x86/include/asm/kvm_host.h               |   7 +-
arch/x86/include/asm/msr-index.h              |  17 +-
arch/x86/include/asm/perf_event.h             |   1 +
arch/x86/include/asm/vmx.h                    |   1 +
arch/x86/kernel/idt.c                         |   3 +
arch/x86/kernel/irq.c                         |  19 +
arch/x86/kvm/Kconfig                          |   1 +
arch/x86/kvm/cpuid.c                          |   2 +
arch/x86/kvm/pmu.c                            | 272 ++++++++-
arch/x86/kvm/pmu.h                            |  37 +-
arch/x86/kvm/svm/nested.c                     |  18 +-
arch/x86/kvm/svm/pmu.c                        |  51 +-
arch/x86/kvm/svm/svm.c                        |  54 +-
arch/x86/kvm/vmx/capabilities.h               |  11 +-
arch/x86/kvm/vmx/main.c                       |  14 +-
arch/x86/kvm/vmx/nested.c                     |  65 ++-
arch/x86/kvm/vmx/pmu_intel.c                  | 169 ++++--
arch/x86/kvm/vmx/pmu_intel.h                  |  15 +
arch/x86/kvm/vmx/vmx.c                        | 143 +++--
arch/x86/kvm/vmx/vmx.h                        |  11 +-
arch/x86/kvm/vmx/x86_ops.h                    |   2 +-
arch/x86/kvm/x86.c                            |  69 ++-
arch/x86/kvm/x86.h                            |   1 +
include/linux/kvm_host.h                      |  11 +-
include/linux/perf_event.h                    |  38 +-
init/Kconfig                                  |   4 +
kernel/events/core.c                          | 521 ++++++++++++++----
.../beauty/arch/x86/include/asm/irq_vectors.h |   3 +-
virt/kvm/kvm_main.c                           |   6 +-
42 files changed, 1385 insertions(+), 295 deletions(-)
[PATCH v5 00/44] KVM: x86: Add support for mediated vPMUs
Posted by Sean Christopherson 1 month, 4 weeks ago
This series is based on the fastpath+PMU cleanups series[*] (which is based on
kvm/queue), but the non-KVM changes apply cleanly on v6.16 or Linus' tree.
I.e. if you only care about the perf changes, I would just apply on whatever
branch is convenient and stop when you hit the KVM changes.

My hope/plan is that the perf changes will go through the tip tree with a
stable tag/branch, and the KVM changes will go the kvm-x86 tree.

Non-x86 KVM folks, y'all are getting Cc'd due to minor changes in "KVM: Add a
simplified wrapper for registering perf callbacks".

The full set is also available at:

  https://github.com/sean-jc/linux.git tags/mediated-vpmu-v5

Add support for mediated vPMUs in KVM x86, where "mediated" aligns with the
standard definition of intercepting control operations (e.g. event selectors),
while allowing the guest to perform data operations (e.g. read PMCs, toggle
counters on/off) without KVM getting involed.

For an in-depth description of the what and why, please see the cover letter
from the original RFC:

  https://lore.kernel.org/all/20240126085444.324918-1-xiong.y.zhang@linux.intel.com

All KVM tests pass (or fail the same before and after), and I've manually
verified MSR/PMC are passed through as expected, but I haven't done much at all
to actually utilize the PMU in a guest.  I'll be amazed if I didn't make at
least one major goof.

Similarly, I tried to address all feedback, but there are many, many changes
relative to v4.  If I missed something, I apologize in advance.

In other words, please thoroughly review and test.

[*] https://lore.kernel.org/all/20250805190526.1453366-1-seanjc@google.com

v5:
 - Add a patch to call security_perf_event_free() from __free_event()
   instead of _free_event() (necessitated by the __cleanup() changes).
 - Add CONFIG_PERF_GUEST_MEDIATED_PMU to guard the new perf functionality.
 - Ensure the PMU is fully disabled in perf_{load,put}_guest_context() when
   when switching between guest and host context. [Kan, Namhyung]
 - Route the new system IRQ, PERF_GUEST_MEDIATED_PMI_VECTOR, through perf,
   not KVM, and play nice with FRED.
 - Rename and combine perf_{guest,host}_{enter,exit}() to a single set of
   APIs, perf_{load,put}_guest_context().
 - Rename perf_{get,put}_mediated_pmu() to perf_{create,release}_mediated_pmu()
   to (hopefully) better differentiate them from perf_{load,put}_guest_context().
 - Change the param to the load/put APIs from "u32 guest_lvtpc" to
   "unsigned long data" to decouple arch code as much as possible.  E.g. if
   a non-x86 arch were to ever support a mediated vPMU, @data could be used
   to pass a pointer to a struct.
 - Use pmu->version to detect if a vCPU has a mediated PMU.
 - Use a kvm_x86_ops hook to check for mediated PMU support.
 - Cull "passthrough" from as many places as I could find.
 - Improve the changelog/documentation related to RDPMC interception.
 - Check harware capabilities, not KVM capabilities, when calculating
   MSR and RDPMC intercepts.
 - Rework intercept (re)calculation to use a request and the existing (well,
   will be existing as of 6.17-rc1) vendor hooks for recalculating intercepts.
 - Always read PERF_GLOBAL_CTRL on VM-Exit if writes weren't intercepted while
   running the vCPU.
 - Call setup_vmcs_config() before kvm_x86_vendor_init() so that the golden
   VMCS configuration is known before kvm_init_pmu_capability() is called.
 - Keep as much refresh/init code in common x86 as possible.
 - Context switch PMCs and event selectors in common x86, not vendor code.
 - Bail from the VM-Exit fastpath if the guest is counting instructions
   retired and the mediated PMU is enabled (because guest state hasn't yet
   been synchronized with hardware).
 - Don't require an userspace to opt-in via KVM_CAP_PMU_CAPABILITY, and instead
   automatically "create" a mediated PMU on the first KVM_CREATE_VCPU call if
   the VM has an in-kernel local APIC.
 - Add entries in kernel-parameters.txt for the PMU params.
 - Add a patch to elide PMC writes when possible.
 - Many more fixups and tweaks...

v4:
 - https://lore.kernel.org/all/20250324173121.1275209-1-mizhang@google.com
 - Rebase whole patchset on 6.14-rc3 base.
 - Address Peter's comments on Perf part.
 - Address Sean's comments on KVM part.
   * Change key word "passthrough" to "mediated" in all patches
   * Change static enabling to user space dynamic enabling via KVM_CAP_PMU_CAPABILITY.
   * Only support GLOBAL_CTRL save/restore with VMCS exec_ctrl, drop the MSR
     save/retore list support for GLOBAL_CTRL, thus the support of mediated
     vPMU is constrained to SapphireRapids and later CPUs on Intel side.
   * Merge some small changes into a single patch.
 - Address Sandipan's comment on invalid pmu pointer.
 - Add back "eventsel_hw" and "fixed_ctr_ctrl_hw" to avoid to directly
   manipulate pmc->eventsel and pmu->fixed_ctr_ctrl.

v3: https://lore.kernel.org/all/20240801045907.4010984-1-mizhang@google.com
v2: https://lore.kernel.org/all/20240506053020.3911940-1-mizhang@google.com
v1: https://lore.kernel.org/all/20240126085444.324918-1-xiong.y.zhang@linux.intel.com

Dapeng Mi (15):
  KVM: x86/pmu: Start stubbing in mediated PMU support
  KVM: x86/pmu: Implement Intel mediated PMU requirements and
    constraints
  KVM: x86: Rename vmx_vmentry/vmexit_ctrl() helpers
  KVM: x86/pmu: Move PMU_CAP_{FW_WRITES,LBR_FMT} into msr-index.h header
  KVM: VMX: Add helpers to toggle/change a bit in VMCS execution
    controls
  KVM: x86/pmu: Disable RDPMC interception for compatible mediated vPMU
  KVM: x86/pmu: Load/save GLOBAL_CTRL via entry/exit fields for mediated
    PMU
  KVM: x86/pmu: Use BIT_ULL() instead of open coded equivalents
  KVM: x86/pmu: Disable interception of select PMU MSRs for mediated
    vPMUs
  KVM: x86/pmu: Bypass perf checks when emulating mediated PMU counter
    accesses
  KVM: x86/pmu: Reprogram mediated PMU event selectors on event filter
    updates
  KVM: x86/pmu: Load/put mediated PMU context when entering/exiting
    guest
  KVM: x86/pmu: Handle emulated instruction for mediated vPMU
  KVM: nVMX: Add macros to simplify nested MSR interception setting
  KVM: x86/pmu: Expose enable_mediated_pmu parameter to user space

Kan Liang (7):
  perf: Skip pmu_ctx based on event_type
  perf: Add generic exclude_guest support
  perf: Add APIs to create/release mediated guest vPMUs
  perf: Clean up perf ctx time
  perf: Add a EVENT_GUEST flag
  perf: Add APIs to load/put guest mediated PMU context
  perf/x86/intel: Support PERF_PMU_CAP_MEDIATED_VPMU

Mingwei Zhang (3):
  perf/x86/core: Plumb mediated PMU capability from x86_pmu to
    x86_pmu_cap
  KVM: x86/pmu: Introduce eventsel_hw to prepare for pmu event filtering
  KVM: nVMX: Disable PMU MSR interception as appropriate while running
    L2

Sandipan Das (3):
  perf/x86/core: Do not set bit width for unavailable counters
  perf/x86/amd: Support PERF_PMU_CAP_MEDIATED_VPMU for AMD host
  KVM: x86/pmu: Always stuff GuestOnly=1,HostOnly=0 for mediated PMCs on
    AMD

Sean Christopherson (15):
  perf: Move security_perf_event_free() call to __free_event()
  perf: core/x86: Register a new vector for handling mediated guest PMIs
  perf/x86: Switch LVTPC to/from mediated PMI vector on guest load/put
    context
  KVM: VMX: Setup canonical VMCS config prior to kvm_x86_vendor_init()
  KVM: SVM: Check pmu->version, not enable_pmu, when getting PMC MSRs
  KVM: Add a simplified wrapper for registering perf callbacks
  KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities
  KVM: x86/pmu: Implement AMD mediated PMU requirements
  KVM: x86: Rework KVM_REQ_MSR_FILTER_CHANGED into a generic
    RECALC_INTERCEPTS
  KVM: x86: Use KVM_REQ_RECALC_INTERCEPTS to react to CPUID updates
  KVM: x86/pmu: Move initialization of valid PMCs bitmask to common x86
  KVM: x86/pmu: Restrict GLOBAL_{CTRL,STATUS}, fixed PMCs, and PEBS to
    PMU v2+
  KVM: x86/pmu: Disallow emulation in the fastpath if mediated PMCs are
    active
  KVM: nSVM: Disable PMU MSR interception as appropriate while running
    L2
  KVM: x86/pmu: Elide WRMSRs when loading guest PMCs if values already
    match

Xiong Zhang (1):
  KVM: x86/pmu: Register PMI handler for mediated vPMU

 .../admin-guide/kernel-parameters.txt         |  49 ++
 arch/arm64/kvm/arm.c                          |   2 +-
 arch/loongarch/kvm/main.c                     |   2 +-
 arch/riscv/kvm/main.c                         |   2 +-
 arch/x86/entry/entry_fred.c                   |   1 +
 arch/x86/events/amd/core.c                    |   2 +
 arch/x86/events/core.c                        |  32 +-
 arch/x86/events/intel/core.c                  |   5 +
 arch/x86/include/asm/hardirq.h                |   3 +
 arch/x86/include/asm/idtentry.h               |   6 +
 arch/x86/include/asm/irq_vectors.h            |   4 +-
 arch/x86/include/asm/kvm-x86-ops.h            |   2 +-
 arch/x86/include/asm/kvm-x86-pmu-ops.h        |   4 +
 arch/x86/include/asm/kvm_host.h               |   7 +-
 arch/x86/include/asm/msr-index.h              |  17 +-
 arch/x86/include/asm/perf_event.h             |   1 +
 arch/x86/include/asm/vmx.h                    |   1 +
 arch/x86/kernel/idt.c                         |   3 +
 arch/x86/kernel/irq.c                         |  19 +
 arch/x86/kvm/Kconfig                          |   1 +
 arch/x86/kvm/cpuid.c                          |   2 +
 arch/x86/kvm/pmu.c                            | 272 ++++++++-
 arch/x86/kvm/pmu.h                            |  37 +-
 arch/x86/kvm/svm/nested.c                     |  18 +-
 arch/x86/kvm/svm/pmu.c                        |  51 +-
 arch/x86/kvm/svm/svm.c                        |  54 +-
 arch/x86/kvm/vmx/capabilities.h               |  11 +-
 arch/x86/kvm/vmx/main.c                       |  14 +-
 arch/x86/kvm/vmx/nested.c                     |  65 ++-
 arch/x86/kvm/vmx/pmu_intel.c                  | 169 ++++--
 arch/x86/kvm/vmx/pmu_intel.h                  |  15 +
 arch/x86/kvm/vmx/vmx.c                        | 143 +++--
 arch/x86/kvm/vmx/vmx.h                        |  11 +-
 arch/x86/kvm/vmx/x86_ops.h                    |   2 +-
 arch/x86/kvm/x86.c                            |  69 ++-
 arch/x86/kvm/x86.h                            |   1 +
 include/linux/kvm_host.h                      |  11 +-
 include/linux/perf_event.h                    |  38 +-
 init/Kconfig                                  |   4 +
 kernel/events/core.c                          | 521 ++++++++++++++----
 .../beauty/arch/x86/include/asm/irq_vectors.h |   3 +-
 virt/kvm/kvm_main.c                           |   6 +-
 42 files changed, 1385 insertions(+), 295 deletions(-)


base-commit: 53d61a43a7973f812caa08fa922b607574befef4
-- 
2.50.1.565.gc32cd1483b-goog
Re: [PATCH v5 00/44] KVM: x86: Add support for mediated vPMUs
Posted by Sandipan Das 1 month, 3 weeks ago
On 07-08-2025 01:26, Sean Christopherson wrote:
> This series is based on the fastpath+PMU cleanups series[*] (which is based on
> kvm/queue), but the non-KVM changes apply cleanly on v6.16 or Linus' tree.
> I.e. if you only care about the perf changes, I would just apply on whatever
> branch is convenient and stop when you hit the KVM changes.
> 
> My hope/plan is that the perf changes will go through the tip tree with a
> stable tag/branch, and the KVM changes will go the kvm-x86 tree.
> 
> Non-x86 KVM folks, y'all are getting Cc'd due to minor changes in "KVM: Add a
> simplified wrapper for registering perf callbacks".
> 
> The full set is also available at:
> 
>   https://github.com/sean-jc/linux.git tags/mediated-vpmu-v5
> 
> Add support for mediated vPMUs in KVM x86, where "mediated" aligns with the
> standard definition of intercepting control operations (e.g. event selectors),
> while allowing the guest to perform data operations (e.g. read PMCs, toggle
> counters on/off) without KVM getting involed.
> 
> For an in-depth description of the what and why, please see the cover letter
> from the original RFC:
> 
>   https://lore.kernel.org/all/20240126085444.324918-1-xiong.y.zhang@linux.intel.com
> 
> All KVM tests pass (or fail the same before and after), and I've manually
> verified MSR/PMC are passed through as expected, but I haven't done much at all
> to actually utilize the PMU in a guest.  I'll be amazed if I didn't make at
> least one major goof.
> 
> Similarly, I tried to address all feedback, but there are many, many changes
> relative to v4.  If I missed something, I apologize in advance.
> 
> In other words, please thoroughly review and test.
> 
> [*] https://lore.kernel.org/all/20250805190526.1453366-1-seanjc@google.com
> 
> v5:
>  - Add a patch to call security_perf_event_free() from __free_event()
>    instead of _free_event() (necessitated by the __cleanup() changes).
>  - Add CONFIG_PERF_GUEST_MEDIATED_PMU to guard the new perf functionality.
>  - Ensure the PMU is fully disabled in perf_{load,put}_guest_context() when
>    when switching between guest and host context. [Kan, Namhyung]
>  - Route the new system IRQ, PERF_GUEST_MEDIATED_PMI_VECTOR, through perf,
>    not KVM, and play nice with FRED.
>  - Rename and combine perf_{guest,host}_{enter,exit}() to a single set of
>    APIs, perf_{load,put}_guest_context().
>  - Rename perf_{get,put}_mediated_pmu() to perf_{create,release}_mediated_pmu()
>    to (hopefully) better differentiate them from perf_{load,put}_guest_context().
>  - Change the param to the load/put APIs from "u32 guest_lvtpc" to
>    "unsigned long data" to decouple arch code as much as possible.  E.g. if
>    a non-x86 arch were to ever support a mediated vPMU, @data could be used
>    to pass a pointer to a struct.
>  - Use pmu->version to detect if a vCPU has a mediated PMU.
>  - Use a kvm_x86_ops hook to check for mediated PMU support.
>  - Cull "passthrough" from as many places as I could find.
>  - Improve the changelog/documentation related to RDPMC interception.
>  - Check harware capabilities, not KVM capabilities, when calculating
>    MSR and RDPMC intercepts.
>  - Rework intercept (re)calculation to use a request and the existing (well,
>    will be existing as of 6.17-rc1) vendor hooks for recalculating intercepts.
>  - Always read PERF_GLOBAL_CTRL on VM-Exit if writes weren't intercepted while
>    running the vCPU.
>  - Call setup_vmcs_config() before kvm_x86_vendor_init() so that the golden
>    VMCS configuration is known before kvm_init_pmu_capability() is called.
>  - Keep as much refresh/init code in common x86 as possible.
>  - Context switch PMCs and event selectors in common x86, not vendor code.
>  - Bail from the VM-Exit fastpath if the guest is counting instructions
>    retired and the mediated PMU is enabled (because guest state hasn't yet
>    been synchronized with hardware).
>  - Don't require an userspace to opt-in via KVM_CAP_PMU_CAPABILITY, and instead
>    automatically "create" a mediated PMU on the first KVM_CREATE_VCPU call if
>    the VM has an in-kernel local APIC.
>  - Add entries in kernel-parameters.txt for the PMU params.
>  - Add a patch to elide PMC writes when possible.
>  - Many more fixups and tweaks...
> 
> v4:
>  - https://lore.kernel.org/all/20250324173121.1275209-1-mizhang@google.com
>  - Rebase whole patchset on 6.14-rc3 base.
>  - Address Peter's comments on Perf part.
>  - Address Sean's comments on KVM part.
>    * Change key word "passthrough" to "mediated" in all patches
>    * Change static enabling to user space dynamic enabling via KVM_CAP_PMU_CAPABILITY.
>    * Only support GLOBAL_CTRL save/restore with VMCS exec_ctrl, drop the MSR
>      save/retore list support for GLOBAL_CTRL, thus the support of mediated
>      vPMU is constrained to SapphireRapids and later CPUs on Intel side.
>    * Merge some small changes into a single patch.
>  - Address Sandipan's comment on invalid pmu pointer.
>  - Add back "eventsel_hw" and "fixed_ctr_ctrl_hw" to avoid to directly
>    manipulate pmc->eventsel and pmu->fixed_ctr_ctrl.
> 
> v3: https://lore.kernel.org/all/20240801045907.4010984-1-mizhang@google.com
> v2: https://lore.kernel.org/all/20240506053020.3911940-1-mizhang@google.com
> v1: https://lore.kernel.org/all/20240126085444.324918-1-xiong.y.zhang@linux.intel.com
> 
> Dapeng Mi (15):
>   KVM: x86/pmu: Start stubbing in mediated PMU support
>   KVM: x86/pmu: Implement Intel mediated PMU requirements and
>     constraints
>   KVM: x86: Rename vmx_vmentry/vmexit_ctrl() helpers
>   KVM: x86/pmu: Move PMU_CAP_{FW_WRITES,LBR_FMT} into msr-index.h header
>   KVM: VMX: Add helpers to toggle/change a bit in VMCS execution
>     controls
>   KVM: x86/pmu: Disable RDPMC interception for compatible mediated vPMU
>   KVM: x86/pmu: Load/save GLOBAL_CTRL via entry/exit fields for mediated
>     PMU
>   KVM: x86/pmu: Use BIT_ULL() instead of open coded equivalents
>   KVM: x86/pmu: Disable interception of select PMU MSRs for mediated
>     vPMUs
>   KVM: x86/pmu: Bypass perf checks when emulating mediated PMU counter
>     accesses
>   KVM: x86/pmu: Reprogram mediated PMU event selectors on event filter
>     updates
>   KVM: x86/pmu: Load/put mediated PMU context when entering/exiting
>     guest
>   KVM: x86/pmu: Handle emulated instruction for mediated vPMU
>   KVM: nVMX: Add macros to simplify nested MSR interception setting
>   KVM: x86/pmu: Expose enable_mediated_pmu parameter to user space
> 
> Kan Liang (7):
>   perf: Skip pmu_ctx based on event_type
>   perf: Add generic exclude_guest support
>   perf: Add APIs to create/release mediated guest vPMUs
>   perf: Clean up perf ctx time
>   perf: Add a EVENT_GUEST flag
>   perf: Add APIs to load/put guest mediated PMU context
>   perf/x86/intel: Support PERF_PMU_CAP_MEDIATED_VPMU
> 
> Mingwei Zhang (3):
>   perf/x86/core: Plumb mediated PMU capability from x86_pmu to
>     x86_pmu_cap
>   KVM: x86/pmu: Introduce eventsel_hw to prepare for pmu event filtering
>   KVM: nVMX: Disable PMU MSR interception as appropriate while running
>     L2
> 
> Sandipan Das (3):
>   perf/x86/core: Do not set bit width for unavailable counters
>   perf/x86/amd: Support PERF_PMU_CAP_MEDIATED_VPMU for AMD host
>   KVM: x86/pmu: Always stuff GuestOnly=1,HostOnly=0 for mediated PMCs on
>     AMD
> 
> Sean Christopherson (15):
>   perf: Move security_perf_event_free() call to __free_event()
>   perf: core/x86: Register a new vector for handling mediated guest PMIs
>   perf/x86: Switch LVTPC to/from mediated PMI vector on guest load/put
>     context
>   KVM: VMX: Setup canonical VMCS config prior to kvm_x86_vendor_init()
>   KVM: SVM: Check pmu->version, not enable_pmu, when getting PMC MSRs
>   KVM: Add a simplified wrapper for registering perf callbacks
>   KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities
>   KVM: x86/pmu: Implement AMD mediated PMU requirements
>   KVM: x86: Rework KVM_REQ_MSR_FILTER_CHANGED into a generic
>     RECALC_INTERCEPTS
>   KVM: x86: Use KVM_REQ_RECALC_INTERCEPTS to react to CPUID updates
>   KVM: x86/pmu: Move initialization of valid PMCs bitmask to common x86
>   KVM: x86/pmu: Restrict GLOBAL_{CTRL,STATUS}, fixed PMCs, and PEBS to
>     PMU v2+
>   KVM: x86/pmu: Disallow emulation in the fastpath if mediated PMCs are
>     active
>   KVM: nSVM: Disable PMU MSR interception as appropriate while running
>     L2
>   KVM: x86/pmu: Elide WRMSRs when loading guest PMCs if values already
>     match
> 
> Xiong Zhang (1):
>   KVM: x86/pmu: Register PMI handler for mediated vPMU
> 
>  .../admin-guide/kernel-parameters.txt         |  49 ++
>  arch/arm64/kvm/arm.c                          |   2 +-
>  arch/loongarch/kvm/main.c                     |   2 +-
>  arch/riscv/kvm/main.c                         |   2 +-
>  arch/x86/entry/entry_fred.c                   |   1 +
>  arch/x86/events/amd/core.c                    |   2 +
>  arch/x86/events/core.c                        |  32 +-
>  arch/x86/events/intel/core.c                  |   5 +
>  arch/x86/include/asm/hardirq.h                |   3 +
>  arch/x86/include/asm/idtentry.h               |   6 +
>  arch/x86/include/asm/irq_vectors.h            |   4 +-
>  arch/x86/include/asm/kvm-x86-ops.h            |   2 +-
>  arch/x86/include/asm/kvm-x86-pmu-ops.h        |   4 +
>  arch/x86/include/asm/kvm_host.h               |   7 +-
>  arch/x86/include/asm/msr-index.h              |  17 +-
>  arch/x86/include/asm/perf_event.h             |   1 +
>  arch/x86/include/asm/vmx.h                    |   1 +
>  arch/x86/kernel/idt.c                         |   3 +
>  arch/x86/kernel/irq.c                         |  19 +
>  arch/x86/kvm/Kconfig                          |   1 +
>  arch/x86/kvm/cpuid.c                          |   2 +
>  arch/x86/kvm/pmu.c                            | 272 ++++++++-
>  arch/x86/kvm/pmu.h                            |  37 +-
>  arch/x86/kvm/svm/nested.c                     |  18 +-
>  arch/x86/kvm/svm/pmu.c                        |  51 +-
>  arch/x86/kvm/svm/svm.c                        |  54 +-
>  arch/x86/kvm/vmx/capabilities.h               |  11 +-
>  arch/x86/kvm/vmx/main.c                       |  14 +-
>  arch/x86/kvm/vmx/nested.c                     |  65 ++-
>  arch/x86/kvm/vmx/pmu_intel.c                  | 169 ++++--
>  arch/x86/kvm/vmx/pmu_intel.h                  |  15 +
>  arch/x86/kvm/vmx/vmx.c                        | 143 +++--
>  arch/x86/kvm/vmx/vmx.h                        |  11 +-
>  arch/x86/kvm/vmx/x86_ops.h                    |   2 +-
>  arch/x86/kvm/x86.c                            |  69 ++-
>  arch/x86/kvm/x86.h                            |   1 +
>  include/linux/kvm_host.h                      |  11 +-
>  include/linux/perf_event.h                    |  38 +-
>  init/Kconfig                                  |   4 +
>  kernel/events/core.c                          | 521 ++++++++++++++----
>  .../beauty/arch/x86/include/asm/irq_vectors.h |   3 +-
>  virt/kvm/kvm_main.c                           |   6 +-
>  42 files changed, 1385 insertions(+), 295 deletions(-)
> 
> 
> base-commit: 53d61a43a7973f812caa08fa922b607574befef4

No issues seen with KUT and KVM kselftest runs on the following types of
AMD host systems.
- Milan (does not have PerfMonV2, cannot use Mediated PMU)
- Genoa and Turin (have PerfMonV2)

Tested with all combinations of kvm.force_emulation_prefix and
kvm_amd.enable_mediated_pmu. The issue seen previously where RDPMC gets
intercepted on secondary vCPUs has also been addressed.
Re: [PATCH v5 00/44] KVM: x86: Add support for mediated vPMUs
Posted by Hao, Xudong 1 month, 1 week ago

On 8/7/2025 3:56 AM, Sean Christopherson wrote:
> This series is based on the fastpath+PMU cleanups series[*] (which is based on
> kvm/queue), but the non-KVM changes apply cleanly on v6.16 or Linus' tree.
> I.e. if you only care about the perf changes, I would just apply on whatever
> branch is convenient and stop when you hit the KVM changes.
> 
> My hope/plan is that the perf changes will go through the tip tree with a
> stable tag/branch, and the KVM changes will go the kvm-x86 tree.
> 
> Non-x86 KVM folks, y'all are getting Cc'd due to minor changes in "KVM: Add a
> simplified wrapper for registering perf callbacks".
> 
> The full set is also available at:
> 
>    https://github.com/sean-jc/linux.git tags/mediated-vpmu-v5
> 
> Add support for mediated vPMUs in KVM x86, where "mediated" aligns with the
> standard definition of intercepting control operations (e.g. event selectors),
> while allowing the guest to perform data operations (e.g. read PMCs, toggle
> counters on/off) without KVM getting involed.
> 
> For an in-depth description of the what and why, please see the cover letter
> from the original RFC:
> 
>    https://lore.kernel.org/all/20240126085444.324918-1-xiong.y.zhang@linux.intel.com
> 
> All KVM tests pass (or fail the same before and after), and I've manually
> verified MSR/PMC are passed through as expected, but I haven't done much at all
> to actually utilize the PMU in a guest.  I'll be amazed if I didn't make at
> least one major goof.
> 
> Similarly, I tried to address all feedback, but there are many, many changes
> relative to v4.  If I missed something, I apologize in advance.
> 
> In other words, please thoroughly review and test.
> 
> [*] https://lore.kernel.org/all/20250805190526.1453366-1-seanjc@google.com
> 
> v5:
>   - Add a patch to call security_perf_event_free() from __free_event()
>     instead of _free_event() (necessitated by the __cleanup() changes).
>   - Add CONFIG_PERF_GUEST_MEDIATED_PMU to guard the new perf functionality.
>   - Ensure the PMU is fully disabled in perf_{load,put}_guest_context() when
>     when switching between guest and host context. [Kan, Namhyung]
>   - Route the new system IRQ, PERF_GUEST_MEDIATED_PMI_VECTOR, through perf,
>     not KVM, and play nice with FRED.
>   - Rename and combine perf_{guest,host}_{enter,exit}() to a single set of
>     APIs, perf_{load,put}_guest_context().
>   - Rename perf_{get,put}_mediated_pmu() to perf_{create,release}_mediated_pmu()
>     to (hopefully) better differentiate them from perf_{load,put}_guest_context().
>   - Change the param to the load/put APIs from "u32 guest_lvtpc" to
>     "unsigned long data" to decouple arch code as much as possible.  E.g. if
>     a non-x86 arch were to ever support a mediated vPMU, @data could be used
>     to pass a pointer to a struct.
>   - Use pmu->version to detect if a vCPU has a mediated PMU.
>   - Use a kvm_x86_ops hook to check for mediated PMU support.
>   - Cull "passthrough" from as many places as I could find.
>   - Improve the changelog/documentation related to RDPMC interception.
>   - Check harware capabilities, not KVM capabilities, when calculating
>     MSR and RDPMC intercepts.
>   - Rework intercept (re)calculation to use a request and the existing (well,
>     will be existing as of 6.17-rc1) vendor hooks for recalculating intercepts.
>   - Always read PERF_GLOBAL_CTRL on VM-Exit if writes weren't intercepted while
>     running the vCPU.
>   - Call setup_vmcs_config() before kvm_x86_vendor_init() so that the golden
>     VMCS configuration is known before kvm_init_pmu_capability() is called.
>   - Keep as much refresh/init code in common x86 as possible.
>   - Context switch PMCs and event selectors in common x86, not vendor code.
>   - Bail from the VM-Exit fastpath if the guest is counting instructions
>     retired and the mediated PMU is enabled (because guest state hasn't yet
>     been synchronized with hardware).
>   - Don't require an userspace to opt-in via KVM_CAP_PMU_CAPABILITY, and instead
>     automatically "create" a mediated PMU on the first KVM_CREATE_VCPU call if
>     the VM has an in-kernel local APIC.
>   - Add entries in kernel-parameters.txt for the PMU params.
>   - Add a patch to elide PMC writes when possible.
>   - Many more fixups and tweaks...
> 
> v4:
>   - https://lore.kernel.org/all/20250324173121.1275209-1-mizhang@google.com
>   - Rebase whole patchset on 6.14-rc3 base.
>   - Address Peter's comments on Perf part.
>   - Address Sean's comments on KVM part.
>     * Change key word "passthrough" to "mediated" in all patches
>     * Change static enabling to user space dynamic enabling via KVM_CAP_PMU_CAPABILITY.
>     * Only support GLOBAL_CTRL save/restore with VMCS exec_ctrl, drop the MSR
>       save/retore list support for GLOBAL_CTRL, thus the support of mediated
>       vPMU is constrained to SapphireRapids and later CPUs on Intel side.
>     * Merge some small changes into a single patch.
>   - Address Sandipan's comment on invalid pmu pointer.
>   - Add back "eventsel_hw" and "fixed_ctr_ctrl_hw" to avoid to directly
>     manipulate pmc->eventsel and pmu->fixed_ctr_ctrl.
> 
> v3: https://lore.kernel.org/all/20240801045907.4010984-1-mizhang@google.com
> v2: https://lore.kernel.org/all/20240506053020.3911940-1-mizhang@google.com
> v1: https://lore.kernel.org/all/20240126085444.324918-1-xiong.y.zhang@linux.intel.com
> 
> Dapeng Mi (15):
>    KVM: x86/pmu: Start stubbing in mediated PMU support
>    KVM: x86/pmu: Implement Intel mediated PMU requirements and
>      constraints
>    KVM: x86: Rename vmx_vmentry/vmexit_ctrl() helpers
>    KVM: x86/pmu: Move PMU_CAP_{FW_WRITES,LBR_FMT} into msr-index.h header
>    KVM: VMX: Add helpers to toggle/change a bit in VMCS execution
>      controls
>    KVM: x86/pmu: Disable RDPMC interception for compatible mediated vPMU
>    KVM: x86/pmu: Load/save GLOBAL_CTRL via entry/exit fields for mediated
>      PMU
>    KVM: x86/pmu: Use BIT_ULL() instead of open coded equivalents
>    KVM: x86/pmu: Disable interception of select PMU MSRs for mediated
>      vPMUs
>    KVM: x86/pmu: Bypass perf checks when emulating mediated PMU counter
>      accesses
>    KVM: x86/pmu: Reprogram mediated PMU event selectors on event filter
>      updates
>    KVM: x86/pmu: Load/put mediated PMU context when entering/exiting
>      guest
>    KVM: x86/pmu: Handle emulated instruction for mediated vPMU
>    KVM: nVMX: Add macros to simplify nested MSR interception setting
>    KVM: x86/pmu: Expose enable_mediated_pmu parameter to user space
> 
> Kan Liang (7):
>    perf: Skip pmu_ctx based on event_type
>    perf: Add generic exclude_guest support
>    perf: Add APIs to create/release mediated guest vPMUs
>    perf: Clean up perf ctx time
>    perf: Add a EVENT_GUEST flag
>    perf: Add APIs to load/put guest mediated PMU context
>    perf/x86/intel: Support PERF_PMU_CAP_MEDIATED_VPMU
> 
> Mingwei Zhang (3):
>    perf/x86/core: Plumb mediated PMU capability from x86_pmu to
>      x86_pmu_cap
>    KVM: x86/pmu: Introduce eventsel_hw to prepare for pmu event filtering
>    KVM: nVMX: Disable PMU MSR interception as appropriate while running
>      L2
> 
> Sandipan Das (3):
>    perf/x86/core: Do not set bit width for unavailable counters
>    perf/x86/amd: Support PERF_PMU_CAP_MEDIATED_VPMU for AMD host
>    KVM: x86/pmu: Always stuff GuestOnly=1,HostOnly=0 for mediated PMCs on
>      AMD
> 
> Sean Christopherson (15):
>    perf: Move security_perf_event_free() call to __free_event()
>    perf: core/x86: Register a new vector for handling mediated guest PMIs
>    perf/x86: Switch LVTPC to/from mediated PMI vector on guest load/put
>      context
>    KVM: VMX: Setup canonical VMCS config prior to kvm_x86_vendor_init()
>    KVM: SVM: Check pmu->version, not enable_pmu, when getting PMC MSRs
>    KVM: Add a simplified wrapper for registering perf callbacks
>    KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities
>    KVM: x86/pmu: Implement AMD mediated PMU requirements
>    KVM: x86: Rework KVM_REQ_MSR_FILTER_CHANGED into a generic
>      RECALC_INTERCEPTS
>    KVM: x86: Use KVM_REQ_RECALC_INTERCEPTS to react to CPUID updates
>    KVM: x86/pmu: Move initialization of valid PMCs bitmask to common x86
>    KVM: x86/pmu: Restrict GLOBAL_{CTRL,STATUS}, fixed PMCs, and PEBS to
>      PMU v2+
>    KVM: x86/pmu: Disallow emulation in the fastpath if mediated PMCs are
>      active
>    KVM: nSVM: Disable PMU MSR interception as appropriate while running
>      L2
>    KVM: x86/pmu: Elide WRMSRs when loading guest PMCs if values already
>      match
> 
> Xiong Zhang (1):
>    KVM: x86/pmu: Register PMI handler for mediated vPMU
> 
>   .../admin-guide/kernel-parameters.txt         |  49 ++
>   arch/arm64/kvm/arm.c                          |   2 +-
>   arch/loongarch/kvm/main.c                     |   2 +-
>   arch/riscv/kvm/main.c                         |   2 +-
>   arch/x86/entry/entry_fred.c                   |   1 +
>   arch/x86/events/amd/core.c                    |   2 +
>   arch/x86/events/core.c                        |  32 +-
>   arch/x86/events/intel/core.c                  |   5 +
>   arch/x86/include/asm/hardirq.h                |   3 +
>   arch/x86/include/asm/idtentry.h               |   6 +
>   arch/x86/include/asm/irq_vectors.h            |   4 +-
>   arch/x86/include/asm/kvm-x86-ops.h            |   2 +-
>   arch/x86/include/asm/kvm-x86-pmu-ops.h        |   4 +
>   arch/x86/include/asm/kvm_host.h               |   7 +-
>   arch/x86/include/asm/msr-index.h              |  17 +-
>   arch/x86/include/asm/perf_event.h             |   1 +
>   arch/x86/include/asm/vmx.h                    |   1 +
>   arch/x86/kernel/idt.c                         |   3 +
>   arch/x86/kernel/irq.c                         |  19 +
>   arch/x86/kvm/Kconfig                          |   1 +
>   arch/x86/kvm/cpuid.c                          |   2 +
>   arch/x86/kvm/pmu.c                            | 272 ++++++++-
>   arch/x86/kvm/pmu.h                            |  37 +-
>   arch/x86/kvm/svm/nested.c                     |  18 +-
>   arch/x86/kvm/svm/pmu.c                        |  51 +-
>   arch/x86/kvm/svm/svm.c                        |  54 +-
>   arch/x86/kvm/vmx/capabilities.h               |  11 +-
>   arch/x86/kvm/vmx/main.c                       |  14 +-
>   arch/x86/kvm/vmx/nested.c                     |  65 ++-
>   arch/x86/kvm/vmx/pmu_intel.c                  | 169 ++++--
>   arch/x86/kvm/vmx/pmu_intel.h                  |  15 +
>   arch/x86/kvm/vmx/vmx.c                        | 143 +++--
>   arch/x86/kvm/vmx/vmx.h                        |  11 +-
>   arch/x86/kvm/vmx/x86_ops.h                    |   2 +-
>   arch/x86/kvm/x86.c                            |  69 ++-
>   arch/x86/kvm/x86.h                            |   1 +
>   include/linux/kvm_host.h                      |  11 +-
>   include/linux/perf_event.h                    |  38 +-
>   init/Kconfig                                  |   4 +
>   kernel/events/core.c                          | 521 ++++++++++++++----
>   .../beauty/arch/x86/include/asm/irq_vectors.h |   3 +-
>   virt/kvm/kvm_main.c                           |   6 +-
>   42 files changed, 1385 insertions(+), 295 deletions(-)
> 
> 
> base-commit: 53d61a43a7973f812caa08fa922b607574befef4

Tested KUT/kselftest/Perf fuzzer/internal perf test suite on 4 Intel 
platforms: Sapphire Rapids (SPR), Granite Rapids (GNR), Sierra Forest 
(SRF), Clearwater Forest (CWF), no issue is found with both mediated 
vPMU and legacy perf-based vPMU.

All tests can be classified into 3 types: Bare Metal perf test, KVM 
guest perf test, and Host-Guest concurrent perf test.

* Bare Metal perf test
1. Perf Fuzzer run pass on 4 Intel platforms, SPR, GNR, SRF, CWF.
2. Internal perf test suite run pass on 4 Intel platforms, SPR, GNR, 
SRF, CWF.

* KVM guest perf test
1. KUT and KVM kselftest passed except for unspported features with 
result "Skip" [1][2].
                         | SPR      | GNR      | SRF      | CWF      |
legacy perf-based vPMU  |          |          |          |          |
KUT                     |          |          |          |          |
   pmu                   | Pass     | Pass     | Fail[3]  | Fail[3]  |
   pmu_lbr               | Skip[1]  | Skip[1]  | Skip[1]  | Skip[1]  |
   pmu_pebs              | Pass     | Fail[4]  | Fail[4]  | Skip[2]  |
KVM kselftest           |          |          |          |          |
   pmu_event_filter_test | Pass     | Pass     | Pass     | Pass     |
   vmx_pmu_caps_test     | Pass     | Pass     | Pass     | Pass     |
   pmu_counters_test     | Pass     | Pass     | Fail[5]  | Fail[5]  |
                         |          |          |          |          |
Mediated vPMU           |          |          |          |          |
KUT                     |          |          |          |          |
   pmu                   | Pass     | Pass     | Fail[3]  | Pass     |
   pmu_lbr               | Skip[1]  | Skip[1]  | Skip[1]  | Skip[1]  |
   pmu_pebs              | Skip[2]  | Skip[2]  | Skip[2]  | Skip[2]  |
KVM kselftest           |          |          |          |          |
   pmu_event_filter_test | Pass     | Pass     | Pass     | Pass     |
   vmx_pmu_caps_test     | Pass     | Pass     | Pass     | Pass     |
   pmu_counters_test     | Pass     | Pass     | Pass     | Pass     |

All failures above are known issues, which run pass with 3 patchset 
[3][4][5].

[1] Mediated vPMU based arch-LBR support is not upstreamed yet.
[2] Mediated vPMU based arch-PEBS support is not upstreamed yet.
[3] kvm-unit-tests: Fix pmu test errors on GNR/SRF/CWF 
https://lore.kernel.org/all/20250718013915.227452-1-dapeng1.mi@linux.intel.com/
[4] perf/x86: Add PERF_CAP_PEBS_TIMING_INFO flag 
https://lore.kernel.org/all/20250811090034.51249-5-dapeng1.mi@linux.intel.com/
[5] Fix PMU kselftests errors on GNR/SRF/CWF 
https://lore.kernel.org/all/20250718001905.196989-1-dapeng1.mi@linux.intel.com/

2. Guest run perf countering/sampling pass with both mediated vPMU and 
legacy perf-based vPMU on 4 Intel platforms, SPR, GNR, SRF, CWF.

3. Nested, L2 run perf pass with below combinations: (Mediated vPMU 1, 
legacy perf-based vPMU 0)
L0   | L1   | L2   |
0    | 0    | Pass |
1    | 0    | Pass |

Mediated vPMU can not be enabled on L1 guest since KVM still doesn't 
support vmx VM_EXIT_SAVE_IA32_PERF_GLOBAL_CTRL yet and mediated vPMU 
depends on it.

* Host-Guest concurrent perf test
Host and guest run perf stress (record and counting) in parallel pass.

Tested-by: Xudong Hao <xudong.hao@intel.com>
Re: [PATCH v5 00/44] KVM: x86: Add support for mediated vPMUs
Posted by Sean Christopherson 2 weeks, 2 days ago
On Wed, 06 Aug 2025 12:56:22 -0700, Sean Christopherson wrote:
> This series is based on the fastpath+PMU cleanups series[*] (which is based on
> kvm/queue), but the non-KVM changes apply cleanly on v6.16 or Linus' tree.
> I.e. if you only care about the perf changes, I would just apply on whatever
> branch is convenient and stop when you hit the KVM changes.
> 
> My hope/plan is that the perf changes will go through the tip tree with a
> stable tag/branch, and the KVM changes will go the kvm-x86 tree.
> 
> [...]

Applied a subset (roughtly 1/4) of the KVM patches to kvm-x86 misc, as the full
thing is far too late for 6.18.  In a nutshell, all of the random prep work and
cleanups that aren't directly related to mediated PMU support.

I want to get "setup VMCS prior to kvm_x86_vendor_init()" in particular landed
to establish the order of setup operations.  The in-progress CET series moves
the _nested_ VMCS setup later, and I had a moment of panic that I was creating
incompatible patches.

[14/44] KVM: VMX: Setup canonical VMCS config prior to kvm_x86_vendor_init()
        https://github.com/kvm-x86/linux/commit/4687a2c4e6a6
[15/44] KVM: SVM: Check pmu->version, not enable_pmu, when getting PMC MSRs
        https://github.com/kvm-x86/linux/commit/e3d1f2826da6

[17/44] KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities
        https://github.com/kvm-x86/linux/commit/51f34b1e650f

[22/44] KVM: x86: Rename vmx_vmentry/vmexit_ctrl() helpers
        https://github.com/kvm-x86/linux/commit/1e24bece2681
[23/44] KVM: x86/pmu: Move PMU_CAP_{FW_WRITES,LBR_FMT} into msr-index.h header
        https://github.com/kvm-x86/linux/commit/cdfed9370b96
[24/44] KVM: x86: Rework KVM_REQ_MSR_FILTER_CHANGED into a generic RECALC_INTERCEPTS
        https://github.com/kvm-x86/linux/commit/6057497336bb
[25/44] KVM: x86: Use KVM_REQ_RECALC_INTERCEPTS to react to CPUID updates
        https://github.com/kvm-x86/linux/commit/5a1a726e68ff
[26/44] KVM: VMX: Add helpers to toggle/change a bit in VMCS execution controls
        https://github.com/kvm-x86/linux/commit/2bff2edf69ed

[29/44] KVM: x86/pmu: Use BIT_ULL() instead of open coded equivalents
        https://github.com/kvm-x86/linux/commit/30c0267f1581
[30/44] KVM: x86/pmu: Move initialization of valid PMCs bitmask to common x86
        https://github.com/kvm-x86/linux/commit/9bae7a086394
[31/44] KVM: x86/pmu: Restrict GLOBAL_{CTRL,STATUS}, fixed PMCs, and PEBS to PMU v2+
        https://github.com/kvm-x86/linux/commit/c49aa9837686

--
https://github.com/kvm-x86/linux/tree/next
Re: [PATCH v5 00/44] KVM: x86: Add support for mediated vPMUs
Posted by Mi, Dapeng 1 month, 3 weeks ago
On 8/7/2025 3:56 AM, Sean Christopherson wrote:
> This series is based on the fastpath+PMU cleanups series[*] (which is based on
> kvm/queue), but the non-KVM changes apply cleanly on v6.16 or Linus' tree.
> I.e. if you only care about the perf changes, I would just apply on whatever
> branch is convenient and stop when you hit the KVM changes.
>
> My hope/plan is that the perf changes will go through the tip tree with a
> stable tag/branch, and the KVM changes will go the kvm-x86 tree.
>
> Non-x86 KVM folks, y'all are getting Cc'd due to minor changes in "KVM: Add a
> simplified wrapper for registering perf callbacks".
>
> The full set is also available at:
>
>   https://github.com/sean-jc/linux.git tags/mediated-vpmu-v5
>
> Add support for mediated vPMUs in KVM x86, where "mediated" aligns with the
> standard definition of intercepting control operations (e.g. event selectors),
> while allowing the guest to perform data operations (e.g. read PMCs, toggle
> counters on/off) without KVM getting involed.
>
> For an in-depth description of the what and why, please see the cover letter
> from the original RFC:
>
>   https://lore.kernel.org/all/20240126085444.324918-1-xiong.y.zhang@linux.intel.com
>
> All KVM tests pass (or fail the same before and after), and I've manually
> verified MSR/PMC are passed through as expected, but I haven't done much at all
> to actually utilize the PMU in a guest.  I'll be amazed if I didn't make at
> least one major goof.
>
> Similarly, I tried to address all feedback, but there are many, many changes
> relative to v4.  If I missed something, I apologize in advance.
>
> In other words, please thoroughly review and test.

Went through the whole patchset, it looks good to me.

Run all PMU related kselftests and KUT tests on Intel Sapphire Rapids, no
issue is found. We would run broader tests on more Intel platforms. Thanks.


>
> [*] https://lore.kernel.org/all/20250805190526.1453366-1-seanjc@google.com
>
> v5:
>  - Add a patch to call security_perf_event_free() from __free_event()
>    instead of _free_event() (necessitated by the __cleanup() changes).
>  - Add CONFIG_PERF_GUEST_MEDIATED_PMU to guard the new perf functionality.
>  - Ensure the PMU is fully disabled in perf_{load,put}_guest_context() when
>    when switching between guest and host context. [Kan, Namhyung]
>  - Route the new system IRQ, PERF_GUEST_MEDIATED_PMI_VECTOR, through perf,
>    not KVM, and play nice with FRED.
>  - Rename and combine perf_{guest,host}_{enter,exit}() to a single set of
>    APIs, perf_{load,put}_guest_context().
>  - Rename perf_{get,put}_mediated_pmu() to perf_{create,release}_mediated_pmu()
>    to (hopefully) better differentiate them from perf_{load,put}_guest_context().
>  - Change the param to the load/put APIs from "u32 guest_lvtpc" to
>    "unsigned long data" to decouple arch code as much as possible.  E.g. if
>    a non-x86 arch were to ever support a mediated vPMU, @data could be used
>    to pass a pointer to a struct.
>  - Use pmu->version to detect if a vCPU has a mediated PMU.
>  - Use a kvm_x86_ops hook to check for mediated PMU support.
>  - Cull "passthrough" from as many places as I could find.
>  - Improve the changelog/documentation related to RDPMC interception.
>  - Check harware capabilities, not KVM capabilities, when calculating
>    MSR and RDPMC intercepts.
>  - Rework intercept (re)calculation to use a request and the existing (well,
>    will be existing as of 6.17-rc1) vendor hooks for recalculating intercepts.
>  - Always read PERF_GLOBAL_CTRL on VM-Exit if writes weren't intercepted while
>    running the vCPU.
>  - Call setup_vmcs_config() before kvm_x86_vendor_init() so that the golden
>    VMCS configuration is known before kvm_init_pmu_capability() is called.
>  - Keep as much refresh/init code in common x86 as possible.
>  - Context switch PMCs and event selectors in common x86, not vendor code.
>  - Bail from the VM-Exit fastpath if the guest is counting instructions
>    retired and the mediated PMU is enabled (because guest state hasn't yet
>    been synchronized with hardware).
>  - Don't require an userspace to opt-in via KVM_CAP_PMU_CAPABILITY, and instead
>    automatically "create" a mediated PMU on the first KVM_CREATE_VCPU call if
>    the VM has an in-kernel local APIC.
>  - Add entries in kernel-parameters.txt for the PMU params.
>  - Add a patch to elide PMC writes when possible.
>  - Many more fixups and tweaks...
>
> v4:
>  - https://lore.kernel.org/all/20250324173121.1275209-1-mizhang@google.com
>  - Rebase whole patchset on 6.14-rc3 base.
>  - Address Peter's comments on Perf part.
>  - Address Sean's comments on KVM part.
>    * Change key word "passthrough" to "mediated" in all patches
>    * Change static enabling to user space dynamic enabling via KVM_CAP_PMU_CAPABILITY.
>    * Only support GLOBAL_CTRL save/restore with VMCS exec_ctrl, drop the MSR
>      save/retore list support for GLOBAL_CTRL, thus the support of mediated
>      vPMU is constrained to SapphireRapids and later CPUs on Intel side.
>    * Merge some small changes into a single patch.
>  - Address Sandipan's comment on invalid pmu pointer.
>  - Add back "eventsel_hw" and "fixed_ctr_ctrl_hw" to avoid to directly
>    manipulate pmc->eventsel and pmu->fixed_ctr_ctrl.
>
> v3: https://lore.kernel.org/all/20240801045907.4010984-1-mizhang@google.com
> v2: https://lore.kernel.org/all/20240506053020.3911940-1-mizhang@google.com
> v1: https://lore.kernel.org/all/20240126085444.324918-1-xiong.y.zhang@linux.intel.com
>
> Dapeng Mi (15):
>   KVM: x86/pmu: Start stubbing in mediated PMU support
>   KVM: x86/pmu: Implement Intel mediated PMU requirements and
>     constraints
>   KVM: x86: Rename vmx_vmentry/vmexit_ctrl() helpers
>   KVM: x86/pmu: Move PMU_CAP_{FW_WRITES,LBR_FMT} into msr-index.h header
>   KVM: VMX: Add helpers to toggle/change a bit in VMCS execution
>     controls
>   KVM: x86/pmu: Disable RDPMC interception for compatible mediated vPMU
>   KVM: x86/pmu: Load/save GLOBAL_CTRL via entry/exit fields for mediated
>     PMU
>   KVM: x86/pmu: Use BIT_ULL() instead of open coded equivalents
>   KVM: x86/pmu: Disable interception of select PMU MSRs for mediated
>     vPMUs
>   KVM: x86/pmu: Bypass perf checks when emulating mediated PMU counter
>     accesses
>   KVM: x86/pmu: Reprogram mediated PMU event selectors on event filter
>     updates
>   KVM: x86/pmu: Load/put mediated PMU context when entering/exiting
>     guest
>   KVM: x86/pmu: Handle emulated instruction for mediated vPMU
>   KVM: nVMX: Add macros to simplify nested MSR interception setting
>   KVM: x86/pmu: Expose enable_mediated_pmu parameter to user space
>
> Kan Liang (7):
>   perf: Skip pmu_ctx based on event_type
>   perf: Add generic exclude_guest support
>   perf: Add APIs to create/release mediated guest vPMUs
>   perf: Clean up perf ctx time
>   perf: Add a EVENT_GUEST flag
>   perf: Add APIs to load/put guest mediated PMU context
>   perf/x86/intel: Support PERF_PMU_CAP_MEDIATED_VPMU
>
> Mingwei Zhang (3):
>   perf/x86/core: Plumb mediated PMU capability from x86_pmu to
>     x86_pmu_cap
>   KVM: x86/pmu: Introduce eventsel_hw to prepare for pmu event filtering
>   KVM: nVMX: Disable PMU MSR interception as appropriate while running
>     L2
>
> Sandipan Das (3):
>   perf/x86/core: Do not set bit width for unavailable counters
>   perf/x86/amd: Support PERF_PMU_CAP_MEDIATED_VPMU for AMD host
>   KVM: x86/pmu: Always stuff GuestOnly=1,HostOnly=0 for mediated PMCs on
>     AMD
>
> Sean Christopherson (15):
>   perf: Move security_perf_event_free() call to __free_event()
>   perf: core/x86: Register a new vector for handling mediated guest PMIs
>   perf/x86: Switch LVTPC to/from mediated PMI vector on guest load/put
>     context
>   KVM: VMX: Setup canonical VMCS config prior to kvm_x86_vendor_init()
>   KVM: SVM: Check pmu->version, not enable_pmu, when getting PMC MSRs
>   KVM: Add a simplified wrapper for registering perf callbacks
>   KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities
>   KVM: x86/pmu: Implement AMD mediated PMU requirements
>   KVM: x86: Rework KVM_REQ_MSR_FILTER_CHANGED into a generic
>     RECALC_INTERCEPTS
>   KVM: x86: Use KVM_REQ_RECALC_INTERCEPTS to react to CPUID updates
>   KVM: x86/pmu: Move initialization of valid PMCs bitmask to common x86
>   KVM: x86/pmu: Restrict GLOBAL_{CTRL,STATUS}, fixed PMCs, and PEBS to
>     PMU v2+
>   KVM: x86/pmu: Disallow emulation in the fastpath if mediated PMCs are
>     active
>   KVM: nSVM: Disable PMU MSR interception as appropriate while running
>     L2
>   KVM: x86/pmu: Elide WRMSRs when loading guest PMCs if values already
>     match
>
> Xiong Zhang (1):
>   KVM: x86/pmu: Register PMI handler for mediated vPMU
>
>  .../admin-guide/kernel-parameters.txt         |  49 ++
>  arch/arm64/kvm/arm.c                          |   2 +-
>  arch/loongarch/kvm/main.c                     |   2 +-
>  arch/riscv/kvm/main.c                         |   2 +-
>  arch/x86/entry/entry_fred.c                   |   1 +
>  arch/x86/events/amd/core.c                    |   2 +
>  arch/x86/events/core.c                        |  32 +-
>  arch/x86/events/intel/core.c                  |   5 +
>  arch/x86/include/asm/hardirq.h                |   3 +
>  arch/x86/include/asm/idtentry.h               |   6 +
>  arch/x86/include/asm/irq_vectors.h            |   4 +-
>  arch/x86/include/asm/kvm-x86-ops.h            |   2 +-
>  arch/x86/include/asm/kvm-x86-pmu-ops.h        |   4 +
>  arch/x86/include/asm/kvm_host.h               |   7 +-
>  arch/x86/include/asm/msr-index.h              |  17 +-
>  arch/x86/include/asm/perf_event.h             |   1 +
>  arch/x86/include/asm/vmx.h                    |   1 +
>  arch/x86/kernel/idt.c                         |   3 +
>  arch/x86/kernel/irq.c                         |  19 +
>  arch/x86/kvm/Kconfig                          |   1 +
>  arch/x86/kvm/cpuid.c                          |   2 +
>  arch/x86/kvm/pmu.c                            | 272 ++++++++-
>  arch/x86/kvm/pmu.h                            |  37 +-
>  arch/x86/kvm/svm/nested.c                     |  18 +-
>  arch/x86/kvm/svm/pmu.c                        |  51 +-
>  arch/x86/kvm/svm/svm.c                        |  54 +-
>  arch/x86/kvm/vmx/capabilities.h               |  11 +-
>  arch/x86/kvm/vmx/main.c                       |  14 +-
>  arch/x86/kvm/vmx/nested.c                     |  65 ++-
>  arch/x86/kvm/vmx/pmu_intel.c                  | 169 ++++--
>  arch/x86/kvm/vmx/pmu_intel.h                  |  15 +
>  arch/x86/kvm/vmx/vmx.c                        | 143 +++--
>  arch/x86/kvm/vmx/vmx.h                        |  11 +-
>  arch/x86/kvm/vmx/x86_ops.h                    |   2 +-
>  arch/x86/kvm/x86.c                            |  69 ++-
>  arch/x86/kvm/x86.h                            |   1 +
>  include/linux/kvm_host.h                      |  11 +-
>  include/linux/perf_event.h                    |  38 +-
>  init/Kconfig                                  |   4 +
>  kernel/events/core.c                          | 521 ++++++++++++++----
>  .../beauty/arch/x86/include/asm/irq_vectors.h |   3 +-
>  virt/kvm/kvm_main.c                           |   6 +-
>  42 files changed, 1385 insertions(+), 295 deletions(-)
>
>
> base-commit: 53d61a43a7973f812caa08fa922b607574befef4
Re: [PATCH v5 00/44] KVM: x86: Add support for mediated vPMUs
Posted by Mi, Dapeng 1 month, 3 weeks ago
On 8/8/2025 4:28 PM, Mi, Dapeng wrote:
> On 8/7/2025 3:56 AM, Sean Christopherson wrote:
>> This series is based on the fastpath+PMU cleanups series[*] (which is based on
>> kvm/queue), but the non-KVM changes apply cleanly on v6.16 or Linus' tree.
>> I.e. if you only care about the perf changes, I would just apply on whatever
>> branch is convenient and stop when you hit the KVM changes.
>>
>> My hope/plan is that the perf changes will go through the tip tree with a
>> stable tag/branch, and the KVM changes will go the kvm-x86 tree.
>>
>> Non-x86 KVM folks, y'all are getting Cc'd due to minor changes in "KVM: Add a
>> simplified wrapper for registering perf callbacks".
>>
>> The full set is also available at:
>>
>>   https://github.com/sean-jc/linux.git tags/mediated-vpmu-v5
>>
>> Add support for mediated vPMUs in KVM x86, where "mediated" aligns with the
>> standard definition of intercepting control operations (e.g. event selectors),
>> while allowing the guest to perform data operations (e.g. read PMCs, toggle
>> counters on/off) without KVM getting involed.
>>
>> For an in-depth description of the what and why, please see the cover letter
>> from the original RFC:
>>
>>   https://lore.kernel.org/all/20240126085444.324918-1-xiong.y.zhang@linux.intel.com
>>
>> All KVM tests pass (or fail the same before and after), and I've manually
>> verified MSR/PMC are passed through as expected, but I haven't done much at all
>> to actually utilize the PMU in a guest.  I'll be amazed if I didn't make at
>> least one major goof.
>>
>> Similarly, I tried to address all feedback, but there are many, many changes
>> relative to v4.  If I missed something, I apologize in advance.
>>
>> In other words, please thoroughly review and test.
> Went through the whole patchset, it looks good to me.
>
> Run all PMU related kselftests and KUT tests on Intel Sapphire Rapids, no
> issue is found. We would run broader tests on more Intel platforms. Thanks.

Forgot to say "all tests are run for both mediated vPMU and the legacy
perf-based vPMU, no issue is found". Thanks.


>
>
>> [*] https://lore.kernel.org/all/20250805190526.1453366-1-seanjc@google.com
>>
>> v5:
>>  - Add a patch to call security_perf_event_free() from __free_event()
>>    instead of _free_event() (necessitated by the __cleanup() changes).
>>  - Add CONFIG_PERF_GUEST_MEDIATED_PMU to guard the new perf functionality.
>>  - Ensure the PMU is fully disabled in perf_{load,put}_guest_context() when
>>    when switching between guest and host context. [Kan, Namhyung]
>>  - Route the new system IRQ, PERF_GUEST_MEDIATED_PMI_VECTOR, through perf,
>>    not KVM, and play nice with FRED.
>>  - Rename and combine perf_{guest,host}_{enter,exit}() to a single set of
>>    APIs, perf_{load,put}_guest_context().
>>  - Rename perf_{get,put}_mediated_pmu() to perf_{create,release}_mediated_pmu()
>>    to (hopefully) better differentiate them from perf_{load,put}_guest_context().
>>  - Change the param to the load/put APIs from "u32 guest_lvtpc" to
>>    "unsigned long data" to decouple arch code as much as possible.  E.g. if
>>    a non-x86 arch were to ever support a mediated vPMU, @data could be used
>>    to pass a pointer to a struct.
>>  - Use pmu->version to detect if a vCPU has a mediated PMU.
>>  - Use a kvm_x86_ops hook to check for mediated PMU support.
>>  - Cull "passthrough" from as many places as I could find.
>>  - Improve the changelog/documentation related to RDPMC interception.
>>  - Check harware capabilities, not KVM capabilities, when calculating
>>    MSR and RDPMC intercepts.
>>  - Rework intercept (re)calculation to use a request and the existing (well,
>>    will be existing as of 6.17-rc1) vendor hooks for recalculating intercepts.
>>  - Always read PERF_GLOBAL_CTRL on VM-Exit if writes weren't intercepted while
>>    running the vCPU.
>>  - Call setup_vmcs_config() before kvm_x86_vendor_init() so that the golden
>>    VMCS configuration is known before kvm_init_pmu_capability() is called.
>>  - Keep as much refresh/init code in common x86 as possible.
>>  - Context switch PMCs and event selectors in common x86, not vendor code.
>>  - Bail from the VM-Exit fastpath if the guest is counting instructions
>>    retired and the mediated PMU is enabled (because guest state hasn't yet
>>    been synchronized with hardware).
>>  - Don't require an userspace to opt-in via KVM_CAP_PMU_CAPABILITY, and instead
>>    automatically "create" a mediated PMU on the first KVM_CREATE_VCPU call if
>>    the VM has an in-kernel local APIC.
>>  - Add entries in kernel-parameters.txt for the PMU params.
>>  - Add a patch to elide PMC writes when possible.
>>  - Many more fixups and tweaks...
>>
>> v4:
>>  - https://lore.kernel.org/all/20250324173121.1275209-1-mizhang@google.com
>>  - Rebase whole patchset on 6.14-rc3 base.
>>  - Address Peter's comments on Perf part.
>>  - Address Sean's comments on KVM part.
>>    * Change key word "passthrough" to "mediated" in all patches
>>    * Change static enabling to user space dynamic enabling via KVM_CAP_PMU_CAPABILITY.
>>    * Only support GLOBAL_CTRL save/restore with VMCS exec_ctrl, drop the MSR
>>      save/retore list support for GLOBAL_CTRL, thus the support of mediated
>>      vPMU is constrained to SapphireRapids and later CPUs on Intel side.
>>    * Merge some small changes into a single patch.
>>  - Address Sandipan's comment on invalid pmu pointer.
>>  - Add back "eventsel_hw" and "fixed_ctr_ctrl_hw" to avoid to directly
>>    manipulate pmc->eventsel and pmu->fixed_ctr_ctrl.
>>
>> v3: https://lore.kernel.org/all/20240801045907.4010984-1-mizhang@google.com
>> v2: https://lore.kernel.org/all/20240506053020.3911940-1-mizhang@google.com
>> v1: https://lore.kernel.org/all/20240126085444.324918-1-xiong.y.zhang@linux.intel.com
>>
>> Dapeng Mi (15):
>>   KVM: x86/pmu: Start stubbing in mediated PMU support
>>   KVM: x86/pmu: Implement Intel mediated PMU requirements and
>>     constraints
>>   KVM: x86: Rename vmx_vmentry/vmexit_ctrl() helpers
>>   KVM: x86/pmu: Move PMU_CAP_{FW_WRITES,LBR_FMT} into msr-index.h header
>>   KVM: VMX: Add helpers to toggle/change a bit in VMCS execution
>>     controls
>>   KVM: x86/pmu: Disable RDPMC interception for compatible mediated vPMU
>>   KVM: x86/pmu: Load/save GLOBAL_CTRL via entry/exit fields for mediated
>>     PMU
>>   KVM: x86/pmu: Use BIT_ULL() instead of open coded equivalents
>>   KVM: x86/pmu: Disable interception of select PMU MSRs for mediated
>>     vPMUs
>>   KVM: x86/pmu: Bypass perf checks when emulating mediated PMU counter
>>     accesses
>>   KVM: x86/pmu: Reprogram mediated PMU event selectors on event filter
>>     updates
>>   KVM: x86/pmu: Load/put mediated PMU context when entering/exiting
>>     guest
>>   KVM: x86/pmu: Handle emulated instruction for mediated vPMU
>>   KVM: nVMX: Add macros to simplify nested MSR interception setting
>>   KVM: x86/pmu: Expose enable_mediated_pmu parameter to user space
>>
>> Kan Liang (7):
>>   perf: Skip pmu_ctx based on event_type
>>   perf: Add generic exclude_guest support
>>   perf: Add APIs to create/release mediated guest vPMUs
>>   perf: Clean up perf ctx time
>>   perf: Add a EVENT_GUEST flag
>>   perf: Add APIs to load/put guest mediated PMU context
>>   perf/x86/intel: Support PERF_PMU_CAP_MEDIATED_VPMU
>>
>> Mingwei Zhang (3):
>>   perf/x86/core: Plumb mediated PMU capability from x86_pmu to
>>     x86_pmu_cap
>>   KVM: x86/pmu: Introduce eventsel_hw to prepare for pmu event filtering
>>   KVM: nVMX: Disable PMU MSR interception as appropriate while running
>>     L2
>>
>> Sandipan Das (3):
>>   perf/x86/core: Do not set bit width for unavailable counters
>>   perf/x86/amd: Support PERF_PMU_CAP_MEDIATED_VPMU for AMD host
>>   KVM: x86/pmu: Always stuff GuestOnly=1,HostOnly=0 for mediated PMCs on
>>     AMD
>>
>> Sean Christopherson (15):
>>   perf: Move security_perf_event_free() call to __free_event()
>>   perf: core/x86: Register a new vector for handling mediated guest PMIs
>>   perf/x86: Switch LVTPC to/from mediated PMI vector on guest load/put
>>     context
>>   KVM: VMX: Setup canonical VMCS config prior to kvm_x86_vendor_init()
>>   KVM: SVM: Check pmu->version, not enable_pmu, when getting PMC MSRs
>>   KVM: Add a simplified wrapper for registering perf callbacks
>>   KVM: x86/pmu: Snapshot host (i.e. perf's) reported PMU capabilities
>>   KVM: x86/pmu: Implement AMD mediated PMU requirements
>>   KVM: x86: Rework KVM_REQ_MSR_FILTER_CHANGED into a generic
>>     RECALC_INTERCEPTS
>>   KVM: x86: Use KVM_REQ_RECALC_INTERCEPTS to react to CPUID updates
>>   KVM: x86/pmu: Move initialization of valid PMCs bitmask to common x86
>>   KVM: x86/pmu: Restrict GLOBAL_{CTRL,STATUS}, fixed PMCs, and PEBS to
>>     PMU v2+
>>   KVM: x86/pmu: Disallow emulation in the fastpath if mediated PMCs are
>>     active
>>   KVM: nSVM: Disable PMU MSR interception as appropriate while running
>>     L2
>>   KVM: x86/pmu: Elide WRMSRs when loading guest PMCs if values already
>>     match
>>
>> Xiong Zhang (1):
>>   KVM: x86/pmu: Register PMI handler for mediated vPMU
>>
>>  .../admin-guide/kernel-parameters.txt         |  49 ++
>>  arch/arm64/kvm/arm.c                          |   2 +-
>>  arch/loongarch/kvm/main.c                     |   2 +-
>>  arch/riscv/kvm/main.c                         |   2 +-
>>  arch/x86/entry/entry_fred.c                   |   1 +
>>  arch/x86/events/amd/core.c                    |   2 +
>>  arch/x86/events/core.c                        |  32 +-
>>  arch/x86/events/intel/core.c                  |   5 +
>>  arch/x86/include/asm/hardirq.h                |   3 +
>>  arch/x86/include/asm/idtentry.h               |   6 +
>>  arch/x86/include/asm/irq_vectors.h            |   4 +-
>>  arch/x86/include/asm/kvm-x86-ops.h            |   2 +-
>>  arch/x86/include/asm/kvm-x86-pmu-ops.h        |   4 +
>>  arch/x86/include/asm/kvm_host.h               |   7 +-
>>  arch/x86/include/asm/msr-index.h              |  17 +-
>>  arch/x86/include/asm/perf_event.h             |   1 +
>>  arch/x86/include/asm/vmx.h                    |   1 +
>>  arch/x86/kernel/idt.c                         |   3 +
>>  arch/x86/kernel/irq.c                         |  19 +
>>  arch/x86/kvm/Kconfig                          |   1 +
>>  arch/x86/kvm/cpuid.c                          |   2 +
>>  arch/x86/kvm/pmu.c                            | 272 ++++++++-
>>  arch/x86/kvm/pmu.h                            |  37 +-
>>  arch/x86/kvm/svm/nested.c                     |  18 +-
>>  arch/x86/kvm/svm/pmu.c                        |  51 +-
>>  arch/x86/kvm/svm/svm.c                        |  54 +-
>>  arch/x86/kvm/vmx/capabilities.h               |  11 +-
>>  arch/x86/kvm/vmx/main.c                       |  14 +-
>>  arch/x86/kvm/vmx/nested.c                     |  65 ++-
>>  arch/x86/kvm/vmx/pmu_intel.c                  | 169 ++++--
>>  arch/x86/kvm/vmx/pmu_intel.h                  |  15 +
>>  arch/x86/kvm/vmx/vmx.c                        | 143 +++--
>>  arch/x86/kvm/vmx/vmx.h                        |  11 +-
>>  arch/x86/kvm/vmx/x86_ops.h                    |   2 +-
>>  arch/x86/kvm/x86.c                            |  69 ++-
>>  arch/x86/kvm/x86.h                            |   1 +
>>  include/linux/kvm_host.h                      |  11 +-
>>  include/linux/perf_event.h                    |  38 +-
>>  init/Kconfig                                  |   4 +
>>  kernel/events/core.c                          | 521 ++++++++++++++----
>>  .../beauty/arch/x86/include/asm/irq_vectors.h |   3 +-
>>  virt/kvm/kvm_main.c                           |   6 +-
>>  42 files changed, 1385 insertions(+), 295 deletions(-)
>>
>>
>> base-commit: 53d61a43a7973f812caa08fa922b607574befef4