From nobody Tue Dec 16 14:38:32 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DAECF22A4DB for ; Sat, 6 Dec 2025 00:18:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980304; cv=none; b=UM22JcwB6at4ingSI5xY6qfMKwIZl5RZfC+Q7tnrBPdCpgsxBQcgOt0HKP3DVZtIa1col4XdVJ2J7NnnPnFVskX10L3FDhnjt1+b86PRDQgqxOA7pphOh63s54BN4EFB83a66S+3XwjyWe9zu/ueIdfNBnYPC4OQEabH2Ch7x18= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764980304; c=relaxed/simple; bh=g6IgwDOr1QTB/U0G/UqEHj7fne6rpUkmt4EyXf4kjNE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=rsD3PobgRVaNQ0jH0256R6CfGrG8UEFO74bVWO9U74i0l2zYveVnUGQjiJWMkU1Jt3Fr+ue9TaeR7wzZEdSD/6YDvUiL39+RcMBuILCpajshEWH6HtWnRIu9IPp1TgoydKpKzbiBbP0qUpCMSfh3oQWxN8BZJlQUGnWSLdzyyc4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=UdpsrET7; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="UdpsrET7" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3438b1220bcso3062277a91.2 for ; Fri, 05 Dec 2025 16:18:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1764980301; x=1765585101; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=woqdwqYc7rQiICol2w1MfYRG1IsQA6G0tPae7LPMVAU=; b=UdpsrET78tTHLHOYWxSoSTfKXIL63puqUiXEpzMh8H3zSBpNXHfMLMJfXqh06sSzMR Zh7GJWQ53vbP9BJRyVh2be6mZTx8sM+ugZ+/1wSM2XfbAYytuxTLms9r5R0p9725RGDR PBpj87AXD2r+BpBmC5dr8ag+or11YKHZmia1YlRWkMPwQOxznHNYzsNYZe408KluSrLj iu+bROZM5L0i5aOyb6tTM0xmt8UQnizN93Za9aj99eJEZevqyVr5OL6fOWt2jVm0S2/j OdlbAHbpLEtG4f//tQguPg5cSmkuYG0T4EQZ3b8RAQTrJSTH30LIkbHmUXrLs80MQtKu lbhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764980301; x=1765585101; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=woqdwqYc7rQiICol2w1MfYRG1IsQA6G0tPae7LPMVAU=; b=LnuxOyqYnsFcdRzuiDVh5+85zsQzV5adkb9wuz2BqXvFdz/ez2SV12/y2Z8VqTHESt Yq/+RodG4zlCKM5tsuyRunWVTnR0vVcLRoll5T6LsFXORbpsXRJ3WaplSd4EUDwLPmc2 U1SKmnidl4zeqb/97zfFcfnUwl/gwN1qVlO9WTGYC8snP6ntZYQt0/tVOi/rR7PNjLEv CVdpVop3doOmTlwH+Etca27U6k7N6hF8LZIVMVRafTnNJAxurM2HTYzZ+6vnXLnoZb0P w+Whg1UyH8F0mw8koLM/Md8x0A3aEmMV+cNsbzfOSUF51r7DHkYgWxS1Wq+CWLahJFWX sjqQ== X-Forwarded-Encrypted: i=1; AJvYcCWhnXrVoXkAQgXeFCOqrNvAvyB5EgRHVj2jQwGXCNhNPr3jZXj8F0Sh+qCadrHWha0beGhOv6KtGDcOTg0=@vger.kernel.org X-Gm-Message-State: AOJu0YwKBnHmXzimRt7i2CzZZ4YOlnzhs5zqoGjJmWJ+lHHe4H+6H2o6 mg1gm571RNuzqvy294L5lzWsMttCcDoE756NeWkdklKryIyV0z8GPeEINJWuz9AvWNaGYkrMiqW Yp6oAHQ== X-Google-Smtp-Source: AGHT+IFJtzUjMs3EhF2cTLNE2/+KZsTfbuJQ6pRxxJNiALtopWi5CNU9QJ//pu80N197imWodywRwcS+L+M= X-Received: from pjxx15.prod.google.com ([2002:a17:90b:58cf:b0:340:bb32:f5cf]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:33d1:b0:341:194:5e7d with SMTP id 98e67ed59e1d1-349a2686534mr715053a91.24.1764980300781; Fri, 05 Dec 2025 16:18:20 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 5 Dec 2025 16:17:03 -0800 In-Reply-To: <20251206001720.468579-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20251206001720.468579-1-seanjc@google.com> X-Mailer: git-send-email 2.52.0.223.gf5cc29aaa4-goog Message-ID: <20251206001720.468579-28-seanjc@google.com> Subject: [PATCH v6 27/44] KVM: x86/pmu: Load/put mediated PMU context when entering/exiting guest From: Sean Christopherson To: Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Xin Li , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Sean Christopherson , Paolo Bonzini Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvm@vger.kernel.org, loongarch@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Mingwei Zhang , Xudong Hao , Sandipan Das , Dapeng Mi , Xiong Zhang , Manali Shukla , Jim Mattson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dapeng Mi Implement the PMU "world switch" between host perf and guest mediated PMU. When loading guest state, call into perf to switch from host to guest, and then load guest state into hardware, and then reverse those actions when putting guest state. On the KVM side, when loading guest state, zero PERF_GLOBAL_CTRL to ensure all counters are disabled, then load selectors and counters, and finally call into vendor code to load control/status information. While VMX and SVM use different mechanisms to avoid counting host activity while guest controls are loaded, both implementations require PERF_GLOBAL_CTRL to be zeroed when the event selectors are in flux. When putting guest state, reverse the order, and save and zero controls and status prior to saving+zeroing selectors and counters. Defer clearing PERF_GLOBAL_CTRL to vendor code, as only SVM needs to manually clear the MSR; VMX configures PERF_GLOBAL_CTRL to be atomically cleared by the CPU on VM-Exit. Handle the difference in MSR layouts between Intel and AMD by communicating the bases and stride via kvm_pmu_ops. Because KVM requires Intel v4 (and full-width writes) and AMD v2, the MSRs to load/save are constant for a given vendor, i.e. do not vary based on the guest PMU, and do not vary based on host PMU (because KVM will simply disable mediated PMU support if the necessary MSRs are unsupported). Except for retrieving the guest's PERF_GLOBAL_CTRL, which needs to be read before invoking any fastpath handler (spoiler alert), perform the context switch around KVM's inner run loop. State only needs to be synchronized from hardware before KVM can access the software "caches". Note, VMX already grabs the guest's PERF_GLOBAL_CTRL immediately after VM-Exit, as hardware saves value into the VMCS. Co-developed-by: Mingwei Zhang Signed-off-by: Mingwei Zhang Co-developed-by: Sandipan Das Signed-off-by: Sandipan Das Signed-off-by: Dapeng Mi Tested-by: Xudong Hao Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm-x86-pmu-ops.h | 2 + arch/x86/include/asm/msr-index.h | 1 + arch/x86/kvm/pmu.c | 130 ++++++++++++++++++++++++- arch/x86/kvm/pmu.h | 10 ++ arch/x86/kvm/svm/pmu.c | 34 +++++++ arch/x86/kvm/svm/svm.c | 3 + arch/x86/kvm/vmx/pmu_intel.c | 44 +++++++++ arch/x86/kvm/x86.c | 4 + 8 files changed, 225 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/= kvm-x86-pmu-ops.h index ad2cc82abf79..f0aa6996811f 100644 --- a/arch/x86/include/asm/kvm-x86-pmu-ops.h +++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h @@ -24,6 +24,8 @@ KVM_X86_PMU_OP_OPTIONAL(deliver_pmi) KVM_X86_PMU_OP_OPTIONAL(cleanup) =20 KVM_X86_PMU_OP_OPTIONAL(write_global_ctrl) +KVM_X86_PMU_OP(mediated_load) +KVM_X86_PMU_OP(mediated_put) =20 #undef KVM_X86_PMU_OP #undef KVM_X86_PMU_OP_OPTIONAL diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index 9e1720d73244..0ba08fd4ac3f 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -1191,6 +1191,7 @@ #define MSR_CORE_PERF_GLOBAL_STATUS 0x0000038e #define MSR_CORE_PERF_GLOBAL_CTRL 0x0000038f #define MSR_CORE_PERF_GLOBAL_OVF_CTRL 0x00000390 +#define MSR_CORE_PERF_GLOBAL_STATUS_SET 0x00000391 =20 #define MSR_PERF_METRICS 0x00000329 =20 diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 39904e6fd227..578bf996bda2 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -882,10 +882,13 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr= _data *msr_info) diff =3D pmu->global_ctrl ^ data; pmu->global_ctrl =3D data; reprogram_counters(pmu, diff); - - if (kvm_vcpu_has_mediated_pmu(vcpu)) - kvm_pmu_call(write_global_ctrl)(data); } + /* + * Unconditionally forward writes to vendor code, i.e. to the + * VMC{B,S}, as pmu->global_ctrl is per-VCPU, not per-VMC{B,S}. + */ + if (kvm_vcpu_has_mediated_pmu(vcpu)) + kvm_pmu_call(write_global_ctrl)(data); break; case MSR_CORE_PERF_GLOBAL_OVF_CTRL: /* @@ -1246,3 +1249,124 @@ int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *k= vm, void __user *argp) kfree(filter); return r; } + +static __always_inline u32 fixed_counter_msr(u32 idx) +{ + return kvm_pmu_ops.FIXED_COUNTER_BASE + idx * kvm_pmu_ops.MSR_STRIDE; +} + +static __always_inline u32 gp_counter_msr(u32 idx) +{ + return kvm_pmu_ops.GP_COUNTER_BASE + idx * kvm_pmu_ops.MSR_STRIDE; +} + +static __always_inline u32 gp_eventsel_msr(u32 idx) +{ + return kvm_pmu_ops.GP_EVENTSEL_BASE + idx * kvm_pmu_ops.MSR_STRIDE; +} + +static void kvm_pmu_load_guest_pmcs(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + struct kvm_pmc *pmc; + u32 i; + + /* + * No need to zero out unexposed GP/fixed counters/selectors since RDPMC + * is intercepted if hardware has counters that aren't visible to the + * guest (KVM will inject #GP as appropriate). + */ + for (i =3D 0; i < pmu->nr_arch_gp_counters; i++) { + pmc =3D &pmu->gp_counters[i]; + + wrmsrl(gp_counter_msr(i), pmc->counter); + wrmsrl(gp_eventsel_msr(i), pmc->eventsel_hw); + } + for (i =3D 0; i < pmu->nr_arch_fixed_counters; i++) { + pmc =3D &pmu->fixed_counters[i]; + + wrmsrl(fixed_counter_msr(i), pmc->counter); + } +} + +void kvm_mediated_pmu_load(struct kvm_vcpu *vcpu) +{ + if (!kvm_vcpu_has_mediated_pmu(vcpu) || + KVM_BUG_ON(!lapic_in_kernel(vcpu), vcpu->kvm)) + return; + + lockdep_assert_irqs_disabled(); + + perf_load_guest_context(); + + /* + * Explicitly clear PERF_GLOBAL_CTRL, as "loading" the guest's context + * disables all individual counters (if any were enabled), but doesn't + * globally disable the entire PMU. Loading event selectors and PMCs + * with guest values while PERF_GLOBAL_CTRL is non-zero will generate + * unexpected events and PMIs. + * + * VMX will enable/disable counters at VM-Enter/VM-Exit by atomically + * loading PERF_GLOBAL_CONTROL. SVM effectively performs the switch by + * configuring all events to be GUEST_ONLY. Clear PERF_GLOBAL_CONTROL + * even for SVM to minimize the damage if a perf event is left enabled, + * and to ensure a consistent starting state. + */ + wrmsrq(kvm_pmu_ops.PERF_GLOBAL_CTRL, 0); + + perf_load_guest_lvtpc(kvm_lapic_get_reg(vcpu->arch.apic, APIC_LVTPC)); + + kvm_pmu_load_guest_pmcs(vcpu); + + kvm_pmu_call(mediated_load)(vcpu); +} + +static void kvm_pmu_put_guest_pmcs(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + struct kvm_pmc *pmc; + u32 i; + + /* + * Clear selectors and counters to ensure hardware doesn't count using + * guest controls when the host (perf) restores its state. + */ + for (i =3D 0; i < pmu->nr_arch_gp_counters; i++) { + pmc =3D &pmu->gp_counters[i]; + + pmc->counter =3D rdpmc(i); + if (pmc->counter) + wrmsrq(gp_counter_msr(i), 0); + if (pmc->eventsel_hw) + wrmsrq(gp_eventsel_msr(i), 0); + } + + for (i =3D 0; i < pmu->nr_arch_fixed_counters; i++) { + pmc =3D &pmu->fixed_counters[i]; + + pmc->counter =3D rdpmc(INTEL_PMC_FIXED_RDPMC_BASE | i); + if (pmc->counter) + wrmsrq(fixed_counter_msr(i), 0); + } +} + +void kvm_mediated_pmu_put(struct kvm_vcpu *vcpu) +{ + if (!kvm_vcpu_has_mediated_pmu(vcpu) || + KVM_BUG_ON(!lapic_in_kernel(vcpu), vcpu->kvm)) + return; + + lockdep_assert_irqs_disabled(); + + /* + * Defer handling of PERF_GLOBAL_CTRL to vendor code. On Intel, it's + * atomically cleared on VM-Exit, i.e. doesn't need to be clear here. + */ + kvm_pmu_call(mediated_put)(vcpu); + + kvm_pmu_put_guest_pmcs(vcpu); + + perf_put_guest_lvtpc(); + + perf_put_guest_context(); +} diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 9a199109d672..25b583da9ee2 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -38,11 +38,19 @@ struct kvm_pmu_ops { void (*cleanup)(struct kvm_vcpu *vcpu); =20 bool (*is_mediated_pmu_supported)(struct x86_pmu_capability *host_pmu); + void (*mediated_load)(struct kvm_vcpu *vcpu); + void (*mediated_put)(struct kvm_vcpu *vcpu); void (*write_global_ctrl)(u64 global_ctrl); =20 const u64 EVENTSEL_EVENT; const int MAX_NR_GP_COUNTERS; const int MIN_NR_GP_COUNTERS; + + const u32 PERF_GLOBAL_CTRL; + const u32 GP_EVENTSEL_BASE; + const u32 GP_COUNTER_BASE; + const u32 FIXED_COUNTER_BASE; + const u32 MSR_STRIDE; }; =20 void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops); @@ -240,6 +248,8 @@ void kvm_pmu_destroy(struct kvm_vcpu *vcpu); int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp); void kvm_pmu_instruction_retired(struct kvm_vcpu *vcpu); void kvm_pmu_branch_retired(struct kvm_vcpu *vcpu); +void kvm_mediated_pmu_load(struct kvm_vcpu *vcpu); +void kvm_mediated_pmu_put(struct kvm_vcpu *vcpu); =20 bool is_vmware_backdoor_pmc(u32 pmc_idx); bool kvm_need_perf_global_ctrl_intercept(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c index 6d5f791126b1..7aa298eeb072 100644 --- a/arch/x86/kvm/svm/pmu.c +++ b/arch/x86/kvm/svm/pmu.c @@ -234,6 +234,32 @@ static bool amd_pmu_is_mediated_pmu_supported(struct x= 86_pmu_capability *host_pm return host_pmu->version >=3D 2; } =20 +static void amd_mediated_pmu_load(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + u64 global_status; + + rdmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS, global_status); + /* Clear host global_status MSR if non-zero. */ + if (global_status) + wrmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, global_status); + + wrmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET, pmu->global_status); + wrmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, pmu->global_ctrl); +} + +static void amd_mediated_pmu_put(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + + wrmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, 0); + rdmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS, pmu->global_status); + + /* Clear global status bits if non-zero */ + if (pmu->global_status) + wrmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, pmu->global_status); +} + struct kvm_pmu_ops amd_pmu_ops __initdata =3D { .rdpmc_ecx_to_pmc =3D amd_rdpmc_ecx_to_pmc, .msr_idx_to_pmc =3D amd_msr_idx_to_pmc, @@ -245,8 +271,16 @@ struct kvm_pmu_ops amd_pmu_ops __initdata =3D { .init =3D amd_pmu_init, =20 .is_mediated_pmu_supported =3D amd_pmu_is_mediated_pmu_supported, + .mediated_load =3D amd_mediated_pmu_load, + .mediated_put =3D amd_mediated_pmu_put, =20 .EVENTSEL_EVENT =3D AMD64_EVENTSEL_EVENT, .MAX_NR_GP_COUNTERS =3D KVM_MAX_NR_AMD_GP_COUNTERS, .MIN_NR_GP_COUNTERS =3D AMD64_NUM_COUNTERS, + + .PERF_GLOBAL_CTRL =3D MSR_AMD64_PERF_CNTR_GLOBAL_CTL, + .GP_EVENTSEL_BASE =3D MSR_F15H_PERF_CTL0, + .GP_COUNTER_BASE =3D MSR_F15H_PERF_CTR0, + .FIXED_COUNTER_BASE =3D 0, + .MSR_STRIDE =3D 2, }; diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index fa04e58ff524..cbebd3a18918 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4368,6 +4368,9 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_= vcpu *vcpu, u64 run_flags) =20 vcpu->arch.regs_avail &=3D ~SVM_REGS_LAZY_LOAD_SET; =20 + if (!msr_write_intercepted(vcpu, MSR_AMD64_PERF_CNTR_GLOBAL_CTL)) + rdmsrq(MSR_AMD64_PERF_CNTR_GLOBAL_CTL, vcpu_to_pmu(vcpu)->global_ctrl); + trace_kvm_exit(vcpu, KVM_ISA_SVM); =20 svm_complete_interrupts(vcpu); diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 855240678300..55249fa4db95 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -792,6 +792,42 @@ static void intel_pmu_write_global_ctrl(u64 global_ctr= l) vmcs_write64(GUEST_IA32_PERF_GLOBAL_CTRL, global_ctrl); } =20 + +static void intel_mediated_pmu_load(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + u64 global_status, toggle; + + rdmsrq(MSR_CORE_PERF_GLOBAL_STATUS, global_status); + toggle =3D pmu->global_status ^ global_status; + if (global_status & toggle) + wrmsrq(MSR_CORE_PERF_GLOBAL_OVF_CTRL, global_status & toggle); + if (pmu->global_status & toggle) + wrmsrq(MSR_CORE_PERF_GLOBAL_STATUS_SET, pmu->global_status & toggle); + + wrmsrq(MSR_CORE_PERF_FIXED_CTR_CTRL, pmu->fixed_ctr_ctrl_hw); +} + +static void intel_mediated_pmu_put(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu =3D vcpu_to_pmu(vcpu); + + /* MSR_CORE_PERF_GLOBAL_CTRL is already saved at VM-exit. */ + rdmsrq(MSR_CORE_PERF_GLOBAL_STATUS, pmu->global_status); + + /* Clear hardware MSR_CORE_PERF_GLOBAL_STATUS MSR, if non-zero. */ + if (pmu->global_status) + wrmsrq(MSR_CORE_PERF_GLOBAL_OVF_CTRL, pmu->global_status); + + /* + * Clear hardware FIXED_CTR_CTRL MSR to avoid information leakage and + * also to avoid accidentally enabling fixed counters (based on guest + * state) while running in the host, e.g. when setting global ctrl. + */ + if (pmu->fixed_ctr_ctrl_hw) + wrmsrq(MSR_CORE_PERF_FIXED_CTR_CTRL, 0); +} + struct kvm_pmu_ops intel_pmu_ops __initdata =3D { .rdpmc_ecx_to_pmc =3D intel_rdpmc_ecx_to_pmc, .msr_idx_to_pmc =3D intel_msr_idx_to_pmc, @@ -805,9 +841,17 @@ struct kvm_pmu_ops intel_pmu_ops __initdata =3D { .cleanup =3D intel_pmu_cleanup, =20 .is_mediated_pmu_supported =3D intel_pmu_is_mediated_pmu_supported, + .mediated_load =3D intel_mediated_pmu_load, + .mediated_put =3D intel_mediated_pmu_put, .write_global_ctrl =3D intel_pmu_write_global_ctrl, =20 .EVENTSEL_EVENT =3D ARCH_PERFMON_EVENTSEL_EVENT, .MAX_NR_GP_COUNTERS =3D KVM_MAX_NR_INTEL_GP_COUNTERS, .MIN_NR_GP_COUNTERS =3D 1, + + .PERF_GLOBAL_CTRL =3D MSR_CORE_PERF_GLOBAL_CTRL, + .GP_EVENTSEL_BASE =3D MSR_P6_EVNTSEL0, + .GP_COUNTER_BASE =3D MSR_IA32_PMC0, + .FIXED_COUNTER_BASE =3D MSR_CORE_PERF_FIXED_CTR0, + .MSR_STRIDE =3D 1, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 76e86eb358df..589a309259f4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11334,6 +11334,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) run_flags |=3D KVM_RUN_LOAD_DEBUGCTL; vcpu->arch.host_debugctl =3D debug_ctl; =20 + kvm_mediated_pmu_load(vcpu); + guest_timing_enter_irqoff(); =20 /* @@ -11372,6 +11374,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) =20 kvm_load_host_pkru(vcpu); =20 + kvm_mediated_pmu_put(vcpu); + /* * Do this here before restoring debug registers on the host. And * since we do this before handling the vmexit, a DR access vmexit --=20 2.52.0.223.gf5cc29aaa4-goog