[PATCH v2 0/2] KVM: SVM: Fix a bug where TSC_AUX can get clobbered

Sean Christopherson posted 2 patches 1 week, 5 days ago
There is a newer version of this series
arch/x86/include/asm/kvm_host.h |  1 +
arch/x86/kvm/svm/sev.c          | 14 +++++++++++++-
arch/x86/kvm/svm/svm.c          | 26 +++++++-------------------
arch/x86/kvm/svm/svm.h          |  4 +++-
arch/x86/kvm/x86.c              |  6 ++++++
5 files changed, 30 insertions(+), 21 deletions(-)
[PATCH v2 0/2] KVM: SVM: Fix a bug where TSC_AUX can get clobbered
Posted by Sean Christopherson 1 week, 5 days ago
v2 of Hou's series to fix a bug where an SEV-ES vCPU running on the same
pCPU as a non-SEV-ES vCPU could clobber TSC_AUX due to loading the host's
TSC_AUX on #VMEXIT, as opposed to restoring whatever was in hardware at the
time of VMRUN.

I tried to test this by hacking sev_smoke_test, but unfortunately I don't
have a machine that has SEV-ES *and* TSC_AUX virtualization.  *sigh*

diff --git a/tools/testing/selftests/kvm/x86/sev_smoke_test.c b/tools/testing/selftests/kvm/x86/sev_smoke_test.c
index 77256c89bb8d..73530a01a3b5 100644
--- a/tools/testing/selftests/kvm/x86/sev_smoke_test.c
+++ b/tools/testing/selftests/kvm/x86/sev_smoke_test.c
@@ -16,6 +16,12 @@
 
 #define XFEATURE_MASK_X87_AVX (XFEATURE_MASK_FP | XFEATURE_MASK_SSE | XFEATURE_MASK_YMM)
 
+static uint64_t guest_sev_es_get_info(void)
+{
+       wrmsr(MSR_AMD64_SEV_ES_GHCB, GHCB_MSR_TERM_REQ);
+       return rdmsr(MSR_AMD64_SEV_ES_GHCB);
+}
+
 static void guest_snp_code(void)
 {
        uint64_t sev_msr = rdmsr(MSR_AMD64_SEV);
@@ -34,6 +40,10 @@ static void guest_sev_es_code(void)
        GUEST_ASSERT(rdmsr(MSR_AMD64_SEV) & MSR_AMD64_SEV_ENABLED);
        GUEST_ASSERT(rdmsr(MSR_AMD64_SEV) & MSR_AMD64_SEV_ES_ENABLED);
 
+       wrmsr(MSR_TSC_AUX, 0x12345678);
+       guest_sev_es_get_info();
+       GUEST_ASSERT(rdmsr(MSR_TSC_AUX) == 0x12345678);
+
        /*
         * TODO: Add GHCB and ucall support for SEV-ES guests.  For now, simply
         * force "termination" to signal "done" via the GHCB MSR protocol.

v2:
 - Drop "cache" from the user_return API.
 - Handle the SEV-ES case in SEV-ES code.
 - Tag everything for stable@.
 - Massage changelog to avoid talking about the host's value and instead
   focus on failing to restore what KVM thinks is in hardware.

v1: https://lore.kernel.org/all/05a018a6997407080b3b7921ba692aa69a720f07.1758166596.git.houwenlong.hwl@antgroup.com

Hou Wenlong (2):
  KVM: x86: Add helper to retrieve current value of user return MSR
  KVM: SVM: Re-load current, not host, TSC_AUX on #VMEXIT from SEV-ES
    guest

 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/svm/sev.c          | 14 +++++++++++++-
 arch/x86/kvm/svm/svm.c          | 26 +++++++-------------------
 arch/x86/kvm/svm/svm.h          |  4 +++-
 arch/x86/kvm/x86.c              |  6 ++++++
 5 files changed, 30 insertions(+), 21 deletions(-)


base-commit: c8fbf7ceb2ae3f64b0c377c8c21f6df577a13eb4
-- 
2.51.0.470.ga7dc726c21-goog
[PATCH v2 1/2] KVM: x86: Add helper to retrieve current value of user return MSR
Posted by Sean Christopherson 1 week, 5 days ago
From: Hou Wenlong <houwenlong.hwl@antgroup.com>

In the user return MSR support, the cached value is always the hardware
value of the specific MSR. Therefore, add a helper to retrieve the
cached value, which can replace the need for RDMSR, for example, to
allow SEV-ES guests to restore the correct host hardware value without
using RDMSR.

Cc: stable@vger.kernel.org
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
[sean: drop "cache" from the name, make it a one-liner, tag for stable]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 1 +
 arch/x86/kvm/x86.c              | 6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 17772513b9cc..14236006266b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2376,6 +2376,7 @@ int kvm_add_user_return_msr(u32 msr);
 int kvm_find_user_return_msr(u32 msr);
 int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask);
 void kvm_user_return_msr_update_cache(unsigned int index, u64 val);
+u64 kvm_get_user_return_msr(unsigned int slot);
 
 static inline bool kvm_is_supported_user_return_msr(u32 msr)
 {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e07936efacd4..801bf6172a21 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -675,6 +675,12 @@ void kvm_user_return_msr_update_cache(unsigned int slot, u64 value)
 }
 EXPORT_SYMBOL_GPL(kvm_user_return_msr_update_cache);
 
+u64 kvm_get_user_return_msr(unsigned int slot)
+{
+	return this_cpu_ptr(user_return_msrs)->values[slot].curr;
+}
+EXPORT_SYMBOL_GPL(kvm_get_user_return_msr);
+
 static void drop_user_return_notifiers(void)
 {
 	struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs);
-- 
2.51.0.470.ga7dc726c21-goog
Re: [PATCH v2 1/2] KVM: x86: Add helper to retrieve current value of user return MSR
Posted by Xiaoyao Li 1 week, 2 days ago
On 9/20/2025 5:38 AM, Sean Christopherson wrote:
> From: Hou Wenlong <houwenlong.hwl@antgroup.com>
> 
> In the user return MSR support, the cached value is always the hardware
> value of the specific MSR. Therefore, add a helper to retrieve the
> cached value, which can replace the need for RDMSR, for example, to
> allow SEV-ES guests to restore the correct host hardware value without
> using RDMSR.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> [sean: drop "cache" from the name, make it a one-liner, tag for stable]
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
[PATCH v2 2/2] KVM: SVM: Re-load current, not host, TSC_AUX on #VMEXIT from SEV-ES guest
Posted by Sean Christopherson 1 week, 5 days ago
From: Hou Wenlong <houwenlong.hwl@antgroup.com>

Prior to running an SEV-ES guest, set TSC_AUX in the host save area to the
current value in hardware, as tracked by the user return infrastructure,
instead of always loading the host's desired value for the CPU.  If the
pCPU is also running a non-SEV-ES vCPU, loading the host's value on #VMEXIT
could clobber the other vCPU's value, e.g. if the SEV-ES vCPU preempted
the non-SEV-ES vCPU, in which case KVM expects the other vCPU's TSC_AUX
value to be resident in hardware.

Note, unlike TDX, which blindly _zeroes_ TSC_AUX on TD-Exit, SEV-ES CPUs
can load an arbitrary value.  Stuff the current value in the host save
area instead of refreshing the user return cache so that KVM doesn't need
to track whether or not the vCPU actually enterred the guest and thus
loaded TSC_AUX from the host save area.

Fixes: 916e3e5f26ab ("KVM: SVM: Do not use user return MSR support for virtualized TSC_AUX")
Cc: stable@vger.kernel.org
Suggested-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
[sean: handle the SEV-ES case in sev_es_prepare_switch_to_guest()]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/sev.c | 14 +++++++++++++-
 arch/x86/kvm/svm/svm.c | 26 +++++++-------------------
 arch/x86/kvm/svm/svm.h |  4 +++-
 3 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index cce48fff2e6c..95767b9d0d55 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4664,7 +4664,9 @@ int sev_vcpu_create(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa)
+void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm,
+				    struct sev_es_save_area *hostsa,
+				    int tsc_aux_uret_slot)
 {
 	struct kvm *kvm = svm->vcpu.kvm;
 
@@ -4712,6 +4714,16 @@ void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_are
 		hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2);
 		hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3);
 	}
+
+	/*
+	 * TSC_AUX is always virtualized for SEV-ES guests when the feature is
+	 * available, i.e. TSC_AUX is loaded on #VMEXIT from the host save area.
+	 * Set the save area to the current hardware value, i.e. the current
+	 * user return value, so that the correct value is restored on #VMEXIT.
+	 */
+	if (cpu_feature_enabled(X86_FEATURE_V_TSC_AUX) &&
+	    !WARN_ON_ONCE(tsc_aux_uret_slot < 0))
+		hostsa->tsc_aux = kvm_get_user_return_msr(tsc_aux_uret_slot);
 }
 
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 67f4eed01526..662cf680faf7 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -577,18 +577,6 @@ static int svm_enable_virtualization_cpu(void)
 
 	amd_pmu_enable_virt();
 
-	/*
-	 * If TSC_AUX virtualization is supported, TSC_AUX becomes a swap type
-	 * "B" field (see sev_es_prepare_switch_to_guest()) for SEV-ES guests.
-	 * Since Linux does not change the value of TSC_AUX once set, prime the
-	 * TSC_AUX field now to avoid a RDMSR on every vCPU run.
-	 */
-	if (boot_cpu_has(X86_FEATURE_V_TSC_AUX)) {
-		u32 __maybe_unused msr_hi;
-
-		rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi);
-	}
-
 	return 0;
 }
 
@@ -1400,16 +1388,17 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
 	 */
 	vmsave(sd->save_area_pa);
 	if (sev_es_guest(vcpu->kvm))
-		sev_es_prepare_switch_to_guest(svm, sev_es_host_save_area(sd));
+		sev_es_prepare_switch_to_guest(svm, sev_es_host_save_area(sd),
+					       tsc_aux_uret_slot);
 
 	if (tsc_scaling)
 		__svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio);
 
 	/*
-	 * TSC_AUX is always virtualized for SEV-ES guests when the feature is
-	 * available. The user return MSR support is not required in this case
-	 * because TSC_AUX is restored on #VMEXIT from the host save area
-	 * (which has been initialized in svm_enable_virtualization_cpu()).
+	 * TSC_AUX is always virtualized (context switched by hardware) for
+	 * SEV-ES guests when the feature is available.  For non-SEV-ES guests,
+	 * context switch TSC_AUX via the user_return MSR infrastructure (not
+	 * all CPUs support TSC_AUX virtualization).
 	 */
 	if (likely(tsc_aux_uret_slot >= 0) &&
 	    (!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm)))
@@ -3004,8 +2993,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
 		 * TSC_AUX is always virtualized for SEV-ES guests when the
 		 * feature is available. The user return MSR support is not
 		 * required in this case because TSC_AUX is restored on #VMEXIT
-		 * from the host save area (which has been initialized in
-		 * svm_enable_virtualization_cpu()).
+		 * from the host save area.
 		 */
 		if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) && sev_es_guest(vcpu->kvm))
 			break;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 5d39c0b17988..4fda677e8ab3 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -831,7 +831,9 @@ void sev_vcpu_after_set_cpuid(struct vcpu_svm *svm);
 int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
 void sev_es_recalc_msr_intercepts(struct kvm_vcpu *vcpu);
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
-void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa);
+void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm,
+				    struct sev_es_save_area *hostsa,
+				    int tsc_aux_uret_slot);
 void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 
 #ifdef CONFIG_KVM_AMD_SEV
-- 
2.51.0.470.ga7dc726c21-goog
Re: [PATCH v2 2/2] KVM: SVM: Re-load current, not host, TSC_AUX on #VMEXIT from SEV-ES guest
Posted by Xiaoyao Li 1 week, 2 days ago
On 9/20/2025 5:38 AM, Sean Christopherson wrote:
> From: Hou Wenlong <houwenlong.hwl@antgroup.com>
> 
> Prior to running an SEV-ES guest, set TSC_AUX in the host save area to the
> current value in hardware, as tracked by the user return infrastructure,
> instead of always loading the host's desired value for the CPU.  If the
> pCPU is also running a non-SEV-ES vCPU, loading the host's value on #VMEXIT
> could clobber the other vCPU's value, e.g. if the SEV-ES vCPU preempted
> the non-SEV-ES vCPU, in which case KVM expects the other vCPU's TSC_AUX
> value to be resident in hardware.
> 
> Note, unlike TDX, which blindly _zeroes_ TSC_AUX on TD-Exit, SEV-ES CPUs
> can load an arbitrary value.  Stuff the current value in the host save
> area instead of refreshing the user return cache so that KVM doesn't need
> to track whether or not the vCPU actually enterred the guest and thus
> loaded TSC_AUX from the host save area.
> 
> Fixes: 916e3e5f26ab ("KVM: SVM: Do not use user return MSR support for virtualized TSC_AUX")
> Cc: stable@vger.kernel.org
> Suggested-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
> [sean: handle the SEV-ES case in sev_es_prepare_switch_to_guest()]
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

one nit below,

> ---
>   arch/x86/kvm/svm/sev.c | 14 +++++++++++++-
>   arch/x86/kvm/svm/svm.c | 26 +++++++-------------------
>   arch/x86/kvm/svm/svm.h |  4 +++-
>   3 files changed, 23 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index cce48fff2e6c..95767b9d0d55 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -4664,7 +4664,9 @@ int sev_vcpu_create(struct kvm_vcpu *vcpu)
>   	return 0;
>   }
>   
> -void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa)
> +void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm,
> +				    struct sev_es_save_area *hostsa,
> +				    int tsc_aux_uret_slot)

Passing the tsc_aux_uret_slot as paramter looks a bit ugly, how about 
externing it?
Re: [PATCH v2 2/2] KVM: SVM: Re-load current, not host, TSC_AUX on #VMEXIT from SEV-ES guest
Posted by Sean Christopherson 1 week, 1 day ago
On Tue, Sep 23, 2025, Xiaoyao Li wrote:
> On 9/20/2025 5:38 AM, Sean Christopherson wrote:
> > ---
> >   arch/x86/kvm/svm/sev.c | 14 +++++++++++++-
> >   arch/x86/kvm/svm/svm.c | 26 +++++++-------------------
> >   arch/x86/kvm/svm/svm.h |  4 +++-
> >   3 files changed, 23 insertions(+), 21 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> > index cce48fff2e6c..95767b9d0d55 100644
> > --- a/arch/x86/kvm/svm/sev.c
> > +++ b/arch/x86/kvm/svm/sev.c
> > @@ -4664,7 +4664,9 @@ int sev_vcpu_create(struct kvm_vcpu *vcpu)
> >   	return 0;
> >   }
> > -void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa)
> > +void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm,
> > +				    struct sev_es_save_area *hostsa,
> > +				    int tsc_aux_uret_slot)
> 
> Passing the tsc_aux_uret_slot as paramter looks a bit ugly, how about
> externing it?

Yeah, looking at this again, I agree.  I'll post a v3 (already did; forgot to
hit "send" on this earlier, /facepalm).