[v6] Enable FRED with KVM VMX

[PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts

Posted by Xin Li (Intel) 1 month, 1 week ago

From: Xin Li <xin3.li@intel.com>

On a userspace MSR filter change, set FRED MSR intercepts.

8 FRED MSRs, i.e., MSR_IA32_FRED_RSP[123], MSR_IA32_FRED_STKLVLS,
MSR_IA32_FRED_SSP[123] and MSR_IA32_FRED_CONFIG, are all safe to
be passthrough, because they all have a pair of corresponding host
and guest VMCS fields.

Both MSR_IA32_FRED_RSP0 and MSR_IA32_FRED_SSP0 are dedicated for
userspace event delivery only, IOW they are NOT used in any kernel
event delivery and the execution of ERETS.  Thus KVM can run safely
with guest values in the two MSRs.  As a result, save and restore of
their guest values are deferred until vCPU context switch and their
host values are restored upon host returning to userspace.

Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Tested-by: Shan Kang <shan.kang@intel.com>
Tested-by: Xuelian Guo <xuelian.guo@intel.com>
---

Changes in v5:
* Skip execution of vmx_set_intercept_for_fred_msr() if FRED is
  not available or enabled (Sean).
* Use 'intercept' as the variable name to indicate whether MSR
  interception should be enabled (Sean).
* Add TB from Xuelian Guo.
---
 arch/x86/kvm/vmx/vmx.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 42e179f19c23..8e81230be7af 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4128,6 +4128,43 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu)
 	}
 }
 
+static void vmx_set_intercept_for_fred_msr(struct kvm_vcpu *vcpu)
+{
+	bool intercept = !guest_cpu_cap_has(vcpu, X86_FEATURE_FRED);
+
+	if (!kvm_cpu_cap_has(X86_FEATURE_FRED))
+		return;
+
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP1, MSR_TYPE_RW, intercept);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP2, MSR_TYPE_RW, intercept);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP3, MSR_TYPE_RW, intercept);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_STKLVLS, MSR_TYPE_RW, intercept);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_SSP1, MSR_TYPE_RW, intercept);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_SSP2, MSR_TYPE_RW, intercept);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_SSP3, MSR_TYPE_RW, intercept);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_CONFIG, MSR_TYPE_RW, intercept);
+
+	/*
+	 * MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP (aka MSR_IA32_FRED_SSP0) are
+	 * designated for event delivery while executing in userspace.  Since
+	 * KVM operates exclusively in kernel mode (the CPL is always 0 after
+	 * any VM exit), KVM can safely retain and operate with the guest-defined
+	 * values for MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP.
+	 *
+	 * Therefore, interception of MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP
+	 * is not required.
+	 *
+	 * Note, save and restore of MSR_IA32_PL0_SSP belong to CET supervisor
+	 * context management.  However the FRED SSP MSRs, including
+	 * MSR_IA32_PL0_SSP, are supported by any processor that enumerates FRED.
+	 * If such a processor does not support CET, FRED transitions will not
+	 * use the MSRs, but the MSRs would still be accessible using MSR-access
+	 * instructions (e.g., RDMSR, WRMSR).
+	 */
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP0, MSR_TYPE_RW, intercept);
+	vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, intercept);
+}
+
 void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
 {
 	bool intercept;
@@ -4194,6 +4231,8 @@ void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
 		vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, intercept);
 	}
 
+	vmx_set_intercept_for_fred_msr(vcpu);
+
 	/*
 	 * x2APIC and LBR MSR intercepts are modified on-demand and cannot be
 	 * filtered by userspace.
-- 
2.50.1

Re: [PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts

Posted by Xin Li 1 month, 1 week ago

On 8/21/2025 3:36 PM, Xin Li (Intel) wrote:
> +	/*
> +	 * MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP (aka MSR_IA32_FRED_SSP0) are
> +	 * designated for event delivery while executing in userspace.  Since
> +	 * KVM operates exclusively in kernel mode (the CPL is always 0 after
> +	 * any VM exit), KVM can safely retain and operate with the guest-defined
> +	 * values for MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP.
> +	 *
> +	 * Therefore, interception of MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP
> +	 * is not required.
> +	 *
> +	 * Note, save and restore of MSR_IA32_PL0_SSP belong to CET supervisor
> +	 * context management.  However the FRED SSP MSRs, including
> +	 * MSR_IA32_PL0_SSP, are supported by any processor that enumerates FRED.
> +	 * If such a processor does not support CET, FRED transitions will not
> +	 * use the MSRs, but the MSRs would still be accessible using MSR-access
> +	 * instructions (e.g., RDMSR, WRMSR).
> +	 */
> +	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP0, MSR_TYPE_RW, intercept);
> +	vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, intercept);

Hi Sean,

I'd like to bring up an issue concerning MSR_IA32_PL0_SSP.

The FRED spec claims:

The FRED SSP MSRs are supported by any processor that enumerates
CPUID.(EAX=7,ECX=1):EAX.FRED[bit 17] as 1. If such a processor does not
support CET, FRED transitions will not use the MSRs (because shadow stacks
are not enabled), but the MSRs would still be accessible using MSR-access
instructions (e.g., RDMSR, WRMSR).

It means KVM needs to handle MSR_IA32_PL0_SSP even when FRED is supported
but CET is not.  And this can be broken down into two subtasks:

1) Allow such a guest to access MSR_IA32_PL0_SSP w/o triggering #GP.  And
this behavior is already implemented in patch 8 of this series.

2) Save and restore MSR_IA32_PL0_SSP in both KVM and Qemu for such a guest.

I have the patches for 2) but they are not included in this series, because

1) how much do we care the value in MSR_IA32_PL0_SSP in such a guest?

Yes, Chao told me that you are the one saying that MSRs can be used as
clobber registers and KVM should preserve the value.  Does MSR_IA32_PL0_SSP
in such a guest count?

2) Saving/restoring MSR_IA32_PL0_SSP adds complexity, though it's seldom
used.  Is it worth it?

BTW I'm still working on a KVM unit test for it, using a L1 VMM that
enumerates FRED but not CET.

Thanks!
     Xin

Re: [PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts

Posted by Andrew Cooper 1 month, 1 week ago

On 25/08/2025 3:51 am, Xin Li wrote:
> On 8/21/2025 3:36 PM, Xin Li (Intel) wrote:
>> +    /*
>> +     * MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP (aka
>> MSR_IA32_FRED_SSP0) are
>> +     * designated for event delivery while executing in userspace. 
>> Since
>> +     * KVM operates exclusively in kernel mode (the CPL is always 0
>> after
>> +     * any VM exit), KVM can safely retain and operate with the
>> guest-defined
>> +     * values for MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP.
>> +     *
>> +     * Therefore, interception of MSR_IA32_FRED_RSP0 and
>> MSR_IA32_PL0_SSP
>> +     * is not required.
>> +     *
>> +     * Note, save and restore of MSR_IA32_PL0_SSP belong to CET
>> supervisor
>> +     * context management.  However the FRED SSP MSRs, including
>> +     * MSR_IA32_PL0_SSP, are supported by any processor that
>> enumerates FRED.
>> +     * If such a processor does not support CET, FRED transitions
>> will not
>> +     * use the MSRs, but the MSRs would still be accessible using
>> MSR-access
>> +     * instructions (e.g., RDMSR, WRMSR).
>> +     */
>> +    vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP0, MSR_TYPE_RW,
>> intercept);
>> +    vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW,
>> intercept);
>
> Hi Sean,
>
> I'd like to bring up an issue concerning MSR_IA32_PL0_SSP.
>
> The FRED spec claims:
>
> The FRED SSP MSRs are supported by any processor that enumerates
> CPUID.(EAX=7,ECX=1):EAX.FRED[bit 17] as 1. If such a processor does not
> support CET, FRED transitions will not use the MSRs (because shadow
> stacks
> are not enabled), but the MSRs would still be accessible using MSR-access
> instructions (e.g., RDMSR, WRMSR).

This is silly.  AIUI, all CPUs that have FRED also have CET-SS, so in
practice they all have these MSRs.

But from an architectural point of view, if CET-SS isn't available,
these MSRs shouldn't be either.  A guest which can't use CET-SS has no
reason to touch these MSRs at all.

MSR_PL0_SSP (== MSR_FRED_SSP_SL0) is gated on CET-SS alone (it already
exists in CPUs), while MSR_FRED_SSP_SL{1..3} should be gated on CET-SS
&& FRED, and should be reserved[1] otherwise.

This distinction only matters for guests, and adding the CET-SS
precondition makes things simpler overall for both VMMs and guests.  So
can't this just be fixed up before being integrated into the SDM?

~Andrew

[1] I have a sneaking suspicion there's a SKU reason why the spec is
written that way, and "Reserved" is still the right behaviour to have
for !CET-SS || !FRED.

Re: [PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts

Posted by Xin Li 1 month, 1 week ago

On 8/26/2025 11:50 AM, Andrew Cooper wrote:
> This distinction only matters for guests, and adding the CET-SS
> precondition makes things simpler overall for both VMMs and guests.  So
> can't this just be fixed up before being integrated into the SDM?

+1 :)

Re: [PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts

Posted by Andrew Cooper 1 month, 1 week ago

On 26/08/2025 11:03 pm, Xin Li wrote:
> On 8/26/2025 11:50 AM, Andrew Cooper wrote:
>> This distinction only matters for guests, and adding the CET-SS
>> precondition makes things simpler overall for both VMMs and guests.  So
>> can't this just be fixed up before being integrated into the SDM?
>
> +1 :)

I've just realised why these MSRs are tied together in this way.

As written, the VMX Entry/Exit Load/Save FRED controls do not allow for
a logical configuration of FRED && !CET-SS.  Both sets of stack pointers
are treated the same.

This is horrible.  I'm less certain if this can simply be fixed by
changing the SDM.

~Andrew

Re: [PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts

Posted by Sean Christopherson 1 month, 1 week ago

On Sun, Aug 24, 2025, Xin Li wrote:
> On 8/21/2025 3:36 PM, Xin Li (Intel) wrote:
> > +	/*
> > +	 * MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP (aka MSR_IA32_FRED_SSP0) are
> > +	 * designated for event delivery while executing in userspace.  Since
> > +	 * KVM operates exclusively in kernel mode (the CPL is always 0 after
> > +	 * any VM exit), KVM can safely retain and operate with the guest-defined
> > +	 * values for MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP.
> > +	 *
> > +	 * Therefore, interception of MSR_IA32_FRED_RSP0 and MSR_IA32_PL0_SSP
> > +	 * is not required.
> > +	 *
> > +	 * Note, save and restore of MSR_IA32_PL0_SSP belong to CET supervisor
> > +	 * context management.  However the FRED SSP MSRs, including
> > +	 * MSR_IA32_PL0_SSP, are supported by any processor that enumerates FRED.
> > +	 * If such a processor does not support CET, FRED transitions will not
> > +	 * use the MSRs, but the MSRs would still be accessible using MSR-access
> > +	 * instructions (e.g., RDMSR, WRMSR).
> > +	 */
> > +	vmx_set_intercept_for_msr(vcpu, MSR_IA32_FRED_RSP0, MSR_TYPE_RW, intercept);
> > +	vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, intercept);
> 
> Hi Sean,
> 
> I'd like to bring up an issue concerning MSR_IA32_PL0_SSP.
> 
> The FRED spec claims:
> 
> The FRED SSP MSRs are supported by any processor that enumerates
> CPUID.(EAX=7,ECX=1):EAX.FRED[bit 17] as 1. If such a processor does not
> support CET, FRED transitions will not use the MSRs (because shadow stacks
> are not enabled), but the MSRs would still be accessible using MSR-access
> instructions (e.g., RDMSR, WRMSR).
> 
> It means KVM needs to handle MSR_IA32_PL0_SSP even when FRED is supported
> but CET is not.  And this can be broken down into two subtasks:
> 
> 1) Allow such a guest to access MSR_IA32_PL0_SSP w/o triggering #GP.  And
> this behavior is already implemented in patch 8 of this series.
> 
> 2) Save and restore MSR_IA32_PL0_SSP in both KVM and Qemu for such a guest.

What novel work needs to be done in KVM?  For QEMU, I assume it's just adding an
"or FRED" somewhere.  For KVM, I'm missing what additional work would be required
that wouldn't be naturally covered by patch 8 (assuming patch 8 is bug-free).

> I have the patches for 2) but they are not included in this series, because
> 
> 1) how much do we care the value in MSR_IA32_PL0_SSP in such a guest?
> 
> Yes, Chao told me that you are the one saying that MSRs can be used as
> clobber registers and KVM should preserve the value.  Does MSR_IA32_PL0_SSP
> in such a guest count?

If the architecture says that MSR_IA32_PL0_SSP exists and is accessible, then
KVM needs to honor that.

> 2) Saving/restoring MSR_IA32_PL0_SSP adds complexity, though it's seldom
> used.  Is it worth it?

Honoring the architecture is generally not optional.  There are extreme cases
where KVM violates that rule and takes (often undocumented) erratum, e.g. APIC
base relocation would require an absurd amount of complexity for no real world
benefit.  But I would be very surprised if the complexity in KVM or QEMU to support
this scenario is at all meaningful, let alone enough to justify diverging from
the architectural spec.

> BTW I'm still working on a KVM unit test for it, using a L1 VMM that
> enumerates FRED but not CET.
> 
> Thanks!
>     Xin

Re: [PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts

Posted by Xin Li 1 month, 1 week ago

>> Hi Sean,
>>
>> I'd like to bring up an issue concerning MSR_IA32_PL0_SSP.
>>
>> The FRED spec claims:
>>
>> The FRED SSP MSRs are supported by any processor that enumerates
>> CPUID.(EAX=7,ECX=1):EAX.FRED[bit 17] as 1. If such a processor does not
>> support CET, FRED transitions will not use the MSRs (because shadow stacks
>> are not enabled), but the MSRs would still be accessible using MSR-access
>> instructions (e.g., RDMSR, WRMSR).
>>
>> It means KVM needs to handle MSR_IA32_PL0_SSP even when FRED is supported
>> but CET is not.  And this can be broken down into two subtasks:
>>
>> 1) Allow such a guest to access MSR_IA32_PL0_SSP w/o triggering #GP.  And
>> this behavior is already implemented in patch 8 of this series.
>>
>> 2) Save and restore MSR_IA32_PL0_SSP in both KVM and Qemu for such a guest.
> 
> What novel work needs to be done in KVM?  For QEMU, I assume it's just adding an
> "or FRED" somewhere.  For KVM, I'm missing what additional work would be required
> that wouldn't be naturally covered by patch 8 (assuming patch 8 is bug-free).

Extra patches:

1) A patch to save/restore guest MSR_IA32_PL0_SSP (i.e., FRED SSP0), as
what we have done for RSP0, following is the patch on top of the patch 
saving/restoring RSP0:

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 449a5e02c7de..0bf684342a71 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1294,8 +1294,13 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)

  	wrmsrq(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);

-	if (guest_cpu_cap_has(vcpu, X86_FEATURE_FRED))
+	if (guest_cpu_cap_has(vcpu, X86_FEATURE_FRED)) {
  		wrmsrns(MSR_IA32_FRED_RSP0, vmx->msr_guest_fred_rsp0);
+
+		/* XSAVES/XRSTORS do not cover SSP MSRs */
+		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
+			wrmsrns(MSR_IA32_FRED_SSP0, vmx->msr_guest_fred_ssp0);
+	}
  #else
  	savesegment(fs, fs_sel);
  	savesegment(gs, gs_sel);
@@ -1349,6 +1354,10 @@ static void vmx_prepare_switch_to_host(struct 
vcpu_vmx *vmx)
  		 * CPU exits to userspace (RSP0 is a per-task value).
  		 */
  		fred_sync_rsp0(vmx->msr_guest_fred_rsp0);
+
+		/* XSAVES/XRSTORS do not cover SSP MSRs */
+		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
+			vmx->msr_guest_fred_ssp0 = read_msr(MSR_IA32_FRED_SSP0);
  	}
  #endif
  	load_fixmap_gdt(raw_smp_processor_id());
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 733fa2ef4bea..12c1cf827cb7 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -228,6 +228,7 @@ struct vcpu_vmx {
  #ifdef CONFIG_X86_64
  	u64		      msr_guest_kernel_gs_base;
  	u64		      msr_guest_fred_rsp0;
+	u64		      msr_guest_fred_ssp0;
  #endif

  	u64		      spec_ctrl;

And We might want to zero host MSR_IA32_PL0_SSP when switching to host.


2) Add vmx_read_guest_fred_ssp0()/vmx_write_guest_fred_ssp0(), and use
them to read/write MSR_IA32_PL0_SSP in patch 8:

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 99106750b1e3..cbdc67682d27 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1400,9 +1408,23 @@ static void vmx_write_guest_fred_rsp0(struct 
vcpu_vmx *vmx, u64 data)
  	vmx_write_guest_host_msr(vmx, MSR_IA32_FRED_RSP0, data,
  				 &vmx->msr_guest_fred_rsp0);
  }
+
+static u64 vmx_read_guest_fred_ssp0(struct vcpu_vmx *vmx)
+{
+	return vmx_read_guest_host_msr(vmx, MSR_IA32_FRED_SSP0,
+				       &vmx->msr_guest_fred_ssp0);
+}
+
+static void vmx_write_guest_fred_ssp0(struct vcpu_vmx *vmx, u64 data)
+{
+	vmx_write_guest_host_msr(vmx, MSR_IA32_FRED_SSP0, data,
+				 &vmx->msr_guest_fred_ssp0);
+}
  #endif

  static void grow_ple_window(struct kvm_vcpu *vcpu)
@@ -2189,6 +2211,18 @@ int vmx_get_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
  	case MSR_IA32_DEBUGCTLMSR:
  		msr_info->data = vmx_guest_debugctl_read();
  		break;
+	case MSR_IA32_PL0_SSP:
+		/*
+		 * If kvm_cpu_cap_has(X86_FEATURE_SHSTK) but
+		 * !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK), XSAVES/XRSTORS
+		 * cover SSP MSRs.
+		 */
+		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+		    guest_cpu_cap_has(vcpu, X86_FEATURE_FRED)) {
+			msr_info->data = vmx_read_guest_fred_ssp0(vmx);
+			break;
+		}
+		fallthrough;
  	default:
  	find_uret_msr:
  		msr = vmx_find_uret_msr(vmx, msr_info->index);
@@ -2540,7 +2574,18 @@ int vmx_set_msr(struct kvm_vcpu *vcpu, struct 
msr_data *msr_info)
  		}
  		ret = kvm_set_msr_common(vcpu, msr_info);
  		break;
-
+	case MSR_IA32_PL0_SSP:
+		/*
+		 * If kvm_cpu_cap_has(X86_FEATURE_SHSTK) but
+		 * !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK), XSAVES/XRSTORS
+		 * cover SSP MSRs.
+		 */
+		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
+		    guest_cpu_cap_has(vcpu, X86_FEATURE_FRED)) {
+			vmx_write_guest_fred_ssp0(vmx, data);
+			break;
+		}
+		fallthrough;
  	default:
  	find_uret_msr:
  		msr = vmx_find_uret_msr(vmx, msr_index);


3) Another change I was discussing with Chao:
https://lore.kernel.org/lkml/2ed04dff-e778-46c6-bd5f-51295763af06@zytor.com/

> 
>> I have the patches for 2) but they are not included in this series, because
>>
>> 1) how much do we care the value in MSR_IA32_PL0_SSP in such a guest?
>>
>> Yes, Chao told me that you are the one saying that MSRs can be used as
>> clobber registers and KVM should preserve the value.  Does MSR_IA32_PL0_SSP
>> in such a guest count?
> 
> If the architecture says that MSR_IA32_PL0_SSP exists and is accessible, then
> KVM needs to honor that.
> 
>> 2) Saving/restoring MSR_IA32_PL0_SSP adds complexity, though it's seldom
>> used.  Is it worth it?
> 
> Honoring the architecture is generally not optional.  There are extreme cases
> where KVM violates that rule and takes (often undocumented) erratum, e.g. APIC
> base relocation would require an absurd amount of complexity for no real world
> benefit.  But I would be very surprised if the complexity in KVM or QEMU to support
> this scenario is at all meaningful, let alone enough to justify diverging from
> the architectural spec.

Let me post v7 which includes all the required changes.

Thanks!
     Xin

Re: [PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts

Posted by Sean Christopherson 1 month, 1 week ago

On Tue, Aug 26, 2025, Xin Li wrote:
> 
> > > Hi Sean,
> > > 
> > > I'd like to bring up an issue concerning MSR_IA32_PL0_SSP.
> > > 
> > > The FRED spec claims:
> > > 
> > > The FRED SSP MSRs are supported by any processor that enumerates
> > > CPUID.(EAX=7,ECX=1):EAX.FRED[bit 17] as 1. If such a processor does not
> > > support CET, FRED transitions will not use the MSRs (because shadow stacks
> > > are not enabled), but the MSRs would still be accessible using MSR-access
> > > instructions (e.g., RDMSR, WRMSR).
> > > 
> > > It means KVM needs to handle MSR_IA32_PL0_SSP even when FRED is supported
> > > but CET is not.  And this can be broken down into two subtasks:
> > > 
> > > 1) Allow such a guest to access MSR_IA32_PL0_SSP w/o triggering #GP.  And
> > > this behavior is already implemented in patch 8 of this series.
> > > 
> > > 2) Save and restore MSR_IA32_PL0_SSP in both KVM and Qemu for such a guest.
> > 
> > What novel work needs to be done in KVM?  For QEMU, I assume it's just adding an
> > "or FRED" somewhere.  For KVM, I'm missing what additional work would be required
> > that wouldn't be naturally covered by patch 8 (assuming patch 8 is bug-free).
> 
> Extra patches:
> 
> 1) A patch to save/restore guest MSR_IA32_PL0_SSP (i.e., FRED SSP0), as
> what we have done for RSP0, following is the patch on top of the patch
> saving/restoring RSP0:
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 449a5e02c7de..0bf684342a71 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1294,8 +1294,13 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
> 
>  	wrmsrq(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);
> 
> -	if (guest_cpu_cap_has(vcpu, X86_FEATURE_FRED))
> +	if (guest_cpu_cap_has(vcpu, X86_FEATURE_FRED)) {
>  		wrmsrns(MSR_IA32_FRED_RSP0, vmx->msr_guest_fred_rsp0);
> +
> +		/* XSAVES/XRSTORS do not cover SSP MSRs */

Eww.  I'm with Andrew, fix the SDM.  This is silly.

> +		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
> +			wrmsrns(MSR_IA32_FRED_SSP0, vmx->msr_guest_fred_ssp0);

FWIW, if we can't get an SDM change, don't bother with RDMSR/WRMSRNS, just
configure KVM to intercept accesses.  Then in kvm_set_msr_common(), pivot on
X86_FEATURE_SHSTK, e.g.

	case MSR_IA32_U_CET:
	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK)) {
			WARN_ON_ONCE(msr != MSR_IA32_FRED_SSP0);
			vcpu->arch.fred_rsp0_fallback = data;
			break;
		}

		kvm_set_xstate_msr(vcpu, msr_info);
		break;

and

	case MSR_IA32_U_CET:
	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK)) {
			WARN_ON_ONCE(msr_info->index != MSR_IA32_FRED_SSP0);
			vcpu->arch.fred_rsp0_fallback = msr_info->data;
			break;
		}

		kvm_get_xstate_msr(vcpu, msr_info);
		break;

Re: [PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts

Posted by Xin Li 1 month, 1 week ago

On 8/26/2025 3:17 PM, Sean Christopherson wrote:
>> +		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
>> +			wrmsrns(MSR_IA32_FRED_SSP0, vmx->msr_guest_fred_ssp0);
> FWIW, if we can't get an SDM change, don't bother with RDMSR/WRMSRNS, just
> configure KVM to intercept accesses.  Then in kvm_set_msr_common(), pivot on
> X86_FEATURE_SHSTK, e.g.


Intercepting is a solid approach: it ensures the guest value is fully
virtual and does not affect the hardware FRED SSP0 MSR.  Of course the code
is also simplified.


> 
> 	case MSR_IA32_U_CET:
> 	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> 		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK)) {
> 			WARN_ON_ONCE(msr != MSR_IA32_FRED_SSP0);
> 			vcpu->arch.fred_rsp0_fallback = data;
> 			break;
> 		}
> 
> 		kvm_set_xstate_msr(vcpu, msr_info);
> 		break;
> 
> and
> 
> 	case MSR_IA32_U_CET:
> 	case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> 		if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK)) {
> 			WARN_ON_ONCE(msr_info->index != MSR_IA32_FRED_SSP0);
> 			vcpu->arch.fred_rsp0_fallback = msr_info->data;
> 			break;
> 		}
> 
> 		kvm_get_xstate_msr(vcpu, msr_info);
> 		break;

Re: [PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts

Posted by Xin Li 1 month, 1 week ago

On 8/27/2025 3:24 PM, Xin Li wrote:
> On 8/26/2025 3:17 PM, Sean Christopherson wrote:
>>> +        if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
>>> +            wrmsrns(MSR_IA32_FRED_SSP0, vmx->msr_guest_fred_ssp0);
>> FWIW, if we can't get an SDM change, don't bother with RDMSR/WRMSRNS, just
>> configure KVM to intercept accesses.  Then in kvm_set_msr_common(), pivot on
>> X86_FEATURE_SHSTK, e.g.
> 
> 
> Intercepting is a solid approach: it ensures the guest value is fully
> virtual and does not affect the hardware FRED SSP0 MSR.  Of course the code
> is also simplified.
> 
> 
>>
>>     case MSR_IA32_U_CET:
>>     case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
>>         if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK)) {
>>             WARN_ON_ONCE(msr != MSR_IA32_FRED_SSP0);
>>             vcpu->arch.fred_rsp0_fallback = data;

Putting fred_rsp0_fallback in struct kvm_vcpu_arch reminds me one thing:

We know AMD will do FRED and follow the FRED spec for bare metal, but
regarding virtualization of FRED, I have no idea how it will be done on
AMD, so I keep the KVM FRED code in VMX files, e.g., msr_guest_fred_rsp0 is
defined in struct vcpu_vmx, and saved/restored in vmx.c.

It is a future task to make common KVM FRED code for Intel and AMD.

Re: [PATCH v6 06/20] KVM: VMX: Set FRED MSR intercepts

Posted by Sean Christopherson 1 month ago

On Wed, Aug 27, 2025, Xin Li wrote:
> On 8/27/2025 3:24 PM, Xin Li wrote:
> > On 8/26/2025 3:17 PM, Sean Christopherson wrote:
> > > > +        if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
> > > > +            wrmsrns(MSR_IA32_FRED_SSP0, vmx->msr_guest_fred_ssp0);
> > > FWIW, if we can't get an SDM change, don't bother with RDMSR/WRMSRNS, just
> > > configure KVM to intercept accesses.  Then in kvm_set_msr_common(), pivot on
> > > X86_FEATURE_SHSTK, e.g.
> > 
> > 
> > Intercepting is a solid approach: it ensures the guest value is fully
> > virtual and does not affect the hardware FRED SSP0 MSR.  Of course the code
> > is also simplified.
> > 
> > 
> > > 
> > >     case MSR_IA32_U_CET:
> > >     case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> > >         if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK)) {
> > >             WARN_ON_ONCE(msr != MSR_IA32_FRED_SSP0);
> > >             vcpu->arch.fred_rsp0_fallback = data;
> 
> Putting fred_rsp0_fallback in struct kvm_vcpu_arch reminds me one thing:
> 
> We know AMD will do FRED and follow the FRED spec for bare metal, but
> regarding virtualization of FRED, I have no idea how it will be done on
> AMD, so I keep the KVM FRED code in VMX files, e.g., msr_guest_fred_rsp0 is
> defined in struct vcpu_vmx, and saved/restored in vmx.c.

The problem is that if you do that, then the handling of MSR_IA32_PL0_SSP takes
completely different paths depending on vendor, theoretically on hardware, and
on guest CPUID model.  That makes it _really_ difficult to understand how PL0_SSP
is emulated by KVM.

And I actually think that's moot anyways.  KVM _always_ needs to emulated MSR
accesses in software, and the whole goofy PL0_SSP behavior is a bare metal quirk,
not a virtualization quirk.  So unless AMD defines different architecture (which
is certainly possible), AMD will also need arch.fred_rsp0_fallback.

> It is a future task to make common KVM FRED code for Intel and AMD.

No, this is not how I want to approach hardware enabling.  KVM needs to guard
against false advertising, e.g. ensure likely-to-be-common CPUID features are
explicitly cleared in the other vendor.  But deliberately burying code that's
vendor agnostic in whatever vendor support happens to come along first isn't
necessary by any means, and is usually a net negative in the grand scheme, and
often in a big way.

E.g. in this case, if arch.fred_rsp0_fallback ends up being unnecessary for AMD,
we probably don't even need to do anything, KVM will just have a field that's
only used on Intel because the quirky scenario can't be reached on AMD.

But if we bury the code in VMX, then the _best_ case scenario is that KVM carries
a weird split of responsibility in perpetuity (happy path handled in x86.c, rare
sad path handled in vmx.c).  And the worst case scenario is that we carry the
weird split for some time, and then have to undo all of it when AMD support comes
along.  Actually, the worst case scenario is that we forget about the VMX code
and re-implement the same thing in svm.c.