[PATCH v12 17/24] KVM: VMX: Set up interception for CET MSRs

Chao Gao posted 24 patches 1 month, 3 weeks ago
There is a newer version of this series
[PATCH v12 17/24] KVM: VMX: Set up interception for CET MSRs
Posted by Chao Gao 1 month, 3 weeks ago
From: Yang Weijiang <weijiang.yang@intel.com>

Enable/disable CET MSRs interception per associated feature configuration.

Shadow Stack feature requires all CET MSRs passed through to guest to make
it supported in user and supervisor mode while IBT feature only depends on
MSR_IA32_{U,S}_CETS_CET to enable user and supervisor IBT.

Note, this MSR design introduced an architectural limitation of SHSTK and
IBT control for guest, i.e., when SHSTK is exposed, IBT is also available
to guest from architectural perspective since IBT relies on subset of SHSTK
relevant MSRs.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
Tested-by: Mathias Krause <minipli@grsecurity.net>
Tested-by: John Allen <john.allen@amd.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 arch/x86/kvm/vmx/vmx.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index bd572c8c7bc3..130ffbe7dc1a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4088,6 +4088,8 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu)
 
 void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
 {
+	bool set;
+
 	if (!cpu_has_vmx_msr_bitmap())
 		return;
 
@@ -4133,6 +4135,24 @@ void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
 		vmx_set_intercept_for_msr(vcpu, MSR_IA32_FLUSH_CMD, MSR_TYPE_W,
 					  !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D));
 
+	if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) {
+		set = !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK);
+
+		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, set);
+		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, MSR_TYPE_RW, set);
+		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, MSR_TYPE_RW, set);
+		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, set);
+		vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW, set);
+	}
+
+	if (kvm_cpu_cap_has(X86_FEATURE_SHSTK) || kvm_cpu_cap_has(X86_FEATURE_IBT)) {
+		set = !guest_cpu_cap_has(vcpu, X86_FEATURE_IBT) &&
+		      !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK);
+
+		vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, MSR_TYPE_RW, set);
+		vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, set);
+	}
+
 	/*
 	 * x2APIC and LBR MSR intercepts are modified on-demand and cannot be
 	 * filtered by userspace.
-- 
2.47.1
Re: [PATCH v12 17/24] KVM: VMX: Set up interception for CET MSRs
Posted by Sean Christopherson 1 month, 2 weeks ago
On Mon, Aug 11, 2025, Chao Gao wrote:
> From: Yang Weijiang <weijiang.yang@intel.com>
> 
> Enable/disable CET MSRs interception per associated feature configuration.
> 
> Shadow Stack feature requires all CET MSRs passed through to guest to make
> it supported in user and supervisor mode 

I doubt that SS _requires_ CET MSRs to be passed through.  IIRC, the actual
reason for passing through most of the MSRs is that they are managed via XSAVE,
i.e. _can't_ be intercepted without also intercepting XRSTOR.

> while IBT feature only depends on
> MSR_IA32_{U,S}_CETS_CET to enable user and supervisor IBT.
> 
> Note, this MSR design introduced an architectural limitation of SHSTK and
> IBT control for guest, i.e., when SHSTK is exposed, IBT is also available
> to guest from architectural perspective since IBT relies on subset of SHSTK
> relevant MSRs.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> Tested-by: Mathias Krause <minipli@grsecurity.net>
> Tested-by: John Allen <john.allen@amd.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index bd572c8c7bc3..130ffbe7dc1a 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4088,6 +4088,8 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu)
>  
>  void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
>  {
> +	bool set;

s/set/intercept

> +
>  	if (!cpu_has_vmx_msr_bitmap())
>  		return;
>  
> @@ -4133,6 +4135,24 @@ void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
>  		vmx_set_intercept_for_msr(vcpu, MSR_IA32_FLUSH_CMD, MSR_TYPE_W,
>  					  !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D));
>  
> +	if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) {
> +		set = !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK);
> +
> +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, set);
> +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, MSR_TYPE_RW, set);
> +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, MSR_TYPE_RW, set);
> +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, set);
> +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW, set);

MSR_IA32_INT_SSP_TAB isn't managed via XSAVE, so why is it being passed through?

> +	}
> +
> +	if (kvm_cpu_cap_has(X86_FEATURE_SHSTK) || kvm_cpu_cap_has(X86_FEATURE_IBT)) {
> +		set = !guest_cpu_cap_has(vcpu, X86_FEATURE_IBT) &&
> +		      !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK);
> +
> +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, MSR_TYPE_RW, set);
> +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, set);
> +	}
> +
>  	/*
>  	 * x2APIC and LBR MSR intercepts are modified on-demand and cannot be
>  	 * filtered by userspace.
> -- 
> 2.47.1
>
Re: [PATCH v12 17/24] KVM: VMX: Set up interception for CET MSRs
Posted by Xin Li 1 month, 2 weeks ago
On 8/19/2025 9:11 AM, Sean Christopherson wrote:
> On Mon, Aug 11, 2025, Chao Gao wrote:
>> From: Yang Weijiang <weijiang.yang@intel.com>
>>
>> Enable/disable CET MSRs interception per associated feature configuration.
>>
>> Shadow Stack feature requires all CET MSRs passed through to guest to make
>> it supported in user and supervisor mode
> 
> I doubt that SS _requires_ CET MSRs to be passed through.  IIRC, the actual
> reason for passing through most of the MSRs is that they are managed via XSAVE,
> i.e. _can't_ be intercepted without also intercepting XRSTOR.
> 
>> while IBT feature only depends on
>> MSR_IA32_{U,S}_CETS_CET to enable user and supervisor IBT.
>>
>> Note, this MSR design introduced an architectural limitation of SHSTK and
>> IBT control for guest, i.e., when SHSTK is exposed, IBT is also available
>> to guest from architectural perspective since IBT relies on subset of SHSTK
>> relevant MSRs.
>>
>> Suggested-by: Sean Christopherson <seanjc@google.com>
>> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
>> Tested-by: Mathias Krause <minipli@grsecurity.net>
>> Tested-by: John Allen <john.allen@amd.com>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>>   arch/x86/kvm/vmx/vmx.c | 20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index bd572c8c7bc3..130ffbe7dc1a 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -4088,6 +4088,8 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu)
>>   
>>   void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
>>   {
>> +	bool set;
> 
> s/set/intercept
> 

Maybe because you asked me to change "flag" to "set" when reviewing FRED
patches, however "intercept" does sound better, and I just changed it :)

>> +
>>   	if (!cpu_has_vmx_msr_bitmap())
>>   		return;
>>   
>> @@ -4133,6 +4135,24 @@ void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
>>   		vmx_set_intercept_for_msr(vcpu, MSR_IA32_FLUSH_CMD, MSR_TYPE_W,
>>   					  !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D));
>>   
>> +	if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) {
>> +		set = !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK);
>> +
>> +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, set);
>> +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, MSR_TYPE_RW, set);
>> +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, MSR_TYPE_RW, set);
>> +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, set);
>> +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW, set);
> 
> MSR_IA32_INT_SSP_TAB isn't managed via XSAVE, so why is it being passed through?

It's managed in VMCS host and guest areas, i.e. HOST_INTR_SSP_TABLE and
GUEST_INTR_SSP_TABLE, if the "load CET" bits are set in both VM entry
and exit controls.

FRED MSRs are also passed through to guest in such cases.

no?
Re: [PATCH v12 17/24] KVM: VMX: Set up interception for CET MSRs
Posted by Sean Christopherson 1 month, 2 weeks ago
On Tue, Aug 19, 2025, Xin Li wrote:
> On 8/19/2025 9:11 AM, Sean Christopherson wrote:
> > On Mon, Aug 11, 2025, Chao Gao wrote:
> > > From: Yang Weijiang <weijiang.yang@intel.com>
> > > 
> > > Enable/disable CET MSRs interception per associated feature configuration.
> > > 
> > > Shadow Stack feature requires all CET MSRs passed through to guest to make
> > > it supported in user and supervisor mode
> > 
> > I doubt that SS _requires_ CET MSRs to be passed through.  IIRC, the actual
> > reason for passing through most of the MSRs is that they are managed via XSAVE,
> > i.e. _can't_ be intercepted without also intercepting XRSTOR.
> > 
> > > while IBT feature only depends on
> > > MSR_IA32_{U,S}_CETS_CET to enable user and supervisor IBT.
> > > 
> > > Note, this MSR design introduced an architectural limitation of SHSTK and
> > > IBT control for guest, i.e., when SHSTK is exposed, IBT is also available
> > > to guest from architectural perspective since IBT relies on subset of SHSTK
> > > relevant MSRs.
> > > 
> > > Suggested-by: Sean Christopherson <seanjc@google.com>
> > > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
> > > Tested-by: Mathias Krause <minipli@grsecurity.net>
> > > Tested-by: John Allen <john.allen@amd.com>
> > > Signed-off-by: Chao Gao <chao.gao@intel.com>
> > > ---
> > >   arch/x86/kvm/vmx/vmx.c | 20 ++++++++++++++++++++
> > >   1 file changed, 20 insertions(+)
> > > 
> > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > index bd572c8c7bc3..130ffbe7dc1a 100644
> > > --- a/arch/x86/kvm/vmx/vmx.c
> > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > @@ -4088,6 +4088,8 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu)
> > >   void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
> > >   {
> > > +	bool set;
> > 
> > s/set/intercept
> > 
> 
> Maybe because you asked me to change "flag" to "set" when reviewing FRED
> patches, however "intercept" does sound better, and I just changed it :)

Ah crud.  I had a feeling I was flip-flopping.  I obviously don't have a strong
preference.

> > > +
> > >   	if (!cpu_has_vmx_msr_bitmap())
> > >   		return;
> > > @@ -4133,6 +4135,24 @@ void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
> > >   		vmx_set_intercept_for_msr(vcpu, MSR_IA32_FLUSH_CMD, MSR_TYPE_W,
> > >   					  !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D));
> > > +	if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) {
> > > +		set = !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK);
> > > +
> > > +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, set);
> > > +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, MSR_TYPE_RW, set);
> > > +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, MSR_TYPE_RW, set);
> > > +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, set);
> > > +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW, set);
> > 
> > MSR_IA32_INT_SSP_TAB isn't managed via XSAVE, so why is it being passed through?
> 
> It's managed in VMCS host and guest areas, i.e. HOST_INTR_SSP_TABLE and
> GUEST_INTR_SSP_TABLE, if the "load CET" bits are set in both VM entry
> and exit controls.

Ah, "because it's essentially free".  Unless there's a true need to pass it through,
I think it makes sense to intercept.  Merging KVM's bitmap with vmcs12's bitmap
isn't completely free (though it's quite cheap).  More importantly, this is technically
wrong due to MSR_IA32_INT_SSP_TAB not existing if the vCPU doesn't have LM.  That's
obviously easy to solve, I just don't see the point.
Re: [PATCH v12 17/24] KVM: VMX: Set up interception for CET MSRs
Posted by Chao Gao 1 month, 2 weeks ago
On Tue, Aug 19, 2025 at 11:45:42AM -0700, Sean Christopherson wrote:
>On Tue, Aug 19, 2025, Xin Li wrote:
>> On 8/19/2025 9:11 AM, Sean Christopherson wrote:
>> > On Mon, Aug 11, 2025, Chao Gao wrote:
>> > > From: Yang Weijiang <weijiang.yang@intel.com>
>> > > 
>> > > Enable/disable CET MSRs interception per associated feature configuration.
>> > > 
>> > > Shadow Stack feature requires all CET MSRs passed through to guest to make
>> > > it supported in user and supervisor mode
>> > 
>> > I doubt that SS _requires_ CET MSRs to be passed through.  IIRC, the actual
>> > reason for passing through most of the MSRs is that they are managed via XSAVE,
>> > i.e. _can't_ be intercepted without also intercepting XRSTOR.

Agreed. Will update the changelog.

>> > 
>> > > while IBT feature only depends on
>> > > MSR_IA32_{U,S}_CETS_CET to enable user and supervisor IBT.
>> > > 
>> > > Note, this MSR design introduced an architectural limitation of SHSTK and
>> > > IBT control for guest, i.e., when SHSTK is exposed, IBT is also available
>> > > to guest from architectural perspective since IBT relies on subset of SHSTK
>> > > relevant MSRs.
>> > > 
>> > > Suggested-by: Sean Christopherson <seanjc@google.com>
>> > > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
>> > > Tested-by: Mathias Krause <minipli@grsecurity.net>
>> > > Tested-by: John Allen <john.allen@amd.com>
>> > > Signed-off-by: Chao Gao <chao.gao@intel.com>
>> > > ---
>> > >   arch/x86/kvm/vmx/vmx.c | 20 ++++++++++++++++++++
>> > >   1 file changed, 20 insertions(+)
>> > > 
>> > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> > > index bd572c8c7bc3..130ffbe7dc1a 100644
>> > > --- a/arch/x86/kvm/vmx/vmx.c
>> > > +++ b/arch/x86/kvm/vmx/vmx.c
>> > > @@ -4088,6 +4088,8 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu)
>> > >   void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
>> > >   {
>> > > +	bool set;
>> > 
>> > s/set/intercept
>> > 
>> 
>> Maybe because you asked me to change "flag" to "set" when reviewing FRED
>> patches, however "intercept" does sound better, and I just changed it :)
>
>Ah crud.  I had a feeling I was flip-flopping.  I obviously don't have a strong
>preference.

Anyway, I will use "intercept".

>
>> > > +
>> > >   	if (!cpu_has_vmx_msr_bitmap())
>> > >   		return;
>> > > @@ -4133,6 +4135,24 @@ void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu)
>> > >   		vmx_set_intercept_for_msr(vcpu, MSR_IA32_FLUSH_CMD, MSR_TYPE_W,
>> > >   					  !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D));
>> > > +	if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) {
>> > > +		set = !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK);
>> > > +
>> > > +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, set);
>> > > +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, MSR_TYPE_RW, set);
>> > > +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, MSR_TYPE_RW, set);
>> > > +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, set);
>> > > +		vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW, set);
>> > 
>> > MSR_IA32_INT_SSP_TAB isn't managed via XSAVE, so why is it being passed through?
>> 
>> It's managed in VMCS host and guest areas, i.e. HOST_INTR_SSP_TABLE and
>> GUEST_INTR_SSP_TABLE, if the "load CET" bits are set in both VM entry
>> and exit controls.
>
>Ah, "because it's essentially free".  Unless there's a true need to pass it through,
>I think it makes sense to intercept.  Merging KVM's bitmap with vmcs12's bitmap
>isn't completely free (though it's quite cheap).  More importantly, this is technically
>wrong due to MSR_IA32_INT_SSP_TAB not existing if the vCPU doesn't have LM.  That's
>obviously easy to solve, I just don't see the point.

Sure. I will leave MSR_IA32_INT_SSP_TAB intercept and add a LM check on its emulation.