[PATCH v16 27/51] KVM: x86: Disable support for IBT and SHSTK if allow_smaller_maxphyaddr is true

Sean Christopherson posted 51 patches 1 week, 5 days ago
[PATCH v16 27/51] KVM: x86: Disable support for IBT and SHSTK if allow_smaller_maxphyaddr is true
Posted by Sean Christopherson 1 week, 5 days ago
Make IBT and SHSTK virtualization mutually exclusive with "officially"
supporting setups with guest.MAXPHYADDR < host.MAXPHYADDR, i.e. if the
allow_smaller_maxphyaddr module param is set.  Running a guest with a
smaller MAXPHYADDR requires intercepting #PF, and can also trigger
emulation of arbitrary instructions.  Intercepting and reacting to #PFs
doesn't play nice with SHSTK, as KVM's MMU hasn't been taught to handle
Shadow Stack accesses, and emulating arbitrary instructions doesn't play
nice with IBT or SHSTK, as KVM's emulator doesn't handle the various side
effects, e.g. doesn't enforce end-branch markers or model Shadow Stack
updates.

Note, hiding IBT and SHSTK based solely on allow_smaller_maxphyaddr is
overkill, as allow_smaller_maxphyaddr is only problematic if the guest is
actually configured to have a smaller MAXPHYADDR.  However, KVM's ABI
doesn't provide a way to express that IBT and SHSTK may break if enabled
in conjunction with guest.MAXPHYADDR < host.MAXPHYADDR.  I.e. the
alternative is to do nothing in KVM and instead update documentation and
hope KVM users are thorough readers.  Go with the conservative-but-correct
approach; worst case scenario, this restriction can be dropped if there's
a strong use case for enabling CET on hosts with allow_smaller_maxphyaddr.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/cpuid.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 499c86bd457e..b5c4cb13630c 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -963,6 +963,16 @@ void kvm_set_cpu_caps(void)
 	if (!tdp_enabled)
 		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
 
+	/*
+	 * Disable support for IBT and SHSTK if KVM is configured to emulate
+	 * accesses to reserved GPAs, as KVM's emulator doesn't support IBT or
+	 * SHSTK, nor does KVM handle Shadow Stack #PFs (see above).
+	 */
+	if (allow_smaller_maxphyaddr) {
+		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
+		kvm_cpu_cap_clear(X86_FEATURE_IBT);
+	}
+
 	kvm_cpu_cap_init(CPUID_7_EDX,
 		F(AVX512_4VNNIW),
 		F(AVX512_4FMAPS),
-- 
2.51.0.470.ga7dc726c21-goog
Re: [PATCH v16 27/51] KVM: x86: Disable support for IBT and SHSTK if allow_smaller_maxphyaddr is true
Posted by Xiaoyao Li 1 week, 1 day ago
On 9/20/2025 6:32 AM, Sean Christopherson wrote:
> Make IBT and SHSTK virtualization mutually exclusive with "officially"
> supporting setups with guest.MAXPHYADDR < host.MAXPHYADDR, i.e. if the
> allow_smaller_maxphyaddr module param is set.  Running a guest with a
> smaller MAXPHYADDR requires intercepting #PF, and can also trigger
> emulation of arbitrary instructions.  Intercepting and reacting to #PFs
> doesn't play nice with SHSTK, as KVM's MMU hasn't been taught to handle
> Shadow Stack accesses, and emulating arbitrary instructions doesn't play
> nice with IBT or SHSTK, as KVM's emulator doesn't handle the various side
> effects, e.g. doesn't enforce end-branch markers or model Shadow Stack
> updates.
> 
> Note, hiding IBT and SHSTK based solely on allow_smaller_maxphyaddr is
> overkill, as allow_smaller_maxphyaddr is only problematic if the guest is
> actually configured to have a smaller MAXPHYADDR.  However, KVM's ABI
> doesn't provide a way to express that IBT and SHSTK may break if enabled
> in conjunction with guest.MAXPHYADDR < host.MAXPHYADDR.  I.e. the
> alternative is to do nothing in KVM and instead update documentation and
> hope KVM users are thorough readers.  

KVM_SET_CPUID* can return error to userspace. So KVM can return -EINVAL 
when userspace sets a smaller maxphyaddr with SHSTK/IBT enabled.

> Go with the conservative-but-correct
> approach; worst case scenario, this restriction can be dropped if there's
> a strong use case for enabling CET on hosts with allow_smaller_maxphyaddr.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/cpuid.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 499c86bd457e..b5c4cb13630c 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -963,6 +963,16 @@ void kvm_set_cpu_caps(void)
>   	if (!tdp_enabled)
>   		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
>   
> +	/*
> +	 * Disable support for IBT and SHSTK if KVM is configured to emulate
> +	 * accesses to reserved GPAs, as KVM's emulator doesn't support IBT or
> +	 * SHSTK, nor does KVM handle Shadow Stack #PFs (see above).
> +	 */
> +	if (allow_smaller_maxphyaddr) {
> +		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
> +		kvm_cpu_cap_clear(X86_FEATURE_IBT);
> +	}
> +
>   	kvm_cpu_cap_init(CPUID_7_EDX,
>   		F(AVX512_4VNNIW),
>   		F(AVX512_4FMAPS),
Re: [PATCH v16 27/51] KVM: x86: Disable support for IBT and SHSTK if allow_smaller_maxphyaddr is true
Posted by Sean Christopherson 1 week, 1 day ago
On Tue, Sep 23, 2025, Xiaoyao Li wrote:
> On 9/20/2025 6:32 AM, Sean Christopherson wrote:
> > Make IBT and SHSTK virtualization mutually exclusive with "officially"
> > supporting setups with guest.MAXPHYADDR < host.MAXPHYADDR, i.e. if the
> > allow_smaller_maxphyaddr module param is set.  Running a guest with a
> > smaller MAXPHYADDR requires intercepting #PF, and can also trigger
> > emulation of arbitrary instructions.  Intercepting and reacting to #PFs
> > doesn't play nice with SHSTK, as KVM's MMU hasn't been taught to handle
> > Shadow Stack accesses, and emulating arbitrary instructions doesn't play
> > nice with IBT or SHSTK, as KVM's emulator doesn't handle the various side
> > effects, e.g. doesn't enforce end-branch markers or model Shadow Stack
> > updates.
> > 
> > Note, hiding IBT and SHSTK based solely on allow_smaller_maxphyaddr is
> > overkill, as allow_smaller_maxphyaddr is only problematic if the guest is
> > actually configured to have a smaller MAXPHYADDR.  However, KVM's ABI
> > doesn't provide a way to express that IBT and SHSTK may break if enabled
> > in conjunction with guest.MAXPHYADDR < host.MAXPHYADDR.  I.e. the
> > alternative is to do nothing in KVM and instead update documentation and
> > hope KVM users are thorough readers.
> 
> KVM_SET_CPUID* can return error to userspace. So KVM can return -EINVAL when
> userspace sets a smaller maxphyaddr with SHSTK/IBT enabled.

Generally speaking, I don't want to police userspace's vCPU model.  For
allow_smaller_maxphyaddr in particular, I want to actively discourage its use.
The entire concept is inherently flawed, e.g. only works for a relative narrow
use case.

And IIRC, Sierra Forest and future Atom-based server CPUs will be straight up
incompatible with allow_smaller_maxphyaddr due to them setting accessed/dirty
bits before generating the EPT Violation, which is what killed allow_smaller_maxphyaddr
with NPT.

I.e. allow_smaller_maxphyaddr is doomed, and I want to help it die.  If someone
really, really wants to enable CET on hosts with allow_smaller_maxphyaddr=true,
then they can send patches and we can sort out how to communicate the various
incompatibilities to userspace.
Re: [PATCH v16 27/51] KVM: x86: Disable support for IBT and SHSTK if allow_smaller_maxphyaddr is true
Posted by Sean Christopherson 1 week, 2 days ago
On Fri, Sep 19, 2025, Sean Christopherson wrote:
> Make IBT and SHSTK virtualization mutually exclusive with "officially"
> supporting setups with guest.MAXPHYADDR < host.MAXPHYADDR, i.e. if the
> allow_smaller_maxphyaddr module param is set.  Running a guest with a
> smaller MAXPHYADDR requires intercepting #PF, and can also trigger
> emulation of arbitrary instructions.  Intercepting and reacting to #PFs
> doesn't play nice with SHSTK, as KVM's MMU hasn't been taught to handle
> Shadow Stack accesses, and emulating arbitrary instructions doesn't play
> nice with IBT or SHSTK, as KVM's emulator doesn't handle the various side
> effects, e.g. doesn't enforce end-branch markers or model Shadow Stack
> updates.
> 
> Note, hiding IBT and SHSTK based solely on allow_smaller_maxphyaddr is
> overkill, as allow_smaller_maxphyaddr is only problematic if the guest is
> actually configured to have a smaller MAXPHYADDR.  However, KVM's ABI
> doesn't provide a way to express that IBT and SHSTK may break if enabled
> in conjunction with guest.MAXPHYADDR < host.MAXPHYADDR.  I.e. the
> alternative is to do nothing in KVM and instead update documentation and
> hope KVM users are thorough readers.  Go with the conservative-but-correct
> approach; worst case scenario, this restriction can be dropped if there's
> a strong use case for enabling CET on hosts with allow_smaller_maxphyaddr.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/cpuid.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 499c86bd457e..b5c4cb13630c 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -963,6 +963,16 @@ void kvm_set_cpu_caps(void)
>  	if (!tdp_enabled)
>  		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
>  
> +	/*
> +	 * Disable support for IBT and SHSTK if KVM is configured to emulate
> +	 * accesses to reserved GPAs, as KVM's emulator doesn't support IBT or
> +	 * SHSTK, nor does KVM handle Shadow Stack #PFs (see above).
> +	 */
> +	if (allow_smaller_maxphyaddr) {
> +		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
> +		kvm_cpu_cap_clear(X86_FEATURE_IBT);
> +	}

Ugh, testing fail.  F(IBT) is initialized in CPUID_7_EDX, clearing IBT here has
no effect.

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b861a88083e1..d290dbc96831 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -964,16 +964,6 @@ void kvm_set_cpu_caps(void)
        if (!tdp_enabled)
                kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
 
-       /*
-        * Disable support for IBT and SHSTK if KVM is configured to emulate
-        * accesses to reserved GPAs, as KVM's emulator doesn't support IBT or
-        * SHSTK, nor does KVM handle Shadow Stack #PFs (see above).
-        */
-       if (allow_smaller_maxphyaddr) {
-               kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
-               kvm_cpu_cap_clear(X86_FEATURE_IBT);
-       }
-
        kvm_cpu_cap_init(CPUID_7_EDX,
                F(AVX512_4VNNIW),
                F(AVX512_4FMAPS),
@@ -994,6 +984,16 @@ void kvm_set_cpu_caps(void)
                F(IBT),
        );
 
+       /*
+        * Disable support for IBT and SHSTK if KVM is configured to emulate
+        * accesses to reserved GPAs, as KVM's emulator doesn't support IBT or
+        * SHSTK, nor does KVM handle Shadow Stack #PFs (see above).
+        */
+       if (allow_smaller_maxphyaddr) {
+               kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
+               kvm_cpu_cap_clear(X86_FEATURE_IBT);
+       }
+
        if (boot_cpu_has(X86_FEATURE_AMD_IBPB_RET) &&
            boot_cpu_has(X86_FEATURE_AMD_IBPB) &&
            boot_cpu_has(X86_FEATURE_AMD_IBRS))

> +
>  	kvm_cpu_cap_init(CPUID_7_EDX,
>  		F(AVX512_4VNNIW),
>  		F(AVX512_4FMAPS),
> -- 
> 2.51.0.470.ga7dc726c21-goog
>
Re: [PATCH v16 27/51] KVM: x86: Disable support for IBT and SHSTK if allow_smaller_maxphyaddr is true
Posted by Binbin Wu 1 week, 2 days ago

On 9/20/2025 6:32 AM, Sean Christopherson wrote:
> Make IBT and SHSTK virtualization mutually exclusive with "officially"
> supporting setups with guest.MAXPHYADDR < host.MAXPHYADDR, i.e. if the
> allow_smaller_maxphyaddr module param is set.  Running a guest with a
> smaller MAXPHYADDR requires intercepting #PF, and can also trigger
> emulation of arbitrary instructions.  Intercepting and reacting to #PFs
> doesn't play nice with SHSTK, as KVM's MMU hasn't been taught to handle
> Shadow Stack accesses, and emulating arbitrary instructions doesn't play
> nice with IBT or SHSTK, as KVM's emulator doesn't handle the various side
> effects, e.g. doesn't enforce end-branch markers or model Shadow Stack
> updates.
>
> Note, hiding IBT and SHSTK based solely on allow_smaller_maxphyaddr is
> overkill, as allow_smaller_maxphyaddr is only problematic if the guest is
> actually configured to have a smaller MAXPHYADDR.  However, KVM's ABI
> doesn't provide a way to express that IBT and SHSTK may break if enabled
> in conjunction with guest.MAXPHYADDR < host.MAXPHYADDR.  I.e. the
> alternative is to do nothing in KVM and instead update documentation and
> hope KVM users are thorough readers.  Go with the conservative-but-correct
> approach; worst case scenario, this restriction can be dropped if there's
> a strong use case for enabling CET on hosts with allow_smaller_maxphyaddr.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>

> ---
>   arch/x86/kvm/cpuid.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 499c86bd457e..b5c4cb13630c 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -963,6 +963,16 @@ void kvm_set_cpu_caps(void)
>   	if (!tdp_enabled)
>   		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
>   
> +	/*
> +	 * Disable support for IBT and SHSTK if KVM is configured to emulate
> +	 * accesses to reserved GPAs, as KVM's emulator doesn't support IBT or
> +	 * SHSTK, nor does KVM handle Shadow Stack #PFs (see above).
> +	 */
> +	if (allow_smaller_maxphyaddr) {
> +		kvm_cpu_cap_clear(X86_FEATURE_SHSTK);
> +		kvm_cpu_cap_clear(X86_FEATURE_IBT);
> +	}
> +
>   	kvm_cpu_cap_init(CPUID_7_EDX,
>   		F(AVX512_4VNNIW),
>   		F(AVX512_4FMAPS),