Disable KVM's "prove #VE" support by default, as it provides no functional
value, and even its sanity checking benefits are relatively limited. I.e.
it should be fully opt-in even on debug kernels, especially since EPT
Violation #VE suppression appears to be buggy on some CPUs.
Opportunistically add a line in the help text to make it abundantly clear
that KVM_INTEL_PROVE_VE should never be enabled in a production
environment.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/Kconfig | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 2a7f69abcac3..3468efc4be55 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -97,15 +97,15 @@ config KVM_INTEL
config KVM_INTEL_PROVE_VE
bool "Check that guests do not receive #VE exceptions"
- default KVM_PROVE_MMU || DEBUG_KERNEL
- depends on KVM_INTEL
+ depends on KVM_INTEL && DEBUG_KERNEL && EXPERT
help
-
Checks that KVM's page table management code will not incorrectly
let guests receive a virtualization exception. Virtualization
exceptions will be trapped by the hypervisor rather than injected
in the guest.
+ This should never be enabled in a production environment.
+
If unsure, say N.
config X86_SGX_KVM
--
2.45.0.215.g3402c0e53f-goog
On Sat, May 18, 2024 at 2:04 AM Sean Christopherson <seanjc@google.com> wrote:
> Disable KVM's "prove #VE" support by default, as it provides no functional
> value, and even its sanity checking benefits are relatively limited. I.e.
> it should be fully opt-in even on debug kernels, especially since EPT
> Violation #VE suppression appears to be buggy on some CPUs.
More #VE trapping than #VE suppression.
I wouldn't go so far as making it *depend* on DEBUG_KERNEL. EXPERT
plus the scary help message is good enough.
What about this:
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index b6831e17ec31..2864608c7016 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -97,14 +97,15 @@ config KVM_INTEL
config KVM_INTEL_PROVE_VE
bool "Check that guests do not receive #VE exceptions"
- depends on KVM_INTEL && DEBUG_KERNEL && EXPERT
+ depends on KVM_INTEL && EXPERT
help
Checks that KVM's page table management code will not incorrectly
let guests receive a virtualization exception. Virtualization
exceptions will be trapped by the hypervisor rather than injected
in the guest.
- This should never be enabled in a production environment.
+ Note that #VE trapping appears to be buggy on some CPUs.
+ This should never be enabled in a production environment!
If unsure, say N.
Paolo
> Opportunistically add a line in the help text to make it abundantly clear
> that KVM_INTEL_PROVE_VE should never be enabled in a production
> environment.
>
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> arch/x86/kvm/Kconfig | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index 2a7f69abcac3..3468efc4be55 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -97,15 +97,15 @@ config KVM_INTEL
>
> config KVM_INTEL_PROVE_VE
> bool "Check that guests do not receive #VE exceptions"
> - default KVM_PROVE_MMU || DEBUG_KERNEL
> - depends on KVM_INTEL
> + depends on KVM_INTEL && DEBUG_KERNEL && EXPERT
> help
> -
> Checks that KVM's page table management code will not incorrectly
> let guests receive a virtualization exception. Virtualization
> exceptions will be trapped by the hypervisor rather than injected
> in the guest.
>
> + This should never be enabled in a production environment.
> +
> If unsure, say N.
>
> config X86_SGX_KVM
> --
> 2.45.0.215.g3402c0e53f-goog
>
On Tue, May 21, 2024, Paolo Bonzini wrote:
> On Sat, May 18, 2024 at 2:04 AM Sean Christopherson <seanjc@google.com> wrote:
> > Disable KVM's "prove #VE" support by default, as it provides no functional
> > value, and even its sanity checking benefits are relatively limited. I.e.
> > it should be fully opt-in even on debug kernels, especially since EPT
> > Violation #VE suppression appears to be buggy on some CPUs.
>
> More #VE trapping than #VE suppression.
>
> I wouldn't go so far as making it *depend* on DEBUG_KERNEL. EXPERT
> plus the scary help message is good enough.
Works for me.
>
> What about this:
>
> diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
> index b6831e17ec31..2864608c7016 100644
> --- a/arch/x86/kvm/Kconfig
> +++ b/arch/x86/kvm/Kconfig
> @@ -97,14 +97,15 @@ config KVM_INTEL
>
> config KVM_INTEL_PROVE_VE
> bool "Check that guests do not receive #VE exceptions"
> - depends on KVM_INTEL && DEBUG_KERNEL && EXPERT
> + depends on KVM_INTEL && EXPERT
> help
> Checks that KVM's page table management code will not incorrectly
> let guests receive a virtualization exception. Virtualization
> exceptions will be trapped by the hypervisor rather than injected
> in the guest.
>
> - This should never be enabled in a production environment.
> + Note that #VE trapping appears to be buggy on some CPUs.
I see where you're coming from, but I don't think "trapping" is much better,
e.g. it suggests there's something broken with the interception of #VEs. Ah,
the entire help text is weird.
This?
config KVM_INTEL_PROVE_VE
bool "Verify guests do not receive unexpected EPT Violation #VEs"
depends on KVM_INTEL && EXPERT
help
Enable EPT Violation #VEs (when supported) for all VMs, to verify
that KVM's EPT management code will not incorrectly result in a #VE
(KVM is supposed to supress #VEs by default). Unexpected #VEs will
be intercepted by KVM and will trigger a WARN, but are otherwise
transparent to the guest.
Note, EPT Violation #VE support appears to be buggy on some CPUs.
This should never be enabled in a production environment!
If unsure, say N.
On Tue, May 21, 2024 at 8:18 PM Sean Christopherson <seanjc@google.com> wrote: > > - This should never be enabled in a production environment. > > + Note that #VE trapping appears to be buggy on some CPUs. > > I see where you're coming from, but I don't think "trapping" is much better, > e.g. it suggests there's something broken with the interception of #VEs. Ah, > the entire help text is weird. Yeah, I didn't want to say #VE is broken altogether - interception is where we saw issues, and #VE is used in production as far as I know (not just by TDX; at least Xen and maybe Hyper-V use it for anti-malware purposes?). Maybe "Note: there appear to be bugs in some CPUs that will trigger the WARN, in particular with eptad=0 and/or nested virtualization" covers all bases. Paolo > > This? > > config KVM_INTEL_PROVE_VE > bool "Verify guests do not receive unexpected EPT Violation #VEs" > depends on KVM_INTEL && EXPERT > help > Enable EPT Violation #VEs (when supported) for all VMs, to verify > that KVM's EPT management code will not incorrectly result in a #VE > (KVM is supposed to supress #VEs by default). Unexpected #VEs will > be intercepted by KVM and will trigger a WARN, but are otherwise > transparent to the guest. > > Note, EPT Violation #VE support appears to be buggy on some CPUs. > > This should never be enabled in a production environment! > > If unsure, say N. >
On Tue, May 21, 2024, Paolo Bonzini wrote: > On Tue, May 21, 2024 at 8:18 PM Sean Christopherson <seanjc@google.com> wrote: > > > - This should never be enabled in a production environment. > > > + Note that #VE trapping appears to be buggy on some CPUs. > > > > I see where you're coming from, but I don't think "trapping" is much better, > > e.g. it suggests there's something broken with the interception of #VEs. Ah, > > the entire help text is weird. > > Yeah, I didn't want to say #VE is broken altogether - Ah, yeah, good call. The #VE isn't broken per se, just spurious/unexpected. > interception is where we saw issues, It's not an issue with interception, disabling #VE intercepts results in the #VE being delivered to the guest. Test suite: ept_access_test_not_present PTE[4] @ 109fff8 = 9fed0007 PTE[3] @ 9fed0ff0 = 9fed1007 PTE[2] @ 9fed1000 = 9fed2007 VA PTE @ 9fed2000 = 8000000007 Created EPT @ 9feca008 = 11d2007 Created EPT @ 11d2000 = 11d3007 Created EPT @ 11d3000 = 11d4007 L1 hva = 40000000, hpa = 40000000, L2 gva = ffffffff80000000, gpa = 8000000000 Unhandled exception 8 #DF at ip 0000000000410d39 error_code=0000 rflags=00010097 cs=00000008 rax=ffffffff80000000 rcx=0000000000000000 rdx=0000000000000000 rbx=0000000000000000 rbp=000000009fec6fe0 rsi=0000000000000000 rdi=0000000000000000 r8=0000000000000000 r9=0000000000000000 r10=0000000000000000 r11=0000000000000000 r12=ffffffff80000008 r13=0000000000000000 r14=0000000000000000 r15=0000000000000000 cr0=0000000080010031 cr2=0000000000000000 cr3=000000000109f000 cr4=0000000000002020 cr8=0000000000000000 STACK: @410d39 40144a 4002dd > and #VE is used in production as far as I know (not just by TDX; at least Xen > and maybe Hyper-V use it for anti-malware purposes?). Hmm, maybe a spurious #VE is benign? Or it really is limited to A/D bits being disabled? Not that us speculating is going to change anything :-) > Maybe "Note: there appear to be bugs in some CPUs that will trigger > the WARN, in particular with eptad=0 and/or nested virtualization" > covers all bases. Works for me. Maybe tweak it slightly to explain why the WARN is triggered? Note, some CPUs appear to generate spurious EPT Violations #VEs that trigger KVM's WARN, in particular with eptad=0 and/or nested virtualization.
© 2016 - 2026 Red Hat, Inc.