From: Yan Zhao <yan.y.zhao@intel.com>
Always honor guest PAT in KVM-managed EPTs on TDX enabled platforms by
making self-snoop feature a hard dependency for TDX and making quirk
KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT not a valid quirk once TDX is enabled.
The quirk KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT only affects memory type of
KVM-managed EPTs. For the TDX-module-managed private EPT, memory type is
always forced to WB now.
Honoring guest PAT in KVM-managed EPTs ensures KVM does not invoke
kvm_zap_gfn_range() when attaching/detaching non-coherent DMA devices;
this would cause mirrored EPTs for TDs to be zapped, as well as incorrect
zapping of the private EPT that is managed by the TDX module.
As a new platform, TDX always comes with self-snoop feature supported and has
no worry to break old not-well-written yet unmodifiable guests. So, simply
force-disable the KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT quirk for TDX VMs.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Message-ID: <20250224071039.31511-1-yan.y.zhao@intel.com>
[Use disabled_quirks instead of supported_quirks. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
arch/x86/kvm/vmx/tdx.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index b6f6f6e2f02e..4450fd99cb4c 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -624,6 +624,7 @@ int tdx_vm_init(struct kvm *kvm)
kvm->arch.has_protected_state = true;
kvm->arch.has_private_mem = true;
+ kvm->arch.disabled_quirks |= KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT;
/*
* Because guest TD is protected, VMM can't parse the instruction in TD.
@@ -3470,6 +3471,11 @@ int __init tdx_bringup(void)
goto success_disable_tdx;
}
+ if (!cpu_feature_enabled(X86_FEATURE_SELFSNOOP)) {
+ pr_err("Self-snoop is required for TDX\n");
+ goto success_disable_tdx;
+ }
+
if (!cpu_feature_enabled(X86_FEATURE_TDX_HOST_PLATFORM)) {
pr_err("tdx: no TDX private KeyIDs available\n");
goto success_disable_tdx;
--
2.43.5
On Sat, Mar 01, 2025 at 02:34:28AM -0500, Paolo Bonzini wrote:
> From: Yan Zhao <yan.y.zhao@intel.com>
>
> Always honor guest PAT in KVM-managed EPTs on TDX enabled platforms by
> making self-snoop feature a hard dependency for TDX and making quirk
> KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT not a valid quirk once TDX is enabled.
>
> The quirk KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT only affects memory type of
> KVM-managed EPTs. For the TDX-module-managed private EPT, memory type is
> always forced to WB now.
>
> Honoring guest PAT in KVM-managed EPTs ensures KVM does not invoke
> kvm_zap_gfn_range() when attaching/detaching non-coherent DMA devices;
> this would cause mirrored EPTs for TDs to be zapped, as well as incorrect
> zapping of the private EPT that is managed by the TDX module.
>
> As a new platform, TDX always comes with self-snoop feature supported and has
> no worry to break old not-well-written yet unmodifiable guests. So, simply
> force-disable the KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT quirk for TDX VMs.
>
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> Message-ID: <20250224071039.31511-1-yan.y.zhao@intel.com>
> [Use disabled_quirks instead of supported_quirks. - Paolo]
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/x86/kvm/vmx/tdx.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index b6f6f6e2f02e..4450fd99cb4c 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -624,6 +624,7 @@ int tdx_vm_init(struct kvm *kvm)
>
> kvm->arch.has_protected_state = true;
> kvm->arch.has_private_mem = true;
> + kvm->arch.disabled_quirks |= KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT;
Though the quirk is disabled by default in KVM in tdx_vm_init() for TDs, the
kvm->arch.disabled_quirks can be overwritten by a userspace specified value in
kvm_vm_ioctl_enable_cap().
"kvm->arch.disabled_quirks = cap->args[0] & kvm_caps.supported_quirks;"
So, when an old userspace tries to disable other quirks on this new KVM, it may
accidentally turn KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT into enabled for TDs, which
would cause SEPT being zapped when (de)attaching non-coherent devices.
Could we force KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT to be disabled for TDs?
e.g.
tdx_vm_init
kvm->arch.always_disabled_quirks |= KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT;
static inline bool kvm_check_has_quirk(struct kvm *kvm, u64 quirk)
{
WARN_ON_ONCE(kvm->arch.always_disabled_quirk & kvm_caps.force_enabled_quirks);
u64 disabled_quirks = kvm->arch.always_disabled_quirk | kvm->arch.disabled_quirks;
return !(disabled_quirks & quirk) |
(kvm_caps.force_enabled_quirks & quirk);
}
>
> /*
> * Because guest TD is protected, VMM can't parse the instruction in TD.
> @@ -3470,6 +3471,11 @@ int __init tdx_bringup(void)
> goto success_disable_tdx;
> }
>
> + if (!cpu_feature_enabled(X86_FEATURE_SELFSNOOP)) {
> + pr_err("Self-snoop is required for TDX\n");
> + goto success_disable_tdx;
> + }
> +
> if (!cpu_feature_enabled(X86_FEATURE_TDX_HOST_PLATFORM)) {
> pr_err("tdx: no TDX private KeyIDs available\n");
> goto success_disable_tdx;
> --
> 2.43.5
>
On 3/3/25 02:30, Yan Zhao wrote:
>> kvm->arch.has_protected_state = true;
>> kvm->arch.has_private_mem = true;
>> + kvm->arch.disabled_quirks |= KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT;
> Though the quirk is disabled by default in KVM in tdx_vm_init() for TDs, the
> kvm->arch.disabled_quirks can be overwritten by a userspace specified value in
> kvm_vm_ioctl_enable_cap().
> "kvm->arch.disabled_quirks = cap->args[0] & kvm_caps.supported_quirks;"
>
> So, when an old userspace tries to disable other quirks on this new KVM, it may
> accidentally turn KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT into enabled for TDs, which
> would cause SEPT being zapped when (de)attaching non-coherent devices.
Yeah, sorry about that - Xiaoyao also pointed it out and I should have
noticed it---or marked the patches as RFC which they were.
> Could we force KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT to be disabled for TDs?
>
> e.g.
>
> tdx_vm_init
> kvm->arch.always_disabled_quirks |= KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT;
>
> static inline bool kvm_check_has_quirk(struct kvm *kvm, u64 quirk)
> {
> WARN_ON_ONCE(kvm->arch.always_disabled_quirk & kvm_caps.force_enabled_quirks);
>
> u64 disabled_quirks = kvm->arch.always_disabled_quirk | kvm->arch.disabled_quirks;
> return !(disabled_quirks & quirk) |
> (kvm_caps.force_enabled_quirks & quirk);
> }
We can change KVM_ENABLE_CAP(KVM_X86_DISABLE_QUIRKS), as well as
QUIRKS2, to use "|=" instead of "=".
While this is technically a change in the API, the current
implementation is just awful and I hope that no one is relying on it!
This way, the "always_disabled_quirks" are not needed.
If the "|=" idea doesn't work out I agree that
kvm->arch.always_disabled_quirk is needed.
Sending v3 shortly...
Paolo
On Mon, Mar 03, 2025 at 05:14:55PM +0100, Paolo Bonzini wrote:
> On 3/3/25 02:30, Yan Zhao wrote:
> > > kvm->arch.has_protected_state = true;
> > > kvm->arch.has_private_mem = true;
> > > + kvm->arch.disabled_quirks |= KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT;
> > Though the quirk is disabled by default in KVM in tdx_vm_init() for TDs, the
> > kvm->arch.disabled_quirks can be overwritten by a userspace specified value in
> > kvm_vm_ioctl_enable_cap().
> > "kvm->arch.disabled_quirks = cap->args[0] & kvm_caps.supported_quirks;"
> >
> > So, when an old userspace tries to disable other quirks on this new KVM, it may
> > accidentally turn KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT into enabled for TDs, which
> > would cause SEPT being zapped when (de)attaching non-coherent devices.
>
> Yeah, sorry about that - Xiaoyao also pointed it out and I should have
> noticed it---or marked the patches as RFC which they were.
>
> > Could we force KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT to be disabled for TDs?
> >
> > e.g.
> >
> > tdx_vm_init
> > kvm->arch.always_disabled_quirks |= KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT;
> >
> > static inline bool kvm_check_has_quirk(struct kvm *kvm, u64 quirk)
> > {
> > WARN_ON_ONCE(kvm->arch.always_disabled_quirk & kvm_caps.force_enabled_quirks);
> >
> > u64 disabled_quirks = kvm->arch.always_disabled_quirk | kvm->arch.disabled_quirks;
> > return !(disabled_quirks & quirk) |
> > (kvm_caps.force_enabled_quirks & quirk);
> > }
>
> We can change KVM_ENABLE_CAP(KVM_X86_DISABLE_QUIRKS), as well as QUIRKS2, to
> use "|=" instead of "=".
>
> While this is technically a change in the API, the current implementation is
> just awful and I hope that no one is relying on it! This way, the
I think QEMU is not relying on it.
I considered making this change while writing the quirk SLOT_ZAP_ALL but gave it
up, thinking it might allow users to re-enable a quirk later on.
I'm glad you also see it as a bug:)
> "always_disabled_quirks" are not needed.
>
> If the "|=" idea doesn't work out I agree that
> kvm->arch.always_disabled_quirk is needed.
>
> Sending v3 shortly...
Thanks!
On 3/1/2025 3:34 PM, Paolo Bonzini wrote:
> From: Yan Zhao <yan.y.zhao@intel.com>
>
> Always honor guest PAT in KVM-managed EPTs on TDX enabled platforms by
> making self-snoop feature a hard dependency for TDX and making quirk
> KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT not a valid quirk once TDX is enabled.
>
> The quirk KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT only affects memory type of
> KVM-managed EPTs. For the TDX-module-managed private EPT, memory type is
> always forced to WB now.
>
> Honoring guest PAT in KVM-managed EPTs ensures KVM does not invoke
> kvm_zap_gfn_range() when attaching/detaching non-coherent DMA devices;
> this would cause mirrored EPTs for TDs to be zapped, as well as incorrect
> zapping of the private EPT that is managed by the TDX module.
>
> As a new platform, TDX always comes with self-snoop feature supported and has
> no worry to break old not-well-written yet unmodifiable guests. So, simply
> force-disable the KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT quirk for TDX VMs.
>
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> Message-ID: <20250224071039.31511-1-yan.y.zhao@intel.com>
> [Use disabled_quirks instead of supported_quirks. - Paolo]
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/x86/kvm/vmx/tdx.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index b6f6f6e2f02e..4450fd99cb4c 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -624,6 +624,7 @@ int tdx_vm_init(struct kvm *kvm)
>
> kvm->arch.has_protected_state = true;
> kvm->arch.has_private_mem = true;
> + kvm->arch.disabled_quirks |= KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT;
This doesn't present userspace from dropping the
KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT bit by updating
kvm->arch.disabled_quirk via KVM_CAP_DISABLE_QUIRKS.
I think we can make inapplicable_quirks per VM in Patch 1 and set
kvm->arch.inapplicable_quirks |= KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT;
for TDX VMs.
>
> /*
> * Because guest TD is protected, VMM can't parse the instruction in TD.
> @@ -3470,6 +3471,11 @@ int __init tdx_bringup(void)
> goto success_disable_tdx;
> }
>
> + if (!cpu_feature_enabled(X86_FEATURE_SELFSNOOP)) {
> + pr_err("Self-snoop is required for TDX\n");
> + goto success_disable_tdx;
> + }
> +
> if (!cpu_feature_enabled(X86_FEATURE_TDX_HOST_PLATFORM)) {
> pr_err("tdx: no TDX private KeyIDs available\n");
> goto success_disable_tdx;
© 2016 - 2026 Red Hat, Inc.