[v13] KVM TDX basic feature support

[PATCH v13 098/113] KVM: TDX: Handle TDX PV map_gpa hypercall

Posted by isaku.yamahata@intel.com 2 years, 11 months ago

From: Isaku Yamahata <isaku.yamahata@intel.com>

Wire up TDX PV map_gpa hypercall to the kvm/mmu backend.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
---
 arch/x86/kvm/vmx/tdx.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index 0c6386fe4684..853cfda99fed 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -1216,6 +1216,24 @@ static int tdx_emulate_wrmsr(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+static int tdx_map_gpa(struct kvm_vcpu *vcpu)
+{
+	struct kvm *kvm = vcpu->kvm;
+	gpa_t gpa = tdvmcall_a0_read(vcpu);
+	gpa_t size = tdvmcall_a1_read(vcpu);
+	gpa_t end = gpa + size;
+
+	if (!IS_ALIGNED(gpa, PAGE_SIZE) || !IS_ALIGNED(size, PAGE_SIZE) ||
+	    end < gpa ||
+	    end > kvm_gfn_shared_mask(kvm) << (PAGE_SHIFT + 1) ||
+	    kvm_is_private_gpa(kvm, gpa) != kvm_is_private_gpa(kvm, end)) {
+		tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND);
+		return 1;
+	}
+
+	return tdx_vp_vmcall_to_user(vcpu);
+}
+
 static int handle_tdvmcall(struct kvm_vcpu *vcpu)
 {
 	if (tdvmcall_exit_type(vcpu))
@@ -1241,6 +1259,8 @@ static int handle_tdvmcall(struct kvm_vcpu *vcpu)
 		 * guest TD doesn't make sense.  No argument check is done.
 		 */
 		return tdx_vp_vmcall_to_user(vcpu);
+	case TDG_VP_VMCALL_MAP_GPA:
+		return tdx_map_gpa(vcpu);
 	default:
 		break;
 	}
-- 
2.25.1

Re: [PATCH v13 098/113] KVM: TDX: Handle TDX PV map_gpa hypercall

Posted by Ryan Afranji 2 years, 9 months ago

+static int tdx_map_gpa(struct kvm_vcpu *vcpu)
+{
+	struct kvm *kvm = vcpu->kvm;
+	gpa_t gpa = tdvmcall_a0_read(vcpu);
+	gpa_t size = tdvmcall_a1_read(vcpu);
+	gpa_t end = gpa + size;
+
+	if (!IS_ALIGNED(gpa, PAGE_SIZE) || !IS_ALIGNED(size, PAGE_SIZE) ||
+	    end < gpa ||
+	    end > kvm_gfn_shared_mask(kvm) << (PAGE_SHIFT + 1) ||
+	    kvm_is_private_gpa(kvm, gpa) != kvm_is_private_gpa(kvm, end)) {
+		tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND);

According to table 3-6 of the GHCI spec, TDG.VP.VMCALL_ALIGN_ERROR should be
returned if there is an alignment error for the size or start GPA. Right now,
TDG_VP_VMCALL_INVALID_OPERAND is being returned instead. Can this be updated?

+		return 1;
+	}
+
+	return tdx_vp_vmcall_to_user(vcpu);
+}
+

Re: [PATCH v13 098/113] KVM: TDX: Handle TDX PV map_gpa hypercall

Posted by Vishal Annapurve 2 years, 9 months ago

> +static int tdx_map_gpa(struct kvm_vcpu *vcpu)
> +{
> +	struct kvm *kvm = vcpu->kvm;
> +	gpa_t gpa = tdvmcall_a0_read(vcpu);
> +	gpa_t size = tdvmcall_a1_read(vcpu);
> +	gpa_t end = gpa + size;
> +
> +	if (!IS_ALIGNED(gpa, PAGE_SIZE) || !IS_ALIGNED(size, PAGE_SIZE) ||
> +	    end < gpa ||
> +	    end > kvm_gfn_shared_mask(kvm) << (PAGE_SHIFT + 1) ||
> +	    kvm_is_private_gpa(kvm, gpa) != kvm_is_private_gpa(kvm, end)) {
> +		tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND);
> +		return 1;
> +	}
> +
> +	return tdx_vp_vmcall_to_user(vcpu);

This will result into exits to userspace for MMIO regions as well. Does it make
sense to only exit to userspace for guest physical memory regions backed by
memslots?

> +}
> +

Re: [PATCH v13 098/113] KVM: TDX: Handle TDX PV map_gpa hypercall

Posted by Zhi Wang 2 years, 9 months ago

On Tue, 18 Apr 2023 19:09:04 +0000
Vishal Annapurve <vannapurve@google.com> wrote:

> > +static int tdx_map_gpa(struct kvm_vcpu *vcpu)
> > +{
> > +	struct kvm *kvm = vcpu->kvm;
> > +	gpa_t gpa = tdvmcall_a0_read(vcpu);
> > +	gpa_t size = tdvmcall_a1_read(vcpu);
> > +	gpa_t end = gpa + size;
> > +
> > +	if (!IS_ALIGNED(gpa, PAGE_SIZE) || !IS_ALIGNED(size, PAGE_SIZE) ||
> > +	    end < gpa ||
> > +	    end > kvm_gfn_shared_mask(kvm) << (PAGE_SHIFT + 1) ||
> > +	    kvm_is_private_gpa(kvm, gpa) != kvm_is_private_gpa(kvm, end)) {
> > +		tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND);
> > +		return 1;
> > +	}
> > +
> > +	return tdx_vp_vmcall_to_user(vcpu);
> 
> This will result into exits to userspace for MMIO regions as well. Does it make
> sense to only exit to userspace for guest physical memory regions backed by
> memslots?
> 
I think this is necessary as when passing a PCI device to a TD, the guest needs to convert a MMIO region from private to shared, which is not backed by memslots.
> > +}
> > +

Re: [PATCH v13 098/113] KVM: TDX: Handle TDX PV map_gpa hypercall

Posted by Vishal Annapurve 2 years, 9 months ago

On Wed, Apr 19, 2023 at 3:38 AM Zhi Wang <zhi.wang.linux@gmail.com> wrote:
>
> On Tue, 18 Apr 2023 19:09:04 +0000
> Vishal Annapurve <vannapurve@google.com> wrote:
>
> > > +static int tdx_map_gpa(struct kvm_vcpu *vcpu)
> > > +{
> > > +   struct kvm *kvm = vcpu->kvm;
> > > +   gpa_t gpa = tdvmcall_a0_read(vcpu);
> > > +   gpa_t size = tdvmcall_a1_read(vcpu);
> > > +   gpa_t end = gpa + size;
> > > +
> > > +   if (!IS_ALIGNED(gpa, PAGE_SIZE) || !IS_ALIGNED(size, PAGE_SIZE) ||
> > > +       end < gpa ||
> > > +       end > kvm_gfn_shared_mask(kvm) << (PAGE_SHIFT + 1) ||
> > > +       kvm_is_private_gpa(kvm, gpa) != kvm_is_private_gpa(kvm, end)) {
> > > +           tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND);
> > > +           return 1;
> > > +   }
> > > +
> > > +   return tdx_vp_vmcall_to_user(vcpu);
> >
> > This will result into exits to userspace for MMIO regions as well. Does it make
> > sense to only exit to userspace for guest physical memory regions backed by
> > memslots?
> >
> I think this is necessary as when passing a PCI device to a TD, the guest needs to convert a MMIO region from private to shared, which is not backed by memslots.

KVM could internally handle conversion of regions not backed by
private memslots instead of exiting to userspace. This could save time
during guest boot process.

What would be the expectations from userspace for handling mapgpa
operations on MMIO regions? Is it to just convert memory attributes?

> > > +}
> > > +
>

Re: [PATCH v13 098/113] KVM: TDX: Handle TDX PV map_gpa hypercall

Posted by Sean Christopherson 2 years, 9 months ago

On Wed, Apr 26, 2023, Vishal Annapurve wrote:
> On Wed, Apr 19, 2023 at 3:38 AM Zhi Wang <zhi.wang.linux@gmail.com> wrote:
> >
> > On Tue, 18 Apr 2023 19:09:04 +0000
> > Vishal Annapurve <vannapurve@google.com> wrote:
> >
> > > > +static int tdx_map_gpa(struct kvm_vcpu *vcpu)
> > > > +{
> > > > +   struct kvm *kvm = vcpu->kvm;
> > > > +   gpa_t gpa = tdvmcall_a0_read(vcpu);
> > > > +   gpa_t size = tdvmcall_a1_read(vcpu);
> > > > +   gpa_t end = gpa + size;
> > > > +
> > > > +   if (!IS_ALIGNED(gpa, PAGE_SIZE) || !IS_ALIGNED(size, PAGE_SIZE) ||
> > > > +       end < gpa ||
> > > > +       end > kvm_gfn_shared_mask(kvm) << (PAGE_SHIFT + 1) ||
> > > > +       kvm_is_private_gpa(kvm, gpa) != kvm_is_private_gpa(kvm, end)) {
> > > > +           tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND);
> > > > +           return 1;
> > > > +   }
> > > > +
> > > > +   return tdx_vp_vmcall_to_user(vcpu);
> > >
> > > This will result into exits to userspace for MMIO regions as well. Does it make
> > > sense to only exit to userspace for guest physical memory regions backed by
> > > memslots?

No, KVM should exit always, e.g. userspace _could_ choose to create a private
memslot in response to the guest's request.

> > I think this is necessary as when passing a PCI device to a TD, the guest
> > needs to convert a MMIO region from private to shared, which is not backed
> > by memslots.

This isn't entirely accurate.  If you're talking about emulated MMIO, then there
is no memslot.  But the "passing a PCI device" makes it sound like you're talking
about device passthrough, in which case there is a memslot that points at an actual
MMIO region in the host platform.

In either case, conversions should be unnecessary as MMIO regions should not be
enumerated to the guest as supporting encryption, i.e. the guest should know from
time zero that those regions are shared.  If we end up with something like Hyper-V's
SVSM-based paravisor, then there might be private emulated MMIO, but such a setup
would also come with its own brand of enlightment in the guest.

> KVM could internally handle conversion of regions not backed by

No, KVM should never internally handle conversions, at least not in the initial
implementation.  And if KVM ever does go down this route, it needs dedicated
support in KVM's uAPI since userspace needs to be kept in the loop, i.e. needs
to opt-in and be notified of any conversions.