[RFC PATCH 1/6] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management

Rick Edgecombe posted 6 patches 1 year, 2 months ago
There is a newer version of this series
[RFC PATCH 1/6] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
Posted by Rick Edgecombe 1 year, 2 months ago
Intel TDX protects guest VMs from malicious host and certain physical
attacks. Pre-TDX Intel hardware has support for a memory encryption
architecture called MK-TME, which repurposes several high bits of
physical address as "KeyID". TDX ends up with reserving a sub-range of
MK-TME KeyIDs as "TDX private KeyIDs".

Like MK-TME, these KeyIDs can be associated with an ephemeral key. For TDX
this association is done by the TDX module. It also has its own tracking
for which KeyIDs are in use. To do this ephemeral key setup and manipulate
the TDX module's internal tracking, KVM will use the following SEAMCALLs:
 TDH.MNG.KEY.CONFIG: Mark the KeyID as in use, and initialize its
                     ephemeral key.
 TDH.MNG.KEY.FREEID: Mark the KeyID as not in use.

These SEAMCALLs both operate on TDR structures, which are setup using the
previously added TDH.MNG.CREATE SEAMCALL. KVM's use of these operations
will go like:
 - tdx_guest_keyid_alloc()
 - Initialize TD and TDR page with TDH.MNG.CREATE (not yet-added), passing
   KeyID
 - TDH.MNG.KEY.CONFIG to initialize the key
 - TD runs, teardown is started
 - TDH.MNG.KEY.FREEID
 - tdx_guest_keyid_free()

Don't try to combine the tdx_guest_keyid_alloc() and TDH.MNG.KEY.CONFIG
operations because TDH.MNG.CREATE and some locking need to be done in the
middle. Don't combine TDH.MNG.KEY.FREEID and tdx_guest_keyid_free() so they
are symmetrical with the creation path.

So implement tdh_mng_key_config() and tdh_mng_key_freeid() as separate
functions than tdx_guest_keyid_alloc() and tdx_guest_keyid_free().

The TDX module provides SEAMCALLs to hand pages to the TDX module for
storing TDX controlled state. SEAMCALLs that operate on this state are
directed to the appropriate TD VM using references to the pages originally
provided for managing the TD's state. So the host kernel needs to track
these pages, both as an ID for specifying which TD to operate on, and to
allow them to be eventually reclaimed. The TD VM associated pages are
called TDR (Trust Domain Root) and TDCS (Trust Domain Control Structure).

Introduce "struct tdx_td" for holding references to pages provided to the
TDX module for this TD VM associated state. Don't plan for any TD
associated state that is controlled by KVM to live in this struct. Only
expect it to hold data for concepts specific to the TDX architecture, for
which there can't already be preexisting storage for in KVM.

Add both the TDR page and an array of TDCS pages, even though the SEAMCALL
wrappers will only need to know about the TDR pages for directing the
SEAMCALLs to the right TD. Adding the TDCS pages to this struct will let
all of the TD VM associated pages handed to the TDX module be tracked in
one location. For a type to specify physical pages, use KVM's hpa_t type.
Do this for KVM's benefit This is the common type used to hold physical
addresses in KVM, so will make interoperability easier.

Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>
Reviewed-by: Yuan Yao <yuan.yao@intel.com>
---
SEAMCALL RFC:
 - Introduce struct tdx_td to use instead of raw TDR u64

uAPI breakout v2:
 - Change to use 'u64' as function parameter to prepare to move
   SEAMCALL wrappers to arch/x86. (Kai)
 - Split to separate patch
 - Move SEAMCALL wrappers from KVM to x86 core;
 - Move TDH_xx macros from KVM to x86 core;
 - Re-write log

uAPI breakout v1:
 - Make argument to C wrapper function struct kvm_tdx * or
   struct vcpu_tdx * .(Sean)
 - Drop unused helpers (Kai)
 - Fix bisectability issues in headers (Kai)
 - Updates from seamcall overhaul (Kai)

v19:
 - Update the commit message to match the patch by Yuan
 - Use seamcall() and seamcall_ret() by paolo

v18:
 - removed stub functions for __seamcall{,_ret}()
 - Added Reviewed-by Binbin
 - Make tdx_seamcall() use struct tdx_module_args instead of taking
   each inputs.
---
 arch/x86/include/asm/tdx.h  |  9 +++++++++
 arch/x86/virt/vmx/tdx/tdx.c | 22 ++++++++++++++++++++++
 arch/x86/virt/vmx/tdx/tdx.h | 16 +++++++++-------
 3 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index d33e46d53d59..ebee4260545f 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -34,6 +34,7 @@
 
 #include <uapi/asm/mce.h>
 #include "tdx_global_metadata.h"
+#include <linux/kvm_types.h>
 
 /*
  * Used by the #VE exception handler to gather the #VE exception
@@ -121,6 +122,14 @@ const struct tdx_sys_info *tdx_get_sysinfo(void);
 
 int tdx_guest_keyid_alloc(void);
 void tdx_guest_keyid_free(unsigned int keyid);
+
+struct tdx_td {
+	hpa_t tdr;
+	hpa_t *tdcs;
+};
+
+u64 tdh_mng_key_config(struct tdx_td *td);
+u64 tdh_mng_key_freeid(struct tdx_td *td);
 #else
 static inline void tdx_init(void) { }
 static inline int tdx_cpu_enable(void) { return -ENODEV; }
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index b883c1a4b002..20eb756b41de 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -1562,3 +1562,25 @@ void tdx_guest_keyid_free(unsigned int keyid)
 	ida_free(&tdx_guest_keyid_pool, keyid);
 }
 EXPORT_SYMBOL_GPL(tdx_guest_keyid_free);
+
+u64 tdh_mng_key_config(struct tdx_td *td)
+{
+	struct tdx_module_args args = {
+		.rcx = td->tdr,
+	};
+
+	return seamcall(TDH_MNG_KEY_CONFIG, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mng_key_config);
+
+
+u64 tdh_mng_key_freeid(struct tdx_td *td)
+{
+	struct tdx_module_args args = {
+		.rcx = td->tdr,
+	};
+
+	return seamcall(TDH_MNG_KEY_FREEID, &args);
+}
+EXPORT_SYMBOL_GPL(tdh_mng_key_freeid);
+
diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
index 9b708a8fb568..95002e7ff4c5 100644
--- a/arch/x86/virt/vmx/tdx/tdx.h
+++ b/arch/x86/virt/vmx/tdx/tdx.h
@@ -17,13 +17,15 @@
 /*
  * TDX module SEAMCALL leaf functions
  */
-#define TDH_PHYMEM_PAGE_RDMD	24
-#define TDH_SYS_KEY_CONFIG	31
-#define TDH_SYS_INIT		33
-#define TDH_SYS_RD		34
-#define TDH_SYS_LP_INIT		35
-#define TDH_SYS_TDMR_INIT	36
-#define TDH_SYS_CONFIG		45
+#define TDH_MNG_KEY_CONFIG		8
+#define TDH_MNG_KEY_FREEID		20
+#define TDH_PHYMEM_PAGE_RDMD		24
+#define TDH_SYS_KEY_CONFIG		31
+#define TDH_SYS_INIT			33
+#define TDH_SYS_RD			34
+#define TDH_SYS_LP_INIT			35
+#define TDH_SYS_TDMR_INIT		36
+#define TDH_SYS_CONFIG			45
 
 /* TDX page types */
 #define	PT_NDA		0x0
-- 
2.47.0
Re: [RFC PATCH 1/6] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
Posted by Dave Hansen 1 year, 2 months ago
On 11/15/24 12:20, Rick Edgecombe wrote:
> +struct tdx_td {
> +	hpa_t tdr;
> +	hpa_t *tdcs;
> +};

This is a step in the right direction because it gives the wrappers some
more type safety.

But an hpa_t is _barely_ better than a u64.  If the 'tdr' is a page,
then it needs to be _stored_ as a page:

	struct page *tdr_page;

Also, please don't forget to spell these things out:

	/* TD root structure: */
	struct page *tdr_page;

And the tdcs is an array of pages, right?  So it should be:

	struct page **tdcs_pages;

Or heck, I _think_ it can theoretically be defined as a variable-length
array:

	struct page *tdcs_pages[];

and use the helpers that we have for that.

Putting it all together, you would have this:

struct tdx_td {
	/* TD root structure: */
	struct page *tdr_page;

	int tdcs_nr_pages;
	/* TD control structure: */
	struct page *tdcs_pages[];
};

That's *MUCH* harder to misuse.  It's 100% obvious that you have a
single page, plus a variable-length array of pages.  This is all from
just looking at the structure definition.

You know that 'tdr' is not just some random physical address.  It's a
whole physical page.  It's page-aligned.  It was allocated, from the
allocator.  It doesn't point to special memory.

Ditto for "hpa_t *tdcs".  It's not obvious from the data structure that
it's an array or if it's an array how it got allocated or how large it is.
Re: [RFC PATCH 1/6] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
Posted by Sean Christopherson 1 year, 2 months ago
On Fri, Nov 22, 2024, Dave Hansen wrote:
> On 11/15/24 12:20, Rick Edgecombe wrote:
> > +struct tdx_td {
> > +	hpa_t tdr;
> > +	hpa_t *tdcs;
> > +};
> 
> This is a step in the right direction because it gives the wrappers some
> more type safety.
> 
> But an hpa_t is _barely_ better than a u64.  If the 'tdr' is a page,
> then it needs to be _stored_ as a page:
> 
> 	struct page *tdr_page;
> 
> Also, please don't forget to spell these things out:
> 
> 	/* TD root structure: */
> 	struct page *tdr_page;
> 
> And the tdcs is an array of pages, right?  So it should be:
> 
> 	struct page **tdcs_pages;
> 
> Or heck, I _think_ it can theoretically be defined as a variable-length
> array:
> 
> 	struct page *tdcs_pages[];
> 
> and use the helpers that we have for that.
> 
> Putting it all together, you would have this:
> 
> struct tdx_td {
> 	/* TD root structure: */
> 	struct page *tdr_page;
> 
> 	int tdcs_nr_pages;
> 	/* TD control structure: */
> 	struct page *tdcs_pages[];
> };
> 
> That's *MUCH* harder to misuse.  It's 100% obvious that you have a
> single page, plus a variable-length array of pages.  This is all from
> just looking at the structure definition.

I don't know the full context, but working with "struct page" is a pain when every
user just wants the physical address.  KVM SVM had a few cases where pointers were
tracked as "struct page", and it was generally unpleasant to read and work with.

I also don't like conflating the kernel's "struct page" with the architecture's
definition of a 4KiB page.

> You know that 'tdr' is not just some random physical address.  It's a
> whole physical page.  It's page-aligned.  It was allocated, from the
> allocator.  It doesn't point to special memory.

Oh, but it does point to special memory.  If it *didn't* point at special memory
that is completely opaque and untouchable, then KVM could use a struct overlay,
which would give contextual information and some amount of type safety.  E.g.
an equivalent without TDX is "struct vmcs *".

Rather than "struct page", what if we add an address_space (in the Sparse sense),
and a typedef for a TDX pages?  Maybe __firmware?  E.g.

  # define __firmware	__attribute__((noderef, address_space(__firmware)))

  typedef u64 __firmware *tdx_page_t;

That doesn't give as much compile-time safety, but in some ways it provides more
type safety since KVM (or whatever else cares) would need to make an explicit and
ugly cast to misuse the pointer.

> Ditto for "hpa_t *tdcs".  It's not obvious from the data structure that
> it's an array or if it's an array how it got allocated or how large it is.
Re: [RFC PATCH 1/6] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
Posted by Dave Hansen 1 year, 2 months ago
On 11/22/24 15:55, Sean Christopherson wrote:
> On Fri, Nov 22, 2024, Dave Hansen wrote:
> I don't know the full context, but working with "struct page" is a pain when every
> user just wants the physical address.  KVM SVM had a few cases where pointers were
> tracked as "struct page", and it was generally unpleasant to read and work with.

I'm not super convinced. page_to_phys(foo) is all it takes

> I also don't like conflating the kernel's "struct page" with the architecture's
> definition of a 4KiB page.

That's fair, although it's pervasively conflated across our entire
codebase. But 'struct page' is substantially better than a hpa_t,
phys_addr_t or u64 that can store a full 64-bits of address. Those
conflate a physical address with a physical page, which is *FAR* worse.

>> You know that 'tdr' is not just some random physical address.  It's a
>> whole physical page.  It's page-aligned.  It was allocated, from the
>> allocator.  It doesn't point to special memory.
> 
> Oh, but it does point to special memory.  If it *didn't* point at special memory
> that is completely opaque and untouchable, then KVM could use a struct overlay,
> which would give contextual information and some amount of type safety.  E.g.
> an equivalent without TDX is "struct vmcs *".
> 
> Rather than "struct page", what if we add an address_space (in the Sparse sense),
> and a typedef for a TDX pages?  Maybe __firmware?  E.g.
> 
>   # define __firmware	__attribute__((noderef, address_space(__firmware)))
> 
>   typedef u64 __firmware *tdx_page_t;
> 
> That doesn't give as much compile-time safety, but in some ways it provides more
> type safety since KVM (or whatever else cares) would need to make an explicit and
> ugly cast to misuse the pointer.

It's better than nothing. But I still vastly prefer to have a type that
tells you that something is physically-allocated out of the buddy, RAM,
and page-aligned.

I'd be better to have:

struct tdx_page {
	u64 page_phys_addr;
};

than depend on sparse, IMNHO.

Do you run sparse every time you compile the kernel, btw? ;)
Re: [RFC PATCH 1/6] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
Posted by Sean Christopherson 1 year, 2 months ago
On Fri, Nov 22, 2024, Dave Hansen wrote:
> On 11/22/24 15:55, Sean Christopherson wrote:
> > On Fri, Nov 22, 2024, Dave Hansen wrote:
> > I don't know the full context, but working with "struct page" is a pain when every
> > user just wants the physical address.  KVM SVM had a few cases where pointers were
> > tracked as "struct page", and it was generally unpleasant to read and work with.
> 
> I'm not super convinced. page_to_phys(foo) is all it takes

I looked again at the KVM code that bugs me, and my complaints are with code that
needs both the physical address and the virtual address, i.e. that could/should
use a meaningful pointer to describe the underlying data structure since KVM does
directly access the memory.

But for TDX pages, that obviously doesn't apply, so I withdraw my objection about
using struct page.

> > I also don't like conflating the kernel's "struct page" with the architecture's
> > definition of a 4KiB page.
> 
> That's fair, although it's pervasively conflated across our entire
> codebase. But 'struct page' is substantially better than a hpa_t,
> phys_addr_t or u64 that can store a full 64-bits of address. Those
> conflate a physical address with a physical page, which is *FAR* worse.
> 
> >> You know that 'tdr' is not just some random physical address.  It's a
> >> whole physical page.  It's page-aligned.  It was allocated, from the
> >> allocator.  It doesn't point to special memory.
> > 
> > Oh, but it does point to special memory.  If it *didn't* point at special memory
> > that is completely opaque and untouchable, then KVM could use a struct overlay,
> > which would give contextual information and some amount of type safety.  E.g.
> > an equivalent without TDX is "struct vmcs *".
> > 
> > Rather than "struct page", what if we add an address_space (in the Sparse sense),
> > and a typedef for a TDX pages?  Maybe __firmware?  E.g.
> > 
> >   # define __firmware	__attribute__((noderef, address_space(__firmware)))
> > 
> >   typedef u64 __firmware *tdx_page_t;
> > 
> > That doesn't give as much compile-time safety, but in some ways it provides more
> > type safety since KVM (or whatever else cares) would need to make an explicit and
> > ugly cast to misuse the pointer.
> 
> It's better than nothing. But I still vastly prefer to have a type that
> tells you that something is physically-allocated out of the buddy, RAM,
> and page-aligned.
> 
> I'd be better to have:
> 
> struct tdx_page {
> 	u64 page_phys_addr;
> };
> 
> than depend on sparse, IMNHO.
> 
> Do you run sparse every time you compile the kernel, btw? ;)

Nah, but the 0-day both does such a good job of detecting and reporting new warnings
that I'm generally comfortable relying on sparse for something like this.  Though
as above, I'm ok with using "struct page" for the TDX pages.
Re: [RFC PATCH 1/6] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
Posted by Dave Hansen 1 year, 2 months ago
On 11/25/24 07:44, Sean Christopherson wrote:
> Nah, but the 0-day both does such a good job of detecting and
> reporting new warnings that I'm generally comfortable relying on
> sparse for something like this.  Though as above, I'm ok with using
> "struct page" for the TDX pages.
Cool beans. Thanks for double-checking that KVM code you were concerned
about. It's much appreciated!
Re: [RFC PATCH 1/6] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
Posted by Edgecombe, Rick P 1 year, 2 months ago
On Fri, 2024-11-22 at 16:08 -0800, Dave Hansen wrote:
> On 11/22/24 15:55, Sean Christopherson wrote:
> > On Fri, Nov 22, 2024, Dave Hansen wrote:
> > I don't know the full context, but working with "struct page" is a pain when every
> > user just wants the physical address.  KVM SVM had a few cases where pointers were
> > tracked as "struct page", and it was generally unpleasant to read and work with.
> 
> I'm not super convinced. page_to_phys(foo) is all it takes
> 
> > I also don't like conflating the kernel's "struct page" with the architecture's
> > definition of a 4KiB page.
> 
> That's fair, although it's pervasively conflated across our entire
> codebase. But 'struct page' is substantially better than a hpa_t,
> phys_addr_t or u64 that can store a full 64-bits of address. Those
> conflate a physical address with a physical page, which is *FAR* worse.

In the case of tdh_mem_page_aug(), etc the caller only has a kvm_pfn_t passed
from a TDP MMU callback, for the page to be mapped in the guest TD. It is
probably not nice to assume that this kvm_pfn_t will have a struct page. So we
shouldn't always use struct pages for the SEAMCALL wrappers in any case.

What if we just move these members from hpa_t to pfn_t? It keeps us off struct
page, but addresses some of Dave's concerns about hpa_t looking like a specific
address.

> 
> > > You know that 'tdr' is not just some random physical address.  It's a
> > > whole physical page.  It's page-aligned.  It was allocated, from the
> > > allocator.  It doesn't point to special memory.
> > 
> > Oh, but it does point to special memory.  If it *didn't* point at special memory
> > that is completely opaque and untouchable, then KVM could use a struct overlay,
> > which would give contextual information and some amount of type safety.  E.g.
> > an equivalent without TDX is "struct vmcs *".
> > 
> > Rather than "struct page", what if we add an address_space (in the Sparse sense),
> > and a typedef for a TDX pages?  Maybe __firmware?  E.g.
> > 
> >    # define __firmware	__attribute__((noderef, address_space(__firmware)))
> > 
> >    typedef u64 __firmware *tdx_page_t;
> > 
> > That doesn't give as much compile-time safety, but in some ways it provides more
> > type safety since KVM (or whatever else cares) would need to make an explicit and
> > ugly cast to misuse the pointer.
> 
> It's better than nothing. But I still vastly prefer to have a type that
> tells you that something is physically-allocated out of the buddy, RAM,
> and page-aligned.
> 
> I'd be better to have:
> 
> struct tdx_page {
> 	u64 page_phys_addr;
> };
> 
> than depend on sparse, IMNHO.
> 
> Do you run sparse every time you compile the kernel, btw? ;)

Hmm, I'm trying to think of specific scenarios that "tdx page" types could make
big safety difference on.

Sean, do you happen to recall any specific bugs on the SEV side that this would
have helped with?

I hear the intuition, but without specific problems, it doesn't seem worth extra
code to me. Not a strong objection though.

Re: [RFC PATCH 1/6] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
Posted by Paolo Bonzini 1 year, 2 months ago
On 11/23/24 03:06, Edgecombe, Rick P wrote:
> On Fri, 2024-11-22 at 16:08 -0800, Dave Hansen wrote:
>> On 11/22/24 15:55, Sean Christopherson wrote:
>>> On Fri, Nov 22, 2024, Dave Hansen wrote:
>>> I don't know the full context, but working with "struct page" is a pain when every
>>> user just wants the physical address.  KVM SVM had a few cases where pointers were
>>> tracked as "struct page", and it was generally unpleasant to read and work with.
>>
>> I'm not super convinced. page_to_phys(foo) is all it takes
>>
>>> I also don't like conflating the kernel's "struct page" with the architecture's
>>> definition of a 4KiB page.
>>
>> That's fair, although it's pervasively conflated across our entire
>> codebase. But 'struct page' is substantially better than a hpa_t,
>> phys_addr_t or u64 that can store a full 64-bits of address. Those
>> conflate a physical address with a physical page, which is *FAR* worse.
> 
> In the case of tdh_mem_page_aug(), etc the caller only has a kvm_pfn_t passed
> from a TDP MMU callback, for the page to be mapped in the guest TD. It is
> probably not nice to assume that this kvm_pfn_t will have a struct page. So we
> shouldn't always use struct pages for the SEAMCALL wrappers in any case.
> 
> What if we just move these members from hpa_t to pfn_t? It keeps us off struct
> page, but addresses some of Dave's concerns about hpa_t looking like a specific
> address.

For tdr I agree with Dave that you probably want a struct which stores 
the struct page*. Currently the code is using __get_free_page(), but 
it's a small change to have it use alloc_page() instead, and 
__free_page() instead of free_page().

The only difference on the arch/x86/virt/ side is a bunch of added 
page_to_phys().

Anyhow, whatever you post I'll take care of adjusting in the KVM patches.

Paolo

Re: [RFC PATCH 1/6] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
Posted by Edgecombe, Rick P 1 year, 2 months ago
On Wed, 2024-11-27 at 19:15 +0100, Paolo Bonzini wrote:
> > What if we just move these members from hpa_t to pfn_t? It keeps us off
> > struct
> > page, but addresses some of Dave's concerns about hpa_t looking like a
> > specific
> > address.
> 
> For tdr I agree with Dave that you probably want a struct which stores 
> the struct page*. Currently the code is using __get_free_page(), but 
> it's a small change to have it use alloc_page() instead, and 
> __free_page() instead of free_page().
> 
> The only difference on the arch/x86/virt/ side is a bunch of added 
> page_to_phys().

Thanks.

> 
> Anyhow, whatever you post I'll take care of adjusting in the KVM patches.

I'm just throwing together a v2. We have the US holidays this week, so it may be
Monday.
Re: [RFC PATCH 1/6] x86/virt/tdx: Add SEAMCALL wrappers for TDX KeyID management
Posted by Dave Hansen 1 year, 2 months ago
On 11/22/24 15:55, Sean Christopherson wrote:
>> You know that 'tdr' is not just some random physical address.  It's a
>> whole physical page.  It's page-aligned.  It was allocated, from the
>> allocator.  It doesn't point to special memory.
> Oh, but it does point to special memory. 

In this specific case I meant "special memory" as not RAM and not
managed by the kernel.