[PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT

Rick Edgecombe posted 16 patches 2 months, 2 weeks ago
[PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Rick Edgecombe 2 months, 2 weeks ago
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

The Physical Address Metadata Table (PAMT) holds TDX metadata for physical
memory and must be allocated by the kernel during TDX module
initialization.

The exact size of the required PAMT memory is determined by the TDX module
and may vary between TDX module versions. Currently it is approximately
0.4% of the system memory. This is a significant commitment, especially if
it is not known upfront whether the machine will run any TDX guests.

For normal PAMT, each memory region that the TDX module might use (TDMR)
needs three separate PAMT allocations. One for each supported page size
(1GB, 2MB, 4KB).

At a high level, Dynamic PAMT still has the 1GB and 2MB levels allocated
on TDX module initialization, but the 4KB level allocated dynamically at
TD runtime. However, in the details, the TDX module still needs some per
4KB page data. The TDX module exposed how many bits per page need to be
allocated (currently it is 1). The bits-per-page value can then be used to
calculate the size to pass in place of the 4KB allocations in the TDMR,
which TDX specs call "PAMT_PAGE_BITMAP".

So in effect, Dynamic PAMT just needs a different (smaller) size
allocation for the 4KB level part of the allocation. Although it is
functionally something different, it is passed in the same way the 4KB page
size PAMT allocation is.

Begin to implement Dynamic PAMT in the kernel by reading the bits-per-page
needed for Dynamic PAMT. Calculate the size needed for the bitmap,
and use it instead of the 4KB size determined for normal PAMT, in the case
of Dynamic PAMT. In doing so, reduce the static allocations to
approximately 0.004%, a 100x improvement.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
[Enhanced log]
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
v4:
 - Use PAGE_ALIGN (Binbin)
 - Add comment about why tdx_supports_dynamic_pamt() is checked in
   metadata reading.

v3:
 - Simplify pamt_4k_size calculation after addition of refactor patch
 - Write log
---
 arch/x86/include/asm/tdx.h                  |  5 +++++
 arch/x86/include/asm/tdx_global_metadata.h  |  1 +
 arch/x86/virt/vmx/tdx/tdx.c                 | 19 ++++++++++++++++++-
 arch/x86/virt/vmx/tdx/tdx_global_metadata.c |  7 +++++++
 4 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 99bbeed8fb28..cf51ccd16194 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -130,6 +130,11 @@ int tdx_enable(void);
 const char *tdx_dump_mce_info(struct mce *m);
 const struct tdx_sys_info *tdx_get_sysinfo(void);
 
+static inline bool tdx_supports_dynamic_pamt(const struct tdx_sys_info *sysinfo)
+{
+	return false; /* To be enabled when kernel is ready */
+}
+
 int tdx_guest_keyid_alloc(void);
 u32 tdx_get_nr_guest_keyids(void);
 void tdx_guest_keyid_free(unsigned int keyid);
diff --git a/arch/x86/include/asm/tdx_global_metadata.h b/arch/x86/include/asm/tdx_global_metadata.h
index 060a2ad744bf..5eb808b23997 100644
--- a/arch/x86/include/asm/tdx_global_metadata.h
+++ b/arch/x86/include/asm/tdx_global_metadata.h
@@ -15,6 +15,7 @@ struct tdx_sys_info_tdmr {
 	u16 pamt_4k_entry_size;
 	u16 pamt_2m_entry_size;
 	u16 pamt_1g_entry_size;
+	u8  pamt_page_bitmap_entry_bits;
 };
 
 struct tdx_sys_info_td_ctrl {
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index f2b16a91ba58..6c522db71265 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -440,6 +440,18 @@ static int fill_out_tdmrs(struct list_head *tmb_list,
 	return 0;
 }
 
+static unsigned long tdmr_get_pamt_bitmap_sz(struct tdmr_info *tdmr)
+{
+	unsigned long pamt_sz, nr_pamt_entries;
+	int bits_per_entry;
+
+	bits_per_entry = tdx_sysinfo.tdmr.pamt_page_bitmap_entry_bits;
+	nr_pamt_entries = tdmr->size >> PAGE_SHIFT;
+	pamt_sz = DIV_ROUND_UP(nr_pamt_entries * bits_per_entry, BITS_PER_BYTE);
+
+	return PAGE_ALIGN(pamt_sz);
+}
+
 /*
  * Calculate PAMT size given a TDMR and a page size.  The returned
  * PAMT size is always aligned up to 4K page boundary.
@@ -507,7 +519,12 @@ static int tdmr_set_up_pamt(struct tdmr_info *tdmr,
 	 * Calculate the PAMT size for each TDX supported page size
 	 * and the total PAMT size.
 	 */
-	tdmr->pamt_4k_size = tdmr_get_pamt_sz(tdmr, TDX_PS_4K);
+	if (tdx_supports_dynamic_pamt(&tdx_sysinfo)) {
+		/* With Dynamic PAMT, PAMT_4K is replaced with a bitmap */
+		tdmr->pamt_4k_size = tdmr_get_pamt_bitmap_sz(tdmr);
+	} else {
+		tdmr->pamt_4k_size = tdmr_get_pamt_sz(tdmr, TDX_PS_4K);
+	}
 	tdmr->pamt_2m_size = tdmr_get_pamt_sz(tdmr, TDX_PS_2M);
 	tdmr->pamt_1g_size = tdmr_get_pamt_sz(tdmr, TDX_PS_1G);
 	tdmr_pamt_size = tdmr->pamt_4k_size + tdmr->pamt_2m_size + tdmr->pamt_1g_size;
diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
index 13ad2663488b..00ab0e550636 100644
--- a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
+++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
@@ -33,6 +33,13 @@ static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
 		sysinfo_tdmr->pamt_2m_entry_size = val;
 	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000012, &val)))
 		sysinfo_tdmr->pamt_1g_entry_size = val;
+	/*
+	 * Don't fail here if tdx_supports_dynamic_pamt() isn't supported. The
+	 * TDX code can fallback to normal PAMT if it's not supported.
+	 */
+	if (!ret && tdx_supports_dynamic_pamt(&tdx_sysinfo) &&
+	    !(ret = read_sys_metadata_field(0x9100000100000013, &val)))
+		sysinfo_tdmr->pamt_page_bitmap_entry_bits = val;
 
 	return ret;
 }
-- 
2.51.2
Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Xu Yilun 1 month, 2 weeks ago
> diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
> index 13ad2663488b..00ab0e550636 100644
> --- a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
> +++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
> @@ -33,6 +33,13 @@ static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
>  		sysinfo_tdmr->pamt_2m_entry_size = val;
>  	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000012, &val)))
>  		sysinfo_tdmr->pamt_1g_entry_size = val;
> +	/*
> +	 * Don't fail here if tdx_supports_dynamic_pamt() isn't supported. The
> +	 * TDX code can fallback to normal PAMT if it's not supported.
> +	 */
> +	if (!ret && tdx_supports_dynamic_pamt(&tdx_sysinfo) &&
> +	    !(ret = read_sys_metadata_field(0x9100000100000013, &val)))
> +		sysinfo_tdmr->pamt_page_bitmap_entry_bits = val;

Is it better we seal the awkward pattern inside the if (dpamt supported)  block:

	if (tdx_support_dynamic_pamt(&tdx_sysinfo))
		if (!ret && !(ret = read_sys_metadata_field(0x9100000100000013, &val)))
			sysinfo_tdmr->pamt_page_bitmap_entry_bits = val;

so we don't have to get used to another variant of the awkward pattern :)

Thanks,
Yilun

>  
>  	return ret;
>  }
> -- 
> 2.51.2
> 
>
Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Edgecombe, Rick P 1 month ago
On Wed, 2025-12-24 at 17:10 +0800, Xu Yilun wrote:
> Is it better we seal the awkward pattern inside the if (dpamt supported)  block:
> 
> 	if (tdx_support_dynamic_pamt(&tdx_sysinfo))
> 		if (!ret && !(ret = read_sys_metadata_field(0x9100000100000013, &val)))
> 			sysinfo_tdmr->pamt_page_bitmap_entry_bits = val;

The extra indentation might be objectionable.

But I agree that line is too hard to read. It actually already caused some
confusion, which precipitated the comment. I played around with it and was
thinking to go with this instead to make it fit the pattern better. What do you
think?

static int get_tdx_sys_info_dpamt_bits(struct tdx_sys_info_tdmr *sysinfo_tdmr,
u64 *val)
{
	/*
	 * Don't let the metadata reading fail if dynamic PAMT isn't
	 * supported. The TDX code can fallback to normal PAMT in
	 * this case.
	 */
	if (!tdx_supports_dynamic_pamt(&tdx_sysinfo)) {
		*val = 0;
		return 0;
	}

	return read_sys_metadata_field(0x9100000100000013, val);
}

static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
{
	int ret = 0;
	u64 val;

	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000008, &val)))
		sysinfo_tdmr->max_tdmrs = val;
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000009, &val)))
		sysinfo_tdmr->max_reserved_per_tdmr = val;
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000010, &val)))
		sysinfo_tdmr->pamt_4k_entry_size = val;
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000011, &val)))
		sysinfo_tdmr->pamt_2m_entry_size = val;
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000012, &val)))
		sysinfo_tdmr->pamt_1g_entry_size = val;
	if (!ret && !(ret = get_tdx_sys_info_dpamt_bits(sysinfo_tdmr, &val)))
		sysinfo_tdmr->pamt_page_bitmap_entry_bits = val;

	return ret;
}


Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Xu Yilun 1 month ago
On Mon, Jan 05, 2026 at 10:06:31PM +0000, Edgecombe, Rick P wrote:
> On Wed, 2025-12-24 at 17:10 +0800, Xu Yilun wrote:
> > Is it better we seal the awkward pattern inside the if (dpamt supported)  block:
> > 
> > 	if (tdx_support_dynamic_pamt(&tdx_sysinfo))
> > 		if (!ret && !(ret = read_sys_metadata_field(0x9100000100000013, &val)))
> > 			sysinfo_tdmr->pamt_page_bitmap_entry_bits = val;
> 
> The extra indentation might be objectionable.

Yes the extra indentation is unconventional, but everything here is, and
we know we will eventually change them all. So I more prefer simple
changes based on:

  if (!ret && !(ret = read_sys_metadata_field(0xABCDEF, &val)))

rather than neat but more LOC (when both are easy to read).

Anyway, this is trivial concern. I have more optional fields to add and
will follow the final decision.
Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Edgecombe, Rick P 1 month ago
On Tue, 2026-01-06 at 12:01 +0800, Xu Yilun wrote:
> Yes the extra indentation is unconventional, but everything here is,
> and we know we will eventually change them all. So I more prefer
> simple changes based on:
> 
>   if (!ret && !(ret = read_sys_metadata_field(0xABCDEF, &val)))
> 
> rather than neat but more LOC (when both are easy to read).

This whole code style was optimized to be verifiable, not for LOC. Kai
originally had several macro based solution that had a bunch of code
reuse before this style got settled on as part of the code gen. It
would definitely be a good thing to review the previous versions before
working on yet another solution to metadata reading.

> 
> Anyway, this is trivial concern. I have more optional fields to add
> and will follow the final decision.

This is the kind of thing that shouldn't need a clever solution. There
*should* be a way to more simply copy structured data from the TDX
module.

I do think this is a good area for cleanup, but let's not overhaul it
just to get a small incremental benefit. If we need a new interface in
the TDX module, let's explore it and actually get to something simple.
Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Xu Yilun 1 month ago
On Tue, Jan 06, 2026 at 05:00:48PM +0000, Edgecombe, Rick P wrote:
> On Tue, 2026-01-06 at 12:01 +0800, Xu Yilun wrote:
> > Yes the extra indentation is unconventional, but everything here is,
> > and we know we will eventually change them all. So I more prefer
> > simple changes based on:
> > 
> >   if (!ret && !(ret = read_sys_metadata_field(0xABCDEF, &val)))
> > 
> > rather than neat but more LOC (when both are easy to read).
> 
> This whole code style was optimized to be verifiable, not for LOC. Kai
> originally had several macro based solution that had a bunch of code
> reuse before this style got settled on as part of the code gen. It

Yeah, I know the code style in this block is the result of several
rounds discussion, refined but not intend for human read. So I suggest
we only keep the existing steorotypes:

  if (!ret && !(ret = read_sys_metadata_field(0xABCDEF, &val)))
	sysinfo_xxx->xxxxx = val;
and

  ret = ret ?: get_tdx_sys_info_xxx(&sysinfo->xxx);

isolate them, don't create other varients/hybrids of them such as:

  if (!ret && !(ret = get_tdx_sys_info_dpamt_bits(sysinfo_tdmr, &val)))

when we need other logics, to avoid extensive review effort.


That's why I'm more fond of my version, it embraces the steorotype with
a nature "if" for extra logic:

  if (tdx_support_dynamic_pamt(&tdx_sysinfo))
	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000013, &val)))
		sysinfo_tdmr->pamt_page_bitmap_entry_bits = val;

But anyway, if anyone is really uncomfortable with the indentation,
ignore my version.

[...]

> > Anyway, this is trivial concern. I have more optional fields to add
> > and will follow the final decision.
> 
> This is the kind of thing that shouldn't need a clever solution. There
> *should* be a way to more simply copy structured data from the TDX
> module.
> 
> I do think this is a good area for cleanup, but let's not overhaul it
> just to get a small incremental benefit. If we need a new interface in

Agree. I definitely don't want a new TDX module interface for now.

> the TDX module, let's explore it and actually get to something simple.
Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Edgecombe, Rick P 1 month ago
On Wed, 2026-01-07 at 14:01 +0800, Xu Yilun wrote:
> > I do think this is a good area for cleanup, but let's not overhaul
> > it
> > just to get a small incremental benefit. If we need a new interface
> > in
> 
> Agree. I definitely don't want a new TDX module interface for now.

No, I was suggesting to think about a new TDX module interface if that
is what it takes to really simplify it. For example, something like a
consistent TDH.SYS.RDALL. For TDX module changes to be available in the
future (for example at the time of the other optional metadata), we
actually need to start the process now.
Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Xu Yilun 4 weeks, 1 day ago
On Wed, Jan 07, 2026 at 02:41:44PM +0000, Edgecombe, Rick P wrote:
> On Wed, 2026-01-07 at 14:01 +0800, Xu Yilun wrote:
> > > I do think this is a good area for cleanup, but let's not overhaul
> > > it
> > > just to get a small incremental benefit. If we need a new interface
> > > in
> > 
> > Agree. I definitely don't want a new TDX module interface for now.
> 
> No, I was suggesting to think about a new TDX module interface if that
> is what it takes to really simplify it. For example, something like a
> consistent TDH.SYS.RDALL. For TDX module changes to be available in the

I actually don't understand why a RDALL seamcall could eliminate
the check "if (some_optional_feature_exists) read_it;". IIUC, The check
exists because kernel doesn't trust TDX Module so kernel wants to verify
the correctness/consistency of the data, otherwise we could accept
whatever TDX Module tells us, do the below for each field:

  static int read_sys_metadata_field(u64 field_id, u64 *data)
  {
	...
	ret = seamcall(TDH_SYS_RD, &args);
	if (ret == TDX_SUCCESS) {
		*data = args.r8;
		return 0;
	}

	/* The field doesn't exist */
	if (ret == TDX_METADATA_FIELD_ID_INCORRECT) {
		*data = 0;
		return 0;
	}

	...

	/* Real reading error */
	return -EFAULT;
  }

The trustness doesn't change no matter how kernel retrieves these data,
by a series of RD or a RDALL.

Thanks,
Yilun

> future (for example at the time of the other optional metadata), we
> actually need to start the process now.
Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Edgecombe, Rick P 4 weeks, 1 day ago
On Thu, 2026-01-08 at 20:53 +0800, Xu Yilun wrote:
> I actually don't understand why a RDALL seamcall could eliminate
> the check "if (some_optional_feature_exists) read_it;". IIUC, The
> check
> exists because kernel doesn't trust TDX Module so kernel wants to
> verify
> the correctness/consistency of the data, otherwise we could accept
> whatever TDX Module tells us, do the below for each field:
> 
>   static int read_sys_metadata_field(u64 field_id, u64 *data)
>   {
> 	...
> 	ret = seamcall(TDH_SYS_RD, &args);
> 	if (ret == TDX_SUCCESS) {
> 		*data = args.r8;
> 		return 0;
> 	}
> 
> 	/* The field doesn't exist */
> 	if (ret == TDX_METADATA_FIELD_ID_INCORRECT) {
> 		*data = 0;
> 		return 0;
> 	}
> 
> 	...
> 
> 	/* Real reading error */
> 	return -EFAULT;
>   }
> 
> The trustness doesn't change no matter how kernel retrieves these
> data,
> by a series of RD or a RDALL.

Having it be field specific behavior (like the diff I posted) means we
don't need to worry about TDX module bugs where some field read fails
and we don't catch it.

By RDALL, I mean a simpler way to bulk read the metadata. So for future
looking changes, let's think about what we need and not try to find yet
more clever ways to code around the current interface. The amount of
code and discussion on TDX metadata reading is just too high. Please go
back and look at the earlier threads if you haven't yet.
Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Xu Yilun 4 weeks, 1 day ago
On Thu, Jan 08, 2026 at 04:52:10PM +0000, Edgecombe, Rick P wrote:
> On Thu, 2026-01-08 at 20:53 +0800, Xu Yilun wrote:
> > I actually don't understand why a RDALL seamcall could eliminate
> > the check "if (some_optional_feature_exists) read_it;". IIUC, The
> > check
> > exists because kernel doesn't trust TDX Module so kernel wants to
> > verify
> > the correctness/consistency of the data, otherwise we could accept
> > whatever TDX Module tells us, do the below for each field:
> > 
> >   static int read_sys_metadata_field(u64 field_id, u64 *data)
> >   {
> > 	...
> > 	ret = seamcall(TDH_SYS_RD, &args);
> > 	if (ret == TDX_SUCCESS) {
> > 		*data = args.r8;
> > 		return 0;
> > 	}
> > 
> > 	/* The field doesn't exist */
> > 	if (ret == TDX_METADATA_FIELD_ID_INCORRECT) {
> > 		*data = 0;
> > 		return 0;
> > 	}
> > 
> > 	...
> > 
> > 	/* Real reading error */
> > 	return -EFAULT;
> >   }
> > 
> > The trustness doesn't change no matter how kernel retrieves these
> > data,
> > by a series of RD or a RDALL.
> 
> Having it be field specific behavior (like the diff I posted) means we
> don't need to worry about TDX module bugs where some field read fails
> and we don't catch it.

We still need this specific behavior when we bulk read. Can you accept
a successfully returned blob with TDX_FEATURES0_DYNAMIC_PAMT set but
pamt_page_bitmap_entry_bits field missing? I assume no, so we still
need:

	if (blob->tdx_feature0 & TDX_FEATURES0_DYNAMIC_PAMT)
		check blob->pamt_page_bitmap_entry_bits validity;

BTW: ret == TDX_METADATA_FIELD_ID_INCORRECT is not something bad, TDX
Module is effectively telling us the field doesn't exist.

> 
> By RDALL, I mean a simpler way to bulk read the metadata. So for future

I understand. But as I mentioned that has nothing to do with the
optional feature checking.

> looking changes, let's think about what we need and not try to find yet
> more clever ways to code around the current interface. The amount of

I don't think so, cause I don't see much benifit bulk reading could
bring to us. Bulk reading basically retrieves an _unverified_ blob from
TDX Module. I believe it could also been achieved by a simple iteration
of existing single reads. The major work, data verification, is still
there.

On the other hand, the cost of a newly designed firmware interface for
an already online functionality is not low, especially when you want
backward compatibility to old TDX Module. The worst case is we keep both
sets of the code...

> code and discussion on TDX metadata reading is just too high. Please go
> back and look at the earlier threads if you haven't yet.

I've read most of the threads before my first posting on this topic.
Most of the discussion/effort focus on the effective data verification.
But I haven't found any decisive evidence that bulk reading could
contribute on it.
Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Edgecombe, Rick P 4 weeks ago
On Fri, 2026-01-09 at 10:18 +0800, Xu Yilun wrote:
> On the other hand, the cost of a newly designed firmware interface
> for an already online functionality is not low, especially when you
> want backward compatibility to old TDX Module. The worst case is we
> keep both sets of the code...

I think TDX module changes are something to consider long term. We
already discussed not overhauling the metadata reading again ahead of
the current work, so I don't think there is anything else to discuss
here.
Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Xu Yilun 3 weeks, 5 days ago
On Fri, Jan 09, 2026 at 04:05:30PM +0000, Edgecombe, Rick P wrote:
> On Fri, 2026-01-09 at 10:18 +0800, Xu Yilun wrote:
> > On the other hand, the cost of a newly designed firmware interface
> > for an already online functionality is not low, especially when you
> > want backward compatibility to old TDX Module. The worst case is we
> > keep both sets of the code...
> 
> I think TDX module changes are something to consider long term. We
> already discussed not overhauling the metadata reading again ahead of
> the current work, so I don't think there is anything else to discuss
> here.

I agree. We don't have to introduce new interfaces for optional feature
checking. That's another topic.
Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Binbin Wu 2 months, 2 weeks ago

On 11/21/2025 8:51 AM, Rick Edgecombe wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
>
> The Physical Address Metadata Table (PAMT) holds TDX metadata for physical
> memory and must be allocated by the kernel during TDX module
> initialization.
>
> The exact size of the required PAMT memory is determined by the TDX module
> and may vary between TDX module versions. Currently it is approximately
> 0.4% of the system memory. This is a significant commitment, especially if
> it is not known upfront whether the machine will run any TDX guests.
>
> For normal PAMT, each memory region that the TDX module might use (TDMR)
> needs three separate PAMT allocations. One for each supported page size
> (1GB, 2MB, 4KB).
>
> At a high level, Dynamic PAMT still has the 1GB and 2MB levels allocated
> on TDX module initialization, but the 4KB level allocated dynamically at
> TD runtime. However, in the details, the TDX module still needs some per
> 4KB page data. The TDX module exposed how many bits per page need to be
> allocated (currently it is 1). The bits-per-page value can then be used to
> calculate the size to pass in place of the 4KB allocations in the TDMR,
> which TDX specs call "PAMT_PAGE_BITMAP".
>
> So in effect, Dynamic PAMT just needs a different (smaller) size
> allocation for the 4KB level part of the allocation. Although it is
> functionally something different, it is passed in the same way the 4KB page
> size PAMT allocation is.
>
> Begin to implement Dynamic PAMT in the kernel by reading the bits-per-page
> needed for Dynamic PAMT. Calculate the size needed for the bitmap,
> and use it instead of the 4KB size determined for normal PAMT, in the case
> of Dynamic PAMT. In doing so, reduce the static allocations to
> approximately 0.004%, a 100x improvement.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> [Enhanced log]
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>

One nit below.

[...]
> diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
> index 13ad2663488b..00ab0e550636 100644
> --- a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
> +++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
> @@ -33,6 +33,13 @@ static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
>   		sysinfo_tdmr->pamt_2m_entry_size = val;
>   	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000012, &val)))
>   		sysinfo_tdmr->pamt_1g_entry_size = val;
> +	/*
> +	 * Don't fail here if tdx_supports_dynamic_pamt() isn't supported. The

A bit weird to say "if tdx_supports_dynamic_pamt() isn't supported", how about
using "if dynamic PAMT isn't supported"?

> +	 * TDX code can fallback to normal PAMT if it's not supported.
> +	 */
> +	if (!ret && tdx_supports_dynamic_pamt(&tdx_sysinfo) &&
> +	    !(ret = read_sys_metadata_field(0x9100000100000013, &val)))
> +		sysinfo_tdmr->pamt_page_bitmap_entry_bits = val;
>   
>   	return ret;
>   }
Re: [PATCH v4 04/16] x86/virt/tdx: Allocate page bitmap for Dynamic PAMT
Posted by Edgecombe, Rick P 2 months, 1 week ago
On Tue, 2025-11-25 at 09:50 +0800, Binbin Wu wrote:
> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com>

Thanks.

> 
> One nit below.


> 
> [...]
> > diff --git a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
> > index 13ad2663488b..00ab0e550636 100644
> > --- a/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
> > +++ b/arch/x86/virt/vmx/tdx/tdx_global_metadata.c
> > @@ -33,6 +33,13 @@ static int get_tdx_sys_info_tdmr(struct tdx_sys_info_tdmr *sysinfo_tdmr)
> >    		sysinfo_tdmr->pamt_2m_entry_size = val;
> >    	if (!ret && !(ret = read_sys_metadata_field(0x9100000100000012, &val)))
> >    		sysinfo_tdmr->pamt_1g_entry_size = val;
> > +	/*
> > +	 * Don't fail here if tdx_supports_dynamic_pamt() isn't supported. The
> 
> A bit weird to say "if tdx_supports_dynamic_pamt() isn't supported", how about
> using "if dynamic PAMT isn't supported"?

Yes, good point, I'll use your suggestion.