x86/tdx: Allow extended topology CPUID leafs to be emulated by hypercall

[PATCH] x86/tdx: Allow extended topology CPUID leafs to be emulated by hypercall

Posted by Sagi Shahar 2 years, 5 months ago

The current TDX module does not handle extended topology leaves
explicitly and will generate a #VE but the current #VE handler
implementation blindly returns 0 for those CPUID leaves.

This currently causes TDX guests to see 0 values when querying the numa
topology leading to incorrect numa configurations.

This patch fixes this behavior by emulating the extended topology leaves
using the CPUID hypercall.

Change-Id: I427807e3ac8d9e3be50a6fac40ebd3f54b445b0c
Signed-off-by: Sagi Shahar <sagis@google.com>
---
 arch/x86/coco/tdx/tdx.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 1d6b863c42b0..db9a4673555a 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -321,13 +321,16 @@ static int handle_cpuid(struct pt_regs *regs, struct ve_info *ve)
 	};
 
 	/*
-	 * Only allow VMM to control range reserved for hypervisor
-	 * communication.
+	 * Only emulate CPUID in 2 cases:
+	 * - CPUID is in the range reserved for hypervisor communication.
+	 * - CPUID is an extended topology leaf which is not emulated natively
+	 *   by the TDX module.
 	 *
-	 * Return all-zeros for any CPUID outside the range. It matches CPU
-	 * behaviour for non-supported leaf.
+	 * Return all-zeros for any other CPUID. It matches CPU behaviour for
+	 * non-supported leaf.
 	 */
-	if (regs->ax < 0x40000000 || regs->ax > 0x4FFFFFFF) {
+	if ((regs->ax < 0x40000000 || regs->ax > 0x4FFFFFFF) &&
+	    regs->ax != 0x0b && regs->ax != 0x1f) {
 		regs->ax = regs->bx = regs->cx = regs->dx = 0;
 		return ve_instr_len(ve);
 	}
-- 
2.42.0.283.g2d96d420d3-goog

Re: [PATCH] x86/tdx: Allow extended topology CPUID leafs to be emulated by hypercall

Posted by Kirill A. Shutemov 2 years, 5 months ago

On Fri, Sep 08, 2023 at 10:56:44AM -0700, Sagi Shahar wrote:
> The current TDX module does not handle extended topology leaves
> explicitly and will generate a #VE but the current #VE handler
> implementation blindly returns 0 for those CPUID leaves.
> 
> This currently causes TDX guests to see 0 values when querying the numa
> topology leading to incorrect numa configurations.
> 
> This patch fixes this behavior by emulating the extended topology leaves
> using the CPUID hypercall.
> 
> Change-Id: I427807e3ac8d9e3be50a6fac40ebd3f54b445b0c
> Signed-off-by: Sagi Shahar <sagis@google.com>

NAK. We are working on getting these and other leafs available to the
guest in a safe manner without #VE. It will take time, but it is going to
be there.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

Re: [PATCH] x86/tdx: Allow extended topology CPUID leafs to be emulated by hypercall

Posted by Dave Hansen 2 years, 5 months ago

On 9/8/23 10:56, Sagi Shahar wrote:
> The current TDX module does not handle extended topology leaves
> explicitly and will generate a #VE but the current #VE handler
> implementation blindly returns 0 for those CPUID leaves.
> 
> This currently causes TDX guests to see 0 values when querying the numa
> topology leading to incorrect numa configurations.
> 
> This patch fixes this behavior by emulating the extended topology leaves
> using the CPUID hypercall.

... and thus acquires the data from the untrusted VMM.  Right?

What are the security implications of consuming this untrusted data?

Re: [PATCH] x86/tdx: Allow extended topology CPUID leafs to be emulated by hypercall

Posted by Sagi Shahar 2 years, 5 months ago

On Fri, Sep 8, 2023 at 11:00 AM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 9/8/23 10:56, Sagi Shahar wrote:
> > The current TDX module does not handle extended topology leaves
> > explicitly and will generate a #VE but the current #VE handler
> > implementation blindly returns 0 for those CPUID leaves.
> >
> > This currently causes TDX guests to see 0 values when querying the numa
> > topology leading to incorrect numa configurations.
> >
> > This patch fixes this behavior by emulating the extended topology leaves
> > using the CPUID hypercall.
>
> ... and thus acquires the data from the untrusted VMM.  Right?
>
> What are the security implications of consuming this untrusted data?

The topology information is mostly used for performance optimizations
on the guest side. I don't see any security implications if VMM passes
incorrect values.
Right now, the guest is already using the returned 0 values and gets
an incorrect numa topology leading to odd behavior in the guest. If we
allow guests to read these values from the untrusted VMM and VMM
spoofs the values, the worst that can happen is a different incorrect
numa topology instead of the incorrect one we already have today.

Re: [PATCH] x86/tdx: Allow extended topology CPUID leafs to be emulated by hypercall

Posted by Dave Hansen 2 years, 5 months ago

On 9/8/23 12:25, Sagi Shahar wrote:
> On Fri, Sep 8, 2023 at 11:00 AM Dave Hansen <dave.hansen@intel.com> wrote:
>>
>> On 9/8/23 10:56, Sagi Shahar wrote:
>>> The current TDX module does not handle extended topology leaves
>>> explicitly and will generate a #VE but the current #VE handler
>>> implementation blindly returns 0 for those CPUID leaves.
>>>
>>> This currently causes TDX guests to see 0 values when querying the numa
>>> topology leading to incorrect numa configurations.
>>>
>>> This patch fixes this behavior by emulating the extended topology leaves
>>> using the CPUID hypercall.
>>
>> ... and thus acquires the data from the untrusted VMM.  Right?
>>
>> What are the security implications of consuming this untrusted data?
> 
> The topology information is mostly used for performance optimizations
> on the guest side. I don't see any security implications if VMM passes
> incorrect values.

Oh, so I take it you did an audit and checked that no data structures
are sized or accessed based on this information.

Could you share some of your analysis, please?  I'd love to see it in
the changelog.

> Right now, the guest is already using the returned 0 values and gets
> an incorrect numa topology leading to odd behavior in the guest. If we
> allow guests to read these values from the untrusted VMM and VMM
> spoofs the values, the worst that can happen is a different incorrect
> numa topology instead of the incorrect one we already have today.

I'm going to have to disagree with this logic a wee bit.

The 0's have *KNOWN*, *FIXED* behavior.  It may stink, but it's known
and fixed and thoroughly unexploitable.  If it were somehow exploited,
we would change it to another value that we control.

The VMM values are fundamentally different.  They're dynamic and can be
maliciously crafted.

I'm actually not even sure how you're getting bad "NUMA" data out of
these leaves.  The check_extended_topology_leaf() checks should fail
because ecx will be all 0's and SMT_TYPE==1, so the (LEAFB_SUBTYPE(ecx)
!=  SMT_TYPE) condition will always hit.

I'm also not sure where the "numa topology" comment comes from.  There's
a _lot_ of topology information in these leaves that's at a much finer
granularity than NUMA nodes.  If you're getting errors on the console,
it would be spectacular to share those.

I have all kinds of guesses about what trouble this might be causing
you, but I'm a bad guesser.