[v4] x86: Expose consistent topology to guests

[PATCH v4 06/10] tools/libguest: Make setting MTRR registers unconditional

Posted by Alejandro Vallejo 2 months, 3 weeks ago

This greatly simplifies a later patch that makes use of HVM contexts to upload
LAPIC data. The idea is to reuse MTRR setting procedure to avoid code
duplication. It's currently only used for PVH, but there's no real reason to
overcomplicate the toolstack preventing them being set for HVM too when
hvmloader will override them anyway.

While at it, add a missing "goto out" to what was the error condition in the
loop.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v4:
  * New patch
---
 tools/libs/guest/xg_dom_x86.c | 83 ++++++++++++++++++-----------------
 1 file changed, 43 insertions(+), 40 deletions(-)

diff --git a/tools/libs/guest/xg_dom_x86.c b/tools/libs/guest/xg_dom_x86.c
index cba01384ae75..82ea3e2aab0b 100644
--- a/tools/libs/guest/xg_dom_x86.c
+++ b/tools/libs/guest/xg_dom_x86.c
@@ -989,6 +989,7 @@ const static void *hvm_get_save_record(const void *ctx, unsigned int type,
 
 static int vcpu_hvm(struct xc_dom_image *dom)
 {
+    /* Initialises the BSP */
     struct {
         struct hvm_save_descriptor header_d;
         HVM_SAVE_TYPE(HEADER) header;
@@ -997,6 +998,18 @@ static int vcpu_hvm(struct xc_dom_image *dom)
         struct hvm_save_descriptor end_d;
         HVM_SAVE_TYPE(END) end;
     } bsp_ctx;
+    /* Initialises APICs and MTRRs of every vCPU */
+    struct {
+        struct hvm_save_descriptor header_d;
+        HVM_SAVE_TYPE(HEADER) header;
+        struct hvm_save_descriptor mtrr_d;
+        HVM_SAVE_TYPE(MTRR) mtrr;
+        struct hvm_save_descriptor end_d;
+        HVM_SAVE_TYPE(END) end;
+    } vcpu_ctx;
+    /* Context from full_ctx */
+    const HVM_SAVE_TYPE(MTRR) *mtrr_record;
+    /* Raw context as taken from Xen */
     uint8_t *full_ctx = NULL;
     int rc;
 
@@ -1083,51 +1096,41 @@ static int vcpu_hvm(struct xc_dom_image *dom)
     bsp_ctx.end_d.instance = 0;
     bsp_ctx.end_d.length = HVM_SAVE_LENGTH(END);
 
-    /* TODO: maybe this should be a firmware option instead? */
-    if ( !dom->device_model )
+    /* TODO: maybe setting MTRRs should be a firmware option instead? */
+    mtrr_record = hvm_get_save_record(full_ctx, HVM_SAVE_CODE(MTRR), 0);
+
+    if ( !mtrr_record)
     {
-        struct {
-            struct hvm_save_descriptor header_d;
-            HVM_SAVE_TYPE(HEADER) header;
-            struct hvm_save_descriptor mtrr_d;
-            HVM_SAVE_TYPE(MTRR) mtrr;
-            struct hvm_save_descriptor end_d;
-            HVM_SAVE_TYPE(END) end;
-        } mtrr = {
-            .header_d = bsp_ctx.header_d,
-            .header = bsp_ctx.header,
-            .mtrr_d.typecode = HVM_SAVE_CODE(MTRR),
-            .mtrr_d.length = HVM_SAVE_LENGTH(MTRR),
-            .end_d = bsp_ctx.end_d,
-            .end = bsp_ctx.end,
-        };
-        const HVM_SAVE_TYPE(MTRR) *mtrr_record =
-            hvm_get_save_record(full_ctx, HVM_SAVE_CODE(MTRR), 0);
-        unsigned int i;
-
-        if ( !mtrr_record )
-        {
-            xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
-                         "%s: unable to get MTRR save record", __func__);
-            goto out;
-        }
+        xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+                     "%s: unable to get MTRR save record", __func__);
+        goto out;
+    }
 
-        memcpy(&mtrr.mtrr, mtrr_record, sizeof(mtrr.mtrr));
+    vcpu_ctx.header_d = bsp_ctx.header_d;
+    vcpu_ctx.header = bsp_ctx.header;
+    vcpu_ctx.mtrr_d.typecode = HVM_SAVE_CODE(MTRR);
+    vcpu_ctx.mtrr_d.length = HVM_SAVE_LENGTH(MTRR);
+    vcpu_ctx.mtrr = *mtrr_record;
+    vcpu_ctx.end_d = bsp_ctx.end_d;
+    vcpu_ctx.end = bsp_ctx.end;
 
-        /*
-         * Enable MTRR, set default type to WB.
-         * TODO: add MMIO areas as UC when passthrough is supported.
-         */
-        mtrr.mtrr.msr_mtrr_def_type = MTRR_TYPE_WRBACK | MTRR_DEF_TYPE_ENABLE;
+    /*
+     * Enable MTRR, set default type to WB.
+     * TODO: add MMIO areas as UC when passthrough is supported in PVH
+     */
+    vcpu_ctx.mtrr.msr_mtrr_def_type = MTRR_TYPE_WRBACK | MTRR_DEF_TYPE_ENABLE;
 
-        for ( i = 0; i < dom->max_vcpus; i++ )
+    for ( unsigned int i = 0; i < dom->max_vcpus; i++ )
+    {
+        vcpu_ctx.mtrr_d.instance = i;
+
+        rc = xc_domain_hvm_setcontext(dom->xch, dom->guest_domid,
+                                          (uint8_t *)&vcpu_ctx, sizeof(vcpu_ctx));
+        if ( rc != 0 )
         {
-            mtrr.mtrr_d.instance = i;
-            rc = xc_domain_hvm_setcontext(dom->xch, dom->guest_domid,
-                                          (uint8_t *)&mtrr, sizeof(mtrr));
-            if ( rc != 0 )
-                xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
-                             "%s: SETHVMCONTEXT failed (rc=%d)", __func__, rc);
+            xc_dom_panic(dom->xch, XC_INTERNAL_ERROR,
+                         "%s: SETHVMCONTEXT failed (rc=%d)", __func__, rc);
+            goto out;
         }
     }
 
-- 
2.34.1

Re: [PATCH v4 06/10] tools/libguest: Make setting MTRR registers unconditional

Posted by Jan Beulich 2 months, 3 weeks ago

On 26.06.2024 18:28, Alejandro Vallejo wrote:
> This greatly simplifies a later patch that makes use of HVM contexts to upload
> LAPIC data. The idea is to reuse MTRR setting procedure to avoid code
> duplication. It's currently only used for PVH, but there's no real reason to
> overcomplicate the toolstack preventing them being set for HVM too when
> hvmloader will override them anyway.

Yet then - why set them when hvmloader will do so again? Is it even guaranteed
to be no change in (guest) behavior to do so?

Plus what about a guest which was configured to have the CPUID bit for MTRRs
clear? I think we ought to document this as not supported for PVH (we may
actually choose to refuse building such a guest), but in principle the MTRR
save/load operations should simply fail for a HVM guest in said configuration.
Making such a change in Xen now would, afaict, be benign to the tool stack.
After this adjustment it would result in a perceived regression, when there
shouldn't be any.

Thinking about it, even for PVH it may make sense to allow CPUID.MTRR=0, as
long as CPUID.PAT=1, thus forcing it into PAT-only mode. I think we did even
discuss this possible configuration before.

Jan

Re: [PATCH v4 06/10] tools/libguest: Make setting MTRR registers unconditional

Posted by Alejandro Vallejo 2 months, 3 weeks ago

On Thu Jun 27, 2024 at 10:42 AM BST, Jan Beulich wrote:
> On 26.06.2024 18:28, Alejandro Vallejo wrote:
> > This greatly simplifies a later patch that makes use of HVM contexts to upload
> > LAPIC data. The idea is to reuse MTRR setting procedure to avoid code
> > duplication. It's currently only used for PVH, but there's no real reason to
> > overcomplicate the toolstack preventing them being set for HVM too when
> > hvmloader will override them anyway.
>
> Yet then - why set them when hvmloader will do so again?

To keep the toolstack complexity tractable, essentially. This way I can send N
hypercalls (for N vCPUs) rather than 2*N and have a single hvmcontext struct
rather than several.

In truth though, I could simply write back the old MTRRs taken from bsp_ctx on
HVM.

> Is it even guaranteed
> to be no change in (guest) behavior to do so?

hvmloader overrides those values, so there is no change by the time BIOS or OVMF
start running. As I mentioned before though, I can actually upload back the old
values in the HVM case.

>
> Plus what about a guest which was configured to have the CPUID bit for MTRRs
> clear?
> I think we ought to document this as not supported for PVH (we may

By "this" do you mean PVH _must_ have MTRR support? I would agree.

> actually choose to refuse building such a guest), but in principle the MTRR
> save/load operations should simply fail for a HVM guest in said configuration.

What use cases does that cover? With the adjustment I mention at the top that
should be sorted. I'm wondering why we allow !mtrr at all.

> Making such a change in Xen now would, afaict, be benign to the tool stack.
> After this adjustment it would result in a perceived regression, when there
> shouldn't be any.

Fair point.

>
> Thinking about it, even for PVH it may make sense to allow CPUID.MTRR=0, as
> long as CPUID.PAT=1, thus forcing it into PAT-only mode. I think we did even
> discuss this possible configuration before.
>
> Jan

Is PAT-only an existing real HW configuration? Can't say I've seen any.

Cheers,
Alejandro

Re: [PATCH v4 06/10] tools/libguest: Make setting MTRR registers unconditional

Posted by Jan Beulich 2 months, 3 weeks ago

On 27.06.2024 14:02, Alejandro Vallejo wrote:
> On Thu Jun 27, 2024 at 10:42 AM BST, Jan Beulich wrote:
>> Plus what about a guest which was configured to have the CPUID bit for MTRRs
>> clear?
>> I think we ought to document this as not supported for PVH (we may
> 
> By "this" do you mean PVH _must_ have MTRR support? I would agree.

That was my first thought, yes. But then further down I adjusted my
considerations.

>> actually choose to refuse building such a guest), but in principle the MTRR
>> save/load operations should simply fail for a HVM guest in said configuration.
> 
> What use cases does that cover? With the adjustment I mention at the top that
> should be sorted. I'm wondering why we allow !mtrr at all.

Not allowing it would open up for a mess in what CPUID bits we allow to
override and for which ones we'd deny overrides.

>> Making such a change in Xen now would, afaict, be benign to the tool stack.
>> After this adjustment it would result in a perceived regression, when there
>> shouldn't be any.
> 
> Fair point.
> 
>>
>> Thinking about it, even for PVH it may make sense to allow CPUID.MTRR=0, as
>> long as CPUID.PAT=1, thus forcing it into PAT-only mode. I think we did even
>> discuss this possible configuration before.
> 
> Is PAT-only an existing real HW configuration? Can't say I've seen any.

I don't think there are any, but the architecture doesn't preclude it, and
that's a simpler model overall for an OS to work with. Hence why it was
discussed (to some degree) before (if my memory doesn't fail me there).

Jan