[PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options

Roger Pau Monne posted 1 patch 1 year, 5 months ago
Failed in applying to current master (apply log)
xen/arch/x86/hvm/viridian/viridian.c |  2 +-
xen/arch/x86/hvm/vmx/vmcs.c          |  8 ++++----
xen/arch/x86/hvm/vmx/vmx.c           | 25 ++++++++++++++++++-------
xen/arch/x86/traps.c                 |  4 +---
4 files changed, 24 insertions(+), 15 deletions(-)
[PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Roger Pau Monne 1 year, 5 months ago
The current reporting of the hardware assisted APIC options is done by
checking "virtualize APIC accesses" which is not very helpful, as that
feature doesn't avoid a vmexit, instead it does provide some help in
order to detect APIC MMIO accesses in vmexit processing.

Repurpose the current reporting of xAPIC assistance to instead report
such feature as present when there's support for "TPR shadow" and
"APIC register virtualization" because in that case some xAPIC MMIO
register accesses are handled directly by the hardware, without
requiring a vmexit.

For symetry also change assisted x2APIC reporting to require
"virtualize x2APIC mode" and "APIC register virtualization", dropping
the option to also be reported when "virtual interrupt delivery" is
available.  Presence of the "virtual interrupt delivery" feature will
be reported using a different option.

Fixes: 2ce11ce249 ('x86/HVM: allow per-domain usage of hardware virtualized APIC')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
I find the logic in vmx_vlapic_msr_changed() hard to follow, but I
don't want to rewrite the function logic at this point.
---
Changes since v1:
 - Fix Viridian MSR tip conditions.
---
 xen/arch/x86/hvm/viridian/viridian.c |  2 +-
 xen/arch/x86/hvm/vmx/vmcs.c          |  8 ++++----
 xen/arch/x86/hvm/vmx/vmx.c           | 25 ++++++++++++++++++-------
 xen/arch/x86/traps.c                 |  4 +---
 4 files changed, 24 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/hvm/viridian/viridian.c b/xen/arch/x86/hvm/viridian/viridian.c
index 25dca93e8b..44eb3d0519 100644
--- a/xen/arch/x86/hvm/viridian/viridian.c
+++ b/xen/arch/x86/hvm/viridian/viridian.c
@@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
         res->a = CPUID4A_RELAX_TIMER_INT;
         if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
             res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
-        if ( !cpu_has_vmx_apic_reg_virt )
+        if ( !has_assisted_xapic(d) )
             res->a |= CPUID4A_MSR_BASED_APIC;
         if ( viridian_feature_mask(d) & HVMPV_hcall_ipi )
             res->a |= CPUID4A_SYNTHETIC_CLUSTER_IPI;
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index a1aca1ec04..7bb96e1a8e 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1136,7 +1136,7 @@ static int construct_vmcs(struct vcpu *v)
 
     if ( !has_assisted_xapic(d) )
         v->arch.hvm.vmx.secondary_exec_control &=
-            ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
+            ~SECONDARY_EXEC_APIC_REGISTER_VIRT;
 
     if ( cpu_has_vmx_secondary_exec_control )
         __vmwrite(SECONDARY_VM_EXEC_CONTROL,
@@ -2156,10 +2156,10 @@ int __init vmx_vmcs_init(void)
     if ( !ret )
     {
         /* Check whether hardware supports accelerated xapic and x2apic. */
-        assisted_xapic_available = cpu_has_vmx_virtualize_apic_accesses;
+        assisted_xapic_available = cpu_has_vmx_tpr_shadow &&
+                                   cpu_has_vmx_apic_reg_virt;
         assisted_x2apic_available = cpu_has_vmx_virtualize_x2apic_mode &&
-                                    (cpu_has_vmx_apic_reg_virt ||
-                                     cpu_has_vmx_virtual_intr_delivery);
+                                    cpu_has_vmx_apic_reg_virt;
         register_keyhandler('v', vmcs_dump, "dump VT-x VMCSs", 1);
     }
 
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index e624b415c9..bf0fe3355c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -3405,25 +3405,29 @@ static void vmx_install_vlapic_mapping(struct vcpu *v)
 
 void vmx_vlapic_msr_changed(struct vcpu *v)
 {
+    bool virtualize_x2apic_mode = has_assisted_x2apic(v->domain) ||
+                                  (cpu_has_vmx_virtualize_x2apic_mode &&
+                                   cpu_has_vmx_virtual_intr_delivery);
     struct vlapic *vlapic = vcpu_vlapic(v);
     unsigned int msr;
 
-    if ( !has_assisted_xapic(v->domain) &&
-         !has_assisted_x2apic(v->domain) )
+    if ( !cpu_has_vmx_virtualize_apic_accesses &&
+         !virtualize_x2apic_mode )
         return;
 
     vmx_vmcs_enter(v);
     v->arch.hvm.vmx.secondary_exec_control &=
         ~(SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
-          SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE);
+          SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
+          SECONDARY_EXEC_APIC_REGISTER_VIRT);
     if ( !vlapic_hw_disabled(vlapic) &&
          (vlapic_base_address(vlapic) == APIC_DEFAULT_PHYS_BASE) )
     {
-        if ( has_assisted_x2apic(v->domain) && vlapic_x2apic_mode(vlapic) )
+        if ( virtualize_x2apic_mode && vlapic_x2apic_mode(vlapic) )
         {
             v->arch.hvm.vmx.secondary_exec_control |=
                 SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
-            if ( cpu_has_vmx_apic_reg_virt )
+            if ( has_assisted_x2apic(v->domain) )
             {
                 for ( msr = MSR_X2APIC_FIRST;
                       msr <= MSR_X2APIC_LAST; msr++ )
@@ -3432,6 +3436,10 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
                 vmx_set_msr_intercept(v, MSR_X2APIC_PPR, VMX_MSR_R);
                 vmx_set_msr_intercept(v, MSR_X2APIC_TMICT, VMX_MSR_R);
                 vmx_set_msr_intercept(v, MSR_X2APIC_TMCCT, VMX_MSR_R);
+
+                v->arch.hvm.vmx.secondary_exec_control |=
+                    SECONDARY_EXEC_APIC_REGISTER_VIRT;
+
             }
             if ( cpu_has_vmx_virtual_intr_delivery )
             {
@@ -3440,9 +3448,12 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
                 vmx_clear_msr_intercept(v, MSR_X2APIC_SELF, VMX_MSR_W);
             }
         }
-        else if ( has_assisted_xapic(v->domain) )
+        else if ( vlapic_xapic_mode(vlapic) )
             v->arch.hvm.vmx.secondary_exec_control |=
-                SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
+                (cpu_has_vmx_virtualize_apic_accesses ?
+                    SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES : 0) |
+                (has_assisted_xapic(v->domain) ?
+                    SECONDARY_EXEC_APIC_REGISTER_VIRT : 0);
     }
     if ( !(v->arch.hvm.vmx.secondary_exec_control &
            SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE) )
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 7207390118..5c0aabe8a3 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -1124,8 +1124,7 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
         if ( !is_hvm_domain(d) || subleaf != 0 )
             break;
 
-        if ( cpu_has_vmx_apic_reg_virt &&
-             has_assisted_xapic(d) )
+        if ( has_assisted_xapic(d) )
             res->a |= XEN_HVM_CPUID_APIC_ACCESS_VIRT;
 
         /*
@@ -1135,7 +1134,6 @@ void cpuid_hypervisor_leaves(const struct vcpu *v, uint32_t leaf,
          * vmx_vlapic_msr_changed()).
          */
         if ( has_assisted_x2apic(d) &&
-             cpu_has_vmx_apic_reg_virt &&
              cpu_has_vmx_virtual_intr_delivery )
             res->a |= XEN_HVM_CPUID_X2APIC_VIRT;
 
-- 
2.37.3


Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Andrew Cooper 1 year, 5 months ago
On 04/11/2022 16:18, Roger Pau Monne wrote:
> The current reporting of the hardware assisted APIC options is done by
> checking "virtualize APIC accesses" which is not very helpful, as that
> feature doesn't avoid a vmexit, instead it does provide some help in
> order to detect APIC MMIO accesses in vmexit processing.
>
> Repurpose the current reporting of xAPIC assistance to instead report
> such feature as present when there's support for "TPR shadow" and
> "APIC register virtualization" because in that case some xAPIC MMIO
> register accesses are handled directly by the hardware, without
> requiring a vmexit.
>
> For symetry also change assisted x2APIC reporting to require
> "virtualize x2APIC mode" and "APIC register virtualization", dropping
> the option to also be reported when "virtual interrupt delivery" is
> available.  Presence of the "virtual interrupt delivery" feature will
> be reported using a different option.
>
> Fixes: 2ce11ce249 ('x86/HVM: allow per-domain usage of hardware virtualized APIC')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> I find the logic in vmx_vlapic_msr_changed() hard to follow, but I
> don't want to rewrite the function logic at this point.
> ---
> Changes since v1:
>  - Fix Viridian MSR tip conditions.
> ---
>  xen/arch/x86/hvm/viridian/viridian.c |  2 +-
>  xen/arch/x86/hvm/vmx/vmcs.c          |  8 ++++----
>  xen/arch/x86/hvm/vmx/vmx.c           | 25 ++++++++++++++++++-------
>  xen/arch/x86/traps.c                 |  4 +---
>  4 files changed, 24 insertions(+), 15 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/viridian/viridian.c b/xen/arch/x86/hvm/viridian/viridian.c
> index 25dca93e8b..44eb3d0519 100644
> --- a/xen/arch/x86/hvm/viridian/viridian.c
> +++ b/xen/arch/x86/hvm/viridian/viridian.c
> @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
>          res->a = CPUID4A_RELAX_TIMER_INT;
>          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
>              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
> -        if ( !cpu_has_vmx_apic_reg_virt )
> +        if ( !has_assisted_xapic(d) )
>              res->a |= CPUID4A_MSR_BASED_APIC;

This check is broken before and after.  It needs to be keyed on
virtualised interrupt delivery, not register acceleration.

But doing this correctly needs a per-domain vintr setting, which we
don't have yet.

It is marginally less broken with this change, than without, but that's
not saying much.

>          if ( viridian_feature_mask(d) & HVMPV_hcall_ipi )
>              res->a |= CPUID4A_SYNTHETIC_CLUSTER_IPI;
> diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> index a1aca1ec04..7bb96e1a8e 100644
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -1136,7 +1136,7 @@ static int construct_vmcs(struct vcpu *v)
>  
>      if ( !has_assisted_xapic(d) )
>          v->arch.hvm.vmx.secondary_exec_control &=
> -            ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
> +            ~SECONDARY_EXEC_APIC_REGISTER_VIRT;
>  
>      if ( cpu_has_vmx_secondary_exec_control )
>          __vmwrite(SECONDARY_VM_EXEC_CONTROL,
> @@ -2156,10 +2156,10 @@ int __init vmx_vmcs_init(void)
>      if ( !ret )
>      {
>          /* Check whether hardware supports accelerated xapic and x2apic. */
> -        assisted_xapic_available = cpu_has_vmx_virtualize_apic_accesses;
> +        assisted_xapic_available = cpu_has_vmx_tpr_shadow &&
> +                                   cpu_has_vmx_apic_reg_virt;
>          assisted_x2apic_available = cpu_has_vmx_virtualize_x2apic_mode &&
> -                                    (cpu_has_vmx_apic_reg_virt ||
> -                                     cpu_has_vmx_virtual_intr_delivery);
> +                                    cpu_has_vmx_apic_reg_virt;

apic reg virt already depends on tpr shadow, so that part of the
condition is redundant.

virtualise x2apic mode and apic reg virt aren't dependent, but they do
only ever appear together in hardware.

Keeping the conditionals might be ok to combat a bad outer hypervisor,
but ...

>          register_keyhandler('v', vmcs_dump, "dump VT-x VMCSs", 1);
>      }
>  
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index e624b415c9..bf0fe3355c 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -3405,25 +3405,29 @@ static void vmx_install_vlapic_mapping(struct vcpu *v)
>  
>  void vmx_vlapic_msr_changed(struct vcpu *v)
>  {
> +    bool virtualize_x2apic_mode = has_assisted_x2apic(v->domain) ||
> +                                  (cpu_has_vmx_virtualize_x2apic_mode &&
> +                                   cpu_has_vmx_virtual_intr_delivery);

... this is still wrong, and ...

>      struct vlapic *vlapic = vcpu_vlapic(v);
>      unsigned int msr;
>  
> -    if ( !has_assisted_xapic(v->domain) &&
> -         !has_assisted_x2apic(v->domain) )
> +    if ( !cpu_has_vmx_virtualize_apic_accesses &&
> +         !virtualize_x2apic_mode )
>          return;

... this surely cannot be right.

While attempting to figure ^ out, I've found yet another regression vs
4.16.  Because virt intr delivery is set in the execution controls
system-wide and not controlled per domain, we'll take a vmentry failure
on SKX/CLX/ICX when trying to build an HVM domain without xAPIC
acceleration.


This, combined with the ABI errors (/misfeatures) that we really don't
want to escape into the world but I haven't finished fixing yet, means
that the only appropriate course of action is to revert.

I'd really hoped to avoid a full revert, but we've run out of time.

~Andrew
Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Roger Pau Monné 1 year, 5 months ago
On Thu, Nov 10, 2022 at 10:47:07PM +0000, Andrew Cooper wrote:
> On 04/11/2022 16:18, Roger Pau Monne wrote:
> > The current reporting of the hardware assisted APIC options is done by
> > checking "virtualize APIC accesses" which is not very helpful, as that
> > feature doesn't avoid a vmexit, instead it does provide some help in
> > order to detect APIC MMIO accesses in vmexit processing.
> >
> > Repurpose the current reporting of xAPIC assistance to instead report
> > such feature as present when there's support for "TPR shadow" and
> > "APIC register virtualization" because in that case some xAPIC MMIO
> > register accesses are handled directly by the hardware, without
> > requiring a vmexit.
> >
> > For symetry also change assisted x2APIC reporting to require
> > "virtualize x2APIC mode" and "APIC register virtualization", dropping
> > the option to also be reported when "virtual interrupt delivery" is
> > available.  Presence of the "virtual interrupt delivery" feature will
> > be reported using a different option.
> >
> > Fixes: 2ce11ce249 ('x86/HVM: allow per-domain usage of hardware virtualized APIC')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > I find the logic in vmx_vlapic_msr_changed() hard to follow, but I
> > don't want to rewrite the function logic at this point.
> > ---
> > Changes since v1:
> >  - Fix Viridian MSR tip conditions.
> > ---
> >  xen/arch/x86/hvm/viridian/viridian.c |  2 +-
> >  xen/arch/x86/hvm/vmx/vmcs.c          |  8 ++++----
> >  xen/arch/x86/hvm/vmx/vmx.c           | 25 ++++++++++++++++++-------
> >  xen/arch/x86/traps.c                 |  4 +---
> >  4 files changed, 24 insertions(+), 15 deletions(-)
> >
> > diff --git a/xen/arch/x86/hvm/viridian/viridian.c b/xen/arch/x86/hvm/viridian/viridian.c
> > index 25dca93e8b..44eb3d0519 100644
> > --- a/xen/arch/x86/hvm/viridian/viridian.c
> > +++ b/xen/arch/x86/hvm/viridian/viridian.c
> > @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
> >          res->a = CPUID4A_RELAX_TIMER_INT;
> >          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
> >              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
> > -        if ( !cpu_has_vmx_apic_reg_virt )
> > +        if ( !has_assisted_xapic(d) )
> >              res->a |= CPUID4A_MSR_BASED_APIC;
> 
> This check is broken before and after.  It needs to be keyed on
> virtualised interrupt delivery, not register acceleration.
> 
> But doing this correctly needs a per-domain vintr setting, which we
> don't have yet.
> 
> It is marginally less broken with this change, than without, but that's
> not saying much.

I took a look at the HyperV TLFS spec but the table that lists the
CPUID bits don't explicitly name which registers are accessed using
MSRs rather than MMIO when the 'MSR_BASED_APIC' suggestion is set on
CPUID.

It's my understanding that the hint is not useful anymore, as Xen
exposes x2APIC by default, and that's what any new-ish version of
Windows should use, in which case all APIC registers are already
accessed using MSRs and the hint is moot.

> >          if ( viridian_feature_mask(d) & HVMPV_hcall_ipi )
> >              res->a |= CPUID4A_SYNTHETIC_CLUSTER_IPI;
> > diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> > index a1aca1ec04..7bb96e1a8e 100644
> > --- a/xen/arch/x86/hvm/vmx/vmcs.c
> > +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> > @@ -1136,7 +1136,7 @@ static int construct_vmcs(struct vcpu *v)
> >  
> >      if ( !has_assisted_xapic(d) )
> >          v->arch.hvm.vmx.secondary_exec_control &=
> > -            ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
> > +            ~SECONDARY_EXEC_APIC_REGISTER_VIRT;
> >  
> >      if ( cpu_has_vmx_secondary_exec_control )
> >          __vmwrite(SECONDARY_VM_EXEC_CONTROL,
> > @@ -2156,10 +2156,10 @@ int __init vmx_vmcs_init(void)
> >      if ( !ret )
> >      {
> >          /* Check whether hardware supports accelerated xapic and x2apic. */
> > -        assisted_xapic_available = cpu_has_vmx_virtualize_apic_accesses;
> > +        assisted_xapic_available = cpu_has_vmx_tpr_shadow &&
> > +                                   cpu_has_vmx_apic_reg_virt;
> >          assisted_x2apic_available = cpu_has_vmx_virtualize_x2apic_mode &&
> > -                                    (cpu_has_vmx_apic_reg_virt ||
> > -                                     cpu_has_vmx_virtual_intr_delivery);
> > +                                    cpu_has_vmx_apic_reg_virt;
> 
> apic reg virt already depends on tpr shadow, so that part of the
> condition is redundant.
> 
> virtualise x2apic mode and apic reg virt aren't dependent, but they do
> only ever appear together in hardware.

I would keep those as-is for sanity, it's easier to spot the
dependencies between them.  And then it's also possible that we want
to introduce a control for tpr_shadow, and which point having the
conditional here is a good reminder that virtualize_apic_accesses
depends on that feature being available and enabled.

> Keeping the conditionals might be ok to combat a bad outer hypervisor,
> but ...
> 
> >          register_keyhandler('v', vmcs_dump, "dump VT-x VMCSs", 1);
> >      }
> >  
> > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> > index e624b415c9..bf0fe3355c 100644
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -3405,25 +3405,29 @@ static void vmx_install_vlapic_mapping(struct vcpu *v)
> >  
> >  void vmx_vlapic_msr_changed(struct vcpu *v)
> >  {
> > +    bool virtualize_x2apic_mode = has_assisted_x2apic(v->domain) ||
> > +                                  (cpu_has_vmx_virtualize_x2apic_mode &&
> > +                                   cpu_has_vmx_virtual_intr_delivery);
> 
> ... this is still wrong, and ...
> 
> >      struct vlapic *vlapic = vcpu_vlapic(v);
> >      unsigned int msr;
> >  
> > -    if ( !has_assisted_xapic(v->domain) &&
> > -         !has_assisted_x2apic(v->domain) )
> > +    if ( !cpu_has_vmx_virtualize_apic_accesses &&
> > +         !virtualize_x2apic_mode )
> >          return;
> 
> ... this surely cannot be right.

It's indeed missing a has_assisted_xapic(v->domain), so it should be:

if ( !cpu_has_vmx_virtualize_apic_accesses &&
     !has_assisted_xapic(v->domain) &&
     !virtualize_x2apic_mode )
    return;

Logic in this function is not the best one IMO, as I've already
mentioned in a post-commit remark.

> 
> While attempting to figure ^ out, I've found yet another regression vs
> 4.16.  Because virt intr delivery is set in the execution controls
> system-wide and not controlled per domain, we'll take a vmentry failure
> on SKX/CLX/ICX when trying to build an HVM domain without xAPIC
> acceleration.

I've tried creating the following guest on one of our Ice Lake boxes
(with and without this patch applied):

type="pvh"
name = "test"
memory = 1024
vcpus = 1

kernel = "/root/vmlinuz-6.1.0-rc4+"
extra = "console=hvc0"

assisted_xapic=0

And it seem s to boot just fine, no vmentry failure.

> This, combined with the ABI errors (/misfeatures) that we really don't
> want to escape into the world but I haven't finished fixing yet, means
> that the only appropriate course of action is to revert.

Hm, I'm confused by this, as it is still not clear to me what's
wrong with the ABI.  Is it the way in which we expose the flags?  Or
is it the naming?

AFAIK we want to have a flag to toggle apic_reg_virt, which is the
proposal here.  Should it be named differently?

> I'd really hoped to avoid a full revert, but we've run out of time.

I've also hoped so.

Thanks, Roger.

Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Roger Pau Monné 1 year, 5 months ago
On Thu, Nov 10, 2022 at 10:47:07PM +0000, Andrew Cooper wrote:
> On 04/11/2022 16:18, Roger Pau Monne wrote:
> > The current reporting of the hardware assisted APIC options is done by
> > checking "virtualize APIC accesses" which is not very helpful, as that
> > feature doesn't avoid a vmexit, instead it does provide some help in
> > order to detect APIC MMIO accesses in vmexit processing.
> >
> > Repurpose the current reporting of xAPIC assistance to instead report
> > such feature as present when there's support for "TPR shadow" and
> > "APIC register virtualization" because in that case some xAPIC MMIO
> > register accesses are handled directly by the hardware, without
> > requiring a vmexit.
> >
> > For symetry also change assisted x2APIC reporting to require
> > "virtualize x2APIC mode" and "APIC register virtualization", dropping
> > the option to also be reported when "virtual interrupt delivery" is
> > available.  Presence of the "virtual interrupt delivery" feature will
> > be reported using a different option.
> >
> > Fixes: 2ce11ce249 ('x86/HVM: allow per-domain usage of hardware virtualized APIC')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > I find the logic in vmx_vlapic_msr_changed() hard to follow, but I
> > don't want to rewrite the function logic at this point.
> > ---
> > Changes since v1:
> >  - Fix Viridian MSR tip conditions.
> > ---
> >  xen/arch/x86/hvm/viridian/viridian.c |  2 +-
> >  xen/arch/x86/hvm/vmx/vmcs.c          |  8 ++++----
> >  xen/arch/x86/hvm/vmx/vmx.c           | 25 ++++++++++++++++++-------
> >  xen/arch/x86/traps.c                 |  4 +---
> >  4 files changed, 24 insertions(+), 15 deletions(-)
> >
> > diff --git a/xen/arch/x86/hvm/viridian/viridian.c b/xen/arch/x86/hvm/viridian/viridian.c
> > index 25dca93e8b..44eb3d0519 100644
> > --- a/xen/arch/x86/hvm/viridian/viridian.c
> > +++ b/xen/arch/x86/hvm/viridian/viridian.c
> > @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
> >          res->a = CPUID4A_RELAX_TIMER_INT;
> >          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
> >              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
> > -        if ( !cpu_has_vmx_apic_reg_virt )
> > +        if ( !has_assisted_xapic(d) )
> >              res->a |= CPUID4A_MSR_BASED_APIC;
> 
> This check is broken before and after.  It needs to be keyed on
> virtualised interrupt delivery, not register acceleration.
> 
> But doing this correctly needs a per-domain vintr setting, which we
> don't have yet.
> 
> It is marginally less broken with this change, than without, but that's
> not saying much.
> 
> >          if ( viridian_feature_mask(d) & HVMPV_hcall_ipi )
> >              res->a |= CPUID4A_SYNTHETIC_CLUSTER_IPI;
> > diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> > index a1aca1ec04..7bb96e1a8e 100644
> > --- a/xen/arch/x86/hvm/vmx/vmcs.c
> > +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> > @@ -1136,7 +1136,7 @@ static int construct_vmcs(struct vcpu *v)
> >  
> >      if ( !has_assisted_xapic(d) )
> >          v->arch.hvm.vmx.secondary_exec_control &=
> > -            ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
> > +            ~SECONDARY_EXEC_APIC_REGISTER_VIRT;
> >  
> >      if ( cpu_has_vmx_secondary_exec_control )
> >          __vmwrite(SECONDARY_VM_EXEC_CONTROL,
> > @@ -2156,10 +2156,10 @@ int __init vmx_vmcs_init(void)
> >      if ( !ret )
> >      {
> >          /* Check whether hardware supports accelerated xapic and x2apic. */
> > -        assisted_xapic_available = cpu_has_vmx_virtualize_apic_accesses;
> > +        assisted_xapic_available = cpu_has_vmx_tpr_shadow &&
> > +                                   cpu_has_vmx_apic_reg_virt;
> >          assisted_x2apic_available = cpu_has_vmx_virtualize_x2apic_mode &&
> > -                                    (cpu_has_vmx_apic_reg_virt ||
> > -                                     cpu_has_vmx_virtual_intr_delivery);
> > +                                    cpu_has_vmx_apic_reg_virt;
> 
> apic reg virt already depends on tpr shadow, so that part of the
> condition is redundant.
> 
> virtualise x2apic mode and apic reg virt aren't dependent, but they do
> only ever appear together in hardware.
> 
> Keeping the conditionals might be ok to combat a bad outer hypervisor,
> but ...
> 
> >          register_keyhandler('v', vmcs_dump, "dump VT-x VMCSs", 1);
> >      }
> >  
> > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> > index e624b415c9..bf0fe3355c 100644
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -3405,25 +3405,29 @@ static void vmx_install_vlapic_mapping(struct vcpu *v)
> >  
> >  void vmx_vlapic_msr_changed(struct vcpu *v)
> >  {
> > +    bool virtualize_x2apic_mode = has_assisted_x2apic(v->domain) ||
> > +                                  (cpu_has_vmx_virtualize_x2apic_mode &&
> > +                                   cpu_has_vmx_virtual_intr_delivery);
> 
> ... this is still wrong, and ...
> 
> >      struct vlapic *vlapic = vcpu_vlapic(v);
> >      unsigned int msr;
> >  
> > -    if ( !has_assisted_xapic(v->domain) &&
> > -         !has_assisted_x2apic(v->domain) )
> > +    if ( !cpu_has_vmx_virtualize_apic_accesses &&
> > +         !virtualize_x2apic_mode )
> >          return;
> 
> ... this surely cannot be right.
> 
> While attempting to figure ^ out, I've found yet another regression vs
> 4.16.  Because virt intr delivery is set in the execution controls
> system-wide and not controlled per domain, we'll take a vmentry failure
> on SKX/CLX/ICX when trying to build an HVM domain without xAPIC
> acceleration.
> 
> 
> This, combined with the ABI errors (/misfeatures) that we really don't
> want to escape into the world but I haven't finished fixing yet, means
> that the only appropriate course of action is to revert.
> 
> I'd really hoped to avoid a full revert, but we've run out of time.

Can we wait for the revert until Monday, it's a public holiday here
today and won't be able to reply to the comments.

Thanks, Roger.

Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Jan Beulich 1 year, 5 months ago
On 10.11.2022 23:47, Andrew Cooper wrote:
> On 04/11/2022 16:18, Roger Pau Monne wrote:
>> --- a/xen/arch/x86/hvm/viridian/viridian.c
>> +++ b/xen/arch/x86/hvm/viridian/viridian.c
>> @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
>>          res->a = CPUID4A_RELAX_TIMER_INT;
>>          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
>>              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
>> -        if ( !cpu_has_vmx_apic_reg_virt )
>> +        if ( !has_assisted_xapic(d) )
>>              res->a |= CPUID4A_MSR_BASED_APIC;
> 
> This check is broken before and after.  It needs to be keyed on
> virtualised interrupt delivery, not register acceleration.

To me this connection you suggest looks entirely unobvious, so would
you mind expanding as to why you're thinking so? The hint to the guest
here is related to how it would best access certain registers (aiui),
which to me looks orthogonal to how interrupt delivery works.

Jan

Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Andrew Cooper 1 year, 5 months ago
On 11/11/2022 07:45, Jan Beulich wrote:
> On 10.11.2022 23:47, Andrew Cooper wrote:
>> On 04/11/2022 16:18, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/hvm/viridian/viridian.c
>>> +++ b/xen/arch/x86/hvm/viridian/viridian.c
>>> @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
>>>          res->a = CPUID4A_RELAX_TIMER_INT;
>>>          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
>>>              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
>>> -        if ( !cpu_has_vmx_apic_reg_virt )
>>> +        if ( !has_assisted_xapic(d) )
>>>              res->a |= CPUID4A_MSR_BASED_APIC;
>> This check is broken before and after.  It needs to be keyed on
>> virtualised interrupt delivery, not register acceleration.
> To me this connection you suggest looks entirely unobvious, so would
> you mind expanding as to why you're thinking so? The hint to the guest
> here is related to how it would best access certain registers (aiui),
> which to me looks orthogonal to how interrupt delivery works.

I refer you again to the diagram.  (For everyone else on xen-devel, I
put this together when fixing XSA-412 because Intel's APIC acceleration
controls are entirely unintuitive.)

It is "virtual interrupt delivery" which controls EOI/ICR acceleration,
and not "apic register virtualisation".  There's a decade worth of
hardware where this logic is an anti-optimsiation, by telling windows to
use a VMExit-ing mechanism even when microcode would have avoided the
VMExit.

It is not by accident that "virtual interrupt delivery", introduced in
IvyBridge, is exactly the missing registers (on top of "use TPR Shadow"
which is even older) to make windows performance less bad.

~Andrew
Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Jan Beulich 1 year, 5 months ago
On 11.11.2022 18:35, Andrew Cooper wrote:
> On 11/11/2022 07:45, Jan Beulich wrote:
>> On 10.11.2022 23:47, Andrew Cooper wrote:
>>> On 04/11/2022 16:18, Roger Pau Monne wrote:
>>>> --- a/xen/arch/x86/hvm/viridian/viridian.c
>>>> +++ b/xen/arch/x86/hvm/viridian/viridian.c
>>>> @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
>>>>          res->a = CPUID4A_RELAX_TIMER_INT;
>>>>          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
>>>>              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
>>>> -        if ( !cpu_has_vmx_apic_reg_virt )
>>>> +        if ( !has_assisted_xapic(d) )
>>>>              res->a |= CPUID4A_MSR_BASED_APIC;
>>> This check is broken before and after.  It needs to be keyed on
>>> virtualised interrupt delivery, not register acceleration.
>> To me this connection you suggest looks entirely unobvious, so would
>> you mind expanding as to why you're thinking so? The hint to the guest
>> here is related to how it would best access certain registers (aiui),
>> which to me looks orthogonal to how interrupt delivery works.
> 
> I refer you again to the diagram.  (For everyone else on xen-devel, I
> put this together when fixing XSA-412 because Intel's APIC acceleration
> controls are entirely unintuitive.)
> 
> It is "virtual interrupt delivery" which controls EOI/ICR acceleration,
> and not "apic register virtualisation".  There's a decade worth of
> hardware where this logic is an anti-optimsiation, by telling windows to
> use a VMExit-ing mechanism even when microcode would have avoided the
> VMExit.

Okay, I agree that I was mislead by the name of the control. Together
with Paul (re)clarifying what (virtual) MSRs the CPUID hint is about
I agree that it wants to be "virtual interrupt delivery" alone which
is checked here. Which in turn makes this a change orthogonal to the
other APIC assist work.

Jan

Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Andrew Cooper 1 year, 5 months ago
On 11/11/2022 17:35, Andrew Cooper wrote:
> On 11/11/2022 07:45, Jan Beulich wrote:
>> On 10.11.2022 23:47, Andrew Cooper wrote:
>>> On 04/11/2022 16:18, Roger Pau Monne wrote:
>>>> --- a/xen/arch/x86/hvm/viridian/viridian.c
>>>> +++ b/xen/arch/x86/hvm/viridian/viridian.c
>>>> @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
>>>>          res->a = CPUID4A_RELAX_TIMER_INT;
>>>>          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
>>>>              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
>>>> -        if ( !cpu_has_vmx_apic_reg_virt )
>>>> +        if ( !has_assisted_xapic(d) )
>>>>              res->a |= CPUID4A_MSR_BASED_APIC;
>>> This check is broken before and after.  It needs to be keyed on
>>> virtualised interrupt delivery, not register acceleration.
>> To me this connection you suggest looks entirely unobvious, so would
>> you mind expanding as to why you're thinking so? The hint to the guest
>> here is related to how it would best access certain registers (aiui),
>> which to me looks orthogonal to how interrupt delivery works.
> I refer you again to the diagram.  (For everyone else on xen-devel, I
> put this together when fixing XSA-412 because Intel's APIC acceleration
> controls are entirely unintuitive.)
>
> It is "virtual interrupt delivery" which controls EOI/ICR acceleration,
> and not "apic register virtualisation".  There's a decade worth of
> hardware where this logic is an anti-optimsiation, by telling windows to
> use a VMExit-ing mechanism even when microcode would have avoided the
> VMExit.
>
> It is not by accident that "virtual interrupt delivery", introduced in
> IvyBridge, is exactly the missing registers (on top of "use TPR Shadow"
> which is even older) to make windows performance less bad.

Sorry, sent too early.

This also firmly highlights why the API/ABI is broken.  It has
successfully fooled all the other maintainers into doing the wrong
thing, even after recently discussing the complexity and subtly in full
as part of the XSA-412 security fix.

~Andrew
Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Roger Pau Monné 1 year, 5 months ago
On Fri, Nov 11, 2022 at 05:47:02PM +0000, Andrew Cooper wrote:
> On 11/11/2022 17:35, Andrew Cooper wrote:
> > On 11/11/2022 07:45, Jan Beulich wrote:
> >> On 10.11.2022 23:47, Andrew Cooper wrote:
> >>> On 04/11/2022 16:18, Roger Pau Monne wrote:
> >>>> --- a/xen/arch/x86/hvm/viridian/viridian.c
> >>>> +++ b/xen/arch/x86/hvm/viridian/viridian.c
> >>>> @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
> >>>>          res->a = CPUID4A_RELAX_TIMER_INT;
> >>>>          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
> >>>>              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
> >>>> -        if ( !cpu_has_vmx_apic_reg_virt )
> >>>> +        if ( !has_assisted_xapic(d) )
> >>>>              res->a |= CPUID4A_MSR_BASED_APIC;
> >>> This check is broken before and after.  It needs to be keyed on
> >>> virtualised interrupt delivery, not register acceleration.
> >> To me this connection you suggest looks entirely unobvious, so would
> >> you mind expanding as to why you're thinking so? The hint to the guest
> >> here is related to how it would best access certain registers (aiui),
> >> which to me looks orthogonal to how interrupt delivery works.
> > I refer you again to the diagram.  (For everyone else on xen-devel, I
> > put this together when fixing XSA-412 because Intel's APIC acceleration
> > controls are entirely unintuitive.)
> >
> > It is "virtual interrupt delivery" which controls EOI/ICR acceleration,
> > and not "apic register virtualisation".  There's a decade worth of
> > hardware where this logic is an anti-optimsiation, by telling windows to
> > use a VMExit-ing mechanism even when microcode would have avoided the
> > VMExit.
> >
> > It is not by accident that "virtual interrupt delivery", introduced in
> > IvyBridge, is exactly the missing registers (on top of "use TPR Shadow"
> > which is even older) to make windows performance less bad.
> 
> Sorry, sent too early.
> 
> This also firmly highlights why the API/ABI is broken. 

I'm not seeing how you are making this connection: the context here is
strictly about a Viridian hint which Xen has been wrongly reporting,
but has nothing to do with the APIC assist API/ABI stuff.  It was
wrong before the introduction of APIC assist, and it's also wrong
after.

Also see my other reply - I'm dubious whether this hint is useful for
any Windows version that supports x2APIC (which seems to be >= Windows
Server 2008), because we expose x2APIC to guests regardless of whether
the underlying platform APIC supports such mode.

Thanks, Roger.

Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Andrew Cooper 1 year, 5 months ago
On 14/11/2022 13:15, Roger Pau Monne wrote:
> On Fri, Nov 11, 2022 at 05:47:02PM +0000, Andrew Cooper wrote:
>> On 11/11/2022 17:35, Andrew Cooper wrote:
>>> On 11/11/2022 07:45, Jan Beulich wrote:
>>>> On 10.11.2022 23:47, Andrew Cooper wrote:
>>>>> On 04/11/2022 16:18, Roger Pau Monne wrote:
>>>>>> --- a/xen/arch/x86/hvm/viridian/viridian.c
>>>>>> +++ b/xen/arch/x86/hvm/viridian/viridian.c
>>>>>> @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
>>>>>>          res->a = CPUID4A_RELAX_TIMER_INT;
>>>>>>          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
>>>>>>              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
>>>>>> -        if ( !cpu_has_vmx_apic_reg_virt )
>>>>>> +        if ( !has_assisted_xapic(d) )
>>>>>>              res->a |= CPUID4A_MSR_BASED_APIC;
>>>>> This check is broken before and after.  It needs to be keyed on
>>>>> virtualised interrupt delivery, not register acceleration.
>>>> To me this connection you suggest looks entirely unobvious, so would
>>>> you mind expanding as to why you're thinking so? The hint to the guest
>>>> here is related to how it would best access certain registers (aiui),
>>>> which to me looks orthogonal to how interrupt delivery works.
>>> I refer you again to the diagram.  (For everyone else on xen-devel, I
>>> put this together when fixing XSA-412 because Intel's APIC acceleration
>>> controls are entirely unintuitive.)
>>>
>>> It is "virtual interrupt delivery" which controls EOI/ICR acceleration,
>>> and not "apic register virtualisation".  There's a decade worth of
>>> hardware where this logic is an anti-optimsiation, by telling windows to
>>> use a VMExit-ing mechanism even when microcode would have avoided the
>>> VMExit.
>>>
>>> It is not by accident that "virtual interrupt delivery", introduced in
>>> IvyBridge, is exactly the missing registers (on top of "use TPR Shadow"
>>> which is even older) to make windows performance less bad.
>> Sorry, sent too early.
>>
>> This also firmly highlights why the API/ABI is broken. 
> I'm not seeing how you are making this connection: the context here is
> strictly about a Viridian hint which Xen has been wrongly reporting,
> but has nothing to do with the APIC assist API/ABI stuff.  It was
> wrong before the introduction of APIC assist, and it's also wrong
> after.

And now it has a layer of obfuscation which makes the bug less obvious.

> Also see my other reply - I'm dubious whether this hint is useful for
> any Windows version that supports x2APIC (which seems to be >= Windows
> Server 2008), because we expose x2APIC to guests regardless of whether
> the underlying platform APIC supports such mode.

It's not about whether a version of Windows supports x2APIC.  Its
whether windows turns x2APIC on.

Short of having an emulated IOMMU, or an absence of an IO-APIC (neither
of which we do), guests wont turn x2APIC on.

I know we have the magic hack for IO-APIC with >8 bit destinations, but
that only got enumerated in the Xen leaves (IIRC?), so only Linux can
pick it up.

The hint is still very relevant for any version of windows running in
xAPIC mode which, at a guess, is all of them under Xen.

~Andrew
Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Roger Pau Monné 1 year, 5 months ago
On Mon, Nov 14, 2022 at 03:31:39PM +0000, Andrew Cooper wrote:
> On 14/11/2022 13:15, Roger Pau Monne wrote:
> > On Fri, Nov 11, 2022 at 05:47:02PM +0000, Andrew Cooper wrote:
> >> On 11/11/2022 17:35, Andrew Cooper wrote:
> >>> On 11/11/2022 07:45, Jan Beulich wrote:
> >>>> On 10.11.2022 23:47, Andrew Cooper wrote:
> >>>>> On 04/11/2022 16:18, Roger Pau Monne wrote:
> >>>>>> --- a/xen/arch/x86/hvm/viridian/viridian.c
> >>>>>> +++ b/xen/arch/x86/hvm/viridian/viridian.c
> >>>>>> @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
> >>>>>>          res->a = CPUID4A_RELAX_TIMER_INT;
> >>>>>>          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
> >>>>>>              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
> >>>>>> -        if ( !cpu_has_vmx_apic_reg_virt )
> >>>>>> +        if ( !has_assisted_xapic(d) )
> >>>>>>              res->a |= CPUID4A_MSR_BASED_APIC;
> >>>>> This check is broken before and after.  It needs to be keyed on
> >>>>> virtualised interrupt delivery, not register acceleration.
> >>>> To me this connection you suggest looks entirely unobvious, so would
> >>>> you mind expanding as to why you're thinking so? The hint to the guest
> >>>> here is related to how it would best access certain registers (aiui),
> >>>> which to me looks orthogonal to how interrupt delivery works.
> >>> I refer you again to the diagram.  (For everyone else on xen-devel, I
> >>> put this together when fixing XSA-412 because Intel's APIC acceleration
> >>> controls are entirely unintuitive.)
> >>>
> >>> It is "virtual interrupt delivery" which controls EOI/ICR acceleration,
> >>> and not "apic register virtualisation".  There's a decade worth of
> >>> hardware where this logic is an anti-optimsiation, by telling windows to
> >>> use a VMExit-ing mechanism even when microcode would have avoided the
> >>> VMExit.
> >>>
> >>> It is not by accident that "virtual interrupt delivery", introduced in
> >>> IvyBridge, is exactly the missing registers (on top of "use TPR Shadow"
> >>> which is even older) to make windows performance less bad.
> >> Sorry, sent too early.
> >>
> >> This also firmly highlights why the API/ABI is broken. 
> > I'm not seeing how you are making this connection: the context here is
> > strictly about a Viridian hint which Xen has been wrongly reporting,
> > but has nothing to do with the APIC assist API/ABI stuff.  It was
> > wrong before the introduction of APIC assist, and it's also wrong
> > after.
> 
> And now it has a layer of obfuscation which makes the bug less obvious.

Still, that's not an argument as to why the API/ABI is broken, just a
remark that the current usage here needs fixing (which it does).

> > Also see my other reply - I'm dubious whether this hint is useful for
> > any Windows version that supports x2APIC (which seems to be >= Windows
> > Server 2008), because we expose x2APIC to guests regardless of whether
> > the underlying platform APIC supports such mode.
> 
> It's not about whether a version of Windows supports x2APIC.  Its
> whether windows turns x2APIC on.

From my previous conversation with Paul I got the impression that
Windows would choose x2APIC based solely on the CPUID bit:

https://lore.kernel.org/xen-devel/2c2d8b2b-e607-6d9d-b991-d1c065aac95d@xen.org/

> Short of having an emulated IOMMU, or an absence of an IO-APIC (neither
> of which we do), guests wont turn x2APIC on.
> 
> I know we have the magic hack for IO-APIC with >8 bit destinations, but
> that only got enumerated in the Xen leaves (IIRC?), so only Linux can
> pick it up.

We don't have the hack yet, just the CPUID bit reserved.

Thanks, Roger.

Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Jan Beulich 1 year, 5 months ago
On 04.11.2022 17:18, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/viridian/viridian.c
> +++ b/xen/arch/x86/hvm/viridian/viridian.c
> @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
>          res->a = CPUID4A_RELAX_TIMER_INT;
>          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
>              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
> -        if ( !cpu_has_vmx_apic_reg_virt )
> +        if ( !has_assisted_xapic(d) )
>              res->a |= CPUID4A_MSR_BASED_APIC;

Isn't this too restrictive when considering x2APIC? IOW is there anything
wrong with leaving this as is?

> @@ -3432,6 +3436,10 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
>                  vmx_set_msr_intercept(v, MSR_X2APIC_PPR, VMX_MSR_R);
>                  vmx_set_msr_intercept(v, MSR_X2APIC_TMICT, VMX_MSR_R);
>                  vmx_set_msr_intercept(v, MSR_X2APIC_TMCCT, VMX_MSR_R);
> +
> +                v->arch.hvm.vmx.secondary_exec_control |=
> +                    SECONDARY_EXEC_APIC_REGISTER_VIRT;
> +
>              }

Nit: stray trailing blank line inside the block.

Everything else looks plausible to me, but from prior discussion I
wonder whether the result isn't still going to be too coarse grained
for Andrew's taste.

Jan
RE: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Henry Wang 1 year, 5 months ago
Hi Andrew,

> Subject: Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the
> APIC assist options
> 
> 
> Everything else looks plausible to me, but from prior discussion I
> wonder whether the result isn't still going to be too coarse grained
> for Andrew's taste.

If you have time, would you mind providing any feedback on this patch so
we can proceed based on your feedback? Thanks very much!

Kind regards,
Henry

> 
> Jan
Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Roger Pau Monné 1 year, 5 months ago
On Mon, Nov 07, 2022 at 05:58:04PM +0100, Jan Beulich wrote:
> On 04.11.2022 17:18, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/viridian/viridian.c
> > +++ b/xen/arch/x86/hvm/viridian/viridian.c
> > @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, uint32_t leaf,
> >          res->a = CPUID4A_RELAX_TIMER_INT;
> >          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
> >              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
> > -        if ( !cpu_has_vmx_apic_reg_virt )
> > +        if ( !has_assisted_xapic(d) )
> >              res->a |= CPUID4A_MSR_BASED_APIC;
> 
> Isn't this too restrictive when considering x2APIC? IOW is there anything
> wrong with leaving this as is?

Using cpu_has_vmx_apic_reg_virt won't be correct, as a domain can have
it disabled now after this change.

When using x2APIC accesses will already be done using MSRs, so the
hint is not useful in that mode.

> > @@ -3432,6 +3436,10 @@ void vmx_vlapic_msr_changed(struct vcpu *v)
> >                  vmx_set_msr_intercept(v, MSR_X2APIC_PPR, VMX_MSR_R);
> >                  vmx_set_msr_intercept(v, MSR_X2APIC_TMICT, VMX_MSR_R);
> >                  vmx_set_msr_intercept(v, MSR_X2APIC_TMCCT, VMX_MSR_R);
> > +
> > +                v->arch.hvm.vmx.secondary_exec_control |=
> > +                    SECONDARY_EXEC_APIC_REGISTER_VIRT;
> > +
> >              }
> 
> Nit: stray trailing blank line inside the block.

Oh, thanks.  I will wait for Andrews feedback then, I think the extra
blank can likely be removed at commit if we agree this is OK.

> Everything else looks plausible to me, but from prior discussion I
> wonder whether the result isn't still going to be too coarse grained
> for Andrew's taste.

Ack, thanks, I think this is the best that we can do given the status
of the release, but would likely need to be quick or else it's gonna
be too late.

Roger.
Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Paul Durrant 1 year, 5 months ago
On 04/11/2022 16:18, Roger Pau Monne wrote:
> The current reporting of the hardware assisted APIC options is done by
> checking "virtualize APIC accesses" which is not very helpful, as that
> feature doesn't avoid a vmexit, instead it does provide some help in
> order to detect APIC MMIO accesses in vmexit processing.
> 
> Repurpose the current reporting of xAPIC assistance to instead report
> such feature as present when there's support for "TPR shadow" and
> "APIC register virtualization" because in that case some xAPIC MMIO
> register accesses are handled directly by the hardware, without
> requiring a vmexit.
> 
> For symetry also change assisted x2APIC reporting to require
> "virtualize x2APIC mode" and "APIC register virtualization", dropping
> the option to also be reported when "virtual interrupt delivery" is
> available.  Presence of the "virtual interrupt delivery" feature will
> be reported using a different option.
> 
> Fixes: 2ce11ce249 ('x86/HVM: allow per-domain usage of hardware virtualized APIC')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> I find the logic in vmx_vlapic_msr_changed() hard to follow, but I
> don't want to rewrite the function logic at this point.
> ---
> Changes since v1:
>   - Fix Viridian MSR tip conditions.

Reviewed-by: Paul Durrant <paul@xen.org>


RE: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options
Posted by Henry Wang 1 year, 5 months ago
Hi Roger,

> Subject: Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the
> APIC assist options
> 
> On 04/11/2022 16:18, Roger Pau Monne wrote:
> > The current reporting of the hardware assisted APIC options is done by
> > checking "virtualize APIC accesses" which is not very helpful, as that
> > feature doesn't avoid a vmexit, instead it does provide some help in
> > order to detect APIC MMIO accesses in vmexit processing.
> >
> > Repurpose the current reporting of xAPIC assistance to instead report
> > such feature as present when there's support for "TPR shadow" and
> > "APIC register virtualization" because in that case some xAPIC MMIO
> > register accesses are handled directly by the hardware, without
> > requiring a vmexit.
> >
> > For symetry also change assisted x2APIC reporting to require
> > "virtualize x2APIC mode" and "APIC register virtualization", dropping
> > the option to also be reported when "virtual interrupt delivery" is
> > available.  Presence of the "virtual interrupt delivery" feature will
> > be reported using a different option.
> >
> > Fixes: 2ce11ce249 ('x86/HVM: allow per-domain usage of hardware
> virtualized APIC')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > Changes since v1:
> >   - Fix Viridian MSR tip conditions.
> 
> Reviewed-by: Paul Durrant <paul@xen.org>

Release-acked-by: Henry Wang <Henry.Wang@arm.com>

Kind regards,
Henry