[Xen-devel] [PATCH v4] x86: do not enable global pages when virtualized on AMD hardware

Roger Pau Monne posted 1 patch 4 years, 4 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/xen tags/patchew/20191209173757.46833-1-roger.pau@citrix.com
docs/misc/xen-command-line.pandoc | 13 +++++++++++++
xen/arch/x86/pv/domain.c          | 15 ++++++++++++++-
2 files changed, 27 insertions(+), 1 deletion(-)
[Xen-devel] [PATCH v4] x86: do not enable global pages when virtualized on AMD hardware
Posted by Roger Pau Monne 4 years, 4 months ago
When using global pages a full tlb flush can only be performed by
toggling the PGE bit in CR4, which is usually quite expensive in terms
of performance when running virtualized. This is specially relevant on
AMD hardware, which doesn't have the ability to do selective CR4
trapping, but can also be relevant on Intel if the underlying
hypervisor also traps accesses to the PGE CR4 bit.

In order to avoid this performance penalty, do not use global pages
when running virtualized on AMD hardware. A command line option
'global-pages' is provided in order to allow the user to select
whether global pages will be enabled for PV guests.

The above figures are from a PV shim running on AMD hardware with
32 vCPUs:

PGE enabled, x2APIC mode:

(XEN) Global lock flush_lock: addr=ffff82d0804b01c0, lockval=1adb1adb, not locked
(XEN)   lock:1841883(1375128998543), block:1658716(10193054890781)

Average lock time:   746588ns
Average block time: 6145147ns

PGE disabled, x2APIC mode:

(XEN) Global lock flush_lock: addr=ffff82d0804af1c0, lockval=a8bfa8bf, not locked
(XEN)   lock:2730175(657505389886), block:2039716(2963768247738)

Average lock time:   240829ns
Average block time: 1453029ns

As seen from the above figures the lock and block time of the flush
lock is reduced to approximately 1/3 of the original value.

Note that XEN_MINIMAL_CR4 and mmu_cr4_features are not modified, and
thus global pages are left enabled for the hypervisor. This is not an
issue because the code to switch the control registers (cr3 and cr4)
already takes into account such situation and performs the necessary
flushes. The same already happens when using XPTI or PCIDE, as the
guest cr4 doesn't have global pages enabled in that case either.

Also note that the suspend and resume code is correct in writing
mmu_cr4_features into cr4 on resume, since that's the cr4 used by the
idle vCPU which is the context used by the suspend and resume routine.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v3:
 - Expand commit message.

Changes since v2:
 - Set the default value at init if not specified by the user.
 - Use int8_t and read_mostly for opt_global_pages.

Changes since v1:
 - Provide command line option to enable/disable PGE.
 - Only disable PGE on AMD hardware when virtualized.
 - Document the global-pages option.
---
 docs/misc/xen-command-line.pandoc | 13 +++++++++++++
 xen/arch/x86/pv/domain.c          | 15 ++++++++++++++-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index d9495ef6b9..7be30f2766 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1087,6 +1087,19 @@ value settable via Xen tools.
 
 Dom0 is using this value for sizing its maptrack table.
 
+### global-pages (x86)
+> `= <boolean>`
+
+> Default: `true` unless running virtualized on AMD hardware
+
+Set whether the PGE bit in CR4 will be enabled for PV guests. This controls the
+usage of global pages, and thus the need to perform tlb flushes by writing to
+CR4.
+
+Note it's disabled by default when running virtualized on AMD hardware since
+AMD SVM doesn't support selective trapping of CR4, so global pages are not
+enabled in order to reduce the overhead of tlb flushes.
+
 ### guest_loglvl
 > `= <level>[/<rate-limited level>]` where level is `none | error | warning | info | debug | all`
 
diff --git a/xen/arch/x86/pv/domain.c b/xen/arch/x86/pv/domain.c
index 4b6f48dea2..8ff733f56b 100644
--- a/xen/arch/x86/pv/domain.c
+++ b/xen/arch/x86/pv/domain.c
@@ -118,6 +118,19 @@ unsigned long pv_fixup_guest_cr4(const struct vcpu *v, unsigned long cr4)
             (mmu_cr4_features & PV_CR4_GUEST_VISIBLE_MASK));
 }
 
+static int8_t __read_mostly opt_global_pages = -1;
+boolean_runtime_param("global-pages", opt_global_pages);
+
+static int __init pge_init(void)
+{
+    if ( opt_global_pages == -1 )
+        opt_global_pages = !cpu_has_hypervisor ||
+                           boot_cpu_data.x86_vendor != X86_VENDOR_AMD;
+
+    return 0;
+}
+__initcall(pge_init);
+
 unsigned long pv_make_cr4(const struct vcpu *v)
 {
     const struct domain *d = v->domain;
@@ -130,7 +143,7 @@ unsigned long pv_make_cr4(const struct vcpu *v)
      */
     if ( d->arch.pv.pcid )
         cr4 |= X86_CR4_PCIDE;
-    else if ( !d->arch.pv.xpti )
+    else if ( !d->arch.pv.xpti && opt_global_pages )
         cr4 |= X86_CR4_PGE;
 
     /*
-- 
2.24.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4] x86: do not enable global pages when virtualized on AMD hardware
Posted by Jan Beulich 4 years, 4 months ago
On 09.12.2019 18:37, Roger Pau Monne wrote:
> When using global pages a full tlb flush can only be performed by
> toggling the PGE bit in CR4, which is usually quite expensive in terms
> of performance when running virtualized. This is specially relevant on
> AMD hardware, which doesn't have the ability to do selective CR4
> trapping, but can also be relevant on Intel if the underlying
> hypervisor also traps accesses to the PGE CR4 bit.
> 
> In order to avoid this performance penalty, do not use global pages
> when running virtualized on AMD hardware. A command line option
> 'global-pages' is provided in order to allow the user to select
> whether global pages will be enabled for PV guests.
> 
> The above figures are from a PV shim running on AMD hardware with
> 32 vCPUs:
> 
> PGE enabled, x2APIC mode:
> 
> (XEN) Global lock flush_lock: addr=ffff82d0804b01c0, lockval=1adb1adb, not locked
> (XEN)   lock:1841883(1375128998543), block:1658716(10193054890781)
> 
> Average lock time:   746588ns
> Average block time: 6145147ns
> 
> PGE disabled, x2APIC mode:
> 
> (XEN) Global lock flush_lock: addr=ffff82d0804af1c0, lockval=a8bfa8bf, not locked
> (XEN)   lock:2730175(657505389886), block:2039716(2963768247738)
> 
> Average lock time:   240829ns
> Average block time: 1453029ns
> 
> As seen from the above figures the lock and block time of the flush
> lock is reduced to approximately 1/3 of the original value.
> 
> Note that XEN_MINIMAL_CR4 and mmu_cr4_features are not modified, and
> thus global pages are left enabled for the hypervisor. This is not an
> issue because the code to switch the control registers (cr3 and cr4)
> already takes into account such situation and performs the necessary
> flushes. The same already happens when using XPTI or PCIDE, as the
> guest cr4 doesn't have global pages enabled in that case either.
> 
> Also note that the suspend and resume code is correct in writing
> mmu_cr4_features into cr4 on resume, since that's the cr4 used by the
> idle vCPU which is the context used by the suspend and resume routine.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4] x86: do not enable global pages when virtualized on AMD hardware
Posted by Jan Beulich 4 years, 4 months ago
On 09.12.2019 18:37, Roger Pau Monne wrote:
> --- a/xen/arch/x86/pv/domain.c
> +++ b/xen/arch/x86/pv/domain.c
> @@ -118,6 +118,19 @@ unsigned long pv_fixup_guest_cr4(const struct vcpu *v, unsigned long cr4)
>              (mmu_cr4_features & PV_CR4_GUEST_VISIBLE_MASK));
>  }
>  
> +static int8_t __read_mostly opt_global_pages = -1;
> +boolean_runtime_param("global-pages", opt_global_pages);
> +
> +static int __init pge_init(void)
> +{
> +    if ( opt_global_pages == -1 )
> +        opt_global_pages = !cpu_has_hypervisor ||
> +                           boot_cpu_data.x86_vendor != X86_VENDOR_AMD;

I was about to commit this when I noticed - what about Hygon here?
I'm happy to make the adjustment while committing, but I don't
want to do so without your consent.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4] x86: do not enable global pages when virtualized on AMD hardware
Posted by Roger Pau Monné 4 years, 4 months ago
On Tue, Dec 10, 2019 at 11:11:18AM +0100, Jan Beulich wrote:
> On 09.12.2019 18:37, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/pv/domain.c
> > +++ b/xen/arch/x86/pv/domain.c
> > @@ -118,6 +118,19 @@ unsigned long pv_fixup_guest_cr4(const struct vcpu *v, unsigned long cr4)
> >              (mmu_cr4_features & PV_CR4_GUEST_VISIBLE_MASK));
> >  }
> >  
> > +static int8_t __read_mostly opt_global_pages = -1;
> > +boolean_runtime_param("global-pages", opt_global_pages);
> > +
> > +static int __init pge_init(void)
> > +{
> > +    if ( opt_global_pages == -1 )
> > +        opt_global_pages = !cpu_has_hypervisor ||
> > +                           boot_cpu_data.x86_vendor != X86_VENDOR_AMD;
> 
> I was about to commit this when I noticed - what about Hygon here?

Oh the vendor ID is different albeit it's just a clone. Please feel
free to add it at commit.

I also wonder: it might be good to have some kind of macro that
matches both AMD and Hygon (IS_AMD_COMPAT or some such) in order to
avoid this kind of mistakes in the future.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4] x86: do not enable global pages when virtualized on AMD hardware
Posted by Jan Beulich 4 years, 4 months ago
On 10.12.2019 11:18, Roger Pau Monné wrote:
> On Tue, Dec 10, 2019 at 11:11:18AM +0100, Jan Beulich wrote:
>> On 09.12.2019 18:37, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/pv/domain.c
>>> +++ b/xen/arch/x86/pv/domain.c
>>> @@ -118,6 +118,19 @@ unsigned long pv_fixup_guest_cr4(const struct vcpu *v, unsigned long cr4)
>>>              (mmu_cr4_features & PV_CR4_GUEST_VISIBLE_MASK));
>>>  }
>>>  
>>> +static int8_t __read_mostly opt_global_pages = -1;
>>> +boolean_runtime_param("global-pages", opt_global_pages);
>>> +
>>> +static int __init pge_init(void)
>>> +{
>>> +    if ( opt_global_pages == -1 )
>>> +        opt_global_pages = !cpu_has_hypervisor ||
>>> +                           boot_cpu_data.x86_vendor != X86_VENDOR_AMD;
>>
>> I was about to commit this when I noticed - what about Hygon here?
> 
> Oh the vendor ID is different albeit it's just a clone. Please feel
> free to add it at commit.
> 
> I also wonder: it might be good to have some kind of macro that
> matches both AMD and Hygon (IS_AMD_COMPAT or some such) in order to
> avoid this kind of mistakes in the future.

Because it's a clone, down the road this may be more risky. Here
what we're really interested in is SVM, just that we can't check
the feature flag (because it may not be exposed to us).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel