[PATCH] x86/svm: Use flush-by-asid when available

Andrew Cooper posted 1 patch 3 years, 10 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/xen tags/patchew/20200505182539.12247-1-andrew.cooper3@citrix.com
xen/arch/x86/hvm/svm/asid.c       | 9 ++++++---
xen/include/asm-x86/hvm/svm/svm.h | 1 +
2 files changed, 7 insertions(+), 3 deletions(-)
[PATCH] x86/svm: Use flush-by-asid when available
Posted by Andrew Cooper 3 years, 10 months ago
AMD Fam15h processors introduced the flush-by-asid feature, for more fine
grain flushing purposes.

Flushing everything including ASID 0 (i.e. Xen context) is an an unnecesserily
large hammer, and never necessary in the context of guest TLBs needing
invalidating.

When available, use TLB_CTRL_FLUSH_ASID in preference to TLB_CTRL_FLUSH_ALL.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wl@xen.org>
CC: Roger Pau Monné <roger.pau@citrix.com>

The APM currently describes tlb_control encoding 1 as "Flush entire
TLB (Should be used only by legacy hypervisors.)".  AMD have agreed that this
is misleading and should say "legacy hardware" instead.

This change does come with a minor observed perf improvement on Fam17h
hardware, of ~0.6s over ~22s for my XTF pagewalk test.  This test will spot
TLB flushing issues, but isn't optimal for spotting the perf increase from
better flushing.  There were no observed differences for Fam15h, but this
could simply mean that the measured code footprint was larger than the TLB on
this CPU.
---
 xen/arch/x86/hvm/svm/asid.c       | 9 ++++++---
 xen/include/asm-x86/hvm/svm/svm.h | 1 +
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/asid.c b/xen/arch/x86/hvm/svm/asid.c
index 9be90058c7..ab06dd3f3a 100644
--- a/xen/arch/x86/hvm/svm/asid.c
+++ b/xen/arch/x86/hvm/svm/asid.c
@@ -18,6 +18,7 @@
 #include <asm/amd.h>
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/svm/asid.h>
+#include <asm/hvm/svm/svm.h>
 
 void svm_asid_init(const struct cpuinfo_x86 *c)
 {
@@ -47,15 +48,17 @@ void svm_asid_handle_vmrun(void)
     if ( p_asid->asid == 0 )
     {
         vmcb_set_guest_asid(vmcb, 1);
-        /* TODO: investigate using TLB_CTRL_FLUSH_ASID here instead. */
-        vmcb->tlb_control = TLB_CTRL_FLUSH_ALL;
+        vmcb->tlb_control =
+            cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL;
         return;
     }
 
     if ( vmcb_get_guest_asid(vmcb) != p_asid->asid )
         vmcb_set_guest_asid(vmcb, p_asid->asid);
 
-    vmcb->tlb_control = need_flush ? TLB_CTRL_FLUSH_ALL : TLB_CTRL_NO_FLUSH;
+    vmcb->tlb_control =
+        !need_flush ? TLB_CTRL_NO_FLUSH :
+        cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL;
 }
 
 /*
diff --git a/xen/include/asm-x86/hvm/svm/svm.h b/xen/include/asm-x86/hvm/svm/svm.h
index 16a994ec74..cd71402cbb 100644
--- a/xen/include/asm-x86/hvm/svm/svm.h
+++ b/xen/include/asm-x86/hvm/svm/svm.h
@@ -79,6 +79,7 @@ extern u32 svm_feature_flags;
 #define cpu_has_svm_svml      cpu_has_svm_feature(SVM_FEATURE_SVML)
 #define cpu_has_svm_nrips     cpu_has_svm_feature(SVM_FEATURE_NRIPS)
 #define cpu_has_svm_cleanbits cpu_has_svm_feature(SVM_FEATURE_VMCBCLEAN)
+#define cpu_has_svm_flushbyasid cpu_has_svm_feature(SVM_FEATURE_FLUSHBYASID)
 #define cpu_has_svm_decode    cpu_has_svm_feature(SVM_FEATURE_DECODEASSISTS)
 #define cpu_has_svm_vgif      cpu_has_svm_feature(SVM_FEATURE_VGIF)
 #define cpu_has_pause_filter  cpu_has_svm_feature(SVM_FEATURE_PAUSEFILTER)
-- 
2.11.0


Re: [PATCH] x86/svm: Use flush-by-asid when available
Posted by Jan Beulich 3 years, 10 months ago
On 05.05.2020 20:25, Andrew Cooper wrote:
> AMD Fam15h processors introduced the flush-by-asid feature, for more fine
> grain flushing purposes.
> 
> Flushing everything including ASID 0 (i.e. Xen context) is an an unnecesserily
> large hammer, and never necessary in the context of guest TLBs needing
> invalidating.
> 
> When available, use TLB_CTRL_FLUSH_ASID in preference to TLB_CTRL_FLUSH_ALL.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>


Re: [PATCH] x86/svm: Use flush-by-asid when available
Posted by Roger Pau Monné 3 years, 10 months ago
On Tue, May 05, 2020 at 07:25:39PM +0100, Andrew Cooper wrote:
> AMD Fam15h processors introduced the flush-by-asid feature, for more fine
> grain flushing purposes.
> 
> Flushing everything including ASID 0 (i.e. Xen context) is an an unnecesserily
> large hammer, and never necessary in the context of guest TLBs needing
> invalidating.
> 
> When available, use TLB_CTRL_FLUSH_ASID in preference to TLB_CTRL_FLUSH_ALL.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

I should also look into restricting the usage of FLUSH_HVM_ASID_CORE
and instead perform more fine grained per-vCPU flushes when possible,
since FLUSH_HVM_ASID_CORE resets the pCPU ASID generation forcing a
new ASID to be allocated for all vCPUs running on that pCPU.

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Wei Liu <wl@xen.org>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> 
> The APM currently describes tlb_control encoding 1 as "Flush entire
> TLB (Should be used only by legacy hypervisors.)".  AMD have agreed that this
> is misleading and should say "legacy hardware" instead.

AFAICT using TLB_CTRL_FLUSH_ASID on hardware not supporting the
feature has not been found to be safe? Ie: TLB_CTRL_FLUSH_ALL is a
subset of TLB_CTRL_FLUSH_ASID from a bitmap PoV, so if those bits
where ignored on older hardware it should be safe to unconditionally
use it.

> This change does come with a minor observed perf improvement on Fam17h
> hardware, of ~0.6s over ~22s for my XTF pagewalk test.  This test will spot
> TLB flushing issues, but isn't optimal for spotting the perf increase from
> better flushing.  There were no observed differences for Fam15h, but this
> could simply mean that the measured code footprint was larger than the TLB on
> this CPU.
> ---
>  xen/arch/x86/hvm/svm/asid.c       | 9 ++++++---
>  xen/include/asm-x86/hvm/svm/svm.h | 1 +
>  2 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/svm/asid.c b/xen/arch/x86/hvm/svm/asid.c
> index 9be90058c7..ab06dd3f3a 100644
> --- a/xen/arch/x86/hvm/svm/asid.c
> +++ b/xen/arch/x86/hvm/svm/asid.c
> @@ -18,6 +18,7 @@
>  #include <asm/amd.h>
>  #include <asm/hvm/nestedhvm.h>
>  #include <asm/hvm/svm/asid.h>
> +#include <asm/hvm/svm/svm.h>
>  
>  void svm_asid_init(const struct cpuinfo_x86 *c)
>  {
> @@ -47,15 +48,17 @@ void svm_asid_handle_vmrun(void)
>      if ( p_asid->asid == 0 )
>      {
>          vmcb_set_guest_asid(vmcb, 1);
> -        /* TODO: investigate using TLB_CTRL_FLUSH_ASID here instead. */
> -        vmcb->tlb_control = TLB_CTRL_FLUSH_ALL;
> +        vmcb->tlb_control =
> +            cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL;
>          return;
>      }
>  
>      if ( vmcb_get_guest_asid(vmcb) != p_asid->asid )
>          vmcb_set_guest_asid(vmcb, p_asid->asid);
>  
> -    vmcb->tlb_control = need_flush ? TLB_CTRL_FLUSH_ALL : TLB_CTRL_NO_FLUSH;
> +    vmcb->tlb_control =
> +        !need_flush ? TLB_CTRL_NO_FLUSH :
> +        cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL;

Since this code structure is used in two places I would consider
locally introducing something like:

#define TLB_CTRL_FLUSH_CMD (cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID \
                                                    : TLB_CTRL_FLUSH_ALL)

To abstract it away.

Thanks, Roger.

Re: [PATCH] x86/svm: Use flush-by-asid when available
Posted by Andrew Cooper 3 years, 10 months ago
On 06/05/2020 08:08, Roger Pau Monné wrote:
> On Tue, May 05, 2020 at 07:25:39PM +0100, Andrew Cooper wrote:
>> AMD Fam15h processors introduced the flush-by-asid feature, for more fine
>> grain flushing purposes.
>>
>> Flushing everything including ASID 0 (i.e. Xen context) is an an unnecesserily
>> large hammer, and never necessary in the context of guest TLBs needing
>> invalidating.
>>
>> When available, use TLB_CTRL_FLUSH_ASID in preference to TLB_CTRL_FLUSH_ALL.
>>
>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>
> I should also look into restricting the usage of FLUSH_HVM_ASID_CORE
> and instead perform more fine grained per-vCPU flushes when possible,
> since FLUSH_HVM_ASID_CORE resets the pCPU ASID generation forcing a
> new ASID to be allocated for all vCPUs running on that pCPU.
>
>> ---
>> CC: Jan Beulich <JBeulich@suse.com>
>> CC: Wei Liu <wl@xen.org>
>> CC: Roger Pau Monné <roger.pau@citrix.com>
>>
>> The APM currently describes tlb_control encoding 1 as "Flush entire
>> TLB (Should be used only by legacy hypervisors.)".  AMD have agreed that this
>> is misleading and should say "legacy hardware" instead.
> AFAICT using TLB_CTRL_FLUSH_ASID on hardware not supporting the
> feature has not been found to be safe? Ie: TLB_CTRL_FLUSH_ALL is a
> subset of TLB_CTRL_FLUSH_ASID from a bitmap PoV, so if those bits
> where ignored on older hardware it should be safe to unconditionally
> use it.

So as far as I can tell, TLB_CTRL_FLUSH_ASID is safe to use on older
hardware, but I was told in no uncertain terms by an AMD architect that
we can't rely on that.

Hence this patch not being s/TLB_CTRL_FLUSH_ALL/TLB_CTRL_FLUSH_ALL/ in
asid.c

>
>> This change does come with a minor observed perf improvement on Fam17h
>> hardware, of ~0.6s over ~22s for my XTF pagewalk test.  This test will spot
>> TLB flushing issues, but isn't optimal for spotting the perf increase from
>> better flushing.  There were no observed differences for Fam15h, but this
>> could simply mean that the measured code footprint was larger than the TLB on
>> this CPU.
>> ---
>>  xen/arch/x86/hvm/svm/asid.c       | 9 ++++++---
>>  xen/include/asm-x86/hvm/svm/svm.h | 1 +
>>  2 files changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/x86/hvm/svm/asid.c b/xen/arch/x86/hvm/svm/asid.c
>> index 9be90058c7..ab06dd3f3a 100644
>> --- a/xen/arch/x86/hvm/svm/asid.c
>> +++ b/xen/arch/x86/hvm/svm/asid.c
>> @@ -18,6 +18,7 @@
>>  #include <asm/amd.h>
>>  #include <asm/hvm/nestedhvm.h>
>>  #include <asm/hvm/svm/asid.h>
>> +#include <asm/hvm/svm/svm.h>
>>  
>>  void svm_asid_init(const struct cpuinfo_x86 *c)
>>  {
>> @@ -47,15 +48,17 @@ void svm_asid_handle_vmrun(void)
>>      if ( p_asid->asid == 0 )
>>      {
>>          vmcb_set_guest_asid(vmcb, 1);
>> -        /* TODO: investigate using TLB_CTRL_FLUSH_ASID here instead. */
>> -        vmcb->tlb_control = TLB_CTRL_FLUSH_ALL;
>> +        vmcb->tlb_control =
>> +            cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL;
>>          return;
>>      }
>>  
>>      if ( vmcb_get_guest_asid(vmcb) != p_asid->asid )
>>          vmcb_set_guest_asid(vmcb, p_asid->asid);
>>  
>> -    vmcb->tlb_control = need_flush ? TLB_CTRL_FLUSH_ALL : TLB_CTRL_NO_FLUSH;
>> +    vmcb->tlb_control =
>> +        !need_flush ? TLB_CTRL_NO_FLUSH :
>> +        cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL;
> Since this code structure is used in two places I would consider
> locally introducing something like:
>
> #define TLB_CTRL_FLUSH_CMD (cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID \
>                                                     : TLB_CTRL_FLUSH_ALL)
>
> To abstract it away.

Right, but TLB_CTRL_FLUSH_CMD is easy to confuse as constant in the same
space as TLB_CTRL_FLUSH_*, and the logic isn't going to survive a
conversion to a finer grain flushing in exactly this form.

~Andrew