[PATCH v2] x86/ept: limit calls to memory_type_changed()

Roger Pau Monne posted 1 patch 1 year, 7 months ago
Failed in applying to current master (apply log)
xen/arch/arm/gic-v2.c            |  2 +-
xen/arch/arm/gic-v3.c            |  2 +-
xen/arch/arm/gic.c               |  2 +-
xen/arch/arm/include/asm/gic.h   |  4 ++--
xen/arch/x86/domctl.c            |  4 ----
xen/arch/x86/include/asm/iocap.h | 33 +++++++++++++++++++++++----
xen/arch/x86/mm/p2m-ept.c        |  4 ++++
xen/common/domctl.c              |  4 ----
xen/include/xen/iocap.h          | 38 ++++++++++++++++++++++++++++----
9 files changed, 72 insertions(+), 21 deletions(-)
[PATCH v2] x86/ept: limit calls to memory_type_changed()
Posted by Roger Pau Monne 1 year, 7 months ago
memory_type_changed() is currently only implemented for Intel EPT, and
results in the invalidation of EMT attributes on all the entries in
the EPT page tables.  Such invalidation causes EPT_MISCONFIG vmexits
when the guest tries to access any gfns for the first time, which
results in the recalculation of the EMT for the accessed page.  The
vmexit and the recalculations are expensive, and as such should be
avoided when possible.

Remove the call to memory_type_changed() from
XEN_DOMCTL_memory_mapping: there are no modifications of the
iomem_caps ranges anymore that could alter the return of
cache_flush_permitted() from that domctl.

Encapsulate calls to memory_type_changed() resulting from changes to
the domain iomem_caps or ioport_caps ranges in the helpers themselves
(io{ports,mem}_{permit,deny}_access()), and add a note in
epte_get_entry_emt() to remind that changes to the logic there likely
need to be propagaed to the IO capabilities helpers.

Note changes to the IO ports or memory ranges are not very common
during guest runtime, but Citrix Hypervisor has an use case for them
related to device passthrough.

Some Arm callers (implementations of the iomem_deny_access function
pointer field in gic_hw_operations struct) pass a const domain pointer
to iomem_deny_access(), which is questionable.  It works because
the rangeset is allocated separately from the domain struct, but
conceptually seems wrong to me, as passing a const pointer would imply
no changes to the domain data, and denying iomem accesses does change
the domain data.  Fix this by removing the const attribute from the
affected functions and call chain.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v1:
 - Place the calls to memory_type_changed() inside the
   io{ports,mem}_{permit,deny}_access() helpers.
---
 xen/arch/arm/gic-v2.c            |  2 +-
 xen/arch/arm/gic-v3.c            |  2 +-
 xen/arch/arm/gic.c               |  2 +-
 xen/arch/arm/include/asm/gic.h   |  4 ++--
 xen/arch/x86/domctl.c            |  4 ----
 xen/arch/x86/include/asm/iocap.h | 33 +++++++++++++++++++++++----
 xen/arch/x86/mm/p2m-ept.c        |  4 ++++
 xen/common/domctl.c              |  4 ----
 xen/include/xen/iocap.h          | 38 ++++++++++++++++++++++++++++----
 9 files changed, 72 insertions(+), 21 deletions(-)

diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index bd773bcc67..ae5bd8e95f 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -1083,7 +1083,7 @@ static void __init gicv2_dt_init(void)
     gicv2_extension_dt_init(node);
 }
 
-static int gicv2_iomem_deny_access(const struct domain *d)
+static int gicv2_iomem_deny_access(struct domain *d)
 {
     int rc;
     unsigned long mfn, nr;
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 64b36cec25..018fa0dfa0 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1424,7 +1424,7 @@ static void __init gicv3_dt_init(void)
                               &vbase, &vsize);
 }
 
-static int gicv3_iomem_deny_access(const struct domain *d)
+static int gicv3_iomem_deny_access(struct domain *d)
 {
     int rc, i;
     unsigned long mfn, nr;
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 3b0331b538..9b82325442 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -462,7 +462,7 @@ unsigned long gic_get_hwdom_madt_size(const struct domain *d)
 }
 #endif
 
-int gic_iomem_deny_access(const struct domain *d)
+int gic_iomem_deny_access(struct domain *d)
 {
     return gic_hw_ops->iomem_deny_access(d);
 }
diff --git a/xen/arch/arm/include/asm/gic.h b/xen/arch/arm/include/asm/gic.h
index 3692fae393..76e3fa5dc4 100644
--- a/xen/arch/arm/include/asm/gic.h
+++ b/xen/arch/arm/include/asm/gic.h
@@ -392,7 +392,7 @@ struct gic_hw_operations {
     /* Map extra GIC MMIO, irqs and other hw stuffs to the hardware domain. */
     int (*map_hwdom_extra_mappings)(struct domain *d);
     /* Deny access to GIC regions */
-    int (*iomem_deny_access)(const struct domain *d);
+    int (*iomem_deny_access)(struct domain *d);
     /* Handle LPIs, which require special handling */
     void (*do_LPI)(unsigned int lpi);
 };
@@ -449,7 +449,7 @@ unsigned long gic_get_hwdom_madt_size(const struct domain *d);
 #endif
 
 int gic_map_hwdom_extra_mappings(struct domain *d);
-int gic_iomem_deny_access(const struct domain *d);
+int gic_iomem_deny_access(struct domain *d);
 
 #endif /* __ASSEMBLY__ */
 #endif
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 020df615bd..e9bfbc57a7 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -232,8 +232,6 @@ long arch_do_domctl(
             ret = ioports_permit_access(d, fp, fp + np - 1);
         else
             ret = ioports_deny_access(d, fp, fp + np - 1);
-        if ( !ret )
-            memory_type_changed(d);
         break;
     }
 
@@ -666,8 +664,6 @@ long arch_do_domctl(
                        "ioport_map: error %ld denying dom%d access to [%x,%x]\n",
                        ret, d->domain_id, fmp, fmp + np - 1);
         }
-        if ( !ret )
-            memory_type_changed(d);
         break;
     }
 
diff --git a/xen/arch/x86/include/asm/iocap.h b/xen/arch/x86/include/asm/iocap.h
index eee47228d4..ce83c3d8a4 100644
--- a/xen/arch/x86/include/asm/iocap.h
+++ b/xen/arch/x86/include/asm/iocap.h
@@ -7,10 +7,11 @@
 #ifndef __X86_IOCAP_H__
 #define __X86_IOCAP_H__
 
-#define ioports_permit_access(d, s, e)                  \
-    rangeset_add_range((d)->arch.ioport_caps, s, e)
-#define ioports_deny_access(d, s, e)                    \
-    rangeset_remove_range((d)->arch.ioport_caps, s, e)
+#include <xen/sched.h>
+#include <xen/rangeset.h>
+
+#include <asm/p2m.h>
+
 #define ioports_access_permitted(d, s, e)               \
     rangeset_contains_range((d)->arch.ioport_caps, s, e)
 
@@ -18,4 +19,28 @@
     (!rangeset_is_empty((d)->iomem_caps) ||             \
      !rangeset_is_empty((d)->arch.ioport_caps))
 
+static inline int ioports_permit_access(struct domain *d, unsigned long s,
+                                        unsigned long e)
+{
+    bool flush = cache_flush_permitted(d);
+    int ret = rangeset_add_range(d->arch.ioport_caps, s, e);
+
+    if ( !ret && !is_iommu_enabled(d) && !flush )
+        /* See comment in iomem_permit_access(). */
+        memory_type_changed(d);
+
+    return ret;
+}
+static inline int ioports_deny_access(struct domain *d, unsigned long s,
+                                      unsigned long e)
+{
+    int ret = rangeset_remove_range(d->arch.ioport_caps, s, e);
+
+    if ( !ret && !is_iommu_enabled(d) && !cache_flush_permitted(d) )
+        /* See comment in iomem_deny_access(). */
+        memory_type_changed(d);
+
+    return ret;
+}
+
 #endif /* __X86_IOCAP_H__ */
diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
index b4919bad51..d61d66c20e 100644
--- a/xen/arch/x86/mm/p2m-ept.c
+++ b/xen/arch/x86/mm/p2m-ept.c
@@ -518,6 +518,10 @@ int epte_get_entry_emt(struct domain *d, gfn_t gfn, mfn_t mfn,
         return MTRR_TYPE_UNCACHABLE;
     }
 
+    /*
+     * Conditional must be kept in sync with the code in
+     * {iomem,ioports}_{permit,deny}_access().
+     */
     if ( type != p2m_mmio_direct && !is_iommu_enabled(d) &&
          !cache_flush_permitted(d) )
     {
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 452266710a..69fb9abd34 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -716,8 +716,6 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
             ret = iomem_permit_access(d, mfn, mfn + nr_mfns - 1);
         else
             ret = iomem_deny_access(d, mfn, mfn + nr_mfns - 1);
-        if ( !ret )
-            memory_type_changed(d);
         break;
     }
 
@@ -778,8 +776,6 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
                        "memory_map: error %ld removing dom%d access to [%lx,%lx]\n",
                        ret, d->domain_id, mfn, mfn_end);
         }
-        /* Do this unconditionally to cover errors on above failure paths. */
-        memory_type_changed(d);
         break;
     }
 
diff --git a/xen/include/xen/iocap.h b/xen/include/xen/iocap.h
index 1ca3858fc0..0ca4c9745f 100644
--- a/xen/include/xen/iocap.h
+++ b/xen/include/xen/iocap.h
@@ -7,13 +7,43 @@
 #ifndef __XEN_IOCAP_H__
 #define __XEN_IOCAP_H__
 
+#include <xen/sched.h>
 #include <xen/rangeset.h>
 #include <asm/iocap.h>
+#include <asm/p2m.h>
+
+static inline int iomem_permit_access(struct domain *d, unsigned long s,
+                                      unsigned long e)
+{
+    bool flush = cache_flush_permitted(d);
+    int ret = rangeset_add_range(d->iomem_caps, s, e);
+
+    if ( !ret && !is_iommu_enabled(d) && !flush )
+        /*
+         * Only flush if the range(s) are empty before this addition and
+         * IOMMU is not enabled for the domain, otherwise it makes no
+         * difference for effective cache attribute calculation purposes.
+         */
+        memory_type_changed(d);
+
+    return ret;
+}
+static inline int iomem_deny_access(struct domain *d, unsigned long s,
+                                    unsigned long e)
+{
+    int ret = rangeset_remove_range(d->iomem_caps, s, e);
+
+    if ( !ret && !is_iommu_enabled(d) && !cache_flush_permitted(d) )
+        /*
+         * Only flush if the range(s) are empty after this removal and
+         * IOMMU is not enabled for the domain, otherwise it makes no
+         * difference for effective cache attribute calculation purposes.
+         */
+        memory_type_changed(d);
+
+    return ret;
+}
 
-#define iomem_permit_access(d, s, e)                    \
-    rangeset_add_range((d)->iomem_caps, s, e)
-#define iomem_deny_access(d, s, e)                      \
-    rangeset_remove_range((d)->iomem_caps, s, e)
 #define iomem_access_permitted(d, s, e)                 \
     rangeset_contains_range((d)->iomem_caps, s, e)
 
-- 
2.37.3


Re: [PATCH v2] x86/ept: limit calls to memory_type_changed()
Posted by Jan Beulich 1 year, 7 months ago
On 27.09.2022 17:39, Roger Pau Monne wrote:
> memory_type_changed() is currently only implemented for Intel EPT, and
> results in the invalidation of EMT attributes on all the entries in
> the EPT page tables.  Such invalidation causes EPT_MISCONFIG vmexits
> when the guest tries to access any gfns for the first time, which
> results in the recalculation of the EMT for the accessed page.  The
> vmexit and the recalculations are expensive, and as such should be
> avoided when possible.
> 
> Remove the call to memory_type_changed() from
> XEN_DOMCTL_memory_mapping: there are no modifications of the
> iomem_caps ranges anymore that could alter the return of
> cache_flush_permitted() from that domctl.
> 
> Encapsulate calls to memory_type_changed() resulting from changes to
> the domain iomem_caps or ioport_caps ranges in the helpers themselves
> (io{ports,mem}_{permit,deny}_access()), and add a note in
> epte_get_entry_emt() to remind that changes to the logic there likely
> need to be propagaed to the IO capabilities helpers.
> 
> Note changes to the IO ports or memory ranges are not very common
> during guest runtime, but Citrix Hypervisor has an use case for them
> related to device passthrough.
> 
> Some Arm callers (implementations of the iomem_deny_access function
> pointer field in gic_hw_operations struct) pass a const domain pointer
> to iomem_deny_access(), which is questionable.  It works because
> the rangeset is allocated separately from the domain struct, but
> conceptually seems wrong to me, as passing a const pointer would imply
> no changes to the domain data, and denying iomem accesses does change
> the domain data.  Fix this by removing the const attribute from the
> affected functions and call chain.

Personally I think this adjustment would better be a separate, prereq
change.

> --- a/xen/include/xen/iocap.h
> +++ b/xen/include/xen/iocap.h
> @@ -7,13 +7,43 @@
>  #ifndef __XEN_IOCAP_H__
>  #define __XEN_IOCAP_H__
>  
> +#include <xen/sched.h>
>  #include <xen/rangeset.h>
>  #include <asm/iocap.h>
> +#include <asm/p2m.h>

That's heavy dependencies you're adding. I wonder if the functions
wouldn't better become out-of-line ones (but see also below).

> +static inline int iomem_permit_access(struct domain *d, unsigned long s,
> +                                      unsigned long e)
> +{
> +    bool flush = cache_flush_permitted(d);
> +    int ret = rangeset_add_range(d->iomem_caps, s, e);
> +
> +    if ( !ret && !is_iommu_enabled(d) && !flush )
> +        /*
> +         * Only flush if the range(s) are empty before this addition and
> +         * IOMMU is not enabled for the domain, otherwise it makes no
> +         * difference for effective cache attribute calculation purposes.
> +         */
> +        memory_type_changed(d);
> +
> +    return ret;
> +}
> +static inline int iomem_deny_access(struct domain *d, unsigned long s,
> +                                    unsigned long e)
> +{
> +    int ret = rangeset_remove_range(d->iomem_caps, s, e);
> +
> +    if ( !ret && !is_iommu_enabled(d) && !cache_flush_permitted(d) )
> +        /*
> +         * Only flush if the range(s) are empty after this removal and
> +         * IOMMU is not enabled for the domain, otherwise it makes no
> +         * difference for effective cache attribute calculation purposes.
> +         */
> +        memory_type_changed(d);
> +
> +    return ret;
> +}

I'm surprised Arm's memory_type_changed() is an empty out-of-line function.
This means the compiler can't eliminate this code (except when using LTO).
But then cache_flush_permitted() (resolving to rangeset_is_empty()) can't
be eliminated either, even if memory_type_changed() was. While gcc doc
doesn't explicitly say that it may help (the talk about repeated invocations
only), I wonder whether we shouldn't mark rangeset_is_empty() pure. In a
reduced example that does help (once memory_type_changed() is also an
inline function) with gcc12 - no call to rangeset_is_empty() remains.

Jan
Re: [PATCH v2] x86/ept: limit calls to memory_type_changed()
Posted by Roger Pau Monné 1 year, 7 months ago
On Wed, Sep 28, 2022 at 10:01:26AM +0200, Jan Beulich wrote:
> On 27.09.2022 17:39, Roger Pau Monne wrote:
> > memory_type_changed() is currently only implemented for Intel EPT, and
> > results in the invalidation of EMT attributes on all the entries in
> > the EPT page tables.  Such invalidation causes EPT_MISCONFIG vmexits
> > when the guest tries to access any gfns for the first time, which
> > results in the recalculation of the EMT for the accessed page.  The
> > vmexit and the recalculations are expensive, and as such should be
> > avoided when possible.
> > 
> > Remove the call to memory_type_changed() from
> > XEN_DOMCTL_memory_mapping: there are no modifications of the
> > iomem_caps ranges anymore that could alter the return of
> > cache_flush_permitted() from that domctl.
> > 
> > Encapsulate calls to memory_type_changed() resulting from changes to
> > the domain iomem_caps or ioport_caps ranges in the helpers themselves
> > (io{ports,mem}_{permit,deny}_access()), and add a note in
> > epte_get_entry_emt() to remind that changes to the logic there likely
> > need to be propagaed to the IO capabilities helpers.
> > 
> > Note changes to the IO ports or memory ranges are not very common
> > during guest runtime, but Citrix Hypervisor has an use case for them
> > related to device passthrough.
> > 
> > Some Arm callers (implementations of the iomem_deny_access function
> > pointer field in gic_hw_operations struct) pass a const domain pointer
> > to iomem_deny_access(), which is questionable.  It works because
> > the rangeset is allocated separately from the domain struct, but
> > conceptually seems wrong to me, as passing a const pointer would imply
> > no changes to the domain data, and denying iomem accesses does change
> > the domain data.  Fix this by removing the const attribute from the
> > affected functions and call chain.
> 
> Personally I think this adjustment would better be a separate, prereq
> change.

Right - I was about to split it but didn't want to go through the
hassle if the approach didn't end up being well received.

Do you think placing the calls to memory_type_changed() inside the
{permit,deny}_,access is acceptable?

> > --- a/xen/include/xen/iocap.h
> > +++ b/xen/include/xen/iocap.h
> > @@ -7,13 +7,43 @@
> >  #ifndef __XEN_IOCAP_H__
> >  #define __XEN_IOCAP_H__
> >  
> > +#include <xen/sched.h>
> >  #include <xen/rangeset.h>
> >  #include <asm/iocap.h>
> > +#include <asm/p2m.h>
> 
> That's heavy dependencies you're adding. I wonder if the functions
> wouldn't better become out-of-line ones (but see also below).
> 
> > +static inline int iomem_permit_access(struct domain *d, unsigned long s,
> > +                                      unsigned long e)
> > +{
> > +    bool flush = cache_flush_permitted(d);
> > +    int ret = rangeset_add_range(d->iomem_caps, s, e);
> > +
> > +    if ( !ret && !is_iommu_enabled(d) && !flush )
> > +        /*
> > +         * Only flush if the range(s) are empty before this addition and
> > +         * IOMMU is not enabled for the domain, otherwise it makes no
> > +         * difference for effective cache attribute calculation purposes.
> > +         */
> > +        memory_type_changed(d);
> > +
> > +    return ret;
> > +}
> > +static inline int iomem_deny_access(struct domain *d, unsigned long s,
> > +                                    unsigned long e)
> > +{
> > +    int ret = rangeset_remove_range(d->iomem_caps, s, e);
> > +
> > +    if ( !ret && !is_iommu_enabled(d) && !cache_flush_permitted(d) )
> > +        /*
> > +         * Only flush if the range(s) are empty after this removal and
> > +         * IOMMU is not enabled for the domain, otherwise it makes no
> > +         * difference for effective cache attribute calculation purposes.
> > +         */
> > +        memory_type_changed(d);
> > +
> > +    return ret;
> > +}
> 
> I'm surprised Arm's memory_type_changed() is an empty out-of-line function.
> This means the compiler can't eliminate this code (except when using LTO).
> But then cache_flush_permitted() (resolving to rangeset_is_empty()) can't
> be eliminated either, even if memory_type_changed() was. While gcc doc
> doesn't explicitly say that it may help (the talk about repeated invocations
> only), I wonder whether we shouldn't mark rangeset_is_empty() pure. In a
> reduced example that does help (once memory_type_changed() is also an
> inline function) with gcc12 - no call to rangeset_is_empty() remains.

Can look into it, do you want it to be a prereq of this patch?

Thanks, Roger.
Re: [PATCH v2] x86/ept: limit calls to memory_type_changed()
Posted by Jan Beulich 1 year, 7 months ago
On 28.09.2022 12:08, Roger Pau Monné wrote:
> On Wed, Sep 28, 2022 at 10:01:26AM +0200, Jan Beulich wrote:
>> On 27.09.2022 17:39, Roger Pau Monne wrote:
>>> memory_type_changed() is currently only implemented for Intel EPT, and
>>> results in the invalidation of EMT attributes on all the entries in
>>> the EPT page tables.  Such invalidation causes EPT_MISCONFIG vmexits
>>> when the guest tries to access any gfns for the first time, which
>>> results in the recalculation of the EMT for the accessed page.  The
>>> vmexit and the recalculations are expensive, and as such should be
>>> avoided when possible.
>>>
>>> Remove the call to memory_type_changed() from
>>> XEN_DOMCTL_memory_mapping: there are no modifications of the
>>> iomem_caps ranges anymore that could alter the return of
>>> cache_flush_permitted() from that domctl.
>>>
>>> Encapsulate calls to memory_type_changed() resulting from changes to
>>> the domain iomem_caps or ioport_caps ranges in the helpers themselves
>>> (io{ports,mem}_{permit,deny}_access()), and add a note in
>>> epte_get_entry_emt() to remind that changes to the logic there likely
>>> need to be propagaed to the IO capabilities helpers.
>>>
>>> Note changes to the IO ports or memory ranges are not very common
>>> during guest runtime, but Citrix Hypervisor has an use case for them
>>> related to device passthrough.
>>>
>>> Some Arm callers (implementations of the iomem_deny_access function
>>> pointer field in gic_hw_operations struct) pass a const domain pointer
>>> to iomem_deny_access(), which is questionable.  It works because
>>> the rangeset is allocated separately from the domain struct, but
>>> conceptually seems wrong to me, as passing a const pointer would imply
>>> no changes to the domain data, and denying iomem accesses does change
>>> the domain data.  Fix this by removing the const attribute from the
>>> affected functions and call chain.
>>
>> Personally I think this adjustment would better be a separate, prereq
>> change.
> 
> Right - I was about to split it but didn't want to go through the
> hassle if the approach didn't end up being well received.
> 
> Do you think placing the calls to memory_type_changed() inside the
> {permit,deny}_,access is acceptable?

Well, as said before - it's not pretty, but the existence of
memory_type_changed() itself isn't either, nor are the present
placements of calls to it. So yes, I view this as acceptable.

>>> --- a/xen/include/xen/iocap.h
>>> +++ b/xen/include/xen/iocap.h
>>> @@ -7,13 +7,43 @@
>>>  #ifndef __XEN_IOCAP_H__
>>>  #define __XEN_IOCAP_H__
>>>  
>>> +#include <xen/sched.h>
>>>  #include <xen/rangeset.h>
>>>  #include <asm/iocap.h>
>>> +#include <asm/p2m.h>
>>
>> That's heavy dependencies you're adding. I wonder if the functions
>> wouldn't better become out-of-line ones (but see also below).
>>
>>> +static inline int iomem_permit_access(struct domain *d, unsigned long s,
>>> +                                      unsigned long e)
>>> +{
>>> +    bool flush = cache_flush_permitted(d);
>>> +    int ret = rangeset_add_range(d->iomem_caps, s, e);
>>> +
>>> +    if ( !ret && !is_iommu_enabled(d) && !flush )
>>> +        /*
>>> +         * Only flush if the range(s) are empty before this addition and
>>> +         * IOMMU is not enabled for the domain, otherwise it makes no
>>> +         * difference for effective cache attribute calculation purposes.
>>> +         */
>>> +        memory_type_changed(d);
>>> +
>>> +    return ret;
>>> +}
>>> +static inline int iomem_deny_access(struct domain *d, unsigned long s,
>>> +                                    unsigned long e)
>>> +{
>>> +    int ret = rangeset_remove_range(d->iomem_caps, s, e);
>>> +
>>> +    if ( !ret && !is_iommu_enabled(d) && !cache_flush_permitted(d) )
>>> +        /*
>>> +         * Only flush if the range(s) are empty after this removal and
>>> +         * IOMMU is not enabled for the domain, otherwise it makes no
>>> +         * difference for effective cache attribute calculation purposes.
>>> +         */
>>> +        memory_type_changed(d);
>>> +
>>> +    return ret;
>>> +}
>>
>> I'm surprised Arm's memory_type_changed() is an empty out-of-line function.
>> This means the compiler can't eliminate this code (except when using LTO).
>> But then cache_flush_permitted() (resolving to rangeset_is_empty()) can't
>> be eliminated either, even if memory_type_changed() was. While gcc doc
>> doesn't explicitly say that it may help (the talk about repeated invocations
>> only), I wonder whether we shouldn't mark rangeset_is_empty() pure. In a
>> reduced example that does help (once memory_type_changed() is also an
>> inline function) with gcc12 - no call to rangeset_is_empty() remains.
> 
> Can look into it, do you want it to be a prereq of this patch?

Well, if done, then it being a prereq would seem desirable. But x86 isn't
affected by this, so I'd leave the "whether" aspect to be judged by Arm folks.

Jan

Re: [PATCH v2] x86/ept: limit calls to memory_type_changed()
Posted by Roger Pau Monné 1 year, 7 months ago
On Wed, Sep 28, 2022 at 12:45:13PM +0200, Jan Beulich wrote:
> On 28.09.2022 12:08, Roger Pau Monné wrote:
> > On Wed, Sep 28, 2022 at 10:01:26AM +0200, Jan Beulich wrote:
> >> On 27.09.2022 17:39, Roger Pau Monne wrote:
> >>> memory_type_changed() is currently only implemented for Intel EPT, and
> >>> results in the invalidation of EMT attributes on all the entries in
> >>> the EPT page tables.  Such invalidation causes EPT_MISCONFIG vmexits
> >>> when the guest tries to access any gfns for the first time, which
> >>> results in the recalculation of the EMT for the accessed page.  The
> >>> vmexit and the recalculations are expensive, and as such should be
> >>> avoided when possible.
> >>>
> >>> Remove the call to memory_type_changed() from
> >>> XEN_DOMCTL_memory_mapping: there are no modifications of the
> >>> iomem_caps ranges anymore that could alter the return of
> >>> cache_flush_permitted() from that domctl.
> >>>
> >>> Encapsulate calls to memory_type_changed() resulting from changes to
> >>> the domain iomem_caps or ioport_caps ranges in the helpers themselves
> >>> (io{ports,mem}_{permit,deny}_access()), and add a note in
> >>> epte_get_entry_emt() to remind that changes to the logic there likely
> >>> need to be propagaed to the IO capabilities helpers.
> >>>
> >>> Note changes to the IO ports or memory ranges are not very common
> >>> during guest runtime, but Citrix Hypervisor has an use case for them
> >>> related to device passthrough.
> >>>
> >>> Some Arm callers (implementations of the iomem_deny_access function
> >>> pointer field in gic_hw_operations struct) pass a const domain pointer
> >>> to iomem_deny_access(), which is questionable.  It works because
> >>> the rangeset is allocated separately from the domain struct, but
> >>> conceptually seems wrong to me, as passing a const pointer would imply
> >>> no changes to the domain data, and denying iomem accesses does change
> >>> the domain data.  Fix this by removing the const attribute from the
> >>> affected functions and call chain.
> >>
> >> Personally I think this adjustment would better be a separate, prereq
> >> change.
> > 
> > Right - I was about to split it but didn't want to go through the
> > hassle if the approach didn't end up being well received.
> > 
> > Do you think placing the calls to memory_type_changed() inside the
> > {permit,deny}_,access is acceptable?
> 
> Well, as said before - it's not pretty, but the existence of
> memory_type_changed() itself isn't either, nor are the present
> placements of calls to it. So yes, I view this as acceptable.
> 
> >>> --- a/xen/include/xen/iocap.h
> >>> +++ b/xen/include/xen/iocap.h
> >>> @@ -7,13 +7,43 @@
> >>>  #ifndef __XEN_IOCAP_H__
> >>>  #define __XEN_IOCAP_H__
> >>>  
> >>> +#include <xen/sched.h>
> >>>  #include <xen/rangeset.h>
> >>>  #include <asm/iocap.h>
> >>> +#include <asm/p2m.h>
> >>
> >> That's heavy dependencies you're adding. I wonder if the functions
> >> wouldn't better become out-of-line ones (but see also below).

I would expect most callers to already have those dependencies TBH,
and in any case definitions there not used would be dropped anyway.

Or are you worried about the newly added dependencies causing a
circular dependency issue in the future?

> >>> +static inline int iomem_permit_access(struct domain *d, unsigned long s,
> >>> +                                      unsigned long e)
> >>> +{
> >>> +    bool flush = cache_flush_permitted(d);
> >>> +    int ret = rangeset_add_range(d->iomem_caps, s, e);
> >>> +
> >>> +    if ( !ret && !is_iommu_enabled(d) && !flush )
> >>> +        /*
> >>> +         * Only flush if the range(s) are empty before this addition and
> >>> +         * IOMMU is not enabled for the domain, otherwise it makes no
> >>> +         * difference for effective cache attribute calculation purposes.
> >>> +         */
> >>> +        memory_type_changed(d);
> >>> +
> >>> +    return ret;
> >>> +}
> >>> +static inline int iomem_deny_access(struct domain *d, unsigned long s,
> >>> +                                    unsigned long e)
> >>> +{
> >>> +    int ret = rangeset_remove_range(d->iomem_caps, s, e);
> >>> +
> >>> +    if ( !ret && !is_iommu_enabled(d) && !cache_flush_permitted(d) )
> >>> +        /*
> >>> +         * Only flush if the range(s) are empty after this removal and
> >>> +         * IOMMU is not enabled for the domain, otherwise it makes no
> >>> +         * difference for effective cache attribute calculation purposes.
> >>> +         */
> >>> +        memory_type_changed(d);
> >>> +
> >>> +    return ret;
> >>> +}
> >>
> >> I'm surprised Arm's memory_type_changed() is an empty out-of-line function.
> >> This means the compiler can't eliminate this code (except when using LTO).
> >> But then cache_flush_permitted() (resolving to rangeset_is_empty()) can't
> >> be eliminated either, even if memory_type_changed() was. While gcc doc
> >> doesn't explicitly say that it may help (the talk about repeated invocations
> >> only), I wonder whether we shouldn't mark rangeset_is_empty() pure. In a
> >> reduced example that does help (once memory_type_changed() is also an
> >> inline function) with gcc12 - no call to rangeset_is_empty() remains.
> > 
> > Can look into it, do you want it to be a prereq of this patch?
> 
> Well, if done, then it being a prereq would seem desirable. But x86 isn't
> affected by this, so I'd leave the "whether" aspect to be judged by Arm folks.

OK, let me split and prepare a new version then.

Thanks, Roger.

Re: [PATCH v2] x86/ept: limit calls to memory_type_changed()
Posted by Jan Beulich 1 year, 7 months ago
On 28.09.2022 15:08, Roger Pau Monné wrote:
> On Wed, Sep 28, 2022 at 12:45:13PM +0200, Jan Beulich wrote:
>> On 28.09.2022 12:08, Roger Pau Monné wrote:
>>> On Wed, Sep 28, 2022 at 10:01:26AM +0200, Jan Beulich wrote:
>>>> On 27.09.2022 17:39, Roger Pau Monne wrote:
>>>>> --- a/xen/include/xen/iocap.h
>>>>> +++ b/xen/include/xen/iocap.h
>>>>> @@ -7,13 +7,43 @@
>>>>>  #ifndef __XEN_IOCAP_H__
>>>>>  #define __XEN_IOCAP_H__
>>>>>  
>>>>> +#include <xen/sched.h>
>>>>>  #include <xen/rangeset.h>
>>>>>  #include <asm/iocap.h>
>>>>> +#include <asm/p2m.h>
>>>>
>>>> That's heavy dependencies you're adding. I wonder if the functions
>>>> wouldn't better become out-of-line ones (but see also below).
> 
> I would expect most callers to already have those dependencies TBH,
> and in any case definitions there not used would be dropped anyway.
> 
> Or are you worried about the newly added dependencies causing a
> circular dependency issue in the future?

Yes, but maybe for no real reason: Now that I look, I see that no other
header file includes */iocap.h (except, of course, xen/iocap.h including
asm/iocap.h).

Jan