[Xen-devel] [PATCH v2 0/3] x86/mm: (remaining) XSA-299 / 309 / 310 follow-up

Jan Beulich posted 3 patches 4 years, 2 months ago
Only 0 patches received!
[Xen-devel] [PATCH v2 0/3] x86/mm: (remaining) XSA-299 / 309 / 310 follow-up
Posted by Jan Beulich 4 years, 2 months ago
Addressing a few assorted aspects I've noticed during the
investigations / reviews.

1: mod_l<N>_entry() have no need to use __copy_from_user()
2: rename and tidy create_pae_xen_mappings()
3: re-order a few conditionals

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 1/3] x86/mm: mod_l<N>_entry() have no need to use __copy_from_user()
Posted by Jan Beulich 4 years, 2 months ago
mod_l1_entry()'s need to do so went away with commit 2d0557c5cb ("x86:
Fold page_info lock into type_info"), and the other three never had such
a need, at least going back as far as 3.2.0. Replace the uses by newly
introduced l<N>e_access_once().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Use ACCESS_ONCE() clones.

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2124,13 +2124,10 @@ static int mod_l1_entry(l1_pgentry_t *pl
                         struct vcpu *pt_vcpu, struct domain *pg_dom)
 {
     bool preserve_ad = (cmd == MMU_PT_UPDATE_PRESERVE_AD);
-    l1_pgentry_t ol1e;
+    l1_pgentry_t ol1e = l1e_access_once(*pl1e);
     struct domain *pt_dom = pt_vcpu->domain;
     int rc = 0;
 
-    if ( unlikely(__copy_from_user(&ol1e, pl1e, sizeof(ol1e)) != 0) )
-        return -EFAULT;
-
     ASSERT(!paging_mode_refcounts(pt_dom));
 
     if ( l1e_get_flags(nl1e) & _PAGE_PRESENT )
@@ -2248,8 +2245,7 @@ static int mod_l2_entry(l2_pgentry_t *pl
         return -EPERM;
     }
 
-    if ( unlikely(__copy_from_user(&ol2e, pl2e, sizeof(ol2e)) != 0) )
-        return -EFAULT;
+    ol2e = l2e_access_once(*pl2e);
 
     if ( l2e_get_flags(nl2e) & _PAGE_PRESENT )
     {
@@ -2311,8 +2307,7 @@ static int mod_l3_entry(l3_pgentry_t *pl
     if ( is_pv_32bit_domain(d) && (pgentry_ptr_to_slot(pl3e) >= 3) )
         return -EINVAL;
 
-    if ( unlikely(__copy_from_user(&ol3e, pl3e, sizeof(ol3e)) != 0) )
-        return -EFAULT;
+    ol3e = l3e_access_once(*pl3e);
 
     if ( l3e_get_flags(nl3e) & _PAGE_PRESENT )
     {
@@ -2378,8 +2373,7 @@ static int mod_l4_entry(l4_pgentry_t *pl
         return -EINVAL;
     }
 
-    if ( unlikely(__copy_from_user(&ol4e, pl4e, sizeof(ol4e)) != 0) )
-        return -EFAULT;
+    ol4e = l4e_access_once(*pl4e);
 
     if ( l4e_get_flags(nl4e) & _PAGE_PRESENT )
     {
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -55,6 +55,16 @@
 #define l4e_write(l4ep, l4e) \
     pte_write(&l4e_get_intpte(*(l4ep)), l4e_get_intpte(l4e))
 
+/* Type-correct ACCESS_ONCE() wrappers for PTE accesses. */
+#define l1e_access_once(l1e) (*container_of(&ACCESS_ONCE(l1e_get_intpte(l1e)), \
+                                            volatile l1_pgentry_t, l1))
+#define l2e_access_once(l2e) (*container_of(&ACCESS_ONCE(l2e_get_intpte(l2e)), \
+                                            volatile l2_pgentry_t, l2))
+#define l3e_access_once(l3e) (*container_of(&ACCESS_ONCE(l3e_get_intpte(l3e)), \
+                                            volatile l3_pgentry_t, l3))
+#define l4e_access_once(l4e) (*container_of(&ACCESS_ONCE(l4e_get_intpte(l4e)), \
+                                            volatile l4_pgentry_t, l4))
+
 /* Get direct integer representation of a pte's contents (intpte_t). */
 #define l1e_get_intpte(x)          ((x).l1)
 #define l2e_get_intpte(x)          ((x).l2)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] x86/mm: mod_l<N>_entry() have no need to use __copy_from_user()
Posted by Andrew Cooper 4 years, 2 months ago
On 06/01/2020 15:34, Jan Beulich wrote:
> --- a/xen/include/asm-x86/page.h
> +++ b/xen/include/asm-x86/page.h
> @@ -55,6 +55,16 @@
>  #define l4e_write(l4ep, l4e) \
>      pte_write(&l4e_get_intpte(*(l4ep)), l4e_get_intpte(l4e))
>  
> +/* Type-correct ACCESS_ONCE() wrappers for PTE accesses. */
> +#define l1e_access_once(l1e) (*container_of(&ACCESS_ONCE(l1e_get_intpte(l1e)), \
> +                                            volatile l1_pgentry_t, l1))
> +#define l2e_access_once(l2e) (*container_of(&ACCESS_ONCE(l2e_get_intpte(l2e)), \
> +                                            volatile l2_pgentry_t, l2))
> +#define l3e_access_once(l3e) (*container_of(&ACCESS_ONCE(l3e_get_intpte(l3e)), \
> +                                            volatile l3_pgentry_t, l3))
> +#define l4e_access_once(l4e) (*container_of(&ACCESS_ONCE(l4e_get_intpte(l4e)), \
> +                                            volatile l4_pgentry_t, l4))

What's wrong with l?e_read_atomic() which already exist, and are already
used elsewhere?

If nothing, then Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
to save another round of posting.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] x86/mm: mod_l<N>_entry() have no need to use __copy_from_user()
Posted by Jan Beulich 4 years, 2 months ago
On 06.01.2020 17:09, Andrew Cooper wrote:
> On 06/01/2020 15:34, Jan Beulich wrote:
>> --- a/xen/include/asm-x86/page.h
>> +++ b/xen/include/asm-x86/page.h
>> @@ -55,6 +55,16 @@
>>  #define l4e_write(l4ep, l4e) \
>>      pte_write(&l4e_get_intpte(*(l4ep)), l4e_get_intpte(l4e))
>>  
>> +/* Type-correct ACCESS_ONCE() wrappers for PTE accesses. */
>> +#define l1e_access_once(l1e) (*container_of(&ACCESS_ONCE(l1e_get_intpte(l1e)), \
>> +                                            volatile l1_pgentry_t, l1))
>> +#define l2e_access_once(l2e) (*container_of(&ACCESS_ONCE(l2e_get_intpte(l2e)), \
>> +                                            volatile l2_pgentry_t, l2))
>> +#define l3e_access_once(l3e) (*container_of(&ACCESS_ONCE(l3e_get_intpte(l3e)), \
>> +                                            volatile l3_pgentry_t, l3))
>> +#define l4e_access_once(l4e) (*container_of(&ACCESS_ONCE(l4e_get_intpte(l4e)), \
>> +                                            volatile l4_pgentry_t, l4))
> 
> What's wrong with l?e_read_atomic() which already exist, and are already
> used elsewhere?

I did consider going that route, but predicted you would object to its
use here: Iirc you've previously voiced opinion in this direction
(perhaps not on the page table accessors themselves but the underlying
{read,write}_u<N>_atomic()).

> If nothing, then Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> to save another round of posting.

Let's get the above clarified - I'll be happy to switch to
l<N>e_read_atomic() if that's fine by you.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 1/3] x86/mm: mod_l<N>_entry() have no need to use __copy_from_user()
Posted by Andrew Cooper 4 years, 2 months ago
On 06/01/2020 16:18, Jan Beulich wrote:
> On 06.01.2020 17:09, Andrew Cooper wrote:
>> On 06/01/2020 15:34, Jan Beulich wrote:
>>> --- a/xen/include/asm-x86/page.h
>>> +++ b/xen/include/asm-x86/page.h
>>> @@ -55,6 +55,16 @@
>>>  #define l4e_write(l4ep, l4e) \
>>>      pte_write(&l4e_get_intpte(*(l4ep)), l4e_get_intpte(l4e))
>>>  
>>> +/* Type-correct ACCESS_ONCE() wrappers for PTE accesses. */
>>> +#define l1e_access_once(l1e) (*container_of(&ACCESS_ONCE(l1e_get_intpte(l1e)), \
>>> +                                            volatile l1_pgentry_t, l1))
>>> +#define l2e_access_once(l2e) (*container_of(&ACCESS_ONCE(l2e_get_intpte(l2e)), \
>>> +                                            volatile l2_pgentry_t, l2))
>>> +#define l3e_access_once(l3e) (*container_of(&ACCESS_ONCE(l3e_get_intpte(l3e)), \
>>> +                                            volatile l3_pgentry_t, l3))
>>> +#define l4e_access_once(l4e) (*container_of(&ACCESS_ONCE(l4e_get_intpte(l4e)), \
>>> +                                            volatile l4_pgentry_t, l4))
>> What's wrong with l?e_read_atomic() which already exist, and are already
>> used elsewhere?
> I did consider going that route, but predicted you would object to its
> use here: Iirc you've previously voiced opinion in this direction
> (perhaps not on the page table accessors themselves but the underlying
> {read,write}_u<N>_atomic()).
>
>> If nothing, then Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> to save another round of posting.
> Let's get the above clarified - I'll be happy to switch to
> l<N>e_read_atomic() if that's fine by you.

I'd definitely prefer reusing l?e_read_atomic() than introducing a new
set of helpers.

I have two key issues with the general _atomic() infrastructure.

First, the term is overloaded for two very different things.  We use it
both for "don't subdivide this read/write" and "use a locked update",
where the latter is what is expected by the name "atomic".

Second, and specifically with {read,write}_u<N>_atomic(), it is their
construction using macros which makes them impossible to locate in the
source code.  This can trivially be fixed by not using macros.  (If it
were up to me, the use of ## would be disallowed in general, because it
does very little but to obfuscate code.)

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 2/3] x86/mm: rename and tidy create_pae_xen_mappings()
Posted by Jan Beulich 4 years, 2 months ago
After dad74b0f9e ("i386: fix handling of Xen entries in final L2 page
table") and the removal of 32-bit support the function doesn't modify
state anymore, and hence its name has been misleading. Change its name,
constify parameters and a local variable, and make it return bool.

Also drop the call to it from mod_l3_entry(): The function explicitly
disallows 32-bit domains to modify slot 3. This way we also won't
re-check slot 3 when a slot other than slot 3 changes. Doing so has
needlessly disallowed making some L2 table recursively link back to an
L2 used in some L3's 3rd slot, as we check for the type ref count to be
1. (Note that allowing dynamic changes of L3 entries in the way we do is
bogus anyway, as that's not how L3s behave in the native and EPT cases:
They get re-evaluated only upon CR3 reloads. NPT is different in this
regard.)

As a result of this we no longer need to play games to get at the start
of the L3 table.

Additionally move the single remaining call site, allowing to drop one
is_pv_32bit_domain() invocation and a _PAGE_PRESENT check (in the
function itself) as well as to exit the loop early (remaining entries
have all ben set to empty just ahead of this loop).

Further move a BUG_ON() such that in the common case its condition
wouldn't need evaluating.

Finally, since we're at it, move init_xen_pae_l2_slots() next to the
renamed function, as they really belong together (in fact
init_xen_pae_l2_slots() was [indirectly] broken out of this function).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Refine description. Drop an ASSERT(). Add a comment ahead of the
    function.
---
We could go further here and delete the function altogether: There are
no linear mappings in a PGT_pae_xen_l2 table anymore (this was on 32-bit
only). The corresponding conditional in mod_l3_entry() could then go
away as well (or, more precisely, would need to be replaced by correct
handling of 3rd slot updates). This would mean that a 32-bit guest
functioning on new Xen may fail to work on older (possibly 32-bit) Xen.

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1414,23 +1414,22 @@ static int promote_l1_table(struct page_
     return ret;
 }
 
-static int create_pae_xen_mappings(struct domain *d, l3_pgentry_t *pl3e)
+/*
+ * Note: The checks performed by this function are just to enforce a
+ * legacy restriction necessary on 32-bit hosts. There's not much point in
+ * relaxing (dropping) this though, as 32-bit guests would still need to
+ * conform to the original restrictions in order to be able to run on (old)
+ * 32-bit Xen.
+ */
+static bool pae_xen_mappings_check(const struct domain *d,
+                                   const l3_pgentry_t *pl3e)
 {
-    struct page_info *page;
-    l3_pgentry_t     l3e3;
-
-    if ( !is_pv_32bit_domain(d) )
-        return 1;
-
-    pl3e = (l3_pgentry_t *)((unsigned long)pl3e & PAGE_MASK);
-
-    /* 3rd L3 slot contains L2 with Xen-private mappings. It *must* exist. */
-    l3e3 = pl3e[3];
-    if ( !(l3e_get_flags(l3e3) & _PAGE_PRESENT) )
-    {
-        gdprintk(XENLOG_WARNING, "PAE L3 3rd slot is empty\n");
-        return 0;
-    }
+    /*
+     * 3rd L3 slot contains L2 with Xen-private mappings. It *must* exist,
+     * which our caller has already verified.
+     */
+    l3_pgentry_t l3e3 = pl3e[3];
+    const struct page_info *page = l3e_get_page(l3e3);
 
     /*
      * The Xen-private mappings include linear mappings. The L2 thus cannot
@@ -1441,17 +1440,24 @@ static int create_pae_xen_mappings(struc
      *     a. promote_l3_table() calls this function and this check will fail
      *     b. mod_l3_entry() disallows updates to slot 3 in an existing table
      */
-    page = l3e_get_page(l3e3);
     BUG_ON(page->u.inuse.type_info & PGT_pinned);
-    BUG_ON((page->u.inuse.type_info & PGT_count_mask) == 0);
     BUG_ON(!(page->u.inuse.type_info & PGT_pae_xen_l2));
     if ( (page->u.inuse.type_info & PGT_count_mask) != 1 )
     {
+        BUG_ON(!(page->u.inuse.type_info & PGT_count_mask));
         gdprintk(XENLOG_WARNING, "PAE L3 3rd slot is shared\n");
-        return 0;
+        return false;
     }
 
-    return 1;
+    return true;
+}
+
+void init_xen_pae_l2_slots(l2_pgentry_t *l2t, const struct domain *d)
+{
+    memcpy(&l2t[COMPAT_L2_PAGETABLE_FIRST_XEN_SLOT(d)],
+           &compat_idle_pg_table_l2[
+               l2_table_offset(HIRO_COMPAT_MPT_VIRT_START)],
+           COMPAT_L2_PAGETABLE_XEN_SLOTS(d) * sizeof(*l2t));
 }
 
 static int promote_l2_table(struct page_info *page, unsigned long type)
@@ -1592,6 +1598,16 @@ static int promote_l3_table(struct page_
                     l3e_get_mfn(l3e),
                     PGT_l2_page_table | PGT_pae_xen_l2, d,
                     partial_flags | PTF_preemptible | PTF_retain_ref_on_restart);
+
+            if ( !rc )
+            {
+                if ( pae_xen_mappings_check(d, pl3e) )
+                {
+                    pl3e[i] = adjust_guest_l3e(l3e, d);
+                    break;
+                }
+                rc = -EINVAL;
+            }
         }
         else if ( !(l3e_get_flags(l3e) & _PAGE_PRESENT) )
         {
@@ -1621,8 +1637,6 @@ static int promote_l3_table(struct page_
         pl3e[i] = adjust_guest_l3e(l3e, d);
     }
 
-    if ( !rc && !create_pae_xen_mappings(d, pl3e) )
-        rc = -EINVAL;
     if ( rc < 0 && rc != -ERESTART && rc != -EINTR )
     {
         gdprintk(XENLOG_WARNING,
@@ -1663,14 +1677,6 @@ static int promote_l3_table(struct page_
     unmap_domain_page(pl3e);
     return rc;
 }
-
-void init_xen_pae_l2_slots(l2_pgentry_t *l2t, const struct domain *d)
-{
-    memcpy(&l2t[COMPAT_L2_PAGETABLE_FIRST_XEN_SLOT(d)],
-           &compat_idle_pg_table_l2[
-               l2_table_offset(HIRO_COMPAT_MPT_VIRT_START)],
-           COMPAT_L2_PAGETABLE_XEN_SLOTS(d) * sizeof(*l2t));
-}
 #endif /* CONFIG_PV */
 
 /*
@@ -2347,10 +2353,6 @@ static int mod_l3_entry(l3_pgentry_t *pl
         return -EFAULT;
     }
 
-    if ( likely(rc == 0) )
-        if ( !create_pae_xen_mappings(d, pl3e) )
-            BUG();
-
     put_page_from_l3e(ol3e, mfn, PTF_defer);
     return rc;
 }


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 2/3] x86/mm: rename and tidy create_pae_xen_mappings()
Posted by Andrew Cooper 4 years, 2 months ago
On 06/01/2020 15:35, Jan Beulich wrote:
> After dad74b0f9e ("i386: fix handling of Xen entries in final L2 page
> table") and the removal of 32-bit support the function doesn't modify
> state anymore, and hence its name has been misleading. Change its name,
> constify parameters and a local variable, and make it return bool.
>
> Also drop the call to it from mod_l3_entry(): The function explicitly
> disallows 32-bit domains to modify slot 3. This way we also won't
> re-check slot 3 when a slot other than slot 3 changes. Doing so has
> needlessly disallowed making some L2 table recursively link back to an
> L2 used in some L3's 3rd slot, as we check for the type ref count to be
> 1. (Note that allowing dynamic changes of L3 entries in the way we do is
> bogus anyway, as that's not how L3s behave in the native and EPT cases:
> They get re-evaluated only upon CR3 reloads. NPT is different in this
> regard.)
>
> As a result of this we no longer need to play games to get at the start
> of the L3 table.
>
> Additionally move the single remaining call site, allowing to drop one
> is_pv_32bit_domain() invocation and a _PAGE_PRESENT check (in the
> function itself) as well as to exit the loop early (remaining entries
> have all ben set to empty just ahead of this loop).

been.

>
> Further move a BUG_ON() such that in the common case its condition
> wouldn't need evaluating.
>
> Finally, since we're at it, move init_xen_pae_l2_slots() next to the
> renamed function, as they really belong together (in fact
> init_xen_pae_l2_slots() was [indirectly] broken out of this function).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
[Xen-devel] [PATCH v2 3/3] x86/mm: re-order a few conditionals
Posted by Jan Beulich 4 years, 2 months ago
is_{hvm,pv}_*() can be expensive now, so where possible evaluate cheaper
conditions first.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.
---
I couldn't really decide whether to drop the two involved unlikely().

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1588,7 +1588,7 @@ static int promote_l3_table(struct page_
 
         if ( i > page->nr_validated_ptes && hypercall_preempt_check() )
             rc = -EINTR;
-        else if ( is_pv_32bit_domain(d) && (i == 3) )
+        else if ( i == 3 && is_pv_32bit_domain(d) )
         {
             if ( !(l3e_get_flags(l3e) & _PAGE_PRESENT) ||
                  (l3e_get_flags(l3e) & l3_disallow_mask(d)) )
@@ -2310,7 +2310,7 @@ static int mod_l3_entry(l3_pgentry_t *pl
      * Disallow updates to final L3 slot. It contains Xen mappings, and it
      * would be a pain to ensure they remain continuously valid throughout.
      */
-    if ( is_pv_32bit_domain(d) && (pgentry_ptr_to_slot(pl3e) >= 3) )
+    if ( pgentry_ptr_to_slot(pl3e) >= 3 && is_pv_32bit_domain(d) )
         return -EINVAL;
 
     ol3e = l3e_access_once(*pl3e);
@@ -2470,7 +2470,7 @@ static int cleanup_page_mappings(struct
     {
         struct domain *d = page_get_owner(page);
 
-        if ( d && is_pv_domain(d) && unlikely(need_iommu_pt_sync(d)) )
+        if ( d && unlikely(need_iommu_pt_sync(d)) && is_pv_domain(d) )
         {
             int rc2 = iommu_legacy_unmap(d, _dfn(mfn), PAGE_ORDER_4K);
 
@@ -2984,7 +2984,7 @@ static int _get_page_type(struct page_in
         /* Special pages should not be accessible from devices. */
         struct domain *d = page_get_owner(page);
 
-        if ( d && is_pv_domain(d) && unlikely(need_iommu_pt_sync(d)) )
+        if ( d && unlikely(need_iommu_pt_sync(d)) && is_pv_domain(d) )
         {
             mfn_t mfn = page_to_mfn(page);
 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 3/3] x86/mm: re-order a few conditionals
Posted by Andrew Cooper 4 years, 2 months ago
On 06/01/2020 15:35, Jan Beulich wrote:
> is_{hvm,pv}_*() can be expensive now, so where possible evaluate cheaper
> conditions first.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

> ---
> v2: New.
> ---
> I couldn't really decide whether to drop the two involved unlikely().

Personally, I don't think we should have any likely/unlikley annotations
at all.  They are difficult for humans to reason about (especially when
you're in a nested clause of annotated condition) - several of them
are(/were) wrong, and plenty are dubious.

People who actually care should be using PGO.  This is yet another
toolchain feature I'm hoping that we will get "for free" from Anthony's
work to switch to using kbuild.

If we were to delete all likely/unlikley annotations, and someone could
then measure a performance improvement from reinserting some of them,
then perhaps it would be ok to keep a few around, but my gut feeling is
that the compiler can do a better job in general than humans can.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel