[PATCH 00/10] x86: mm (mainly shadow) adjustments

Jan Beulich posted 10 patches 4 years ago
Only 0 patches received!
There is a newer version of this series
[PATCH 00/10] x86: mm (mainly shadow) adjustments
Posted by Jan Beulich 4 years ago
Large parts of this series are to further isolate pieces which
are needed for HVM only, and hence would better not be built
with HVM=n. But there are also a few other items which I've
noticed along the road.

01: mm: no-one passes a NULL domain to init_xen_l4_slots()
02: shadow: drop a stray forward structure declaration
03: shadow: monitor table is HVM-only
04: shadow: sh_update_linear_entries() is a no-op for PV
05: mm: monitor table is HVM-only
06: shadow: sh_remove_write_access_from_sl1p() can be static
07: shadow: the guess_wrmap() hook is needed for HVM only
08: mm: pagetable_dying() is HVM-only
09: shadow: the trace_emul_write_val() hook is HVM-only
10: shadow: don't open-code shadow_blow_tables_per_domain()

Jan

Re: [PATCH 00/10] x86: mm (mainly shadow) adjustments
Posted by Tim Deegan 4 years ago
At 16:23 +0200 on 17 Apr (1587140581), Jan Beulich wrote:
> Large parts of this series are to further isolate pieces which
> are needed for HVM only, and hence would better not be built
> with HVM=n. But there are also a few other items which I've
> noticed along the road.

Acked-by: Tim Deegan <tim@xen.org>
with two suggestions that I've sent separately.

Cheers,

Tim.

Re: [PATCH 00/10] x86: mm (mainly shadow) adjustments
Posted by Andrew Cooper 4 years ago
On 17/04/2020 15:23, Jan Beulich wrote:
> Large parts of this series are to further isolate pieces which
> are needed for HVM only, and hence would better not be built
> with HVM=n. But there are also a few other items which I've
> noticed along the road.
>
> 01: mm: no-one passes a NULL domain to init_xen_l4_slots()
> 02: shadow: drop a stray forward structure declaration
> 03: shadow: monitor table is HVM-only
> 04: shadow: sh_update_linear_entries() is a no-op for PV
> 05: mm: monitor table is HVM-only
> 06: shadow: sh_remove_write_access_from_sl1p() can be static
> 07: shadow: the guess_wrmap() hook is needed for HVM only
> 08: mm: pagetable_dying() is HVM-only
> 09: shadow: the trace_emul_write_val() hook is HVM-only
> 10: shadow: don't open-code shadow_blow_tables_per_domain()

Patch 1 I think ought to be dropped.  Everything else Acked-by: Andrew
Cooper <andrew.cooper3@citrix.com>, ideally with the suggested tweak in
patch 3.

~Andrew

[PATCH 01/10] x86/mm: no-one passes a NULL domain to init_xen_l4_slots()
Posted by Jan Beulich 4 years ago
Drop the NULL checks - they've been introduced by commit 8d7b633ada
("x86/mm: Consolidate all Xen L4 slot writing into
init_xen_l4_slots()") for no apparent reason.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -1696,7 +1696,7 @@ void init_xen_l4_slots(l4_pgentry_t *l4t
      * PV vcpus need a shortened directmap.  HVM and Idle vcpus get the full
      * directmap.
      */
-    bool short_directmap = d && !paging_mode_external(d);
+    bool short_directmap = !paging_mode_external(d);
 
     /* Slot 256: RO M2P (if applicable). */
     l4t[l4_table_offset(RO_MPT_VIRT_START)] =
@@ -1717,10 +1717,9 @@ void init_xen_l4_slots(l4_pgentry_t *l4t
         mfn_eq(sl4mfn, INVALID_MFN) ? l4e_empty() :
         l4e_from_mfn(sl4mfn, __PAGE_HYPERVISOR_RW);
 
-    /* Slot 260: Per-domain mappings (if applicable). */
+    /* Slot 260: Per-domain mappings. */
     l4t[l4_table_offset(PERDOMAIN_VIRT_START)] =
-        d ? l4e_from_page(d->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW)
-          : l4e_empty();
+        l4e_from_page(d->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW);
 
     /* Slot 261-: text/data/bss, RW M2P, vmap, frametable, directmap. */
 #ifndef NDEBUG


Re: [PATCH 01/10] x86/mm: no-one passes a NULL domain to init_xen_l4_slots()
Posted by Andrew Cooper 4 years ago
On 17/04/2020 15:25, Jan Beulich wrote:
> Drop the NULL checks - they've been introduced by commit 8d7b633ada
> ("x86/mm: Consolidate all Xen L4 slot writing into
> init_xen_l4_slots()") for no apparent reason.

:) I'll take this as conformation that all my sudden pagetable work in
Xen manage ended up being rather more subtle than Linux's equivalent
work for KPTI.

https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg00281.html

Specifically, this was part of trying to arrange for fully per-pcpu
private mappings, and was used to construct the pagetables for the idle
vcpu which specifically don't have a perdomain mapping.

Seeing as this is still an outstanding task in the secret-free-Xen
plans, any dropping of it now will have to be undone at some point in
the future.  Is there a specific reason for the cleanup?

> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -1696,7 +1696,7 @@ void init_xen_l4_slots(l4_pgentry_t *l4t

If we continue with this patch, this comment, just out of context, needs
adjusting.

~Andrew

>       * PV vcpus need a shortened directmap.  HVM and Idle vcpus get the full
>       * directmap.
>       */
> -    bool short_directmap = d && !paging_mode_external(d);
> +    bool short_directmap = !paging_mode_external(d);
>  
>      /* Slot 256: RO M2P (if applicable). */
>      l4t[l4_table_offset(RO_MPT_VIRT_START)] =
>


Re: [PATCH 01/10] x86/mm: no-one passes a NULL domain to init_xen_l4_slots()
Posted by Jan Beulich 4 years ago
On 17.04.2020 21:46, Andrew Cooper wrote:
> On 17/04/2020 15:25, Jan Beulich wrote:
>> Drop the NULL checks - they've been introduced by commit 8d7b633ada
>> ("x86/mm: Consolidate all Xen L4 slot writing into
>> init_xen_l4_slots()") for no apparent reason.
> 
> :) I'll take this as conformation that all my sudden pagetable work in
> Xen manage ended up being rather more subtle than Linux's equivalent
> work for KPTI.
> 
> https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg00281.html
> 
> Specifically, this was part of trying to arrange for fully per-pcpu
> private mappings, and was used to construct the pagetables for the idle
> vcpu which specifically don't have a perdomain mapping.
> 
> Seeing as this is still an outstanding task in the secret-free-Xen
> plans, any dropping of it now will have to be undone at some point in
> the future.

s/will/may/ I suppose - we don't know for sure how this will look
like at this point.

>  Is there a specific reason for the cleanup?

Elimination of effectively dead code. We avoid arbitrary NULL checks
elsewhere as well; iirc I've seen you (amongst others) comment on
people trying to introduce such in their patches.

>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -1696,7 +1696,7 @@ void init_xen_l4_slots(l4_pgentry_t *l4t
> 
> If we continue with this patch, this comment, just out of context, needs
> adjusting.

Will do.

Jan

Re: [PATCH 01/10] x86/mm: no-one passes a NULL domain to init_xen_l4_slots()
Posted by Andrew Cooper 4 years ago
On 20/04/2020 06:48, Jan Beulich wrote:
> On 17.04.2020 21:46, Andrew Cooper wrote:
>> On 17/04/2020 15:25, Jan Beulich wrote:
>>> Drop the NULL checks - they've been introduced by commit 8d7b633ada
>>> ("x86/mm: Consolidate all Xen L4 slot writing into
>>> init_xen_l4_slots()") for no apparent reason.
>> :) I'll take this as conformation that all my sudden pagetable work in
>> Xen manage ended up being rather more subtle than Linux's equivalent
>> work for KPTI.
>>
>> https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg00281.html
>>
>> Specifically, this was part of trying to arrange for fully per-pcpu
>> private mappings, and was used to construct the pagetables for the idle
>> vcpu which specifically don't have a perdomain mapping.
>>
>> Seeing as this is still an outstanding task in the secret-free-Xen
>> plans, any dropping of it now will have to be undone at some point in
>> the future.
> s/will/may/ I suppose - we don't know for sure how this will look
> like at this point.

Will.

The only reason we don't need it right now is because idle_pg_table[]
gets constructed at boot time.  As soon as we're creating one (or more)
per pcpu, we need a way of writing an L4 without a perdomain mapping.

~Andrew

[PATCH 02/10] x86/shadow: drop a stray forward structure declaration
Posted by Jan Beulich 4 years ago
struct sh_emulate_ctxt is private to shadow code, and hence a
declaration for it is not needed here.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/include/asm-x86/paging.h
+++ b/xen/include/asm-x86/paging.h
@@ -92,7 +92,6 @@
  * These shouldn't be used directly by callers; rather use the functions
  * below which will indirect through this table as appropriate. */
 
-struct sh_emulate_ctxt;
 struct shadow_paging_mode {
 #ifdef CONFIG_SHADOW_PAGING
     void          (*detach_old_tables     )(struct vcpu *v);


[PATCH 03/10] x86/shadow: monitor table is HVM-only
Posted by Jan Beulich 4 years ago
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -2376,7 +2376,6 @@ void sh_reset_l3_up_pointers(struct vcpu
 static void sh_update_paging_modes(struct vcpu *v)
 {
     struct domain *d = v->domain;
-    const struct paging_mode *old_mode = v->arch.paging.mode;
 
     ASSERT(paging_locked_by_me(d));
 
@@ -2421,11 +2420,14 @@ static void sh_update_paging_modes(struc
     if ( v->arch.paging.mode )
         v->arch.paging.mode->shadow.detach_old_tables(v);
 
+#ifdef CONFIG_HVM
     if ( !is_pv_domain(d) )
     {
         ///
         /// HVM guest
         ///
+        const struct paging_mode *old_mode = v->arch.paging.mode;
+
         ASSERT(shadow_mode_translate(d));
         ASSERT(shadow_mode_external(d));
 
@@ -2523,6 +2525,7 @@ static void sh_update_paging_modes(struc
         //        different values for CR4.PSE and CR4.PGE at the same time.
         //        This *does* happen, at least for CR4.PGE...
     }
+#endif /* CONFIG_HVM */
 
 #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
     /* We need to check that all the vcpus have paging enabled to
@@ -2703,7 +2706,6 @@ void shadow_teardown(struct domain *d, b
  * Should only be called for dying domains. */
 {
     struct vcpu *v;
-    mfn_t mfn;
     struct page_info *unpaged_pagetable = NULL;
 
     ASSERT(d->is_dying);
@@ -2719,13 +2721,16 @@ void shadow_teardown(struct domain *d, b
             if ( v->arch.paging.mode )
             {
                 v->arch.paging.mode->shadow.detach_old_tables(v);
+#ifdef CONFIG_HVM
                 if ( shadow_mode_external(d) )
                 {
-                    mfn = pagetable_get_mfn(v->arch.monitor_table);
+                    mfn_t mfn = pagetable_get_mfn(v->arch.monitor_table);
+
                     if ( mfn_valid(mfn) && (mfn_x(mfn) != 0) )
                         v->arch.paging.mode->shadow.destroy_monitor_table(v, mfn);
                     v->arch.monitor_table = pagetable_null();
                 }
+#endif /* CONFIG_HVM */
             }
         }
     }
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -1515,7 +1515,7 @@ make_fl1_shadow(struct domain *d, gfn_t
 }
 
 
-#if SHADOW_PAGING_LEVELS == GUEST_PAGING_LEVELS
+#if SHADOW_PAGING_LEVELS == GUEST_PAGING_LEVELS && defined(CONFIG_HVM)
 mfn_t
 sh_make_monitor_table(struct vcpu *v)
 {
@@ -1965,7 +1965,7 @@ void sh_destroy_l1_shadow(struct domain
     shadow_free(d, smfn);
 }
 
-#if SHADOW_PAGING_LEVELS == GUEST_PAGING_LEVELS
+#if SHADOW_PAGING_LEVELS == GUEST_PAGING_LEVELS && defined(CONFIG_HVM)
 void sh_destroy_monitor_table(struct vcpu *v, mfn_t mmfn)
 {
     struct domain *d = v->domain;
@@ -4881,8 +4881,10 @@ const struct paging_mode sh_paging_mode
     .shadow.write_guest_entry      = sh_write_guest_entry,
     .shadow.cmpxchg_guest_entry    = sh_cmpxchg_guest_entry,
 #endif
+#ifdef CONFIG_HVM
     .shadow.make_monitor_table     = sh_make_monitor_table,
     .shadow.destroy_monitor_table  = sh_destroy_monitor_table,
+#endif
 #if SHADOW_OPTIMIZATIONS & SHOPT_WRITABLE_HEURISTIC
     .shadow.guess_wrmap            = sh_guess_wrmap,
 #endif
--- a/xen/include/asm-x86/paging.h
+++ b/xen/include/asm-x86/paging.h
@@ -102,8 +102,10 @@ struct shadow_paging_mode {
                                             intpte_t *old, intpte_t new,
                                             mfn_t gmfn);
 #endif
+#ifdef CONFIG_HVM
     mfn_t         (*make_monitor_table    )(struct vcpu *v);
     void          (*destroy_monitor_table )(struct vcpu *v, mfn_t mmfn);
+#endif
     int           (*guess_wrmap           )(struct vcpu *v, 
                                             unsigned long vaddr, mfn_t gmfn);
     void          (*pagetable_dying       )(paddr_t gpa);


Re: [PATCH 03/10] x86/shadow: monitor table is HVM-only
Posted by Andrew Cooper 4 years ago
On 17/04/2020 15:26, Jan Beulich wrote:
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/mm/shadow/common.c
> +++ b/xen/arch/x86/mm/shadow/common.c
> @@ -2376,7 +2376,6 @@ void sh_reset_l3_up_pointers(struct vcpu
>  static void sh_update_paging_modes(struct vcpu *v)
>  {
>      struct domain *d = v->domain;
> -    const struct paging_mode *old_mode = v->arch.paging.mode;
>  
>      ASSERT(paging_locked_by_me(d));
>  
> @@ -2421,11 +2420,14 @@ static void sh_update_paging_modes(struc
>      if ( v->arch.paging.mode )
>          v->arch.paging.mode->shadow.detach_old_tables(v);
>  
> +#ifdef CONFIG_HVM
>      if ( !is_pv_domain(d) )
>      {
>          ///
>          /// HVM guest
>          ///

Can we drop this comment while we're here?  The ifdef and
!is_pv_domain() are crystal clear.

~Andrew

[PATCH 04/10] x86/shadow: sh_update_linear_entries() is a no-op for PV
Posted by Jan Beulich 4 years ago
Consolidate the shadow_mode_external() in here: Check this once at the
start of the function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -3707,34 +3707,30 @@ sh_update_linear_entries(struct vcpu *v)
      */
 
     /* Don't try to update the monitor table if it doesn't exist */
-    if ( shadow_mode_external(d)
-         && pagetable_get_pfn(v->arch.monitor_table) == 0 )
+    if ( !shadow_mode_external(d) ||
+         pagetable_get_pfn(v->arch.monitor_table) == 0 )
         return;
 
 #if SHADOW_PAGING_LEVELS == 4
 
-    /* For PV, one l4e points at the guest l4, one points at the shadow
-     * l4.  No maintenance required.
-     * For HVM, just need to update the l4e that points to the shadow l4. */
+    /* For HVM, just need to update the l4e that points to the shadow l4. */
 
-    if ( shadow_mode_external(d) )
+    /* Use the linear map if we can; otherwise make a new mapping */
+    if ( v == current )
     {
-        /* Use the linear map if we can; otherwise make a new mapping */
-        if ( v == current )
-        {
-            __linear_l4_table[l4_linear_offset(SH_LINEAR_PT_VIRT_START)] =
-                l4e_from_pfn(pagetable_get_pfn(v->arch.shadow_table[0]),
-                             __PAGE_HYPERVISOR_RW);
-        }
-        else
-        {
-            l4_pgentry_t *ml4e;
-            ml4e = map_domain_page(pagetable_get_mfn(v->arch.monitor_table));
-            ml4e[l4_table_offset(SH_LINEAR_PT_VIRT_START)] =
-                l4e_from_pfn(pagetable_get_pfn(v->arch.shadow_table[0]),
-                             __PAGE_HYPERVISOR_RW);
-            unmap_domain_page(ml4e);
-        }
+        __linear_l4_table[l4_linear_offset(SH_LINEAR_PT_VIRT_START)] =
+            l4e_from_pfn(pagetable_get_pfn(v->arch.shadow_table[0]),
+                         __PAGE_HYPERVISOR_RW);
+    }
+    else
+    {
+        l4_pgentry_t *ml4e;
+
+        ml4e = map_domain_page(pagetable_get_mfn(v->arch.monitor_table));
+        ml4e[l4_table_offset(SH_LINEAR_PT_VIRT_START)] =
+            l4e_from_pfn(pagetable_get_pfn(v->arch.shadow_table[0]),
+                         __PAGE_HYPERVISOR_RW);
+        unmap_domain_page(ml4e);
     }
 
 #elif SHADOW_PAGING_LEVELS == 3
@@ -3748,7 +3744,6 @@ sh_update_linear_entries(struct vcpu *v)
      * the shadows.
      */
 
-    ASSERT(shadow_mode_external(d));
     {
         /* Install copies of the shadow l3es into the monitor l2 table
          * that maps SH_LINEAR_PT_VIRT_START. */
@@ -3799,20 +3794,16 @@ sh_update_linear_entries(struct vcpu *v)
 #error this should not happen
 #endif
 
-    if ( shadow_mode_external(d) )
-    {
-        /*
-         * Having modified the linear pagetable mapping, flush local host TLBs.
-         * This was not needed when vmenter/vmexit always had the side effect
-         * of flushing host TLBs but, with ASIDs, it is possible to finish
-         * this CR3 update, vmenter the guest, vmexit due to a page fault,
-         * without an intervening host TLB flush. Then the page fault code
-         * could use the linear pagetable to read a top-level shadow page
-         * table entry. But, without this change, it would fetch the wrong
-         * value due to a stale TLB.
-         */
-        flush_tlb_local();
-    }
+    /*
+     * Having modified the linear pagetable mapping, flush local host TLBs.
+     * This was not needed when vmenter/vmexit always had the side effect of
+     * flushing host TLBs but, with ASIDs, it is possible to finish this CR3
+     * update, vmenter the guest, vmexit due to a page fault, without an
+     * intervening host TLB flush. Then the page fault code could use the
+     * linear pagetable to read a top-level shadow page table entry. But,
+     * without this change, it would fetch the wrong value due to a stale TLB.
+     */
+    flush_tlb_local();
 }
 
 


Re: [PATCH 04/10] x86/shadow: sh_update_linear_entries() is a no-op for PV
Posted by Tim Deegan 4 years ago
At 16:26 +0200 on 17 Apr (1587140817), Jan Beulich wrote:
> Consolidate the shadow_mode_external() in here: Check this once at the
> start of the function.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -3707,34 +3707,30 @@ sh_update_linear_entries(struct vcpu *v)
>       */

This looks good.  Can you please also delete the out-of-date comment
just above that talks about about PAE linear pagetables in PV guests,
to make it clear that PV guests don't need any maintenance here?

Cheers,

Tim.

[PATCH 05/10] x86/mm: monitor table is HVM-only
Posted by Jan Beulich 4 years ago
Move the per-vCPU field to the HVM sub-structure.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -545,7 +545,7 @@ void write_ptbase(struct vcpu *v)
  * Should be called after CR3 is updated.
  *
  * Uses values found in vcpu->arch.(guest_table and guest_table_user), and
- * for HVM guests, arch.monitor_table and hvm's guest CR3.
+ * for HVM guests, arch.hvm.monitor_table and hvm's guest CR3.
  *
  * Update ref counts to shadow tables appropriately.
  */
--- a/xen/arch/x86/mm/hap/hap.c
+++ b/xen/arch/x86/mm/hap/hap.c
@@ -393,7 +393,7 @@ static mfn_t hap_make_monitor_table(stru
     l4_pgentry_t *l4e;
     mfn_t m4mfn;
 
-    ASSERT(pagetable_get_pfn(v->arch.monitor_table) == 0);
+    ASSERT(pagetable_get_pfn(v->arch.hvm.monitor_table) == 0);
 
     if ( (pg = hap_alloc(d)) == NULL )
         goto oom;
@@ -579,10 +579,10 @@ void hap_teardown(struct domain *d, bool
         {
             if ( paging_get_hostmode(v) && paging_mode_external(d) )
             {
-                mfn = pagetable_get_mfn(v->arch.monitor_table);
+                mfn = pagetable_get_mfn(v->arch.hvm.monitor_table);
                 if ( mfn_valid(mfn) && (mfn_x(mfn) != 0) )
                     hap_destroy_monitor_table(v, mfn);
-                v->arch.monitor_table = pagetable_null();
+                v->arch.hvm.monitor_table = pagetable_null();
             }
         }
     }
@@ -758,10 +758,10 @@ static void hap_update_paging_modes(stru
 
     v->arch.paging.mode = hap_paging_get_mode(v);
 
-    if ( pagetable_is_null(v->arch.monitor_table) )
+    if ( pagetable_is_null(v->arch.hvm.monitor_table) )
     {
         mfn_t mmfn = hap_make_monitor_table(v);
-        v->arch.monitor_table = pagetable_from_mfn(mmfn);
+        v->arch.hvm.monitor_table = pagetable_from_mfn(mmfn);
         make_cr3(v, mmfn);
         hvm_update_host_cr3(v);
     }
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -2465,10 +2465,10 @@ static void sh_update_paging_modes(struc
                 &SHADOW_INTERNAL_NAME(sh_paging_mode, 2);
         }
 
-        if ( pagetable_is_null(v->arch.monitor_table) )
+        if ( pagetable_is_null(v->arch.hvm.monitor_table) )
         {
             mfn_t mmfn = v->arch.paging.mode->shadow.make_monitor_table(v);
-            v->arch.monitor_table = pagetable_from_mfn(mmfn);
+            v->arch.hvm.monitor_table = pagetable_from_mfn(mmfn);
             make_cr3(v, mmfn);
             hvm_update_host_cr3(v);
         }
@@ -2502,10 +2502,10 @@ static void sh_update_paging_modes(struc
                     return;
                 }
 
-                old_mfn = pagetable_get_mfn(v->arch.monitor_table);
-                v->arch.monitor_table = pagetable_null();
+                old_mfn = pagetable_get_mfn(v->arch.hvm.monitor_table);
+                v->arch.hvm.monitor_table = pagetable_null();
                 new_mfn = v->arch.paging.mode->shadow.make_monitor_table(v);
-                v->arch.monitor_table = pagetable_from_mfn(new_mfn);
+                v->arch.hvm.monitor_table = pagetable_from_mfn(new_mfn);
                 SHADOW_PRINTK("new monitor table %"PRI_mfn "\n",
                                mfn_x(new_mfn));
 
@@ -2724,11 +2724,11 @@ void shadow_teardown(struct domain *d, b
 #ifdef CONFIG_HVM
                 if ( shadow_mode_external(d) )
                 {
-                    mfn_t mfn = pagetable_get_mfn(v->arch.monitor_table);
+                    mfn_t mfn = pagetable_get_mfn(v->arch.hvm.monitor_table);
 
                     if ( mfn_valid(mfn) && (mfn_x(mfn) != 0) )
                         v->arch.paging.mode->shadow.destroy_monitor_table(v, mfn);
-                    v->arch.monitor_table = pagetable_null();
+                    v->arch.hvm.monitor_table = pagetable_null();
                 }
 #endif /* CONFIG_HVM */
             }
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -1521,7 +1521,7 @@ sh_make_monitor_table(struct vcpu *v)
 {
     struct domain *d = v->domain;
 
-    ASSERT(pagetable_get_pfn(v->arch.monitor_table) == 0);
+    ASSERT(pagetable_get_pfn(v->arch.hvm.monitor_table) == 0);
 
     /* Guarantee we can get the memory we need */
     shadow_prealloc(d, SH_type_monitor_table, CONFIG_PAGING_LEVELS);
@@ -3708,7 +3708,7 @@ sh_update_linear_entries(struct vcpu *v)
 
     /* Don't try to update the monitor table if it doesn't exist */
     if ( !shadow_mode_external(d) ||
-         pagetable_get_pfn(v->arch.monitor_table) == 0 )
+         pagetable_get_pfn(v->arch.hvm.monitor_table) == 0 )
         return;
 
 #if SHADOW_PAGING_LEVELS == 4
@@ -3726,7 +3726,7 @@ sh_update_linear_entries(struct vcpu *v)
     {
         l4_pgentry_t *ml4e;
 
-        ml4e = map_domain_page(pagetable_get_mfn(v->arch.monitor_table));
+        ml4e = map_domain_page(pagetable_get_mfn(v->arch.hvm.monitor_table));
         ml4e[l4_table_offset(SH_LINEAR_PT_VIRT_START)] =
             l4e_from_pfn(pagetable_get_pfn(v->arch.shadow_table[0]),
                          __PAGE_HYPERVISOR_RW);
@@ -3761,7 +3761,7 @@ sh_update_linear_entries(struct vcpu *v)
             l4_pgentry_t *ml4e;
             l3_pgentry_t *ml3e;
             int linear_slot = shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START);
-            ml4e = map_domain_page(pagetable_get_mfn(v->arch.monitor_table));
+            ml4e = map_domain_page(pagetable_get_mfn(v->arch.hvm.monitor_table));
 
             ASSERT(l4e_get_flags(ml4e[linear_slot]) & _PAGE_PRESENT);
             l3mfn = l4e_get_mfn(ml4e[linear_slot]);
@@ -4096,7 +4096,7 @@ sh_update_cr3(struct vcpu *v, int do_loc
     ///
     if ( shadow_mode_external(d) )
     {
-        make_cr3(v, pagetable_get_mfn(v->arch.monitor_table));
+        make_cr3(v, pagetable_get_mfn(v->arch.hvm.monitor_table));
     }
 #if SHADOW_PAGING_LEVELS == 4
     else // not shadow_mode_external...
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -581,7 +581,6 @@ struct arch_vcpu
     /* guest_table holds a ref to the page, and also a type-count unless
      * shadow refcounts are in use */
     pagetable_t shadow_table[4];        /* (MFN) shadow(s) of guest */
-    pagetable_t monitor_table;          /* (MFN) hypervisor PT (for HVM) */
     unsigned long cr3;                  /* (MA) value to install in HW CR3 */
 
     /*
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -176,6 +176,9 @@ struct hvm_vcpu {
         uint16_t p2midx;
     } fast_single_step;
 
+    /* (MFN) hypervisor page table */
+    pagetable_t         monitor_table;
+
     struct hvm_vcpu_asid n1asid;
 
     u64                 msr_tsc_adjust;


[PATCH 06/10] x86/shadow: sh_remove_write_access_from_sl1p() can be static
Posted by Jan Beulich 4 years ago
It's only used by common.c.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -38,6 +38,9 @@
 #include <xen/numa.h>
 #include "private.h"
 
+static int sh_remove_write_access_from_sl1p(struct domain *d, mfn_t gmfn,
+                                            mfn_t smfn, unsigned long offset);
+
 DEFINE_PER_CPU(uint32_t,trace_shadow_path_flags);
 
 static int sh_enable_log_dirty(struct domain *, bool log_global);
@@ -1999,8 +2002,8 @@ int sh_remove_write_access(struct domain
 }
 
 #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
-int sh_remove_write_access_from_sl1p(struct domain *d, mfn_t gmfn,
-                                     mfn_t smfn, unsigned long off)
+static int sh_remove_write_access_from_sl1p(struct domain *d, mfn_t gmfn,
+                                            mfn_t smfn, unsigned long off)
 {
     struct page_info *sp = mfn_to_page(smfn);
 
--- a/xen/arch/x86/mm/shadow/private.h
+++ b/xen/arch/x86/mm/shadow/private.h
@@ -396,9 +396,6 @@ void sh_resync(struct domain *d, mfn_t g
 
 void oos_fixup_add(struct domain *d, mfn_t gmfn, mfn_t smfn, unsigned long off);
 
-int sh_remove_write_access_from_sl1p(struct domain *d, mfn_t gmfn,
-                                     mfn_t smfn, unsigned long offset);
-
 /* Pull all out-of-sync shadows back into sync.  If skip != 0, we try
  * to avoid resyncing where we think we can get away with it. */
 


[PATCH 07/10] x86/shadow: the guess_wrmap() hook is needed for HVM only
Posted by Jan Beulich 4 years ago
sh_remove_write_access() bails early for !external guests, and hence its
building and thus the need for the hook can be suppressed altogether in
!HVM configs.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1769,6 +1769,7 @@ static inline void trace_shadow_wrmap_bf
     }
 }
 
+#ifdef CONFIG_HVM
 /**************************************************************************/
 /* Remove all writeable mappings of a guest frame from the shadow tables
  * Returns non-zero if we need to flush TLBs.
@@ -2000,6 +2001,7 @@ int sh_remove_write_access(struct domain
     /* We killed at least one writeable mapping, so must flush TLBs. */
     return 1;
 }
+#endif /* CONFIG_HVM */
 
 #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
 static int sh_remove_write_access_from_sl1p(struct domain *d, mfn_t gmfn,
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -4203,7 +4203,7 @@ int sh_rm_write_access_from_sl1p(struct
 }
 #endif /* OOS */
 
-#if SHADOW_OPTIMIZATIONS & SHOPT_WRITABLE_HEURISTIC
+#if defined(CONFIG_HVM) && (SHADOW_OPTIMIZATIONS & SHOPT_WRITABLE_HEURISTIC)
 static int sh_guess_wrmap(struct vcpu *v, unsigned long vaddr, mfn_t gmfn)
 /* Look up this vaddr in the current shadow and see if it's a writeable
  * mapping of this gmfn.  If so, remove it.  Returns 1 if it worked. */
@@ -4875,10 +4875,10 @@ const struct paging_mode sh_paging_mode
 #ifdef CONFIG_HVM
     .shadow.make_monitor_table     = sh_make_monitor_table,
     .shadow.destroy_monitor_table  = sh_destroy_monitor_table,
-#endif
 #if SHADOW_OPTIMIZATIONS & SHOPT_WRITABLE_HEURISTIC
     .shadow.guess_wrmap            = sh_guess_wrmap,
 #endif
+#endif /* CONFIG_HVM */
     .shadow.pagetable_dying        = sh_pagetable_dying,
     .shadow.trace_emul_write_val   = trace_emulate_write_val,
     .shadow.shadow_levels          = SHADOW_PAGING_LEVELS,
--- a/xen/arch/x86/mm/shadow/private.h
+++ b/xen/arch/x86/mm/shadow/private.h
@@ -359,6 +359,7 @@ void sh_install_xen_entries_in_l4(struct
 /* Update the shadows in response to a pagetable write from Xen */
 int sh_validate_guest_entry(struct vcpu *v, mfn_t gmfn, void *entry, u32 size);
 
+#ifdef CONFIG_HVM
 /* Remove all writeable mappings of a guest frame from the shadows.
  * Returns non-zero if we need to flush TLBs.
  * level and fault_addr desribe how we found this to be a pagetable;
@@ -366,6 +367,14 @@ int sh_validate_guest_entry(struct vcpu
 extern int sh_remove_write_access(struct domain *d, mfn_t readonly_mfn,
                                   unsigned int level,
                                   unsigned long fault_addr);
+#else
+static inline int sh_remove_write_access(struct domain *d, mfn_t readonly_mfn,
+                                         unsigned int level,
+                                         unsigned long fault_addr)
+{
+    return 0;
+}
+#endif
 
 /* Functions that atomically write PT/P2M entries and update state */
 int shadow_write_p2m_entry(struct p2m_domain *p2m, unsigned long gfn,
--- a/xen/include/asm-x86/paging.h
+++ b/xen/include/asm-x86/paging.h
@@ -105,9 +105,9 @@ struct shadow_paging_mode {
 #ifdef CONFIG_HVM
     mfn_t         (*make_monitor_table    )(struct vcpu *v);
     void          (*destroy_monitor_table )(struct vcpu *v, mfn_t mmfn);
-#endif
     int           (*guess_wrmap           )(struct vcpu *v, 
                                             unsigned long vaddr, mfn_t gmfn);
+#endif
     void          (*pagetable_dying       )(paddr_t gpa);
     void          (*trace_emul_write_val  )(const void *ptr, unsigned long vaddr,
                                             const void *src, unsigned int bytes);


Re: [PATCH 07/10] x86/shadow: the guess_wrmap() hook is needed for HVM only
Posted by Tim Deegan 4 years ago
At 16:28 +0200 on 17 Apr (1587140897), Jan Beulich wrote:
> sh_remove_write_access() bails early for !external guests, and hence its
> building and thus the need for the hook can be suppressed altogether in
> !HVM configs.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

> @@ -366,6 +367,14 @@ int sh_validate_guest_entry(struct vcpu
>  extern int sh_remove_write_access(struct domain *d, mfn_t readonly_mfn,
>                                    unsigned int level,
>                                    unsigned long fault_addr);
> +#else
> +static inline int sh_remove_write_access(struct domain *d, mfn_t readonly_mfn,
> +                                         unsigned int level,
> +                                         unsigned long fault_addr)
> +{

Can we have an ASSERT(!shadow_mode_refcounts(d)) here, please,
matching the check that would have made it a noop before?

Cheers,

Tim.


Re: [PATCH 07/10] x86/shadow: the guess_wrmap() hook is needed for HVM only
Posted by Jan Beulich 4 years ago
On 18.04.2020 11:03, Tim Deegan wrote:
> At 16:28 +0200 on 17 Apr (1587140897), Jan Beulich wrote:
>> sh_remove_write_access() bails early for !external guests, and hence its
>> building and thus the need for the hook can be suppressed altogether in
>> !HVM configs.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
>> @@ -366,6 +367,14 @@ int sh_validate_guest_entry(struct vcpu
>>  extern int sh_remove_write_access(struct domain *d, mfn_t readonly_mfn,
>>                                    unsigned int level,
>>                                    unsigned long fault_addr);
>> +#else
>> +static inline int sh_remove_write_access(struct domain *d, mfn_t readonly_mfn,
>> +                                         unsigned int level,
>> +                                         unsigned long fault_addr)
>> +{
> 
> Can we have an ASSERT(!shadow_mode_refcounts(d)) here, please,
> matching the check that would have made it a noop before?

I've added one, but I find this quite odd in a !HVM build, where

#define PG_refcounts   0

and

#define paging_mode_refcounts(_d) (!!((_d)->arch.paging.mode & PG_refcounts))

Perhaps you're wanting this mainly for documentation purposes?

Jan

Re: [PATCH 07/10] x86/shadow: the guess_wrmap() hook is needed for HVM only
Posted by Tim Deegan 4 years ago
At 15:06 +0200 on 20 Apr (1587395210), Jan Beulich wrote:
> On 18.04.2020 11:03, Tim Deegan wrote:
> > At 16:28 +0200 on 17 Apr (1587140897), Jan Beulich wrote:
> >> sh_remove_write_access() bails early for !external guests, and hence its
> >> building and thus the need for the hook can be suppressed altogether in
> >> !HVM configs.
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> > 
> >> @@ -366,6 +367,14 @@ int sh_validate_guest_entry(struct vcpu
> >>  extern int sh_remove_write_access(struct domain *d, mfn_t readonly_mfn,
> >>                                    unsigned int level,
> >>                                    unsigned long fault_addr);
> >> +#else
> >> +static inline int sh_remove_write_access(struct domain *d, mfn_t readonly_mfn,
> >> +                                         unsigned int level,
> >> +                                         unsigned long fault_addr)
> >> +{
> > 
> > Can we have an ASSERT(!shadow_mode_refcounts(d)) here, please,
> > matching the check that would have made it a noop before?
> 
> I've added one, but I find this quite odd in a !HVM build, where
> 
> #define PG_refcounts   0
> 
> and
> 
> #define paging_mode_refcounts(_d) (!!((_d)->arch.paging.mode & PG_refcounts))
> 
> Perhaps you're wanting this mainly for documentation purposes?


Yeah, that and future-proofing.  If !HVM builds ever start using
paging_mode_refcounts then this assertion will forcibly remind us that
we need changes here.  I'm glad that it compiles away to nothing in
the meantime.

Cheers,

Tim.

[PATCH 08/10] x86/mm: pagetable_dying() is HVM-only
Posted by Jan Beulich 4 years ago
Its only caller lives in HVM-only code.

This involves wider changes, in order to limit #ifdef-ary: Shadow's
SHOPT_FAST_EMULATION and the fields used by it get constrained to HVM
builds as well. Additionally the shadow_{init,continue}_emulation()
stubs for the !HVM case aren't needed anymore and hence get dropped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -851,6 +851,7 @@ int paging_enable(struct domain *d, u32
         return shadow_enable(d, mode);
 }
 
+#ifdef CONFIG_HVM
 /* Called from the guest to indicate that a process is being torn down
  * and therefore its pagetables will soon be discarded */
 void pagetable_dying(paddr_t gpa)
@@ -865,6 +866,7 @@ void pagetable_dying(paddr_t gpa)
     BUG();
 #endif
 }
+#endif /* CONFIG_HVM */
 
 /* Print paging-assistance info to the console */
 void paging_dump_domain_info(struct domain *d)
--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -66,7 +66,9 @@ int shadow_domain_init(struct domain *d)
 #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC)
     d->arch.paging.shadow.oos_active = 0;
 #endif
+#ifdef CONFIG_HVM
     d->arch.paging.shadow.pagetable_dying_op = 0;
+#endif
 
     return 0;
 }
@@ -690,8 +692,10 @@ void shadow_promote(struct domain *d, mf
     if ( !test_and_set_bit(_PGC_page_table, &page->count_info) )
     {
         page->shadow_flags = 0;
+#ifdef CONFIG_HVM
         if ( is_hvm_domain(d) )
             page->pagetable_dying = false;
+#endif
     }
 
     ASSERT(!(page->shadow_flags & (1u << type)));
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -2764,8 +2764,10 @@ static int sh_page_fault(struct vcpu *v,
     mfn_t gmfn, sl1mfn = _mfn(0);
     shadow_l1e_t sl1e, *ptr_sl1e;
     paddr_t gpa;
+#ifdef CONFIG_HVM
     struct sh_emulate_ctxt emul_ctxt;
     const struct x86_emulate_ops *emul_ops;
+#endif
     int r;
     p2m_type_t p2mt;
     uint32_t rc, error_code;
@@ -3253,7 +3255,13 @@ static int sh_page_fault(struct vcpu *v,
      * caught by user-mode page-table check above.
      */
  emulate_readonly:
+    if ( !is_hvm_domain(d) )
+    {
+        ASSERT_UNREACHABLE();
+        goto not_a_shadow_fault;
+    }
 
+#ifdef CONFIG_HVM
     /* Unshadow if we are writing to a toplevel pagetable that is
      * flagged as a dying process, and that is not currently used. */
     if ( sh_mfn_is_a_page_table(gmfn) && is_hvm_domain(d) &&
@@ -3302,31 +3310,28 @@ static int sh_page_fault(struct vcpu *v,
 #if SHADOW_OPTIMIZATIONS & SHOPT_FAST_EMULATION
  early_emulation:
 #endif
-    if ( is_hvm_domain(d) )
+    /*
+     * If we are in the middle of injecting an exception or interrupt then
+     * we should not emulate: the fault is a side effect of the processor
+     * trying to deliver the exception (e.g. IDT/GDT accesses, pushing the
+     * exception frame onto the stack).  Furthermore it is almost
+     * certainly the case the handler stack is currently considered to be
+     * a page table, so we should unshadow the faulting page before
+     * exiting.
+     */
+    if ( unlikely(hvm_event_pending(v)) )
     {
-        /*
-         * If we are in the middle of injecting an exception or interrupt then
-         * we should not emulate: the fault is a side effect of the processor
-         * trying to deliver the exception (e.g. IDT/GDT accesses, pushing the
-         * exception frame onto the stack).  Furthermore it is almost
-         * certainly the case the handler stack is currently considered to be
-         * a page table, so we should unshadow the faulting page before
-         * exiting.
-         */
-        if ( unlikely(hvm_event_pending(v)) )
-        {
 #if SHADOW_OPTIMIZATIONS & SHOPT_FAST_EMULATION
-            if ( fast_emul )
-            {
-                perfc_incr(shadow_fault_fast_emulate_fail);
-                v->arch.paging.last_write_emul_ok = 0;
-            }
-#endif
-            sh_remove_shadows(d, gmfn, 0 /* thorough */, 1 /* must succeed */);
-            trace_shadow_emulate_other(TRC_SHADOW_EMULATE_UNSHADOW_EVTINJ,
-                                       va, gfn);
-            return EXCRET_fault_fixed;
+        if ( fast_emul )
+        {
+            perfc_incr(shadow_fault_fast_emulate_fail);
+            v->arch.paging.last_write_emul_ok = 0;
         }
+#endif
+        sh_remove_shadows(d, gmfn, 0 /* thorough */, 1 /* must succeed */);
+        trace_shadow_emulate_other(TRC_SHADOW_EMULATE_UNSHADOW_EVTINJ,
+                                   va, gfn);
+        return EXCRET_fault_fixed;
     }
 
     SHADOW_PRINTK("emulate: eip=%#lx esp=%#lx\n", regs->rip, regs->rsp);
@@ -3334,11 +3339,8 @@ static int sh_page_fault(struct vcpu *v,
     emul_ops = shadow_init_emulation(&emul_ctxt, regs, GUEST_PTE_SIZE);
 
     r = x86_emulate(&emul_ctxt.ctxt, emul_ops);
-
-#ifdef CONFIG_HVM
     if ( r == X86EMUL_EXCEPTION )
     {
-        ASSERT(is_hvm_domain(d));
         /*
          * This emulation covers writes to shadow pagetables.  We tolerate #PF
          * (from accesses spanning pages, concurrent paging updated from
@@ -3360,7 +3362,6 @@ static int sh_page_fault(struct vcpu *v,
             r = X86EMUL_UNHANDLEABLE;
         }
     }
-#endif
 
     /*
      * NB. We do not unshadow on X86EMUL_EXCEPTION. It's not clear that it
@@ -3466,6 +3467,7 @@ static int sh_page_fault(struct vcpu *v,
  emulate_done:
     SHADOW_PRINTK("emulated\n");
     return EXCRET_fault_fixed;
+#endif /* CONFIG_HVM */
 
  mmio:
     if ( !guest_mode(regs) )
@@ -4157,7 +4159,9 @@ sh_update_cr3(struct vcpu *v, int do_loc
 int sh_rm_write_access_from_sl1p(struct domain *d, mfn_t gmfn,
                                  mfn_t smfn, unsigned long off)
 {
+#ifdef CONFIG_HVM
     struct vcpu *curr = current;
+#endif
     int r;
     shadow_l1e_t *sl1p, sl1e;
     struct page_info *sp;
@@ -4165,10 +4169,12 @@ int sh_rm_write_access_from_sl1p(struct
     ASSERT(mfn_valid(gmfn));
     ASSERT(mfn_valid(smfn));
 
+#ifdef CONFIG_HVM
     /* Remember if we've been told that this process is being torn down */
     if ( curr->domain == d && is_hvm_domain(d) )
         curr->arch.paging.shadow.pagetable_dying
             = mfn_to_page(gmfn)->pagetable_dying;
+#endif
 
     sp = mfn_to_page(smfn);
 
@@ -4424,6 +4430,7 @@ int sh_remove_l3_shadow(struct domain *d
 }
 #endif /* 64bit guest */
 
+#ifdef CONFIG_HVM
 /**************************************************************************/
 /* Function for the guest to inform us that a process is being torn
  * down.  We remember that as a hint to unshadow its pagetables soon,
@@ -4545,6 +4552,7 @@ static void sh_pagetable_dying(paddr_t g
     put_gfn(d, gpa >> PAGE_SHIFT);
 }
 #endif
+#endif /* CONFIG_HVM */
 
 /**************************************************************************/
 /* Audit tools */
@@ -4878,8 +4886,8 @@ const struct paging_mode sh_paging_mode
 #if SHADOW_OPTIMIZATIONS & SHOPT_WRITABLE_HEURISTIC
     .shadow.guess_wrmap            = sh_guess_wrmap,
 #endif
-#endif /* CONFIG_HVM */
     .shadow.pagetable_dying        = sh_pagetable_dying,
+#endif /* CONFIG_HVM */
     .shadow.trace_emul_write_val   = trace_emulate_write_val,
     .shadow.shadow_levels          = SHADOW_PAGING_LEVELS,
 };
--- a/xen/arch/x86/mm/shadow/private.h
+++ b/xen/arch/x86/mm/shadow/private.h
@@ -66,7 +66,11 @@ extern int shadow_audit_enable;
 #define SHOPT_FAST_EMULATION      0x80  /* Fast write emulation */
 #define SHOPT_OUT_OF_SYNC        0x100  /* Allow guest writes to L1 PTs */
 
+#ifdef CONFIG_HVM
 #define SHADOW_OPTIMIZATIONS     0x1ff
+#else
+#define SHADOW_OPTIMIZATIONS     (0x1ff & ~SHOPT_FAST_EMULATION)
+#endif
 
 
 /******************************************************************************
@@ -715,26 +719,11 @@ struct sh_emulate_ctxt {
 #endif
 };
 
-#ifdef CONFIG_HVM
 const struct x86_emulate_ops *shadow_init_emulation(
     struct sh_emulate_ctxt *sh_ctxt, struct cpu_user_regs *regs,
     unsigned int pte_size);
 void shadow_continue_emulation(
     struct sh_emulate_ctxt *sh_ctxt, struct cpu_user_regs *regs);
-#else
-static inline const struct x86_emulate_ops *shadow_init_emulation(
-    struct sh_emulate_ctxt *sh_ctxt, struct cpu_user_regs *regs,
-    unsigned int pte_size)
-{
-    BUG();
-    return NULL;
-}
-static inline void shadow_continue_emulation(
-    struct sh_emulate_ctxt *sh_ctxt, struct cpu_user_regs *regs)
-{
-    BUG();
-}
-#endif
 
 /* Stop counting towards early unshadows, as we've seen a real page fault */
 static inline void sh_reset_early_unshadow(struct vcpu *v)
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -117,8 +117,10 @@ struct shadow_domain {
     /* OOS */
     bool_t oos_active;
 
+#ifdef CONFIG_HVM
     /* Has this domain ever used HVMOP_pagetable_dying? */
     bool_t pagetable_dying_op;
+#endif
 
 #ifdef CONFIG_PV
     /* PV L1 Terminal Fault mitigation. */
@@ -137,10 +139,12 @@ struct shadow_vcpu {
     unsigned long last_emulated_mfn_for_unshadow;
     /* MFN of the last shadow that we shot a writeable mapping in */
     unsigned long last_writeable_pte_smfn;
+#ifdef CONFIG_HVM
     /* Last frame number that we emulated a write to. */
     unsigned long last_emulated_frame;
     /* Last MFN that we emulated a write successfully */
     unsigned long last_emulated_mfn;
+#endif
 
     /* Shadow out-of-sync: pages that this vcpu has let go out of sync */
     mfn_t oos[SHADOW_OOS_PAGES];
@@ -151,8 +155,10 @@ struct shadow_vcpu {
         unsigned long off[SHADOW_OOS_FIXUPS];
     } oos_fixup[SHADOW_OOS_PAGES];
 
+#ifdef CONFIG_HVM
     bool_t pagetable_dying;
 #endif
+#endif
 };
 
 /************************************************/
@@ -225,10 +231,12 @@ struct paging_vcpu {
     const struct paging_mode *mode;
     /* Nested Virtualization: paging mode of nested guest */
     const struct paging_mode *nestedmode;
+#ifdef CONFIG_HVM
     /* HVM guest: last emulate was to a pagetable */
     unsigned int last_write_was_pt:1;
     /* HVM guest: last write emulation succeeds */
     unsigned int last_write_emul_ok:1;
+#endif
     /* Translated guest: virtual TLB */
     struct shadow_vtlb *vtlb;
     spinlock_t          vtlb_lock;
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -287,7 +287,9 @@ struct page_info
          */
         struct {
             uint16_t shadow_flags;
+#ifdef CONFIG_HVM
             bool pagetable_dying;
+#endif
         };
 
         /* When in use as a shadow, next shadow in this hash chain. */
--- a/xen/include/asm-x86/paging.h
+++ b/xen/include/asm-x86/paging.h
@@ -107,8 +107,8 @@ struct shadow_paging_mode {
     void          (*destroy_monitor_table )(struct vcpu *v, mfn_t mmfn);
     int           (*guess_wrmap           )(struct vcpu *v, 
                                             unsigned long vaddr, mfn_t gmfn);
-#endif
     void          (*pagetable_dying       )(paddr_t gpa);
+#endif
     void          (*trace_emul_write_val  )(const void *ptr, unsigned long vaddr,
                                             const void *src, unsigned int bytes);
 #endif


[PATCH 09/10] x86/shadow: the trace_emul_write_val() hook is HVM-only
Posted by Jan Beulich 4 years ago
Its only caller lives in HVM-only code, and the only caller of
trace_shadow_emulate() also already site in a HVM-only code section.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -2694,6 +2694,7 @@ static inline void trace_shadow_emulate_
     }
 }
 
+#ifdef CONFIG_HVM
 #if GUEST_PAGING_LEVELS == 3
 static DEFINE_PER_CPU(guest_va_t,trace_emulate_initial_va);
 static DEFINE_PER_CPU(int,trace_extra_emulation_count);
@@ -2745,6 +2746,7 @@ static inline void trace_shadow_emulate(
         __trace_var(event, 0/*!tsc*/, sizeof(d), &d);
     }
 }
+#endif /* CONFIG_HVM */
 
 /**************************************************************************/
 /* Entry points into the shadow code */
@@ -4887,8 +4889,8 @@ const struct paging_mode sh_paging_mode
     .shadow.guess_wrmap            = sh_guess_wrmap,
 #endif
     .shadow.pagetable_dying        = sh_pagetable_dying,
-#endif /* CONFIG_HVM */
     .shadow.trace_emul_write_val   = trace_emulate_write_val,
+#endif /* CONFIG_HVM */
     .shadow.shadow_levels          = SHADOW_PAGING_LEVELS,
 };
 
--- a/xen/include/asm-x86/paging.h
+++ b/xen/include/asm-x86/paging.h
@@ -108,10 +108,10 @@ struct shadow_paging_mode {
     int           (*guess_wrmap           )(struct vcpu *v, 
                                             unsigned long vaddr, mfn_t gmfn);
     void          (*pagetable_dying       )(paddr_t gpa);
-#endif
     void          (*trace_emul_write_val  )(const void *ptr, unsigned long vaddr,
                                             const void *src, unsigned int bytes);
 #endif
+#endif
     /* For outsiders to tell what mode we're in */
     unsigned int shadow_levels;
 };


[PATCH 10/10] x86/shadow: don't open-code shadow_blow_tables_per_domain()
Posted by Jan Beulich 4 years ago
Make shadow_blow_all_tables() call the designated function, and this
occasion make the function itself use domain_vcpu().

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/shadow/common.c
+++ b/xen/arch/x86/mm/shadow/common.c
@@ -1005,7 +1005,8 @@ static void shadow_blow_tables(struct do
 
 void shadow_blow_tables_per_domain(struct domain *d)
 {
-    if ( shadow_mode_enabled(d) && d->vcpu != NULL && d->vcpu[0] != NULL ) {
+    if ( shadow_mode_enabled(d) && domain_vcpu(d, 0) )
+    {
         paging_lock(d);
         shadow_blow_tables(d);
         paging_unlock(d);
@@ -1022,14 +1023,7 @@ static void shadow_blow_all_tables(unsig
     printk("'%c' pressed -> blowing all shadow tables\n", c);
     rcu_read_lock(&domlist_read_lock);
     for_each_domain(d)
-    {
-        if ( shadow_mode_enabled(d) && d->vcpu != NULL && d->vcpu[0] != NULL )
-        {
-            paging_lock(d);
-            shadow_blow_tables(d);
-            paging_unlock(d);
-        }
-    }
+        shadow_blow_tables_per_domain(d);
     rcu_read_unlock(&domlist_read_lock);
 }