The change addressing the shadow related build issue noticed by Andrew went in already. The build breakage goes beyond this specific combination though - PV_SHIM_EXCLUSIVE plus HVM is similarly an issue. This is what the 1st patch tries to take care of, in a shape already on irc noticed to be controversial. I'm submitting the change nevertheless because for the moment there looks to be a majority in favor of going this route. One argument not voiced there yet: What good does it do to allow a user to enable HVM when then on the resulting hypervisor they still can't run HVM guests (for the hypervisor still being a dedicated PV shim one). On top of this, the alternative approach is likely going to get ugly. The shadow related adjustments are here merely because the want to make them was noticed in the context of the patch which has already gone in. 1: don't permit HVM and PV_SHIM_EXCLUSIVE at the same time 2: refactor shadow_vram_{get,put}_l1e() 3: sh_{make,destroy}_monitor_table() are "even more" HVM-only Jan
This combination doesn't really make sense (and there likely are more); in particular even if the code built with both options set, HVM guests wouldn't work (and I think one wouldn't be able to create one in the first place). The alternative here would be some presumably intrusive #ifdef-ary to get this combination to actually build (but still not work) again. Fixes: 8b5b49ceb3d9 ("x86: don't include domctl and alike in shim-exclusive builds") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> --- v2: Restore lost default setting. --- a/xen/arch/x86/Kconfig +++ b/xen/arch/x86/Kconfig @@ -23,7 +23,7 @@ config X86 select HAS_PDX select HAS_SCHED_GRANULARITY select HAS_UBSAN - select HAS_VPCI if !PV_SHIM_EXCLUSIVE && HVM + select HAS_VPCI if HVM select NEEDS_LIBELF select NUMA @@ -90,8 +90,9 @@ config PV_LINEAR_PT If unsure, say Y. config HVM - def_bool !PV_SHIM_EXCLUSIVE - prompt "HVM support" + bool "HVM support" + depends on !PV_SHIM_EXCLUSIVE + default y ---help--- Interfaces to support HVM domains. HVM domains require hardware virtualisation extensions (e.g. Intel VT-x, AMD SVM), but can boot
By passing the functions an MFN and flags, only a single instance of each is needed; they were pretty large for being inline functions anyway. While moving the code, also adjust coding style and add const where sensible / possible. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: New. --- a/xen/arch/x86/mm/shadow/hvm.c +++ b/xen/arch/x86/mm/shadow/hvm.c @@ -903,6 +903,104 @@ int shadow_track_dirty_vram(struct domai return rc; } +void shadow_vram_get_mfn(mfn_t mfn, unsigned int l1f, + mfn_t sl1mfn, const void *sl1e, + const struct domain *d) +{ + unsigned long gfn; + struct sh_dirty_vram *dirty_vram = d->arch.hvm.dirty_vram; + + ASSERT(is_hvm_domain(d)); + + if ( !dirty_vram /* tracking disabled? */ || + !(l1f & _PAGE_RW) /* read-only mapping? */ || + !mfn_valid(mfn) /* mfn can be invalid in mmio_direct */) + return; + + gfn = gfn_x(mfn_to_gfn(d, mfn)); + /* Page sharing not supported on shadow PTs */ + BUG_ON(SHARED_M2P(gfn)); + + if ( (gfn >= dirty_vram->begin_pfn) && (gfn < dirty_vram->end_pfn) ) + { + unsigned long i = gfn - dirty_vram->begin_pfn; + const struct page_info *page = mfn_to_page(mfn); + + if ( (page->u.inuse.type_info & PGT_count_mask) == 1 ) + /* Initial guest reference, record it */ + dirty_vram->sl1ma[i] = mfn_to_maddr(sl1mfn) | + PAGE_OFFSET(sl1e); + } +} + +void shadow_vram_put_mfn(mfn_t mfn, unsigned int l1f, + mfn_t sl1mfn, const void *sl1e, + const struct domain *d) +{ + unsigned long gfn; + struct sh_dirty_vram *dirty_vram = d->arch.hvm.dirty_vram; + + ASSERT(is_hvm_domain(d)); + + if ( !dirty_vram /* tracking disabled? */ || + !(l1f & _PAGE_RW) /* read-only mapping? */ || + !mfn_valid(mfn) /* mfn can be invalid in mmio_direct */) + return; + + gfn = gfn_x(mfn_to_gfn(d, mfn)); + /* Page sharing not supported on shadow PTs */ + BUG_ON(SHARED_M2P(gfn)); + + if ( (gfn >= dirty_vram->begin_pfn) && (gfn < dirty_vram->end_pfn) ) + { + unsigned long i = gfn - dirty_vram->begin_pfn; + const struct page_info *page = mfn_to_page(mfn); + bool dirty = false; + paddr_t sl1ma = mfn_to_maddr(sl1mfn) | PAGE_OFFSET(sl1e); + + if ( (page->u.inuse.type_info & PGT_count_mask) == 1 ) + { + /* Last reference */ + if ( dirty_vram->sl1ma[i] == INVALID_PADDR ) + { + /* We didn't know it was that one, let's say it is dirty */ + dirty = true; + } + else + { + ASSERT(dirty_vram->sl1ma[i] == sl1ma); + dirty_vram->sl1ma[i] = INVALID_PADDR; + if ( l1f & _PAGE_DIRTY ) + dirty = true; + } + } + else + { + /* We had more than one reference, just consider the page dirty. */ + dirty = true; + /* Check that it's not the one we recorded. */ + if ( dirty_vram->sl1ma[i] == sl1ma ) + { + /* Too bad, we remembered the wrong one... */ + dirty_vram->sl1ma[i] = INVALID_PADDR; + } + else + { + /* + * Ok, our recorded sl1e is still pointing to this page, let's + * just hope it will remain. + */ + } + } + + if ( dirty ) + { + dirty_vram->dirty_bitmap[i / 8] |= 1 << (i % 8); + dirty_vram->last_dirty = NOW(); + } + } +} + /* * Local variables: * mode: C --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -1047,107 +1047,6 @@ static int shadow_set_l2e(struct domain return flags; } -static inline void shadow_vram_get_l1e(shadow_l1e_t new_sl1e, - shadow_l1e_t *sl1e, - mfn_t sl1mfn, - struct domain *d) -{ -#ifdef CONFIG_HVM - mfn_t mfn = shadow_l1e_get_mfn(new_sl1e); - int flags = shadow_l1e_get_flags(new_sl1e); - unsigned long gfn; - struct sh_dirty_vram *dirty_vram = d->arch.hvm.dirty_vram; - - if ( !is_hvm_domain(d) || !dirty_vram /* tracking disabled? */ - || !(flags & _PAGE_RW) /* read-only mapping? */ - || !mfn_valid(mfn) ) /* mfn can be invalid in mmio_direct */ - return; - - gfn = gfn_x(mfn_to_gfn(d, mfn)); - /* Page sharing not supported on shadow PTs */ - BUG_ON(SHARED_M2P(gfn)); - - if ( (gfn >= dirty_vram->begin_pfn) && (gfn < dirty_vram->end_pfn) ) - { - unsigned long i = gfn - dirty_vram->begin_pfn; - struct page_info *page = mfn_to_page(mfn); - - if ( (page->u.inuse.type_info & PGT_count_mask) == 1 ) - /* Initial guest reference, record it */ - dirty_vram->sl1ma[i] = mfn_to_maddr(sl1mfn) - | ((unsigned long)sl1e & ~PAGE_MASK); - } -#endif -} - -static inline void shadow_vram_put_l1e(shadow_l1e_t old_sl1e, - shadow_l1e_t *sl1e, - mfn_t sl1mfn, - struct domain *d) -{ -#ifdef CONFIG_HVM - mfn_t mfn = shadow_l1e_get_mfn(old_sl1e); - int flags = shadow_l1e_get_flags(old_sl1e); - unsigned long gfn; - struct sh_dirty_vram *dirty_vram = d->arch.hvm.dirty_vram; - - if ( !is_hvm_domain(d) || !dirty_vram /* tracking disabled? */ - || !(flags & _PAGE_RW) /* read-only mapping? */ - || !mfn_valid(mfn) ) /* mfn can be invalid in mmio_direct */ - return; - - gfn = gfn_x(mfn_to_gfn(d, mfn)); - /* Page sharing not supported on shadow PTs */ - BUG_ON(SHARED_M2P(gfn)); - - if ( (gfn >= dirty_vram->begin_pfn) && (gfn < dirty_vram->end_pfn) ) - { - unsigned long i = gfn - dirty_vram->begin_pfn; - struct page_info *page = mfn_to_page(mfn); - int dirty = 0; - paddr_t sl1ma = mfn_to_maddr(sl1mfn) - | ((unsigned long)sl1e & ~PAGE_MASK); - - if ( (page->u.inuse.type_info & PGT_count_mask) == 1 ) - { - /* Last reference */ - if ( dirty_vram->sl1ma[i] == INVALID_PADDR ) { - /* We didn't know it was that one, let's say it is dirty */ - dirty = 1; - } - else - { - ASSERT(dirty_vram->sl1ma[i] == sl1ma); - dirty_vram->sl1ma[i] = INVALID_PADDR; - if ( flags & _PAGE_DIRTY ) - dirty = 1; - } - } - else - { - /* We had more than one reference, just consider the page dirty. */ - dirty = 1; - /* Check that it's not the one we recorded. */ - if ( dirty_vram->sl1ma[i] == sl1ma ) - { - /* Too bad, we remembered the wrong one... */ - dirty_vram->sl1ma[i] = INVALID_PADDR; - } - else - { - /* Ok, our recorded sl1e is still pointing to this page, let's - * just hope it will remain. */ - } - } - if ( dirty ) - { - dirty_vram->dirty_bitmap[i / 8] |= 1 << (i % 8); - dirty_vram->last_dirty = NOW(); - } - } -#endif -} - static int shadow_set_l1e(struct domain *d, shadow_l1e_t *sl1e, shadow_l1e_t new_sl1e, @@ -1156,6 +1055,7 @@ static int shadow_set_l1e(struct domain { int flags = 0; shadow_l1e_t old_sl1e; + unsigned int old_sl1f; #if SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC mfn_t new_gmfn = shadow_l1e_get_mfn(new_sl1e); #endif @@ -1194,7 +1094,9 @@ static int shadow_set_l1e(struct domain new_sl1e = shadow_l1e_flip_flags(new_sl1e, rc); /* fall through */ case 0: - shadow_vram_get_l1e(new_sl1e, sl1e, sl1mfn, d); + shadow_vram_get_mfn(shadow_l1e_get_mfn(new_sl1e), + shadow_l1e_get_flags(new_sl1e), + sl1mfn, sl1e, d); break; } #undef PAGE_FLIPPABLE @@ -1205,20 +1107,19 @@ static int shadow_set_l1e(struct domain shadow_write_entries(sl1e, &new_sl1e, 1, sl1mfn); flags |= SHADOW_SET_CHANGED; - if ( (shadow_l1e_get_flags(old_sl1e) & _PAGE_PRESENT) - && !sh_l1e_is_magic(old_sl1e) ) + old_sl1f = shadow_l1e_get_flags(old_sl1e); + if ( (old_sl1f & _PAGE_PRESENT) && !sh_l1e_is_magic(old_sl1e) && + shadow_mode_refcounts(d) ) { /* We lost a reference to an old mfn. */ /* N.B. Unlike higher-level sets, never need an extra flush * when writing an l1e. Because it points to the same guest frame * as the guest l1e did, it's the guest's responsibility to * trigger a flush later. */ - if ( shadow_mode_refcounts(d) ) - { - shadow_vram_put_l1e(old_sl1e, sl1e, sl1mfn, d); - shadow_put_page_from_l1e(old_sl1e, d); - TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_SHADOW_L1_PUT_REF); - } + shadow_vram_put_mfn(shadow_l1e_get_mfn(old_sl1e), old_sl1f, + sl1mfn, sl1e, d); + shadow_put_page_from_l1e(old_sl1e, d); + TRACE_SHADOW_PATH_FLAG(TRCE_SFLAG_SHADOW_L1_PUT_REF); } return flags; } @@ -1944,9 +1845,12 @@ void sh_destroy_l1_shadow(struct domain /* Decrement refcounts of all the old entries */ mfn_t sl1mfn = smfn; SHADOW_FOREACH_L1E(sl1mfn, sl1e, 0, 0, { - if ( (shadow_l1e_get_flags(*sl1e) & _PAGE_PRESENT) - && !sh_l1e_is_magic(*sl1e) ) { - shadow_vram_put_l1e(*sl1e, sl1e, sl1mfn, d); + unsigned int sl1f = shadow_l1e_get_flags(*sl1e); + + if ( (sl1f & _PAGE_PRESENT) && !sh_l1e_is_magic(*sl1e) ) + { + shadow_vram_put_mfn(shadow_l1e_get_mfn(*sl1e), sl1f, + sl1mfn, sl1e, d); shadow_put_page_from_l1e(*sl1e, d); } }); --- a/xen/arch/x86/mm/shadow/private.h +++ b/xen/arch/x86/mm/shadow/private.h @@ -410,6 +410,14 @@ void shadow_update_paging_modes(struct v * With user_only == 1, unhooks only the user-mode mappings. */ void shadow_unhook_mappings(struct domain *d, mfn_t smfn, int user_only); +/* VRAM dirty tracking helpers. */ +void shadow_vram_get_mfn(mfn_t mfn, unsigned int l1f, + mfn_t sl1mfn, const void *sl1e, + const struct domain *d); +void shadow_vram_put_mfn(mfn_t mfn, unsigned int l1f, + mfn_t sl1mfn, const void *sl1e, + const struct domain *d); + #if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC) /* Allow a shadowed page to go out of sync */ int sh_unsync(struct vcpu *v, mfn_t gmfn);
On Mon, Oct 19, 2020 at 10:44:31AM +0200, Jan Beulich wrote: > By passing the functions an MFN and flags, only a single instance of > each is needed; they were pretty large for being inline functions > anyway. > > While moving the code, also adjust coding style and add const where > sensible / possible. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Thanks, Roger.
At 10:44 +0200 on 19 Oct (1603104271), Jan Beulich wrote: > By passing the functions an MFN and flags, only a single instance of > each is needed; they were pretty large for being inline functions > anyway. > > While moving the code, also adjust coding style and add const where > sensible / possible. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
With them depending on just the number of shadow levels, there's no need for more than one instance of them, and hence no need for any hook (IOW 452219e24648 ["x86/shadow: monitor table is HVM-only"] didn't go quite far enough). Move the functions to hvm.c while dropping the dead is_pv_32bit_domain() code paths. While moving the code, replace a stale comment reference to sh_install_xen_entries_in_l4(). Doing so made me notice the function also didn't have its prototype dropped in 8d7b633adab7 ("x86/mm: Consolidate all Xen L4 slot writing into init_xen_l4_slots()"), which gets done here as well. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v3: New. --- TBD: In principle both functions could have their first parameter constified. In fact, "destroy" doesn't depend on the vCPU at all and hence could be passed a struct domain *. Not sure whether such an asymmetry would be acceptable. In principle "make" would also not need passing of the number of shadow levels (can be derived from v), which would result in yet another asymmetry. If these asymmetries were acceptable, "make" could then also update v->arch.hvm.monitor_table, instead of doing so at both call sites. TBD: Collides with Andrew's "xen/x86: Fix memory leak in vcpu_create() error path". --- a/xen/arch/x86/mm/shadow/common.c +++ b/xen/arch/x86/mm/shadow/common.c @@ -2467,7 +2467,9 @@ static void sh_update_paging_modes(struc if ( pagetable_is_null(v->arch.hvm.monitor_table) ) { - mfn_t mmfn = v->arch.paging.mode->shadow.make_monitor_table(v); + mfn_t mmfn = sh_make_monitor_table( + v, v->arch.paging.mode->shadow.shadow_levels); + v->arch.hvm.monitor_table = pagetable_from_mfn(mmfn); make_cr3(v, mmfn); hvm_update_host_cr3(v); @@ -2504,7 +2506,8 @@ static void sh_update_paging_modes(struc old_mfn = pagetable_get_mfn(v->arch.hvm.monitor_table); v->arch.hvm.monitor_table = pagetable_null(); - new_mfn = v->arch.paging.mode->shadow.make_monitor_table(v); + new_mfn = sh_make_monitor_table( + v, v->arch.paging.mode->shadow.shadow_levels); v->arch.hvm.monitor_table = pagetable_from_mfn(new_mfn); SHADOW_PRINTK("new monitor table %"PRI_mfn "\n", mfn_x(new_mfn)); @@ -2516,7 +2519,8 @@ static void sh_update_paging_modes(struc if ( v == current ) write_ptbase(v); hvm_update_host_cr3(v); - old_mode->shadow.destroy_monitor_table(v, old_mfn); + sh_destroy_monitor_table(v, old_mfn, + old_mode->shadow.shadow_levels); } } @@ -2801,7 +2805,9 @@ void shadow_teardown(struct domain *d, b mfn_t mfn = pagetable_get_mfn(v->arch.hvm.monitor_table); if ( mfn_valid(mfn) && (mfn_x(mfn) != 0) ) - v->arch.paging.mode->shadow.destroy_monitor_table(v, mfn); + sh_destroy_monitor_table( + v, mfn, + v->arch.paging.mode->shadow.shadow_levels); v->arch.hvm.monitor_table = pagetable_null(); } #endif /* CONFIG_HVM */ --- a/xen/arch/x86/mm/shadow/hvm.c +++ b/xen/arch/x86/mm/shadow/hvm.c @@ -691,6 +691,88 @@ static void sh_emulate_unmap_dest(struct atomic_inc(&v->domain->arch.paging.shadow.gtable_dirty_version); } +mfn_t sh_make_monitor_table(struct vcpu *v, unsigned int shadow_levels) +{ + struct domain *d = v->domain; + mfn_t m4mfn; + l4_pgentry_t *l4e; + + ASSERT(!pagetable_get_pfn(v->arch.hvm.monitor_table)); + + /* Guarantee we can get the memory we need */ + shadow_prealloc(d, SH_type_monitor_table, CONFIG_PAGING_LEVELS); + m4mfn = shadow_alloc(d, SH_type_monitor_table, 0); + mfn_to_page(m4mfn)->shadow_flags = 4; + + l4e = map_domain_page(m4mfn); + + /* + * Create a self-linear mapping, but no shadow-linear mapping. A + * shadow-linear mapping will either be inserted below when creating + * lower level monitor tables, or later in sh_update_cr3(). + */ + init_xen_l4_slots(l4e, m4mfn, d, INVALID_MFN, false); + + if ( shadow_levels < 4 ) + { + mfn_t m3mfn, m2mfn; + l3_pgentry_t *l3e; + + /* + * Install an l3 table and an l2 table that will hold the shadow + * linear map entries. This overrides the empty entry that was + * installed by init_xen_l4_slots(). + */ + m3mfn = shadow_alloc(d, SH_type_monitor_table, 0); + mfn_to_page(m3mfn)->shadow_flags = 3; + l4e[l4_table_offset(SH_LINEAR_PT_VIRT_START)] + = l4e_from_mfn(m3mfn, __PAGE_HYPERVISOR_RW); + + m2mfn = shadow_alloc(d, SH_type_monitor_table, 0); + mfn_to_page(m2mfn)->shadow_flags = 2; + l3e = map_domain_page(m3mfn); + l3e[0] = l3e_from_mfn(m2mfn, __PAGE_HYPERVISOR_RW); + unmap_domain_page(l3e); + } + + unmap_domain_page(l4e); + + return m4mfn; +} + +void sh_destroy_monitor_table(struct vcpu *v, mfn_t mmfn, + unsigned int shadow_levels) +{ + struct domain *d = v->domain; + + ASSERT(mfn_to_page(mmfn)->u.sh.type == SH_type_monitor_table); + + if ( shadow_levels < 4 ) + { + mfn_t m3mfn; + l4_pgentry_t *l4e = map_domain_page(mmfn); + l3_pgentry_t *l3e; + unsigned int linear_slot = l4_table_offset(SH_LINEAR_PT_VIRT_START); + + /* + * Need to destroy the l3 and l2 monitor pages used + * for the linear map. + */ + ASSERT(l4e_get_flags(l4e[linear_slot]) & _PAGE_PRESENT); + m3mfn = l4e_get_mfn(l4e[linear_slot]); + l3e = map_domain_page(m3mfn); + ASSERT(l3e_get_flags(l3e[0]) & _PAGE_PRESENT); + shadow_free(d, l3e_get_mfn(l3e[0])); + unmap_domain_page(l3e); + shadow_free(d, m3mfn); + + unmap_domain_page(l4e); + } + + /* Put the memory back in the pool */ + shadow_free(d, mmfn); +} + /**************************************************************************/ /* VRAM dirty tracking support */ int shadow_track_dirty_vram(struct domain *d, --- a/xen/arch/x86/mm/shadow/multi.c +++ b/xen/arch/x86/mm/shadow/multi.c @@ -1405,84 +1405,6 @@ make_fl1_shadow(struct domain *d, gfn_t } -#if SHADOW_PAGING_LEVELS == GUEST_PAGING_LEVELS && defined(CONFIG_HVM) -mfn_t -sh_make_monitor_table(struct vcpu *v) -{ - struct domain *d = v->domain; - - ASSERT(pagetable_get_pfn(v->arch.hvm.monitor_table) == 0); - - /* Guarantee we can get the memory we need */ - shadow_prealloc(d, SH_type_monitor_table, CONFIG_PAGING_LEVELS); - - { - mfn_t m4mfn; - l4_pgentry_t *l4e; - - m4mfn = shadow_alloc(d, SH_type_monitor_table, 0); - mfn_to_page(m4mfn)->shadow_flags = 4; - - l4e = map_domain_page(m4mfn); - - /* - * Create a self-linear mapping, but no shadow-linear mapping. A - * shadow-linear mapping will either be inserted below when creating - * lower level monitor tables, or later in sh_update_cr3(). - */ - init_xen_l4_slots(l4e, m4mfn, d, INVALID_MFN, false); - -#if SHADOW_PAGING_LEVELS < 4 - { - mfn_t m3mfn, m2mfn; - l3_pgentry_t *l3e; - /* Install an l3 table and an l2 table that will hold the shadow - * linear map entries. This overrides the linear map entry that - * was installed by sh_install_xen_entries_in_l4. */ - - m3mfn = shadow_alloc(d, SH_type_monitor_table, 0); - mfn_to_page(m3mfn)->shadow_flags = 3; - l4e[shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START)] - = l4e_from_mfn(m3mfn, __PAGE_HYPERVISOR_RW); - - m2mfn = shadow_alloc(d, SH_type_monitor_table, 0); - mfn_to_page(m2mfn)->shadow_flags = 2; - l3e = map_domain_page(m3mfn); - l3e[0] = l3e_from_mfn(m2mfn, __PAGE_HYPERVISOR_RW); - unmap_domain_page(l3e); - - if ( is_pv_32bit_domain(d) ) - { - l2_pgentry_t *l2t; - - /* For 32-bit PV guests, we need to map the 32-bit Xen - * area into its usual VAs in the monitor tables */ - m3mfn = shadow_alloc(d, SH_type_monitor_table, 0); - mfn_to_page(m3mfn)->shadow_flags = 3; - l4e[0] = l4e_from_mfn(m3mfn, __PAGE_HYPERVISOR_RW); - - m2mfn = shadow_alloc(d, SH_type_monitor_table, 0); - mfn_to_page(m2mfn)->shadow_flags = 2; - l3e = map_domain_page(m3mfn); - l3e[3] = l3e_from_mfn(m2mfn, _PAGE_PRESENT); - - l2t = map_domain_page(m2mfn); - init_xen_pae_l2_slots(l2t, d); - unmap_domain_page(l2t); - - unmap_domain_page(l3e); - } - - } -#endif /* SHADOW_PAGING_LEVELS < 4 */ - - unmap_domain_page(l4e); - - return m4mfn; - } -} -#endif /* SHADOW_PAGING_LEVELS == GUEST_PAGING_LEVELS */ - /**************************************************************************/ /* These functions also take a virtual address and return the level-N * shadow table mfn and entry, but they create the shadow pagetables if @@ -1860,50 +1782,6 @@ void sh_destroy_l1_shadow(struct domain shadow_free(d, smfn); } -#if SHADOW_PAGING_LEVELS == GUEST_PAGING_LEVELS && defined(CONFIG_HVM) -void sh_destroy_monitor_table(struct vcpu *v, mfn_t mmfn) -{ - struct domain *d = v->domain; - ASSERT(mfn_to_page(mmfn)->u.sh.type == SH_type_monitor_table); - -#if SHADOW_PAGING_LEVELS != 4 - { - mfn_t m3mfn; - l4_pgentry_t *l4e = map_domain_page(mmfn); - l3_pgentry_t *l3e; - int linear_slot = shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START); - - /* Need to destroy the l3 and l2 monitor pages used - * for the linear map */ - ASSERT(l4e_get_flags(l4e[linear_slot]) & _PAGE_PRESENT); - m3mfn = l4e_get_mfn(l4e[linear_slot]); - l3e = map_domain_page(m3mfn); - ASSERT(l3e_get_flags(l3e[0]) & _PAGE_PRESENT); - shadow_free(d, l3e_get_mfn(l3e[0])); - unmap_domain_page(l3e); - shadow_free(d, m3mfn); - - if ( is_pv_32bit_domain(d) ) - { - /* Need to destroy the l3 and l2 monitor pages that map the - * Xen VAs at 3GB-4GB */ - ASSERT(l4e_get_flags(l4e[0]) & _PAGE_PRESENT); - m3mfn = l4e_get_mfn(l4e[0]); - l3e = map_domain_page(m3mfn); - ASSERT(l3e_get_flags(l3e[3]) & _PAGE_PRESENT); - shadow_free(d, l3e_get_mfn(l3e[3])); - unmap_domain_page(l3e); - shadow_free(d, m3mfn); - } - unmap_domain_page(l4e); - } -#endif - - /* Put the memory back in the pool */ - shadow_free(d, mmfn); -} -#endif - /**************************************************************************/ /* Functions to destroy non-Xen mappings in a pagetable hierarchy. * These are called from common code when we are running out of shadow @@ -4705,8 +4583,6 @@ const struct paging_mode sh_paging_mode .shadow.cmpxchg_guest_entry = sh_cmpxchg_guest_entry, #endif #ifdef CONFIG_HVM - .shadow.make_monitor_table = sh_make_monitor_table, - .shadow.destroy_monitor_table = sh_destroy_monitor_table, #if SHADOW_OPTIMIZATIONS & SHOPT_WRITABLE_HEURISTIC .shadow.guess_wrmap = sh_guess_wrmap, #endif --- a/xen/arch/x86/mm/shadow/private.h +++ b/xen/arch/x86/mm/shadow/private.h @@ -366,9 +366,6 @@ void sh_set_toplevel_shadow(struct vcpu mfn_t gmfn, uint32_t shadow_type)); -/* Install the xen mappings in various flavours of shadow */ -void sh_install_xen_entries_in_l4(struct domain *, mfn_t gl4mfn, mfn_t sl4mfn); - /* Update the shadows in response to a pagetable write from Xen */ int sh_validate_guest_entry(struct vcpu *v, mfn_t gmfn, void *entry, u32 size); @@ -410,6 +407,14 @@ void shadow_update_paging_modes(struct v * With user_only == 1, unhooks only the user-mode mappings. */ void shadow_unhook_mappings(struct domain *d, mfn_t smfn, int user_only); +/* + * sh_{make,destroy}_monitor_table() depend only on the number of shadow + * levels. + */ +mfn_t sh_make_monitor_table(struct vcpu *v, unsigned int shadow_levels); +void sh_destroy_monitor_table(struct vcpu *v, mfn_t mmfn, + unsigned int shadow_levels); + /* VRAM dirty tracking helpers. */ void shadow_vram_get_mfn(mfn_t mfn, unsigned int l1f, mfn_t sl1mfn, const void *sl1e, --- a/xen/arch/x86/mm/shadow/types.h +++ b/xen/arch/x86/mm/shadow/types.h @@ -262,15 +262,6 @@ static inline shadow_l4e_t shadow_l4e_fr #define sh_rm_write_access_from_sl1p INTERNAL_NAME(sh_rm_write_access_from_sl1p) #endif -/* sh_make_monitor_table depends only on the number of shadow levels */ -#define sh_make_monitor_table \ - SHADOW_SH_NAME(sh_make_monitor_table, SHADOW_PAGING_LEVELS) -#define sh_destroy_monitor_table \ - SHADOW_SH_NAME(sh_destroy_monitor_table, SHADOW_PAGING_LEVELS) - -mfn_t sh_make_monitor_table(struct vcpu *v); -void sh_destroy_monitor_table(struct vcpu *v, mfn_t mmfn); - #if SHADOW_PAGING_LEVELS == 3 #define MFN_FITS_IN_HVM_CR3(_MFN) !(mfn_x(_MFN) >> 20) #endif --- a/xen/include/asm-x86/paging.h +++ b/xen/include/asm-x86/paging.h @@ -107,8 +107,6 @@ struct shadow_paging_mode { mfn_t gmfn); #endif #ifdef CONFIG_HVM - mfn_t (*make_monitor_table )(struct vcpu *v); - void (*destroy_monitor_table )(struct vcpu *v, mfn_t mmfn); int (*guess_wrmap )(struct vcpu *v, unsigned long vaddr, mfn_t gmfn); void (*pagetable_dying )(paddr_t gpa);
On Mon, Oct 19, 2020 at 10:45:00AM +0200, Jan Beulich wrote: > With them depending on just the number of shadow levels, there's no need > for more than one instance of them, and hence no need for any hook (IOW > 452219e24648 ["x86/shadow: monitor table is HVM-only"] didn't go quite > far enough). Move the functions to hvm.c while dropping the dead > is_pv_32bit_domain() code paths. > > While moving the code, replace a stale comment reference to > sh_install_xen_entries_in_l4(). Doing so made me notice the function > also didn't have its prototype dropped in 8d7b633adab7 ("x86/mm: > Consolidate all Xen L4 slot writing into init_xen_l4_slots()"), which > gets done here as well. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> > --- > v3: New. > --- > TBD: In principle both functions could have their first parameter > constified. In fact, "destroy" doesn't depend on the vCPU at all > and hence could be passed a struct domain *. Not sure whether such > an asymmetry would be acceptable. > In principle "make" would also not need passing of the number of > shadow levels (can be derived from v), which would result in yet > another asymmetry. I'm not specially fuzzed either way - having const v would be good IMO. Thanks, Roger.
At 10:45 +0200 on 19 Oct (1603104300), Jan Beulich wrote: > With them depending on just the number of shadow levels, there's no need > for more than one instance of them, and hence no need for any hook (IOW > 452219e24648 ["x86/shadow: monitor table is HVM-only"] didn't go quite > far enough). Move the functions to hvm.c while dropping the dead > is_pv_32bit_domain() code paths. > > While moving the code, replace a stale comment reference to > sh_install_xen_entries_in_l4(). Doing so made me notice the function > also didn't have its prototype dropped in 8d7b633adab7 ("x86/mm: > Consolidate all Xen L4 slot writing into init_xen_l4_slots()"), which > gets done here as well. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org> > TBD: In principle both functions could have their first parameter > constified. In fact, "destroy" doesn't depend on the vCPU at all > and hence could be passed a struct domain *. Not sure whether such > an asymmetry would be acceptable. > In principle "make" would also not need passing of the number of > shadow levels (can be derived from v), which would result in yet > another asymmetry. > If these asymmetries were acceptable, "make" could then also update > v->arch.hvm.monitor_table, instead of doing so at both call sites. Feel free to add consts, but please don't change the function parameters any more than that. I would rather keep them as a matched pair, and leave the hvm.monitor_table updates in the caller, where it's easier to see why they're not symmetrical. Cheers Tim.
Tim, On 19.10.2020 10:42, Jan Beulich wrote: > The change addressing the shadow related build issue noticed by > Andrew went in already. The build breakage goes beyond this > specific combination though - PV_SHIM_EXCLUSIVE plus HVM is > similarly an issue. This is what the 1st patch tries to take care > of, in a shape already on irc noticed to be controversial. I'm > submitting the change nevertheless because for the moment there > looks to be a majority in favor of going this route. One argument > not voiced there yet: What good does it do to allow a user to > enable HVM when then on the resulting hypervisor they still can't > run HVM guests (for the hypervisor still being a dedicated PV > shim one). On top of this, the alternative approach is likely > going to get ugly. > > The shadow related adjustments are here merely because the want > to make them was noticed in the context of the patch which has > already gone in. > > 1: don't permit HVM and PV_SHIM_EXCLUSIVE at the same time > 2: refactor shadow_vram_{get,put}_l1e() > 3: sh_{make,destroy}_monitor_table() are "even more" HVM-only unless you tell me otherwise I'm intending to commit the latter two with Roger's acks some time next week. Of course it would still be nice to know your view on the first of the TBDs in patch 3 (which may result in further changes, in a follow-up). Jan
At 14:40 +0100 on 29 Oct (1603982415), Jan Beulich wrote: > Tim, > > unless you tell me otherwise I'm intending to commit the latter > two with Roger's acks some time next week. Of course it would > still be nice to know your view on the first of the TBDs in > patch 3 (which may result in further changes, in a follow-up). Ack, sorry for the dropped patches, and thank you for the ping. I've now replied to everything that I think is wating for my review. Cheers, Tim.
© 2016 - 2024 Red Hat, Inc.