Hello,
The aim of this series is to introduce the functionality required to
create linear mappings visible to a single pCPU.
Doing so requires having a per-CPU root page-table (L4), and hence
requires changes to the HVM monitor tables and shadowing the guest
selected L4 on PV guests. As follow ups (and partially to ensure the
per-CPU mappings work fine) the CPU stacks are switched to use per-CPU
mappings, so that remote stack contents are not by default mapped on all
page-tables (note: for this to be true the directmap entries for the
stack pages would need to be removed also).
Patches before patch 12 are either small fixes or preparatory
non-functional changes in order to accommodate the rest of the series.
Patch 12 introduces a new 'asi' spec-ctrl option, that's used to enable
Address Space Isolation.
Patches 13-15 and 20 introduce logic to use per-CPU L4 on HVM and PV
guests.
Patches 16-18 add support for creating per-CPU mappings to the existing
page-table management functions, map_pages_to_xen() and related
functions. Patch 19 introduce helpers for creating per-CPU mappings
using a fixmap interface.
Finally patches 21-22 add support for mapping the CPU stack in a per-CPU
fixmap region, and zeroing the stacks on guest context switch.
I've been testing the patches quite a lot using XenRT, and so far they
seem to not cause regressions (either with spec-ctrl=asi or without it),
but XenRT no longer tests shadow paging or 32bit PV guests.
This proposal is also missing an interface similar to map_domain_page()
in order to create per-CPU mappings that don't use a fixmap entry. I
thought however that the current content was fair enough for a first
posting, and that I would like to get feedback on this before building
further functionality on top of it.
Note that none of the logic introduced in the series removes entries for
the directmap, so evne when creating the per-CPU mappings the underlying
physical addresses are fully accessible when using it's linear direct
map entries.
I also haven't done any benchmarking. Doesn't seem to cripple
performance up to the point that XenRT jobs would timeout before
finishing, that the only objective reference I can provide at the
moment.
It's likely to still have some rough edges, handle with care.
Thanks, Roger.
Roger Pau Monne (22):
x86/mm: drop l{1,2,3,4}e_write_atomic()
x86/mm: rename l{1,2,3,4}e_read_atomic()
x86/dom0: only disable SMAP for the PV dom0 build
x86/mm: ensure L4 idle_pg_table is not modified past boot
x86/mm: make virt_to_xen_l1e() static
x86/mm: introduce a local domain variable to write_ptbase()
x86/spec-ctrl: initialize per-domain XPTI in spec_ctrl_init_domain()
x86/mm: avoid passing a domain parameter to L4 init function
x86/pv: untie issuing FLUSH_ROOT_PGTBL from XPTI
x86/mm: move FLUSH_ROOT_PGTBL handling before TLB flush
x86/mm: split setup of the per-domain slot on context switch
x86/spec-ctrl: introduce Address Space Isolation command line option
x86/hvm: use a per-pCPU monitor table in HAP mode
x86/hvm: use a per-pCPU monitor table in shadow mode
x86/idle: allow using a per-pCPU L4
x86/mm: introduce a per-CPU L3 table for the per-domain slot
x86/mm: introduce support to populate a per-CPU page-table region
x86/mm: allow modifying per-CPU entries of remote page-tables
x86/mm: introduce a per-CPU fixmap area
x86/pv: allow using a unique per-pCPU root page table (L4)
x86/mm: switch to a per-CPU mapped stack when using ASI
x86/mm: zero stack on stack switch or reset
docs/misc/xen-command-line.pandoc | 15 +-
xen/arch/x86/boot/x86_64.S | 11 +
xen/arch/x86/domain.c | 75 +++-
xen/arch/x86/domain_page.c | 2 +-
xen/arch/x86/flushtlb.c | 18 +-
xen/arch/x86/hvm/hvm.c | 67 ++++
xen/arch/x86/hvm/svm/svm.c | 5 +
xen/arch/x86/hvm/vmx/vmcs.c | 1 +
xen/arch/x86/hvm/vmx/vmx.c | 4 +
xen/arch/x86/include/asm/config.h | 4 +
xen/arch/x86/include/asm/current.h | 38 +-
xen/arch/x86/include/asm/domain.h | 7 +
xen/arch/x86/include/asm/fixmap.h | 50 +++
xen/arch/x86/include/asm/flushtlb.h | 3 +-
xen/arch/x86/include/asm/hap.h | 1 -
xen/arch/x86/include/asm/hvm/hvm.h | 8 +
xen/arch/x86/include/asm/hvm/vcpu.h | 6 +-
xen/arch/x86/include/asm/mm.h | 34 +-
xen/arch/x86/include/asm/page.h | 37 +-
xen/arch/x86/include/asm/paging.h | 18 +
xen/arch/x86/include/asm/pv/mm.h | 8 +
xen/arch/x86/include/asm/setup.h | 1 +
xen/arch/x86/include/asm/smp.h | 12 +
xen/arch/x86/include/asm/spec_ctrl.h | 2 +
xen/arch/x86/include/asm/x86_64/page.h | 4 -
xen/arch/x86/mm.c | 484 ++++++++++++++++++++-----
xen/arch/x86/mm/hap/hap.c | 74 ----
xen/arch/x86/mm/paging.c | 4 +-
xen/arch/x86/mm/shadow/common.c | 42 +--
xen/arch/x86/mm/shadow/hvm.c | 64 ++--
xen/arch/x86/mm/shadow/multi.c | 73 ++--
xen/arch/x86/mm/shadow/private.h | 4 +-
xen/arch/x86/pv/dom0_build.c | 16 +-
xen/arch/x86/pv/domain.c | 28 +-
xen/arch/x86/pv/mm.c | 52 +++
xen/arch/x86/setup.c | 55 +--
xen/arch/x86/smp.c | 29 ++
xen/arch/x86/smpboot.c | 78 +++-
xen/arch/x86/spec_ctrl.c | 78 +++-
xen/arch/x86/traps.c | 14 +-
xen/common/efi/runtime.c | 12 +
xen/common/smp.c | 10 +
xen/include/xen/smp.h | 5 +
43 files changed, 1198 insertions(+), 355 deletions(-)
--
2.45.2