[PATCH 0/7] KVM: x86/xen: Fix Xen / GPC / PREEMPT_RT issues with rwlock_t

David Woodhouse posted 7 patches 1 month ago
arch/x86/include/asm/kvm_host.h |   2 +-
arch/x86/kvm/irq.c              |   4 --
arch/x86/kvm/x86.c              | 138 +++++++++++++++++++++-------------------
arch/x86/kvm/xen.c              | 117 +++++++++++++++++-----------------
arch/x86/kvm/xen.h              |  21 +-----
kernel/locking/rwbase_rt.c      |   5 +-
virt/kvm/pfncache.c             |  28 ++++----
7 files changed, 152 insertions(+), 163 deletions(-)
[PATCH 0/7] KVM: x86/xen: Fix Xen / GPC / PREEMPT_RT issues with rwlock_t
Posted by David Woodhouse 1 month ago
This series fixes sleeping-in-hardirq bugs in KVM's Xen emulation on
PREEMPT_RT, and then cleans up the now-unnecessary IRQ disabling in GPC
lock usage throughout KVM.
  
The core issue is that kvm_xen_set_evtchn_fast() and the Xen timer
callback are called from hardirq/atomic context, but on PREEMPT_RT the
GPC rwlock_t is a sleeping lock.
  
Patch 1 fixes a related RT locking bug in the rwlock core where
__rwbase_read_unlock() unconditionally re-enables IRQs regardless of
the caller's saved state.
  
Patch 2 converts record_steal_time() to use gfn_to_pfn_cache, replacing
the kvm_map_gfn()/kvm_unmap_gfn() interface.
  
Patch 3 is the main fix: it switches the hardirq/atomic GPC users to
read_trylock() with -EWOULDBLOCK fallback. There is always a slow path
for the case where the GPC is invalid and needs to be refreshed.
  
Patches 4-6 remove the now-unnecessary irqsave/irqrestore from all
remaining GPC lock users, since no hardirq path holds the lock any more.
This simplifies the locking throughout xen.c, x86.c, and pfncache.c.
  
Patch 7 subsumes Xen timer injection into kvm_xen_inject_pending_events()
and calls it from vcpu_enter_guest(), reducing deferred timer delivery
latency from ~10ms (scheduler tick dependent) to sub-microsecond.
  
Tested on bare metal (c7i.metal-48xl) with both non-RT and PREEMPT_RT
kernels, including the xen_shinfo_test selftest and QEMU with Xen
emulation (xen-version=0x40010,kernel-irqchip=split).

Carsten Stollmaier (1):
      KVM: x86: Use gfn_to_pfn_cache for record_steal_time

David Woodhouse (6):
      locking/rt: Use raw_spin_lock_irqsave() in __rwbase_read_unlock()
      KVM: x86/xen: Use read_trylock() for GPC locks in hardirq/atomic paths
      KVM: x86/xen: Remove unnecessary irqsave from GPC lock usage in xen.c
      KVM: x86: Remove unnecessary irqsave from kvm_setup_guest_pvclock()
      KVM: Remove unnecessary IRQ disabling from GPC lock in pfncache.c
      KVM: x86/xen: Handle pending Xen timer events in vcpu_enter_guest()

 arch/x86/include/asm/kvm_host.h |   2 +-
 arch/x86/kvm/irq.c              |   4 --
 arch/x86/kvm/x86.c              | 138 +++++++++++++++++++++-------------------
 arch/x86/kvm/xen.c              | 117 +++++++++++++++++-----------------
 arch/x86/kvm/xen.h              |  21 +-----
 kernel/locking/rwbase_rt.c      |   5 +-
 virt/kvm/pfncache.c             |  28 ++++----
 7 files changed, 152 insertions(+), 163 deletions(-)