[PATCH v2 0/7] x86/irq: fixes for CPU hot{,un}plug

Roger Pau Monne posted 7 patches 5 months, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20240610142043.11924-1-roger.pau@citrix.com
There is a newer version of this series
xen/arch/x86/include/asm/apic.h |   5 +
xen/arch/x86/include/asm/irq.h  |  40 ++++++-
xen/arch/x86/irq.c              | 197 ++++++++++++++++++++++++--------
xen/arch/x86/smp.c              |   2 +-
xen/common/cpu.c                |   5 +
xen/include/xen/cpu.h           |  10 ++
xen/include/xen/rwlock.h        |   2 +
7 files changed, 214 insertions(+), 47 deletions(-)
[PATCH v2 0/7] x86/irq: fixes for CPU hot{,un}plug
Posted by Roger Pau Monne 5 months, 2 weeks ago
Hello,

The following series aim to fix interrupt handling when doing CPU
plug/unplug operations.  Without this series running:

cpus=`xl info max_cpu_id`
while [ 1 ]; do
    for i in `seq 1 $cpus`; do
        xen-hptool cpu-offline $i;
        xen-hptool cpu-online $i;
    done
done

Quite quickly results in interrupts getting lost and "No irq handler for
vector" messages on the Xen console.  Drivers in dom0 also start getting
interrupt timeouts and the system becomes unusable.

After applying the series running the loop over night still result in a
fully usable system, no  "No irq handler for vector" messages at all, no
interrupt loses reported by dom0.  Test with x2apic-mode={mixed,cluster}.

I've attempted to document all code as good as I could, interrupt
handling has some unexpected corner cases that are hard to diagnose and
reason about.

I'm currently also doing some extra testing with XenRT in case I've
missed something.

Thanks, Roger.

Roger Pau Monne (7):
  x86/smp: do not use shorthand IPI destinations in CPU hot{,un}plug
    contexts
  x86/irq: describe how the interrupt CPU movement works
  x86/irq: limit interrupt movement done by fixup_irqs()
  x86/irq: restrict CPU movement in set_desc_affinity()
  x86/irq: deal with old_cpu_mask for interrupts in movement in
    fixup_irqs()
  x86/irq: handle moving interrupts in _assign_irq_vector()
  x86/irq: forward pending interrupts to new destination in fixup_irqs()

 xen/arch/x86/include/asm/apic.h |   5 +
 xen/arch/x86/include/asm/irq.h  |  40 ++++++-
 xen/arch/x86/irq.c              | 197 ++++++++++++++++++++++++--------
 xen/arch/x86/smp.c              |   2 +-
 xen/common/cpu.c                |   5 +
 xen/include/xen/cpu.h           |  10 ++
 xen/include/xen/rwlock.h        |   2 +
 7 files changed, 214 insertions(+), 47 deletions(-)

-- 
2.44.0