:p
atchew
Login
This series implements support for CPU hotplug/unplug on Arm. To achieve this, several things need to be done: 1. XEN_SYSCTL_CPU_HOTPLUG_* calls implemented on Arm64. 2. Enabled building of xen-hptool. 3. Migration of irqs from dying CPUs implemented. Tested on QEMU. v4->v5: * drop merged patches * combine "smp: Move cpu_up/down helpers to common code" with "arm/sysctl: Implement cpu hotplug ops" * see individual patches v3->v4: * add irq migration patches * see individual patches v2->v3: * add docs v1->v2: * see individual patches Mykyta Poturai (5): arm/irq: Keep track of irq affinities arm/irq: Migrate IRQs during CPU up/down operations arm/sysctl: Implement cpu hotplug ops tools: Allow building xen-hptool without CONFIG_MIGRATE docs: Document CPU hotplug SUPPORT.md | 1 + docs/misc/cpu-hotplug.txt | 50 +++++++++++++++++++++++++ tools/libs/guest/Makefile.common | 2 +- tools/misc/Makefile | 2 +- xen/arch/arm/Kconfig | 1 + xen/arch/arm/gic-vgic.c | 2 + xen/arch/arm/include/asm/irq.h | 2 + xen/arch/arm/irq.c | 63 +++++++++++++++++++++++++++++++- xen/arch/arm/smp.c | 9 +++++ xen/arch/arm/smpboot.c | 6 +++ xen/arch/arm/vgic.c | 14 ++++++- xen/arch/ppc/stubs.c | 4 ++ xen/arch/riscv/stubs.c | 5 +++ xen/arch/x86/Kconfig | 1 + xen/arch/x86/include/asm/smp.h | 3 -- xen/arch/x86/smp.c | 33 ++--------------- xen/arch/x86/sysctl.c | 12 ++---- xen/common/Kconfig | 3 ++ xen/common/smp.c | 34 +++++++++++++++++ xen/common/sysctl.c | 45 +++++++++++++++++++++++ xen/include/xen/smp.h | 4 ++ 21 files changed, 248 insertions(+), 48 deletions(-) create mode 100644 docs/misc/cpu-hotplug.txt -- 2.51.2
Currently on Arm the desc->affinity mask of an irq is never updated, which makes it hard to know the actual affinity of an interrupt. Fix this by updating the field in irq_set_affinity. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v4->v5: * add locking v3->v4: * patch introduced --- xen/arch/arm/gic-vgic.c | 2 ++ xen/arch/arm/irq.c | 9 +++++++-- xen/arch/arm/vgic.c | 14 ++++++++++++-- 3 files changed, 21 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/gic-vgic.c +++ b/xen/arch/arm/gic-vgic.c @@ -XXX,XX +XXX,XX @@ static void gic_update_one_lr(struct vcpu *v, int i) if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) ) { struct vcpu *v_target = vgic_get_target_vcpu(v, irq); + spin_lock(&p->desc->lock); irq_set_affinity(p->desc, cpumask_of(v_target->processor)); + spin_unlock(&p->desc->lock); clear_bit(GIC_IRQ_GUEST_MIGRATING, &p->status); } } diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/irq.c +++ b/xen/arch/arm/irq.c @@ -XXX,XX +XXX,XX @@ static inline struct domain *irq_get_domain(struct irq_desc *desc) return irq_get_guest_info(desc)->d; } +/* Must be called with desc->lock held */ void irq_set_affinity(struct irq_desc *desc, const cpumask_t *mask) { - if ( desc != NULL ) - desc->handler->set_affinity(desc, mask); + if ( desc == NULL ) + return; + + ASSERT(spin_is_locked(&desc->lock)); + cpumask_copy(desc->affinity, mask); + desc->handler->set_affinity(desc, mask); } int request_irq(unsigned int irq, unsigned int irqflags, diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -XXX,XX +XXX,XX @@ bool vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq) if ( list_empty(&p->inflight) ) { + spin_lock(&p->desc->lock); irq_set_affinity(p->desc, cpumask_of(new->processor)); + spin_unlock(&p->desc->lock); spin_unlock_irqrestore(&old->arch.vgic.lock, flags); return true; } @@ -XXX,XX +XXX,XX @@ bool vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq) if ( !list_empty(&p->lr_queue) ) { vgic_remove_irq_from_queues(old, p); + spin_lock(&p->desc->lock); irq_set_affinity(p->desc, cpumask_of(new->processor)); + spin_unlock(&p->desc->lock); spin_unlock_irqrestore(&old->arch.vgic.lock, flags); vgic_inject_irq(new->domain, new, irq, true); return true; @@ -XXX,XX +XXX,XX @@ void arch_move_irqs(struct vcpu *v) struct domain *d = v->domain; struct pending_irq *p; struct vcpu *v_target; + unsigned long flags; int i; /* @@ -XXX,XX +XXX,XX @@ void arch_move_irqs(struct vcpu *v) p = irq_to_pending(v_target, virq); if ( v_target == v && !test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) ) + { + if ( !p->desc ) + continue; + spin_lock_irqsave(&p->desc->lock, flags); irq_set_affinity(p->desc, cpu_mask); + spin_unlock_irqrestore(&p->desc->lock, flags); + } } } @@ -XXX,XX +XXX,XX @@ void vgic_enable_irqs(struct vcpu *v, uint32_t r, unsigned int n) spin_unlock_irqrestore(&v_target->arch.vgic.lock, flags); if ( p->desc != NULL ) { - irq_set_affinity(p->desc, cpumask_of(v_target->processor)); spin_lock_irqsave(&p->desc->lock, flags); + irq_set_affinity(p->desc, cpumask_of(v_target->processor)); /* * The irq cannot be a PPI, we only support delivery of SPIs * to guests. @@ -XXX,XX +XXX,XX @@ void vgic_check_inflight_irqs_pending(struct vcpu *v, unsigned int rank, uint32_ * indent-tabs-mode: nil * End: */ - -- 2.51.2
Move IRQs from dying CPU to the online ones when a CPU is getting offlined. When onlining, rebalance all IRQs in a round-robin fashion. Guest-bound IRQs are already handled by scheduler in the process of moving vCPUs to active pCPUs, so we only need to handle IRQs used by Xen itself. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v4->v5: * handle CPU onlining as well * more comments * fix crash when ESPI is disabled * don't assume CPU 0 is a boot CPU * use insigned int for irq number * remove assumption that all irqs a bound to CPU 0 by default from the commit message v3->v4: * patch introduced --- xen/arch/arm/include/asm/irq.h | 2 ++ xen/arch/arm/irq.c | 54 ++++++++++++++++++++++++++++++++++ xen/arch/arm/smpboot.c | 6 ++++ 3 files changed, 62 insertions(+) diff --git a/xen/arch/arm/include/asm/irq.h b/xen/arch/arm/include/asm/irq.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/include/asm/irq.h +++ b/xen/arch/arm/include/asm/irq.h @@ -XXX,XX +XXX,XX @@ bool irq_type_set_by_domain(const struct domain *d); void irq_end_none(struct irq_desc *irq); #define irq_end_none irq_end_none +void rebalance_irqs(unsigned int from, bool up); + #endif /* _ASM_HW_IRQ_H */ /* * Local variables: diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/irq.c +++ b/xen/arch/arm/irq.c @@ -XXX,XX +XXX,XX @@ static int init_local_irq_data(unsigned int cpu) return 0; } +static int cpu_next; + +static void balance_irq(int irq, unsigned int from, bool up) +{ + struct irq_desc *desc = irq_to_desc(irq); + unsigned long flags; + + ASSERT(!cpumask_empty(&cpu_online_map)); + + spin_lock_irqsave(&desc->lock, flags); + if ( likely(!desc->action) ) + goto out; + + if ( likely(test_bit(_IRQ_GUEST, &desc->status) || + test_bit(_IRQ_MOVE_PENDING, &desc->status)) ) + goto out; + + /* + * Setting affinity to a mask of multiple CPUs causes the GIC drivers to + * select one CPU from that mask. If the dying CPU was included in the IRQ's + * affinity mask, we cannot determine exactly which CPU the interrupt is + * currently routed to, as GIC drivers lack a concrete get_affinity API. So + * to be safe we must reroute it to a new, definitely online, CPU. In the + * case of CPU going down, we move only the interrupt that could reside on + * it. Otherwise, we rearrange all interrupts in a round-robin fashion. + */ + if ( !up && !cpumask_test_cpu(from, desc->affinity) ) + goto out; + + cpu_next = cpumask_cycle(cpu_next, &cpu_online_map); + irq_set_affinity(desc, cpumask_of(cpu_next)); + +out: + spin_unlock_irqrestore(&desc->lock, flags); +} + +void rebalance_irqs(unsigned int from, bool up) +{ + int irq; + + if ( cpumask_empty(&cpu_online_map) ) + return; + + for ( irq = NR_LOCAL_IRQS; irq < NR_IRQS; irq++ ) + balance_irq(irq, from, up); + +#ifdef CONFIG_GICV3_ESPI + for ( irq = ESPI_BASE_INTID; irq < ESPI_MAX_INTID; irq++ ) + balance_irq(irq, from, up); +#endif +} + static int cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) { @@ -XXX,XX +XXX,XX @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action, printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n", cpu); break; + case CPU_ONLINE: + rebalance_irqs(cpu, true); } return notifier_from_errno(rc); diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/smpboot.c +++ b/xen/arch/arm/smpboot.c @@ -XXX,XX +XXX,XX @@ void __cpu_disable(void) smp_mb(); + /* + * Now that the interrupts are cleared and the CPU marked as offline, + * move interrupts out of it + */ + rebalance_irqs(cpu, false); + /* Return to caller; eventually the IPI mechanism will unwind and the * scheduler will drop to the idle loop, which will call stop_cpu(). */ } -- 2.51.2
Move XEN_SYSCTL_CPU_HOTPLUG_{ONLINE,OFFLINE} handlers to common code to allow for enabling/disabling CPU cores in runtime on Arm64. SMT-disable enforcement check is moved into a separate architecture-specific function. For now this operations only support Arm64. For proper Arm32 support, there needs to be a mechanism to free per-cpu page tables, allocated in init_domheap_mappings. Also, hotplug is not supported if ITS, FFA, or TEE is enabled, as they use non-static IRQ actions. Create a Kconfig option CPU_HOTPLUG that reflects this constraints. On X86 the option is enabled unconditionally. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v4->v5: * move handling to common code * rename config to CPU_HOTPUG * merge with "smp: Move cpu_up/down helpers to common code" v3->v4: * don't reimplement cpu_up/down helpers * add Kconfig option * fixup formatting v2->v3: * no changes v1->v2: * remove SMT ops * remove cpu == 0 checks * add XSM hooks * only implement for 64bit Arm --- xen/arch/arm/Kconfig | 1 + xen/arch/arm/smp.c | 9 +++++++ xen/arch/ppc/stubs.c | 4 +++ xen/arch/riscv/stubs.c | 5 ++++ xen/arch/x86/Kconfig | 1 + xen/arch/x86/include/asm/smp.h | 3 --- xen/arch/x86/smp.c | 33 +++---------------------- xen/arch/x86/sysctl.c | 12 +++------ xen/common/Kconfig | 3 +++ xen/common/smp.c | 34 +++++++++++++++++++++++++ xen/common/sysctl.c | 45 ++++++++++++++++++++++++++++++++++ xen/include/xen/smp.h | 4 +++ 12 files changed, 112 insertions(+), 42 deletions(-) diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/Kconfig +++ b/xen/arch/arm/Kconfig @@ -XXX,XX +XXX,XX @@ config ARM_64 def_bool y depends on !ARM_32 select 64BIT + select CPU_HOTPLUG if !TEE && !FFA && !HAS_ITS select HAS_FAST_MULTIPLY select HAS_VPCI_GUEST_SUPPORT if PCI_PASSTHROUGH diff --git a/xen/arch/arm/smp.c b/xen/arch/arm/smp.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/smp.c +++ b/xen/arch/arm/smp.c @@ -XXX,XX +XXX,XX @@ void smp_send_call_function_mask(const cpumask_t *mask) } } +/* + * We currently don't support SMT on ARM so we don't need any special logic for + * CPU disabling + */ +bool arch_smt_cpu_disable(unsigned int cpu) +{ + return false; +} + /* * Local variables: * mode: C diff --git a/xen/arch/ppc/stubs.c b/xen/arch/ppc/stubs.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/ppc/stubs.c +++ b/xen/arch/ppc/stubs.c @@ -XXX,XX +XXX,XX @@ void smp_send_call_function_mask(const cpumask_t *mask) BUG_ON("unimplemented"); } +bool arch_smt_cpu_disable(unsigned int cpu) +{ + BUG_ON("unimplemented"); +} /* irq.c */ void irq_ack_none(struct irq_desc *desc) diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/stubs.c +++ b/xen/arch/riscv/stubs.c @@ -XXX,XX +XXX,XX @@ void smp_send_call_function_mask(const cpumask_t *mask) BUG_ON("unimplemented"); } +bool arch_smt_cpu_disable(unsigned int cpu) +{ + BUG_ON("unimplemented"); +} + /* irq.c */ void irq_ack_none(struct irq_desc *desc) diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/Kconfig +++ b/xen/arch/x86/Kconfig @@ -XXX,XX +XXX,XX @@ config X86 select ARCH_PAGING_MEMPOOL select ARCH_SUPPORTS_INT128 imply CORE_PARKING + select CPU_HOTPLUG select FUNCTION_ALIGNMENT_16B select GENERIC_BUG_FRAME select HAS_ALTERNATIVE diff --git a/xen/arch/x86/include/asm/smp.h b/xen/arch/x86/include/asm/smp.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/include/asm/smp.h +++ b/xen/arch/x86/include/asm/smp.h @@ -XXX,XX +XXX,XX @@ int cpu_add(uint32_t apic_id, uint32_t acpi_id, uint32_t pxm); void __stop_this_cpu(void); -long cf_check cpu_up_helper(void *data); -long cf_check cpu_down_helper(void *data); - long cf_check core_parking_helper(void *data); bool core_parking_remove(unsigned int cpu); uint32_t get_cur_idle_nums(void); diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/smp.c +++ b/xen/arch/x86/smp.c @@ -XXX,XX +XXX,XX @@ void cf_check call_function_interrupt(void) smp_call_function_interrupt(); } -long cf_check cpu_up_helper(void *data) +bool arch_smt_cpu_disable(unsigned int cpu) { - unsigned int cpu = (unsigned long)data; - int ret = cpu_up(cpu); - - /* Have one more go on EBUSY. */ - if ( ret == -EBUSY ) - ret = cpu_up(cpu); - - if ( !ret && !opt_smt && - cpu_data[cpu].compute_unit_id == INVALID_CUID && - cpumask_weight(per_cpu(cpu_sibling_mask, cpu)) > 1 ) - { - ret = cpu_down_helper(data); - if ( ret ) - printk("Could not re-offline CPU%u (%d)\n", cpu, ret); - else - ret = -EPERM; - } - - return ret; -} - -long cf_check cpu_down_helper(void *data) -{ - int cpu = (unsigned long)data; - int ret = cpu_down(cpu); - /* Have one more go on EBUSY. */ - if ( ret == -EBUSY ) - ret = cpu_down(cpu); - return ret; + return !opt_smt && cpu_data[cpu].compute_unit_id == INVALID_CUID && + cpumask_weight(per_cpu(cpu_sibling_mask, cpu)) > 1; } diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/sysctl.c +++ b/xen/arch/x86/sysctl.c @@ -XXX,XX +XXX,XX @@ long arch_do_sysctl( case XEN_SYSCTL_cpu_hotplug: { - unsigned int cpu = sysctl->u.cpu_hotplug.cpu; unsigned int op = sysctl->u.cpu_hotplug.op; bool plug; long (*fn)(void *data); @@ -XXX,XX +XXX,XX @@ long arch_do_sysctl( switch ( op ) { case XEN_SYSCTL_CPU_HOTPLUG_ONLINE: - plug = true; - fn = cpu_up_helper; - hcpu = _p(cpu); - break; - case XEN_SYSCTL_CPU_HOTPLUG_OFFLINE: - plug = false; - fn = cpu_down_helper; - hcpu = _p(cpu); + /* Handled by common code */ + ASSERT_UNREACHABLE(); + ret = -EOPNOTSUPP; break; case XEN_SYSCTL_CPU_HOTPLUG_SMT_ENABLE: diff --git a/xen/common/Kconfig b/xen/common/Kconfig index XXXXXXX..XXXXXXX 100644 --- a/xen/common/Kconfig +++ b/xen/common/Kconfig @@ -XXX,XX +XXX,XX @@ config LIBFDT config MEM_ACCESS_ALWAYS_ON bool +config CPU_HOTPLUG + bool + config VM_EVENT def_bool MEM_ACCESS_ALWAYS_ON prompt "Memory Access and VM events" if !MEM_ACCESS_ALWAYS_ON diff --git a/xen/common/smp.c b/xen/common/smp.c index XXXXXXX..XXXXXXX 100644 --- a/xen/common/smp.c +++ b/xen/common/smp.c @@ -XXX,XX +XXX,XX @@ * GNU General Public License for more details. */ +#include <xen/cpu.h> #include <asm/hardirq.h> #include <asm/processor.h> #include <xen/spinlock.h> @@ -XXX,XX +XXX,XX @@ void smp_call_function_interrupt(void) irq_exit(); } +#ifdef CONFIG_CPU_HOTPLUG +long cf_check cpu_up_helper(void *data) +{ + unsigned int cpu = (unsigned long)data; + int ret = cpu_up(cpu); + + /* Have one more go on EBUSY. */ + if ( ret == -EBUSY ) + ret = cpu_up(cpu); + + if ( !ret && arch_smt_cpu_disable(cpu) ) + { + ret = cpu_down_helper(data); + if ( ret ) + printk("Could not re-offline CPU%u (%d)\n", cpu, ret); + else + ret = -EPERM; + } + + return ret; +} + +long cf_check cpu_down_helper(void *data) +{ + int cpu = (unsigned long)data; + int ret = cpu_down(cpu); + /* Have one more go on EBUSY. */ + if ( ret == -EBUSY ) + ret = cpu_down(cpu); + return ret; +} +#endif /* CONFIG_CPU_HOTPLUG */ + /* * Local variables: * mode: C diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c index XXXXXXX..XXXXXXX 100644 --- a/xen/common/sysctl.c +++ b/xen/common/sysctl.c @@ -XXX,XX +XXX,XX @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) copyback = 1; break; +#ifdef CONFIG_CPU_HOTPLUG + case XEN_SYSCTL_cpu_hotplug: + { + unsigned int cpu = op->u.cpu_hotplug.cpu; + unsigned int hp_op = op->u.cpu_hotplug.op; + bool plug; + long (*fn)(void *data); + void *hcpu; + + switch ( hp_op ) + { + case XEN_SYSCTL_CPU_HOTPLUG_ONLINE: + plug = true; + fn = cpu_up_helper; + hcpu = _p(cpu); + break; + + case XEN_SYSCTL_CPU_HOTPLUG_OFFLINE: + plug = false; + fn = cpu_down_helper; + hcpu = _p(cpu); + break; + + case XEN_SYSCTL_CPU_HOTPLUG_SMT_ENABLE: + case XEN_SYSCTL_CPU_HOTPLUG_SMT_DISABLE: + /* Use arch specific handlers as SMT is very arch-dependent */ + ret = arch_do_sysctl(op, u_sysctl); + copyback = 0; + goto out; + + default: + ret = -EOPNOTSUPP; + break; + } + + if ( !ret ) + ret = plug ? xsm_resource_plug_core(XSM_HOOK) + : xsm_resource_unplug_core(XSM_HOOK); + + if ( !ret ) + ret = continue_hypercall_on_cpu(0, fn, hcpu); + break; + } +#endif + default: ret = arch_do_sysctl(op, u_sysctl); copyback = 0; diff --git a/xen/include/xen/smp.h b/xen/include/xen/smp.h index XXXXXXX..XXXXXXX 100644 --- a/xen/include/xen/smp.h +++ b/xen/include/xen/smp.h @@ -XXX,XX +XXX,XX @@ extern void *stack_base[NR_CPUS]; void initialize_cpu_data(unsigned int cpu); int setup_cpu_root_pgt(unsigned int cpu); +bool arch_smt_cpu_disable(unsigned int cpu); +long cf_check cpu_up_helper(void *data); +long cf_check cpu_down_helper(void *data); + #endif /* __XEN_SMP_H__ */ -- 2.51.2
With CPU hotplug sysctls implemented on Arm it becomes useful to have a tool for calling them. According to the commit history it seems that putting hptool under config MIGRATE was a measure to fix IA64 build. As IA64 is no longer supported it can now be brought back. So build it unconditionally. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v4->v5: * make hptool always build v3->v4: * no changes v2->v3: * no changes v1->v2: * switch to configure from legacy config --- tools/libs/guest/Makefile.common | 2 +- tools/misc/Makefile | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/libs/guest/Makefile.common b/tools/libs/guest/Makefile.common index XXXXXXX..XXXXXXX 100644 --- a/tools/libs/guest/Makefile.common +++ b/tools/libs/guest/Makefile.common @@ -XXX,XX +XXX,XX @@ OBJS-y += xg_private.o OBJS-y += xg_domain.o OBJS-y += xg_suspend.o OBJS-y += xg_resume.o +OBJS-y += xg_offline_page.o ifeq ($(CONFIG_MIGRATE),y) OBJS-y += xg_sr_common.o OBJS-$(CONFIG_X86) += xg_sr_common_x86.o @@ -XXX,XX +XXX,XX @@ OBJS-$(CONFIG_X86) += xg_sr_save_x86_pv.o OBJS-$(CONFIG_X86) += xg_sr_save_x86_hvm.o OBJS-y += xg_sr_restore.o OBJS-y += xg_sr_save.o -OBJS-y += xg_offline_page.o else OBJS-y += xg_nomigrate.o endif diff --git a/tools/misc/Makefile b/tools/misc/Makefile index XXXXXXX..XXXXXXX 100644 --- a/tools/misc/Makefile +++ b/tools/misc/Makefile @@ -XXX,XX +XXX,XX @@ INSTALL_BIN += xencov_split INSTALL_BIN += $(INSTALL_BIN-y) # Everything to be installed in regular sbin/ -INSTALL_SBIN-$(CONFIG_MIGRATE) += xen-hptool INSTALL_SBIN-$(CONFIG_X86) += xen-hvmcrash INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx INSTALL_SBIN-$(CONFIG_X86) += xen-lowmemd @@ -XXX,XX +XXX,XX @@ INSTALL_SBIN += xenwatchdogd INSTALL_SBIN += xen-access INSTALL_SBIN += xen-livepatch INSTALL_SBIN += xen-diag +INSTALL_SBIN += xen-hptool INSTALL_SBIN += $(INSTALL_SBIN-y) # Everything to be installed -- 2.51.2
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v4->v5: * s/supported/implemented/ * update SUPPORT.md v3->v4: * update configuration section v2->v3: * patch introduced --- SUPPORT.md | 1 + docs/misc/cpu-hotplug.txt | 50 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 51 insertions(+) create mode 100644 docs/misc/cpu-hotplug.txt diff --git a/SUPPORT.md b/SUPPORT.md index XXXXXXX..XXXXXXX 100644 --- a/SUPPORT.md +++ b/SUPPORT.md @@ -XXX,XX +XXX,XX @@ For the Cortex A77 r0p0 - r1p0, see Errata 1508412. ### ACPI CPU Hotplug Status, x86: Experimental + Status, Arm64: Experimental ### Physical Memory diff --git a/docs/misc/cpu-hotplug.txt b/docs/misc/cpu-hotplug.txt new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/docs/misc/cpu-hotplug.txt @@ -XXX,XX +XXX,XX @@ +CPU Hotplug +=========== + +CPU hotplug is a feature that allows pCPU cores to be added to or removed from a +running system without requiring a reboot. It is implemented on x86 and Arm64 +architectures. + +Implementation Details +---------------------- + +CPU hotplug is implemented through the `XEN_SYSCTL_CPU_HOTPLUG_*` sysctl calls. +The specific calls are: + +- `XEN_SYSCTL_CPU_HOTPLUG_ONLINE`: Brings a pCPU online +- `XEN_SYSCTL_CPU_HOTPLUG_OFFLINE`: Takes a pCPU offline +- `XEN_SYSCTL_CPU_HOTPLUG_SMT_ENABLE`: Enables SMT threads (x86 only) +- `XEN_SYSCTL_CPU_HOTPLUG_SMT_DISABLE`: Disables SMT threads (x86 only) + +All cores can be disabled, assuming hardware support, except for the boot core. +Sysctl calls are routed to the boot core before doing any actual up/down +operations on other cores. + +Configuration +------------- + +The presence of the feature is controlled by CONFIG_CPU_HOTPLUG option. It is +enabled unconditionally on x86 architecture. On Arm64, the option is enabled by +default when ITS, FFA, and TEE configs are disabled. +xen-hptool userspace tool is built unconditionally. + +Usage +----- + +Disable core: + +$ xen-hptool cpu-offline 2 +Prepare to offline CPU 2 +(XEN) Removing cpu 2 from runqueue 0 +CPU 2 offlined successfully + +Enable core: + +$ xen-hptool cpu-online 2 +Prepare to online CPU 2 +(XEN) Bringing up CPU2 +(XEN) GICv3: CPU2: Found redistributor in region 0 @00000a004005c000 +(XEN) CPU2: Guest atomics will try 1 times before pausing the domain +(XEN) CPU 2 booted. +(XEN) Adding cpu 2 to runqueue 0 +CPU 2 onlined successfully -- 2.51.2
This series implements support for CPU hotplug/unplug on Arm. To achieve this, several things need to be done: 1. XEN_SYSCTL_CPU_HOTPLUG_* calls implemented on Arm64. 2. Enabled building of xen-hptool. 3. Migration of irqs from dying CPUs implemented. Tested on QEMU. v5->v6: * see individual patches v4->v5: * drop merged patches * combine "smp: Move cpu_up/down helpers to common code" with "arm/sysctl: Implement cpu hotplug ops" * see individual patches v3->v4: * add irq migration patches * see individual patches v2->v3: * add docs v1->v2: * see individual patches Mykyta Poturai (5): arm/irq: Keep track of irq affinities arm/irq: Migrate IRQs during CPU up/down operations arm/sysctl: Implement cpu hotplug ops tools: Allow building xen-hptool without CONFIG_MIGRATE docs: Document CPU hotplug SUPPORT.md | 1 + docs/misc/cpu-hotplug.txt | 50 ++++++++++++++++++++++ tools/libs/guest/Makefile.common | 2 +- tools/misc/Makefile | 2 +- xen/arch/arm/gic-vgic.c | 2 + xen/arch/arm/include/asm/irq.h | 4 ++ xen/arch/arm/irq.c | 69 ++++++++++++++++++++++++++++++- xen/arch/arm/smp.c | 9 ++++ xen/arch/arm/smpboot.c | 8 ++++ xen/arch/arm/vgic.c | 14 ++++++- xen/arch/arm/vgic/vgic-mmio-v2.c | 11 +++-- xen/arch/arm/vgic/vgic.c | 15 +++---- xen/arch/ppc/stubs.c | 4 ++ xen/arch/riscv/stubs.c | 5 +++ xen/arch/x86/include/asm/smp.h | 3 -- xen/arch/x86/platform_hypercall.c | 12 ++++++ xen/arch/x86/smp.c | 33 ++------------- xen/arch/x86/sysctl.c | 21 ++++++---- xen/common/Kconfig | 6 +++ xen/common/smp.c | 35 ++++++++++++++++ xen/common/sysctl.c | 46 +++++++++++++++++++++ xen/include/xen/smp.h | 4 ++ xen/xsm/flask/hooks.c | 2 +- 23 files changed, 296 insertions(+), 62 deletions(-) create mode 100644 docs/misc/cpu-hotplug.txt -- 2.51.2
Currently on Arm the desc->affinity mask of an irq is never updated, which makes it hard to know the actual affinity of an interrupt. Fix this by updating the field in irq_set_affinity. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v5->v6: * add missing locking around irq_set_affinity calls v4->v5: * add locking v3->v4: * patch introduced --- xen/arch/arm/gic-vgic.c | 2 ++ xen/arch/arm/irq.c | 9 +++++++-- xen/arch/arm/vgic.c | 14 ++++++++++++-- xen/arch/arm/vgic/vgic-mmio-v2.c | 11 +++++------ xen/arch/arm/vgic/vgic.c | 15 ++++++++------- 5 files changed, 34 insertions(+), 17 deletions(-) diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/gic-vgic.c +++ b/xen/arch/arm/gic-vgic.c @@ -XXX,XX +XXX,XX @@ static void gic_update_one_lr(struct vcpu *v, int i) if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) ) { struct vcpu *v_target = vgic_get_target_vcpu(v, irq); + spin_lock(&p->desc->lock); irq_set_affinity(p->desc, cpumask_of(v_target->processor)); + spin_unlock(&p->desc->lock); clear_bit(GIC_IRQ_GUEST_MIGRATING, &p->status); } } diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/irq.c +++ b/xen/arch/arm/irq.c @@ -XXX,XX +XXX,XX @@ static inline struct domain *irq_get_domain(struct irq_desc *desc) return irq_get_guest_info(desc)->d; } +/* Must be called with desc->lock held */ void irq_set_affinity(struct irq_desc *desc, const cpumask_t *mask) { - if ( desc != NULL ) - desc->handler->set_affinity(desc, mask); + if ( desc == NULL ) + return; + + ASSERT(spin_is_locked(&desc->lock)); + cpumask_copy(desc->affinity, mask); + desc->handler->set_affinity(desc, mask); } int request_irq(unsigned int irq, unsigned int irqflags, diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -XXX,XX +XXX,XX @@ bool vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq) if ( list_empty(&p->inflight) ) { + spin_lock(&p->desc->lock); irq_set_affinity(p->desc, cpumask_of(new->processor)); + spin_unlock(&p->desc->lock); spin_unlock_irqrestore(&old->arch.vgic.lock, flags); return true; } @@ -XXX,XX +XXX,XX @@ bool vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq) if ( !list_empty(&p->lr_queue) ) { vgic_remove_irq_from_queues(old, p); + spin_lock(&p->desc->lock); irq_set_affinity(p->desc, cpumask_of(new->processor)); + spin_unlock(&p->desc->lock); spin_unlock_irqrestore(&old->arch.vgic.lock, flags); vgic_inject_irq(new->domain, new, irq, true); return true; @@ -XXX,XX +XXX,XX @@ void arch_move_irqs(struct vcpu *v) struct domain *d = v->domain; struct pending_irq *p; struct vcpu *v_target; + unsigned long flags; int i; /* @@ -XXX,XX +XXX,XX @@ void arch_move_irqs(struct vcpu *v) p = irq_to_pending(v_target, virq); if ( v_target == v && !test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) ) + { + if ( !p->desc ) + continue; + spin_lock_irqsave(&p->desc->lock, flags); irq_set_affinity(p->desc, cpu_mask); + spin_unlock_irqrestore(&p->desc->lock, flags); + } } } @@ -XXX,XX +XXX,XX @@ void vgic_enable_irqs(struct vcpu *v, uint32_t r, unsigned int n) spin_unlock_irqrestore(&v_target->arch.vgic.lock, flags); if ( p->desc != NULL ) { - irq_set_affinity(p->desc, cpumask_of(v_target->processor)); spin_lock_irqsave(&p->desc->lock, flags); + irq_set_affinity(p->desc, cpumask_of(v_target->processor)); /* * The irq cannot be a PPI, we only support delivery of SPIs * to guests. @@ -XXX,XX +XXX,XX @@ void vgic_check_inflight_irqs_pending(struct vcpu *v, unsigned int rank, uint32_ * indent-tabs-mode: nil * End: */ - diff --git a/xen/arch/arm/vgic/vgic-mmio-v2.c b/xen/arch/arm/vgic/vgic-mmio-v2.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/vgic/vgic-mmio-v2.c +++ b/xen/arch/arm/vgic/vgic-mmio-v2.c @@ -XXX,XX +XXX,XX @@ static void vgic_mmio_write_target(struct vcpu *vcpu, for ( i = 0; i < len; i++ ) { struct vgic_irq *irq = vgic_get_irq(vcpu->domain, NULL, intid + i); + struct irq_desc *desc = irq_to_desc(irq->hwintid); - spin_lock_irqsave(&irq->irq_lock, flags); + spin_lock_irqsave(&desc->lock, flags); + spin_lock(&irq->irq_lock); irq->targets = (val >> (i * 8)) & cpu_mask; if ( irq->targets ) { irq->target_vcpu = vcpu->domain->vcpu[ffs(irq->targets) - 1]; if ( irq->hw ) - { - struct irq_desc *desc = irq_to_desc(irq->hwintid); - irq_set_affinity(desc, cpumask_of(irq->target_vcpu->processor)); - } } else irq->target_vcpu = NULL; - spin_unlock_irqrestore(&irq->irq_lock, flags); + spin_unlock(&irq->irq_lock); + spin_unlock_irqrestore(&desc->lock, flags); vgic_put_irq(vcpu->domain, irq); } } diff --git a/xen/arch/arm/vgic/vgic.c b/xen/arch/arm/vgic/vgic.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/vgic/vgic.c +++ b/xen/arch/arm/vgic/vgic.c @@ -XXX,XX +XXX,XX @@ void arch_move_irqs(struct vcpu *v) { struct vgic_irq *irq = vgic_get_irq(d, NULL, i + VGIC_NR_PRIVATE_IRQS); unsigned long flags; + irq_desc_t *desc; if ( !irq ) continue; - spin_lock_irqsave(&irq->irq_lock, flags); + desc = irq_to_desc(irq->hwintid); - /* Only hardware mapped vIRQs that are targeting this vCPU. */ - if ( irq->hw && irq->target_vcpu == v) - { - irq_desc_t *desc = irq_to_desc(irq->hwintid); + spin_lock_irqsave(&desc->lock, flags); + spin_lock(&irq->irq_lock); + /* Only hardware mapped vIRQs that are targeting this vCPU. */ + if ( irq->hw && irq->target_vcpu == v ) irq_set_affinity(desc, cpumask_of(v->processor)); - } - spin_unlock_irqrestore(&irq->irq_lock, flags); + spin_unlock(&irq->irq_lock); + spin_unlock_irqrestore(&desc->lock, flags); vgic_put_irq(d, irq); } } -- 2.51.2
Move IRQs from dying CPU to the online ones when a CPU is getting offlined. When onlining, rebalance all IRQs in a round-robin fashion. Guest-bound IRQs are already handled by scheduler in the process of moving vCPUs to active pCPUs, so we only need to handle IRQs used by Xen itself. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v5->v6: * don't do any balancing on boot * only do balancing when cpu hotplug is enabled v4->v5: * handle CPU onlining as well * more comments * fix crash when ESPI is disabled * don't assume CPU 0 is a boot CPU * use insigned int for irq number * remove assumption that all irqs a bound to CPU 0 by default from the commit message v3->v4: * patch introduced --- xen/arch/arm/include/asm/irq.h | 4 +++ xen/arch/arm/irq.c | 60 ++++++++++++++++++++++++++++++++++ xen/arch/arm/smpboot.c | 8 +++++ 3 files changed, 72 insertions(+) diff --git a/xen/arch/arm/include/asm/irq.h b/xen/arch/arm/include/asm/irq.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/include/asm/irq.h +++ b/xen/arch/arm/include/asm/irq.h @@ -XXX,XX +XXX,XX @@ bool irq_type_set_by_domain(const struct domain *d); void irq_end_none(struct irq_desc *irq); #define irq_end_none irq_end_none +#ifdef CONFIG_CPU_HOTPLUG +void rebalance_irqs(unsigned int from, bool up); +#endif + #endif /* _ASM_HW_IRQ_H */ /* * Local variables: diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/irq.c +++ b/xen/arch/arm/irq.c @@ -XXX,XX +XXX,XX @@ static int init_local_irq_data(unsigned int cpu) return 0; } +#ifdef CONFIG_CPU_HOTPLUG +static int cpu_next; + +static void balance_irq(int irq, unsigned int from, bool up) +{ + struct irq_desc *desc = irq_to_desc(irq); + unsigned long flags; + + ASSERT(!cpumask_empty(&cpu_online_map)); + + spin_lock_irqsave(&desc->lock, flags); + if ( likely(!desc->action) ) + goto out; + + if ( likely(test_bit(_IRQ_GUEST, &desc->status) || + test_bit(_IRQ_MOVE_PENDING, &desc->status)) ) + goto out; + + /* + * Setting affinity to a mask of multiple CPUs causes the GIC drivers to + * select one CPU from that mask. If the dying CPU was included in the IRQ's + * affinity mask, we cannot determine exactly which CPU the interrupt is + * currently routed to, as GIC drivers lack a concrete get_affinity API. So + * to be safe we must reroute it to a new, definitely online, CPU. In the + * case of CPU going down, we move only the interrupt that could reside on + * it. Otherwise, we rearrange all interrupts in a round-robin fashion. + */ + if ( !up && !cpumask_test_cpu(from, desc->affinity) ) + goto out; + + cpu_next = cpumask_cycle(cpu_next, &cpu_online_map); + irq_set_affinity(desc, cpumask_of(cpu_next)); + +out: + spin_unlock_irqrestore(&desc->lock, flags); +} + +void rebalance_irqs(unsigned int from, bool up) +{ + int irq; + + if ( cpumask_empty(&cpu_online_map) ) + return; + + for ( irq = NR_LOCAL_IRQS; irq < NR_IRQS; irq++ ) + balance_irq(irq, from, up); + +#ifdef CONFIG_GICV3_ESPI + for ( irq = ESPI_BASE_INTID; irq < ESPI_MAX_INTID; irq++ ) + balance_irq(irq, from, up); +#endif +} +#endif /* CONFIG_CPU_HOTPLUG */ + static int cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) { @@ -XXX,XX +XXX,XX @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action, printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n", cpu); break; + case CPU_ONLINE: +#ifdef CONFIG_CPU_HOTPLUG + if ( system_state >= SYS_STATE_active ) + rebalance_irqs(cpu, true); +#endif + break; } return notifier_from_errno(rc); diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/smpboot.c +++ b/xen/arch/arm/smpboot.c @@ -XXX,XX +XXX,XX @@ void __cpu_disable(void) smp_mb(); + /* + * Now that the interrupts are cleared and the CPU marked as offline, + * move interrupts out of it + */ +#ifdef CONFIG_CPU_HOTPLUG + rebalance_irqs(cpu, false); +#endif + /* Return to caller; eventually the IPI mechanism will unwind and the * scheduler will drop to the idle loop, which will call stop_cpu(). */ } -- 2.51.2
Move XEN_SYSCTL_CPU_HOTPLUG_{ONLINE,OFFLINE} handlers to common code to allow for enabling/disabling CPU cores in runtime on Arm64. SMT-disable enforcement check is moved into a separate architecture-specific function. For now this operations only support Arm64. For proper Arm32 support, there needs to be a mechanism to free per-cpu page tables, allocated in init_domheap_mappings. Also, hotplug is not supported if ITS, FFA, or TEE is enabled, as they use non-static IRQ actions. Create a Kconfig option CPU_HOTPLUG that reflects this constraints. On X86 the option is enabled unconditionally. As cpu hotplug now has its own config option, switch flask to allow XEN_SYSCTL_cpu_hotplug depending on CONFIG_CPU_HOTPLUG, so it can work not only on x86. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v5->v6: * fix style issues * rename arch_smt_cpu_disable -> arch_cpu_can_stay_online and invert the logic * use IS_ENABLED istead of ifdef * remove explicit list af arch-specific SYSCTL_CPU_HOTPLUG_* options from the common handler * fix flask issue v4->v5: * move handling to common code * rename config to CPU_HOTPUG * merge with "smp: Move cpu_up/down helpers to common code" v3->v4: * don't reimplement cpu_up/down helpers * add Kconfig option * fixup formatting v2->v3: * no changes v1->v2: * remove SMT ops * remove cpu == 0 checks * add XSM hooks * only implement for 64bit Arm --- xen/arch/arm/smp.c | 9 ++++++ xen/arch/ppc/stubs.c | 4 +++ xen/arch/riscv/stubs.c | 5 ++++ xen/arch/x86/include/asm/smp.h | 3 -- xen/arch/x86/platform_hypercall.c | 12 ++++++++ xen/arch/x86/smp.c | 33 ++-------------------- xen/arch/x86/sysctl.c | 21 ++++++++------ xen/common/Kconfig | 6 ++++ xen/common/smp.c | 35 +++++++++++++++++++++++ xen/common/sysctl.c | 46 +++++++++++++++++++++++++++++++ xen/include/xen/smp.h | 4 +++ xen/xsm/flask/hooks.c | 2 +- 12 files changed, 137 insertions(+), 43 deletions(-) diff --git a/xen/arch/arm/smp.c b/xen/arch/arm/smp.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/smp.c +++ b/xen/arch/arm/smp.c @@ -XXX,XX +XXX,XX @@ void smp_send_call_function_mask(const cpumask_t *mask) } } +/* + * We currently don't support SMT on ARM so we don't need any special logic for + * CPU disabling + */ +bool arch_cpu_can_stay_online(unsigned int cpu) +{ + return true; +} + /* * Local variables: * mode: C diff --git a/xen/arch/ppc/stubs.c b/xen/arch/ppc/stubs.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/ppc/stubs.c +++ b/xen/arch/ppc/stubs.c @@ -XXX,XX +XXX,XX @@ void smp_send_call_function_mask(const cpumask_t *mask) BUG_ON("unimplemented"); } +bool arch_cpu_can_stay_online(unsigned int cpu) +{ + BUG_ON("unimplemented"); +} /* irq.c */ void irq_ack_none(struct irq_desc *desc) diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/stubs.c +++ b/xen/arch/riscv/stubs.c @@ -XXX,XX +XXX,XX @@ void smp_send_call_function_mask(const cpumask_t *mask) BUG_ON("unimplemented"); } +bool arch_cpu_can_stay_online(unsigned int cpu) +{ + BUG_ON("unimplemented"); +} + /* irq.c */ void irq_ack_none(struct irq_desc *desc) diff --git a/xen/arch/x86/include/asm/smp.h b/xen/arch/x86/include/asm/smp.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/include/asm/smp.h +++ b/xen/arch/x86/include/asm/smp.h @@ -XXX,XX +XXX,XX @@ int cpu_add(uint32_t apic_id, uint32_t acpi_id, uint32_t pxm); void __stop_this_cpu(void); -long cf_check cpu_up_helper(void *data); -long cf_check cpu_down_helper(void *data); - long cf_check core_parking_helper(void *data); bool core_parking_remove(unsigned int cpu); uint32_t get_cur_idle_nums(void); diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/platform_hypercall.c +++ b/xen/arch/x86/platform_hypercall.c @@ -XXX,XX +XXX,XX @@ ret_t do_platform_op( { int cpu = op->u.cpu_ol.cpuid; + if ( !IS_ENABLED(CONFIG_CPU_HOTPLUG) ) + { + ret = -EOPNOTSUPP; + break; + } + ret = xsm_resource_plug_core(XSM_HOOK); if ( ret ) break; @@ -XXX,XX +XXX,XX @@ ret_t do_platform_op( { int cpu = op->u.cpu_ol.cpuid; + if ( !IS_ENABLED(CONFIG_CPU_HOTPLUG) ) + { + ret = -EOPNOTSUPP; + break; + } + ret = xsm_resource_unplug_core(XSM_HOOK); if ( ret ) break; diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/smp.c +++ b/xen/arch/x86/smp.c @@ -XXX,XX +XXX,XX @@ void cf_check call_function_interrupt(void) smp_call_function_interrupt(); } -long cf_check cpu_up_helper(void *data) +bool arch_cpu_can_stay_online(unsigned int cpu) { - unsigned int cpu = (unsigned long)data; - int ret = cpu_up(cpu); - - /* Have one more go on EBUSY. */ - if ( ret == -EBUSY ) - ret = cpu_up(cpu); - - if ( !ret && !opt_smt && - cpu_data[cpu].compute_unit_id == INVALID_CUID && - cpumask_weight(per_cpu(cpu_sibling_mask, cpu)) > 1 ) - { - ret = cpu_down_helper(data); - if ( ret ) - printk("Could not re-offline CPU%u (%d)\n", cpu, ret); - else - ret = -EPERM; - } - - return ret; -} - -long cf_check cpu_down_helper(void *data) -{ - int cpu = (unsigned long)data; - int ret = cpu_down(cpu); - /* Have one more go on EBUSY. */ - if ( ret == -EBUSY ) - ret = cpu_down(cpu); - return ret; + return opt_smt || cpu_data[cpu].compute_unit_id != INVALID_CUID || + cpumask_weight(per_cpu(cpu_sibling_mask, cpu)) <= 1; } diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/sysctl.c +++ b/xen/arch/x86/sysctl.c @@ -XXX,XX +XXX,XX @@ static void cf_check l3_cache_get(void *arg) static long cf_check smt_up_down_helper(void *data) { + #ifdef CONFIG_CPU_HOTPLUG bool up = (bool)data; unsigned int cpu, sibling_mask = boot_cpu_data.x86_num_siblings - 1; int ret = 0; @@ -XXX,XX +XXX,XX @@ static long cf_check smt_up_down_helper(void *data) up ? "enabled" : "disabled", CPUMASK_PR(&cpu_online_map)); return ret; + #endif /* CONFIG_CPU_HOTPLUG */ + return 0; } void arch_do_physinfo(struct xen_sysctl_physinfo *pi) @@ -XXX,XX +XXX,XX @@ long arch_do_sysctl( case XEN_SYSCTL_cpu_hotplug: { - unsigned int cpu = sysctl->u.cpu_hotplug.cpu; unsigned int op = sysctl->u.cpu_hotplug.op; bool plug; long (*fn)(void *data); void *hcpu; - switch ( op ) + if ( !IS_ENABLED(CONFIG_CPU_HOTPLUG) ) { - case XEN_SYSCTL_CPU_HOTPLUG_ONLINE: - plug = true; - fn = cpu_up_helper; - hcpu = _p(cpu); + ret = -EOPNOTSUPP; break; + } + switch ( op ) + { + case XEN_SYSCTL_CPU_HOTPLUG_ONLINE: case XEN_SYSCTL_CPU_HOTPLUG_OFFLINE: - plug = false; - fn = cpu_down_helper; - hcpu = _p(cpu); + /* Handled by common code */ + ASSERT_UNREACHABLE(); + ret = -EOPNOTSUPP; break; case XEN_SYSCTL_CPU_HOTPLUG_SMT_ENABLE: diff --git a/xen/common/Kconfig b/xen/common/Kconfig index XXXXXXX..XXXXXXX 100644 --- a/xen/common/Kconfig +++ b/xen/common/Kconfig @@ -XXX,XX +XXX,XX @@ config SYSTEM_SUSPEND If unsure, say N. +config CPU_HOTPLUG + bool "Enable CPU hotplug" + depends on (X86 || ARM_64) && !FFA && !TEE && !HAS_ITS + default y + + menu "Supported hypercall interfaces" visible if EXPERT diff --git a/xen/common/smp.c b/xen/common/smp.c index XXXXXXX..XXXXXXX 100644 --- a/xen/common/smp.c +++ b/xen/common/smp.c @@ -XXX,XX +XXX,XX @@ * GNU General Public License for more details. */ +#include <xen/cpu.h> #include <asm/hardirq.h> #include <asm/processor.h> #include <xen/spinlock.h> @@ -XXX,XX +XXX,XX @@ void smp_call_function_interrupt(void) irq_exit(); } +#ifdef CONFIG_CPU_HOTPLUG +long cf_check cpu_up_helper(void *data) +{ + unsigned int cpu = (unsigned long)data; + int ret = cpu_up(cpu); + + /* Have one more go on EBUSY. */ + if ( ret == -EBUSY ) + ret = cpu_up(cpu); + + if ( !ret && !arch_cpu_can_stay_online(cpu) ) + { + ret = cpu_down_helper(data); + if ( ret ) + printk("Could not re-offline CPU%u (%d)\n", cpu, ret); + else + ret = -EPERM; + } + + return ret; +} + +long cf_check cpu_down_helper(void *data) +{ + unsigned int cpu = (unsigned long)data; + int ret = cpu_down(cpu); + + /* Have one more go on EBUSY. */ + if ( ret == -EBUSY ) + ret = cpu_down(cpu); + return ret; +} +#endif /* CONFIG_CPU_HOTPLUG */ + /* * Local variables: * mode: C diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c index XXXXXXX..XXXXXXX 100644 --- a/xen/common/sysctl.c +++ b/xen/common/sysctl.c @@ -XXX,XX +XXX,XX @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) copyback = 1; break; + case XEN_SYSCTL_cpu_hotplug: + { + unsigned int cpu = op->u.cpu_hotplug.cpu; + unsigned int hp_op = op->u.cpu_hotplug.op; + bool plug; + long (*fn)(void *data); + void *hcpu; + + ret = -EOPNOTSUPP; + if ( !IS_ENABLED(CONFIG_CPU_HOTPLUG) ) + break; + + switch ( hp_op ) + { + case XEN_SYSCTL_CPU_HOTPLUG_ONLINE: + plug = true; + fn = cpu_up_helper; + hcpu = _p(cpu); + break; + + case XEN_SYSCTL_CPU_HOTPLUG_OFFLINE: + plug = false; + fn = cpu_down_helper; + hcpu = _p(cpu); + break; + + default: + fn = NULL; + break; + } + + if ( fn ) + { + ret = plug ? xsm_resource_plug_core(XSM_HOOK) + : xsm_resource_unplug_core(XSM_HOOK); + + if ( !ret ) + ret = continue_hypercall_on_cpu(0, fn, hcpu); + + break; + } + + /* Use the arch handler for cases not handled here */ + fallthrough; + } + default: ret = arch_do_sysctl(op, u_sysctl); copyback = 0; diff --git a/xen/include/xen/smp.h b/xen/include/xen/smp.h index XXXXXXX..XXXXXXX 100644 --- a/xen/include/xen/smp.h +++ b/xen/include/xen/smp.h @@ -XXX,XX +XXX,XX @@ extern void *stack_base[NR_CPUS]; void initialize_cpu_data(unsigned int cpu); int setup_cpu_root_pgt(unsigned int cpu); +bool arch_cpu_can_stay_online(unsigned int cpu); +long cf_check cpu_up_helper(void *data); +long cf_check cpu_down_helper(void *data); + #endif /* __XEN_SMP_H__ */ diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c index XXXXXXX..XXXXXXX 100644 --- a/xen/xsm/flask/hooks.c +++ b/xen/xsm/flask/hooks.c @@ -XXX,XX +XXX,XX @@ static int cf_check flask_sysctl(int cmd) case XEN_SYSCTL_getdomaininfolist: case XEN_SYSCTL_page_offline_op: case XEN_SYSCTL_scheduler_op: -#ifdef CONFIG_X86 +#ifdef CONFIG_CPU_HOTPLUG case XEN_SYSCTL_cpu_hotplug: #endif return 0; -- 2.51.2
With CPU hotplug sysctls implemented on Arm it becomes useful to have a tool for calling them. According to the commit history it seems that putting hptool under config MIGRATE was a measure to fix IA64 build. As IA64 is no longer supported it can now be brought back. So build it unconditionally. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v5->v6: * don't change order in Makefile v4->v5: * make hptool always build v3->v4: * no changes v2->v3: * no changes v1->v2: * switch to configure from legacy config --- tools/libs/guest/Makefile.common | 2 +- tools/misc/Makefile | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/libs/guest/Makefile.common b/tools/libs/guest/Makefile.common index XXXXXXX..XXXXXXX 100644 --- a/tools/libs/guest/Makefile.common +++ b/tools/libs/guest/Makefile.common @@ -XXX,XX +XXX,XX @@ OBJS-y += xg_private.o OBJS-y += xg_domain.o OBJS-y += xg_suspend.o OBJS-y += xg_resume.o +OBJS-y += xg_offline_page.o ifeq ($(CONFIG_MIGRATE),y) OBJS-y += xg_sr_common.o OBJS-$(CONFIG_X86) += xg_sr_common_x86.o @@ -XXX,XX +XXX,XX @@ OBJS-$(CONFIG_X86) += xg_sr_save_x86_pv.o OBJS-$(CONFIG_X86) += xg_sr_save_x86_hvm.o OBJS-y += xg_sr_restore.o OBJS-y += xg_sr_save.o -OBJS-y += xg_offline_page.o else OBJS-y += xg_nomigrate.o endif diff --git a/tools/misc/Makefile b/tools/misc/Makefile index XXXXXXX..XXXXXXX 100644 --- a/tools/misc/Makefile +++ b/tools/misc/Makefile @@ -XXX,XX +XXX,XX @@ INSTALL_BIN += xencov_split INSTALL_BIN += $(INSTALL_BIN-y) # Everything to be installed in regular sbin/ -INSTALL_SBIN-$(CONFIG_MIGRATE) += xen-hptool +INSTALL_SBIN += xen-hptool INSTALL_SBIN-$(CONFIG_X86) += xen-hvmcrash INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx INSTALL_SBIN-$(CONFIG_X86) += xen-lowmemd -- 2.51.2
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v5->v6: * no changes v4->v5: * s/supported/implemented/ * update SUPPORT.md v3->v4: * update configuration section v2->v3: * patch introduced --- SUPPORT.md | 1 + docs/misc/cpu-hotplug.txt | 50 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 51 insertions(+) create mode 100644 docs/misc/cpu-hotplug.txt diff --git a/SUPPORT.md b/SUPPORT.md index XXXXXXX..XXXXXXX 100644 --- a/SUPPORT.md +++ b/SUPPORT.md @@ -XXX,XX +XXX,XX @@ For the Cortex A77 r0p0 - r1p0, see Errata 1508412. ### ACPI CPU Hotplug Status, x86: Experimental + Status, Arm64: Experimental ### Physical Memory diff --git a/docs/misc/cpu-hotplug.txt b/docs/misc/cpu-hotplug.txt new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/docs/misc/cpu-hotplug.txt @@ -XXX,XX +XXX,XX @@ +CPU Hotplug +=========== + +CPU hotplug is a feature that allows pCPU cores to be added to or removed from a +running system without requiring a reboot. It is implemented on x86 and Arm64 +architectures. + +Implementation Details +---------------------- + +CPU hotplug is implemented through the `XEN_SYSCTL_CPU_HOTPLUG_*` sysctl calls. +The specific calls are: + +- `XEN_SYSCTL_CPU_HOTPLUG_ONLINE`: Brings a pCPU online +- `XEN_SYSCTL_CPU_HOTPLUG_OFFLINE`: Takes a pCPU offline +- `XEN_SYSCTL_CPU_HOTPLUG_SMT_ENABLE`: Enables SMT threads (x86 only) +- `XEN_SYSCTL_CPU_HOTPLUG_SMT_DISABLE`: Disables SMT threads (x86 only) + +All cores can be disabled, assuming hardware support, except for the boot core. +Sysctl calls are routed to the boot core before doing any actual up/down +operations on other cores. + +Configuration +------------- + +The presence of the feature is controlled by CONFIG_CPU_HOTPLUG option. It is +enabled by default on x86 architecture. On Arm64, the option is enabled by +default when ITS, FFA, and TEE configs are disabled. +xen-hptool userspace tool is built unconditionally. + +Usage +----- + +Disable core: + +$ xen-hptool cpu-offline 2 +Prepare to offline CPU 2 +(XEN) Removing cpu 2 from runqueue 0 +CPU 2 offlined successfully + +Enable core: + +$ xen-hptool cpu-online 2 +Prepare to online CPU 2 +(XEN) Bringing up CPU2 +(XEN) GICv3: CPU2: Found redistributor in region 0 @00000a004005c000 +(XEN) CPU2: Guest atomics will try 1 times before pausing the domain +(XEN) CPU 2 booted. +(XEN) Adding cpu 2 to runqueue 0 +CPU 2 onlined successfully -- 2.51.2