:p
atchew
Login
This series implements support for CPU hotplug/unplug on Arm. To achieve this, several things need to be done: 1. XEN_SYSCTL_CPU_HOTPLUG_* calls implemented. 2. timer and GIC maintenance interrupts switched to static irqactions to remove the need for freeing them during release_irq. 3. Enabled the build of xen-hptool on Arm. 4. Migration of irqs from dying CPUs implemented. Tested on QEMU. Note: As there are currently no Xen-used IRQs on non-zero CPUs, I used a hack that changed default affinity of IRQs in setup_irq to properly test IRQ migration. The hack consisted of changing smp_processor_id call to some hard-coded non-zero number. v3->v4: * add irq migration patches * see individual patches v2->v3: * add docs v1->v2: * see individual patches Mykyta Poturai (5): arm/time: Use static irqaction arm/gic: Use static irqaction arm/irq: Keep track of irq affinities arm/irq: Migrate IRQs from dyings CPUs smp: Move cpu_up/down helpers to common code arm/sysctl: Implement cpu hotplug ops tools: Allow building xen-hptool without CONFIG_MIGRATE docs: Document CPU hotplug config/Tools.mk.in | 1 + docs/misc/cpu-hotplug.txt | 51 ++++++++++++++++++++++++++++++++ tools/configure | 30 +++++++++++++++++++ tools/configure.ac | 1 + tools/libs/guest/Makefile.common | 4 +++ tools/misc/Makefile | 2 +- xen/arch/arm/Kconfig | 4 +++ xen/arch/arm/gic.c | 11 +++++-- xen/arch/arm/include/asm/irq.h | 2 ++ xen/arch/arm/irq.c | 42 ++++++++++++++++++++++++++ xen/arch/arm/smp.c | 6 ++++ xen/arch/arm/smpboot.c | 2 ++ xen/arch/arm/sysctl.c | 32 ++++++++++++++++++++ xen/arch/arm/time.c | 21 ++++++++++--- xen/arch/ppc/stubs.c | 4 +++ xen/arch/riscv/stubs.c | 5 ++++ xen/arch/x86/include/asm/smp.h | 3 -- xen/arch/x86/smp.c | 33 ++------------------- xen/common/smp.c | 32 ++++++++++++++++++++ xen/include/xen/smp.h | 4 +++ 20 files changed, 250 insertions(+), 40 deletions(-) create mode 100644 docs/misc/cpu-hotplug.txt mode change 100755 => 100644 tools/configure -- 2.51.2
When stopping a core deinit_timer_interrupt is called in non-alloc context, which causes xfree in release_irq to fail an assert. To fix this, switch to a statically allocated irqaction that does not need to be freed in release_irq. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> Reviewed-by: Mykola Kvach <mykola_kvach@epam.com> Reviewed-by: Julien Grall <jgrall@amazon.com> v3->v4: * make irqactions static * collect RBs v2->v3: * no changes v1->v2: * use percpu actions --- xen/arch/arm/time.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/time.c +++ b/xen/arch/arm/time.c @@ -XXX,XX +XXX,XX @@ static void check_timer_irq_cfg(unsigned int irq, const char *which) "WARNING: %s-timer IRQ%u is not level triggered.\n", which, irq); } +static DEFINE_PER_CPU_READ_MOSTLY(struct irqaction, irq_hyp); +static DEFINE_PER_CPU_READ_MOSTLY(struct irqaction, irq_virt); + /* Set up the timer interrupt on this CPU */ void init_timer_interrupt(void) { + struct irqaction *hyp_action = &this_cpu(irq_hyp); + struct irqaction *virt_action = &this_cpu(irq_virt); + /* Sensible defaults */ WRITE_SYSREG64(0, CNTVOFF_EL2); /* No VM-specific offset */ /* Do not let the VMs program the physical timer, only read the physical counter */ @@ -XXX,XX +XXX,XX @@ void init_timer_interrupt(void) WRITE_SYSREG(0, CNTHP_CTL_EL2); /* Hypervisor's timer disabled */ isb(); - request_irq(timer_irq[TIMER_HYP_PPI], 0, htimer_interrupt, - "hyptimer", NULL); - request_irq(timer_irq[TIMER_VIRT_PPI], 0, vtimer_interrupt, - "virtimer", NULL); + hyp_action->name = "hyptimer"; + hyp_action->handler = htimer_interrupt; + hyp_action->dev_id = NULL; + hyp_action->free_on_release = 0; + setup_irq(timer_irq[TIMER_HYP_PPI], 0, hyp_action); + + virt_action->name = "virtimer"; + virt_action->handler = vtimer_interrupt; + virt_action->dev_id = NULL; + virt_action->free_on_release = 0; + setup_irq(timer_irq[TIMER_VIRT_PPI], 0, virt_action); check_timer_irq_cfg(timer_irq[TIMER_HYP_PPI], "hypervisor"); check_timer_irq_cfg(timer_irq[TIMER_VIRT_PPI], "virtual"); -- 2.51.2
When stopping a core cpu_gic_callback is called in non-alloc context, which causes xfree in release_irq to fail an assert. To fix this, switch to a statically allocated irqaction that does not need to be freed in release_irq. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> Reviewed-by: Mykola Kvach <mykola_kvach@epam.com> Reviewed-by: Julien Grall <jgrall@amazon.com> v3->v4: * make irqactions static * collect RBs v2->v3: * no changes v1->v2: * use percpu actions --- xen/arch/arm/gic.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -XXX,XX +XXX,XX @@ void gic_dump_info(struct vcpu *v) gic_hw_ops->dump_state(v); } +static DEFINE_PER_CPU_READ_MOSTLY(struct irqaction, irq_maintenance); + void init_maintenance_interrupt(void) { - request_irq(gic_hw_ops->info->maintenance_irq, 0, maintenance_interrupt, - "irq-maintenance", NULL); + struct irqaction *maintenance = &this_cpu(irq_maintenance); + + maintenance->name = "irq-maintenance"; + maintenance->handler = maintenance_interrupt; + maintenance->dev_id = NULL; + maintenance->free_on_release = 0; + setup_irq(gic_hw_ops->info->maintenance_irq, 0, maintenance); } int gic_make_hwdom_dt_node(const struct domain *d, -- 2.51.2
Currently on Arm the desc->affinity mask of an irq is never updated, which makes it hard to know the actual affinity of an interrupt. Fix this by updating the field in irq_set_affinity. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> v3->v4: * patch introduced --- xen/arch/arm/irq.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/irq.c +++ b/xen/arch/arm/irq.c @@ -XXX,XX +XXX,XX @@ static inline struct domain *irq_get_domain(struct irq_desc *desc) void irq_set_affinity(struct irq_desc *desc, const cpumask_t *mask) { if ( desc != NULL ) + { + cpumask_copy(desc->affinity, mask); desc->handler->set_affinity(desc, mask); + } } int request_irq(unsigned int irq, unsigned int irqflags, -- 2.51.2
Move IRQs from dying CPU to the online ones. Guest-bound IRQs are already handled by scheduler in the process of moving vCPUs to active pCPUs, so we only need to handle IRQs used by Xen itself. If IRQ is to be migrated, it's affinity is set to a mask of all online CPUs. With current GIC implementation, this means they are routed to a random online CPU. This may cause extra moves if multiple cores are disabled in sequence, but should prevent all interrupts from piling up on CPU0 in case of repeated up-down cycles on different cores. IRQs from CPU 0 are never migrated, as dying CPU 0 means we are either shutting down compeletely or entering system suspend. Considering that all Xen-used IRQs are currently allocated during init on CPU 0, and setup_irq uses smp_processor_id for the initial affinity. This change is not strictly required for correct operation for now, but it should future-proof cpu hotplug and system suspend support in case some kind if IRQ balancing is implemented later. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> v3->v4: * patch introduced --- xen/arch/arm/include/asm/irq.h | 2 ++ xen/arch/arm/irq.c | 39 ++++++++++++++++++++++++++++++++++ xen/arch/arm/smpboot.c | 2 ++ 3 files changed, 43 insertions(+) diff --git a/xen/arch/arm/include/asm/irq.h b/xen/arch/arm/include/asm/irq.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/include/asm/irq.h +++ b/xen/arch/arm/include/asm/irq.h @@ -XXX,XX +XXX,XX @@ bool irq_type_set_by_domain(const struct domain *d); void irq_end_none(struct irq_desc *irq); #define irq_end_none irq_end_none +void evacuate_irqs(unsigned int from); + #endif /* _ASM_HW_IRQ_H */ /* * Local variables: diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/irq.c +++ b/xen/arch/arm/irq.c @@ -XXX,XX +XXX,XX @@ static int init_local_irq_data(unsigned int cpu) return 0; } +static void evacuate_irq(int irq, unsigned int from) +{ + struct irq_desc *desc = irq_to_desc(irq); + unsigned long flags; + + /* Don't move irqs from CPU 0 as it is always last to be disabled */ + if ( from == 0 ) + return; + + ASSERT(!cpumask_empty(&cpu_online_map)); + ASSERT(!cpumask_test_cpu(from, &cpu_online_map)); + + spin_lock_irqsave(&desc->lock, flags); + if ( likely(!desc->action) ) + goto out; + + if ( likely(test_bit(_IRQ_GUEST, &desc->status) || + test_bit(_IRQ_MOVE_PENDING, &desc->status)) ) + goto out; + + if ( cpumask_test_cpu(from, desc->affinity) ) + irq_set_affinity(desc, &cpu_online_map); + +out: + spin_unlock_irqrestore(&desc->lock, flags); + return; +} + +void evacuate_irqs(unsigned int from) +{ + int irq; + + for ( irq = NR_LOCAL_IRQS; irq < NR_IRQS; irq++ ) + evacuate_irq(irq, from); + + for ( irq = ESPI_BASE_INTID; irq < ESPI_MAX_INTID; irq++ ) + evacuate_irq(irq, from); +} + static int cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) { diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/smpboot.c +++ b/xen/arch/arm/smpboot.c @@ -XXX,XX +XXX,XX @@ void __cpu_disable(void) smp_mb(); + evacuate_irqs(cpu); + /* Return to caller; eventually the IPI mechanism will unwind and the * scheduler will drop to the idle loop, which will call stop_cpu(). */ } -- 2.51.2
This will reduce code duplication for the upcoming cpu hotplug support on Arm64 patch. SMT-disable enforcement check is moved into a separate architecture-specific function. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> v3->v4: * patch introduced --- xen/arch/arm/smp.c | 6 ++++++ xen/arch/ppc/stubs.c | 4 ++++ xen/arch/riscv/stubs.c | 5 +++++ xen/arch/x86/include/asm/smp.h | 3 --- xen/arch/x86/smp.c | 33 +++------------------------------ xen/common/smp.c | 32 ++++++++++++++++++++++++++++++++ xen/include/xen/smp.h | 4 ++++ 7 files changed, 54 insertions(+), 33 deletions(-) diff --git a/xen/arch/arm/smp.c b/xen/arch/arm/smp.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/smp.c +++ b/xen/arch/arm/smp.c @@ -XXX,XX +XXX,XX @@ void smp_send_call_function_mask(const cpumask_t *mask) } } +/* ARM don't have SMT so we don't need any special logic for CPU disabling */ +bool arch_smt_cpu_disable(unsigned int cpu) +{ + return false; +} + /* * Local variables: * mode: C diff --git a/xen/arch/ppc/stubs.c b/xen/arch/ppc/stubs.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/ppc/stubs.c +++ b/xen/arch/ppc/stubs.c @@ -XXX,XX +XXX,XX @@ void smp_send_call_function_mask(const cpumask_t *mask) BUG_ON("unimplemented"); } +bool arch_smt_cpu_disable(unsigned int cpu) +{ + BUG_ON("unimplemented"); +} /* irq.c */ void irq_ack_none(struct irq_desc *desc) diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/stubs.c +++ b/xen/arch/riscv/stubs.c @@ -XXX,XX +XXX,XX @@ void smp_send_call_function_mask(const cpumask_t *mask) BUG_ON("unimplemented"); } +bool arch_smt_cpu_disable(unsigned int cpu) +{ + BUG_ON("unimplemented"); +} + /* irq.c */ void irq_ack_none(struct irq_desc *desc) diff --git a/xen/arch/x86/include/asm/smp.h b/xen/arch/x86/include/asm/smp.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/include/asm/smp.h +++ b/xen/arch/x86/include/asm/smp.h @@ -XXX,XX +XXX,XX @@ int cpu_add(uint32_t apic_id, uint32_t acpi_id, uint32_t pxm); void __stop_this_cpu(void); -long cf_check cpu_up_helper(void *data); -long cf_check cpu_down_helper(void *data); - long cf_check core_parking_helper(void *data); bool core_parking_remove(unsigned int cpu); uint32_t get_cur_idle_nums(void); diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/smp.c +++ b/xen/arch/x86/smp.c @@ -XXX,XX +XXX,XX @@ void cf_check call_function_interrupt(void) smp_call_function_interrupt(); } -long cf_check cpu_up_helper(void *data) +bool arch_smt_cpu_disable(unsigned int cpu) { - unsigned int cpu = (unsigned long)data; - int ret = cpu_up(cpu); - - /* Have one more go on EBUSY. */ - if ( ret == -EBUSY ) - ret = cpu_up(cpu); - - if ( !ret && !opt_smt && - cpu_data[cpu].compute_unit_id == INVALID_CUID && - cpumask_weight(per_cpu(cpu_sibling_mask, cpu)) > 1 ) - { - ret = cpu_down_helper(data); - if ( ret ) - printk("Could not re-offline CPU%u (%d)\n", cpu, ret); - else - ret = -EPERM; - } - - return ret; -} - -long cf_check cpu_down_helper(void *data) -{ - int cpu = (unsigned long)data; - int ret = cpu_down(cpu); - /* Have one more go on EBUSY. */ - if ( ret == -EBUSY ) - ret = cpu_down(cpu); - return ret; + return !opt_smt && cpu_data[cpu].compute_unit_id == INVALID_CUID && + cpumask_weight(per_cpu(cpu_sibling_mask, cpu)) > 1; } diff --git a/xen/common/smp.c b/xen/common/smp.c index XXXXXXX..XXXXXXX 100644 --- a/xen/common/smp.c +++ b/xen/common/smp.c @@ -XXX,XX +XXX,XX @@ * GNU General Public License for more details. */ +#include <xen/cpu.h> #include <asm/hardirq.h> #include <asm/processor.h> #include <xen/spinlock.h> @@ -XXX,XX +XXX,XX @@ void smp_call_function_interrupt(void) irq_exit(); } +long cf_check cpu_up_helper(void *data) +{ + unsigned int cpu = (unsigned long)data; + int ret = cpu_up(cpu); + + /* Have one more go on EBUSY. */ + if ( ret == -EBUSY ) + ret = cpu_up(cpu); + + if ( !ret && arch_smt_cpu_disable(cpu) ) + { + ret = cpu_down_helper(data); + if ( ret ) + printk("Could not re-offline CPU%u (%d)\n", cpu, ret); + else + ret = -EPERM; + } + + return ret; +} + +long cf_check cpu_down_helper(void *data) +{ + int cpu = (unsigned long)data; + int ret = cpu_down(cpu); + /* Have one more go on EBUSY. */ + if ( ret == -EBUSY ) + ret = cpu_down(cpu); + return ret; +} + /* * Local variables: * mode: C diff --git a/xen/include/xen/smp.h b/xen/include/xen/smp.h index XXXXXXX..XXXXXXX 100644 --- a/xen/include/xen/smp.h +++ b/xen/include/xen/smp.h @@ -XXX,XX +XXX,XX @@ extern void *stack_base[NR_CPUS]; void initialize_cpu_data(unsigned int cpu); int setup_cpu_root_pgt(unsigned int cpu); +bool arch_smt_cpu_disable(unsigned int cpu); +long cf_check cpu_up_helper(void *data); +long cf_check cpu_down_helper(void *data); + #endif /* __XEN_SMP_H__ */ -- 2.51.2
Implement XEN_SYSCTL_CPU_HOTPLUG_{ONLINE,OFFLINE} calls to allow for enabling/disabling CPU cores in runtime. For now this operations only support Arm64. For proper Arm32 support, there needs to be a mechanism to free per-cpu page tables, allocated in init_domheap_mappings. Also, hotplug is not supported if ITS, FFA, or TEE is enabled, as they use non-static IRQ actions. Create a Kconfig option RUNTIME_CPU_CONTROL that reflects this constraints. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> v3->v4: * don't reimplement cpu_up/down helpers * add Kconfig option * fixup formatting v2->v3: * no changes v1->v2: * remove SMT ops * remove cpu == 0 checks * add XSM hooks * only implement for 64bit Arm --- xen/arch/arm/Kconfig | 4 ++++ xen/arch/arm/sysctl.c | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 36 insertions(+) diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/Kconfig +++ b/xen/arch/arm/Kconfig @@ -XXX,XX +XXX,XX @@ config PCI_PASSTHROUGH help This option enables PCI device passthrough +config RUNTIME_CPU_CONTROL + def_bool y + depends on ARM_64 && !TEE && !FFA && !HAS_ITS + endmenu menu "ARM errata workaround via the alternative framework" diff --git a/xen/arch/arm/sysctl.c b/xen/arch/arm/sysctl.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/sysctl.c +++ b/xen/arch/arm/sysctl.c @@ -XXX,XX +XXX,XX @@ #include <xen/dt-overlay.h> #include <xen/errno.h> #include <xen/hypercall.h> +#include <xsm/xsm.h> #include <asm/arm64/sve.h> #include <public/sysctl.h> @@ -XXX,XX +XXX,XX @@ void arch_do_physinfo(struct xen_sysctl_physinfo *pi) XEN_SYSCTL_PHYSCAP_ARM_SVE_MASK); } +static long cpu_hotplug_sysctl(struct xen_sysctl_cpu_hotplug *hotplug) +{ +#ifdef CONFIG_RUNTIME_CPU_CONTROL + int ret; + + switch ( hotplug->op ) + { + case XEN_SYSCTL_CPU_HOTPLUG_ONLINE: + ret = xsm_resource_plug_core(XSM_HOOK); + if ( ret ) + return ret; + return continue_hypercall_on_cpu(0, cpu_up_helper, _p(hotplug->cpu)); + + case XEN_SYSCTL_CPU_HOTPLUG_OFFLINE: + ret = xsm_resource_unplug_core(XSM_HOOK); + if ( ret ) + return ret; + return continue_hypercall_on_cpu(0, cpu_down_helper, _p(hotplug->cpu)); + + default: + return -EOPNOTSUPP; + } +#else + return -EOPNOTSUPP; +#endif +} + long arch_do_sysctl(struct xen_sysctl *sysctl, XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) { @@ -XXX,XX +XXX,XX @@ long arch_do_sysctl(struct xen_sysctl *sysctl, ret = dt_overlay_sysctl(&sysctl->u.dt_overlay); break; + case XEN_SYSCTL_cpu_hotplug: + ret = cpu_hotplug_sysctl(&sysctl->u.cpu_hotplug); + break; + default: ret = -ENOSYS; break; -- 2.51.2
With CPU hotplug sysctls implemented on Arm it becomes useful to have a tool for calling them. Introduce a new congifure option "hptool" to allow building hptool separately from other migration tools, and enable it by default. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> v3->v4: * no changes v2->v3: * no changes v1->v2: * switch to configure from legacy config --- config/Tools.mk.in | 1 + tools/configure | 30 ++++++++++++++++++++++++++++++ tools/configure.ac | 1 + tools/libs/guest/Makefile.common | 4 ++++ tools/misc/Makefile | 2 +- 5 files changed, 37 insertions(+), 1 deletion(-) mode change 100755 => 100644 tools/configure diff --git a/config/Tools.mk.in b/config/Tools.mk.in index XXXXXXX..XXXXXXX 100644 --- a/config/Tools.mk.in +++ b/config/Tools.mk.in @@ -XXX,XX +XXX,XX @@ CONFIG_LIBNL := @libnl@ CONFIG_GOLANG := @golang@ CONFIG_PYGRUB := @pygrub@ CONFIG_LIBFSIMAGE := @libfsimage@ +CONFIG_HPTOOL := @hptool@ CONFIG_SYSTEMD := @systemd@ XEN_SYSTEMD_DIR := @SYSTEMD_DIR@ diff --git a/tools/configure b/tools/configure old mode 100755 new mode 100644 index XXXXXXX..XXXXXXX --- a/tools/configure +++ b/tools/configure @@ -XXX,XX +XXX,XX @@ LD86 AS86 ipxe LINUX_BACKEND_MODULES +hptool pygrub golang seabios @@ -XXX,XX +XXX,XX @@ enable_ovmf enable_seabios enable_golang enable_pygrub +enable_hptool with_linux_backend_modules enable_ipxe with_system_ipxe @@ -XXX,XX +XXX,XX @@ Optional Features: --disable-seabios Disable SeaBIOS (default is ENABLED) --disable-golang Disable Go tools (default is ENABLED) --disable-pygrub Disable pygrub (default is ENABLED) + --disable-hptool Disable hptool (default is ENABLED) --enable-ipxe Enable in-tree IPXE, (DEFAULT is off, see also --with-system-ipxe) --enable-rombios Enable ROMBIOS, (DEFAULT is on if ipxe is enabled, @@ -XXX,XX +XXX,XX @@ pygrub=$ax_cv_pygrub +# Check whether --enable-hptool was given. +if test ${enable_hptool+y} +then : + enableval=$enable_hptool; +fi + + +if test "x$enable_hptool" = "xno" +then : + + ax_cv_hptool="n" + +elif test "x$enable_hptool" = "xyes" +then : + + ax_cv_hptool="y" + +elif test -z $ax_cv_hptool +then : + + ax_cv_hptool="y" + +fi +hptool=$ax_cv_hptool + + + # Check whether --with-linux-backend-modules was given. if test ${with_linux_backend_modules+y} diff --git a/tools/configure.ac b/tools/configure.ac index XXXXXXX..XXXXXXX 100644 --- a/tools/configure.ac +++ b/tools/configure.ac @@ -XXX,XX +XXX,XX @@ AX_ARG_DEFAULT_DISABLE([ovmf], [Enable OVMF]) AX_ARG_DEFAULT_ENABLE([seabios], [Disable SeaBIOS]) AX_ARG_DEFAULT_ENABLE([golang], [Disable Go tools]) AX_ARG_DEFAULT_ENABLE([pygrub], [Disable pygrub]) +AX_ARG_DEFAULT_ENABLE([hptool], [Disable hptool]) AC_ARG_WITH([linux-backend-modules], AS_HELP_STRING([--with-linux-backend-modules="mod1 mod2"], diff --git a/tools/libs/guest/Makefile.common b/tools/libs/guest/Makefile.common index XXXXXXX..XXXXXXX 100644 --- a/tools/libs/guest/Makefile.common +++ b/tools/libs/guest/Makefile.common @@ -XXX,XX +XXX,XX @@ OBJS-y += xg_core.o OBJS-$(CONFIG_X86) += xg_core_x86.o OBJS-$(CONFIG_ARM) += xg_core_arm.o +ifneq (,$(filter y,$(CONFIG_MIGRATE)$(CONFIG_HPTOOL))) +OBJS-y += xg_offline_page.o +endif + vpath %.c ../../../xen/common/libelf LIBELF_OBJS += libelf-tools.o libelf-loader.o diff --git a/tools/misc/Makefile b/tools/misc/Makefile index XXXXXXX..XXXXXXX 100644 --- a/tools/misc/Makefile +++ b/tools/misc/Makefile @@ -XXX,XX +XXX,XX @@ INSTALL_BIN += xencov_split INSTALL_BIN += $(INSTALL_BIN-y) # Everything to be installed in regular sbin/ -INSTALL_SBIN-$(CONFIG_MIGRATE) += xen-hptool +INSTALL_SBIN-$(CONFIG_HPTOOL) += xen-hptool INSTALL_SBIN-$(CONFIG_X86) += xen-hvmcrash INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx INSTALL_SBIN-$(CONFIG_X86) += xen-lowmemd -- 2.51.2
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> v3->v4: * update configuration section v2->v3: * patch introduced --- docs/misc/cpu-hotplug.txt | 51 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) create mode 100644 docs/misc/cpu-hotplug.txt diff --git a/docs/misc/cpu-hotplug.txt b/docs/misc/cpu-hotplug.txt new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/docs/misc/cpu-hotplug.txt @@ -XXX,XX +XXX,XX @@ +CPU Hotplug +=========== + +CPU hotplug is a feature that allows pCPU cores to be added to or removed from a +running system without requiring a reboot. It is supported on x86 and Arm64 +architectures. + +Implementation Details +---------------------- + +CPU hotplug is implemented through the `XEN_SYSCTL_CPU_HOTPLUG_*` sysctl calls. +The specific calls are: + +- `XEN_SYSCTL_CPU_HOTPLUG_ONLINE`: Brings a pCPU online +- `XEN_SYSCTL_CPU_HOTPLUG_OFFLINE`: Takes a pCPU offline +- `XEN_SYSCTL_CPU_HOTPLUG_SMT_ENABLE`: Enables SMT threads (x86 only) +- `XEN_SYSCTL_CPU_HOTPLUG_SMT_DISABLE`: Disables SMT threads (x86 only) + +All cores can be disabled, assuming hardware support, except for core 0. Sysctl +calls are routed to core 0 before doing any actual up/down operations on other +cores. + +Configuration +------------- + +Sysctl handlers are enabled unconditionally on x86 architecture. On Arm64, +handlers are enabled by default when ITS, FFA, and TEE configs are disabled. +Building of the userspace tool "hptool" is controlled by the "hptool" flag in +the configure script. It is enabled by default and can be disabled with +--disable-hptool command line option. + +Usage +----- + +Disable core: + +$ xen-hptool cpu-offline 2 +Prepare to offline CPU 2 +(XEN) Removing cpu 2 from runqueue 0 +CPU 2 offlined successfully + +Enable core: + +$ xen-hptool cpu-online 2 +Prepare to online CPU 2 +(XEN) Bringing up CPU2 +(XEN) GICv3: CPU2: Found redistributor in region 0 @00000a004005c000 +(XEN) CPU2: Guest atomics will try 1 times before pausing the domain +(XEN) CPU 2 booted. +(XEN) Adding cpu 2 to runqueue 0 +CPU 2 onlined successfully -- 2.51.2
This series implements support for CPU hotplug/unplug on Arm. To achieve this, several things need to be done: 1. XEN_SYSCTL_CPU_HOTPLUG_* calls implemented on Arm64. 2. Enabled building of xen-hptool. 3. Migration of irqs from dying CPUs implemented. Tested on QEMU. v4->v5: * drop merged patches * combine "smp: Move cpu_up/down helpers to common code" with "arm/sysctl: Implement cpu hotplug ops" * see individual patches v3->v4: * add irq migration patches * see individual patches v2->v3: * add docs v1->v2: * see individual patches Mykyta Poturai (5): arm/irq: Keep track of irq affinities arm/irq: Migrate IRQs during CPU up/down operations arm/sysctl: Implement cpu hotplug ops tools: Allow building xen-hptool without CONFIG_MIGRATE docs: Document CPU hotplug SUPPORT.md | 1 + docs/misc/cpu-hotplug.txt | 50 +++++++++++++++++++++++++ tools/libs/guest/Makefile.common | 2 +- tools/misc/Makefile | 2 +- xen/arch/arm/Kconfig | 1 + xen/arch/arm/gic-vgic.c | 2 + xen/arch/arm/include/asm/irq.h | 2 + xen/arch/arm/irq.c | 63 +++++++++++++++++++++++++++++++- xen/arch/arm/smp.c | 9 +++++ xen/arch/arm/smpboot.c | 6 +++ xen/arch/arm/vgic.c | 14 ++++++- xen/arch/ppc/stubs.c | 4 ++ xen/arch/riscv/stubs.c | 5 +++ xen/arch/x86/Kconfig | 1 + xen/arch/x86/include/asm/smp.h | 3 -- xen/arch/x86/smp.c | 33 ++--------------- xen/arch/x86/sysctl.c | 12 ++---- xen/common/Kconfig | 3 ++ xen/common/smp.c | 34 +++++++++++++++++ xen/common/sysctl.c | 45 +++++++++++++++++++++++ xen/include/xen/smp.h | 4 ++ 21 files changed, 248 insertions(+), 48 deletions(-) create mode 100644 docs/misc/cpu-hotplug.txt -- 2.51.2
Currently on Arm the desc->affinity mask of an irq is never updated, which makes it hard to know the actual affinity of an interrupt. Fix this by updating the field in irq_set_affinity. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v4->v5: * add locking v3->v4: * patch introduced --- xen/arch/arm/gic-vgic.c | 2 ++ xen/arch/arm/irq.c | 9 +++++++-- xen/arch/arm/vgic.c | 14 ++++++++++++-- 3 files changed, 21 insertions(+), 4 deletions(-) diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/gic-vgic.c +++ b/xen/arch/arm/gic-vgic.c @@ -XXX,XX +XXX,XX @@ static void gic_update_one_lr(struct vcpu *v, int i) if ( test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) ) { struct vcpu *v_target = vgic_get_target_vcpu(v, irq); + spin_lock(&p->desc->lock); irq_set_affinity(p->desc, cpumask_of(v_target->processor)); + spin_unlock(&p->desc->lock); clear_bit(GIC_IRQ_GUEST_MIGRATING, &p->status); } } diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/irq.c +++ b/xen/arch/arm/irq.c @@ -XXX,XX +XXX,XX @@ static inline struct domain *irq_get_domain(struct irq_desc *desc) return irq_get_guest_info(desc)->d; } +/* Must be called with desc->lock held */ void irq_set_affinity(struct irq_desc *desc, const cpumask_t *mask) { - if ( desc != NULL ) - desc->handler->set_affinity(desc, mask); + if ( desc == NULL ) + return; + + ASSERT(spin_is_locked(&desc->lock)); + cpumask_copy(desc->affinity, mask); + desc->handler->set_affinity(desc, mask); } int request_irq(unsigned int irq, unsigned int irqflags, diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -XXX,XX +XXX,XX @@ bool vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq) if ( list_empty(&p->inflight) ) { + spin_lock(&p->desc->lock); irq_set_affinity(p->desc, cpumask_of(new->processor)); + spin_unlock(&p->desc->lock); spin_unlock_irqrestore(&old->arch.vgic.lock, flags); return true; } @@ -XXX,XX +XXX,XX @@ bool vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq) if ( !list_empty(&p->lr_queue) ) { vgic_remove_irq_from_queues(old, p); + spin_lock(&p->desc->lock); irq_set_affinity(p->desc, cpumask_of(new->processor)); + spin_unlock(&p->desc->lock); spin_unlock_irqrestore(&old->arch.vgic.lock, flags); vgic_inject_irq(new->domain, new, irq, true); return true; @@ -XXX,XX +XXX,XX @@ void arch_move_irqs(struct vcpu *v) struct domain *d = v->domain; struct pending_irq *p; struct vcpu *v_target; + unsigned long flags; int i; /* @@ -XXX,XX +XXX,XX @@ void arch_move_irqs(struct vcpu *v) p = irq_to_pending(v_target, virq); if ( v_target == v && !test_bit(GIC_IRQ_GUEST_MIGRATING, &p->status) ) + { + if ( !p->desc ) + continue; + spin_lock_irqsave(&p->desc->lock, flags); irq_set_affinity(p->desc, cpu_mask); + spin_unlock_irqrestore(&p->desc->lock, flags); + } } } @@ -XXX,XX +XXX,XX @@ void vgic_enable_irqs(struct vcpu *v, uint32_t r, unsigned int n) spin_unlock_irqrestore(&v_target->arch.vgic.lock, flags); if ( p->desc != NULL ) { - irq_set_affinity(p->desc, cpumask_of(v_target->processor)); spin_lock_irqsave(&p->desc->lock, flags); + irq_set_affinity(p->desc, cpumask_of(v_target->processor)); /* * The irq cannot be a PPI, we only support delivery of SPIs * to guests. @@ -XXX,XX +XXX,XX @@ void vgic_check_inflight_irqs_pending(struct vcpu *v, unsigned int rank, uint32_ * indent-tabs-mode: nil * End: */ - -- 2.51.2
Move IRQs from dying CPU to the online ones when a CPU is getting offlined. When onlining, rebalance all IRQs in a round-robin fashion. Guest-bound IRQs are already handled by scheduler in the process of moving vCPUs to active pCPUs, so we only need to handle IRQs used by Xen itself. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v4->v5: * handle CPU onlining as well * more comments * fix crash when ESPI is disabled * don't assume CPU 0 is a boot CPU * use insigned int for irq number * remove assumption that all irqs a bound to CPU 0 by default from the commit message v3->v4: * patch introduced --- xen/arch/arm/include/asm/irq.h | 2 ++ xen/arch/arm/irq.c | 54 ++++++++++++++++++++++++++++++++++ xen/arch/arm/smpboot.c | 6 ++++ 3 files changed, 62 insertions(+) diff --git a/xen/arch/arm/include/asm/irq.h b/xen/arch/arm/include/asm/irq.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/include/asm/irq.h +++ b/xen/arch/arm/include/asm/irq.h @@ -XXX,XX +XXX,XX @@ bool irq_type_set_by_domain(const struct domain *d); void irq_end_none(struct irq_desc *irq); #define irq_end_none irq_end_none +void rebalance_irqs(unsigned int from, bool up); + #endif /* _ASM_HW_IRQ_H */ /* * Local variables: diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/irq.c +++ b/xen/arch/arm/irq.c @@ -XXX,XX +XXX,XX @@ static int init_local_irq_data(unsigned int cpu) return 0; } +static int cpu_next; + +static void balance_irq(int irq, unsigned int from, bool up) +{ + struct irq_desc *desc = irq_to_desc(irq); + unsigned long flags; + + ASSERT(!cpumask_empty(&cpu_online_map)); + + spin_lock_irqsave(&desc->lock, flags); + if ( likely(!desc->action) ) + goto out; + + if ( likely(test_bit(_IRQ_GUEST, &desc->status) || + test_bit(_IRQ_MOVE_PENDING, &desc->status)) ) + goto out; + + /* + * Setting affinity to a mask of multiple CPUs causes the GIC drivers to + * select one CPU from that mask. If the dying CPU was included in the IRQ's + * affinity mask, we cannot determine exactly which CPU the interrupt is + * currently routed to, as GIC drivers lack a concrete get_affinity API. So + * to be safe we must reroute it to a new, definitely online, CPU. In the + * case of CPU going down, we move only the interrupt that could reside on + * it. Otherwise, we rearrange all interrupts in a round-robin fashion. + */ + if ( !up && !cpumask_test_cpu(from, desc->affinity) ) + goto out; + + cpu_next = cpumask_cycle(cpu_next, &cpu_online_map); + irq_set_affinity(desc, cpumask_of(cpu_next)); + +out: + spin_unlock_irqrestore(&desc->lock, flags); +} + +void rebalance_irqs(unsigned int from, bool up) +{ + int irq; + + if ( cpumask_empty(&cpu_online_map) ) + return; + + for ( irq = NR_LOCAL_IRQS; irq < NR_IRQS; irq++ ) + balance_irq(irq, from, up); + +#ifdef CONFIG_GICV3_ESPI + for ( irq = ESPI_BASE_INTID; irq < ESPI_MAX_INTID; irq++ ) + balance_irq(irq, from, up); +#endif +} + static int cpu_callback(struct notifier_block *nfb, unsigned long action, void *hcpu) { @@ -XXX,XX +XXX,XX @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action, printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n", cpu); break; + case CPU_ONLINE: + rebalance_irqs(cpu, true); } return notifier_from_errno(rc); diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/smpboot.c +++ b/xen/arch/arm/smpboot.c @@ -XXX,XX +XXX,XX @@ void __cpu_disable(void) smp_mb(); + /* + * Now that the interrupts are cleared and the CPU marked as offline, + * move interrupts out of it + */ + rebalance_irqs(cpu, false); + /* Return to caller; eventually the IPI mechanism will unwind and the * scheduler will drop to the idle loop, which will call stop_cpu(). */ } -- 2.51.2
Move XEN_SYSCTL_CPU_HOTPLUG_{ONLINE,OFFLINE} handlers to common code to allow for enabling/disabling CPU cores in runtime on Arm64. SMT-disable enforcement check is moved into a separate architecture-specific function. For now this operations only support Arm64. For proper Arm32 support, there needs to be a mechanism to free per-cpu page tables, allocated in init_domheap_mappings. Also, hotplug is not supported if ITS, FFA, or TEE is enabled, as they use non-static IRQ actions. Create a Kconfig option CPU_HOTPLUG that reflects this constraints. On X86 the option is enabled unconditionally. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v4->v5: * move handling to common code * rename config to CPU_HOTPUG * merge with "smp: Move cpu_up/down helpers to common code" v3->v4: * don't reimplement cpu_up/down helpers * add Kconfig option * fixup formatting v2->v3: * no changes v1->v2: * remove SMT ops * remove cpu == 0 checks * add XSM hooks * only implement for 64bit Arm --- xen/arch/arm/Kconfig | 1 + xen/arch/arm/smp.c | 9 +++++++ xen/arch/ppc/stubs.c | 4 +++ xen/arch/riscv/stubs.c | 5 ++++ xen/arch/x86/Kconfig | 1 + xen/arch/x86/include/asm/smp.h | 3 --- xen/arch/x86/smp.c | 33 +++---------------------- xen/arch/x86/sysctl.c | 12 +++------ xen/common/Kconfig | 3 +++ xen/common/smp.c | 34 +++++++++++++++++++++++++ xen/common/sysctl.c | 45 ++++++++++++++++++++++++++++++++++ xen/include/xen/smp.h | 4 +++ 12 files changed, 112 insertions(+), 42 deletions(-) diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/Kconfig +++ b/xen/arch/arm/Kconfig @@ -XXX,XX +XXX,XX @@ config ARM_64 def_bool y depends on !ARM_32 select 64BIT + select CPU_HOTPLUG if !TEE && !FFA && !HAS_ITS select HAS_FAST_MULTIPLY select HAS_VPCI_GUEST_SUPPORT if PCI_PASSTHROUGH diff --git a/xen/arch/arm/smp.c b/xen/arch/arm/smp.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/arm/smp.c +++ b/xen/arch/arm/smp.c @@ -XXX,XX +XXX,XX @@ void smp_send_call_function_mask(const cpumask_t *mask) } } +/* + * We currently don't support SMT on ARM so we don't need any special logic for + * CPU disabling + */ +bool arch_smt_cpu_disable(unsigned int cpu) +{ + return false; +} + /* * Local variables: * mode: C diff --git a/xen/arch/ppc/stubs.c b/xen/arch/ppc/stubs.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/ppc/stubs.c +++ b/xen/arch/ppc/stubs.c @@ -XXX,XX +XXX,XX @@ void smp_send_call_function_mask(const cpumask_t *mask) BUG_ON("unimplemented"); } +bool arch_smt_cpu_disable(unsigned int cpu) +{ + BUG_ON("unimplemented"); +} /* irq.c */ void irq_ack_none(struct irq_desc *desc) diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/riscv/stubs.c +++ b/xen/arch/riscv/stubs.c @@ -XXX,XX +XXX,XX @@ void smp_send_call_function_mask(const cpumask_t *mask) BUG_ON("unimplemented"); } +bool arch_smt_cpu_disable(unsigned int cpu) +{ + BUG_ON("unimplemented"); +} + /* irq.c */ void irq_ack_none(struct irq_desc *desc) diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/Kconfig +++ b/xen/arch/x86/Kconfig @@ -XXX,XX +XXX,XX @@ config X86 select ARCH_PAGING_MEMPOOL select ARCH_SUPPORTS_INT128 imply CORE_PARKING + select CPU_HOTPLUG select FUNCTION_ALIGNMENT_16B select GENERIC_BUG_FRAME select HAS_ALTERNATIVE diff --git a/xen/arch/x86/include/asm/smp.h b/xen/arch/x86/include/asm/smp.h index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/include/asm/smp.h +++ b/xen/arch/x86/include/asm/smp.h @@ -XXX,XX +XXX,XX @@ int cpu_add(uint32_t apic_id, uint32_t acpi_id, uint32_t pxm); void __stop_this_cpu(void); -long cf_check cpu_up_helper(void *data); -long cf_check cpu_down_helper(void *data); - long cf_check core_parking_helper(void *data); bool core_parking_remove(unsigned int cpu); uint32_t get_cur_idle_nums(void); diff --git a/xen/arch/x86/smp.c b/xen/arch/x86/smp.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/smp.c +++ b/xen/arch/x86/smp.c @@ -XXX,XX +XXX,XX @@ void cf_check call_function_interrupt(void) smp_call_function_interrupt(); } -long cf_check cpu_up_helper(void *data) +bool arch_smt_cpu_disable(unsigned int cpu) { - unsigned int cpu = (unsigned long)data; - int ret = cpu_up(cpu); - - /* Have one more go on EBUSY. */ - if ( ret == -EBUSY ) - ret = cpu_up(cpu); - - if ( !ret && !opt_smt && - cpu_data[cpu].compute_unit_id == INVALID_CUID && - cpumask_weight(per_cpu(cpu_sibling_mask, cpu)) > 1 ) - { - ret = cpu_down_helper(data); - if ( ret ) - printk("Could not re-offline CPU%u (%d)\n", cpu, ret); - else - ret = -EPERM; - } - - return ret; -} - -long cf_check cpu_down_helper(void *data) -{ - int cpu = (unsigned long)data; - int ret = cpu_down(cpu); - /* Have one more go on EBUSY. */ - if ( ret == -EBUSY ) - ret = cpu_down(cpu); - return ret; + return !opt_smt && cpu_data[cpu].compute_unit_id == INVALID_CUID && + cpumask_weight(per_cpu(cpu_sibling_mask, cpu)) > 1; } diff --git a/xen/arch/x86/sysctl.c b/xen/arch/x86/sysctl.c index XXXXXXX..XXXXXXX 100644 --- a/xen/arch/x86/sysctl.c +++ b/xen/arch/x86/sysctl.c @@ -XXX,XX +XXX,XX @@ long arch_do_sysctl( case XEN_SYSCTL_cpu_hotplug: { - unsigned int cpu = sysctl->u.cpu_hotplug.cpu; unsigned int op = sysctl->u.cpu_hotplug.op; bool plug; long (*fn)(void *data); @@ -XXX,XX +XXX,XX @@ long arch_do_sysctl( switch ( op ) { case XEN_SYSCTL_CPU_HOTPLUG_ONLINE: - plug = true; - fn = cpu_up_helper; - hcpu = _p(cpu); - break; - case XEN_SYSCTL_CPU_HOTPLUG_OFFLINE: - plug = false; - fn = cpu_down_helper; - hcpu = _p(cpu); + /* Handled by common code */ + ASSERT_UNREACHABLE(); + ret = -EOPNOTSUPP; break; case XEN_SYSCTL_CPU_HOTPLUG_SMT_ENABLE: diff --git a/xen/common/Kconfig b/xen/common/Kconfig index XXXXXXX..XXXXXXX 100644 --- a/xen/common/Kconfig +++ b/xen/common/Kconfig @@ -XXX,XX +XXX,XX @@ config LIBFDT config MEM_ACCESS_ALWAYS_ON bool +config CPU_HOTPLUG + bool + config VM_EVENT def_bool MEM_ACCESS_ALWAYS_ON prompt "Memory Access and VM events" if !MEM_ACCESS_ALWAYS_ON diff --git a/xen/common/smp.c b/xen/common/smp.c index XXXXXXX..XXXXXXX 100644 --- a/xen/common/smp.c +++ b/xen/common/smp.c @@ -XXX,XX +XXX,XX @@ * GNU General Public License for more details. */ +#include <xen/cpu.h> #include <asm/hardirq.h> #include <asm/processor.h> #include <xen/spinlock.h> @@ -XXX,XX +XXX,XX @@ void smp_call_function_interrupt(void) irq_exit(); } +#ifdef CONFIG_CPU_HOTPLUG +long cf_check cpu_up_helper(void *data) +{ + unsigned int cpu = (unsigned long)data; + int ret = cpu_up(cpu); + + /* Have one more go on EBUSY. */ + if ( ret == -EBUSY ) + ret = cpu_up(cpu); + + if ( !ret && arch_smt_cpu_disable(cpu) ) + { + ret = cpu_down_helper(data); + if ( ret ) + printk("Could not re-offline CPU%u (%d)\n", cpu, ret); + else + ret = -EPERM; + } + + return ret; +} + +long cf_check cpu_down_helper(void *data) +{ + int cpu = (unsigned long)data; + int ret = cpu_down(cpu); + /* Have one more go on EBUSY. */ + if ( ret == -EBUSY ) + ret = cpu_down(cpu); + return ret; +} +#endif /* CONFIG_CPU_HOTPLUG */ + /* * Local variables: * mode: C diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c index XXXXXXX..XXXXXXX 100644 --- a/xen/common/sysctl.c +++ b/xen/common/sysctl.c @@ -XXX,XX +XXX,XX @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl) copyback = 1; break; +#ifdef CONFIG_CPU_HOTPLUG + case XEN_SYSCTL_cpu_hotplug: + { + unsigned int cpu = op->u.cpu_hotplug.cpu; + unsigned int hp_op = op->u.cpu_hotplug.op; + bool plug; + long (*fn)(void *data); + void *hcpu; + + switch ( hp_op ) + { + case XEN_SYSCTL_CPU_HOTPLUG_ONLINE: + plug = true; + fn = cpu_up_helper; + hcpu = _p(cpu); + break; + + case XEN_SYSCTL_CPU_HOTPLUG_OFFLINE: + plug = false; + fn = cpu_down_helper; + hcpu = _p(cpu); + break; + + case XEN_SYSCTL_CPU_HOTPLUG_SMT_ENABLE: + case XEN_SYSCTL_CPU_HOTPLUG_SMT_DISABLE: + /* Use arch specific handlers as SMT is very arch-dependent */ + ret = arch_do_sysctl(op, u_sysctl); + copyback = 0; + goto out; + + default: + ret = -EOPNOTSUPP; + break; + } + + if ( !ret ) + ret = plug ? xsm_resource_plug_core(XSM_HOOK) + : xsm_resource_unplug_core(XSM_HOOK); + + if ( !ret ) + ret = continue_hypercall_on_cpu(0, fn, hcpu); + break; + } +#endif + default: ret = arch_do_sysctl(op, u_sysctl); copyback = 0; diff --git a/xen/include/xen/smp.h b/xen/include/xen/smp.h index XXXXXXX..XXXXXXX 100644 --- a/xen/include/xen/smp.h +++ b/xen/include/xen/smp.h @@ -XXX,XX +XXX,XX @@ extern void *stack_base[NR_CPUS]; void initialize_cpu_data(unsigned int cpu); int setup_cpu_root_pgt(unsigned int cpu); +bool arch_smt_cpu_disable(unsigned int cpu); +long cf_check cpu_up_helper(void *data); +long cf_check cpu_down_helper(void *data); + #endif /* __XEN_SMP_H__ */ -- 2.51.2
With CPU hotplug sysctls implemented on Arm it becomes useful to have a tool for calling them. According to the commit history it seems that putting hptool under config MIGRATE was a measure to fix IA64 build. As IA64 is no longer supported it can now be brought back. So build it unconditionally. Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v4->v5: * make hptool always build v3->v4: * no changes v2->v3: * no changes v1->v2: * switch to configure from legacy config --- tools/libs/guest/Makefile.common | 2 +- tools/misc/Makefile | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/libs/guest/Makefile.common b/tools/libs/guest/Makefile.common index XXXXXXX..XXXXXXX 100644 --- a/tools/libs/guest/Makefile.common +++ b/tools/libs/guest/Makefile.common @@ -XXX,XX +XXX,XX @@ OBJS-y += xg_private.o OBJS-y += xg_domain.o OBJS-y += xg_suspend.o OBJS-y += xg_resume.o +OBJS-y += xg_offline_page.o ifeq ($(CONFIG_MIGRATE),y) OBJS-y += xg_sr_common.o OBJS-$(CONFIG_X86) += xg_sr_common_x86.o @@ -XXX,XX +XXX,XX @@ OBJS-$(CONFIG_X86) += xg_sr_save_x86_pv.o OBJS-$(CONFIG_X86) += xg_sr_save_x86_hvm.o OBJS-y += xg_sr_restore.o OBJS-y += xg_sr_save.o -OBJS-y += xg_offline_page.o else OBJS-y += xg_nomigrate.o endif diff --git a/tools/misc/Makefile b/tools/misc/Makefile index XXXXXXX..XXXXXXX 100644 --- a/tools/misc/Makefile +++ b/tools/misc/Makefile @@ -XXX,XX +XXX,XX @@ INSTALL_BIN += xencov_split INSTALL_BIN += $(INSTALL_BIN-y) # Everything to be installed in regular sbin/ -INSTALL_SBIN-$(CONFIG_MIGRATE) += xen-hptool INSTALL_SBIN-$(CONFIG_X86) += xen-hvmcrash INSTALL_SBIN-$(CONFIG_X86) += xen-hvmctx INSTALL_SBIN-$(CONFIG_X86) += xen-lowmemd @@ -XXX,XX +XXX,XX @@ INSTALL_SBIN += xenwatchdogd INSTALL_SBIN += xen-access INSTALL_SBIN += xen-livepatch INSTALL_SBIN += xen-diag +INSTALL_SBIN += xen-hptool INSTALL_SBIN += $(INSTALL_SBIN-y) # Everything to be installed -- 2.51.2
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com> --- v4->v5: * s/supported/implemented/ * update SUPPORT.md v3->v4: * update configuration section v2->v3: * patch introduced --- SUPPORT.md | 1 + docs/misc/cpu-hotplug.txt | 50 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 51 insertions(+) create mode 100644 docs/misc/cpu-hotplug.txt diff --git a/SUPPORT.md b/SUPPORT.md index XXXXXXX..XXXXXXX 100644 --- a/SUPPORT.md +++ b/SUPPORT.md @@ -XXX,XX +XXX,XX @@ For the Cortex A77 r0p0 - r1p0, see Errata 1508412. ### ACPI CPU Hotplug Status, x86: Experimental + Status, Arm64: Experimental ### Physical Memory diff --git a/docs/misc/cpu-hotplug.txt b/docs/misc/cpu-hotplug.txt new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/docs/misc/cpu-hotplug.txt @@ -XXX,XX +XXX,XX @@ +CPU Hotplug +=========== + +CPU hotplug is a feature that allows pCPU cores to be added to or removed from a +running system without requiring a reboot. It is implemented on x86 and Arm64 +architectures. + +Implementation Details +---------------------- + +CPU hotplug is implemented through the `XEN_SYSCTL_CPU_HOTPLUG_*` sysctl calls. +The specific calls are: + +- `XEN_SYSCTL_CPU_HOTPLUG_ONLINE`: Brings a pCPU online +- `XEN_SYSCTL_CPU_HOTPLUG_OFFLINE`: Takes a pCPU offline +- `XEN_SYSCTL_CPU_HOTPLUG_SMT_ENABLE`: Enables SMT threads (x86 only) +- `XEN_SYSCTL_CPU_HOTPLUG_SMT_DISABLE`: Disables SMT threads (x86 only) + +All cores can be disabled, assuming hardware support, except for the boot core. +Sysctl calls are routed to the boot core before doing any actual up/down +operations on other cores. + +Configuration +------------- + +The presence of the feature is controlled by CONFIG_CPU_HOTPLUG option. It is +enabled unconditionally on x86 architecture. On Arm64, the option is enabled by +default when ITS, FFA, and TEE configs are disabled. +xen-hptool userspace tool is built unconditionally. + +Usage +----- + +Disable core: + +$ xen-hptool cpu-offline 2 +Prepare to offline CPU 2 +(XEN) Removing cpu 2 from runqueue 0 +CPU 2 offlined successfully + +Enable core: + +$ xen-hptool cpu-online 2 +Prepare to online CPU 2 +(XEN) Bringing up CPU2 +(XEN) GICv3: CPU2: Found redistributor in region 0 @00000a004005c000 +(XEN) CPU2: Guest atomics will try 1 times before pausing the domain +(XEN) CPU 2 booted. +(XEN) Adding cpu 2 to runqueue 0 +CPU 2 onlined successfully -- 2.51.2