From: Salil Mehta <salil.mehta@huawei.com>
On most architectures, during vCPU hot-plug and hot-unplug actions, the
firmware or VMM/QEMU can update the OS on vCPU status by toggling the
ACPI method `_STA.Present` bit. However, certain CPU architectures
prohibit [1] modifications to a CPU’s `presence` status after the kernel
has booted.
This limitation [2][3] exists because many per-CPU components, such as
interrupt controllers and various per-CPU features tightly integrated
with CPUs, may not support reconfiguration once the kernel is
initialized. Often, these components cannot be powered down, as they may
belong to an `always-on` power domain. As a result, some architectures
require all CPUs to remain `_STA.Present` after system initialization.
Therefore, it is essential to mirror the exact QOM vCPU status through
ACPI for the Guest kernel. For this, we should determine—via
architecture-specific code[4]—whether vCPUs must always remain present
and whether the associated `AcpiCpuStatus::cpu` object should remain
valid, even following a vCPU hot-unplug operation.
References:
[1] Check comment 5 in the bugzilla entry
Link: https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
[2] KVMForum 2023 Presentation: Challenges Revisited in Supporting Virt CPU Hotplug on
architectures that don’t Support CPU Hotplug (like ARM64)
a. Kernel Link: https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf
b. Qemu Link: https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
[3] KVMForum 2020 Presentation: Challenges in Supporting Virtual CPU Hotplug on
SoC Based Systems (like ARM64)
Link: https://kvmforum2020.sched.com/event/eE4m
[4] Example implementation of architecture-specific CPU persistence hook
Link: https://github.com/salil-mehta/qemu/commit/c0b416b11e5af6505e558866f0eb6c9f3709173e
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Message-Id: <20241103102419.202225-2-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
include/hw/core/cpu.h | 1 +
hw/acpi/cpu.c | 15 ++++++++++++++-
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index c3ca0babcb..e7de77dc6d 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -158,6 +158,7 @@ struct CPUClass {
void (*dump_state)(CPUState *cpu, FILE *, int flags);
void (*query_cpu_fast)(CPUState *cpu, CpuInfoFast *value);
int64_t (*get_arch_id)(CPUState *cpu);
+ bool (*cpu_persistent_status)(CPUState *cpu);
void (*set_pc)(CPUState *cpu, vaddr value);
vaddr (*get_pc)(CPUState *cpu);
int (*gdb_read_register)(CPUState *cpu, GByteArray *buf, int reg);
diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index 5cb60ca8bc..9b03b4292e 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -233,6 +233,17 @@ void cpu_hotplug_hw_init(MemoryRegion *as, Object *owner,
memory_region_add_subregion(as, base_addr, &state->ctrl_reg);
}
+static bool should_remain_acpi_present(DeviceState *dev)
+{
+ CPUClass *k = CPU_GET_CLASS(dev);
+ /*
+ * A system may contain CPUs that are always present on one die, NUMA node,
+ * or socket, yet may be non-present on another simultaneously. Check from
+ * architecture specific code.
+ */
+ return k->cpu_persistent_status && k->cpu_persistent_status(CPU(dev));
+}
+
static AcpiCpuStatus *get_cpu_status(CPUHotplugState *cpu_st, DeviceState *dev)
{
CPUClass *k = CPU_GET_CLASS(dev);
@@ -289,7 +300,9 @@ void acpi_cpu_unplug_cb(CPUHotplugState *cpu_st,
return;
}
- cdev->cpu = NULL;
+ if (!should_remain_acpi_present(dev)) {
+ cdev->cpu = NULL;
+ }
}
static const VMStateDescription vmstate_cpuhp_sts = {
--
MST