During `machvirt_init()`, QOM ARMCPU objects are pre-created along with the
corresponding KVM vCPUs in the host for all possible vCPUs. This is necessary
due to the architectural constraint that KVM restricts the deferred creation of
KVM vCPUs and VGIC initialization/sizing after VM initialization. Hence, VGIC is
pre-sized with possible vCPUs.
After the initialization of the machine is complete, the disabled possible KVM
vCPUs are parked in the per-virt-machine list "kvm_parked_vcpus," and we release
the QOM ARMCPU objects for the disabled vCPUs. These will be re-created when the
vCPU is hotplugged again. The QOM ARMCPU object is then re-attached to the
corresponding parked KVM vCPU.
Alternatively, we could have chosen not to release the QOM CPU objects and kept
reusing them. This approach might require some modifications to the
`qdevice_add()` interface to retrieve the old ARMCPU object instead of creating
a new one for the hotplug request.
Each of these approaches has its own pros and cons. This prototype uses the
first approach (suggestions are welcome!).
Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
---
hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9d33f30a6a..a72cd3b20d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2050,6 +2050,7 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
{
CPUArchIdList *possible_cpus = vms->parent.possible_cpus;
int max_cpus = MACHINE(vms)->smp.max_cpus;
+ MachineState *ms = MACHINE(vms);
bool aarch64, steal_time;
CPUState *cpu;
int n;
@@ -2111,6 +2112,37 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem)
}
}
}
+
+ if (kvm_enabled() || tcg_enabled()) {
+ for (n = 0; n < possible_cpus->len; n++) {
+ cpu = qemu_get_possible_cpu(n);
+
+ /*
+ * Now, GIC has been sized with possible CPUs and we dont require
+ * disabled vCPU objects to be represented in the QOM. Release the
+ * disabled ARMCPU objects earlier used during init for pre-sizing.
+ *
+ * We fake to the guest through ACPI about the presence(_STA.PRES=1)
+ * of these non-existent vCPUs at VMM/qemu and present these as
+ * disabled vCPUs(_STA.ENA=0) so that they cant be used. These vCPUs
+ * can be later added to the guest through hotplug exchanges when
+ * ARMCPU objects are created back again using 'device_add' QMP
+ * command.
+ */
+ /*
+ * RFC: Question: Other approach could've been to keep them forever
+ * and release it only once when qemu exits as part of finalize or
+ * when new vCPU is hotplugged. In the later old could be released
+ * for the newly created object for the same vCPU?
+ */
+ if (!qemu_enabled_cpu(cpu)) {
+ CPUArchId *cpu_slot;
+ cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index);
+ cpu_slot->cpu = NULL;
+ object_unref(OBJECT(cpu));
+ }
+ }
+ }
}
static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId *cpu_slot,
--
2.34.1
On 6/14/24 9:36 AM, Salil Mehta wrote: > During `machvirt_init()`, QOM ARMCPU objects are pre-created along with the > corresponding KVM vCPUs in the host for all possible vCPUs. This is necessary > due to the architectural constraint that KVM restricts the deferred creation of > KVM vCPUs and VGIC initialization/sizing after VM initialization. Hence, VGIC is > pre-sized with possible vCPUs. > > After the initialization of the machine is complete, the disabled possible KVM > vCPUs are parked in the per-virt-machine list "kvm_parked_vcpus," and we release > the QOM ARMCPU objects for the disabled vCPUs. These will be re-created when the > vCPU is hotplugged again. The QOM ARMCPU object is then re-attached to the > corresponding parked KVM vCPU. > > Alternatively, we could have chosen not to release the QOM CPU objects and kept > reusing them. This approach might require some modifications to the > `qdevice_add()` interface to retrieve the old ARMCPU object instead of creating > a new one for the hotplug request. > > Each of these approaches has its own pros and cons. This prototype uses the > first approach (suggestions are welcome!). > > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com> > --- > hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++ > 1 file changed, 32 insertions(+) > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c > index 9d33f30a6a..a72cd3b20d 100644 > --- a/hw/arm/virt.c > +++ b/hw/arm/virt.c > @@ -2050,6 +2050,7 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem) > { > CPUArchIdList *possible_cpus = vms->parent.possible_cpus; > int max_cpus = MACHINE(vms)->smp.max_cpus; > + MachineState *ms = MACHINE(vms); > bool aarch64, steal_time; > CPUState *cpu; > int n; > @@ -2111,6 +2112,37 @@ static void virt_cpu_post_init(VirtMachineState *vms, MemoryRegion *sysmem) > } > } > } > + > + if (kvm_enabled() || tcg_enabled()) { > + for (n = 0; n < possible_cpus->len; n++) { > + cpu = qemu_get_possible_cpu(n); > + > + /* > + * Now, GIC has been sized with possible CPUs and we dont require > + * disabled vCPU objects to be represented in the QOM. Release the > + * disabled ARMCPU objects earlier used during init for pre-sizing. > + * > + * We fake to the guest through ACPI about the presence(_STA.PRES=1) > + * of these non-existent vCPUs at VMM/qemu and present these as > + * disabled vCPUs(_STA.ENA=0) so that they cant be used. These vCPUs > + * can be later added to the guest through hotplug exchanges when > + * ARMCPU objects are created back again using 'device_add' QMP > + * command. > + */ > + /* > + * RFC: Question: Other approach could've been to keep them forever > + * and release it only once when qemu exits as part of finalize or > + * when new vCPU is hotplugged. In the later old could be released > + * for the newly created object for the same vCPU? > + */ > + if (!qemu_enabled_cpu(cpu)) { > + CPUArchId *cpu_slot; > + cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index); > + cpu_slot->cpu = NULL; > + object_unref(OBJECT(cpu)); > + } > + } > + } > } It's probably hard to keep those ARMCPU objects forever. First of all, one vCPU can be hot-added first and then hot-removed afterwards. With those ARMCPU objects kept forever, the syntax of 'device_add' and 'device_del' become broken at least. The ideal mechanism would be to avoid instanciating those ARMCPU objects and destroying them soon. I don't know if ms->possible_cpus->cpus[] can fit and how much efforts needed. Thanks, Gavin > > static void virt_cpu_set_properties(Object *cpuobj, const CPUArchId *cpu_slot,
Hi Gavin, > From: Gavin Shan <gshan@redhat.com> > Sent: Tuesday, August 13, 2024 2:17 AM > To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; > qemu-arm@nongnu.org; mst@redhat.com > > On 6/14/24 9:36 AM, Salil Mehta wrote: > > During `machvirt_init()`, QOM ARMCPU objects are pre-created along > > with the corresponding KVM vCPUs in the host for all possible vCPUs. > > This is necessary due to the architectural constraint that KVM > > restricts the deferred creation of KVM vCPUs and VGIC > > initialization/sizing after VM initialization. Hence, VGIC is pre-sized with > possible vCPUs. > > > > After the initialization of the machine is complete, the disabled > > possible KVM vCPUs are parked in the per-virt-machine list > > "kvm_parked_vcpus," and we release the QOM ARMCPU objects for the > > disabled vCPUs. These will be re-created when the vCPU is hotplugged > > again. The QOM ARMCPU object is then re-attached to the corresponding > parked KVM vCPU. > > > > Alternatively, we could have chosen not to release the QOM CPU objects > > and kept reusing them. This approach might require some modifications > > to the `qdevice_add()` interface to retrieve the old ARMCPU object > > instead of creating a new one for the hotplug request. > > > > Each of these approaches has its own pros and cons. This prototype > > uses the first approach (suggestions are welcome!). > > > > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com> > > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> > > Signed-off-by: Salil Mehta <salil.mehta@huawei.com> > > --- > > hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++ > > 1 file changed, 32 insertions(+) > > > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index > > 9d33f30a6a..a72cd3b20d 100644 > > --- a/hw/arm/virt.c > > +++ b/hw/arm/virt.c > > @@ -2050,6 +2050,7 @@ static void virt_cpu_post_init(VirtMachineState > *vms, MemoryRegion *sysmem) > > { > > CPUArchIdList *possible_cpus = vms->parent.possible_cpus; > > int max_cpus = MACHINE(vms)->smp.max_cpus; > > + MachineState *ms = MACHINE(vms); > > bool aarch64, steal_time; > > CPUState *cpu; > > int n; > > @@ -2111,6 +2112,37 @@ static void virt_cpu_post_init(VirtMachineState > *vms, MemoryRegion *sysmem) > > } > > } > > } > > + > > + if (kvm_enabled() || tcg_enabled()) { > > + for (n = 0; n < possible_cpus->len; n++) { > > + cpu = qemu_get_possible_cpu(n); > > + > > + /* > > + * Now, GIC has been sized with possible CPUs and we dont > require > > + * disabled vCPU objects to be represented in the QOM. Release > the > > + * disabled ARMCPU objects earlier used during init for pre-sizing. > > + * > > + * We fake to the guest through ACPI about the > presence(_STA.PRES=1) > > + * of these non-existent vCPUs at VMM/qemu and present these > as > > + * disabled vCPUs(_STA.ENA=0) so that they cant be used. These > vCPUs > > + * can be later added to the guest through hotplug exchanges > when > > + * ARMCPU objects are created back again using 'device_add' QMP > > + * command. > > + */ > > + /* > > + * RFC: Question: Other approach could've been to keep them > forever > > + * and release it only once when qemu exits as part of finalize or > > + * when new vCPU is hotplugged. In the later old could be released > > + * for the newly created object for the same vCPU? > > + */ > > + if (!qemu_enabled_cpu(cpu)) { > > + CPUArchId *cpu_slot; > > + cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index); > > + cpu_slot->cpu = NULL; > > + object_unref(OBJECT(cpu)); > > + } > > + } > > + } > > } > > It's probably hard to keep those ARMCPU objects forever. First of all, one > vCPU can be hot-added first and then hot-removed afterwards. With those > ARMCPU objects kept forever, the syntax of 'device_add' and 'device_del' > become broken at least. I had prototyped both approaches 4 years back. Yes, interface problem with device_add was solved by a trick of keeping the old vCPU object and on device_add instead of creating a new vCPU object we could use the old vCPU object and then call qdev_realize() on it. But bigger problem with this approach is that of migration. Only realized objects have state to be migrated. So it might look cleaner on one aspect but had its own issues. I think I did share a prototype of this with Igor which he was not in agreement with and wanted vCPU objects to be destroyed like in x86. Hence, we stuck with the current approach. > The ideal mechanism would be to avoid instanciating those ARMCPU objects > and destroying them soon. I don't know if ms->possible_cpus->cpus[] can > fit and how much efforts needed. This is what we are doing now in the current approach. Please read the KVMForum slides of 2023 for more details and the cover letter of RFC V3 for more details. Thanks Salil. > > Thanks, > Gavin
Hi Salil, On 8/19/24 10:21 PM, Salil Mehta wrote: >> From: Gavin Shan <gshan@redhat.com> >> Sent: Tuesday, August 13, 2024 2:17 AM >> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; >> qemu-arm@nongnu.org; mst@redhat.com >> >> On 6/14/24 9:36 AM, Salil Mehta wrote: >> > During `machvirt_init()`, QOM ARMCPU objects are pre-created along >> > with the corresponding KVM vCPUs in the host for all possible vCPUs. >> > This is necessary due to the architectural constraint that KVM >> > restricts the deferred creation of KVM vCPUs and VGIC >> > initialization/sizing after VM initialization. Hence, VGIC is pre-sized with >> possible vCPUs. >> > >> > After the initialization of the machine is complete, the disabled >> > possible KVM vCPUs are parked in the per-virt-machine list >> > "kvm_parked_vcpus," and we release the QOM ARMCPU objects for the >> > disabled vCPUs. These will be re-created when the vCPU is hotplugged >> > again. The QOM ARMCPU object is then re-attached to the corresponding >> parked KVM vCPU. >> > >> > Alternatively, we could have chosen not to release the QOM CPU objects >> > and kept reusing them. This approach might require some modifications >> > to the `qdevice_add()` interface to retrieve the old ARMCPU object >> > instead of creating a new one for the hotplug request. >> > >> > Each of these approaches has its own pros and cons. This prototype >> > uses the first approach (suggestions are welcome!). >> > >> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com> >> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> >> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com> >> > --- >> > hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++ >> > 1 file changed, 32 insertions(+) >> > >> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index >> > 9d33f30a6a..a72cd3b20d 100644 >> > --- a/hw/arm/virt.c >> > +++ b/hw/arm/virt.c >> > @@ -2050,6 +2050,7 @@ static void virt_cpu_post_init(VirtMachineState >> *vms, MemoryRegion *sysmem) >> > { >> > CPUArchIdList *possible_cpus = vms->parent.possible_cpus; >> > int max_cpus = MACHINE(vms)->smp.max_cpus; >> > + MachineState *ms = MACHINE(vms); >> > bool aarch64, steal_time; >> > CPUState *cpu; >> > int n; >> > @@ -2111,6 +2112,37 @@ static void virt_cpu_post_init(VirtMachineState >> *vms, MemoryRegion *sysmem) >> > } >> > } >> > } >> > + >> > + if (kvm_enabled() || tcg_enabled()) { >> > + for (n = 0; n < possible_cpus->len; n++) { >> > + cpu = qemu_get_possible_cpu(n); >> > + >> > + /* >> > + * Now, GIC has been sized with possible CPUs and we dont >> require >> > + * disabled vCPU objects to be represented in the QOM. Release >> the >> > + * disabled ARMCPU objects earlier used during init for pre-sizing. >> > + * >> > + * We fake to the guest through ACPI about the >> presence(_STA.PRES=1) >> > + * of these non-existent vCPUs at VMM/qemu and present these >> as >> > + * disabled vCPUs(_STA.ENA=0) so that they cant be used. These >> vCPUs >> > + * can be later added to the guest through hotplug exchanges >> when >> > + * ARMCPU objects are created back again using 'device_add' QMP >> > + * command. >> > + */ >> > + /* >> > + * RFC: Question: Other approach could've been to keep them >> forever >> > + * and release it only once when qemu exits as part of finalize or >> > + * when new vCPU is hotplugged. In the later old could be released >> > + * for the newly created object for the same vCPU? >> > + */ >> > + if (!qemu_enabled_cpu(cpu)) { >> > + CPUArchId *cpu_slot; >> > + cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index); >> > + cpu_slot->cpu = NULL; >> > + object_unref(OBJECT(cpu)); >> > + } >> > + } >> > + } >> > } >> >> It's probably hard to keep those ARMCPU objects forever. First of all, one >> vCPU can be hot-added first and then hot-removed afterwards. With those >> ARMCPU objects kept forever, the syntax of 'device_add' and 'device_del' >> become broken at least. > > I had prototyped both approaches 4 years back. Yes, interface problem with > device_add was solved by a trick of keeping the old vCPU object and on > device_add instead of creating a new vCPU object we could use the old vCPU > object and then call qdev_realize() on it. > > But bigger problem with this approach is that of migration. Only realized objects > have state to be migrated. So it might look cleaner on one aspect but had its > own issues. > > I think I did share a prototype of this with Igor which he was not in agreement with > and wanted vCPU objects to be destroyed like in x86. Hence, we stuck with > the current approach. > Migration needn't to be the blocker necessarily. For those states of vCPUs, which have been instantiated and not realized yet, those states can be moved around to be managed under another migratable object (e.g. MachineState). In this way, those vCPU states can be migrated together with MachineState. However, it sounds strange to me to have vCPU objects instantiated, but unrealized until hot-add is triggered. It's not what QOM supposes to support. Ok, I don't realize x86 also follows this model: instantiate hotpluggable vCPUs and destroy them on bootup. > >> The ideal mechanism would be to avoid instanciating those ARMCPU objects >> and destroying them soon. I don't know if ms->possible_cpus->cpus[] can >> fit and how much efforts needed. > > This is what we are doing now in the current approach. Please read the KVMForum > slides of 2023 for more details and the cover letter of RFC V3 for more details. > My question has been parsed in a wrong way. My question was if it's doable to avoid instantiating vCPU#1 on bootup when we have command lines '-cpu host -smp cpus=1,max_cpus=2' and release it in machvirt_init(). Instead, ms->possible_cpus->cpus[] are reused to create GICv3 and vCPU file descriptors. It looks like a clean way at least. There may be significant efforts to reuse mc->possible_cpus->cpus[], but vCPU#1 object has ephemeral life cycle. It looks unnecessary to create the ephemeral vCPU#1 object. Thanks, Gavin
> From: Gavin Shan <gshan@redhat.com> > Sent: Tuesday, August 20, 2024 1:06 AM > To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; > qemu-arm@nongnu.org; mst@redhat.com > > Hi Salil, > > On 8/19/24 10:21 PM, Salil Mehta wrote: > >> From: Gavin Shan <gshan@redhat.com> > >> Sent: Tuesday, August 13, 2024 2:17 AM > >> To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; > >> qemu-arm@nongnu.org; mst@redhat.com > >> > >> On 6/14/24 9:36 AM, Salil Mehta wrote: > >> > During `machvirt_init()`, QOM ARMCPU objects are pre-created along > >> > with the corresponding KVM vCPUs in the host for all possible vCPUs. > >> > This is necessary due to the architectural constraint that KVM > >> > restricts the deferred creation of KVM vCPUs and VGIC > >> > initialization/sizing after VM initialization. Hence, VGIC is pre-sized > with > >> possible vCPUs. > >> > > >> > After the initialization of the machine is complete, the disabled > >> > possible KVM vCPUs are parked in the per-virt-machine list > >> > "kvm_parked_vcpus," and we release the QOM ARMCPU objects for > the > >> > disabled vCPUs. These will be re-created when the vCPU is > hotplugged > >> > again. The QOM ARMCPU object is then re-attached to the > corresponding > >> parked KVM vCPU. > >> > > >> > Alternatively, we could have chosen not to release the QOM CPU > objects > >> > and kept reusing them. This approach might require some > modifications > >> > to the `qdevice_add()` interface to retrieve the old ARMCPU object > >> > instead of creating a new one for the hotplug request. > >> > > >> > Each of these approaches has its own pros and cons. This prototype > >> > uses the first approach (suggestions are welcome!). > >> > > >> > Co-developed-by: Keqian Zhu <zhukeqian1@huawei.com> > >> > Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> > >> > Signed-off-by: Salil Mehta <salil.mehta@huawei.com> > >> > --- > >> > hw/arm/virt.c | 32 ++++++++++++++++++++++++++++++++ > >> > 1 file changed, 32 insertions(+) > >> > > >> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index > >> > 9d33f30a6a..a72cd3b20d 100644 > >> > --- a/hw/arm/virt.c > >> > +++ b/hw/arm/virt.c > >> > @@ -2050,6 +2050,7 @@ static void > virt_cpu_post_init(VirtMachineState > >> *vms, MemoryRegion *sysmem) > >> > { > >> > CPUArchIdList *possible_cpus = vms->parent.possible_cpus; > >> > int max_cpus = MACHINE(vms)->smp.max_cpus; > >> > + MachineState *ms = MACHINE(vms); > >> > bool aarch64, steal_time; > >> > CPUState *cpu; > >> > int n; > >> > @@ -2111,6 +2112,37 @@ static void > virt_cpu_post_init(VirtMachineState > >> *vms, MemoryRegion *sysmem) > >> > } > >> > } > >> > } > >> > + > >> > + if (kvm_enabled() || tcg_enabled()) { > >> > + for (n = 0; n < possible_cpus->len; n++) { > >> > + cpu = qemu_get_possible_cpu(n); > >> > + > >> > + /* > >> > + * Now, GIC has been sized with possible CPUs and we dont > >> require > >> > + * disabled vCPU objects to be represented in the QOM. > Release > >> the > >> > + * disabled ARMCPU objects earlier used during init for pre- > sizing. > >> > + * > >> > + * We fake to the guest through ACPI about the > >> presence(_STA.PRES=1) > >> > + * of these non-existent vCPUs at VMM/qemu and present > these > >> as > >> > + * disabled vCPUs(_STA.ENA=0) so that they cant be used. > These > >> vCPUs > >> > + * can be later added to the guest through hotplug exchanges > >> when > >> > + * ARMCPU objects are created back again using 'device_add' > QMP > >> > + * command. > >> > + */ > >> > + /* > >> > + * RFC: Question: Other approach could've been to keep them > >> forever > >> > + * and release it only once when qemu exits as part of finalize > or > >> > + * when new vCPU is hotplugged. In the later old could be > released > >> > + * for the newly created object for the same vCPU? > >> > + */ > >> > + if (!qemu_enabled_cpu(cpu)) { > >> > + CPUArchId *cpu_slot; > >> > + cpu_slot = virt_find_cpu_slot(ms, cpu->cpu_index); > >> > + cpu_slot->cpu = NULL; > >> > + object_unref(OBJECT(cpu)); > >> > + } > >> > + } > >> > + } > >> > } > >> > >> It's probably hard to keep those ARMCPU objects forever. First of all, > one > >> vCPU can be hot-added first and then hot-removed afterwards. With > those > >> ARMCPU objects kept forever, the syntax of 'device_add' and > 'device_del' > >> become broken at least. > > > > I had prototyped both approaches 4 years back. Yes, interface problem > > with device_add was solved by a trick of keeping the old vCPU object > > and on device_add instead of creating a new vCPU object we could use > > the old vCPU object and then call qdev_realize() on it. > > > > But bigger problem with this approach is that of migration. Only > > realized objects have state to be migrated. So it might look cleaner > > on one aspect but had its own issues. > > > > I think I did share a prototype of this with Igor which he was not in > > agreement with and wanted vCPU objects to be destroyed like in x86. > > Hence, we stuck with the current approach. > > > > Migration needn't to be the blocker necessarily. For those states of vCPUs, > which have been instantiated and not realized yet, those states can be > moved around to be managed under another migratable object (e.g. > MachineState). In this way, those vCPU states can be migrated together > with MachineState. Migration was a blocker., I'll have to muddle my head again to gain back every context of that which at this juncture seems unnecessary. However, it sounds strange to me to have vCPU objects > instantiated, but unrealized until hot-add is triggered. > It's not what QOM supposes to support. Yes, but for user it does not matter as it is an internal implementation and we don’t have to expose that to the external user either. Just a representation. > > Ok, I don't realize x86 also follows this model: instantiate hotpluggable > vCPUs and destroy them on bootup. Correct, so we have now kept it consistent with x86 but of course we have fake states for ARM arch when vCPUs don’t exist and this state Is not migrated. ACPI persistent state gets initialized at the start by the architecture specific code. Hence, what vCPU state we expose to kernel during Qemu init and hot(un)plug operations varies. > > > > >> The ideal mechanism would be to avoid instanciating those ARMCPU objects > >> and destroying them soon. I don't know if ms->possible_cpus->cpus[] can > >> fit and how much efforts needed. > > > > This is what we are doing now in the current approach. Please read the > > KVMForum slides of 2023 for more details and the cover letter of RFC V3 > for more details. > > > > My question has been parsed in a wrong way. Oh. 😊 My question was if it's doable > to avoid instantiating vCPU#1 on bootup when we have command lines '- > cpu host -smp cpus=1,max_cpus=2' > and release it in machvirt_init(). Instead, ms->possible_cpus->cpus[] are > reused to create > GICv3 and vCPU file descriptors. We create QOM vCPU objects to initialize KVM vCPU objects in host kernel. and only permanently populate the 'possible_cpus' list for the realized vCPUs. We release vCPU objects for other vCPUs which will be hot-plugged in future. and only make then part of 'possible_cpus' list after they get plugged and realized. It looks like a clean way at least. There may > be significant efforts to reuse mc->possible_cpus->cpus[], but vCPU#1 > object has ephemeral life cycle. > It looks unnecessary to create the ephemeral vCPU#1 object. I don’t understand this clearly. Are you suggesting to reuse only single vCPU object to initialize all KVM vCPUs not yet plugged? If yes, then I'm not sure what do we gain here by adding this complexity? It does not consume time or resources because we are not realizing any of these vCPU object in any case. Thanks Salil. > > Thanks, > Gavin >
Hi Salil, On 8/21/24 2:40 AM, Salil Mehta wrote: > > I don’t understand this clearly. Are you suggesting to reuse only single > vCPU object to initialize all KVM vCPUs not yet plugged? If yes, then > I'm not sure what do we gain here by adding this complexity? It does > not consume time or resources because we are not realizing any of these > vCPU object in any case. > First of all, it seems we have different names and terms for those cold-booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1 are cold-booted vCPUs while vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion. The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and released in the same function for those hotpluggable vCPUs. As I can understand, those hotpluggable vCPU instances are serving for two purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have to be created and populated; (2) Help to instantiate and realize GICv3 object. For (1), I don't think we have to instantiate those hotpluggable vCPUs at all. In the above example where we have command line '-smp maxcpus=4,cpus=2', it's unnecessary to instantiate vCPU-3 and vCPU-4 to create and populate their KVM file descriptors. A vCPU's KVM file descriptor is create and populated by the following ioctls and function calls. When the first vCPU (vCPU-0) is realized, the property corresponding to "&init" is fixed for all vCPUs. It means all vCPUs have same properties except the "vcpu_index". ioctl(vm-fd, KVM_CREATE_VCPU, vcpu_index); ioctl(vcpu-fd, KVM_ARM_VCPU_INIT, &init); kvm_park_vcpu(cs); A vCPU's properties are determined by two sources and both are global. It means all vCPUs should have same properties: (a) Feature registers returned from the host. The function kvm_arm_get_host_cpu_features() is called for once, meaning this source is same to all vCPUs; (b) The parameters provided by user through '-cpu host,sve=off' are translated to global properties and applied to all vCPUs when they're instantiated. (a) (b) aarch64_host_initfn qemu_init kvm_arm_set_cpu_features_from_host parse_cpu_option kvm_arm_get_host_cpu_features cpu_common_parse_features qdev_prop_register_global : device_post_init qdev_prop_set_globals For (2), I'm still looking into the GICv3 code for better understanding. Until now, I don't see we need the instantiated hotpluggable vCPUs either. For example, the redistributor regions can be exposed based on 'maxcpus' instead of 'cpus'. The IRQ connection and teardown can be dynamically done by connecting the board with GICv3 through callbacks in ARMGICv3CommonClass. The connection between GICv3CPUState and CPUARMState also can be done dynamically. Thanks, Gavin
Hi Gavin, > From: Gavin Shan <gshan@redhat.com> > Sent: Wednesday, August 21, 2024 7:25 AM > To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; > qemu-arm@nongnu.org; mst@redhat.com > > Hi Salil, > > On 8/21/24 2:40 AM, Salil Mehta wrote: > > > > I don’t understand this clearly. Are you suggesting to reuse only > > single vCPU object to initialize all KVM vCPUs not yet plugged? If > > yes, then I'm not sure what do we gain here by adding this complexity? > > It does not consume time or resources because we are not realizing any > > of these vCPU object in any case. > > > > First of all, it seems we have different names and terms for those cold- > booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1 > are cold-booted vCPUs while > vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp > maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion. > > The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and > released in the same function for those hotpluggable vCPUs. As I can > understand, those hotpluggable vCPU instances are serving for two > purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have > to be created and populated; We are devising *workarounds* in Qemu for the ARM CPU architectural constraints in KVM and in Guest Kernel, *not relaxing* them. We are not allowed to meddle with the constraints. That is the whole point. Not having to respect those constraints led to rejection of the earlier attempts to upstream Virtual CPU Hotplug for ARM. (2) Help to instantiate and realize > GICv3 object. > > For (1), I don't think we have to instantiate those hotpluggable vCPUs at all. > In the above example where we have command line '-smp > maxcpus=4,cpus=2', it's unnecessary to instantiate > vCPU-3 and vCPU-4 to create and populate their KVM file descriptors. We cannot defer create vCPU in KVM after GIC has been initialized in KVM. It needs to know every vCPU that will ever exists right at the time it is getting Initialized. This is an ARM CPU Architectural constraint. A > vCPU's KVM file descriptor is create and populated by the following ioctls > and function calls. When the first vCPU (vCPU-0) is realized, the property > corresponding to "&init" is fixed for all vCPUs. It means all vCPUs have same > properties except the "vcpu_index". > > ioctl(vm-fd, KVM_CREATE_VCPU, vcpu_index); > ioctl(vcpu-fd, KVM_ARM_VCPU_INIT, &init); > kvm_park_vcpu(cs); > > A vCPU's properties are determined by two sources and both are global. It > means all vCPUs should have same properties: (a) Feature registers > returned from the host. The function > kvm_arm_get_host_cpu_features() is called for once, meaning this source > is same to all vCPUs; Sure, but what are you trying to save here? > (b) The parameters provided by user through '-cpu host,sve=off' are > translated to global properties and applied to all vCPUs when they're > instantiated. Sure. Same is the case with PMU and other per-vCPU parameters. We do not support heterogenous computing and therefore we do not have per-vCPU control of these features as of now. > > (a) (b) > > aarch64_host_initfn qemu_init > kvm_arm_set_cpu_features_from_host parse_cpu_option > kvm_arm_get_host_cpu_features cpu_common_parse_features > qdev_prop_register_global > : > device_post_init > qdev_prop_set_globals Sure, I understand the code flow but what are you trying to suggest here? > For (2), I'm still looking into the GICv3 code for better understanding. Oh, I thought you said you've finished your reviews 😊 Please take your time. For your reference, you might want to check: KVMForum 2023: https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf KVMForum 2020: https://kvm-forum.qemu.org/2020/Oct%2029_Challenges%20in%20Supporting%20Virtual%20CPU%20Hotplug%20in%20SoC%20Based%20Systems%20like%20ARM64_Salil%20Mehta.pdf Until > now, I don't see we need the instantiated hotpluggable vCPUs either. I think, I've already answered this above it is because of ARM Architectural constraint. For > example, the redistributor regions can be exposed based on 'maxcpus' > instead of 'cpus'. You mean during the review of the code you found that we are not doing it? The IRQ connection and teardown can be dynamically > done by connecting the board with GICv3 through callbacks in > ARMGICv3CommonClass. > The connection between GICv3CPUState and CPUARMState also can be > done dynamically. Are you suggesting this after reviewing the code or you have to review it yet? 😉 Thanks Salil. > > Thanks, > Gavin >
Hi Salil, On 8/21/24 8:23 PM, Salil Mehta wrote: >> >> On 8/21/24 2:40 AM, Salil Mehta wrote: >> > >> > I don’t understand this clearly. Are you suggesting to reuse only >> > single vCPU object to initialize all KVM vCPUs not yet plugged? If >> > yes, then I'm not sure what do we gain here by adding this complexity? >> > It does not consume time or resources because we are not realizing any >> > of these vCPU object in any case. >> > >> >> First of all, it seems we have different names and terms for those cold- >> booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1 >> are cold-booted vCPUs while >> vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp >> maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion. >> >> The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and >> released in the same function for those hotpluggable vCPUs. As I can >> understand, those hotpluggable vCPU instances are serving for two >> purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have >> to be created and populated; > > > We are devising *workarounds* in Qemu for the ARM CPU architectural constraints > in KVM and in Guest Kernel, *not relaxing* them. We are not allowed to meddle with > the constraints. That is the whole point. > > Not having to respect those constraints led to rejection of the earlier attempts to > upstream Virtual CPU Hotplug for ARM. > I meant to 'overcome' the constraints by 'relax'. My apologies if there're any caused confusions. Previously, you had attempt to create all vCPU objects and reuse them when vCPU hot added. In current implementation, the hotpluggable vCPUs are instantiated and released pretty soon. I was bringing the third possibility, to avoid instantiating those hotpluggable vCPU objects, for discussion. In this series, the life cycle of those hotpluggable vCPU objects are really short. Again, I didn't say we must avoid instantiating those vCPU objects, I brought the topic ONLY for discussion. > > (2) Help to instantiate and realize >> GICv3 object. >> >> For (1), I don't think we have to instantiate those hotpluggable vCPUs at all. >> In the above example where we have command line '-smp >> maxcpus=4,cpus=2', it's unnecessary to instantiate >> vCPU-3 and vCPU-4 to create and populate their KVM file descriptors. > > > We cannot defer create vCPU in KVM after GIC has been initialized in KVM. > It needs to know every vCPU that will ever exists right at the time it is getting > Initialized. This is an ARM CPU Architectural constraint. > It will be appreciated if more details other than 'an ARM CPU architectural constraint' can be provided. I don't understand this constrain very well at least. > > A >> vCPU's KVM file descriptor is create and populated by the following ioctls >> and function calls. When the first vCPU (vCPU-0) is realized, the property >> corresponding to "&init" is fixed for all vCPUs. It means all vCPUs have same >> properties except the "vcpu_index". >> >> ioctl(vm-fd, KVM_CREATE_VCPU, vcpu_index); >> ioctl(vcpu-fd, KVM_ARM_VCPU_INIT, &init); >> kvm_park_vcpu(cs); >> >> A vCPU's properties are determined by two sources and both are global. It >> means all vCPUs should have same properties: (a) Feature registers >> returned from the host. The function >> kvm_arm_get_host_cpu_features() is called for once, meaning this source >> is same to all vCPUs; > > > Sure, but what are you trying to save here? > As mentioned above, the life cycle of those hotpluggable vCPU objects are really short. They still consume time and memory to instantiate them. If I'm correct, one of the primary goal for vCPU hotplug feature is to save system boot-up time, correct? > >> (b) The parameters provided by user through '-cpu host,sve=off' are >> translated to global properties and applied to all vCPUs when they're >> instantiated. > > > Sure. Same is the case with PMU and other per-vCPU parameters. > We do not support heterogenous computing and therefore we do not > have per-vCPU control of these features as of now. > > >> >> (a) (b) >> >> aarch64_host_initfn qemu_init >> kvm_arm_set_cpu_features_from_host parse_cpu_option >> kvm_arm_get_host_cpu_features cpu_common_parse_features >> qdev_prop_register_global >> : >> device_post_init >> qdev_prop_set_globals > > > Sure, I understand the code flow but what are you trying to suggest here? > I tried to explain that vCPU object isn't needed to create and populate vCPU's file descriptors, as highlight in (1). The information used to create the cold-booted vCPU-0 can be reused because all vCPUs have same properties and feature set. > >> For (2), I'm still looking into the GICv3 code for better understanding. > > > Oh, I thought you said you've finished your reviews 😊 > > Please take your time. For your reference, you might want to check: > > KVMForum 2023: > https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf > https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf > > KVMForum 2020: > https://kvm-forum.qemu.org/2020/Oct%2029_Challenges%20in%20Supporting%20Virtual%20CPU%20Hotplug%20in%20SoC%20Based%20Systems%20like%20ARM64_Salil%20Mehta.pdf > hmm, 'finished my review' has been misread frankly. By that, I meant I finished my tests and provided all the comments I had. Some of them are questions and discussions, which I still need to follow up. > > Until >> now, I don't see we need the instantiated hotpluggable vCPUs either. > > > I think, I've already answered this above it is because of ARM Architectural constraint. > > > For >> example, the redistributor regions can be exposed based on 'maxcpus' >> instead of 'cpus'. > > You mean during the review of the code you found that we are not doing it? > It's all about the discussion to the possibility to avoid instantiating hotpluggable vCPU objects. > > The IRQ connection and teardown can be dynamically >> done by connecting the board with GICv3 through callbacks in >> ARMGICv3CommonClass. >> The connection between GICv3CPUState and CPUARMState also can be >> done dynamically. > > Are you suggesting this after reviewing the code or you have to review it yet? 😉 > I was actually trying to ask for your input and feedback. I was hoping your input to clear my puzzles: why vCPU objects must be in place to create GICv3 object? Is it possible to create the GICv3 object without those vCPU objects? What kinds of efforts we need to avoid instantiating those hotpluggable vCPU objects. The best way perhaps is to find the answer from the code by myself ;-) Thanks, Gavin
Hi Gavin, > From: qemu-arm-bounces+salil.mehta=huawei.com@nongnu.org <qemu- > arm-bounces+salil.mehta=huawei.com@nongnu.org> On Behalf Of Gavin Shan > Sent: Wednesday, August 21, 2024 2:33 PM > To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; > qemu-arm@nongnu.org; mst@redhat.com > > Hi Salil, > > On 8/21/24 8:23 PM, Salil Mehta wrote: > >> > >> On 8/21/24 2:40 AM, Salil Mehta wrote: > >> > > >> > I don’t understand this clearly. Are you suggesting to reuse only > >> > single vCPU object to initialize all KVM vCPUs not yet plugged? If > >> > yes, then I'm not sure what do we gain here by adding this complexity? > >> > It does not consume time or resources because we are not realizing any > >> > of these vCPU object in any case. > >> > > >> > >> First of all, it seems we have different names and terms for those cold- > >> booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1 > >> are cold-booted vCPUs while > >> vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp > >> maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion. > >> > >> The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and > >> released in the same function for those hotpluggable vCPUs. As I can > >> understand, those hotpluggable vCPU instances are serving for two > >> purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have > >> to be created and populated; > > > > > > We are devising *workarounds* in Qemu for the ARM CPU architectural > > constraints in KVM and in Guest Kernel, *not relaxing* them. We are > > not allowed to meddle with the constraints. That is the whole point. > > > > Not having to respect those constraints led to rejection of the > > earlier attempts to upstream Virtual CPU Hotplug for ARM. > > > > I meant to 'overcome' the constraints by 'relax'. My apologies if there're any > caused confusions. Ok. No issues. It was important for me to clarify though. Previously, you had attempt to create all vCPU objects > and reuse them when vCPU hot added. Yes, at QOM level. But that approach did not realize the unplugged/yet-to-be-plugged vCPUs. We were just using QOM vCPU objects as the place holders In current implementation, the > hotpluggable vCPUs are instantiated and released pretty soon. I was > bringing the third possibility, to avoid instantiating those hotpluggable vCPU > objects, for discussion. Are you suggesting not calling KVM_ARM_VCPU_INIT IOCTL as all for the vCPUs which are part of possible list but not yet plugged? If yes, we cannot do that as KVM vCPUs should be fully initialized even before VGIC is initialized inside the KVM. This is a constraint. I've explained this in detail in the cover letter of this patch-set and in the slides I have shared earlier. In this series, the life cycle of those hotpluggable > vCPU objects are really short. Again, I didn't say we must avoid instantiating > those vCPU objects, I brought the topic ONLY for discussion. Sure, I appreciate that. For the details of the reasons please follow below: 1. Cover Letter of this patch-set (Constraints are explained there) 2. KVMForum Slides of 2020 and 2023 > > (2) Help to instantiate and realize > >> GICv3 object. > >> > >> For (1), I don't think we have to instantiate those hotpluggable vCPUs at all. > >> In the above example where we have command line '-smp > >> maxcpus=4,cpus=2', it's unnecessary to instantiate > >> vCPU-3 and vCPU-4 to create and populate their KVM file descriptors. > > > > > > We cannot defer create vCPU in KVM after GIC has been initialized in KVM. > > It needs to know every vCPU that will ever exists right at the time it > > is getting Initialized. This is an ARM CPU Architectural constraint. > > > > It will be appreciated if more details other than 'an ARM CPU architectural constraint' > can be provided. I don't understand this constrain very well at least. We cannot do that as we MUST present KVM vCPUs to the VGIC fully initialized, even before it starts its initialization. Initialization of the vCPUs also initializes the system registers for the corresponding KVM vCPU. For example, MPIDR_EL1 must be initialized at VCPU INIT time. We cannot avoid this. MPIDR value is used by VGIC during its initialization. This MUST be present for all of the possible KVM vCPUs right from start during vgic_init() Please check the cover letter of this patch-set, I explained these there and the KVMForum slides. Please review and comment there and let me know what is not clear from the text. > > A > >> vCPU's KVM file descriptor is create and populated by the following ioctls > >> and function calls. When the first vCPU (vCPU-0) is realized, the property > >> corresponding to "&init" is fixed for all vCPUs. It means all vCPUs have same > >> properties except the "vcpu_index". > >> > >> ioctl(vm-fd, KVM_CREATE_VCPU, vcpu_index); > >> ioctl(vcpu-fd, KVM_ARM_VCPU_INIT, &init); > >> kvm_park_vcpu(cs); > >> > >> A vCPU's properties are determined by two sources and both are global. It > >> means all vCPUs should have same properties: (a) Feature registers > >> returned from the host. The function > >> kvm_arm_get_host_cpu_features() is called for once, meaning this source > >> is same to all vCPUs; > > > > > > Sure, but what are you trying to save here? > > > > As mentioned above, the life cycle of those hotpluggable vCPU objects are > really short. They still consume time and memory to instantiate them. If I'm > correct, one of the primary goal for vCPU hotplug feature is to save system > boot-up time, correct? Correct. We targeted vCPU hotplug for Kata-containers for on-demand resource allocation and saving the resources. Kata-containers can work with different types of VMM like Qemu and microVMs like Firecracker. AFAIK, Usecase of microVM is slightly different than the normal containers. They are short lived (say around 15 min) and require ultrafast boot-up times (say less than 125 ms) - these figures are from Amazon who invented the concept of microVM in the earlier decade. With the current patches, we have only partially achieved what we had started i.e. Kata/Qemu but we also want to target Kata/microVM. In our case, we want that microVM to be Qemu based fro ARM. I think x86 already has reduced lots of legacy stuff and created a microVM in Qemu. I'm not sure how it compares against the true microVM like Firecracker. It will be a good target to reduce memory foot print of ARM Qemu Virt Machine. or think or creating a new one just like x86. Using the vCPU Hotplug feature we were drastically able to reduce the boot up times of Qemu. Please check the calibrated performance figures in KVmForum 2023 slide 19G (Page 26) [1] [1] https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf Last year, I had prototyped a microVM for ARM, Michael Tsirkin suggested that if the performance number of the ARM Virt machine can match the x86 microVM then we might not require an explicit microVM code for ARM. Hence, my current efforts are to reduce the memory foot print of existing Virt machine. But I do have a rough prototype of microVM as well. We can debate about that later in a different discussion. > >> (b) The parameters provided by user through '-cpu host,sve=off' are > >> translated to global properties and applied to all vCPUs when they're > >> instantiated. > > > > > > Sure. Same is the case with PMU and other per-vCPU parameters. > > We do not support heterogenous computing and therefore we do not have > > per-vCPU control of these features as of now. > > > > > >> > >> (a) (b) > >> > >> aarch64_host_initfn qemu_init > >> kvm_arm_set_cpu_features_from_host parse_cpu_option > >> kvm_arm_get_host_cpu_features > cpu_common_parse_features > >> qdev_prop_register_global > >> : > >> device_post_init > >> > >> qdev_prop_set_globals > > > > > > Sure, I understand the code flow but what are you trying to suggest here? > > > > I tried to explain that vCPU object isn't needed to create and populate > vCPU's file descriptors, as highlight in (1). The information used to create the > cold-booted > vCPU-0 can be reused because all vCPUs have same properties and feature > set. It does not matter. We use those QOM vCPU object states to initializes Qemu GICv3 model with max possible vCPUs and then release the QOM vCPU objects. which are yet-to-be-plugged. > >> For (2), I'm still looking into the GICv3 code for better understanding. > > > > > > Oh, I thought you said you've finished your reviews 😊 > > > > Please take your time. For your reference, you might want to check: > > > > KVMForum 2023: > > https://kvm- > forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Vir > > t_CPU_Hotplug_-__ii0iNb3.pdf > > https://kvm-forum.qemu.org/2023/KVM-forum-cpu- > hotplug_7OJ1YyJ.pdf > > > > KVMForum 2020: > > https://kvm- > forum.qemu.org/2020/Oct%2029_Challenges%20in%20Supporting% > > > 20Virtual%20CPU%20Hotplug%20in%20SoC%20Based%20Systems%20like%2 > 0ARM64_ > > Salil%20Mehta.pdf > > > > hmm, 'finished my review' has been misread frankly. By that, I meant I > finished my tests and provided all the comments I had. Some of them are > questions and discussions, which I still need to follow up. Sure. No worries. Even if you miss, you will have more chance to comment on upcoming RFC V4 😊 > > Until > >> now, I don't see we need the instantiated hotpluggable vCPUs either. > > > > > > I think, I've already answered this above it is because of ARM Architectural > constraint. > > > > > > For > >> example, the redistributor regions can be exposed based on 'maxcpus' > >> instead of 'cpus'. > > > > You mean during the review of the code you found that we are not doing > it? > > > > It's all about the discussion to the possibility to avoid instantiating > hotpluggable vCPU objects. As mentioned above, with the current KVM code you cannot. But if we really want to then perhaps we would need to change KVM. I might be wrong but AFAICS I don’t see a reason why we cannot have something like *batch* KVM vCPU create and initialize instead of current sequential KVM operations. This will avoid multiple calls to the KVM Host and can improve Qemu init time further. But this will require a separate discussion in the LKML including all the KVM folks. This has potential to delay the vCPU hotplug feature acceptance and I'm really not in favor of that. We have already stretched it a lot because of the standards change acceptance earlier. > > The IRQ connection and teardown can be dynamically > >> done by connecting the board with GICv3 through callbacks in > >> ARMGICv3CommonClass. > >> The connection between GICv3CPUState and CPUARMState also can be > >> done dynamically. > > > > Are you suggesting this after reviewing the code or you have to review > > it yet? 😉 > > > > I was actually trying to ask for your input and feedback. I was hoping your > input to clear my puzzles: why vCPU objects must be in place to create > GICv3 object? > Is it possible to create the GICv3 object without those vCPU objects? No. VGIC initializes IRQs to target KVM vCPUs, it would expect same KVM vCPU MPIDR or MP-AFFINITY configured when KVM vCPUs were initialized at the first place otherwise the VGIC initialization will not happen correctly. Hence, the sequence. The sequence of these initialization is generally strictly controlled by specification which is closely tied up with hardware including powering up initializations. You will need to honor the expectations of the KVM VGIC init which in turn are ARM CPU Architecture specification compliant. It is not just a loosely written code. What > kinds of efforts we need to avoid instantiating those hotpluggable vCPU > objects. I mentioned one of the ways above. Introduce *batch* KVM vCPU create & initialize. But it will have to undergo greater scrutiny because we are touching a common part which might affect many stake holders. But this is a discussion we can do later as part of microVM for ARM. Thanks Salil. > The best way perhaps is to find the answer from the code by myself ;-) > > Thanks, > Gavin >
Hi Sail, On 8/22/24 8:58 PM, Salil Mehta wrote: >> On 8/21/24 8:23 PM, Salil Mehta wrote: >> >> >> >> On 8/21/24 2:40 AM, Salil Mehta wrote: >> >> > >> >> > I don’t understand this clearly. Are you suggesting to reuse only >> >> > single vCPU object to initialize all KVM vCPUs not yet plugged? If >> >> > yes, then I'm not sure what do we gain here by adding this complexity? >> >> > It does not consume time or resources because we are not realizing any >> >> > of these vCPU object in any case. >> >> > >> >> >> >> First of all, it seems we have different names and terms for those cold- >> >> booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1 >> >> are cold-booted vCPUs while >> >> vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp >> >> maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion. >> >> >> >> The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and >> >> released in the same function for those hotpluggable vCPUs. As I can >> >> understand, those hotpluggable vCPU instances are serving for two >> >> purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have >> >> to be created and populated; >> > >> > >> > We are devising *workarounds* in Qemu for the ARM CPU architectural >> > constraints in KVM and in Guest Kernel, *not relaxing* them. We are >> > not allowed to meddle with the constraints. That is the whole point. >> > >> > Not having to respect those constraints led to rejection of the >> > earlier attempts to upstream Virtual CPU Hotplug for ARM. >> > >> >> I meant to 'overcome' the constraints by 'relax'. My apologies if there're any >> caused confusions. > > > Ok. No issues. It was important for me to clarify though. > > > Previously, you had attempt to create all vCPU objects >> and reuse them when vCPU hot added. > > Yes, at QOM level. But that approach did not realize the unplugged/yet-to-be-plugged > vCPUs. We were just using QOM vCPU objects as the place holders > Right, my point was actually vCPU objects are too heavy as the place holders. It was reason why I had the concern: why those hotpluggable vCPU objects can't be avoided during the bootup time. > In current implementation, the >> hotpluggable vCPUs are instantiated and released pretty soon. I was >> bringing the third possibility, to avoid instantiating those hotpluggable vCPU >> objects, for discussion. > > > Are you suggesting not calling KVM_ARM_VCPU_INIT IOCTL as all for the vCPUs > which are part of possible list but not yet plugged? > > If yes, we cannot do that as KVM vCPUs should be fully initialized even before > VGIC is initialized inside the KVM. This is a constraint. I've explained this in > detail in the cover letter of this patch-set and in the slides I have shared > earlier. > No, it's not what I was suggesting. What I suggested is to avoid creating those hotpluggable vCPU objects (place holders) during the bootup time. However, all vCPU file descriptors (KVM objects) still need to be created and initialized before GICv3 is initialized. It's one of the constrains. So we need to create and populate all vCPU file descriptors through ioctl(vm_fd, CREATE_VCPU) and ioctl(vcpu_fd, INIT_VCPU) before GICv3 object is created and realized. As I explained in the previous reply, the hotpluggable vCPU objects (place holders) haven't to be created in order to initialize and populate the vCPU file descriptors for those hotpluggable vCPUs. I think the parameters used to create and initialize vCPU-0's file descriptor can be reused by all other vCPUs, because we don't support heterogeneous vCPUs. What I suggested is something like below: the point is to avoid instantiating those hotpluggable vCPUs, but their vCPU file descriptor (KVM object) are still created and initialized. static void machvirt_init(MachineState *machine) { /* * Instantiate and realize vCPU-0, record the parameter passed to * ioctl(vcpu-fd, VCPU_INIT, &init), or a better place to remember the parameter. * The point is the parameter can be shared by all vCPUs. */ /* * Create vCPU descriptors for all other vCPUs (including hotpluggable vCPUs). * The remembered parameter is reused and passed to ioctl(vcpu-fd, VCPU_INIT, &init). */ /* Instanaite and realize other cold-booted vCPUs */ /* Instantiate and realize GICv3 */ } > > In this series, the life cycle of those hotpluggable >> vCPU objects are really short. Again, I didn't say we must avoid instantiating >> those vCPU objects, I brought the topic ONLY for discussion. > > Sure, I appreciate that. For the details of the reasons please follow below: > > 1. Cover Letter of this patch-set (Constraints are explained there) > 2. KVMForum Slides of 2020 and 2023 > > >> > (2) Help to instantiate and realize >> >> GICv3 object. >> >> >> >> For (1), I don't think we have to instantiate those hotpluggable vCPUs at all. >> >> In the above example where we have command line '-smp >> >> maxcpus=4,cpus=2', it's unnecessary to instantiate >> >> vCPU-3 and vCPU-4 to create and populate their KVM file descriptors. >> > >> > >> > We cannot defer create vCPU in KVM after GIC has been initialized in KVM. >> > It needs to know every vCPU that will ever exists right at the time it >> > is getting Initialized. This is an ARM CPU Architectural constraint. >> > >> >> It will be appreciated if more details other than 'an ARM CPU architectural constraint' >> can be provided. I don't understand this constrain very well at least. > > > We cannot do that as we MUST present KVM vCPUs to the VGIC fully initialized, > even before it starts its initialization. Initialization of the vCPUs also initializes > the system registers for the corresponding KVM vCPU. > > For example, MPIDR_EL1 must be initialized at VCPU INIT time. We cannot > avoid this. MPIDR value is used by VGIC during its initialization. This MUST be > present for all of the possible KVM vCPUs right from start during vgic_init() > > Please check the cover letter of this patch-set, I explained these there and the > KVMForum slides. Please review and comment there and let me know what is > not clear from the text. > It seems my suggestion wasn't fully understood. I was suggesting to avoid instantiating those hotpluggable vCPU objects (place holders) during QEMU startup time. All vCPU file descriptors (the vCPU's corresponding objects) still need to be in place before GICv3 object is initiated and realized. > >> > A >> >> vCPU's KVM file descriptor is create and populated by the following ioctls >> >> and function calls. When the first vCPU (vCPU-0) is realized, the property >> >> corresponding to "&init" is fixed for all vCPUs. It means all vCPUs have same >> >> properties except the "vcpu_index". >> >> >> >> ioctl(vm-fd, KVM_CREATE_VCPU, vcpu_index); >> >> ioctl(vcpu-fd, KVM_ARM_VCPU_INIT, &init); >> >> kvm_park_vcpu(cs); >> >> >> >> A vCPU's properties are determined by two sources and both are global. It >> >> means all vCPUs should have same properties: (a) Feature registers >> >> returned from the host. The function >> >> kvm_arm_get_host_cpu_features() is called for once, meaning this source >> >> is same to all vCPUs; >> > >> > >> > Sure, but what are you trying to save here? >> > >> >> As mentioned above, the life cycle of those hotpluggable vCPU objects are >> really short. They still consume time and memory to instantiate them. If I'm >> correct, one of the primary goal for vCPU hotplug feature is to save system >> boot-up time, correct? > > > Correct. We targeted vCPU hotplug for Kata-containers for on-demand resource > allocation and saving the resources. Kata-containers can work with different types > of VMM like Qemu and microVMs like Firecracker. AFAIK, Usecase of microVM is > slightly different than the normal containers. They are short lived (say around > 15 min) and require ultrafast boot-up times (say less than 125 ms) - these figures > are from Amazon who invented the concept of microVM in the earlier decade. > > With the current patches, we have only partially achieved what we had started > i.e. Kata/Qemu but we also want to target Kata/microVM. In our case, we want > that microVM to be Qemu based fro ARM. I think x86 already has reduced lots > of legacy stuff and created a microVM in Qemu. I'm not sure how it compares > against the true microVM like Firecracker. It will be a good target to reduce > memory foot print of ARM Qemu Virt Machine. or think or creating a new one > just like x86. Using the vCPU Hotplug feature we were drastically able to reduce > the boot up times of Qemu. Please check the calibrated performance figures in > KVmForum 2023 slide 19G (Page 26) [1] > > [1] https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf > > Last year, I had prototyped a microVM for ARM, Michael Tsirkin suggested that if > the performance number of the ARM Virt machine can match the x86 microVM then > we might not require an explicit microVM code for ARM. Hence, my current efforts > are to reduce the memory foot print of existing Virt machine. But I do have a rough > prototype of microVM as well. We can debate about that later in a different > discussion. > Thanks for the linker to the slides. Yeah, it's nice to reduce the bootup time and memory footprint if possible. vCPU hotplug feature may help to improve the performance, but all other paths might also impact the performance. In summary, it's a comprehensive goal to reduce the memory footprint and bootup time, and other components (paths) need optimization to achieve this goal. > >> >> (b) The parameters provided by user through '-cpu host,sve=off' are >> >> translated to global properties and applied to all vCPUs when they're >> >> instantiated. >> > >> > >> > Sure. Same is the case with PMU and other per-vCPU parameters. >> > We do not support heterogenous computing and therefore we do not have >> > per-vCPU control of these features as of now. >> > >> > >> >> >> >> (a) (b) >> >> >> >> aarch64_host_initfn qemu_init >> >> kvm_arm_set_cpu_features_from_host parse_cpu_option >> >> kvm_arm_get_host_cpu_features >> cpu_common_parse_features >> >> qdev_prop_register_global >> >> : >> >> device_post_init >> >> >> >> qdev_prop_set_globals >> > >> > >> > Sure, I understand the code flow but what are you trying to suggest here? >> > >> >> I tried to explain that vCPU object isn't needed to create and populate >> vCPU's file descriptors, as highlight in (1). The information used to create the >> cold-booted >> vCPU-0 can be reused because all vCPUs have same properties and feature >> set. > > > It does not matter. We use those QOM vCPU object states to initializes Qemu > GICv3 model with max possible vCPUs and then release the QOM vCPU objects. > which are yet-to-be-plugged. > It's what has been implemented in this series. My concern remains: why those vCPU hotpluggable objects can't be avoided? Again, their corresponding vCPU file descritpors (KVM vCPU objects) still have to be in place before GICv3 is instantiated and realized. > >> >> For (2), I'm still looking into the GICv3 code for better understanding. >> > >> > >> > Oh, I thought you said you've finished your reviews 😊 >> > >> > Please take your time. For your reference, you might want to check: >> > >> > KVMForum 2023: >> > https://kvm- >> forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Vir >> > t_CPU_Hotplug_-__ii0iNb3.pdf >> > https://kvm-forum.qemu.org/2023/KVM-forum-cpu- >> hotplug_7OJ1YyJ.pdf >> > >> > KVMForum 2020: >> > https://kvm- >> forum.qemu.org/2020/Oct%2029_Challenges%20in%20Supporting% >> > >> 20Virtual%20CPU%20Hotplug%20in%20SoC%20Based%20Systems%20like%2 >> 0ARM64_ >> > Salil%20Mehta.pdf >> > >> >> hmm, 'finished my review' has been misread frankly. By that, I meant I >> finished my tests and provided all the comments I had. Some of them are >> questions and discussions, which I still need to follow up. > > > Sure. No worries. Even if you miss, you will have more chance to comment on > upcoming RFC V4 😊 > Ok :-) > >> > Until >> >> now, I don't see we need the instantiated hotpluggable vCPUs either. >> > >> > >> > I think, I've already answered this above it is because of ARM Architectural >> constraint. >> > >> > >> > For >> >> example, the redistributor regions can be exposed based on 'maxcpus' >> >> instead of 'cpus'. >> > >> > You mean during the review of the code you found that we are not doing >> it? >> > >> >> It's all about the discussion to the possibility to avoid instantiating >> hotpluggable vCPU objects. > > > As mentioned above, with the current KVM code you cannot. But if we > really want to then perhaps we would need to change KVM. > > I might be wrong but AFAICS I don’t see a reason why we cannot have > something like *batch* KVM vCPU create and initialize instead of current > sequential KVM operations. This will avoid multiple calls to the KVM Host > and can improve Qemu init time further. But this will require a separate > discussion in the LKML including all the KVM folks. > > This has potential to delay the vCPU hotplug feature acceptance and I'm > really not in favor of that. We have already stretched it a lot because of > the standards change acceptance earlier. > Again, my suggestion wasn't completely understood. I was suggesting to avoid instantiating those hotpluggable objects, but their vCPU file descriptors still need to be in place before GICv3's instantiation and realization. Yes, I was also concerned that too much code changes would be needed if my suggestion is accepted. It will definitely delay the feature's upstreaming process. It's why I said the topic (to avoid the hotpluggable objects) are just for discussion now. We can do it (as separate optimization) after your current implementation is merged. > >> > The IRQ connection and teardown can be dynamically >> >> done by connecting the board with GICv3 through callbacks in >> >> ARMGICv3CommonClass. >> >> The connection between GICv3CPUState and CPUARMState also can be >> >> done dynamically. >> > >> > Are you suggesting this after reviewing the code or you have to review >> > it yet? 😉 >> > >> >> I was actually trying to ask for your input and feedback. I was hoping your >> input to clear my puzzles: why vCPU objects must be in place to create >> GICv3 object? >> Is it possible to create the GICv3 object without those vCPU objects? > > > No. VGIC initializes IRQs to target KVM vCPUs, it would expect same KVM vCPU MPIDR > or MP-AFFINITY configured when KVM vCPUs were initialized at the first place > otherwise the VGIC initialization will not happen correctly. Hence, the sequence. > > The sequence of these initialization is generally strictly controlled by specification > which is closely tied up with hardware including powering up initializations. > You will need to honor the expectations of the KVM VGIC init which in turn are > ARM CPU Architecture specification compliant. It is not just a loosely written code. > umm, As I explained from the beginning, all KVM vCPU file descriptors are still in place before GICv3 is instantiated and realized. With those KVM vCPU file descriptor, we shouldn't have problem except more code changes are needed, or I miss something? :) > > What >> kinds of efforts we need to avoid instantiating those hotpluggable vCPU >> objects. > > > I mentioned one of the ways above. Introduce *batch* KVM vCPU create & > initialize. But it will have to undergo greater scrutiny because we are touching > a common part which might affect many stake holders. But this is a discussion > we can do later as part of microVM for ARM. > Remember you just had the architecture agnostic series merged. The vCPU file descriptor can be parked and picked up at a later time based on the vCPU index. This KVM vCPU create in batch mode can be well supported. What you need to is record the parameters passed to ioctl(vm_fd, CREATE_VCPU, index) and ioctl(vcpu_fd, INIT_CPU) for vCPU-0 in hw/arm/virt.c and create all other vCPU file descriptors based on the recorded parameters. Thanks, Gavin
Hi Gavin, > From: Gavin Shan <gshan@redhat.com> > Sent: Friday, August 23, 2024 11:52 AM > To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org; > qemu-arm@nongnu.org; mst@redhat.com > > Hi Sail, > > On 8/22/24 8:58 PM, Salil Mehta wrote: > >> On 8/21/24 8:23 PM, Salil Mehta wrote: > >> >> > >> >> On 8/21/24 2:40 AM, Salil Mehta wrote: > >> >> > > >> >> > I don’t understand this clearly. Are you suggesting to reuse only > >> >> > single vCPU object to initialize all KVM vCPUs not yet plugged? If > >> >> > yes, then I'm not sure what do we gain here by adding this complexity? > >> >> > It does not consume time or resources because we are not realizing any > >> >> > of these vCPU object in any case. > >> >> > > >> >> > >> >> First of all, it seems we have different names and terms for those cold- > >> >> booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1 > >> >> are cold-booted vCPUs while > >> >> vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp > >> >> maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion. > >> >> > >> >> The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and > >> >> released in the same function for those hotpluggable vCPUs. As I can > >> >> understand, those hotpluggable vCPU instances are serving for two > >> >> purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have > >> >> to be created and populated; > >> > > >> > > >> > We are devising *workarounds* in Qemu for the ARM CPU architectural > >> > constraints in KVM and in Guest Kernel, *not relaxing* them. We are > >> > not allowed to meddle with the constraints. That is the whole point. > >> > > >> > Not having to respect those constraints led to rejection of the > >> > earlier attempts to upstream Virtual CPU Hotplug for ARM. > >> > > >> > >> I meant to 'overcome' the constraints by 'relax'. My apologies if there're any > >> caused confusions. > > > > > > Ok. No issues. It was important for me to clarify though. > > > > > > Previously, you had attempt to create all vCPU objects > >> and reuse them when vCPU hot added. > > > > Yes, at QOM level. But that approach did not realize the > > unplugged/yet-to-be-plugged vCPUs. We were just using QOM vCPU objects > > as the place holders > > > > Right, my point was actually vCPU objects are too heavy as the place > holders. It was reason why I had the concern: why those hotpluggable vCPU > objects can't be avoided during the bootup time. Sure, to list down again. For the reasons I've already explained : 1. KVM MUST have details about all the vCPUs and its features finalized before the VGIC initialization. This is ARM CPU Architecture constraint. This is immutable requirement! 2. QOM vCPUs has got representational changes of the KVM vCPU like vcpu-id, features list etc. These MUST be finalized even before QOM begins its own GICv3 initialization which will end up in initialization of KVM VGIC. QOM vCPUs MUST be handed over to the GICV3 QOM in fully initialized state. The same reason applies here as well. Till this point we are architecture compliant. The only place where we are not ARM Architecture compliant is where in QOM, we dissociate the vCPU state with GICV3 CPU State during unplug action or for the yet-to-be-plugged vCPUs. Later are released as part of virt_cpu_post_init() in our current design. > > In current implementation, the > >> hotpluggable vCPUs are instantiated and released pretty soon. I was > >> bringing the third possibility, to avoid instantiating those hotpluggable vCPU > >> objects, for discussion. > > > > > > Are you suggesting not calling KVM_ARM_VCPU_INIT IOCTL as all for the > > vCPUs which are part of possible list but not yet plugged? > > > > If yes, we cannot do that as KVM vCPUs should be fully initialized > > even before VGIC is initialized inside the KVM. This is a constraint. > > I've explained this in detail in the cover letter of this patch-set > > and in the slides I have shared earlier. > > > > No, it's not what I was suggesting. What I suggested is to avoid creating > those hotpluggable vCPU objects (place holders) during the bootup time. > However, all vCPU file descriptors (KVM > objects) still need to be created and initialized before GICv3 is initialized. It's > one of the constrains. So we need to create and populate all vCPU file > descriptors through ioctl(vm_fd, CREATE_VCPU) and ioctl(vcpu_fd, > INIT_VCPU) before GICv3 object is created and realized. As I explained in > the previous reply, the hotpluggable vCPU objects (place holders) haven't > to be created in order to initialize and populate the vCPU file descriptors for > those hotpluggable vCPUs. I think the parameters used to create and > initialize vCPU-0's file descriptor can be reused by all other vCPUs, because > we don't support heterogeneous vCPUs. > What I suggested is something like below: the point is to avoid instantiating > those hotpluggable vCPUs, but their vCPU file descriptor (KVM object) are > still created and initialized. > > static void machvirt_init(MachineState *machine) > { > > /* > * Instantiate and realize vCPU-0, record the parameter passed to > * ioctl(vcpu-fd, VCPU_INIT, &init), or a better place to remember the > parameter. > * The point is the parameter can be shared by all vCPUs. > */ > > /* > * Create vCPU descriptors for all other vCPUs (including hotpluggable > vCPUs). > * The remembered parameter is reused and passed to ioctl(vcpu-fd, > VCPU_INIT, &init). > */ > > /* Instanaite and realize other cold-booted vCPUs */ > > /* Instantiate and realize GICv3 */ > > } No. For the reasons I've mentioned above, we MUST provide fully initialize the QOM vCPU objects before initialization of QOM GICV3 kicks in. This ensures that nothing breaks during initialization process of the QOM GICV3. Therefore, the optimization steps mentioned above are unnecessary and could cause problems in future. Additionally, the evolution of the GICV3 QOM can be independent of the ARM Virt Machine as it can be used with other Machines as well so we MUST treat it as a black box which needs QOM vCPU objects as inputs during its initialization. > > In this series, the life cycle of those hotpluggable > >> vCPU objects are really short. Again, I didn't say we must avoid instantiating > >> those vCPU objects, I brought the topic ONLY for discussion. > > > > Sure, I appreciate that. For the details of the reasons please follow below: > > > > 1. Cover Letter of this patch-set (Constraints are explained there) 2. > > KVMForum Slides of 2020 and 2023 > > > > > >> > (2) Help to instantiate and realize > >> >> GICv3 object. > >> >> > >> >> For (1), I don't think we have to instantiate those hotpluggable vCPUs at all. > >> >> In the above example where we have command line '-smp > >> >> maxcpus=4,cpus=2', it's unnecessary to instantiate > >> >> vCPU-3 and vCPU-4 to create and populate their KVM file descriptors. > >> > > >> > > >> > We cannot defer create vCPU in KVM after GIC has been initialized in KVM. > >> > It needs to know every vCPU that will ever exists right at the time it > >> > is getting Initialized. This is an ARM CPU Architectural constraint. > >> > > >> > >> It will be appreciated if more details other than 'an ARM CPU architectural constraint' > >> can be provided. I don't understand this constrain very well at least. > > > > > > We cannot do that as we MUST present KVM vCPUs to the VGIC fully > > initialized, even before it starts its initialization. Initialization > > of the vCPUs also initializes the system registers for the corresponding KVM vCPU. > > > > For example, MPIDR_EL1 must be initialized at VCPU INIT time. We > > cannot avoid this. MPIDR value is used by VGIC during its > > initialization. This MUST be present for all of the possible KVM vCPUs > > right from start during vgic_init() > > > > Please check the cover letter of this patch-set, I explained these > > there and the KVMForum slides. Please review and comment there and > > let me know what is not clear from the text. > > > > It seems my suggestion wasn't fully understood. I was suggesting to avoid > instantiating those hotpluggable vCPU objects (place holders) during QEMU > startup time. All vCPU file descriptors (the vCPU's corresponding objects) > still need to be in place before GICv3 object is initiated and realized. We need both QOM and KVM vCPU objects fully initialized even before QOM GICV3 and hence KVM VGIC starts to initialize itself for the reasons I've explained above. Yes, there is a duplicity especially when we are not supporting heterogenous computing but let us not add weight of that optimization to the current patch-set. > >> > A > >> >> vCPU's KVM file descriptor is create and populated by the following ioctls > >> >> and function calls. When the first vCPU (vCPU-0) is realized, the property > >> >> corresponding to "&init" is fixed for all vCPUs. It means all vCPUs have same > >> >> properties except the "vcpu_index". > >> >> > >> >> ioctl(vm-fd, KVM_CREATE_VCPU, vcpu_index); > >> >> ioctl(vcpu-fd, KVM_ARM_VCPU_INIT, &init); > >> >> kvm_park_vcpu(cs); > >> >> > >> >> A vCPU's properties are determined by two sources and both are global. It > >> >> means all vCPUs should have same properties: (a) Feature registers > >> >> returned from the host. The function > >> >> kvm_arm_get_host_cpu_features() is called for once, meaning this source > >> >> is same to all vCPUs; > >> > > >> > > >> > Sure, but what are you trying to save here? > >> > > >> > >> As mentioned above, the life cycle of those hotpluggable vCPU objects are > >> really short. They still consume time and memory to instantiate them. If I'm > >> correct, one of the primary goal for vCPU hotplug feature is to save system > >> boot-up time, correct? > > > > > > Correct. We targeted vCPU hotplug for Kata-containers for on-demand > > resource allocation and saving the resources. Kata-containers can work > > with different types of VMM like Qemu and microVMs like Firecracker. > > AFAIK, Usecase of microVM is slightly different than the normal > > containers. They are short lived (say around > > 15 min) and require ultrafast boot-up times (say less than 125 ms) - > > these figures are from Amazon who invented the concept of microVM in > the earlier decade. > > > > With the current patches, we have only partially achieved what we had > > started i.e. Kata/Qemu but we also want to target Kata/microVM. In our > > case, we want that microVM to be Qemu based fro ARM. I think x86 > > already has reduced lots of legacy stuff and created a microVM in > > Qemu. I'm not sure how it compares against the true microVM like > > Firecracker. It will be a good target to reduce memory foot print of > > ARM Qemu Virt Machine. or think or creating a new one just like x86. > > Using the vCPU Hotplug feature we were drastically able to reduce the > > boot up times of Qemu. Please check the calibrated performance figures > > in KVmForum 2023 slide 19G (Page 26) [1] > > > > [1] > > https://kvm- > forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Vir > > t_CPU_Hotplug_-__ii0iNb3.pdf > > > > Last year, I had prototyped a microVM for ARM, Michael Tsirkin > > suggested that if the performance number of the ARM Virt machine can > > match the x86 microVM then we might not require an explicit microVM > > code for ARM. Hence, my current efforts are to reduce the memory foot > > print of existing Virt machine. But I do have a rough prototype of > > microVM as well. We can debate about that later in a different discussion. > > > > Thanks for the linker to the slides. Yeah, it's nice to reduce the bootup time > and memory footprint if possible. vCPU hotplug feature may help to > improve the performance, but all other paths might also impact the > performance. In summary, it's a comprehensive goal to reduce the memory > footprint and bootup time, and other components (paths) need > optimization to achieve this goal. Agreed, there is a lot of scope for that. But let's prioritize and do first things first. 1. We want this patch-set to get accepted after all fixes. This is the main goal. 2. Then think of optimizations like bootup and memory footprint reduction. > >> >> (b) The parameters provided by user through '-cpu host,sve=off' are > >> >> translated to global properties and applied to all vCPUs when they're > >> >> instantiated. > >> > > >> > > >> > Sure. Same is the case with PMU and other per-vCPU parameters. > >> > We do not support heterogenous computing and therefore we do not have > >> > per-vCPU control of these features as of now. > >> > > >> > > >> >> > >> >> (a) (b) > >> >> > >> >> aarch64_host_initfn qemu_init > >> >> kvm_arm_set_cpu_features_from_host parse_cpu_option > >> >> kvm_arm_get_host_cpu_features > >> cpu_common_parse_features > >> >> qdev_prop_register_global > >> >> : > >> >> device_post_init > >> >> > >> >> qdev_prop_set_globals > >> > > >> > > >> > Sure, I understand the code flow but what are you trying to suggest here? > >> > > >> > >> I tried to explain that vCPU object isn't needed to create and populate > >> vCPU's file descriptors, as highlight in (1). The information used to create the > >> cold-booted > >> vCPU-0 can be reused because all vCPUs have same properties and feature > >> set. > > > > > > It does not matter. We use those QOM vCPU object states to initializes > > Qemu > > GICv3 model with max possible vCPUs and then release the QOM vCPU objects. > > which are yet-to-be-plugged. > > > > It's what has been implemented in this series. My concern remains: why > those vCPU hotpluggable objects can't be avoided? Again, their > corresponding vCPU file descritpors (KVM vCPU objects) still have to be in > place before GICv3 is instantiated and realized. Answered above. > >> >> For (2), I'm still looking into the GICv3 code for better understanding. > >> > > >> > > >> > Oh, I thought you said you've finished your reviews 😊 > >> > > >> > Please take your time. For your reference, you might want to check: > >> > > >> > KVMForum 2023: > >> > https://kvm- > >> forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Vir > >> > t_CPU_Hotplug_-__ii0iNb3.pdf > >> > https://kvm-forum.qemu.org/2023/KVM-forum-cpu- > >> hotplug_7OJ1YyJ.pdf > >> > > >> > KVMForum 2020: > >> > https://kvm- > >> forum.qemu.org/2020/Oct%2029_Challenges%20in%20Supporting% > >> > > >> > 20Virtual%20CPU%20Hotplug%20in%20SoC%20Based%20Systems%20like%2 > >> 0ARM64_ > >> > Salil%20Mehta.pdf > >> > > >> > >> hmm, 'finished my review' has been misread frankly. By that, I meant I > >> finished my tests and provided all the comments I had. Some of them > are > >> questions and discussions, which I still need to follow up. > > > > > > Sure. No worries. Even if you miss, you will have more chance to > > comment on upcoming RFC V4 😊 > > > > Ok :-) > > > > >> > Until > >> >> now, I don't see we need the instantiated hotpluggable vCPUs either. > >> > > >> > > >> > I think, I've already answered this above it is because of ARM Architectural > >> constraint. > >> > > >> > > >> > For > >> >> example, the redistributor regions can be exposed based on 'maxcpus' > >> >> instead of 'cpus'. > >> > > >> > You mean during the review of the code you found that we are not doing > >> it? > >> > > >> > >> It's all about the discussion to the possibility to avoid instantiating > >> hotpluggable vCPU objects. > > > > > > As mentioned above, with the current KVM code you cannot. But if we > > really want to then perhaps we would need to change KVM. > > > > I might be wrong but AFAICS I don’t see a reason why we cannot have > > something like *batch* KVM vCPU create and initialize instead of > > current sequential KVM operations. This will avoid multiple calls to > > the KVM Host and can improve Qemu init time further. But this will > > require a separate discussion in the LKML including all the KVM folks. > > > > This has potential to delay the vCPU hotplug feature acceptance and > > I'm really not in favor of that. We have already stretched it a lot > > because of the standards change acceptance earlier. > > > > Again, my suggestion wasn't completely understood. I was suggesting to > avoid instantiating those hotpluggable objects, but their vCPU file > descriptors still need to be in place before GICv3's instantiation and > realization. > Yes, I was also concerned that too much code changes would be needed if > my suggestion is accepted. It will definitely delay the feature's upstreaming > process. It's why I said the topic (to avoid the hotpluggable objects) are just > for discussion now. We can do it (as separate optimization) after your > current implementation is merged. Answered above, why we cannot and should not do that. > >> > The IRQ connection and teardown can be dynamically > >> >> done by connecting the board with GICv3 through callbacks in > >> >> ARMGICv3CommonClass. > >> >> The connection between GICv3CPUState and CPUARMState also can be > >> >> done dynamically. > >> > > >> > Are you suggesting this after reviewing the code or you have to review > >> > it yet? 😉 > >> > > >> > >> I was actually trying to ask for your input and feedback. I was hoping your > >> input to clear my puzzles: why vCPU objects must be in place to create > >> GICv3 object? > >> Is it possible to create the GICv3 object without those vCPU objects? > > > > > > No. VGIC initializes IRQs to target KVM vCPUs, it would expect same > > KVM vCPU MPIDR or MP-AFFINITY configured when KVM vCPUs were > > initialized at the first place otherwise the VGIC initialization will not happen > correctly. Hence, the sequence. > > > > The sequence of these initialization is generally strictly controlled > > by specification which is closely tied up with hardware including powering > up initializations. > > You will need to honor the expectations of the KVM VGIC init which in > > turn are ARM CPU Architecture specification compliant. It is not just a > loosely written code. > > > > umm, As I explained from the beginning, all KVM vCPU file descriptors are > still in place before GICv3 is instantiated and realized. With those KVM vCPU > file descriptor, we shouldn't have problem except more code changes are > needed, or I miss something? :) Same as explained above. > > What > >> kinds of efforts we need to avoid instantiating those hotpluggable vCPU > >> objects. > > > > > > I mentioned one of the ways above. Introduce *batch* KVM vCPU create & > > initialize. But it will have to undergo greater scrutiny because we > > are touching a common part which might affect many stake holders. But > > this is a discussion we can do later as part of microVM for ARM. > > > > Remember you just had the architecture agnostic series merged. The vCPU > file descriptor can be parked and picked up at a later time based on the > vCPU index. > This KVM vCPU create in batch mode can be well supported. What you need > to is record the parameters passed to ioctl(vm_fd, CREATE_VCPU, index) > and ioctl(vcpu_fd, > INIT_CPU) for vCPU-0 in hw/arm/virt.c and create all other vCPU file > descriptors based on the recorded parameters. Yes, but we would need to consider many other things and we cannot decide that here. Lets cross the bridge when it comes. I'd suggest to park this part of the discussion for now. Thanks Salil. > > Thanks, > Gavin >
Hi Salil, On 8/23/24 11:17 PM, Salil Mehta wrote: >> On 8/22/24 8:58 PM, Salil Mehta wrote: >> >> On 8/21/24 8:23 PM, Salil Mehta wrote: >> >> >> >> >> >> On 8/21/24 2:40 AM, Salil Mehta wrote: >> >> >> > >> >> >> > I don’t understand this clearly. Are you suggesting to reuse only >> >> >> > single vCPU object to initialize all KVM vCPUs not yet plugged? If >> >> >> > yes, then I'm not sure what do we gain here by adding this complexity? >> >> >> > It does not consume time or resources because we are not realizing any >> >> >> > of these vCPU object in any case. >> >> >> > >> >> >> >> >> >> First of all, it seems we have different names and terms for those cold- >> >> >> booted vCPUs and hotpluggable vCPUs. For example, vCPU-0 and vCPU-1 >> >> >> are cold-booted vCPUs while >> >> >> vCPU-2 and vCPU-3 are hotpluggable vCPUs when we have '-smp >> >> >> maxcpus=4,cpus=2'. Lets stick to convention and terms for easier discussion. >> >> >> >> >> >> The idea is avoid instantiating hotpluggable vCPUs in virtmach_init() and >> >> >> released in the same function for those hotpluggable vCPUs. As I can >> >> >> understand, those hotpluggable vCPU instances are serving for two >> >> >> purposes: (1) Relax the constraint that all vCPU's (kvm) file descriptor have >> >> >> to be created and populated; >> >> > >> >> > >> >> > We are devising *workarounds* in Qemu for the ARM CPU architectural >> >> > constraints in KVM and in Guest Kernel, *not relaxing* them. We are >> >> > not allowed to meddle with the constraints. That is the whole point. >> >> > >> >> > Not having to respect those constraints led to rejection of the >> >> > earlier attempts to upstream Virtual CPU Hotplug for ARM. >> >> > >> >> >> >> I meant to 'overcome' the constraints by 'relax'. My apologies if there're any >> >> caused confusions. >> > >> > >> > Ok. No issues. It was important for me to clarify though. >> > >> > >> > Previously, you had attempt to create all vCPU objects >> >> and reuse them when vCPU hot added. >> > >> > Yes, at QOM level. But that approach did not realize the >> > unplugged/yet-to-be-plugged vCPUs. We were just using QOM vCPU objects >> > as the place holders >> > >> >> Right, my point was actually vCPU objects are too heavy as the place >> holders. It was reason why I had the concern: why those hotpluggable vCPU >> objects can't be avoided during the bootup time. > > > Sure, to list down again. For the reasons I've already explained : > > 1. KVM MUST have details about all the vCPUs and its features finalized before the VGIC > initialization. This is ARM CPU Architecture constraint. This is immutable requirement! > 2. QOM vCPUs has got representational changes of the KVM vCPU like vcpu-id, features > list etc. These MUST be finalized even before QOM begins its own GICv3 initialization > which will end up in initialization of KVM VGIC. QOM vCPUs MUST be handed over to > the GICV3 QOM in fully initialized state. The same reason applies here as well. > Till this point we are architecture compliant. The only place where we are not ARM > Architecture compliant is where in QOM, we dissociate the vCPU state with GICV3 > CPU State during unplug action or for the yet-to-be-plugged vCPUs. Later are > released as part of virt_cpu_post_init() in our current design. > Thanks for your time to evaluate and reply. Sorry that I'm not convinced. You're explaining what we already have in current design and implementation. It doesn't mean the current design and implementation is 100% perfect and flawless. In current design and implementation, all (QOM) vCPU objects have to be instantiated, even though those hotpluggable (QOM) vCPU objects aren't realized at bootup time. After that, those (QOM) hotpluggable vCPU objects are finalized and destroyed so that they can be hot added afterwards. This sounds like we create (QOM) vCPU objects, remove those (QOM) vCPU objects at booting time so that they can be hot-added at running time. The duplicate work is obvious there. This scheme and design looks just-for-working, not in an optimized way to me. I'm giving up the efforts to convince you. Lets see if other reviewers will have same concern or not. Another possibility would be to have current implementation (with all fixes) merged upstream firstly, and then seek optimizations after that. That time, the suggestion can be re-evaluated. > >> > In current implementation, the >> >> hotpluggable vCPUs are instantiated and released pretty soon. I was >> >> bringing the third possibility, to avoid instantiating those hotpluggable vCPU >> >> objects, for discussion. >> > >> > >> > Are you suggesting not calling KVM_ARM_VCPU_INIT IOCTL as all for the >> > vCPUs which are part of possible list but not yet plugged? >> > >> > If yes, we cannot do that as KVM vCPUs should be fully initialized >> > even before VGIC is initialized inside the KVM. This is a constraint. >> > I've explained this in detail in the cover letter of this patch-set >> > and in the slides I have shared earlier. >> > >> >> No, it's not what I was suggesting. What I suggested is to avoid creating >> those hotpluggable vCPU objects (place holders) during the bootup time. >> However, all vCPU file descriptors (KVM >> objects) still need to be created and initialized before GICv3 is initialized. It's >> one of the constrains. So we need to create and populate all vCPU file >> descriptors through ioctl(vm_fd, CREATE_VCPU) and ioctl(vcpu_fd, >> INIT_VCPU) before GICv3 object is created and realized. As I explained in >> the previous reply, the hotpluggable vCPU objects (place holders) haven't >> to be created in order to initialize and populate the vCPU file descriptors for >> those hotpluggable vCPUs. I think the parameters used to create and >> initialize vCPU-0's file descriptor can be reused by all other vCPUs, because >> we don't support heterogeneous vCPUs. >> What I suggested is something like below: the point is to avoid instantiating >> those hotpluggable vCPUs, but their vCPU file descriptor (KVM object) are >> still created and initialized. >> >> static void machvirt_init(MachineState *machine) >> { >> >> /* >> * Instantiate and realize vCPU-0, record the parameter passed to >> * ioctl(vcpu-fd, VCPU_INIT, &init), or a better place to remember the >> parameter. >> * The point is the parameter can be shared by all vCPUs. >> */ >> >> /* >> * Create vCPU descriptors for all other vCPUs (including hotpluggable >> vCPUs). >> * The remembered parameter is reused and passed to ioctl(vcpu-fd, >> VCPU_INIT, &init). >> */ >> >> /* Instanaite and realize other cold-booted vCPUs */ >> >> /* Instantiate and realize GICv3 */ >> >> } > > > No. For the reasons I've mentioned above, we MUST provide fully initialize the QOM > vCPU objects before initialization of QOM GICV3 kicks in. This ensures that nothing > breaks during initialization process of the QOM GICV3. Therefore, the optimization > steps mentioned above are unnecessary and could cause problems in future. > Additionally, the evolution of the GICV3 QOM can be independent of the ARM Virt > Machine as it can be used with other Machines as well so we MUST treat it as a black > box which needs QOM vCPU objects as inputs during its initialization. > Your explanation is not completely correct to me. It's what we had in current design. It doesn't means the design should have to be like this. The (KVM) vCPU file descriptor must be in place before QOM GICv3 object is instantiated and realized, but the (QOM) vCPU objects don't have to exist before that. However, we may need a lot of code changes to tame GICv3Common and GICv3KVM so that they can be vCPU hot-add/remove friendly and then to avoid the pre-created (QOM) vCPU objects. It can be something to be re-evaluated in the future, as I said above. Frankly, It's pointless to take our time on the discussions without reaching to any conclusions. From my side, I've tried my best to provide my comments and explain my thoughts. Appreciated for your patience, time on evaluation, and replies :) Thanks, Gavin
© 2016 - 2024 Red Hat, Inc.