[v3] RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch

RE: [PATCH RFC V3 00/29] Support of Virtual CPU Hotplug for ARMv8 Arch
Posted by Salil Mehta via 3 months, 2 weeks ago
Hi Gavin,

>  From: Gavin Shan <gshan@redhat.com>
>  Sent: Thursday, August 8, 2024 1:29 AM
>  To: Salil Mehta <salil.mehta@huawei.com>; qemu-devel@nongnu.org;
>  qemu-arm@nongnu.org; mst@redhat.com
>  
>  Hi Salil,
>  
>  On 8/8/24 9:48 AM, Salil Mehta wrote:
>  >>   On 8/7/24 11:27 PM, Salil Mehta wrote:
>  >>   >
>  >>   > Let me figure out this. Have you also included the below patch along
>  >>   > with the architecture agnostic patch-set accepted in this Qemu cycle?
>  >>   >
>  >>   > https://lore.kernel.org/all/20240801142322.3948866-3-
>  peter.maydell@lin
>  >>   > aro.org/
>  >>   >
>  >>
>  >>   There are no vCPU fd to be parked and unparked when the core dump
>  >>   happenes. I tried it, but didn't help. I added more debugging messages
>  and
>  >>   the core dump is triggered in the following path. It seems 'cpu-
>  >>   >sve_vq.map' isn't correct since it's populated in CPU realization path,
>  and
>  >>   those non-cold-booted CPUs aren't realized in the booting stage.
>  >
>  >
>  > Ah, I've to fix the SVE support. I'm already working on it and will be
>  > part of the RFC V4.
>  >
>  > Have you tried booting VM by disabling the SVE support?
>  >
>  
>  I'm able to boot the guest after SVE is disabled by clearing the
>  corresponding bits in ID_AA64PFR0, as below.
>  
>  static bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf)
>  {
>       :
>  
>       /*
>        * SVE is explicitly disabled. Otherwise, the non-cold-booted
>        * CPUs can't be initialized in the vCPU hotplug scenario.
>        */
>       err = read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64pfr0,
>                            ARM64_SYS_REG(3, 0, 0, 4, 0));
>       ahcf->isar.id_aa64pfr0 &= ~R_ID_AA64PFR0_SVE_MASK; }
>  
>  However, I'm unable to hot-add a vCPU and haven't get a chance to look at
>  it closely.
>  
>  (qemu) device_add host-arm-cpu,id=cpu,socket-id=1
>  (qemu) [  258.901027] Unable to handle kernel write to read-only memory
>  at virtual address ffff800080fa7190 [  258.901686] Mem abort info:
>  [  258.901889]   ESR = 0x000000009600004e
>  [  258.902160]   EC = 0x25: DABT (current EL), IL = 32 bits
>  [  258.902543]   SET = 0, FnV = 0
>  [  258.902763]   EA = 0, S1PTW = 0
>  [  258.902991]   FSC = 0x0e: level 2 permission fault
>  [  258.903338] Data abort info:
>  [  258.903547]   ISV = 0, ISS = 0x0000004e, ISS2 = 0x00000000
>  [  258.903943]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
>  [  258.904304]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>  [  258.904687] swapper pgtable: 4k pages, 48-bit VAs,
>  pgdp=00000000b8e24000 [  258.905258] [ffff800080fa7190]
>  pgd=10000000b95b0003, p4d=10000000b95b0003, pud=10000000b95b1003,
>  pmd=00600000b8c00781 [  258.906026] Internal error: Oops:
>  000000009600004e [#1] PREEMPT SMP [  258.906474] Modules linked in:
>  [  258.906705] CPU: 0 UID: 0 PID: 29 Comm: kworker/u8:1 Not tainted 6.11.0-
>  rc2-gavin-gb446a2dae984 #7 [  258.907338] Hardware name: QEMU KVM
>  Virtual Machine, BIOS edk2-stable202402-prebuilt.qemu.org 02/14/2024 [
>  258.908009] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [
>  258.908401] pstate: 63400005 (nZCv daif +PAN -UAO +TCO +DIT -SSBS
>  BTYPE=--) [  258.908899] pc : register_cpu+0x140/0x290 [  258.909195] lr :
>  register_cpu+0x128/0x290 [  258.909487] sp : ffff8000817fba10 [  258.909727]
>  x29: ffff8000817fba10 x28: 0000000000000000 x27: ffff0000011f9098 [
>  258.910246] x26: ffff80008167b1b0 x25: 0000000000000001 x24:
>  ffff80008153dad0 [  258.910762] x23: 0000000000000001 x22:
>  ffff0000ff7de210 x21: ffff8000811b9a00 [  258.911279] x20:
>  0000000000000000 x19: ffff800080fa7190 x18: ffffffffffffffff [  258.911798]
>  x17: 0000000000000000 x16: 0000000000000000 x15: ffff000005a46a1c [
>  258.912326] x14: ffffffffffffffff x13: ffff000005a4632b x12: 0000000000000000
>  [  258.912854] x11: 0000000000000040 x10: 0000000000000000 x9 :
>  ffff8000808a6cd4 [  258.913382] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f
>  x6 : fefefefefefefeff [  258.913906] x5 : ffff0000053fab40 x4 :
>  ffff0000053fa920 x3 : ffff0000053fabb0 [  258.914439] x2 : ffff000000de1100
>  x1 : ffff800080fa7190 x0 : 0000000000000002 [  258.914968] Call trace:
>  [  258.915154]  register_cpu+0x140/0x290 [  258.915429]
>  arch_register_cpu+0x84/0xd8 [  258.915726]
>  acpi_processor_add+0x480/0x5b0 [  258.916042]
>  acpi_bus_attach+0x1c4/0x300 [  258.916334]
>  acpi_dev_for_one_check+0x3c/0x50 [  258.916689]
>  device_for_each_child+0x68/0xc8 [  258.917012]
>  acpi_dev_for_each_child+0x48/0x80 [  258.917344]
>  acpi_bus_attach+0x84/0x300 [  258.917629]  acpi_bus_scan+0x74/0x220 [
>  258.917902]  acpi_scan_rescan_bus+0x54/0x88 [  258.918211]
>  acpi_device_hotplug+0x208/0x478 [  258.918529]
>  acpi_hotplug_work_fn+0x2c/0x50 [  258.918839]
>  process_one_work+0x15c/0x3c0 [  258.919139]
>  worker_thread+0x2ec/0x400 [  258.919417]  kthread+0x120/0x130 [
>  258.919658]  ret_from_fork+0x10/0x20 [  258.919924] Code: 91064021
>  9ad72000 8b130c33 d503201f (f820327f) [  258.920373] ---[ end trace
>  0000000000000000 ]---


Yes, this crash. Thanks for confirming!


>  
>  Thanks,
>  Gavin
>  
>