This is noticed while working on RHBZ 1449031, and fixes the reported
crash which happens when plugging back a virtio-scsi device after
unplugging it.
The root cause of the crash is not obvious here, but the change
regardlessly makes sense so it's proposed here: the listener was
registered in .realize(), so do the cleanup in the matching .unrealize()
rather than the .finalize() callback.
The difference this makes is that, due to some other references to the
memory region that is owned here, .finalize() is not called when unplug.
(Note that memory_region_ref() does object_ref() on the owner instead of
the MemoryRegion itself.) This is something fishy, and is being
investigated independently.
Signed-off-by: Fam Zheng <famz@redhat.com>
---
hw/virtio/virtio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 03592c5..12604d6 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2515,6 +2515,7 @@ static void virtio_device_unrealize(DeviceState *dev, Error **errp)
}
}
+ memory_listener_unregister(&vdev->listener);
g_free(vdev->bus_name);
vdev->bus_name = NULL;
}
@@ -2539,7 +2540,6 @@ static void virtio_device_instance_finalize(Object *obj)
{
VirtIODevice *vdev = VIRTIO_DEVICE(obj);
- memory_listener_unregister(&vdev->listener);
virtio_device_free_virtqueues(vdev);
g_free(vdev->config);
--
2.9.3
On Tue, 05/16 15:24, Fam Zheng wrote: > The root cause of the crash is not obvious here, but the change > regardlessly makes sense so it's proposed here: the listener was > registered in .realize(), so do the cleanup in the matching .unrealize() > rather than the .finalize() callback. Actually it seem calling memory_listener_unregister in .instance_finalize is not safe because it can be in the RCU thread. This race is what caused the corruption of the listener lists. So I think this patch is doing the right thing. Fam
On 16/05/2017 10:07, Fam Zheng wrote: > On Tue, 05/16 15:24, Fam Zheng wrote: >> The root cause of the crash is not obvious here, but the change >> regardlessly makes sense so it's proposed here: the listener was >> registered in .realize(), so do the cleanup in the matching .unrealize() >> rather than the .finalize() callback. This is not entirely true. Unrealize is the point where the device doesn't get any more requests. Instance finalize is the point where there are no references anymore. If a pending request has a reference to X, instance finalize is the right place to free X. However, in this case using .unrealize() should be fine. > Actually it seem calling memory_listener_unregister in .instance_finalize is not > safe because it can be in the RCU thread. This race is what caused the > corruption of the listener lists. RCU callbacks are called with BQL held, so that shouldn't be it. But the patch should be okay anyway. Paolo
On Tue, 05/16 11:23, Paolo Bonzini wrote: > > > On 16/05/2017 10:07, Fam Zheng wrote: > > On Tue, 05/16 15:24, Fam Zheng wrote: > >> The root cause of the crash is not obvious here, but the change > >> regardlessly makes sense so it's proposed here: the listener was > >> registered in .realize(), so do the cleanup in the matching .unrealize() > >> rather than the .finalize() callback. > > This is not entirely true. > > Unrealize is the point where the device doesn't get any more requests. > Instance finalize is the point where there are no references anymore. > If a pending request has a reference to X, instance finalize is the > right place to free X. > > However, in this case using .unrealize() should be fine. > > > Actually it seem calling memory_listener_unregister in .instance_finalize is not > > safe because it can be in the RCU thread. This race is what caused the > > corruption of the listener lists. > > RCU callbacks are called with BQL held, so that shouldn't be it. But > the patch should be okay anyway. You are right. Having had another look, I think it's because of this: VirtIODevice is an embeded member of VirtIOSCSIPCI therefore it is never "finalized" through QOM reference directly. Am I right? Fam
On 16/05/2017 14:25, Fam Zheng wrote: > You are right. Having had another look, I think it's because of this: > VirtIODevice is an embeded member of VirtIOSCSIPCI therefore it is never > "finalized" through QOM reference directly. Am I right? What I would expect is: virtio_instance_init_common: - create the VirtIODevice with refcount 1 - create a child property for the VirtIODevice (refcount is now 2) - unref the VirtIODevice (refcount is again 1) virtio_pci_realize: - virtio_pci_bus_new creates the virtio bus - the virtio bus is added as a child property virtio_scsi_pci_realize: - qdev_set_parent_bus links the device to the bus (bus and VirtIODevice refcounts are now 3) - the VirtIODevice is realized ... at hot-unplug time: - the device is unrealized - the bus is unparented (calling bus_unparent) - the device is unparented (calling device_unparent) - bus_remove_child is called (bus and VirtIODevice refcounts are now 1) - the VirtIODevice child property is deleted by object_unparent and the VirtIODevice is finalized - the bus child property is deleted by object_unparent and the VirtIODevice is finalized Thanks, Paolo
On Tue, 05/16 14:44, Paolo Bonzini wrote: > > > On 16/05/2017 14:25, Fam Zheng wrote: > > You are right. Having had another look, I think it's because of this: > > VirtIODevice is an embeded member of VirtIOSCSIPCI therefore it is never > > "finalized" through QOM reference directly. Am I right? > > What I would expect is: > > virtio_instance_init_common: > - create the VirtIODevice with refcount 1 > - create a child property for the VirtIODevice (refcount is now 2) > - unref the VirtIODevice (refcount is again 1) > > virtio_pci_realize: > - virtio_pci_bus_new creates the virtio bus > - the virtio bus is added as a child property > > virtio_scsi_pci_realize: > - qdev_set_parent_bus links the device to the bus (bus and VirtIODevice > refcounts are now 3) > - the VirtIODevice is realized > > ... > at hot-unplug time: > - the device is unrealized > - the bus is unparented (calling bus_unparent) > - the device is unparented (calling device_unparent) > - bus_remove_child is called (bus and VirtIODevice refcounts are now 1) > - the VirtIODevice child property is deleted by object_unparent and > the VirtIODevice is finalized > - the bus child property is deleted by object_unparent and the > VirtIODevice is finalized Sorry I don't understand. From my debugging, VirtIODevice is not finalized, because it is embeded as VirtIOSCSIPCI.vdev.parent_obj.parent_obj, in non-zero offset. As I understand it, it's the VirtIOSCSIPCI instance that is being finalized, but not VirtIODevice. Because the object_unparent is actually called on the containing object: Thread 4 "qemu-system-x86" hit Breakpoint 1, device_unparent (obj=0x559449824f40) at /stor/work/qemu/hw/core/qdev.c:1078 1078 DeviceState *dev = DEVICE(obj); (gdb) bt #0 0x00005594455f11df in device_unparent (obj=0x559449824f40) at /stor/work/qemu/hw/core/qdev.c:1078 #1 0x00005594457be582 in object_finalize_child_property (obj=0x559447f0d320, name=0x5594498306e0 "scsi1", opaque=0x559449824f40) at /stor/work/qemu/qom/object.c:1369 #2 0x00005594457bc34a in object_property_del_child (obj=0x559447f0d320, child=0x559449824f40, errp=0x0) at /stor/work/qemu/qom/object.c:428 #3 0x00005594457bc42a in object_unparent (obj=0x559449824f40) at /stor/work/qemu/qom/object.c:447 #4 0x00005594455acf19 in acpi_pcihp_eject_slot (s=0x55944967c440, bsel=0, slots=16) at /stor/work/qemu/hw/acpi/pcihp.c:138 #5 0x00005594455ad542 in pci_write (opaque=0x55944967c440, addr=8, data=16, size=4) at /stor/work/qemu/hw/acpi/pcihp.c:272 #6 0x0000559445437667 in memory_region_write_accessor (mr=0x55944967d050, addr=8, value=0x7fdfa7569838, size=4, shift=0, mask=4294967295, attrs=...) at /stor/work/qemu/memory.c:526 #7 0x000055944543787f in access_with_adjusted_size (addr=8, value=0x7fdfa7569838, size=4, access_size_min=1, access_size_max=4, access= 0x55944543757d <memory_region_write_accessor>, mr=0x55944967d050, attrs=...) at /stor/work/qemu/memory.c:592 #8 0x0000559445439fdb in memory_region_dispatch_write (mr=0x55944967d050, addr=8, data=16, size=4, attrs=...) at /stor/work/qemu/memory.c:1319 #9 0x00005594453dfa77 in address_space_write_continue (as=0x559445f1fce0 <address_space_io>, addr=44552, attrs=..., buf=0x7fdfb77d2000 "\020", len=4, addr1=8, l=4, mr=0x55944967d050) at /stor/work/qemu/exec.c:2822 #10 0x00005594453dfc31 in address_space_write (as=0x559445f1fce0 <address_space_io>, addr=44552, attrs=..., buf=0x7fdfb77d2000 "\020", len=4) at /stor/work/qemu/exec.c:2879 #11 0x00005594453dffbd in address_space_rw (as=0x559445f1fce0 <address_space_io>, addr=44552, attrs=..., buf=0x7fdfb77d2000 "\020", len=4, is_write=true) at /stor/work/qemu/exec.c:2981 #12 0x0000559445433797 in kvm_handle_io (port=44552, attrs=..., data=0x7fdfb77d2000, direction=1, size=4, count=1) at /stor/work/qemu/kvm-all.c:1803 #13 0x0000559445433e87 in kvm_cpu_exec (cpu=0x559447f0d560) at /stor/work/qemu/kvm-all.c:2032 #14 0x0000559445419cc6 in qemu_kvm_cpu_thread_fn (arg=0x559447f0d560) at /stor/work/qemu/cpus.c:1118 #15 0x00007fdfb56ba6ca in start_thread () at /lib64/libpthread.so.0 #16 0x00007fdfb1382f7f in clone () at /lib64/libc.so.6 (gdb) p *obj.class.type $1 = {name = 0x559447e57a20 "virtio-scsi-pci", class_size = 280, instance_size = 34688, class_init = 0x55944573573e <virtio_scsi_pci_class_init>, class_base_init = 0x0, class_finalize = 0x0, class_data = 0x0, instance_init = 0x559445735837 <virtio_scsi_pci_instance_init>, instance_post_init = 0x0, instance_finalize = 0x0, abstract = false, parent = 0x559447e57a40 "virtio-pci", parent_type = 0x559447e57520, class = 0x559447e7add0, num_interfaces = 0, interfaces = {{ typename = 0x0} <repeats 32 times>}} Am I missing something? Fam
> On Tue, 05/16 14:44, Paolo Bonzini wrote: > > On 16/05/2017 14:25, Fam Zheng wrote: > > > You are right. Having had another look, I think it's because of this: > > > VirtIODevice is an embeded member of VirtIOSCSIPCI therefore it is never > > > "finalized" through QOM reference directly. Am I right? > > > > What I would expect is: > > > > virtio_instance_init_common: > > - create the VirtIODevice with refcount 1 > > - create a child property for the VirtIODevice (refcount is now 2) > > - unref the VirtIODevice (refcount is again 1) > > > > virtio_pci_realize: > > - virtio_pci_bus_new creates the virtio bus > > - the virtio bus is added as a child property > > > > virtio_scsi_pci_realize: > > - qdev_set_parent_bus links the device to the bus (bus and VirtIODevice > > refcounts are now 3) > > - the VirtIODevice is realized > > > > ... > > at hot-unplug time: > > - the device is unrealized > > - the bus is unparented (calling bus_unparent) > > - the device is unparented (calling device_unparent) > > - bus_remove_child is called (bus and VirtIODevice refcounts are now 1) > > - the VirtIODevice child property is deleted by object_unparent and > > the VirtIODevice is finalized > > - the bus child property is deleted by object_unparent and the > > VirtIODevice is finalized > > Sorry I don't understand. From my debugging, VirtIODevice is not finalized, > because it is embeded as VirtIOSCSIPCI.vdev.parent_obj.parent_obj, in > non-zero offset. It is correct that it's embedded. But it is added as a child property, and when the child property is deleted the refcount should go to zero and the VirtIODevice should be deleted too. The child property is deleted when bus_unparent calls object_unparent: while ((kid = QTAILQ_FIRST(&bus->children)) != NULL) { DeviceState *dev = kid->child; object_unparent(OBJECT(dev)); } and in turn bus_unparent is called by the VirtIOSCSIPCI's unparent callback (device_unparent): while (dev->num_child_bus) { bus = QLIST_FIRST(&dev->child_bus); object_unparent(OBJECT(bus)); } Thanks, Paolo > > As I understand it, it's the VirtIOSCSIPCI instance that is being finalized, > but > not VirtIODevice. Because the object_unparent is actually called on the > containing object: > > Thread 4 "qemu-system-x86" hit Breakpoint 1, device_unparent > (obj=0x559449824f40) at /stor/work/qemu/hw/core/qdev.c:1078 > 1078 DeviceState *dev = DEVICE(obj); > (gdb) bt > #0 0x00005594455f11df in device_unparent (obj=0x559449824f40) at > /stor/work/qemu/hw/core/qdev.c:1078 > #1 0x00005594457be582 in object_finalize_child_property (obj=0x559447f0d320, > name=0x5594498306e0 "scsi1", opaque=0x559449824f40) at > /stor/work/qemu/qom/object.c:1369 > #2 0x00005594457bc34a in object_property_del_child (obj=0x559447f0d320, > child=0x559449824f40, errp=0x0) at /stor/work/qemu/qom/object.c:428 > #3 0x00005594457bc42a in object_unparent (obj=0x559449824f40) at > /stor/work/qemu/qom/object.c:447 > #4 0x00005594455acf19 in acpi_pcihp_eject_slot (s=0x55944967c440, bsel=0, > slots=16) at /stor/work/qemu/hw/acpi/pcihp.c:138 > #5 0x00005594455ad542 in pci_write (opaque=0x55944967c440, addr=8, data=16, > size=4) at /stor/work/qemu/hw/acpi/pcihp.c:272 > #6 0x0000559445437667 in memory_region_write_accessor (mr=0x55944967d050, > addr=8, value=0x7fdfa7569838, size=4, shift=0, mask=4294967295, attrs=...) > at /stor/work/qemu/memory.c:526 > #7 0x000055944543787f in access_with_adjusted_size (addr=8, > value=0x7fdfa7569838, size=4, access_size_min=1, access_size_max=4, access= > 0x55944543757d <memory_region_write_accessor>, mr=0x55944967d050, > attrs=...) at /stor/work/qemu/memory.c:592 > #8 0x0000559445439fdb in memory_region_dispatch_write (mr=0x55944967d050, > addr=8, data=16, size=4, attrs=...) at /stor/work/qemu/memory.c:1319 > #9 0x00005594453dfa77 in address_space_write_continue (as=0x559445f1fce0 > <address_space_io>, addr=44552, attrs=..., buf=0x7fdfb77d2000 "\020", len=4, > addr1=8, l=4, mr=0x55944967d050) at /stor/work/qemu/exec.c:2822 > #10 0x00005594453dfc31 in address_space_write (as=0x559445f1fce0 > <address_space_io>, addr=44552, attrs=..., buf=0x7fdfb77d2000 "\020", len=4) > at /stor/work/qemu/exec.c:2879 > #11 0x00005594453dffbd in address_space_rw (as=0x559445f1fce0 > <address_space_io>, addr=44552, attrs=..., buf=0x7fdfb77d2000 "\020", len=4, > is_write=true) > at /stor/work/qemu/exec.c:2981 > #12 0x0000559445433797 in kvm_handle_io (port=44552, attrs=..., > data=0x7fdfb77d2000, direction=1, size=4, count=1) at > /stor/work/qemu/kvm-all.c:1803 > #13 0x0000559445433e87 in kvm_cpu_exec (cpu=0x559447f0d560) at > /stor/work/qemu/kvm-all.c:2032 > #14 0x0000559445419cc6 in qemu_kvm_cpu_thread_fn (arg=0x559447f0d560) at > /stor/work/qemu/cpus.c:1118 > #15 0x00007fdfb56ba6ca in start_thread () at /lib64/libpthread.so.0 > #16 0x00007fdfb1382f7f in clone () at /lib64/libc.so.6 > (gdb) p *obj.class.type > $1 = {name = 0x559447e57a20 "virtio-scsi-pci", class_size = 280, > instance_size = 34688, class_init = 0x55944573573e > <virtio_scsi_pci_class_init>, class_base_init = 0x0, > class_finalize = 0x0, class_data = 0x0, instance_init = 0x559445735837 > <virtio_scsi_pci_instance_init>, instance_post_init = 0x0, > instance_finalize = 0x0, > abstract = false, parent = 0x559447e57a40 "virtio-pci", parent_type = > 0x559447e57520, class = 0x559447e7add0, num_interfaces = 0, interfaces = > {{ > typename = 0x0} <repeats 32 times>}} > > Am I missing something? > > Fam >
On Wed, 05/17 02:58, Paolo Bonzini wrote: > The child property is deleted when bus_unparent calls object_unparent: > > while ((kid = QTAILQ_FIRST(&bus->children)) != NULL) { > DeviceState *dev = kid->child; > object_unparent(OBJECT(dev)); > } > > and in turn bus_unparent is called by the VirtIOSCSIPCI's unparent > callback (device_unparent): > > while (dev->num_child_bus) { > bus = QLIST_FIRST(&dev->child_bus); > object_unparent(OBJECT(bus)); > } OK, sorry for being dumb, these are way over my head. Let me try again: I count three references before unplug: a.1) object_property_add_child in virtio_instance_init_common as virtio-backend a.2) qdev_set_parent_bus in virtio_scsi_pci_realize by virtio-pci-bus a.3) qbus_set_hotplug_handler in virtio_scsi_device_realize for Only two object_unref()'s happen in unplug, respectively: b.1) object_finalize_child_property, matches a.1) b.2) bus_remove_child, matches a.2) Do we need cleanup for a.3) ? The patch below does fix the crash for me. --- diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 46a3e3f..fde1b1fe 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -918,6 +918,8 @@ void virtio_scsi_common_unrealize(DeviceState *dev, Error **errp) static void virtio_scsi_device_unrealize(DeviceState *dev, Error **errp) { + VirtIOSCSI *s = VIRTIO_SCSI(dev); + qbus_set_hotplug_handler(BUS(&s->bus), NULL, &error_abort); virtio_scsi_common_unrealize(dev, errp); }
On 17/05/2017 14:00, Fam Zheng wrote: > On Wed, 05/17 02:58, Paolo Bonzini wrote: >> The child property is deleted when bus_unparent calls object_unparent: >> >> while ((kid = QTAILQ_FIRST(&bus->children)) != NULL) { >> DeviceState *dev = kid->child; >> object_unparent(OBJECT(dev)); >> } >> >> and in turn bus_unparent is called by the VirtIOSCSIPCI's unparent >> callback (device_unparent): >> >> while (dev->num_child_bus) { >> bus = QLIST_FIRST(&dev->child_bus); >> object_unparent(OBJECT(bus)); >> } > > OK, sorry for being dumb, these are way over my head. Let me try again: > > I count three references before unplug: > > a.1) object_property_add_child in virtio_instance_init_common as virtio-backend > a.2) qdev_set_parent_bus in virtio_scsi_pci_realize by virtio-pci-bus > a.3) qbus_set_hotplug_handler in virtio_scsi_device_realize for > > Only two object_unref()'s happen in unplug, respectively: > > b.1) object_finalize_child_property, matches a.1) > b.2) bus_remove_child, matches a.2) > > Do we need cleanup for a.3) ? The patch below does fix the crash for me. Yes, good catch. Either there, or in bus_unparent. Paolo
© 2016 - 2024 Red Hat, Inc.