[Qemu-devel] [PATCH] hw: scsi: dc390: add device unrealize function

Li Qiang posted 1 patch 5 years, 4 months ago
Test asan passed
Test checkpatch passed
Test docker-quick@centos7 passed
Test docker-mingw@fedora passed
Test docker-clang@ubuntu passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20181129152552.14363-1-liq3ea@163.com
hw/scsi/esp-pci.c | 9 +++++++++
1 file changed, 9 insertions(+)
[Qemu-devel] [PATCH] hw: scsi: dc390: add device unrealize function
Posted by Li Qiang 5 years, 4 months ago
Currently the dc390 device has no unrealize function. This
can cause memory leak when hotplug/unplug device. Also more
serious, it will trigger an assert when rehotplug. 
The backtrack is following:

qemu-system-x86_64: migration/savevm.c:734: vmstate_register_with_alias_id: Assertion `!se->compat || se->instance_id == 0' failed.

Thread 1 "qemu-system-x86" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff7fce280 (LWP 5721)]
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
1  0x00007ffff5a10801 in __GI_abort () at abort.c:79
2  0x00007ffff5a0039a in __assert_fail_base (fmt=0x7ffff5b877d8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=assertion@entry=0x555555f67f68 "!se->compat || se->instance_id == 0",
    file=file@entry=0x555555f67d07 "migration/savevm.c", line=line@entry=734,
    function=function@entry=0x555555f68e50 <__PRETTY_FUNCTION__.32111> "vmstate_register_with_alias_id")
    at assert.c:92
3  0x00007ffff5a00412 in __GI___assert_fail (assertion=0x555555f67f68 "!se->compat || se->instance_id == 0",
    file=0x555555f67d07 "migration/savevm.c", line=734,
    function=0x555555f68e50 <__PRETTY_FUNCTION__.32111> "vmstate_register_with_alias_id") at assert.c:101
4  0x0000555555bfc1d2 in vmstate_register_with_alias_id (dev=0x5555577fff60, instance_id=-1,
    vmsd=0x555556533840 <vmstate_eeprom>, opaque=0x555556e82ad0, alias_id=-1, required_for_version=0, errp=0x0)
    at migration/savevm.c:734
5  0x0000555555b3ab79 in vmstate_register (dev=0x5555577fff60, instance_id=0, vmsd=0x555556533840 <vmstate_eeprom>,
    opaque=0x555556e82ad0) at /home/test/qemu/include/migration/vmstate.h:1067
6  0x0000555555b3b106 in eeprom93xx_new (dev=0x5555577fff60, nwords=64) at hw/nvram/eeprom93xx.c:323
7  0x0000555555b8b587 in dc390_scsi_realize (dev=0x5555577fff60, errp=0x7fffffffd9f8) at hw/scsi/esp-pci.c:482
8  0x0000555555b4b62e in pci_qdev_realize (qdev=0x5555577fff60, errp=0x7fffffffda70) at hw/pci/pci.c:2038
9  0x0000555555a89cb2 in device_set_realized (obj=0x5555577fff60, value=true, errp=0x7fffffffdc40)
    at hw/core/qdev.c:826
10 0x0000555555c7fbe8 in property_set_bool (obj=0x5555577fff60, v=0x5555567d9c50, name=0x555555edda12 "realized",
    opaque=0x5555575a6170, errp=0x7fffffffdc40) at qom/object.c:1991
11 0x0000555555c7de64 in object_property_set (obj=0x5555577fff60, v=0x5555567d9c50, name=0x555555edda12 "realized",
    errp=0x7fffffffdc40) at qom/object.c:1183
12 0x0000555555c80f2a in object_property_set_qobject (obj=0x5555577fff60, value=0x555556e84be0,
    name=0x555555edda12 "realized", errp=0x7fffffffdc40) at qom/qom-qobject.c:27
13 0x0000555555c7e149 in object_property_set_bool (obj=0x5555577fff60, value=true, name=0x555555edda12 "realized",
    errp=0x7fffffffdc40) at qom/object.c:1249
14 0x00005555559d3207 in qdev_device_add (opts=0x5555573aab00, errp=0x7fffffffdcb0) at qdev-monitor.c:642
15 0x00005555559d39f0 in qmp_device_add (qdict=0x555556a15000, ret_data=0x0, errp=0x7fffffffdcf0)
---Type <return> to continue, or q <return> to quit---
    at qdev-monitor.c:822
16 0x0000555555a01e2a in hmp_device_add (mon=0x5555568b8000, qdict=0x555556a15000) at hmp.c:2067

Signed-off-by: Li Qiang <liq3ea@163.com>
---
 hw/scsi/esp-pci.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/hw/scsi/esp-pci.c b/hw/scsi/esp-pci.c
index 419fc668ac..09d1331395 100644
--- a/hw/scsi/esp-pci.c
+++ b/hw/scsi/esp-pci.c
@@ -463,6 +463,14 @@ static void dc390_write_config(PCIDevice *dev,
     }
 }
 
+static void dc390_scsi_uninit(PCIDevice *dev)
+{
+    DC390State *pci = DC390(dev);
+
+    eeprom93xx_free(&dev->qdev, pci->eeprom);
+    esp_pci_scsi_uninit(dev);
+}
+
 static void dc390_scsi_realize(PCIDevice *dev, Error **errp)
 {
     DC390State *pci = DC390(dev);
@@ -510,6 +518,7 @@ static void dc390_class_init(ObjectClass *klass, void *data)
     PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
     k->realize = dc390_scsi_realize;
+    k->exit = dc390_scsi_uninit;
     k->config_read = dc390_read_config;
     k->config_write = dc390_write_config;
     set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
-- 
2.17.1



[Qemu-devel] Hot-pluggable device without ->unrealize() is highly suspect (was: [PATCH] hw: scsi: dc390: add device unrealize function)
Posted by Markus Armbruster 5 years, 4 months ago
Li Qiang <liq3ea@163.com> writes:

> Currently the dc390 device has no unrealize function. This
> can cause memory leak when hotplug/unplug device. Also more
> serious, it will trigger an assert when rehotplug. 
[...]

When a hot-pluggable device doesn't have an ->unrealize() method, unplug
is probably broken.  I think we should track down such devices for
review.  Any takers?

Re: [Qemu-devel] Hot-pluggable device without ->unrealize() is highly suspect (was: [PATCH] hw: scsi: dc390: add device unrealize function)
Posted by Peter Maydell 5 years, 4 months ago
On Thu, 29 Nov 2018 at 19:03, Markus Armbruster <armbru@redhat.com> wrote:
>
> Li Qiang <liq3ea@163.com> writes:
>
> > Currently the dc390 device has no unrealize function. This
> > can cause memory leak when hotplug/unplug device. Also more
> > serious, it will trigger an assert when rehotplug.
> [...]
>
> When a hot-pluggable device doesn't have an ->unrealize() method, unplug
> is probably broken.  I think we should track down such devices for
> review.  Any takers?

Add an assert somewhere and catch it with the usual
"instantiate everything" qtest?

More generally, what is causing dc390 to be hotpluggable?
I can't see anything obvious in the class init. If we
have APIs where the default is "you get this weird thing
you weren't thinking about but it's broken because
you weren't thinking about it" then we will have a whole
class of bugs that we then need to squash device by device[*].
I think it would be better for devices to have to explicitly
opt in to implementing things like hotplug -- that way the
failure mode is just "this isn't hotpluggable", we can
report that helpfully to the user, and if anybody cares
they can add the support.

[*] Also currently true for migration support. We should
require devices to either provide a VMState or explicitly
say they have no state needing migration or explicitly
say they don't support migration, and then assert that
they do one or the other of those, rather than having the
default be "we'll allow migration but not migrate any
of the device's state".

thanks
-- PMM

Re: [Qemu-devel] Hot-pluggable device without ->unrealize() is highly suspect
Posted by Markus Armbruster 5 years, 4 months ago
Juan, David, there's a migration bit for you towards the end.

Peter Maydell <peter.maydell@linaro.org> writes:

> On Thu, 29 Nov 2018 at 19:03, Markus Armbruster <armbru@redhat.com> wrote:
>>
>> Li Qiang <liq3ea@163.com> writes:
>>
>> > Currently the dc390 device has no unrealize function. This
>> > can cause memory leak when hotplug/unplug device. Also more
>> > serious, it will trigger an assert when rehotplug.
>> [...]
>>
>> When a hot-pluggable device doesn't have an ->unrealize() method, unplug
>> is probably broken.  I think we should track down such devices for
>> review.  Any takers?
>
> Add an assert somewhere and catch it with the usual
> "instantiate everything" qtest?

Not as easy as it sounds.

For unplug to work, the device and all its parents in the QOM type
hierarchy up to "device" have to undo their ->realize() in their
->unrealize().

Example: "dc390" -> "am53c974" -> "pci-device" -> "device"

* device: does not implement DeviceClass::realize(), nothing to do.

* pci_device: implements DeviceClass::realize(), ::unrealize() as
  pci_qdev_realize(), pci_qdev_unrealize().  They call
  PCIDeviceClass::realize() and PCIDeviceClass::exit(), respectively.

* am53c974: implements PCIDeviceClass::realize(), ::unrealize() as
  esp_pci_scsi_realize() and esp_pci_scsi_uninit().

* dc390: implements PCIDeviceClass::realize() as dc390_scsi_realize(),
  overriding esp_pci_scsi_realize().  dc390_scsi_realize() calls
  esp_pci_scsi_realize().  Bug: it fails to overide
  esp_pci_scsi_uninit().

Three patterns:

(1) A class implements whatever realize interface its parent class
    demands.  Example: am53c974.

(2) A (generally abstract) class implements ::realize() and
    ::unrealize(), and its implementation calls methods its children may
    implement.  Example: pci_device.

(3) A class overrides its parent's implementation, and calls it.
    Example: dc390.

How can we assert any realize-like thing has a matching unrealize-like
thing?

Any class that calls realize-/unrealize-like methods its children may
implement needs to assert the child implements both or none.

The troublemaker is (3), where we may end up with an overridden
realize-like method and an non-matching, un-overridden unrealize-like
method.  Got any smart ideas?

> More generally, what is causing dc390 to be hotpluggable?
> I can't see anything obvious in the class init.

It's a PCI device.  These are all hot-pluggable by default.

>                                                 If we
> have APIs where the default is "you get this weird thing
> you weren't thinking about but it's broken because
> you weren't thinking about it" then we will have a whole
> class of bugs that we then need to squash device by device[*].
> I think it would be better for devices to have to explicitly
> opt in to implementing things like hotplug -- that way the
> failure mode is just "this isn't hotpluggable", we can
> report that helpfully to the user, and if anybody cares
> they can add the support.

I tend to agree, although for PCI devices, not being hot-pluggable is
kind of weird, except for the ones that only occur soldered onto
mainboards.

> [*] Also currently true for migration support. We should
> require devices to either provide a VMState or explicitly
> say they have no state needing migration or explicitly
> say they don't support migration, and then assert that
> they do one or the other of those, rather than having the
> default be "we'll allow migration but not migrate any
> of the device's state".

True.  Cc: Juan & Dave.

Re: [Qemu-devel] Hot-pluggable device without ->unrealize() is highly suspect
Posted by Peter Maydell 5 years, 4 months ago
On Fri, 30 Nov 2018 at 07:40, Markus Armbruster <armbru@redhat.com> wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
> > Add an assert somewhere and catch it with the usual
> > "instantiate everything" qtest?

> The troublemaker is (3), where we may end up with an overridden
> realize-like method and an non-matching, un-overridden unrealize-like
> method.  Got any smart ideas?

Mmm, I see the difficulty. I suppose in theory we could have
a custom linter like clang-tidy and check that anything that
sets a realize pointer also sets an unrealize, but that's
pie-in-the-sky territory at the moment.

Asserting that there is an unrealize would catch at least
some cases of "forgot to handle hotplug", though, right?

> > More generally, what is causing dc390 to be hotpluggable?
> > I can't see anything obvious in the class init.
>
> It's a PCI device.  These are all hot-pluggable by default.

Maybe we should change that? Most people writing a
PCI device are probably not thinking about hotplug.
Though in the dc390 case it would probably not help to have
to set a dc->hotpluggable flag, because it would inherit that
from am53c974 too.

I wonder if the dc390 could be structured in a way that
it doesn't subclass a complete non-abstract pci device like
that...

> >                                                 If we
> > have APIs where the default is "you get this weird thing
> > you weren't thinking about but it's broken because
> > you weren't thinking about it" then we will have a whole
> > class of bugs that we then need to squash device by device[*].
> > I think it would be better for devices to have to explicitly
> > opt in to implementing things like hotplug -- that way the
> > failure mode is just "this isn't hotpluggable", we can
> > report that helpfully to the user, and if anybody cares
> > they can add the support.
>
> I tend to agree, although for PCI devices, not being hot-pluggable is
> kind of weird, except for the ones that only occur soldered onto
> mainboards.

Real hardware PCI devices are not hot-pluggable by default
as far as I'm aware; PCIe may be hot-pluggable but I think
need to be designed to support it.
This is a random non-hotplug-safe PCIe card:
https://i.stack.imgur.com/tXug5.jpg
PRSNT2# (pin 17 side B) is connected to Ground on pin 18
(these are the rightmost two connections visible); for
hotplug it must be connected to PRSNT1# (which is pin 1 side A),
AIUI.

But the hardware situation is kind of a tangent from how
we try to design our APIs.

thanks
-- PMM

Re: [Qemu-devel] Hot-pluggable device without ->unrealize() is highly suspect
Posted by Markus Armbruster 5 years, 4 months ago
Peter Maydell <peter.maydell@linaro.org> writes:

> On Fri, 30 Nov 2018 at 07:40, Markus Armbruster <armbru@redhat.com> wrote:
>> Peter Maydell <peter.maydell@linaro.org> writes:
>> > Add an assert somewhere and catch it with the usual
>> > "instantiate everything" qtest?
>
>> The troublemaker is (3), where we may end up with an overridden
>> realize-like method and an non-matching, un-overridden unrealize-like
>> method.  Got any smart ideas?
>
> Mmm, I see the difficulty. I suppose in theory we could have
> a custom linter like clang-tidy and check that anything that
> sets a realize pointer also sets an unrealize, but that's
> pie-in-the-sky territory at the moment.
>
> Asserting that there is an unrealize would catch at least
> some cases of "forgot to handle hotplug", though, right?

Yes.  I figure it's worth doing.

Another idea: track down all the instances of "parent class implements
::realize(), ::unrealize(), which in turn call methods its children may
implement", and rename the children's methods to be called realize(),
unrealize(), so that we can find offenders with a git-grep -E
'->(un)?realize *='.

>> > More generally, what is causing dc390 to be hotpluggable?
>> > I can't see anything obvious in the class init.
>>
>> It's a PCI device.  These are all hot-pluggable by default.
>
> Maybe we should change that? Most people writing a
> PCI device are probably not thinking about hotplug.

More on that below.

> Though in the dc390 case it would probably not help to have
> to set a dc->hotpluggable flag, because it would inherit that
> from am53c974 too.
>
> I wonder if the dc390 could be structured in a way that
> it doesn't subclass a complete non-abstract pci device like
> that...

Yes, a common abstract base class would be cleaner.

>> >                                                 If we
>> > have APIs where the default is "you get this weird thing
>> > you weren't thinking about but it's broken because
>> > you weren't thinking about it" then we will have a whole
>> > class of bugs that we then need to squash device by device[*].
>> > I think it would be better for devices to have to explicitly
>> > opt in to implementing things like hotplug -- that way the
>> > failure mode is just "this isn't hotpluggable", we can
>> > report that helpfully to the user, and if anybody cares
>> > they can add the support.
>>
>> I tend to agree, although for PCI devices, not being hot-pluggable is
>> kind of weird, except for the ones that only occur soldered onto
>> mainboards.
>
> Real hardware PCI devices are not hot-pluggable by default
> as far as I'm aware; PCIe may be hot-pluggable but I think
> need to be designed to support it.
> This is a random non-hotplug-safe PCIe card:
> https://i.stack.imgur.com/tXug5.jpg
> PRSNT2# (pin 17 side B) is connected to Ground on pin 18
> (these are the rightmost two connections visible); for
> hotplug it must be connected to PRSNT1# (which is pin 1 side A),
> AIUI.
>
> But the hardware situation is kind of a tangent from how
> we try to design our APIs.

It's best not to deviate from real hardware without a good reason.
Making it harder to create the same stupid bugs over and over again is a
good reason.