[Qemu-devel] [PATCH RFC] qdev: add init order filter

Michael S. Tsirkin posted 1 patch 7 years ago
Failed in applying to current master (apply log)
include/hw/qdev-core.h | 10 ++++++++++
include/monitor/qdev.h |  2 +-
qdev-monitor.c         |  9 +++++++--
vl.c                   | 15 ++++++++++-----
4 files changed, 28 insertions(+), 8 deletions(-)
[Qemu-devel] [PATCH RFC] qdev: add init order filter
Posted by Michael S. Tsirkin 7 years ago
Allow forcing a specific order of initialization on
devices created with -device.
Helpful e.g. for built-in devices such as IOMMUs which must
exist before all other devices.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

Looks like we have a ton of problems because devices
are initialized in a random order, while we
really want e.g. iommu to be initialized
earlier than devices.

This will be helpful for other things, e.g.
real hardware often is initialized in a specific order,
creating built-in devices for the board often
has to happen in a specific order, etc.


We could then stop trying to do things at machine
done time and simply set initialization order.

The following patch achieves this for devices
created with -device but unfortunately not others
(e.g. not -net nic, etc).

Thoughts on the best way to complete this would be appreciated.

As you can see this patch is small enough for 2.9 and it might be a good
idea considering the pile of hacks we have pending as a replacement.


 include/hw/qdev-core.h | 10 ++++++++++
 include/monitor/qdev.h |  2 +-
 qdev-monitor.c         |  9 +++++++--
 vl.c                   | 15 ++++++++++-----
 4 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index b44b476..5704b67 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -40,6 +40,13 @@ typedef void (*BusUnrealize)(BusState *bus, Error **errp);
 
 struct VMStateDescription;
 
+typedef enum DeviceInitOrder {
+    DEVICE_INIT_ORDER_MIN = -1,
+    DEVICE_INIT_ORDER_EARLY = -1,
+    DEVICE_INIT_ORDER_DEFAULT = 0,
+    DEVICE_INIT_ORDER_MAX = 0,
+} DeviceInitOrder;
+
 /**
  * DeviceClass:
  * @props: Properties accessing state fields.
@@ -128,6 +135,9 @@ typedef struct DeviceClass {
 
     bool hotpluggable;
 
+    /* init order. Only affects devices created with -device at this point */
+    DeviceInitOrder init_order;
+
     /* callbacks */
     void (*reset)(DeviceState *dev);
     DeviceRealize realize;
diff --git a/include/monitor/qdev.h b/include/monitor/qdev.h
index 0ff3331..af2dc67 100644
--- a/include/monitor/qdev.h
+++ b/include/monitor/qdev.h
@@ -11,7 +11,7 @@ void hmp_info_qom_tree(Monitor *mon, const QDict *dict);
 void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp);
 
 int qdev_device_help(QemuOpts *opts);
-DeviceState *qdev_device_add(QemuOpts *opts, Error **errp);
+DeviceState *qdev_device_add(QemuOpts *opts, int init_order, Error **errp);
 void qdev_set_id(DeviceState *dev, const char *id);
 
 #endif
diff --git a/qdev-monitor.c b/qdev-monitor.c
index 549f45f..657f89d 100644
--- a/qdev-monitor.c
+++ b/qdev-monitor.c
@@ -559,7 +559,7 @@ void qdev_set_id(DeviceState *dev, const char *id)
     }
 }
 
-DeviceState *qdev_device_add(QemuOpts *opts, Error **errp)
+DeviceState *qdev_device_add(QemuOpts *opts, int init_order, Error **errp)
 {
     DeviceClass *dc;
     const char *driver, *path;
@@ -579,6 +579,11 @@ DeviceState *qdev_device_add(QemuOpts *opts, Error **errp)
         return NULL;
     }
 
+    if (dc->init_order != init_order) {
+        /* Not an error - will be initialized with correct order */
+        return NULL;
+    }
+
     if (only_migratable) {
         if (dc->vmsd->unmigratable) {
             error_setg(errp, "Device %s is not migratable, but "
@@ -807,7 +812,7 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp)
         qemu_opts_del(opts);
         return;
     }
-    dev = qdev_device_add(opts, &local_err);
+    dev = qdev_device_add(opts, DEVICE_INIT_ORDER_DEFAULT,  &local_err);
     if (!dev) {
         error_propagate(errp, local_err);
         qemu_opts_del(opts);
diff --git a/vl.c b/vl.c
index e10a27b..d77fbc9 100644
--- a/vl.c
+++ b/vl.c
@@ -2300,11 +2300,14 @@ static int device_init_func(void *opaque, QemuOpts *opts, Error **errp)
 {
     Error *err = NULL;
     DeviceState *dev;
+    int init_order = (intptr_t)opaque;
 
-    dev = qdev_device_add(opts, &err);
-    if (!dev) {
+    dev = qdev_device_add(opts, init_order, &err);
+    if (!dev && err) {
         error_report_err(err);
         return -1;
+    } else if (!dev) {
+        return 0;
     }
     object_unref(OBJECT(dev));
     return 0;
@@ -4546,9 +4549,11 @@ int main(int argc, char **argv, char **envp)
 
     /* init generic devices */
     rom_set_order_override(FW_CFG_ORDER_OVERRIDE_DEVICE);
-    if (qemu_opts_foreach(qemu_find_opts("device"),
-                          device_init_func, NULL, NULL)) {
-        exit(1);
+    for (i = DEVICE_INIT_ORDER_MIN; i <= DEVICE_INIT_ORDER_MAX; ++i) {
+        if (qemu_opts_foreach(qemu_find_opts("device"),
+                              device_init_func, (void *)(intptr_t)i, NULL)) {
+            exit(1);
+        }
     }
 
     cpu_synchronize_all_post_init();
-- 
MST

Re: [Qemu-devel] [PATCH RFC] qdev: add init order filter
Posted by Paolo Bonzini 7 years ago

On 09/03/2017 00:59, Michael S. Tsirkin wrote:
> Allow forcing a specific order of initialization on
> devices created with -device.
> Helpful e.g. for built-in devices such as IOMMUs which must
> exist before all other devices.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
> 
> Looks like we have a ton of problems because devices
> are initialized in a random order, while we
> really want e.g. iommu to be initialized
> earlier than devices.
> 
> This will be helpful for other things, e.g.
> real hardware often is initialized in a specific order,
> creating built-in devices for the board often
> has to happen in a specific order, etc.

In the specific case of PCI bus_master_as there is a simple workaround
by placing a dummy container (which lets us build the AddressSpace) and
then adding the alias region at machine_done time.

This is exactly how MemoryListener is supposed to work, so I would start
from there.

Paolo

Re: [Qemu-devel] [PATCH RFC] qdev: add init order filter
Posted by Marcel Apfelbaum 7 years ago
On 03/09/2017 01:13 PM, Paolo Bonzini wrote:
>
>
> On 09/03/2017 00:59, Michael S. Tsirkin wrote:
>> Allow forcing a specific order of initialization on
>> devices created with -device.
>> Helpful e.g. for built-in devices such as IOMMUs which must
>> exist before all other devices.
>>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> ---
>>
>> Looks like we have a ton of problems because devices
>> are initialized in a random order, while we
>> really want e.g. iommu to be initialized
>> earlier than devices.
>>
>> This will be helpful for other things, e.g.
>> real hardware often is initialized in a specific order,
>> creating built-in devices for the board often
>> has to happen in a specific order, etc.
>
> In the specific case of PCI bus_master_as there is a simple workaround
> by placing a dummy container (which lets us build the AddressSpace) and
> then adding the alias region at machine_done time.
>
> This is exactly how MemoryListener is supposed to work, so I would start
> from there.
>

Hi Paolo,

This will certainly solve virtio-pci ordering issue, but I am not sure
it solves the vfio-pci problem. Here is a link to Alex's explanation:

http://www.mail-archive.com/qemu-devel@nongnu.org/msg432365.html

Thanks,
Marcel

> Paolo
>


Re: [Qemu-devel] [PATCH RFC] qdev: add init order filter
Posted by Paolo Bonzini 7 years ago

On 09/03/2017 13:11, Marcel Apfelbaum wrote:
> On 03/09/2017 01:13 PM, Paolo Bonzini wrote:
>>
>>
>> On 09/03/2017 00:59, Michael S. Tsirkin wrote:
>>> Allow forcing a specific order of initialization on
>>> devices created with -device.
>>> Helpful e.g. for built-in devices such as IOMMUs which must
>>> exist before all other devices.
>>>
>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>> ---
>>>
>>> Looks like we have a ton of problems because devices
>>> are initialized in a random order, while we
>>> really want e.g. iommu to be initialized
>>> earlier than devices.
>>>
>>> This will be helpful for other things, e.g.
>>> real hardware often is initialized in a specific order,
>>> creating built-in devices for the board often
>>> has to happen in a specific order, etc.
>>
>> In the specific case of PCI bus_master_as there is a simple workaround
>> by placing a dummy container (which lets us build the AddressSpace) and
>> then adding the alias region at machine_done time.
>>
>> This is exactly how MemoryListener is supposed to work, so I would start
>> from there.
>>
> 
> Hi Paolo,
> 
> This will certainly solve virtio-pci ordering issue, but I am not sure
> it solves the vfio-pci problem. Here is a link to Alex's explanation:
> 
> http://www.mail-archive.com/qemu-devel@nongnu.org/msg432365.html

Right, VFIO is more complicated (hence "in the specific case of PCI
bus_master_as").  I answered in that thread.

Paolo

Re: [Qemu-devel] [PATCH RFC] qdev: add init order filter
Posted by Peter Xu 7 years ago
On Thu, Mar 09, 2017 at 01:28:09PM +0100, Paolo Bonzini wrote:
> 
> 
> On 09/03/2017 13:11, Marcel Apfelbaum wrote:
> > On 03/09/2017 01:13 PM, Paolo Bonzini wrote:
> >>
> >>
> >> On 09/03/2017 00:59, Michael S. Tsirkin wrote:
> >>> Allow forcing a specific order of initialization on
> >>> devices created with -device.
> >>> Helpful e.g. for built-in devices such as IOMMUs which must
> >>> exist before all other devices.
> >>>
> >>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >>> ---
> >>>
> >>> Looks like we have a ton of problems because devices
> >>> are initialized in a random order, while we
> >>> really want e.g. iommu to be initialized
> >>> earlier than devices.
> >>>
> >>> This will be helpful for other things, e.g.
> >>> real hardware often is initialized in a specific order,
> >>> creating built-in devices for the board often
> >>> has to happen in a specific order, etc.
> >>
> >> In the specific case of PCI bus_master_as there is a simple workaround
> >> by placing a dummy container (which lets us build the AddressSpace) and
> >> then adding the alias region at machine_done time.
> >>
> >> This is exactly how MemoryListener is supposed to work, so I would start
> >> from there.
> >>
> > 
> > Hi Paolo,
> > 
> > This will certainly solve virtio-pci ordering issue, but I am not sure
> > it solves the vfio-pci problem. Here is a link to Alex's explanation:
> > 
> > http://www.mail-archive.com/qemu-devel@nongnu.org/msg432365.html
> 
> Right, VFIO is more complicated (hence "in the specific case of PCI
> bus_master_as").  I answered in that thread.

If we keep allowing vfio-pci to use pci_device_iommu_address_space()
(IMHO that's exactly the right thing to do there, rather than using
bus_master_as), this patch should work for vfio-pci as well? Since
vfio_realize() will get a correct iommu_fn, and that looks to be
enough.

Thanks,

-- peterx