[PATCH v6 04/10] device_core: use drain_call_rcu in in qmp_device_add

Paolo Bonzini posted 10 patches 5 years, 4 months ago
Maintainers: "Michael S. Tsirkin" <mst@redhat.com>, Laurent Vivier <lvivier@redhat.com>, Jason Wang <jasowang@redhat.com>, Eduardo Habkost <ehabkost@redhat.com>, "Daniel P. Berrangé" <berrange@redhat.com>, Thomas Huth <thuth@redhat.com>, "Philippe Mathieu-Daudé" <f4bug@amsat.org>, Paolo Bonzini <pbonzini@redhat.com>, Fam Zheng <fam@euphon.net>
There is a newer version of this series
[PATCH v6 04/10] device_core: use drain_call_rcu in in qmp_device_add
Posted by Paolo Bonzini 5 years, 4 months ago
From: Maxim Levitsky <mlevitsk@redhat.com>

Soon, a device removal might only happen on RCU callback execution.
This is okay for device-del which provides a DEVICE_DELETED event,
but not for the failure case of device-add.  To avoid changing
monitor semantics, just drain all pending RCU callbacks on error.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Suggested-by: Stefan Hajnoczi <stefanha@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20200913160259.32145-4-mlevitsk@redhat.com>
[Don't use it in qmp_device_del. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 qdev-monitor.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/qdev-monitor.c b/qdev-monitor.c
index e9b7228480..bcfb90a08f 100644
--- a/qdev-monitor.c
+++ b/qdev-monitor.c
@@ -803,6 +803,18 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp)
         return;
     }
     dev = qdev_device_add(opts, errp);
+
+    /*
+     * Drain all pending RCU callbacks. This is done because
+     * some bus related operations can delay a device removal
+     * (in this case this can happen if device is added and then
+     * removed due to a configuration error)
+     * to a RCU callback, but user might expect that this interface
+     * will finish its job completely once qmp command returns result
+     * to the user
+     */
+    drain_call_rcu();
+
     if (!dev) {
         qemu_opts_del(opts);
         return;
-- 
2.26.2



Re: [PATCH v6 04/10] device_core: use drain_call_rcu in in qmp_device_add
Posted by Paolo Bonzini 5 years, 4 months ago
On 02/10/20 19:35, Paolo Bonzini wrote:
> From: Maxim Levitsky <mlevitsk@redhat.com>
> 
> Soon, a device removal might only happen on RCU callback execution.
> This is okay for device-del which provides a DEVICE_DELETED event,
> but not for the failure case of device-add.  To avoid changing
> monitor semantics, just drain all pending RCU callbacks on error.

Nope, this is not enough...  qemu-iotests test 067 still breaks; if I
leave the drain_call_rcu() in qmp_device_del, on the other hand, the
hotplug test in qtest-ppc64:qos-test hangs because DEVICE_DELETED comes
before the {'return'}.

One way to handle this is:

1) handle events properly in libqtest.  For example, rename
qtest_qmp_receive to qtest_qmp_receive_dict (most callers can keep on
calling qtest_qmp_receive, though some have to switch if they look for
events); a new qtest_qmp_receive that calls qtest_qmp_receive_dict until
it doesn't get an event, and in the meanwhile stashes the events in a
GList.  Then qtest_qmp_eventwait_ref can look for events in the GList
prior to calling qtest_qmp_receive_any if it doesn't find any.

2) with (1) in place we can probably keep the drain_call_rcu(), but
perhaps we can drop it if we remove test 067 and add equivalent tests to
tests/qtest/drive_del.c.

The reason I'm not fond of the drain_call_rcu() in device_del is that
there is already a DEVICE_DELETED callback, and device-del is (at least
for some buses) an advisory operation that the guest needs to attend to.
 So there's no guarantee that you get the DEVICE_DELETED event before
drain_call_rcu().

I can look at this next week.

Paolo