On 02/10/20 19:35, Paolo Bonzini wrote:
> From: Maxim Levitsky <mlevitsk@redhat.com>
>
> Soon, a device removal might only happen on RCU callback execution.
> This is okay for device-del which provides a DEVICE_DELETED event,
> but not for the failure case of device-add. To avoid changing
> monitor semantics, just drain all pending RCU callbacks on error.
Nope, this is not enough... qemu-iotests test 067 still breaks; if I
leave the drain_call_rcu() in qmp_device_del, on the other hand, the
hotplug test in qtest-ppc64:qos-test hangs because DEVICE_DELETED comes
before the {'return'}.
One way to handle this is:
1) handle events properly in libqtest. For example, rename
qtest_qmp_receive to qtest_qmp_receive_dict (most callers can keep on
calling qtest_qmp_receive, though some have to switch if they look for
events); a new qtest_qmp_receive that calls qtest_qmp_receive_dict until
it doesn't get an event, and in the meanwhile stashes the events in a
GList. Then qtest_qmp_eventwait_ref can look for events in the GList
prior to calling qtest_qmp_receive_any if it doesn't find any.
2) with (1) in place we can probably keep the drain_call_rcu(), but
perhaps we can drop it if we remove test 067 and add equivalent tests to
tests/qtest/drive_del.c.
The reason I'm not fond of the drain_call_rcu() in device_del is that
there is already a DEVICE_DELETED callback, and device-del is (at least
for some buses) an advisory operation that the guest needs to attend to.
So there's no guarantee that you get the DEVICE_DELETED event before
drain_call_rcu().
I can look at this next week.
Paolo