block: quiesce AioContext when detaching from it

[Qemu-devel] [PATCH] block: quiesce AioContext when detaching from it

Posted by Paolo Bonzini 8 years, 11 months ago

While it is true that bdrv_set_aio_context only works on a single
BlockDriverState subtree (see commit message for 53ec73e, "block: Use
bdrv_drain to replace uncessary bdrv_drain_all", 2015-07-07), it works
at the AioContext level rather than the BlockDriverState level.

Therefore, it is also necessary to trigger pending bottom halves too,
even if no requests are pending.

For NBD this ensures that the aio_co_schedule of a previous call to
nbd_attach_aio_context is completed before detaching from the old
AioContext; it fixes qemu-iotest 094.  Another similar bug happens
when the VM is stopped and the virtio-blk dataplane irqfd is torn down.
In this case it's possible that guest I/O gets stuck if notify_guest_bh
was scheduled but doesn't run.

Calling aio_poll from another AioContext is safe if non-blocking; races
such as the one mentioned in the commit message for c9d1a56 ("block:
only call aio_poll on the current thread's AioContext", 2016-10-28)
are a concern for blocking calls.

I considered other options, including:

- moving the bs->wakeup mechanism to AioContext, and letting the caller
check.  This might work for virtio which has a clear place to wakeup
(notify_place_bh) and check the condition (virtio_blk_data_plane_stop).
For aio_co_schedule I couldn't find a clear place to check the condition.

- adding a dummy oneshot bottom half and waiting for it to trigger.
This has the complication that bottom half list is LIFO for historical
reasons.  There were performance issues caused by bottom half ordering
in the past, so I decided against it for 2.9.

Fixes: 99723548561978da8ef44cf804fb7912698f5d88
Reported-by: Max Reitz <mreitz@redhat.com>
Reported-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Tested-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/block.c b/block.c
index f293ccb..e159251 100644
--- a/block.c
+++ b/block.c
@@ -4272,8 +4272,15 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
 
 void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
 {
+    AioContext *ctx;
+
     bdrv_drain(bs); /* ensure there are no in-flight requests */
 
+    ctx = bdrv_get_aio_context(bs);
+    while (aio_poll(ctx, false)) {
+        /* wait for all bottom halves to execute */
+    }
+
     bdrv_detach_aio_context(bs);
 
     /* This function executes in the old AioContext so acquire the new one in
-- 
2.9.3

Re: [Qemu-devel] [PATCH] block: quiesce AioContext when detaching from it

Posted by Eric Blake 8 years, 11 months ago

On 03/14/2017 06:11 AM, Paolo Bonzini wrote:
> While it is true that bdrv_set_aio_context only works on a single
> BlockDriverState subtree (see commit message for 53ec73e, "block: Use
> bdrv_drain to replace uncessary bdrv_drain_all", 2015-07-07), it works

I was about to correct the typo, but see it was in the original commit :)

> at the AioContext level rather than the BlockDriverState level.
> 
> Therefore, it is also necessary to trigger pending bottom halves too,
> even if no requests are pending.
> 
> For NBD this ensures that the aio_co_schedule of a previous call to
> nbd_attach_aio_context is completed before detaching from the old
> AioContext; it fixes qemu-iotest 094.  Another similar bug happens
> when the VM is stopped and the virtio-blk dataplane irqfd is torn down.
> In this case it's possible that guest I/O gets stuck if notify_guest_bh
> was scheduled but doesn't run.
> 
> Calling aio_poll from another AioContext is safe if non-blocking; races
> such as the one mentioned in the commit message for c9d1a56 ("block:
> only call aio_poll on the current thread's AioContext", 2016-10-28)
> are a concern for blocking calls.
> 
> I considered other options, including:
> 
> - moving the bs->wakeup mechanism to AioContext, and letting the caller
> check.  This might work for virtio which has a clear place to wakeup
> (notify_place_bh) and check the condition (virtio_blk_data_plane_stop).
> For aio_co_schedule I couldn't find a clear place to check the condition.
> 
> - adding a dummy oneshot bottom half and waiting for it to trigger.
> This has the complication that bottom half list is LIFO for historical
> reasons.  There were performance issues caused by bottom half ordering
> in the past, so I decided against it for 2.9.
> 
> Fixes: 99723548561978da8ef44cf804fb7912698f5d88
> Reported-by: Max Reitz <mreitz@redhat.com>
> Reported-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> Tested-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Re: [Qemu-devel] [PATCH] block: quiesce AioContext when detaching from it

Posted by Max Reitz 8 years, 10 months ago

On 14.03.2017 12:11, Paolo Bonzini wrote:
> While it is true that bdrv_set_aio_context only works on a single
> BlockDriverState subtree (see commit message for 53ec73e, "block: Use
> bdrv_drain to replace uncessary bdrv_drain_all", 2015-07-07), it works
> at the AioContext level rather than the BlockDriverState level.
> 
> Therefore, it is also necessary to trigger pending bottom halves too,
> even if no requests are pending.
> 
> For NBD this ensures that the aio_co_schedule of a previous call to
> nbd_attach_aio_context is completed before detaching from the old
> AioContext; it fixes qemu-iotest 094.  Another similar bug happens
> when the VM is stopped and the virtio-blk dataplane irqfd is torn down.
> In this case it's possible that guest I/O gets stuck if notify_guest_bh
> was scheduled but doesn't run.
> 
> Calling aio_poll from another AioContext is safe if non-blocking; races
> such as the one mentioned in the commit message for c9d1a56 ("block:
> only call aio_poll on the current thread's AioContext", 2016-10-28)
> are a concern for blocking calls.
> 
> I considered other options, including:
> 
> - moving the bs->wakeup mechanism to AioContext, and letting the caller
> check.  This might work for virtio which has a clear place to wakeup
> (notify_place_bh) and check the condition (virtio_blk_data_plane_stop).
> For aio_co_schedule I couldn't find a clear place to check the condition.
> 
> - adding a dummy oneshot bottom half and waiting for it to trigger.
> This has the complication that bottom half list is LIFO for historical
> reasons.  There were performance issues caused by bottom half ordering
> in the past, so I decided against it for 2.9.
> 
> Fixes: 99723548561978da8ef44cf804fb7912698f5d88
> Reported-by: Max Reitz <mreitz@redhat.com>
> Reported-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> Tested-by: Halil Pasic <pasic@linux.vnet.ibm.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block.c | 7 +++++++
>  1 file changed, 7 insertions(+)

Thanks, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

Max