Calling aio_poll() directly may have been fine previously, but this is
the future, man! The difference between an aio_poll() loop and
BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext
around aio_poll().
This allows the IOThread to run fd handlers or BHs to complete the
request. Failure to release the AioContext causes deadlocks.
Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object
iothread.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block/io.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/block/io.c b/block/io.c
index cc56e90..f0041cd 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2031,9 +2031,7 @@ bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
Coroutine *co = qemu_coroutine_create(bdrv_co_rw_vmstate_entry, &data);
bdrv_coroutine_enter(bs, co);
- while (data.ret == -EINPROGRESS) {
- aio_poll(bdrv_get_aio_context(bs), true);
- }
+ BDRV_POLL_WHILE(bs, data.ret == -EINPROGRESS);
return data.ret;
}
}
--
2.9.3
On 05/17/2017 12:09 PM, Stefan Hajnoczi wrote:
> Calling aio_poll() directly may have been fine previously, but this is
> the future, man!
lol
> The difference between an aio_poll() loop and
> BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext
> around aio_poll().
>
> This allows the IOThread to run fd handlers or BHs to complete the
> request. Failure to release the AioContext causes deadlocks.
>
> Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object
> iothread.
I'm surprised at how many separate hangs we actually had!
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> block/io.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
Reviewed-by: Eric Blake <eblake@redhat.com>
>
> diff --git a/block/io.c b/block/io.c
> index cc56e90..f0041cd 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2031,9 +2031,7 @@ bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
> Coroutine *co = qemu_coroutine_create(bdrv_co_rw_vmstate_entry, &data);
>
> bdrv_coroutine_enter(bs, co);
> - while (data.ret == -EINPROGRESS) {
> - aio_poll(bdrv_get_aio_context(bs), true);
> - }
> + BDRV_POLL_WHILE(bs, data.ret == -EINPROGRESS);
> return data.ret;
> }
Do we have other culprits (not necessarily for vmsave, but for other
situations), where we should be using BDRV_POLL_WHILE in separate
patches? For example, a quick grep show that at least hw/9pfs/9p.c makes
a direct call to aio_poll(,true) in a while loop.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
> I'm surprised at how many separate hangs we actually had!
Note that I have seen quite a few before, though I am not sure about
the details and the reproducibility. The release/acquire was hidden
behind RFifoLock contention callbacks instead of BDRV_POLL_WHILE.
Paolo
> >
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> > block/io.c | 4 +---
> > 1 file changed, 1 insertion(+), 3 deletions(-)
>
> Reviewed-by: Eric Blake <eblake@redhat.com>
>
> >
> > diff --git a/block/io.c b/block/io.c
> > index cc56e90..f0041cd 100644
> > --- a/block/io.c
> > +++ b/block/io.c
> > @@ -2031,9 +2031,7 @@ bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector
> > *qiov, int64_t pos,
> > Coroutine *co = qemu_coroutine_create(bdrv_co_rw_vmstate_entry,
> > &data);
> >
> > bdrv_coroutine_enter(bs, co);
> > - while (data.ret == -EINPROGRESS) {
> > - aio_poll(bdrv_get_aio_context(bs), true);
> > - }
> > + BDRV_POLL_WHILE(bs, data.ret == -EINPROGRESS);
> > return data.ret;
> > }
>
> Do we have other culprits (not necessarily for vmsave, but for other
> situations), where we should be using BDRV_POLL_WHILE in separate
> patches? For example, a quick grep show that at least hw/9pfs/9p.c makes
> a direct call to aio_poll(,true) in a while loop.
>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc. +1-919-301-3266
> Virtualization: qemu.org | libvirt.org
>
>
Am 17.05.2017 um 22:16 hat Eric Blake geschrieben: > On 05/17/2017 12:09 PM, Stefan Hajnoczi wrote: > > Calling aio_poll() directly may have been fine previously, but this is > > the future, man! > > lol > > > The difference between an aio_poll() loop and > > BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext > > around aio_poll(). > > > > This allows the IOThread to run fd handlers or BHs to complete the > > request. Failure to release the AioContext causes deadlocks. > > > > Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object > > iothread. > > I'm surprised at how many separate hangs we actually had! How hard would it be to write some test cases for this? Dataplane has a serious lack of automated testing. Kevin
On Thu, May 18, 2017 at 8:57 AM, Kevin Wolf <kwolf@redhat.com> wrote: > Am 17.05.2017 um 22:16 hat Eric Blake geschrieben: >> On 05/17/2017 12:09 PM, Stefan Hajnoczi wrote: >> > Calling aio_poll() directly may have been fine previously, but this is >> > the future, man! >> >> lol >> >> > The difference between an aio_poll() loop and >> > BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext >> > around aio_poll(). >> > >> > This allows the IOThread to run fd handlers or BHs to complete the >> > request. Failure to release the AioContext causes deadlocks. >> > >> > Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object >> > iothread. >> >> I'm surprised at how many separate hangs we actually had! > > How hard would it be to write some test cases for this? Dataplane has > a serious lack of automated testing. And this hang doesn't even require in-flight guest I/O, so it would be easy to reproduce in qemu-iotests if we add an IOThread mode. Stefan
Am 18.05.2017 um 10:06 hat Stefan Hajnoczi geschrieben: > On Thu, May 18, 2017 at 8:57 AM, Kevin Wolf <kwolf@redhat.com> wrote: > > Am 17.05.2017 um 22:16 hat Eric Blake geschrieben: > >> On 05/17/2017 12:09 PM, Stefan Hajnoczi wrote: > >> > Calling aio_poll() directly may have been fine previously, but this is > >> > the future, man! > >> > >> lol > >> > >> > The difference between an aio_poll() loop and > >> > BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext > >> > around aio_poll(). > >> > > >> > This allows the IOThread to run fd handlers or BHs to complete the > >> > request. Failure to release the AioContext causes deadlocks. > >> > > >> > Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object > >> > iothread. > >> > >> I'm surprised at how many separate hangs we actually had! > > > > How hard would it be to write some test cases for this? Dataplane has > > a serious lack of automated testing. > > And this hang doesn't even require in-flight guest I/O, so it would be > easy to reproduce in qemu-iotests if we add an IOThread mode. I don't think I would make it a separate mode (because people would run only one or the other - when did you last run qcow2 v2 tests?), but just add some test cases that make use of it during the normal -qcow2 or -raw run. Kevin
On Wed, May 17, 2017 at 9:16 PM, Eric Blake <eblake@redhat.com> wrote:
> On 05/17/2017 12:09 PM, Stefan Hajnoczi wrote:
>> diff --git a/block/io.c b/block/io.c
>> index cc56e90..f0041cd 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -2031,9 +2031,7 @@ bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
>> Coroutine *co = qemu_coroutine_create(bdrv_co_rw_vmstate_entry, &data);
>>
>> bdrv_coroutine_enter(bs, co);
>> - while (data.ret == -EINPROGRESS) {
>> - aio_poll(bdrv_get_aio_context(bs), true);
>> - }
>> + BDRV_POLL_WHILE(bs, data.ret == -EINPROGRESS);
>> return data.ret;
>> }
>
> Do we have other culprits (not necessarily for vmsave, but for other
> situations), where we should be using BDRV_POLL_WHILE in separate
> patches? For example, a quick grep show that at least hw/9pfs/9p.c makes
> a direct call to aio_poll(,true) in a while loop.
virtio-9p doesn't use IOThread (dataplane) so it is not affected. It
also doesn't use BlockDriverState so it cannot use BDRV_POLL_WHILE().
I did grep for other aio_poll() callers and didn't spot any obvious
cases that need to be fixed. They tend to run in situations where
this deadlock cannot occur.
Stefan
© 2016 - 2026 Red Hat, Inc.