On 13/09/2018 19:21, Kevin Wolf wrote:
> Am 13.09.2018 um 17:11 hat Paolo Bonzini geschrieben:
>> On 13/09/2018 14:52, Kevin Wolf wrote:
>>> Even if AIO_WAIT_WHILE() is called in the home context of the
>>> AioContext, we still want to allow the condition to change depending on
>>> other threads as long as they kick the AioWait. Specfically block jobs
>>> can be running in an I/O thread and should then be able to kick a drain
>>> in the main loop context.
>>
>> I don't understand the scenario very well. Why hasn't the main loop's
>> drain incremented num_waiters?
>
> We are in this path (that didn't increase num_waiters before this patch)
> because drain, and therefore AIO_WAIT_WHILE(), was called from the main
> thread. But draining depends on a job in a different thread, so we need
> to be able to be kicked when that job finally is quiesced.
Ah, because AIO_WAIT_WHILE() is invoked with ctx ==
qemu_get_aio_context(), but the condition is triggered *outside* the
main context? Tricky, but seems correct. AIO_WAIT_WHILE() anyway is
not a fast path.
Paolo
> If I revert this, the test /bdrv-drain/blockjob/iothread/drain hangs.
> This is a block job that works on two nodes in two different contexts.
>
> (I think I saw this with mirror, which doesn't take additional locks
> when it issues a request, so maybe there's a bit more wrong there... We
> clearly need more testing with iothreads, this series probably only
> scratches the surface.)
>
>> Note I'm not against the patch---though I would hoist the
>> atomic_inc/atomic_dec outside the if, since it's done in both branches.
>
> Ok.
>
> Kevin
>