block/block-backend.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
When new requests arrive at a BlockBackend that is currently drained,
these requests are queued until the drain section ends.
There is a race window between blk_root_drained_end() waking up a queued
request in an iothread from the main thread and blk_wait_while_drained()
actually being woken up in the iothread and calling blk_in_flight(). If
the BlockBackend is drained again during this window, drain won't wait
for this request and it will sneak in when the BlockBackend is already
supposed to be quiesced. This causes assertion failures in
bdrv_drain_all_begin() and can have other unintended consequences.
Fix this by increasing the in_flight counter immediately when scheduling
the request to be resumed so that the next drain will wait for it to
complete.
Cc: qemu-stable@nongnu.org
Reported-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/block-backend.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/block/block-backend.c b/block/block-backend.c
index f8d6ba65c1..d6df369188 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1318,9 +1318,9 @@ static void coroutine_fn blk_wait_while_drained(BlockBackend *blk)
* section.
*/
qemu_mutex_lock(&blk->queued_requests_lock);
+ /* blk_root_drained_end() has the corresponding blk_inc_in_flight() */
blk_dec_in_flight(blk);
qemu_co_queue_wait(&blk->queued_requests, &blk->queued_requests_lock);
- blk_inc_in_flight(blk);
qemu_mutex_unlock(&blk->queued_requests_lock);
}
}
@@ -2767,9 +2767,11 @@ static void blk_root_drained_end(BdrvChild *child)
blk->dev_ops->drained_end(blk->dev_opaque);
}
qemu_mutex_lock(&blk->queued_requests_lock);
- while (qemu_co_enter_next(&blk->queued_requests,
- &blk->queued_requests_lock)) {
+ while (!qemu_co_queue_empty(&blk->queued_requests)) {
/* Resume all queued requests */
+ blk_inc_in_flight(blk);
+ qemu_co_enter_next(&blk->queued_requests,
+ &blk->queued_requests_lock);
}
qemu_mutex_unlock(&blk->queued_requests_lock);
}
--
2.51.1
Am 19.11.25 um 6:27 PM schrieb Kevin Wolf: > When new requests arrive at a BlockBackend that is currently drained, > these requests are queued until the drain section ends. > > There is a race window between blk_root_drained_end() waking up a queued > request in an iothread from the main thread and blk_wait_while_drained() > actually being woken up in the iothread and calling blk_in_flight(). If Small typo here: blk_inc_in_flight() > the BlockBackend is drained again during this window, drain won't wait > for this request and it will sneak in when the BlockBackend is already > supposed to be quiesced. This causes assertion failures in > bdrv_drain_all_begin() and can have other unintended consequences. > > Fix this by increasing the in_flight counter immediately when scheduling > the request to be resumed so that the next drain will wait for it to > complete. > > Cc: qemu-stable@nongnu.org > Reported-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> > Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
On 11/19/25 7:27 PM, Kevin Wolf wrote: > When new requests arrive at a BlockBackend that is currently drained, > these requests are queued until the drain section ends. > > There is a race window between blk_root_drained_end() waking up a queued > request in an iothread from the main thread and blk_wait_while_drained() > actually being woken up in the iothread and calling blk_in_flight(). If > the BlockBackend is drained again during this window, drain won't wait > for this request and it will sneak in when the BlockBackend is already > supposed to be quiesced. This causes assertion failures in > bdrv_drain_all_begin() and can have other unintended consequences. > > Fix this by increasing the in_flight counter immediately when scheduling > the request to be resumed so that the next drain will wait for it to > complete. > > Cc: qemu-stable@nongnu.org > Reported-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block/block-backend.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) I can confirm that the crash is no longer reproducible with this fix applied. Thanks for looking into this! Tested-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
On 19.11.25 18:27, Kevin Wolf wrote: > When new requests arrive at a BlockBackend that is currently drained, > these requests are queued until the drain section ends. > > There is a race window between blk_root_drained_end() waking up a queued > request in an iothread from the main thread and blk_wait_while_drained() > actually being woken up in the iothread and calling blk_in_flight(). If > the BlockBackend is drained again during this window, drain won't wait > for this request and it will sneak in when the BlockBackend is already > supposed to be quiesced. This causes assertion failures in > bdrv_drain_all_begin() and can have other unintended consequences. > > Fix this by increasing the in_flight counter immediately when scheduling > the request to be resumed so that the next drain will wait for it to > complete. > > Cc: qemu-stable@nongnu.org > Reported-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > block/block-backend.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
© 2016 - 2026 Red Hat, Inc.