[Qemu-devel] [PATCH v4] throttle-groups: drain before detaching ThrottleState

Stefan Hajnoczi posted 1 patch 6 years, 4 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20171110151934.16883-1-stefanha@redhat.com
Test checkpatch passed
Test docker passed
Test ppc passed
Test s390x passed
block/block-backend.c   | 2 ++
block/throttle-groups.c | 6 ++++++
2 files changed, 8 insertions(+)
[Qemu-devel] [PATCH v4] throttle-groups: drain before detaching ThrottleState
Posted by Stefan Hajnoczi 6 years, 4 months ago
I/O requests hang after stop/cont commands at least since QEMU 2.10.0
with -drive iops=100:

  (guest)$ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000
  (qemu) stop
  (qemu) cont
  ...I/O is stuck...

This happens because blk_set_aio_context() detaches the ThrottleState
while requests may still be in flight:

  if (tgm->throttle_state) {
      throttle_group_detach_aio_context(tgm);
      throttle_group_attach_aio_context(tgm, new_context);
  }

This patch encloses the detach/attach calls in a drained region so no
I/O request is left hanging.  Also add assertions so we don't make the
same mistake again in the future.

Reported-by: Yongxue Hong <yhong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v4:
 * Simplified patch in response to Berto's review
---
 block/block-backend.c   | 2 ++
 block/throttle-groups.c | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 45d9101be3..da2f6c0f8a 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1748,8 +1748,10 @@ void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
 
     if (bs) {
         if (tgm->throttle_state) {
+            bdrv_drained_begin(bs);
             throttle_group_detach_aio_context(tgm);
             throttle_group_attach_aio_context(tgm, new_context);
+            bdrv_drained_end(bs);
         }
         bdrv_set_aio_context(bs, new_context);
     }
diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index b291a88481..2587f19ca3 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -594,6 +594,12 @@ void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
 void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
 {
     ThrottleTimers *tt = &tgm->throttle_timers;
+
+    /* Requests must have been drained */
+    assert(tgm->pending_reqs[0] == 0 && tgm->pending_reqs[1] == 0);
+    assert(qemu_co_queue_empty(&tgm->throttled_reqs[0]));
+    assert(qemu_co_queue_empty(&tgm->throttled_reqs[1]));
+
     throttle_timers_detach_aio_context(tt);
     tgm->aio_context = NULL;
 }
-- 
2.13.6


Re: [Qemu-devel] [PATCH v4] throttle-groups: drain before detaching ThrottleState
Posted by Alberto Garcia 6 years, 4 months ago
On Fri 10 Nov 2017 04:19:34 PM CET, Stefan Hajnoczi wrote:
> I/O requests hang after stop/cont commands at least since QEMU 2.10.0
> with -drive iops=100:
>
>   (guest)$ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000
>   (qemu) stop
>   (qemu) cont
>   ...I/O is stuck...
>
> This happens because blk_set_aio_context() detaches the ThrottleState
> while requests may still be in flight:
>
>   if (tgm->throttle_state) {
>       throttle_group_detach_aio_context(tgm);
>       throttle_group_attach_aio_context(tgm, new_context);
>   }
>
> This patch encloses the detach/attach calls in a drained region so no
> I/O request is left hanging.  Also add assertions so we don't make the
> same mistake again in the future.
>
> Reported-by: Yongxue Hong <yhong@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Alberto Garcia <berto@igalia.com>

Berto

Re: [Qemu-devel] [PATCH v4] throttle-groups: drain before detaching ThrottleState
Posted by Stefan Hajnoczi 6 years, 4 months ago
On Fri, Nov 10, 2017 at 03:19:34PM +0000, Stefan Hajnoczi wrote:
> I/O requests hang after stop/cont commands at least since QEMU 2.10.0
> with -drive iops=100:
> 
>   (guest)$ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000
>   (qemu) stop
>   (qemu) cont
>   ...I/O is stuck...
> 
> This happens because blk_set_aio_context() detaches the ThrottleState
> while requests may still be in flight:
> 
>   if (tgm->throttle_state) {
>       throttle_group_detach_aio_context(tgm);
>       throttle_group_attach_aio_context(tgm, new_context);
>   }
> 
> This patch encloses the detach/attach calls in a drained region so no
> I/O request is left hanging.  Also add assertions so we don't make the
> same mistake again in the future.
> 
> Reported-by: Yongxue Hong <yhong@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> v4:
>  * Simplified patch in response to Berto's review
> ---
>  block/block-backend.c   | 2 ++
>  block/throttle-groups.c | 6 ++++++
>  2 files changed, 8 insertions(+)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan
Re: [Qemu-devel] [PATCH v4] throttle-groups: drain before detaching ThrottleState
Posted by Alberto Garcia 6 years, 4 months ago
On Fri 10 Nov 2017 04:19:34 PM CET, Stefan Hajnoczi wrote:
> I/O requests hang after stop/cont commands at least since QEMU 2.10.0
> with -drive iops=100:
>
>   (guest)$ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000
>   (qemu) stop
>   (qemu) cont
>   ...I/O is stuck...
>
> This happens because blk_set_aio_context() detaches the ThrottleState
> while requests may still be in flight:
>
>   if (tgm->throttle_state) {
>       throttle_group_detach_aio_context(tgm);
>       throttle_group_attach_aio_context(tgm, new_context);
>   }
>
> This patch encloses the detach/attach calls in a drained region so no
> I/O request is left hanging.  Also add assertions so we don't make the
> same mistake again in the future.

I'm wondering about the implications of this change... is it possible
now to bypass the I/O limits simply by stopping and quickly resuming the
VM? And is that a problem?

Berto

Re: [Qemu-devel] [Qemu-block] [PATCH v4] throttle-groups: drain before detaching ThrottleState
Posted by Stefan Hajnoczi 6 years, 4 months ago
On Mon, Nov 13, 2017 at 2:29 PM, Alberto Garcia <berto@igalia.com> wrote:
> On Fri 10 Nov 2017 04:19:34 PM CET, Stefan Hajnoczi wrote:
>> I/O requests hang after stop/cont commands at least since QEMU 2.10.0
>> with -drive iops=100:
>>
>>   (guest)$ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000
>>   (qemu) stop
>>   (qemu) cont
>>   ...I/O is stuck...
>>
>> This happens because blk_set_aio_context() detaches the ThrottleState
>> while requests may still be in flight:
>>
>>   if (tgm->throttle_state) {
>>       throttle_group_detach_aio_context(tgm);
>>       throttle_group_attach_aio_context(tgm, new_context);
>>   }
>>
>> This patch encloses the detach/attach calls in a drained region so no
>> I/O request is left hanging.  Also add assertions so we don't make the
>> same mistake again in the future.
>
> I'm wondering about the implications of this change... is it possible
> now to bypass the I/O limits simply by stopping and quickly resuming the
> VM? And is that a problem?

bdrv_set_aio_context() already drains so this patch doesn't change
existing behavior with respect to bypassing throttling.

It's not ideal that certain VM lifecycle operations temporarily
disable throttling, but it's a trade-off since synchronous drain is
usually performance sensitive and should not take a long time.
Perhaps there are ways to improve the situation, I haven't studied it
in detail.

Stefan