[v2] throttle-groups: update tg->any_timer_armed[] on detach

[Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach

Posted by Stefan Hajnoczi 8 years, 4 months ago

Clear tg->any_timer_armed[] when throttling timers are destroyed during
AioContext attach/detach.  Failure to do so causes throttling to hang
because we believe the timer is already scheduled!

The following was broken at least since QEMU 2.10.0 with -drive
iops=100:

  $ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000
  (qemu) stop
  (qemu) cont
  ...I/O is stuck...

Reported-by: Yongxue Hong <yhong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
v2:
 * Use timer_pending(tt->timers[i]) instead of token [Berto]
 * Fix s/destroy/destroyed/ typo [Eric]

 block/throttle-groups.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/throttle-groups.c b/block/throttle-groups.c
index 6ba992c8d7..0a49f3c409 100644
--- a/block/throttle-groups.c
+++ b/block/throttle-groups.c
@@ -592,6 +592,17 @@ void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
 void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
 {
     ThrottleTimers *tt = &tgm->throttle_timers;
+    ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
+
+    qemu_mutex_lock(&tg->lock);
+    if (timer_pending(tt->timers[0])) {
+        tg->any_timer_armed[0] = false;
+    }
+    if (timer_pending(tt->timers[1])) {
+        tg->any_timer_armed[1] = false;
+    }
+    qemu_mutex_unlock(&tg->lock);
+
     throttle_timers_detach_aio_context(tt);
     tgm->aio_context = NULL;
 }
-- 
2.13.5

Re: [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach

Posted by Manos Pitsidianakis 8 years, 4 months ago

On Wed, Sep 20, 2017 at 11:17:40AM +0100, Stefan Hajnoczi wrote:
>Clear tg->any_timer_armed[] when throttling timers are destroyed during
>AioContext attach/detach.  Failure to do so causes throttling to hang
>because we believe the timer is already scheduled!
>
>The following was broken at least since QEMU 2.10.0 with -drive
>iops=100:
>
>  $ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000
>  (qemu) stop
>  (qemu) cont
>  ...I/O is stuck...
>
>Reported-by: Yongxue Hong <yhong@redhat.com>
>Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>


>v2:
> * Use timer_pending(tt->timers[i]) instead of token [Berto]
> * Fix s/destroy/destroyed/ typo [Eric]
>
> block/throttle-groups.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
>diff --git a/block/throttle-groups.c b/block/throttle-groups.c
>index 6ba992c8d7..0a49f3c409 100644
>--- a/block/throttle-groups.c
>+++ b/block/throttle-groups.c
>@@ -592,6 +592,17 @@ void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
> void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
> {
>     ThrottleTimers *tt = &tgm->throttle_timers;
>+    ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
>+
>+    qemu_mutex_lock(&tg->lock);
>+    if (timer_pending(tt->timers[0])) {
>+        tg->any_timer_armed[0] = false;
>+    }
>+    if (timer_pending(tt->timers[1])) {
>+        tg->any_timer_armed[1] = false;
>+    }
>+    qemu_mutex_unlock(&tg->lock);
>+
>     throttle_timers_detach_aio_context(tt);
>     tgm->aio_context = NULL;
> }
>-- 
>2.13.5
>
>

Re: [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach

Posted by Alberto Garcia 8 years, 4 months ago

On Wed 20 Sep 2017 12:17:40 PM CEST, Stefan Hajnoczi wrote:
> @@ -592,6 +592,17 @@ void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
>  void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
>  {
>      ThrottleTimers *tt = &tgm->throttle_timers;
> +    ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
> +
> +    qemu_mutex_lock(&tg->lock);
> +    if (timer_pending(tt->timers[0])) {
> +        tg->any_timer_armed[0] = false;
> +    }
> +    if (timer_pending(tt->timers[1])) {
> +        tg->any_timer_armed[1] = false;
> +    }
> +    qemu_mutex_unlock(&tg->lock);
> +
>      throttle_timers_detach_aio_context(tt);
>      tgm->aio_context = NULL;
>  }

I'm sorry that I didn't noticed this in my previous e-mail, but after
this call you might have destroyed the timer that was set for that
throttling group, so if there are pending requests waiting it can happen
that no one wakes them up.

I think that the queue needs to be restarted after this, probably after
having reattached the context (or actually after detaching it already,
but then what happens if you try to restart the queue while aio_context
is NULL?).

Berto

Re: [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach

Posted by Manos Pitsidianakis 8 years, 4 months ago

On Wed, Sep 20, 2017 at 01:08:52PM +0200, Alberto Garcia wrote:
>On Wed 20 Sep 2017 12:17:40 PM CEST, Stefan Hajnoczi wrote:
>> @@ -592,6 +592,17 @@ void throttle_group_attach_aio_context(ThrottleGroupMember *tgm,
>>  void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
>>  {
>>      ThrottleTimers *tt = &tgm->throttle_timers;
>> +    ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
>> +
>> +    qemu_mutex_lock(&tg->lock);
>> +    if (timer_pending(tt->timers[0])) {
>> +        tg->any_timer_armed[0] = false;
>> +    }
>> +    if (timer_pending(tt->timers[1])) {
>> +        tg->any_timer_armed[1] = false;
>> +    }
>> +    qemu_mutex_unlock(&tg->lock);
>> +
>>      throttle_timers_detach_aio_context(tt);
>>      tgm->aio_context = NULL;
>>  }
>
>I'm sorry that I didn't noticed this in my previous e-mail, but after
>this call you might have destroyed the timer that was set for that
>throttling group, so if there are pending requests waiting it can happen
>that no one wakes them up.
>
>I think that the queue needs to be restarted after this, probably after
>having reattached the context (or actually after detaching it already,
>but then what happens if you try to restart the queue while aio_context
>is NULL?).

aio_co_enter in the restart queue function requires that aio_context is 
non-NULL. Perhaps calling throttle_group_restart_tgm in 
throttle_group_attach_aio_context will suffice.

Re: [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach

Posted by Alberto Garcia 8 years, 4 months ago

On Wed 20 Sep 2017 01:39:02 PM CEST, Manos Pitsidianakis wrote:
>>>  void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
>>>  {
>>>      ThrottleTimers *tt = &tgm->throttle_timers;
>>> +    ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
>>> +
>>> +    qemu_mutex_lock(&tg->lock);
>>> +    if (timer_pending(tt->timers[0])) {
>>> +        tg->any_timer_armed[0] = false;
>>> +    }
>>> +    if (timer_pending(tt->timers[1])) {
>>> +        tg->any_timer_armed[1] = false;
>>> +    }
>>> +    qemu_mutex_unlock(&tg->lock);
>>> +
>>>      throttle_timers_detach_aio_context(tt);
>>>      tgm->aio_context = NULL;
>>>  }
>>
>>I'm sorry that I didn't noticed this in my previous e-mail, but after
>>this call you might have destroyed the timer that was set for that
>>throttling group, so if there are pending requests waiting it can
>>happen that no one wakes them up.
>>
>>I think that the queue needs to be restarted after this, probably
>>after having reattached the context (or actually after detaching it
>>already, but then what happens if you try to restart the queue while
>>aio_context is NULL?).
>
> aio_co_enter in the restart queue function requires that aio_context
> is non-NULL. Perhaps calling throttle_group_restart_tgm in
> throttle_group_attach_aio_context will suffice.

But can we guarantee that everything is safe between the _detach() and
_attach() calls?

Berto

Re: [Qemu-devel] [PATCH v2] throttle-groups: update tg->any_timer_armed[] on detach

Posted by Stefan Hajnoczi 8 years, 4 months ago

On Wed, Sep 20, 2017 at 02:17:51PM +0200, Alberto Garcia wrote:
> On Wed 20 Sep 2017 01:39:02 PM CEST, Manos Pitsidianakis wrote:
> >>>  void throttle_group_detach_aio_context(ThrottleGroupMember *tgm)
> >>>  {
> >>>      ThrottleTimers *tt = &tgm->throttle_timers;
> >>> +    ThrottleGroup *tg = container_of(tgm->throttle_state, ThrottleGroup, ts);
> >>> +
> >>> +    qemu_mutex_lock(&tg->lock);
> >>> +    if (timer_pending(tt->timers[0])) {
> >>> +        tg->any_timer_armed[0] = false;
> >>> +    }
> >>> +    if (timer_pending(tt->timers[1])) {
> >>> +        tg->any_timer_armed[1] = false;
> >>> +    }
> >>> +    qemu_mutex_unlock(&tg->lock);
> >>> +
> >>>      throttle_timers_detach_aio_context(tt);
> >>>      tgm->aio_context = NULL;
> >>>  }
> >>
> >>I'm sorry that I didn't noticed this in my previous e-mail, but after
> >>this call you might have destroyed the timer that was set for that
> >>throttling group, so if there are pending requests waiting it can
> >>happen that no one wakes them up.
> >>
> >>I think that the queue needs to be restarted after this, probably
> >>after having reattached the context (or actually after detaching it
> >>already, but then what happens if you try to restart the queue while
> >>aio_context is NULL?).
> >
> > aio_co_enter in the restart queue function requires that aio_context
> > is non-NULL. Perhaps calling throttle_group_restart_tgm in
> > throttle_group_attach_aio_context will suffice.
> 
> But can we guarantee that everything is safe between the _detach() and
> _attach() calls?

Here is a second patch that I didn't send because I was unsure whether it's
needed.

The idea is to move the throttle_group_detach/attach_aio_context() into
bdrv_set_aio_context() so it becomes part of the bdrv_drained_begin/end()
region.

The guarantees we have inside the drained region:

1. Throttling has been temporarily disabled.
2. throttle_group_restart_tgm() has been called to kick throttled reqs.
3. All requests are completed.

This is a nice, controlled environment to do the detach/attach.  That said,
Berto's point still stands: what about other ThrottleGroupMembers who have
throttled requests queued?

The main reason I didn't publish this patch is because Manos' "block: remove
legacy I/O throttling" series ought to remove this code anyway soon.

diff --git a/block/block-backend.c b/block/block-backend.c
index 45d9101be3..624eb3dc15 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1747,10 +1747,6 @@ void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
     ThrottleGroupMember *tgm = &blk->public.throttle_group_member;
 
     if (bs) {
-        if (tgm->throttle_state) {
-            throttle_group_detach_aio_context(tgm);
-            throttle_group_attach_aio_context(tgm, new_context);
-        }
         bdrv_set_aio_context(bs, new_context);
     }
 }
@@ -2029,6 +2025,13 @@ static void blk_root_drained_end(BdrvChild *child)
     BlockBackend *blk = child->opaque;
     assert(blk->quiesce_counter);
 
+    if (tgm->throttle_state) {
+        AioContext *new_context = bdrv_aio_context(blk_bs(blk));
+
+        throttle_group_detach_aio_context(tgm);
+        throttle_group_attach_aio_context(tgm, new_context);
+    }
+
     assert(blk->public.throttle_group_member.io_limits_disabled);
     atomic_dec(&blk->public.throttle_group_member.io_limits_disabled);