From nobody Mon May 6 05:53:59 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1506347978623222.00413526956106; Mon, 25 Sep 2017 06:59:38 -0700 (PDT) Received: from localhost ([::1]:42576 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dwTv7-00079U-A9 for importer@patchew.org; Mon, 25 Sep 2017 09:59:37 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39342) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dwTtS-00068i-BT for qemu-devel@nongnu.org; Mon, 25 Sep 2017 09:57:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dwTtO-0006y6-9l for qemu-devel@nongnu.org; Mon, 25 Sep 2017 09:57:54 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52526) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dwTtE-0006uY-8r; Mon, 25 Sep 2017 09:57:40 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 01F587E42A; Mon, 25 Sep 2017 13:57:39 +0000 (UTC) Received: from localhost (ovpn-117-160.ams2.redhat.com [10.36.117.160]) by smtp.corp.redhat.com (Postfix) with ESMTP id 62D056375E; Mon, 25 Sep 2017 13:57:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 01F587E42A Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=stefanha@redhat.com From: Stefan Hajnoczi To: Date: Mon, 25 Sep 2017 14:57:35 +0100 Message-Id: <20170925135735.25076-1-stefanha@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Mon, 25 Sep 2017 13:57:39 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v3] throttle-groups: cancel timers on restart X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Alberto Garcia , qemu-block@nongnu.org, Manos Pitsidianakis , Max Reitz , Stefan Hajnoczi Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Throttling group members are restarted on config change and by bdrv_drained_begin(). Pending timers should be cancelled before restarting the queue, otherwise requests see that tg->any_timer_armed[] is already set and do not schedule a timer. For example, a hang occurs at least since QEMU 2.10.0 with -drive iops=3D100 because no further timers will be scheduled: (guest)$ dd if=3D/dev/zero of=3D/dev/vdb oflag=3Ddirect count=3D1000 (qemu) stop (qemu) cont ...I/O is stuck... This can be fixed by calling throttle_group_detach_aio_context() from a bdrv_drained_begin/end() region. This way timers are quiesced properly, requests are drained, and other throttle group members are scheduled, if necessary. Reported-by: Yongxue Hong Signed-off-by: Stefan Hajnoczi Reviewed-by: Eric Blake --- v3: * Different approach this time, based on Berto's insight that another TGM may need to be scheduled now that our TGM's timers have been detached. I realized the problem isn't the detach operation, it's the restart operation that runs before detach. Timers shouldn't be left active across restart. After fixing restart no change is necessary in detach, but I decided to add assertions to make it clear that this function must be called in a quiesced environment. block/block-backend.c | 2 ++ block/throttle-groups.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/block/block-backend.c b/block/block-backend.c index 45d9101be3..da2f6c0f8a 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1748,8 +1748,10 @@ void blk_set_aio_context(BlockBackend *blk, AioConte= xt *new_context) =20 if (bs) { if (tgm->throttle_state) { + bdrv_drained_begin(bs); throttle_group_detach_aio_context(tgm); throttle_group_attach_aio_context(tgm, new_context); + bdrv_drained_end(bs); } bdrv_set_aio_context(bs, new_context); } diff --git a/block/throttle-groups.c b/block/throttle-groups.c index 6ba992c8d7..c13530695a 100644 --- a/block/throttle-groups.c +++ b/block/throttle-groups.c @@ -420,6 +420,23 @@ static void throttle_group_restart_queue(ThrottleGroup= Member *tgm, bool is_write void throttle_group_restart_tgm(ThrottleGroupMember *tgm) { if (tgm->throttle_state) { + ThrottleState *ts =3D tgm->throttle_state; + ThrottleGroup *tg =3D container_of(ts, ThrottleGroup, ts); + ThrottleTimers *tt =3D &tgm->throttle_timers; + + /* Cancel pending timers */ + qemu_mutex_lock(&tg->lock); + if (timer_pending(tt->timers[0])) { + timer_del(tt->timers[0]); + tg->any_timer_armed[0] =3D false; + } + if (timer_pending(tt->timers[1])) { + timer_del(tt->timers[1]); + tg->any_timer_armed[1] =3D false; + } + qemu_mutex_unlock(&tg->lock); + + /* This also schedules the next request in other TGMs, if necessar= y */ throttle_group_restart_queue(tgm, 0); throttle_group_restart_queue(tgm, 1); } @@ -592,6 +609,17 @@ void throttle_group_attach_aio_context(ThrottleGroupMe= mber *tgm, void throttle_group_detach_aio_context(ThrottleGroupMember *tgm) { ThrottleTimers *tt =3D &tgm->throttle_timers; + + /* Requests must have been drained */ + assert(tgm->pending_reqs[0] =3D=3D 0); + assert(tgm->pending_reqs[1] =3D=3D 0); + assert(qemu_co_queue_empty(&tgm->throttled_reqs[0])); + assert(qemu_co_queue_empty(&tgm->throttled_reqs[1])); + + /* Timers must be disabled */ + assert(!timer_pending(tt->timers[0])); + assert(!timer_pending(tt->timers[1])); + throttle_timers_detach_aio_context(tt); tgm->aio_context =3D NULL; } --=20 2.13.5