From nobody Sat May 30 19:21:50 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1777027183; cv=none; d=zohomail.com; s=zohoarc; b=AAWId2V5bTZZHZ8iuGggOj+kT6vvpwWfBR6q2mVXWNdX8QT6KcHRcnQIzRFcSg9LgyWpcniYk6vtItHXfVf+grfWnhCXHIstrPS2y0sWvF1G34j3vSBX3TDwMXPy50JVxkTTQLpL25zC0L0hO8sNTPQiMG1agcpBcN6eg1s2K90= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1777027183; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Reply-To:Reply-To:References:Sender:Subject:Subject:To:To:Message-Id; bh=0rF0i4tPGZ/wx4pJf8mTyDIVzwqbQMZM1Dp9RCapltc=; b=nDyT2j7WRGgDlrJyA+56SlBT+q5OHja6gttivTJfgjGNu0JSyFKXmUguwpO6aFTL5ghVP7dp1KwBu9x6rUCTPNX0PUnT6gzHXqlNmDyUCooB77ixb4nM74RAwW3D61Cr3rrdiU8/0YWeC8/uWThBxtFOZQ6XidAfkXSOKHB56s0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists1p.gnu.org (lists1p.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1777027183351245.2232313932858; Fri, 24 Apr 2026 03:39:43 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists1p.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1wGDwV-0002c7-3N; Fri, 24 Apr 2026 06:39:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wGDwT-0002ai-Br; Fri, 24 Apr 2026 06:39:25 -0400 Received: from relay.virtuozzo.com ([130.117.225.111]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wGDwR-0001zz-7M; Fri, 24 Apr 2026 06:39:25 -0400 Received: from ch-demo-asa.virtuozzo.com ([130.117.225.8] helo=iris.sw.ru) by relay.virtuozzo.com with esmtp (Exim 4.96) (envelope-from ) id 1wGDtf-00F3Ps-1J; Fri, 24 Apr 2026 12:39:15 +0200 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=MIME-Version:Message-ID:Date:Subject:From: Content-Type; bh=0rF0i4tPGZ/wx4pJf8mTyDIVzwqbQMZM1Dp9RCapltc=; b=SPlzZmObKbiZ gC+fKyfTr2oJ1itRsdCycNSYt+z/wtX0BXG/JkAnJypFbUF2/K4x2POIKP78cqJAjz4f2rafPHa/N +ju7yQIrxTPOY+tN9QEgurL9iAvB5A1+Kfu1T4VJxiwN+ckM41VgMhKR5xxAFUsZksoWoOtK8auW0 dkQAw84bb8GeauihKqhY9Y4TiSEFnQxdjvO6szqhut5t4aJe/sjgM5BTUHemeoorDau9yBiGh8FCc e2riJdWT8zUR8F6a0gnQUBuL7HhumiikGNhs4xvH2hvt5CzztDoALlQuJN6hQHhajyRDOSgC+garS izQRbNPOQlrwhFuTROSwCg==; To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, qemu-stable@nongnu.org, "Denis V. Lunev" , Kevin Wolf , Hanna Reitz , Stefan Hajnoczi , Fiona Ebner Subject: [PATCH 1/2] block/graph-lock: fix missed wakeup in bdrv_graph_co_rdunlock() Date: Fri, 24 Apr 2026 12:39:16 +0200 Message-ID: <20260424103917.248668-2-den@openvz.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260424103917.248668-1-den@openvz.org> References: <20260424103917.248668-1-den@openvz.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists1p.gnu.org; Received-SPF: softfail client-ip=130.117.225.111; envelope-from=den@openvz.org; helo=relay.virtuozzo.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: "Denis V. Lunev" From: "Denis V. Lunev" via qemu development Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1777027184725158500 Content-Type: text/plain; charset="utf-8" tests/qemu-iotests/tests/iothreads-create reproduces the hang on master under `stress-ng --cpu $(nproc) --timeout 0`. The iotest's vm.run_job() times out and qemu stays permanently stuck in ppoll(timeout=3D-1) inside bdrv_graph_wrlock_drained -> blk_remove_bs during qemu_cleanup(). The timing window is narrow on modern bare-metal hardware and much wider in a VM guest; downstream trees that still use plain bdrv_graph_wrlock() in blk_remove_bs() hit it on the first iteration under the same stress. bdrv_graph_wrlock() zeroes has_writer around its AIO_WAIT_WHILE loop so that callbacks dispatched by aio_poll() can still take the read lock on the fast path. The rdunlock side, however, only kicks a waiting writer when has_writer is observed set; a reader that drops its lock inside the polling window silently returns and nothing ever wakes the writer: main thread iothread0 coroutine ----------- ------------------- bdrv_graph_wrlock: rdlock held, reader_count=3D1 bdrv_drain_all_begin_nopoll has_writer =3D 0 AIO_WAIT_WHILE_UNLOCKED( NULL, reader_count >=3D 1): num_waiters++ smp_mb aio_poll(main_ctx, true) --> bdrv_graph_co_rdunlock: (ppoll, blocked) reader_count-- -> 0 smp_mb read has_writer =3D 0 skip aio_wait_kick() return reader_count is now 0 and num_waiters is still 1, but no BH, fd or timer on the main AioContext will fire -- the only entity that could kick just decided it did not have to. Main stays in ppoll() holding BQL, so RCU, VCPUs and any iothread path that needs BQL stall behind it. The hang is final; no timeout, no forward progress, no recovery as there is no other source of wake up inside qemu_cleanup(). bdrv_drain_all_begin() does not close the race on its own: it quiesces in-flight I/O, but graph readers also include non-I/O coroutines (block-job cleanup, virtio-scsi polling) that drain does not evict. The bdrv_graph_wrlock_drained() wrapper narrows the window but does not eliminate it; every plain bdrv_graph_wrlock() site is exposed on the same basis. Drop the has_writer check in bdrv_graph_co_rdunlock() and call aio_wait_kick() unconditionally. The helper itself loads num_waiters atomically and only schedules a dummy BH when a waiter exists, so the change is a no-op on the no-writer path and closes the missed-wakeup on the writer path. Signed-off-by: Denis V. Lunev Cc: Kevin Wolf Cc: Hanna Reitz Cc: Stefan Hajnoczi Cc: Fiona Ebner Reviewed-by: Kevin Wolf --- block/graph-lock.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/block/graph-lock.c b/block/graph-lock.c index b7319473a1..f2501d75fb 100644 --- a/block/graph-lock.c +++ b/block/graph-lock.c @@ -278,14 +278,12 @@ void coroutine_fn bdrv_graph_co_rdunlock(void) smp_mb(); =20 /* - * has_writer =3D=3D 0: this means reader will read reader_count decre= ased - * has_writer =3D=3D 1: we don't know if writer read reader_count old = or - * new. Therefore, kick again so on next iteration - * writer will for sure read the updated value. + * Always kick: bdrv_graph_wrlock() zeroes has_writer while polling (to + * let callbacks take the reader lock via the fast path), so we cannot + * rely on has_writer to detect a waiting writer. aio_wait_kick() is a + * no-op when no one is waiting, so it is cheap in the common case. */ - if (qatomic_read(&has_writer)) { - aio_wait_kick(); - } + aio_wait_kick(); } =20 void bdrv_graph_rdlock_main_loop(void) --=20 2.51.0 From nobody Sat May 30 19:21:50 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1777027183; cv=none; d=zohomail.com; s=zohoarc; b=oB1XDKjp7bURkdT9XxVRiz3JtEknGTq6PWD3VWad/18tMN9n08IVpwrKkmStw73UFm+7xgxFpDWNUwoPaLC6eNWpD3fbo3Q7rsfXw4AQbL4RiKsM6O7x3BzmDfoRZSu9AdYt02Q8+9gVEycPPHf1aLQSoLBKDAg2kHUjN0zQjQA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1777027183; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Reply-To:Reply-To:References:Sender:Subject:Subject:To:To:Message-Id; bh=uNSdOJEhtNrcgxDuLoQOyJZjT8cwzLWmaavcB1WJkGU=; b=KELyCOs/HL5m2HxDU8UU4JUFj/wdiHT93DEOdq3KNPlfxJVVDnHDTIqTvkZX/oRpTbISq8/kIFKR81ZXodaSLqCR7ods/fD6GD1evUJ3EdMCx7N1Vl1vUZeC8rt+2RmMB3lTFI63EvKrgd/zEIjx6YfY6gRWn9vRqwralwbQcWI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists1p.gnu.org (lists1p.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1777027183954372.5870816705168; Fri, 24 Apr 2026 03:39:43 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists1p.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1wGDwW-0002eS-W1; Fri, 24 Apr 2026 06:39:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wGDwU-0002bB-AP; Fri, 24 Apr 2026 06:39:26 -0400 Received: from relay.virtuozzo.com ([130.117.225.111]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wGDwR-000201-AZ; Fri, 24 Apr 2026 06:39:26 -0400 Received: from ch-demo-asa.virtuozzo.com ([130.117.225.8] helo=iris.sw.ru) by relay.virtuozzo.com with esmtp (Exim 4.96) (envelope-from ) id 1wGDtf-00F3Ps-2L; Fri, 24 Apr 2026 12:39:15 +0200 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=MIME-Version:Message-ID:Date:Subject:From: Content-Type; bh=uNSdOJEhtNrcgxDuLoQOyJZjT8cwzLWmaavcB1WJkGU=; b=SR4mYlt8Juoe IYaRkuOOH2WVVDHdqA/tOgLy0KGLut6PTzMqkxI8jwE6MWpUujtiFMromHzfCWU6aJeqi8OHUK0wp e9gs1eRL/R5DiQ4W2MLSsVR0jzqecxLrhAaedURl9mXg06UyPjQBVx5wEiaSV0sUU4HEQLUhqeD2V qIWGzQSmAnOUC1FxbCxc1KH/gHtJGARY6ZOdttROTCqcxo/aqAczbcF3Gx+bFfDA+9aQ9SqrK5bG0 0nerrFqoTBc36H17sGlU5eUrNZSJ9tXRqpSlXJDrdmxyccoPC4sOB8g13XeZOQ1jzUQlj2oqp3mvK CazAkI0JQX57IOTQuBDqmw==; To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, qemu-stable@nongnu.org, "Denis V. Lunev" , Hanna Czenczek , Kevin Wolf Subject: [PATCH 2/2] block/qcow2: fix hangup in cache_clean_timer cancellation Date: Fri, 24 Apr 2026 12:39:17 +0200 Message-ID: <20260424103917.248668-3-den@openvz.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260424103917.248668-1-den@openvz.org> References: <20260424103917.248668-1-den@openvz.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists1p.gnu.org; Received-SPF: softfail client-ip=130.117.225.111; envelope-from=den@openvz.org; helo=relay.virtuozzo.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: "Denis V. Lunev" From: "Denis V. Lunev" via qemu development Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1777027184720158500 Content-Type: text/plain; charset="utf-8" cache_clean_timer_del_and_wait() cancels the cache-cleaner coroutine by setting s->cache_clean_interval =3D 0 and calling qemu_co_sleep_wake() to cut short its qemu_co_sleep_ns_wakeable(). qemu_co_sleep_wake() is fire-and-forget: it reads w->to_wake and silently returns when it is NULL. A sleeper that is between two iterations -- has just released s->lock but has not yet set w->to_wake inside qemu_co_sleep() -- loses the wake: iothread0 timer coroutine main thread (qcow2 close) ------------------------- ------------------------- while-body (holding s->lock): read interval =3D 600 wait_ns =3D 600 * NS release s->lock take s->lock interval =3D 0 qemu_co_sleep_wake(w): w->to_wake =3D=3D NULL -> skip return qemu_co_queue_wait(exit, s->lock): release s->lock yield qemu_co_sleep_ns_wakeable: aio_timer_init(+600 s) qemu_co_sleep: cas scheduled NULL -> "qsns" w->to_wake =3D co yield [sleeps 600 s] cache_clean_timer_del_and_wait() is now stuck waiting for cache_clean_timer_exit; the timer will not signal it until its original 600 s expiry fires. qcow2_close() is on the main thread holding BQL, so RCU, VCPUs and every iothread path that needs BQL stall behind it. qemu_co_sleep_wake() has always been a hint: it has no way to rendezvous with a sleeper still arming. Rather than mutate it (which would change semantics for every other user -- mirror, stream, backup), fix the caller. Split the sleep in cache_clean_timer() into steps of at most one second and move the s->cache_clean_interval check to the top of the loop so it is re-evaluated under s->lock between steps. The loop/wait structure itself is unchanged. The stop decision is now made under the same lock that the teardown caller holds to set cache_clean_interval =3D 0, so it cannot be missed. qemu_co_sleep_wake() is still called opportunistically to cut short the current step; if it misses, the next 1 s tick catches the change. Worst-case cancellation latency is bounded at 1 s, independent of cache_clean_interval. Fixes: f86dde9a15 ("qcow2: Fix cache_clean_timer") Signed-off-by: Denis V. Lunev Cc: Hanna Czenczek Cc: Kevin Wolf --- block/qcow2.c | 28 +++++++++++++++++----------- 1 file changed, 17 insertions(+), 11 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index f6461743d2..3e249970d6 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -838,24 +838,30 @@ static const char *overlap_bool_option_names[QCOW2_OL= _MAX_BITNR] =3D { static void coroutine_fn cache_clean_timer(void *opaque) { BDRVQcow2State *s =3D opaque; - uint64_t wait_ns; + uint64_t remaining_ns =3D 0; =20 - WITH_QEMU_LOCK_GUARD(&s->lock) { - wait_ns =3D s->cache_clean_interval * NANOSECONDS_PER_SECOND; - } - - while (wait_ns > 0) { - qemu_co_sleep_ns_wakeable(&s->cache_clean_timer_wake, - QEMU_CLOCK_REALTIME, wait_ns); + for (;;) { + bool stop =3D false; + uint64_t step; =20 WITH_QEMU_LOCK_GUARD(&s->lock) { - if (s->cache_clean_interval > 0) { + if (s->cache_clean_interval =3D=3D 0) { + stop =3D true; + } else if (remaining_ns =3D=3D 0) { qcow2_cache_clean_unused(s->l2_table_cache); qcow2_cache_clean_unused(s->refcount_block_cache); + remaining_ns =3D s->cache_clean_interval + * (uint64_t)NANOSECONDS_PER_SECOND; } - - wait_ns =3D s->cache_clean_interval * NANOSECONDS_PER_SECOND; } + if (stop) { + break; + } + + step =3D MIN(remaining_ns, (uint64_t)NANOSECONDS_PER_SECOND); + qemu_co_sleep_ns_wakeable(&s->cache_clean_timer_wake, + QEMU_CLOCK_REALTIME, step); + remaining_ns -=3D step; } =20 WITH_QEMU_LOCK_GUARD(&s->lock) { --=20 2.51.0