From nobody Wed Oct 29 09:08:32 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 152578780279638.38698692458695; Tue, 8 May 2018 06:56:42 -0700 (PDT) Received: from localhost ([::1]:51376 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fG36f-0004s9-IR for importer@patchew.org; Tue, 08 May 2018 09:56:41 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33073) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fG35H-0004BA-V3 for qemu-devel@nongnu.org; Tue, 08 May 2018 09:55:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fG35B-0001y7-WE for qemu-devel@nongnu.org; Tue, 08 May 2018 09:55:15 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:46712 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fG354-0001ux-Np; Tue, 08 May 2018 09:55:02 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1A9CEF95CD; Tue, 8 May 2018 13:54:53 +0000 (UTC) Received: from localhost (ovpn-117-57.ams2.redhat.com [10.36.117.57]) by smtp.corp.redhat.com (Postfix) with ESMTP id BE23A215CDA7; Tue, 8 May 2018 13:54:52 +0000 (UTC) From: Stefan Hajnoczi To: qemu-devel@nongnu.org Date: Tue, 8 May 2018 14:54:35 +0100 Message-Id: <20180508135436.30140-2-stefanha@redhat.com> In-Reply-To: <20180508135436.30140-1-stefanha@redhat.com> References: <20180508135436.30140-1-stefanha@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Tue, 08 May 2018 13:54:53 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Tue, 08 May 2018 13:54:53 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'stefanha@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 1/2] qemu-iotests: reduce chance of races in 185 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Vladimir Sementsov-Ogievskiy , qemu-block@nongnu.org, Jeff Cody , Max Reitz , Stefan Hajnoczi Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Commit 8565c3ab537e78f3e69977ec2c609dc9417a806e ("qemu-iotests: fix 185") identified a race condition in a sub-test. Similar issues also affect the other sub-tests. If disk I/O completes quickly, it races with the QMP 'quit' command. This causes spurious test failures because QMP events are emitted in an unpredictable order. This test relies on QEMU internals and there is no QMP API for getting deterministic behavior needed to make this test 100% reliable. At the same time, the test is useful and it would be a shame to remove it. Add sleep 0.5 to reduce the chance of races. This is not a real fix but appears to reduce spurious failures in practice. Cc: Vladimir Sementsov-Ogievskiy Signed-off-by: Stefan Hajnoczi Reviewed-by: Eric Blake --- tests/qemu-iotests/185 | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/tests/qemu-iotests/185 b/tests/qemu-iotests/185 index 298d88d04e..975404c99d 100755 --- a/tests/qemu-iotests/185 +++ b/tests/qemu-iotests/185 @@ -118,6 +118,9 @@ _send_qemu_cmd $h \ 'speed': 65536 } }" \ "return" =20 +# If we don't sleep here 'quit' command races with disk I/O +sleep 0.5 + _send_qemu_cmd $h "{ 'execute': 'quit' }" "return" wait=3D1 _cleanup_qemu =20 @@ -137,6 +140,9 @@ _send_qemu_cmd $h \ 'speed': 65536 } }" \ "return" =20 +# If we don't sleep here 'quit' command races with disk I/O +sleep 0.5 + _send_qemu_cmd $h "{ 'execute': 'quit' }" "return" wait=3D1 _cleanup_qemu =20 @@ -183,6 +189,9 @@ _send_qemu_cmd $h \ 'speed': 65536 } }" \ "return" =20 +# If we don't sleep here 'quit' command races with disk I/O +sleep 0.5 + _send_qemu_cmd $h "{ 'execute': 'quit' }" "return" wait=3D1 _cleanup_qemu =20 @@ -201,6 +210,9 @@ _send_qemu_cmd $h \ 'speed': 65536 } }" \ "return" =20 +# If we don't sleep here 'quit' command races with disk I/O +sleep 0.5 + _send_qemu_cmd $h "{ 'execute': 'quit' }" "return" wait=3D1 _cleanup_qemu =20 --=20 2.14.3 From nobody Wed Oct 29 09:08:32 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525787900480514.243765732924; Tue, 8 May 2018 06:58:20 -0700 (PDT) Received: from localhost ([::1]:51382 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fG38F-0006Dt-NJ for importer@patchew.org; Tue, 08 May 2018 09:58:19 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33062) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fG35C-000437-Fr for qemu-devel@nongnu.org; Tue, 08 May 2018 09:55:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fG35B-0001xd-6P for qemu-devel@nongnu.org; Tue, 08 May 2018 09:55:10 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:42920 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fG354-0001uw-Of; Tue, 08 May 2018 09:55:02 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BCC93406C740; Tue, 8 May 2018 13:54:54 +0000 (UTC) Received: from localhost (ovpn-117-57.ams2.redhat.com [10.36.117.57]) by smtp.corp.redhat.com (Postfix) with ESMTP id 656C82023227; Tue, 8 May 2018 13:54:54 +0000 (UTC) From: Stefan Hajnoczi To: qemu-devel@nongnu.org Date: Tue, 8 May 2018 14:54:36 +0100 Message-Id: <20180508135436.30140-3-stefanha@redhat.com> In-Reply-To: <20180508135436.30140-1-stefanha@redhat.com> References: <20180508135436.30140-1-stefanha@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Tue, 08 May 2018 13:54:54 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Tue, 08 May 2018 13:54:54 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'stefanha@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 2/2] blockjob: do not cancel timer in resume X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , qemu-block@nongnu.org, Jeff Cody , Max Reitz , QingFeng Hao , Stefan Hajnoczi Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Currently the timer is cancelled and the block job is entered by block_job_resume(). This behavior causes drain to run extra blockjob iterations when the job was sleeping due to the ratelimit. This patch leaves the job asleep when block_job_resume() is called. Jobs can still be forcibly woken up using block_job_enter(), which is used to cancel jobs. After this patch drain no longer runs extra blockjob iterations. This is the expected behavior that qemu-iotests 185 used to rely on. We temporarily changed the 185 test output to make it pass for the QEMU 2.12 release but now it's time to address this issue. Cc: QingFeng Hao Signed-off-by: Stefan Hajnoczi Reviewed-by: Eric Blake Reviewed-by: QingFeng Hao --- blockjob.c | 22 +++++++++++++++------- tests/qemu-iotests/185 | 5 +---- tests/qemu-iotests/185.out | 12 +++++------- 3 files changed, 21 insertions(+), 18 deletions(-) diff --git a/blockjob.c b/blockjob.c index 27f957e571..70643890be 100644 --- a/blockjob.c +++ b/blockjob.c @@ -209,6 +209,18 @@ static void block_job_txn_del_job(BlockJob *job) } } =20 +/* Assumes the block_job_mutex is held */ +static bool block_job_timer_pending(BlockJob *job) +{ + return timer_pending(&job->sleep_timer); +} + +/* Assumes the block_job_mutex is held */ +static bool block_job_timer_not_pending(BlockJob *job) +{ + return !block_job_timer_pending(job); +} + static void block_job_pause(BlockJob *job) { job->pause_count++; @@ -221,7 +233,9 @@ static void block_job_resume(BlockJob *job) if (job->pause_count) { return; } - block_job_enter(job); + + /* kick only if no timer is pending */ + block_job_enter_cond(job, block_job_timer_not_pending); } =20 void block_job_ref(BlockJob *job) @@ -651,12 +665,6 @@ static void block_job_completed_txn_success(BlockJob *= job) } } =20 -/* Assumes the block_job_mutex is held */ -static bool block_job_timer_pending(BlockJob *job) -{ - return timer_pending(&job->sleep_timer); -} - void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp) { Error *local_err =3D NULL; diff --git a/tests/qemu-iotests/185 b/tests/qemu-iotests/185 index 975404c99d..9a2d317414 100755 --- a/tests/qemu-iotests/185 +++ b/tests/qemu-iotests/185 @@ -101,14 +101,11 @@ echo # command to be received (after receiving the command, the rest runs # synchronously, so jobs can arbitrarily continue or complete). # -# Jobs present while QEMU is terminating iterate once more due to -# bdrv_drain_all(). -# # The buffer size for commit and streaming is 512k (waiting for 8 seconds = after # the first request), for active commit and mirror it's large enough to co= ver # the full 4M, and for backup it's the qcow2 cluster size, which we know is # 64k. As all of these are at least as large as the speed, we are sure tha= t the -# offset advances exactly twice before qemu exits. +# offset advances exactly once before qemu exits. =20 _send_qemu_cmd $h \ "{ 'execute': 'block-commit', diff --git a/tests/qemu-iotests/185.out b/tests/qemu-iotests/185.out index 992162f418..57eaf8d699 100644 --- a/tests/qemu-iotests/185.out +++ b/tests/qemu-iotests/185.out @@ -20,7 +20,7 @@ Formatting 'TEST_DIR/t.qcow2', fmt=3Dqcow2 size=3D6710886= 4 backing_file=3DTEST_DIR/t.q {"return": {}} {"return": {}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "SHUTDOWN", "data": {"guest": false}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 67108864, "offse= t": 1048576, "speed": 65536, "type": "commit"}} +{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 67108864, "offse= t": 524288, "speed": 65536, "type": "commit"}} =20 =3D=3D=3D Start active commit job and exit qemu =3D=3D=3D =20 @@ -28,8 +28,7 @@ Formatting 'TEST_DIR/t.qcow2', fmt=3Dqcow2 size=3D6710886= 4 backing_file=3DTEST_DIR/t.q {"return": {}} {"return": {}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "SHUTDOWN", "data": {"guest": false}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_READY", "data": {"device": "disk", "len": 4194304, "offset": 4= 194304, "speed": 65536, "type": "commit"}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_COMPLETED", "data": {"device": "disk", "len": 4194304, "offset= ": 4194304, "speed": 65536, "type": "commit"}} +{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 4194304, "offset= ": 4194304, "speed": 65536, "type": "commit"}} =20 =3D=3D=3D Start mirror job and exit qemu =3D=3D=3D =20 @@ -38,8 +37,7 @@ Formatting 'TEST_DIR/t.qcow2.copy', fmt=3Dqcow2 size=3D67= 108864 cluster_size=3D65536 l {"return": {}} {"return": {}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "SHUTDOWN", "data": {"guest": false}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_READY", "data": {"device": "disk", "len": 4194304, "offset": 4= 194304, "speed": 65536, "type": "mirror"}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_COMPLETED", "data": {"device": "disk", "len": 4194304, "offset= ": 4194304, "speed": 65536, "type": "mirror"}} +{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 4194304, "offset= ": 4194304, "speed": 65536, "type": "mirror"}} =20 =3D=3D=3D Start backup job and exit qemu =3D=3D=3D =20 @@ -48,7 +46,7 @@ Formatting 'TEST_DIR/t.qcow2.copy', fmt=3Dqcow2 size=3D67= 108864 cluster_size=3D65536 l {"return": {}} {"return": {}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "SHUTDOWN", "data": {"guest": false}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 67108864, "offse= t": 131072, "speed": 65536, "type": "backup"}} +{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 67108864, "offse= t": 65536, "speed": 65536, "type": "backup"}} =20 =3D=3D=3D Start streaming job and exit qemu =3D=3D=3D =20 @@ -56,6 +54,6 @@ Formatting 'TEST_DIR/t.qcow2.copy', fmt=3Dqcow2 size=3D67= 108864 cluster_size=3D65536 l {"return": {}} {"return": {}} {"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "SHUTDOWN", "data": {"guest": false}} -{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 67108864, "offse= t": 1048576, "speed": 65536, "type": "stream"}} +{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event"= : "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 67108864, "offse= t": 524288, "speed": 65536, "type": "stream"}} No errors were found on the image. *** done --=20 2.14.3