From nobody Mon Apr 29 12:13:38 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1524473220970472.90431444732224; Mon, 23 Apr 2018 01:47:00 -0700 (PDT) Received: from localhost ([::1]:55825 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fAX7a-0002e9-Q6 for importer@patchew.org; Mon, 23 Apr 2018 04:46:50 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37593) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fAX6O-00026V-0U for qemu-devel@nongnu.org; Mon, 23 Apr 2018 04:45:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fAX6M-0002yn-P3 for qemu-devel@nongnu.org; Mon, 23 Apr 2018 04:45:36 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:39664 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fAX6H-0002wM-32; Mon, 23 Apr 2018 04:45:29 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7DBA48182D1A; Mon, 23 Apr 2018 08:45:23 +0000 (UTC) Received: from blackfin.pond.sub.org (ovpn-116-129.ams2.redhat.com [10.36.116.129]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 368E02166BAE; Mon, 23 Apr 2018 08:45:19 +0000 (UTC) Received: by blackfin.pond.sub.org (Postfix, from userid 1000) id 16A6E1138645; Mon, 23 Apr 2018 10:45:18 +0200 (CEST) From: Markus Armbruster To: qemu-devel@nongnu.org Date: Mon, 23 Apr 2018 10:45:18 +0200 Message-Id: <20180423084518.2426-1-armbru@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Mon, 23 Apr 2018 08:45:23 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Mon, 23 Apr 2018 08:45:23 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'armbru@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH] cpus: Fix event order on resume of stopped guest X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, Paolo Bonzini , qemu-block@nongnu.org, mreitz@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" When resume of a stopped guest immediately runs into block device errors, the BLOCK_IO_ERROR event is sent before the RESUME event. Reproducer: 1. Create a scratch image $ dd if=3D/dev/zero of=3Dscratch.img bs=3D1M count=3D100 Size doesn't actually matter. 2. Prepare blkdebug configuration: $ cat >blkdebug.conf < ' Issue QMP command 'qmp_capabilities': QMP> { "execute": "qmp_capabilities" } 5. Boot the guest. 6. In the guest, write to the scratch disk, e.g. like this: # dd if=3D/dev/zero of=3D/dev/vdb count=3D1 Do double-check the device specified with of=3D is actually the scratch device! 7. Issue QMP command 'cont': QMP> { "execute": "cont" } After step 6, I get a BLOCK_IO_ERROR event followed by a STOP event. Good. After step 7, I get BLOCK_IO_ERROR, then RESUME, then STOP. Not so good; I'd expect RESUME, then BLOCK_IO_ERROR, then STOP. The funny event order confuses libvirt: virsh -r domstate DOMAIN --reason reports "paused (unknown)" rather than "paused (I/O error)". The culprit is vm_prepare_start(). /* Ensure that a STOP/RESUME pair of events is emitted if a * vmstop request was pending. The BLOCK_IO_ERROR event, for * example, according to documentation is always followed by * the STOP event. */ if (runstate_is_running()) { qapi_event_send_stop(&error_abort); res =3D -1; } else { replay_enable_events(); cpu_enable_ticks(); runstate_set(RUN_STATE_RUNNING); vm_state_notify(1, RUN_STATE_RUNNING); } /* We are sending this now, but the CPUs will be resumed shortly later = */ qapi_event_send_resume(&error_abort); return res; When resuming a stopped guest, we take the else branch before we get to sending RESUME. vm_state_notify() runs virtio_vmstate_change(), among other things. This restarts I/O, triggering the BLOCK_IO_ERROR event. Reshuffle vm_prepare_start() to send the RESUME event earlier. Fixes RHBZ 1566153. Cc: Paolo Bonzini Signed-off-by: Markus Armbruster --- cpus.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/cpus.c b/cpus.c index 38eba8bff3..398392bc3a 100644 --- a/cpus.c +++ b/cpus.c @@ -2043,7 +2043,6 @@ int vm_stop(RunState state) int vm_prepare_start(void) { RunState requested; - int res =3D 0; =20 qemu_vmstop_requested(&requested); if (runstate_is_running() && requested =3D=3D RUN_STATE__MAX) { @@ -2057,17 +2056,18 @@ int vm_prepare_start(void) */ if (runstate_is_running()) { qapi_event_send_stop(&error_abort); - res =3D -1; - } else { - replay_enable_events(); - cpu_enable_ticks(); - runstate_set(RUN_STATE_RUNNING); - vm_state_notify(1, RUN_STATE_RUNNING); + qapi_event_send_resume(&error_abort); + return -1; } =20 /* We are sending this now, but the CPUs will be resumed shortly later= */ qapi_event_send_resume(&error_abort); - return res; + + replay_enable_events(); + cpu_enable_ticks(); + runstate_set(RUN_STATE_RUNNING); + vm_state_notify(1, RUN_STATE_RUNNING); + return 0; } =20 void vm_start(void) --=20 2.13.6