From nobody Tue Nov 4 22:05:02 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1531214633756957.3212375912254; Tue, 10 Jul 2018 02:23:53 -0700 (PDT) Received: from localhost ([::1]:46683 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fcosC-0006Rp-Lo for importer@patchew.org; Tue, 10 Jul 2018 05:23:52 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54336) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fconp-00033w-Ac for qemu-devel@nongnu.org; Tue, 10 Jul 2018 05:19:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fconm-0002wb-6i for qemu-devel@nongnu.org; Tue, 10 Jul 2018 05:19:21 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:51600 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fconm-0002wP-1h for qemu-devel@nongnu.org; Tue, 10 Jul 2018 05:19:18 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9B5EA402346C; Tue, 10 Jul 2018 09:19:17 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-170.pek2.redhat.com [10.72.12.170]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1FB912156889; Tue, 10 Jul 2018 09:19:12 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Tue, 10 Jul 2018 17:18:54 +0800 Message-Id: <20180710091902.28780-3-peterx@redhat.com> In-Reply-To: <20180710091902.28780-1-peterx@redhat.com> References: <20180710091902.28780-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Tue, 10 Jul 2018 09:19:17 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Tue, 10 Jul 2018 09:19:17 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH for-3.0 v2 02/10] migration: loosen recovery check when load vm X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Balamuruhan S , "Dr . David Alan Gilbert" , peterx@redhat.com, Juan Quintela Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" We were checking against -EIO, assuming that it will cover all IO failures. But actually it is not. One example is that in qemu_loadvm_section_start_full() we can have tons of places that will return -EINVAL even if the error is caused by IO failures on the network. Let's loosen the recovery check logic here to cover all the error cases happened by removing the explicit check against -EIO. After all we won't lose anything here if any other failure happened. Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela Signed-off-by: Peter Xu --- migration/savevm.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/migration/savevm.c b/migration/savevm.c index 851d74e8b6..efcc795071 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2276,18 +2276,14 @@ out: qemu_file_set_error(f, ret); =20 /* - * Detect whether it is: - * - * 1. postcopy running (after receiving all device data, which - * must be in POSTCOPY_INCOMING_RUNNING state. Note that - * POSTCOPY_INCOMING_LISTENING is still not enough, it's - * still receiving device states). - * 2. network failure (-EIO) - * - * If so, we try to wait for a recovery. + * If we are during an active postcopy, then we pause instead + * of bail out to at least keep the VM's dirty data. Note + * that POSTCOPY_INCOMING_LISTENING stage is still not enough, + * during which we're still receiving device states and we + * still haven't yet started the VM on destination. */ if (postcopy_state_get() =3D=3D POSTCOPY_INCOMING_RUNNING && - ret =3D=3D -EIO && postcopy_pause_incoming(mis)) { + postcopy_pause_incoming(mis)) { /* Reset f to point to the newly created channel */ f =3D mis->from_src_file; goto retry; --=20 2.17.1