From nobody Sat May 4 15:45:29 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1511290270264567.681789387009; Tue, 21 Nov 2017 10:51:10 -0800 (PST) Received: from localhost ([::1]:36049 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eHDdL-0004uf-HO for importer@patchew.org; Tue, 21 Nov 2017 13:50:59 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33043) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eHDaH-0002wI-P2 for qemu-devel@nongnu.org; Tue, 21 Nov 2017 13:47:50 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eHDaE-0007QM-Sz for qemu-devel@nongnu.org; Tue, 21 Nov 2017 13:47:49 -0500 Received: from mx1.redhat.com ([209.132.183.28]:37618) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eHDaE-0007Q6-K2 for qemu-devel@nongnu.org; Tue, 21 Nov 2017 13:47:46 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 954224A6FF; Tue, 21 Nov 2017 18:47:45 +0000 (UTC) Received: from secure.mitica (ovpn-116-72.ams2.redhat.com [10.36.116.72]) by smtp.corp.redhat.com (Postfix) with ESMTP id F025717A7B; Tue, 21 Nov 2017 18:47:43 +0000 (UTC) From: Juan Quintela To: qemu-devel@nongnu.org Date: Tue, 21 Nov 2017 19:47:37 +0100 Message-Id: <20171121184738.8502-2-quintela@redhat.com> In-Reply-To: <20171121184738.8502-1-quintela@redhat.com> References: <20171121184738.8502-1-quintela@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Tue, 21 Nov 2017 18:47:45 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL 1/2] migration, xen: Fix block image lock issue on live migration X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, Anthony PERARD , dgilbert@redhat.com, peterx@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Anthony PERARD When doing a live migration of a Xen guest with libxl, the images for block devices are locked by the original QEMU process, and this prevent the QEMU at the destination to take the lock and the migration fail. >From QEMU point of view, once the RAM of a domain is migrated, there is two QMP commands, "stop" then "xen-save-devices-state", at which point a new QEMU is spawned at the destination. Release locks in "xen-save-devices-state" so the destination can takes them, if it's a live migration. This patch add the "live" parameter to "xen-save-devices-state" which default to true so older version of libxenlight can work with newer version of QEMU. Signed-off-by: Anthony PERARD Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/savevm.c | 23 ++++++++++++++++++++++- qapi/migration.json | 6 +++++- 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/migration/savevm.c b/migration/savevm.c index 192f2d82cd..b7908f62be 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2242,13 +2242,20 @@ int save_snapshot(const char *name, Error **errp) return ret; } =20 -void qmp_xen_save_devices_state(const char *filename, Error **errp) +void qmp_xen_save_devices_state(const char *filename, bool has_live, bool = live, + Error **errp) { QEMUFile *f; QIOChannelFile *ioc; int saved_vm_running; int ret; =20 + if (!has_live) { + /* live default to true so old version of Xen tool stack can have a + * successfull live migration */ + live =3D true; + } + saved_vm_running =3D runstate_is_running(); vm_stop(RUN_STATE_SAVE_VM); global_state_store_running(); @@ -2263,6 +2270,20 @@ void qmp_xen_save_devices_state(const char *filename= , Error **errp) qemu_fclose(f); if (ret < 0) { error_setg(errp, QERR_IO_ERROR); + } else { + /* libxl calls the QMP command "stop" before calling + * "xen-save-devices-state" and in case of migration failure, libxl + * would call "cont". + * So call bdrv_inactivate_all (release locks) here to let the oth= er + * side of the migration take controle of the images. + */ + if (live && !saved_vm_running) { + ret =3D bdrv_inactivate_all(); + if (ret) { + error_setg(errp, "%s: bdrv_inactivate_all() failed (%d)", + __func__, ret); + } + } } =20 the_end: diff --git a/qapi/migration.json b/qapi/migration.json index bbc4671ded..03f57c9616 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1075,6 +1075,9 @@ # data. See xen-save-devices-state.txt for a description of the binary # format. # +# @live: Optional argument to ask QEMU to treat this command as part of a = live +# migration. Default to true. (since 2.11) +# # Returns: Nothing on success # # Since: 1.1 @@ -1086,7 +1089,8 @@ # <- { "return": {} } # ## -{ 'command': 'xen-save-devices-state', 'data': {'filename': 'str'} } +{ 'command': 'xen-save-devices-state', + 'data': {'filename': 'str', '*live':'bool' } } =20 ## # @xen-set-replication: --=20 2.13.6 From nobody Sat May 4 15:45:29 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 151129022601454.5190548980147; Tue, 21 Nov 2017 10:50:26 -0800 (PST) Received: from localhost ([::1]:36043 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eHDci-0004Px-3e for importer@patchew.org; Tue, 21 Nov 2017 13:50:20 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33044) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eHDaH-0002wJ-PP for qemu-devel@nongnu.org; Tue, 21 Nov 2017 13:47:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eHDaG-0007Qi-KY for qemu-devel@nongnu.org; Tue, 21 Nov 2017 13:47:49 -0500 Received: from mx1.redhat.com ([209.132.183.28]:42144) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eHDaG-0007QY-C0 for qemu-devel@nongnu.org; Tue, 21 Nov 2017 13:47:48 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7FE88624CB; Tue, 21 Nov 2017 18:47:47 +0000 (UTC) Received: from secure.mitica (ovpn-116-72.ams2.redhat.com [10.36.116.72]) by smtp.corp.redhat.com (Postfix) with ESMTP id E85C6614EE; Tue, 21 Nov 2017 18:47:45 +0000 (UTC) From: Juan Quintela To: qemu-devel@nongnu.org Date: Tue, 21 Nov 2017 19:47:38 +0100 Message-Id: <20171121184738.8502-3-quintela@redhat.com> In-Reply-To: <20171121184738.8502-1-quintela@redhat.com> References: <20171121184738.8502-1-quintela@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 21 Nov 2017 18:47:47 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL 2/2] migration/ram.c: do not set 'postcopy_running' in POSTCOPY_INCOMING_END X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, Daniel Henrique Barboza , dgilbert@redhat.com, peterx@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Daniel Henrique Barboza When migrating a VM with 'migrate_set_capability postcopy-ram on' a postcopy_state is set during the process, ending up with the state POSTCOPY_INCOMING_END when the migration is over. This postcopy_state is taken into account inside ram_load to check how it will load the memory pages. This same ram_load is called when in a loadvm command. Inside ram_load, the logic to see if we're at postcopy_running state is: postcopy_running =3D postcopy_state_get() >=3D POSTCOPY_INCOMING_LISTENING postcopy_state_get() returns this enum type: typedef enum { POSTCOPY_INCOMING_NONE =3D 0, POSTCOPY_INCOMING_ADVISE, POSTCOPY_INCOMING_DISCARD, POSTCOPY_INCOMING_LISTENING, POSTCOPY_INCOMING_RUNNING, POSTCOPY_INCOMING_END } PostcopyState; In the case where ram_load is executed and postcopy_state is POSTCOPY_INCOMING_END, postcopy_running will be set to 'true' and ram_load will behave like a postcopy is in progress. This scenario isn't achievable in a migration but it is reproducible when executing savevm/loadvm after migrating with 'postcopy-ram on', causing loadvm to fail with Error -22: Source: (qemu) migrate_set_capability postcopy-ram on (qemu) migrate tcp:127.0.0.1:4444 Dest: (qemu) migrate_set_capability postcopy-ram on (qemu) ubuntu1704-intel login: Ubuntu 17.04 ubuntu1704-intel ttyS0 ubuntu1704-intel login: (qemu) (qemu) savevm test1 (qemu) loadvm test1 Unknown combination of migration flags: 0x4 (postcopy mode) error while loading state for instance 0x0 of device 'ram' Error -22 while loading VM state (qemu) This patch fixes this problem by changing a bit the semantics of postcopy_running inside ram_load, verifying first if we're not in the POSTCOPY_INCOMING_END state. In this case, postcopy_running is set to 'false'. Signed-off-by: Daniel Henrique Barboza Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/ram.c | 22 +++++++++++++++------- 1 file changed, 15 insertions(+), 7 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 8620aa400a..43ed719668 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2803,13 +2803,21 @@ static int ram_load(QEMUFile *f, void *opaque, int = version_id) int flags =3D 0, ret =3D 0, invalid_flags =3D 0; static uint64_t seq_iter; int len =3D 0; - /* - * If system is running in postcopy mode, page inserts to host memory = must - * be atomic - */ - bool postcopy_running =3D postcopy_state_get() >=3D POSTCOPY_INCOMING_= LISTENING; - /* ADVISE is earlier, it shows the source has the postcopy capability = on */ - bool postcopy_advised =3D postcopy_state_get() >=3D POSTCOPY_INCOMING_= ADVISE; + bool postcopy_advised =3D false, postcopy_running =3D false; + uint8_t postcopy_state =3D postcopy_state_get(); + + if (postcopy_state !=3D POSTCOPY_INCOMING_END) { + /* + * If system is running in postcopy mode, page inserts to host mem= ory + * must be atomic + */ + postcopy_running =3D postcopy_state >=3D POSTCOPY_INCOMING_LISTENI= NG; + + /* ADVISE is earlier, it shows the source has the postcopy + * capability on + */ + postcopy_advised =3D postcopy_state >=3D POSTCOPY_INCOMING_ADVISE; + } =20 seq_iter++; =20 --=20 2.13.6