From nobody Sat May 4 12:05:07 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 151134037894884.8212789727811; Wed, 22 Nov 2017 00:46:18 -0800 (PST) Received: from localhost ([::1]:38395 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eHQff-0008OB-8S for importer@patchew.org; Wed, 22 Nov 2017 03:46:15 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36007) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eHQep-0007y2-BQ for qemu-devel@nongnu.org; Wed, 22 Nov 2017 03:45:24 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eHQeo-0004Bw-5p for qemu-devel@nongnu.org; Wed, 22 Nov 2017 03:45:23 -0500 Received: from mx1.redhat.com ([209.132.183.28]:49510) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eHQen-0004Bm-MW for qemu-devel@nongnu.org; Wed, 22 Nov 2017 03:45:22 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BEB1C49032; Wed, 22 Nov 2017 08:45:20 +0000 (UTC) Received: from secure.mitica (ovpn-116-72.ams2.redhat.com [10.36.116.72]) by smtp.corp.redhat.com (Postfix) with ESMTP id B402960BEC; Wed, 22 Nov 2017 08:45:16 +0000 (UTC) From: Juan Quintela To: qemu-devel@nongnu.org Date: Wed, 22 Nov 2017 09:45:03 +0100 Message-Id: <20171122084504.11984-2-quintela@redhat.com> In-Reply-To: <20171122084504.11984-1-quintela@redhat.com> References: <20171122084504.11984-1-quintela@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Wed, 22 Nov 2017 08:45:20 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL 1/2] migration, xen: Fix block image lock issue on live migration X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, Anthony PERARD , dgilbert@redhat.com, peterx@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Anthony PERARD When doing a live migration of a Xen guest with libxl, the images for block devices are locked by the original QEMU process, and this prevent the QEMU at the destination to take the lock and the migration fail. >From QEMU point of view, once the RAM of a domain is migrated, there is two QMP commands, "stop" then "xen-save-devices-state", at which point a new QEMU is spawned at the destination. Release locks in "xen-save-devices-state" so the destination can takes them, if it's a live migration. This patch add the "live" parameter to "xen-save-devices-state" which default to true so older version of libxenlight can work with newer version of QEMU. Signed-off-by: Anthony PERARD Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/savevm.c | 23 ++++++++++++++++++++++- qapi/migration.json | 6 +++++- 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/migration/savevm.c b/migration/savevm.c index 192f2d82cd..b7908f62be 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2242,13 +2242,20 @@ int save_snapshot(const char *name, Error **errp) return ret; } =20 -void qmp_xen_save_devices_state(const char *filename, Error **errp) +void qmp_xen_save_devices_state(const char *filename, bool has_live, bool = live, + Error **errp) { QEMUFile *f; QIOChannelFile *ioc; int saved_vm_running; int ret; =20 + if (!has_live) { + /* live default to true so old version of Xen tool stack can have a + * successfull live migration */ + live =3D true; + } + saved_vm_running =3D runstate_is_running(); vm_stop(RUN_STATE_SAVE_VM); global_state_store_running(); @@ -2263,6 +2270,20 @@ void qmp_xen_save_devices_state(const char *filename= , Error **errp) qemu_fclose(f); if (ret < 0) { error_setg(errp, QERR_IO_ERROR); + } else { + /* libxl calls the QMP command "stop" before calling + * "xen-save-devices-state" and in case of migration failure, libxl + * would call "cont". + * So call bdrv_inactivate_all (release locks) here to let the oth= er + * side of the migration take controle of the images. + */ + if (live && !saved_vm_running) { + ret =3D bdrv_inactivate_all(); + if (ret) { + error_setg(errp, "%s: bdrv_inactivate_all() failed (%d)", + __func__, ret); + } + } } =20 the_end: diff --git a/qapi/migration.json b/qapi/migration.json index bbc4671ded..03f57c9616 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1075,6 +1075,9 @@ # data. See xen-save-devices-state.txt for a description of the binary # format. # +# @live: Optional argument to ask QEMU to treat this command as part of a = live +# migration. Default to true. (since 2.11) +# # Returns: Nothing on success # # Since: 1.1 @@ -1086,7 +1089,8 @@ # <- { "return": {} } # ## -{ 'command': 'xen-save-devices-state', 'data': {'filename': 'str'} } +{ 'command': 'xen-save-devices-state', + 'data': {'filename': 'str', '*live':'bool' } } =20 ## # @xen-set-replication: --=20 2.13.6 From nobody Sat May 4 12:05:07 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1511340518687955.3858764565468; Wed, 22 Nov 2017 00:48:38 -0800 (PST) Received: from localhost ([::1]:38401 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eHQhv-0001Lg-PY for importer@patchew.org; Wed, 22 Nov 2017 03:48:35 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36036) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eHQev-00083h-LC for qemu-devel@nongnu.org; Wed, 22 Nov 2017 03:45:30 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eHQes-0004D6-8p for qemu-devel@nongnu.org; Wed, 22 Nov 2017 03:45:29 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40318) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1eHQer-0004Cq-Ve for qemu-devel@nongnu.org; Wed, 22 Nov 2017 03:45:26 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1ED0537E6E; Wed, 22 Nov 2017 08:45:25 +0000 (UTC) Received: from secure.mitica (ovpn-116-72.ams2.redhat.com [10.36.116.72]) by smtp.corp.redhat.com (Postfix) with ESMTP id 20BB760BEC; Wed, 22 Nov 2017 08:45:20 +0000 (UTC) From: Juan Quintela To: qemu-devel@nongnu.org Date: Wed, 22 Nov 2017 09:45:04 +0100 Message-Id: <20171122084504.11984-3-quintela@redhat.com> In-Reply-To: <20171122084504.11984-1-quintela@redhat.com> References: <20171122084504.11984-1-quintela@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Wed, 22 Nov 2017 08:45:25 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PULL 2/2] migration/ram.c: do not set 'postcopy_running' in POSTCOPY_INCOMING_END X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, Daniel Henrique Barboza , dgilbert@redhat.com, peterx@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Daniel Henrique Barboza When migrating a VM with 'migrate_set_capability postcopy-ram on' a postcopy_state is set during the process, ending up with the state POSTCOPY_INCOMING_END when the migration is over. This postcopy_state is taken into account inside ram_load to check how it will load the memory pages. This same ram_load is called when in a loadvm command. Inside ram_load, the logic to see if we're at postcopy_running state is: postcopy_running =3D postcopy_state_get() >=3D POSTCOPY_INCOMING_LISTENING postcopy_state_get() returns this enum type: typedef enum { POSTCOPY_INCOMING_NONE =3D 0, POSTCOPY_INCOMING_ADVISE, POSTCOPY_INCOMING_DISCARD, POSTCOPY_INCOMING_LISTENING, POSTCOPY_INCOMING_RUNNING, POSTCOPY_INCOMING_END } PostcopyState; In the case where ram_load is executed and postcopy_state is POSTCOPY_INCOMING_END, postcopy_running will be set to 'true' and ram_load will behave like a postcopy is in progress. This scenario isn't achievable in a migration but it is reproducible when executing savevm/loadvm after migrating with 'postcopy-ram on', causing loadvm to fail with Error -22: Source: (qemu) migrate_set_capability postcopy-ram on (qemu) migrate tcp:127.0.0.1:4444 Dest: (qemu) migrate_set_capability postcopy-ram on (qemu) ubuntu1704-intel login: Ubuntu 17.04 ubuntu1704-intel ttyS0 ubuntu1704-intel login: (qemu) (qemu) savevm test1 (qemu) loadvm test1 Unknown combination of migration flags: 0x4 (postcopy mode) error while loading state for instance 0x0 of device 'ram' Error -22 while loading VM state (qemu) This patch fixes this problem by changing the existing logic for postcopy_advised and postcopy_running in ram_load, making them 'false' if we're at POSTCOPY_INCOMING_END state. Signed-off-by: Daniel Henrique Barboza CC: Juan Quintela CC: Dr. David Alan Gilbert Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Reported-by: Balamuruhan S Signed-off-by: Juan Quintela --- migration/ram.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 8620aa400a..021d583b9b 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2798,6 +2798,18 @@ static int ram_load_postcopy(QEMUFile *f) return ret; } =20 +static bool postcopy_is_advised(void) +{ + PostcopyState ps =3D postcopy_state_get(); + return ps >=3D POSTCOPY_INCOMING_ADVISE && ps < POSTCOPY_INCOMING_END; +} + +static bool postcopy_is_running(void) +{ + PostcopyState ps =3D postcopy_state_get(); + return ps >=3D POSTCOPY_INCOMING_LISTENING && ps < POSTCOPY_INCOMING_E= ND; +} + static int ram_load(QEMUFile *f, void *opaque, int version_id) { int flags =3D 0, ret =3D 0, invalid_flags =3D 0; @@ -2807,9 +2819,9 @@ static int ram_load(QEMUFile *f, void *opaque, int ve= rsion_id) * If system is running in postcopy mode, page inserts to host memory = must * be atomic */ - bool postcopy_running =3D postcopy_state_get() >=3D POSTCOPY_INCOMING_= LISTENING; + bool postcopy_running =3D postcopy_is_running(); /* ADVISE is earlier, it shows the source has the postcopy capability = on */ - bool postcopy_advised =3D postcopy_state_get() >=3D POSTCOPY_INCOMING_= ADVISE; + bool postcopy_advised =3D postcopy_is_advised(); =20 seq_iter++; =20 --=20 2.13.6