From nobody Sat Apr 27 17:02:04 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1499947118126692.2329011690763; Thu, 13 Jul 2017 04:58:38 -0700 (PDT) Received: from localhost ([::1]:59043 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVclQ-0005kp-JB for importer@patchew.org; Thu, 13 Jul 2017 07:58:36 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53800) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVcjq-0004j6-4v for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:56:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dVcjo-0000qh-AN for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:56:58 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43570) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dVcjo-0000pB-44 for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:56:56 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0D0657AE93 for ; Thu, 13 Jul 2017 11:56:55 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-102.ams2.redhat.com [10.36.117.102]) by smtp.corp.redhat.com (Postfix) with ESMTP id D30236060C; Thu, 13 Jul 2017 11:56:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 0D0657AE93 Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 0D0657AE93 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Thu, 13 Jul 2017 12:56:45 +0100 Message-Id: <20170713115649.11853-2-dgilbert@redhat.com> In-Reply-To: <20170713115649.11853-1-dgilbert@redhat.com> References: <20170713115649.11853-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 13 Jul 2017 11:56:55 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v2 1/5] migration/rdma: Fix race on source X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" Fix a race where the destination might try and send the source a WRID_READY before the source has done a post-recv for it. rdma_post_recv has to happen after the qp exists, and we're OK since we've already called qemu_rdma_source_init that calls qemu_alloc_qp. This corresponds to: https://bugzilla.redhat.com/show_bug.cgi?id=3D1285044 The race can be triggered by adding a few ms wait before this post_recv_control (which was originally due to me turning on loads of debug). Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index c6bc607a03..6111e10c70 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2365,6 +2365,12 @@ static int qemu_rdma_connect(RDMAContext *rdma, Erro= r **errp) =20 caps_to_network(&cap); =20 + ret =3D qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY); + if (ret) { + ERROR(errp, "posting second control recv"); + goto err_rdma_source_connect; + } + ret =3D rdma_connect(rdma->cm_id, &conn_param); if (ret) { perror("rdma_connect"); @@ -2405,12 +2411,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Erro= r **errp) =20 rdma_ack_cm_event(cm_event); =20 - ret =3D qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY); - if (ret) { - ERROR(errp, "posting second control recv!"); - goto err_rdma_source_connect; - } - rdma->control_ready_expected =3D 1; rdma->nb_sent =3D 0; return 0; --=20 2.13.0 From nobody Sat Apr 27 17:02:04 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1499947240891768.7463440263772; Thu, 13 Jul 2017 05:00:40 -0700 (PDT) Received: from localhost ([::1]:59086 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVcnN-0000R0-Je for importer@patchew.org; Thu, 13 Jul 2017 08:00:37 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53821) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVcjr-0004jL-ME for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:57:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dVcjq-0000sN-Ny for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:56:59 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49516) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dVcjq-0000rn-Hj for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:56:58 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 732AC7D0CB for ; Thu, 13 Jul 2017 11:56:57 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-102.ams2.redhat.com [10.36.117.102]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6A50D6060C; Thu, 13 Jul 2017 11:56:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 732AC7D0CB Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 732AC7D0CB From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Thu, 13 Jul 2017 12:56:46 +0100 Message-Id: <20170713115649.11853-3-dgilbert@redhat.com> In-Reply-To: <20170713115649.11853-1-dgilbert@redhat.com> References: <20170713115649.11853-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Thu, 13 Jul 2017 11:56:57 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v2 2/5] migration: Close file on failed migration load X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" Closing the file before exit on a failure allows the source to cleanup better, especially with RDMA. Partial fix for https://bugs.launchpad.net/qemu/+bug/1545052 Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/migration.c | 1 + 1 file changed, 1 insertion(+) diff --git a/migration/migration.c b/migration/migration.c index a0db40d364..8552f54ab4 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -348,6 +348,7 @@ static void process_incoming_migration_co(void *opaque) migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE, MIGRATION_STATUS_FAILED); error_report("load of migration failed: %s", strerror(-ret)); + qemu_fclose(mis->from_src_file); exit(EXIT_FAILURE); } mis->bh =3D qemu_bh_new(process_incoming_migration_bh, mis); --=20 2.13.0 From nobody Sat Apr 27 17:02:04 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1499947240217123.21795942344056; Thu, 13 Jul 2017 05:00:40 -0700 (PDT) Received: from localhost ([::1]:59089 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVcnP-0000Rx-2p for importer@patchew.org; Thu, 13 Jul 2017 08:00:39 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53838) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVcjt-0004kK-73 for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:57:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dVcjs-0000t1-9c for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:57:01 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55758) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dVcjr-0000sX-WC for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:57:00 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D5BC3C0587CC for ; Thu, 13 Jul 2017 11:56:58 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-102.ams2.redhat.com [10.36.117.102]) by smtp.corp.redhat.com (Postfix) with ESMTP id B91416060C; Thu, 13 Jul 2017 11:56:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com D5BC3C0587CC Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com D5BC3C0587CC From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Thu, 13 Jul 2017 12:56:47 +0100 Message-Id: <20170713115649.11853-4-dgilbert@redhat.com> In-Reply-To: <20170713115649.11853-1-dgilbert@redhat.com> References: <20170713115649.11853-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Thu, 13 Jul 2017 11:56:59 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v2 3/5] migration/rdma: Allow cancelling while waiting for wrid X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" When waiting for a WRID, if the other side dies we end up waiting for ever with no way to cancel the migration. Cure this by poll()ing the fd first with a timeout and checking error flags and migration state. Signed-off-by: Dr. David Alan Gilbert --- migration/rdma.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 46 insertions(+), 6 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 6111e10c70..30f5542b49 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -1466,6 +1466,50 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, ui= nt64_t *wr_id_out, return 0; } =20 +/* Wait for activity on the completion channel. + * Returns 0 on success, none-0 on error. + */ +static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) +{ + /* + * Coroutine doesn't start until migration_fd_process_incoming() + * so don't yield unless we know we're running inside of a coroutine. + */ + if (rdma->migration_started_on_destination) { + yield_until_fd_readable(rdma->comp_channel->fd); + } else { + /* This is the source side, we're in a separate thread + * or destination prior to migration_fd_process_incoming() + * we can't yield; so we have to poll the fd. + * But we need to be able to handle 'cancel' or an error + * without hanging forever. + */ + while (!rdma->error_state && !rdma->received_error) { + GPollFD pfds[1]; + pfds[0].fd =3D rdma->comp_channel->fd; + pfds[0].events =3D G_IO_IN | G_IO_HUP | G_IO_ERR; + /* 0.1s timeout, should be fine for a 'cancel' */ + switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) { + case 1: /* fd active */ + return 0; + + case 0: /* Timeout, go around again */ + break; + + default: /* Error of some type */ + return -1; + } + + if (migrate_get_current()->state =3D=3D MIGRATION_STATUS_CANCE= LLING) { + /* Bail out and let the cancellation happen */ + return -EPIPE; + } + } + } + + return rdma->error_state || rdma->received_error; +} + /* * Block until the next work request has completed. * @@ -1513,12 +1557,8 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdm= a, int wrid_requested, } =20 while (1) { - /* - * Coroutine doesn't start until migration_fd_process_incoming() - * so don't yield unless we know we're running inside of a corouti= ne. - */ - if (rdma->migration_started_on_destination) { - yield_until_fd_readable(rdma->comp_channel->fd); + if (qemu_rdma_wait_comp_channel(rdma)) { + goto err_block_for_wrid; } =20 if (ibv_get_cq_event(rdma->comp_channel, &cq, &cq_ctx)) { --=20 2.13.0 From nobody Sat Apr 27 17:02:04 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1499947136239981.1275039185213; Thu, 13 Jul 2017 04:58:56 -0700 (PDT) Received: from localhost ([::1]:59045 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVcli-00061a-Nn for importer@patchew.org; Thu, 13 Jul 2017 07:58:54 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53873) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVcjz-0004pO-Kf for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:57:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dVcjw-0000wy-HV for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:57:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58362) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dVcjw-0000vg-7t for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:57:04 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0FE7D80491 for ; Thu, 13 Jul 2017 11:57:03 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-102.ams2.redhat.com [10.36.117.102]) by smtp.corp.redhat.com (Postfix) with ESMTP id 15ABB6060C; Thu, 13 Jul 2017 11:56:58 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 0FE7D80491 Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 0FE7D80491 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Thu, 13 Jul 2017 12:56:48 +0100 Message-Id: <20170713115649.11853-5-dgilbert@redhat.com> In-Reply-To: <20170713115649.11853-1-dgilbert@redhat.com> References: <20170713115649.11853-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Thu, 13 Jul 2017 11:57:03 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v2 4/5] migration/rdma: Safely convert control types X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" control_desc[] is an array of strings that correspond to a series of message types; they're used only for error messages, but if the message type is seriously broken then we could go off the end of the array. Convert the array to a function control_desc() that bound checks. Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 54 ++++++++++++++++++++++++++++++++--------------------= -- 1 file changed, 32 insertions(+), 22 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 30f5542b49..89684fdec6 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -165,20 +165,6 @@ enum { RDMA_CONTROL_UNREGISTER_FINISHED, /* unpinning finished */ }; =20 -static const char *control_desc[] =3D { - [RDMA_CONTROL_NONE] =3D "NONE", - [RDMA_CONTROL_ERROR] =3D "ERROR", - [RDMA_CONTROL_READY] =3D "READY", - [RDMA_CONTROL_QEMU_FILE] =3D "QEMU FILE", - [RDMA_CONTROL_RAM_BLOCKS_REQUEST] =3D "RAM BLOCKS REQUEST", - [RDMA_CONTROL_RAM_BLOCKS_RESULT] =3D "RAM BLOCKS RESULT", - [RDMA_CONTROL_COMPRESS] =3D "COMPRESS", - [RDMA_CONTROL_REGISTER_REQUEST] =3D "REGISTER REQUEST", - [RDMA_CONTROL_REGISTER_RESULT] =3D "REGISTER RESULT", - [RDMA_CONTROL_REGISTER_FINISHED] =3D "REGISTER FINISHED", - [RDMA_CONTROL_UNREGISTER_REQUEST] =3D "UNREGISTER REQUEST", - [RDMA_CONTROL_UNREGISTER_FINISHED] =3D "UNREGISTER FINISHED", -}; =20 /* * Memory and MR structures used to represent an IB Send/Recv work request. @@ -251,6 +237,30 @@ typedef struct QEMU_PACKED RDMADestBlock { uint32_t padding; } RDMADestBlock; =20 +static const char *control_desc(unsigned int rdma_control) +{ + static const char *strs[] =3D { + [RDMA_CONTROL_NONE] =3D "NONE", + [RDMA_CONTROL_ERROR] =3D "ERROR", + [RDMA_CONTROL_READY] =3D "READY", + [RDMA_CONTROL_QEMU_FILE] =3D "QEMU FILE", + [RDMA_CONTROL_RAM_BLOCKS_REQUEST] =3D "RAM BLOCKS REQUEST", + [RDMA_CONTROL_RAM_BLOCKS_RESULT] =3D "RAM BLOCKS RESULT", + [RDMA_CONTROL_COMPRESS] =3D "COMPRESS", + [RDMA_CONTROL_REGISTER_REQUEST] =3D "REGISTER REQUEST", + [RDMA_CONTROL_REGISTER_RESULT] =3D "REGISTER RESULT", + [RDMA_CONTROL_REGISTER_FINISHED] =3D "REGISTER FINISHED", + [RDMA_CONTROL_UNREGISTER_REQUEST] =3D "UNREGISTER REQUEST", + [RDMA_CONTROL_UNREGISTER_FINISHED] =3D "UNREGISTER FINISHED", + }; + + if (rdma_control > RDMA_CONTROL_UNREGISTER_FINISHED) { + return "??BAD CONTROL VALUE??"; + } + + return strs[rdma_control]; +} + static uint64_t htonll(uint64_t v) { union { uint32_t lv[2]; uint64_t llv; } u; @@ -1630,7 +1640,7 @@ static int qemu_rdma_post_send_control(RDMAContext *r= dma, uint8_t *buf, .num_sge =3D 1, }; =20 - trace_qemu_rdma_post_send_control(control_desc[head->type]); + trace_qemu_rdma_post_send_control(control_desc(head->type)); =20 /* * We don't actually need to do a memcpy() in here if we used @@ -1709,16 +1719,16 @@ static int qemu_rdma_exchange_get_response(RDMACont= ext *rdma, network_to_control((void *) rdma->wr_data[idx].control); memcpy(head, rdma->wr_data[idx].control, sizeof(RDMAControlHeader)); =20 - trace_qemu_rdma_exchange_get_response_start(control_desc[expecting]); + trace_qemu_rdma_exchange_get_response_start(control_desc(expecting)); =20 if (expecting =3D=3D RDMA_CONTROL_NONE) { - trace_qemu_rdma_exchange_get_response_none(control_desc[head->type= ], + trace_qemu_rdma_exchange_get_response_none(control_desc(head->type= ), head->type); } else if (head->type !=3D expecting || head->type =3D=3D RDMA_CONTROL= _ERROR) { error_report("Was expecting a %s (%d) control message" ", but got: %s (%d), length: %d", - control_desc[expecting], expecting, - control_desc[head->type], head->type, head->len); + control_desc(expecting), expecting, + control_desc(head->type), head->type, head->len); if (head->type =3D=3D RDMA_CONTROL_ERROR) { rdma->received_error =3D true; } @@ -1828,7 +1838,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma,= RDMAControlHeader *head, } } =20 - trace_qemu_rdma_exchange_send_waiting(control_desc[resp->type]); + trace_qemu_rdma_exchange_send_waiting(control_desc(resp->type)); ret =3D qemu_rdma_exchange_get_response(rdma, resp, resp->type, RDMA_WRID_DATA); =20 @@ -1840,7 +1850,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma,= RDMAControlHeader *head, if (resp_idx) { *resp_idx =3D RDMA_WRID_DATA; } - trace_qemu_rdma_exchange_send_received(control_desc[resp->type]); + trace_qemu_rdma_exchange_send_received(control_desc(resp->type)); } =20 rdma->control_ready_expected =3D 1; @@ -3390,7 +3400,7 @@ static int qemu_rdma_registration_handle(QEMUFile *f,= void *opaque) ret =3D -EIO; goto out; default: - error_report("Unknown control message %s", control_desc[head.t= ype]); + error_report("Unknown control message %s", control_desc(head.t= ype)); ret =3D -EIO; goto out; } --=20 2.13.0 From nobody Sat Apr 27 17:02:04 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1499947134366955.8727549184518; Thu, 13 Jul 2017 04:58:54 -0700 (PDT) Received: from localhost ([::1]:59044 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVclh-000616-5s for importer@patchew.org; Thu, 13 Jul 2017 07:58:53 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53871) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVcjz-0004pM-KT for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:57:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dVcjx-0000xp-OV for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:57:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49600) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dVcjx-0000xU-I2 for qemu-devel@nongnu.org; Thu, 13 Jul 2017 07:57:05 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 76EC783F3E for ; Thu, 13 Jul 2017 11:57:04 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-102.ams2.redhat.com [10.36.117.102]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4FA156060C; Thu, 13 Jul 2017 11:57:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 76EC783F3E Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 76EC783F3E From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Thu, 13 Jul 2017 12:56:49 +0100 Message-Id: <20170713115649.11853-6-dgilbert@redhat.com> In-Reply-To: <20170713115649.11853-1-dgilbert@redhat.com> References: <20170713115649.11853-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Thu, 13 Jul 2017 11:57:04 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v2 5/5] migration/rdma: Send error during cancelling X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" When we issue a cancel and clean up the RDMA channel send a CONTROL_ERROR to get the destination to quit. The rdma_cleanup code waits for the event to come back from the rdma_disconnect; but that wont happen until the destination quits and there's currently nothing to force it. Note this makes the case of a cancel work while the destination is alive, and it already works if the destination is truly dead. Note it doesn't fix the case where the destination is hung (we get stuck waiting for the rdma_disconnect event). Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/migration/rdma.c b/migration/rdma.c index 89684fdec6..bb9aa48d8c 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2258,7 +2258,9 @@ static void qemu_rdma_cleanup(RDMAContext *rdma) int ret, idx; =20 if (rdma->cm_id && rdma->connected) { - if (rdma->error_state && !rdma->received_error) { + if ((rdma->error_state || + migrate_get_current()->state =3D=3D MIGRATION_STATUS_CANCELLI= NG) && + !rdma->received_error) { RDMAControlHeader head =3D { .len =3D 0, .type =3D RDMA_CONTROL_ERROR, .repeat =3D 1, --=20 2.13.0