From nobody Tue Apr 30 17:02:23 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1500033566864333.21648482270484; Fri, 14 Jul 2017 04:59:26 -0700 (PDT) Received: from localhost ([::1]:37270 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVzFl-0003YW-Gp for importer@patchew.org; Fri, 14 Jul 2017 07:59:25 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40081) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVzE8-0002IH-Dg for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dVzE6-0003bM-Qd for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53380) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dVzE6-0003b7-Ju for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:42 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7AE208B13F for ; Fri, 14 Jul 2017 11:57:41 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-125.ams2.redhat.com [10.36.117.125]) by smtp.corp.redhat.com (Postfix) with ESMTP id 66074704DC; Fri, 14 Jul 2017 11:57:40 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 7AE208B13F Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 7AE208B13F From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Fri, 14 Jul 2017 12:57:32 +0100 Message-Id: <20170714115736.13693-2-dgilbert@redhat.com> In-Reply-To: <20170714115736.13693-1-dgilbert@redhat.com> References: <20170714115736.13693-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Fri, 14 Jul 2017 11:57:41 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v3 1/5] migration/rdma: Fix race on source X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" Fix a race where the destination might try and send the source a WRID_READY before the source has done a post-recv for it. rdma_post_recv has to happen after the qp exists, and we're OK since we've already called qemu_rdma_source_init that calls qemu_alloc_qp. This corresponds to: https://bugzilla.redhat.com/show_bug.cgi?id=3D1285044 The race can be triggered by adding a few ms wait before this post_recv_control (which was originally due to me turning on loads of debug). Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index c6bc607a03..6111e10c70 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2365,6 +2365,12 @@ static int qemu_rdma_connect(RDMAContext *rdma, Erro= r **errp) =20 caps_to_network(&cap); =20 + ret =3D qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY); + if (ret) { + ERROR(errp, "posting second control recv"); + goto err_rdma_source_connect; + } + ret =3D rdma_connect(rdma->cm_id, &conn_param); if (ret) { perror("rdma_connect"); @@ -2405,12 +2411,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Erro= r **errp) =20 rdma_ack_cm_event(cm_event); =20 - ret =3D qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY); - if (ret) { - ERROR(errp, "posting second control recv!"); - goto err_rdma_source_connect; - } - rdma->control_ready_expected =3D 1; rdma->nb_sent =3D 0; return 0; --=20 2.13.0 From nobody Tue Apr 30 17:02:23 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1500034184508518.654935923571; Fri, 14 Jul 2017 05:09:44 -0700 (PDT) Received: from localhost ([::1]:37533 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVzPd-0006J8-6v for importer@patchew.org; Fri, 14 Jul 2017 08:09:37 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40099) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVzE9-0002Il-Bc for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dVzE8-0003bt-E2 for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:45 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57148) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dVzE8-0003bf-7e for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:44 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1CE9BC0587F1 for ; Fri, 14 Jul 2017 11:57:43 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-125.ams2.redhat.com [10.36.117.125]) by smtp.corp.redhat.com (Postfix) with ESMTP id C1611704DC; Fri, 14 Jul 2017 11:57:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 1CE9BC0587F1 Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 1CE9BC0587F1 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Fri, 14 Jul 2017 12:57:33 +0100 Message-Id: <20170714115736.13693-3-dgilbert@redhat.com> In-Reply-To: <20170714115736.13693-1-dgilbert@redhat.com> References: <20170714115736.13693-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Fri, 14 Jul 2017 11:57:43 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v3 2/5] migration: Close file on failed migration load X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" Closing the file before exit on a failure allows the source to cleanup better, especially with RDMA. Partial fix for https://bugs.launchpad.net/qemu/+bug/1545052 Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu Reviewed-by: Juan Quintela --- migration/migration.c | 1 + 1 file changed, 1 insertion(+) diff --git a/migration/migration.c b/migration/migration.c index a0db40d364..8552f54ab4 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -348,6 +348,7 @@ static void process_incoming_migration_co(void *opaque) migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE, MIGRATION_STATUS_FAILED); error_report("load of migration failed: %s", strerror(-ret)); + qemu_fclose(mis->from_src_file); exit(EXIT_FAILURE); } mis->bh =3D qemu_bh_new(process_incoming_migration_bh, mis); --=20 2.13.0 From nobody Tue Apr 30 17:02:23 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1500033568324305.02081855033464; Fri, 14 Jul 2017 04:59:28 -0700 (PDT) Received: from localhost ([::1]:37271 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVzFl-0003av-Vy for importer@patchew.org; Fri, 14 Jul 2017 07:59:26 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40118) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVzEA-0002Jm-TW for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dVzEA-0003cb-1t for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:46 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59936) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dVzE9-0003cF-P7 for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:45 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A07803DBD5 for ; Fri, 14 Jul 2017 11:57:44 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-125.ams2.redhat.com [10.36.117.125]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6243717C40; Fri, 14 Jul 2017 11:57:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com A07803DBD5 Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com A07803DBD5 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Fri, 14 Jul 2017 12:57:34 +0100 Message-Id: <20170714115736.13693-4-dgilbert@redhat.com> In-Reply-To: <20170714115736.13693-1-dgilbert@redhat.com> References: <20170714115736.13693-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 14 Jul 2017 11:57:44 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v3 3/5] migration/rdma: Allow cancelling while waiting for wrid X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" When waiting for a WRID, if the other side dies we end up waiting for ever with no way to cancel the migration. Cure this by poll()ing the fd first with a timeout and checking error flags and migration state. Signed-off-by: Dr. David Alan Gilbert --- migration/rdma.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++--= ---- 1 file changed, 53 insertions(+), 6 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 6111e10c70..53076646ef 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -1466,6 +1466,57 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, ui= nt64_t *wr_id_out, return 0; } =20 +/* Wait for activity on the completion channel. + * Returns 0 on success, none-0 on error. + */ +static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) +{ + /* + * Coroutine doesn't start until migration_fd_process_incoming() + * so don't yield unless we know we're running inside of a coroutine. + */ + if (rdma->migration_started_on_destination) { + yield_until_fd_readable(rdma->comp_channel->fd); + } else { + /* This is the source side, we're in a separate thread + * or destination prior to migration_fd_process_incoming() + * we can't yield; so we have to poll the fd. + * But we need to be able to handle 'cancel' or an error + * without hanging forever. + */ + while (!rdma->error_state && !rdma->received_error) { + GPollFD pfds[1]; + pfds[0].fd =3D rdma->comp_channel->fd; + pfds[0].events =3D G_IO_IN | G_IO_HUP | G_IO_ERR; + /* 0.1s timeout, should be fine for a 'cancel' */ + switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) { + case 1: /* fd active */ + return 0; + + case 0: /* Timeout, go around again */ + break; + + default: /* Error of some type - + * I don't trust errno from qemu_poll_ns + */ + error_report("%s: poll failed", __func__); + rdma->error_state =3D -EPIPE; + return -1; + } + + if (migrate_get_current()->state =3D=3D MIGRATION_STATUS_CANCE= LLING) { + /* Bail out and let the cancellation happen */ + return -EPIPE; + } + } + } + + if (rdma->received_error) { + return -EPIPE; + } + return rdma->error_state; +} + /* * Block until the next work request has completed. * @@ -1513,12 +1564,8 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdm= a, int wrid_requested, } =20 while (1) { - /* - * Coroutine doesn't start until migration_fd_process_incoming() - * so don't yield unless we know we're running inside of a corouti= ne. - */ - if (rdma->migration_started_on_destination) { - yield_until_fd_readable(rdma->comp_channel->fd); + if (qemu_rdma_wait_comp_channel(rdma)) { + goto err_block_for_wrid; } =20 if (ibv_get_cq_event(rdma->comp_channel, &cq, &cq_ctx)) { --=20 2.13.0 From nobody Tue Apr 30 17:02:23 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1500034301600377.71193184043705; Fri, 14 Jul 2017 05:11:41 -0700 (PDT) Received: from localhost ([::1]:37554 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVzRW-0008Ec-71 for importer@patchew.org; Fri, 14 Jul 2017 08:11:34 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40142) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVzEC-0002La-Kz for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:49 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dVzEB-0003dI-Fk for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:48 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57586) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dVzEB-0003cu-7E for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:47 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1FC4AC062D16 for ; Fri, 14 Jul 2017 11:57:46 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-125.ams2.redhat.com [10.36.117.125]) by smtp.corp.redhat.com (Postfix) with ESMTP id E78FD17C40; Fri, 14 Jul 2017 11:57:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 1FC4AC062D16 Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 1FC4AC062D16 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Fri, 14 Jul 2017 12:57:35 +0100 Message-Id: <20170714115736.13693-5-dgilbert@redhat.com> In-Reply-To: <20170714115736.13693-1-dgilbert@redhat.com> References: <20170714115736.13693-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Fri, 14 Jul 2017 11:57:46 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v3 4/5] migration/rdma: Safely convert control types X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" control_desc[] is an array of strings that correspond to a series of message types; they're used only for error messages, but if the message type is seriously broken then we could go off the end of the array. Convert the array to a function control_desc() that bound checks. Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 54 ++++++++++++++++++++++++++++++++--------------------= -- 1 file changed, 32 insertions(+), 22 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 53076646ef..13d6dd7709 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -165,20 +165,6 @@ enum { RDMA_CONTROL_UNREGISTER_FINISHED, /* unpinning finished */ }; =20 -static const char *control_desc[] =3D { - [RDMA_CONTROL_NONE] =3D "NONE", - [RDMA_CONTROL_ERROR] =3D "ERROR", - [RDMA_CONTROL_READY] =3D "READY", - [RDMA_CONTROL_QEMU_FILE] =3D "QEMU FILE", - [RDMA_CONTROL_RAM_BLOCKS_REQUEST] =3D "RAM BLOCKS REQUEST", - [RDMA_CONTROL_RAM_BLOCKS_RESULT] =3D "RAM BLOCKS RESULT", - [RDMA_CONTROL_COMPRESS] =3D "COMPRESS", - [RDMA_CONTROL_REGISTER_REQUEST] =3D "REGISTER REQUEST", - [RDMA_CONTROL_REGISTER_RESULT] =3D "REGISTER RESULT", - [RDMA_CONTROL_REGISTER_FINISHED] =3D "REGISTER FINISHED", - [RDMA_CONTROL_UNREGISTER_REQUEST] =3D "UNREGISTER REQUEST", - [RDMA_CONTROL_UNREGISTER_FINISHED] =3D "UNREGISTER FINISHED", -}; =20 /* * Memory and MR structures used to represent an IB Send/Recv work request. @@ -251,6 +237,30 @@ typedef struct QEMU_PACKED RDMADestBlock { uint32_t padding; } RDMADestBlock; =20 +static const char *control_desc(unsigned int rdma_control) +{ + static const char *strs[] =3D { + [RDMA_CONTROL_NONE] =3D "NONE", + [RDMA_CONTROL_ERROR] =3D "ERROR", + [RDMA_CONTROL_READY] =3D "READY", + [RDMA_CONTROL_QEMU_FILE] =3D "QEMU FILE", + [RDMA_CONTROL_RAM_BLOCKS_REQUEST] =3D "RAM BLOCKS REQUEST", + [RDMA_CONTROL_RAM_BLOCKS_RESULT] =3D "RAM BLOCKS RESULT", + [RDMA_CONTROL_COMPRESS] =3D "COMPRESS", + [RDMA_CONTROL_REGISTER_REQUEST] =3D "REGISTER REQUEST", + [RDMA_CONTROL_REGISTER_RESULT] =3D "REGISTER RESULT", + [RDMA_CONTROL_REGISTER_FINISHED] =3D "REGISTER FINISHED", + [RDMA_CONTROL_UNREGISTER_REQUEST] =3D "UNREGISTER REQUEST", + [RDMA_CONTROL_UNREGISTER_FINISHED] =3D "UNREGISTER FINISHED", + }; + + if (rdma_control > RDMA_CONTROL_UNREGISTER_FINISHED) { + return "??BAD CONTROL VALUE??"; + } + + return strs[rdma_control]; +} + static uint64_t htonll(uint64_t v) { union { uint32_t lv[2]; uint64_t llv; } u; @@ -1637,7 +1647,7 @@ static int qemu_rdma_post_send_control(RDMAContext *r= dma, uint8_t *buf, .num_sge =3D 1, }; =20 - trace_qemu_rdma_post_send_control(control_desc[head->type]); + trace_qemu_rdma_post_send_control(control_desc(head->type)); =20 /* * We don't actually need to do a memcpy() in here if we used @@ -1716,16 +1726,16 @@ static int qemu_rdma_exchange_get_response(RDMACont= ext *rdma, network_to_control((void *) rdma->wr_data[idx].control); memcpy(head, rdma->wr_data[idx].control, sizeof(RDMAControlHeader)); =20 - trace_qemu_rdma_exchange_get_response_start(control_desc[expecting]); + trace_qemu_rdma_exchange_get_response_start(control_desc(expecting)); =20 if (expecting =3D=3D RDMA_CONTROL_NONE) { - trace_qemu_rdma_exchange_get_response_none(control_desc[head->type= ], + trace_qemu_rdma_exchange_get_response_none(control_desc(head->type= ), head->type); } else if (head->type !=3D expecting || head->type =3D=3D RDMA_CONTROL= _ERROR) { error_report("Was expecting a %s (%d) control message" ", but got: %s (%d), length: %d", - control_desc[expecting], expecting, - control_desc[head->type], head->type, head->len); + control_desc(expecting), expecting, + control_desc(head->type), head->type, head->len); if (head->type =3D=3D RDMA_CONTROL_ERROR) { rdma->received_error =3D true; } @@ -1835,7 +1845,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma,= RDMAControlHeader *head, } } =20 - trace_qemu_rdma_exchange_send_waiting(control_desc[resp->type]); + trace_qemu_rdma_exchange_send_waiting(control_desc(resp->type)); ret =3D qemu_rdma_exchange_get_response(rdma, resp, resp->type, RDMA_WRID_DATA); =20 @@ -1847,7 +1857,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma,= RDMAControlHeader *head, if (resp_idx) { *resp_idx =3D RDMA_WRID_DATA; } - trace_qemu_rdma_exchange_send_received(control_desc[resp->type]); + trace_qemu_rdma_exchange_send_received(control_desc(resp->type)); } =20 rdma->control_ready_expected =3D 1; @@ -3397,7 +3407,7 @@ static int qemu_rdma_registration_handle(QEMUFile *f,= void *opaque) ret =3D -EIO; goto out; default: - error_report("Unknown control message %s", control_desc[head.t= ype]); + error_report("Unknown control message %s", control_desc(head.t= ype)); ret =3D -EIO; goto out; } --=20 2.13.0 From nobody Tue Apr 30 17:02:23 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1500033837411798.8542915538843; Fri, 14 Jul 2017 05:03:57 -0700 (PDT) Received: from localhost ([::1]:37392 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVzK2-0000Nu-5Y for importer@patchew.org; Fri, 14 Jul 2017 08:03:50 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40163) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dVzED-0002Mb-JC for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dVzEC-0003dn-LY for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:49 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58362) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dVzEC-0003dU-G5 for qemu-devel@nongnu.org; Fri, 14 Jul 2017 07:57:48 -0400 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 655AF793FC for ; Fri, 14 Jul 2017 11:57:47 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-125.ams2.redhat.com [10.36.117.125]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6753D60468; Fri, 14 Jul 2017 11:57:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 655AF793FC Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 655AF793FC From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Fri, 14 Jul 2017 12:57:36 +0100 Message-Id: <20170714115736.13693-6-dgilbert@redhat.com> In-Reply-To: <20170714115736.13693-1-dgilbert@redhat.com> References: <20170714115736.13693-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Fri, 14 Jul 2017 11:57:47 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v3 5/5] migration/rdma: Send error during cancelling X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" When we issue a cancel and clean up the RDMA channel send a CONTROL_ERROR to get the destination to quit. The rdma_cleanup code waits for the event to come back from the rdma_disconnect; but that wont happen until the destination quits and there's currently nothing to force it. Note this makes the case of a cancel work while the destination is alive, and it already works if the destination is truly dead. Note it doesn't fix the case where the destination is hung (we get stuck waiting for the rdma_disconnect event). Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/migration/rdma.c b/migration/rdma.c index 13d6dd7709..0dc9fe115c 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2265,7 +2265,9 @@ static void qemu_rdma_cleanup(RDMAContext *rdma) int ret, idx; =20 if (rdma->cm_id && rdma->connected) { - if (rdma->error_state && !rdma->received_error) { + if ((rdma->error_state || + migrate_get_current()->state =3D=3D MIGRATION_STATUS_CANCELLI= NG) && + !rdma->received_error) { RDMAControlHeader head =3D { .len =3D 0, .type =3D RDMA_CONTROL_ERROR, .repeat =3D 1, --=20 2.13.0