From nobody Tue Apr 30 15:32:40 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1499194267114771.6937065532444; Tue, 4 Jul 2017 11:51:07 -0700 (PDT) Received: from localhost ([::1]:42627 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dSSue-0004SI-KN for importer@patchew.org; Tue, 04 Jul 2017 14:51:04 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34493) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dSSt3-0003Z8-H1 for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dSSt0-0001NX-Qe for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:25 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38410) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dSSt0-0001Mw-LD for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:22 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9378980F94; Tue, 4 Jul 2017 18:49:21 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-176.ams2.redhat.com [10.36.117.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id 39A4760F8C; Tue, 4 Jul 2017 18:49:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 9378980F94 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 9378980F94 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org, michael@hinespot.com, quintela@redhat.com, peterx@redhat.com, lvivier@redhat.com, berrange@redhat.com Date: Tue, 4 Jul 2017 19:49:11 +0100 Message-Id: <20170704184915.31586-2-dgilbert@redhat.com> In-Reply-To: <20170704184915.31586-1-dgilbert@redhat.com> References: <20170704184915.31586-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Tue, 04 Jul 2017 18:49:21 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 1/5] migration/rdma: Fix race on source X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" Fix a race where the destination might try and send the source a WRID_READY before the source has done a post-recv for it. rdma_post_recv has to happen after the qp exists, and we're OK since we've already called qemu_rdma_source_init that calls qemu_alloc_qp. This corresponds to: https://bugzilla.redhat.com/show_bug.cgi?id=3D1285044 The race can be triggered by adding a few ms wait before this post_recv_control (which was originally due to me turning on loads of debug). Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index c6bc607a03..6111e10c70 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2365,6 +2365,12 @@ static int qemu_rdma_connect(RDMAContext *rdma, Erro= r **errp) =20 caps_to_network(&cap); =20 + ret =3D qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY); + if (ret) { + ERROR(errp, "posting second control recv"); + goto err_rdma_source_connect; + } + ret =3D rdma_connect(rdma->cm_id, &conn_param); if (ret) { perror("rdma_connect"); @@ -2405,12 +2411,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Erro= r **errp) =20 rdma_ack_cm_event(cm_event); =20 - ret =3D qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY); - if (ret) { - ERROR(errp, "posting second control recv!"); - goto err_rdma_source_connect; - } - rdma->control_ready_expected =3D 1; rdma->nb_sent =3D 0; return 0; --=20 2.13.0 From nobody Tue Apr 30 15:32:40 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1499194268102601.665532924805; Tue, 4 Jul 2017 11:51:08 -0700 (PDT) Received: from localhost ([::1]:42628 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dSSug-0004TU-SH for importer@patchew.org; Tue, 04 Jul 2017 14:51:06 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34492) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dSSt3-0003Z7-H2 for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dSSt2-0001Ok-H7 for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:25 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39370) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dSSt2-0001O7-BR for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:24 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3EC2DC058EDA; Tue, 4 Jul 2017 18:49:23 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-176.ams2.redhat.com [10.36.117.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id DA16460F8C; Tue, 4 Jul 2017 18:49:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 3EC2DC058EDA Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 3EC2DC058EDA From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org, michael@hinespot.com, quintela@redhat.com, peterx@redhat.com, lvivier@redhat.com, berrange@redhat.com Date: Tue, 4 Jul 2017 19:49:12 +0100 Message-Id: <20170704184915.31586-3-dgilbert@redhat.com> In-Reply-To: <20170704184915.31586-1-dgilbert@redhat.com> References: <20170704184915.31586-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 04 Jul 2017 18:49:23 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 2/5] migration: Close file on failed migration load X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" Closing the file before exit on a failure allows the source to cleanup better, especially with RDMA. Partial fix for https://bugs.launchpad.net/qemu/+bug/1545052 Signed-off-by: Dr. David Alan Gilbert --- migration/migration.c | 1 + 1 file changed, 1 insertion(+) diff --git a/migration/migration.c b/migration/migration.c index 51ccd1a4c5..21d6902a29 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -355,6 +355,7 @@ static void process_incoming_migration_co(void *opaque) MIGRATION_STATUS_FAILED); error_report("load of migration failed: %s", strerror(-ret)); migrate_decompress_threads_join(); + qemu_fclose(mis->from_src_file); exit(EXIT_FAILURE); } =20 --=20 2.13.0 From nobody Tue Apr 30 15:32:40 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1499194385069145.52054894699484; Tue, 4 Jul 2017 11:53:05 -0700 (PDT) Received: from localhost ([::1]:42633 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dSSwZ-00060x-Sp for importer@patchew.org; Tue, 04 Jul 2017 14:53:03 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34526) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dSSt5-0003ZR-9B for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dSSt4-0001Rl-8O for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:27 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39408) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dSSt3-0001Qb-VJ for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:26 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id DF062C058EDA; Tue, 4 Jul 2017 18:49:24 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-176.ams2.redhat.com [10.36.117.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id 849C16EC61; Tue, 4 Jul 2017 18:49:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com DF062C058EDA Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com DF062C058EDA From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org, michael@hinespot.com, quintela@redhat.com, peterx@redhat.com, lvivier@redhat.com, berrange@redhat.com Date: Tue, 4 Jul 2017 19:49:13 +0100 Message-Id: <20170704184915.31586-4-dgilbert@redhat.com> In-Reply-To: <20170704184915.31586-1-dgilbert@redhat.com> References: <20170704184915.31586-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 04 Jul 2017 18:49:25 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 3/5] migration/rdma: Allow cancelling while waiting for wrid X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" When waiting for a WRID, if the other side dies we end up waiting for ever with no way to cancel the migration. Cure this by poll()ing the fd first with a timeout and checking error flags and migration state. Signed-off-by: Dr. David Alan Gilbert --- migration/rdma.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++----= -- 1 file changed, 48 insertions(+), 6 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 6111e10c70..7273ae9929 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -1466,6 +1466,52 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, ui= nt64_t *wr_id_out, return 0; } =20 +/* Wait for activity on the completion channel. + * Returns 0 on success, none-0 on error. + */ +static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) +{ + /* + * Coroutine doesn't start until migration_fd_process_incoming() + * so don't yield unless we know we're running inside of a coroutine. + */ + if (rdma->migration_started_on_destination) { + yield_until_fd_readable(rdma->comp_channel->fd); + } else { + /* This is the source side, we're in a separate thread + * or destination prior to migration_fd_process_incoming() + * we can't yield; so we have to poll the fd. + * But we need to be able to handle 'cancel' or an error + * without hanging forever. + */ + while (!rdma->error_state && !rdma->error_reported && + !rdma->received_error) { + GPollFD pfds[1]; + pfds[0].fd =3D rdma->comp_channel->fd; + pfds[0].events =3D G_IO_IN | G_IO_HUP | G_IO_ERR; + /* 0.5s timeout, should be fine for a 'cancel' */ + switch (qemu_poll_ns(pfds, 1, 500 * 1000 * 1000)) { + case 1: /* fd active */ + return 0; + + case 0: /* Timeout, go around again */ + break; + + default: /* Error of some type */ + return -1; + } + + if (migrate_get_current()->state =3D=3D MIGRATION_STATUS_CANCE= LLING) { + /* Bail out and let the cancellation happen */ + return -EPIPE; + } + } + } + + return rdma->error_state || rdma->error_reported || + rdma->received_error; +} + /* * Block until the next work request has completed. * @@ -1513,12 +1559,8 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdm= a, int wrid_requested, } =20 while (1) { - /* - * Coroutine doesn't start until migration_fd_process_incoming() - * so don't yield unless we know we're running inside of a corouti= ne. - */ - if (rdma->migration_started_on_destination) { - yield_until_fd_readable(rdma->comp_channel->fd); + if (qemu_rdma_wait_comp_channel(rdma)) { + goto err_block_for_wrid; } =20 if (ibv_get_cq_event(rdma->comp_channel, &cq, &cq_ctx)) { --=20 2.13.0 From nobody Tue Apr 30 15:32:40 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1499194272293311.00513570770363; Tue, 4 Jul 2017 11:51:12 -0700 (PDT) Received: from localhost ([::1]:42629 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dSSuk-0004Wu-Nh for importer@patchew.org; Tue, 04 Jul 2017 14:51:10 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34540) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dSSt7-0003ay-2R for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dSSt5-0001So-UV for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:29 -0400 Received: from mx1.redhat.com ([209.132.183.28]:60680) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dSSt5-0001SL-Lt for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:27 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8C35791FC8; Tue, 4 Jul 2017 18:49:26 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-176.ams2.redhat.com [10.36.117.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2EF42619D6; Tue, 4 Jul 2017 18:49:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 8C35791FC8 Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 8C35791FC8 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org, michael@hinespot.com, quintela@redhat.com, peterx@redhat.com, lvivier@redhat.com, berrange@redhat.com Date: Tue, 4 Jul 2017 19:49:14 +0100 Message-Id: <20170704184915.31586-5-dgilbert@redhat.com> In-Reply-To: <20170704184915.31586-1-dgilbert@redhat.com> References: <20170704184915.31586-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Tue, 04 Jul 2017 18:49:26 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 4/5] migration/rdma: Safely convert control types X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" control_desc[] is an array of strings that correspond to a series of message types; they're used only for error messages, but if the message type is seriously broken then we could go off the end of the array. Convert the array to a function control_desc() that bound checks. Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 54 ++++++++++++++++++++++++++++++++--------------------= -- 1 file changed, 32 insertions(+), 22 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 7273ae9929..bfb0a43740 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -165,20 +165,6 @@ enum { RDMA_CONTROL_UNREGISTER_FINISHED, /* unpinning finished */ }; =20 -static const char *control_desc[] =3D { - [RDMA_CONTROL_NONE] =3D "NONE", - [RDMA_CONTROL_ERROR] =3D "ERROR", - [RDMA_CONTROL_READY] =3D "READY", - [RDMA_CONTROL_QEMU_FILE] =3D "QEMU FILE", - [RDMA_CONTROL_RAM_BLOCKS_REQUEST] =3D "RAM BLOCKS REQUEST", - [RDMA_CONTROL_RAM_BLOCKS_RESULT] =3D "RAM BLOCKS RESULT", - [RDMA_CONTROL_COMPRESS] =3D "COMPRESS", - [RDMA_CONTROL_REGISTER_REQUEST] =3D "REGISTER REQUEST", - [RDMA_CONTROL_REGISTER_RESULT] =3D "REGISTER RESULT", - [RDMA_CONTROL_REGISTER_FINISHED] =3D "REGISTER FINISHED", - [RDMA_CONTROL_UNREGISTER_REQUEST] =3D "UNREGISTER REQUEST", - [RDMA_CONTROL_UNREGISTER_FINISHED] =3D "UNREGISTER FINISHED", -}; =20 /* * Memory and MR structures used to represent an IB Send/Recv work request. @@ -251,6 +237,30 @@ typedef struct QEMU_PACKED RDMADestBlock { uint32_t padding; } RDMADestBlock; =20 +static const char *control_desc(unsigned int rdma_control) +{ + static const char *strs[] =3D { + [RDMA_CONTROL_NONE] =3D "NONE", + [RDMA_CONTROL_ERROR] =3D "ERROR", + [RDMA_CONTROL_READY] =3D "READY", + [RDMA_CONTROL_QEMU_FILE] =3D "QEMU FILE", + [RDMA_CONTROL_RAM_BLOCKS_REQUEST] =3D "RAM BLOCKS REQUEST", + [RDMA_CONTROL_RAM_BLOCKS_RESULT] =3D "RAM BLOCKS RESULT", + [RDMA_CONTROL_COMPRESS] =3D "COMPRESS", + [RDMA_CONTROL_REGISTER_REQUEST] =3D "REGISTER REQUEST", + [RDMA_CONTROL_REGISTER_RESULT] =3D "REGISTER RESULT", + [RDMA_CONTROL_REGISTER_FINISHED] =3D "REGISTER FINISHED", + [RDMA_CONTROL_UNREGISTER_REQUEST] =3D "UNREGISTER REQUEST", + [RDMA_CONTROL_UNREGISTER_FINISHED] =3D "UNREGISTER FINISHED", + }; + + if (rdma_control > RDMA_CONTROL_UNREGISTER_FINISHED) { + return "??BAD CONTROL VALUE??"; + } + + return strs[rdma_control]; +} + static uint64_t htonll(uint64_t v) { union { uint32_t lv[2]; uint64_t llv; } u; @@ -1632,7 +1642,7 @@ static int qemu_rdma_post_send_control(RDMAContext *r= dma, uint8_t *buf, .num_sge =3D 1, }; =20 - trace_qemu_rdma_post_send_control(control_desc[head->type]); + trace_qemu_rdma_post_send_control(control_desc(head->type)); =20 /* * We don't actually need to do a memcpy() in here if we used @@ -1711,16 +1721,16 @@ static int qemu_rdma_exchange_get_response(RDMACont= ext *rdma, network_to_control((void *) rdma->wr_data[idx].control); memcpy(head, rdma->wr_data[idx].control, sizeof(RDMAControlHeader)); =20 - trace_qemu_rdma_exchange_get_response_start(control_desc[expecting]); + trace_qemu_rdma_exchange_get_response_start(control_desc(expecting)); =20 if (expecting =3D=3D RDMA_CONTROL_NONE) { - trace_qemu_rdma_exchange_get_response_none(control_desc[head->type= ], + trace_qemu_rdma_exchange_get_response_none(control_desc(head->type= ), head->type); } else if (head->type !=3D expecting || head->type =3D=3D RDMA_CONTROL= _ERROR) { error_report("Was expecting a %s (%d) control message" ", but got: %s (%d), length: %d", - control_desc[expecting], expecting, - control_desc[head->type], head->type, head->len); + control_desc(expecting), expecting, + control_desc(head->type), head->type, head->len); if (head->type =3D=3D RDMA_CONTROL_ERROR) { rdma->received_error =3D true; } @@ -1830,7 +1840,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma,= RDMAControlHeader *head, } } =20 - trace_qemu_rdma_exchange_send_waiting(control_desc[resp->type]); + trace_qemu_rdma_exchange_send_waiting(control_desc(resp->type)); ret =3D qemu_rdma_exchange_get_response(rdma, resp, resp->type, RDMA_WRID_DATA); =20 @@ -1842,7 +1852,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma,= RDMAControlHeader *head, if (resp_idx) { *resp_idx =3D RDMA_WRID_DATA; } - trace_qemu_rdma_exchange_send_received(control_desc[resp->type]); + trace_qemu_rdma_exchange_send_received(control_desc(resp->type)); } =20 rdma->control_ready_expected =3D 1; @@ -3392,7 +3402,7 @@ static int qemu_rdma_registration_handle(QEMUFile *f,= void *opaque) ret =3D -EIO; goto out; default: - error_report("Unknown control message %s", control_desc[head.t= ype]); + error_report("Unknown control message %s", control_desc(head.t= ype)); ret =3D -EIO; goto out; } --=20 2.13.0 From nobody Tue Apr 30 15:32:40 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1499194392042890.4470972278232; Tue, 4 Jul 2017 11:53:12 -0700 (PDT) Received: from localhost ([::1]:42635 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dSSwf-00065Z-V8 for importer@patchew.org; Tue, 04 Jul 2017 14:53:10 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34560) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dSSt8-0003cL-G5 for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dSSt7-0001Ti-Kl for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:30 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39450) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dSSt7-0001T7-EX for qemu-devel@nongnu.org; Tue, 04 Jul 2017 14:49:29 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 50AE3C058EB6; Tue, 4 Jul 2017 18:49:28 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-176.ams2.redhat.com [10.36.117.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id D14436EC61; Tue, 4 Jul 2017 18:49:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 50AE3C058EB6 Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx08.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 50AE3C058EB6 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org, michael@hinespot.com, quintela@redhat.com, peterx@redhat.com, lvivier@redhat.com, berrange@redhat.com Date: Tue, 4 Jul 2017 19:49:15 +0100 Message-Id: <20170704184915.31586-6-dgilbert@redhat.com> In-Reply-To: <20170704184915.31586-1-dgilbert@redhat.com> References: <20170704184915.31586-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.32]); Tue, 04 Jul 2017 18:49:28 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH 5/5] migration/rdma: Send error during cancelling X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" When we issue a cancel and clean up the RDMA channel send a CONTROL_ERROR to get the destination to quit. The rdma_cleanup code waits for the event to come back from the rdma_disconnect; but that wont happen until the destination quits and there's currently nothing to force it. Note this makes the case of a cancel work while the destination is alive, and it already works if the destination is truly dead. Note it doesn't fix the case where the destination is hung (we get stuck waiting for the rdma_disconnect event). Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/migration/rdma.c b/migration/rdma.c index bfb0a43740..3d17db3a23 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2260,7 +2260,9 @@ static void qemu_rdma_cleanup(RDMAContext *rdma) int ret, idx; =20 if (rdma->cm_id && rdma->connected) { - if (rdma->error_state && !rdma->received_error) { + if ((rdma->error_state || + migrate_get_current()->state =3D=3D MIGRATION_STATUS_CANCELLI= NG) && + !rdma->received_error) { RDMAControlHeader head =3D { .len =3D 0, .type =3D RDMA_CONTROL_ERROR, .repeat =3D 1, --=20 2.13.0