From nobody Tue May 7 18:07:29 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1500290005948733.4721132085172; Mon, 17 Jul 2017 04:13:25 -0700 (PDT) Received: from localhost ([::1]:49455 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dX3xq-0005dp-JM for importer@patchew.org; Mon, 17 Jul 2017 07:13:22 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53685) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dX3uL-0002sW-3A for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:09:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dX3uI-0002o8-Be for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:09:45 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41604) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dX3uI-0002nu-5t for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:09:42 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 294F185A05 for ; Mon, 17 Jul 2017 11:09:41 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-162.ams2.redhat.com [10.36.117.162]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1120670953; Mon, 17 Jul 2017 11:09:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 294F185A05 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 294F185A05 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Mon, 17 Jul 2017 12:09:31 +0100 Message-Id: <20170717110936.23314-2-dgilbert@redhat.com> In-Reply-To: <20170717110936.23314-1-dgilbert@redhat.com> References: <20170717110936.23314-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 17 Jul 2017 11:09:41 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v4 1/6] migration/rdma: Fix race on source X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" Fix a race where the destination might try and send the source a WRID_READY before the source has done a post-recv for it. rdma_post_recv has to happen after the qp exists, and we're OK since we've already called qemu_rdma_source_init that calls qemu_alloc_qp. This corresponds to: https://bugzilla.redhat.com/show_bug.cgi?id=3D1285044 The race can be triggered by adding a few ms wait before this post_recv_control (which was originally due to me turning on loads of debug). Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index c6bc607a03..6111e10c70 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2365,6 +2365,12 @@ static int qemu_rdma_connect(RDMAContext *rdma, Erro= r **errp) =20 caps_to_network(&cap); =20 + ret =3D qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY); + if (ret) { + ERROR(errp, "posting second control recv"); + goto err_rdma_source_connect; + } + ret =3D rdma_connect(rdma->cm_id, &conn_param); if (ret) { perror("rdma_connect"); @@ -2405,12 +2411,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Erro= r **errp) =20 rdma_ack_cm_event(cm_event); =20 - ret =3D qemu_rdma_post_recv_control(rdma, RDMA_WRID_READY); - if (ret) { - ERROR(errp, "posting second control recv!"); - goto err_rdma_source_connect; - } - rdma->control_ready_expected =3D 1; rdma->nb_sent =3D 0; return 0; --=20 2.13.0 From nobody Tue May 7 18:07:29 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1500290008383232.8359353449033; Mon, 17 Jul 2017 04:13:28 -0700 (PDT) Received: from localhost ([::1]:49456 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dX3xr-0005eG-6d for importer@patchew.org; Mon, 17 Jul 2017 07:13:23 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53684) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dX3uL-0002sV-2h for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:09:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dX3uJ-0002pB-Jf for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:09:45 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43690) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dX3uJ-0002os-Cs for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:09:43 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 79EED356C2 for ; Mon, 17 Jul 2017 11:09:42 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-162.ams2.redhat.com [10.36.117.162]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6E6A25C684; Mon, 17 Jul 2017 11:09:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 79EED356C2 Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 79EED356C2 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Mon, 17 Jul 2017 12:09:32 +0100 Message-Id: <20170717110936.23314-3-dgilbert@redhat.com> In-Reply-To: <20170717110936.23314-1-dgilbert@redhat.com> References: <20170717110936.23314-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Mon, 17 Jul 2017 11:09:42 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v4 2/6] migration: Close file on failed migration load X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" Closing the file before exit on a failure allows the source to cleanup better, especially with RDMA. Partial fix for https://bugs.launchpad.net/qemu/+bug/1545052 Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu Reviewed-by: Juan Quintela --- migration/migration.c | 1 + 1 file changed, 1 insertion(+) diff --git a/migration/migration.c b/migration/migration.c index a0db40d364..8552f54ab4 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -348,6 +348,7 @@ static void process_incoming_migration_co(void *opaque) migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE, MIGRATION_STATUS_FAILED); error_report("load of migration failed: %s", strerror(-ret)); + qemu_fclose(mis->from_src_file); exit(EXIT_FAILURE); } mis->bh =3D qemu_bh_new(process_incoming_migration_bh, mis); --=20 2.13.0 From nobody Tue May 7 18:07:29 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 150028988603317.027303659115546; Mon, 17 Jul 2017 04:11:26 -0700 (PDT) Received: from localhost ([::1]:49444 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dX3vv-0003wk-PF for importer@patchew.org; Mon, 17 Jul 2017 07:11:23 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53704) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dX3uL-0002sX-Lj for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:09:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dX3uK-0002ph-ST for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:09:45 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41660) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dX3uK-0002pH-MG for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:09:44 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BE3E2F1D62 for ; Mon, 17 Jul 2017 11:09:43 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-162.ams2.redhat.com [10.36.117.162]) by smtp.corp.redhat.com (Postfix) with ESMTP id BA28C5C684; Mon, 17 Jul 2017 11:09:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com BE3E2F1D62 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com BE3E2F1D62 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Mon, 17 Jul 2017 12:09:33 +0100 Message-Id: <20170717110936.23314-4-dgilbert@redhat.com> In-Reply-To: <20170717110936.23314-1-dgilbert@redhat.com> References: <20170717110936.23314-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 17 Jul 2017 11:09:43 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v4 3/6] migration/rdma: fix qemu_rdma_block_for_wrid error paths X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" The two places that 'goto err_block_for_wrid' weren't setting ret and so would end up returning 0 even though we've failed. Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 6111e10c70..59810aec2e 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -1521,14 +1521,16 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rd= ma, int wrid_requested, yield_until_fd_readable(rdma->comp_channel->fd); } =20 - if (ibv_get_cq_event(rdma->comp_channel, &cq, &cq_ctx)) { + ret =3D ibv_get_cq_event(rdma->comp_channel, &cq, &cq_ctx); + if (ret) { perror("ibv_get_cq_event"); goto err_block_for_wrid; } =20 num_cq_events++; =20 - if (ibv_req_notify_cq(cq, 0)) { + ret =3D -ibv_req_notify_cq(cq, 0); + if (ret) { goto err_block_for_wrid; } =20 @@ -1564,6 +1566,8 @@ err_block_for_wrid: if (num_cq_events) { ibv_ack_cq_events(cq, num_cq_events); } + + rdma->error_state =3D ret; return ret; } =20 --=20 2.13.0 From nobody Tue May 7 18:07:29 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1500289899218305.31304552523613; Mon, 17 Jul 2017 04:11:39 -0700 (PDT) Received: from localhost ([::1]:49445 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dX3w8-00047W-Ph for importer@patchew.org; Mon, 17 Jul 2017 07:11:36 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53719) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dX3uV-0002zT-2a for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:09:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dX3uQ-0002sn-Sn for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:09:55 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35610) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dX3uQ-0002sM-JX for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:09:50 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AB20F4E03F for ; Mon, 17 Jul 2017 11:09:49 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-162.ams2.redhat.com [10.36.117.162]) by smtp.corp.redhat.com (Postfix) with ESMTP id 11FF570952; Mon, 17 Jul 2017 11:09:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com AB20F4E03F Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com AB20F4E03F From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Mon, 17 Jul 2017 12:09:34 +0100 Message-Id: <20170717110936.23314-5-dgilbert@redhat.com> In-Reply-To: <20170717110936.23314-1-dgilbert@redhat.com> References: <20170717110936.23314-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 17 Jul 2017 11:09:49 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v4 4/6] migration/rdma: Allow cancelling while waiting for wrid X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" When waiting for a WRID, if the other side dies we end up waiting for ever with no way to cancel the migration. Cure this by poll()ing the fd first with a timeout and checking error flags and migration state. Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++--= ---- 1 file changed, 53 insertions(+), 6 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 59810aec2e..0cf55a6d5b 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -1466,6 +1466,56 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, ui= nt64_t *wr_id_out, return 0; } =20 +/* Wait for activity on the completion channel. + * Returns 0 on success, none-0 on error. + */ +static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) +{ + /* + * Coroutine doesn't start until migration_fd_process_incoming() + * so don't yield unless we know we're running inside of a coroutine. + */ + if (rdma->migration_started_on_destination) { + yield_until_fd_readable(rdma->comp_channel->fd); + } else { + /* This is the source side, we're in a separate thread + * or destination prior to migration_fd_process_incoming() + * we can't yield; so we have to poll the fd. + * But we need to be able to handle 'cancel' or an error + * without hanging forever. + */ + while (!rdma->error_state && !rdma->received_error) { + GPollFD pfds[1]; + pfds[0].fd =3D rdma->comp_channel->fd; + pfds[0].events =3D G_IO_IN | G_IO_HUP | G_IO_ERR; + /* 0.1s timeout, should be fine for a 'cancel' */ + switch (qemu_poll_ns(pfds, 1, 100 * 1000 * 1000)) { + case 1: /* fd active */ + return 0; + + case 0: /* Timeout, go around again */ + break; + + default: /* Error of some type - + * I don't trust errno from qemu_poll_ns + */ + error_report("%s: poll failed", __func__); + return -EPIPE; + } + + if (migrate_get_current()->state =3D=3D MIGRATION_STATUS_CANCE= LLING) { + /* Bail out and let the cancellation happen */ + return -EPIPE; + } + } + } + + if (rdma->received_error) { + return -EPIPE; + } + return rdma->error_state; +} + /* * Block until the next work request has completed. * @@ -1513,12 +1563,9 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdm= a, int wrid_requested, } =20 while (1) { - /* - * Coroutine doesn't start until migration_fd_process_incoming() - * so don't yield unless we know we're running inside of a corouti= ne. - */ - if (rdma->migration_started_on_destination) { - yield_until_fd_readable(rdma->comp_channel->fd); + ret =3D qemu_rdma_wait_comp_channel(rdma); + if (ret) { + goto err_block_for_wrid; } =20 ret =3D ibv_get_cq_event(rdma->comp_channel, &cq, &cq_ctx); --=20 2.13.0 From nobody Tue May 7 18:07:29 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1500289916912875.9848561624523; Mon, 17 Jul 2017 04:11:56 -0700 (PDT) Received: from localhost ([::1]:49446 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dX3wR-0004N0-Jq for importer@patchew.org; Mon, 17 Jul 2017 07:11:55 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53764) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dX3uk-0003Bv-0E for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:10:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dX3uf-0002zA-RG for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:10:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42248) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dX3uf-0002yx-Gz for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:10:05 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8A01EF1D60 for ; Mon, 17 Jul 2017 11:10:04 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-162.ams2.redhat.com [10.36.117.162]) by smtp.corp.redhat.com (Postfix) with ESMTP id F339E5C684; Mon, 17 Jul 2017 11:09:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 8A01EF1D60 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 8A01EF1D60 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Mon, 17 Jul 2017 12:09:35 +0100 Message-Id: <20170717110936.23314-6-dgilbert@redhat.com> In-Reply-To: <20170717110936.23314-1-dgilbert@redhat.com> References: <20170717110936.23314-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Mon, 17 Jul 2017 11:10:04 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v4 5/6] migration/rdma: Safely convert control types X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" control_desc[] is an array of strings that correspond to a series of message types; they're used only for error messages, but if the message type is seriously broken then we could go off the end of the array. Convert the array to a function control_desc() that bound checks. Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu Reviewed-by: Juan Quintela --- migration/rdma.c | 54 ++++++++++++++++++++++++++++++++--------------------= -- 1 file changed, 32 insertions(+), 22 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 0cf55a6d5b..972167d899 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -165,20 +165,6 @@ enum { RDMA_CONTROL_UNREGISTER_FINISHED, /* unpinning finished */ }; =20 -static const char *control_desc[] =3D { - [RDMA_CONTROL_NONE] =3D "NONE", - [RDMA_CONTROL_ERROR] =3D "ERROR", - [RDMA_CONTROL_READY] =3D "READY", - [RDMA_CONTROL_QEMU_FILE] =3D "QEMU FILE", - [RDMA_CONTROL_RAM_BLOCKS_REQUEST] =3D "RAM BLOCKS REQUEST", - [RDMA_CONTROL_RAM_BLOCKS_RESULT] =3D "RAM BLOCKS RESULT", - [RDMA_CONTROL_COMPRESS] =3D "COMPRESS", - [RDMA_CONTROL_REGISTER_REQUEST] =3D "REGISTER REQUEST", - [RDMA_CONTROL_REGISTER_RESULT] =3D "REGISTER RESULT", - [RDMA_CONTROL_REGISTER_FINISHED] =3D "REGISTER FINISHED", - [RDMA_CONTROL_UNREGISTER_REQUEST] =3D "UNREGISTER REQUEST", - [RDMA_CONTROL_UNREGISTER_FINISHED] =3D "UNREGISTER FINISHED", -}; =20 /* * Memory and MR structures used to represent an IB Send/Recv work request. @@ -251,6 +237,30 @@ typedef struct QEMU_PACKED RDMADestBlock { uint32_t padding; } RDMADestBlock; =20 +static const char *control_desc(unsigned int rdma_control) +{ + static const char *strs[] =3D { + [RDMA_CONTROL_NONE] =3D "NONE", + [RDMA_CONTROL_ERROR] =3D "ERROR", + [RDMA_CONTROL_READY] =3D "READY", + [RDMA_CONTROL_QEMU_FILE] =3D "QEMU FILE", + [RDMA_CONTROL_RAM_BLOCKS_REQUEST] =3D "RAM BLOCKS REQUEST", + [RDMA_CONTROL_RAM_BLOCKS_RESULT] =3D "RAM BLOCKS RESULT", + [RDMA_CONTROL_COMPRESS] =3D "COMPRESS", + [RDMA_CONTROL_REGISTER_REQUEST] =3D "REGISTER REQUEST", + [RDMA_CONTROL_REGISTER_RESULT] =3D "REGISTER RESULT", + [RDMA_CONTROL_REGISTER_FINISHED] =3D "REGISTER FINISHED", + [RDMA_CONTROL_UNREGISTER_REQUEST] =3D "UNREGISTER REQUEST", + [RDMA_CONTROL_UNREGISTER_FINISHED] =3D "UNREGISTER FINISHED", + }; + + if (rdma_control > RDMA_CONTROL_UNREGISTER_FINISHED) { + return "??BAD CONTROL VALUE??"; + } + + return strs[rdma_control]; +} + static uint64_t htonll(uint64_t v) { union { uint32_t lv[2]; uint64_t llv; } u; @@ -1641,7 +1651,7 @@ static int qemu_rdma_post_send_control(RDMAContext *r= dma, uint8_t *buf, .num_sge =3D 1, }; =20 - trace_qemu_rdma_post_send_control(control_desc[head->type]); + trace_qemu_rdma_post_send_control(control_desc(head->type)); =20 /* * We don't actually need to do a memcpy() in here if we used @@ -1720,16 +1730,16 @@ static int qemu_rdma_exchange_get_response(RDMACont= ext *rdma, network_to_control((void *) rdma->wr_data[idx].control); memcpy(head, rdma->wr_data[idx].control, sizeof(RDMAControlHeader)); =20 - trace_qemu_rdma_exchange_get_response_start(control_desc[expecting]); + trace_qemu_rdma_exchange_get_response_start(control_desc(expecting)); =20 if (expecting =3D=3D RDMA_CONTROL_NONE) { - trace_qemu_rdma_exchange_get_response_none(control_desc[head->type= ], + trace_qemu_rdma_exchange_get_response_none(control_desc(head->type= ), head->type); } else if (head->type !=3D expecting || head->type =3D=3D RDMA_CONTROL= _ERROR) { error_report("Was expecting a %s (%d) control message" ", but got: %s (%d), length: %d", - control_desc[expecting], expecting, - control_desc[head->type], head->type, head->len); + control_desc(expecting), expecting, + control_desc(head->type), head->type, head->len); if (head->type =3D=3D RDMA_CONTROL_ERROR) { rdma->received_error =3D true; } @@ -1839,7 +1849,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma,= RDMAControlHeader *head, } } =20 - trace_qemu_rdma_exchange_send_waiting(control_desc[resp->type]); + trace_qemu_rdma_exchange_send_waiting(control_desc(resp->type)); ret =3D qemu_rdma_exchange_get_response(rdma, resp, resp->type, RDMA_WRID_DATA); =20 @@ -1851,7 +1861,7 @@ static int qemu_rdma_exchange_send(RDMAContext *rdma,= RDMAControlHeader *head, if (resp_idx) { *resp_idx =3D RDMA_WRID_DATA; } - trace_qemu_rdma_exchange_send_received(control_desc[resp->type]); + trace_qemu_rdma_exchange_send_received(control_desc(resp->type)); } =20 rdma->control_ready_expected =3D 1; @@ -3401,7 +3411,7 @@ static int qemu_rdma_registration_handle(QEMUFile *f,= void *opaque) ret =3D -EIO; goto out; default: - error_report("Unknown control message %s", control_desc[head.t= ype]); + error_report("Unknown control message %s", control_desc(head.t= ype)); ret =3D -EIO; goto out; } --=20 2.13.0 From nobody Tue May 7 18:07:29 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1500290075955711.4703531606474; Mon, 17 Jul 2017 04:14:35 -0700 (PDT) Received: from localhost ([::1]:49464 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dX3z0-0006oX-QO for importer@patchew.org; Mon, 17 Jul 2017 07:14:34 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53765) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dX3uk-0003Bx-0W for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:10:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dX3ui-00033N-Qq for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:10:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36232) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dX3ui-000322-Jf for qemu-devel@nongnu.org; Mon, 17 Jul 2017 07:10:08 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AD3104E34E for ; Mon, 17 Jul 2017 11:10:07 +0000 (UTC) Received: from dgilbert-t530.redhat.com (ovpn-117-162.ams2.redhat.com [10.36.117.162]) by smtp.corp.redhat.com (Postfix) with ESMTP id CF98A5C684; Mon, 17 Jul 2017 11:10:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com AD3104E34E Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=dgilbert@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com AD3104E34E From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Mon, 17 Jul 2017 12:09:36 +0100 Message-Id: <20170717110936.23314-7-dgilbert@redhat.com> In-Reply-To: <20170717110936.23314-1-dgilbert@redhat.com> References: <20170717110936.23314-1-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Mon, 17 Jul 2017 11:10:07 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v4 6/6] migration/rdma: Send error during cancelling X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: "Dr. David Alan Gilbert" When we issue a cancel and clean up the RDMA channel send a CONTROL_ERROR to get the destination to quit. The rdma_cleanup code waits for the event to come back from the rdma_disconnect; but that wont happen until the destination quits and there's currently nothing to force it. Note this makes the case of a cancel work while the destination is alive, and it already works if the destination is truly dead. Note it doesn't fix the case where the destination is hung (we get stuck waiting for the rdma_disconnect event). Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Peter Xu --- migration/rdma.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/migration/rdma.c b/migration/rdma.c index 972167d899..ca56594328 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -2269,7 +2269,9 @@ static void qemu_rdma_cleanup(RDMAContext *rdma) int ret, idx; =20 if (rdma->cm_id && rdma->connected) { - if (rdma->error_state && !rdma->received_error) { + if ((rdma->error_state || + migrate_get_current()->state =3D=3D MIGRATION_STATUS_CANCELLI= NG) && + !rdma->received_error) { RDMAControlHeader head =3D { .len =3D 0, .type =3D RDMA_CONTROL_ERROR, .repeat =3D 1, --=20 2.13.0