From nobody Mon Feb 9 07:56:41 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1658225333; cv=none; d=zohomail.com; s=zohoarc; b=neamc5FbOW2ysfChpSQ+HzNiBepCcz1d91KV39ng04nOzCZOjU7u15g+cwmDsPqpBvfJ/aN132+X82LqeUdfhn+gCZ9pSA0wcAOSEPeHraKsFmC9kw6+fyXNy+2kVeQY3+SUU7KJ9l1Mmth7MvcfiZ4ybYsAiJVJdOFu9u8HU8U= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1658225333; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=yML95toYa99tCvuXIfZKF4Ahr1nl2pc4LD2Nh4zNAFk=; b=ICAZMG3gjvaqT1uiLasOt3sGNVvV2uG44fCNr8wFYnjOHl6QvgQyGT3NwYAYOD6TtFS5DKwkpAWcx1xbcpZYgdCGcoTnpYzaEVxrLFZxSF2VaiWIvzQbnU80hC5g8Iq0ThiFCENb6tsP2/OmJapRtZypy7k5VbYBS0PNdyZ7B5U= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1658225333434168.93402246831613; Tue, 19 Jul 2022 03:08:53 -0700 (PDT) Received: from localhost ([::1]:43222 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oDk9r-0000pG-6O for importer@patchew.org; Tue, 19 Jul 2022 06:08:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50554) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oDjyx-0000P3-RZ for qemu-devel@nongnu.org; Tue, 19 Jul 2022 05:57:35 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:25796) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oDjyv-0005As-Ql for qemu-devel@nongnu.org; Tue, 19 Jul 2022 05:57:35 -0400 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-213-i-GBBTmVPV2KX_KQkVC3BA-1; Tue, 19 Jul 2022 05:57:29 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 45662185A7BA; Tue, 19 Jul 2022 09:57:29 +0000 (UTC) Received: from eperezma.remote.csb (unknown [10.39.192.113]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3A84494561; Tue, 19 Jul 2022 09:57:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1658224653; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yML95toYa99tCvuXIfZKF4Ahr1nl2pc4LD2Nh4zNAFk=; b=JDBIga/GZVHk4zFmkgP9+E6bacZGK4S8Vl5vb0/klczhczUoK0TdsZuVkwHqx4+yd3flKU tOYOGsh8TN5WqLAJIvqHLaqNYQxg+/wq6dJqluihYx8+jSjXRb5HLdCZIfN+YqRyq3oYwN ZyzNoY3REsnWAlzY2L3gGFKZJ+wGhhc= X-MC-Unique: i-GBBTmVPV2KX_KQkVC3BA-1 From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= To: qemu-devel@nongnu.org Cc: Stefan Hajnoczi , Gautam Dawar , Liuxiangdong , Paolo Bonzini , Markus Armbruster , Jason Wang , Cornelia Huck , Parav Pandit , Eric Blake , "Michael S. Tsirkin" , Laurent Vivier , Zhu Lingshan , Eli Cohen , Cindy Lu , "Gonglei (Arei)" , Stefano Garzarella , Harpreet Singh Anand Subject: [PATCH v5 17/20] vdpa: Buffer CVQ support on shadow virtqueue Date: Tue, 19 Jul 2022 11:56:26 +0200 Message-Id: <20220719095629.3031338-18-eperezma@redhat.com> In-Reply-To: <20220719095629.3031338-1-eperezma@redhat.com> References: <20220719095629.3031338-1-eperezma@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=eperezma@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1658225335410100001 Introduce the control virtqueue support for vDPA shadow virtqueue. This is needed for advanced networking features like rx filtering. Virtio-net control VQ copies the descriptors to qemu's VA, so we avoid TOCTOU with the guest's or device's memory every time there is a device model change. Otherwise, the guest could change the memory content in the time between qemu and the device read it. To demonstrate command handling, VIRTIO_NET_F_CTRL_MACADDR is implemented. If the virtio-net driver changes MAC the virtio-net device model will be updated with the new one, and a rx filtering change event will be raised. More cvq commands could be added here straightforwardly but they have not been tested. Signed-off-by: Eugenio P=C3=A9rez --- net/vhost-vdpa.c | 213 +++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 205 insertions(+), 8 deletions(-) diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c index 2e3b6b10d8..502f6f9d3e 100644 --- a/net/vhost-vdpa.c +++ b/net/vhost-vdpa.c @@ -33,6 +33,9 @@ typedef struct VhostVDPAState { NetClientState nc; struct vhost_vdpa vhost_vdpa; VHostNetState *vhost_net; + + /* Control commands shadow buffers */ + void *cvq_cmd_out_buffer, *cvq_cmd_in_buffer; bool started; } VhostVDPAState; =20 @@ -131,6 +134,8 @@ static void vhost_vdpa_cleanup(NetClientState *nc) { VhostVDPAState *s =3D DO_UPCAST(VhostVDPAState, nc, nc); =20 + qemu_vfree(s->cvq_cmd_out_buffer); + qemu_vfree(s->cvq_cmd_in_buffer); if (s->vhost_net) { vhost_net_cleanup(s->vhost_net); g_free(s->vhost_net); @@ -190,24 +195,191 @@ static NetClientInfo net_vhost_vdpa_info =3D { .check_peer_type =3D vhost_vdpa_check_peer_type, }; =20 +static void vhost_vdpa_cvq_unmap_buf(struct vhost_vdpa *v, void *addr) +{ + VhostIOVATree *tree =3D v->iova_tree; + DMAMap needle =3D { + /* + * No need to specify size or to look for more translations since + * this contiguous chunk was allocated by us. + */ + .translated_addr =3D (hwaddr)(uintptr_t)addr, + }; + const DMAMap *map =3D vhost_iova_tree_find_iova(tree, &needle); + int r; + + if (unlikely(!map)) { + error_report("Cannot locate expected map"); + return; + } + + r =3D vhost_vdpa_dma_unmap(v, map->iova, map->size + 1); + if (unlikely(r !=3D 0)) { + error_report("Device cannot unmap: %s(%d)", g_strerror(r), r); + } + + vhost_iova_tree_remove(tree, map); +} + +static size_t vhost_vdpa_net_cvq_cmd_len(void) +{ + /* + * MAC_TABLE_SET is the ctrl command that produces the longer out buff= er. + * In buffer is always 1 byte, so it should fit here + */ + return sizeof(struct virtio_net_ctrl_hdr) + + 2 * sizeof(struct virtio_net_ctrl_mac) + + MAC_TABLE_ENTRIES * ETH_ALEN; +} + +static size_t vhost_vdpa_net_cvq_cmd_page_len(void) +{ + return ROUND_UP(vhost_vdpa_net_cvq_cmd_len(), qemu_real_host_page_size= ()); +} + +/** Copy and map a guest buffer. */ +static bool vhost_vdpa_cvq_map_buf(struct vhost_vdpa *v, + const struct iovec *out_data, + size_t out_num, size_t data_len, void *= buf, + size_t *written, bool write) +{ + DMAMap map =3D {}; + int r; + + if (unlikely(!data_len)) { + qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid legnth of %s buffer\n", + __func__, write ? "in" : "out"); + return false; + } + + *written =3D iov_to_buf(out_data, out_num, 0, buf, data_len); + map.translated_addr =3D (hwaddr)(uintptr_t)buf; + map.size =3D vhost_vdpa_net_cvq_cmd_page_len() - 1; + map.perm =3D write ? IOMMU_RW : IOMMU_RO, + r =3D vhost_iova_tree_map_alloc(v->iova_tree, &map); + if (unlikely(r !=3D IOVA_OK)) { + error_report("Cannot map injected element"); + return false; + } + + r =3D vhost_vdpa_dma_map(v, map.iova, vhost_vdpa_net_cvq_cmd_page_len(= ), buf, + !write); + if (unlikely(r < 0)) { + goto dma_map_err; + } + + return true; + +dma_map_err: + vhost_iova_tree_remove(v->iova_tree, &map); + return false; +} + /** - * Forward buffer for the moment. + * Copy the guest element into a dedicated buffer suitable to be sent to N= IC + * + * @iov: [0] is the out buffer, [1] is the in one + */ +static bool vhost_vdpa_net_cvq_map_elem(VhostVDPAState *s, + VirtQueueElement *elem, + struct iovec *iov) +{ + size_t in_copied; + bool ok; + + iov[0].iov_base =3D s->cvq_cmd_out_buffer; + ok =3D vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, elem->out_sg, elem->out_= num, + vhost_vdpa_net_cvq_cmd_len(), iov[0].iov_b= ase, + &iov[0].iov_len, false); + if (unlikely(!ok)) { + return false; + } + + iov[1].iov_base =3D s->cvq_cmd_in_buffer; + ok =3D vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, NULL, 0, + sizeof(virtio_net_ctrl_ack), iov[1].iov_ba= se, + &in_copied, true); + if (unlikely(!ok)) { + vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, s->cvq_cmd_out_buffer); + return false; + } + + iov[1].iov_len =3D sizeof(virtio_net_ctrl_ack); + return true; +} + +/** + * Do not forward commands not supported by SVQ. Otherwise, the device cou= ld + * accept it and qemu would not know how to update the device model. + */ +static bool vhost_vdpa_net_cvq_validate_cmd(const struct iovec *out, + size_t out_num) +{ + struct virtio_net_ctrl_hdr ctrl; + size_t n; + + n =3D iov_to_buf(out, out_num, 0, &ctrl, sizeof(ctrl)); + if (unlikely(n < sizeof(ctrl))) { + qemu_log_mask(LOG_GUEST_ERROR, + "%s: invalid legnth of out buffer %zu\n", __func__, = n); + return false; + } + + switch (ctrl.class) { + case VIRTIO_NET_CTRL_MAC: + switch (ctrl.cmd) { + case VIRTIO_NET_CTRL_MAC_ADDR_SET: + return true; + default: + qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid mac cmd %u\n", + __func__, ctrl.cmd); + }; + break; + default: + qemu_log_mask(LOG_GUEST_ERROR, "%s: invalid control class %u\n", + __func__, ctrl.class); + }; + + return false; +} + +/** + * Validate and copy control virtqueue commands. + * + * Following QEMU guidelines, we offer a copy of the buffers to the device= to + * prevent TOCTOU bugs. */ static int vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq, VirtQueueElement *elem, void *opaque) { - unsigned int n =3D elem->out_num + elem->in_num; - g_autofree struct iovec *dev_buffers =3D g_new(struct iovec, n); + VhostVDPAState *s =3D opaque; size_t in_len, dev_written; virtio_net_ctrl_ack status =3D VIRTIO_NET_ERR; - int r; + /* out and in buffers sent to the device */ + struct iovec dev_buffers[2] =3D { + { .iov_base =3D s->cvq_cmd_out_buffer }, + { .iov_base =3D s->cvq_cmd_in_buffer }, + }; + /* in buffer used for device model */ + const struct iovec in =3D { + .iov_base =3D &status, + .iov_len =3D sizeof(status), + }; + int r =3D -EINVAL; + bool ok; + + ok =3D vhost_vdpa_net_cvq_map_elem(s, elem, dev_buffers); + if (unlikely(!ok)) { + goto out; + } =20 - memcpy(dev_buffers, elem->out_sg, elem->out_num); - memcpy(dev_buffers + elem->out_num, elem->in_sg, elem->in_num); + ok =3D vhost_vdpa_net_cvq_validate_cmd(&dev_buffers[0], 1); + if (unlikely(!ok)) { + goto out; + } =20 - r =3D vhost_svq_add(svq, &dev_buffers[0], elem->out_num, &dev_buffers[= 1], - elem->in_num, elem); + r =3D vhost_svq_add(svq, &dev_buffers[0], 1, &dev_buffers[1], 1, elem); if (unlikely(r !=3D 0)) { if (unlikely(r =3D=3D -ENOSPC)) { qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n= ", @@ -224,6 +396,18 @@ static int vhost_vdpa_net_handle_ctrl_avail(VhostShado= wVirtqueue *svq, dev_written =3D vhost_svq_poll(svq); if (unlikely(dev_written < sizeof(status))) { error_report("Insufficient written data (%zu)", dev_written); + goto out; + } + + memcpy(&status, dev_buffers[1].iov_base, sizeof(status)); + if (status !=3D VIRTIO_NET_OK) { + goto out; + } + + status =3D VIRTIO_NET_ERR; + virtio_net_handle_ctrl_iov(svq->vdev, &in, 1, dev_buffers, 1); + if (status !=3D VIRTIO_NET_OK) { + error_report("Bad CVQ processing in model"); } =20 out: @@ -234,6 +418,12 @@ out: } vhost_svq_push_elem(svq, elem, MIN(in_len, sizeof(status))); g_free(elem); + if (dev_buffers[0].iov_base) { + vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, dev_buffers[0].iov_base); + } + if (dev_buffers[1].iov_base) { + vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, dev_buffers[1].iov_base); + } return r; } =20 @@ -266,6 +456,13 @@ static NetClientState *net_vhost_vdpa_init(NetClientSt= ate *peer, s->vhost_vdpa.device_fd =3D vdpa_device_fd; s->vhost_vdpa.index =3D queue_pair_index; if (!is_datapath) { + s->cvq_cmd_out_buffer =3D qemu_memalign(qemu_real_host_page_size(), + vhost_vdpa_net_cvq_cmd_page_le= n()); + memset(s->cvq_cmd_out_buffer, 0, vhost_vdpa_net_cvq_cmd_page_len()= ); + s->cvq_cmd_in_buffer =3D qemu_memalign(qemu_real_host_page_size(), + vhost_vdpa_net_cvq_cmd_page_le= n()); + memset(s->cvq_cmd_in_buffer, 0, vhost_vdpa_net_cvq_cmd_page_len()); + s->vhost_vdpa.shadow_vq_ops =3D &vhost_vdpa_net_svq_ops; s->vhost_vdpa.shadow_vq_ops_opaque =3D s; } --=20 2.31.1