From nobody Mon Feb 9 15:11:02 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1621443400; cv=none; d=zohomail.com; s=zohoarc; b=boAJA9qi/eaTYePsbsq8uD1lqim/OvN3ZsZLznERtVNn38+sGVu57wDL7YyKqjMyh3RjO0djWExNk0SujGNPwfvMT56y8kUvRowjfccZM5ko0neP7CJQ1qyYg2FMR768P9PxGhTXmYbZ4bWaPSmAIdgDnhrassDE3ZROcvqVV5Y= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1621443400; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=3EN2RtzwvYrUtWholmNU/bVJLAfkhNFdmmfMWBtVVPE=; b=XItfWoNIOsH01RtJZISPrO6klxtMKdz3X0yjM2pecjAeK4eCS4j3yn2ES6XLp+8Bbyvkv/WE1VhuWYLs+xvatiY4VKus9r9TKdzT549DERN54yoo9fnKvPmV0rAStWhxuvIlw5//m9fAIYtZXNNheUXr4kBY8UExCNE/bdiaOtQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1621443400352703.7531450673218; Wed, 19 May 2021 09:56:40 -0700 (PDT) Received: from localhost ([::1]:35004 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ljPUt-00062P-10 for importer@patchew.org; Wed, 19 May 2021 12:56:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35342) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ljP7F-0006dp-Iy for qemu-devel@nongnu.org; Wed, 19 May 2021 12:32:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:48142) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ljP7C-00064r-JD for qemu-devel@nongnu.org; Wed, 19 May 2021 12:32:13 -0400 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-407-9KOtJN-6Nf6rUjMRoktSIA-1; Wed, 19 May 2021 12:32:08 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BE60F107ACC7; Wed, 19 May 2021 16:32:06 +0000 (UTC) Received: from eperezma.remote.csb (ovpn-113-65.ams2.redhat.com [10.36.113.65]) by smtp.corp.redhat.com (Postfix) with ESMTP id C205B5D6AC; Wed, 19 May 2021 16:31:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621441930; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3EN2RtzwvYrUtWholmNU/bVJLAfkhNFdmmfMWBtVVPE=; b=UiRi7O0MaoTO7YnbL951I3Ph1ORYNhdU38JsmO50vBNI8rbHz3id2aXUK5nUxtjdxm9XK4 qp494CU1JlzU8M2/EN5b78xlWPGEOxrPuOLnObqeGTYlP8SBCiV47Cv2nQ2pJz9y5wbCPI Wvy2Qh0DJABy/o800PZzpoDwVsXOh+0= X-MC-Unique: 9KOtJN-6Nf6rUjMRoktSIA-1 From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= To: qemu-devel@nongnu.org Subject: [RFC v3 23/29] vhost: Use a tree to store memory mappings Date: Wed, 19 May 2021 18:28:57 +0200 Message-Id: <20210519162903.1172366-24-eperezma@redhat.com> In-Reply-To: <20210519162903.1172366-1-eperezma@redhat.com> References: <20210519162903.1172366-1-eperezma@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eperezma@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=216.205.24.124; envelope-from=eperezma@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -31 X-Spam_score: -3.2 X-Spam_bar: --- X-Spam_report: (-3.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.39, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Parav Pandit , "Michael S. Tsirkin" , Jason Wang , Juan Quintela , Markus Armbruster , virtualization@lists.linux-foundation.org, Harpreet Singh Anand , Xiao W Wang , Stefan Hajnoczi , Eli Cohen , Michael Lilja , Stefano Garzarella Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) At the moment, the tree is only used to store 1:1 maps of the qemu virtual addresses of shadow virtqueue vring and the guest's addresses. In other words, the tree only serves to check if the address the guest exposed is valid at the moment qemu receives the miss. It does not work if device has restrictions in its iova range at the moment. Updates to tree are protected by BQL, each one always run from main event loop context. vhost_device_iotlb_miss runs in the same one on reading it. Signed-off-by: Eugenio P=C3=A9rez --- include/hw/virtio/vhost.h | 3 + hw/virtio/vhost.c | 121 ++++++++++++++++++++++++++++++-------- 2 files changed, 99 insertions(+), 25 deletions(-) diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h index c97a4c0017..773f882145 100644 --- a/include/hw/virtio/vhost.h +++ b/include/hw/virtio/vhost.h @@ -2,6 +2,7 @@ #define VHOST_H =20 #include "hw/virtio/vhost-backend.h" +#include "hw/virtio/vhost-iova-tree.h" #include "hw/virtio/virtio.h" #include "exec/memory.h" =20 @@ -88,6 +89,8 @@ struct vhost_dev { bool log_enabled; bool shadow_vqs_enabled; uint64_t log_size; + /* IOVA mapping used by Shadow Virtqueue */ + VhostIOVATree iova_map; struct { hwaddr first; hwaddr last; diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index c8fa9df9b3..925d2146a4 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -1013,31 +1013,45 @@ static int vhost_memory_region_lookup(struct vhost_= dev *hdev, =20 int vhost_device_iotlb_miss(struct vhost_dev *dev, uint64_t iova, int writ= e) { - IOMMUTLBEntry iotlb; + IOMMUAccessFlags perm; uint64_t uaddr, len; int ret =3D -EFAULT; =20 - RCU_READ_LOCK_GUARD(); - trace_vhost_iotlb_miss(dev, 1); =20 if (dev->shadow_vqs_enabled) { - uaddr =3D iova; - len =3D 4096; - ret =3D vhost_backend_update_device_iotlb(dev, iova, uaddr, len, - IOMMU_RW); - if (ret) { - trace_vhost_iotlb_miss(dev, 2); - error_report("Fail to update device iotlb"); + /* Shadow virtqueue translations in its Virtual Address Space */ + const VhostDMAMap *result; + const VhostDMAMap needle =3D { + .iova =3D iova, + }; + + result =3D vhost_iova_tree_find_taddr(&dev->iova_map, &needle); + + if (unlikely(!result)) { + goto out; } =20 - return ret; - } + iova =3D result->iova; + uaddr =3D (uint64_t)result->translated_addr; + /* + * In IOVATree, result.iova + result.size is the last element of i= ova. + * For vhost, it is one past that last element. + */ + len =3D result->size + 1; + perm =3D result->perm; + } else { + IOMMUTLBEntry iotlb; + + RCU_READ_LOCK_GUARD(); + iotlb =3D address_space_get_iotlb_entry(dev->vdev->dma_as, + iova, write, + MEMTXATTRS_UNSPECIFIED); + + if (iotlb.target_as =3D=3D NULL) { + goto out; + } =20 - iotlb =3D address_space_get_iotlb_entry(dev->vdev->dma_as, - iova, write, - MEMTXATTRS_UNSPECIFIED); - if (iotlb.target_as !=3D NULL) { ret =3D vhost_memory_region_lookup(dev, iotlb.translated_addr, &uaddr, &len); if (ret) { @@ -1049,14 +1063,14 @@ int vhost_device_iotlb_miss(struct vhost_dev *dev, = uint64_t iova, int write) =20 len =3D MIN(iotlb.addr_mask + 1, len); iova =3D iova & ~iotlb.addr_mask; + perm =3D iotlb.perm; + } =20 - ret =3D vhost_backend_update_device_iotlb(dev, iova, uaddr, - len, iotlb.perm); - if (ret) { - trace_vhost_iotlb_miss(dev, 4); - error_report("Fail to update device iotlb"); - goto out; - } + ret =3D vhost_backend_update_device_iotlb(dev, iova, uaddr, len, perm); + if (ret) { + trace_vhost_iotlb_miss(dev, 4); + error_report("Fail to update device iotlb"); + goto out; } =20 trace_vhost_iotlb_miss(dev, 2); @@ -1249,7 +1263,7 @@ static int vhost_sw_live_migration_stop(struct vhost_= dev *dev) if (r) { error_report("Fail to invalidate device iotlb"); } - + vhost_iova_tree_destroy(&dev->iova_map); for (idx =3D 0; idx < dev->nvqs; ++idx) { struct vhost_virtqueue *vq =3D dev->vqs + idx; if (vhost_dev_has_iommu(dev) && @@ -1279,6 +1293,26 @@ static int vhost_sw_live_migration_stop(struct vhost= _dev *dev) return 0; } =20 +static bool vhost_shadow_vq_start_store_sections(struct vhost_dev *dev) +{ + int idx; + + for (idx =3D 0; idx < dev->n_mem_sections; ++idx) { + size_t region_size =3D dev->mem->regions[idx].memory_size; + VhostDMAMap region =3D { + .iova =3D dev->mem->regions[idx].userspace_addr, + .translated_addr =3D (void *)dev->mem->regions[idx].userspace_= addr, + .size =3D region_size - 1, + .perm =3D VHOST_ACCESS_RW, + }; + + VhostDMAMapNewRC r =3D vhost_iova_tree_insert(&dev->iova_map, ®= ion); + assert(r =3D=3D VHOST_DMA_MAP_OK); + } + + return true; +} + /* * Start shadow virtqueue in a given queue. * In failure case, this function leaves queue working as regular vhost mo= de. @@ -1292,9 +1326,37 @@ static bool vhost_sw_live_migration_start_vq(struct = vhost_dev *dev, struct vhost_vring_state s =3D { .index =3D idx, }; + VhostDMAMap driver_region, device_region; + int r; bool ok; =20 + assert(dev->shadow_vqs[idx] !=3D NULL); + vhost_shadow_vq_get_vring_addr(dev->shadow_vqs[idx], &addr); + driver_region =3D (VhostDMAMap) { + .iova =3D addr.desc_user_addr, + .translated_addr =3D (void *)addr.desc_user_addr, + + /* + * DMAMAp.size include the last byte included in the range, while + * sizeof marks one past it. Substract one byte to make them match. + */ + .size =3D vhost_shadow_vq_driver_area_size(dev->shadow_vqs[idx]) -= 1, + .perm =3D VHOST_ACCESS_RO, + }; + device_region =3D (VhostDMAMap) { + .iova =3D addr.used_user_addr, + .translated_addr =3D (void *)addr.used_user_addr, + .size =3D vhost_shadow_vq_device_area_size(dev->shadow_vqs[idx]) -= 1, + .perm =3D VHOST_ACCESS_RW, + }; + + r =3D vhost_iova_tree_insert(&dev->iova_map, &driver_region); + assert(r =3D=3D VHOST_DMA_MAP_OK); + + r =3D vhost_iova_tree_insert(&dev->iova_map, &device_region); + assert(r =3D=3D VHOST_DMA_MAP_OK); + vhost_virtqueue_stop(dev, dev->vdev, &dev->vqs[idx], dev->vq_index + i= dx); ok =3D vhost_shadow_vq_start(dev, idx, dev->shadow_vqs[idx]); if (unlikely(!ok)) { @@ -1302,7 +1364,6 @@ static bool vhost_sw_live_migration_start_vq(struct v= host_dev *dev, } =20 /* From this point, vhost_virtqueue_start can reset these changes */ - vhost_shadow_vq_get_vring_addr(dev->shadow_vqs[idx], &addr); r =3D dev->vhost_ops->vhost_set_vring_addr(dev, &addr); if (unlikely(r !=3D 0)) { VHOST_OPS_DEBUG("vhost_set_vring_addr for shadow vq failed"); @@ -1315,6 +1376,7 @@ static bool vhost_sw_live_migration_start_vq(struct v= host_dev *dev, goto err; } =20 + if (vhost_dev_has_iommu(dev) && dev->vhost_ops->vhost_set_iotlb_callba= ck) { /* * Update used ring information for IOTLB to work correctly, @@ -1357,6 +1419,15 @@ static int vhost_sw_live_migration_start(struct vhos= t_dev *dev) error_report("Fail to invalidate device iotlb"); } =20 + /* + * Create new iova mappings. SVQ always expose qemu's VA. + * TODO: Fine tune the exported mapping. Default vhost does not expose + * everything. + */ + + vhost_iova_tree_new(&dev->iova_map); + vhost_shadow_vq_start_store_sections(dev); + /* Can be read by vhost_virtqueue_mask, from vm exit */ dev->shadow_vqs_enabled =3D true; for (idx =3D 0; idx < dev->nvqs; ++idx) { --=20 2.27.0