From nobody Wed Oct 29 17:29:48 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525403445806835.9215353929864; Thu, 3 May 2018 20:10:45 -0700 (PDT) Received: from localhost ([::1]:60168 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER7N-0002iE-1L for importer@patchew.org; Thu, 03 May 2018 23:10:45 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33160) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER5E-0001VX-Ln for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fER5D-0005In-Kg for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:32 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:48244 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fER5D-0005IY-F7 for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:31 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 25C6140858CD; Fri, 4 May 2018 03:08:31 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-46.pek2.redhat.com [10.72.12.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1E1B9111762B; Fri, 4 May 2018 03:08:27 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 4 May 2018 11:08:02 +0800 Message-Id: <20180504030811.28111-2-peterx@redhat.com> In-Reply-To: <20180504030811.28111-1-peterx@redhat.com> References: <20180504030811.28111-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Fri, 04 May 2018 03:08:31 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Fri, 04 May 2018 03:08:31 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 01/10] intel-iommu: send PSI always even if across PDEs X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" During IOVA page table walking, there is a special case when the PSI covers one whole PDE (Page Directory Entry, which contains 512 Page Table Entries) or more. In the past, we skip that entry and we don't notify the IOMMU notifiers. This is not correct. We should send UNMAP notification to registered UNMAP notifiers in this case. For UNMAP only notifiers, this might cause IOTLBs cached in the devices even if they were already invalid. For MAP/UNMAP notifiers like vfio-pci, this will cause stale page mappings. This special case doesn't trigger often, but it is very easy to be triggered by nested device assignments, since in that case we'll possibly map the whole L2 guest RAM region into the device's IOVA address space (several GBs at least), which is far bigger than normal kernel driver usages of the device (tens of MBs normally). Without this patch applied to L1 QEMU, nested device assignment to L2 guests will dump some errors like: qemu-system-x86_64: VFIO_MAP_DMA: -17 qemu-system-x86_64: vfio_dma_map(0x557305420c30, 0xad000, 0x1000, 0x7f89a920d000) =3D -17 (File exists) Acked-by: Jason Wang [peterx: rewrite the commit message] Signed-off-by: Peter Xu Reviewed-by: Eric Auger --- hw/i386/intel_iommu.c | 42 ++++++++++++++++++++++++++++++------------ 1 file changed, 30 insertions(+), 12 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index fb31de9416..b359efd6f9 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -722,6 +722,15 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint= 64_t iova, bool is_write, =20 typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private); =20 +static int vtd_page_walk_one(IOMMUTLBEntry *entry, int level, + vtd_page_walk_hook hook_fn, void *private) +{ + assert(hook_fn); + trace_vtd_page_walk_one(level, entry->iova, entry->translated_addr, + entry->addr_mask, entry->perm); + return hook_fn(entry, private); +} + /** * vtd_page_walk_level - walk over specific level for IOVA range * @@ -781,28 +790,37 @@ static int vtd_page_walk_level(dma_addr_t addr, uint6= 4_t start, */ entry_valid =3D read_cur | write_cur; =20 + entry.target_as =3D &address_space_memory; + entry.iova =3D iova & subpage_mask; + entry.perm =3D IOMMU_ACCESS_FLAG(read_cur, write_cur); + entry.addr_mask =3D ~subpage_mask; + if (vtd_is_last_slpte(slpte, level)) { - entry.target_as =3D &address_space_memory; - entry.iova =3D iova & subpage_mask; /* NOTE: this is only meaningful if entry_valid =3D=3D true */ entry.translated_addr =3D vtd_get_slpte_addr(slpte, aw); - entry.addr_mask =3D ~subpage_mask; - entry.perm =3D IOMMU_ACCESS_FLAG(read_cur, write_cur); if (!entry_valid && !notify_unmap) { trace_vtd_page_walk_skip_perm(iova, iova_next); goto next; } - trace_vtd_page_walk_one(level, entry.iova, entry.translated_ad= dr, - entry.addr_mask, entry.perm); - if (hook_fn) { - ret =3D hook_fn(&entry, private); - if (ret < 0) { - return ret; - } + ret =3D vtd_page_walk_one(&entry, level, hook_fn, private); + if (ret < 0) { + return ret; } } else { if (!entry_valid) { - trace_vtd_page_walk_skip_perm(iova, iova_next); + if (notify_unmap) { + /* + * The whole entry is invalid; unmap it all. + * Translated address is meaningless, zero it. + */ + entry.translated_addr =3D 0x0; + ret =3D vtd_page_walk_one(&entry, level, hook_fn, priv= ate); + if (ret < 0) { + return ret; + } + } else { + trace_vtd_page_walk_skip_perm(iova, iova_next); + } goto next; } ret =3D vtd_page_walk_level(vtd_get_slpte_addr(slpte, aw), iov= a, --=20 2.17.0 From nobody Wed Oct 29 17:29:48 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525403591362370.5976881242286; Thu, 3 May 2018 20:13:11 -0700 (PDT) Received: from localhost ([::1]:60184 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER9i-0004mJ-Gw for importer@patchew.org; Thu, 03 May 2018 23:13:10 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33188) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER5I-0001Yj-RD for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fER5H-0005Ka-ER for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:36 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:54370 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fER5H-0005KG-8u for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:35 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id EB1098E692; Fri, 4 May 2018 03:08:34 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-46.pek2.redhat.com [10.72.12.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id BE7D5111762B; Fri, 4 May 2018 03:08:31 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 4 May 2018 11:08:03 +0800 Message-Id: <20180504030811.28111-3-peterx@redhat.com> In-Reply-To: <20180504030811.28111-1-peterx@redhat.com> References: <20180504030811.28111-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Fri, 04 May 2018 03:08:34 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Fri, 04 May 2018 03:08:34 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 02/10] intel-iommu: remove IntelIOMMUNotifierNode X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" That is not really necessary. Removing that node struct and put the list entry directly into VTDAddressSpace. It simplfies the code a lot. Signed-off-by: Peter Xu --- include/hw/i386/intel_iommu.h | 9 ++------ hw/i386/intel_iommu.c | 41 ++++++++++------------------------- 2 files changed, 14 insertions(+), 36 deletions(-) diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index 45ec8919b6..220697253f 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -67,7 +67,6 @@ typedef union VTD_IR_TableEntry VTD_IR_TableEntry; typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress; typedef struct VTDIrq VTDIrq; typedef struct VTD_MSIMessage VTD_MSIMessage; -typedef struct IntelIOMMUNotifierNode IntelIOMMUNotifierNode; =20 /* Context-Entry */ struct VTDContextEntry { @@ -93,6 +92,7 @@ struct VTDAddressSpace { MemoryRegion iommu_ir; /* Interrupt region: 0xfeeXXXXX */ IntelIOMMUState *iommu_state; VTDContextCacheEntry context_cache_entry; + QLIST_ENTRY(VTDAddressSpace) next; }; =20 struct VTDBus { @@ -253,11 +253,6 @@ struct VTD_MSIMessage { /* When IR is enabled, all MSI/MSI-X data bits should be zero */ #define VTD_IR_MSI_DATA (0) =20 -struct IntelIOMMUNotifierNode { - VTDAddressSpace *vtd_as; - QLIST_ENTRY(IntelIOMMUNotifierNode) next; -}; - /* The iommu (DMAR) device state struct */ struct IntelIOMMUState { X86IOMMUState x86_iommu; @@ -295,7 +290,7 @@ struct IntelIOMMUState { GHashTable *vtd_as_by_busptr; /* VTDBus objects indexed by PCIBus* r= eference */ VTDBus *vtd_as_by_bus_num[VTD_PCI_BUS_MAX]; /* VTDBus objects indexed = by bus number */ /* list of registered notifiers */ - QLIST_HEAD(, IntelIOMMUNotifierNode) notifiers_list; + QLIST_HEAD(, VTDAddressSpace) notifiers_list; =20 /* interrupt remapping */ bool intr_enabled; /* Whether guest enabled IR */ diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index b359efd6f9..5987b48d43 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -1248,10 +1248,10 @@ static void vtd_interrupt_remap_table_setup(IntelIO= MMUState *s) =20 static void vtd_iommu_replay_all(IntelIOMMUState *s) { - IntelIOMMUNotifierNode *node; + VTDAddressSpace *vtd_as; =20 - QLIST_FOREACH(node, &s->notifiers_list, next) { - memory_region_iommu_replay_all(&node->vtd_as->iommu); + QLIST_FOREACH(vtd_as, &s->notifiers_list, next) { + memory_region_iommu_replay_all(&vtd_as->iommu); } } =20 @@ -1372,7 +1372,6 @@ static void vtd_iotlb_global_invalidate(IntelIOMMUSta= te *s) =20 static void vtd_iotlb_domain_invalidate(IntelIOMMUState *s, uint16_t domai= n_id) { - IntelIOMMUNotifierNode *node; VTDContextEntry ce; VTDAddressSpace *vtd_as; =20 @@ -1381,8 +1380,7 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUSta= te *s, uint16_t domain_id) g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_domain, &domain_id); =20 - QLIST_FOREACH(node, &s->notifiers_list, next) { - vtd_as =3D node->vtd_as; + QLIST_FOREACH(vtd_as, &s->notifiers_list, next) { if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), vtd_as->devfn, &ce) && domain_id =3D=3D VTD_CONTEXT_ENTRY_DID(ce.hi)) { @@ -1402,12 +1400,11 @@ static void vtd_iotlb_page_invalidate_notify(IntelI= OMMUState *s, uint16_t domain_id, hwaddr addr, uint8_t am) { - IntelIOMMUNotifierNode *node; + VTDAddressSpace *vtd_as; VTDContextEntry ce; int ret; =20 - QLIST_FOREACH(node, &(s->notifiers_list), next) { - VTDAddressSpace *vtd_as =3D node->vtd_as; + QLIST_FOREACH(vtd_as, &(s->notifiers_list), next) { ret =3D vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), vtd_as->devfn, &ce); if (!ret && domain_id =3D=3D VTD_CONTEXT_ENTRY_DID(ce.hi)) { @@ -2344,8 +2341,6 @@ static void vtd_iommu_notify_flag_changed(IOMMUMemory= Region *iommu, { VTDAddressSpace *vtd_as =3D container_of(iommu, VTDAddressSpace, iommu= ); IntelIOMMUState *s =3D vtd_as->iommu_state; - IntelIOMMUNotifierNode *node =3D NULL; - IntelIOMMUNotifierNode *next_node =3D NULL; =20 if (!s->caching_mode && new & IOMMU_NOTIFIER_MAP) { error_report("We need to set caching-mode=3D1 for intel-iommu to e= nable " @@ -2354,21 +2349,11 @@ static void vtd_iommu_notify_flag_changed(IOMMUMemo= ryRegion *iommu, } =20 if (old =3D=3D IOMMU_NOTIFIER_NONE) { - node =3D g_malloc0(sizeof(*node)); - node->vtd_as =3D vtd_as; - QLIST_INSERT_HEAD(&s->notifiers_list, node, next); - return; - } - - /* update notifier node with new flags */ - QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) { - if (node->vtd_as =3D=3D vtd_as) { - if (new =3D=3D IOMMU_NOTIFIER_NONE) { - QLIST_REMOVE(node, next); - g_free(node); - } - return; - } + /* Insert new ones */ + QLIST_INSERT_HEAD(&s->notifiers_list, vtd_as, next); + } else if (new =3D=3D IOMMU_NOTIFIER_NONE) { + /* Remove old ones */ + QLIST_REMOVE(vtd_as, next); } } =20 @@ -2838,12 +2823,10 @@ static void vtd_address_space_unmap(VTDAddressSpace= *as, IOMMUNotifier *n) =20 static void vtd_address_space_unmap_all(IntelIOMMUState *s) { - IntelIOMMUNotifierNode *node; VTDAddressSpace *vtd_as; IOMMUNotifier *n; =20 - QLIST_FOREACH(node, &s->notifiers_list, next) { - vtd_as =3D node->vtd_as; + QLIST_FOREACH(vtd_as, &s->notifiers_list, next) { IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) { vtd_address_space_unmap(vtd_as, n); } --=20 2.17.0 From nobody Wed Oct 29 17:29:48 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525403624785639.4467956465393; Thu, 3 May 2018 20:13:44 -0700 (PDT) Received: from localhost ([::1]:60185 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fERAF-0005Ah-VO for importer@patchew.org; Thu, 03 May 2018 23:13:44 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33256) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER5V-0001h8-3G for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fER5R-0005UI-0F for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:49 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:55760 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fER5Q-0005Tt-R7 for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:44 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7ACF64182D59; Fri, 4 May 2018 03:08:44 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-46.pek2.redhat.com [10.72.12.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8F515111762B; Fri, 4 May 2018 03:08:35 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 4 May 2018 11:08:04 +0800 Message-Id: <20180504030811.28111-4-peterx@redhat.com> In-Reply-To: <20180504030811.28111-1-peterx@redhat.com> References: <20180504030811.28111-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Fri, 04 May 2018 03:08:44 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Fri, 04 May 2018 03:08:44 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 03/10] intel-iommu: add iommu lock X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Add a per-iommu big lock to protect IOMMU status. Currently the only thing to be protected is the IOTLB/context cache, since that can be accessed even without BQL, e.g., in IO dataplane. Note that we don't need to protect device page tables since that's fully controlled by the guest kernel. However there is still possibility that malicious drivers will program the device to not obey the rule. In that case QEMU can't really do anything useful, instead the guest itself will be responsible for all uncertainties. Reported-by: Fam Zheng Signed-off-by: Peter Xu --- include/hw/i386/intel_iommu.h | 6 +++++ hw/i386/intel_iommu.c | 43 +++++++++++++++++++++++++++++++---- 2 files changed, 44 insertions(+), 5 deletions(-) diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index 220697253f..ee517704e7 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -300,6 +300,12 @@ struct IntelIOMMUState { OnOffAuto intr_eim; /* Toggle for EIM cabability */ bool buggy_eim; /* Force buggy EIM unless eim=3Doff */ uint8_t aw_bits; /* Host/IOVA address width (in bits) */ + + /* + * Protects IOMMU states in general. Currently it protects the + * per-IOMMU IOTLB cache, and context entry cache in VTDAddressSpace. + */ + QemuMutex iommu_lock; }; =20 /* Find the VTD Address space associated with the given bus pointer, diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 5987b48d43..112971638d 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -128,6 +128,16 @@ static uint64_t vtd_set_clear_mask_quad(IntelIOMMUStat= e *s, hwaddr addr, return new_val; } =20 +static inline void vtd_iommu_lock(IntelIOMMUState *s) +{ + qemu_mutex_lock(&s->iommu_lock); +} + +static inline void vtd_iommu_unlock(IntelIOMMUState *s) +{ + qemu_mutex_unlock(&s->iommu_lock); +} + /* GHashTable functions */ static gboolean vtd_uint64_equal(gconstpointer v1, gconstpointer v2) { @@ -172,7 +182,7 @@ static gboolean vtd_hash_remove_by_page(gpointer key, g= pointer value, } =20 /* Reset all the gen of VTDAddressSpace to zero and set the gen of - * IntelIOMMUState to 1. + * IntelIOMMUState to 1. Must be with IOMMU lock held. */ static void vtd_reset_context_cache(IntelIOMMUState *s) { @@ -197,12 +207,19 @@ static void vtd_reset_context_cache(IntelIOMMUState *= s) s->context_cache_gen =3D 1; } =20 -static void vtd_reset_iotlb(IntelIOMMUState *s) +static void vtd_reset_iotlb_locked(IntelIOMMUState *s) { assert(s->iotlb); g_hash_table_remove_all(s->iotlb); } =20 +static void vtd_reset_iotlb(IntelIOMMUState *s) +{ + vtd_iommu_lock(s); + vtd_reset_iotlb_locked(s); + vtd_iommu_unlock(s); +} + static uint64_t vtd_get_iotlb_key(uint64_t gfn, uint16_t source_id, uint32_t level) { @@ -215,6 +232,7 @@ static uint64_t vtd_get_iotlb_gfn(hwaddr addr, uint32_t= level) return (addr & vtd_slpt_level_page_mask(level)) >> VTD_PAGE_SHIFT_4K; } =20 +/* Must be with IOMMU lock held */ static VTDIOTLBEntry *vtd_lookup_iotlb(IntelIOMMUState *s, uint16_t source= _id, hwaddr addr) { @@ -235,6 +253,7 @@ out: return entry; } =20 +/* Must be with IOMMU lock held */ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t source_id, uint16_t domain_id, hwaddr addr, uint64_t slp= te, uint8_t access_flags, uint32_t level) @@ -246,7 +265,7 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16= _t source_id, trace_vtd_iotlb_page_update(source_id, addr, slpte, domain_id); if (g_hash_table_size(s->iotlb) >=3D VTD_IOTLB_MAX_SIZE) { trace_vtd_iotlb_reset("iotlb exceeds size limit"); - vtd_reset_iotlb(s); + vtd_reset_iotlb_locked(s); } =20 entry->gfn =3D gfn; @@ -1106,7 +1125,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *v= td_as, PCIBus *bus, IntelIOMMUState *s =3D vtd_as->iommu_state; VTDContextEntry ce; uint8_t bus_num =3D pci_bus_num(bus); - VTDContextCacheEntry *cc_entry =3D &vtd_as->context_cache_entry; + VTDContextCacheEntry *cc_entry; uint64_t slpte, page_mask; uint32_t level; uint16_t source_id =3D vtd_make_source_id(bus_num, devfn); @@ -1123,6 +1142,10 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *= vtd_as, PCIBus *bus, */ assert(!vtd_is_interrupt_addr(addr)); =20 + vtd_iommu_lock(s); + + cc_entry =3D &vtd_as->context_cache_entry; + /* Try to fetch slpte form IOTLB */ iotlb_entry =3D vtd_lookup_iotlb(s, source_id, addr); if (iotlb_entry) { @@ -1182,7 +1205,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *v= td_as, PCIBus *bus, * IOMMU region can be swapped back. */ vtd_pt_enable_fast_path(s, source_id); - + vtd_iommu_unlock(s); return true; } =20 @@ -1203,6 +1226,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *v= td_as, PCIBus *bus, vtd_update_iotlb(s, source_id, VTD_CONTEXT_ENTRY_DID(ce.hi), addr, slp= te, access_flags, level); out: + vtd_iommu_unlock(s); entry->iova =3D addr & page_mask; entry->translated_addr =3D vtd_get_slpte_addr(slpte, s->aw_bits) & pag= e_mask; entry->addr_mask =3D ~page_mask; @@ -1210,6 +1234,7 @@ out: return true; =20 error: + vtd_iommu_unlock(s); entry->iova =3D 0; entry->translated_addr =3D 0; entry->addr_mask =3D 0; @@ -1258,10 +1283,13 @@ static void vtd_iommu_replay_all(IntelIOMMUState *s) static void vtd_context_global_invalidate(IntelIOMMUState *s) { trace_vtd_inv_desc_cc_global(); + /* Protects context cache */ + vtd_iommu_lock(s); s->context_cache_gen++; if (s->context_cache_gen =3D=3D VTD_CONTEXT_CACHE_GEN_MAX) { vtd_reset_context_cache(s); } + vtd_iommu_unlock(s); vtd_switch_address_space_all(s); /* * From VT-d spec 6.5.2.1, a global context entry invalidation @@ -1377,8 +1405,10 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUSt= ate *s, uint16_t domain_id) =20 trace_vtd_inv_desc_iotlb_domain(domain_id); =20 + vtd_iommu_lock(s); g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_domain, &domain_id); + vtd_iommu_unlock(s); =20 QLIST_FOREACH(vtd_as, &s->notifiers_list, next) { if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), @@ -1426,7 +1456,9 @@ static void vtd_iotlb_page_invalidate(IntelIOMMUState= *s, uint16_t domain_id, info.domain_id =3D domain_id; info.addr =3D addr; info.mask =3D ~((1 << am) - 1); + vtd_iommu_lock(s); g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page, &info); + vtd_iommu_unlock(s); vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am); } =20 @@ -3072,6 +3104,7 @@ static void vtd_realize(DeviceState *dev, Error **err= p) } =20 QLIST_INIT(&s->notifiers_list); + qemu_mutex_init(&s->iommu_lock); memset(s->vtd_as_by_bus_num, 0, sizeof(s->vtd_as_by_bus_num)); memory_region_init_io(&s->csrmem, OBJECT(s), &vtd_mem_ops, s, "intel_iommu", DMAR_REG_SIZE); --=20 2.17.0 From nobody Wed Oct 29 17:29:48 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525403463707711.6037839318723; Thu, 3 May 2018 20:11:03 -0700 (PDT) Received: from localhost ([::1]:60173 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER7e-00034I-Uq for importer@patchew.org; Thu, 03 May 2018 23:11:03 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33271) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER5W-0001iD-9c for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fER5V-0005YP-3o for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:50 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:48264 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fER5U-0005Xj-UX for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:49 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9DFB240858C9; Fri, 4 May 2018 03:08:48 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-46.pek2.redhat.com [10.72.12.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 49CCD11166E0; Fri, 4 May 2018 03:08:44 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 4 May 2018 11:08:05 +0800 Message-Id: <20180504030811.28111-5-peterx@redhat.com> In-Reply-To: <20180504030811.28111-1-peterx@redhat.com> References: <20180504030811.28111-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Fri, 04 May 2018 03:08:48 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Fri, 04 May 2018 03:08:48 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 04/10] intel-iommu: only do page walk for MAP notifiers X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" For UNMAP-only IOMMU notifiers, we don't really need to walk the page tables. Fasten that procedure by skipping the page table walk. That should boost performance for UNMAP-only notifiers like vhost. Signed-off-by: Peter Xu --- include/hw/i386/intel_iommu.h | 2 ++ hw/i386/intel_iommu.c | 43 +++++++++++++++++++++++++++++++---- 2 files changed, 40 insertions(+), 5 deletions(-) diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index ee517704e7..9e0a6c1c6a 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -93,6 +93,8 @@ struct VTDAddressSpace { IntelIOMMUState *iommu_state; VTDContextCacheEntry context_cache_entry; QLIST_ENTRY(VTDAddressSpace) next; + /* Superset of notifier flags that this address space has */ + IOMMUNotifierFlag notifier_flags; }; =20 struct VTDBus { diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 112971638d..9a418abfb6 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -138,6 +138,12 @@ static inline void vtd_iommu_unlock(IntelIOMMUState *s) qemu_mutex_unlock(&s->iommu_lock); } =20 +/* Whether the address space needs to notify new mappings */ +static inline gboolean vtd_as_notify_mappings(VTDAddressSpace *as) +{ + return as->notifier_flags & IOMMU_NOTIFIER_MAP; +} + /* GHashTable functions */ static gboolean vtd_uint64_equal(gconstpointer v1, gconstpointer v2) { @@ -1433,14 +1439,35 @@ static void vtd_iotlb_page_invalidate_notify(IntelI= OMMUState *s, VTDAddressSpace *vtd_as; VTDContextEntry ce; int ret; + hwaddr size =3D (1 << am) * VTD_PAGE_SIZE; =20 QLIST_FOREACH(vtd_as, &(s->notifiers_list), next) { ret =3D vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), vtd_as->devfn, &ce); if (!ret && domain_id =3D=3D VTD_CONTEXT_ENTRY_DID(ce.hi)) { - vtd_page_walk(&ce, addr, addr + (1 << am) * VTD_PAGE_SIZE, - vtd_page_invalidate_notify_hook, - (void *)&vtd_as->iommu, true, s->aw_bits); + if (vtd_as_notify_mappings(vtd_as)) { + /* + * For MAP-inclusive notifiers, we need to walk the + * page table to sync the shadow page table. + */ + vtd_page_walk(&ce, addr, addr + size, + vtd_page_invalidate_notify_hook, + (void *)&vtd_as->iommu, true, s->aw_bits); + } else { + /* + * For UNMAP-only notifiers, we don't need to walk the + * page tables. We just deliver the PSI down to + * invalidate caches. + */ + IOMMUTLBEntry entry =3D { + .target_as =3D &address_space_memory, + .iova =3D addr, + .translated_addr =3D 0, + .addr_mask =3D size - 1, + .perm =3D IOMMU_NONE, + }; + memory_region_notify_iommu(&vtd_as->iommu, entry); + } } } } @@ -2380,6 +2407,9 @@ static void vtd_iommu_notify_flag_changed(IOMMUMemory= Region *iommu, exit(1); } =20 + /* Update per-address-space notifier flags */ + vtd_as->notifier_flags =3D new; + if (old =3D=3D IOMMU_NOTIFIER_NONE) { /* Insert new ones */ QLIST_INSERT_HEAD(&s->notifiers_list, vtd_as, next); @@ -2890,8 +2920,11 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iomm= u_mr, IOMMUNotifier *n) PCI_FUNC(vtd_as->devfn), VTD_CONTEXT_ENTRY_DID(ce.hi), ce.hi, ce.lo); - vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, false, - s->aw_bits); + if (vtd_as_notify_mappings(vtd_as)) { + /* This is required only for MAP typed notifiers */ + vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, false, + s->aw_bits); + } } else { trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), PCI_FUNC(vtd_as->devfn)); --=20 2.17.0 From nobody Wed Oct 29 17:29:48 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525403795491153.69503980302602; Thu, 3 May 2018 20:16:35 -0700 (PDT) Received: from localhost ([::1]:60203 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fERCw-0007dq-MJ for importer@patchew.org; Thu, 03 May 2018 23:16:30 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33291) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER5a-0001lX-08 for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fER5Y-0005c9-Mw for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:53 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:45882 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fER5Y-0005bk-Gu for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:52 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 326FBF641F; Fri, 4 May 2018 03:08:52 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-46.pek2.redhat.com [10.72.12.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4778211166E0; Fri, 4 May 2018 03:08:48 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 4 May 2018 11:08:06 +0800 Message-Id: <20180504030811.28111-6-peterx@redhat.com> In-Reply-To: <20180504030811.28111-1-peterx@redhat.com> References: <20180504030811.28111-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 04 May 2018 03:08:52 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 04 May 2018 03:08:52 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 05/10] intel-iommu: introduce vtd_page_walk_info X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" During the recursive page walking of IOVA page tables, some stack variables are constant variables and never changed during the whole page walking procedure. Isolate them into a struct so that we don't need to pass those contants down the stack every time and multiple times. Signed-off-by: Peter Xu Reviewed-by: Eric Auger --- hw/i386/intel_iommu.c | 56 ++++++++++++++++++++++++++++--------------- 1 file changed, 37 insertions(+), 19 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 9a418abfb6..b2b2a0a441 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -747,9 +747,27 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint= 64_t iova, bool is_write, =20 typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private); =20 +/** + * Constant information used during page walking + * + * @hook_fn: hook func to be called when detected page + * @private: private data to be passed into hook func + * @notify_unmap: whether we should notify invalid entries + * @aw: maximum address width + */ +typedef struct { + vtd_page_walk_hook hook_fn; + void *private; + bool notify_unmap; + uint8_t aw; +} vtd_page_walk_info; + static int vtd_page_walk_one(IOMMUTLBEntry *entry, int level, - vtd_page_walk_hook hook_fn, void *private) + vtd_page_walk_info *info) { + vtd_page_walk_hook hook_fn =3D info->hook_fn; + void *private =3D info->private; + assert(hook_fn); trace_vtd_page_walk_one(level, entry->iova, entry->translated_addr, entry->addr_mask, entry->perm); @@ -762,17 +780,13 @@ static int vtd_page_walk_one(IOMMUTLBEntry *entry, in= t level, * @addr: base GPA addr to start the walk * @start: IOVA range start address * @end: IOVA range end address (start <=3D addr < end) - * @hook_fn: hook func to be called when detected page - * @private: private data to be passed into hook func * @read: whether parent level has read permission * @write: whether parent level has write permission - * @notify_unmap: whether we should notify invalid entries - * @aw: maximum address width + * @info: constant information for the page walk */ static int vtd_page_walk_level(dma_addr_t addr, uint64_t start, - uint64_t end, vtd_page_walk_hook hook_fn, - void *private, uint32_t level, bool read, - bool write, bool notify_unmap, uint8_t aw) + uint64_t end, uint32_t level, bool read, + bool write, vtd_page_walk_info *info) { bool read_cur, write_cur, entry_valid; uint32_t offset; @@ -822,24 +836,24 @@ static int vtd_page_walk_level(dma_addr_t addr, uint6= 4_t start, =20 if (vtd_is_last_slpte(slpte, level)) { /* NOTE: this is only meaningful if entry_valid =3D=3D true */ - entry.translated_addr =3D vtd_get_slpte_addr(slpte, aw); - if (!entry_valid && !notify_unmap) { + entry.translated_addr =3D vtd_get_slpte_addr(slpte, info->aw); + if (!entry_valid && !info->notify_unmap) { trace_vtd_page_walk_skip_perm(iova, iova_next); goto next; } - ret =3D vtd_page_walk_one(&entry, level, hook_fn, private); + ret =3D vtd_page_walk_one(&entry, level, info); if (ret < 0) { return ret; } } else { if (!entry_valid) { - if (notify_unmap) { + if (info->notify_unmap) { /* * The whole entry is invalid; unmap it all. * Translated address is meaningless, zero it. */ entry.translated_addr =3D 0x0; - ret =3D vtd_page_walk_one(&entry, level, hook_fn, priv= ate); + ret =3D vtd_page_walk_one(&entry, level, info); if (ret < 0) { return ret; } @@ -848,10 +862,9 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64= _t start, } goto next; } - ret =3D vtd_page_walk_level(vtd_get_slpte_addr(slpte, aw), iov= a, - MIN(iova_next, end), hook_fn, privat= e, - level - 1, read_cur, write_cur, - notify_unmap, aw); + ret =3D vtd_page_walk_level(vtd_get_slpte_addr(slpte, info->aw= ), + iova, MIN(iova_next, end), level - 1, + read_cur, write_cur, info); if (ret < 0) { return ret; } @@ -880,6 +893,12 @@ static int vtd_page_walk(VTDContextEntry *ce, uint64_t= start, uint64_t end, { dma_addr_t addr =3D vtd_ce_get_slpt_base(ce); uint32_t level =3D vtd_ce_get_level(ce); + vtd_page_walk_info info =3D { + .hook_fn =3D hook_fn, + .private =3D private, + .notify_unmap =3D notify_unmap, + .aw =3D aw, + }; =20 if (!vtd_iova_range_check(start, ce, aw)) { return -VTD_FR_ADDR_BEYOND_MGAW; @@ -890,8 +909,7 @@ static int vtd_page_walk(VTDContextEntry *ce, uint64_t = start, uint64_t end, end =3D vtd_iova_limit(ce, aw); } =20 - return vtd_page_walk_level(addr, start, end, hook_fn, private, - level, true, true, notify_unmap, aw); + return vtd_page_walk_level(addr, start, end, level, true, true, &info); } =20 /* Map a device to its corresponding domain (context-entry) */ --=20 2.17.0 From nobody Wed Oct 29 17:29:48 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525403646858877.2868898895987; Thu, 3 May 2018 20:14:06 -0700 (PDT) Received: from localhost ([::1]:60187 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fERAc-0005TV-3u for importer@patchew.org; Thu, 03 May 2018 23:14:06 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33340) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER5g-0001qW-Mw for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fER5c-0005gQ-NR for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:00 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:45908 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fER5c-0005fv-Hr for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:56 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 34172EB706; Fri, 4 May 2018 03:08:56 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-46.pek2.redhat.com [10.72.12.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id CE7A711166E0; Fri, 4 May 2018 03:08:52 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 4 May 2018 11:08:07 +0800 Message-Id: <20180504030811.28111-7-peterx@redhat.com> In-Reply-To: <20180504030811.28111-1-peterx@redhat.com> References: <20180504030811.28111-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 04 May 2018 03:08:56 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 04 May 2018 03:08:56 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 06/10] intel-iommu: pass in address space when page walk X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" We pass in the VTDAddressSpace to replace the aw bits when doing page walk. The VTDAddressSpace contains the aw bits information, meanwhile we'll need to do something more in the follow up patches regarding to the address spaces. Signed-off-by: Peter Xu Reviewed-by: Eric Auger --- hw/i386/intel_iommu.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index b2b2a0a441..83769f2b8c 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -753,13 +753,13 @@ typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entr= y, void *private); * @hook_fn: hook func to be called when detected page * @private: private data to be passed into hook func * @notify_unmap: whether we should notify invalid entries - * @aw: maximum address width + * @as: VT-d address space of the device */ typedef struct { + VTDAddressSpace *as; vtd_page_walk_hook hook_fn; void *private; bool notify_unmap; - uint8_t aw; } vtd_page_walk_info; =20 static int vtd_page_walk_one(IOMMUTLBEntry *entry, int level, @@ -796,6 +796,7 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_= t start, uint64_t iova =3D start; uint64_t iova_next; int ret =3D 0; + uint8_t aw =3D info->as->iommu_state->aw_bits; =20 trace_vtd_page_walk_level(addr, level, start, end); =20 @@ -836,7 +837,7 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_= t start, =20 if (vtd_is_last_slpte(slpte, level)) { /* NOTE: this is only meaningful if entry_valid =3D=3D true */ - entry.translated_addr =3D vtd_get_slpte_addr(slpte, info->aw); + entry.translated_addr =3D vtd_get_slpte_addr(slpte, aw); if (!entry_valid && !info->notify_unmap) { trace_vtd_page_walk_skip_perm(iova, iova_next); goto next; @@ -862,7 +863,7 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_= t start, } goto next; } - ret =3D vtd_page_walk_level(vtd_get_slpte_addr(slpte, info->aw= ), + ret =3D vtd_page_walk_level(vtd_get_slpte_addr(slpte, aw), iova, MIN(iova_next, end), level - 1, read_cur, write_cur, info); if (ret < 0) { @@ -885,19 +886,20 @@ next: * @end: IOVA range end address (start <=3D addr < end) * @hook_fn: the hook that to be called for each detected area * @private: private data for the hook function - * @aw: maximum address width + * @as: the VT-d address space of the device */ static int vtd_page_walk(VTDContextEntry *ce, uint64_t start, uint64_t end, vtd_page_walk_hook hook_fn, void *private, - bool notify_unmap, uint8_t aw) + bool notify_unmap, VTDAddressSpace *as) { dma_addr_t addr =3D vtd_ce_get_slpt_base(ce); uint32_t level =3D vtd_ce_get_level(ce); + uint8_t aw =3D as->iommu_state->aw_bits; vtd_page_walk_info info =3D { .hook_fn =3D hook_fn, .private =3D private, .notify_unmap =3D notify_unmap, - .aw =3D aw, + .as =3D as, }; =20 if (!vtd_iova_range_check(start, ce, aw)) { @@ -1470,7 +1472,7 @@ static void vtd_iotlb_page_invalidate_notify(IntelIOM= MUState *s, */ vtd_page_walk(&ce, addr, addr + size, vtd_page_invalidate_notify_hook, - (void *)&vtd_as->iommu, true, s->aw_bits); + (void *)&vtd_as->iommu, true, vtd_as); } else { /* * For UNMAP-only notifiers, we don't need to walk the @@ -2941,7 +2943,7 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu= _mr, IOMMUNotifier *n) if (vtd_as_notify_mappings(vtd_as)) { /* This is required only for MAP typed notifiers */ vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, false, - s->aw_bits); + vtd_as); } } else { trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), --=20 2.17.0 From nobody Wed Oct 29 17:29:48 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525403628278878.8449152063092; Thu, 3 May 2018 20:13:48 -0700 (PDT) Received: from localhost ([::1]:60186 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fERAJ-0005Cb-Ca for importer@patchew.org; Thu, 03 May 2018 23:13:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33394) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER5r-0001xV-Jk for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fER5n-0005nw-CT for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:11 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:45926 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fER5n-0005nf-6A for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:07 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D2567F641F; Fri, 4 May 2018 03:09:06 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-46.pek2.redhat.com [10.72.12.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id C9DFC11166E0; Fri, 4 May 2018 03:08:56 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 4 May 2018 11:08:08 +0800 Message-Id: <20180504030811.28111-8-peterx@redhat.com> In-Reply-To: <20180504030811.28111-1-peterx@redhat.com> References: <20180504030811.28111-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 04 May 2018 03:09:06 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 04 May 2018 03:09:06 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 07/10] util: implement simple interval tree logic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Introduce a simplest interval tree implementation based on GTree. Current implementation is mostly tailored to maintain and trace device mapped IOVA ranges, but still it might be useful to other modules in the future. It is naive in that it even does not allow user to pass in private structs along with the ranges. However it's good in that the tree can do mergings of ranges when necessary when without such information. Signed-off-by: Peter Xu --- include/qemu/interval-tree.h | 130 ++++++++++++++++++++++ util/interval-tree.c | 208 +++++++++++++++++++++++++++++++++++ util/Makefile.objs | 1 + 3 files changed, 339 insertions(+) create mode 100644 include/qemu/interval-tree.h create mode 100644 util/interval-tree.c diff --git a/include/qemu/interval-tree.h b/include/qemu/interval-tree.h new file mode 100644 index 0000000000..14d364455b --- /dev/null +++ b/include/qemu/interval-tree.h @@ -0,0 +1,130 @@ +/* + * An very simplified interval tree implementation based on GTree. + * + * Copyright 2018 Red Hat, Inc. + * + * Authors: + * Peter Xu + * + * This work is licensed under the terms of the GNU GPL, version 2 or late= r. + */ +#ifndef INTERVAL_TREE_H +#define INTERVAL_TREE_H + +/* + * Currently the interval tree will only allow to keep ranges + * information, and no extra user data is allowed for each element. A + * benefit is that we can merge adjacent ranges internally within the + * tree. It can save a lot of memory when the ranges are splitted but + * mostly continuous. + * + * Note that current implementation does not provide any thread + * protections. Callers of the interval tree should be responsible + * for the thread safety issue. + */ + +#include + +#define IT_OK (0) +#define IT_ERR_OVERLAP (-1) + +typedef unsigned long long ITValue; +typedef struct ITTree ITTree; +typedef gboolean (*it_tree_iterator)(ITValue start, ITValue end); + +struct ITRange { + ITValue start; + ITValue end; +}; +typedef struct ITRange ITRange; + +/** + * it_tree_new: + * + * Create a new interval tree. + * + * Returns: the tree pointer when succeeded, or NULL if error. + */ +ITTree *it_tree_new(void); + +/** + * it_tree_insert: + * + * @tree: the interval tree to insert + * @start: the start of range, inclusive + * @end: the end of range, inclusive + * + * Insert an interval range to the tree. If there is overlapped + * ranges, IT_ERR_OVERLAP will be returned. + * + * Return: 0 if succeeded, or <0 if error. + */ +int it_tree_insert(ITTree *tree, ITValue start, ITValue end); + +/** + * it_tree_remove: + * + * @tree: the interval tree to remove range from + * @start: the start of range, inclusive + * @end: the end of range, inclusive + * + * Remove an range from the tree. The range does not need to be + * exactly what has inserted. All the ranges that are included in the + * provided range will be removed from the tree. + * + * Return: 0 if succeeded, or <0 if error. + */ +int it_tree_remove(ITTree *tree, ITValue start, ITValue end); + +/** + * it_tree_find: + * + * @tree: the interval tree to search from + * @start: the start of range, inclusive + * @end: the end of range, inclusive + * + * Search for a range in the interval tree that overlaps with the + * range specified. Only the first found range will be returned. + * + * Return: ITRange if found, or NULL if not found. Note: the returned + * ITRange pointer is maintained internally. User should only read + * the content but never modify or free the content. + */ +ITRange *it_tree_find(ITTree *tree, ITValue start, ITValue end); + +/** + * it_tree_find_value: + * + * @tree: the interval tree to search from + * @value: the value to find + * + * Similar to it_tree_find(), but it tries to find range (value, value). + * + * Return: same as it_tree_find(). + */ +ITRange *it_tree_find_value(ITTree *tree, ITValue value); + +/** + * it_tree_foreach: + * + * @tree: the interval tree to iterate on + * @iterator: the interator for the ranges, return true to stop + * + * Search for a range in the interval tree. + * + * Return: 1 if found any overlap, 0 if not, <0 if error. + */ +void it_tree_foreach(ITTree *tree, it_tree_iterator iterator); + +/** + * it_tree_destroy: + * + * @tree: the interval tree to destroy + * + * Destroy an existing interval tree. + * + * Return: None. + */ +void it_tree_destroy(ITTree *tree); + +#endif diff --git a/util/interval-tree.c b/util/interval-tree.c new file mode 100644 index 0000000000..3d2bb951aa --- /dev/null +++ b/util/interval-tree.c @@ -0,0 +1,208 @@ +/* + * An very simplified interval tree implementation based on GTree. + * + * Copyright 2018 Red Hat, Inc. + * + * Authors: + * Peter Xu + * + * This work is licensed under the terms of the GNU GPL, version 2 or late= r. + */ + +#include +#include "qemu/interval-tree.h" + +/* + * Each element of the internal tree is an ITRange. It is shared + * between the key and value of the element, or we can see it a tree + * with keys only but no values. + */ + +struct ITTree { + GTree *tree; +}; + +static int it_tree_compare(gconstpointer a, gconstpointer b, gpointer data) +{ + const ITRange *r1 =3D a, *r2 =3D b; + + if (r1->start > r2->end) { + return 1; + } + + if (r1->end < r2->start) { + return -1; + } + + /* Overlapped */ + return 0; +} + +/* Find out intersection of range A and B, put into OUT */ +static inline void it_range_and(ITRange *out, ITRange *a, ITRange *b) +{ + out->start =3D MAX(a->start, b->start); + out->end =3D MIN(a->end, b->end); +} + +static inline gboolean it_range_equal(ITRange *a, ITRange *b) +{ + return a->start =3D=3D b->start && a->end =3D=3D b->end; +} + +/* Whether ITRange A is superset of B? */ +static inline gboolean it_range_cover(ITRange *a, ITRange *b) +{ + return a->start <=3D b->start && a->end >=3D b->end; +} + +ITTree *it_tree_new(void) +{ + ITTree *ittree =3D g_new0(ITTree, 1); + + /* We don't have values actually, no need to free */ + ittree->tree =3D g_tree_new_full(it_tree_compare, NULL, g_free, NULL); + + return ittree; +} + +ITRange *it_tree_find(ITTree *tree, ITValue start, ITValue end) +{ + ITRange range; + + g_assert(tree); + + range.start =3D start; + range.end =3D end; + + return g_tree_lookup(tree->tree, &range); +} + +ITRange *it_tree_find_value(ITTree *tree, ITValue value) +{ + return it_tree_find(tree, value, value); +} + +static inline void it_tree_insert_internal(GTree *gtree, ITRange *range) +{ + /* Key and value are sharing the same range data */ + g_tree_insert(gtree, range, range); +} + +int it_tree_insert(ITTree *tree, ITValue start, ITValue end) +{ + ITRange range, *new, *overlap; + GTree *gtree; + + g_assert(tree); + g_assert(start <=3D end); + + gtree =3D tree->tree; + + range.start =3D start; + range.end =3D end; + + /* We don't allow to insert range that overlaps with existings */ + if (g_tree_lookup(gtree, &range)) { + return IT_ERR_OVERLAP; + } + + /* Merge left adjacent range */ + overlap =3D it_tree_find_value(tree, start - 1); + if (overlap) { + range.start =3D overlap->start; + g_tree_remove(gtree, overlap); + } + + /* Merge right adjacent range */ + overlap =3D it_tree_find_value(tree, end + 1); + if (overlap) { + range.end =3D overlap->end; + g_tree_remove(gtree, overlap); + } + + new =3D g_new0(ITRange, 1); + new->start =3D range.start; + new->end =3D range.end; + it_tree_insert_internal(gtree, new); + + return IT_OK; +} + +static gboolean it_tree_traverse(gpointer key, gpointer value, + gpointer data) +{ + it_tree_iterator iterator =3D data; + ITRange *range =3D key; + + g_assert(key =3D=3D value); + + return iterator(range->start, range->end); +} + +void it_tree_foreach(ITTree *tree, it_tree_iterator iterator) +{ + g_assert(tree && iterator); + g_tree_foreach(tree->tree, it_tree_traverse, iterator); +} + +/* Remove subset `range', which is part of `overlap'. */ +static void it_tree_remove_subset(GTree *gtree, const ITRange *overlap, + const ITRange *range) +{ + ITRange *range1, *range2; + + if (overlap->start < range->start) { + range1 =3D g_new0(ITRange, 1); + range1->start =3D overlap->start; + range1->end =3D range->start - 1; + } else { + range1 =3D NULL; + } + if (range->end < overlap->end) { + range2 =3D g_new0(ITRange, 1); + range2->start =3D range->end + 1; + range2->end =3D overlap->end; + } else { + range2 =3D NULL; + } + + g_tree_remove(gtree, overlap); + + if (range1) { + it_tree_insert_internal(gtree, range1); + } + if (range2) { + it_tree_insert_internal(gtree, range2); + } +} + +int it_tree_remove(ITTree *tree, ITValue start, ITValue end) +{ + ITRange range =3D { .start =3D start, .end =3D end }, *overlap, and; + GTree *gtree; + + g_assert(tree); + + gtree =3D tree->tree; + while ((overlap =3D g_tree_lookup(gtree, &range))) { + if (it_range_cover(overlap, &range)) { + /* Split existing range into two if needed; done */ + it_tree_remove_subset(gtree, overlap, &range); + break; + } else { + /* Remove intersection and continue */ + it_range_and(&and, overlap, &range); + g_assert(and.start <=3D and.end); + it_tree_remove_subset(gtree, overlap, &and); + } + } + + return IT_OK; +} + +void it_tree_destroy(ITTree *tree) +{ + g_tree_destroy(tree->tree); + g_free(tree); +} diff --git a/util/Makefile.objs b/util/Makefile.objs index 728c3541db..4ac33910ed 100644 --- a/util/Makefile.objs +++ b/util/Makefile.objs @@ -47,4 +47,5 @@ util-obj-y +=3D qht.o util-obj-y +=3D range.o util-obj-y +=3D stats64.o util-obj-y +=3D systemd.o +util-obj-y +=3D interval-tree.o util-obj-$(CONFIG_LINUX) +=3D vfio-helpers.o --=20 2.17.0 From nobody Wed Oct 29 17:29:48 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525403764438825.0895098259464; Thu, 3 May 2018 20:16:04 -0700 (PDT) Received: from localhost ([::1]:60202 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fERCM-0007Iz-7c for importer@patchew.org; Thu, 03 May 2018 23:15:54 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33413) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER5u-00020H-0Q for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fER5s-0005qO-EF for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:13 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:45950 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fER5s-0005q7-8y for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:12 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E9721FA466; Fri, 4 May 2018 03:09:11 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-46.pek2.redhat.com [10.72.12.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7848111166E0; Fri, 4 May 2018 03:09:07 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 4 May 2018 11:08:09 +0800 Message-Id: <20180504030811.28111-9-peterx@redhat.com> In-Reply-To: <20180504030811.28111-1-peterx@redhat.com> References: <20180504030811.28111-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 04 May 2018 03:09:11 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 04 May 2018 03:09:11 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 08/10] intel-iommu: maintain per-device iova ranges X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" For each VTDAddressSpace, now we maintain what IOVA ranges we have mapped and what we have not. With that information, now we only send MAP or UNMAP when necessary. Say, we don't send MAP notifies if we know we have already mapped the range, meanwhile we don't send UNMAP notifies if we know we never mapped the range at all. Signed-off-by: Peter Xu --- include/hw/i386/intel_iommu.h | 2 + hw/i386/intel_iommu.c | 73 +++++++++++++++++++++++++++++++++++ hw/i386/trace-events | 2 + 3 files changed, 77 insertions(+) diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index 9e0a6c1c6a..5e54f4fae2 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -27,6 +27,7 @@ #include "hw/i386/ioapic.h" #include "hw/pci/msi.h" #include "hw/sysbus.h" +#include "qemu/interval-tree.h" =20 #define TYPE_INTEL_IOMMU_DEVICE "intel-iommu" #define INTEL_IOMMU_DEVICE(obj) \ @@ -95,6 +96,7 @@ struct VTDAddressSpace { QLIST_ENTRY(VTDAddressSpace) next; /* Superset of notifier flags that this address space has */ IOMMUNotifierFlag notifier_flags; + ITTree *iova_tree; /* Traces mapped IOVA ranges */ }; =20 struct VTDBus { diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 83769f2b8c..16b3514afb 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -765,12 +765,82 @@ typedef struct { static int vtd_page_walk_one(IOMMUTLBEntry *entry, int level, vtd_page_walk_info *info) { + VTDAddressSpace *as =3D info->as; vtd_page_walk_hook hook_fn =3D info->hook_fn; void *private =3D info->private; + ITRange *mapped =3D it_tree_find(as->iova_tree, entry->iova, + entry->iova + entry->addr_mask); =20 assert(hook_fn); + + /* Update local IOVA mapped ranges */ + if (entry->perm) { + if (mapped) { + /* + * Skip since we have already mapped this range. + * + * NOTE: here we didn't solve the modify-PTE problem. For + * example: + * + * (1) map iova1 to 4K page P1 + * (2) send PSI on (iova1, iova1 + 4k) + * (3) modify iova1 to 4K page P2 + * (4) send PSI on (iova1, iova1 + 4k) + * + * Logically QEMU as emulator should atomically modify the + * shadow page table PTE to follow what has been changed. + * However here actually we will never have a way to + * emulate this "atomic PTE modification" procedure. + * Because even we know that this PTE is changed we'll + * still need to unmap it first then remap it (please + * refer to VFIO_IOMMU_MAP_DMA and VFIO_IOMMU_UNMAP_DMA + * VFIO APIs as an example). And there is no such API to + * atomically update an IOMMU PTE on the host. Then we'll + * have a small window that we still have invalid page + * mapping (while ideally we shoudn't). + * + * With out current code, on step (4) we just ignored the + * PTE update and we'll skip the map update, which will + * leave stale mappings. + * + * Modifying PTEs won't happen on general OSs (e.g., + * Linux) - it should be prohibited since the device may + * be using the page table during the modification then + * the behavior of modifying PTEs could be undefined. + * However it's still possible for other guests to do so. + * + * Let's mark this as TODO. After all it should never + * happen on general OSs like Linux/Windows/... Meanwhile + * even if the guest is running a private and problematic + * OS, the stale page (P1) will definitely still be a + * valid page for current L1 QEMU, so the worst case is + * that L1 guest will be at risk on its own on that single + * device only. It'll never harm the rest of L1 guest, + * the host memory or other guests on the host. + * + * Let's solve this once-and-for-all when we really + * needed, and when we are capable of (for now, we can't). + * Though I would suspect that won't happen soon. + */ + trace_vtd_page_walk_one_skip_map(entry->iova, entry->addr_mask, + mapped->start, mapped->end); + return 0; + } + it_tree_insert(as->iova_tree, entry->iova, + entry->iova + entry->addr_mask); + } else { + if (!mapped) { + /* Skip since we didn't map this range at all */ + trace_vtd_page_walk_one_skip_unmap(entry->iova, entry->addr_ma= sk); + return 0; + } + it_tree_remove(as->iova_tree, entry->iova, + entry->iova + entry->addr_mask); + } + trace_vtd_page_walk_one(level, entry->iova, entry->translated_addr, entry->addr_mask, entry->perm); + return hook_fn(entry, private); } =20 @@ -2804,6 +2874,7 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, = PCIBus *bus, int devfn) vtd_dev_as->devfn =3D (uint8_t)devfn; vtd_dev_as->iommu_state =3D s; vtd_dev_as->context_cache_entry.context_cache_gen =3D 0; + vtd_dev_as->iova_tree =3D it_tree_new(); =20 /* * Memory region relationships looks like (Address range shows @@ -2900,6 +2971,8 @@ static void vtd_address_space_unmap(VTDAddressSpace *= as, IOMMUNotifier *n) VTD_PCI_FUNC(as->devfn), entry.iova, size); =20 + it_tree_remove(as->iova_tree, entry.iova, entry.iova + entry.addr_mask= ); + memory_region_notify_one(n, &entry); } =20 diff --git a/hw/i386/trace-events b/hw/i386/trace-events index 22d44648af..677f83420d 100644 --- a/hw/i386/trace-events +++ b/hw/i386/trace-events @@ -40,6 +40,8 @@ vtd_replay_ce_valid(uint8_t bus, uint8_t dev, uint8_t fn,= uint16_t domain, uint6 vtd_replay_ce_invalid(uint8_t bus, uint8_t dev, uint8_t fn) "replay invali= d context device %02"PRIx8":%02"PRIx8".%02"PRIx8 vtd_page_walk_level(uint64_t addr, uint32_t level, uint64_t start, uint64_= t end) "walk (base=3D0x%"PRIx64", level=3D%"PRIu32") iova range 0x%"PRIx64"= - 0x%"PRIx64 vtd_page_walk_one(uint32_t level, uint64_t iova, uint64_t gpa, uint64_t ma= sk, int perm) "detected page level 0x%"PRIx32" iova 0x%"PRIx64" -> gpa 0x%"= PRIx64" mask 0x%"PRIx64" perm %d" +vtd_page_walk_one_skip_map(uint64_t iova, uint64_t mask, uint64_t start, u= int64_t end) "iova 0x%"PRIx64" mask 0x%"PRIx64" start 0x%"PRIx64" end 0x%"P= RIx64 +vtd_page_walk_one_skip_unmap(uint64_t iova, uint64_t mask) "iova 0x%"PRIx6= 4" mask 0x%"PRIx64 vtd_page_walk_skip_read(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to unable to read" vtd_page_walk_skip_perm(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to perm empty" vtd_page_walk_skip_reserve(uint64_t iova, uint64_t next) "Page walk skip i= ova 0x%"PRIx64" - 0x%"PRIx64" due to rsrv set" --=20 2.17.0 From nobody Wed Oct 29 17:29:48 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525403794421364.9010824629112; Thu, 3 May 2018 20:16:34 -0700 (PDT) Received: from localhost ([::1]:60205 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fERCz-0007h3-Kx for importer@patchew.org; Thu, 03 May 2018 23:16:33 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33441) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER5x-000244-Rf for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fER5w-0005uH-Q6 for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:17 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:45960 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fER5w-0005t5-ME for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:16 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6B784EFDFA; Fri, 4 May 2018 03:09:16 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-46.pek2.redhat.com [10.72.12.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8957211166E0; Fri, 4 May 2018 03:09:12 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 4 May 2018 11:08:10 +0800 Message-Id: <20180504030811.28111-10-peterx@redhat.com> In-Reply-To: <20180504030811.28111-1-peterx@redhat.com> References: <20180504030811.28111-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 04 May 2018 03:09:16 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Fri, 04 May 2018 03:09:16 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 09/10] intel-iommu: don't unmap all for shadow page table X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" IOMMU replay was carried out before in many use cases, e.g., context cache invalidations, domain flushes. We used this mechanism to sync the shadow page table by firstly (1) unmap the whole address space, then (2) walk the page table to remap what's in the table. This is very dangerous. The problem is that we'll have a very small window (in my measurement, it can be about 3ms) during above step (1) and (2) that the device will see no (or incomplete) device page table. Howerver the device never knows that. This can cause DMA error of devices, who assumes the page table is always there. So the point is that, for MAP typed notifiers (vfio-pci, for example) they'll need the mapped page entries always be there. We can never unmap any existing page entries like what we did in (1) above. The only solution is to remove step (1). We can't do that before since we didn't know what device page was mapped and what was not, so we unmap them all. Now with the new IOVA tree QEMU knows what has mapped and what has not. We don't need this step (1) any more. Remove it. Note that after removing that global unmap flushing, we'll need to notify unmap now during page walkings. This should fix the DMA error problem that Jintack Lim reported with nested device assignment. This problem won't not happen always, e.g., I cannot reproduce the error. However after collecting logs it shows that this is the possible cause to Jintack's problem. Reported-by: Jintack Lim Signed-off-by: Peter Xu --- hw/i386/intel_iommu.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 16b3514afb..9439103cac 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -3003,10 +3003,8 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iomm= u_mr, IOMMUNotifier *n) =20 /* * The replay can be triggered by either a invalidation or a newly - * created entry. No matter what, we release existing mappings - * (it means flushing caches for UNMAP-only registers). + * created entry. */ - vtd_address_space_unmap(vtd_as, n); =20 if (vtd_dev_to_context_entry(s, bus_n, vtd_as->devfn, &ce) =3D=3D 0) { trace_vtd_replay_ce_valid(bus_n, PCI_SLOT(vtd_as->devfn), @@ -3015,8 +3013,10 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iomm= u_mr, IOMMUNotifier *n) ce.hi, ce.lo); if (vtd_as_notify_mappings(vtd_as)) { /* This is required only for MAP typed notifiers */ - vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, false, + vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, true, vtd_as); + } else { + vtd_address_space_unmap(vtd_as, n); } } else { trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), --=20 2.17.0 From nobody Wed Oct 29 17:29:48 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525403888935597.3054129425237; Thu, 3 May 2018 20:18:08 -0700 (PDT) Received: from localhost ([::1]:60211 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fEREW-0000N9-77 for importer@patchew.org; Thu, 03 May 2018 23:18:08 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33542) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER6C-0002ID-SO for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fER68-00061f-LV for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:32 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:39294 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fER68-00061R-FH for qemu-devel@nongnu.org; Thu, 03 May 2018 23:09:28 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 201FB818592A; Fri, 4 May 2018 03:09:28 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-46.pek2.redhat.com [10.72.12.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1179F11166E0; Fri, 4 May 2018 03:09:17 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 4 May 2018 11:08:11 +0800 Message-Id: <20180504030811.28111-11-peterx@redhat.com> In-Reply-To: <20180504030811.28111-1-peterx@redhat.com> References: <20180504030811.28111-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Fri, 04 May 2018 03:09:28 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Fri, 04 May 2018 03:09:28 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 10/10] intel-iommu: remove notify_unmap for page walk X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Now after previous changes, we will always pass that parameter as true now. Remove it. After removing that variable, now we can greatly simplify the page walking logic. No functional change at all. Signed-off-by: Peter Xu --- hw/i386/intel_iommu.c | 70 +++++++++++++++++-------------------------- hw/i386/trace-events | 1 - 2 files changed, 28 insertions(+), 43 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 9439103cac..c156235135 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -759,7 +759,6 @@ typedef struct { VTDAddressSpace *as; vtd_page_walk_hook hook_fn; void *private; - bool notify_unmap; } vtd_page_walk_info; =20 static int vtd_page_walk_one(IOMMUTLBEntry *entry, int level, @@ -900,45 +899,34 @@ static int vtd_page_walk_level(dma_addr_t addr, uint6= 4_t start, */ entry_valid =3D read_cur | write_cur; =20 - entry.target_as =3D &address_space_memory; - entry.iova =3D iova & subpage_mask; - entry.perm =3D IOMMU_ACCESS_FLAG(read_cur, write_cur); - entry.addr_mask =3D ~subpage_mask; - - if (vtd_is_last_slpte(slpte, level)) { - /* NOTE: this is only meaningful if entry_valid =3D=3D true */ - entry.translated_addr =3D vtd_get_slpte_addr(slpte, aw); - if (!entry_valid && !info->notify_unmap) { - trace_vtd_page_walk_skip_perm(iova, iova_next); - goto next; - } - ret =3D vtd_page_walk_one(&entry, level, info); - if (ret < 0) { - return ret; - } - } else { - if (!entry_valid) { - if (info->notify_unmap) { - /* - * The whole entry is invalid; unmap it all. - * Translated address is meaningless, zero it. - */ - entry.translated_addr =3D 0x0; - ret =3D vtd_page_walk_one(&entry, level, info); - if (ret < 0) { - return ret; - } - } else { - trace_vtd_page_walk_skip_perm(iova, iova_next); - } - goto next; - } + if (!vtd_is_last_slpte(slpte, level) && entry_valid) { + /* + * This is a valid PDE (or even bigger than PDE). We need + * to walk one further level. + */ ret =3D vtd_page_walk_level(vtd_get_slpte_addr(slpte, aw), iova, MIN(iova_next, end), level - 1, read_cur, write_cur, info); - if (ret < 0) { - return ret; - } + } else { + /* + * This means we are either: + * + * (1) the real page entry (either 4K page, or huge page) + * (2) the whole range is invalid + * + * In either case, we send an IOTLB notification down. + */ + entry.target_as =3D &address_space_memory; + entry.iova =3D iova & subpage_mask; + entry.perm =3D IOMMU_ACCESS_FLAG(read_cur, write_cur); + entry.addr_mask =3D ~subpage_mask; + /* NOTE: this is only meaningful if entry_valid =3D=3D true */ + entry.translated_addr =3D vtd_get_slpte_addr(slpte, aw); + ret =3D vtd_page_walk_one(&entry, level, info); + } + + if (ret < 0) { + return ret; } =20 next: @@ -960,7 +948,7 @@ next: */ static int vtd_page_walk(VTDContextEntry *ce, uint64_t start, uint64_t end, vtd_page_walk_hook hook_fn, void *private, - bool notify_unmap, VTDAddressSpace *as) + VTDAddressSpace *as) { dma_addr_t addr =3D vtd_ce_get_slpt_base(ce); uint32_t level =3D vtd_ce_get_level(ce); @@ -968,7 +956,6 @@ static int vtd_page_walk(VTDContextEntry *ce, uint64_t = start, uint64_t end, vtd_page_walk_info info =3D { .hook_fn =3D hook_fn, .private =3D private, - .notify_unmap =3D notify_unmap, .as =3D as, }; =20 @@ -1542,7 +1529,7 @@ static void vtd_iotlb_page_invalidate_notify(IntelIOM= MUState *s, */ vtd_page_walk(&ce, addr, addr + size, vtd_page_invalidate_notify_hook, - (void *)&vtd_as->iommu, true, vtd_as); + (void *)&vtd_as->iommu, vtd_as); } else { /* * For UNMAP-only notifiers, we don't need to walk the @@ -3013,8 +3000,7 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu= _mr, IOMMUNotifier *n) ce.hi, ce.lo); if (vtd_as_notify_mappings(vtd_as)) { /* This is required only for MAP typed notifiers */ - vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, true, - vtd_as); + vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, vtd_a= s); } else { vtd_address_space_unmap(vtd_as, n); } diff --git a/hw/i386/trace-events b/hw/i386/trace-events index 677f83420d..eb23ace5fb 100644 --- a/hw/i386/trace-events +++ b/hw/i386/trace-events @@ -43,7 +43,6 @@ vtd_page_walk_one(uint32_t level, uint64_t iova, uint64_t= gpa, uint64_t mask, in vtd_page_walk_one_skip_map(uint64_t iova, uint64_t mask, uint64_t start, u= int64_t end) "iova 0x%"PRIx64" mask 0x%"PRIx64" start 0x%"PRIx64" end 0x%"P= RIx64 vtd_page_walk_one_skip_unmap(uint64_t iova, uint64_t mask) "iova 0x%"PRIx6= 4" mask 0x%"PRIx64 vtd_page_walk_skip_read(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to unable to read" -vtd_page_walk_skip_perm(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to perm empty" vtd_page_walk_skip_reserve(uint64_t iova, uint64_t next) "Page walk skip i= ova 0x%"PRIx64" - 0x%"PRIx64" due to rsrv set" vtd_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "= Device %02x:%02x.%x switching address space (iommu enabled=3D%d)" vtd_as_unmap_whole(uint8_t bus, uint8_t slot, uint8_t fn, uint64_t iova, u= int64_t size) "Device %02x:%02x.%x start 0x%"PRIx64" size 0x%"PRIx64 --=20 2.17.0 From nobody Wed Oct 29 17:29:48 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525764622488308.9473865861928; Tue, 8 May 2018 00:30:22 -0700 (PDT) Received: from localhost ([::1]:49735 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fFx4i-0003k5-60 for importer@patchew.org; Tue, 08 May 2018 03:30:16 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34272) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fFx3t-0003RF-8a for qemu-devel@nongnu.org; Tue, 08 May 2018 03:29:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fFx3q-000799-3b for qemu-devel@nongnu.org; Tue, 08 May 2018 03:29:25 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:49242 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fFx3p-00078d-R1 for qemu-devel@nongnu.org; Tue, 08 May 2018 03:29:22 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 60E65818F6E5; Tue, 8 May 2018 07:29:15 +0000 (UTC) Received: from xz-mi.nay.redhat.com (dhcp-14-151.nay.redhat.com [10.66.14.151]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7AEB52022EE0; Tue, 8 May 2018 07:29:08 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Tue, 8 May 2018 15:29:07 +0800 Message-Id: <20180508072907.18959-1-peterx@redhat.com> In-Reply-To: <20180504030811.28111-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Tue, 08 May 2018 07:29:15 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Tue, 08 May 2018 07:29:15 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 11/10] tests: add interval tree unit test X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , Peter Xu , Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Peter Xu --- tests/test-interval-tree.c | 190 +++++++++++++++++++++++++++++++++++++ tests/Makefile.include | 2 + 2 files changed, 192 insertions(+) create mode 100644 tests/test-interval-tree.c diff --git a/tests/test-interval-tree.c b/tests/test-interval-tree.c new file mode 100644 index 0000000000..a0e3decca0 --- /dev/null +++ b/tests/test-interval-tree.c @@ -0,0 +1,190 @@ +/* + * Interval tree tests + * + * Copyright Red Hat, Inc. 2018 + * + * Authors: + * Peter Xu , + * + * This work is licensed under the terms of the GNU LGPL, version 2 or lat= er. + * See the COPYING.LIB file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/interval-tree.h" + +static ITRange ranges[2]; +static int range_i; + +static void ranges_reset(void) +{ + memset(&ranges, 0, sizeof(ranges)); + range_i =3D 0; +} + +static gboolean ranges_iterate(ITValue start, ITValue end) +{ + g_assert(range_i < ARRAY_SIZE(ranges)); + ranges[range_i].start =3D start; + ranges[range_i].end =3D end; + range_i++; + return FALSE; +} + +static void ranges_check(void) +{ + g_assert(range_i =3D=3D 2); + g_assert(ranges[0].start =3D=3D 10 && ranges[0].end =3D=3D 19); + g_assert(ranges[1].start =3D=3D 30 && ranges[1].end =3D=3D 39); +} + +static void test_interval_tree_common(void) +{ + int ret; + ITTree *tree =3D it_tree_new(); + ITRange *range; + + g_assert(tree); + + /* Test insertion */ + ret =3D it_tree_insert(tree, 10, 19); + g_assert(ret =3D=3D 0); + ret =3D it_tree_insert(tree, 30, 39); + g_assert(ret =3D=3D 0); + ret =3D it_tree_insert(tree, 15, 19); + g_assert(ret =3D=3D IT_ERR_OVERLAP); + ret =3D it_tree_insert(tree, 0, 99); + g_assert(ret =3D=3D IT_ERR_OVERLAP); + + /* Test searching */ + range =3D it_tree_find(tree, 0, 9); + g_assert(range =3D=3D NULL); + range =3D it_tree_find(tree, 10, 19); + g_assert(range->start =3D=3D 10 && range->end =3D=3D 19); + range =3D it_tree_find_value(tree, 15); + g_assert(range->start =3D=3D 10 && range->end =3D=3D 19); + range =3D it_tree_find(tree, 15, 99); + g_assert(range->start =3D=3D 10 && range->end =3D=3D 19); + range =3D it_tree_find_value(tree, 35); + g_assert(range->start =3D=3D 30 && range->end =3D=3D 39); + + /* Test iterations */ + ranges_reset(); + it_tree_foreach(tree, ranges_iterate); + ranges_check(); + + /* Remove one of them */ + ret =3D it_tree_remove(tree, 10, 19); + g_assert(ret =3D=3D 0); + g_assert(!it_tree_find(tree, 10, 19)); + g_assert(it_tree_find(tree, 30, 39)); + + it_tree_destroy(tree); +} + +static void test_interval_tree_merging(void) +{ + int ret; + ITTree *tree =3D it_tree_new(); + ITRange *range; + + g_assert(tree); + + ret =3D it_tree_insert(tree, 10, 19); + g_assert(ret =3D=3D 0); + ret =3D it_tree_insert(tree, 30, 39); + g_assert(ret =3D=3D 0); + + /* Test left side merging */ + ret =3D it_tree_insert(tree, 40, 59); + g_assert(ret =3D=3D 0); + range =3D it_tree_find(tree, 30, 39); + g_assert(range->start =3D=3D 30 && range->end =3D=3D 59); + + /* Test right side merging */ + ret =3D it_tree_insert(tree, 0, 9); + g_assert(ret =3D=3D 0); + range =3D it_tree_find(tree, 10, 19); + g_assert(range->start =3D=3D 0 && range->end =3D=3D 19); + + /* Test bidirectional merging */ + ret =3D it_tree_insert(tree, 20, 29); + g_assert(ret =3D=3D 0); + range =3D it_tree_find(tree, 20, 29); + g_assert(range->start =3D=3D 0 && range->end =3D=3D 59); + range =3D it_tree_find(tree, 0, 29); + g_assert(range->start =3D=3D 0 && range->end =3D=3D 59); + range =3D it_tree_find(tree, 40, 45); + g_assert(range->start =3D=3D 0 && range->end =3D=3D 59); + + it_tree_destroy(tree); +} + +static void test_interval_tree_removal(void) +{ + int ret; + ITTree *tree =3D it_tree_new(); + ITRange *range; + + g_assert(tree); + + ret =3D it_tree_insert(tree, 10, 19); + g_assert(ret =3D=3D 0); + ret =3D it_tree_insert(tree, 30, 39); + g_assert(ret =3D=3D 0); + + /* + * Remove some useless areas, which should not remove any existing + * ranges in the tree + */ + ret =3D it_tree_remove(tree, 0, 9); + g_assert(ret =3D=3D 0); + ret =3D it_tree_remove(tree, 50, 99); + g_assert(ret =3D=3D 0); + ret =3D it_tree_remove(tree, 20, 29); + g_assert(ret =3D=3D 0); + /* Make sure the elements are not removed */ + g_assert(it_tree_find(tree, 10, 19)); + g_assert(it_tree_find(tree, 30, 39)); + + /* Remove left subset of a range */ + ret =3D it_tree_remove(tree, 0, 14); + g_assert(ret =3D=3D 0); + range =3D it_tree_find(tree, 10, 19); + g_assert(range->start =3D=3D 15 && range->end =3D=3D 19); + it_tree_insert(tree, 10, 15); + + /* Remove right subset of a range */ + ret =3D it_tree_remove(tree, 35, 45); + g_assert(ret =3D=3D 0); + range =3D it_tree_find(tree, 30, 39); + g_assert(range->start =3D=3D 30 && range->end =3D=3D 34); + it_tree_insert(tree, 35, 39); + + /* Remove covers more than one range */ + ret =3D it_tree_remove(tree, 0, 40); + g_assert(ret =3D=3D 0); + g_assert(!it_tree_find(tree, 10, 19)); + g_assert(!it_tree_find(tree, 30, 39)); + it_tree_insert(tree, 10, 19); + it_tree_insert(tree, 30, 39); + + /* Remove in the middle */ + ret =3D it_tree_remove(tree, 12, 16); + g_assert(ret =3D=3D 0); + range =3D it_tree_find_value(tree, 10); + g_assert(range->start =3D=3D 10 && range->end =3D=3D 11); + range =3D it_tree_find_value(tree, 17); + g_assert(range->start =3D=3D 17 && range->end =3D=3D 19); + + it_tree_destroy(tree); +} + +int main(int argc, char *argv[]) +{ + g_test_init(&argc, &argv, NULL); + g_test_add_func("/interval-tree/common", test_interval_tree_common); + g_test_add_func("/interval-tree/merging", test_interval_tree_merging); + g_test_add_func("/interval-tree/removal", test_interval_tree_removal); + return g_test_run(); +} diff --git a/tests/Makefile.include b/tests/Makefile.include index 3b9a5e31a2..db1e6c93db 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -169,6 +169,7 @@ check-unit-y +=3D tests/ptimer-test$(EXESUF) gcov-files-ptimer-test-y =3D hw/core/ptimer.c check-unit-y +=3D tests/test-qapi-util$(EXESUF) gcov-files-test-qapi-util-y =3D qapi/qapi-util.c +check-unit-y +=3D tests/test-interval-tree$(EXESUF) =20 check-block-$(CONFIG_POSIX) +=3D tests/qemu-iotests-quick.sh =20 @@ -642,6 +643,7 @@ tests/test-qht-par$(EXESUF): tests/test-qht-par.o tests= /qht-bench$(EXESUF) $(tes tests/qht-bench$(EXESUF): tests/qht-bench.o $(test-util-obj-y) tests/test-bufferiszero$(EXESUF): tests/test-bufferiszero.o $(test-util-ob= j-y) tests/atomic_add-bench$(EXESUF): tests/atomic_add-bench.o $(test-util-obj-= y) +tests/test-interval-tree$(EXESUF): tests/test-interval-tree.o $(test-util-= obj-y) =20 tests/test-qdev-global-props$(EXESUF): tests/test-qdev-global-props.o \ hw/core/qdev.o hw/core/qdev-properties.o hw/core/hotplug.o\ --=20 2.17.0