From nobody Wed Oct 29 20:45:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1526547732269999.6306775643145; Thu, 17 May 2018 02:02:12 -0700 (PDT) Received: from localhost ([::1]:43793 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJEnT-0001rB-AL for importer@patchew.org; Thu, 17 May 2018 05:02:03 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59132) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJEl7-0000hz-97 for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJEl6-0006cG-CT for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:37 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:55674 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fJEl6-0006c7-77 for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:36 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A2B5D808255C; Thu, 17 May 2018 08:59:35 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-109.pek2.redhat.com [10.72.12.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id E871C215CDA7; Thu, 17 May 2018 08:59:32 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Thu, 17 May 2018 16:59:16 +0800 Message-Id: <20180517085927.24925-2-peterx@redhat.com> In-Reply-To: <20180517085927.24925-1-peterx@redhat.com> References: <20180517085927.24925-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 17 May 2018 08:59:35 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 17 May 2018 08:59:35 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 01/12] intel-iommu: send PSI always even if across PDEs X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" During IOVA page table walking, there is a special case when the PSI covers one whole PDE (Page Directory Entry, which contains 512 Page Table Entries) or more. In the past, we skip that entry and we don't notify the IOMMU notifiers. This is not correct. We should send UNMAP notification to registered UNMAP notifiers in this case. For UNMAP only notifiers, this might cause IOTLBs cached in the devices even if they were already invalid. For MAP/UNMAP notifiers like vfio-pci, this will cause stale page mappings. This special case doesn't trigger often, but it is very easy to be triggered by nested device assignments, since in that case we'll possibly map the whole L2 guest RAM region into the device's IOVA address space (several GBs at least), which is far bigger than normal kernel driver usages of the device (tens of MBs normally). Without this patch applied to L1 QEMU, nested device assignment to L2 guests will dump some errors like: qemu-system-x86_64: VFIO_MAP_DMA: -17 qemu-system-x86_64: vfio_dma_map(0x557305420c30, 0xad000, 0x1000, 0x7f89a920d000) =3D -17 (File exists) Acked-by: Jason Wang [peterx: rewrite the commit message] Signed-off-by: Peter Xu Reviewed-by: Eric Auger Tested-by: Jintack Lim --- hw/i386/intel_iommu.c | 42 ++++++++++++++++++++++++++++++------------ 1 file changed, 30 insertions(+), 12 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index fb31de9416..b359efd6f9 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -722,6 +722,15 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint= 64_t iova, bool is_write, =20 typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private); =20 +static int vtd_page_walk_one(IOMMUTLBEntry *entry, int level, + vtd_page_walk_hook hook_fn, void *private) +{ + assert(hook_fn); + trace_vtd_page_walk_one(level, entry->iova, entry->translated_addr, + entry->addr_mask, entry->perm); + return hook_fn(entry, private); +} + /** * vtd_page_walk_level - walk over specific level for IOVA range * @@ -781,28 +790,37 @@ static int vtd_page_walk_level(dma_addr_t addr, uint6= 4_t start, */ entry_valid =3D read_cur | write_cur; =20 + entry.target_as =3D &address_space_memory; + entry.iova =3D iova & subpage_mask; + entry.perm =3D IOMMU_ACCESS_FLAG(read_cur, write_cur); + entry.addr_mask =3D ~subpage_mask; + if (vtd_is_last_slpte(slpte, level)) { - entry.target_as =3D &address_space_memory; - entry.iova =3D iova & subpage_mask; /* NOTE: this is only meaningful if entry_valid =3D=3D true */ entry.translated_addr =3D vtd_get_slpte_addr(slpte, aw); - entry.addr_mask =3D ~subpage_mask; - entry.perm =3D IOMMU_ACCESS_FLAG(read_cur, write_cur); if (!entry_valid && !notify_unmap) { trace_vtd_page_walk_skip_perm(iova, iova_next); goto next; } - trace_vtd_page_walk_one(level, entry.iova, entry.translated_ad= dr, - entry.addr_mask, entry.perm); - if (hook_fn) { - ret =3D hook_fn(&entry, private); - if (ret < 0) { - return ret; - } + ret =3D vtd_page_walk_one(&entry, level, hook_fn, private); + if (ret < 0) { + return ret; } } else { if (!entry_valid) { - trace_vtd_page_walk_skip_perm(iova, iova_next); + if (notify_unmap) { + /* + * The whole entry is invalid; unmap it all. + * Translated address is meaningless, zero it. + */ + entry.translated_addr =3D 0x0; + ret =3D vtd_page_walk_one(&entry, level, hook_fn, priv= ate); + if (ret < 0) { + return ret; + } + } else { + trace_vtd_page_walk_skip_perm(iova, iova_next); + } goto next; } ret =3D vtd_page_walk_level(vtd_get_slpte_addr(slpte, aw), iov= a, --=20 2.17.0 From nobody Wed Oct 29 20:45:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1526547930522448.8759743941604; Thu, 17 May 2018 02:05:30 -0700 (PDT) Received: from localhost ([::1]:43811 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJEqn-0004hu-JR for importer@patchew.org; Thu, 17 May 2018 05:05:29 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59143) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJElA-0000k7-Vc for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJEl9-0006dg-T8 for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:41 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:34688 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fJEl9-0006d4-Kj for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:39 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E2578401EF12; Thu, 17 May 2018 08:59:38 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-109.pek2.redhat.com [10.72.12.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3781C215CDA7; Thu, 17 May 2018 08:59:35 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Thu, 17 May 2018 16:59:17 +0800 Message-Id: <20180517085927.24925-3-peterx@redhat.com> In-Reply-To: <20180517085927.24925-1-peterx@redhat.com> References: <20180517085927.24925-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Thu, 17 May 2018 08:59:38 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Thu, 17 May 2018 08:59:38 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 02/12] intel-iommu: remove IntelIOMMUNotifierNode X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" That is not really necessary. Removing that node struct and put the list entry directly into VTDAddressSpace. It simplfies the code a lot. Signed-off-by: Peter Xu Tested-by: Jintack Lim --- include/hw/i386/intel_iommu.h | 9 ++------ hw/i386/intel_iommu.c | 41 ++++++++++------------------------- 2 files changed, 14 insertions(+), 36 deletions(-) diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index 45ec8919b6..220697253f 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -67,7 +67,6 @@ typedef union VTD_IR_TableEntry VTD_IR_TableEntry; typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress; typedef struct VTDIrq VTDIrq; typedef struct VTD_MSIMessage VTD_MSIMessage; -typedef struct IntelIOMMUNotifierNode IntelIOMMUNotifierNode; =20 /* Context-Entry */ struct VTDContextEntry { @@ -93,6 +92,7 @@ struct VTDAddressSpace { MemoryRegion iommu_ir; /* Interrupt region: 0xfeeXXXXX */ IntelIOMMUState *iommu_state; VTDContextCacheEntry context_cache_entry; + QLIST_ENTRY(VTDAddressSpace) next; }; =20 struct VTDBus { @@ -253,11 +253,6 @@ struct VTD_MSIMessage { /* When IR is enabled, all MSI/MSI-X data bits should be zero */ #define VTD_IR_MSI_DATA (0) =20 -struct IntelIOMMUNotifierNode { - VTDAddressSpace *vtd_as; - QLIST_ENTRY(IntelIOMMUNotifierNode) next; -}; - /* The iommu (DMAR) device state struct */ struct IntelIOMMUState { X86IOMMUState x86_iommu; @@ -295,7 +290,7 @@ struct IntelIOMMUState { GHashTable *vtd_as_by_busptr; /* VTDBus objects indexed by PCIBus* r= eference */ VTDBus *vtd_as_by_bus_num[VTD_PCI_BUS_MAX]; /* VTDBus objects indexed = by bus number */ /* list of registered notifiers */ - QLIST_HEAD(, IntelIOMMUNotifierNode) notifiers_list; + QLIST_HEAD(, VTDAddressSpace) notifiers_list; =20 /* interrupt remapping */ bool intr_enabled; /* Whether guest enabled IR */ diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index b359efd6f9..5987b48d43 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -1248,10 +1248,10 @@ static void vtd_interrupt_remap_table_setup(IntelIO= MMUState *s) =20 static void vtd_iommu_replay_all(IntelIOMMUState *s) { - IntelIOMMUNotifierNode *node; + VTDAddressSpace *vtd_as; =20 - QLIST_FOREACH(node, &s->notifiers_list, next) { - memory_region_iommu_replay_all(&node->vtd_as->iommu); + QLIST_FOREACH(vtd_as, &s->notifiers_list, next) { + memory_region_iommu_replay_all(&vtd_as->iommu); } } =20 @@ -1372,7 +1372,6 @@ static void vtd_iotlb_global_invalidate(IntelIOMMUSta= te *s) =20 static void vtd_iotlb_domain_invalidate(IntelIOMMUState *s, uint16_t domai= n_id) { - IntelIOMMUNotifierNode *node; VTDContextEntry ce; VTDAddressSpace *vtd_as; =20 @@ -1381,8 +1380,7 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUSta= te *s, uint16_t domain_id) g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_domain, &domain_id); =20 - QLIST_FOREACH(node, &s->notifiers_list, next) { - vtd_as =3D node->vtd_as; + QLIST_FOREACH(vtd_as, &s->notifiers_list, next) { if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), vtd_as->devfn, &ce) && domain_id =3D=3D VTD_CONTEXT_ENTRY_DID(ce.hi)) { @@ -1402,12 +1400,11 @@ static void vtd_iotlb_page_invalidate_notify(IntelI= OMMUState *s, uint16_t domain_id, hwaddr addr, uint8_t am) { - IntelIOMMUNotifierNode *node; + VTDAddressSpace *vtd_as; VTDContextEntry ce; int ret; =20 - QLIST_FOREACH(node, &(s->notifiers_list), next) { - VTDAddressSpace *vtd_as =3D node->vtd_as; + QLIST_FOREACH(vtd_as, &(s->notifiers_list), next) { ret =3D vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), vtd_as->devfn, &ce); if (!ret && domain_id =3D=3D VTD_CONTEXT_ENTRY_DID(ce.hi)) { @@ -2344,8 +2341,6 @@ static void vtd_iommu_notify_flag_changed(IOMMUMemory= Region *iommu, { VTDAddressSpace *vtd_as =3D container_of(iommu, VTDAddressSpace, iommu= ); IntelIOMMUState *s =3D vtd_as->iommu_state; - IntelIOMMUNotifierNode *node =3D NULL; - IntelIOMMUNotifierNode *next_node =3D NULL; =20 if (!s->caching_mode && new & IOMMU_NOTIFIER_MAP) { error_report("We need to set caching-mode=3D1 for intel-iommu to e= nable " @@ -2354,21 +2349,11 @@ static void vtd_iommu_notify_flag_changed(IOMMUMemo= ryRegion *iommu, } =20 if (old =3D=3D IOMMU_NOTIFIER_NONE) { - node =3D g_malloc0(sizeof(*node)); - node->vtd_as =3D vtd_as; - QLIST_INSERT_HEAD(&s->notifiers_list, node, next); - return; - } - - /* update notifier node with new flags */ - QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) { - if (node->vtd_as =3D=3D vtd_as) { - if (new =3D=3D IOMMU_NOTIFIER_NONE) { - QLIST_REMOVE(node, next); - g_free(node); - } - return; - } + /* Insert new ones */ + QLIST_INSERT_HEAD(&s->notifiers_list, vtd_as, next); + } else if (new =3D=3D IOMMU_NOTIFIER_NONE) { + /* Remove old ones */ + QLIST_REMOVE(vtd_as, next); } } =20 @@ -2838,12 +2823,10 @@ static void vtd_address_space_unmap(VTDAddressSpace= *as, IOMMUNotifier *n) =20 static void vtd_address_space_unmap_all(IntelIOMMUState *s) { - IntelIOMMUNotifierNode *node; VTDAddressSpace *vtd_as; IOMMUNotifier *n; =20 - QLIST_FOREACH(node, &s->notifiers_list, next) { - vtd_as =3D node->vtd_as; + QLIST_FOREACH(vtd_as, &s->notifiers_list, next) { IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) { vtd_address_space_unmap(vtd_as, n); } --=20 2.17.0 From nobody Wed Oct 29 20:45:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 152654793044861.46571411773243; Thu, 17 May 2018 02:05:30 -0700 (PDT) Received: from localhost ([::1]:43812 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJEqn-0004is-KT for importer@patchew.org; Thu, 17 May 2018 05:05:29 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59163) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJElF-0000o5-Vh for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJElC-0006hO-Sk for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:46 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:51802 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fJElC-0006gR-NI for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:42 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 358C3407572D; Thu, 17 May 2018 08:59:42 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-109.pek2.redhat.com [10.72.12.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7A7BE2166BAD; Thu, 17 May 2018 08:59:39 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Thu, 17 May 2018 16:59:18 +0800 Message-Id: <20180517085927.24925-4-peterx@redhat.com> In-Reply-To: <20180517085927.24925-1-peterx@redhat.com> References: <20180517085927.24925-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Thu, 17 May 2018 08:59:42 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Thu, 17 May 2018 08:59:42 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 03/12] intel-iommu: add iommu lock X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Add a per-iommu big lock to protect IOMMU status. Currently the only thing to be protected is the IOTLB/context cache, since that can be accessed even without BQL, e.g., in IO dataplane. Note that we don't need to protect device page tables since that's fully controlled by the guest kernel. However there is still possibility that malicious drivers will program the device to not obey the rule. In that case QEMU can't really do anything useful, instead the guest itself will be responsible for all uncertainties. Reported-by: Fam Zheng Signed-off-by: Peter Xu Tested-by: Jintack Lim --- include/hw/i386/intel_iommu.h | 6 +++++ hw/i386/intel_iommu.c | 43 +++++++++++++++++++++++++++++++---- 2 files changed, 44 insertions(+), 5 deletions(-) diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index 220697253f..ee517704e7 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -300,6 +300,12 @@ struct IntelIOMMUState { OnOffAuto intr_eim; /* Toggle for EIM cabability */ bool buggy_eim; /* Force buggy EIM unless eim=3Doff */ uint8_t aw_bits; /* Host/IOVA address width (in bits) */ + + /* + * Protects IOMMU states in general. Currently it protects the + * per-IOMMU IOTLB cache, and context entry cache in VTDAddressSpace. + */ + QemuMutex iommu_lock; }; =20 /* Find the VTD Address space associated with the given bus pointer, diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 5987b48d43..112971638d 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -128,6 +128,16 @@ static uint64_t vtd_set_clear_mask_quad(IntelIOMMUStat= e *s, hwaddr addr, return new_val; } =20 +static inline void vtd_iommu_lock(IntelIOMMUState *s) +{ + qemu_mutex_lock(&s->iommu_lock); +} + +static inline void vtd_iommu_unlock(IntelIOMMUState *s) +{ + qemu_mutex_unlock(&s->iommu_lock); +} + /* GHashTable functions */ static gboolean vtd_uint64_equal(gconstpointer v1, gconstpointer v2) { @@ -172,7 +182,7 @@ static gboolean vtd_hash_remove_by_page(gpointer key, g= pointer value, } =20 /* Reset all the gen of VTDAddressSpace to zero and set the gen of - * IntelIOMMUState to 1. + * IntelIOMMUState to 1. Must be with IOMMU lock held. */ static void vtd_reset_context_cache(IntelIOMMUState *s) { @@ -197,12 +207,19 @@ static void vtd_reset_context_cache(IntelIOMMUState *= s) s->context_cache_gen =3D 1; } =20 -static void vtd_reset_iotlb(IntelIOMMUState *s) +static void vtd_reset_iotlb_locked(IntelIOMMUState *s) { assert(s->iotlb); g_hash_table_remove_all(s->iotlb); } =20 +static void vtd_reset_iotlb(IntelIOMMUState *s) +{ + vtd_iommu_lock(s); + vtd_reset_iotlb_locked(s); + vtd_iommu_unlock(s); +} + static uint64_t vtd_get_iotlb_key(uint64_t gfn, uint16_t source_id, uint32_t level) { @@ -215,6 +232,7 @@ static uint64_t vtd_get_iotlb_gfn(hwaddr addr, uint32_t= level) return (addr & vtd_slpt_level_page_mask(level)) >> VTD_PAGE_SHIFT_4K; } =20 +/* Must be with IOMMU lock held */ static VTDIOTLBEntry *vtd_lookup_iotlb(IntelIOMMUState *s, uint16_t source= _id, hwaddr addr) { @@ -235,6 +253,7 @@ out: return entry; } =20 +/* Must be with IOMMU lock held */ static void vtd_update_iotlb(IntelIOMMUState *s, uint16_t source_id, uint16_t domain_id, hwaddr addr, uint64_t slp= te, uint8_t access_flags, uint32_t level) @@ -246,7 +265,7 @@ static void vtd_update_iotlb(IntelIOMMUState *s, uint16= _t source_id, trace_vtd_iotlb_page_update(source_id, addr, slpte, domain_id); if (g_hash_table_size(s->iotlb) >=3D VTD_IOTLB_MAX_SIZE) { trace_vtd_iotlb_reset("iotlb exceeds size limit"); - vtd_reset_iotlb(s); + vtd_reset_iotlb_locked(s); } =20 entry->gfn =3D gfn; @@ -1106,7 +1125,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *v= td_as, PCIBus *bus, IntelIOMMUState *s =3D vtd_as->iommu_state; VTDContextEntry ce; uint8_t bus_num =3D pci_bus_num(bus); - VTDContextCacheEntry *cc_entry =3D &vtd_as->context_cache_entry; + VTDContextCacheEntry *cc_entry; uint64_t slpte, page_mask; uint32_t level; uint16_t source_id =3D vtd_make_source_id(bus_num, devfn); @@ -1123,6 +1142,10 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *= vtd_as, PCIBus *bus, */ assert(!vtd_is_interrupt_addr(addr)); =20 + vtd_iommu_lock(s); + + cc_entry =3D &vtd_as->context_cache_entry; + /* Try to fetch slpte form IOTLB */ iotlb_entry =3D vtd_lookup_iotlb(s, source_id, addr); if (iotlb_entry) { @@ -1182,7 +1205,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *v= td_as, PCIBus *bus, * IOMMU region can be swapped back. */ vtd_pt_enable_fast_path(s, source_id); - + vtd_iommu_unlock(s); return true; } =20 @@ -1203,6 +1226,7 @@ static bool vtd_do_iommu_translate(VTDAddressSpace *v= td_as, PCIBus *bus, vtd_update_iotlb(s, source_id, VTD_CONTEXT_ENTRY_DID(ce.hi), addr, slp= te, access_flags, level); out: + vtd_iommu_unlock(s); entry->iova =3D addr & page_mask; entry->translated_addr =3D vtd_get_slpte_addr(slpte, s->aw_bits) & pag= e_mask; entry->addr_mask =3D ~page_mask; @@ -1210,6 +1234,7 @@ out: return true; =20 error: + vtd_iommu_unlock(s); entry->iova =3D 0; entry->translated_addr =3D 0; entry->addr_mask =3D 0; @@ -1258,10 +1283,13 @@ static void vtd_iommu_replay_all(IntelIOMMUState *s) static void vtd_context_global_invalidate(IntelIOMMUState *s) { trace_vtd_inv_desc_cc_global(); + /* Protects context cache */ + vtd_iommu_lock(s); s->context_cache_gen++; if (s->context_cache_gen =3D=3D VTD_CONTEXT_CACHE_GEN_MAX) { vtd_reset_context_cache(s); } + vtd_iommu_unlock(s); vtd_switch_address_space_all(s); /* * From VT-d spec 6.5.2.1, a global context entry invalidation @@ -1377,8 +1405,10 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUSt= ate *s, uint16_t domain_id) =20 trace_vtd_inv_desc_iotlb_domain(domain_id); =20 + vtd_iommu_lock(s); g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_domain, &domain_id); + vtd_iommu_unlock(s); =20 QLIST_FOREACH(vtd_as, &s->notifiers_list, next) { if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), @@ -1426,7 +1456,9 @@ static void vtd_iotlb_page_invalidate(IntelIOMMUState= *s, uint16_t domain_id, info.domain_id =3D domain_id; info.addr =3D addr; info.mask =3D ~((1 << am) - 1); + vtd_iommu_lock(s); g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page, &info); + vtd_iommu_unlock(s); vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am); } =20 @@ -3072,6 +3104,7 @@ static void vtd_realize(DeviceState *dev, Error **err= p) } =20 QLIST_INIT(&s->notifiers_list); + qemu_mutex_init(&s->iommu_lock); memset(s->vtd_as_by_bus_num, 0, sizeof(s->vtd_as_by_bus_num)); memory_region_init_io(&s->csrmem, OBJECT(s), &vtd_mem_ops, s, "intel_iommu", DMAR_REG_SIZE); --=20 2.17.0 From nobody Wed Oct 29 20:45:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1526548111182436.0119622281778; Thu, 17 May 2018 02:08:31 -0700 (PDT) Received: from localhost ([::1]:43991 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJEtZ-0007Eh-9a for importer@patchew.org; Thu, 17 May 2018 05:08:21 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59176) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJElH-0000ou-23 for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJElG-0006n9-4Q for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:47 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:55680 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fJElF-0006lr-Vz for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:46 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 77150808255B; Thu, 17 May 2018 08:59:45 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-109.pek2.redhat.com [10.72.12.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id C25CF215CDA7; Thu, 17 May 2018 08:59:42 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Thu, 17 May 2018 16:59:19 +0800 Message-Id: <20180517085927.24925-5-peterx@redhat.com> In-Reply-To: <20180517085927.24925-1-peterx@redhat.com> References: <20180517085927.24925-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 17 May 2018 08:59:45 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 17 May 2018 08:59:45 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 04/12] intel-iommu: only do page walk for MAP notifiers X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" For UNMAP-only IOMMU notifiers, we don't really need to walk the page tables. Fasten that procedure by skipping the page table walk. That should boost performance for UNMAP-only notifiers like vhost. Signed-off-by: Peter Xu Tested-by: Jintack Lim --- include/hw/i386/intel_iommu.h | 2 ++ hw/i386/intel_iommu.c | 43 +++++++++++++++++++++++++++++++---- 2 files changed, 40 insertions(+), 5 deletions(-) diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index ee517704e7..9e0a6c1c6a 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -93,6 +93,8 @@ struct VTDAddressSpace { IntelIOMMUState *iommu_state; VTDContextCacheEntry context_cache_entry; QLIST_ENTRY(VTDAddressSpace) next; + /* Superset of notifier flags that this address space has */ + IOMMUNotifierFlag notifier_flags; }; =20 struct VTDBus { diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 112971638d..9a418abfb6 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -138,6 +138,12 @@ static inline void vtd_iommu_unlock(IntelIOMMUState *s) qemu_mutex_unlock(&s->iommu_lock); } =20 +/* Whether the address space needs to notify new mappings */ +static inline gboolean vtd_as_notify_mappings(VTDAddressSpace *as) +{ + return as->notifier_flags & IOMMU_NOTIFIER_MAP; +} + /* GHashTable functions */ static gboolean vtd_uint64_equal(gconstpointer v1, gconstpointer v2) { @@ -1433,14 +1439,35 @@ static void vtd_iotlb_page_invalidate_notify(IntelI= OMMUState *s, VTDAddressSpace *vtd_as; VTDContextEntry ce; int ret; + hwaddr size =3D (1 << am) * VTD_PAGE_SIZE; =20 QLIST_FOREACH(vtd_as, &(s->notifiers_list), next) { ret =3D vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), vtd_as->devfn, &ce); if (!ret && domain_id =3D=3D VTD_CONTEXT_ENTRY_DID(ce.hi)) { - vtd_page_walk(&ce, addr, addr + (1 << am) * VTD_PAGE_SIZE, - vtd_page_invalidate_notify_hook, - (void *)&vtd_as->iommu, true, s->aw_bits); + if (vtd_as_notify_mappings(vtd_as)) { + /* + * For MAP-inclusive notifiers, we need to walk the + * page table to sync the shadow page table. + */ + vtd_page_walk(&ce, addr, addr + size, + vtd_page_invalidate_notify_hook, + (void *)&vtd_as->iommu, true, s->aw_bits); + } else { + /* + * For UNMAP-only notifiers, we don't need to walk the + * page tables. We just deliver the PSI down to + * invalidate caches. + */ + IOMMUTLBEntry entry =3D { + .target_as =3D &address_space_memory, + .iova =3D addr, + .translated_addr =3D 0, + .addr_mask =3D size - 1, + .perm =3D IOMMU_NONE, + }; + memory_region_notify_iommu(&vtd_as->iommu, entry); + } } } } @@ -2380,6 +2407,9 @@ static void vtd_iommu_notify_flag_changed(IOMMUMemory= Region *iommu, exit(1); } =20 + /* Update per-address-space notifier flags */ + vtd_as->notifier_flags =3D new; + if (old =3D=3D IOMMU_NOTIFIER_NONE) { /* Insert new ones */ QLIST_INSERT_HEAD(&s->notifiers_list, vtd_as, next); @@ -2890,8 +2920,11 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iomm= u_mr, IOMMUNotifier *n) PCI_FUNC(vtd_as->devfn), VTD_CONTEXT_ENTRY_DID(ce.hi), ce.hi, ce.lo); - vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, false, - s->aw_bits); + if (vtd_as_notify_mappings(vtd_as)) { + /* This is required only for MAP typed notifiers */ + vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, false, + s->aw_bits); + } } else { trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), PCI_FUNC(vtd_as->devfn)); --=20 2.17.0 From nobody Wed Oct 29 20:45:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1526547750842786.1522939503658; Thu, 17 May 2018 02:02:30 -0700 (PDT) Received: from localhost ([::1]:43796 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJEnt-0002Hc-V5 for importer@patchew.org; Thu, 17 May 2018 05:02:30 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59211) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJElK-0000rt-Oe for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJElJ-0006q8-EN for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:50 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:51826 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fJElJ-0006q4-8s for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:49 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id BCC3240200A3; Thu, 17 May 2018 08:59:48 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-109.pek2.redhat.com [10.72.12.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0F1DD215CDA7; Thu, 17 May 2018 08:59:45 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Thu, 17 May 2018 16:59:20 +0800 Message-Id: <20180517085927.24925-6-peterx@redhat.com> In-Reply-To: <20180517085927.24925-1-peterx@redhat.com> References: <20180517085927.24925-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 17 May 2018 08:59:48 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 17 May 2018 08:59:48 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 05/12] intel-iommu: introduce vtd_page_walk_info X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" During the recursive page walking of IOVA page tables, some stack variables are constant variables and never changed during the whole page walking procedure. Isolate them into a struct so that we don't need to pass those contants down the stack every time and multiple times. Signed-off-by: Peter Xu Reviewed-by: Eric Auger Tested-by: Jintack Lim --- hw/i386/intel_iommu.c | 84 ++++++++++++++++++++++++++----------------- 1 file changed, 52 insertions(+), 32 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 9a418abfb6..4953d02ed0 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -747,9 +747,27 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint= 64_t iova, bool is_write, =20 typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private); =20 +/** + * Constant information used during page walking + * + * @hook_fn: hook func to be called when detected page + * @private: private data to be passed into hook func + * @notify_unmap: whether we should notify invalid entries + * @aw: maximum address width + */ +typedef struct { + vtd_page_walk_hook hook_fn; + void *private; + bool notify_unmap; + uint8_t aw; +} vtd_page_walk_info; + static int vtd_page_walk_one(IOMMUTLBEntry *entry, int level, - vtd_page_walk_hook hook_fn, void *private) + vtd_page_walk_info *info) { + vtd_page_walk_hook hook_fn =3D info->hook_fn; + void *private =3D info->private; + assert(hook_fn); trace_vtd_page_walk_one(level, entry->iova, entry->translated_addr, entry->addr_mask, entry->perm); @@ -762,17 +780,13 @@ static int vtd_page_walk_one(IOMMUTLBEntry *entry, in= t level, * @addr: base GPA addr to start the walk * @start: IOVA range start address * @end: IOVA range end address (start <=3D addr < end) - * @hook_fn: hook func to be called when detected page - * @private: private data to be passed into hook func * @read: whether parent level has read permission * @write: whether parent level has write permission - * @notify_unmap: whether we should notify invalid entries - * @aw: maximum address width + * @info: constant information for the page walk */ static int vtd_page_walk_level(dma_addr_t addr, uint64_t start, - uint64_t end, vtd_page_walk_hook hook_fn, - void *private, uint32_t level, bool read, - bool write, bool notify_unmap, uint8_t aw) + uint64_t end, uint32_t level, bool read, + bool write, vtd_page_walk_info *info) { bool read_cur, write_cur, entry_valid; uint32_t offset; @@ -822,24 +836,24 @@ static int vtd_page_walk_level(dma_addr_t addr, uint6= 4_t start, =20 if (vtd_is_last_slpte(slpte, level)) { /* NOTE: this is only meaningful if entry_valid =3D=3D true */ - entry.translated_addr =3D vtd_get_slpte_addr(slpte, aw); - if (!entry_valid && !notify_unmap) { + entry.translated_addr =3D vtd_get_slpte_addr(slpte, info->aw); + if (!entry_valid && !info->notify_unmap) { trace_vtd_page_walk_skip_perm(iova, iova_next); goto next; } - ret =3D vtd_page_walk_one(&entry, level, hook_fn, private); + ret =3D vtd_page_walk_one(&entry, level, info); if (ret < 0) { return ret; } } else { if (!entry_valid) { - if (notify_unmap) { + if (info->notify_unmap) { /* * The whole entry is invalid; unmap it all. * Translated address is meaningless, zero it. */ entry.translated_addr =3D 0x0; - ret =3D vtd_page_walk_one(&entry, level, hook_fn, priv= ate); + ret =3D vtd_page_walk_one(&entry, level, info); if (ret < 0) { return ret; } @@ -848,10 +862,9 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64= _t start, } goto next; } - ret =3D vtd_page_walk_level(vtd_get_slpte_addr(slpte, aw), iov= a, - MIN(iova_next, end), hook_fn, privat= e, - level - 1, read_cur, write_cur, - notify_unmap, aw); + ret =3D vtd_page_walk_level(vtd_get_slpte_addr(slpte, info->aw= ), + iova, MIN(iova_next, end), level - 1, + read_cur, write_cur, info); if (ret < 0) { return ret; } @@ -870,28 +883,24 @@ next: * @ce: context entry to walk upon * @start: IOVA address to start the walk * @end: IOVA range end address (start <=3D addr < end) - * @hook_fn: the hook that to be called for each detected area - * @private: private data for the hook function - * @aw: maximum address width + * @info: page walking information struct */ static int vtd_page_walk(VTDContextEntry *ce, uint64_t start, uint64_t end, - vtd_page_walk_hook hook_fn, void *private, - bool notify_unmap, uint8_t aw) + vtd_page_walk_info *info) { dma_addr_t addr =3D vtd_ce_get_slpt_base(ce); uint32_t level =3D vtd_ce_get_level(ce); =20 - if (!vtd_iova_range_check(start, ce, aw)) { + if (!vtd_iova_range_check(start, ce, info->aw)) { return -VTD_FR_ADDR_BEYOND_MGAW; } =20 - if (!vtd_iova_range_check(end, ce, aw)) { + if (!vtd_iova_range_check(end, ce, info->aw)) { /* Fix end so that it reaches the maximum */ - end =3D vtd_iova_limit(ce, aw); + end =3D vtd_iova_limit(ce, info->aw); } =20 - return vtd_page_walk_level(addr, start, end, hook_fn, private, - level, true, true, notify_unmap, aw); + return vtd_page_walk_level(addr, start, end, level, true, true, info); } =20 /* Map a device to its corresponding domain (context-entry) */ @@ -1446,13 +1455,18 @@ static void vtd_iotlb_page_invalidate_notify(IntelI= OMMUState *s, vtd_as->devfn, &ce); if (!ret && domain_id =3D=3D VTD_CONTEXT_ENTRY_DID(ce.hi)) { if (vtd_as_notify_mappings(vtd_as)) { + vtd_page_walk_info info =3D { + .hook_fn =3D vtd_page_invalidate_notify_hook, + .private =3D (void *)&vtd_as->iommu, + .notify_unmap =3D true, + .aw =3D s->aw_bits, + }; + /* * For MAP-inclusive notifiers, we need to walk the * page table to sync the shadow page table. */ - vtd_page_walk(&ce, addr, addr + size, - vtd_page_invalidate_notify_hook, - (void *)&vtd_as->iommu, true, s->aw_bits); + vtd_page_walk(&ce, addr, addr + size, &info); } else { /* * For UNMAP-only notifiers, we don't need to walk the @@ -2922,8 +2936,14 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iomm= u_mr, IOMMUNotifier *n) ce.hi, ce.lo); if (vtd_as_notify_mappings(vtd_as)) { /* This is required only for MAP typed notifiers */ - vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, false, - s->aw_bits); + vtd_page_walk_info info =3D { + .hook_fn =3D vtd_replay_hook, + .private =3D (void *)n, + .notify_unmap =3D false, + .aw =3D s->aw_bits, + }; + + vtd_page_walk(&ce, 0, ~0ULL, &info); } } else { trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), --=20 2.17.0 From nobody Wed Oct 29 20:45:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1526547956023927.1757228940842; Thu, 17 May 2018 02:05:56 -0700 (PDT) Received: from localhost ([::1]:43814 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJErD-00052j-5x for importer@patchew.org; Thu, 17 May 2018 05:05:55 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59238) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJElP-0000ul-Nn for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJElM-0006ra-Lo for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:55 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:51822 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fJElM-0006rM-H3 for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:52 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 0E98C407572D; Thu, 17 May 2018 08:59:52 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-109.pek2.redhat.com [10.72.12.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id 54D012166BAD; Thu, 17 May 2018 08:59:49 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Thu, 17 May 2018 16:59:21 +0800 Message-Id: <20180517085927.24925-7-peterx@redhat.com> In-Reply-To: <20180517085927.24925-1-peterx@redhat.com> References: <20180517085927.24925-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Thu, 17 May 2018 08:59:52 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Thu, 17 May 2018 08:59:52 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 06/12] intel-iommu: pass in address space when page walk X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" We pass in the VTDAddressSpace too. It'll be used in the follow up patches. Signed-off-by: Peter Xu Reviewed-by: Eric Auger Tested-by: Jintack Lim --- hw/i386/intel_iommu.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 4953d02ed0..fe5ee77d46 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -753,9 +753,11 @@ typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry= , void *private); * @hook_fn: hook func to be called when detected page * @private: private data to be passed into hook func * @notify_unmap: whether we should notify invalid entries + * @as: VT-d address space of the device * @aw: maximum address width */ typedef struct { + VTDAddressSpace *as; vtd_page_walk_hook hook_fn; void *private; bool notify_unmap; @@ -1460,6 +1462,7 @@ static void vtd_iotlb_page_invalidate_notify(IntelIOM= MUState *s, .private =3D (void *)&vtd_as->iommu, .notify_unmap =3D true, .aw =3D s->aw_bits, + .as =3D vtd_as, }; =20 /* @@ -2941,6 +2944,7 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu= _mr, IOMMUNotifier *n) .private =3D (void *)n, .notify_unmap =3D false, .aw =3D s->aw_bits, + .as =3D vtd_as, }; =20 vtd_page_walk(&ce, 0, ~0ULL, &info); --=20 2.17.0 From nobody Wed Oct 29 20:45:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1526548128903252.52637461496352; Thu, 17 May 2018 02:08:48 -0700 (PDT) Received: from localhost ([::1]:43999 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJEu0-0007Zm-68 for importer@patchew.org; Thu, 17 May 2018 05:08:48 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59249) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJElR-0000w7-4z for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJElP-0006sG-Ug for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:57 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:51830 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fJElP-0006s0-PX for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:55 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4FC2740200A3; Thu, 17 May 2018 08:59:55 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-109.pek2.redhat.com [10.72.12.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9A148215CDA7; Thu, 17 May 2018 08:59:52 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Thu, 17 May 2018 16:59:22 +0800 Message-Id: <20180517085927.24925-8-peterx@redhat.com> In-Reply-To: <20180517085927.24925-1-peterx@redhat.com> References: <20180517085927.24925-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 17 May 2018 08:59:55 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 17 May 2018 08:59:55 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 07/12] intel-iommu: trace domain id during page walk X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This patch only modifies the trace points. Previously we were tracing page walk levels. They are redundant since we have page mask (size) already. Now we trace something much more useful which is the domain ID of the page walking. That can be very useful when we trace more than one devices on the same system, so that we can know which map is for which domain. Signed-off-by: Peter Xu Tested-by: Jintack Lim --- hw/i386/intel_iommu.c | 16 ++++++++++------ hw/i386/trace-events | 2 +- 2 files changed, 11 insertions(+), 7 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index fe5ee77d46..29fcf2b3a8 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -755,6 +755,7 @@ typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry,= void *private); * @notify_unmap: whether we should notify invalid entries * @as: VT-d address space of the device * @aw: maximum address width + * @domain: domain ID of the page walk */ typedef struct { VTDAddressSpace *as; @@ -762,17 +763,18 @@ typedef struct { void *private; bool notify_unmap; uint8_t aw; + uint16_t domain_id; } vtd_page_walk_info; =20 -static int vtd_page_walk_one(IOMMUTLBEntry *entry, int level, - vtd_page_walk_info *info) +static int vtd_page_walk_one(IOMMUTLBEntry *entry, vtd_page_walk_info *inf= o) { vtd_page_walk_hook hook_fn =3D info->hook_fn; void *private =3D info->private; =20 assert(hook_fn); - trace_vtd_page_walk_one(level, entry->iova, entry->translated_addr, - entry->addr_mask, entry->perm); + trace_vtd_page_walk_one(info->domain_id, entry->iova, + entry->translated_addr, entry->addr_mask, + entry->perm); return hook_fn(entry, private); } =20 @@ -843,7 +845,7 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_= t start, trace_vtd_page_walk_skip_perm(iova, iova_next); goto next; } - ret =3D vtd_page_walk_one(&entry, level, info); + ret =3D vtd_page_walk_one(&entry, info); if (ret < 0) { return ret; } @@ -855,7 +857,7 @@ static int vtd_page_walk_level(dma_addr_t addr, uint64_= t start, * Translated address is meaningless, zero it. */ entry.translated_addr =3D 0x0; - ret =3D vtd_page_walk_one(&entry, level, info); + ret =3D vtd_page_walk_one(&entry, info); if (ret < 0) { return ret; } @@ -1463,6 +1465,7 @@ static void vtd_iotlb_page_invalidate_notify(IntelIOM= MUState *s, .notify_unmap =3D true, .aw =3D s->aw_bits, .as =3D vtd_as, + .domain_id =3D domain_id, }; =20 /* @@ -2945,6 +2948,7 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu= _mr, IOMMUNotifier *n) .notify_unmap =3D false, .aw =3D s->aw_bits, .as =3D vtd_as, + .domain_id =3D VTD_CONTEXT_ENTRY_DID(ce.hi), }; =20 vtd_page_walk(&ce, 0, ~0ULL, &info); diff --git a/hw/i386/trace-events b/hw/i386/trace-events index 22d44648af..ca23ba9fad 100644 --- a/hw/i386/trace-events +++ b/hw/i386/trace-events @@ -39,7 +39,7 @@ vtd_fault_disabled(void) "Fault processing disabled for c= ontext entry" vtd_replay_ce_valid(uint8_t bus, uint8_t dev, uint8_t fn, uint16_t domain,= uint64_t hi, uint64_t lo) "replay valid context device %02"PRIx8":%02"PRIx= 8".%02"PRIx8" domain 0x%"PRIx16" hi 0x%"PRIx64" lo 0x%"PRIx64 vtd_replay_ce_invalid(uint8_t bus, uint8_t dev, uint8_t fn) "replay invali= d context device %02"PRIx8":%02"PRIx8".%02"PRIx8 vtd_page_walk_level(uint64_t addr, uint32_t level, uint64_t start, uint64_= t end) "walk (base=3D0x%"PRIx64", level=3D%"PRIu32") iova range 0x%"PRIx64"= - 0x%"PRIx64 -vtd_page_walk_one(uint32_t level, uint64_t iova, uint64_t gpa, uint64_t ma= sk, int perm) "detected page level 0x%"PRIx32" iova 0x%"PRIx64" -> gpa 0x%"= PRIx64" mask 0x%"PRIx64" perm %d" +vtd_page_walk_one(uint16_t domain, uint64_t iova, uint64_t gpa, uint64_t m= ask, int perm) "domain 0x%"PRIu16" iova 0x%"PRIx64" -> gpa 0x%"PRIx64" mask= 0x%"PRIx64" perm %d" vtd_page_walk_skip_read(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to unable to read" vtd_page_walk_skip_perm(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to perm empty" vtd_page_walk_skip_reserve(uint64_t iova, uint64_t next) "Page walk skip i= ova 0x%"PRIx64" - 0x%"PRIx64" due to rsrv set" --=20 2.17.0 From nobody Wed Oct 29 20:45:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1526547767883790.5879910580618; Thu, 17 May 2018 02:02:47 -0700 (PDT) Received: from localhost ([::1]:43797 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJEoB-0002W6-1L for importer@patchew.org; Thu, 17 May 2018 05:02:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59261) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJElU-0000yH-L9 for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJElT-0006uq-8P for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:00 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:55686 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fJElT-0006ua-30 for qemu-devel@nongnu.org; Thu, 17 May 2018 04:59:59 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 919AB808255B; Thu, 17 May 2018 08:59:58 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-109.pek2.redhat.com [10.72.12.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id DBD12215CDA7; Thu, 17 May 2018 08:59:55 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Thu, 17 May 2018 16:59:23 +0800 Message-Id: <20180517085927.24925-9-peterx@redhat.com> In-Reply-To: <20180517085927.24925-1-peterx@redhat.com> References: <20180517085927.24925-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 17 May 2018 08:59:58 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 17 May 2018 08:59:58 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 08/12] util: implement simple iova tree X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Introduce a simplest iova tree implementation based on GTree. Signed-off-by: Peter Xu Tested-by: Jintack Lim --- include/qemu/iova-tree.h | 134 +++++++++++++++++++++++++++++++++++++++ util/iova-tree.c | 114 +++++++++++++++++++++++++++++++++ MAINTAINERS | 6 ++ util/Makefile.objs | 1 + 4 files changed, 255 insertions(+) create mode 100644 include/qemu/iova-tree.h create mode 100644 util/iova-tree.c diff --git a/include/qemu/iova-tree.h b/include/qemu/iova-tree.h new file mode 100644 index 0000000000..b061932097 --- /dev/null +++ b/include/qemu/iova-tree.h @@ -0,0 +1,134 @@ +/* + * An very simplified iova tree implementation based on GTree. + * + * Copyright 2018 Red Hat, Inc. + * + * Authors: + * Peter Xu + * + * This work is licensed under the terms of the GNU GPL, version 2 or late= r. + */ +#ifndef IOVA_TREE_H +#define IOVA_TREE_H + +/* + * Currently the iova tree will only allow to keep ranges + * information, and no extra user data is allowed for each element. A + * benefit is that we can merge adjacent ranges internally within the + * tree. It can save a lot of memory when the ranges are splitted but + * mostly continuous. + * + * Note that current implementation does not provide any thread + * protections. Callers of the iova tree should be responsible + * for the thread safety issue. + */ + +#include "qemu/osdep.h" +#include "exec/memory.h" +#include "exec/hwaddr.h" + +#define IOVA_OK (0) +#define IOVA_ERR_INVALID (-1) /* Invalid parameters */ +#define IOVA_ERR_OVERLAP (-2) /* IOVA range overlapped */ + +typedef struct IOVATree IOVATree; +typedef struct DMAMap { + hwaddr iova; + hwaddr translated_addr; + hwaddr size; /* Inclusive */ + IOMMUAccessFlags perm; +} QEMU_PACKED DMAMap; +typedef gboolean (*iova_tree_iterator)(DMAMap *map); + +/** + * iova_tree_new: + * + * Create a new iova tree. + * + * Returns: the tree pointer when succeeded, or NULL if error. + */ +IOVATree *iova_tree_new(void); + +/** + * iova_tree_insert: + * + * @tree: the iova tree to insert + * @map: the mapping to insert + * + * Insert an iova range to the tree. If there is overlapped + * ranges, IOVA_ERR_OVERLAP will be returned. + * + * Return: 0 if succeeded, or <0 if error. + */ +int iova_tree_insert(IOVATree *tree, DMAMap *map); + +/** + * iova_tree_remove: + * + * @tree: the iova tree to remove range from + * @map: the map range to remove + * + * Remove mappings from the tree that are covered by the map range + * provided. The range does not need to be exactly what has inserted, + * all the mappings that are included in the provided range will be + * removed from the tree. Here map->translated_addr is meaningless. + * + * Return: 0 if succeeded, or <0 if error. + */ +int iova_tree_remove(IOVATree *tree, DMAMap *map); + +/** + * iova_tree_find: + * + * @tree: the iova tree to search from + * @map: the mapping to search + * + * Search for a mapping in the iova tree that overlaps with the + * mapping range specified. Only the first found mapping will be + * returned. + * + * Return: DMAMap pointer if found, or NULL if not found. Note that + * the returned DMAMap pointer is maintained internally. User should + * only read the content but never modify or free the content. Also, + * user is responsible to make sure the pointer is valid (say, no + * concurrent deletion in progress). + */ +DMAMap *iova_tree_find(IOVATree *tree, DMAMap *map); + +/** + * iova_tree_find_address: + * + * @tree: the iova tree to search from + * @iova: the iova address to find + * + * Similar to iova_tree_find(), but it tries to find mapping with + * range iova=3Diova & size=3D0. + * + * Return: same as iova_tree_find(). + */ +DMAMap *iova_tree_find_address(IOVATree *tree, hwaddr iova); + +/** + * iova_tree_foreach: + * + * @tree: the iova tree to iterate on + * @iterator: the interator for the mappings, return true to stop + * + * Iterate over the iova tree. + * + * Return: 1 if found any overlap, 0 if not, <0 if error. + */ +void iova_tree_foreach(IOVATree *tree, iova_tree_iterator iterator); + +/** + * iova_tree_destroy: + * + * @tree: the iova tree to destroy + * + * Destroy an existing iova tree. + * + * Return: None. + */ +void iova_tree_destroy(IOVATree *tree); + +#endif diff --git a/util/iova-tree.c b/util/iova-tree.c new file mode 100644 index 0000000000..2d9cebfc89 --- /dev/null +++ b/util/iova-tree.c @@ -0,0 +1,114 @@ +/* + * IOVA tree implementation based on GTree. + * + * Copyright 2018 Red Hat, Inc. + * + * Authors: + * Peter Xu + * + * This work is licensed under the terms of the GNU GPL, version 2 or late= r. + */ + +#include +#include "qemu/iova-tree.h" + +struct IOVATree { + GTree *tree; +}; + +static int iova_tree_compare(gconstpointer a, gconstpointer b, gpointer da= ta) +{ + const DMAMap *m1 =3D a, *m2 =3D b; + + if (m1->iova > m2->iova + m2->size) { + return 1; + } + + if (m1->iova + m1->size < m2->iova) { + return -1; + } + + /* Overlapped */ + return 0; +} + +IOVATree *iova_tree_new(void) +{ + IOVATree *iova_tree =3D g_new0(IOVATree, 1); + + /* We don't have values actually, no need to free */ + iova_tree->tree =3D g_tree_new_full(iova_tree_compare, NULL, g_free, N= ULL); + + return iova_tree; +} + +DMAMap *iova_tree_find(IOVATree *tree, DMAMap *map) +{ + return g_tree_lookup(tree->tree, map); +} + +DMAMap *iova_tree_find_address(IOVATree *tree, hwaddr iova) +{ + DMAMap map =3D { .iova =3D iova, .size =3D 0 }; + + return iova_tree_find(tree, &map); +} + +static inline void iova_tree_insert_internal(GTree *gtree, DMAMap *range) +{ + /* Key and value are sharing the same range data */ + g_tree_insert(gtree, range, range); +} + +int iova_tree_insert(IOVATree *tree, DMAMap *map) +{ + DMAMap *new; + + if (map->iova + map->size < map->iova || map->perm =3D=3D IOMMU_NONE) { + return IOVA_ERR_INVALID; + } + + /* We don't allow to insert range that overlaps with existings */ + if (iova_tree_find(tree, map)) { + return IOVA_ERR_OVERLAP; + } + + new =3D g_new0(DMAMap, 1); + memcpy(new, map, sizeof(*new)); + iova_tree_insert_internal(tree->tree, new); + + return IOVA_OK; +} + +static gboolean iova_tree_traverse(gpointer key, gpointer value, + gpointer data) +{ + iova_tree_iterator iterator =3D data; + DMAMap *map =3D key; + + g_assert(key =3D=3D value); + + return iterator(map); +} + +void iova_tree_foreach(IOVATree *tree, iova_tree_iterator iterator) +{ + g_tree_foreach(tree->tree, iova_tree_traverse, iterator); +} + +int iova_tree_remove(IOVATree *tree, DMAMap *map) +{ + DMAMap *overlap; + + while ((overlap =3D iova_tree_find(tree, map))) { + g_tree_remove(tree->tree, overlap); + } + + return IOVA_OK; +} + +void iova_tree_destroy(IOVATree *tree) +{ + g_tree_destroy(tree->tree); + g_free(tree); +} diff --git a/MAINTAINERS b/MAINTAINERS index e187b1f18f..f07fcee72e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1783,6 +1783,12 @@ F: include/sysemu/replay.h F: docs/replay.txt F: stubs/replay.c =20 +IOVA Tree +M: Peter Xu +S: Maintained +F: include/qemu/iova-tree.h +F: util/iova-tree.c + Usermode Emulation ------------------ Overall diff --git a/util/Makefile.objs b/util/Makefile.objs index 728c3541db..e1c3fed4dc 100644 --- a/util/Makefile.objs +++ b/util/Makefile.objs @@ -47,4 +47,5 @@ util-obj-y +=3D qht.o util-obj-y +=3D range.o util-obj-y +=3D stats64.o util-obj-y +=3D systemd.o +util-obj-y +=3D iova-tree.o util-obj-$(CONFIG_LINUX) +=3D vfio-helpers.o --=20 2.17.0 From nobody Wed Oct 29 20:45:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1526547974197719.8961303702608; Thu, 17 May 2018 02:06:14 -0700 (PDT) Received: from localhost ([::1]:43820 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJErV-0005JQ-Ci for importer@patchew.org; Thu, 17 May 2018 05:06:13 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59284) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJEla-00012d-7K for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJElX-0006w6-4t for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:06 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:34700 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fJElW-0006w1-TE for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:03 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 38663401EF12; Thu, 17 May 2018 09:00:02 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-109.pek2.redhat.com [10.72.12.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id 299DA215CDA7; Thu, 17 May 2018 08:59:58 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Thu, 17 May 2018 16:59:24 +0800 Message-Id: <20180517085927.24925-10-peterx@redhat.com> In-Reply-To: <20180517085927.24925-1-peterx@redhat.com> References: <20180517085927.24925-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Thu, 17 May 2018 09:00:02 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Thu, 17 May 2018 09:00:02 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 09/12] intel-iommu: maintain per-device iova ranges X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" For each VTDAddressSpace, now we maintain what IOVA ranges we have mapped and what we have not. With that information, now we only send MAP or UNMAP when necessary. Say, we don't send MAP notifies if we know we have already mapped the range, meanwhile we don't send UNMAP notifies if we know we never mapped the range at all. Signed-off-by: Peter Xu Tested-by: Jintack Lim --- include/hw/i386/intel_iommu.h | 2 ++ hw/i386/intel_iommu.c | 67 +++++++++++++++++++++++++++++++++++ hw/i386/trace-events | 2 ++ 3 files changed, 71 insertions(+) diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index 9e0a6c1c6a..90190bfaa1 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -27,6 +27,7 @@ #include "hw/i386/ioapic.h" #include "hw/pci/msi.h" #include "hw/sysbus.h" +#include "qemu/iova-tree.h" =20 #define TYPE_INTEL_IOMMU_DEVICE "intel-iommu" #define INTEL_IOMMU_DEVICE(obj) \ @@ -95,6 +96,7 @@ struct VTDAddressSpace { QLIST_ENTRY(VTDAddressSpace) next; /* Superset of notifier flags that this address space has */ IOMMUNotifierFlag notifier_flags; + IOVATree *iova_tree; /* Traces mapped IOVA ranges */ }; =20 struct VTDBus { diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 29fcf2b3a8..5a5175a4ed 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -768,10 +768,71 @@ typedef struct { =20 static int vtd_page_walk_one(IOMMUTLBEntry *entry, vtd_page_walk_info *inf= o) { + VTDAddressSpace *as =3D info->as; vtd_page_walk_hook hook_fn =3D info->hook_fn; void *private =3D info->private; + DMAMap target =3D { + .iova =3D entry->iova, + .size =3D entry->addr_mask, + .translated_addr =3D entry->translated_addr, + .perm =3D entry->perm, + }; + DMAMap *mapped =3D iova_tree_find(as->iova_tree, &target); =20 assert(hook_fn); + + /* Update local IOVA mapped ranges */ + if (entry->perm) { + if (mapped) { + /* If it's exactly the same translation, skip */ + if (!memcmp(mapped, &target, sizeof(target))) { + trace_vtd_page_walk_one_skip_map(entry->iova, entry->addr_= mask, + entry->translated_addr); + return 0; + } else { + /* + * Translation changed. This should not happen with + * "intel_iommu=3Don,strict", but it can happen when + * delayed flushing is used in guest IOMMU driver + * (when without "strict") when page A is reused + * before its previous unmap (the unmap can still be + * queued in the delayed flushing queue). Now we do + * our best to remap. Note that there will be a small + * window that we don't have map at all. But that's + * the best effort we can do, and logically + * well-behaved guests should not really using this + * DMA region yet so we should be very safe. + */ + IOMMUAccessFlags cache_perm =3D entry->perm; + int ret; + + /* Emulate an UNMAP */ + entry->perm =3D IOMMU_NONE; + trace_vtd_page_walk_one(info->domain_id, + entry->iova, + entry->translated_addr, + entry->addr_mask, + entry->perm); + ret =3D hook_fn(entry, private); + if (ret) { + return ret; + } + /* Drop any existing mapping */ + iova_tree_remove(as->iova_tree, &target); + /* Recover the correct permission */ + entry->perm =3D cache_perm; + } + } + iova_tree_insert(as->iova_tree, &target); + } else { + if (!mapped) { + /* Skip since we didn't map this range at all */ + trace_vtd_page_walk_one_skip_unmap(entry->iova, entry->addr_ma= sk); + return 0; + } + iova_tree_remove(as->iova_tree, &target); + } + trace_vtd_page_walk_one(info->domain_id, entry->iova, entry->translated_addr, entry->addr_mask, entry->perm); @@ -2804,6 +2865,7 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s, = PCIBus *bus, int devfn) vtd_dev_as->devfn =3D (uint8_t)devfn; vtd_dev_as->iommu_state =3D s; vtd_dev_as->context_cache_entry.context_cache_gen =3D 0; + vtd_dev_as->iova_tree =3D iova_tree_new(); =20 /* * Memory region relationships looks like (Address range shows @@ -2856,6 +2918,7 @@ static void vtd_address_space_unmap(VTDAddressSpace *= as, IOMMUNotifier *n) hwaddr start =3D n->start; hwaddr end =3D n->end; IntelIOMMUState *s =3D as->iommu_state; + DMAMap map; =20 /* * Note: all the codes in this function has a assumption that IOVA @@ -2900,6 +2963,10 @@ static void vtd_address_space_unmap(VTDAddressSpace = *as, IOMMUNotifier *n) VTD_PCI_FUNC(as->devfn), entry.iova, size); =20 + map.iova =3D entry.iova; + map.size =3D entry.addr_mask; + iova_tree_remove(as->iova_tree, &map); + memory_region_notify_one(n, &entry); } =20 diff --git a/hw/i386/trace-events b/hw/i386/trace-events index ca23ba9fad..d8194b80e3 100644 --- a/hw/i386/trace-events +++ b/hw/i386/trace-events @@ -40,6 +40,8 @@ vtd_replay_ce_valid(uint8_t bus, uint8_t dev, uint8_t fn,= uint16_t domain, uint6 vtd_replay_ce_invalid(uint8_t bus, uint8_t dev, uint8_t fn) "replay invali= d context device %02"PRIx8":%02"PRIx8".%02"PRIx8 vtd_page_walk_level(uint64_t addr, uint32_t level, uint64_t start, uint64_= t end) "walk (base=3D0x%"PRIx64", level=3D%"PRIu32") iova range 0x%"PRIx64"= - 0x%"PRIx64 vtd_page_walk_one(uint16_t domain, uint64_t iova, uint64_t gpa, uint64_t m= ask, int perm) "domain 0x%"PRIu16" iova 0x%"PRIx64" -> gpa 0x%"PRIx64" mask= 0x%"PRIx64" perm %d" +vtd_page_walk_one_skip_map(uint64_t iova, uint64_t mask, uint64_t translat= ed) "iova 0x%"PRIx64" mask 0x%"PRIx64" translated 0x%"PRIx64 +vtd_page_walk_one_skip_unmap(uint64_t iova, uint64_t mask) "iova 0x%"PRIx6= 4" mask 0x%"PRIx64 vtd_page_walk_skip_read(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to unable to read" vtd_page_walk_skip_perm(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to perm empty" vtd_page_walk_skip_reserve(uint64_t iova, uint64_t next) "Page walk skip i= ova 0x%"PRIx64" - 0x%"PRIx64" due to rsrv set" --=20 2.17.0 From nobody Wed Oct 29 20:45:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1526548234081480.6847608426076; Thu, 17 May 2018 02:10:34 -0700 (PDT) Received: from localhost ([::1]:44050 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJEvh-0000aA-AE for importer@patchew.org; Thu, 17 May 2018 05:10:33 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59320) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJElh-0001AQ-6q for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJElb-0006yD-3v for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:13 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:51838 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fJEla-0006xt-PK for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:07 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D394640200A3; Thu, 17 May 2018 09:00:05 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-109.pek2.redhat.com [10.72.12.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id EEB552166BC7; Thu, 17 May 2018 09:00:02 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Thu, 17 May 2018 16:59:25 +0800 Message-Id: <20180517085927.24925-11-peterx@redhat.com> In-Reply-To: <20180517085927.24925-1-peterx@redhat.com> References: <20180517085927.24925-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 17 May 2018 09:00:05 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Thu, 17 May 2018 09:00:05 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 10/12] intel-iommu: simplify page walk logic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Let's move the notify_unmap check into the new vtd_page_walk_one() function so that we can greatly simplify the vtd_page_walk_level() logic. No functional change at all. Signed-off-by: Peter Xu Tested-by: Jintack Lim --- hw/i386/intel_iommu.c | 66 ++++++++++++++++++++----------------------- hw/i386/trace-events | 1 - 2 files changed, 30 insertions(+), 37 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 5a5175a4ed..272e49ff66 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -779,6 +779,11 @@ static int vtd_page_walk_one(IOMMUTLBEntry *entry, vtd= _page_walk_info *info) }; DMAMap *mapped =3D iova_tree_find(as->iova_tree, &target); =20 + if (entry->perm =3D=3D IOMMU_NONE && !info->notify_unmap) { + trace_vtd_page_walk_one_skip_unmap(entry->iova, entry->addr_mask); + return 0; + } + assert(hook_fn); =20 /* Update local IOVA mapped ranges */ @@ -894,45 +899,34 @@ static int vtd_page_walk_level(dma_addr_t addr, uint6= 4_t start, */ entry_valid =3D read_cur | write_cur; =20 - entry.target_as =3D &address_space_memory; - entry.iova =3D iova & subpage_mask; - entry.perm =3D IOMMU_ACCESS_FLAG(read_cur, write_cur); - entry.addr_mask =3D ~subpage_mask; - - if (vtd_is_last_slpte(slpte, level)) { - /* NOTE: this is only meaningful if entry_valid =3D=3D true */ - entry.translated_addr =3D vtd_get_slpte_addr(slpte, info->aw); - if (!entry_valid && !info->notify_unmap) { - trace_vtd_page_walk_skip_perm(iova, iova_next); - goto next; - } - ret =3D vtd_page_walk_one(&entry, info); - if (ret < 0) { - return ret; - } - } else { - if (!entry_valid) { - if (info->notify_unmap) { - /* - * The whole entry is invalid; unmap it all. - * Translated address is meaningless, zero it. - */ - entry.translated_addr =3D 0x0; - ret =3D vtd_page_walk_one(&entry, info); - if (ret < 0) { - return ret; - } - } else { - trace_vtd_page_walk_skip_perm(iova, iova_next); - } - goto next; - } + if (!vtd_is_last_slpte(slpte, level) && entry_valid) { + /* + * This is a valid PDE (or even bigger than PDE). We need + * to walk one further level. + */ ret =3D vtd_page_walk_level(vtd_get_slpte_addr(slpte, info->aw= ), iova, MIN(iova_next, end), level - 1, read_cur, write_cur, info); - if (ret < 0) { - return ret; - } + } else { + /* + * This means we are either: + * + * (1) the real page entry (either 4K page, or huge page) + * (2) the whole range is invalid + * + * In either case, we send an IOTLB notification down. + */ + entry.target_as =3D &address_space_memory; + entry.iova =3D iova & subpage_mask; + entry.perm =3D IOMMU_ACCESS_FLAG(read_cur, write_cur); + entry.addr_mask =3D ~subpage_mask; + /* NOTE: this is only meaningful if entry_valid =3D=3D true */ + entry.translated_addr =3D vtd_get_slpte_addr(slpte, info->aw); + ret =3D vtd_page_walk_one(&entry, info); + } + + if (ret < 0) { + return ret; } =20 next: diff --git a/hw/i386/trace-events b/hw/i386/trace-events index d8194b80e3..e14d06ec83 100644 --- a/hw/i386/trace-events +++ b/hw/i386/trace-events @@ -43,7 +43,6 @@ vtd_page_walk_one(uint16_t domain, uint64_t iova, uint64_= t gpa, uint64_t mask, i vtd_page_walk_one_skip_map(uint64_t iova, uint64_t mask, uint64_t translat= ed) "iova 0x%"PRIx64" mask 0x%"PRIx64" translated 0x%"PRIx64 vtd_page_walk_one_skip_unmap(uint64_t iova, uint64_t mask) "iova 0x%"PRIx6= 4" mask 0x%"PRIx64 vtd_page_walk_skip_read(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to unable to read" -vtd_page_walk_skip_perm(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to perm empty" vtd_page_walk_skip_reserve(uint64_t iova, uint64_t next) "Page walk skip i= ova 0x%"PRIx64" - 0x%"PRIx64" due to rsrv set" vtd_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "= Device %02x:%02x.%x switching address space (iommu enabled=3D%d)" vtd_as_unmap_whole(uint8_t bus, uint8_t slot, uint8_t fn, uint64_t iova, u= int64_t size) "Device %02x:%02x.%x start 0x%"PRIx64" size 0x%"PRIx64 --=20 2.17.0 From nobody Wed Oct 29 20:45:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1526548347385361.4828830402714; Thu, 17 May 2018 02:12:27 -0700 (PDT) Received: from localhost ([::1]:44191 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJExW-00024d-LA for importer@patchew.org; Thu, 17 May 2018 05:12:26 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59331) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJElj-0001Cz-8q for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJEle-00072D-GI for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:15 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:51834 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fJEle-00070R-5z for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:10 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3A278407572D; Thu, 17 May 2018 09:00:09 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-109.pek2.redhat.com [10.72.12.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id 75302215CDA7; Thu, 17 May 2018 09:00:06 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Thu, 17 May 2018 16:59:26 +0800 Message-Id: <20180517085927.24925-12-peterx@redhat.com> In-Reply-To: <20180517085927.24925-1-peterx@redhat.com> References: <20180517085927.24925-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Thu, 17 May 2018 09:00:09 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Thu, 17 May 2018 09:00:09 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 11/12] intel-iommu: new vtd_sync_shadow_page_table_range X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" I pick part of the page walk implementation out into this single function. It was only used now for PSIs, but actually it can be used not only for invalidations but for any place that we want to synchronize the shadow page table. No functional change at all. Here I passed in context entry explicit to avoid fetching it again. However I enhanced the helper to even be able to fetch it on its own, since in follow up patches we might call without context entries. Signed-off-by: Peter Xu Tested-by: Jintack Lim --- hw/i386/intel_iommu.c | 65 ++++++++++++++++++++++++++++++++----------- 1 file changed, 48 insertions(+), 17 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 272e49ff66..a1a2a009c1 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -1018,6 +1018,53 @@ static int vtd_dev_to_context_entry(IntelIOMMUState = *s, uint8_t bus_num, return 0; } =20 +static int vtd_sync_shadow_page_hook(IOMMUTLBEntry *entry, + void *private) +{ + memory_region_notify_iommu((IOMMUMemoryRegion *)private, *entry); + return 0; +} + +/* If context entry is NULL, we'll try to fetch it on our own. */ +static int vtd_sync_shadow_page_table_range(VTDAddressSpace *vtd_as, + VTDContextEntry *ce, + hwaddr addr, hwaddr size) +{ + IntelIOMMUState *s =3D vtd_as->iommu_state; + vtd_page_walk_info info =3D { + .hook_fn =3D vtd_sync_shadow_page_hook, + .private =3D (void *)&vtd_as->iommu, + .notify_unmap =3D true, + .aw =3D s->aw_bits, + .as =3D vtd_as, + }; + VTDContextEntry ce_cache; + int ret; + + if (ce) { + /* If the caller provided context entry, use it */ + ce_cache =3D *ce; + } else { + /* If the caller didn't provide ce, try to fetch */ + ret =3D vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), + vtd_as->devfn, &ce_cache); + if (ret) { + /* + * This should not really happen, but in case it happens, + * we just skip the sync for this time. After all we even + * don't have the root table pointer! + */ + trace_vtd_err("Detected invalid context entry when " + "trying to sync shadow page table"); + return 0; + } + } + + info.domain_id =3D VTD_CONTEXT_ENTRY_DID(ce_cache.hi); + + return vtd_page_walk(&ce_cache, addr, addr + size, &info); +} + /* * Fetch translation type for specific device. Returns <0 if error * happens, otherwise return the shifted type to check against @@ -1493,13 +1540,6 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUSt= ate *s, uint16_t domain_id) } } =20 -static int vtd_page_invalidate_notify_hook(IOMMUTLBEntry *entry, - void *private) -{ - memory_region_notify_iommu((IOMMUMemoryRegion *)private, *entry); - return 0; -} - static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s, uint16_t domain_id, hwaddr addr, uint8_t am) @@ -1514,20 +1554,11 @@ static void vtd_iotlb_page_invalidate_notify(IntelI= OMMUState *s, vtd_as->devfn, &ce); if (!ret && domain_id =3D=3D VTD_CONTEXT_ENTRY_DID(ce.hi)) { if (vtd_as_notify_mappings(vtd_as)) { - vtd_page_walk_info info =3D { - .hook_fn =3D vtd_page_invalidate_notify_hook, - .private =3D (void *)&vtd_as->iommu, - .notify_unmap =3D true, - .aw =3D s->aw_bits, - .as =3D vtd_as, - .domain_id =3D domain_id, - }; - /* * For MAP-inclusive notifiers, we need to walk the * page table to sync the shadow page table. */ - vtd_page_walk(&ce, addr, addr + size, &info); + vtd_sync_shadow_page_table_range(vtd_as, &ce, addr, size); } else { /* * For UNMAP-only notifiers, we don't need to walk the --=20 2.17.0 From nobody Wed Oct 29 20:45:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1526548255073618.7509350630065; Thu, 17 May 2018 02:10:55 -0700 (PDT) Received: from localhost ([::1]:44059 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJEw2-0000tr-AQ for importer@patchew.org; Thu, 17 May 2018 05:10:54 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59350) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fJEln-0001Ix-NB for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fJElh-00079Z-Tq for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:19 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:51842 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fJElh-00077T-L5 for qemu-devel@nongnu.org; Thu, 17 May 2018 05:00:13 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 932F340704AE; Thu, 17 May 2018 09:00:12 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-109.pek2.redhat.com [10.72.12.109]) by smtp.corp.redhat.com (Postfix) with ESMTP id C8BA6215CDA7; Thu, 17 May 2018 09:00:09 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Thu, 17 May 2018 16:59:27 +0800 Message-Id: <20180517085927.24925-13-peterx@redhat.com> In-Reply-To: <20180517085927.24925-1-peterx@redhat.com> References: <20180517085927.24925-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Thu, 17 May 2018 09:00:12 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Thu, 17 May 2018 09:00:12 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v3 12/12] intel-iommu: new sync_shadow_page_table X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Firstly, introduce the sync_shadow_page_table() helper to resync the whole shadow page table of an IOMMU address space. Meanwhile, when we receive domain invalidation or similar requests (for example, context entry invalidations, global invalidations, ...), we should not really run the replay logic, instead we can now use the new sync shadow page table API to resync the whole shadow page table. There will be two major differences: 1. We don't unmap-all before walking the page table, we just sync. The global unmap-all can create a very small window that the page table is invalid or incomplete 2. We only walk the page table once now (while replay can be triggered multiple times depending on how many notifiers there are) Signed-off-by: Peter Xu Tested-by: Jintack Lim --- hw/i386/intel_iommu.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index a1a2a009c1..fbb2f763f0 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -1065,6 +1065,11 @@ static int vtd_sync_shadow_page_table_range(VTDAddre= ssSpace *vtd_as, return vtd_page_walk(&ce_cache, addr, addr + size, &info); } =20 +static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as) +{ + return vtd_sync_shadow_page_table_range(vtd_as, NULL, 0, UINT64_MAX); +} + /* * Fetch translation type for specific device. Returns <0 if error * happens, otherwise return the shifted type to check against @@ -1397,7 +1402,7 @@ static void vtd_iommu_replay_all(IntelIOMMUState *s) VTDAddressSpace *vtd_as; =20 QLIST_FOREACH(vtd_as, &s->notifiers_list, next) { - memory_region_iommu_replay_all(&vtd_as->iommu); + vtd_sync_shadow_page_table(vtd_as); } } =20 @@ -1470,14 +1475,13 @@ static void vtd_context_device_invalidate(IntelIOMM= UState *s, vtd_switch_address_space(vtd_as); /* * So a device is moving out of (or moving into) a - * domain, a replay() suites here to notify all the - * IOMMU_NOTIFIER_MAP registers about this change. + * domain, resync the shadow page table. * This won't bring bad even if we have no such * notifier registered - the IOMMU notification * framework will skip MAP notifications if that * happened. */ - memory_region_iommu_replay_all(&vtd_as->iommu); + vtd_sync_shadow_page_table(vtd_as); } } } @@ -1535,7 +1539,7 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUSta= te *s, uint16_t domain_id) if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), vtd_as->devfn, &ce) && domain_id =3D=3D VTD_CONTEXT_ENTRY_DID(ce.hi)) { - memory_region_iommu_replay_all(&vtd_as->iommu); + vtd_sync_shadow_page_table(vtd_as); } } } --=20 2.17.0