From nobody Thu Oct 30 05:21:17 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1525403445806835.9215353929864; Thu, 3 May 2018 20:10:45 -0700 (PDT) Received: from localhost ([::1]:60168 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER7N-0002iE-1L for importer@patchew.org; Thu, 03 May 2018 23:10:45 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33160) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fER5E-0001VX-Ln for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fER5D-0005In-Kg for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:32 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:48244 helo=mx1.redhat.com) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fER5D-0005IY-F7 for qemu-devel@nongnu.org; Thu, 03 May 2018 23:08:31 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 25C6140858CD; Fri, 4 May 2018 03:08:31 +0000 (UTC) Received: from xz-mi.redhat.com (ovpn-12-46.pek2.redhat.com [10.72.12.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1E1B9111762B; Fri, 4 May 2018 03:08:27 +0000 (UTC) From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 4 May 2018 11:08:02 +0800 Message-Id: <20180504030811.28111-2-peterx@redhat.com> In-Reply-To: <20180504030811.28111-1-peterx@redhat.com> References: <20180504030811.28111-1-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Fri, 04 May 2018 03:08:31 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Fri, 04 May 2018 03:08:31 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'peterx@redhat.com' RCPT:'' X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.187.233.73 Subject: [Qemu-devel] [PATCH v2 01/10] intel-iommu: send PSI always even if across PDEs X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tian Kevin , "Michael S . Tsirkin" , Jason Wang , peterx@redhat.com, Alex Williamson , Jintack Lim Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" During IOVA page table walking, there is a special case when the PSI covers one whole PDE (Page Directory Entry, which contains 512 Page Table Entries) or more. In the past, we skip that entry and we don't notify the IOMMU notifiers. This is not correct. We should send UNMAP notification to registered UNMAP notifiers in this case. For UNMAP only notifiers, this might cause IOTLBs cached in the devices even if they were already invalid. For MAP/UNMAP notifiers like vfio-pci, this will cause stale page mappings. This special case doesn't trigger often, but it is very easy to be triggered by nested device assignments, since in that case we'll possibly map the whole L2 guest RAM region into the device's IOVA address space (several GBs at least), which is far bigger than normal kernel driver usages of the device (tens of MBs normally). Without this patch applied to L1 QEMU, nested device assignment to L2 guests will dump some errors like: qemu-system-x86_64: VFIO_MAP_DMA: -17 qemu-system-x86_64: vfio_dma_map(0x557305420c30, 0xad000, 0x1000, 0x7f89a920d000) =3D -17 (File exists) Acked-by: Jason Wang [peterx: rewrite the commit message] Signed-off-by: Peter Xu Reviewed-by: Eric Auger --- hw/i386/intel_iommu.c | 42 ++++++++++++++++++++++++++++++------------ 1 file changed, 30 insertions(+), 12 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index fb31de9416..b359efd6f9 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -722,6 +722,15 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint= 64_t iova, bool is_write, =20 typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private); =20 +static int vtd_page_walk_one(IOMMUTLBEntry *entry, int level, + vtd_page_walk_hook hook_fn, void *private) +{ + assert(hook_fn); + trace_vtd_page_walk_one(level, entry->iova, entry->translated_addr, + entry->addr_mask, entry->perm); + return hook_fn(entry, private); +} + /** * vtd_page_walk_level - walk over specific level for IOVA range * @@ -781,28 +790,37 @@ static int vtd_page_walk_level(dma_addr_t addr, uint6= 4_t start, */ entry_valid =3D read_cur | write_cur; =20 + entry.target_as =3D &address_space_memory; + entry.iova =3D iova & subpage_mask; + entry.perm =3D IOMMU_ACCESS_FLAG(read_cur, write_cur); + entry.addr_mask =3D ~subpage_mask; + if (vtd_is_last_slpte(slpte, level)) { - entry.target_as =3D &address_space_memory; - entry.iova =3D iova & subpage_mask; /* NOTE: this is only meaningful if entry_valid =3D=3D true */ entry.translated_addr =3D vtd_get_slpte_addr(slpte, aw); - entry.addr_mask =3D ~subpage_mask; - entry.perm =3D IOMMU_ACCESS_FLAG(read_cur, write_cur); if (!entry_valid && !notify_unmap) { trace_vtd_page_walk_skip_perm(iova, iova_next); goto next; } - trace_vtd_page_walk_one(level, entry.iova, entry.translated_ad= dr, - entry.addr_mask, entry.perm); - if (hook_fn) { - ret =3D hook_fn(&entry, private); - if (ret < 0) { - return ret; - } + ret =3D vtd_page_walk_one(&entry, level, hook_fn, private); + if (ret < 0) { + return ret; } } else { if (!entry_valid) { - trace_vtd_page_walk_skip_perm(iova, iova_next); + if (notify_unmap) { + /* + * The whole entry is invalid; unmap it all. + * Translated address is meaningless, zero it. + */ + entry.translated_addr =3D 0x0; + ret =3D vtd_page_walk_one(&entry, level, hook_fn, priv= ate); + if (ret < 0) { + return ret; + } + } else { + trace_vtd_page_walk_skip_perm(iova, iova_next); + } goto next; } ret =3D vtd_page_walk_level(vtd_get_slpte_addr(slpte, aw), iov= a, --=20 2.17.0