From nobody Sun Apr 28 09:42:58 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 149156304822039.1951575709719; Fri, 7 Apr 2017 04:04:08 -0700 (PDT) Received: from localhost ([::1]:50106 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRgV-0002us-3p for importer@patchew.org; Fri, 07 Apr 2017 07:04:07 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56917) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRd2-0000LF-NS for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:00:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cwRcz-0005QU-Lc for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:00:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39344) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cwRcz-0005QF-DJ for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:00:29 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 4494D7D0F8; Fri, 7 Apr 2017 11:00:28 +0000 (UTC) Received: from pxdev.xzpeter.org.com (ovpn-8-43.pek2.redhat.com [10.72.8.43]) by smtp.corp.redhat.com (Postfix) with ESMTP id BC37F8E74C; Fri, 7 Apr 2017 10:59:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 4494D7D0F8 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=peterx@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 4494D7D0F8 From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 7 Apr 2017 18:59:07 +0800 Message-Id: <1491562755-23867-2-git-send-email-peterx@redhat.com> In-Reply-To: <1491562755-23867-1-git-send-email-peterx@redhat.com> References: <1491562755-23867-1-git-send-email-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 07 Apr 2017 11:00:28 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v9 1/9] memory: add section range info for IOMMU notifier X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tianyu.lan@intel.com, kevin.tian@intel.com, mst@redhat.com, jan.kiszka@siemens.com, jasowang@redhat.com, peterx@redhat.com, alex.williamson@redhat.com, Marcel Apfelbaum , bd.aviv@gmail.com, David Gibson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" In this patch, IOMMUNotifier.{start|end} are introduced to store section information for a specific notifier. When notification occurs, we not only check the notification type (MAP|UNMAP), but also check whether the notified iova range overlaps with the range of specific IOMMU notifier, and skip those notifiers if not in the listened range. When removing an region, we need to make sure we removed the correct VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well. This patch is solving the problem that vfio-pci devices receive duplicated UNMAP notification on x86 platform when vIOMMU is there. The issue is that x86 IOMMU has a (0, 2^64-1) IOMMU region, which is splitted by the (0xfee00000, 0xfeefffff) IRQ region. AFAIK this (splitted IOMMU region) is only happening on x86. This patch also helps vhost to leverage the new interface as well, so that vhost won't get duplicated cache flushes. In that sense, it's an slight performance improvement. Suggested-by: David Gibson Reviewed-by: Eric Auger Reviewed-by: Michael S. Tsirkin Acked-by: Alex Williamson Signed-off-by: Peter Xu --- hw/vfio/common.c | 12 +++++++++--- hw/virtio/vhost.c | 10 ++++++++-- include/exec/memory.h | 19 ++++++++++++++++++- memory.c | 9 +++++++++ 4 files changed, 44 insertions(+), 6 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index f3ba9b9..6b33b9f 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -478,8 +478,13 @@ static void vfio_listener_region_add(MemoryListener *l= istener, giommu->iommu_offset =3D section->offset_within_address_space - section->offset_within_region; giommu->container =3D container; - giommu->n.notify =3D vfio_iommu_map_notify; - giommu->n.notifier_flags =3D IOMMU_NOTIFIER_ALL; + llend =3D int128_add(int128_make64(section->offset_within_region), + section->size); + llend =3D int128_sub(llend, int128_one()); + iommu_notifier_init(&giommu->n, vfio_iommu_map_notify, + IOMMU_NOTIFIER_ALL, + section->offset_within_region, + int128_get64(llend)); QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next); =20 memory_region_register_iommu_notifier(giommu->iommu, &giommu->n); @@ -550,7 +555,8 @@ static void vfio_listener_region_del(MemoryListener *li= stener, VFIOGuestIOMMU *giommu; =20 QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) { - if (giommu->iommu =3D=3D section->mr) { + if (giommu->iommu =3D=3D section->mr && + giommu->n.start =3D=3D section->offset_within_region) { memory_region_unregister_iommu_notifier(giommu->iommu, &giommu->n); QLIST_REMOVE(giommu, giommu_next); diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 613494d..185b95b 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -736,14 +736,20 @@ static void vhost_iommu_region_add(MemoryListener *li= stener, struct vhost_dev *dev =3D container_of(listener, struct vhost_dev, iommu_listener); struct vhost_iommu *iommu; + Int128 end; =20 if (!memory_region_is_iommu(section->mr)) { return; } =20 iommu =3D g_malloc0(sizeof(*iommu)); - iommu->n.notify =3D vhost_iommu_unmap_notify; - iommu->n.notifier_flags =3D IOMMU_NOTIFIER_UNMAP; + end =3D int128_add(int128_make64(section->offset_within_region), + section->size); + end =3D int128_sub(end, int128_one()); + iommu_notifier_init(&iommu->n, vhost_iommu_unmap_notify, + IOMMU_NOTIFIER_UNMAP, + section->offset_within_region, + int128_get64(end)); iommu->mr =3D section->mr; iommu->iommu_offset =3D section->offset_within_address_space - section->offset_within_region; diff --git a/include/exec/memory.h b/include/exec/memory.h index f20b191..0840c89 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -77,13 +77,30 @@ typedef enum { =20 #define IOMMU_NOTIFIER_ALL (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP) =20 +struct IOMMUNotifier; +typedef void (*IOMMUNotify)(struct IOMMUNotifier *notifier, + IOMMUTLBEntry *data); + struct IOMMUNotifier { - void (*notify)(struct IOMMUNotifier *notifier, IOMMUTLBEntry *data); + IOMMUNotify notify; IOMMUNotifierFlag notifier_flags; + /* Notify for address space range start <=3D addr <=3D end */ + hwaddr start; + hwaddr end; QLIST_ENTRY(IOMMUNotifier) node; }; typedef struct IOMMUNotifier IOMMUNotifier; =20 +static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn, + IOMMUNotifierFlag flags, + hwaddr start, hwaddr end) +{ + n->notify =3D fn; + n->notifier_flags =3D flags; + n->start =3D start; + n->end =3D end; +} + /* New-style MMIO accessors can indicate that the transaction failed. * A zero (MEMTX_OK) response means success; anything else is a failure * of some kind. The memory subsystem will bitwise-OR together results diff --git a/memory.c b/memory.c index 4c95aaf..75ac595 100644 --- a/memory.c +++ b/memory.c @@ -1606,6 +1606,7 @@ void memory_region_register_iommu_notifier(MemoryRegi= on *mr, =20 /* We need to register for at least one bitfield */ assert(n->notifier_flags !=3D IOMMU_NOTIFIER_NONE); + assert(n->start <=3D n->end); QLIST_INSERT_HEAD(&mr->iommu_notify, n, node); memory_region_update_iommu_notify_flags(mr); } @@ -1667,6 +1668,14 @@ void memory_region_notify_iommu(MemoryRegion *mr, } =20 QLIST_FOREACH(iommu_notifier, &mr->iommu_notify, node) { + /* + * Skip the notification if the notification does not overlap + * with registered range. + */ + if (iommu_notifier->start > entry.iova + entry.addr_mask + 1 || + iommu_notifier->end < entry.iova) { + continue; + } if (iommu_notifier->notifier_flags & request_flags) { iommu_notifier->notify(iommu_notifier, &entry); } --=20 2.7.4 From nobody Sun Apr 28 09:42:58 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491563151492917.7157101040442; Fri, 7 Apr 2017 04:05:51 -0700 (PDT) Received: from localhost ([::1]:50119 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRiA-0004Gy-CQ for importer@patchew.org; Fri, 07 Apr 2017 07:05:50 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57048) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRdO-0000ZC-Ct for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:00:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cwRdL-0005ZK-CR for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:00:54 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59446) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cwRdL-0005Vf-2T for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:00:51 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E37C661B8C; Fri, 7 Apr 2017 11:00:47 +0000 (UTC) Received: from pxdev.xzpeter.org.com (ovpn-8-43.pek2.redhat.com [10.72.8.43]) by smtp.corp.redhat.com (Postfix) with ESMTP id 47CD1179FC; Fri, 7 Apr 2017 11:00:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com E37C661B8C Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx10.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=peterx@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com E37C661B8C From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 7 Apr 2017 18:59:08 +0800 Message-Id: <1491562755-23867-3-git-send-email-peterx@redhat.com> In-Reply-To: <1491562755-23867-1-git-send-email-peterx@redhat.com> References: <1491562755-23867-1-git-send-email-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Fri, 07 Apr 2017 11:00:48 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v9 2/9] memory: provide IOMMU_NOTIFIER_FOREACH macro X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tianyu.lan@intel.com, kevin.tian@intel.com, mst@redhat.com, jan.kiszka@siemens.com, jasowang@redhat.com, peterx@redhat.com, alex.williamson@redhat.com, Marcel Apfelbaum , bd.aviv@gmail.com, David Gibson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" A new macro is provided to iterate all the IOMMU notifiers hooked under specific IOMMU memory region. Reviewed-by: David Gibson Reviewed-by: Eric Auger Reviewed-by: \"Michael S. Tsirkin\" Signed-off-by: Peter Xu --- include/exec/memory.h | 3 +++ memory.c | 4 ++-- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index 0840c89..07e43da 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -239,6 +239,9 @@ struct MemoryRegion { IOMMUNotifierFlag iommu_notify_flags; }; =20 +#define IOMMU_NOTIFIER_FOREACH(n, mr) \ + QLIST_FOREACH((n), &(mr)->iommu_notify, node) + /** * MemoryListener: callbacks structure for updates to the physical memory = map * diff --git a/memory.c b/memory.c index 75ac595..7496b3d 100644 --- a/memory.c +++ b/memory.c @@ -1583,7 +1583,7 @@ static void memory_region_update_iommu_notify_flags(M= emoryRegion *mr) IOMMUNotifierFlag flags =3D IOMMU_NOTIFIER_NONE; IOMMUNotifier *iommu_notifier; =20 - QLIST_FOREACH(iommu_notifier, &mr->iommu_notify, node) { + IOMMU_NOTIFIER_FOREACH(iommu_notifier, mr) { flags |=3D iommu_notifier->notifier_flags; } =20 @@ -1667,7 +1667,7 @@ void memory_region_notify_iommu(MemoryRegion *mr, request_flags =3D IOMMU_NOTIFIER_UNMAP; } =20 - QLIST_FOREACH(iommu_notifier, &mr->iommu_notify, node) { + IOMMU_NOTIFIER_FOREACH(iommu_notifier, mr) { /* * Skip the notification if the notification does not overlap * with registered range. --=20 2.7.4 From nobody Sun Apr 28 09:42:58 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491562974233407.3731014868496; Fri, 7 Apr 2017 04:02:54 -0700 (PDT) Received: from localhost ([::1]:50101 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRfJ-0001wV-3X for importer@patchew.org; Fri, 07 Apr 2017 07:02:53 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57102) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRdb-0000j9-8G for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cwRdY-0005dJ-D2 for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:07 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36170) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cwRdY-0005ct-85 for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:04 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 20CD586647; Fri, 7 Apr 2017 11:01:03 +0000 (UTC) Received: from pxdev.xzpeter.org.com (ovpn-8-43.pek2.redhat.com [10.72.8.43]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9FA35179FC; Fri, 7 Apr 2017 11:00:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 20CD586647 Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx02.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=peterx@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 20CD586647 From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 7 Apr 2017 18:59:09 +0800 Message-Id: <1491562755-23867-4-git-send-email-peterx@redhat.com> In-Reply-To: <1491562755-23867-1-git-send-email-peterx@redhat.com> References: <1491562755-23867-1-git-send-email-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 07 Apr 2017 11:01:03 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v9 3/9] memory: provide iommu_replay_all() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tianyu.lan@intel.com, kevin.tian@intel.com, mst@redhat.com, jan.kiszka@siemens.com, jasowang@redhat.com, peterx@redhat.com, alex.williamson@redhat.com, Marcel Apfelbaum , bd.aviv@gmail.com, David Gibson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This is an "global" version of existing memory_region_iommu_replay() - we announce the translations to all the registered notifiers, instead of a specific one. Reviewed-by: David Gibson Reviewed-by: \"Michael S. Tsirkin\" Signed-off-by: Peter Xu --- include/exec/memory.h | 8 ++++++++ memory.c | 9 +++++++++ 2 files changed, 17 insertions(+) diff --git a/include/exec/memory.h b/include/exec/memory.h index 07e43da..fb7dff3 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -713,6 +713,14 @@ void memory_region_iommu_replay(MemoryRegion *mr, IOMM= UNotifier *n, bool is_write); =20 /** + * memory_region_iommu_replay_all: replay existing IOMMU translations + * to all the notifiers registered. + * + * @mr: the memory region to observe + */ +void memory_region_iommu_replay_all(MemoryRegion *mr); + +/** * memory_region_unregister_iommu_notifier: unregister a notifier for * changes to IOMMU translation entries. * diff --git a/memory.c b/memory.c index 7496b3d..b4ed67b 100644 --- a/memory.c +++ b/memory.c @@ -1642,6 +1642,15 @@ void memory_region_iommu_replay(MemoryRegion *mr, IO= MMUNotifier *n, } } =20 +void memory_region_iommu_replay_all(MemoryRegion *mr) +{ + IOMMUNotifier *notifier; + + IOMMU_NOTIFIER_FOREACH(notifier, mr) { + memory_region_iommu_replay(mr, notifier, false); + } +} + void memory_region_unregister_iommu_notifier(MemoryRegion *mr, IOMMUNotifier *n) { --=20 2.7.4 From nobody Sun Apr 28 09:42:58 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491563223180104.44641192258644; Fri, 7 Apr 2017 04:07:03 -0700 (PDT) Received: from localhost ([::1]:50122 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRjJ-0004xh-9E for importer@patchew.org; Fri, 07 Apr 2017 07:07:01 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57134) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRdf-0000mo-6y for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cwRde-0005fs-EO for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40286) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cwRde-0005fX-6E for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:10 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1B15E8050B; Fri, 7 Apr 2017 11:01:09 +0000 (UTC) Received: from pxdev.xzpeter.org.com (ovpn-8-43.pek2.redhat.com [10.72.8.43]) by smtp.corp.redhat.com (Postfix) with ESMTP id BD0D190613; Fri, 7 Apr 2017 11:01:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 1B15E8050B Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=peterx@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 1B15E8050B From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 7 Apr 2017 18:59:10 +0800 Message-Id: <1491562755-23867-5-git-send-email-peterx@redhat.com> In-Reply-To: <1491562755-23867-1-git-send-email-peterx@redhat.com> References: <1491562755-23867-1-git-send-email-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 07 Apr 2017 11:01:09 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v9 4/9] memory: introduce memory_region_notify_one() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tianyu.lan@intel.com, kevin.tian@intel.com, mst@redhat.com, jan.kiszka@siemens.com, jasowang@redhat.com, peterx@redhat.com, alex.williamson@redhat.com, Marcel Apfelbaum , bd.aviv@gmail.com, David Gibson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Generalizing the notify logic in memory_region_notify_iommu() into a single function. This can be further used in customized replay() functions for IOMMUs. Reviewed-by: David Gibson Reviewed-by: Eric Auger Reviewed-by: \"Michael S. Tsirkin\" Signed-off-by: Peter Xu --- include/exec/memory.h | 15 +++++++++++++++ memory.c | 40 ++++++++++++++++++++++++---------------- 2 files changed, 39 insertions(+), 16 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index fb7dff3..055b3a8 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -688,6 +688,21 @@ void memory_region_notify_iommu(MemoryRegion *mr, IOMMUTLBEntry entry); =20 /** + * memory_region_notify_one: notify a change in an IOMMU translation + * entry to a single notifier + * + * This works just like memory_region_notify_iommu(), but it only + * notifies a specific notifier, not all of them. + * + * @notifier: the notifier to be notified + * @entry: the new entry in the IOMMU translation table. The entry + * replaces all old entries for the same virtual I/O address range. + * Deleted entries have .@perm =3D=3D 0. + */ +void memory_region_notify_one(IOMMUNotifier *notifier, + IOMMUTLBEntry *entry); + +/** * memory_region_register_iommu_notifier: register a notifier for changes = to * IOMMU translation entries. * diff --git a/memory.c b/memory.c index b4ed67b..ded4bf1 100644 --- a/memory.c +++ b/memory.c @@ -1662,32 +1662,40 @@ void memory_region_unregister_iommu_notifier(Memory= Region *mr, memory_region_update_iommu_notify_flags(mr); } =20 -void memory_region_notify_iommu(MemoryRegion *mr, - IOMMUTLBEntry entry) +void memory_region_notify_one(IOMMUNotifier *notifier, + IOMMUTLBEntry *entry) { - IOMMUNotifier *iommu_notifier; IOMMUNotifierFlag request_flags; =20 - assert(memory_region_is_iommu(mr)); + /* + * Skip the notification if the notification does not overlap + * with registered range. + */ + if (notifier->start > entry->iova + entry->addr_mask + 1 || + notifier->end < entry->iova) { + return; + } =20 - if (entry.perm & IOMMU_RW) { + if (entry->perm & IOMMU_RW) { request_flags =3D IOMMU_NOTIFIER_MAP; } else { request_flags =3D IOMMU_NOTIFIER_UNMAP; } =20 + if (notifier->notifier_flags & request_flags) { + notifier->notify(notifier, entry); + } +} + +void memory_region_notify_iommu(MemoryRegion *mr, + IOMMUTLBEntry entry) +{ + IOMMUNotifier *iommu_notifier; + + assert(memory_region_is_iommu(mr)); + IOMMU_NOTIFIER_FOREACH(iommu_notifier, mr) { - /* - * Skip the notification if the notification does not overlap - * with registered range. - */ - if (iommu_notifier->start > entry.iova + entry.addr_mask + 1 || - iommu_notifier->end < entry.iova) { - continue; - } - if (iommu_notifier->notifier_flags & request_flags) { - iommu_notifier->notify(iommu_notifier, &entry); - } + memory_region_notify_one(iommu_notifier, &entry); } } =20 --=20 2.7.4 From nobody Sun Apr 28 09:42:58 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491563283938104.78049436526089; Fri, 7 Apr 2017 04:08:03 -0700 (PDT) Received: from localhost ([::1]:50124 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRkI-0005ev-Li for importer@patchew.org; Fri, 07 Apr 2017 07:08:02 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57209) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRdy-0000zz-7E for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cwRdv-0005kC-7q for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:30 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41154) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cwRdv-0005k1-2o for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:27 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 04AAB3D95D; Fri, 7 Apr 2017 11:01:26 +0000 (UTC) Received: from pxdev.xzpeter.org.com (ovpn-8-43.pek2.redhat.com [10.72.8.43]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9FE94179FC; Fri, 7 Apr 2017 11:01:09 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 04AAB3D95D Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx06.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=peterx@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 04AAB3D95D From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 7 Apr 2017 18:59:11 +0800 Message-Id: <1491562755-23867-6-git-send-email-peterx@redhat.com> In-Reply-To: <1491562755-23867-1-git-send-email-peterx@redhat.com> References: <1491562755-23867-1-git-send-email-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Fri, 07 Apr 2017 11:01:26 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v9 5/9] memory: add MemoryRegionIOMMUOps.replay() callback X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tianyu.lan@intel.com, kevin.tian@intel.com, mst@redhat.com, jan.kiszka@siemens.com, jasowang@redhat.com, peterx@redhat.com, alex.williamson@redhat.com, Marcel Apfelbaum , bd.aviv@gmail.com, David Gibson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Originally we have one memory_region_iommu_replay() function, which is the default behavior to replay the translations of the whole IOMMU region. However, on some platform like x86, we may want our own replay logic for IOMMU regions. This patch adds one more hook for IOMMUOps for the callback, and it'll override the default if set. Reviewed-by: David Gibson Reviewed-by: Eric Auger Reviewed-by: \"Michael S. Tsirkin\" Signed-off-by: Peter Xu --- include/exec/memory.h | 2 ++ memory.c | 6 ++++++ 2 files changed, 8 insertions(+) diff --git a/include/exec/memory.h b/include/exec/memory.h index 055b3a8..c0280b7 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -191,6 +191,8 @@ struct MemoryRegionIOMMUOps { void (*notify_flag_changed)(MemoryRegion *iommu, IOMMUNotifierFlag old_flags, IOMMUNotifierFlag new_flags); + /* Set this up to provide customized IOMMU replay function */ + void (*replay)(MemoryRegion *iommu, IOMMUNotifier *notifier); }; =20 typedef struct CoalescedMemoryRange CoalescedMemoryRange; diff --git a/memory.c b/memory.c index ded4bf1..b782d5b 100644 --- a/memory.c +++ b/memory.c @@ -1626,6 +1626,12 @@ void memory_region_iommu_replay(MemoryRegion *mr, IO= MMUNotifier *n, hwaddr addr, granularity; IOMMUTLBEntry iotlb; =20 + /* If the IOMMU has its own replay callback, override */ + if (mr->iommu_ops->replay) { + mr->iommu_ops->replay(mr, n); + return; + } + granularity =3D memory_region_iommu_get_min_page_size(mr); =20 for (addr =3D 0; addr < memory_region_size(mr); addr +=3D granularity)= { --=20 2.7.4 From nobody Sun Apr 28 09:42:58 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491563014857848.3849609551228; Fri, 7 Apr 2017 04:03:34 -0700 (PDT) Received: from localhost ([::1]:50102 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRfx-0002TZ-CQ for importer@patchew.org; Fri, 07 Apr 2017 07:03:33 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57275) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRe8-00017d-CE for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cwRe5-0005ol-Bn for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:40 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43672) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cwRe5-0005oZ-76 for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:37 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 144162EF175; Fri, 7 Apr 2017 11:01:36 +0000 (UTC) Received: from pxdev.xzpeter.org.com (ovpn-8-43.pek2.redhat.com [10.72.8.43]) by smtp.corp.redhat.com (Postfix) with ESMTP id 876B2179FC; Fri, 7 Apr 2017 11:01:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 144162EF175 Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=peterx@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 144162EF175 From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 7 Apr 2017 18:59:12 +0800 Message-Id: <1491562755-23867-7-git-send-email-peterx@redhat.com> In-Reply-To: <1491562755-23867-1-git-send-email-peterx@redhat.com> References: <1491562755-23867-1-git-send-email-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Fri, 07 Apr 2017 11:01:36 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v9 6/9] intel_iommu: use the correct memory region for device IOTLB notification X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tianyu.lan@intel.com, kevin.tian@intel.com, Eduardo Habkost , mst@redhat.com, jan.kiszka@siemens.com, jasowang@redhat.com, peterx@redhat.com, alex.williamson@redhat.com, Marcel Apfelbaum , Paolo Bonzini , Richard Henderson , bd.aviv@gmail.com, David Gibson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Jason Wang We have a specific memory region for DMAR now, so it's wrong to trigger the notifier with the root region. Cc: Michael S. Tsirkin Cc: Paolo Bonzini Cc: Richard Henderson Cc: Eduardo Habkost Signed-off-by: Jason Wang Reviewed-by: Peter Xu Reviewed-by: \"Michael S. Tsirkin\" Signed-off-by: Peter Xu --- hw/i386/intel_iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 22d8226..2412df4 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -1457,7 +1457,7 @@ static bool vtd_process_device_iotlb_desc(IntelIOMMUS= tate *s, entry.iova =3D addr; entry.perm =3D IOMMU_NONE; entry.translated_addr =3D 0; - memory_region_notify_iommu(entry.target_as->root, entry); + memory_region_notify_iommu(&vtd_dev_as->iommu, entry); =20 done: return true; --=20 2.7.4 From nobody Sun Apr 28 09:42:58 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491563038036905.7829770524802; Fri, 7 Apr 2017 04:03:58 -0700 (PDT) Received: from localhost ([::1]:50103 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRgK-0002kI-Th for importer@patchew.org; Fri, 07 Apr 2017 07:03:56 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57368) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwReM-0001Ix-TV for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cwReI-0005sm-8r for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:54 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40956) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cwReI-0005sX-18 for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:50 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E6B67804F4; Fri, 7 Apr 2017 11:01:48 +0000 (UTC) Received: from pxdev.xzpeter.org.com (ovpn-8-43.pek2.redhat.com [10.72.8.43]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8704C9061D; Fri, 7 Apr 2017 11:01:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com E6B67804F4 Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=peterx@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com E6B67804F4 From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 7 Apr 2017 18:59:13 +0800 Message-Id: <1491562755-23867-8-git-send-email-peterx@redhat.com> In-Reply-To: <1491562755-23867-1-git-send-email-peterx@redhat.com> References: <1491562755-23867-1-git-send-email-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 07 Apr 2017 11:01:49 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v9 7/9] intel_iommu: provide its own replay() callback X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tianyu.lan@intel.com, kevin.tian@intel.com, mst@redhat.com, jan.kiszka@siemens.com, jasowang@redhat.com, peterx@redhat.com, alex.williamson@redhat.com, Marcel Apfelbaum , bd.aviv@gmail.com, David Gibson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The default replay() don't work for VT-d since vt-d will have a huge default memory region which covers address range 0-(2^64-1). This will normally consumes a lot of time (which looks like a dead loop). The solution is simple - we don't walk over all the regions. Instead, we jump over the regions when we found that the page directories are empty. It'll greatly reduce the time to walk the whole region. To achieve this, we provided a page walk helper to do that, invoking corresponding hook function when we found an page we are interested in. vtd_page_walk_level() is the core logic for the page walking. It's interface is designed to suite further use case, e.g., to invalidate a range of addresses. Reviewed-by: Jason Wang Reviewed-by: David Gibson Reviewed-by: \"Michael S. Tsirkin\" Signed-off-by: Peter Xu --- hw/i386/intel_iommu.c | 182 ++++++++++++++++++++++++++++++++++++++++++++++= ++-- hw/i386/trace-events | 7 ++ include/exec/memory.h | 2 + 3 files changed, 186 insertions(+), 5 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 2412df4..7af4e22 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -595,6 +595,22 @@ static inline uint32_t vtd_get_agaw_from_context_entry= (VTDContextEntry *ce) return 30 + (ce->hi & VTD_CONTEXT_ENTRY_AW) * 9; } =20 +static inline uint64_t vtd_iova_limit(VTDContextEntry *ce) +{ + uint32_t ce_agaw =3D vtd_get_agaw_from_context_entry(ce); + return 1ULL << MIN(ce_agaw, VTD_MGAW); +} + +/* Return true if IOVA passes range check, otherwise false. */ +static inline bool vtd_iova_range_check(uint64_t iova, VTDContextEntry *ce) +{ + /* + * Check if @iova is above 2^X-1, where X is the minimum of MGAW + * in CAP_REG and AW in context-entry. + */ + return !(iova & ~(vtd_iova_limit(ce) - 1)); +} + static const uint64_t vtd_paging_entry_rsvd_field[] =3D { [0] =3D ~0ULL, /* For not large page */ @@ -630,13 +646,9 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint= 64_t iova, bool is_write, uint32_t level =3D vtd_get_level_from_context_entry(ce); uint32_t offset; uint64_t slpte; - uint32_t ce_agaw =3D vtd_get_agaw_from_context_entry(ce); uint64_t access_right_check; =20 - /* Check if @iova is above 2^X-1, where X is the minimum of MGAW - * in CAP_REG and AW in context-entry. - */ - if (iova & ~((1ULL << MIN(ce_agaw, VTD_MGAW)) - 1)) { + if (!vtd_iova_range_check(iova, ce)) { VTD_DPRINTF(GENERAL, "error: iova 0x%"PRIx64 " exceeds limits", io= va); return -VTD_FR_ADDR_BEYOND_MGAW; } @@ -684,6 +696,134 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uin= t64_t iova, bool is_write, } } =20 +typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, void *private); + +/** + * vtd_page_walk_level - walk over specific level for IOVA range + * + * @addr: base GPA addr to start the walk + * @start: IOVA range start address + * @end: IOVA range end address (start <=3D addr < end) + * @hook_fn: hook func to be called when detected page + * @private: private data to be passed into hook func + * @read: whether parent level has read permission + * @write: whether parent level has write permission + * @notify_unmap: whether we should notify invalid entries + */ +static int vtd_page_walk_level(dma_addr_t addr, uint64_t start, + uint64_t end, vtd_page_walk_hook hook_fn, + void *private, uint32_t level, + bool read, bool write, bool notify_unmap) +{ + bool read_cur, write_cur, entry_valid; + uint32_t offset; + uint64_t slpte; + uint64_t subpage_size, subpage_mask; + IOMMUTLBEntry entry; + uint64_t iova =3D start; + uint64_t iova_next; + int ret =3D 0; + + trace_vtd_page_walk_level(addr, level, start, end); + + subpage_size =3D 1ULL << vtd_slpt_level_shift(level); + subpage_mask =3D vtd_slpt_level_page_mask(level); + + while (iova < end) { + iova_next =3D (iova & subpage_mask) + subpage_size; + + offset =3D vtd_iova_level_offset(iova, level); + slpte =3D vtd_get_slpte(addr, offset); + + if (slpte =3D=3D (uint64_t)-1) { + trace_vtd_page_walk_skip_read(iova, iova_next); + goto next; + } + + if (vtd_slpte_nonzero_rsvd(slpte, level)) { + trace_vtd_page_walk_skip_reserve(iova, iova_next); + goto next; + } + + /* Permissions are stacked with parents' */ + read_cur =3D read && (slpte & VTD_SL_R); + write_cur =3D write && (slpte & VTD_SL_W); + + /* + * As long as we have either read/write permission, this is a + * valid entry. The rule works for both page entries and page + * table entries. + */ + entry_valid =3D read_cur | write_cur; + + if (vtd_is_last_slpte(slpte, level)) { + entry.target_as =3D &address_space_memory; + entry.iova =3D iova & subpage_mask; + /* NOTE: this is only meaningful if entry_valid =3D=3D true */ + entry.translated_addr =3D vtd_get_slpte_addr(slpte); + entry.addr_mask =3D ~subpage_mask; + entry.perm =3D IOMMU_ACCESS_FLAG(read_cur, write_cur); + if (!entry_valid && !notify_unmap) { + trace_vtd_page_walk_skip_perm(iova, iova_next); + goto next; + } + trace_vtd_page_walk_one(level, entry.iova, entry.translated_ad= dr, + entry.addr_mask, entry.perm); + if (hook_fn) { + ret =3D hook_fn(&entry, private); + if (ret < 0) { + return ret; + } + } + } else { + if (!entry_valid) { + trace_vtd_page_walk_skip_perm(iova, iova_next); + goto next; + } + ret =3D vtd_page_walk_level(vtd_get_slpte_addr(slpte), iova, + MIN(iova_next, end), hook_fn, privat= e, + level - 1, read_cur, write_cur, + notify_unmap); + if (ret < 0) { + return ret; + } + } + +next: + iova =3D iova_next; + } + + return 0; +} + +/** + * vtd_page_walk - walk specific IOVA range, and call the hook + * + * @ce: context entry to walk upon + * @start: IOVA address to start the walk + * @end: IOVA range end address (start <=3D addr < end) + * @hook_fn: the hook that to be called for each detected area + * @private: private data for the hook function + */ +static int vtd_page_walk(VTDContextEntry *ce, uint64_t start, uint64_t end, + vtd_page_walk_hook hook_fn, void *private) +{ + dma_addr_t addr =3D vtd_get_slpt_base_from_context(ce); + uint32_t level =3D vtd_get_level_from_context_entry(ce); + + if (!vtd_iova_range_check(start, ce)) { + return -VTD_FR_ADDR_BEYOND_MGAW; + } + + if (!vtd_iova_range_check(end, ce)) { + /* Fix end so that it reaches the maximum */ + end =3D vtd_iova_limit(ce); + } + + return vtd_page_walk_level(addr, start, end, hook_fn, private, + level, true, true, false); +} + /* Map a device to its corresponding domain (context-entry) */ static int vtd_dev_to_context_entry(IntelIOMMUState *s, uint8_t bus_num, uint8_t devfn, VTDContextEntry *ce) @@ -2402,6 +2542,37 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s,= PCIBus *bus, int devfn) return vtd_dev_as; } =20 +static int vtd_replay_hook(IOMMUTLBEntry *entry, void *private) +{ + memory_region_notify_one((IOMMUNotifier *)private, entry); + return 0; +} + +static void vtd_iommu_replay(MemoryRegion *mr, IOMMUNotifier *n) +{ + VTDAddressSpace *vtd_as =3D container_of(mr, VTDAddressSpace, iommu); + IntelIOMMUState *s =3D vtd_as->iommu_state; + uint8_t bus_n =3D pci_bus_num(vtd_as->bus); + VTDContextEntry ce; + + if (vtd_dev_to_context_entry(s, bus_n, vtd_as->devfn, &ce) =3D=3D 0) { + /* + * Scanned a valid context entry, walk over the pages and + * notify when needed. + */ + trace_vtd_replay_ce_valid(bus_n, PCI_SLOT(vtd_as->devfn), + PCI_FUNC(vtd_as->devfn), + VTD_CONTEXT_ENTRY_DID(ce.hi), + ce.hi, ce.lo); + vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n); + } else { + trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), + PCI_FUNC(vtd_as->devfn)); + } + + return; +} + /* Do the initialization. It will also be called when reset, so pay * attention when adding new initialization stuff. */ @@ -2416,6 +2587,7 @@ static void vtd_init(IntelIOMMUState *s) =20 s->iommu_ops.translate =3D vtd_iommu_translate; s->iommu_ops.notify_flag_changed =3D vtd_iommu_notify_flag_changed; + s->iommu_ops.replay =3D vtd_iommu_replay; s->root =3D 0; s->root_extended =3D false; s->dmar_enabled =3D false; diff --git a/hw/i386/trace-events b/hw/i386/trace-events index baed874..f725bca 100644 --- a/hw/i386/trace-events +++ b/hw/i386/trace-events @@ -30,6 +30,13 @@ vtd_iotlb_cc_hit(uint8_t bus, uint8_t devfn, uint64_t hi= gh, uint64_t low, uint32 vtd_iotlb_cc_update(uint8_t bus, uint8_t devfn, uint64_t high, uint64_t lo= w, uint32_t gen1, uint32_t gen2) "IOTLB context update bus 0x%"PRIx8" devfn= 0x%"PRIx8" high 0x%"PRIx64" low 0x%"PRIx64" gen %"PRIu32" -> gen %"PRIu32 vtd_iotlb_reset(const char *reason) "IOTLB reset (reason: %s)" vtd_fault_disabled(void) "Fault processing disabled for context entry" +vtd_replay_ce_valid(uint8_t bus, uint8_t dev, uint8_t fn, uint16_t domain,= uint64_t hi, uint64_t lo) "replay valid context device %02"PRIx8":%02"PRIx= 8".%02"PRIx8" domain 0x%"PRIx16" hi 0x%"PRIx64" lo 0x%"PRIx64 +vtd_replay_ce_invalid(uint8_t bus, uint8_t dev, uint8_t fn) "replay invali= d context device %02"PRIx8":%02"PRIx8".%02"PRIx8 +vtd_page_walk_level(uint64_t addr, uint32_t level, uint64_t start, uint64_= t end) "walk (base=3D0x%"PRIx64", level=3D%"PRIu32") iova range 0x%"PRIx64"= - 0x%"PRIx64 +vtd_page_walk_one(uint32_t level, uint64_t iova, uint64_t gpa, uint64_t ma= sk, int perm) "detected page level 0x%"PRIx32" iova 0x%"PRIx64" -> gpa 0x%"= PRIx64" mask 0x%"PRIx64" perm %d" +vtd_page_walk_skip_read(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to unable to read" +vtd_page_walk_skip_perm(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to perm empty" +vtd_page_walk_skip_reserve(uint64_t iova, uint64_t next) "Page walk skip i= ova 0x%"PRIx64" - 0x%"PRIx64" due to rsrv set" =20 # hw/i386/amd_iommu.c amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at = addr 0x%"PRIx64" + offset 0x%"PRIx32 diff --git a/include/exec/memory.h b/include/exec/memory.h index c0280b7..c4fc94d 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -55,6 +55,8 @@ typedef enum { IOMMU_RW =3D 3, } IOMMUAccessFlags; =20 +#define IOMMU_ACCESS_FLAG(r, w) (((r) ? IOMMU_RO : 0) | ((w) ? IOMMU_WO : = 0)) + struct IOMMUTLBEntry { AddressSpace *target_as; hwaddr iova; --=20 2.7.4 From nobody Sun Apr 28 09:42:58 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491563343945936.884258575086; Fri, 7 Apr 2017 04:09:03 -0700 (PDT) Received: from localhost ([::1]:50126 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRlG-0006AV-Kr for importer@patchew.org; Fri, 07 Apr 2017 07:09:02 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57397) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwReS-0001Rt-R6 for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:02:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cwReR-0005wQ-BA for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:02:00 -0400 Received: from mx1.redhat.com ([209.132.183.28]:34466) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cwReR-0005vf-2j for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:01:59 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 06A167709; Fri, 7 Apr 2017 11:01:58 +0000 (UTC) Received: from pxdev.xzpeter.org.com (ovpn-8-43.pek2.redhat.com [10.72.8.43]) by smtp.corp.redhat.com (Postfix) with ESMTP id E881D8E74C; Fri, 7 Apr 2017 11:01:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 06A167709 Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=peterx@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 06A167709 From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 7 Apr 2017 18:59:14 +0800 Message-Id: <1491562755-23867-9-git-send-email-peterx@redhat.com> In-Reply-To: <1491562755-23867-1-git-send-email-peterx@redhat.com> References: <1491562755-23867-1-git-send-email-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 07 Apr 2017 11:01:58 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v9 8/9] intel_iommu: allow dynamic switch of IOMMU region X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tianyu.lan@intel.com, kevin.tian@intel.com, mst@redhat.com, jan.kiszka@siemens.com, jasowang@redhat.com, peterx@redhat.com, alex.williamson@redhat.com, Marcel Apfelbaum , bd.aviv@gmail.com, David Gibson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This is preparation work to finally enabled dynamic switching ON/OFF for VT-d protection. The old VT-d codes is using static IOMMU address space, and that won't satisfy vfio-pci device listeners. Let me explain. vfio-pci devices depend on the memory region listener and IOMMU replay mechanism to make sure the device mapping is coherent with the guest even if there are domain switches. And there are two kinds of domain switches: (1) switch from domain A -> B (2) switch from domain A -> no domain (e.g., turn DMAR off) Case (1) is handled by the context entry invalidation handling by the VT-d replay logic. What the replay function should do here is to replay the existing page mappings in domain B. However for case (2), we don't want to replay any domain mappings - we just need the default GPA->HPA mappings (the address_space_memory mapping). And this patch helps on case (2) to build up the mapping automatically by leveraging the vfio-pci memory listeners. Another important thing that this patch does is to seperate IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not depend on the DMAR region (like before this patch). It should be a standalone region, and it should be able to be activated without DMAR (which is a common behavior of Linux kernel - by default it enables IR while disabled DMAR). Reviewed-by: Jason Wang Reviewed-by: David Gibson Reviewed-by: \"Michael S. Tsirkin\" Signed-off-by: Peter Xu --- hw/i386/intel_iommu.c | 81 +++++++++++++++++++++++++++++++++++++++= +--- hw/i386/trace-events | 2 +- include/hw/i386/intel_iommu.h | 2 ++ 3 files changed, 79 insertions(+), 6 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 7af4e22..f7dec82 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -1291,9 +1291,49 @@ static void vtd_handle_gcmd_sirtp(IntelIOMMUState *s) vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_IRTPS); } =20 +static void vtd_switch_address_space(VTDAddressSpace *as) +{ + assert(as); + + trace_vtd_switch_address_space(pci_bus_num(as->bus), + VTD_PCI_SLOT(as->devfn), + VTD_PCI_FUNC(as->devfn), + as->iommu_state->dmar_enabled); + + /* Turn off first then on the other */ + if (as->iommu_state->dmar_enabled) { + memory_region_set_enabled(&as->sys_alias, false); + memory_region_set_enabled(&as->iommu, true); + } else { + memory_region_set_enabled(&as->iommu, false); + memory_region_set_enabled(&as->sys_alias, true); + } +} + +static void vtd_switch_address_space_all(IntelIOMMUState *s) +{ + GHashTableIter iter; + VTDBus *vtd_bus; + int i; + + g_hash_table_iter_init(&iter, s->vtd_as_by_busptr); + while (g_hash_table_iter_next(&iter, NULL, (void **)&vtd_bus)) { + for (i =3D 0; i < X86_IOMMU_PCI_DEVFN_MAX; i++) { + if (!vtd_bus->dev_as[i]) { + continue; + } + vtd_switch_address_space(vtd_bus->dev_as[i]); + } + } +} + /* Handle Translation Enable/Disable */ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en) { + if (s->dmar_enabled =3D=3D en) { + return; + } + VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off")); =20 if (en) { @@ -1308,6 +1348,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bo= ol en) /* Ok - report back to driver */ vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0); } + + vtd_switch_address_space_all(s); } =20 /* Handle Interrupt Remap Enable/Disable */ @@ -2529,15 +2571,44 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s= , PCIBus *bus, int devfn) vtd_dev_as->devfn =3D (uint8_t)devfn; vtd_dev_as->iommu_state =3D s; vtd_dev_as->context_cache_entry.context_cache_gen =3D 0; + + /* + * Memory region relationships looks like (Address range shows + * only lower 32 bits to make it short in length...): + * + * |-----------------+-------------------+----------| + * | Name | Address range | Priority | + * |-----------------+-------------------+----------+ + * | vtd_root | 00000000-ffffffff | 0 | + * | intel_iommu | 00000000-ffffffff | 1 | + * | vtd_sys_alias | 00000000-ffffffff | 1 | + * | intel_iommu_ir | fee00000-feefffff | 64 | + * |-----------------+-------------------+----------| + * + * We enable/disable DMAR by switching enablement for + * vtd_sys_alias and intel_iommu regions. IR region is always + * enabled. + */ memory_region_init_iommu(&vtd_dev_as->iommu, OBJECT(s), - &s->iommu_ops, "intel_iommu", UINT64_MAX); + &s->iommu_ops, "intel_iommu_dmar", + UINT64_MAX); + memory_region_init_alias(&vtd_dev_as->sys_alias, OBJECT(s), + "vtd_sys_alias", get_system_memory(), + 0, memory_region_size(get_system_memory()= )); memory_region_init_io(&vtd_dev_as->iommu_ir, OBJECT(s), &vtd_mem_ir_ops, s, "intel_iommu_ir", VTD_INTERRUPT_ADDR_SIZE); - memory_region_add_subregion(&vtd_dev_as->iommu, VTD_INTERRUPT_ADDR= _FIRST, - &vtd_dev_as->iommu_ir); - address_space_init(&vtd_dev_as->as, - &vtd_dev_as->iommu, name); + memory_region_init(&vtd_dev_as->root, OBJECT(s), + "vtd_root", UINT64_MAX); + memory_region_add_subregion_overlap(&vtd_dev_as->root, + VTD_INTERRUPT_ADDR_FIRST, + &vtd_dev_as->iommu_ir, 64); + address_space_init(&vtd_dev_as->as, &vtd_dev_as->root, name); + memory_region_add_subregion_overlap(&vtd_dev_as->root, 0, + &vtd_dev_as->sys_alias, 1); + memory_region_add_subregion_overlap(&vtd_dev_as->root, 0, + &vtd_dev_as->iommu, 1); + vtd_switch_address_space(vtd_dev_as); } return vtd_dev_as; } diff --git a/hw/i386/trace-events b/hw/i386/trace-events index f725bca..3c3a167 100644 --- a/hw/i386/trace-events +++ b/hw/i386/trace-events @@ -4,7 +4,6 @@ x86_iommu_iec_notify(bool global, uint32_t index, uint32_t mask) "Notify I= EC invalidation: global=3D%d index=3D%" PRIu32 " mask=3D%" PRIu32 =20 # hw/i386/intel_iommu.c -vtd_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "= Device %02x:%02x.%x switching address space (iommu enabled=3D%d)" vtd_inv_desc(const char *type, uint64_t hi, uint64_t lo) "invalidate desc = type %s high 0x%"PRIx64" low 0x%"PRIx64 vtd_inv_desc_invalid(uint64_t hi, uint64_t lo) "invalid inv desc hi 0x%"PR= Ix64" lo 0x%"PRIx64 vtd_inv_desc_cc_domain(uint16_t domain) "context invalidate domain 0x%"PRI= x16 @@ -37,6 +36,7 @@ vtd_page_walk_one(uint32_t level, uint64_t iova, uint64_t= gpa, uint64_t mask, in vtd_page_walk_skip_read(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to unable to read" vtd_page_walk_skip_perm(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to perm empty" vtd_page_walk_skip_reserve(uint64_t iova, uint64_t next) "Page walk skip i= ova 0x%"PRIx64" - 0x%"PRIx64" due to rsrv set" +vtd_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "= Device %02x:%02x.%x switching address space (iommu enabled=3D%d)" =20 # hw/i386/amd_iommu.c amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at = addr 0x%"PRIx64" + offset 0x%"PRIx32 diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index fe645aa..8f212a1 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -83,6 +83,8 @@ struct VTDAddressSpace { uint8_t devfn; AddressSpace as; MemoryRegion iommu; + MemoryRegion root; + MemoryRegion sys_alias; MemoryRegion iommu_ir; /* Interrupt region: 0xfeeXXXXX */ IntelIOMMUState *iommu_state; VTDContextCacheEntry context_cache_entry; --=20 2.7.4 From nobody Sun Apr 28 09:42:58 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 14915631452061006.9889587161607; Fri, 7 Apr 2017 04:05:45 -0700 (PDT) Received: from localhost ([::1]:50118 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRi4-0004Cn-2G for importer@patchew.org; Fri, 07 Apr 2017 07:05:44 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57459) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cwRen-0001qn-Lp for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:02:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cwRej-00061f-Lg for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:02:21 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41486) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cwRej-00061O-C7 for qemu-devel@nongnu.org; Fri, 07 Apr 2017 07:02:17 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 291CF7A49B; Fri, 7 Apr 2017 11:02:16 +0000 (UTC) Received: from pxdev.xzpeter.org.com (ovpn-8-43.pek2.redhat.com [10.72.8.43]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1325190615; Fri, 7 Apr 2017 11:01:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 291CF7A49B Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=peterx@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 291CF7A49B From: Peter Xu To: qemu-devel@nongnu.org Date: Fri, 7 Apr 2017 18:59:15 +0800 Message-Id: <1491562755-23867-10-git-send-email-peterx@redhat.com> In-Reply-To: <1491562755-23867-1-git-send-email-peterx@redhat.com> References: <1491562755-23867-1-git-send-email-peterx@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Fri, 07 Apr 2017 11:02:16 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v9 9/9] intel_iommu: enable remote IOTLB X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: tianyu.lan@intel.com, kevin.tian@intel.com, mst@redhat.com, jan.kiszka@siemens.com, jasowang@redhat.com, peterx@redhat.com, alex.williamson@redhat.com, Marcel Apfelbaum , Aviv Ben-David , bd.aviv@gmail.com, David Gibson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This patch is based on Aviv Ben-David ()'s patch upstream: "IOMMU: enable intel_iommu map and unmap notifiers" https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01453.html However I removed/fixed some content, and added my own codes. Instead of translate() every page for iotlb invalidations (which is slower), we walk the pages when needed and notify in a hook function. This patch enables vfio devices for VT-d emulation. And, since we already have vhost DMAR support via device-iotlb, a natural benefit that this patch brings is that vt-d enabled vhost can live even without ATS capability now. Though more tests are needed. Signed-off-by: Aviv Ben-David Reviewed-by: Jason Wang Reviewed-by: David Gibson Reviewed-by: \"Michael S. Tsirkin\" Signed-off-by: Peter Xu --- hw/i386/intel_iommu.c | 191 +++++++++++++++++++++++++++++++++++++= +--- hw/i386/intel_iommu_internal.h | 1 + hw/i386/trace-events | 1 + include/hw/i386/intel_iommu.h | 8 ++ 4 files changed, 188 insertions(+), 13 deletions(-) diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index f7dec82..02f047c 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -806,7 +806,8 @@ next: * @private: private data for the hook function */ static int vtd_page_walk(VTDContextEntry *ce, uint64_t start, uint64_t end, - vtd_page_walk_hook hook_fn, void *private) + vtd_page_walk_hook hook_fn, void *private, + bool notify_unmap) { dma_addr_t addr =3D vtd_get_slpt_base_from_context(ce); uint32_t level =3D vtd_get_level_from_context_entry(ce); @@ -821,7 +822,7 @@ static int vtd_page_walk(VTDContextEntry *ce, uint64_t = start, uint64_t end, } =20 return vtd_page_walk_level(addr, start, end, hook_fn, private, - level, true, true, false); + level, true, true, notify_unmap); } =20 /* Map a device to its corresponding domain (context-entry) */ @@ -1038,6 +1039,15 @@ static void vtd_interrupt_remap_table_setup(IntelIOM= MUState *s) s->intr_root, s->intr_size); } =20 +static void vtd_iommu_replay_all(IntelIOMMUState *s) +{ + IntelIOMMUNotifierNode *node; + + QLIST_FOREACH(node, &s->notifiers_list, next) { + memory_region_iommu_replay_all(&node->vtd_as->iommu); + } +} + static void vtd_context_global_invalidate(IntelIOMMUState *s) { trace_vtd_inv_desc_cc_global(); @@ -1045,6 +1055,14 @@ static void vtd_context_global_invalidate(IntelIOMMU= State *s) if (s->context_cache_gen =3D=3D VTD_CONTEXT_CACHE_GEN_MAX) { vtd_reset_context_cache(s); } + /* + * From VT-d spec 6.5.2.1, a global context entry invalidation + * should be followed by a IOTLB global invalidation, so we should + * be safe even without this. Hoewever, let's replay the region as + * well to be safer, and go back here when we need finer tunes for + * VT-d emulation codes. + */ + vtd_iommu_replay_all(s); } =20 =20 @@ -1111,6 +1129,16 @@ static void vtd_context_device_invalidate(IntelIOMMU= State *s, trace_vtd_inv_desc_cc_device(bus_n, VTD_PCI_SLOT(devfn_it), VTD_PCI_FUNC(devfn_it)); vtd_as->context_cache_entry.context_cache_gen =3D 0; + /* + * So a device is moving out of (or moving into) a + * domain, a replay() suites here to notify all the + * IOMMU_NOTIFIER_MAP registers about this change. + * This won't bring bad even if we have no such + * notifier registered - the IOMMU notification + * framework will skip MAP notifications if that + * happened. + */ + memory_region_iommu_replay_all(&vtd_as->iommu); } } } @@ -1152,12 +1180,53 @@ static void vtd_iotlb_global_invalidate(IntelIOMMUS= tate *s) { trace_vtd_iotlb_reset("global invalidation recved"); vtd_reset_iotlb(s); + vtd_iommu_replay_all(s); } =20 static void vtd_iotlb_domain_invalidate(IntelIOMMUState *s, uint16_t domai= n_id) { + IntelIOMMUNotifierNode *node; + VTDContextEntry ce; + VTDAddressSpace *vtd_as; + g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_domain, &domain_id); + + QLIST_FOREACH(node, &s->notifiers_list, next) { + vtd_as =3D node->vtd_as; + if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), + vtd_as->devfn, &ce) && + domain_id =3D=3D VTD_CONTEXT_ENTRY_DID(ce.hi)) { + memory_region_iommu_replay_all(&vtd_as->iommu); + } + } +} + +static int vtd_page_invalidate_notify_hook(IOMMUTLBEntry *entry, + void *private) +{ + memory_region_notify_iommu((MemoryRegion *)private, *entry); + return 0; +} + +static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s, + uint16_t domain_id, hwaddr addr, + uint8_t am) +{ + IntelIOMMUNotifierNode *node; + VTDContextEntry ce; + int ret; + + QLIST_FOREACH(node, &(s->notifiers_list), next) { + VTDAddressSpace *vtd_as =3D node->vtd_as; + ret =3D vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus), + vtd_as->devfn, &ce); + if (!ret && domain_id =3D=3D VTD_CONTEXT_ENTRY_DID(ce.hi)) { + vtd_page_walk(&ce, addr, addr + (1 << am) * VTD_PAGE_SIZE, + vtd_page_invalidate_notify_hook, + (void *)&vtd_as->iommu, true); + } + } } =20 static void vtd_iotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_= id, @@ -1170,6 +1239,7 @@ static void vtd_iotlb_page_invalidate(IntelIOMMUState= *s, uint16_t domain_id, info.addr =3D addr; info.mask =3D ~((1 << am) - 1); g_hash_table_foreach_remove(s->iotlb, vtd_hash_remove_by_page, &info); + vtd_iotlb_page_invalidate_notify(s, domain_id, addr, am); } =20 /* Flush IOTLB @@ -2187,15 +2257,33 @@ static void vtd_iommu_notify_flag_changed(MemoryReg= ion *iommu, IOMMUNotifierFlag new) { VTDAddressSpace *vtd_as =3D container_of(iommu, VTDAddressSpace, iommu= ); + IntelIOMMUState *s =3D vtd_as->iommu_state; + IntelIOMMUNotifierNode *node =3D NULL; + IntelIOMMUNotifierNode *next_node =3D NULL; =20 - if (new & IOMMU_NOTIFIER_MAP) { - error_report("Device at bus %s addr %02x.%d requires iommu " - "notifier which is currently not supported by " - "intel-iommu emulation", - vtd_as->bus->qbus.name, PCI_SLOT(vtd_as->devfn), - PCI_FUNC(vtd_as->devfn)); + if (!s->caching_mode && new & IOMMU_NOTIFIER_MAP) { + error_report("We need to set cache_mode=3D1 for intel-iommu to ena= ble " + "device assignment with IOMMU protection."); exit(1); } + + if (old =3D=3D IOMMU_NOTIFIER_NONE) { + node =3D g_malloc0(sizeof(*node)); + node->vtd_as =3D vtd_as; + QLIST_INSERT_HEAD(&s->notifiers_list, node, next); + return; + } + + /* update notifier node with new flags */ + QLIST_FOREACH_SAFE(node, &s->notifiers_list, next, next_node) { + if (node->vtd_as =3D=3D vtd_as) { + if (new =3D=3D IOMMU_NOTIFIER_NONE) { + QLIST_REMOVE(node, next); + g_free(node); + } + return; + } + } } =20 static const VMStateDescription vtd_vmstate =3D { @@ -2613,6 +2701,74 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s,= PCIBus *bus, int devfn) return vtd_dev_as; } =20 +/* Unmap the whole range in the notifier's scope. */ +static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n) +{ + IOMMUTLBEntry entry; + hwaddr size; + hwaddr start =3D n->start; + hwaddr end =3D n->end; + + /* + * Note: all the codes in this function has a assumption that IOVA + * bits are no more than VTD_MGAW bits (which is restricted by + * VT-d spec), otherwise we need to consider overflow of 64 bits. + */ + + if (end > VTD_ADDRESS_SIZE) { + /* + * Don't need to unmap regions that is bigger than the whole + * VT-d supported address space size + */ + end =3D VTD_ADDRESS_SIZE; + } + + assert(start <=3D end); + size =3D end - start; + + if (ctpop64(size) !=3D 1) { + /* + * This size cannot format a correct mask. Let's enlarge it to + * suite the minimum available mask. + */ + int n =3D 64 - clz64(size); + if (n > VTD_MGAW) { + /* should not happen, but in case it happens, limit it */ + n =3D VTD_MGAW; + } + size =3D 1ULL << n; + } + + entry.target_as =3D &address_space_memory; + /* Adjust iova for the size */ + entry.iova =3D n->start & ~(size - 1); + /* This field is meaningless for unmap */ + entry.translated_addr =3D 0; + entry.perm =3D IOMMU_NONE; + entry.addr_mask =3D size - 1; + + trace_vtd_as_unmap_whole(pci_bus_num(as->bus), + VTD_PCI_SLOT(as->devfn), + VTD_PCI_FUNC(as->devfn), + entry.iova, size); + + memory_region_notify_one(n, &entry); +} + +static void vtd_address_space_unmap_all(IntelIOMMUState *s) +{ + IntelIOMMUNotifierNode *node; + VTDAddressSpace *vtd_as; + IOMMUNotifier *n; + + QLIST_FOREACH(node, &s->notifiers_list, next) { + vtd_as =3D node->vtd_as; + IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) { + vtd_address_space_unmap(vtd_as, n); + } + } +} + static int vtd_replay_hook(IOMMUTLBEntry *entry, void *private) { memory_region_notify_one((IOMMUNotifier *)private, entry); @@ -2626,16 +2782,19 @@ static void vtd_iommu_replay(MemoryRegion *mr, IOMM= UNotifier *n) uint8_t bus_n =3D pci_bus_num(vtd_as->bus); VTDContextEntry ce; =20 + /* + * The replay can be triggered by either a invalidation or a newly + * created entry. No matter what, we release existing mappings + * (it means flushing caches for UNMAP-only registers). + */ + vtd_address_space_unmap(vtd_as, n); + if (vtd_dev_to_context_entry(s, bus_n, vtd_as->devfn, &ce) =3D=3D 0) { - /* - * Scanned a valid context entry, walk over the pages and - * notify when needed. - */ trace_vtd_replay_ce_valid(bus_n, PCI_SLOT(vtd_as->devfn), PCI_FUNC(vtd_as->devfn), VTD_CONTEXT_ENTRY_DID(ce.hi), ce.hi, ce.lo); - vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n); + vtd_page_walk(&ce, 0, ~0ULL, vtd_replay_hook, (void *)n, false); } else { trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn), PCI_FUNC(vtd_as->devfn)); @@ -2754,6 +2913,11 @@ static void vtd_reset(DeviceState *dev) =20 VTD_DPRINTF(GENERAL, ""); vtd_init(s); + + /* + * When device reset, throw away all mappings and external caches + */ + vtd_address_space_unmap_all(s); } =20 static AddressSpace *vtd_host_dma_iommu(PCIBus *bus, void *opaque, int dev= fn) @@ -2817,6 +2981,7 @@ static void vtd_realize(DeviceState *dev, Error **err= p) return; } =20 + QLIST_INIT(&s->notifiers_list); memset(s->vtd_as_by_bus_num, 0, sizeof(s->vtd_as_by_bus_num)); memory_region_init_io(&s->csrmem, OBJECT(s), &vtd_mem_ops, s, "intel_iommu", DMAR_REG_SIZE); diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h index 4104121..29d6707 100644 --- a/hw/i386/intel_iommu_internal.h +++ b/hw/i386/intel_iommu_internal.h @@ -197,6 +197,7 @@ #define VTD_DOMAIN_ID_MASK ((1UL << VTD_DOMAIN_ID_SHIFT) - 1) #define VTD_CAP_ND (((VTD_DOMAIN_ID_SHIFT - 4) / 2) & 7UL= L) #define VTD_MGAW 39 /* Maximum Guest Address Width */ +#define VTD_ADDRESS_SIZE (1ULL << VTD_MGAW) #define VTD_CAP_MGAW (((VTD_MGAW - 1) & 0x3fULL) << 16) #define VTD_MAMV 18ULL #define VTD_CAP_MAMV (VTD_MAMV << 48) diff --git a/hw/i386/trace-events b/hw/i386/trace-events index 3c3a167..04a6980 100644 --- a/hw/i386/trace-events +++ b/hw/i386/trace-events @@ -37,6 +37,7 @@ vtd_page_walk_skip_read(uint64_t iova, uint64_t next) "Pa= ge walk skip iova 0x%"P vtd_page_walk_skip_perm(uint64_t iova, uint64_t next) "Page walk skip iova= 0x%"PRIx64" - 0x%"PRIx64" due to perm empty" vtd_page_walk_skip_reserve(uint64_t iova, uint64_t next) "Page walk skip i= ova 0x%"PRIx64" - 0x%"PRIx64" due to rsrv set" vtd_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "= Device %02x:%02x.%x switching address space (iommu enabled=3D%d)" +vtd_as_unmap_whole(uint8_t bus, uint8_t slot, uint8_t fn, uint64_t iova, u= int64_t size) "Device %02x:%02x.%x start 0x%"PRIx64" size 0x%"PRIx64 =20 # hw/i386/amd_iommu.c amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at = addr 0x%"PRIx64" + offset 0x%"PRIx32 diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index 8f212a1..3e51876 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -63,6 +63,7 @@ typedef union VTD_IR_TableEntry VTD_IR_TableEntry; typedef union VTD_IR_MSIAddress VTD_IR_MSIAddress; typedef struct VTDIrq VTDIrq; typedef struct VTD_MSIMessage VTD_MSIMessage; +typedef struct IntelIOMMUNotifierNode IntelIOMMUNotifierNode; =20 /* Context-Entry */ struct VTDContextEntry { @@ -249,6 +250,11 @@ struct VTD_MSIMessage { /* When IR is enabled, all MSI/MSI-X data bits should be zero */ #define VTD_IR_MSI_DATA (0) =20 +struct IntelIOMMUNotifierNode { + VTDAddressSpace *vtd_as; + QLIST_ENTRY(IntelIOMMUNotifierNode) next; +}; + /* The iommu (DMAR) device state struct */ struct IntelIOMMUState { X86IOMMUState x86_iommu; @@ -286,6 +292,8 @@ struct IntelIOMMUState { MemoryRegionIOMMUOps iommu_ops; GHashTable *vtd_as_by_busptr; /* VTDBus objects indexed by PCIBus* r= eference */ VTDBus *vtd_as_by_bus_num[VTD_PCI_BUS_MAX]; /* VTDBus objects indexed = by bus number */ + /* list of registered notifiers */ + QLIST_HEAD(, IntelIOMMUNotifierNode) notifiers_list; =20 /* interrupt remapping */ bool intr_enabled; /* Whether guest enabled IR */ --=20 2.7.4