From nobody Sun Dec 14 12:02:34 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1763372491; cv=none; d=zohomail.com; s=zohoarc; b=IyCds7tPWzdjl46IqXyNABSNipa0Rv3SARKaYUOwmdRcytnBhJRaVNtJmZEHTXfQ39zTwQXYbUHxW/gQs8orwwGv4vvok/pnWPEG2XN1mFV5kPe7nMOpICHir1FURIMGN4BfjWnaXKD3eJnamLWLX733KpxgVRgMPQuNzqNZ/Sw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1763372491; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=kqRHAaekIlPqrhmLI1Mbu7s4oIDk8XQjrX7a0yrOPFg=; b=XeYqMGRKLoP/7vgoggGMGBRemBBiU7iNVUSkdSGqXZ9GZzAfF5a5GW7Wf2SVlSH8k0Dwh1tAov/o8v2qKnruOrTEHcrGzC1OcBKgVvXbYSnr4mb02sgfl42B+teBnolAqbVruKntAXcinZF4lQJxd8sBuumlpcGZz2fkXM3LG7M= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1763372491174283.2269885886428; Mon, 17 Nov 2025 01:41:31 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vKvjC-0006WT-Us; Mon, 17 Nov 2025 04:40:55 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vKviV-0005IB-GR for qemu-devel@nongnu.org; Mon, 17 Nov 2025 04:40:11 -0500 Received: from mgamail.intel.com ([192.198.163.9]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vKviT-0006HR-Hr for qemu-devel@nongnu.org; Mon, 17 Nov 2025 04:40:11 -0500 Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Nov 2025 01:39:55 -0800 Received: from unknown (HELO gnr-sp-2s-612.sh.intel.com) ([10.112.230.229]) by fmviesa007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Nov 2025 01:39:51 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1763372410; x=1794908410; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=X+cTR1IXNxdLo5t0HxEfQS4QmeEbXQKJfoPcX2q4jsg=; b=hlNU4P3fjRmCYua9MDqoTIOJkBsUnEud42Krd5fkLaMDm04+JvJC7Hn1 VaAQ+rqPFfACHHvlSUYofve1CAXlx0rE6Ba8tDa2G0P878fTO+yie71Hj pudyTwMqHaNJLQi1gGF2TK9aTu1ZyUK1G0NuBHM8DCK3GmImRbWVMzjY0 XAZiVhaP6uBJCYiAJ3xvdaDl3SrzrXDMpTVMAUEoPPLb569dgiiwu4iqS povMTGtqI1KguBkUbg8eApNYkLCi1TX/CEhcpe0nSZn/ytnduJx3iwWyO vMCkpSbDHTY59pPW8CXpow54JKQFo8NJZuRgDSdLZKsmpqTDYslnaVnh2 Q==; X-CSE-ConnectionGUID: 6kXXDRxzQDiLQkcXx2WPbQ== X-CSE-MsgGUID: zM8jre+QSGCsElwGkOgZBw== X-IronPort-AV: E=McAfee;i="6800,10657,11615"; a="76046104" X-IronPort-AV: E=Sophos;i="6.19,311,1754982000"; d="scan'208";a="76046104" X-CSE-ConnectionGUID: dZ0F/s85Q9er4ipgYQ2vxQ== X-CSE-MsgGUID: Et96+1cmST+7YjJGozTERQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.19,311,1754982000"; d="scan'208";a="190071007" From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex@shazbot.org, clg@redhat.com, eric.auger@redhat.com, mst@redhat.com, jasowang@redhat.com, peterx@redhat.com, ddutile@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, skolothumtho@nvidia.com, joao.m.martins@oracle.com, clement.mathieu--drif@eviden.com, kevin.tian@intel.com, yi.l.liu@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v8 21/23] Workaround for ERRATA_772415_SPR17 Date: Mon, 17 Nov 2025 04:37:24 -0500 Message-ID: <20251117093729.1121324-22-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20251117093729.1121324-1-zhenzhong.duan@intel.com> References: <20251117093729.1121324-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=192.198.163.9; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1763372492097153000 Content-Type: text/plain; charset="utf-8" On a system influenced by ERRATA_772415, IOMMU_HW_INFO_VTD_ERRATA_772415_SP= R17 is repored by IOMMU_DEVICE_GET_HW_INFO. Due to this errata, even the readon= ly range mapped on second stage page table could still be written. Reference from 4th Gen Intel Xeon Processor Scalable Family Specification Update, Errata Details, SPR17. https://edc.intel.com/content/www/us/en/design/products-and-solutions/proce= ssors-and-chipsets/eagle-stream/sapphire-rapids-specification-update/ Also copied the SPR17 details from above link: "Problem: When remapping hardware is configured by system software in scalable mode as Nested (PGTT=3D011b) and with PWSNP field Set in the PASID-table-entry, it may Set Accessed bit and Dirty bit (and Extended Access bit if enabled) in first-stage page-table entries even when second-stage mappings indicate that corresponding first-stage page-table is Read-Only. Implication: Due to this erratum, pages mapped as Read-only in second-stage page-tables may be modified by remapping hardware Access/Dirty bit updates. Workaround: None identified. System software enabling nested translations for a VM should ensure that there are no read-only pages in the corresponding second-stage mappings." Introduce a helper vfio_device_get_host_iommu_quirk_bypass_ro to check if readonly mappings should be bypassed. Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-container.h | 1 + include/hw/vfio/vfio-device.h | 3 +++ hw/vfio/device.c | 14 ++++++++++++++ hw/vfio/iommufd.c | 9 ++++++++- hw/vfio/listener.c | 6 ++++-- 5 files changed, 30 insertions(+), 3 deletions(-) diff --git a/include/hw/vfio/vfio-container.h b/include/hw/vfio/vfio-contai= ner.h index 9f6e8cedfc..a7d5c5ed67 100644 --- a/include/hw/vfio/vfio-container.h +++ b/include/hw/vfio/vfio-container.h @@ -52,6 +52,7 @@ struct VFIOContainer { QLIST_HEAD(, VFIODevice) device_list; GList *iova_ranges; NotifierWithReturn cpr_reboot_notifier; + bool bypass_ro; }; =20 #define TYPE_VFIO_IOMMU "vfio-iommu" diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h index 48d00c7bc4..f6f3d0e378 100644 --- a/include/hw/vfio/vfio-device.h +++ b/include/hw/vfio/vfio-device.h @@ -268,6 +268,9 @@ void vfio_device_prepare(VFIODevice *vbasedev, VFIOCont= ainer *bcontainer, void vfio_device_unprepare(VFIODevice *vbasedev); =20 bool vfio_device_get_viommu_flags_want_nesting(VFIODevice *vbasedev); +bool vfio_device_get_host_iommu_quirk_bypass_ro(VFIODevice *vbasedev, + uint32_t type, void *caps, + uint32_t size); =20 int vfio_device_get_region_info(VFIODevice *vbasedev, int index, struct vfio_region_info **info); diff --git a/hw/vfio/device.c b/hw/vfio/device.c index 71eb069eb6..290011e154 100644 --- a/hw/vfio/device.c +++ b/hw/vfio/device.c @@ -533,6 +533,20 @@ bool vfio_device_get_viommu_flags_want_nesting(VFIODev= ice *vbasedev) return false; } =20 +bool vfio_device_get_host_iommu_quirk_bypass_ro(VFIODevice *vbasedev, + uint32_t type, void *caps, + uint32_t size) +{ + VFIOPCIDevice *vdev =3D vfio_pci_from_vfio_device(vbasedev); + + if (vdev) { + return !!(pci_device_get_host_iommu_quirks(PCI_DEVICE(vdev), type, + caps, size) & + HOST_IOMMU_QUIRK_NESTING_PARENT_BYPASS_RO); + } + return false; +} + /* * Traditional ioctl() based io */ diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index 63f8442865..2a7b0d0c07 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -351,6 +351,7 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *vb= asedev, VFIOContainer *bcontainer =3D VFIO_IOMMU(container); uint32_t type, flags =3D 0; uint64_t hw_caps; + VendorCaps caps; VFIOIOASHwpt *hwpt; uint32_t hwpt_id; int ret; @@ -396,7 +397,8 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *vb= asedev, * instead. */ if (!iommufd_backend_get_device_info(vbasedev->iommufd, vbasedev->devi= d, - &type, NULL, 0, &hw_caps, errp)) { + &type, &caps, sizeof(caps), &hw_c= aps, + errp)) { return false; } =20 @@ -411,6 +413,11 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *v= basedev, */ if (vfio_device_get_viommu_flags_want_nesting(vbasedev)) { flags |=3D IOMMU_HWPT_ALLOC_NEST_PARENT; + + if (vfio_device_get_host_iommu_quirk_bypass_ro(vbasedev, type, + &caps, sizeof(caps)= )) { + bcontainer->bypass_ro =3D true; + } } =20 if (cpr_is_incoming()) { diff --git a/hw/vfio/listener.c b/hw/vfio/listener.c index ca2377d860..090f935d30 100644 --- a/hw/vfio/listener.c +++ b/hw/vfio/listener.c @@ -502,7 +502,8 @@ void vfio_container_region_add(VFIOContainer *bcontaine= r, int ret; Error *err =3D NULL; =20 - if (!vfio_listener_valid_section(section, false, "region_add")) { + if (!vfio_listener_valid_section(section, bcontainer->bypass_ro, + "region_add")) { return; } =20 @@ -668,7 +669,8 @@ static void vfio_listener_region_del(MemoryListener *li= stener, int ret; bool try_unmap =3D true; =20 - if (!vfio_listener_valid_section(section, false, "region_del")) { + if (!vfio_listener_valid_section(section, bcontainer->bypass_ro, + "region_del")) { return; } =20 --=20 2.47.1