From nobody Tue Oct 7 05:42:06 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A99DF1F3B8A for ; Mon, 14 Jul 2025 04:52:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468770; cv=none; b=o3BDl8zmPjiZY5dApkt6Ouspf7ylZYr6q2A9F237rbBldLOTFrP9Bz/r3DsSE3rmUEXMq/N9RFUogOBlvoCdpUGp6ZS1ZhXMIROqNIvYRnJBOoip3yA4ou9FM/AHOPjNvbH1xuzZ3uhPq7yyk4AkCRNnfuA5AkDFyTUVyBQ5UKg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468770; c=relaxed/simple; bh=syDR/ggoQRcY2z+FN85gCRMGWAqN9JGTyQdffmweMoo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rMUGsrNZMucsLd1lnwyaIGFpoJKfOQHKaEvZQ5aiU1XZrs28VFaPTBHjhfb7TW7p2vWARWOFbFD8DEnydKRv+pbE2fexplXcjqduCrPm15scgPX9tbDL1l80q+MJPZu7ymrWl4T6npPPMVb1spd95wOG1AgooKvdqQ2d84YtB3Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=kvhLvcjo; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="kvhLvcjo" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752468768; x=1784004768; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=syDR/ggoQRcY2z+FN85gCRMGWAqN9JGTyQdffmweMoo=; b=kvhLvcjo7qFxbq8cC35FRVmG4OdkPyzHCt6jk6HUYspJ45do2Fouj/e6 NWMjSKnsZrWTz1crU1+xvB+g+2zKiHNUBnXpyDf6/+YtfsKT7F1zpNYLR xEgnyw+Jz3blofs9Ab4rl3Wt9ajGIb6Q70d8XaTA6pmqp0X96Gn5rA4gy Jxo8HJ++mTz9KIRdjG3tNEjEQf1UCCy9IxdYWwYjTAiDrw883PNUm/q5c 7EfZx2beamSrNF/UxXvzIntVsyRU2ojdw8QfMv1uXQnvccHQmArcifhlH Fp/92MYyPXBD6mSwi76E+1cORcuzqfNndROyFBaZPUOj3tsv+YJXt9jlf Q==; X-CSE-ConnectionGUID: dRrnUApxT1Sm/9cLHOPuUg== X-CSE-MsgGUID: yAENH/XlTKK4cHKSRprKkA== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="53765035" X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="53765035" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2025 21:52:47 -0700 X-CSE-ConnectionGUID: EHG8kM0tSCq1tMSgDxKGPA== X-CSE-MsgGUID: J/LEche7Sp2vi2tu6+bBcA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="161166145" Received: from allen-box.sh.intel.com ([10.239.159.52]) by orviesa003.jf.intel.com with ESMTP; 13 Jul 2025 21:52:47 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 01/11] iommu/vt-d: Remove the CONFIG_X86 wrapping from iommu init hook Date: Mon, 14 Jul 2025 12:50:18 +0800 Message-ID: <20250714045028.958850-2-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250714045028.958850-1-baolu.lu@linux.intel.com> References: <20250714045028.958850-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Vineeth Pillai (Google)" iommu init hook is wrapped in CONFI_X86 and is a remnant of dmar.c when it was a common code in "drivers/pci/dmar.c". This was added in commit (9d5ce73a64be2 x86: intel-iommu: Convert detect_intel_iommu to use iommu_init hook) Now this is built only for x86. This config wrap could be removed. Signed-off-by: Vineeth Pillai (Google) Reviewed-by: Jason Gunthorpe Link: https://lore.kernel.org/r/20250616131740.3499289-1-vineeth@bitbytewor= d.org Signed-off-by: Lu Baolu --- drivers/iommu/intel/dmar.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index b61d9ea27aa9..ec975c73cfe6 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -935,14 +935,11 @@ void __init detect_intel_iommu(void) pci_request_acs(); } =20 -#ifdef CONFIG_X86 if (!ret) { x86_init.iommu.iommu_init =3D intel_iommu_init; x86_platform.iommu_shutdown =3D intel_iommu_shutdown; } =20 -#endif - if (dmar_tbl) { acpi_put_table(dmar_tbl); dmar_tbl =3D NULL; --=20 2.43.0 From nobody Tue Oct 7 05:42:06 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A7EC21F4624 for ; Mon, 14 Jul 2025 04:52:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468771; cv=none; b=T1n/OI8+pUak69J4iwJYGHxkKJSf1uECyjjldHZMoGqHQdt/OAKFX77aP94+zDW4yAqvk2a6Xb2S/0yOliGKlHgRhLOPjfjf84l4ELPB9l0J09iiw5Hgo9UaHAUgPKJfKBda47wHBh4uJUNbrBV6BnXrRVUvDuJKlzCkjXnGVKc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468771; c=relaxed/simple; bh=DwUhOZ8kbcjWYIdLnos9Z/vLhWxZdh9oLPf3cntU9Sk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=p4v857vNpzsaGZaAfQF6bbtiSJgIozOMRdm4OxIfmn50dGcjKKrPoLJbIo+Jp00gJu+XVvb8LKzsJBAjsXTZ5XYprfltkrBNpUnCb2gtjEShfXCPeZEG57hyOgYslzNP6VuZB4MwMyeS4wbezitbvGc6Js9yMWUYvgaQ43mZcaE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=jKuTArE3; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="jKuTArE3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752468769; x=1784004769; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DwUhOZ8kbcjWYIdLnos9Z/vLhWxZdh9oLPf3cntU9Sk=; b=jKuTArE3xyT/gNgBLuumRyromW4QYpf9uG61l5YyNhwp/DWLi5fDOnO2 1SznpihD/DMN4Qy8QqY1C6v8Lb+uio/BtXpbR+zSsCwS0lDY3CnaoTpoB 2ifqoTIs/MoIIqGetwjj/IjSSWNWKEeBXi+NTo97J/QSXQ+3tAET6QPIa hpjKNosAvWxRBf5QnvWQDFV6yx99V6RJaIPSIBnQoRpzTuQFvdw5bK6pX mjyAtPeAEp85hUCyx0e4ns0zUVPOn4+p97+Jdh5rp5PyiZnIMLA4Tr0xm X4Up378M2M/grIzWTyxFVHX1MdNDjynlsAG/GIXcLzUz8iRsPmO/lKD6p A==; X-CSE-ConnectionGUID: yjcwdAn1TEq6KE2RFeCBIQ== X-CSE-MsgGUID: 21I0S2p+QMOCeyQK9URJrw== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="53765038" X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="53765038" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2025 21:52:49 -0700 X-CSE-ConnectionGUID: PKVbN8MVQQK2OS1dUDJbXQ== X-CSE-MsgGUID: 7dT/4QJATLiy2tFd5rfK7g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="161166146" Received: from allen-box.sh.intel.com ([10.239.159.52]) by orviesa003.jf.intel.com with ESMTP; 13 Jul 2025 21:52:49 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 02/11] iommu/vt-d: Optimize iotlb_sync_map for non-caching/non-RWBF modes Date: Mon, 14 Jul 2025 12:50:19 +0800 Message-ID: <20250714045028.958850-3-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250714045028.958850-1-baolu.lu@linux.intel.com> References: <20250714045028.958850-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The iotlb_sync_map iommu ops allows drivers to perform necessary cache flushes when new mappings are established. For the Intel iommu driver, this callback specifically serves two purposes: - To flush caches when a second-stage page table is attached to a device whose iommu is operating in caching mode (CAP_REG.CM=3D=3D1). - To explicitly flush internal write buffers to ensure updates to memory- resident remapping structures are visible to hardware (CAP_REG.RWBF=3D=3D= 1). However, in scenarios where neither caching mode nor the RWBF flag is active, the cache_tag_flush_range_np() helper, which is called in the iotlb_sync_map path, effectively becomes a no-op. Despite being a no-op, cache_tag_flush_range_np() involves iterating through all cache tags of the iommu's attached to the domain, protected by a spinlock. This unnecessary execution path introduces overhead, leading to a measurable I/O performance regression. On systems with NVMes under the same bridge, performance was observed to drop from approximately ~6150 MiB/s down to ~4985 MiB/s. Introduce a flag in the dmar_domain structure. This flag will only be set when iotlb_sync_map is required (i.e., when CM or RWBF is set). The cache_tag_flush_range_np() is called only for domains where this flag is set. This flag, once set, is immutable, given that there won't be mixed configurations in real-world scenarios where some IOMMUs in a system operate in caching mode while others do not. Theoretically, the immutability of this flag does not impact functionality. Reported-by: Ioanna Alifieraki Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2115738 Link: https://lore.kernel.org/r/20250701171154.52435-1-ioanna-maria.alifier= aki@canonical.com Fixes: 129dab6e1286 ("iommu/vt-d: Use cache_tag_flush_range_np() in iotlb_s= ync_map") Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20250703031545.3378602-1-baolu.lu@linux.int= el.com --- drivers/iommu/intel/iommu.c | 19 ++++++++++++++++++- drivers/iommu/intel/iommu.h | 3 +++ 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 148b944143b8..b23efb70b52c 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1796,6 +1796,18 @@ static int domain_setup_first_level(struct intel_iom= mu *iommu, (pgd_t *)pgd, flags, old); } =20 +static bool domain_need_iotlb_sync_map(struct dmar_domain *domain, + struct intel_iommu *iommu) +{ + if (cap_caching_mode(iommu->cap) && !domain->use_first_level) + return true; + + if (rwbf_quirk || cap_rwbf(iommu->cap)) + return true; + + return false; +} + static int dmar_domain_attach_device(struct dmar_domain *domain, struct device *dev) { @@ -1833,6 +1845,8 @@ static int dmar_domain_attach_device(struct dmar_doma= in *domain, if (ret) goto out_block_translation; =20 + domain->iotlb_sync_map |=3D domain_need_iotlb_sync_map(domain, iommu); + return 0; =20 out_block_translation: @@ -3954,7 +3968,10 @@ static bool risky_device(struct pci_dev *pdev) static int intel_iommu_iotlb_sync_map(struct iommu_domain *domain, unsigned long iova, size_t size) { - cache_tag_flush_range_np(to_dmar_domain(domain), iova, iova + size - 1); + struct dmar_domain *dmar_domain =3D to_dmar_domain(domain); + + if (dmar_domain->iotlb_sync_map) + cache_tag_flush_range_np(dmar_domain, iova, iova + size - 1); =20 return 0; } diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 2d1afab5eedc..61f42802fe9e 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -614,6 +614,9 @@ struct dmar_domain { u8 has_mappings:1; /* Has mappings configured through * iommu_map() interface. */ + u8 iotlb_sync_map:1; /* Need to flush IOTLB cache or write + * buffer when creating mappings. + */ =20 spinlock_t lock; /* Protect device tracking lists */ struct list_head devices; /* all devices' list */ --=20 2.43.0 From nobody Tue Oct 7 05:42:06 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 67995202C58 for ; Mon, 14 Jul 2025 04:52:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468774; cv=none; b=LIMpnPbRU6yzbcRWUoyXKgbcgu3HuViDfH+zw2rooM7PVwWPiMq8u7YZy4eDpwRCA9EA79oYbIYHz367SLVLLiBXbybJsXNqKcsWhok8K4oTSgnIQFBAloq+MOhQuZwoDZ4FcA9ZbOLp1LCapOCj5uDlXcleP1HJ9AkoBrWlplM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468774; c=relaxed/simple; bh=Zo3e/7tiLVLCxcZDd4c9LLFf6IGNHzpwzRkWpHf/R6s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RAWzJ+J+01AmioDd4Hbe10UC99BNEy9LdvJ/YzEguSv/lYPJSbmkRA5lgOu598W7AENv6o8fDslAGtQuXzpP3y0QAVuZRzUlbyGW1PwAsNMbtUdlgYa0Wf5chUmiFIsP6epGyInYhq/h8Kr7NQZtDDbVHoJmMTYBrDfKm6p2QyY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=fWJEpoxv; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="fWJEpoxv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752468772; x=1784004772; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Zo3e/7tiLVLCxcZDd4c9LLFf6IGNHzpwzRkWpHf/R6s=; b=fWJEpoxvofc0WBGefAWyDBtONZz6qjPUg26EoYOerK7rlP85JG/Jzo6h KEkTLm5Cmr2dUHTuLT0FEuZjZcxhUuCcYhUoz1tZa/IKZjh/aZGP6+mZp Z10gHVFqkbsBmTRehpT3NTd6/4W4w5ZYtgo8p/+yFX4FUepVM0Ra+7lvp p9GoCw0So7WitYfNKqRVlqhgL5Cw5lPlktYRXc4jH5iOsm+EKmx9B5b72 PAMig95RV/dUwnTvsND+WcrMgQAg6zk2okT87+EeCmvobkUfBTRzQFsxQ CgPjIaoGBcNaIWs4cgz47qS4apcvxQrpLcNhZfAEXCz9Wg283MDt7ctw8 g==; X-CSE-ConnectionGUID: UctrdSLvQviyUoVVSOJWnw== X-CSE-MsgGUID: TFsKuKTZTj2GCm7Hzcyv0w== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="53765041" X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="53765041" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2025 21:52:51 -0700 X-CSE-ConnectionGUID: 1tOy7zTCQzOQYwDEGy9L3g== X-CSE-MsgGUID: 8pQDLCltRVKy+BbFSFq/Zw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="161166150" Received: from allen-box.sh.intel.com ([10.239.159.52]) by orviesa003.jf.intel.com with ESMTP; 13 Jul 2025 21:52:50 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 03/11] iommu/vt-d: Lift the __pa to domain_setup_first_level/intel_svm_set_dev_pasid() Date: Mon, 14 Jul 2025 12:50:20 +0800 Message-ID: <20250714045028.958850-4-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250714045028.958850-1-baolu.lu@linux.intel.com> References: <20250714045028.958850-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Jason Gunthorpe Pass the phys_addr_t down through the call chain from the top instead of passing a pgd_t * KVA. This moves the __pa() into domain_setup_first_level() which is the first function to obtain the pgd from the IOMMU page table in this call chain. The SVA flow is also adjusted to get the pa of the mm->pgd. iommput will move the __pa() into iommupt code, it never shares the KVA of the page table with the driver. Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/1-v3-dbbe6f7e7ae3+124ffe-vtd_prep_jgg@nvidi= a.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 15 +++++++-------- drivers/iommu/intel/iommu.h | 7 +++---- drivers/iommu/intel/pasid.c | 17 +++++++++-------- drivers/iommu/intel/pasid.h | 11 +++++------ drivers/iommu/intel/svm.c | 2 +- 5 files changed, 25 insertions(+), 27 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index b23efb70b52c..7c2e5e682a41 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1736,15 +1736,14 @@ static void domain_context_clear_one(struct device_= domain_info *info, u8 bus, u8 intel_context_flush_no_pasid(info, context, did); } =20 -int __domain_setup_first_level(struct intel_iommu *iommu, - struct device *dev, ioasid_t pasid, - u16 did, pgd_t *pgd, int flags, - struct iommu_domain *old) +int __domain_setup_first_level(struct intel_iommu *iommu, struct device *d= ev, + ioasid_t pasid, u16 did, phys_addr_t fsptptr, + int flags, struct iommu_domain *old) { if (!old) - return intel_pasid_setup_first_level(iommu, dev, pgd, - pasid, did, flags); - return intel_pasid_replace_first_level(iommu, dev, pgd, pasid, did, + return intel_pasid_setup_first_level(iommu, dev, fsptptr, pasid, + did, flags); + return intel_pasid_replace_first_level(iommu, dev, fsptptr, pasid, did, iommu_domain_did(old, iommu), flags); } @@ -1793,7 +1792,7 @@ static int domain_setup_first_level(struct intel_iomm= u *iommu, =20 return __domain_setup_first_level(iommu, dev, pasid, domain_id_iommu(domain, iommu), - (pgd_t *)pgd, flags, old); + __pa(pgd), flags, old); } =20 static bool domain_need_iotlb_sync_map(struct dmar_domain *domain, diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 61f42802fe9e..50d69cc88a1f 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -1255,10 +1255,9 @@ domain_add_dev_pasid(struct iommu_domain *domain, void domain_remove_dev_pasid(struct iommu_domain *domain, struct device *dev, ioasid_t pasid); =20 -int __domain_setup_first_level(struct intel_iommu *iommu, - struct device *dev, ioasid_t pasid, - u16 did, pgd_t *pgd, int flags, - struct iommu_domain *old); +int __domain_setup_first_level(struct intel_iommu *iommu, struct device *d= ev, + ioasid_t pasid, u16 did, phys_addr_t fsptptr, + int flags, struct iommu_domain *old); =20 int dmar_ir_support(void); =20 diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index ac67a056b6c8..52f678975da7 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -348,14 +348,15 @@ static void intel_pasid_flush_present(struct intel_io= mmu *iommu, */ static void pasid_pte_config_first_level(struct intel_iommu *iommu, struct pasid_entry *pte, - pgd_t *pgd, u16 did, int flags) + phys_addr_t fsptptr, u16 did, + int flags) { lockdep_assert_held(&iommu->lock); =20 pasid_clear_entry(pte); =20 /* Setup the first level page table pointer: */ - pasid_set_flptr(pte, (u64)__pa(pgd)); + pasid_set_flptr(pte, fsptptr); =20 if (flags & PASID_FLAG_FL5LP) pasid_set_flpm(pte, 1); @@ -372,9 +373,9 @@ static void pasid_pte_config_first_level(struct intel_i= ommu *iommu, pasid_set_present(pte); } =20 -int intel_pasid_setup_first_level(struct intel_iommu *iommu, - struct device *dev, pgd_t *pgd, - u32 pasid, u16 did, int flags) +int intel_pasid_setup_first_level(struct intel_iommu *iommu, struct device= *dev, + phys_addr_t fsptptr, u32 pasid, u16 did, + int flags) { struct pasid_entry *pte; =20 @@ -402,7 +403,7 @@ int intel_pasid_setup_first_level(struct intel_iommu *i= ommu, return -EBUSY; } =20 - pasid_pte_config_first_level(iommu, pte, pgd, did, flags); + pasid_pte_config_first_level(iommu, pte, fsptptr, did, flags); =20 spin_unlock(&iommu->lock); =20 @@ -412,7 +413,7 @@ int intel_pasid_setup_first_level(struct intel_iommu *i= ommu, } =20 int intel_pasid_replace_first_level(struct intel_iommu *iommu, - struct device *dev, pgd_t *pgd, + struct device *dev, phys_addr_t fsptptr, u32 pasid, u16 did, u16 old_did, int flags) { @@ -430,7 +431,7 @@ int intel_pasid_replace_first_level(struct intel_iommu = *iommu, return -EINVAL; } =20 - pasid_pte_config_first_level(iommu, &new_pte, pgd, did, flags); + pasid_pte_config_first_level(iommu, &new_pte, fsptptr, did, flags); =20 spin_lock(&iommu->lock); pte =3D intel_pasid_get_entry(dev, pasid); diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h index fd0fd1a0df84..a771a77d4239 100644 --- a/drivers/iommu/intel/pasid.h +++ b/drivers/iommu/intel/pasid.h @@ -288,9 +288,9 @@ extern unsigned int intel_pasid_max_id; int intel_pasid_alloc_table(struct device *dev); void intel_pasid_free_table(struct device *dev); struct pasid_table *intel_pasid_get_table(struct device *dev); -int intel_pasid_setup_first_level(struct intel_iommu *iommu, - struct device *dev, pgd_t *pgd, - u32 pasid, u16 did, int flags); +int intel_pasid_setup_first_level(struct intel_iommu *iommu, struct device= *dev, + phys_addr_t fsptptr, u32 pasid, u16 did, + int flags); int intel_pasid_setup_second_level(struct intel_iommu *iommu, struct dmar_domain *domain, struct device *dev, u32 pasid); @@ -302,9 +302,8 @@ int intel_pasid_setup_pass_through(struct intel_iommu *= iommu, int intel_pasid_setup_nested(struct intel_iommu *iommu, struct device *dev, u32 pasid, struct dmar_domain *domain); int intel_pasid_replace_first_level(struct intel_iommu *iommu, - struct device *dev, pgd_t *pgd, - u32 pasid, u16 did, u16 old_did, - int flags); + struct device *dev, phys_addr_t fsptptr, + u32 pasid, u16 did, u16 old_did, int flags); int intel_pasid_replace_second_level(struct intel_iommu *iommu, struct dmar_domain *domain, struct device *dev, u16 old_did, diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index f3da596410b5..8c0bed36c587 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -171,7 +171,7 @@ static int intel_svm_set_dev_pasid(struct iommu_domain = *domain, /* Setup the pasid table: */ sflags =3D cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0; ret =3D __domain_setup_first_level(iommu, dev, pasid, - FLPT_DEFAULT_DID, mm->pgd, + FLPT_DEFAULT_DID, __pa(mm->pgd), sflags, old); if (ret) goto out_unwind_iopf; --=20 2.43.0 From nobody Tue Oct 7 05:42:06 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2F0EB20A5F5 for ; Mon, 14 Jul 2025 04:52:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468775; cv=none; b=Gydu5IULvb++Qsqfk3H3EcjQNBOPT0l3nMG8tP2JvgBUDoT1Kx5YxRs3cNsE1791UzXuqx1uP2pVT93Hg2sJRjSfVnNB3cfDIB53W1NSSzlO61sR/aHFysbNVITWA344eJ4js54QT3CV+6NkwMoe72ofl6CP5C57v+9mKeiE790= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468775; c=relaxed/simple; bh=w6OHcfYvWgcpddF27T2R8sT+80U+FhVKBRnhQQgCvck=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mdXFdS/ApJoZrfZLYOrYUi4CGcvqE7MDnIIpIR+oGm4wXU0l34oi5EeAndG4chtIdnU1bkhFCHbPzVIc1pATHjxxuEc054C+N+UR55noBjuUnti7kJEPAsLbfX109Y6AmBBzYv9Q+TaOkYKq8XKCy7T7IqcgbXQUeBXitnweol8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Ny920KeH; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ny920KeH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752468774; x=1784004774; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=w6OHcfYvWgcpddF27T2R8sT+80U+FhVKBRnhQQgCvck=; b=Ny920KeHl6TTw90odQFPq9nDXAlT06YU4oT7vVq4ZjGjYZrIv5xFJvnB c3GcYOxWhWMRkfCWd1LG6S4VQr3t76niA04dYoDli/lfrHqyUNu3/wG66 lZRD+bIY+YBEOmQMnUOxVeiA1S1609NYIWPnnhra0ZbWVHfvlDPCxzLoj kKOPiT2o9nLVa52s9TmmNKB5eII7+QC2GruFHG+QQyWUPog0lZW1sr5fX wsvfFrs3jMAykZabaL8FEwAAIVzbtBrOlhm+/HBo8YMga80PFbLk5Eg0H WKf/SNZH6ZQ2Plh4kDfH6sdDUSwgrOJ6EBiOpowSOwJZdfz70D4y2FuBL Q==; X-CSE-ConnectionGUID: tg3LKJ+fRwywVjeOANcgSA== X-CSE-MsgGUID: ngWk/yB5QWCtUnyeKLBP0A== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="53765045" X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="53765045" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2025 21:52:52 -0700 X-CSE-ConnectionGUID: n4Z9TJjUSCinaLmyOmNjpg== X-CSE-MsgGUID: OH7tv5bvQ62+zZEO63pAYQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="161166159" Received: from allen-box.sh.intel.com ([10.239.159.52]) by orviesa003.jf.intel.com with ESMTP; 13 Jul 2025 21:52:52 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 04/11] iommu/vt-d: Fold domain_exit() into intel_iommu_domain_free() Date: Mon, 14 Jul 2025 12:50:21 +0800 Message-ID: <20250714045028.958850-5-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250714045028.958850-1-baolu.lu@linux.intel.com> References: <20250714045028.958850-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Jason Gunthorpe It has only one caller, no need for two functions. Correct the WARN_ON() error handling to leak the entire page table if the HW is still referencing it so we don't UAF during WARN_ON recovery. Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/2-v3-dbbe6f7e7ae3+124ffe-vtd_prep_jgg@nvidi= a.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 38 ++++++++++++++++++------------------- 1 file changed, 18 insertions(+), 20 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 7c2e5e682a41..5c798f30dbc4 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1396,23 +1396,6 @@ void domain_detach_iommu(struct dmar_domain *domain,= struct intel_iommu *iommu) } } =20 -static void domain_exit(struct dmar_domain *domain) -{ - if (domain->pgd) { - struct iommu_pages_list freelist =3D - IOMMU_PAGES_LIST_INIT(freelist); - - domain_unmap(domain, 0, DOMAIN_MAX_PFN(domain->gaw), &freelist); - iommu_put_pages_list(&freelist); - } - - if (WARN_ON(!list_empty(&domain->devices))) - return; - - kfree(domain->qi_batch); - kfree(domain); -} - /* * For kdump cases, old valid entries may be cached due to the * in-flight DMA and copied pgtable, but there is no unmapping @@ -3406,9 +3389,24 @@ static void intel_iommu_domain_free(struct iommu_dom= ain *domain) { struct dmar_domain *dmar_domain =3D to_dmar_domain(domain); =20 - WARN_ON(dmar_domain->nested_parent && - !list_empty(&dmar_domain->s1_domains)); - domain_exit(dmar_domain); + if (WARN_ON(dmar_domain->nested_parent && + !list_empty(&dmar_domain->s1_domains))) + return; + + if (WARN_ON(!list_empty(&dmar_domain->devices))) + return; + + if (dmar_domain->pgd) { + struct iommu_pages_list freelist =3D + IOMMU_PAGES_LIST_INIT(freelist); + + domain_unmap(dmar_domain, 0, DOMAIN_MAX_PFN(dmar_domain->gaw), + &freelist); + iommu_put_pages_list(&freelist); + } + + kfree(dmar_domain->qi_batch); + kfree(dmar_domain); } =20 int paging_domain_compatible(struct iommu_domain *domain, struct device *d= ev) --=20 2.43.0 From nobody Tue Oct 7 05:42:06 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4319820AF9A for ; Mon, 14 Jul 2025 04:52:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468775; cv=none; b=ig9OjMsXtcjsaPfRwZXBH4afMistIpEukqFeI3cxy0cnqJVyYtIRy+090aJf53P2Tya9WQiAPEFDcaNcPQZ30JFPB2Inpcu0cGhwOQqvRFUwSLzUexzbbf8/ltg8874z/TVws/+QR/OngRx3LUZggNXWhQMAqwMwljgngzussf0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468775; c=relaxed/simple; bh=uAR7aJXKMAUUzYnjgFC0oh/4Ts6cvw1XnZBBjFHogEU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=K70IdjyHFFCpDQWsXClp2W74maP7OOIy9anOisPwTPBo8wJzOAwd5mAjKaxxPs1tGnlIaWA2ykaHXthUfFaPVZLSEbShNF75vHMpsY/4k3ObWkNILPp52v4uTVa0nFHHy77ybqFXvUqEAp/Q11EpwtFGMmE9nOGFWRPjfvttW78= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=f2osqrn/; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="f2osqrn/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752468774; x=1784004774; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uAR7aJXKMAUUzYnjgFC0oh/4Ts6cvw1XnZBBjFHogEU=; b=f2osqrn/vUo4Q4Py6cmKo0HEsLPVONKtflNG0+jLCNGOyXLFUd/zn3tg I49Lsv0//KqRHS6Me7ijXAbQYXxcnIyxqPt44qV4zFDkpYt4++8eZuLCX C/qBgFJ1FMva6dQlIbqfsVOe40RHJLZEdrLQL/ljQCp2U4VIXbpaEgf/o XsJ36EPQ/4CQyLr7CwjYnC2VxxH3sHiL5JeOCIpEnrI2vmGwnwcogCvvN 0KqBcdI4cm83MUFfOTU8736vH/AP10ibimNjTJp+FjzdR4pnbbwEUOWKh 286Nz4oRAS6LA6VbypZT6qeHRGinNyZB66Gsq8YHGuvmGGjCwKRzo9JfY A==; X-CSE-ConnectionGUID: Gq6Wz+Y4SkCIgK4Rmt+C9w== X-CSE-MsgGUID: UxU5eyRMTuypoHJn64Vr3w== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="53765052" X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="53765052" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2025 21:52:53 -0700 X-CSE-ConnectionGUID: C5MQhJVvRwK2Uia7V1fJ1g== X-CSE-MsgGUID: xTBFJSpyQbaI0L8WKSK7uA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="161166166" Received: from allen-box.sh.intel.com ([10.239.159.52]) by orviesa003.jf.intel.com with ESMTP; 13 Jul 2025 21:52:53 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 05/11] iommu/vt-d: Do not wipe out the page table NID when devices detach Date: Mon, 14 Jul 2025 12:50:22 +0800 Message-ID: <20250714045028.958850-6-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250714045028.958850-1-baolu.lu@linux.intel.com> References: <20250714045028.958850-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Jason Gunthorpe The NID is used to control which NUMA node memory for the page table is allocated it from. It should be a permanent property of the page table when it was allocated and not change during attach/detach of devices. Reviewed-by: Wei Wang Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/3-v3-dbbe6f7e7ae3+124ffe-vtd_prep_jgg@nvidi= a.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 5c798f30dbc4..55e0ba4d20ae 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1391,7 +1391,6 @@ void domain_detach_iommu(struct dmar_domain *domain, = struct intel_iommu *iommu) if (--info->refcnt =3D=3D 0) { ida_free(&iommu->domain_ida, info->did); xa_erase(&domain->iommu_array, iommu->seq_id); - domain->nid =3D NUMA_NO_NODE; kfree(info); } } --=20 2.43.0 From nobody Tue Oct 7 05:42:06 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE1A320DD72 for ; Mon, 14 Jul 2025 04:52:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468777; cv=none; b=LOhZvXBD898gng9ai9caO4S2+R9k1+rOMX0P2Qkt4ZxZ6vNOAkez1/rwYkjeE+OeNEI4/DImJ9qGFd4dRqOaAmtxxpW9hWk4IQXTOmCwJ3XhIn59FxzAdhWnJZrO4gzbmtlDAuZw8CrCtLSzJst/hRrAU1tDW4jXDPul9SJfCwk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468777; c=relaxed/simple; bh=Vax7/G9E5YAfEbBHn6+S9xTegQB26ThUYjD6uXo6bz8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VbQQwVRT3smpJk1q4ei/lCx8AuI1MILHFHEduvPqnMD/miO7bztvryjPDpHAcHD/n7pH5mgoTMNQd+VfLKXFnN7IgFQL9xjxZQKyAo97fHZe6V2ltiGg1VBe2kQ2Td0J9wQtm+kxKVdt2DvjLMJgUvlo6PFglMuH3BMOkRnMBdA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=moFfp7Ly; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="moFfp7Ly" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752468775; x=1784004775; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Vax7/G9E5YAfEbBHn6+S9xTegQB26ThUYjD6uXo6bz8=; b=moFfp7LyaQsl9sB+JtNRw7LEW1s/hGz2TtrRrDswnf9EOWdk+ZDVY9UX lH209SaDdkYogelQf75Yyf6N/7psTEPKDpN3d4wuwmLhQ6ZmRyJNlM3Mz ZFtjS5xZKMp7+0bRgl8U3p9vP2vusjPDZk/SuoSxT+7c7AP9KWX1JuRaa 1+MSmK1FzPRJwg9iAiGbXx7/gc0El6X51iPaJgp6DO2Q21qR3E8xmAYST aT/OyJ7gTNgFmLRkolkslbF/mDblBb/6SKVVe/X8KMsBPcRBSGzeKaMnF Nn0d5cIOrcA1NkW22uCe/z/fKbqLVw4leOFwgOZlRsMBZVa0eiBsHQCj1 w==; X-CSE-ConnectionGUID: U8EE1wEYTCyuXfwzI6/FCg== X-CSE-MsgGUID: Q9Xt1CLCSoeKPRUgKH99Iw== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="53765056" X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="53765056" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2025 21:52:55 -0700 X-CSE-ConnectionGUID: yC9+eIHTTLubNaLHsxwlXg== X-CSE-MsgGUID: Z/cQkoVMQqSCy8GLSXb65A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="161166175" Received: from allen-box.sh.intel.com ([10.239.159.52]) by orviesa003.jf.intel.com with ESMTP; 13 Jul 2025 21:52:55 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 06/11] iommu/vt-d: Split intel_iommu_domain_alloc_paging_flags() Date: Mon, 14 Jul 2025 12:50:23 +0800 Message-ID: <20250714045028.958850-7-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250714045028.958850-1-baolu.lu@linux.intel.com> References: <20250714045028.958850-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Jason Gunthorpe Create stage specific functions that check the stage specific conditions if each stage can be supported. Have intel_iommu_domain_alloc_paging_flags() call both stages in sequence until one does not return EOPNOTSUPP and prefer to use the first stage if available and suitable for the requested flags. Move second stage only operations like nested_parent and dirty_tracking into the second stage function for clarity. Move initialization of the iommu_domain members into paging_domain_alloc(). Drop initialization of domain->owner as the callers all do it. Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/4-v3-dbbe6f7e7ae3+124ffe-vtd_prep_jgg@nvidi= a.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 98 +++++++++++++++++++++---------------- 1 file changed, 57 insertions(+), 41 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 55e0ba4d20ae..0ac3c3a6d9e7 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3281,10 +3281,15 @@ static struct dmar_domain *paging_domain_alloc(stru= ct device *dev, bool first_st spin_lock_init(&domain->lock); spin_lock_init(&domain->cache_lock); xa_init(&domain->iommu_array); + INIT_LIST_HEAD(&domain->s1_domains); + spin_lock_init(&domain->s1_lock); =20 domain->nid =3D dev_to_node(dev); domain->use_first_level =3D first_stage; =20 + domain->domain.type =3D IOMMU_DOMAIN_UNMANAGED; + domain->domain.ops =3D intel_iommu_ops.default_domain_ops; + /* calculate the address width */ addr_width =3D agaw_to_width(iommu->agaw); if (addr_width > cap_mgaw(iommu->cap)) @@ -3326,62 +3331,73 @@ static struct dmar_domain *paging_domain_alloc(stru= ct device *dev, bool first_st } =20 static struct iommu_domain * -intel_iommu_domain_alloc_paging_flags(struct device *dev, u32 flags, - const struct iommu_user_data *user_data) +intel_iommu_domain_alloc_first_stage(struct device *dev, + struct intel_iommu *iommu, u32 flags) +{ + struct dmar_domain *dmar_domain; + + if (flags & ~IOMMU_HWPT_ALLOC_PASID) + return ERR_PTR(-EOPNOTSUPP); + + /* Only SL is available in legacy mode */ + if (!sm_supported(iommu) || !ecap_flts(iommu->ecap)) + return ERR_PTR(-EOPNOTSUPP); + + dmar_domain =3D paging_domain_alloc(dev, true); + if (IS_ERR(dmar_domain)) + return ERR_CAST(dmar_domain); + return &dmar_domain->domain; +} + +static struct iommu_domain * +intel_iommu_domain_alloc_second_stage(struct device *dev, + struct intel_iommu *iommu, u32 flags) { - struct device_domain_info *info =3D dev_iommu_priv_get(dev); - bool dirty_tracking =3D flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING; - bool nested_parent =3D flags & IOMMU_HWPT_ALLOC_NEST_PARENT; - struct intel_iommu *iommu =3D info->iommu; struct dmar_domain *dmar_domain; - struct iommu_domain *domain; - bool first_stage; =20 if (flags & (~(IOMMU_HWPT_ALLOC_NEST_PARENT | IOMMU_HWPT_ALLOC_DIRTY_TRACKING | IOMMU_HWPT_ALLOC_PASID))) return ERR_PTR(-EOPNOTSUPP); - if (nested_parent && !nested_supported(iommu)) - return ERR_PTR(-EOPNOTSUPP); - if (user_data || (dirty_tracking && !ssads_supported(iommu))) + + if (((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && + !nested_supported(iommu)) || + ((flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING) && + !ssads_supported(iommu))) return ERR_PTR(-EOPNOTSUPP); =20 - /* - * Always allocate the guest compatible page table unless - * IOMMU_HWPT_ALLOC_NEST_PARENT or IOMMU_HWPT_ALLOC_DIRTY_TRACKING - * is specified. - */ - if (nested_parent || dirty_tracking) { - if (!sm_supported(iommu) || !ecap_slts(iommu->ecap)) - return ERR_PTR(-EOPNOTSUPP); - first_stage =3D false; - } else { - first_stage =3D first_level_by_default(iommu); - } + /* Legacy mode always supports second stage */ + if (sm_supported(iommu) && !ecap_slts(iommu->ecap)) + return ERR_PTR(-EOPNOTSUPP); =20 - dmar_domain =3D paging_domain_alloc(dev, first_stage); + dmar_domain =3D paging_domain_alloc(dev, false); if (IS_ERR(dmar_domain)) return ERR_CAST(dmar_domain); - domain =3D &dmar_domain->domain; - domain->type =3D IOMMU_DOMAIN_UNMANAGED; - domain->owner =3D &intel_iommu_ops; - domain->ops =3D intel_iommu_ops.default_domain_ops; =20 - if (nested_parent) { - dmar_domain->nested_parent =3D true; - INIT_LIST_HEAD(&dmar_domain->s1_domains); - spin_lock_init(&dmar_domain->s1_lock); - } + dmar_domain->nested_parent =3D flags & IOMMU_HWPT_ALLOC_NEST_PARENT; =20 - if (dirty_tracking) { - if (dmar_domain->use_first_level) { - iommu_domain_free(domain); - return ERR_PTR(-EOPNOTSUPP); - } - domain->dirty_ops =3D &intel_dirty_ops; - } + if (flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING) + dmar_domain->domain.dirty_ops =3D &intel_dirty_ops; =20 - return domain; + return &dmar_domain->domain; +} + +static struct iommu_domain * +intel_iommu_domain_alloc_paging_flags(struct device *dev, u32 flags, + const struct iommu_user_data *user_data) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + struct iommu_domain *domain; + + if (user_data) + return ERR_PTR(-EOPNOTSUPP); + + /* Prefer first stage if possible by default. */ + domain =3D intel_iommu_domain_alloc_first_stage(dev, iommu, flags); + if (domain !=3D ERR_PTR(-EOPNOTSUPP)) + return domain; + return intel_iommu_domain_alloc_second_stage(dev, iommu, flags); } =20 static void intel_iommu_domain_free(struct iommu_domain *domain) --=20 2.43.0 From nobody Tue Oct 7 05:42:06 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5812C1F4624 for ; Mon, 14 Jul 2025 04:52:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468779; cv=none; b=d1TaNojV4haRgR3DKm386h9+3KHP093XWi1sqlIDDaBiwOAc1JmO8iDgrc9JNQXJwYbZ/TUgIvARiWTag0ogFs05Rh+T7VpOsiuVaHQOCjOzufjVPvcu8KJGPxa80Knw1cfFKf4RnCWyJxa4zViRnCC4lQeovLFP29Uvd99hAZ0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468779; c=relaxed/simple; bh=/K48GTJABaYkID4O43SBS3bLezuf6em0COXTSVHitjU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KGGKvN3sKQrSyeQuuC3ZpV/njjDIIowq/ry+6NOG6PtoYnJ7O5ATITJJYgzmG6Wp+LjtdudYfR7s2Ep5+5vLl5sbX0ceb3boPwvxB4+itttJWuPWp1r5E/MUmI508ndtELOIHrLgQi6s9+sWcYrehFA3svQYnLYdlm/DL0WgMA8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=UJZyXTNN; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="UJZyXTNN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752468777; x=1784004777; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/K48GTJABaYkID4O43SBS3bLezuf6em0COXTSVHitjU=; b=UJZyXTNNb5nozwdUFiXUmfqwJn7M/S5NkPnT3XbX15CMObn7dTVlMzpv FQrG6J6AX0eeyMAGlK21WCZLSdeTuQtAIsiF/vOa2ujiRslCeAgn2xrDK GD3Yrtij6xap0kmuLn+tAYSDXVxQEZ8ycnl2CNgd4oymWCQg/RVCKtwXi T/LHNIFtDMfWKlE75dr4uyq4Z5tzBTXLoE/vDSVk3QH/m2zzs5jIuzVZR jHN/kqpYmJl65WxNaHDwWc9tUUag2eN6Qz0+2tqMc7o7LZZ6kauM/DFvO ySiO+DwVsRYqQZ4ksHny0UesWNewvKB5nPT+3dQEwTk8/uCS1w/qu8Ysp A==; X-CSE-ConnectionGUID: LoIGbrDuQ46s7jj5M1oV0A== X-CSE-MsgGUID: 6TCWPJqvThSkK1wD+WejiQ== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="53765060" X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="53765060" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2025 21:52:57 -0700 X-CSE-ConnectionGUID: 6A6KHF5USeOmFRjmrW5cdQ== X-CSE-MsgGUID: YfYsnvz5QnO2n/upohUSbQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="161166186" Received: from allen-box.sh.intel.com ([10.239.159.52]) by orviesa003.jf.intel.com with ESMTP; 13 Jul 2025 21:52:56 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 07/11] iommu/vt-d: Create unique domain ops for each stage Date: Mon, 14 Jul 2025 12:50:24 +0800 Message-ID: <20250714045028.958850-8-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250714045028.958850-1-baolu.lu@linux.intel.com> References: <20250714045028.958850-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Jason Gunthorpe Use the domain ops pointer to tell what kind of domain it is instead of the internal use_first_level indication. This also protects against wrongly using a SVA/nested/IDENTITY/BLOCKED domain type in places they should not be. The only remaining uses of use_first_level outside the paging domain are in paging_domain_compatible() and intel_iommu_enforce_cache_coherency(). Thus, remove the useless sets of use_first_level in intel_svm_domain_alloc() and intel_iommu_domain_alloc_nested(). None of the unique ops for these domain types ever reference it on their call chains. Add a WARN_ON() check in domain_context_mapping_one() as it only works with second stage. This is preparation for iommupt which will have different ops for each of the stages. Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/5-v3-dbbe6f7e7ae3+124ffe-vtd_prep_jgg@nvidi= a.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/cache.c | 5 +-- drivers/iommu/intel/iommu.c | 60 +++++++++++++++++++++++++----------- drivers/iommu/intel/iommu.h | 12 ++++++++ drivers/iommu/intel/nested.c | 4 +-- drivers/iommu/intel/svm.c | 1 - 5 files changed, 58 insertions(+), 24 deletions(-) diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index 47692cbfaabd..876630e10849 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -370,7 +370,7 @@ static void cache_tag_flush_iotlb(struct dmar_domain *d= omain, struct cache_tag * struct intel_iommu *iommu =3D tag->iommu; u64 type =3D DMA_TLB_PSI_FLUSH; =20 - if (domain->use_first_level) { + if (intel_domain_is_fs_paging(domain)) { qi_batch_add_piotlb(iommu, tag->domain_id, tag->pasid, addr, pages, ih, domain->qi_batch); return; @@ -545,7 +545,8 @@ void cache_tag_flush_range_np(struct dmar_domain *domai= n, unsigned long start, qi_batch_flush_descs(iommu, domain->qi_batch); iommu =3D tag->iommu; =20 - if (!cap_caching_mode(iommu->cap) || domain->use_first_level) { + if (!cap_caching_mode(iommu->cap) || + intel_domain_is_fs_paging(domain)) { iommu_flush_write_buffer(iommu); continue; } diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 0ac3c3a6d9e7..b7b1a3d2cbfc 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1462,6 +1462,9 @@ static int domain_context_mapping_one(struct dmar_dom= ain *domain, struct context_entry *context; int ret; =20 + if (WARN_ON(!intel_domain_is_ss_paging(domain))) + return -EINVAL; + pr_debug("Set context mapping for %02x:%02x.%d\n", bus, PCI_SLOT(devfn), PCI_FUNC(devfn)); =20 @@ -1780,7 +1783,7 @@ static int domain_setup_first_level(struct intel_iomm= u *iommu, static bool domain_need_iotlb_sync_map(struct dmar_domain *domain, struct intel_iommu *iommu) { - if (cap_caching_mode(iommu->cap) && !domain->use_first_level) + if (cap_caching_mode(iommu->cap) && intel_domain_is_ss_paging(domain)) return true; =20 if (rwbf_quirk || cap_rwbf(iommu->cap)) @@ -1812,12 +1815,14 @@ static int dmar_domain_attach_device(struct dmar_do= main *domain, =20 if (!sm_supported(iommu)) ret =3D domain_context_mapping(domain, dev); - else if (domain->use_first_level) + else if (intel_domain_is_fs_paging(domain)) ret =3D domain_setup_first_level(iommu, domain, dev, IOMMU_NO_PASID, NULL); - else + else if (intel_domain_is_ss_paging(domain)) ret =3D domain_setup_second_level(iommu, domain, dev, IOMMU_NO_PASID, NULL); + else if (WARN_ON(true)) + ret =3D -EINVAL; =20 if (ret) goto out_block_translation; @@ -3288,7 +3293,6 @@ static struct dmar_domain *paging_domain_alloc(struct= device *dev, bool first_st domain->use_first_level =3D first_stage; =20 domain->domain.type =3D IOMMU_DOMAIN_UNMANAGED; - domain->domain.ops =3D intel_iommu_ops.default_domain_ops; =20 /* calculate the address width */ addr_width =3D agaw_to_width(iommu->agaw); @@ -3346,6 +3350,8 @@ intel_iommu_domain_alloc_first_stage(struct device *d= ev, dmar_domain =3D paging_domain_alloc(dev, true); if (IS_ERR(dmar_domain)) return ERR_CAST(dmar_domain); + + dmar_domain->domain.ops =3D &intel_fs_paging_domain_ops; return &dmar_domain->domain; } =20 @@ -3374,6 +3380,7 @@ intel_iommu_domain_alloc_second_stage(struct device *= dev, if (IS_ERR(dmar_domain)) return ERR_CAST(dmar_domain); =20 + dmar_domain->domain.ops =3D &intel_ss_paging_domain_ops; dmar_domain->nested_parent =3D flags & IOMMU_HWPT_ALLOC_NEST_PARENT; =20 if (flags & IOMMU_HWPT_ALLOC_DIRTY_TRACKING) @@ -4107,12 +4114,15 @@ static int intel_iommu_set_dev_pasid(struct iommu_d= omain *domain, if (ret) goto out_remove_dev_pasid; =20 - if (dmar_domain->use_first_level) + if (intel_domain_is_fs_paging(dmar_domain)) ret =3D domain_setup_first_level(iommu, dmar_domain, dev, pasid, old); - else + else if (intel_domain_is_ss_paging(dmar_domain)) ret =3D domain_setup_second_level(iommu, dmar_domain, dev, pasid, old); + else if (WARN_ON(true)) + ret =3D -EINVAL; + if (ret) goto out_unwind_iopf; =20 @@ -4387,6 +4397,32 @@ static struct iommu_domain identity_domain =3D { }, }; =20 +const struct iommu_domain_ops intel_fs_paging_domain_ops =3D { + .attach_dev =3D intel_iommu_attach_device, + .set_dev_pasid =3D intel_iommu_set_dev_pasid, + .map_pages =3D intel_iommu_map_pages, + .unmap_pages =3D intel_iommu_unmap_pages, + .iotlb_sync_map =3D intel_iommu_iotlb_sync_map, + .flush_iotlb_all =3D intel_flush_iotlb_all, + .iotlb_sync =3D intel_iommu_tlb_sync, + .iova_to_phys =3D intel_iommu_iova_to_phys, + .free =3D intel_iommu_domain_free, + .enforce_cache_coherency =3D intel_iommu_enforce_cache_coherency, +}; + +const struct iommu_domain_ops intel_ss_paging_domain_ops =3D { + .attach_dev =3D intel_iommu_attach_device, + .set_dev_pasid =3D intel_iommu_set_dev_pasid, + .map_pages =3D intel_iommu_map_pages, + .unmap_pages =3D intel_iommu_unmap_pages, + .iotlb_sync_map =3D intel_iommu_iotlb_sync_map, + .flush_iotlb_all =3D intel_flush_iotlb_all, + .iotlb_sync =3D intel_iommu_tlb_sync, + .iova_to_phys =3D intel_iommu_iova_to_phys, + .free =3D intel_iommu_domain_free, + .enforce_cache_coherency =3D intel_iommu_enforce_cache_coherency, +}; + const struct iommu_ops intel_iommu_ops =3D { .blocked_domain =3D &blocking_domain, .release_domain =3D &blocking_domain, @@ -4405,18 +4441,6 @@ const struct iommu_ops intel_iommu_ops =3D { .def_domain_type =3D device_def_domain_type, .pgsize_bitmap =3D SZ_4K, .page_response =3D intel_iommu_page_response, - .default_domain_ops =3D &(const struct iommu_domain_ops) { - .attach_dev =3D intel_iommu_attach_device, - .set_dev_pasid =3D intel_iommu_set_dev_pasid, - .map_pages =3D intel_iommu_map_pages, - .unmap_pages =3D intel_iommu_unmap_pages, - .iotlb_sync_map =3D intel_iommu_iotlb_sync_map, - .flush_iotlb_all =3D intel_flush_iotlb_all, - .iotlb_sync =3D intel_iommu_tlb_sync, - .iova_to_phys =3D intel_iommu_iova_to_phys, - .free =3D intel_iommu_domain_free, - .enforce_cache_coherency =3D intel_iommu_enforce_cache_coherency, - } }; =20 static void quirk_iommu_igfx(struct pci_dev *dev) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 50d69cc88a1f..d09b92871659 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -1380,6 +1380,18 @@ struct context_entry *iommu_context_addr(struct inte= l_iommu *iommu, u8 bus, u8 devfn, int alloc); =20 extern const struct iommu_ops intel_iommu_ops; +extern const struct iommu_domain_ops intel_fs_paging_domain_ops; +extern const struct iommu_domain_ops intel_ss_paging_domain_ops; + +static inline bool intel_domain_is_fs_paging(struct dmar_domain *domain) +{ + return domain->domain.ops =3D=3D &intel_fs_paging_domain_ops; +} + +static inline bool intel_domain_is_ss_paging(struct dmar_domain *domain) +{ + return domain->domain.ops =3D=3D &intel_ss_paging_domain_ops; +} =20 #ifdef CONFIG_INTEL_IOMMU extern int intel_iommu_sm; diff --git a/drivers/iommu/intel/nested.c b/drivers/iommu/intel/nested.c index fc312f649f9e..1b6ad9c900a5 100644 --- a/drivers/iommu/intel/nested.c +++ b/drivers/iommu/intel/nested.c @@ -216,8 +216,7 @@ intel_iommu_domain_alloc_nested(struct device *dev, str= uct iommu_domain *parent, /* Must be nested domain */ if (user_data->type !=3D IOMMU_HWPT_DATA_VTD_S1) return ERR_PTR(-EOPNOTSUPP); - if (parent->ops !=3D intel_iommu_ops.default_domain_ops || - !s2_domain->nested_parent) + if (!intel_domain_is_ss_paging(s2_domain) || !s2_domain->nested_parent) return ERR_PTR(-EINVAL); =20 ret =3D iommu_copy_struct_from_user(&vtd, user_data, @@ -229,7 +228,6 @@ intel_iommu_domain_alloc_nested(struct device *dev, str= uct iommu_domain *parent, if (!domain) return ERR_PTR(-ENOMEM); =20 - domain->use_first_level =3D true; domain->s2_domain =3D s2_domain; domain->s1_cfg =3D vtd; domain->domain.ops =3D &intel_nested_domain_ops; diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index 8c0bed36c587..e147f71f91b7 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -214,7 +214,6 @@ struct iommu_domain *intel_svm_domain_alloc(struct devi= ce *dev, return ERR_PTR(-ENOMEM); =20 domain->domain.ops =3D &intel_svm_domain_ops; - domain->use_first_level =3D true; INIT_LIST_HEAD(&domain->dev_pasids); INIT_LIST_HEAD(&domain->cache_tags); spin_lock_init(&domain->cache_lock); --=20 2.43.0 From nobody Tue Oct 7 05:42:06 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE15E20A5F5 for ; Mon, 14 Jul 2025 04:52:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468780; cv=none; b=lymdxvqJOZjNC6U8mouS05flDe4OxXgUeQ9kxyFotWU1FMG1q6sy/HIsZjlQKwn9xJNNgbcaHXWOU3Xyhmu42YZFCgblTxYuQB8q0Y8abel+u16yNpGdnUBUtf/u0NvDFdZ1R5pCnHxmiahgBmHBbTmbr+HMcAp5phOdsurjeas= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468780; c=relaxed/simple; bh=GTHbngVd52A1C620pmn/9G2aoaRoLssEGXLHdKtA9EQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bf+Q2fsBhOZevILONVzaeFnNCxaHjnRgDeESipJLCFcdZH5odGpiKzjcXcRhY59vvD8t7T7pduWEdQkmhrlZUKlPp1hB7/soS/J7MSEn/GOIHq8iaIJyP4yoQLjKE4X5FFP/IRFebj1FV6mQ37c13UagUWVFitRpSrAlyCAdCt4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=S8LTbZw/; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="S8LTbZw/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752468778; x=1784004778; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GTHbngVd52A1C620pmn/9G2aoaRoLssEGXLHdKtA9EQ=; b=S8LTbZw/4h9ZxpYUo5kMfSQPA72kNR51BCgKYtSXU/xZud+GTY1IKKqi sWcVW7fSEzlMS/pwCrlezJXh0w3xN4Apt0kIqqpe+AZo56WvnirXPN01z rb1T4ToINh3nrMHv9KUN7bVI4i1QkDe40sk1HypMMD+5uhuVReL43VUco YI0kQadBi9mQD/npb4SI2ahBNEi7t0G4k9+UB1Dahc+9sY6EGRkyEOlvE XmyDCHahLG2kEQKNLwPewSv6VwxngJ8a5+OD9z23Lw2y48ShLtTMf1mpu 4XELcY7iW1K2s6TORWJqyZomWlsRyfmLScw7Vtqp50ml/LOINwYMJGbNa A==; X-CSE-ConnectionGUID: CNjo504WRoqxrpNB6qIJkg== X-CSE-MsgGUID: /TdEyxieR/WFG7occXIwmQ== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="53765065" X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="53765065" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2025 21:52:58 -0700 X-CSE-ConnectionGUID: 0EofOn55SR2Euldm+wgHUQ== X-CSE-MsgGUID: MCXsC1bqTwCh8GgxK0j1bQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="161166201" Received: from allen-box.sh.intel.com ([10.239.159.52]) by orviesa003.jf.intel.com with ESMTP; 13 Jul 2025 21:52:58 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 08/11] iommu/vt-d: Split intel_iommu_enforce_cache_coherency() Date: Mon, 14 Jul 2025 12:50:25 +0800 Message-ID: <20250714045028.958850-9-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250714045028.958850-1-baolu.lu@linux.intel.com> References: <20250714045028.958850-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Jason Gunthorpe First Stage and Second Stage have very different ways to deny no-snoop. The first stage uses the PGSNP bit which is global per-PASID so enabling requires loading new PASID entries for all the attached devices. Second stage uses a bit per PTE, so enabling just requires telling future maps to set the bit. Since we now have two domain ops we can have two functions that can directly code their required actions instead of a bunch of logic dancing around use_first_level. Combine domain_set_force_snooping() into the new functions since they are the only caller. Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/6-v3-dbbe6f7e7ae3+124ffe-vtd_prep_jgg@nvidi= a.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 47 +++++++++++++++++-------------------- 1 file changed, 22 insertions(+), 25 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index b7b1a3d2cbfc..95619640b027 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3643,44 +3643,41 @@ static bool domain_support_force_snooping(struct dm= ar_domain *domain) return support; } =20 -static void domain_set_force_snooping(struct dmar_domain *domain) +static bool intel_iommu_enforce_cache_coherency_fs(struct iommu_domain *do= main) { + struct dmar_domain *dmar_domain =3D to_dmar_domain(domain); struct device_domain_info *info; =20 - assert_spin_locked(&domain->lock); - /* - * Second level page table supports per-PTE snoop control. The - * iommu_map() interface will handle this by setting SNP bit. - */ - if (!domain->use_first_level) { - domain->set_pte_snp =3D true; - return; - } + guard(spinlock_irqsave)(&dmar_domain->lock); =20 - list_for_each_entry(info, &domain->devices, link) + if (dmar_domain->force_snooping) + return true; + + if (!domain_support_force_snooping(dmar_domain)) + return false; + + dmar_domain->force_snooping =3D true; + list_for_each_entry(info, &dmar_domain->devices, link) intel_pasid_setup_page_snoop_control(info->iommu, info->dev, IOMMU_NO_PASID); + return true; } =20 -static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domai= n) +static bool intel_iommu_enforce_cache_coherency_ss(struct iommu_domain *do= main) { struct dmar_domain *dmar_domain =3D to_dmar_domain(domain); - unsigned long flags; =20 - if (dmar_domain->force_snooping) - return true; - - spin_lock_irqsave(&dmar_domain->lock, flags); + guard(spinlock_irqsave)(&dmar_domain->lock); if (!domain_support_force_snooping(dmar_domain) || - (!dmar_domain->use_first_level && dmar_domain->has_mappings)) { - spin_unlock_irqrestore(&dmar_domain->lock, flags); + dmar_domain->has_mappings) return false; - } =20 - domain_set_force_snooping(dmar_domain); + /* + * Second level page table supports per-PTE snoop control. The + * iommu_map() interface will handle this by setting SNP bit. + */ + dmar_domain->set_pte_snp =3D true; dmar_domain->force_snooping =3D true; - spin_unlock_irqrestore(&dmar_domain->lock, flags); - return true; } =20 @@ -4407,7 +4404,7 @@ const struct iommu_domain_ops intel_fs_paging_domain_= ops =3D { .iotlb_sync =3D intel_iommu_tlb_sync, .iova_to_phys =3D intel_iommu_iova_to_phys, .free =3D intel_iommu_domain_free, - .enforce_cache_coherency =3D intel_iommu_enforce_cache_coherency, + .enforce_cache_coherency =3D intel_iommu_enforce_cache_coherency_fs, }; =20 const struct iommu_domain_ops intel_ss_paging_domain_ops =3D { @@ -4420,7 +4417,7 @@ const struct iommu_domain_ops intel_ss_paging_domain_= ops =3D { .iotlb_sync =3D intel_iommu_tlb_sync, .iova_to_phys =3D intel_iommu_iova_to_phys, .free =3D intel_iommu_domain_free, - .enforce_cache_coherency =3D intel_iommu_enforce_cache_coherency, + .enforce_cache_coherency =3D intel_iommu_enforce_cache_coherency_ss, }; =20 const struct iommu_ops intel_iommu_ops =3D { --=20 2.43.0 From nobody Tue Oct 7 05:42:06 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A0A2622259E for ; Mon, 14 Jul 2025 04:53:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468784; cv=none; b=Qo4n0bfDfnvkXDN2jJfaBrJ7tFGVbZnGwIk5zXpZKqQuG1sm79NYG1JASRXg7UdnNrp9aMMalmNRM2WMrMobkXCZR7/ZiiWPmUMCDA+ib8/II9k2A65PZPeJQWzc586/dP75Taj1yqn6wAbg2JKgJ0VURqIuVSwP6sF8R78r3NE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468784; c=relaxed/simple; bh=21iTm4okT/mj4EyVviTYK024U2GXvJIK9ISDdy7bclo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TKLAlA4az7mD8bjOBj+9OrYuX+6hLcPDwvrK7jSOxmnWfFQbqnZfyORtQhzP9xU9GiUo+0wdS9RfIWYdbL+UpP4SZdMN9b4bBhkc2dU3gKFpy1CiIerqvp7tvOZcwz26a7KuA0E7Nq/0zNZpTOvBD3lIBYw3Jj80D6/X3h3amio= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=nO/Jbhz0; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="nO/Jbhz0" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752468782; x=1784004782; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=21iTm4okT/mj4EyVviTYK024U2GXvJIK9ISDdy7bclo=; b=nO/Jbhz0evL0enljpaR+w9ko+I02yKbYgieuS0Gs+y3C/eH5XL85rxPx I2xiMXeUmIsXjVZlFBgu+O/r+UUxa4RHxNTmEiFATFmtwwR8D9ZkH1oLa 0dj+DAw32OZlWYRq1PO9Jg5yDXb59fYsYqntS7zc/uWTbtIbPxRjB87Jk MdZPNVZicQ/KFBEzT0DqQ/p/NF4/84wBai0Aqodf0WbfTrepS1vZU1gDt p37azgZx7R4ADQYinovFT1mRUbioYzsas2KBcBk319hrpKQiqvKw8f5dC SubSzexrsPubLvOMMVFG9pTwSWOx1bdElLBeuy2otaTIfaTSldE4eItM4 g==; X-CSE-ConnectionGUID: aZXau4JYSk2uYspIkLSqNQ== X-CSE-MsgGUID: RExxSPy5QwqtyxvdOq7PTQ== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="53765081" X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="53765081" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2025 21:53:02 -0700 X-CSE-ConnectionGUID: Oote/yaxTva9hxmCI4etyw== X-CSE-MsgGUID: 1LqInSeyTmeEFhdBMZNLpg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="161166210" Received: from allen-box.sh.intel.com ([10.239.159.52]) by orviesa003.jf.intel.com with ESMTP; 13 Jul 2025 21:52:59 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 09/11] iommu/vt-d: Split paging_domain_compatible() Date: Mon, 14 Jul 2025 12:50:26 +0800 Message-ID: <20250714045028.958850-10-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250714045028.958850-1-baolu.lu@linux.intel.com> References: <20250714045028.958850-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Jason Gunthorpe Make First/Second stage specific functions that follow the same pattern in intel_iommu_domain_alloc_first/second_stage() for computing EOPNOTSUPP. This makes the code easier to understand as if we couldn't create a domain with the parameters for this IOMMU instance then we certainly are not compatible with it. Check superpage support directly against the per-stage cap bits and the pgsize_bitmap. Add a note that the force_snooping is read without locking. The locking needs to cover the compatible check and the add of the device to the list. Reviewed-by: Kevin Tian Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/7-v3-dbbe6f7e7ae3+124ffe-vtd_prep_jgg@nvidi= a.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 66 ++++++++++++++++++++++++++++++------- 1 file changed, 54 insertions(+), 12 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 95619640b027..ddb981ad0e19 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3431,33 +3431,75 @@ static void intel_iommu_domain_free(struct iommu_do= main *domain) kfree(dmar_domain); } =20 +static int paging_domain_compatible_first_stage(struct dmar_domain *dmar_d= omain, + struct intel_iommu *iommu) +{ + if (WARN_ON(dmar_domain->domain.dirty_ops || + dmar_domain->nested_parent)) + return -EINVAL; + + /* Only SL is available in legacy mode */ + if (!sm_supported(iommu) || !ecap_flts(iommu->ecap)) + return -EINVAL; + + /* Same page size support */ + if (!cap_fl1gp_support(iommu->cap) && + (dmar_domain->domain.pgsize_bitmap & SZ_1G)) + return -EINVAL; + return 0; +} + +static int +paging_domain_compatible_second_stage(struct dmar_domain *dmar_domain, + struct intel_iommu *iommu) +{ + unsigned int sslps =3D cap_super_page_val(iommu->cap); + + if (dmar_domain->domain.dirty_ops && !ssads_supported(iommu)) + return -EINVAL; + if (dmar_domain->nested_parent && !nested_supported(iommu)) + return -EINVAL; + + /* Legacy mode always supports second stage */ + if (sm_supported(iommu) && !ecap_slts(iommu->ecap)) + return -EINVAL; + + /* Same page size support */ + if (!(sslps & BIT(0)) && (dmar_domain->domain.pgsize_bitmap & SZ_2M)) + return -EINVAL; + if (!(sslps & BIT(1)) && (dmar_domain->domain.pgsize_bitmap & SZ_1G)) + return -EINVAL; + return 0; +} + int paging_domain_compatible(struct iommu_domain *domain, struct device *d= ev) { struct device_domain_info *info =3D dev_iommu_priv_get(dev); struct dmar_domain *dmar_domain =3D to_dmar_domain(domain); struct intel_iommu *iommu =3D info->iommu; + int ret =3D -EINVAL; int addr_width; =20 - if (WARN_ON_ONCE(!(domain->type & __IOMMU_DOMAIN_PAGING))) - return -EPERM; + if (intel_domain_is_fs_paging(dmar_domain)) + ret =3D paging_domain_compatible_first_stage(dmar_domain, iommu); + else if (intel_domain_is_ss_paging(dmar_domain)) + ret =3D paging_domain_compatible_second_stage(dmar_domain, iommu); + else if (WARN_ON(true)) + ret =3D -EINVAL; + if (ret) + return ret; =20 + /* + * FIXME this is locked wrong, it needs to be under the + * dmar_domain->lock + */ if (dmar_domain->force_snooping && !ecap_sc_support(iommu->ecap)) return -EINVAL; =20 - if (domain->dirty_ops && !ssads_supported(iommu)) - return -EINVAL; - if (dmar_domain->iommu_coherency !=3D iommu_paging_structure_coherency(iommu)) return -EINVAL; =20 - if (dmar_domain->iommu_superpage !=3D - iommu_superpage_capability(iommu, dmar_domain->use_first_level)) - return -EINVAL; - - if (dmar_domain->use_first_level && - (!sm_supported(iommu) || !ecap_flts(iommu->ecap))) - return -EINVAL; =20 /* check if this iommu agaw is sufficient for max mapped address */ addr_width =3D agaw_to_width(iommu->agaw); --=20 2.43.0 From nobody Tue Oct 7 05:42:06 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 002E11F8BD6 for ; Mon, 14 Jul 2025 04:53:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468785; cv=none; b=D6trnSWpR+BcowVE2wwdYeZtMDm/tXQBbx2eYz7Pp68v2bG/+fgAjypphx8xvt412ZMotxOp5MOXzJdqxokgwUAnDLmA9YclrENiq6Wu3sFL7Spp/NN6HHPDTN0uP+B0mQZZbh+LV1Jvx0l7JO/+gpJfivWlJ55xMvmvdgkKjw8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468785; c=relaxed/simple; bh=DzEkyFZQ28u7x7SePqyWjjDO5a/QUmvg26E6ZupBXSw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RvZytcpqT3O515Z6uFnm2E7sUYHq3/gpIgm2l4rAjp6iGywcCs/7B+NRQxBNq2n3dAmnkB1kgq5SOLtnFIS/gvM01WJjyPcbMZ0hQQ1zOOZdFQsdxyzmV4wIO6yV4WkFUx+WEy7BwXfhgkPven133JywcIaYrdVWjrPScnLP/Ko= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=mHJd7sXQ; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="mHJd7sXQ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752468784; x=1784004784; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DzEkyFZQ28u7x7SePqyWjjDO5a/QUmvg26E6ZupBXSw=; b=mHJd7sXQl3Am/BMcrvkXyJq0sp2HLZQMl3uKTGsyPxliToh74KOVTcCK 5fkBf6AIVO4bYE+eygBnURstXhJPs6v4U+DtySt5eiGwe68g2J1mku/8Q EsUEY/iC/FajCZApU7txLhLjc/hpQ7O+WH81yQhF1NLbFnSe3P8UGTTvl RW1kp/nQre++pn7meJ3ivK0XhbJPMQxEITrIWhnerrK2Z6vvFYcXzBvE7 kCW69XwLO2It6Jd4HF89D+GB+wAo/IPkmz9NCMo0v9ZD+cCnXl6TArSNL m9ZCnIib5mdp+VoVzlYbLi/PTGC4s2/+fgqnjJOmIIM8GD8iffXYodOTe g==; X-CSE-ConnectionGUID: exOHJg+yShqma7fxNG/3aQ== X-CSE-MsgGUID: YKROnb5JQL+T1fYSfCSJFQ== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="53765085" X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="53765085" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2025 21:53:03 -0700 X-CSE-ConnectionGUID: Lt7pOygmSAur0rtlYmbjuQ== X-CSE-MsgGUID: 1wMDH8jnTxe1+aIhgi0kaQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="161166219" Received: from allen-box.sh.intel.com ([10.239.159.52]) by orviesa003.jf.intel.com with ESMTP; 13 Jul 2025 21:53:02 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 10/11] iommu/vt-d: Fix missing PASID in dev TLB flush with cache_tag_flush_all Date: Mon, 14 Jul 2025 12:50:27 +0800 Message-ID: <20250714045028.958850-11-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250714045028.958850-1-baolu.lu@linux.intel.com> References: <20250714045028.958850-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ethan Milon The function cache_tag_flush_all() was originally implemented with incorrect device TLB invalidation logic that does not handle PASID, in commit c4d27ffaa8eb ("iommu/vt-d: Add cache tag invalidation helpers") This causes regressions where full address space TLB invalidations occur with a PASID attached, such as during transparent hugepage unmapping in SVA configurations or when calling iommu_flush_iotlb_all(). In these cases, the device receives a TLB invalidation that lacks PASID. This incorrect logic was later extracted into cache_tag_flush_devtlb_all(), in commit 3297d047cd7f ("iommu/vt-d: Refactor IOTLB and Dev-IOTLB flush for batching") The fix replaces the call to cache_tag_flush_devtlb_all() with cache_tag_flush_devtlb_psi(), which properly handles PASID. Fixes: 4f609dbff51b ("iommu/vt-d: Use cache helpers in arch_invalidate_seco= ndary_tlbs") Fixes: 4e589a53685c ("iommu/vt-d: Use cache_tag_flush_all() in flush_iotlb_= all") Signed-off-by: Ethan Milon Link: https://lore.kernel.org/r/20250708214821.30967-1-ethan.milon@eviden.c= om Signed-off-by: Lu Baolu --- drivers/iommu/intel/cache.c | 18 +----------------- 1 file changed, 1 insertion(+), 17 deletions(-) diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index 876630e10849..071f78e67fcb 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -422,22 +422,6 @@ static void cache_tag_flush_devtlb_psi(struct dmar_dom= ain *domain, struct cache_ domain->qi_batch); } =20 -static void cache_tag_flush_devtlb_all(struct dmar_domain *domain, struct = cache_tag *tag) -{ - struct intel_iommu *iommu =3D tag->iommu; - struct device_domain_info *info; - u16 sid; - - info =3D dev_iommu_priv_get(tag->dev); - sid =3D PCI_DEVID(info->bus, info->devfn); - - qi_batch_add_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, 0, - MAX_AGAW_PFN_WIDTH, domain->qi_batch); - if (info->dtlb_extra_inval) - qi_batch_add_dev_iotlb(iommu, sid, info->pfsid, info->ats_qdep, 0, - MAX_AGAW_PFN_WIDTH, domain->qi_batch); -} - /* * Invalidates a range of IOVA from @start (inclusive) to @end (inclusive) * when the memory mappings in the target domain have been modified. @@ -508,7 +492,7 @@ void cache_tag_flush_all(struct dmar_domain *domain) break; case CACHE_TAG_DEVTLB: case CACHE_TAG_NESTING_DEVTLB: - cache_tag_flush_devtlb_all(domain, tag); + cache_tag_flush_devtlb_psi(domain, tag, 0, MAX_AGAW_PFN_WIDTH); break; } =20 --=20 2.43.0 From nobody Tue Oct 7 05:42:06 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A39B022259F for ; Mon, 14 Jul 2025 04:53:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468786; cv=none; b=si5S+e0xNLcXq6EF8nSbw57EvZ68O6XIFjc2+gZiyukrYVxxmjEl/jR5619Yldf5str4c892vcVB9h6LR6I9KEXXFeJEgYQ/ZJGFUWg9ge0iVrH/JASiXRYLjziSr873GJVRV2gE44PP3lUN5fGNiE5PDQ7C7sAjdykqFFDhpHU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752468786; c=relaxed/simple; bh=64tzP1tfDuZ1RLCBuSwPHPtDvngWJGj0WwdSjW827T8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mxjzikgMnawak53UlljmSpBcT42RF4jNb4lg91BUWION5GszViBasw4kDs0sP1kVfoLxRnkrSN7Z0x9DPm//D8RVfTJhZKW6bOFBtl9GPaBwI+QuocJl9asaQRA8L/ps11K+VtUvy0WXEi4SxpfjCYKWYWQbC8pnucG3w2wyitQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=RTDljtOI; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="RTDljtOI" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1752468784; x=1784004784; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=64tzP1tfDuZ1RLCBuSwPHPtDvngWJGj0WwdSjW827T8=; b=RTDljtOIlYvpyivuSCMlJMMUZNLgHeVN3I0x/Xk2Wk4mUw7hPXUOxcJQ 5bU0NjiUhXok5L8UbtUhC6KFU1Qm3Bz3NgKJQ4ZPJVmCXaeQit5lkm5wO JqivrNy6rFZXsldYoLTAb6DgHV030REFVdiquZ3Dmy56tuu9f7QqI/EWr 9pMtVojnVFTwS0fLvihTb5wwIIYujqdH/1N2MVYX7xm6QKSJCHSEyNu0k rlyF91hLrnUNaJ/LMfl05jAQgxXb7vwr7vYWoT8jXkuY34I2BImicyL7t 1RTMmnk+VVCEvO+4KthN2RM/sx/unyxhVXBIrLaQHplu58hCAHkLxu295 w==; X-CSE-ConnectionGUID: ZrVAKu+FT3q33cK43M7stQ== X-CSE-MsgGUID: q+i6oj/pTomy/qedw4TTPg== X-IronPort-AV: E=McAfee;i="6800,10657,11491"; a="53765088" X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="53765088" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2025 21:53:04 -0700 X-CSE-ConnectionGUID: sI5papjZSYy32hU6Zm8KQQ== X-CSE-MsgGUID: S/s6EJZzTHeJOQvwJCtiYw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,310,1744095600"; d="scan'208";a="161166227" Received: from allen-box.sh.intel.com ([10.239.159.52]) by orviesa003.jf.intel.com with ESMTP; 13 Jul 2025 21:53:03 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 11/11] iommu/vt-d: Deduplicate cache_tag_flush_all by reusing flush_range Date: Mon, 14 Jul 2025 12:50:28 +0800 Message-ID: <20250714045028.958850-12-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250714045028.958850-1-baolu.lu@linux.intel.com> References: <20250714045028.958850-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ethan Milon The logic in cache_tag_flush_all() to iterate over cache tags and issue TLB invalidations is largely duplicated in cache_tag_flush_range(), with the only difference being the range parameters. Extend cache_tag_flush_range() to handle a full address space flush when called with start =3D 0 and end =3D ULONG_MAX. This allows cache_tag_flush_all() to simply delegate to cache_tag_flush_range() Signed-off-by: Ethan Milon Link: https://lore.kernel.org/r/20250708214821.30967-2-ethan.milon@eviden.c= om Signed-off-by: Lu Baolu --- drivers/iommu/intel/cache.c | 34 ++++++++-------------------------- drivers/iommu/intel/trace.h | 5 ----- 2 files changed, 8 insertions(+), 31 deletions(-) diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index 071f78e67fcb..265e7290256b 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -434,7 +434,13 @@ void cache_tag_flush_range(struct dmar_domain *domain,= unsigned long start, struct cache_tag *tag; unsigned long flags; =20 - addr =3D calculate_psi_aligned_address(start, end, &pages, &mask); + if (start =3D=3D 0 && end =3D=3D ULONG_MAX) { + addr =3D 0; + pages =3D -1; + mask =3D MAX_AGAW_PFN_WIDTH; + } else { + addr =3D calculate_psi_aligned_address(start, end, &pages, &mask); + } =20 spin_lock_irqsave(&domain->cache_lock, flags); list_for_each_entry(tag, &domain->cache_tags, node) { @@ -475,31 +481,7 @@ void cache_tag_flush_range(struct dmar_domain *domain,= unsigned long start, */ void cache_tag_flush_all(struct dmar_domain *domain) { - struct intel_iommu *iommu =3D NULL; - struct cache_tag *tag; - unsigned long flags; - - spin_lock_irqsave(&domain->cache_lock, flags); - list_for_each_entry(tag, &domain->cache_tags, node) { - if (iommu && iommu !=3D tag->iommu) - qi_batch_flush_descs(iommu, domain->qi_batch); - iommu =3D tag->iommu; - - switch (tag->type) { - case CACHE_TAG_IOTLB: - case CACHE_TAG_NESTING_IOTLB: - cache_tag_flush_iotlb(domain, tag, 0, -1, 0, 0); - break; - case CACHE_TAG_DEVTLB: - case CACHE_TAG_NESTING_DEVTLB: - cache_tag_flush_devtlb_psi(domain, tag, 0, MAX_AGAW_PFN_WIDTH); - break; - } - - trace_cache_tag_flush_all(tag); - } - qi_batch_flush_descs(iommu, domain->qi_batch); - spin_unlock_irqrestore(&domain->cache_lock, flags); + cache_tag_flush_range(domain, 0, ULONG_MAX, 0); } =20 /* diff --git a/drivers/iommu/intel/trace.h b/drivers/iommu/intel/trace.h index 9defdae6ebae..6311ba3f1691 100644 --- a/drivers/iommu/intel/trace.h +++ b/drivers/iommu/intel/trace.h @@ -130,11 +130,6 @@ DEFINE_EVENT(cache_tag_log, cache_tag_unassign, TP_ARGS(tag) ); =20 -DEFINE_EVENT(cache_tag_log, cache_tag_flush_all, - TP_PROTO(struct cache_tag *tag), - TP_ARGS(tag) -); - DECLARE_EVENT_CLASS(cache_tag_flush, TP_PROTO(struct cache_tag *tag, unsigned long start, unsigned long end, unsigned long addr, unsigned long pages, unsigned long mask), --=20 2.43.0