From nobody Sat Feb 7 21:24:07 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49EB242A93 for ; Mon, 10 Mar 2025 02:47:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741574865; cv=none; b=PMVmMk+VfB0S5qZolToyU0X4EE7O4ZNKky/9GlrbNFz8FGJJ06VZ+QWbjyMP+ujwabSlVXLBHQ+fwp4901vpmUILun7CvacL+MTrwF2OWtdEUpRFpEQKPXi9T+1ggrl9Alq9msHYUOFD8IpNwnCK79PIToqlUTtju0mby7+eskg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741574865; c=relaxed/simple; bh=LbmK2ae0pXBLOUD1NM6z/Pl0jrpB+bHqVqnVzchGmTw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PjRrO2o9tFkmUyGiFKdq2fnfcmfqq5PvNXnI1pB2xirmmc1etmc5LQ/22yC/okRHX6oLOREr1t4Jue3EArSaDT1rbxIDDJc4Lf7+To7kLCIgtLUwDgp1xOI1PG1o2X2rYvfSiEgXZYLgUGQO6R+1cPNzUM0z43OXYXX9ACsBN/M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=j2XlNm0Z; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="j2XlNm0Z" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1741574863; x=1773110863; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LbmK2ae0pXBLOUD1NM6z/Pl0jrpB+bHqVqnVzchGmTw=; b=j2XlNm0Zgfz5/7I2b0xr8yRE5HejhP1WIstlS1qomi+uQ6esyk075axM 7x6dbaJ6VktPbDA+/knPBsSVKJp0idQAw1edpO31jjP2W+fb8ZV8CoInC rkfJL0VzFPLrkDD0iykoo9Lajdcl8H7ULzfRCIwWPnit/7zRV3Tbkopny jHxc57dX6Xz3SHv9i3PA1wPlnhir0w3MwmBHUeI9qwdmPUE+6pE6yKhrD TC7vzGPAJ8dijWBoMkQP3E7N1yr9ciVqHRLtxOWk5WIMTX1sNyvlgfypO 5O6XKQGWcaD21HrIZF1D3+iMQO+rbUkRFFMy3Z4YBPURDEBRuIjHMS9qA w==; X-CSE-ConnectionGUID: ILFKi8U1SemrqiUrIFsiQg== X-CSE-MsgGUID: yGmnLQd3QEWjwlDeI4q0Ow== X-IronPort-AV: E=McAfee;i="6700,10204,11368"; a="42401588" X-IronPort-AV: E=Sophos;i="6.14,235,1736841600"; d="scan'208";a="42401588" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Mar 2025 19:47:38 -0700 X-CSE-ConnectionGUID: jBWc2Th7RqKSLvyDrCiVhA== X-CSE-MsgGUID: JSnsBpMyQIGpalNi+xP6bA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,235,1736841600"; d="scan'208";a="143079227" Received: from allen-box.sh.intel.com ([10.239.159.52]) by fmviesa002.fm.intel.com with ESMTP; 09 Mar 2025 19:47:34 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 1/6] iommu/vt-d: Fix system hang on reboot -f Date: Mon, 10 Mar 2025 10:47:44 +0800 Message-ID: <20250310024749.3702681-2-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250310024749.3702681-1-baolu.lu@linux.intel.com> References: <20250310024749.3702681-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yunhui Cui We found that executing the command ./a.out &;reboot -f (where a.out is a program that only executes a while(1) infinite loop) can probabilistically cause the system to hang in the intel_iommu_shutdown() function, rendering it unresponsive. Through analysis, we identified that the factors contributing to this issue are as follows: 1. The reboot -f command does not prompt the kernel to notify the application layer to perform cleanup actions, allowing the application to continue running. 2. When the kernel reaches the intel_iommu_shutdown() function, only the BSP (Bootstrap Processor) CPU is operational in the system. 3. During the execution of intel_iommu_shutdown(), the function down_write (&dmar_global_lock) causes the process to sleep and be scheduled out. 4. At this point, though the processor's interrupt flag is not cleared, allowing interrupts to be accepted. However, only legacy devices and NMI (Non-Maskable Interrupt) interrupts could come in, as other interrupts routing have already been disabled. If no legacy or NMI interrupts occur at this stage, the scheduler will not be able to run. 5. If the application got scheduled at this time is executing a while(1)- type loop, it will be unable to be preempted, leading to an infinite loop and causing the system to become unresponsive. To resolve this issue, the intel_iommu_shutdown() function should not execute down_write(), which can potentially cause the process to be scheduled out. Furthermore, since only the BSP is running during the later stages of the reboot, there is no need for protection against parallel access to the DMAR (DMA Remapping) unit. Therefore, the following lines could be removed: down_write(&dmar_global_lock); up_write(&dmar_global_lock); After testing, the issue has been resolved. Fixes: 6c3a44ed3c55 ("iommu/vt-d: Turn off translations at shutdown") Co-developed-by: Ethan Zhao Signed-off-by: Ethan Zhao Signed-off-by: Yunhui Cui Link: https://lore.kernel.org/r/20250303062421.17929-1-cuiyunhui@bytedance.= com Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index bf1f0c814348..25d31f8c129a 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -2871,16 +2871,19 @@ void intel_iommu_shutdown(void) if (no_iommu || dmar_disabled) return; =20 - down_write(&dmar_global_lock); + /* + * All other CPUs were brought down, hotplug interrupts were disabled, + * no lock and RCU checking needed anymore + */ + list_for_each_entry(drhd, &dmar_drhd_units, list) { + iommu =3D drhd->iommu; =20 - /* Disable PMRs explicitly here. */ - for_each_iommu(iommu, drhd) + /* Disable PMRs explicitly here. */ iommu_disable_protect_mem_regions(iommu); =20 - /* Make sure the IOMMUs are switched off */ - intel_disable_iommus(); - - up_write(&dmar_global_lock); + /* Make sure the IOMMUs are switched off */ + iommu_disable_translation(iommu); + } } =20 static struct intel_iommu *dev_to_intel_iommu(struct device *dev) --=20 2.43.0 From nobody Sat Feb 7 21:24:07 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E9691AF0BC for ; Mon, 10 Mar 2025 02:47:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741574866; cv=none; b=cTnayR6Rgou/8JeY6/AO4EwcBqQN1nIkeBq+LnZqw663LGiDRT2WlvGgE0tD834V0SLVkPCuLdLDYA6GCgH0vqEeGcJWLU9QCgvC9qZYzOPy7ViG/W62jmzaVuQL92zrJOaxy0oP5dfVyWDNuP56jUPof/g+JfRxjKXJsaHBh1I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741574866; c=relaxed/simple; bh=tUTDTesMrpiXmjW8gl3QjpxoqRcSuCz7vluHospBe0Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eWEOVVM7gspFsczI+siDaf5/W9hOkbMbO5cAc0GLb9ktJ1X5tE2aIlcsqHmapOnRVN1iO6iOcPC0/rfFMR4leAl53M1AuCc3vdX4xXgbsUNLP3tK/nCt1Vj3HNCeeL9DamUNMoBnjqwk4x+AcnswVg3QYPekh6UJ1jMxzEpJ9ZI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=LsKQHGEr; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="LsKQHGEr" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1741574863; x=1773110863; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tUTDTesMrpiXmjW8gl3QjpxoqRcSuCz7vluHospBe0Q=; b=LsKQHGErc/ptomXCGguAVGtVOiMYab9VmZtNoK3j/a81r7PFamEUsnq3 xfC7LO6eHt3A5uF4XYPbnlJBcUGrGT5mery1aMkX3i2k8Csxy0kZw+1Z6 mF7Q2VBrgk8lbo+bXlldsTRRUgdJs6uFW60DPBJlohilXQU6gzIMDDKCU DdSSQV8kaKfgIuOH9JFEHrUNVW52sZLFD1v7wcTiRMxnaYRBdka3h7pd+ cW+54IpxLV5NJj3ftwQvH9UVCMxTfcKkrF2kBAivSfEFAB6/0exQwcag4 DUN1ieiQp68MvJThWvr1bR8MStPziHp+S8dEPjZJGxlEXwvtIPvx5RAg0 w==; X-CSE-ConnectionGUID: wjNPvrYdRUqAqm5cIdCq1Q== X-CSE-MsgGUID: e4+Y/VdmSy+zejll9KwddQ== X-IronPort-AV: E=McAfee;i="6700,10204,11368"; a="42401596" X-IronPort-AV: E=Sophos;i="6.14,235,1736841600"; d="scan'208";a="42401596" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Mar 2025 19:47:39 -0700 X-CSE-ConnectionGUID: rUxLmraHSRqi6FBMQ1zLfw== X-CSE-MsgGUID: w9yPtq5JQCOrUhK5KWWE5g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,235,1736841600"; d="scan'208";a="143079230" Received: from allen-box.sh.intel.com ([10.239.159.52]) by fmviesa002.fm.intel.com with ESMTP; 09 Mar 2025 19:47:35 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 2/6] iommu/vt-d: Use virt_to_phys() Date: Mon, 10 Mar 2025 10:47:45 +0800 Message-ID: <20250310024749.3702681-3-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250310024749.3702681-1-baolu.lu@linux.intel.com> References: <20250310024749.3702681-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Jason Gunthorpe If all the inlines are unwound virt_to_dma_pfn() is simply: return page_to_pfn(virt_to_page(p)) << (PAGE_SHIFT - VTD_PAGE_SHIFT); Which can be re-arranged to: (page_to_pfn(virt_to_page(p)) << PAGE_SHIFT) >> VTD_PAGE_SHIFT The only caller is: ((uint64_t)virt_to_dma_pfn(tmp_page) << VTD_PAGE_SHIFT) re-arranged to: ((page_to_pfn(virt_to_page(tmp_page)) << PAGE_SHIFT) >> VTD_PAGE_SHIFT) << VTD_PAGE_SHIFT Which simplifies to: page_to_pfn(virt_to_page(tmp_page)) << PAGE_SHIFT That is the same as virt_to_phys(tmp_page), so just remove all of this. Reviewed-by: Lu Baolu Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/8-v3-e797f4dc6918+93057-iommu_pages_jgg@nvi= dia.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 3 ++- drivers/iommu/intel/iommu.h | 19 ------------------- 2 files changed, 2 insertions(+), 20 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 25d31f8c129a..e7152ea77393 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -737,7 +737,8 @@ static struct dma_pte *pfn_to_dma_pte(struct dmar_domai= n *domain, return NULL; =20 domain_flush_cache(domain, tmp_page, VTD_PAGE_SIZE); - pteval =3D ((uint64_t)virt_to_dma_pfn(tmp_page) << VTD_PAGE_SHIFT) | DM= A_PTE_READ | DMA_PTE_WRITE; + pteval =3D virt_to_phys(tmp_page) | DMA_PTE_READ | + DMA_PTE_WRITE; if (domain->use_first_level) pteval |=3D DMA_FL_PTE_US | DMA_FL_PTE_ACCESS; =20 diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 6ea7bbe26b19..dd980808998d 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -953,25 +953,6 @@ static inline unsigned long lvl_to_nr_pages(unsigned i= nt lvl) return 1UL << min_t(int, (lvl - 1) * LEVEL_STRIDE, MAX_AGAW_PFN_WIDTH); } =20 -/* VT-d pages must always be _smaller_ than MM pages. Otherwise things - are never going to work. */ -static inline unsigned long mm_to_dma_pfn_start(unsigned long mm_pfn) -{ - return mm_pfn << (PAGE_SHIFT - VTD_PAGE_SHIFT); -} -static inline unsigned long mm_to_dma_pfn_end(unsigned long mm_pfn) -{ - return ((mm_pfn + 1) << (PAGE_SHIFT - VTD_PAGE_SHIFT)) - 1; -} -static inline unsigned long page_to_dma_pfn(struct page *pg) -{ - return mm_to_dma_pfn_start(page_to_pfn(pg)); -} -static inline unsigned long virt_to_dma_pfn(void *p) -{ - return page_to_dma_pfn(virt_to_page(p)); -} - static inline void context_set_present(struct context_entry *context) { context->lo |=3D 1; --=20 2.43.0 From nobody Sat Feb 7 21:24:07 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E413B1B0F33 for ; Mon, 10 Mar 2025 02:47:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741574865; cv=none; b=V1nUE23uYVrxBmcyZVu6j2CQNqCf2hsMY5aG8wl2EMVe7O+I7OXeiXh6kCUSskL+ZbdU2zmyJCKIcsIZb5/1YKB0O9KGoAqDebWOA0m2cmKIRgjdI9SnxRualxDAnQq9lw+cohA9a+4oSW4tbrM30aCv+b1hkxRMYI/bTQ7QjqQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741574865; c=relaxed/simple; bh=eRLZCAOXumm8xciIXwNBy9yoUgyMWxGZG1wMruoGihQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=g23mmqVONg9nVms0JVsbkOSVx9Wpo0uSyeIXFuUrz50WrojcrEAs6E6lvVilNOG6k+WmhVqvT5b7JMdR99uV8pZRIEqfh0wfDbYzLAtDTocUWkCzvzGT6sSD8McT4xFkETX/wTDa5jCm+QNeevG7qsHoid1Hpu35EcSb90u6/CI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ZCGlxPxv; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ZCGlxPxv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1741574864; x=1773110864; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eRLZCAOXumm8xciIXwNBy9yoUgyMWxGZG1wMruoGihQ=; b=ZCGlxPxvEivXQ48GrOSRcj7vak4sqhpErWC37xMKMe5fsbl/tac9m71m 7AxkYcZvvTtuu0yXS4Lq/EjBKdowUy2dcX5Cz4pWDinJh5+EPCrxzG1zk 6lVwZzs5hM7ULCeFBdI+2qXXiUBOFv9smcNVb5Dc9dolH+NWryS7DXdGu oF2u7B79GHj4XFS981SbRJ4noFQWGkOilKU90yult03vJLgIAV3WRm8lL 9SHbft+nvTH5ZHFcLaKSg7wmNphN6jDW9d+20dkVw4j2EXq+CCr3Wbxx7 X7Qs+us6YlhNWP4z/7A1YoEbtcrdL6K0ka0iQodfta7zYn0RC07JkGdTV Q==; X-CSE-ConnectionGUID: DJbcWFNYSACGNkJSg0tBYg== X-CSE-MsgGUID: E5q9QR40TqqVFDyRNT1bdw== X-IronPort-AV: E=McAfee;i="6700,10204,11368"; a="42401602" X-IronPort-AV: E=Sophos;i="6.14,235,1736841600"; d="scan'208";a="42401602" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Mar 2025 19:47:39 -0700 X-CSE-ConnectionGUID: A/PkadB7QAW0CFl7ygzhuw== X-CSE-MsgGUID: pnw+4NfiQiyn+0r25QAzTw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,235,1736841600"; d="scan'208";a="143079231" Received: from allen-box.sh.intel.com ([10.239.159.52]) by fmviesa002.fm.intel.com with ESMTP; 09 Mar 2025 19:47:37 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 3/6] iommu/vt-d: Check if SVA is supported when attaching the SVA domain Date: Mon, 10 Mar 2025 10:47:46 +0800 Message-ID: <20250310024749.3702681-4-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250310024749.3702681-1-baolu.lu@linux.intel.com> References: <20250310024749.3702681-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Jason Gunthorpe Attach of a SVA domain should fail if SVA is not supported, move the check for SVA support out of IOMMU_DEV_FEAT_SVA and into attach. Also check when allocating a SVA domain to match other drivers. Signed-off-by: Jason Gunthorpe Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Reviewed-by: Yi Liu Tested-by: Zhangfei Gao Link: https://lore.kernel.org/r/20250228092631.3425464-3-baolu.lu@linux.int= el.com --- drivers/iommu/intel/iommu.c | 37 +------------------------------ drivers/iommu/intel/svm.c | 43 +++++++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+), 36 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index e7152ea77393..58bff6fe3a93 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3862,41 +3862,6 @@ static struct iommu_group *intel_iommu_device_group(= struct device *dev) return generic_device_group(dev); } =20 -static int intel_iommu_enable_sva(struct device *dev) -{ - struct device_domain_info *info =3D dev_iommu_priv_get(dev); - struct intel_iommu *iommu; - - if (!info || dmar_disabled) - return -EINVAL; - - iommu =3D info->iommu; - if (!iommu) - return -EINVAL; - - if (!(iommu->flags & VTD_FLAG_SVM_CAPABLE)) - return -ENODEV; - - if (!info->pasid_enabled || !info->ats_enabled) - return -EINVAL; - - /* - * Devices having device-specific I/O fault handling should not - * support PCI/PRI. The IOMMU side has no means to check the - * capability of device-specific IOPF. Therefore, IOMMU can only - * default that if the device driver enables SVA on a non-PRI - * device, it will handle IOPF in its own way. - */ - if (!info->pri_supported) - return 0; - - /* Devices supporting PRI should have it enabled. */ - if (!info->pri_enabled) - return -EINVAL; - - return 0; -} - static int context_flip_pri(struct device_domain_info *info, bool enable) { struct intel_iommu *iommu =3D info->iommu; @@ -4017,7 +3982,7 @@ intel_iommu_dev_enable_feat(struct device *dev, enum = iommu_dev_features feat) return intel_iommu_enable_iopf(dev); =20 case IOMMU_DEV_FEAT_SVA: - return intel_iommu_enable_sva(dev); + return 0; =20 default: return -ENODEV; diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c index f5569347591f..ba93123cb4eb 100644 --- a/drivers/iommu/intel/svm.c +++ b/drivers/iommu/intel/svm.c @@ -110,6 +110,41 @@ static const struct mmu_notifier_ops intel_mmuops =3D { .free_notifier =3D intel_mm_free_notifier, }; =20 +static int intel_iommu_sva_supported(struct device *dev) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu; + + if (!info || dmar_disabled) + return -EINVAL; + + iommu =3D info->iommu; + if (!iommu) + return -EINVAL; + + if (!(iommu->flags & VTD_FLAG_SVM_CAPABLE)) + return -ENODEV; + + if (!info->pasid_enabled || !info->ats_enabled) + return -EINVAL; + + /* + * Devices having device-specific I/O fault handling should not + * support PCI/PRI. The IOMMU side has no means to check the + * capability of device-specific IOPF. Therefore, IOMMU can only + * default that if the device driver enables SVA on a non-PRI + * device, it will handle IOPF in its own way. + */ + if (!info->pri_supported) + return 0; + + /* Devices supporting PRI should have it enabled. */ + if (!info->pri_enabled) + return -EINVAL; + + return 0; +} + static int intel_svm_set_dev_pasid(struct iommu_domain *domain, struct device *dev, ioasid_t pasid, struct iommu_domain *old) @@ -121,6 +156,10 @@ static int intel_svm_set_dev_pasid(struct iommu_domain= *domain, unsigned long sflags; int ret =3D 0; =20 + ret =3D intel_iommu_sva_supported(dev); + if (ret) + return ret; + dev_pasid =3D domain_add_dev_pasid(domain, dev, pasid); if (IS_ERR(dev_pasid)) return PTR_ERR(dev_pasid); @@ -161,6 +200,10 @@ struct iommu_domain *intel_svm_domain_alloc(struct dev= ice *dev, struct dmar_domain *domain; int ret; =20 + ret =3D intel_iommu_sva_supported(dev); + if (ret) + return ERR_PTR(ret); + domain =3D kzalloc(sizeof(*domain), GFP_KERNEL); if (!domain) return ERR_PTR(-ENOMEM); --=20 2.43.0 From nobody Sat Feb 7 21:24:07 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB3451B0F20 for ; Mon, 10 Mar 2025 02:47:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741574867; cv=none; b=iNDszRI7MPJTwdpKtrc8exRn1jm0xwQ2X1POPvoIkBMuyZgs482gLk8rIIOQ8ASTVc1RSTJgkG01bVs4PCPrh9ZYhh4T7bKJQ7yjhzo6eMgIIqbGg4fFdWt5bBEmSgsQxF/C8w25dXas6/sLO5LJk42+LIGCnFLlEpK+ewJUb2A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741574867; c=relaxed/simple; bh=e8N98miGjhZxcY9cjAV3CsYuZfyz+ZbDyvNfXdF6jbU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Kb99xhRJJlySrR6Pn835nUmELEbl0+gcGgkJKVirneES8znxGuAvvlKz5bp137bJJw16VMbt01E+mvkD3BvD4kKCkiPVhzwg6SKF9rgJxbztdWBKjU1ltldhQtVpYpekqhZOnGeBd+PfsUD8p5Opb0nkQlH43H2fkqpjyQ4x5+U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Bmb7fxw/; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Bmb7fxw/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1741574865; x=1773110865; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=e8N98miGjhZxcY9cjAV3CsYuZfyz+ZbDyvNfXdF6jbU=; b=Bmb7fxw/dg7bQjAhGKqvm0Sb/YmKNR24MTJr96mwhDrG/ANxMpThIpPF LLcQjQqY4i9ORK0HT0jN8l3mL9+rDGE8qS3wfssH68+1TOaxcIs+Z7aln xfrAQ+7ksn2ZwHTeWNnpRS18Y9I/Mr1f+23qeITjv0X2xBlqvfFTxhv9G mfa9oBRh9h7c00eFBeN+wxJOmtx+lmmONL//3l9Ga0IwURpoe07BoOCxZ IPj7sanEsjHg3702tWJhir3nVQdRXkXHCGTNG075ncamiBXtoF9e1jcmD EE2TV+V1MuptyRF8kkxFIstN2MgB1XaGPrtos31jnzhVnWrf5duCSz5Zy A==; X-CSE-ConnectionGUID: uHb4SrBaRCmZICJRAPjlGA== X-CSE-MsgGUID: Gte5EnLlRqGRBop95KcO5g== X-IronPort-AV: E=McAfee;i="6700,10204,11368"; a="42401608" X-IronPort-AV: E=Sophos;i="6.14,235,1736841600"; d="scan'208";a="42401608" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Mar 2025 19:47:40 -0700 X-CSE-ConnectionGUID: vrTO/rQeQsyjEJ9fd13EOg== X-CSE-MsgGUID: Fk5YOvyMSqqlKPGkn9k/5w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,235,1736841600"; d="scan'208";a="143079235" Received: from allen-box.sh.intel.com ([10.239.159.52]) by fmviesa002.fm.intel.com with ESMTP; 09 Mar 2025 19:47:38 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 4/6] iommu/vt-d: Move scalable mode ATS enablement to probe path Date: Mon, 10 Mar 2025 10:47:47 +0800 Message-ID: <20250310024749.3702681-5-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250310024749.3702681-1-baolu.lu@linux.intel.com> References: <20250310024749.3702681-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Device ATS is currently enabled when a domain is attached to the device and disabled when the domain is detached. This creates a limitation: when the IOMMU is operating in scalable mode and IOPF is enabled, the device's domain cannot be changed. The previous code enables ATS when a domain is set to a device's RID and disables it during RID domain switch. So, if a PASID is set with a domain requiring PRI, ATS should remain enabled until the domain is removed. During the PASID domain's lifecycle, if the RID's domain changes, PRI will be disrupted because it depends on ATS, which is disabled when the blocking domain is set for the device's RID. Remove this limitation by moving ATS enablement to the device probe path. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Tested-by: Zhangfei Gao Link: https://lore.kernel.org/r/20250228092631.3425464-5-baolu.lu@linux.int= el.com --- drivers/iommu/intel/iommu.c | 51 ++++++++++++++++++++----------------- 1 file changed, 27 insertions(+), 24 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 58bff6fe3a93..1c8724cd2ddc 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1173,32 +1173,28 @@ static bool dev_needs_extra_dtlb_flush(struct pci_d= ev *pdev) return true; } =20 -static void iommu_enable_pci_caps(struct device_domain_info *info) +static void iommu_enable_pci_ats(struct device_domain_info *info) { struct pci_dev *pdev; =20 - if (!dev_is_pci(info->dev)) + if (!info->ats_supported) return; =20 pdev =3D to_pci_dev(info->dev); - if (info->ats_supported && pci_ats_page_aligned(pdev) && - !pci_enable_ats(pdev, VTD_PAGE_SHIFT)) + if (!pci_ats_page_aligned(pdev)) + return; + + if (!pci_enable_ats(pdev, VTD_PAGE_SHIFT)) info->ats_enabled =3D 1; } =20 -static void iommu_disable_pci_caps(struct device_domain_info *info) +static void iommu_disable_pci_ats(struct device_domain_info *info) { - struct pci_dev *pdev; - - if (!dev_is_pci(info->dev)) + if (!info->ats_enabled) return; =20 - pdev =3D to_pci_dev(info->dev); - - if (info->ats_enabled) { - pci_disable_ats(pdev); - info->ats_enabled =3D 0; - } + pci_disable_ats(to_pci_dev(info->dev)); + info->ats_enabled =3D 0; } =20 static void intel_flush_iotlb_all(struct iommu_domain *domain) @@ -1557,12 +1553,19 @@ domain_context_mapping(struct dmar_domain *domain, = struct device *dev) struct device_domain_info *info =3D dev_iommu_priv_get(dev); struct intel_iommu *iommu =3D info->iommu; u8 bus =3D info->bus, devfn =3D info->devfn; + int ret; =20 if (!dev_is_pci(dev)) return domain_context_mapping_one(domain, iommu, bus, devfn); =20 - return pci_for_each_dma_alias(to_pci_dev(dev), - domain_context_mapping_cb, domain); + ret =3D pci_for_each_dma_alias(to_pci_dev(dev), + domain_context_mapping_cb, domain); + if (ret) + return ret; + + iommu_enable_pci_ats(info); + + return 0; } =20 /* Return largest possible superpage level for a given mapping */ @@ -1844,8 +1847,6 @@ static int dmar_domain_attach_device(struct dmar_doma= in *domain, if (ret) goto out_block_translation; =20 - iommu_enable_pci_caps(info); - ret =3D cache_tag_assign_domain(domain, dev, IOMMU_NO_PASID); if (ret) goto out_block_translation; @@ -3209,6 +3210,7 @@ static void domain_context_clear(struct device_domain= _info *info) =20 pci_for_each_dma_alias(to_pci_dev(info->dev), &domain_context_clear_one_cb, info); + iommu_disable_pci_ats(info); } =20 /* @@ -3225,7 +3227,6 @@ void device_block_translation(struct device *dev) if (info->domain) cache_tag_unassign_domain(info->domain, dev, IOMMU_NO_PASID); =20 - iommu_disable_pci_caps(info); if (!dev_is_real_dma_subdevice(dev)) { if (sm_supported(iommu)) intel_pasid_tear_down_entry(iommu, dev, @@ -3760,6 +3761,9 @@ static struct iommu_device *intel_iommu_probe_device(= struct device *dev) !pci_enable_pasid(pdev, info->pasid_supported & ~1)) info->pasid_enabled =3D 1; =20 + if (sm_supported(iommu)) + iommu_enable_pci_ats(info); + return &iommu->iommu; free_table: intel_pasid_free_table(dev); @@ -3776,6 +3780,8 @@ static void intel_iommu_release_device(struct device = *dev) struct device_domain_info *info =3D dev_iommu_priv_get(dev); struct intel_iommu *iommu =3D info->iommu; =20 + iommu_disable_pci_ats(info); + if (info->pasid_enabled) { pci_disable_pasid(to_pci_dev(dev)); info->pasid_enabled =3D 0; @@ -4379,13 +4385,10 @@ static int identity_domain_attach_dev(struct iommu_= domain *domain, struct device if (dev_is_real_dma_subdevice(dev)) return 0; =20 - if (sm_supported(iommu)) { + if (sm_supported(iommu)) ret =3D intel_pasid_setup_pass_through(iommu, dev, IOMMU_NO_PASID); - if (!ret) - iommu_enable_pci_caps(info); - } else { + else ret =3D device_setup_pass_through(dev); - } =20 return ret; } --=20 2.43.0 From nobody Sat Feb 7 21:24:07 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 51FD81B85C5 for ; Mon, 10 Mar 2025 02:47:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741574868; cv=none; b=jDh9WpUmZi1m/6YIKiZPZwg/jUX/HDrbucVFINkd9hx7z/LWKliNebLPmKOy1rL/ZJ8gdnpLLrN9ydoukmJAoSYUHGAvqnDHt6eCPb86bA7m+YKCB7hFgpyXc2ARE8QQpVRQ4vzeJXjJOchSeD+XlQkVZbA4KU5sGuc7J0DR4Kg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741574868; c=relaxed/simple; bh=dgeoentwBe0HLcO4MQEbm5laWuF8AElUddpbU2Pg3DM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lui8eIuohvJpyyzGm6zci6JuFdFO9JLUwoQ+McohOOhnKVvfXcTr2mIlbmnEoH/0QMK9/Puziy+6Qokyr4X0YMgpLdhUtSnL8DA/SnCfqMHUNTvd/yP0452IvW1/0kd3gl3AuLc8KutlXttGp3gpfWTelWYx+Kgqwg4kLTNM15I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YXrpF1rO; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YXrpF1rO" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1741574866; x=1773110866; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dgeoentwBe0HLcO4MQEbm5laWuF8AElUddpbU2Pg3DM=; b=YXrpF1rOBnOEvwmdKYkGB+oJAzEpzr8aAKDFOroQ4SBqMLwyWOMGaoB2 aIfUdexLFvZs/jN7kY4T8HEO8kEVsUZekDKfKuXxHLzfgsYu+lJkptqj4 0YJhxSF0rPFyt2wmIEWjdsMbkIGkQ60jC0o/Z50sH1dF3/ZBE60+bznf2 61+fPEPfg10GRV+d2BhnrMKGxh2htX1qFnMayne5DLLQyTPlMM65prOMp 1mHrgRwZXIuwPrNEmlX0dl5t+j/yI7SNR+xwxlctlfLV3RHy+SVE1UzxW xrkc58kUEGpDnW8DOiPAcqiQMbDJv8qsvzKfPjKrFif9nkxmWG8Cxzx9y g==; X-CSE-ConnectionGUID: om1+tTczT0WPDm5WqanpPg== X-CSE-MsgGUID: 8oe8PQ0rQyOddEYYWJgXAQ== X-IronPort-AV: E=McAfee;i="6700,10204,11368"; a="42401612" X-IronPort-AV: E=Sophos;i="6.14,235,1736841600"; d="scan'208";a="42401612" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Mar 2025 19:47:43 -0700 X-CSE-ConnectionGUID: khlVlSr+SNq+qLmIbOFc0g== X-CSE-MsgGUID: UKioD/1eSMO7oZtfAdWWVQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,235,1736841600"; d="scan'208";a="143079237" Received: from allen-box.sh.intel.com ([10.239.159.52]) by fmviesa002.fm.intel.com with ESMTP; 09 Mar 2025 19:47:39 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 5/6] iommu/vt-d: Move PRI enablement in probe path Date: Mon, 10 Mar 2025 10:47:48 +0800 Message-ID: <20250310024749.3702681-6-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250310024749.3702681-1-baolu.lu@linux.intel.com> References: <20250310024749.3702681-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Update PRI enablement to use the new method, similar to the amd iommu driver. Enable PRI in the device probe path and disable it when the device is released. PRI is enabled throughout the device's iommu lifecycle. The infrastructure for the iommu subsystem to handle iopf requests is created during iopf enablement and released during iopf disablement. All invalid page requests from the device are automatically handled by the iommu subsystem if iopf is not enabled. Add iopf_refcount to track the iopf enablement. Convert the return type of intel_iommu_disable_iopf() to void, as there is no way to handle a failure when disabling this feature. Make intel_iommu_enable/disable_iopf() helpers global, as they will be used beyond the current file in the subsequent patch. The iopf_refcount is not protected by any lock. This is acceptable, as there is no concurrent access to it in the current code. The following patch will address this by moving it to the domain attach/detach paths, which are protected by the iommu group mutex. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Tested-by: Zhangfei Gao Link: https://lore.kernel.org/r/20250228092631.3425464-6-baolu.lu@linux.int= el.com --- drivers/iommu/intel/iommu.c | 137 +++++++++++++----------------------- drivers/iommu/intel/iommu.h | 4 ++ drivers/iommu/intel/pasid.c | 2 + drivers/iommu/intel/prq.c | 2 +- 4 files changed, 55 insertions(+), 90 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 1c8724cd2ddc..6a6c271ef7e6 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1197,6 +1197,37 @@ static void iommu_disable_pci_ats(struct device_doma= in_info *info) info->ats_enabled =3D 0; } =20 +static void iommu_enable_pci_pri(struct device_domain_info *info) +{ + struct pci_dev *pdev; + + if (!info->ats_enabled || !info->pri_supported) + return; + + pdev =3D to_pci_dev(info->dev); + /* PASID is required in PRG Response Message. */ + if (info->pasid_enabled && !pci_prg_resp_pasid_required(pdev)) + return; + + if (pci_reset_pri(pdev)) + return; + + if (!pci_enable_pri(pdev, PRQ_DEPTH)) + info->pri_enabled =3D 1; +} + +static void iommu_disable_pci_pri(struct device_domain_info *info) +{ + if (!info->pri_enabled) + return; + + if (WARN_ON(info->iopf_refcount)) + iopf_queue_remove_device(info->iommu->iopf_queue, info->dev); + + pci_disable_pri(to_pci_dev(info->dev)); + info->pri_enabled =3D 0; +} + static void intel_flush_iotlb_all(struct iommu_domain *domain) { cache_tag_flush_all(to_dmar_domain(domain)); @@ -3763,6 +3794,7 @@ static struct iommu_device *intel_iommu_probe_device(= struct device *dev) =20 if (sm_supported(iommu)) iommu_enable_pci_ats(info); + iommu_enable_pci_pri(info); =20 return &iommu->iommu; free_table: @@ -3780,6 +3812,7 @@ static void intel_iommu_release_device(struct device = *dev) struct device_domain_info *info =3D dev_iommu_priv_get(dev); struct intel_iommu *iommu =3D info->iommu; =20 + iommu_disable_pci_pri(info); iommu_disable_pci_ats(info); =20 if (info->pasid_enabled) { @@ -3868,116 +3901,41 @@ static struct iommu_group *intel_iommu_device_grou= p(struct device *dev) return generic_device_group(dev); } =20 -static int context_flip_pri(struct device_domain_info *info, bool enable) +int intel_iommu_enable_iopf(struct device *dev) { - struct intel_iommu *iommu =3D info->iommu; - u8 bus =3D info->bus, devfn =3D info->devfn; - struct context_entry *context; - u16 did; - - spin_lock(&iommu->lock); - if (context_copied(iommu, bus, devfn)) { - spin_unlock(&iommu->lock); - return -EINVAL; - } - - context =3D iommu_context_addr(iommu, bus, devfn, false); - if (!context || !context_present(context)) { - spin_unlock(&iommu->lock); - return -ENODEV; - } - did =3D context_domain_id(context); - - if (enable) - context_set_sm_pre(context); - else - context_clear_sm_pre(context); - - if (!ecap_coherent(iommu->ecap)) - clflush_cache_range(context, sizeof(*context)); - intel_context_flush_present(info, context, did, true); - spin_unlock(&iommu->lock); - - return 0; -} - -static int intel_iommu_enable_iopf(struct device *dev) -{ - struct pci_dev *pdev =3D dev_is_pci(dev) ? to_pci_dev(dev) : NULL; struct device_domain_info *info =3D dev_iommu_priv_get(dev); - struct intel_iommu *iommu; + struct intel_iommu *iommu =3D info->iommu; int ret; =20 - if (!pdev || !info || !info->ats_enabled || !info->pri_supported) + if (!info->pri_enabled) return -ENODEV; =20 - if (info->pri_enabled) - return -EBUSY; - - iommu =3D info->iommu; - if (!iommu) - return -EINVAL; - - /* PASID is required in PRG Response Message. */ - if (info->pasid_enabled && !pci_prg_resp_pasid_required(pdev)) - return -EINVAL; - - ret =3D pci_reset_pri(pdev); - if (ret) - return ret; + if (info->iopf_refcount) { + info->iopf_refcount++; + return 0; + } =20 ret =3D iopf_queue_add_device(iommu->iopf_queue, dev); if (ret) return ret; =20 - ret =3D context_flip_pri(info, true); - if (ret) - goto err_remove_device; - - ret =3D pci_enable_pri(pdev, PRQ_DEPTH); - if (ret) - goto err_clear_pri; - - info->pri_enabled =3D 1; + info->iopf_refcount =3D 1; =20 return 0; -err_clear_pri: - context_flip_pri(info, false); -err_remove_device: - iopf_queue_remove_device(iommu->iopf_queue, dev); - - return ret; } =20 -static int intel_iommu_disable_iopf(struct device *dev) +void intel_iommu_disable_iopf(struct device *dev) { struct device_domain_info *info =3D dev_iommu_priv_get(dev); struct intel_iommu *iommu =3D info->iommu; =20 - if (!info->pri_enabled) - return -EINVAL; + if (WARN_ON(!info->pri_enabled || !info->iopf_refcount)) + return; =20 - /* Disable new PRI reception: */ - context_flip_pri(info, false); + if (--info->iopf_refcount) + return; =20 - /* - * Remove device from fault queue and acknowledge all outstanding - * PRQs to the device: - */ iopf_queue_remove_device(iommu->iopf_queue, dev); - - /* - * PCIe spec states that by clearing PRI enable bit, the Page - * Request Interface will not issue new page requests, but has - * outstanding page requests that have been transmitted or are - * queued for transmission. This is supposed to be called after - * the device driver has stopped DMA, all PASIDs have been - * unbound and the outstanding PRQs have been drained. - */ - pci_disable_pri(to_pci_dev(dev)); - info->pri_enabled =3D 0; - - return 0; } =20 static int @@ -4000,7 +3958,8 @@ intel_iommu_dev_disable_feat(struct device *dev, enum= iommu_dev_features feat) { switch (feat) { case IOMMU_DEV_FEAT_IOPF: - return intel_iommu_disable_iopf(dev); + intel_iommu_disable_iopf(dev); + return 0; =20 case IOMMU_DEV_FEAT_SVA: return 0; diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index dd980808998d..42b4e500989b 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -774,6 +774,7 @@ struct device_domain_info { u8 ats_enabled:1; u8 dtlb_extra_inval:1; /* Quirk for devices need extra flush */ u8 ats_qdep; + unsigned int iopf_refcount; struct device *dev; /* it's NULL for PCIe-to-PCI bridge */ struct intel_iommu *iommu; /* IOMMU used by this device */ struct dmar_domain *domain; /* pointer to domain */ @@ -1295,6 +1296,9 @@ void intel_iommu_page_response(struct device *dev, st= ruct iopf_fault *evt, struct iommu_page_response *msg); void intel_iommu_drain_pasid_prq(struct device *dev, u32 pasid); =20 +int intel_iommu_enable_iopf(struct device *dev); +void intel_iommu_disable_iopf(struct device *dev); + #ifdef CONFIG_INTEL_IOMMU_SVM void intel_svm_check(struct intel_iommu *iommu); struct iommu_domain *intel_svm_domain_alloc(struct device *dev, diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index fb59a7d35958..c2742e256552 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -992,6 +992,8 @@ static int context_entry_set_pasid_table(struct context= _entry *context, context_set_sm_dte(context); if (info->pasid_supported) context_set_pasid(context); + if (info->pri_supported) + context_set_sm_pre(context); =20 context_set_fault_enable(context); context_set_present(context); diff --git a/drivers/iommu/intel/prq.c b/drivers/iommu/intel/prq.c index 064194399b38..5b6a64d96850 100644 --- a/drivers/iommu/intel/prq.c +++ b/drivers/iommu/intel/prq.c @@ -67,7 +67,7 @@ void intel_iommu_drain_pasid_prq(struct device *dev, u32 = pasid) u16 sid, did; =20 info =3D dev_iommu_priv_get(dev); - if (!info->pri_enabled) + if (!info->iopf_refcount) return; =20 iommu =3D info->iommu; --=20 2.43.0 From nobody Sat Feb 7 21:24:07 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5431E1B85CC for ; Mon, 10 Mar 2025 02:47:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741574867; cv=none; b=rFC0aRjEYj0hZhkK8icghNkCauqS6Qw9+BqsHOJwqXzrseTM7Iyx5uO6GAh79VIJcgsa7KoDeYg/hdmuinFGils378bM4X2KpLP6+TqFvPYV2/OmgCpMMUxzlNC5/5oBnJ7ebCQfeferRfhbcvroVm3anhqF/8jgs85U62o1AV4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741574867; c=relaxed/simple; bh=uvmDMhSJxXxzxnFDExTRka6WtW0ou5h0o0fSWRSjBJ8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tN2UfwqFew/a9nC1+gLCevpeRpdrcy+4VE6w/89Rui7x5dkfF3uDvtQ+TDEclGUjzzud9g3RR0McD6SKihD9z11LWwKKz/0M/ZFwM7JPfd5VjkyIuZwDBNRHBy9gyU/qL0eqkBnkbgU8AuHNa4kx3MeXi3TzixuA3z/QHasNJuM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=h1npF8N4; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="h1npF8N4" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1741574866; x=1773110866; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uvmDMhSJxXxzxnFDExTRka6WtW0ou5h0o0fSWRSjBJ8=; b=h1npF8N4roeTwbwKXFFeX9NOnAqBuBRSux4++PKAfOKr0eiDK1EIjWA4 Amz/igWShZyXGHaH/xnG1KKtH4Epf609Y3C3fvENL8VLntcTwlNaooWi8 Fglm8cgttKW3BGYjqNI0vqLoYQBF/5j7og14A8l16vrc592hJiA7J77KV O9ky2LhhYW+JI1G8uNApJXzh3LGIFw1nIyuLwIkRy/hvj+HOmKD8X/d1Z oxnB8UoS6ExG/eCD/h0S8MDWR5wK1hmo1spJBkUsNsPc/pdmWXJiZAxZj wJlBFJJBlDsf0uBxcvcDL63MHYd9Li92xVfEtpMO97c81dGHCm28iCriu w==; X-CSE-ConnectionGUID: znVXYjwDSbyBKh5u9MlnfA== X-CSE-MsgGUID: uHHaDt7bSuy16Vrc3QvBww== X-IronPort-AV: E=McAfee;i="6700,10204,11368"; a="42401615" X-IronPort-AV: E=Sophos;i="6.14,235,1736841600"; d="scan'208";a="42401615" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Mar 2025 19:47:43 -0700 X-CSE-ConnectionGUID: j1Ok4pvtS4OLAQxoGLZd0w== X-CSE-MsgGUID: jDwWRNNARRultASU9+vMAQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.14,235,1736841600"; d="scan'208";a="143079239" Received: from allen-box.sh.intel.com ([10.239.159.52]) by fmviesa002.fm.intel.com with ESMTP; 09 Mar 2025 19:47:41 -0700 From: Lu Baolu To: Joerg Roedel Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 6/6] iommu/vt-d: Cleanup intel_context_flush_present() Date: Mon, 10 Mar 2025 10:47:49 +0800 Message-ID: <20250310024749.3702681-7-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250310024749.3702681-1-baolu.lu@linux.intel.com> References: <20250310024749.3702681-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The intel_context_flush_present() is called in places where either the scalable mode is disabled, or scalable mode is enabled but all PASID entries are known to be non-present. In these cases, the flush_domains path within intel_context_flush_present() will never execute. This dead code is therefore removed. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Tested-by: Zhangfei Gao Link: https://lore.kernel.org/r/20250228092631.3425464-7-baolu.lu@linux.int= el.com --- drivers/iommu/intel/iommu.c | 2 +- drivers/iommu/intel/iommu.h | 5 ++--- drivers/iommu/intel/pasid.c | 41 +++++++------------------------------ 3 files changed, 10 insertions(+), 38 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 6a6c271ef7e6..85aa66ef4d61 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1783,7 +1783,7 @@ static void domain_context_clear_one(struct device_do= main_info *info, u8 bus, u8 context_clear_entry(context); __iommu_flush_cache(iommu, context, sizeof(*context)); spin_unlock(&iommu->lock); - intel_context_flush_present(info, context, did, true); + intel_context_flush_no_pasid(info, context, did); } =20 int __domain_setup_first_level(struct intel_iommu *iommu, diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 42b4e500989b..c4916886da5a 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -1286,9 +1286,8 @@ void cache_tag_flush_all(struct dmar_domain *domain); void cache_tag_flush_range_np(struct dmar_domain *domain, unsigned long st= art, unsigned long end); =20 -void intel_context_flush_present(struct device_domain_info *info, - struct context_entry *context, - u16 did, bool affect_domains); +void intel_context_flush_no_pasid(struct device_domain_info *info, + struct context_entry *context, u16 did); =20 int intel_iommu_enable_prq(struct intel_iommu *iommu); int intel_iommu_finish_prq(struct intel_iommu *iommu); diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index c2742e256552..7ee18bb48bd4 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -932,7 +932,7 @@ static void device_pasid_table_teardown(struct device *= dev, u8 bus, u8 devfn) context_clear_entry(context); __iommu_flush_cache(iommu, context, sizeof(*context)); spin_unlock(&iommu->lock); - intel_context_flush_present(info, context, did, false); + intel_context_flush_no_pasid(info, context, did); } =20 static int pci_pasid_table_teardown(struct pci_dev *pdev, u16 alias, void = *data) @@ -1119,17 +1119,15 @@ static void __context_flush_dev_iotlb(struct device= _domain_info *info) =20 /* * Cache invalidations after change in a context table entry that was pres= ent - * according to the Spec 6.5.3.3 (Guidance to Software for Invalidations).= If - * IOMMU is in scalable mode and all PASID table entries of the device were - * non-present, set flush_domains to false. Otherwise, true. + * according to the Spec 6.5.3.3 (Guidance to Software for Invalidations). + * This helper can only be used when IOMMU is working in the legacy mode or + * IOMMU is in scalable mode but all PASID table entries of the device are + * non-present. */ -void intel_context_flush_present(struct device_domain_info *info, - struct context_entry *context, - u16 did, bool flush_domains) +void intel_context_flush_no_pasid(struct device_domain_info *info, + struct context_entry *context, u16 did) { struct intel_iommu *iommu =3D info->iommu; - struct pasid_entry *pte; - int i; =20 /* * Device-selective context-cache invalidation. The Domain-ID field @@ -1152,30 +1150,5 @@ void intel_context_flush_present(struct device_domai= n_info *info, return; } =20 - /* - * For scalable mode: - * - Domain-selective PASID-cache invalidation to affected domains - * - Domain-selective IOTLB invalidation to affected domains - * - Global Device-TLB invalidation to affected functions - */ - if (flush_domains) { - /* - * If the IOMMU is running in scalable mode and there might - * be potential PASID translations, the caller should hold - * the lock to ensure that context changes and cache flushes - * are atomic. - */ - assert_spin_locked(&iommu->lock); - for (i =3D 0; i < info->pasid_table->max_pasid; i++) { - pte =3D intel_pasid_get_entry(info->dev, i); - if (!pte || !pasid_pte_is_present(pte)) - continue; - - did =3D pasid_get_domain_id(pte); - qi_flush_pasid_cache(iommu, did, QI_PC_ALL_PASIDS, 0); - iommu->flush.flush_iotlb(iommu, did, 0, 0, DMA_TLB_DSI_FLUSH); - } - } - __context_flush_dev_iotlb(info); } --=20 2.43.0