From nobody Sat Feb 7 18:23:03 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 38ED55A4E2 for ; Tue, 5 Mar 2024 12:27:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641649; cv=none; b=IdQ8ecn68W7YiqxqtR7w7POQwXR5pLXTAloI3RSByzHgiKeE858BkrpJ/XGkfqOSPUZPY3JyZsXsA1ZclF3BGtA5G3xzylthw/zWV2hmzoPY937V60bn8nOYC6Mv3XJ3Hp5TGfiD9R18DkoHnmGfltfSHnBR8vaR2AzpWg0lmpc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641649; c=relaxed/simple; bh=TKDPxYW/tLR6+pdSadQd+rmH+EeLgf0eVqOWRG19CiM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Ra/SZlzgUZblC+H3MKx45MTu9YR9xNJwk65xgqv9lFtyagg8E0+cL3IFUoDj2ezo6gMLHX3U3J+ukQlo/yeYByrZuYcRCnLXjThZNJBir9KIiGzmyiqKq+E0jOePLKFczxMb3bXYz6sqAHWnhcK2xL0L1nDAl/yx3zCUY492sDc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=kq0TTBF3; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="kq0TTBF3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709641648; x=1741177648; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TKDPxYW/tLR6+pdSadQd+rmH+EeLgf0eVqOWRG19CiM=; b=kq0TTBF3iC1XkcQQIw5LonZg2Tt5A91x018nZXhbAeRzOST/WcE+OsyB WQqog/CZbHHiQ3P6T8jQWZl3lOukO8uwBLgAtn6nx+09GTWF0T85yTXOY ZJXXEX2RiUIP1vuNBR2DseX2liTZsCfFdtC5DM/wiHBiRxkiFlMmRv9NL Y3nkC8mckT4CcVSmlwpfjo6a8sLGpl3JzOLYkzpS/hvunYzHJJGaUT1Fg D8fiwloms/uLhCYGDj9KAViwvAJeiFUdBCcf3oaYHDfaOsnOrQt6L5pfm Qi9gitv/gQsglGnGgqJokvmEgUTquF6TdLcwnf3h9WU2TTkaX4Idk2OZr A==; X-IronPort-AV: E=McAfee;i="6600,9927,11003"; a="21648435" X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="21648435" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Mar 2024 04:27:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="9330074" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa010.jf.intel.com with ESMTP; 05 Mar 2024 04:27:26 -0800 From: Lu Baolu To: Joerg Roedel Cc: Ethan Zhao , Eric Badger , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 1/8] PCI: Make pci_dev_is_disconnected() helper public for other drivers Date: Tue, 5 Mar 2024 20:21:14 +0800 Message-Id: <20240305122121.211482-2-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240305122121.211482-1-baolu.lu@linux.intel.com> References: <20240305122121.211482-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ethan Zhao Make pci_dev_is_disconnected() public so that it can be called from Intel VT-d driver to quickly fix/workaround the surprise removal unplug hang issue for those ATS capable devices on PCIe switch downstream hotplug capable ports. Beside pci_device_is_present() function, this one has no config space space access, so is light enough to optimize the normal pure surprise removal and safe removal flow. Acked-by: Bjorn Helgaas Reviewed-by: Dan Carpenter Tested-by: Haorong Ye Signed-off-by: Ethan Zhao Link: https://lore.kernel.org/r/20240301080727.3529832-2-haifeng.zhao@linux= .intel.com Signed-off-by: Lu Baolu --- include/linux/pci.h | 5 +++++ drivers/pci/pci.h | 5 ----- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/include/linux/pci.h b/include/linux/pci.h index 7ab0d13672da..213109d3c601 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2517,6 +2517,11 @@ static inline struct pci_dev *pcie_find_root_port(st= ruct pci_dev *dev) return NULL; } =20 +static inline bool pci_dev_is_disconnected(const struct pci_dev *dev) +{ + return dev->error_state =3D=3D pci_channel_io_perm_failure; +} + void pci_request_acs(void); bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); bool pci_acs_path_enabled(struct pci_dev *start, diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index e9750b1b19ba..bfc56f7bee1c 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -368,11 +368,6 @@ static inline int pci_dev_set_disconnected(struct pci_= dev *dev, void *unused) return 0; } =20 -static inline bool pci_dev_is_disconnected(const struct pci_dev *dev) -{ - return dev->error_state =3D=3D pci_channel_io_perm_failure; -} - /* pci_dev priv_flags */ #define PCI_DEV_ADDED 0 #define PCI_DPC_RECOVERED 1 --=20 2.34.1 From nobody Sat Feb 7 18:23:03 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 822925B670 for ; Tue, 5 Mar 2024 12:27:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641653; cv=none; b=jirj0af8IYCznWGEZlAjshlYwcIu455IDho22IAJFWvStlbz1S1d0Wb0CPExe9FyOmfWMfCtLAZquPaF6DZnh8AqVekI3NbSMES6OydoJk2EJSDuX6+nUtcMx7sPi1w5i+F4WQdU+klDps5+cHJ77D+X8Yuvi5MIgCR5CyeAC/0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641653; c=relaxed/simple; bh=Do5Qmt0LUVcZ/D/yMwhEMx258cx50GlNIapQbJzYE4s=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=LOqspEiT1X3qyULYA2PyxLW/qGlco17RAvbdwoAhr6gzIv+JNvsaEe9PDV74b1Ru0ASC0cxHlSkRvsBTECUsu4t96M1wibmWrzDxWB4VnOextObRp36HZbFHuMvvEoyQypVc675NA82i5eq2LZhOzFxR12SkNZJ7ydHmZpdGaJc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=PLFm3Ren; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="PLFm3Ren" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709641651; x=1741177651; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Do5Qmt0LUVcZ/D/yMwhEMx258cx50GlNIapQbJzYE4s=; b=PLFm3Renh6VR0nS3ubOK6nAuNSmRm89LKcd5lJvTbt3BjM7UKq0skngw wyGLHxzAFl6AKHTW1fnWWBY3GTbIOmQFYN+OpPG4UGUdTiiSNUgdFT2C8 UR9zwbmuW3GsTCuKcLxQgK/sRjBrnNgyGm1YfD6W5xLt+C/ouX2EcoZuJ KS6LOrIPBd89IQuFVxqUHuRfBw5Xpy+uxurnawUK5jCTe05b2ijWCQno4 yhH6elAlfGPMkjuoDthcs7UUcnYrkh/e831MXV1tSfPwTt1lmCNBqpqsy WFvyId7RCFiwaiHwJHdsSIcCP+DCEn/SxkUdcHhN5BqtcOGeW2bLiobq8 A==; X-IronPort-AV: E=McAfee;i="6600,9927,11003"; a="21648451" X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="21648451" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Mar 2024 04:27:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="9330088" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa010.jf.intel.com with ESMTP; 05 Mar 2024 04:27:28 -0800 From: Lu Baolu To: Joerg Roedel Cc: Ethan Zhao , Eric Badger , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 2/8] iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected Date: Tue, 5 Mar 2024 20:21:15 +0800 Message-Id: <20240305122121.211482-3-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240305122121.211482-1-baolu.lu@linux.intel.com> References: <20240305122121.211482-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ethan Zhao For those endpoint devices connect to system via hotplug capable ports, users could request a hot reset to the device by flapping device's link through setting the slot's link control register, as pciehp_ist() DLLSC interrupt sequence response, pciehp will unload the device driver and then power it off. thus cause an IOMMU device-TLB invalidation (Intel VT-d spec, or ATS Invalidation in PCIe spec r6.1) request for non-existence target device to be sent and deadly loop to retry that request after ITE fault triggered in interrupt context. That would cause following continuous hard lockup warning and system hang [ 4211.433662] pcieport 0000:17:01.0: pciehp: Slot(108): Link Down [ 4211.433664] pcieport 0000:17:01.0: pciehp: Slot(108): Card not present [ 4223.822591] NMI watchdog: Watchdog detected hard LOCKUP on cpu 144 [ 4223.822622] CPU: 144 PID: 1422 Comm: irq/57-pciehp Kdump: loaded Tainted= : G S OE kernel version xxxx [ 4223.822623] Hardware name: vendorname xxxx 666-106, BIOS 01.01.02.03.01 05/15/2023 [ 4223.822623] RIP: 0010:qi_submit_sync+0x2c0/0x490 [ 4223.822624] Code: 48 be 00 00 00 00 00 08 00 00 49 85 74 24 20 0f 95 c1 = 48 8b 57 10 83 c1 04 83 3c 1a 03 0f 84 a2 01 00 00 49 8b 04 24 8b 70 34 <40> f6 = c6 1 0 74 17 49 8b 04 24 8b 80 80 00 00 00 89 c2 d3 fa 41 39 [ 4223.822624] RSP: 0018:ffffc4f074f0bbb8 EFLAGS: 00000093 [ 4223.822625] RAX: ffffc4f040059000 RBX: 0000000000000014 RCX: 00000000000= 00005 [ 4223.822625] RDX: ffff9f3841315800 RSI: 0000000000000000 RDI: ffff9f38401= a8340 [ 4223.822625] RBP: ffff9f38401a8340 R08: ffffc4f074f0bc00 R09: 00000000000= 00000 [ 4223.822626] R10: 0000000000000010 R11: 0000000000000018 R12: ffff9f38400= 5e200 [ 4223.822626] R13: 0000000000000004 R14: 0000000000000046 R15: 00000000000= 00004 [ 4223.822626] FS: 0000000000000000(0000) GS:ffffa237ae400000(0000) knlGS:0000000000000000 [ 4223.822627] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4223.822627] CR2: 00007ffe86515d80 CR3: 000002fd3000a001 CR4: 00000000007= 70ee0 [ 4223.822627] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000= 00000 [ 4223.822628] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 00000000000= 00400 [ 4223.822628] PKRU: 55555554 [ 4223.822628] Call Trace: [ 4223.822628] qi_flush_dev_iotlb+0xb1/0xd0 [ 4223.822628] __dmar_remove_one_dev_info+0x224/0x250 [ 4223.822629] dmar_remove_one_dev_info+0x3e/0x50 [ 4223.822629] intel_iommu_release_device+0x1f/0x30 [ 4223.822629] iommu_release_device+0x33/0x60 [ 4223.822629] iommu_bus_notifier+0x7f/0x90 [ 4223.822630] blocking_notifier_call_chain+0x60/0x90 [ 4223.822630] device_del+0x2e5/0x420 [ 4223.822630] pci_remove_bus_device+0x70/0x110 [ 4223.822630] pciehp_unconfigure_device+0x7c/0x130 [ 4223.822631] pciehp_disable_slot+0x6b/0x100 [ 4223.822631] pciehp_handle_presence_or_link_change+0xd8/0x320 [ 4223.822631] pciehp_ist+0x176/0x180 [ 4223.822631] ? irq_finalize_oneshot.part.50+0x110/0x110 [ 4223.822632] irq_thread_fn+0x19/0x50 [ 4223.822632] irq_thread+0x104/0x190 [ 4223.822632] ? irq_forced_thread_fn+0x90/0x90 [ 4223.822632] ? irq_thread_check_affinity+0xe0/0xe0 [ 4223.822633] kthread+0x114/0x130 [ 4223.822633] ? __kthread_cancel_work+0x40/0x40 [ 4223.822633] ret_from_fork+0x1f/0x30 [ 4223.822633] Kernel panic - not syncing: Hard LOCKUP [ 4223.822634] CPU: 144 PID: 1422 Comm: irq/57-pciehp Kdump: loaded Tainted= : G S OE kernel version xxxx [ 4223.822634] Hardware name: vendorname xxxx 666-106, BIOS 01.01.02.03.01 05/15/2023 [ 4223.822634] Call Trace: [ 4223.822634] [ 4223.822635] dump_stack+0x6d/0x88 [ 4223.822635] panic+0x101/0x2d0 [ 4223.822635] ? ret_from_fork+0x11/0x30 [ 4223.822635] nmi_panic.cold.14+0xc/0xc [ 4223.822636] watchdog_overflow_callback.cold.8+0x6d/0x81 [ 4223.822636] __perf_event_overflow+0x4f/0xf0 [ 4223.822636] handle_pmi_common+0x1ef/0x290 [ 4223.822636] ? __set_pte_vaddr+0x28/0x40 [ 4223.822637] ? flush_tlb_one_kernel+0xa/0x20 [ 4223.822637] ? __native_set_fixmap+0x24/0x30 [ 4223.822637] ? ghes_copy_tofrom_phys+0x70/0x100 [ 4223.822637] ? __ghes_peek_estatus.isra.16+0x49/0xa0 [ 4223.822637] intel_pmu_handle_irq+0xba/0x2b0 [ 4223.822638] perf_event_nmi_handler+0x24/0x40 [ 4223.822638] nmi_handle+0x4d/0xf0 [ 4223.822638] default_do_nmi+0x49/0x100 [ 4223.822638] exc_nmi+0x134/0x180 [ 4223.822639] end_repeat_nmi+0x16/0x67 [ 4223.822639] RIP: 0010:qi_submit_sync+0x2c0/0x490 [ 4223.822639] Code: 48 be 00 00 00 00 00 08 00 00 49 85 74 24 20 0f 95 c1 = 48 8b 57 10 83 c1 04 83 3c 1a 03 0f 84 a2 01 00 00 49 8b 04 24 8b 70 34 <40> f6 = c6 10 74 17 49 8b 04 24 8b 80 80 00 00 00 89 c2 d3 fa 41 39 [ 4223.822640] RSP: 0018:ffffc4f074f0bbb8 EFLAGS: 00000093 [ 4223.822640] RAX: ffffc4f040059000 RBX: 0000000000000014 RCX: 00000000000= 00005 [ 4223.822640] RDX: ffff9f3841315800 RSI: 0000000000000000 RDI: ffff9f38401= a8340 [ 4223.822641] RBP: ffff9f38401a8340 R08: ffffc4f074f0bc00 R09: 00000000000= 00000 [ 4223.822641] R10: 0000000000000010 R11: 0000000000000018 R12: ffff9f38400= 5e200 [ 4223.822641] R13: 0000000000000004 R14: 0000000000000046 R15: 00000000000= 00004 [ 4223.822641] ? qi_submit_sync+0x2c0/0x490 [ 4223.822642] ? qi_submit_sync+0x2c0/0x490 [ 4223.822642] [ 4223.822642] qi_flush_dev_iotlb+0xb1/0xd0 [ 4223.822642] __dmar_remove_one_dev_info+0x224/0x250 [ 4223.822643] dmar_remove_one_dev_info+0x3e/0x50 [ 4223.822643] intel_iommu_release_device+0x1f/0x30 [ 4223.822643] iommu_release_device+0x33/0x60 [ 4223.822643] iommu_bus_notifier+0x7f/0x90 [ 4223.822644] blocking_notifier_call_chain+0x60/0x90 [ 4223.822644] device_del+0x2e5/0x420 [ 4223.822644] pci_remove_bus_device+0x70/0x110 [ 4223.822644] pciehp_unconfigure_device+0x7c/0x130 [ 4223.822644] pciehp_disable_slot+0x6b/0x100 [ 4223.822645] pciehp_handle_presence_or_link_change+0xd8/0x320 [ 4223.822645] pciehp_ist+0x176/0x180 [ 4223.822645] ? irq_finalize_oneshot.part.50+0x110/0x110 [ 4223.822645] irq_thread_fn+0x19/0x50 [ 4223.822646] irq_thread+0x104/0x190 [ 4223.822646] ? irq_forced_thread_fn+0x90/0x90 [ 4223.822646] ? irq_thread_check_affinity+0xe0/0xe0 [ 4223.822646] kthread+0x114/0x130 [ 4223.822647] ? __kthread_cancel_work+0x40/0x40 [ 4223.822647] ret_from_fork+0x1f/0x30 [ 4223.822647] Kernel Offset: 0x6400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Such issue could be triggered by all kinds of regular surprise removal hotplug operation. like: 1. pull EP(endpoint device) out directly. 2. turn off EP's power. 3. bring the link down. etc. this patch aims to work for regular safe removal and surprise removal unplug. these hot unplug handling process could be optimized for fix the ATS Invalidation hang issue by calling pci_dev_is_disconnected() in function devtlb_invalidation_with_pasid() to check target device state to avoid sending meaningless ATS Invalidation request to iommu when device is gone. (see IMPLEMENTATION NOTE in PCIe spec r6.1 section 10.3.1) For safe removal, device wouldn't be removed until the whole software handling process is done, it wouldn't trigger the hard lock up issue caused by too long ATS Invalidation timeout wait. In safe removal path, device state isn't set to pci_channel_io_perm_failure in pciehp_unconfigure_device() by checking 'presence' parameter, calling pci_dev_is_disconnected() in devtlb_invalidation_with_pasid() will return false there, wouldn't break the function. For surprise removal, device state is set to pci_channel_io_perm_failure in pciehp_unconfigure_device(), means device is already gone (disconnected) call pci_dev_is_disconnected() in devtlb_invalidation_with_pasid() will return true to break the function not to send ATS Invalidation request to the disconnected device blindly, thus avoid to trigger further ITE fault, and ITE fault will block all invalidation request to be handled. furthermore retry the timeout request could trigger hard lockup. safe removal (present) & surprise removal (not present) pciehp_ist() pciehp_handle_presence_or_link_change() pciehp_disable_slot() remove_board() pciehp_unconfigure_device(presence) { if (!presence) pci_walk_bus(parent, pci_dev_set_disconnected, NULL); } this patch works for regular safe removal and surprise removal of ATS capable endpoint on PCIe switch downstream ports. Fixes: 6f7db75e1c46 ("iommu/vt-d: Add second level page table interface") Reviewed-by: Dan Carpenter Tested-by: Haorong Ye Signed-off-by: Ethan Zhao Link: https://lore.kernel.org/r/20240301080727.3529832-3-haifeng.zhao@linux= .intel.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/pasid.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index 108158e2b907..746c7abe2237 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -214,6 +214,9 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iomm= u, if (!info || !info->ats_enabled) return; =20 + if (pci_dev_is_disconnected(to_pci_dev(dev))) + return; + sid =3D info->bus << 8 | info->devfn; qdep =3D info->ats_qdep; pfsid =3D info->pfsid; --=20 2.34.1 From nobody Sat Feb 7 18:23:03 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C2E895B685 for ; Tue, 5 Mar 2024 12:27:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641654; cv=none; b=i/dGTH6BL/EjKPpgXn1sGtQhE8ssD7IvHjXABpcujXmd21B5fdXBqVkdo6kaLFhLWjnURVfTfWdj5OmlF+nqwTaN+enKr2/TsiTSB0d7GV/9O8XRxttMTZrp80+X2CJ6NU3j0ZxHgPc4aps51KYPDfrRZcuvbgj6cNwmpabVaqc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641654; c=relaxed/simple; bh=Wea79WtDueDcsbl7gYJzerSqx45NPTieanVyZzD4qvA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=PnJOzkpGeOqxC9Gi8O6QFg7WR39XeRJ44SjzqTuPiWi2Iod+FDJi0u1gUzhAjNkKAd4tsxMyjXYYZ/94p/wtR1urPx+tCxl09o7Ci2fC265iyuf0EVrpzRO9KKeKHEddAVAbD5PbJdqDjvtmIe8I+cGo/I85mNtQDxEvZmEwBbs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=fHtum+OI; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="fHtum+OI" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709641652; x=1741177652; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Wea79WtDueDcsbl7gYJzerSqx45NPTieanVyZzD4qvA=; b=fHtum+OIi8Eub/3wMyZCJgd3qTkjkRBEE4VbcZH1e5ynHn529DPrp+dv IR8Ln5WKekm6OT8/afWRA+ooJfvPT92ON1TS4ieA0V30rcc8zYxw8k6H4 t4s/SNt9aO0aEW624u/DLTDDQ7g/Varm5PYF9k+l1Ye33aI0A7XTUj45l LFt4yWQM+GU97yFo/dJwEAdYA8QKfEmOsbvBEAuk+8bf3Co39LpWXNvZt 5sQMUEsDp23xCJGTl5Ua7Z9maraxpu8TD/GXfUL1v4XGXwnFD/a8c/Lj6 XxrVYxF3z+bqcMULOgW0BQ76bqNEbGMPN5aJJ2W3utD28ZawekwykcSX7 w==; X-IronPort-AV: E=McAfee;i="6600,9927,11003"; a="21648461" X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="21648461" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Mar 2024 04:27:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="9330097" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa010.jf.intel.com with ESMTP; 05 Mar 2024 04:27:30 -0800 From: Lu Baolu To: Joerg Roedel Cc: Ethan Zhao , Eric Badger , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 3/8] iommu/vt-d: Improve ITE fault handling if target device isn't present Date: Tue, 5 Mar 2024 20:21:16 +0800 Message-Id: <20240305122121.211482-4-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240305122121.211482-1-baolu.lu@linux.intel.com> References: <20240305122121.211482-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ethan Zhao Because surprise removal could happen anytime, e.g. user could request safe removal to EP(endpoint device) via sysfs and brings its link down to do surprise removal cocurrently. such aggressive cases would cause ATS invalidation request issued to non-existence target device, then deadly loop to retry that request after ITE fault triggered in interrupt context. this patch aims to optimize the ITE handling by checking the target device presence state to avoid retrying the timeout request blindly, thus avoid hard lockup or system hang. Devices TLB should only be invalidated when devices are in the iommu->device_rbtree (probed, not released) and present. Fixes: 6ba6c3a4cacf ("VT-d: add device IOTLB invalidation support") Reviewed-by: Dan Carpenter Signed-off-by: Ethan Zhao Link: https://lore.kernel.org/r/20240301080727.3529832-4-haifeng.zhao@linux= .intel.com Signed-off-by: Lu Baolu --- drivers/iommu/intel/dmar.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index d14797aabb7a..36d7427b1202 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -1273,6 +1273,8 @@ static int qi_check_fault(struct intel_iommu *iommu, = int index, int wait_index) { u32 fault; int head, tail; + struct device *dev; + u64 iqe_err, ite_sid; struct q_inval *qi =3D iommu->qi; int shift =3D qi_shift(iommu); =20 @@ -1317,6 +1319,13 @@ static int qi_check_fault(struct intel_iommu *iommu,= int index, int wait_index) tail =3D readl(iommu->reg + DMAR_IQT_REG); tail =3D ((tail >> shift) - 1 + QI_LENGTH) % QI_LENGTH; =20 + /* + * SID field is valid only when the ITE field is Set in FSTS_REG + * see Intel VT-d spec r4.1, section 11.4.9.9 + */ + iqe_err =3D dmar_readq(iommu->reg + DMAR_IQER_REG); + ite_sid =3D DMAR_IQER_REG_ITESID(iqe_err); + writel(DMA_FSTS_ITE, iommu->reg + DMAR_FSTS_REG); pr_info("Invalidation Time-out Error (ITE) cleared\n"); =20 @@ -1326,6 +1335,19 @@ static int qi_check_fault(struct intel_iommu *iommu,= int index, int wait_index) head =3D (head - 2 + QI_LENGTH) % QI_LENGTH; } while (head !=3D tail); =20 + /* + * If device was released or isn't present, no need to retry + * the ATS invalidate request anymore. + * + * 0 value of ite_sid means old VT-d device, no ite_sid value. + * see Intel VT-d spec r4.1, section 11.4.9.9 + */ + if (ite_sid) { + dev =3D device_rbtree_find(iommu, ite_sid); + if (!dev || !dev_is_pci(dev) || + !pci_device_is_present(to_pci_dev(dev))) + return -ETIMEDOUT; + } if (qi->desc_status[wait_index] =3D=3D QI_ABORT) return -EAGAIN; } --=20 2.34.1 From nobody Sat Feb 7 18:23:03 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 048445B68A for ; Tue, 5 Mar 2024 12:27:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641655; cv=none; b=YiYyKpr3mA7Z3C5tPH32+OS0uN1K2Yt3dZZqyQOl5BWkjGXoA5dYJ4BDzhIYaXk6dof9NCkmwzE0ffu4VJbmghOZ0WDJk6yswhwzXchBnEzXYdVxvpFVjiP7kzaMonIH8VMuI4ZC1rm24nso4r56EKVi4MAYhc+OLEL0c67HhuY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641655; c=relaxed/simple; bh=maKNjl3/k7YtNAhNUBytDuUXRmMI86juobu03Z7Ayi0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Hf687CcAs0ZYyoe7/czp7khVhwjYFglPpiz/VooRmvl5hY8NPqNbemoz6oNNdlFgnlJ3R5OPnadOru5YHbMGmfudMWNxVjPeNUp/A1tFiPQQ3+U7L++4/4MJjgC+v/zQguNm7JSK57Y6YBf7Ni2utEAP2hPs9/TEpsrCBuDIF48= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=iKlym7RD; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="iKlym7RD" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709641654; x=1741177654; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=maKNjl3/k7YtNAhNUBytDuUXRmMI86juobu03Z7Ayi0=; b=iKlym7RD50XrfGRApYpZBGTUV3HfJacix0vQ7aB8yTRhDpc2rw2BUXGy znYnFtz9OtQVRj8ZYMxa0ZACvGC8tuMVxni5CfH8SPcWYEdbH9HIRvpVh tYkpFeWuVrfCVP5Fi4w2nqrmuqLA/Pz13XyrW70gvSrL/7YFaaWMstPL1 1Cq/A8G4zexvmt+RisrrCyO9Xqea9NiLW7BgPjVHwARwTJWFqm1qRSBtq xDJMzDQJ6jrazzw/cASTiFwG5vR9UDPMJzaqHF/7QACf8WkZ60fCbHzlH Wd+iU5Kxw5dWfcyUoQ9ClXFvUrNSsnbwmReow48a9X51M9j5QPaiz73qa g==; X-IronPort-AV: E=McAfee;i="6600,9927,11003"; a="21648471" X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="21648471" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Mar 2024 04:27:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="9330112" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa010.jf.intel.com with ESMTP; 05 Mar 2024 04:27:32 -0800 From: Lu Baolu To: Joerg Roedel Cc: Ethan Zhao , Eric Badger , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 4/8] iommu: Add static iommu_ops->release_domain Date: Tue, 5 Mar 2024 20:21:17 +0800 Message-Id: <20240305122121.211482-5-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240305122121.211482-1-baolu.lu@linux.intel.com> References: <20240305122121.211482-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The current device_release callback for individual iommu drivers does the following: 1) Silent IOMMU DMA translation: It detaches any existing domain from the device and puts it into a blocking state (some drivers might use the identity state). 2) Resource release: It releases resources allocated during the device_probe callback and restores the device to its pre-probe state. Step 1 is challenging for individual iommu drivers because each must check if a domain is already attached to the device. Additionally, if a deferred attach never occurred, the device_release should avoid modifying hardware configuration regardless of the reason for its call. To simplify this process, introduce a static release_domain within the iommu_ops structure. It can be either a blocking or identity domain depending on the iommu hardware. The iommu core will decide whether to attach this domain before the device_release callback, eliminating the need for repetitive code in various drivers. Consequently, the device_release callback can focus solely on the opposite operations of device_probe, including releasing all resources allocated during that callback. Co-developed-by: Jason Gunthorpe Signed-off-by: Jason Gunthorpe Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240305013305.204605-2-baolu.lu@linux.inte= l.com --- include/linux/iommu.h | 1 + drivers/iommu/iommu.c | 19 +++++++++++++++---- 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index af6c367ed673..2e925b5eba53 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -585,6 +585,7 @@ struct iommu_ops { struct module *owner; struct iommu_domain *identity_domain; struct iommu_domain *blocked_domain; + struct iommu_domain *release_domain; struct iommu_domain *default_domain; }; =20 diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index eb50543bf956..098869007c69 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -462,13 +462,24 @@ static void iommu_deinit_device(struct device *dev) =20 /* * release_device() must stop using any attached domain on the device. - * If there are still other devices in the group they are not effected + * If there are still other devices in the group, they are not affected * by this callback. * - * The IOMMU driver must set the device to either an identity or - * blocking translation and stop using any domain pointer, as it is - * going to be freed. + * If the iommu driver provides release_domain, the core code ensures + * that domain is attached prior to calling release_device. Drivers can + * use this to enforce a translation on the idle iommu. Typically, the + * global static blocked_domain is a good choice. + * + * Otherwise, the iommu driver must set the device to either an identity + * or a blocking translation in release_device() and stop using any + * domain pointer, as it is going to be freed. + * + * Regardless, if a delayed attach never occurred, then the release + * should still avoid touching any hardware configuration either. */ + if (!dev->iommu->attach_deferred && ops->release_domain) + ops->release_domain->ops->attach_dev(ops->release_domain, dev); + if (ops->release_device) ops->release_device(dev); =20 --=20 2.34.1 From nobody Sat Feb 7 18:23:03 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F02CD5BAF0 for ; Tue, 5 Mar 2024 12:27:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641657; cv=none; b=TCkXm6mC1ILXiMfBkhRklt15sUPSSzmSo0z1uiARUIjJMindbll2StpC+sV3SpZ9RE8O+cD3lhU7K2HLfDhZbBhLUfc8piHS+xUUi4uGSFJrfqS6F1ECcHRLDLLFITyWO1tpQmFu/+xa2Kuvs5DoquOutqmSil3496+cOl4Qz8g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641657; c=relaxed/simple; bh=8Rd/lNxfbF55r0swiBWQPYT42RoKXJM4lUE23H+2YPI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=C8PyyJc+CujxdjGOGGpnR7QGMCY8YnQtoqnO23NzqGDPNzmkCUPis9/1Z3dU21eBz6K5YxzPTxzf/yk1QSD0WYHGPe2IfMcdj5YueCf+LoPB+hOPdD5pZHSrSzvFyGY5d1qRieIcHcJj6qpo/vQEoLEy/Xxjz0btxrtZO/j5Ki0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=oE42kTPs; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="oE42kTPs" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709641656; x=1741177656; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8Rd/lNxfbF55r0swiBWQPYT42RoKXJM4lUE23H+2YPI=; b=oE42kTPs5G5PKfRNQrbTwcavcPi01vpSdth+7ndkb5HFtWamiPS9Inux pPdHaf9wbXndJSbWW3X3EAER5bQaz0s7Owe15O4cScPcZnITY5fPieASs dEnL6KdvosU0YonXFr5DWCkOHHPSYJByKQnn9HXJFA9NHj3B3VRqsiOLX 1ElenXBW535RJT4dAVcaK16rjNfNxo3/IdE0i1MtrdBRCDveEepFnyXpP 9sVGe1JQLbWp7jvwivMA9VHOZc9llMHehxiLophbr0r7u098TsW7pwbWu ppkQvpL0q29sE8p9wQ+VWdFKDcoxS7Gll7p4nTrkDi0sISSOeVr/IIp51 g==; X-IronPort-AV: E=McAfee;i="6600,9927,11003"; a="21648482" X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="21648482" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Mar 2024 04:27:35 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="9330122" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa010.jf.intel.com with ESMTP; 05 Mar 2024 04:27:34 -0800 From: Lu Baolu To: Joerg Roedel Cc: Ethan Zhao , Eric Badger , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 5/8] iommu/vt-d: Fix NULL domain on device release Date: Tue, 5 Mar 2024 20:21:18 +0800 Message-Id: <20240305122121.211482-6-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240305122121.211482-1-baolu.lu@linux.intel.com> References: <20240305122121.211482-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In the kdump kernel, the IOMMU operates in deferred_attach mode. In this mode, info->domain may not yet be assigned by the time the release_device function is called. It leads to the following crash in the crash kernel: BUG: kernel NULL pointer dereference, address: 000000000000003c ... RIP: 0010:do_raw_spin_lock+0xa/0xa0 ... _raw_spin_lock_irqsave+0x1b/0x30 intel_iommu_release_device+0x96/0x170 iommu_deinit_device+0x39/0xf0 __iommu_group_remove_device+0xa0/0xd0 iommu_bus_notifier+0x55/0xb0 notifier_call_chain+0x5a/0xd0 blocking_notifier_call_chain+0x41/0x60 bus_notify+0x34/0x50 device_del+0x269/0x3d0 pci_remove_bus_device+0x77/0x100 p2sb_bar+0xae/0x1d0 ... i801_probe+0x423/0x740 Use the release_domain mechanism to fix it. The scalable mode context entry which is not part of release domain should be cleared in release_device(). Fixes: 586081d3f6b1 ("iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO") Reported-by: Eric Badger Closes: https://lore.kernel.org/r/20240113181713.1817855-1-ebadger@purestor= age.com Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240305013305.204605-3-baolu.lu@linux.inte= l.com --- drivers/iommu/intel/pasid.h | 1 + drivers/iommu/intel/iommu.c | 31 ++++-------------- drivers/iommu/intel/pasid.c | 64 +++++++++++++++++++++++++++++++++++++ 3 files changed, 71 insertions(+), 25 deletions(-) diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h index 487ede039bdd..42fda97fd851 100644 --- a/drivers/iommu/intel/pasid.h +++ b/drivers/iommu/intel/pasid.h @@ -318,4 +318,5 @@ void intel_pasid_tear_down_entry(struct intel_iommu *io= mmu, bool fault_ignore); void intel_pasid_setup_page_snoop_control(struct intel_iommu *iommu, struct device *dev, u32 pasid); +void intel_pasid_teardown_sm_context(struct device *dev); #endif /* __INTEL_PASID_H */ diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index cc3994efd362..f74d42d3258f 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3869,30 +3869,6 @@ static void domain_context_clear(struct device_domai= n_info *info) &domain_context_clear_one_cb, info); } =20 -static void dmar_remove_one_dev_info(struct device *dev) -{ - struct device_domain_info *info =3D dev_iommu_priv_get(dev); - struct dmar_domain *domain =3D info->domain; - struct intel_iommu *iommu =3D info->iommu; - unsigned long flags; - - if (!dev_is_real_dma_subdevice(info->dev)) { - if (dev_is_pci(info->dev) && sm_supported(iommu)) - intel_pasid_tear_down_entry(iommu, info->dev, - IOMMU_NO_PASID, false); - - iommu_disable_pci_caps(info); - domain_context_clear(info); - } - - spin_lock_irqsave(&domain->lock, flags); - list_del(&info->link); - spin_unlock_irqrestore(&domain->lock, flags); - - domain_detach_iommu(domain, iommu); - info->domain =3D NULL; -} - /* * Clear the page table pointer in context or pasid table entries so that * all DMA requests without PASID from the device are blocked. If the page @@ -4431,7 +4407,11 @@ static void intel_iommu_release_device(struct device= *dev) mutex_lock(&iommu->iopf_lock); device_rbtree_remove(info); mutex_unlock(&iommu->iopf_lock); - dmar_remove_one_dev_info(dev); + + if (sm_supported(iommu) && !dev_is_real_dma_subdevice(dev) && + !context_copied(iommu, info->bus, info->devfn)) + intel_pasid_teardown_sm_context(dev); + intel_pasid_free_table(dev); intel_iommu_debugfs_remove_dev(info); kfree(info); @@ -4922,6 +4902,7 @@ static const struct iommu_dirty_ops intel_dirty_ops = =3D { =20 const struct iommu_ops intel_iommu_ops =3D { .blocked_domain =3D &blocking_domain, + .release_domain =3D &blocking_domain, .capable =3D intel_iommu_capable, .hw_info =3D intel_iommu_hw_info, .domain_alloc =3D intel_iommu_domain_alloc, diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index 746c7abe2237..a51e895d9a17 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -670,3 +670,67 @@ int intel_pasid_setup_nested(struct intel_iommu *iommu= , struct device *dev, =20 return 0; } + +/* + * Interfaces to setup or teardown a pasid table to the scalable-mode + * context table entry: + */ + +static void device_pasid_table_teardown(struct device *dev, u8 bus, u8 dev= fn) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + struct context_entry *context; + + spin_lock(&iommu->lock); + context =3D iommu_context_addr(iommu, bus, devfn, false); + if (!context) { + spin_unlock(&iommu->lock); + return; + } + + context_clear_entry(context); + __iommu_flush_cache(iommu, context, sizeof(*context)); + spin_unlock(&iommu->lock); + + /* + * Cache invalidation for changes to a scalable-mode context table + * entry. + * + * Section 6.5.3.3 of the VT-d spec: + * - Device-selective context-cache invalidation; + * - Domain-selective PASID-cache invalidation to affected domains + * (can be skipped if all PASID entries were not-present); + * - Domain-selective IOTLB invalidation to affected domains; + * - Global Device-TLB invalidation to affected functions. + * + * The iommu has been parked in the blocking state. All domains have + * been detached from the device or PASID. The PASID and IOTLB caches + * have been invalidated during the domain detach path. + */ + iommu->flush.flush_context(iommu, 0, PCI_DEVID(bus, devfn), + DMA_CCMD_MASK_NOBIT, DMA_CCMD_DEVICE_INVL); + devtlb_invalidation_with_pasid(iommu, dev, IOMMU_NO_PASID); +} + +static int pci_pasid_table_teardown(struct pci_dev *pdev, u16 alias, void = *data) +{ + struct device *dev =3D data; + + if (dev =3D=3D &pdev->dev) + device_pasid_table_teardown(dev, PCI_BUS_NUM(alias), alias & 0xff); + + return 0; +} + +void intel_pasid_teardown_sm_context(struct device *dev) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + + if (!dev_is_pci(dev)) { + device_pasid_table_teardown(dev, info->bus, info->devfn); + return; + } + + pci_for_each_dma_alias(to_pci_dev(dev), pci_pasid_table_teardown, dev); +} --=20 2.34.1 From nobody Sat Feb 7 18:23:03 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B9C905C605 for ; Tue, 5 Mar 2024 12:27:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641659; cv=none; b=hhQ1VJZule1k5BreVZ7gw0jUZc/ku1WOIj4F+yUeWjQZ7uxDFtrQbIjisNmN8IbOF8nLKSQthqi6PXaOnxvcetdZ6+wV9MLrOKEUsnBqkbI1YXQS4hZl82qRx9rxTnjU+iVcBaqQJaV/rN0qM4arD7+Z6j1fCabeQQIChtafzZU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641659; c=relaxed/simple; bh=VouZ/6ufeu12LgA8puasFbjIcELZDxJkCgL7ecRZX+U=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mE5uRF+71tU3b3C3r1k5uwT9kCChQ8UEk6nhwQXLBcYQu4cDDCGN1PCUN1Dvi6gNheYwHdu8Ed5305irLmv206RO4NTdGaDH68Ei6/l4dmJq40fcHfK+63e2JQ4oABlapiUNhOFRDTLLuZk5GYh992/ZIXP3wVXhf+TbrceLs1M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=VfAiCd7I; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="VfAiCd7I" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709641657; x=1741177657; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VouZ/6ufeu12LgA8puasFbjIcELZDxJkCgL7ecRZX+U=; b=VfAiCd7IIcJg5idN/ks5Z9mRqUSRiuyH9diKZVfMUjN7I233rdwNSKV/ svIqvK6iXH1rnUaWzOJnP0C+FSPzIDsKZRn2zKFDwC2gIbYp0tVctYXGq op/SjK9cSeq4RP0o5h/QLI5KXbNyLD72hUIGNVjGdtSz2Gn8AEq9dz0bQ jvBo/LZHmW/kFHBXhAu1j8xaTIcRAipmzvbXkMlrnvY7BpQJQRX5EGVl8 ideiqdyOZYWKhkkRpJ9LlmdRhBaJFj6RFG85rXYNa+oL8mn/Xfn8uUbG9 8ZMKyGYrH8iCm5VtfCoLAltK/9a0lexInrFZ5yKCBLg1dgZ+dQ0o+MQ6D A==; X-IronPort-AV: E=McAfee;i="6600,9927,11003"; a="21648495" X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="21648495" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Mar 2024 04:27:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="9330135" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa010.jf.intel.com with ESMTP; 05 Mar 2024 04:27:36 -0800 From: Lu Baolu To: Joerg Roedel Cc: Ethan Zhao , Eric Badger , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 6/8] iommu/vt-d: Setup scalable mode context entry in probe path Date: Tue, 5 Mar 2024 20:21:19 +0800 Message-Id: <20240305122121.211482-7-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240305122121.211482-1-baolu.lu@linux.intel.com> References: <20240305122121.211482-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In contrast to legacy mode, the DMA translation table is configured in the PASID table entry instead of the context entry for scalable mode. For this reason, it is more appropriate to set up the scalable mode context entry in the device_probe callback and direct it to the appropriate PASID table. The iommu domain attach/detach operations only affect the PASID table entry. Therefore, there is no need to modify the context entry when configuring the translation type and page table. The only exception is the kdump case, where context entry setup is postponed until the device driver invokes the first DMA interface. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240305013305.204605-4-baolu.lu@linux.inte= l.com --- drivers/iommu/intel/pasid.h | 1 + drivers/iommu/intel/iommu.c | 12 ++++ drivers/iommu/intel/pasid.c | 138 ++++++++++++++++++++++++++++++++++++ 3 files changed, 151 insertions(+) diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h index 42fda97fd851..da9978fef7ac 100644 --- a/drivers/iommu/intel/pasid.h +++ b/drivers/iommu/intel/pasid.h @@ -318,5 +318,6 @@ void intel_pasid_tear_down_entry(struct intel_iommu *io= mmu, bool fault_ignore); void intel_pasid_setup_page_snoop_control(struct intel_iommu *iommu, struct device *dev, u32 pasid); +int intel_pasid_setup_sm_context(struct device *dev); void intel_pasid_teardown_sm_context(struct device *dev); #endif /* __INTEL_PASID_H */ diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index f74d42d3258f..9b96d36b9d2a 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4073,6 +4073,10 @@ int prepare_domain_attach_device(struct iommu_domain= *domain, dmar_domain->agaw--; } =20 + if (sm_supported(iommu) && !dev_is_real_dma_subdevice(dev) && + context_copied(iommu, info->bus, info->devfn)) + return intel_pasid_setup_sm_context(dev); + return 0; } =20 @@ -4386,11 +4390,19 @@ static struct iommu_device *intel_iommu_probe_devic= e(struct device *dev) dev_err(dev, "PASID table allocation failed\n"); goto clear_rbtree; } + + if (!context_copied(iommu, info->bus, info->devfn)) { + ret =3D intel_pasid_setup_sm_context(dev); + if (ret) + goto free_table; + } } =20 intel_iommu_debugfs_create_dev(info); =20 return &iommu->iommu; +free_table: + intel_pasid_free_table(dev); clear_rbtree: device_rbtree_remove(info); free: diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c index a51e895d9a17..11f0b856d74c 100644 --- a/drivers/iommu/intel/pasid.c +++ b/drivers/iommu/intel/pasid.c @@ -734,3 +734,141 @@ void intel_pasid_teardown_sm_context(struct device *d= ev) =20 pci_for_each_dma_alias(to_pci_dev(dev), pci_pasid_table_teardown, dev); } + +/* + * Get the PASID directory size for scalable mode context entry. + * Value of X in the PDTS field of a scalable mode context entry + * indicates PASID directory with 2^(X + 7) entries. + */ +static unsigned long context_get_sm_pds(struct pasid_table *table) +{ + unsigned long pds, max_pde; + + max_pde =3D table->max_pasid >> PASID_PDE_SHIFT; + pds =3D find_first_bit(&max_pde, MAX_NR_PASID_BITS); + if (pds < 7) + return 0; + + return pds - 7; +} + +static int context_entry_set_pasid_table(struct context_entry *context, + struct device *dev) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct pasid_table *table =3D info->pasid_table; + struct intel_iommu *iommu =3D info->iommu; + unsigned long pds; + + context_clear_entry(context); + + pds =3D context_get_sm_pds(table); + context->lo =3D (u64)virt_to_phys(table->table) | context_pdts(pds); + context_set_sm_rid2pasid(context, IOMMU_NO_PASID); + + if (info->ats_supported) + context_set_sm_dte(context); + if (info->pri_supported) + context_set_sm_pre(context); + if (info->pasid_supported) + context_set_pasid(context); + + context_set_fault_enable(context); + context_set_present(context); + __iommu_flush_cache(iommu, context, sizeof(*context)); + + return 0; +} + +static int device_pasid_table_setup(struct device *dev, u8 bus, u8 devfn) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + struct intel_iommu *iommu =3D info->iommu; + struct context_entry *context; + + spin_lock(&iommu->lock); + context =3D iommu_context_addr(iommu, bus, devfn, true); + if (!context) { + spin_unlock(&iommu->lock); + return -ENOMEM; + } + + if (context_present(context) && !context_copied(iommu, bus, devfn)) { + spin_unlock(&iommu->lock); + return 0; + } + + if (context_copied(iommu, bus, devfn)) { + context_clear_entry(context); + __iommu_flush_cache(iommu, context, sizeof(*context)); + + /* + * For kdump cases, old valid entries may be cached due to + * the in-flight DMA and copied pgtable, but there is no + * unmapping behaviour for them, thus we need explicit cache + * flushes for all affected domain IDs and PASIDs used in + * the copied PASID table. Given that we have no idea about + * which domain IDs and PASIDs were used in the copied tables, + * upgrade them to global PASID and IOTLB cache invalidation. + */ + iommu->flush.flush_context(iommu, 0, + PCI_DEVID(bus, devfn), + DMA_CCMD_MASK_NOBIT, + DMA_CCMD_DEVICE_INVL); + qi_flush_pasid_cache(iommu, 0, QI_PC_GLOBAL, 0); + iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH); + devtlb_invalidation_with_pasid(iommu, dev, IOMMU_NO_PASID); + + /* + * At this point, the device is supposed to finish reset at + * its driver probe stage, so no in-flight DMA will exist, + * and we don't need to worry anymore hereafter. + */ + clear_context_copied(iommu, bus, devfn); + } + + context_entry_set_pasid_table(context, dev); + spin_unlock(&iommu->lock); + + /* + * It's a non-present to present mapping. If hardware doesn't cache + * non-present entry we don't need to flush the caches. If it does + * cache non-present entries, then it does so in the special + * domain #0, which we have to flush: + */ + if (cap_caching_mode(iommu->cap)) { + iommu->flush.flush_context(iommu, 0, + PCI_DEVID(bus, devfn), + DMA_CCMD_MASK_NOBIT, + DMA_CCMD_DEVICE_INVL); + iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_DSI_FLUSH); + } + + return 0; +} + +static int pci_pasid_table_setup(struct pci_dev *pdev, u16 alias, void *da= ta) +{ + struct device *dev =3D data; + + if (dev !=3D &pdev->dev) + return 0; + + return device_pasid_table_setup(dev, PCI_BUS_NUM(alias), alias & 0xff); +} + +/* + * Set the device's PASID table to its context table entry. + * + * The PASID table is set to the context entries of both device itself + * and its alias requester ID for DMA. + */ +int intel_pasid_setup_sm_context(struct device *dev) +{ + struct device_domain_info *info =3D dev_iommu_priv_get(dev); + + if (!dev_is_pci(dev)) + return device_pasid_table_setup(dev, info->bus, info->devfn); + + return pci_for_each_dma_alias(to_pci_dev(dev), pci_pasid_table_setup, dev= ); +} --=20 2.34.1 From nobody Sat Feb 7 18:23:03 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD7E55C8E5 for ; Tue, 5 Mar 2024 12:27:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641661; cv=none; b=Wt0HSw/HR2wC0+dL77SUDzs7jeyFqrvw6ej4k08OZYbJvzkvzOABrAItJWS+GZ5O54kuEtshAtD9T4gWH3vpYi9p8nq96OmNfj/bI2MAeNFy3SuhYgdnUMDTWXBRxsrfWYmhFA0eRxlKBDLiXOi96yIisDZ0xsIN69+UEdzR61k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641661; c=relaxed/simple; bh=WvckCYx3o4P91nUGVJiGEmiOvgJ5nL7dKzsLf3U9+Qw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=NX08V2tVFcQTYAFqUlQjKvI0Iak7mfhcexb169T67Btkr4D/FmmYM1zbljFwGRNGhy5Va20t8q9fkNOVreiEAoEvlpBt3tJHpSPAkZLRv/zuIg3Y4SVquPXmDzzFDzDk/n1+7bT7p8R1JpJO6LDBD+7xaTzhCdR5kcBWjj0RA0o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Ns2kMi84; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ns2kMi84" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709641659; x=1741177659; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WvckCYx3o4P91nUGVJiGEmiOvgJ5nL7dKzsLf3U9+Qw=; b=Ns2kMi8461y/bEfVFMcrCk4XHiBW+fwG4PcUkWijLDPsJiTE/QRj5l1o kSTRH8rJ9Navo1qz9P8AQDr5sJC6m9ouFVYX86OOzmgvXze5f2Bt3bcsZ kr8Ehdn0aeAF36q7d/Rpb1594xackWenWFQcdG+2/5ooN78AKQt1QQOdr Kov2r3ZOesEtOvPzfmUrby/GYt80uDYYLG6CGETCLv18eGF2ki6Sw3d3Q CPISHbhKbLl/JrcPnx5QyxIOdKM7YxlKsOx3bDJd04M+K9Ke0RbT0l7+K 34myT/9Z7P22Lnm/dLTMdl6Mz6QgOYnkaBgzC95GBt5nTTDouAwmbZlN9 g==; X-IronPort-AV: E=McAfee;i="6600,9927,11003"; a="21648515" X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="21648515" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Mar 2024 04:27:39 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="9330144" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa010.jf.intel.com with ESMTP; 05 Mar 2024 04:27:38 -0800 From: Lu Baolu To: Joerg Roedel Cc: Ethan Zhao , Eric Badger , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 7/8] iommu/vt-d: Remove scalable mode context entry setup from attach_dev Date: Tue, 5 Mar 2024 20:21:20 +0800 Message-Id: <20240305122121.211482-8-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240305122121.211482-1-baolu.lu@linux.intel.com> References: <20240305122121.211482-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The scalable mode context entry is now setup in the probe_device path, eliminating the need to configure it in the attach_dev path. Removes the redundant code from the attach_dev path to avoid dead code. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240305013305.204605-5-baolu.lu@linux.inte= l.com --- drivers/iommu/intel/iommu.c | 156 ++++++++++-------------------------- 1 file changed, 44 insertions(+), 112 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 9b96d36b9d2a..d682eb6ad4d2 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1850,34 +1850,17 @@ static void domain_exit(struct dmar_domain *domain) kfree(domain); } =20 -/* - * Get the PASID directory size for scalable mode context entry. - * Value of X in the PDTS field of a scalable mode context entry - * indicates PASID directory with 2^(X + 7) entries. - */ -static unsigned long context_get_sm_pds(struct pasid_table *table) -{ - unsigned long pds, max_pde; - - max_pde =3D table->max_pasid >> PASID_PDE_SHIFT; - pds =3D find_first_bit(&max_pde, MAX_NR_PASID_BITS); - if (pds < 7) - return 0; - - return pds - 7; -} - static int domain_context_mapping_one(struct dmar_domain *domain, struct intel_iommu *iommu, - struct pasid_table *table, u8 bus, u8 devfn) { struct device_domain_info *info =3D domain_lookup_dev_info(domain, iommu, bus, devfn); u16 did =3D domain_id_iommu(domain, iommu); int translation =3D CONTEXT_TT_MULTI_LEVEL; + struct dma_pte *pgd =3D domain->pgd; struct context_entry *context; - int ret; + int agaw, ret; =20 if (hw_pass_through && domain_type_is_si(domain)) translation =3D CONTEXT_TT_PASS_THROUGH; @@ -1920,65 +1903,37 @@ static int domain_context_mapping_one(struct dmar_d= omain *domain, } =20 context_clear_entry(context); + context_set_domain_id(context, did); =20 - if (sm_supported(iommu)) { - unsigned long pds; - - /* Setup the PASID DIR pointer: */ - pds =3D context_get_sm_pds(table); - context->lo =3D (u64)virt_to_phys(table->table) | - context_pdts(pds); - - /* Setup the RID_PASID field: */ - context_set_sm_rid2pasid(context, IOMMU_NO_PASID); - + if (translation !=3D CONTEXT_TT_PASS_THROUGH) { /* - * Setup the Device-TLB enable bit and Page request - * Enable bit: + * Skip top levels of page tables for iommu which has + * less agaw than default. Unnecessary for PT mode. */ + for (agaw =3D domain->agaw; agaw > iommu->agaw; agaw--) { + ret =3D -ENOMEM; + pgd =3D phys_to_virt(dma_pte_addr(pgd)); + if (!dma_pte_present(pgd)) + goto out_unlock; + } + if (info && info->ats_supported) - context_set_sm_dte(context); - if (info && info->pri_supported) - context_set_sm_pre(context); - if (info && info->pasid_supported) - context_set_pasid(context); + translation =3D CONTEXT_TT_DEV_IOTLB; + else + translation =3D CONTEXT_TT_MULTI_LEVEL; + + context_set_address_root(context, virt_to_phys(pgd)); + context_set_address_width(context, agaw); } else { - struct dma_pte *pgd =3D domain->pgd; - int agaw; - - context_set_domain_id(context, did); - - if (translation !=3D CONTEXT_TT_PASS_THROUGH) { - /* - * Skip top levels of page tables for iommu which has - * less agaw than default. Unnecessary for PT mode. - */ - for (agaw =3D domain->agaw; agaw > iommu->agaw; agaw--) { - ret =3D -ENOMEM; - pgd =3D phys_to_virt(dma_pte_addr(pgd)); - if (!dma_pte_present(pgd)) - goto out_unlock; - } - - if (info && info->ats_supported) - translation =3D CONTEXT_TT_DEV_IOTLB; - else - translation =3D CONTEXT_TT_MULTI_LEVEL; - - context_set_address_root(context, virt_to_phys(pgd)); - context_set_address_width(context, agaw); - } else { - /* - * In pass through mode, AW must be programmed to - * indicate the largest AGAW value supported by - * hardware. And ASR is ignored by hardware. - */ - context_set_address_width(context, iommu->msagaw); - } - - context_set_translation_type(context, translation); + /* + * In pass through mode, AW must be programmed to + * indicate the largest AGAW value supported by + * hardware. And ASR is ignored by hardware. + */ + context_set_address_width(context, iommu->msagaw); } =20 + context_set_translation_type(context, translation); context_set_fault_enable(context); context_set_present(context); if (!ecap_coherent(iommu->ecap)) @@ -2008,43 +1963,29 @@ static int domain_context_mapping_one(struct dmar_d= omain *domain, return ret; } =20 -struct domain_context_mapping_data { - struct dmar_domain *domain; - struct intel_iommu *iommu; - struct pasid_table *table; -}; - static int domain_context_mapping_cb(struct pci_dev *pdev, u16 alias, void *opaque) { - struct domain_context_mapping_data *data =3D opaque; + struct device_domain_info *info =3D dev_iommu_priv_get(&pdev->dev); + struct intel_iommu *iommu =3D info->iommu; + struct dmar_domain *domain =3D opaque; =20 - return domain_context_mapping_one(data->domain, data->iommu, - data->table, PCI_BUS_NUM(alias), - alias & 0xff); + return domain_context_mapping_one(domain, iommu, + PCI_BUS_NUM(alias), alias & 0xff); } =20 static int domain_context_mapping(struct dmar_domain *domain, struct device *dev) { struct device_domain_info *info =3D dev_iommu_priv_get(dev); - struct domain_context_mapping_data data; struct intel_iommu *iommu =3D info->iommu; u8 bus =3D info->bus, devfn =3D info->devfn; - struct pasid_table *table; - - table =3D intel_pasid_get_table(dev); =20 if (!dev_is_pci(dev)) - return domain_context_mapping_one(domain, iommu, table, - bus, devfn); - - data.domain =3D domain; - data.iommu =3D iommu; - data.table =3D table; + return domain_context_mapping_one(domain, iommu, bus, devfn); =20 return pci_for_each_dma_alias(to_pci_dev(dev), - &domain_context_mapping_cb, &data); + domain_context_mapping_cb, domain); } =20 /* Returns a number of VTD pages, but aligned to MM page size */ @@ -2404,28 +2345,19 @@ static int dmar_domain_attach_device(struct dmar_do= main *domain, list_add(&info->link, &domain->devices); spin_unlock_irqrestore(&domain->lock, flags); =20 - /* PASID table is mandatory for a PCI device in scalable mode. */ - if (sm_supported(iommu) && !dev_is_real_dma_subdevice(dev)) { - /* Setup the PASID entry for requests without PASID: */ - if (hw_pass_through && domain_type_is_si(domain)) - ret =3D intel_pasid_setup_pass_through(iommu, - dev, IOMMU_NO_PASID); - else if (domain->use_first_level) - ret =3D domain_setup_first_level(iommu, domain, dev, - IOMMU_NO_PASID); - else - ret =3D intel_pasid_setup_second_level(iommu, domain, - dev, IOMMU_NO_PASID); - if (ret) { - dev_err(dev, "Setup RID2PASID failed\n"); - device_block_translation(dev); - return ret; - } - } + if (dev_is_real_dma_subdevice(dev)) + return 0; + + if (!sm_supported(iommu)) + ret =3D domain_context_mapping(domain, dev); + else if (hw_pass_through && domain_type_is_si(domain)) + ret =3D intel_pasid_setup_pass_through(iommu, dev, IOMMU_NO_PASID); + else if (domain->use_first_level) + ret =3D domain_setup_first_level(iommu, domain, dev, IOMMU_NO_PASID); + else + ret =3D intel_pasid_setup_second_level(iommu, domain, dev, IOMMU_NO_PASI= D); =20 - ret =3D domain_context_mapping(domain, dev); if (ret) { - dev_err(dev, "Domain context map failed\n"); device_block_translation(dev); return ret; } --=20 2.34.1 From nobody Sat Feb 7 18:23:03 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 895005C913 for ; Tue, 5 Mar 2024 12:27:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641663; cv=none; b=nYVpEFgOX48dpDQIAVnbsnUqfDuZn0yUSxDYtzbH7nYZ7bmJifo8rYqYBH7VOXDiuQYBd/DjMEeZPl/dWia1OrPC1jZTwISjCbKOCQ4UlEw0cJcUh4qipCo40HKWLGDxKuvfuWxzLt8ZB2/J5eLBjuzhncneaaKbewd1I0Y3ykg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709641663; c=relaxed/simple; bh=ZowgDUVcvv+doVk6YVMSxDhTJIb10vanJKzULJxdQqE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=h71dIIQWoHfVXWuFuywXEkNVhnaVig5NglQUhEg3AbhPIBilnmF/NhDyWCcPgKaVnuoYp41JP7/XRjUgO4kjFFMZ/2ndtVxVjpnRodf8vHo3JTPvjztus8KcuQO1wHzufoJyoRDA/T3bNGQe9jiBDBqffvgDhwO462MStWbjwlM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=h38YFYa+; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="h38YFYa+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1709641661; x=1741177661; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZowgDUVcvv+doVk6YVMSxDhTJIb10vanJKzULJxdQqE=; b=h38YFYa+v0wsrOdkE2zZLgycP6CPGYbBwhgQxcFnQqFNFGTM/5JeOoqK MsFJAsafi11X9ilVWLv+DIzHgwjs/cwQir69+sasIYEY6Gw87WYVJ45sB YIhgG+P1oc2kz2MGuN2mOLv9uN4u6mKx+G+nrm8CxBXN4mXoX62nCTwT5 IDLtNVgil41UYm3X1wG2tmpwUCjMg8sjZHexg6ezrPw4d5WNdjajwDs4a ItmCTszJPlKC+N5DnRUETJjTB3qm1Ldg69R/oivV5YdKuivk/3ooNJzuO S+pgNfm8Tmk722UAPYzVyAgXZ/vFH5H2zcl7iYjSt+7KY8wAY6FqkJ4G7 A==; X-IronPort-AV: E=McAfee;i="6600,9927,11003"; a="21648524" X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="21648524" Received: from orviesa010.jf.intel.com ([10.64.159.150]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Mar 2024 04:27:41 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,205,1705392000"; d="scan'208";a="9330155" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orviesa010.jf.intel.com with ESMTP; 05 Mar 2024 04:27:40 -0800 From: Lu Baolu To: Joerg Roedel Cc: Ethan Zhao , Eric Badger , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH 8/8] iommu/vt-d: Remove scalabe mode in domain_context_clear_one() Date: Tue, 5 Mar 2024 20:21:21 +0800 Message-Id: <20240305122121.211482-9-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240305122121.211482-1-baolu.lu@linux.intel.com> References: <20240305122121.211482-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" domain_context_clear_one() only handles the context entry teardown in legacy mode. Remove the scalable mode check in it to avoid dead code. Remove an unnecessary check in the code as well. Signed-off-by: Lu Baolu Reviewed-by: Kevin Tian Link: https://lore.kernel.org/r/20240305013305.204605-6-baolu.lu@linux.inte= l.com --- drivers/iommu/intel/iommu.c | 15 +-------------- 1 file changed, 1 insertion(+), 14 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index d682eb6ad4d2..50eb9aed47cc 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -2175,9 +2175,6 @@ static void domain_context_clear_one(struct device_do= main_info *info, u8 bus, u8 struct context_entry *context; u16 did_old; =20 - if (!iommu) - return; - spin_lock(&iommu->lock); context =3D iommu_context_addr(iommu, bus, devfn, 0); if (!context) { @@ -2185,14 +2182,7 @@ static void domain_context_clear_one(struct device_d= omain_info *info, u8 bus, u8 return; } =20 - if (sm_supported(iommu)) { - if (hw_pass_through && domain_type_is_si(info->domain)) - did_old =3D FLPT_DEFAULT_DID; - else - did_old =3D domain_id_iommu(info->domain, iommu); - } else { - did_old =3D context_domain_id(context); - } + did_old =3D context_domain_id(context); =20 context_clear_entry(context); __iommu_flush_cache(iommu, context, sizeof(*context)); @@ -2203,9 +2193,6 @@ static void domain_context_clear_one(struct device_do= main_info *info, u8 bus, u8 DMA_CCMD_MASK_NOBIT, DMA_CCMD_DEVICE_INVL); =20 - if (sm_supported(iommu)) - qi_flush_pasid_cache(iommu, did_old, QI_PC_ALL_PASIDS, 0); - iommu->flush.flush_iotlb(iommu, did_old, 0, --=20 2.34.1