[PATCH rc v8 0/8] iommu: Fix pci_dev_reset_iommu_prepare/done()

Nicolin Chen posted 8 patches 1 month, 3 weeks ago
drivers/iommu/iommu.c | 223 ++++++++++++++++++++++++++++++++++--------
1 file changed, 181 insertions(+), 42 deletions(-)
[PATCH rc v8 0/8] iommu: Fix pci_dev_reset_iommu_prepare/done()
Posted by Nicolin Chen 1 month, 3 weeks ago
Shuai and Kevin found a few bugs in the pci_dev_reset_iommu_prepare/done()
helpers when used to handle some corner cases:
 - Nested callbacks
 - Multi-device groups
 - WARN_ON/UAF due to concurrent detach

This needs some substantial rework by tracking device reset states on a per
gdev basis. This series includes a few patches addressing them. Most of the
patches are reviewed previously in a single patch v6. As we found more bugs
during the reviews, I split that v6 to smaller patches so each of them will
be cleaner.

This is on Github:
https://github.com/nicolinc/iommufd/commits/fix_iommu_reset-v8

Note that concurrent reset of two DMA alias siblings (sharing the same RID)
might prematurely unblock when one device is done while the other is still
resetting. And it's a bit convoluted to support this case. Given that it's
unclear whether real ATS devices might share RID, for now, add a warning in
the done(). A future work can fix it properly if someone hits it.

Changelog
v8:
 * Add Reviewed-by tags
 * Fix NULL group->domain in done()
 * Tidy goto cleanup when using guard()
 * Update patch subject and commit message
 * Add warning on premature unblocking in DMA alias cases
 * Drop unreachable skip in __iommu_group_set_domain_internal() error path
v7:
 https://lore.kernel.org/all/cover.1776551790.git.nicolinc@nvidia.com/
 * Add Reviewed-by tags
 * Split v6 into smaller patches
 * Add one patch to fix UAF during detach()
 * Add one patch to fix unnecessary ATS invalidation
v6:
 https://lore.kernel.org/all/20260407194644.171304-1-nicolinc@nvidia.com/
 * Update inline comments and commit message
 * Add "max_pasids > 0" condition in both helpers
v5:
 https://lore.kernel.org/all/20260404050243.141366-1-nicolinc@nvidia.com/
 * Add 'blocked' to fix iommu_driver_get_domain_for_dev() return.
v4:
 https://lore.kernel.org/all/20260324014056.36103-1-nicolinc@nvidia.com/
 * Rename 'reset_cnt' to 'recovery_cnt'
v3:
 https://lore.kernel.org/all/20260321223930.10836-1-nicolinc@nvidia.com/
 * Turn prepare()/done() to be per-gdev
 * Use reset_depth to track nested re-entries
 * Replace group->resetting_domain with a reset_cnt
v2:
 https://lore.kernel.org/all/20260319043135.1153534-1-nicolinc@nvidia.com/
 * Fix in the helpers by allowing re-entry
v1:
 https://lore.kernel.org/all/20260318220028.1146905-1-nicolinc@nvidia.com/

Nicolin Chen (8):
  iommu: Fix NULL group->domain dereference in
    pci_dev_reset_iommu_done()
  iommu: Fix kdocs of pci_dev_reset_iommu_done()
  iommu: Replace per-group resetting_domain with per-gdev blocked flag
  iommu: Fix pasid attach in pci_dev_reset_iommu_prepare/done()
  iommu: Fix nested pci_dev_reset_iommu_prepare/done()
  iommu: Fix ATS invalidation timeouts during
    __iommu_remove_group_pasid()
  iommu: Fix WARN_ON in __iommu_group_set_domain_nofail() due to reset
  iommu: Warn on premature unblock during DMA aliased sibling reset

 drivers/iommu/iommu.c | 223 ++++++++++++++++++++++++++++++++++--------
 1 file changed, 181 insertions(+), 42 deletions(-)

-- 
2.43.0
Re: [PATCH rc v8 0/8] iommu: Fix pci_dev_reset_iommu_prepare/done()
Posted by Jörg Rödel 1 month ago
On Fri, Apr 24, 2026 at 06:15:19PM -0700, Nicolin Chen wrote:
> Nicolin Chen (8):
>   iommu: Fix NULL group->domain dereference in
>     pci_dev_reset_iommu_done()
>   iommu: Fix kdocs of pci_dev_reset_iommu_done()
>   iommu: Replace per-group resetting_domain with per-gdev blocked flag
>   iommu: Fix pasid attach in pci_dev_reset_iommu_prepare/done()
>   iommu: Fix nested pci_dev_reset_iommu_prepare/done()
>   iommu: Fix ATS invalidation timeouts during
>     __iommu_remove_group_pasid()
>   iommu: Fix WARN_ON in __iommu_group_set_domain_nofail() due to reset
>   iommu: Warn on premature unblock during DMA aliased sibling reset
> 
>  drivers/iommu/iommu.c | 223 ++++++++++++++++++++++++++++++++++--------
>  1 file changed, 181 insertions(+), 42 deletions(-)

Applied for -rc, thanks.