[PATCH v2 00/14] intel_iommu: Enable PASID support for passthrough device

Zhenzhong Duan posted 14 patches 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20260326091130.321483-1-zhenzhong.duan@intel.com
Maintainers: Yi Liu <yi.l.liu@intel.com>, Eric Auger <eric.auger@redhat.com>, Zhenzhong Duan <zhenzhong.duan@intel.com>, Peter Maydell <peter.maydell@linaro.org>, "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, "Clément Mathieu--Drif" <clement.mathieu--drif@bull.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, Alex Williamson <alex@shazbot.org>, "Cédric Le Goater" <clg@redhat.com>
hw/i386/intel_iommu_accel.h    |  34 ++-
hw/i386/intel_iommu_internal.h |  43 +++-
include/hw/core/iommu.h        |   2 +
include/hw/i386/intel_iommu.h  |   4 +-
include/hw/vfio/vfio-device.h  |   1 +
include/system/iommufd.h       |  16 +-
backends/iommufd.c             |   9 +-
hw/arm/smmuv3-accel.c          |  12 +-
hw/i386/intel_iommu.c          |  83 +++----
hw/i386/intel_iommu_accel.c    | 420 +++++++++++++++++++++++++++------
hw/vfio/device.c               |  11 +
hw/vfio/iommufd.c              |  56 +++--
hw/vfio/trace-events           |   4 +-
13 files changed, 524 insertions(+), 171 deletions(-)
[PATCH v2 00/14] intel_iommu: Enable PASID support for passthrough device
Posted by Zhenzhong Duan 1 week ago
Hi,

Now we already support first stage translation with passthrough device
backed by nested translation in host, but only for PASID_0.

Structure VTDAddressSpace includes some elements suitable for emulated
device and passthrough device without PASID, e.g., address space,
different memory regions, etc, it is also protected by vtd iommu lock,
all these are useless and become a burden for passthrough device with
PASID.

When there are lots of PASIDs used in one device, the AS and MRs are
all registered to memory core and impact the whole system performance.

So instead of using VTDAddressSpace to cache pasid entry for each pasid
of a passthrough device, we define a light weight structure
VTDAccelPASIDCacheEntry with only necessary elements for each pasid. We
will use this struct as a parameter to conduct binding/unbinding to
nested hwpt, to record the current binded nested hwpt and even future
PRQ support. It's also designed to support PASID_0.

The potential full definition of VTDAccelPASIDCacheEntry may like:

  typedef struct VTDAccelPASIDCacheEntry {
      VTDHostIOMMUDevice *vtd_hiod;
      VTDPASIDEntry pasid_entry;
      uint32_t pasid;
      uint32_t fs_hwpt_id;
      uint32_t fault_id;
      int fault_fd;
      QLIST_HEAD(, VTDPRQEntry) vtd_prq_list;
      IOMMUPRINotifier pri_notifier_entry;
      IOMMUPRINotifier *pri_notifier;
      QLIST_ENTRY(VTDAccelPASIDCacheEntry) next;
  } VTDAccelPASIDCacheEntry;

GIT branch: https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_pasid

PATCH01-06: Some preparing work
PATCH07-10: Handle PASID entry addition and removal
PATCH11-12: Support pasid binding and unbinding
PATCH13-14: Add PASID related check and enable PASID for passthrough device

This patchset depends on a kernel feature enhancement[1] to work.

Tests:
Tested with DSA device which driver uses 2 PASIDs by default.

Thanks
Zhenzhong

[1] https://lore.kernel.org/all/20260205023405.41583-1-zhenzhong.duan@intel.com/

Changelog:
v2:
- move the check "s->pasid > PCI_EXT_CAP_PASID_MAX_WIDTH" to patch5 (Clement)
- move #include "hw/core/iommu.h" before #include "hw/core/qdev.h" (liuyi)
- polish the comment about @Pasid parameter (Liuyi)
- s/pe/pasid_entry, s/as_it/hiod_it, s/vtd_find_add_pc/vtd_accel_fill_pc (Liuyi)
- s/VTDACCELPASIDCacheEntry/VTDAccelPASIDCacheEntry (Liuyi)
- add explanation in code about PASID removal before addition (Liuyi)
- polish the comment about scope of VTDAccelPASIDCacheEntry vs VTDPASIDCacheEntry (Liuyi)
- add an optimization to bypass PASID entry addition for PASID selective pv_inv_dsc (Liuyi)

v1:
- use naming pattern "XXX_SET_THENAME" same as smmu (Clement)
- fix s->pasid check (Clement)

RFCv2:
- extend attach/detach_hwpt() instead of introducing new callbacks (Shammer)
- Define IOMMU_NO_PASID for device attachment without pasid (Nicolin)
- update vtd_destroy_old_fs_hwpt()'s parameter for naming consistency (Clement)
- check pasid bits size to be no more than 20 bits (Clement)
- initialize local variable max_pasid_log2 to 0 (Cédric)


Zhenzhong Duan (14):
  vfio/iommufd: Extend attach/detach_hwpt callback implementations with
    pasid
  iommufd: Extend attach/detach_hwpt callbacks to support pasid
  vfio/iommufd: Create nesting parent hwpt with IOMMU_HWPT_ALLOC_PASID
    flag
  intel_iommu: Create the nested hwpt with IOMMU_HWPT_ALLOC_PASID flag
  intel_iommu: Change pasid property from bool to uint8
  intel_iommu: Export some functions
  intel_iommu_accel: Handle PASID entry addition for pc_inv_dsc request
  intel_iommu_accel: Handle PASID entry removal for pc_inv_dsc request
  intel_iommu_accel: Bypass PASID entry addition for just deleted entry
  intel_iommu_accel: Handle PASID entry removal for system reset
  intel_iommu_accel: Support pasid binding/unbinding and PIOTLB flushing
  intel_iommu_accel: drop _lock suffix in
    vtd_flush_host_piotlb_all_locked()
  intel_iommu_accel: Add pasid bits size check
  intel_iommu: Expose flag VIOMMU_FLAG_PASID_SUPPORTED when configured

 hw/i386/intel_iommu_accel.h    |  34 ++-
 hw/i386/intel_iommu_internal.h |  43 +++-
 include/hw/core/iommu.h        |   2 +
 include/hw/i386/intel_iommu.h  |   4 +-
 include/hw/vfio/vfio-device.h  |   1 +
 include/system/iommufd.h       |  16 +-
 backends/iommufd.c             |   9 +-
 hw/arm/smmuv3-accel.c          |  12 +-
 hw/i386/intel_iommu.c          |  83 +++----
 hw/i386/intel_iommu_accel.c    | 420 +++++++++++++++++++++++++++------
 hw/vfio/device.c               |  11 +
 hw/vfio/iommufd.c              |  56 +++--
 hw/vfio/trace-events           |   4 +-
 13 files changed, 524 insertions(+), 171 deletions(-)

-- 
2.47.3


RE: [PATCH v2 00/14] intel_iommu: Enable PASID support for passthrough device
Posted by Hao, Xudong 6 days, 16 hours ago

> -----Original Message-----
> From: Duan, Zhenzhong <zhenzhong.duan@intel.com>
> Sent: Thursday, March 26, 2026 5:11 PM
> To: qemu-devel@nongnu.org
> Cc: alex@shazbot.org; clg@redhat.com; eric.auger@redhat.com;
> mst@redhat.com; jasowang@redhat.com; jgg@nvidia.com;
> nicolinc@nvidia.com; skolothumtho@nvidia.com; joao.m.martins@oracle.com;
> clement.mathieu--drif@bull.com; Tian, Kevin <kevin.tian@intel.com>; Liu, Yi L
> <yi.l.liu@intel.com>; Hao, Xudong <xudong.hao@intel.com>; Duan, Zhenzhong
> <zhenzhong.duan@intel.com>
> Subject: [PATCH v2 00/14] intel_iommu: Enable PASID support for passthrough
> device
> 
> Hi,
> 
> Now we already support first stage translation with passthrough device backed
> by nested translation in host, but only for PASID_0.
> 
> Structure VTDAddressSpace includes some elements suitable for emulated
> device and passthrough device without PASID, e.g., address space, different
> memory regions, etc, it is also protected by vtd iommu lock, all these are useless
> and become a burden for passthrough device with PASID.
> 
> When there are lots of PASIDs used in one device, the AS and MRs are all
> registered to memory core and impact the whole system performance.
> 
> So instead of using VTDAddressSpace to cache pasid entry for each pasid of a
> passthrough device, we define a light weight structure
> VTDAccelPASIDCacheEntry with only necessary elements for each pasid. We will
> use this struct as a parameter to conduct binding/unbinding to nested hwpt, to
> record the current binded nested hwpt and even future PRQ support. It's also
> designed to support PASID_0.
> 
> The potential full definition of VTDAccelPASIDCacheEntry may like:
> 
>   typedef struct VTDAccelPASIDCacheEntry {
>       VTDHostIOMMUDevice *vtd_hiod;
>       VTDPASIDEntry pasid_entry;
>       uint32_t pasid;
>       uint32_t fs_hwpt_id;
>       uint32_t fault_id;
>       int fault_fd;
>       QLIST_HEAD(, VTDPRQEntry) vtd_prq_list;
>       IOMMUPRINotifier pri_notifier_entry;
>       IOMMUPRINotifier *pri_notifier;
>       QLIST_ENTRY(VTDAccelPASIDCacheEntry) next;
>   } VTDAccelPASIDCacheEntry;
> 
> GIT branch:
> https://github.com/yiliu1765/qemu/tree/zhenzhong/iommufd_pasid
> 
> PATCH01-06: Some preparing work
> PATCH07-10: Handle PASID entry addition and removal
> PATCH11-12: Support pasid binding and unbinding
> PATCH13-14: Add PASID related check and enable PASID for passthrough device
> 
> This patchset depends on a kernel feature enhancement[1] to work.
> 
> Tests:
> Tested with DSA device which driver uses 2 PASIDs by default.
> 
> Thanks
> Zhenzhong
> 
> [1] https://lore.kernel.org/all/20260205023405.41583-1-
> zhenzhong.duan@intel.com/
> 
> Changelog:
> v2:
> - move the check "s->pasid > PCI_EXT_CAP_PASID_MAX_WIDTH" to patch5
> (Clement)
> - move #include "hw/core/iommu.h" before #include "hw/core/qdev.h" (liuyi)
> - polish the comment about @Pasid parameter (Liuyi)
> - s/pe/pasid_entry, s/as_it/hiod_it, s/vtd_find_add_pc/vtd_accel_fill_pc (Liuyi)
> - s/VTDACCELPASIDCacheEntry/VTDAccelPASIDCacheEntry (Liuyi)
> - add explanation in code about PASID removal before addition (Liuyi)
> - polish the comment about scope of VTDAccelPASIDCacheEntry vs
> VTDPASIDCacheEntry (Liuyi)
> - add an optimization to bypass PASID entry addition for PASID selective
> pv_inv_dsc (Liuyi)
> 
> v1:
> - use naming pattern "XXX_SET_THENAME" same as smmu (Clement)
> - fix s->pasid check (Clement)
> 
> RFCv2:
> - extend attach/detach_hwpt() instead of introducing new callbacks (Shammer)
> - Define IOMMU_NO_PASID for device attachment without pasid (Nicolin)
> - update vtd_destroy_old_fs_hwpt()'s parameter for naming consistency
> (Clement)
> - check pasid bits size to be no more than 20 bits (Clement)
> - initialize local variable max_pasid_log2 to 0 (Cédric)
> 
> 
> Zhenzhong Duan (14):
>   vfio/iommufd: Extend attach/detach_hwpt callback implementations with
>     pasid
>   iommufd: Extend attach/detach_hwpt callbacks to support pasid
>   vfio/iommufd: Create nesting parent hwpt with IOMMU_HWPT_ALLOC_PASID
>     flag
>   intel_iommu: Create the nested hwpt with IOMMU_HWPT_ALLOC_PASID flag
>   intel_iommu: Change pasid property from bool to uint8
>   intel_iommu: Export some functions
>   intel_iommu_accel: Handle PASID entry addition for pc_inv_dsc request
>   intel_iommu_accel: Handle PASID entry removal for pc_inv_dsc request
>   intel_iommu_accel: Bypass PASID entry addition for just deleted entry
>   intel_iommu_accel: Handle PASID entry removal for system reset
>   intel_iommu_accel: Support pasid binding/unbinding and PIOTLB flushing
>   intel_iommu_accel: drop _lock suffix in
>     vtd_flush_host_piotlb_all_locked()
>   intel_iommu_accel: Add pasid bits size check
>   intel_iommu: Expose flag VIOMMU_FLAG_PASID_SUPPORTED when
> configured
> 
>  hw/i386/intel_iommu_accel.h    |  34 ++-
>  hw/i386/intel_iommu_internal.h |  43 +++-
>  include/hw/core/iommu.h        |   2 +
>  include/hw/i386/intel_iommu.h  |   4 +-
>  include/hw/vfio/vfio-device.h  |   1 +
>  include/system/iommufd.h       |  16 +-
>  backends/iommufd.c             |   9 +-
>  hw/arm/smmuv3-accel.c          |  12 +-
>  hw/i386/intel_iommu.c          |  83 +++----
>  hw/i386/intel_iommu_accel.c    | 420 +++++++++++++++++++++++++++------
>  hw/vfio/device.c               |  11 +
>  hw/vfio/iommufd.c              |  56 +++--
>  hw/vfio/trace-events           |   4 +-
>  13 files changed, 524 insertions(+), 171 deletions(-)
> 
> --
> 2.47.3

Tested-by: Xudong Hao <xudong.hao@intel.com>

DSA PF assignment to VM with vIOMMU scalable mode, dmatest passed in VM.