[PATCH 00/11] iommufd: Enable noiommu mode for cdev

Jacob Pan posted 11 patches 1 month, 2 weeks ago
There is a newer version of this series
Documentation/driver-api/vfio.rst             |  44 +-
drivers/iommu/iommufd/Makefile                |   1 +
drivers/iommu/iommufd/device.c                | 161 ++++--
drivers/iommu/iommufd/hw_pagetable.c          |  11 +-
drivers/iommu/iommufd/hwpt_noiommu.c          |  91 +++
drivers/iommu/iommufd/io_pagetable.c          |  39 ++
drivers/iommu/iommufd/ioas.c                  |  22 +
drivers/iommu/iommufd/iommufd_private.h       |   5 +
drivers/iommu/iommufd/main.c                  |   3 +
drivers/vfio/Kconfig                          |   6 +-
drivers/vfio/group.c                          |  35 +-
drivers/vfio/iommufd.c                        |   7 -
drivers/vfio/vfio.h                           |  34 +-
drivers/vfio/vfio_main.c                      |  22 +-
include/linux/vfio.h                          |  10 +
include/uapi/linux/iommufd.h                  |  25 +
tools/testing/selftests/vfio/Makefile         |   1 +
.../selftests/vfio/lib/vfio_pci_device.c      |  25 +-
.../vfio/vfio_iommufd_noiommu_test.c          | 540 ++++++++++++++++++
19 files changed, 990 insertions(+), 92 deletions(-)
create mode 100644 drivers/iommu/iommufd/hwpt_noiommu.c
create mode 100644 tools/testing/selftests/vfio/vfio_iommufd_noiommu_test.c
[PATCH 00/11] iommufd: Enable noiommu mode for cdev
Posted by Jacob Pan 1 month, 2 weeks ago
VFIO's unsafe_noiommu_mode has long provided a way for userspace drivers
to operate on platforms lacking a hardware IOMMU. Today, IOMMUFD also
supports No-IOMMU mode for group-based devices under vfio_compat mode.
However, IOMMUFD's native character device (cdev) does not yet support
No-IOMMU mode, which is the purpose of this patch.

In summary, we have:

|-------------------------+------+---------------|
| Device access mode      | VFIO | IOMMUFD       |
|-------------------------+------+---------------|
| group /dev/vfio/$GROUP  | Yes  | Yes           |
|-------------------------+------+---------------|
| cdev /dev/vfio/devices/ | No   | This patch    |
|-------------------------+------+---------------|

Beyond enabling cdev for IOMMUFD, this patch also addresses the following
deficiencies in the current No-IOMMU mode suggested by Jason[1]:
- Devices operating under No-IOMMU mode are limited to device-level UAPI
  access, without container or IOAS-level capabilities. Consequently,
  user-space drivers lack structured mechanisms for page pinning and often
  resort to mlock(), which is less robust than pin_user_pages() used for
  devices backed by a physical IOMMU. For example, mlock() does not prevent
  page migration.
- There is no architectural mechanism for obtaining physical addresses for
  DMA. As a workaround, user-space drivers frequently rely on /proc/pagemap
  tricks or hardcoded values.

By allowing noiommu device access to IOMMUFD IOAS and HWPT objects, this
patch brings No-IOMMU mode closer to full citizenship within the IOMMU
subsystem. In addition to addressing the two deficiencies mentioned above,
the expectation is that it will also enable No-IOMMU devices to seamlessly
participate in live update sessions via KHO [2].

Furthermore, these devices will use the IOMMUFD-based ownership checking model for
VFIO_DEVICE_PCI_HOT_RESET, eliminating the need for an iommufd_access object
as required in a previous attempt [3].

ChangeLog:

Since RFC[4]:
- Abandoned dummy iommu driver approach as patch 1-3 absorbed the
  changes into iommufd.

[1] https://lore.kernel.org/linux-iommu/20250603175403.GA407344@nvidia.com/
[2] https://lore.kernel.org/linux-pci/20251027134430.00007e46@linux.microsoft.com/
[3] https://lore.kernel.org/kvm/20230522115751.326947-1-yi.l.liu@intel.com/
[4] https://lore.kernel.org/linux-iommu/20251201173012.18371-1-jacob.pan@linux.microsoft.com/

Thanks,

Jacob

Jacob Pan (8):
  iommufd: Add an ioctl IOMMU_IOAS_GET_PA to query PA from IOVA
  vfio: Allow null group for noiommu without containers
  vfio: Introduce and set noiommu flag on vfio_device
  vfio: Update noiommu device detection logic for cdev
  vfio: Enable cdev noiommu mode under iommufd
  vfio:selftest: Handle VFIO noiommu cdev
  selftests/vfio: Add iommufd noiommu mode selftest for cdev
  Doc: Update VFIO NOIOMMU mode

Jason Gunthorpe (3):
  iommufd: Support a HWPT without an iommu driver for noiommu
  iommufd: Move igroup allocation to a function
  iommufd: Allow binding to a noiommu device

 Documentation/driver-api/vfio.rst             |  44 +-
 drivers/iommu/iommufd/Makefile                |   1 +
 drivers/iommu/iommufd/device.c                | 161 ++++--
 drivers/iommu/iommufd/hw_pagetable.c          |  11 +-
 drivers/iommu/iommufd/hwpt_noiommu.c          |  91 +++
 drivers/iommu/iommufd/io_pagetable.c          |  39 ++
 drivers/iommu/iommufd/ioas.c                  |  22 +
 drivers/iommu/iommufd/iommufd_private.h       |   5 +
 drivers/iommu/iommufd/main.c                  |   3 +
 drivers/vfio/Kconfig                          |   6 +-
 drivers/vfio/group.c                          |  35 +-
 drivers/vfio/iommufd.c                        |   7 -
 drivers/vfio/vfio.h                           |  34 +-
 drivers/vfio/vfio_main.c                      |  22 +-
 include/linux/vfio.h                          |  10 +
 include/uapi/linux/iommufd.h                  |  25 +
 tools/testing/selftests/vfio/Makefile         |   1 +
 .../selftests/vfio/lib/vfio_pci_device.c      |  25 +-
 .../vfio/vfio_iommufd_noiommu_test.c          | 540 ++++++++++++++++++
 19 files changed, 990 insertions(+), 92 deletions(-)
 create mode 100644 drivers/iommu/iommufd/hwpt_noiommu.c
 create mode 100644 tools/testing/selftests/vfio/vfio_iommufd_noiommu_test.c

-- 
2.34.1
Re: [PATCH 00/11] iommufd: Enable noiommu mode for cdev
Posted by Jason Gunthorpe 1 month, 2 weeks ago
On Fri, Feb 27, 2026 at 09:52:36AM -0800, Jacob Pan wrote:
> VFIO's unsafe_noiommu_mode has long provided a way for userspace drivers
> to operate on platforms lacking a hardware IOMMU. Today, IOMMUFD also
> supports No-IOMMU mode for group-based devices under vfio_compat mode.
> However, IOMMUFD's native character device (cdev) does not yet support
> No-IOMMU mode, which is the purpose of this patch.

I browsed through this quickly and it looks OK to me, though I might
suggest correcting that FIXME so that the get pa scans the domain for
contiguous physical address. You can copy the loop from vfio probably.

Also the kbuild error needs fixing, I gave a suggestion for that in
the thread.

Thanks,
Jason
Re: [PATCH 00/11] iommufd: Enable noiommu mode for cdev
Posted by Jacob Pan 1 month, 1 week ago
Hi Jason,

On Mon, 2 Mar 2026 20:35:32 -0400
Jason Gunthorpe <jgg@nvidia.com> wrote:

> On Fri, Feb 27, 2026 at 09:52:36AM -0800, Jacob Pan wrote:
> > VFIO's unsafe_noiommu_mode has long provided a way for userspace
> > drivers to operate on platforms lacking a hardware IOMMU. Today,
> > IOMMUFD also supports No-IOMMU mode for group-based devices under
> > vfio_compat mode. However, IOMMUFD's native character device (cdev)
> > does not yet support No-IOMMU mode, which is the purpose of this
> > patch.  
> 
> I browsed through this quickly and it looks OK to me, though I might
> suggest correcting that FIXME so that the get pa scans the domain for
> contiguous physical address. You can copy the loop from vfio probably.
> 
Thanks for the suggestion, I will add the following to v2 and update
selftest with hugepage to cover this.

--- a/drivers/iommu/iommufd/io_pagetable.c
+++ b/drivers/iommu/iommufd/io_pagetable.c
@@ -877,10 +877,19 @@ int iopt_get_phys(struct io_pagetable *iopt,
unsigned long iova, u64 *paddr, goto unlock_exit;
        }
        /*
-        * TBD: we can return contiguous IOVA length so that userspace
can
-        * keep searching for next physical address.
+        * Scan the domain for the contiguous physical address length
so that
+        * userspace search can be optimized for fewer ioctls.
         */
-       *length = PAGE_SIZE;
+       while (iova < iopt_area_last_iova(area)) {
+               u64 next_paddr =
iommu_iova_to_phys(area->storage_domain,
+                                                   iova + PAGE_SIZE);
+               if (!next_paddr || next_paddr != *paddr + PAGE_SIZE) {
+                       *length += PAGE_SIZE;
+                       break;
+               }
+               iova += PAGE_SIZE;
+               *paddr += PAGE_SIZE;
+       }
> Also the kbuild error needs fixing, I gave a suggestion for that in
> the thread.
will fix in v2.

Thanks again,

Jacob