[PATCH v4 00/27] hw/arm/virt: Add support for user-creatable accelerated SMMUv3

Shameer Kolothum posted 27 patches 1 month, 2 weeks ago
Failed in applying to current master (apply log)
There is a newer version of this series
backends/iommufd.c                  |  68 ++-
backends/trace-events               |   2 +
hw/arm/Kconfig                      |   5 +
hw/arm/meson.build                  |   3 +-
hw/arm/smmu-common.c                |  51 +-
hw/arm/smmuv3-accel.c               | 726 ++++++++++++++++++++++++++++
hw/arm/smmuv3-accel.h               |  83 ++++
hw/arm/smmuv3-internal.h            |   7 +-
hw/arm/smmuv3.c                     | 145 +++++-
hw/arm/trace-events                 |   5 +
hw/arm/virt-acpi-build.c            |  93 +++-
hw/arm/virt.c                       |  36 +-
hw/pci-bridge/pci_expander_bridge.c |   1 -
hw/pci-host/gpex-acpi.c             |  29 +-
hw/pci/pci.c                        |  19 +
hw/vfio/iommufd.c                   |   7 +-
hw/vfio/pci.c                       |  31 ++
include/hw/arm/smmu-common.h        |   7 +
include/hw/arm/smmuv3.h             |   9 +
include/hw/arm/virt.h               |   1 +
include/hw/iommu.h                  |   1 +
include/hw/pci-host/gpex.h          |   1 +
include/hw/pci/pci.h                |  16 +
include/hw/pci/pci_bridge.h         |   1 +
include/system/host_iommu_device.h  |  14 +
include/system/iommufd.h            |  29 +-
target/arm/kvm.c                    |   2 +-
27 files changed, 1335 insertions(+), 57 deletions(-)
create mode 100644 hw/arm/smmuv3-accel.c
create mode 100644 hw/arm/smmuv3-accel.h
[PATCH v4 00/27] hw/arm/virt: Add support for user-creatable accelerated SMMUv3
Posted by Shameer Kolothum 1 month, 2 weeks ago
Hi,

Changes from RFCv3:

 -Removed RFC tag as we have the user-creatable SMMUv3 sereis now applied[0]
 -Addressed feedback from RFCv3. Thanks to all!(I believe I have addressed
  all comments, apologies if I missed any)
 -Removed dependency on “at least one cold-plugged vfio-pci device.” The
  accelerated SMMUv3 features are now initialized based on QEMU SMMUv3
  defaults, and each time a device is attached, the host SMMUv3 info is
  retrieved and features are cross-checked.
 -Includes IORT RMR support to enable MSI doorbell address translation.
  Thanks to Eric, this is based on his earlier attempt on DSM #5 and
  IORT RMR support.
 -Added optional properties (like ATS, RIL, etc.) for the user to override
  the default QEMU SMMUv3 features.
 -Deferred batched invalidation of commands for now. This series supports
  basic single in-order command issuing to the host. Batched support will
  be added as a follow up series.
 -Includes synthesizing PASID capability for the assigned vfio-pci device.
  Thanks to Yi’s effort, this is based on his out-of-tree patches.
 -Added a migration blocker for now. Plan is to enable migration support
  later.
 -Has dependency(patches: 4/5/8)on Zhenzhong's pass-through support series[1]

PATCH organization:
 1–20: Enables accelerated SMMUv3 with features based on default QEMU SMMUv3,
       including IORT RMR based MSI support.
 21–23: Adds options for specifying RIL, ATS, and OAS features.
 24–27: Adds PASID support, including VFIO changes.

Tests:
Performed basic sanity tests on an NVIDIA GRACE platform with GPU device
assignments. A CUDA test application was used to verify the SVA use case.
Further tests are always welcome.

Eg: Qemu Cmd line:

qemu-system-aarch64 -machine virt,gic-version=3,highmem-mmio-size=2T \
-cpu host -smp cpus=4 -m size=16G,slots=2,maxmem=66G -nographic \
-bios QEMU_EFI.fd -object iommufd,id=iommufd0 -enable-kvm \
-object memory-backend-ram,size=8G,id=m0 \
-object memory-backend-ram,size=8G,id=m1 \
-numa node,memdev=m0,cpus=0-3,nodeid=0 -numa node,memdev=m1,nodeid=1 \
-numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 -numa node,nodeid=5 \
-numa node,nodeid=6 -numa node,nodeid=7 -numa node,nodeid=8 -numa node,nodeid=9 \
-device pxb-pcie,id=pcie.1,bus_nr=1,bus=pcie.0 \
-device arm-smmuv3,primary-bus=pcie.1,id=smmuv3.0,accel=on,ats=on,ril=off,pasid=on,oas=48 \
-device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1,pref64-reserve=512G,id=dev0 \
-device vfio-pci,host=0019:06:00.0,rombar=0,id=dev0,iommufd=iommufd0,bus=pcie.port1 \
-object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
...
-object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \
-device pxb-pcie,id=pcie.2,bus_nr=8,bus=pcie.0 \
-device arm-smmuv3,primary-bus=pcie.2,id=smmuv3.1,accel=on,ats=on,ril=off,pasid=on \
-device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2,pref64-reserve=512G \
-device vfio-pci,host=0018:06:00.0,rombar=0,id=dev1,iommufd=iommufd0,bus=pcie.port2 \
-device virtio-blk-device,drive=fs \
-drive file=image.qcow2,index=0,media=disk,format=qcow2,if=none,id=fs \
-net none \
-nographic

A complete branch can be found here,
https://github.com/shamiali2008/qemu-master smmuv3-accel-v4

Please take a look and let me know your feedback.

Thanks,
Shameer

[0] https://lore.kernel.org/qemu-devel/20250829082543.7680-1-skolothumtho@nvidia.com/
[1] https://lore.kernel.org/qemu-devel/20250918085803.796942-1-zhenzhong.duan@intel.com/

Deatils from RFCv3 Cover letter:
-------------------------------
This patch series introduces initial support for a user-creatable,
accelerated SMMUv3 device (-device arm-smmuv3,accel=on) in QEMU.

This is based on the user-creatable SMMUv3 device series [0].

Why this is needed:

On ARM, to enable vfio-pci pass-through devices in a VM, the host SMMUv3
must be set up in nested translation mode (Stage 1 + Stage 2), with
Stage 1 (S1) controlled by the guest and Stage 2 (S2) managed by the host.

This series introduces an optional accel property for the SMMUv3 device,
indicating that the guest will try to leverage host SMMUv3 features for
acceleration. By default, enabling accel configures the host SMMUv3 in
nested mode to support vfio-pci pass-through.

This new accelerated, user-creatable SMMUv3 device lets you:

 -Set up a VM with multiple SMMUv3s, each tied to a different physical SMMUv3
  on the host. Typically, you’d have multiple PCIe PXB root complexes in the
  VM (one per virtual NUMA node), and each of them can have its own SMMUv3.
  This setup mirrors the host's layout, where each NUMA node has its own
  SMMUv3, and helps build VMs that are more aligned with the host's NUMA
  topology.

 -The host–guest SMMUv3 association results in reduced invalidation broadcasts
  and lookups for devices behind different physical SMMUv3s.

 -Simplifies handling of host SMMUv3s with differing feature sets.

 -Lays the groundwork for additional capabilities like vCMDQ support.

Eric Auger (3):
  acpi/gpex: Fix PCI Express Slot Information function 0 returned value
  hw/pci-host/gpex: Allow to generate preserve boot config DSM #5
  hw/arm/virt-acpi-build: Add IORT RMR regions to handle MSI nested
    binding

Nicolin Chen (5):
  backends/iommufd: Introduce iommufd_backend_alloc_viommu
  backends/iommufd: Introduce iommufd_vdev_alloc
  hw/arm/smmuv3-accel: Add set/unset_iommu_device callback
  hw/arm/smmuv3-accel: Support nested STE install/uninstall support
  hw/arm/smmuv3-accel: Allocate a vDEVICE object for device

Shameer Kolothum (18):
  hw/arm/smmu-common: Factor out common helper functions and export
  hw/arm/smmu-common:Make iommu ops part of SMMUState
  hw/arm/smmuv3-accel: Introduce smmuv3 accel device
  hw/arm/smmuv3-accel: Restrict accelerated SMMUv3 to vfio-pci endpoints
    with iommufd
  hw/arm/smmuv3: Implement get_viommu_cap() callback
  hw/pci/pci: Introduce optional get_msi_address_space() callback
  hw/arm/smmuv3-accel: Make use of get_msi_address_space() callback
  hw/arm/smmuv3-accel: Add support to issue invalidation cmd to host
  hw/arm/smmuv3-accel: Get host SMMUv3 hw info and validate
  hw/arm/virt: Set PCI preserve_config for accel SMMUv3
  hw/arm/smmuv3-accel: Install S1 bypass hwpt on reset
  hw/arm/smmuv3: Add accel property for SMMUv3 device
  hw/arm/smmuv3-accel: Add a property to specify RIL support
  hw/arm/smmuv3-accel: Add support for ATS
  hw/arm/smmuv3-accel: Add property to specify OAS bits
  backends/iommufd: Retrieve PASID width from
    iommufd_backend_get_device_info()
  backends/iommufd: Add a callback helper to retrieve PASID support
  hw.arm/smmuv3: Add support for PASID enable

Yi Liu (1):
  vfio: Synthesize vPASID capability to VM

 backends/iommufd.c                  |  68 ++-
 backends/trace-events               |   2 +
 hw/arm/Kconfig                      |   5 +
 hw/arm/meson.build                  |   3 +-
 hw/arm/smmu-common.c                |  51 +-
 hw/arm/smmuv3-accel.c               | 726 ++++++++++++++++++++++++++++
 hw/arm/smmuv3-accel.h               |  83 ++++
 hw/arm/smmuv3-internal.h            |   7 +-
 hw/arm/smmuv3.c                     | 145 +++++-
 hw/arm/trace-events                 |   5 +
 hw/arm/virt-acpi-build.c            |  93 +++-
 hw/arm/virt.c                       |  36 +-
 hw/pci-bridge/pci_expander_bridge.c |   1 -
 hw/pci-host/gpex-acpi.c             |  29 +-
 hw/pci/pci.c                        |  19 +
 hw/vfio/iommufd.c                   |   7 +-
 hw/vfio/pci.c                       |  31 ++
 include/hw/arm/smmu-common.h        |   7 +
 include/hw/arm/smmuv3.h             |   9 +
 include/hw/arm/virt.h               |   1 +
 include/hw/iommu.h                  |   1 +
 include/hw/pci-host/gpex.h          |   1 +
 include/hw/pci/pci.h                |  16 +
 include/hw/pci/pci_bridge.h         |   1 +
 include/system/host_iommu_device.h  |  14 +
 include/system/iommufd.h            |  29 +-
 target/arm/kvm.c                    |   2 +-
 27 files changed, 1335 insertions(+), 57 deletions(-)
 create mode 100644 hw/arm/smmuv3-accel.c
 create mode 100644 hw/arm/smmuv3-accel.h

-- 
2.43.0


Re: [PATCH v4 00/27] hw/arm/virt: Add support for user-creatable accelerated SMMUv3
Posted by Zhangfei Gao 4 weeks ago
Hi, Shameer

On Mon, 29 Sept 2025 at 21:39, Shameer Kolothum <skolothumtho@nvidia.com> wrote:
>
> Hi,
>
> Changes from RFCv3:
>
>  -Removed RFC tag as we have the user-creatable SMMUv3 sereis now applied[0]
>  -Addressed feedback from RFCv3. Thanks to all!(I believe I have addressed
>   all comments, apologies if I missed any)
>  -Removed dependency on “at least one cold-plugged vfio-pci device.” The
>   accelerated SMMUv3 features are now initialized based on QEMU SMMUv3
>   defaults, and each time a device is attached, the host SMMUv3 info is
>   retrieved and features are cross-checked.
>  -Includes IORT RMR support to enable MSI doorbell address translation.
>   Thanks to Eric, this is based on his earlier attempt on DSM #5 and
>   IORT RMR support.
>  -Added optional properties (like ATS, RIL, etc.) for the user to override
>   the default QEMU SMMUv3 features.
>  -Deferred batched invalidation of commands for now. This series supports
>   basic single in-order command issuing to the host. Batched support will
>   be added as a follow up series.
>  -Includes synthesizing PASID capability for the assigned vfio-pci device.
>   Thanks to Yi’s effort, this is based on his out-of-tree patches.
>  -Added a migration blocker for now. Plan is to enable migration support
>   later.
>  -Has dependency(patches: 4/5/8)on Zhenzhong's pass-through support series[1]
>
> PATCH organization:
>  1–20: Enables accelerated SMMUv3 with features based on default QEMU SMMUv3,
>        including IORT RMR based MSI support.
>  21–23: Adds options for specifying RIL, ATS, and OAS features.
>  24–27: Adds PASID support, including VFIO changes.
>
> Tests:
> Performed basic sanity tests on an NVIDIA GRACE platform with GPU device
> assignments. A CUDA test application was used to verify the SVA use case.
> Further tests are always welcome.
>
> Eg: Qemu Cmd line:
>
> qemu-system-aarch64 -machine virt,gic-version=3,highmem-mmio-size=2T \
> -cpu host -smp cpus=4 -m size=16G,slots=2,maxmem=66G -nographic \
> -bios QEMU_EFI.fd -object iommufd,id=iommufd0 -enable-kvm \
> -object memory-backend-ram,size=8G,id=m0 \
> -object memory-backend-ram,size=8G,id=m1 \
> -numa node,memdev=m0,cpus=0-3,nodeid=0 -numa node,memdev=m1,nodeid=1 \
> -numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 -numa node,nodeid=5 \
> -numa node,nodeid=6 -numa node,nodeid=7 -numa node,nodeid=8 -numa node,nodeid=9 \
> -device pxb-pcie,id=pcie.1,bus_nr=1,bus=pcie.0 \
> -device arm-smmuv3,primary-bus=pcie.1,id=smmuv3.0,accel=on,ats=on,ril=off,pasid=on,oas=48 \
> -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1,pref64-reserve=512G,id=dev0 \
> -device vfio-pci,host=0019:06:00.0,rombar=0,id=dev0,iommufd=iommufd0,bus=pcie.port1 \
> -object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
> ...
> -object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \
> -device pxb-pcie,id=pcie.2,bus_nr=8,bus=pcie.0 \
> -device arm-smmuv3,primary-bus=pcie.2,id=smmuv3.1,accel=on,ats=on,ril=off,pasid=on \
> -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2,pref64-reserve=512G \
> -device vfio-pci,host=0018:06:00.0,rombar=0,id=dev1,iommufd=iommufd0,bus=pcie.port2 \
> -device virtio-blk-device,drive=fs \
> -drive file=image.qcow2,index=0,media=disk,format=qcow2,if=none,id=fs \
> -net none \
> -nographic
>
> A complete branch can be found here,
> https://github.com/shamiali2008/qemu-master smmuv3-accel-v4


I have tested this series with stall enabled.
 https://github.com/Linaro/qemu/pull/new/10.1.50-wip

Tested-by:  Zhangfei Gao <zhangfei.gao@linaro.org>

By the way, the stall feature requires some additional patches,
including page fault handling.
Shall we handle that after this series?

Thanks
RE: [PATCH v4 00/27] hw/arm/virt: Add support for user-creatable accelerated SMMUv3
Posted by Shameer Kolothum 4 weeks ago
Hi Zhangfei,

> -----Original Message-----
> From: Zhangfei Gao <zhangfei.gao@linaro.org>
> Sent: 17 October 2025 07:25
> To: Shameer Kolothum <skolothumtho@nvidia.com>
> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org;
> eric.auger@redhat.com; peter.maydell@linaro.org; Jason Gunthorpe
> <jgg@nvidia.com>; Nicolin Chen <nicolinc@nvidia.com>;
> ddutile@redhat.com; berrange@redhat.com; Nathan Chen
> <nathanc@nvidia.com>; Matt Ochs <mochs@nvidia.com>;
> smostafa@google.com; wangzhou1@hisilicon.com;
> jiangkunkun@huawei.com; jonathan.cameron@huawei.com;
> zhenzhong.duan@intel.com; yi.l.liu@intel.com;
> shameerkolothum@gmail.com
> Subject: Re: [PATCH v4 00/27] hw/arm/virt: Add support for user-creatable
> accelerated SMMUv3
> 
 
> I have tested this series with stall enabled.
 
> Tested-by:  Zhangfei Gao <zhangfei.gao@linaro.org>

Thanks for that.
 
> By the way, the stall feature requires some additional patches, including page
> fault handling.
> Shall we handle that after this series?

Yes. I am working on v5 of the series addressing comments/feedback received so far.
STALL can be enabled as a follow up series as it is not that straightforward 😊

Thanks,
Shameer