Hi,
Changes from v4:
https://lore.kernel.org/qemu-devel/20250929133643.38961-1-skolothumtho@nvidia.com/
- Addressed feedback from v4 and picked up R-by and T-by tags.
Thanks to all!
- Split out the _DSM fix into a separate mini series which has
already been sent out [0].
- Introduced a global shared address space aliasing to the system
address space instead of directly using "address_space_memory" in the
get_address_space() callback(patch #6).
- Fixed pci_find_device() returning NULL in the get_address_space()
path (Patch #7).
- Introduced an optional supports_address_space() callback for
rejecting devices attached to a vIOMMU (Patch #8). This allows us
to reject emulated endpoints when using SMMUv3 with accel=on.
- Added BIOS table tests for the IORT revision change.
- Added support to install vSTE based on SMMUv3 GBPA (Patch #14).
- Factored out ID register initialization from the reset path so
that it can be used early in the SMMUv3 accel path for HW
compatibility checks (Patch #18).
- GBPA-based vSTE update depends on Nicolin's kernel patch [1].
- VFIO/IOMMUFD has dependency on Zhenzhong's patches: 4/5/8 from the
pass-through support series [3].
PATCH organization:
1–25: Enables accelerated SMMUv3 with features based on default QEMU SMMUv3,
including IORT RMR based MSI support.
26–28: Adds options for specifying RIL, ATS, and OAS features.
29–32: Adds PASID support, including VFIO changes.
Tests:
Performed basic sanity tests on an NVIDIA GRACE platform with GPU device
assignments. A CUDA test application was used to verify the SVA use case.
Further tests are always welcome.
Eg: Qemu Cmd line:
qemu-system-aarch64 -machine virt,gic-version=3,highmem-mmio-size=2T \
-cpu host -smp cpus=4 -m size=16G,slots=2,maxmem=66G -nographic \
-bios QEMU_EFI.fd -object iommufd,id=iommufd0 -enable-kvm \
-object memory-backend-ram,size=8G,id=m0 \
-object memory-backend-ram,size=8G,id=m1 \
-numa node,memdev=m0,cpus=0-3,nodeid=0 -numa node,memdev=m1,nodeid=1 \
-numa node,nodeid=2 -numa node,nodeid=3 -numa node,nodeid=4 -numa node,nodeid=5 \
-numa node,nodeid=6 -numa node,nodeid=7 -numa node,nodeid=8 -numa node,nodeid=9 \
-device pxb-pcie,id=pcie.1,bus_nr=1,bus=pcie.0 \
-device arm-smmuv3,primary-bus=pcie.1,id=smmuv3.0,accel=on,ats=on,ril=off,pasid=on,oas=48 \
-device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1,pref64-reserve=512G,id=dev0 \
-device vfio-pci,host=0019:06:00.0,rombar=0,id=dev0,iommufd=iommufd0,bus=pcie.port1 \
-object acpi-generic-initiator,id=gi0,pci-dev=dev0,node=2 \
...
-object acpi-generic-initiator,id=gi7,pci-dev=dev0,node=9 \
-device pxb-pcie,id=pcie.2,bus_nr=8,bus=pcie.0 \
-device arm-smmuv3,primary-bus=pcie.2,id=smmuv3.1,accel=on,ats=on,ril=off,pasid=on \
-device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2,pref64-reserve=512G \
-device vfio-pci,host=0018:06:00.0,rombar=0,id=dev1,iommufd=iommufd0,bus=pcie.port2 \
-device virtio-blk-device,drive=fs \
-drive file=image.qcow2,index=0,media=disk,format=qcow2,if=none,id=fs \
-net none \
-nographic
A complete branch can be found here,
https://github.com/shamiali2008/qemu-master master-smmuv3-accel-v5
Please take a look and let me know your feedback.
Thanks,
Shameer
[0] https://lore.kernel.org/qemu-devel/20251022080639.243965-1-skolothumtho@nvidia.com/
[1] https://lore.kernel.org/linux-iommu/20251024040551.1711281-1-nicolinc@nvidia.com/
[2] https://lore.kernel.org/qemu-devel/20251024084349.102322-1-zhenzhong.duan@intel.com/
Details from RFCv3 Cover letter:
-------------------------------
https://lore.kernel.org/qemu-devel/20250714155941.22176-1-shameerali.kolothum.thodi@huawei.com/
This patch series introduces initial support for a user-creatable,
accelerated SMMUv3 device (-device arm-smmuv3,accel=on) in QEMU.
This is based on the user-creatable SMMUv3 device series [0].
Why this is needed:
On ARM, to enable vfio-pci pass-through devices in a VM, the host SMMUv3
must be set up in nested translation mode (Stage 1 + Stage 2), with
Stage 1 (S1) controlled by the guest and Stage 2 (S2) managed by the host.
This series introduces an optional accel property for the SMMUv3 device,
indicating that the guest will try to leverage host SMMUv3 features for
acceleration. By default, enabling accel configures the host SMMUv3 in
nested mode to support vfio-pci pass-through.
This new accelerated, user-creatable SMMUv3 device lets you:
-Set up a VM with multiple SMMUv3s, each tied to a different physical SMMUv3
on the host. Typically, you’d have multiple PCIe PXB root complexes in the
VM (one per virtual NUMA node), and each of them can have its own SMMUv3.
This setup mirrors the host's layout, where each NUMA node has its own
SMMUv3, and helps build VMs that are more aligned with the host's NUMA
topology.
-The host–guest SMMUv3 association results in reduced invalidation broadcasts
and lookups for devices behind different physical SMMUv3s.
-Simplifies handling of host SMMUv3s with differing feature sets.
-Lays the groundwork for additional capabilities like vCMDQ support.
-------------------------------
Eric Auger (2):
hw/pci-host/gpex: Allow to generate preserve boot config DSM #5
hw/arm/virt-acpi-build: Add IORT RMR regions to handle MSI nested
binding
Nicolin Chen (4):
backends/iommufd: Introduce iommufd_backend_alloc_viommu
backends/iommufd: Introduce iommufd_backend_alloc_vdev
hw/arm/smmuv3-accel: Add set/unset_iommu_device callback
hw/arm/smmuv3-accel: Add nested vSTE install/uninstall support
Shameer Kolothum (25):
hw/arm/smmu-common: Factor out common helper functions and export
hw/arm/smmu-common: Make iommu ops part of SMMUState
hw/arm/smmuv3-accel: Introduce smmuv3 accel device
hw/arm/smmuv3-accel: Initialize shared system address space
hw/pci/pci: Move pci_init_bus_master() after adding device to bus
hw/pci/pci: Add optional supports_address_space() callback
hw/pci-bridge/pci_expander_bridge: Move TYPE_PXB_PCIE_DEV to header
hw/arm/smmuv3-accel: Restrict accelerated SMMUv3 to vfio-pci endpoints
with iommufd
hw/arm/smmuv3: Implement get_viommu_cap() callback
hw/arm/smmuv3-accel: Install SMMUv3 GBPA based hwpt
hw/pci/pci: Introduce optional get_msi_address_space() callback
hw/arm/smmuv3-accel: Make use of get_msi_address_space() callback
hw/arm/smmuv3-accel: Add support to issue invalidation cmd to host
hw/arm/smmuv3: Initialize ID registers early during realize()
hw/arm/smmuv3-accel: Get host SMMUv3 hw info and validate
hw/arm/virt: Set PCI preserve_config for accel SMMUv3
tests/qtest/bios-tables-test: Prepare for IORT revison upgrade
tests/qtest/bios-tables-test: Update IORT blobs after revision upgrade
hw/arm/smmuv3: Add accel property for SMMUv3 device
hw/arm/smmuv3-accel: Add a property to specify RIL support
hw/arm/smmuv3-accel: Add support for ATS
hw/arm/smmuv3-accel: Add property to specify OAS bits
backends/iommufd: Retrieve PASID width from
iommufd_backend_get_device_info()
Extend get_cap() callback to support PASID
hw/arm/smmuv3-accel: Add support for PASID enable
Yi Liu (1):
vfio: Synthesize vPASID capability to VM
backends/iommufd.c | 77 +-
backends/trace-events | 2 +
hw/arm/Kconfig | 5 +
hw/arm/meson.build | 3 +-
hw/arm/smmu-common.c | 51 +-
hw/arm/smmuv3-accel.c | 763 ++++++++++++++++++
hw/arm/smmuv3-accel.h | 92 +++
hw/arm/smmuv3-internal.h | 29 +-
hw/arm/smmuv3.c | 161 +++-
hw/arm/trace-events | 6 +
hw/arm/virt-acpi-build.c | 128 ++-
hw/arm/virt.c | 31 +-
hw/i386/intel_iommu.c | 5 +-
hw/pci-bridge/pci_expander_bridge.c | 1 -
hw/pci-host/gpex-acpi.c | 29 +-
hw/pci/pci.c | 44 +-
hw/vfio/container-legacy.c | 8 +-
hw/vfio/iommufd.c | 7 +-
hw/vfio/pci.c | 37 +
include/hw/arm/smmu-common.h | 7 +
include/hw/arm/smmuv3.h | 9 +
include/hw/arm/virt.h | 1 +
include/hw/iommu.h | 1 +
include/hw/pci-host/gpex.h | 1 +
include/hw/pci/pci.h | 33 +
include/hw/pci/pci_bridge.h | 1 +
include/system/host_iommu_device.h | 17 +-
include/system/iommufd.h | 29 +-
target/arm/kvm.c | 2 +-
tests/data/acpi/aarch64/virt/IORT | Bin 128 -> 128 bytes
tests/data/acpi/aarch64/virt/IORT.its_off | Bin 172 -> 172 bytes
tests/data/acpi/aarch64/virt/IORT.smmuv3-dev | Bin 364 -> 364 bytes
.../data/acpi/aarch64/virt/IORT.smmuv3-legacy | Bin 276 -> 276 bytes
33 files changed, 1506 insertions(+), 74 deletions(-)
create mode 100644 hw/arm/smmuv3-accel.c
create mode 100644 hw/arm/smmuv3-accel.h
--
2.43.0