Hi Shameer,
On 12/10/25 2:37 PM, Shameer Kolothum wrote:
> Hi,
>
> This RFC series adds initial support for NVIDIA Tegra241 CMDQV
> (Command Queue Virtualisation), an extension to ARM SMMUv3 that
> provides hardware accelerated virtual command queues (VCMDQs) for
> guests. CMDQV allows guests to issue SMMU invalidation commands
> directly to hardware without VM exits, significantly reducing TLBI
> overhead.
>
> Thanks to Nicolin for the initial patches and testing on which this RFC
> is based.
>
> This is based on v6[0] of the SMMUv3 accel series, which is still under
> review, though nearing convergence. This is sent as an RFC, with the goal
> of gathering early feedback on the CMDQV design and its integration with
> the SMMUv3 acceleration path.
>
> Background:
>
> Tegra241 CMDQV extends SMMUv3 by allocating per-VM "virtual interfaces"
> (VINTFs), each hosting up to 128 VCMDQs.
>
> Each VINTF exposes two 64KB MMIO pages:
> - Page0 – guest owned control and status registers (directly mapped
> into the VM)
> - Page1 – queue configuration registers (trapped/emulated by QEMU)
>
> Unlike the standard SMMU CMDQ, a guest owned Tegra241 VCMDQ does not
> support the full command set. Only a subset, primarily invalidation
> related commands, is accepted by the CMDQV hardware. For this reason,
> a distinct CMDQV device must be exposed to the guest, and the guest OS
> must include a Tegra241 CMDQV aware driver to take advantage of the
> hardware acceleration.
>
> VCMDQ support is integrated via the IOMMU_HW_QUEUE_ALLOC mechanism,
> allowing QEMU to attach guest configured VCMDQ buffers to the
> underlying CMDQV hardware through IOMMUFD. The Linux kernel already
> supports the full CMDQV virtualisation model via IOMMUFD[0].
>
> Summary of QEMU changes:
>
> - Integrated into the existing SMMUv3 accel path via a
> "tegra241-cmdqv" property.
> - Support for allocating vIOMMU objects of type
> IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV.
> - Mapping and emulation of the CMDQV MMIO register layout.
> - VCMDQ/VINTF read/write handling and queue allocation using IOMMUFD
> APIs.
> - Reset and initialisation hooks, including checks for at least one
> cold-plugged device.
> - CMDQV hardware reads guest queue memory using host physical addresses
> provided through IOMMUFD, which requires that the VCMDQ buffer be
> physically contiguous not only in guest PA space but also in host
> PA space. When Tegra241 CMDQV is enabled, QEMU must therefore only
> expose a CMDQV size that the host can reliably back with contiguous
> physical memory. Because of this constraint, it is suggested to use
> huge pages to back the guest RAM.
> - ACPI DSDT node generation for CMDQV devices on the virt machine.
>
> These patches have been sanity tested on NVIDIA Grace platforms.
>
> ToDo / revisit:
> - Prevent hot-unplug of the last device associated with vIOMMU as
> this might allow associating a different host SMMU/CMDQV.
> - Locking requirements around error event propagation.
>
> Feedback and testing are very welcome.
>
> Thanks,
> Shameer
> [0] https://lore.kernel.org/qemu-devel/20251120132213.56581-1-skolothumtho@nvidia.com/
> [1] https://lore.kernel.org/all/cover.1752126748.git.nicolinc@nvidia.com/
do you have a branch to share with all the bits?
Thanks
Eric
>
> Nicolin Chen (12):
> backends/iommufd: Update iommufd_backend_get_device_info
> backends/iommufd: Update iommufd_backend_alloc_viommu to allow user
> ptr
> backends/iommufd: Introduce iommufd_backend_alloc_hw_queue
> backends/iommufd: Introduce iommufd_backend_viommu_mmap
> hw/arm/tegra241-cmdqv: Add initial Tegra241 CMDQ-Virtualisation
> support
> hw/arm/tegra241-cmdqv: Map VINTF Page0 into guest
> hw/arm/tegra241-cmdqv: Add read emulation support for registers
> system/physmem: Add helper to check whether a guest PA maps to RAM
> hw/arm/tegra241-cmdqv:: Add write emulation for registers
> hw/arm/tegra241-cmdqv: Add reset handler
> hw/arm/tegra241-cmdqv: Limit queue size based on backend page size
> hw/arm/virt-acpi: Advertise Tegra241 CMDQV nodes in DSDT
>
> Shameer Kolothum (4):
> hw/arm/tegra241-cmdqv: Allocate vEVENTQ object
> hw/arm/tegra241-cmdqv: Read and propagate Tegra241 CMDQV errors
> virt-acpi-build: Rename AcpiIortSMMUv3Dev to AcpiSMMUv3Dev
> hw/arm/smmuv3: Add tegra241-cmdqv property for SMMUv3 device
>
> backends/iommufd.c | 65 ++++
> backends/trace-events | 2 +
> hw/arm/Kconfig | 5 +
> hw/arm/meson.build | 1 +
> hw/arm/smmuv3-accel.c | 16 +-
> hw/arm/smmuv3.c | 18 +
> hw/arm/tegra241-cmdqv.c | 759 ++++++++++++++++++++++++++++++++++++++
> hw/arm/tegra241-cmdqv.h | 337 +++++++++++++++++
> hw/arm/trace-events | 5 +
> hw/arm/virt-acpi-build.c | 110 +++++-
> hw/vfio/iommufd.c | 6 +-
> include/exec/cpu-common.h | 2 +
> include/hw/arm/smmuv3.h | 3 +
> include/hw/arm/virt.h | 2 +
> include/system/iommufd.h | 16 +
> system/physmem.c | 12 +
> 16 files changed, 1332 insertions(+), 27 deletions(-)
> create mode 100644 hw/arm/tegra241-cmdqv.c
> create mode 100644 hw/arm/tegra241-cmdqv.h
>