[PATCH v5 10/32] hw/arm/smmuv3-accel: Restrict accelerated SMMUv3 to vfio-pci endpoints with iommufd

Shameer Kolothum posted 32 patches 1 week, 6 days ago
[PATCH v5 10/32] hw/arm/smmuv3-accel: Restrict accelerated SMMUv3 to vfio-pci endpoints with iommufd
Posted by Shameer Kolothum 1 week, 6 days ago
Accelerated SMMUv3 is only meaningful when a device can leverage the
host SMMUv3 in nested mode (S1+S2 translation). To keep the model
consistent and correct, this mode is restricted to vfio-pci endpoint
devices using the iommufd backend.

Non-endpoint emulated devices such as PCIe root ports and bridges are
also permitted so that vfio-pci devices can be attached beneath them.
All other device types are unsupported in accelerated mode.

Implement supports_address_space() callaback to reject all such
unsupported devices.

This restriction also avoids complications with IOTLB invalidations.
Some TLBI commands (e.g. CMD_TLBI_NH_ASID) lack an associated SID,
making it difficult to trace the originating device. Allowing emulated
endpoints would require invalidating both QEMU’s software IOTLB and the
host’s hardware IOTLB, which can significantly degrade performance.

For vfio-pci devices in nested mode, get_address_space() returns an
address space aliased to system address space so that the VFIO core
can set up the correct stage-2 mappings for guest RAM.

In summary:
 - vfio-pci devices(with iommufd as backend) return an address space
   aliased to system address space.
 - bridges and root ports return the IOMMU address space.

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
---
 hw/arm/smmuv3-accel.c | 66 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 65 insertions(+), 1 deletion(-)

diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index f62b6cf2c9..550a0496fe 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -7,8 +7,13 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/error-report.h"
 
 #include "hw/arm/smmuv3.h"
+#include "hw/pci/pci_bridge.h"
+#include "hw/pci-host/gpex.h"
+#include "hw/vfio/pci.h"
+
 #include "smmuv3-accel.h"
 
 /*
@@ -38,6 +43,41 @@ static SMMUv3AccelDevice *smmuv3_accel_get_dev(SMMUState *bs, SMMUPciBus *sbus,
     return accel_dev;
 }
 
+static bool smmuv3_accel_pdev_allowed(PCIDevice *pdev, bool *vfio_pci)
+{
+
+    if (object_dynamic_cast(OBJECT(pdev), TYPE_PCI_BRIDGE) ||
+        object_dynamic_cast(OBJECT(pdev), TYPE_PXB_PCIE_DEV) ||
+        object_dynamic_cast(OBJECT(pdev), TYPE_GPEX_ROOT_DEVICE)) {
+        return true;
+    } else if ((object_dynamic_cast(OBJECT(pdev), TYPE_VFIO_PCI))) {
+        *vfio_pci = true;
+        if (object_property_get_link(OBJECT(pdev), "iommufd", NULL)) {
+            return true;
+        }
+    }
+    return false;
+}
+
+static bool smmuv3_accel_supports_as(PCIBus *bus, void *opaque, int devfn,
+                                     Error **errp)
+{
+    PCIDevice *pdev = pci_find_device(bus, pci_bus_num(bus), devfn);
+    bool vfio_pci = false;
+
+    if (pdev && !smmuv3_accel_pdev_allowed(pdev, &vfio_pci)) {
+        if (vfio_pci) {
+            error_setg(errp, "vfio-pci endpoint devices without an iommufd "
+                       "backend not allowed when using arm-smmuv3,accel=on");
+
+        } else {
+            error_setg(errp, "Emulated endpoint devices are not allowed when "
+                       "using arm-smmuv3,accel=on");
+        }
+        return false;
+    }
+    return true;
+}
 /*
  * Find or add an address space for the given PCI device.
  *
@@ -48,15 +88,39 @@ static SMMUv3AccelDevice *smmuv3_accel_get_dev(SMMUState *bs, SMMUPciBus *sbus,
 static AddressSpace *smmuv3_accel_find_add_as(PCIBus *bus, void *opaque,
                                               int devfn)
 {
+    PCIDevice *pdev = pci_find_device(bus, pci_bus_num(bus), devfn);
     SMMUState *bs = opaque;
     SMMUPciBus *sbus = smmu_get_sbus(bs, bus);
     SMMUv3AccelDevice *accel_dev = smmuv3_accel_get_dev(bs, sbus, bus, devfn);
     SMMUDevice *sdev = &accel_dev->sdev;
+    bool vfio_pci = false;
 
-    return &sdev->as;
+    if (pdev && !smmuv3_accel_pdev_allowed(pdev, &vfio_pci)) {
+        /* Should never be here: supports_address_space() filters these out */
+        g_assert_not_reached();
+    }
+
+    /*
+     * In the accelerated mode, a vfio-pci device attached via the iommufd
+     * backend must remain in the system address space. Such a device is
+     * always translated by its physical SMMU (using either a stage-2-only
+     * STE or a nested STE), where the parent stage-2 page table is allocated
+     * by the VFIO core to back the system address space.
+     *
+     * Return the shared_as_sysmem aliased to the global system memory in this
+     * case. Sharing address_space_memory also allows devices under different
+     * vSMMU instances in the same VM to reuse a single nesting parent HWPT in
+     * the VFIO core.
+     */
+    if (vfio_pci) {
+        return shared_as_sysmem;
+    } else {
+        return &sdev->as;
+    }
 }
 
 static const PCIIOMMUOps smmuv3_accel_ops = {
+    .supports_address_space = smmuv3_accel_supports_as,
     .get_address_space = smmuv3_accel_find_add_as,
 };
 
-- 
2.43.0


Re: [PATCH v5 10/32] hw/arm/smmuv3-accel: Restrict accelerated SMMUv3 to vfio-pci endpoints with iommufd
Posted by Eric Auger 1 week, 3 days ago

On 10/31/25 11:49 AM, Shameer Kolothum wrote:
> Accelerated SMMUv3 is only meaningful when a device can leverage the
> host SMMUv3 in nested mode (S1+S2 translation). To keep the model
> consistent and correct, this mode is restricted to vfio-pci endpoint
> devices using the iommufd backend.
>
> Non-endpoint emulated devices such as PCIe root ports and bridges are
> also permitted so that vfio-pci devices can be attached beneath them.
s/beneath them/downstream?
> All other device types are unsupported in accelerated mode.
>
> Implement supports_address_space() callaback to reject all such
callback
> unsupported devices.
>
> This restriction also avoids complications with IOTLB invalidations.
> Some TLBI commands (e.g. CMD_TLBI_NH_ASID) lack an associated SID,
> making it difficult to trace the originating device. Allowing emulated
> endpoints would require invalidating both QEMU’s software IOTLB and the
> host’s hardware IOTLB, which can significantly degrade performance.
>
> For vfio-pci devices in nested mode, get_address_space() returns an
> address space aliased to system address space so that the VFIO core
> can set up the correct stage-2 mappings for guest RAM.
>
> In summary:
>  - vfio-pci devices(with iommufd as backend) return an address space
>    aliased to system address space.
>  - bridges and root ports return the IOMMU address space.
>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <skolothumtho@nvidia.com>
> ---
>  hw/arm/smmuv3-accel.c | 66 ++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 65 insertions(+), 1 deletion(-)
>
> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> index f62b6cf2c9..550a0496fe 100644
> --- a/hw/arm/smmuv3-accel.c
> +++ b/hw/arm/smmuv3-accel.c
> @@ -7,8 +7,13 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/error-report.h"
>  
>  #include "hw/arm/smmuv3.h"
> +#include "hw/pci/pci_bridge.h"
> +#include "hw/pci-host/gpex.h"
> +#include "hw/vfio/pci.h"
> +
>  #include "smmuv3-accel.h"
>  
>  /*
> @@ -38,6 +43,41 @@ static SMMUv3AccelDevice *smmuv3_accel_get_dev(SMMUState *bs, SMMUPciBus *sbus,
>      return accel_dev;
>  }
>  
> +static bool smmuv3_accel_pdev_allowed(PCIDevice *pdev, bool *vfio_pci)
> +{
> +
> +    if (object_dynamic_cast(OBJECT(pdev), TYPE_PCI_BRIDGE) ||
> +        object_dynamic_cast(OBJECT(pdev), TYPE_PXB_PCIE_DEV) ||
> +        object_dynamic_cast(OBJECT(pdev), TYPE_GPEX_ROOT_DEVICE)) {
> +        return true;
> +    } else if ((object_dynamic_cast(OBJECT(pdev), TYPE_VFIO_PCI))) {
> +        *vfio_pci = true;
> +        if (object_property_get_link(OBJECT(pdev), "iommufd", NULL)) {
> +            return true;
> +        }
> +    }
> +    return false;
> +}
> +
> +static bool smmuv3_accel_supports_as(PCIBus *bus, void *opaque, int devfn,
> +                                     Error **errp)
> +{
> +    PCIDevice *pdev = pci_find_device(bus, pci_bus_num(bus), devfn);
> +    bool vfio_pci = false;
> +
> +    if (pdev && !smmuv3_accel_pdev_allowed(pdev, &vfio_pci)) {
> +        if (vfio_pci) {
> +            error_setg(errp, "vfio-pci endpoint devices without an iommufd "
> +                       "backend not allowed when using arm-smmuv3,accel=on");
> +
> +        } else {
> +            error_setg(errp, "Emulated endpoint devices are not allowed when "
> +                       "using arm-smmuv3,accel=on");
> +        }
> +        return false;
> +    }
> +    return true;
> +}
>  /*
>   * Find or add an address space for the given PCI device.
>   *
> @@ -48,15 +88,39 @@ static SMMUv3AccelDevice *smmuv3_accel_get_dev(SMMUState *bs, SMMUPciBus *sbus,
>  static AddressSpace *smmuv3_accel_find_add_as(PCIBus *bus, void *opaque,
>                                                int devfn)
>  {
> +    PCIDevice *pdev = pci_find_device(bus, pci_bus_num(bus), devfn);
>      SMMUState *bs = opaque;
>      SMMUPciBus *sbus = smmu_get_sbus(bs, bus);
>      SMMUv3AccelDevice *accel_dev = smmuv3_accel_get_dev(bs, sbus, bus, devfn);
>      SMMUDevice *sdev = &accel_dev->sdev;
> +    bool vfio_pci = false;
>  
> -    return &sdev->as;
> +    if (pdev && !smmuv3_accel_pdev_allowed(pdev, &vfio_pci)) {
> +        /* Should never be here: supports_address_space() filters these out */
> +        g_assert_not_reached();
> +    }
> +
> +    /*
> +     * In the accelerated mode, a vfio-pci device attached via the iommufd
> +     * backend must remain in the system address space. Such a device is
> +     * always translated by its physical SMMU (using either a stage-2-only
> +     * STE or a nested STE), where the parent stage-2 page table is allocated
> +     * by the VFIO core to back the system address space.
> +     *
> +     * Return the shared_as_sysmem aliased to the global system memory in this
> +     * case. Sharing address_space_memory also allows devices under different
> +     * vSMMU instances in the same VM to reuse a single nesting parent HWPT in
> +     * the VFIO core.
> +     */
> +    if (vfio_pci) {
> +        return shared_as_sysmem;
> +    } else {
> +        return &sdev->as;
> +    }
>  }
>  
>  static const PCIIOMMUOps smmuv3_accel_ops = {
> +    .supports_address_space = smmuv3_accel_supports_as,
>      .get_address_space = smmuv3_accel_find_add_as,
>  };
>  
Besides

Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric