[v3] hw/arm/virt: Add support for user-creatable accelerated SMMUv3

[RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Shameer Kolothum via 7 months ago

From: Nicolin Chen <nicolinc@nvidia.com>

Not all fields in the SMMU IDR registers are meaningful for userspace.
Only the following fields can be used:

  - IDR0: ST_LEVEL, TERM_MODEL, STALL_MODEL, TTENDIAN, CD2L, ASID16, TTF  
  - IDR1: SIDSIZE, SSIDSIZE  
  - IDR3: BBML, RIL  
  - IDR5: VAX, GRAN64K, GRAN16K, GRAN4K

Use the relevant fields from these to check whether the host and emulated
SMMUv3 features are sufficiently aligned to enable accelerated SMMUv3
support.

To retrieve this information from the host, at least one vfio-pci device
must be assigned with "arm-smmuv3,accel=on" usage. Add a check to enforce
this.

Note:

ATS, PASID, and PRI features are currently not supported. Only devices
that do not require or make use of these features are expected to work.

Also, requiring at least one vfio-pci device to be cold-plugged
complicates hot-unplug and replug scenarios. For example, if all devices
behind the vSMMUv3 are unplugged after the guest boots, and a new device
is later hot-plugged into the same PCI bus, there is no guarantee that
the underlying host SMMUv3 will expose the same feature set as the one
originally used when the vSMMU was initialized.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 hw/arm/smmuv3-accel.c | 103 ++++++++++++++++++++++++++++++++++++++++++
 hw/arm/smmuv3-accel.h |   5 ++
 hw/arm/smmuv3.c       |   4 ++
 hw/arm/trace-events   |   2 +-
 4 files changed, 113 insertions(+), 1 deletion(-)

diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 1298b4f6d0..3b2f45bd88 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -23,6 +23,109 @@
 #define SMMU_STE_VALID      (1ULL << 0)
 #define SMMU_STE_CFG_BYPASS (1ULL << 3)
 
+static int
+smmuv3_accel_host_hw_info(SMMUv3AccelDevice *accel_dev, uint32_t *data_type,
+                          uint32_t data_len, void *data)
+{
+    uint64_t caps;
+
+    if (!accel_dev || !accel_dev->idev) {
+        return -ENOENT;
+    }
+
+    return !iommufd_backend_get_device_info(accel_dev->idev->iommufd,
+                                            accel_dev->idev->devid,
+                                            data_type, data,
+                                            data_len, &caps, NULL);
+}
+
+void smmuv3_accel_init_regs(SMMUv3State *s)
+{
+    SMMUv3AccelState *s_accel = s->s_accel;
+    SMMUv3AccelDevice *accel_dev;
+    uint32_t data_type;
+    uint32_t val;
+    int ret;
+
+    if (s_accel->info.idr[0]) {
+        /* We already got this */
+        return;
+    }
+
+    if (!s_accel->viommu || QLIST_EMPTY(&s_accel->viommu->device_list)) {
+        error_report("For arm-smmuv3,accel=on case, atleast one cold-plugged "
+                     "vfio-pci dev needs to be assigned");
+        goto out_err;
+    }
+
+    accel_dev = QLIST_FIRST(&s_accel->viommu->device_list);
+    ret = smmuv3_accel_host_hw_info(accel_dev, &data_type,
+                                    sizeof(s_accel->info), &s_accel->info);
+    if (ret) {
+        error_report("Failed to get Host SMMU device info");
+        goto out_err;
+    }
+
+    if (data_type != IOMMU_HW_INFO_TYPE_ARM_SMMUV3) {
+        error_report("Wrong data type (%d) for Host SMMU device info",
+                     data_type);
+        goto out_err;
+    }
+
+    trace_smmuv3_accel_host_hw_info(s_accel->info.idr[0], s_accel->info.idr[1],
+                                    s_accel->info.idr[3], s_accel->info.idr[5]);
+    /*
+     * QEMU SMMUv3 supports both linear and 2-level stream tables. If host
+     * SMMUv3 supports only linear stream table, report that to Guest.
+     */
+    val = FIELD_EX32(s_accel->info.idr[0], IDR0, STLEVEL);
+    if (val < FIELD_EX32(s->idr[0], IDR0, STLEVEL)) {
+        s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STLEVEL, val);
+    }
+
+    /*
+     * QEMU SMMUv3 supports little-endian support for translation table walks.
+     * If host SMMUv3 supports only big-endian, report error.
+     */
+    val = FIELD_EX32(s_accel->info.idr[0], IDR0, TTENDIAN);
+    if (val > FIELD_EX32(s->idr[0], IDR0, TTENDIAN)) {
+        error_report("Host SUUMU device translation table walk endianess "
+                     "not supported");
+        goto out_err;
+    }
+
+    /*
+     * QEMU SMMUv3 supports AArch64 Translation table format.
+     * If host SMMUv3 supports only AArch32, report error.
+     */
+    val = FIELD_EX32(s_accel->info.idr[0], IDR0, TTF);
+    if (val < FIELD_EX32(s->idr[0], IDR0, TTF)) {
+        error_report("Host SMMU device Translation table format not supported");
+        goto out_err;
+    }
+
+    /*
+     * QEMU SMMUv3 supports 4K/16K/64K translation granules. If host SMMUv3
+     * does't support any of these, report the supported ones only to Guest.
+     */
+    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN4K);
+    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN4K)) {
+        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, val);
+    }
+    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN16K);
+    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN16K)) {
+        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, val);
+    }
+    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN64K);
+    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN64K)) {
+        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, val);
+    }
+    return;
+
+out_err:
+    exit(1);
+}
+
 static void
 smmuv3_accel_dev_uninstall_nested_ste(SMMUv3AccelDevice *accel_dev, bool abort)
 {
diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
index d06c9664ba..e1e99598b4 100644
--- a/hw/arm/smmuv3-accel.h
+++ b/hw/arm/smmuv3-accel.h
@@ -49,6 +49,7 @@ typedef struct SMMUv3AccelState {
     MemoryRegion root;
     MemoryRegion sysmem;
     SMMUViommu *viommu;
+    struct iommu_hw_info_arm_smmuv3 info;
 } SMMUv3AccelState;
 
 #if defined(CONFIG_ARM_SMMUV3) && defined(CONFIG_IOMMUFD)
@@ -60,6 +61,7 @@ bool smmuv3_accel_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch);
 void smmuv3_accel_batch_cmd(SMMUState *bs, SMMUDevice *sdev,
                            SMMUCommandBatch *batch, struct Cmd *cmd,
                            uint32_t *cons);
+void smmuv3_accel_init_regs(SMMUv3State *s);
 #else
 static inline void smmuv3_accel_init(SMMUv3State *d)
 {
@@ -83,6 +85,9 @@ static inline void smmuv3_accel_batch_cmd(SMMUState *bs, SMMUDevice *sdev,
 {
     return;
 }
+static inline void smmuv3_accel_init_regs(SMMUv3State *s)
+{
+}
 #endif
 
 #endif /* HW_ARM_SMMUV3_ACCEL_H */
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 97ecca0764..100e3c8929 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1894,6 +1894,7 @@ static void smmu_init_irq(SMMUv3State *s, SysBusDevice *dev)
  */
 static void smmu_reset_exit(Object *obj, ResetType type)
 {
+    SMMUState *sys = ARM_SMMU(obj);
     SMMUv3State *s = ARM_SMMUV3(obj);
     SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s);
 
@@ -1903,6 +1904,9 @@ static void smmu_reset_exit(Object *obj, ResetType type)
     }
 
     smmuv3_init_regs(s);
+    if (sys->accel) {
+        smmuv3_accel_init_regs(s);
+    }
 }
 
 static void smmu_realize(DeviceState *d, Error **errp)
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 7d232ca17c..37ecab10a0 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -70,7 +70,7 @@ smmu_reset_exit(void) ""
 smmuv3_accel_set_iommu_device(int devfn, uint32_t sid) "devfn=0x%x (sid=0x%x)"
 smmuv3_accel_unset_iommu_device(int devfn, uint32_t sid) "devfn=0x%x (sid=0x%x"
 smmuv3_accel_install_nested_ste(uint32_t sid, uint64_t ste_1, uint64_t ste_0) "sid=%d ste=%"PRIx64":%"PRIx64
-
+smmuv3_accel_host_hw_info(uint32_t idr0, uint32_t idr1, uint32_t idr3, uint32_t idr5) "idr0=0x%x idr1=0x%x idr3=0x%x idr5=0x%x"
 # strongarm.c
 strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
 strongarm_ssp_read_underrun(void) "SSP rx underrun"
-- 
2.34.1

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Eric Auger 5 months, 1 week ago


On 7/14/25 5:59 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> Not all fields in the SMMU IDR registers are meaningful for userspace.
> Only the following fields can be used:
>
>   - IDR0: ST_LEVEL, TERM_MODEL, STALL_MODEL, TTENDIAN, CD2L, ASID16, TTF  
>   - IDR1: SIDSIZE, SSIDSIZE  
>   - IDR3: BBML, RIL  
>   - IDR5: VAX, GRAN64K, GRAN16K, GRAN4K
>
> Use the relevant fields from these to check whether the host and emulated
> SMMUv3 features are sufficiently aligned to enable accelerated SMMUv3
> support.
>
> To retrieve this information from the host, at least one vfio-pci device
> must be assigned with "arm-smmuv3,accel=on" usage. Add a check to enforce
> this.
>
> Note:
>
> ATS, PASID, and PRI features are currently not supported. Only devices
> that do not require or make use of these features are expected to work.
>
> Also, requiring at least one vfio-pci device to be cold-plugged
> complicates hot-unplug and replug scenarios. For example, if all devices
> behind the vSMMUv3 are unplugged after the guest boots, and a new device
> is later hot-plugged into the same PCI bus, there is no guarantee that
> the underlying host SMMUv3 will expose the same feature set as the one
> originally used when the vSMMU was initialized.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  hw/arm/smmuv3-accel.c | 103 ++++++++++++++++++++++++++++++++++++++++++
>  hw/arm/smmuv3-accel.h |   5 ++
>  hw/arm/smmuv3.c       |   4 ++
>  hw/arm/trace-events   |   2 +-
>  4 files changed, 113 insertions(+), 1 deletion(-)
>
> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> index 1298b4f6d0..3b2f45bd88 100644
> --- a/hw/arm/smmuv3-accel.c
> +++ b/hw/arm/smmuv3-accel.c
> @@ -23,6 +23,109 @@
>  #define SMMU_STE_VALID      (1ULL << 0)
>  #define SMMU_STE_CFG_BYPASS (1ULL << 3)
>  
> +static int
> +smmuv3_accel_host_hw_info(SMMUv3AccelDevice *accel_dev, uint32_t *data_type,
> +                          uint32_t data_len, void *data)
> +{
> +    uint64_t caps;
> +
> +    if (!accel_dev || !accel_dev->idev) {
> +        return -ENOENT;
> +    }
> +
> +    return !iommufd_backend_get_device_info(accel_dev->idev->iommufd,
> +                                            accel_dev->idev->devid,
> +                                            data_type, data,
> +                                            data_len, &caps, NULL);
> +}
> +
> +void smmuv3_accel_init_regs(SMMUv3State *s)
> +{
> +    SMMUv3AccelState *s_accel = s->s_accel;
> +    SMMUv3AccelDevice *accel_dev;
> +    uint32_t data_type;
> +    uint32_t val;
> +    int ret;
> +
> +    if (s_accel->info.idr[0]) {
> +        /* We already got this */
what does it mean?
> +        return;
> +    }
> +
> +    if (!s_accel->viommu || QLIST_EMPTY(&s_accel->viommu->device_list)) {
> +        error_report("For arm-smmuv3,accel=on case, atleast one cold-plugged "
> +                     "vfio-pci dev needs to be assigned");
to me this shall be relaxed. If you don't have any way to fetch HW info,
you apply the defaults and if a device is hotplugged any mismatch would
fail the hotplug
> +        goto out_err;
> +    }
> +
> +    accel_dev = QLIST_FIRST(&s_accel->viommu->device_list);
> +    ret = smmuv3_accel_host_hw_info(accel_dev, &data_type,
> +                                    sizeof(s_accel->info), &s_accel->info);
> +    if (ret) {
> +        error_report("Failed to get Host SMMU device info");
> +        goto out_err;
> +    }
> +
> +    if (data_type != IOMMU_HW_INFO_TYPE_ARM_SMMUV3) {
> +        error_report("Wrong data type (%d) for Host SMMU device info",
> +                     data_type);
> +        goto out_err;
> +    }
> +
> +    trace_smmuv3_accel_host_hw_info(s_accel->info.idr[0], s_accel->info.idr[1],
> +                                    s_accel->info.idr[3], s_accel->info.idr[5]);
> +    /*
> +     * QEMU SMMUv3 supports both linear and 2-level stream tables. If host
> +     * SMMUv3 supports only linear stream table, report that to Guest.
> +     */
> +    val = FIELD_EX32(s_accel->info.idr[0], IDR0, STLEVEL);
> +    if (val < FIELD_EX32(s->idr[0], IDR0, STLEVEL)) {
> +        s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STLEVEL, val);
> +    }
> +
> +    /*
> +     * QEMU SMMUv3 supports little-endian support for translation table walks.
> +     * If host SMMUv3 supports only big-endian, report error.
> +     */
> +    val = FIELD_EX32(s_accel->info.idr[0], IDR0, TTENDIAN);
> +    if (val > FIELD_EX32(s->idr[0], IDR0, TTENDIAN)) {
> +        error_report("Host SUUMU device translation table walk endianess "
> +                     "not supported");
> +        goto out_err;
> +    }
> +
> +    /*
> +     * QEMU SMMUv3 supports AArch64 Translation table format.
> +     * If host SMMUv3 supports only AArch32, report error.
> +     */
> +    val = FIELD_EX32(s_accel->info.idr[0], IDR0, TTF);
> +    if (val < FIELD_EX32(s->idr[0], IDR0, TTF)) {
> +        error_report("Host SMMU device Translation table format not supported");
> +        goto out_err;
> +    }
> +
> +    /*
> +     * QEMU SMMUv3 supports 4K/16K/64K translation granules. If host SMMUv3
> +     * does't support any of these, report the supported ones only to Guest.
> +     */
> +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN4K);
> +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN4K)) {
> +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, val);
> +    }
> +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN16K);
> +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN16K)) {
> +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, val);
> +    }
> +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN64K);
> +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN64K)) {
> +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, val);
for migration purpose we would definitively need to store those values
in a VMState.

Thanks

Eric
> +    }
> +    return;
> +
> +out_err:
> +    exit(1);
> +}
> +
>  static void
>  smmuv3_accel_dev_uninstall_nested_ste(SMMUv3AccelDevice *accel_dev, bool abort)
>  {
> diff --git a/hw/arm/smmuv3-accel.h b/hw/arm/smmuv3-accel.h
> index d06c9664ba..e1e99598b4 100644
> --- a/hw/arm/smmuv3-accel.h
> +++ b/hw/arm/smmuv3-accel.h
> @@ -49,6 +49,7 @@ typedef struct SMMUv3AccelState {
>      MemoryRegion root;
>      MemoryRegion sysmem;
>      SMMUViommu *viommu;
> +    struct iommu_hw_info_arm_smmuv3 info;
>  } SMMUv3AccelState;
>  
>  #if defined(CONFIG_ARM_SMMUV3) && defined(CONFIG_IOMMUFD)
> @@ -60,6 +61,7 @@ bool smmuv3_accel_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch);
>  void smmuv3_accel_batch_cmd(SMMUState *bs, SMMUDevice *sdev,
>                             SMMUCommandBatch *batch, struct Cmd *cmd,
>                             uint32_t *cons);
> +void smmuv3_accel_init_regs(SMMUv3State *s);
>  #else
>  static inline void smmuv3_accel_init(SMMUv3State *d)
>  {
> @@ -83,6 +85,9 @@ static inline void smmuv3_accel_batch_cmd(SMMUState *bs, SMMUDevice *sdev,
>  {
>      return;
>  }
> +static inline void smmuv3_accel_init_regs(SMMUv3State *s)
> +{
> +}
>  #endif
>  
>  #endif /* HW_ARM_SMMUV3_ACCEL_H */
> diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
> index 97ecca0764..100e3c8929 100644
> --- a/hw/arm/smmuv3.c
> +++ b/hw/arm/smmuv3.c
> @@ -1894,6 +1894,7 @@ static void smmu_init_irq(SMMUv3State *s, SysBusDevice *dev)
>   */
>  static void smmu_reset_exit(Object *obj, ResetType type)
>  {
> +    SMMUState *sys = ARM_SMMU(obj);
>      SMMUv3State *s = ARM_SMMUV3(obj);
>      SMMUv3Class *c = ARM_SMMUV3_GET_CLASS(s);
>  
> @@ -1903,6 +1904,9 @@ static void smmu_reset_exit(Object *obj, ResetType type)
>      }
>  
>      smmuv3_init_regs(s);
> +    if (sys->accel) {
> +        smmuv3_accel_init_regs(s);
> +    }
>  }
>  
>  static void smmu_realize(DeviceState *d, Error **errp)
> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 7d232ca17c..37ecab10a0 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -70,7 +70,7 @@ smmu_reset_exit(void) ""
>  smmuv3_accel_set_iommu_device(int devfn, uint32_t sid) "devfn=0x%x (sid=0x%x)"
>  smmuv3_accel_unset_iommu_device(int devfn, uint32_t sid) "devfn=0x%x (sid=0x%x"
>  smmuv3_accel_install_nested_ste(uint32_t sid, uint64_t ste_1, uint64_t ste_0) "sid=%d ste=%"PRIx64":%"PRIx64
> -
> +smmuv3_accel_host_hw_info(uint32_t idr0, uint32_t idr1, uint32_t idr3, uint32_t idr5) "idr0=0x%x idr1=0x%x idr3=0x%x idr5=0x%x"
>  # strongarm.c
>  strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
>  strongarm_ssp_read_underrun(void) "SSP rx underrun"

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Nicolin Chen 6 months, 3 weeks ago

On Mon, Jul 14, 2025 at 04:59:40PM +0100, Shameer Kolothum wrote:
> +void smmuv3_accel_init_regs(SMMUv3State *s)
> +{
> +    SMMUv3AccelState *s_accel = s->s_accel;
> +    SMMUv3AccelDevice *accel_dev;
> +    uint32_t data_type;
> +    uint32_t val;
> +    int ret;
> +
> +    if (s_accel->info.idr[0]) {
> +        /* We already got this */
> +        return;

We can avoid duplicated HW_INFO ioctls but probably shouldn't return
here, but just goto ..

> +    }
> +
> +    if (!s_accel->viommu || QLIST_EMPTY(&s_accel->viommu->device_list)) {
> +        error_report("For arm-smmuv3,accel=on case, atleast one cold-plugged "
> +                     "vfio-pci dev needs to be assigned");
> +        goto out_err;
> +    }
> +
> +    accel_dev = QLIST_FIRST(&s_accel->viommu->device_list);
> +    ret = smmuv3_accel_host_hw_info(accel_dev, &data_type,
> +                                    sizeof(s_accel->info), &s_accel->info);
> +    if (ret) {
> +        error_report("Failed to get Host SMMU device info");
> +        goto out_err;
> +    }
> +
> +    if (data_type != IOMMU_HW_INFO_TYPE_ARM_SMMUV3) {
> +        error_report("Wrong data type (%d) for Host SMMU device info",
> +                     data_type);
> +        goto out_err;
> +    }

.. likely here:

> +    trace_smmuv3_accel_host_hw_info(s_accel->info.idr[0], s_accel->info.idr[1],
> +                                    s_accel->info.idr[3], s_accel->info.idr[5]);

The following register initializations shouldn't be skipped.

Otherwise, caps would not be the same after a system reboot.

Thanks
Nicolin

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Nicolin Chen via 6 months, 4 weeks ago

On Mon, Jul 14, 2025 at 04:59:40PM +0100, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
> 
> Not all fields in the SMMU IDR registers are meaningful for userspace.
> Only the following fields can be used:
> 
>   - IDR0: ST_LEVEL, TERM_MODEL, STALL_MODEL, TTENDIAN, CD2L, ASID16, TTF  
>   - IDR1: SIDSIZE, SSIDSIZE  
>   - IDR3: BBML, RIL  
>   - IDR5: VAX, GRAN64K, GRAN16K, GRAN4K

But half of these fields are not validated in the patch :-/

My vSMMU didn't work until I added entries like SIDSIZE, SSIDSIZE,
TERM_MODEL, STALL_MODEL, and RIL.

I think IDR5.OAS should be also added in the list. Maybe we should
update the kernel uAPI meanwhile.

> +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN4K);
> +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN4K)) {
> +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, val);
> +    }
> +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN16K);
> +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN16K)) {
> +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, val);
> +    }
> +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN64K);
> +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN64K)) {
> +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, val);

Unless there is some conflicts between the QEMU emulation and the
SMMU HW, I think we should probably just override these fields to
the HW values, instead of running comparisons. The justification
could be that these fields are unlikely going to be controlled by
the QEMU but supported directly by the real HW.

For example, if HW supports SSIDSIZE=5, there seems to be no good
reason to limit it to SSIDSIZE=4? Even if the default SSIDSIZE in
the smmuv3_init_regs() is 4.

> @@ -1903,6 +1904,9 @@ static void smmu_reset_exit(Object *obj, ResetType type)
>      }
>  
>      smmuv3_init_regs(s);
> +    if (sys->accel) {
> +        smmuv3_accel_init_regs(s);
> +    }

I feel that we should likely do an if-else instead, i.e.

    if (sys->accel) {
        smmuv3_accel_init_regs(s);
    } else {
        smmuv3_init_regs(s);
    }

The smmuv3_init_regs() enables certain bits that really should be
set by the returned IDRs from hw_info in smmuv3_accel_init_regs().

Doing an overriding call can potentially give us some trouble in
the future if there are new bits being introduced and enabled in
smmuv3_init_regs() but missed in smmuv3_accel_init_regs().

So, it can be simpler in the long run if smmuv3_accel_init_regs()
initializes in its own way, IMHO.

Thanks
Nicolin

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Jason Gunthorpe 6 months, 4 weeks ago

On Tue, Jul 15, 2025 at 07:57:57PM -0700, Nicolin Chen wrote:
> > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN4K);
> > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN4K)) {
> > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, val);
> > +    }
> > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN16K);
> > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN16K)) {
> > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, val);
> > +    }
> > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN64K);
> > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN64K)) {
> > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, val);
> 
> Unless there is some conflicts between the QEMU emulation and the
> SMMU HW, I think we should probably just override these fields to
> the HW values,

The qemu model should be fully independent of the underlying HW, it
should not override from HW.

It should check if the underlying supports the model and fail if it
doesn't.

Jason

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Nicolin Chen 6 months, 4 weeks ago

On Wed, Jul 16, 2025 at 08:51:23AM -0300, Jason Gunthorpe wrote:
> On Tue, Jul 15, 2025 at 07:57:57PM -0700, Nicolin Chen wrote:
> > > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN4K);
> > > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN4K)) {
> > > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, val);
> > > +    }
> > > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN16K);
> > > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN16K)) {
> > > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, val);
> > > +    }
> > > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN64K);
> > > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN64K)) {
> > > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, val);
> > 
> > Unless there is some conflicts between the QEMU emulation and the
> > SMMU HW, I think we should probably just override these fields to
> > the HW values,
> 
> The qemu model should be fully independent of the underlying HW, it
> should not override from HW.
> 
> It should check if the underlying supports the model and fail if it
> doesn't.

For every bit? If there is a conflict at a certain field (e.g.
VMM only supports little endian while HW supports big endian),
it must fail.

But here, I mean for these specific fields such as GRANxK and
RIL (range-based invalidation), we should override them with
the HW values. Otherwise, the guest OS seeing RIL for example
will issue TLBI commands that the host can't support. Right?

Thanks
Nicolin

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Jason Gunthorpe 6 months, 4 weeks ago

On Wed, Jul 16, 2025 at 10:35:25AM -0700, Nicolin Chen wrote:
> On Wed, Jul 16, 2025 at 08:51:23AM -0300, Jason Gunthorpe wrote:
> > On Tue, Jul 15, 2025 at 07:57:57PM -0700, Nicolin Chen wrote:
> > > > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN4K);
> > > > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN4K)) {
> > > > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, val);
> > > > +    }
> > > > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN16K);
> > > > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN16K)) {
> > > > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, val);
> > > > +    }
> > > > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN64K);
> > > > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN64K)) {
> > > > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, val);
> > > 
> > > Unless there is some conflicts between the QEMU emulation and the
> > > SMMU HW, I think we should probably just override these fields to
> > > the HW values,
> > 
> > The qemu model should be fully independent of the underlying HW, it
> > should not override from HW.
> > 
> > It should check if the underlying supports the model and fail if it
> > doesn't.
> 
> For every bit? If there is a conflict at a certain field (e.g.
> VMM only supports little endian while HW supports big endian),
> it must fail.

Yes every bit.

> But here, I mean for these specific fields such as GRANxK and
> RIL (range-based invalidation), we should override them with
> the HW values. Otherwise, the guest OS seeing RIL for example
> will issue TLBI commands that the host can't support. Right?

No.

If the SMMU model does not include RIL then RIL is not available to
the guest.

If the SMMU model only supports GRAN4K, then the guest only uses 4k.

This exactness is critical for live migration. We cannot have the IDRs
change during live migration.

So there should be some built in models in qemu that define exactly
what kind of SMMU you get, and things like if 4k/16k/64k or RIL are
included in that model or not should be command line parameters/etc
like everything else in qemu..

Jason

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Eric Auger 5 months, 1 week ago

Hi,

On 7/16/25 7:45 PM, Jason Gunthorpe wrote:
> On Wed, Jul 16, 2025 at 10:35:25AM -0700, Nicolin Chen wrote:
>> On Wed, Jul 16, 2025 at 08:51:23AM -0300, Jason Gunthorpe wrote:
>>> On Tue, Jul 15, 2025 at 07:57:57PM -0700, Nicolin Chen wrote:
>>>>> +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN4K);
>>>>> +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN4K)) {
>>>>> +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, val);
>>>>> +    }
>>>>> +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN16K);
>>>>> +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN16K)) {
>>>>> +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, val);
>>>>> +    }
>>>>> +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN64K);
>>>>> +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN64K)) {
>>>>> +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, val);
>>>> Unless there is some conflicts between the QEMU emulation and the
>>>> SMMU HW, I think we should probably just override these fields to
>>>> the HW values,
>>> The qemu model should be fully independent of the underlying HW, it
>>> should not override from HW.
>>>
>>> It should check if the underlying supports the model and fail if it
>>> doesn't.
>> For every bit? If there is a conflict at a certain field (e.g.
>> VMM only supports little endian while HW supports big endian),
>> it must fail.
> Yes every bit.
>
>> But here, I mean for these specific fields such as GRANxK and
>> RIL (range-based invalidation), we should override them with
>> the HW values. Otherwise, the guest OS seeing RIL for example
>> will issue TLBI commands that the host can't support. Right?
> No.
>
> If the SMMU model does not include RIL then RIL is not available to
> the guest.

For virtio-iommu several parameters are dynamically computed: the pgsize
mask, the aw (using ReservedRegion info). They are computed according to
the assigned device requirements, if not conflicting with anything else.

For instance you can have a look at 5c3cfe33f418 ("virtio-iommu: Set
supported page size mask"). I don't quite remember but intel-iommu might
also have such dynamic settings depending on the host.

However I am unsure we enforce the computed granule/aw on dest (VFIO mig
was not supported when the feature were implemented). as this is part of
the device config it may be but it would be worth to check.

Thanks

Eric
>
> If the SMMU model only supports GRAN4K, then the guest only uses 4k.
>
> This exactness is critical for live migration. We cannot have the IDRs
> change during live migration.
>
> So there should be some built in models in qemu that define exactly
> what kind of SMMU you get, and things like if 4k/16k/64k or RIL are
> included in that model or not should be command line parameters/etc
> like everything else in qemu..
>
> Jason
>

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Nicolin Chen 6 months, 4 weeks ago

On Wed, Jul 16, 2025 at 02:45:06PM -0300, Jason Gunthorpe wrote:
> On Wed, Jul 16, 2025 at 10:35:25AM -0700, Nicolin Chen wrote:
> > On Wed, Jul 16, 2025 at 08:51:23AM -0300, Jason Gunthorpe wrote:
> > > On Tue, Jul 15, 2025 at 07:57:57PM -0700, Nicolin Chen wrote:
> > > > > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN4K);
> > > > > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN4K)) {
> > > > > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, val);
> > > > > +    }
> > > > > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN16K);
> > > > > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN16K)) {
> > > > > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, val);
> > > > > +    }
> > > > > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN64K);
> > > > > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN64K)) {
> > > > > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, val);
> > > > 
> > > > Unless there is some conflicts between the QEMU emulation and the
> > > > SMMU HW, I think we should probably just override these fields to
> > > > the HW values,
> > > 
> > > The qemu model should be fully independent of the underlying HW, it
> > > should not override from HW.
> > > 
> > > It should check if the underlying supports the model and fail if it
> > > doesn't.
> > 
> > For every bit? If there is a conflict at a certain field (e.g.
> > VMM only supports little endian while HW supports big endian),
> > it must fail.
> 
> Yes every bit.
> 
> > But here, I mean for these specific fields such as GRANxK and
> > RIL (range-based invalidation), we should override them with
> > the HW values. Otherwise, the guest OS seeing RIL for example
> > will issue TLBI commands that the host can't support. Right?
> 
> No.
> 
> If the SMMU model does not include RIL then RIL is not available to
> the guest.
> 
> If the SMMU model only supports GRAN4K, then the guest only uses 4k.
> 
> This exactness is critical for live migration. We cannot have the IDRs
> change during live migration.
> 
> So there should be some built in models in qemu that define exactly
> what kind of SMMU you get, and things like if 4k/16k/64k or RIL are
> included in that model or not should be command line parameters/etc
> like everything else in qemu..

OK. I see your point. That will leads to a very long list of
parameters.

So, a vSMMU model is defined following the parameters in the
command line. A device (and its attaching SMMU HW) that's not
compatibile should just fail the cold-plug at the beginning.

Then, it shouldn't run into any problem that I encountered.

Thanks
Nicolin

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Jason Gunthorpe 6 months, 4 weeks ago

On Wed, Jul 16, 2025 at 11:09:45AM -0700, Nicolin Chen wrote:
> OK. I see your point. That will leads to a very long list of
> parameters.

I would have some useful prebaked ones. Realistically there are not
that many combinations of HW capabilities that are
interesting/exist.

> So, a vSMMU model is defined following the parameters in the
> command line. A device (and its attaching SMMU HW) that's not
> compatibile should just fail the cold-plug at the beginning.

Yes

And if you want to do hotplug the SMMU is already fully defined so you
don't need to discover anything at VM startup time.

Jason

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Nicolin Chen 6 months, 4 weeks ago

On Wed, Jul 16, 2025 at 03:42:39PM -0300, Jason Gunthorpe wrote:
> On Wed, Jul 16, 2025 at 11:09:45AM -0700, Nicolin Chen wrote:
> > OK. I see your point. That will leads to a very long list of
> > parameters.
> 
> I would have some useful prebaked ones. Realistically there are not
> that many combinations of HW capabilities that are
> interesting/exist.

Maybe starting with a configurable subversion could be a good one.

But I suspect there can be some case where somebody wants certain
bits to be off on top of a subversion prebake..

> > So, a vSMMU model is defined following the parameters in the
> > command line. A device (and its attaching SMMU HW) that's not
> > compatibile should just fail the cold-plug at the beginning.
> 
> Yes
> 
> And if you want to do hotplug the SMMU is already fully defined so you
> don't need to discover anything at VM startup time.

Yea, basically every device (whether hotplug or coldplug) should
run hw_info to make sure it's compatible to the predefined vSMMU,
not the other way around that I expected.

Thanks
Nicolin

RE: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Shameerali Kolothum Thodi via 6 months, 4 weeks ago


> -----Original Message-----
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Wednesday, July 16, 2025 3:58 AM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org;
> eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com;
> ddutile@redhat.com; berrange@redhat.com; nathanc@nvidia.com;
> mochs@nvidia.com; smostafa@google.com; Linuxarm
> <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
> jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron
> <jonathan.cameron@huawei.com>; zhangfei.gao@linaro.org;
> zhenzhong.duan@intel.com; shameerkolothum@gmail.com
> Subject: Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature
> bits
> 
> On Mon, Jul 14, 2025 at 04:59:40PM +0100, Shameer Kolothum wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> >
> > Not all fields in the SMMU IDR registers are meaningful for userspace.
> > Only the following fields can be used:
> >
> >   - IDR0: ST_LEVEL, TERM_MODEL, STALL_MODEL, TTENDIAN, CD2L, ASID16,
> TTF
> >   - IDR1: SIDSIZE, SSIDSIZE
> >   - IDR3: BBML, RIL
> >   - IDR5: VAX, GRAN64K, GRAN16K, GRAN4K
> 
> But half of these fields are not validated in the patch :-/

That’s why I said " Use the relevant fields from these to check.." .
But sorry, I was conservative ☹ and not sure the SSIDSIZE/STALL mattered
for non pasid cases.

> My vSMMU didn't work until I added entries like SIDSIZE, SSIDSIZE,
> TERM_MODEL, STALL_MODEL, and RIL.

How come your vSMMU not working? Or you meant the assigned
dev is not working?

The emulation supports SIDSIZE = 16 and RIL. Could you please
share the difference between these values w.r.t host SMMUv3.

> I think IDR5.OAS should be also added in the list. Maybe we should
> update the kernel uAPI meanwhile.

Ok.
 
> > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN4K);
> > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN4K)) {
> > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, val);
> > +    }
> > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN16K);
> > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN16K)) {
> > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, val);
> > +    }
> > +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN64K);
> > +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN64K)) {
> > +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, val);
> 
> Unless there is some conflicts between the QEMU emulation and the
> SMMU HW, I think we should probably just override these fields to
> the HW values, instead of running comparisons. The justification
> could be that these fields are unlikely going to be controlled by
> the QEMU but supported directly by the real HW.
> 
> For example, if HW supports SSIDSIZE=5, there seems to be no good
> reason to limit it to SSIDSIZE=4? Even if the default SSIDSIZE in
> the smmuv3_init_regs() is 4.
> 
> > @@ -1903,6 +1904,9 @@ static void smmu_reset_exit(Object *obj,
> ResetType type)
> >      }
> >
> >      smmuv3_init_regs(s);
> > +    if (sys->accel) {
> > +        smmuv3_accel_init_regs(s);
> > +    }
> 
> I feel that we should likely do an if-else instead, i.e.
> 
>     if (sys->accel) {
>         smmuv3_accel_init_regs(s);
>     } else {
>         smmuv3_init_regs(s);
>     }
> 
> The smmuv3_init_regs() enables certain bits that really should be
> set by the returned IDRs from hw_info in smmuv3_accel_init_regs().
> 
> Doing an overriding call can potentially give us some trouble in
> the future if there are new bits being introduced and enabled in
> smmuv3_init_regs() but missed in smmuv3_accel_init_regs().
> 
> So, it can be simpler in the long run if smmuv3_accel_init_regs()
> initializes in its own way, IMHO.

Ok. Are you suggesting we simply override the IDR values from Host?
I don't think that is a good idea as it is not just the IDR values that
determines the host features. And we had a discussion on this 
in v2 and the suggestion was " vmm should not be copying IDR fields
blindly..."

https://lore.kernel.org/qemu-devel/Z+VNA+hFu0LJn19l@nvidia.com/

Probably we should take a look at Intel vtd implementation mentioned
by Zhenzhong in the other thread where it looks like there seems to be
a property for each capability they care about.

Probably something like,
-device arm-smmuv3,accel=on,pasid_cap=on,

And then enabling all features related to pasid and on later when
we retrieve the HW_INFO on device plug, compare and fail if not?

But I think on ARM, we still we have limitations in knowing the actual
host supported features through IDR. In that case, we can only assume
that user is making an informed decision while enabling these features.

Thanks,
Shameer

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Nicolin Chen 6 months, 4 weeks ago

On Wed, Jul 16, 2025 at 10:26:21AM +0000, Shameerali Kolothum Thodi wrote:
> > On Mon, Jul 14, 2025 at 04:59:40PM +0100, Shameer Kolothum wrote:
> > My vSMMU didn't work until I added entries like SIDSIZE, SSIDSIZE,
> > TERM_MODEL, STALL_MODEL, and RIL.
> 
> How come your vSMMU not working? Or you meant the assigned
> dev is not working?

The "dev" (behind the vSMMU) running in the guest I mean.

> The emulation supports SIDSIZE = 16 and RIL. Could you please
> share the difference between these values w.r.t host SMMUv3.

My hardware doesn't support RIL while the VMM sets RIL.

There are other conflicts like STALL_MODEL that affected
the final two-stage STE in the host too.

> Probably we should take a look at Intel vtd implementation mentioned
> by Zhenzhong in the other thread where it looks like there seems to be
> a property for each capability they care about.
> 
> Probably something like,
> -device arm-smmuv3,accel=on,pasid_cap=on,
> 
> And then enabling all features related to pasid and on later when
> we retrieve the HW_INFO on device plug, compare and fail if not?
> 
> But I think on ARM, we still we have limitations in knowing the actual
> host supported features through IDR. In that case, we can only assume
> that user is making an informed decision while enabling these features.

Yes, I think you are right about this approach.

Maybe a "subversion" parameter could mask away quite a few bits
like RIL BBML. But the tricky thing is that user might want a
customization to those individual bits, because it has to match
with the HW values to use the device correctly.

Thanks
Nicolin

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Jonathan Cameron via 7 months ago

On Mon, 14 Jul 2025 16:59:40 +0100
Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote:

> From: Nicolin Chen <nicolinc@nvidia.com>
> 
> Not all fields in the SMMU IDR registers are meaningful for userspace.
> Only the following fields can be used:
> 
>   - IDR0: ST_LEVEL, TERM_MODEL, STALL_MODEL, TTENDIAN, CD2L, ASID16, TTF  
>   - IDR1: SIDSIZE, SSIDSIZE  
>   - IDR3: BBML, RIL  
>   - IDR5: VAX, GRAN64K, GRAN16K, GRAN4K
> 
> Use the relevant fields from these to check whether the host and emulated
> SMMUv3 features are sufficiently aligned to enable accelerated SMMUv3
> support.
> 
> To retrieve this information from the host, at least one vfio-pci device
> must be assigned with "arm-smmuv3,accel=on" usage. Add a check to enforce
> this.
> 
> Note:
> 
> ATS, PASID, and PRI features are currently not supported. Only devices
> that do not require or make use of these features are expected to work.
> 
> Also, requiring at least one vfio-pci device to be cold-plugged
> complicates hot-unplug and replug scenarios. For example, if all devices
> behind the vSMMUv3 are unplugged after the guest boots, and a new device
> is later hot-plugged into the same PCI bus, there is no guarantee that
> the underlying host SMMUv3 will expose the same feature set as the one
> originally used when the vSMMU was initialized.
> 
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
=
> +
> +void smmuv3_accel_init_regs(SMMUv3State *s)
> +{
> +    SMMUv3AccelState *s_accel = s->s_accel;
> +    SMMUv3AccelDevice *accel_dev;
> +    uint32_t data_type;
> +    uint32_t val;
> +    int ret;
> +
> +    if (s_accel->info.idr[0]) {
> +        /* We already got this */
> +        return;
> +    }
> +
> +    if (!s_accel->viommu || QLIST_EMPTY(&s_accel->viommu->device_list)) {
> +        error_report("For arm-smmuv3,accel=on case, atleast one cold-plugged "
> +                     "vfio-pci dev needs to be assigned");
> +        goto out_err;
> +    }
> +
> +    accel_dev = QLIST_FIRST(&s_accel->viommu->device_list);
> +    ret = smmuv3_accel_host_hw_info(accel_dev, &data_type,
> +                                    sizeof(s_accel->info), &s_accel->info);
> +    if (ret) {
> +        error_report("Failed to get Host SMMU device info");
> +        goto out_err;
> +    }
> +
> +    if (data_type != IOMMU_HW_INFO_TYPE_ARM_SMMUV3) {
> +        error_report("Wrong data type (%d) for Host SMMU device info",
> +                     data_type);
> +        goto out_err;
> +    }
> +
> +    trace_smmuv3_accel_host_hw_info(s_accel->info.idr[0], s_accel->info.idr[1],
> +                                    s_accel->info.idr[3], s_accel->info.idr[5]);
> +    /*
> +     * QEMU SMMUv3 supports both linear and 2-level stream tables. If host
> +     * SMMUv3 supports only linear stream table, report that to Guest.
> +     */
> +    val = FIELD_EX32(s_accel->info.idr[0], IDR0, STLEVEL);
> +    if (val < FIELD_EX32(s->idr[0], IDR0, STLEVEL)) {
> +        s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STLEVEL, val);
> +    }
> +
> +    /*
> +     * QEMU SMMUv3 supports little-endian support for translation table walks.
> +     * If host SMMUv3 supports only big-endian, report error.
> +     */
> +    val = FIELD_EX32(s_accel->info.idr[0], IDR0, TTENDIAN);
> +    if (val > FIELD_EX32(s->idr[0], IDR0, TTENDIAN)) {
> +        error_report("Host SUUMU device translation table walk endianess "
> +                     "not supported");
> +        goto out_err;
> +    }
> +
> +    /*
> +     * QEMU SMMUv3 supports AArch64 Translation table format.
> +     * If host SMMUv3 supports only AArch32, report error.
> +     */
> +    val = FIELD_EX32(s_accel->info.idr[0], IDR0, TTF);
> +    if (val < FIELD_EX32(s->idr[0], IDR0, TTF)) {
> +        error_report("Host SMMU device Translation table format not supported");
> +        goto out_err;
> +    }
> +
> +    /*
> +     * QEMU SMMUv3 supports 4K/16K/64K translation granules. If host SMMUv3
> +     * does't support any of these, report the supported ones only to Guest.
> +     */
> +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN4K);
> +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN4K)) {
> +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN4K, val);
> +    }
> +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN16K);
> +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN16K)) {
> +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN16K, val);
> +    }
> +    val = FIELD_EX32(s_accel->info.idr[5], IDR5, GRAN64K);
> +    if (val < FIELD_EX32(s->idr[5], IDR5, GRAN64K)) {
> +        s->idr[5] = FIELD_DP32(s->idr[5], IDR5, GRAN64K, val);
> +    }
> +    return;
> +
> +out_err:
> +    exit(1);

Maybe just do this at each error path rather than goto?
Makes it clear that the result is brutal.


> +}
> +
>  static void
>  smmuv3_accel_dev_uninstall_nested_ste(SMMUv3AccelDevice *accel_dev, bool abort)
>  {


> diff --git a/hw/arm/trace-events b/hw/arm/trace-events
> index 7d232ca17c..37ecab10a0 100644
> --- a/hw/arm/trace-events
> +++ b/hw/arm/trace-events
> @@ -70,7 +70,7 @@ smmu_reset_exit(void) ""
>  smmuv3_accel_set_iommu_device(int devfn, uint32_t sid) "devfn=0x%x (sid=0x%x)"
>  smmuv3_accel_unset_iommu_device(int devfn, uint32_t sid) "devfn=0x%x (sid=0x%x"
>  smmuv3_accel_install_nested_ste(uint32_t sid, uint64_t ste_1, uint64_t ste_0) "sid=%d ste=%"PRIx64":%"PRIx64
> -

Stray

> +smmuv3_accel_host_hw_info(uint32_t idr0, uint32_t idr1, uint32_t idr3, uint32_t idr5) "idr0=0x%x idr1=0x%x idr3=0x%x idr5=0x%x"
>  # strongarm.c
>  strongarm_uart_update_parameters(const char *label, int speed, char parity, int data_bits, int stop_bits) "%s speed=%d parity=%c data=%d stop=%d"
>  strongarm_ssp_read_underrun(void) "SSP rx underrun"

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Nicolin Chen via 7 months ago

On Mon, Jul 14, 2025 at 04:59:40PM +0100, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
> 
> Not all fields in the SMMU IDR registers are meaningful for userspace.
> Only the following fields can be used:
> 
>   - IDR0: ST_LEVEL, TERM_MODEL, STALL_MODEL, TTENDIAN, CD2L, ASID16, TTF  
>   - IDR1: SIDSIZE, SSIDSIZE  
>   - IDR3: BBML, RIL  
>   - IDR5: VAX, GRAN64K, GRAN16K, GRAN4K
> 
> Use the relevant fields from these to check whether the host and emulated
> SMMUv3 features are sufficiently aligned to enable accelerated SMMUv3
> support.
> 
> To retrieve this information from the host, at least one vfio-pci device
> must be assigned with "arm-smmuv3,accel=on" usage. Add a check to enforce
> this.
> 
> Note:
> 
> ATS, PASID, and PRI features are currently not supported. Only devices
> that do not require or make use of these features are expected to work.

Can we support ATS/PASID at least? I need to double check intel's
series, but I somehow recall that there is a PASID cap support in
the VFIO level, so VM could actually report ATS/PASID caps?

The invalidation part could forward ATC_INV command too, as kernel
supports that.

Thanks
Nicolin

Re: [RFC PATCH v3 14/15] Read and validate host SMMUv3 feature bits

Posted by Nicolin Chen via 7 months ago

On Mon, Jul 14, 2025 at 01:04:02PM -0700, Nicolin Chen wrote:
> On Mon, Jul 14, 2025 at 04:59:40PM +0100, Shameer Kolothum wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > 
> > Not all fields in the SMMU IDR registers are meaningful for userspace.
> > Only the following fields can be used:
> > 
> >   - IDR0: ST_LEVEL, TERM_MODEL, STALL_MODEL, TTENDIAN, CD2L, ASID16, TTF  
> >   - IDR1: SIDSIZE, SSIDSIZE  
> >   - IDR3: BBML, RIL  
> >   - IDR5: VAX, GRAN64K, GRAN16K, GRAN4K
> > 
> > Use the relevant fields from these to check whether the host and emulated
> > SMMUv3 features are sufficiently aligned to enable accelerated SMMUv3
> > support.
> > 
> > To retrieve this information from the host, at least one vfio-pci device
> > must be assigned with "arm-smmuv3,accel=on" usage. Add a check to enforce
> > this.
> > 
> > Note:
> > 
> > ATS, PASID, and PRI features are currently not supported. Only devices
> > that do not require or make use of these features are expected to work.
> 
> Can we support ATS/PASID at least? I need to double check intel's
> series, but I somehow recall that there is a PASID cap support in
> the VFIO level, so VM could actually report ATS/PASID caps?
> 
> The invalidation part could forward ATC_INV command too, as kernel
> supports that.

Also, the Subject is missing prefix "hw/arm/smmuv3-accel:".

Nicolin