[RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache invalidations

Shameer Kolothum via posted 20 patches 3 weeks, 4 days ago
[RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache invalidations
Posted by Shameer Kolothum via 3 weeks, 4 days ago
From: Nicolin Chen <nicolinc@nvidia.com>

Inroduce an SMMUCommandBatch and some helpers to batch and issue the
commands.  Currently separate out TLBI commands and device cache commands
to avoid some errata on certain versions of SMMUs. Later it should check
IIDR register to detect if underlying SMMU hw has such an erratum.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 hw/arm/smmuv3-accel.c    | 69 ++++++++++++++++++++++++++++++++++++++++
 hw/arm/smmuv3-internal.h | 29 +++++++++++++++++
 2 files changed, 98 insertions(+)

diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 76134d106a..09be838d22 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -160,6 +160,75 @@ void smmuv3_accel_install_nested_ste(SMMUDevice *sdev, int sid)
                                           nested_data.ste[0]);
 }
 
+/* Update batch->ncmds to the number of execute cmds */
+int smmuv3_accel_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch)
+{
+    SMMUv3AccelState *s_accel = ARM_SMMUV3_ACCEL(bs);
+    uint32_t total = batch->ncmds;
+    IOMMUFDViommu *viommu_core;
+    int ret;
+
+    if (!bs->accel) {
+        return 0;
+    }
+
+    if (!s_accel->viommu) {
+        return 0;
+    }
+    viommu_core = &s_accel->viommu->core;
+    ret = iommufd_backend_invalidate_cache(viommu_core->iommufd,
+                                           viommu_core->viommu_id,
+                                           IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3,
+                                           sizeof(Cmd), &batch->ncmds,
+                                           batch->cmds);
+    if (total != batch->ncmds) {
+        error_report("%s failed: ret=%d, total=%d, done=%d",
+                      __func__, ret, total, batch->ncmds);
+        return ret;
+    }
+
+    batch->ncmds = 0;
+    batch->dev_cache = false;
+    return ret;
+}
+
+int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
+                            SMMUCommandBatch *batch, Cmd *cmd,
+                            uint32_t *cons, bool dev_cache)
+{
+    int ret;
+
+    if (!bs->accel) {
+        return 0;
+    }
+
+    if (sdev) {
+        SMMUv3AccelDevice *accel_dev;
+        accel_dev = container_of(sdev, SMMUv3AccelDevice, sdev);
+        if (!accel_dev->s1_hwpt) {
+            return 0;
+        }
+    }
+
+    /*
+     * Currently separate out dev_cache and hwpt for safety, which might
+     * not be necessary if underlying HW SMMU does not have the errata.
+     *
+     * TODO check IIDR register values read from hw_info.
+     */
+    if (batch->ncmds && (dev_cache != batch->dev_cache)) {
+        ret = smmuv3_accel_issue_cmd_batch(bs, batch);
+        if (ret) {
+            *cons = batch->cons[batch->ncmds];
+            return ret;
+        }
+    }
+    batch->dev_cache = dev_cache;
+    batch->cmds[batch->ncmds] = *cmd;
+    batch->cons[batch->ncmds++] = *cons;
+    return 0;
+}
+
 static bool
 smmuv3_accel_dev_attach_viommu(SMMUv3AccelDevice *accel_dev,
                                HostIOMMUDeviceIOMMUFD *idev, Error **errp)
diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index 46c8bcae14..4602ae6728 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -549,13 +549,42 @@ typedef struct CD {
     uint32_t word[16];
 } CD;
 
+/**
+ * SMMUCommandBatch - batch of invalidation commands for smmuv3-accel
+ * @cmds: Pointer to list of commands
+ * @cons: Pointer to list of CONS corresponding to the commands
+ * @ncmds: Total ncmds in the batch
+ * @dev_cache: Issue to a device cache
+ */
+typedef struct SMMUCommandBatch {
+    Cmd *cmds;
+    uint32_t *cons;
+    uint32_t ncmds;
+    bool dev_cache;
+} SMMUCommandBatch;
+
 int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
                   SMMUEventInfo *event);
 void smmuv3_flush_config(SMMUDevice *sdev);
 
 #if defined(CONFIG_ARM_SMMUV3_ACCEL) && defined(CONFIG_IOMMUFD)
+int smmuv3_accel_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch);
+int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
+                            SMMUCommandBatch *batch, Cmd *cmd,
+                            uint32_t *cons, bool dev_cache);
 void smmuv3_accel_install_nested_ste(SMMUDevice *sdev, int sid);
 #else
+static inline int smmuv3_accel_issue_cmd_batch(SMMUState *bs,
+                                               SMMUCommandBatch *batch)
+{
+    return 0;
+}
+static inline int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
+                                          SMMUCommandBatch *batch, Cmd *cmd,
+                                          uint32_t *cons, bool dev_cache)
+{
+    return 0;
+}
 static inline void smmuv3_accel_install_nested_ste(SMMUDevice *sdev, int sid)
 {
 }
-- 
2.34.1
Re: [RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache invalidations
Posted by Eric Auger 1 week, 3 days ago

On 3/11/25 3:10 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> Inroduce an SMMUCommandBatch and some helpers to batch and issue the
> commands.  Currently separate out TLBI commands and device cache commands
> to avoid some errata on certain versions of SMMUs. Later it should check
> IIDR register to detect if underlying SMMU hw has such an erratum.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  hw/arm/smmuv3-accel.c    | 69 ++++++++++++++++++++++++++++++++++++++++
>  hw/arm/smmuv3-internal.h | 29 +++++++++++++++++
>  2 files changed, 98 insertions(+)
>
> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> index 76134d106a..09be838d22 100644
> --- a/hw/arm/smmuv3-accel.c
> +++ b/hw/arm/smmuv3-accel.c
> @@ -160,6 +160,75 @@ void smmuv3_accel_install_nested_ste(SMMUDevice *sdev, int sid)
>                                            nested_data.ste[0]);
>  }
>  
> +/* Update batch->ncmds to the number of execute cmds */
> +int smmuv3_accel_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch)
> +{
> +    SMMUv3AccelState *s_accel = ARM_SMMUV3_ACCEL(bs);
> +    uint32_t total = batch->ncmds;
> +    IOMMUFDViommu *viommu_core;
> +    int ret;
> +
> +    if (!bs->accel) {
> +        return 0;
> +    }
> +
> +    if (!s_accel->viommu) {
> +        return 0;
> +    }
> +    viommu_core = &s_accel->viommu->core;
> +    ret = iommufd_backend_invalidate_cache(viommu_core->iommufd,
> +                                           viommu_core->viommu_id,
> +                                           IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3,
> +                                           sizeof(Cmd), &batch->ncmds,
> +                                           batch->cmds);
> +    if (total != batch->ncmds) {
> +        error_report("%s failed: ret=%d, total=%d, done=%d",
> +                      __func__, ret, total, batch->ncmds);
> +        return ret;
> +    }
> +
> +    batch->ncmds = 0;
> +    batch->dev_cache = false;
> +    return ret;
> +}
> +
> +int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
I think you shall document that sdev can be NULL and also when this
helper shall be called with sdev != NULL

Thanks

Eric
> +                            SMMUCommandBatch *batch, Cmd *cmd,
> +                            uint32_t *cons, bool dev_cache)
> +{
> +    int ret;
> +
> +    if (!bs->accel) {
> +        return 0;
> +    }
> +
> +    if (sdev) {
> +        SMMUv3AccelDevice *accel_dev;
> +        accel_dev = container_of(sdev, SMMUv3AccelDevice, sdev);
> +        if (!accel_dev->s1_hwpt) {
> +            return 0;
> +        }
> +    }
> +
> +    /*
> +     * Currently separate out dev_cache and hwpt for safety, which might
> +     * not be necessary if underlying HW SMMU does not have the errata.
> +     *
> +     * TODO check IIDR register values read from hw_info.
> +     */
> +    if (batch->ncmds && (dev_cache != batch->dev_cache)) {
> +        ret = smmuv3_accel_issue_cmd_batch(bs, batch);
> +        if (ret) {
> +            *cons = batch->cons[batch->ncmds];
> +            return ret;
> +        }
> +    }
> +    batch->dev_cache = dev_cache;
> +    batch->cmds[batch->ncmds] = *cmd;
> +    batch->cons[batch->ncmds++] = *cons;
> +    return 0;
> +}
> +
>  static bool
>  smmuv3_accel_dev_attach_viommu(SMMUv3AccelDevice *accel_dev,
>                                 HostIOMMUDeviceIOMMUFD *idev, Error **errp)
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index 46c8bcae14..4602ae6728 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -549,13 +549,42 @@ typedef struct CD {
>      uint32_t word[16];
>  } CD;
>  
> +/**
> + * SMMUCommandBatch - batch of invalidation commands for smmuv3-accel
> + * @cmds: Pointer to list of commands
> + * @cons: Pointer to list of CONS corresponding to the commands
> + * @ncmds: Total ncmds in the batch
> + * @dev_cache: Issue to a device cache
> + */
> +typedef struct SMMUCommandBatch {
> +    Cmd *cmds;
> +    uint32_t *cons;
> +    uint32_t ncmds;
> +    bool dev_cache;
> +} SMMUCommandBatch;
> +
>  int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
>                    SMMUEventInfo *event);
>  void smmuv3_flush_config(SMMUDevice *sdev);
>  
>  #if defined(CONFIG_ARM_SMMUV3_ACCEL) && defined(CONFIG_IOMMUFD)
> +int smmuv3_accel_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch);
> +int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
> +                            SMMUCommandBatch *batch, Cmd *cmd,
> +                            uint32_t *cons, bool dev_cache);
>  void smmuv3_accel_install_nested_ste(SMMUDevice *sdev, int sid);
>  #else
> +static inline int smmuv3_accel_issue_cmd_batch(SMMUState *bs,
> +                                               SMMUCommandBatch *batch)
> +{
> +    return 0;
> +}
> +static inline int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
> +                                          SMMUCommandBatch *batch, Cmd *cmd,
> +                                          uint32_t *cons, bool dev_cache)
> +{
> +    return 0;
> +}
>  static inline void smmuv3_accel_install_nested_ste(SMMUDevice *sdev, int sid)
>  {
>  }
Re: [RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache invalidations
Posted by Eric Auger 1 week, 3 days ago
Hi Shameer, Nicolin,

On 3/11/25 3:10 PM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
>
> Inroduce an SMMUCommandBatch and some helpers to batch and issue the
> commands.  Currently separate out TLBI commands and device cache commands
can you precise which "device cache commands" you are talking about?
> to avoid some errata on certain versions of SMMUs. Later it should check
worth to give more details about this famous errata here.
> IIDR register to detect if underlying SMMU hw has such an erratum.
>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  hw/arm/smmuv3-accel.c    | 69 ++++++++++++++++++++++++++++++++++++++++
>  hw/arm/smmuv3-internal.h | 29 +++++++++++++++++
>  2 files changed, 98 insertions(+)
>
> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> index 76134d106a..09be838d22 100644
> --- a/hw/arm/smmuv3-accel.c
> +++ b/hw/arm/smmuv3-accel.c
> @@ -160,6 +160,75 @@ void smmuv3_accel_install_nested_ste(SMMUDevice *sdev, int sid)
>                                            nested_data.ste[0]);
>  }
>  
> +/* Update batch->ncmds to the number of execute cmds */
> +int smmuv3_accel_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch)
> +{
> +    SMMUv3AccelState *s_accel = ARM_SMMUV3_ACCEL(bs);
> +    uint32_t total = batch->ncmds;
> +    IOMMUFDViommu *viommu_core;
> +    int ret;
> +
> +    if (!bs->accel) {
> +        return 0;
> +    }
> +
> +    if (!s_accel->viommu) {
> +        return 0;
> +    }
> +    viommu_core = &s_accel->viommu->core;
> +    ret = iommufd_backend_invalidate_cache(viommu_core->iommufd,
> +                                           viommu_core->viommu_id,
> +                                           IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3,
> +                                           sizeof(Cmd), &batch->ncmds,
> +                                           batch->cmds);
> +    if (total != batch->ncmds) {
> +        error_report("%s failed: ret=%d, total=%d, done=%d",
> +                      __func__, ret, total, batch->ncmds);
some commands may have been executed (batch->ncmds !=0). Is the
batch_cmds array updated accordingly? In the kernel doc I don't see any
mention of that. Do you need to report a cmd_error as we do for some
other cmds?
> +        return ret;
> +    }
> +
> +    batch->ncmds = 0;
> +    batch->dev_cache = false;
> +    return ret;
> +}
> +
> +int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
I was confused by the name. The helper adds a single Cmd to the batch,
right? so batch_cmd would better suited.
> +                            SMMUCommandBatch *batch, Cmd *cmd,
> +                            uint32_t *cons, bool dev_cache)
> +{
> +    int ret;
> +
> +    if (!bs->accel) {
> +        return 0;
> +    }
> +
> +    if (sdev) {
> +        SMMUv3AccelDevice *accel_dev;
> +        accel_dev = container_of(sdev, SMMUv3AccelDevice, sdev);
> +        if (!accel_dev->s1_hwpt) {
can it happen? in the positive can you add some comment to describe in
which condition?
> +            return 0;
> +        }
> +    }
> +
> +    /*
> +     * Currently separate out dev_cache and hwpt for safety, which might
> +     * not be necessary if underlying HW SMMU does not have the errata.
> +     *
> +     * TODO check IIDR register values read from hw_info.
> +     */
> +    if (batch->ncmds && (dev_cache != batch->dev_cache)) {
> +        ret = smmuv3_accel_issue_cmd_batch(bs, batch);
> +        if (ret) {
> +            *cons = batch->cons[batch->ncmds];
> +            return ret;
> +        }
> +    }
> +    batch->dev_cache = dev_cache;
> +    batch->cmds[batch->ncmds] = *cmd;
> +    batch->cons[batch->ncmds++] = *cons;
> +    return 0;
> +}
> +
>  static bool
>  smmuv3_accel_dev_attach_viommu(SMMUv3AccelDevice *accel_dev,
>                                 HostIOMMUDeviceIOMMUFD *idev, Error **errp)
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index 46c8bcae14..4602ae6728 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -549,13 +549,42 @@ typedef struct CD {
>      uint32_t word[16];
>  } CD;
>  
> +/**
> + * SMMUCommandBatch - batch of invalidation commands for smmuv3-accel
> + * @cmds: Pointer to list of commands
> + * @cons: Pointer to list of CONS corresponding to the commands
> + * @ncmds: Total ncmds in the batch
number of commands
> + * @dev_cache: Issue to a device cache
indicate whether the invalidation command batch targets device cache?
> + */
> +typedef struct SMMUCommandBatch {
> +    Cmd *cmds;
> +    uint32_t *cons;
> +    uint32_t ncmds;
> +    bool dev_cache;
> +} SMMUCommandBatch;
> +
>  int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
>                    SMMUEventInfo *event);
>  void smmuv3_flush_config(SMMUDevice *sdev);
>  
>  #if defined(CONFIG_ARM_SMMUV3_ACCEL) && defined(CONFIG_IOMMUFD)
> +int smmuv3_accel_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch);
> +int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
> +                            SMMUCommandBatch *batch, Cmd *cmd,
> +                            uint32_t *cons, bool dev_cache);
>  void smmuv3_accel_install_nested_ste(SMMUDevice *sdev, int sid);
>  #else
> +static inline int smmuv3_accel_issue_cmd_batch(SMMUState *bs,
> +                                               SMMUCommandBatch *batch)
> +{
> +    return 0;
> +}
> +static inline int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
> +                                          SMMUCommandBatch *batch, Cmd *cmd,
> +                                          uint32_t *cons, bool dev_cache)
> +{
> +    return 0;
> +}
>  static inline void smmuv3_accel_install_nested_ste(SMMUDevice *sdev, int sid)
>  {
>  }
Thanks

Eric
Re: [RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache invalidations
Posted by Nicolin Chen 1 week, 3 days ago
On Wed, Mar 26, 2025 at 02:38:04PM +0100, Eric Auger wrote:
> > +/* Update batch->ncmds to the number of execute cmds */
> > +int smmuv3_accel_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch)
> > +{
> > +    SMMUv3AccelState *s_accel = ARM_SMMUV3_ACCEL(bs);
> > +    uint32_t total = batch->ncmds;
> > +    IOMMUFDViommu *viommu_core;
> > +    int ret;
> > +
> > +    if (!bs->accel) {
> > +        return 0;
> > +    }
> > +
> > +    if (!s_accel->viommu) {
> > +        return 0;
> > +    }
> > +    viommu_core = &s_accel->viommu->core;
> > +    ret = iommufd_backend_invalidate_cache(viommu_core->iommufd,
> > +                                           viommu_core->viommu_id,
> > +                                           IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3,
> > +                                           sizeof(Cmd), &batch->ncmds,
> > +                                           batch->cmds);
> > +    if (total != batch->ncmds) {
> > +        error_report("%s failed: ret=%d, total=%d, done=%d",
> > +                      __func__, ret, total, batch->ncmds);
> some commands may have been executed (batch->ncmds !=0). Is the
> batch_cmds array updated accordingly? In the kernel doc I don't see any
> mention of that.

The uAPI kdoc of ioctl(IOMMU_HWPT_INVALIDATE) mentions:
 * @entry_num: Input the number of cache invalidation requests in the array.
 *             Output the number of requests successfully handled by kernel.

> Do you need to report a cmd_error as we do for some
> other cmds?

Yes, we do. And we did (in this patch)? cons would be updated:
+    if (batch->ncmds && (dev_cache != batch->dev_cache)) {
+        ret = smmuv3_accel_issue_cmd_batch(bs, batch);
+        if (ret) {
+            *cons = batch->cons[batch->ncmds];
+            return ret;
+        }
+    }

> > +        return ret;
> > +    }
> > +
> > +    batch->ncmds = 0;
> > +    batch->dev_cache = false;
> > +    return ret;
> > +}
> > +
> > +int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
> I was confused by the name. The helper adds a single Cmd to the batch,
> right? so batch_cmd would better suited.

Yea, it could be "smmuv3_accel_batch_cmd".

> > +                            SMMUCommandBatch *batch, Cmd *cmd,
> > +                            uint32_t *cons, bool dev_cache)
> > +{
> > +    int ret;
> > +
> > +    if (!bs->accel) {
> > +        return 0;
> > +    }
> > +
> > +    if (sdev) {
> > +        SMMUv3AccelDevice *accel_dev;
> > +        accel_dev = container_of(sdev, SMMUv3AccelDevice, sdev);
> > +        if (!accel_dev->s1_hwpt) {

> can it happen? in the positive can you add some comment to describe in
> which condition?

I recall this is for device cache specifically (i.e. CGFI_CD[_ALL]
and ATC_INV) that I had in smmuv3_cmdq_consume(). Perhaps it gets
here because Shameer separated the accel code from the non-accel
smmuv3 file.

This condition is to check if the device is attached to an accel
HWPT, particularly to exclude commands being issued for emulated
devices. Surely, if a device isn't attached to an accel stage-1
HWPT any more, we probably shouldn't forward the commands to the
kernel? Though I start to suspect that we might need a lock for
accel_dev->s1_hwpt?

> > +/**
> > + * SMMUCommandBatch - batch of invalidation commands for smmuv3-accel
> > + * @cmds: Pointer to list of commands
> > + * @cons: Pointer to list of CONS corresponding to the commands
> > + * @ncmds: Total ncmds in the batch

> number of commands

OK.

> > + * @dev_cache: Issue to a device cache

> indicate whether the invalidation command batch targets device cache?

Maybe "invalidation command batch targeting device cache or TLB".

Thanks
Nicolin
Re: [RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache invalidations
Posted by Eric Auger 1 week, 2 days ago

On 3/26/25 8:16 PM, Nicolin Chen wrote:
> On Wed, Mar 26, 2025 at 02:38:04PM +0100, Eric Auger wrote:
>>> +/* Update batch->ncmds to the number of execute cmds */
>>> +int smmuv3_accel_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch)
>>> +{
>>> +    SMMUv3AccelState *s_accel = ARM_SMMUV3_ACCEL(bs);
>>> +    uint32_t total = batch->ncmds;
>>> +    IOMMUFDViommu *viommu_core;
>>> +    int ret;
>>> +
>>> +    if (!bs->accel) {
>>> +        return 0;
>>> +    }
>>> +
>>> +    if (!s_accel->viommu) {
>>> +        return 0;
>>> +    }
>>> +    viommu_core = &s_accel->viommu->core;
>>> +    ret = iommufd_backend_invalidate_cache(viommu_core->iommufd,
>>> +                                           viommu_core->viommu_id,
>>> +                                           IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3,
>>> +                                           sizeof(Cmd), &batch->ncmds,
>>> +                                           batch->cmds);
>>> +    if (total != batch->ncmds) {
>>> +        error_report("%s failed: ret=%d, total=%d, done=%d",
>>> +                      __func__, ret, total, batch->ncmds);
>> some commands may have been executed (batch->ncmds !=0). Is the
>> batch_cmds array updated accordingly? In the kernel doc I don't see any
>> mention of that.
> The uAPI kdoc of ioctl(IOMMU_HWPT_INVALIDATE) mentions:
>  * @entry_num: Input the number of cache invalidation requests in the array.
>  *             Output the number of requests successfully handled by kernel.
I was rather talking about the array of cmd itself but I guess it is
left unchanged.  don't know if we are supposed to retry sending failed
cmds or not.
>
>> Do you need to report a cmd_error as we do for some
>> other cmds?
> Yes, we do. And we did (in this patch)? cons would be updated:
> +    if (batch->ncmds && (dev_cache != batch->dev_cache)) {
> +        ret = smmuv3_accel_issue_cmd_batch(bs, batch);
> +        if (ret) {
> +            *cons = batch->cons[batch->ncmds];
> +            return ret;
cons is updated but error is not logged in this patch.
> +        }
> +    }
>
>>> +        return ret;
>>> +    }
>>> +
>>> +    batch->ncmds = 0;
>>> +    batch->dev_cache = false;
>>> +    return ret;
>>> +}
>>> +
>>> +int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
>> I was confused by the name. The helper adds a single Cmd to the batch,
>> right? so batch_cmd would better suited.
> Yea, it could be "smmuv3_accel_batch_cmd".
>
>>> +                            SMMUCommandBatch *batch, Cmd *cmd,
>>> +                            uint32_t *cons, bool dev_cache)
>>> +{
>>> +    int ret;
>>> +
>>> +    if (!bs->accel) {
>>> +        return 0;
>>> +    }
>>> +
>>> +    if (sdev) {
>>> +        SMMUv3AccelDevice *accel_dev;
>>> +        accel_dev = container_of(sdev, SMMUv3AccelDevice, sdev);
>>> +        if (!accel_dev->s1_hwpt) {
>> can it happen? in the positive can you add some comment to describe in
>> which condition?
> I recall this is for device cache specifically (i.e. CGFI_CD[_ALL]
> and ATC_INV) that I had in smmuv3_cmdq_consume(). Perhaps it gets
> here because Shameer separated the accel code from the non-accel
> smmuv3 file.
>
> This condition is to check if the device is attached to an accel
> HWPT, particularly to exclude commands being issued for emulated
> devices. Surely, if a device isn't attached to an accel stage-1
> HWPT any more, we probably shouldn't forward the commands to the
> kernel? Though I start to suspect that we might need a lock for
> accel_dev->s1_hwpt?
Yes worth to dig in and add a comment

Thanks

Eric
>
>>> +/**
>>> + * SMMUCommandBatch - batch of invalidation commands for smmuv3-accel
>>> + * @cmds: Pointer to list of commands
>>> + * @cons: Pointer to list of CONS corresponding to the commands
>>> + * @ncmds: Total ncmds in the batch
>> number of commands
> OK.
>
>>> + * @dev_cache: Issue to a device cache
>> indicate whether the invalidation command batch targets device cache?
> Maybe "invalidation command batch targeting device cache or TLB".
>
> Thanks
> Nicolin
>


RE: [RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache invalidations
Posted by Shameerali Kolothum Thodi via 1 week, 2 days ago

> -----Original Message-----
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Wednesday, March 26, 2025 7:16 PM
> To: Eric Auger <eric.auger@redhat.com>
> Cc: Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; qemu-arm@nongnu.org;
> qemu-devel@nongnu.org; peter.maydell@linaro.org; jgg@nvidia.com;
> ddutile@redhat.com; berrange@redhat.com; nathanc@nvidia.com;
> mochs@nvidia.com; smostafa@google.com; Linuxarm
> <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
> jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron
> <jonathan.cameron@huawei.com>; zhangfei.gao@linaro.org
> Subject: Re: [RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers
> to batch and issue cache invalidations
> 
 
> > can it happen? in the positive can you add some comment to describe in
> > which condition?
> 
> I recall this is for device cache specifically (i.e. CGFI_CD[_ALL]
> and ATC_INV) that I had in smmuv3_cmdq_consume(). Perhaps it gets
> here because Shameer separated the accel code from the non-accel
> smmuv3 file.

Yes. It is because I moved the code the around.
> 
> This condition is to check if the device is attached to an accel
> HWPT, particularly to exclude commands being issued for emulated
> devices. Surely, if a device isn't attached to an accel stage-1
> HWPT any more, we probably shouldn't forward the commands to the
> kernel? Though I start to suspect that we might need a lock for
> accel_dev->s1_hwpt?

I will double check this whether we require this or not.

Thanks,
Shameer
Re: [RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache invalidations
Posted by Donald Dutile 2 weeks, 4 days ago
Shameer,

Hi,


On 3/11/25 10:10 AM, Shameer Kolothum wrote:
> From: Nicolin Chen <nicolinc@nvidia.com>
> 
> Inroduce an SMMUCommandBatch and some helpers to batch and issue the
   ^^^^^^^^ Introduce
> commands.  Currently separate out TLBI commands and device cache commands
> to avoid some errata on certain versions of SMMUs. Later it should check
> IIDR register to detect if underlying SMMU hw has such an erratum.
Where is all this info about 'certain versions of SMMUs' and
'check IIDR register' has something to do with 'underlying SMMU hw such an erratum',
-- which IIDR (& bits)? or are we talking about rsvd SMMU_IDR<> registers?


And can't these helpers be used for emulated smmuv3 as well as accelerated?

Thanks,
- Don
> 
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>   hw/arm/smmuv3-accel.c    | 69 ++++++++++++++++++++++++++++++++++++++++
>   hw/arm/smmuv3-internal.h | 29 +++++++++++++++++
>   2 files changed, 98 insertions(+)
> 
> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> index 76134d106a..09be838d22 100644
> --- a/hw/arm/smmuv3-accel.c
> +++ b/hw/arm/smmuv3-accel.c
> @@ -160,6 +160,75 @@ void smmuv3_accel_install_nested_ste(SMMUDevice *sdev, int sid)
>                                             nested_data.ste[0]);
>   }
>   
> +/* Update batch->ncmds to the number of execute cmds */
> +int smmuv3_accel_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch)
> +{
> +    SMMUv3AccelState *s_accel = ARM_SMMUV3_ACCEL(bs);
> +    uint32_t total = batch->ncmds;
> +    IOMMUFDViommu *viommu_core;
> +    int ret;
> +
> +    if (!bs->accel) {
> +        return 0;
> +    }
> +
> +    if (!s_accel->viommu) {
> +        return 0;
> +    }
> +    viommu_core = &s_accel->viommu->core;
> +    ret = iommufd_backend_invalidate_cache(viommu_core->iommufd,
> +                                           viommu_core->viommu_id,
> +                                           IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3,
> +                                           sizeof(Cmd), &batch->ncmds,
> +                                           batch->cmds);
> +    if (total != batch->ncmds) {
> +        error_report("%s failed: ret=%d, total=%d, done=%d",
> +                      __func__, ret, total, batch->ncmds);
> +        return ret;
> +    }
> +
> +    batch->ncmds = 0;
> +    batch->dev_cache = false;
> +    return ret;
> +}
> +
> +int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
> +                            SMMUCommandBatch *batch, Cmd *cmd,
> +                            uint32_t *cons, bool dev_cache)
> +{
> +    int ret;
> +
> +    if (!bs->accel) {
> +        return 0;
> +    }
> +
> +    if (sdev) {
> +        SMMUv3AccelDevice *accel_dev;
> +        accel_dev = container_of(sdev, SMMUv3AccelDevice, sdev);
> +        if (!accel_dev->s1_hwpt) {
> +            return 0;
> +        }
> +    }
> +
> +    /*
> +     * Currently separate out dev_cache and hwpt for safety, which might
> +     * not be necessary if underlying HW SMMU does not have the errata.
> +     *
> +     * TODO check IIDR register values read from hw_info.
> +     */
> +    if (batch->ncmds && (dev_cache != batch->dev_cache)) {
> +        ret = smmuv3_accel_issue_cmd_batch(bs, batch);
> +        if (ret) {
> +            *cons = batch->cons[batch->ncmds];
> +            return ret;
> +        }
> +    }
> +    batch->dev_cache = dev_cache;
> +    batch->cmds[batch->ncmds] = *cmd;
> +    batch->cons[batch->ncmds++] = *cons;
> +    return 0;
> +}
> +
>   static bool
>   smmuv3_accel_dev_attach_viommu(SMMUv3AccelDevice *accel_dev,
>                                  HostIOMMUDeviceIOMMUFD *idev, Error **errp)
> diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
> index 46c8bcae14..4602ae6728 100644
> --- a/hw/arm/smmuv3-internal.h
> +++ b/hw/arm/smmuv3-internal.h
> @@ -549,13 +549,42 @@ typedef struct CD {
>       uint32_t word[16];
>   } CD;
>   
> +/**
> + * SMMUCommandBatch - batch of invalidation commands for smmuv3-accel
> + * @cmds: Pointer to list of commands
> + * @cons: Pointer to list of CONS corresponding to the commands
> + * @ncmds: Total ncmds in the batch
> + * @dev_cache: Issue to a device cache
> + */
> +typedef struct SMMUCommandBatch {
> +    Cmd *cmds;
> +    uint32_t *cons;
> +    uint32_t ncmds;
> +    bool dev_cache;
> +} SMMUCommandBatch;
> +
>   int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
>                     SMMUEventInfo *event);
>   void smmuv3_flush_config(SMMUDevice *sdev);
>   
>   #if defined(CONFIG_ARM_SMMUV3_ACCEL) && defined(CONFIG_IOMMUFD)
> +int smmuv3_accel_issue_cmd_batch(SMMUState *bs, SMMUCommandBatch *batch);
> +int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
> +                            SMMUCommandBatch *batch, Cmd *cmd,
> +                            uint32_t *cons, bool dev_cache);
>   void smmuv3_accel_install_nested_ste(SMMUDevice *sdev, int sid);
>   #else
> +static inline int smmuv3_accel_issue_cmd_batch(SMMUState *bs,
> +                                               SMMUCommandBatch *batch)
> +{
> +    return 0;
> +}
> +static inline int smmuv3_accel_batch_cmds(SMMUState *bs, SMMUDevice *sdev,
> +                                          SMMUCommandBatch *batch, Cmd *cmd,
> +                                          uint32_t *cons, bool dev_cache)
> +{
> +    return 0;
> +}
>   static inline void smmuv3_accel_install_nested_ste(SMMUDevice *sdev, int sid)
>   {
>   }
RE: [RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache invalidations
Posted by Shameerali Kolothum Thodi via 2 weeks, 3 days ago

> -----Original Message-----
> From: Donald Dutile <ddutile@redhat.com>
> Sent: Wednesday, March 19, 2025 1:31 AM
> To: Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>; qemu-arm@nongnu.org;
> qemu-devel@nongnu.org
> Cc: eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com;
> nicolinc@nvidia.com; berrange@redhat.com; nathanc@nvidia.com;
> mochs@nvidia.com; smostafa@google.com; Linuxarm
> <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
> jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron
> <jonathan.cameron@huawei.com>; zhangfei.gao@linaro.org
> Subject: Re: [RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers
> to batch and issue cache invalidations
> 
> Shameer,
> 
> Hi,
> 
> 
> On 3/11/25 10:10 AM, Shameer Kolothum wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> >
> > Inroduce an SMMUCommandBatch and some helpers to batch and issue
> the
>    ^^^^^^^^ Introduce
> > commands.  Currently separate out TLBI commands and device cache
> > commands to avoid some errata on certain versions of SMMUs. Later it
> > should check IIDR register to detect if underlying SMMU hw has such an
> erratum.
> Where is all this info about 'certain versions of SMMUs' and 'check IIDR
> register' has something to do with 'underlying SMMU hw such an erratum',
> -- which IIDR (& bits)? or are we talking about rsvd SMMU_IDR<> registers?

I guess the batching has constraints on some platforms, IIRC, this was discussed
somewhere in a kernel thread.  

Nicolin, could you please provide some background on this.

> 
> And can't these helpers be used for emulated smmuv3 as well as
> accelerated?

Could be I guess. But no benefit in terms of performance. May be will make
code look nicer. I will take a look if not much of changes in the emulated path.

Thanks,
Shameer


Re: [RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache invalidations
Posted by Donald Dutile 2 weeks, 3 days ago

On 3/19/25 5:48 AM, Shameerali Kolothum Thodi wrote:
> 
> 
>> -----Original Message-----
>> From: Donald Dutile <ddutile@redhat.com>
>> Sent: Wednesday, March 19, 2025 1:31 AM
>> To: Shameerali Kolothum Thodi
>> <shameerali.kolothum.thodi@huawei.com>; qemu-arm@nongnu.org;
>> qemu-devel@nongnu.org
>> Cc: eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com;
>> nicolinc@nvidia.com; berrange@redhat.com; nathanc@nvidia.com;
>> mochs@nvidia.com; smostafa@google.com; Linuxarm
>> <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
>> jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron
>> <jonathan.cameron@huawei.com>; zhangfei.gao@linaro.org
>> Subject: Re: [RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers
>> to batch and issue cache invalidations
>>
>> Shameer,
>>
>> Hi,
>>
>>
>> On 3/11/25 10:10 AM, Shameer Kolothum wrote:
>>> From: Nicolin Chen <nicolinc@nvidia.com>
>>>
>>> Inroduce an SMMUCommandBatch and some helpers to batch and issue
>> the
>>     ^^^^^^^^ Introduce
>>> commands.  Currently separate out TLBI commands and device cache
>>> commands to avoid some errata on certain versions of SMMUs. Later it
>>> should check IIDR register to detect if underlying SMMU hw has such an
>> erratum.
>> Where is all this info about 'certain versions of SMMUs' and 'check IIDR
>> register' has something to do with 'underlying SMMU hw such an erratum',
>> -- which IIDR (& bits)? or are we talking about rsvd SMMU_IDR<> registers?
> 
> I guess the batching has constraints on some platforms, IIRC, this was discussed
> somewhere in a kernel thread.
> 
> Nicolin, could you please provide some background on this.
> 
A lore link if it's discussed upstream, thanks.

>>
>> And can't these helpers be used for emulated smmuv3 as well as
>> accelerated?
> 
> Could be I guess. But no benefit in terms of performance. May be will make
> code look nicer. I will take a look if not much of changes in the emulated path.
> 
Thanks for looking into it.  The push is to use common code path(s) to invoke (or not)
an accel path.

> Thanks,
> Shameer
> 
>
Re: [RFC PATCH v2 13/20] hw/arm/smmuv3-accel: Introduce helpers to batch and issue cache invalidations
Posted by Nicolin Chen 2 weeks, 3 days ago
On Wed, Mar 19, 2025 at 12:24:32PM -0400, Donald Dutile wrote:
> > > On 3/11/25 10:10 AM, Shameer Kolothum wrote:
> > > > From: Nicolin Chen <nicolinc@nvidia.com>
> > > > 
> > > > Inroduce an SMMUCommandBatch and some helpers to batch and issue
> > > the
> > >     ^^^^^^^^ Introduce
> > > > commands.  Currently separate out TLBI commands and device cache
> > > > commands to avoid some errata on certain versions of SMMUs. Later it
> > > > should check IIDR register to detect if underlying SMMU hw has such an
> > > erratum.
> > > Where is all this info about 'certain versions of SMMUs' and 'check IIDR
> > > register' has something to do with 'underlying SMMU hw such an erratum',
> > > -- which IIDR (& bits)? or are we talking about rsvd SMMU_IDR<> registers?
> > 
> > I guess the batching has constraints on some platforms, IIRC, this was discussed
> > somewhere in a kernel thread.
> > 
> > Nicolin, could you please provide some background on this.
> > 
> A lore link if it's discussed upstream, thanks.

https://lore.kernel.org/all/696da78d32bb4491f898f11b0bb4d850a8aa7c6a.1683731256.git.robin.murphy@arm.com/

IIRC, some of them forbid command issuing like mixing leaf TLBI
commands with non-leaf TLBI commands or mixing device commands
with TLBI commands.

Currently, kernel masks away the ARM_SMMU_FEAT_NESTING from the
affected SMMU versions/subversions. So, I think we are fine for
now, though probably doesn't hurt to check IIDR?

Thanks
Nicolin