[RFC PATCH v3 07/15] hw/arm/smmuv3: Implement get_viommu_cap() callback

Shameer Kolothum via posted 15 patches 4 months ago
There is a newer version of this series
[RFC PATCH v3 07/15] hw/arm/smmuv3: Implement get_viommu_cap() callback
Posted by Shameer Kolothum via 4 months ago
For accelerated SMMUv3, we need nested parent domain creation. Add the
callback support so that VFIO can create a nested parent.

Since 'accel=on' for SMMUv3 requires the guest SMMUv3 to be configured
in Stage 1 mode, ensure that the 'stage' property is explicitly set to
Stage 1.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 hw/arm/smmuv3-accel.c | 15 +++++++++++++++
 hw/arm/virt.c         | 12 ++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 0b0ddb03e2..66cd4f5ece 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -10,6 +10,7 @@
 #include "qemu/error-report.h"
 
 #include "hw/arm/smmuv3.h"
+#include "hw/iommu.h"
 #include "hw/pci/pci_bridge.h"
 #include "hw/pci-host/gpex.h"
 #include "hw/vfio/pci.h"
@@ -81,8 +82,22 @@ static AddressSpace *smmuv3_accel_find_add_as(PCIBus *bus, void *opaque,
     }
 }
 
+static uint64_t smmuv3_accel_get_viommu_cap(void *opaque)
+{
+    /*
+     * Accelerated smmuv3 support only allowes Guest S1
+     * configuration. Hence report VIOMMU_CAP_STAGE1
+     * so that VFIO can create nested parent domain.
+     * The real nested support should be reported from host
+     * SMMUv3 and if it doesn't, the nested parent allocation
+     * will fail anyway.
+     */
+    return VIOMMU_CAP_STAGE1;
+}
+
 static const PCIIOMMUOps smmuv3_accel_ops = {
     .get_address_space = smmuv3_accel_find_add_as,
+    .get_viommu_cap = smmuv3_accel_get_viommu_cap,
 };
 
 void smmuv3_accel_init(SMMUv3State *s)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 22393cf39e..fdb47eda6a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3053,6 +3053,18 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
                 return;
             }
 
+            if (object_property_get_bool(OBJECT(dev), "accel", &error_abort)) {
+                char *stage;
+
+                stage = object_property_get_str(OBJECT(dev), "stage",
+                                                &error_fatal);
+                if (*stage && strcmp("1", stage)) {
+                    error_setg(errp, "Only stage1 is supported for SMMUV3 with "
+                               "accel=on");
+                    return;
+                }
+            }
+
             create_smmuv3_dev_dtb(vms, dev, bus);
         }
     }
-- 
2.34.1
Re: [RFC PATCH v3 07/15] hw/arm/smmuv3: Implement get_viommu_cap() callback
Posted by Eric Auger 2 months, 1 week ago

On 7/14/25 5:59 PM, Shameer Kolothum wrote:
> For accelerated SMMUv3, we need nested parent domain creation. Add the
> callback support so that VFIO can create a nested parent.
>
> Since 'accel=on' for SMMUv3 requires the guest SMMUv3 to be configured
> in Stage 1 mode, ensure that the 'stage' property is explicitly set to
> Stage 1.
nit: strictly speaking couldn't we have a stage2 being used at guest
level implemented by a stage1 at physical level?
but it is totally fair to restrict the support.
>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  hw/arm/smmuv3-accel.c | 15 +++++++++++++++
>  hw/arm/virt.c         | 12 ++++++++++++
>  2 files changed, 27 insertions(+)
>
> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> index 0b0ddb03e2..66cd4f5ece 100644
> --- a/hw/arm/smmuv3-accel.c
> +++ b/hw/arm/smmuv3-accel.c
> @@ -10,6 +10,7 @@
>  #include "qemu/error-report.h"
>  
>  #include "hw/arm/smmuv3.h"
> +#include "hw/iommu.h"
>  #include "hw/pci/pci_bridge.h"
>  #include "hw/pci-host/gpex.h"
>  #include "hw/vfio/pci.h"
> @@ -81,8 +82,22 @@ static AddressSpace *smmuv3_accel_find_add_as(PCIBus *bus, void *opaque,
>      }
>  }
>  
> +static uint64_t smmuv3_accel_get_viommu_cap(void *opaque)
> +{
> +    /*
> +     * Accelerated smmuv3 support only allowes Guest S1
> +     * configuration. Hence report VIOMMU_CAP_STAGE1
> +     * so that VFIO can create nested parent domain.
> +     * The real nested support should be reported from host
the actual nested support at host level will be queried from the host
later on?
> +     * SMMUv3 and if it doesn't, the nested parent allocation
> +     * will fail anyway.
> +     */
> +    return VIOMMU_CAP_STAGE1;
> +}
> +
>  static const PCIIOMMUOps smmuv3_accel_ops = {
>      .get_address_space = smmuv3_accel_find_add_as,
> +    .get_viommu_cap = smmuv3_accel_get_viommu_cap,
>  };
>  
>  void smmuv3_accel_init(SMMUv3State *s)
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 22393cf39e..fdb47eda6a 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -3053,6 +3053,18 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
>                  return;
>              }
>  
> +            if (object_property_get_bool(OBJECT(dev), "accel", &error_abort)) {
> +                char *stage;
> +
> +                stage = object_property_get_str(OBJECT(dev), "stage",
> +                                                &error_fatal);
> +                if (*stage && strcmp("1", stage)) {
I am not sure you need to check *stage
> +                    error_setg(errp, "Only stage1 is supported for SMMUV3 with "
> +                               "accel=on");
> +                    return;
> +                }
> +            }
> +
>              create_smmuv3_dev_dtb(vms, dev, bus);
>          }
>      }
Thanks

Eric
RE: [RFC PATCH v3 07/15] hw/arm/smmuv3: Implement get_viommu_cap() callback
Posted by Shameer Kolothum 2 months, 1 week ago

> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 05 September 2025 09:50
> To: qemu-arm@nongnu.org; qemu-devel@nongnu.org; Shameer Kolothum
> <skolothumtho@nvidia.com>
> Cc: peter.maydell@linaro.org; Jason Gunthorpe <jgg@nvidia.com>; Nicolin
> Chen <nicolinc@nvidia.com>; ddutile@redhat.com; berrange@redhat.com;
> Nathan Chen <nathanc@nvidia.com>; Matt Ochs <mochs@nvidia.com>;
> smostafa@google.com; linuxarm@huawei.com; wangzhou1@hisilicon.com;
> jiangkunkun@huawei.com; jonathan.cameron@huawei.com;
> zhangfei.gao@linaro.org; zhenzhong.duan@intel.com;
> shameerkolothum@gmail.com
> Subject: Re: [RFC PATCH v3 07/15] hw/arm/smmuv3: Implement
> get_viommu_cap() callback
> 
> External email: Use caution opening links or attachments
> 
> 
> On 7/14/25 5:59 PM, Shameer Kolothum wrote:
> > For accelerated SMMUv3, we need nested parent domain creation. Add the
> > callback support so that VFIO can create a nested parent.
> >
> > Since 'accel=on' for SMMUv3 requires the guest SMMUv3 to be configured
> > in Stage 1 mode, ensure that the 'stage' property is explicitly set to
> > Stage 1.
> nit: strictly speaking couldn't we have a stage2 being used at guest
> level implemented by a stage1 at physical level?
> but it is totally fair to restrict the support.

Yeah it is possible I guess. But then we have to use the S2TTB to configure
Host SMMUv3 S1 instead of S1ContextPtr which is used now. I have not 
tried that yet. Since S2 stage is more restrictive in nature(eg: No PASID
possible) don’t think we should support that for accel case.

I will change the commit log accordingly.
  
> >
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  hw/arm/smmuv3-accel.c | 15 +++++++++++++++
> >  hw/arm/virt.c         | 12 ++++++++++++
> >  2 files changed, 27 insertions(+)
> >
> > diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> > index 0b0ddb03e2..66cd4f5ece 100644
> > --- a/hw/arm/smmuv3-accel.c
> > +++ b/hw/arm/smmuv3-accel.c
> > @@ -10,6 +10,7 @@
> >  #include "qemu/error-report.h"
> >
> >  #include "hw/arm/smmuv3.h"
> > +#include "hw/iommu.h"
> >  #include "hw/pci/pci_bridge.h"
> >  #include "hw/pci-host/gpex.h"
> >  #include "hw/vfio/pci.h"
> > @@ -81,8 +82,22 @@ static AddressSpace
> *smmuv3_accel_find_add_as(PCIBus *bus, void *opaque,
> >      }
> >  }
> >
> > +static uint64_t smmuv3_accel_get_viommu_cap(void *opaque)
> > +{
> > +    /*
> > +     * Accelerated smmuv3 support only allowes Guest S1
> > +     * configuration. Hence report VIOMMU_CAP_STAGE1
> > +     * so that VFIO can create nested parent domain.
> > +     * The real nested support should be reported from host
> the actual nested support at host level will be queried from the host
> later on?

Yes. It is handled by vfio/iommufd.  See this,
https://lore.kernel.org/qemu-devel/20250822064101.123526-6-zhenzhong.duan@intel.com/

> > +     * SMMUv3 and if it doesn't, the nested parent allocation
> > +     * will fail anyway.
> > +     */
> > +    return VIOMMU_CAP_STAGE1;
> > +}
> > +
> >  static const PCIIOMMUOps smmuv3_accel_ops = {
> >      .get_address_space = smmuv3_accel_find_add_as,
> > +    .get_viommu_cap = smmuv3_accel_get_viommu_cap,
> >  };
> >
> >  void smmuv3_accel_init(SMMUv3State *s)
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 22393cf39e..fdb47eda6a 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -3053,6 +3053,18 @@ static void
> virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
> >                  return;
> >              }
> >
> > +            if (object_property_get_bool(OBJECT(dev), "accel", &error_abort)) {
> > +                char *stage;
> > +
> > +                stage = object_property_get_str(OBJECT(dev), "stage",
> > +                                                &error_fatal);
> > +                if (*stage && strcmp("1", stage)) {
> I am not sure you need to check *stage

Ok. I will double check that.

Thanks,
Shameer

Re: [RFC PATCH v3 07/15] hw/arm/smmuv3: Implement get_viommu_cap() callback
Posted by Jason Gunthorpe 2 months, 1 week ago
On Mon, Sep 08, 2025 at 08:22:59AM +0000, Shameer Kolothum wrote:
> > nit: strictly speaking couldn't we have a stage2 being used at guest
> > level implemented by a stage1 at physical level?
> > but it is totally fair to restrict the support.
> 
> Yeah it is possible I guess. But then we have to use the S2TTB to configure
> Host SMMUv3 S1 instead of S1ContextPtr which is used now. 

S1 and S2 have different PTE formats, you cannot take a guest S2 table
with S2 PTEs and have the hypervisor program it to a S1.

The guest must see a SMMU with no S2 support in the IDRs.

Jason
Re: [RFC PATCH v3 07/15] hw/arm/smmuv3: Implement get_viommu_cap() callback
Posted by Eric Auger 2 months ago
Hi Jason, Shameer,

On 9/8/25 3:40 PM, Jason Gunthorpe wrote:
> On Mon, Sep 08, 2025 at 08:22:59AM +0000, Shameer Kolothum wrote:
>>> nit: strictly speaking couldn't we have a stage2 being used at guest
>>> level implemented by a stage1 at physical level?
>>> but it is totally fair to restrict the support.
>> Yeah it is possible I guess. But then we have to use the S2TTB to configure
>> Host SMMUv3 S1 instead of S1ContextPtr which is used now. 
> S1 and S2 have different PTE formats, you cannot take a guest S2 table
> with S2 PTEs and have the hypervisor program it to a S1.
>
> The guest must see a SMMU with no S2 support in the IDRs.

Yes you're right. As PTE are different we cannot do it.

Sorry for the noise.

Eric
>
> Jason
>
Re: [RFC PATCH v3 07/15] hw/arm/smmuv3: Implement get_viommu_cap() callback
Posted by Nicolin Chen 4 months ago
On Mon, Jul 14, 2025 at 04:59:33PM +0100, Shameer Kolothum wrote:
> For accelerated SMMUv3, we need nested parent domain creation. Add the
> callback support so that VFIO can create a nested parent.
> 
> Since 'accel=on' for SMMUv3 requires the guest SMMUv3 to be configured
> in Stage 1 mode, ensure that the 'stage' property is explicitly set to
> Stage 1.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>

> @@ -81,8 +82,22 @@ static AddressSpace *smmuv3_accel_find_add_as(PCIBus *bus, void *opaque,
>      }
>  }
>  
> +static uint64_t smmuv3_accel_get_viommu_cap(void *opaque)
> +{
> +    /*
> +     * Accelerated smmuv3 support only allowes Guest S1

s/allowes/allows

> +     * configuration. Hence report VIOMMU_CAP_STAGE1
> +     * so that VFIO can create nested parent domain.

Aligning with the kernel uAPI docs:

s/nested/a nesting

> +     * The real nested support should be reported from host

"The real HW nested stage-1 translation must be supported by the .."

> +     * SMMUv3 and if it doesn't, the nested parent allocation

s/nested/nesting

> +     * will fail anyway.
> +     */

And I think the lines are wrapped a bit too early. Should QEMU allow
up-to-80 characters?

Thanks
Nicolin