For accelerated SMMUv3, we need nested parent domain creation. Add the
callback support so that VFIO can create a nested parent.
Since 'accel=on' for SMMUv3 requires the guest SMMUv3 to be configured
in Stage 1 mode, ensure that the 'stage' property is explicitly set to
Stage 1.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
hw/arm/smmuv3-accel.c | 15 +++++++++++++++
hw/arm/virt.c | 12 ++++++++++++
2 files changed, 27 insertions(+)
diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
index 0b0ddb03e2..66cd4f5ece 100644
--- a/hw/arm/smmuv3-accel.c
+++ b/hw/arm/smmuv3-accel.c
@@ -10,6 +10,7 @@
#include "qemu/error-report.h"
#include "hw/arm/smmuv3.h"
+#include "hw/iommu.h"
#include "hw/pci/pci_bridge.h"
#include "hw/pci-host/gpex.h"
#include "hw/vfio/pci.h"
@@ -81,8 +82,22 @@ static AddressSpace *smmuv3_accel_find_add_as(PCIBus *bus, void *opaque,
}
}
+static uint64_t smmuv3_accel_get_viommu_cap(void *opaque)
+{
+ /*
+ * Accelerated smmuv3 support only allowes Guest S1
+ * configuration. Hence report VIOMMU_CAP_STAGE1
+ * so that VFIO can create nested parent domain.
+ * The real nested support should be reported from host
+ * SMMUv3 and if it doesn't, the nested parent allocation
+ * will fail anyway.
+ */
+ return VIOMMU_CAP_STAGE1;
+}
+
static const PCIIOMMUOps smmuv3_accel_ops = {
.get_address_space = smmuv3_accel_find_add_as,
+ .get_viommu_cap = smmuv3_accel_get_viommu_cap,
};
void smmuv3_accel_init(SMMUv3State *s)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 22393cf39e..fdb47eda6a 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3053,6 +3053,18 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
return;
}
+ if (object_property_get_bool(OBJECT(dev), "accel", &error_abort)) {
+ char *stage;
+
+ stage = object_property_get_str(OBJECT(dev), "stage",
+ &error_fatal);
+ if (*stage && strcmp("1", stage)) {
+ error_setg(errp, "Only stage1 is supported for SMMUV3 with "
+ "accel=on");
+ return;
+ }
+ }
+
create_smmuv3_dev_dtb(vms, dev, bus);
}
}
--
2.34.1
On 7/14/25 5:59 PM, Shameer Kolothum wrote:
> For accelerated SMMUv3, we need nested parent domain creation. Add the
> callback support so that VFIO can create a nested parent.
>
> Since 'accel=on' for SMMUv3 requires the guest SMMUv3 to be configured
> in Stage 1 mode, ensure that the 'stage' property is explicitly set to
> Stage 1.
nit: strictly speaking couldn't we have a stage2 being used at guest
level implemented by a stage1 at physical level?
but it is totally fair to restrict the support.
>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
> hw/arm/smmuv3-accel.c | 15 +++++++++++++++
> hw/arm/virt.c | 12 ++++++++++++
> 2 files changed, 27 insertions(+)
>
> diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> index 0b0ddb03e2..66cd4f5ece 100644
> --- a/hw/arm/smmuv3-accel.c
> +++ b/hw/arm/smmuv3-accel.c
> @@ -10,6 +10,7 @@
> #include "qemu/error-report.h"
>
> #include "hw/arm/smmuv3.h"
> +#include "hw/iommu.h"
> #include "hw/pci/pci_bridge.h"
> #include "hw/pci-host/gpex.h"
> #include "hw/vfio/pci.h"
> @@ -81,8 +82,22 @@ static AddressSpace *smmuv3_accel_find_add_as(PCIBus *bus, void *opaque,
> }
> }
>
> +static uint64_t smmuv3_accel_get_viommu_cap(void *opaque)
> +{
> + /*
> + * Accelerated smmuv3 support only allowes Guest S1
> + * configuration. Hence report VIOMMU_CAP_STAGE1
> + * so that VFIO can create nested parent domain.
> + * The real nested support should be reported from host
the actual nested support at host level will be queried from the host
later on?
> + * SMMUv3 and if it doesn't, the nested parent allocation
> + * will fail anyway.
> + */
> + return VIOMMU_CAP_STAGE1;
> +}
> +
> static const PCIIOMMUOps smmuv3_accel_ops = {
> .get_address_space = smmuv3_accel_find_add_as,
> + .get_viommu_cap = smmuv3_accel_get_viommu_cap,
> };
>
> void smmuv3_accel_init(SMMUv3State *s)
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 22393cf39e..fdb47eda6a 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -3053,6 +3053,18 @@ static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
> return;
> }
>
> + if (object_property_get_bool(OBJECT(dev), "accel", &error_abort)) {
> + char *stage;
> +
> + stage = object_property_get_str(OBJECT(dev), "stage",
> + &error_fatal);
> + if (*stage && strcmp("1", stage)) {
I am not sure you need to check *stage
> + error_setg(errp, "Only stage1 is supported for SMMUV3 with "
> + "accel=on");
> + return;
> + }
> + }
> +
> create_smmuv3_dev_dtb(vms, dev, bus);
> }
> }
Thanks
Eric
> -----Original Message-----
> From: Eric Auger <eric.auger@redhat.com>
> Sent: 05 September 2025 09:50
> To: qemu-arm@nongnu.org; qemu-devel@nongnu.org; Shameer Kolothum
> <skolothumtho@nvidia.com>
> Cc: peter.maydell@linaro.org; Jason Gunthorpe <jgg@nvidia.com>; Nicolin
> Chen <nicolinc@nvidia.com>; ddutile@redhat.com; berrange@redhat.com;
> Nathan Chen <nathanc@nvidia.com>; Matt Ochs <mochs@nvidia.com>;
> smostafa@google.com; linuxarm@huawei.com; wangzhou1@hisilicon.com;
> jiangkunkun@huawei.com; jonathan.cameron@huawei.com;
> zhangfei.gao@linaro.org; zhenzhong.duan@intel.com;
> shameerkolothum@gmail.com
> Subject: Re: [RFC PATCH v3 07/15] hw/arm/smmuv3: Implement
> get_viommu_cap() callback
>
> External email: Use caution opening links or attachments
>
>
> On 7/14/25 5:59 PM, Shameer Kolothum wrote:
> > For accelerated SMMUv3, we need nested parent domain creation. Add the
> > callback support so that VFIO can create a nested parent.
> >
> > Since 'accel=on' for SMMUv3 requires the guest SMMUv3 to be configured
> > in Stage 1 mode, ensure that the 'stage' property is explicitly set to
> > Stage 1.
> nit: strictly speaking couldn't we have a stage2 being used at guest
> level implemented by a stage1 at physical level?
> but it is totally fair to restrict the support.
Yeah it is possible I guess. But then we have to use the S2TTB to configure
Host SMMUv3 S1 instead of S1ContextPtr which is used now. I have not
tried that yet. Since S2 stage is more restrictive in nature(eg: No PASID
possible) don’t think we should support that for accel case.
I will change the commit log accordingly.
> >
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > ---
> > hw/arm/smmuv3-accel.c | 15 +++++++++++++++
> > hw/arm/virt.c | 12 ++++++++++++
> > 2 files changed, 27 insertions(+)
> >
> > diff --git a/hw/arm/smmuv3-accel.c b/hw/arm/smmuv3-accel.c
> > index 0b0ddb03e2..66cd4f5ece 100644
> > --- a/hw/arm/smmuv3-accel.c
> > +++ b/hw/arm/smmuv3-accel.c
> > @@ -10,6 +10,7 @@
> > #include "qemu/error-report.h"
> >
> > #include "hw/arm/smmuv3.h"
> > +#include "hw/iommu.h"
> > #include "hw/pci/pci_bridge.h"
> > #include "hw/pci-host/gpex.h"
> > #include "hw/vfio/pci.h"
> > @@ -81,8 +82,22 @@ static AddressSpace
> *smmuv3_accel_find_add_as(PCIBus *bus, void *opaque,
> > }
> > }
> >
> > +static uint64_t smmuv3_accel_get_viommu_cap(void *opaque)
> > +{
> > + /*
> > + * Accelerated smmuv3 support only allowes Guest S1
> > + * configuration. Hence report VIOMMU_CAP_STAGE1
> > + * so that VFIO can create nested parent domain.
> > + * The real nested support should be reported from host
> the actual nested support at host level will be queried from the host
> later on?
Yes. It is handled by vfio/iommufd. See this,
https://lore.kernel.org/qemu-devel/20250822064101.123526-6-zhenzhong.duan@intel.com/
> > + * SMMUv3 and if it doesn't, the nested parent allocation
> > + * will fail anyway.
> > + */
> > + return VIOMMU_CAP_STAGE1;
> > +}
> > +
> > static const PCIIOMMUOps smmuv3_accel_ops = {
> > .get_address_space = smmuv3_accel_find_add_as,
> > + .get_viommu_cap = smmuv3_accel_get_viommu_cap,
> > };
> >
> > void smmuv3_accel_init(SMMUv3State *s)
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 22393cf39e..fdb47eda6a 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -3053,6 +3053,18 @@ static void
> virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
> > return;
> > }
> >
> > + if (object_property_get_bool(OBJECT(dev), "accel", &error_abort)) {
> > + char *stage;
> > +
> > + stage = object_property_get_str(OBJECT(dev), "stage",
> > + &error_fatal);
> > + if (*stage && strcmp("1", stage)) {
> I am not sure you need to check *stage
Ok. I will double check that.
Thanks,
Shameer
On Mon, Sep 08, 2025 at 08:22:59AM +0000, Shameer Kolothum wrote: > > nit: strictly speaking couldn't we have a stage2 being used at guest > > level implemented by a stage1 at physical level? > > but it is totally fair to restrict the support. > > Yeah it is possible I guess. But then we have to use the S2TTB to configure > Host SMMUv3 S1 instead of S1ContextPtr which is used now. S1 and S2 have different PTE formats, you cannot take a guest S2 table with S2 PTEs and have the hypervisor program it to a S1. The guest must see a SMMU with no S2 support in the IDRs. Jason
Hi Jason, Shameer, On 9/8/25 3:40 PM, Jason Gunthorpe wrote: > On Mon, Sep 08, 2025 at 08:22:59AM +0000, Shameer Kolothum wrote: >>> nit: strictly speaking couldn't we have a stage2 being used at guest >>> level implemented by a stage1 at physical level? >>> but it is totally fair to restrict the support. >> Yeah it is possible I guess. But then we have to use the S2TTB to configure >> Host SMMUv3 S1 instead of S1ContextPtr which is used now. > S1 and S2 have different PTE formats, you cannot take a guest S2 table > with S2 PTEs and have the hypervisor program it to a S1. > > The guest must see a SMMU with no S2 support in the IDRs. Yes you're right. As PTE are different we cannot do it. Sorry for the noise. Eric > > Jason >
On Mon, Jul 14, 2025 at 04:59:33PM +0100, Shameer Kolothum wrote:
> For accelerated SMMUv3, we need nested parent domain creation. Add the
> callback support so that VFIO can create a nested parent.
>
> Since 'accel=on' for SMMUv3 requires the guest SMMUv3 to be configured
> in Stage 1 mode, ensure that the 'stage' property is explicitly set to
> Stage 1.
>
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
> @@ -81,8 +82,22 @@ static AddressSpace *smmuv3_accel_find_add_as(PCIBus *bus, void *opaque,
> }
> }
>
> +static uint64_t smmuv3_accel_get_viommu_cap(void *opaque)
> +{
> + /*
> + * Accelerated smmuv3 support only allowes Guest S1
s/allowes/allows
> + * configuration. Hence report VIOMMU_CAP_STAGE1
> + * so that VFIO can create nested parent domain.
Aligning with the kernel uAPI docs:
s/nested/a nesting
> + * The real nested support should be reported from host
"The real HW nested stage-1 translation must be supported by the .."
> + * SMMUv3 and if it doesn't, the nested parent allocation
s/nested/nesting
> + * will fail anyway.
> + */
And I think the lines are wrapped a bit too early. Should QEMU allow
up-to-80 characters?
Thanks
Nicolin
© 2016 - 2025 Red Hat, Inc.