RE: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable nested SMMUv3

Shameerali Kolothum Thodi via posted 5 patches 9 months, 2 weeks ago
Only 0 patches received!
There is a newer version of this series
RE: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable nested SMMUv3
Posted by Shameerali Kolothum Thodi via 9 months, 2 weeks ago

> -----Original Message-----
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, January 31, 2025 2:54 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org;
> qemu-devel@nongnu.org; eric.auger@redhat.com;
> peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com;
> Linuxarm <linuxarm@huawei.com>; Wangzhou (B)
> <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>;
> Jonathan Cameron <jonathan.cameron@huawei.com>;
> zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com>
> Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable
> nested SMMUv3
> 
> On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi
> wrote:
> 
> > > > And Qemu does some checking to make sure that the device is indeed
> > > associated
> > > > with the specified phys-smmuv3.  This can be done going through the
> > > sysfs path checking
> > > > which is what I guess libvirt is currently doing to populate the
> topology.
> > > So basically
> > > > Qemu is just replicating that to validate again.
> > >
> > > I would prefer that iommufd users not have to go out to sysfs..
> > >
> > > > Or another option is extending the IOMMU_GET_HW_INFO IOCTL to
> > > return the phys
> > > > smmuv3 base address which can avoid going through the sysfs.
> > >
> > > It also doesn't seem great to expose a physical address. But we could
> > > have an 'iommu instance id' that was a unique small integer?
> >
> > Ok. But how the user space can map that to the device?
> 
> Why does it need to?
> 
> libvirt picks some label for the vsmmu instance, it doesn't matter
> what the string is.
> 
> qemu validates that all of the vsmmu instances are only linked to PCI
> device that have the same iommu ID. This is already happening in the
> kernel, it will fail attaches to mismatched instances.
> 
> Nothing further is needed?

-device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \
-device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \
-device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \
-device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \

-device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \
-device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \
-device arm-smmuv3-accel,pci-bus=pcie.2,id=smmuv2 \
-device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \

I think it works from a functionality point of view. A  particular
instance of arm-smmuv3-accel(say id=smmuv1) can only have devices attached
to the same phys smmuv3 "iommu instance id"

But not sure from a libvirt/Qemu interface point of view[0] the concerns
are addressed. Daniel/Nathan?

Thanks,
Shameer
https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/X6R52JRBYDFZ5PSJFR534A655UZ3RHKN/
Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable nested SMMUv3
Posted by Eric Auger 9 months, 2 weeks ago
Hi,


On 1/31/25 4:23 PM, Shameerali Kolothum Thodi wrote:
>
>> -----Original Message-----
>> From: Jason Gunthorpe <jgg@nvidia.com>
>> Sent: Friday, January 31, 2025 2:54 PM
>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
>> Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org;
>> qemu-devel@nongnu.org; eric.auger@redhat.com;
>> peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com;
>> Linuxarm <linuxarm@huawei.com>; Wangzhou (B)
>> <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>;
>> Jonathan Cameron <jonathan.cameron@huawei.com>;
>> zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com>
>> Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable
>> nested SMMUv3
>>
>> On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi
>> wrote:
>>
>>>>> And Qemu does some checking to make sure that the device is indeed
>>>> associated
>>>>> with the specified phys-smmuv3.  This can be done going through the
>>>> sysfs path checking
>>>>> which is what I guess libvirt is currently doing to populate the
>> topology.
>>>> So basically
>>>>> Qemu is just replicating that to validate again.
>>>> I would prefer that iommufd users not have to go out to sysfs..
>>>>
>>>>> Or another option is extending the IOMMU_GET_HW_INFO IOCTL to
>>>> return the phys
>>>>> smmuv3 base address which can avoid going through the sysfs.
>>>> It also doesn't seem great to expose a physical address. But we could
>>>> have an 'iommu instance id' that was a unique small integer?
>>> Ok. But how the user space can map that to the device?
>> Why does it need to?
>>
>> libvirt picks some label for the vsmmu instance, it doesn't matter
>> what the string is.
>>
>> qemu validates that all of the vsmmu instances are only linked to PCI
>> device that have the same iommu ID. This is already happening in the
>> kernel, it will fail attaches to mismatched instances.
>>
>> Nothing further is needed?
> -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \
> -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \
> -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \
I don't get what is the point of adding such an id if it is not
referenced anywhere?

Eric
> -device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \
>
> -device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \
> -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \
> -device arm-smmuv3-accel,pci-bus=pcie.2,id=smmuv2 \
> -device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \
>
> I think it works from a functionality point of view. A  particular
> instance of arm-smmuv3-accel(say id=smmuv1) can only have devices attached
> to the same phys smmuv3 "iommu instance id"
>
> But not sure from a libvirt/Qemu interface point of view[0] the concerns
> are addressed. Daniel/Nathan?
>
> Thanks,
> Shameer
> https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/X6R52JRBYDFZ5PSJFR534A655UZ3RHKN/
>


Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable nested SMMUv3
Posted by Daniel P. Berrangé 9 months, 1 week ago
On Fri, Jan 31, 2025 at 05:08:28PM +0100, Eric Auger wrote:
> Hi,
> 
> 
> On 1/31/25 4:23 PM, Shameerali Kolothum Thodi wrote:
> >
> >> -----Original Message-----
> >> From: Jason Gunthorpe <jgg@nvidia.com>
> >> Sent: Friday, January 31, 2025 2:54 PM
> >> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> >> Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org;
> >> qemu-devel@nongnu.org; eric.auger@redhat.com;
> >> peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com;
> >> Linuxarm <linuxarm@huawei.com>; Wangzhou (B)
> >> <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>;
> >> Jonathan Cameron <jonathan.cameron@huawei.com>;
> >> zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com>
> >> Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable
> >> nested SMMUv3
> >>
> >> On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi
> >> wrote:
> >>
> >>>>> And Qemu does some checking to make sure that the device is indeed
> >>>> associated
> >>>>> with the specified phys-smmuv3.  This can be done going through the
> >>>> sysfs path checking
> >>>>> which is what I guess libvirt is currently doing to populate the
> >> topology.
> >>>> So basically
> >>>>> Qemu is just replicating that to validate again.
> >>>> I would prefer that iommufd users not have to go out to sysfs..
> >>>>
> >>>>> Or another option is extending the IOMMU_GET_HW_INFO IOCTL to
> >>>> return the phys
> >>>>> smmuv3 base address which can avoid going through the sysfs.
> >>>> It also doesn't seem great to expose a physical address. But we could
> >>>> have an 'iommu instance id' that was a unique small integer?
> >>> Ok. But how the user space can map that to the device?
> >> Why does it need to?
> >>
> >> libvirt picks some label for the vsmmu instance, it doesn't matter
> >> what the string is.
> >>
> >> qemu validates that all of the vsmmu instances are only linked to PCI
> >> device that have the same iommu ID. This is already happening in the
> >> kernel, it will fail attaches to mismatched instances.
> >>
> >> Nothing further is needed?
> > -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \
> > -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \
> > -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \
> I don't get what is the point of adding such an id if it is not
> referenced anywhere?

Every QDev device instance has an 'id' property - if you don't
set one explicitly, QEMU will generate one internally. Libvirt
will always set the 'id' property to avoid the internal auto-
generated IDs, as it wants full knowledge of naming.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable nested SMMUv3
Posted by Eric Auger 9 months, 1 week ago


On 2/6/25 9:53 AM, Daniel P. Berrangé wrote:
> On Fri, Jan 31, 2025 at 05:08:28PM +0100, Eric Auger wrote:
>> Hi,
>>
>>
>> On 1/31/25 4:23 PM, Shameerali Kolothum Thodi wrote:
>>>> -----Original Message-----
>>>> From: Jason Gunthorpe <jgg@nvidia.com>
>>>> Sent: Friday, January 31, 2025 2:54 PM
>>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
>>>> Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org;
>>>> qemu-devel@nongnu.org; eric.auger@redhat.com;
>>>> peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com;
>>>> Linuxarm <linuxarm@huawei.com>; Wangzhou (B)
>>>> <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>;
>>>> Jonathan Cameron <jonathan.cameron@huawei.com>;
>>>> zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com>
>>>> Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable
>>>> nested SMMUv3
>>>>
>>>> On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi
>>>> wrote:
>>>>
>>>>>>> And Qemu does some checking to make sure that the device is indeed
>>>>>> associated
>>>>>>> with the specified phys-smmuv3.  This can be done going through the
>>>>>> sysfs path checking
>>>>>>> which is what I guess libvirt is currently doing to populate the
>>>> topology.
>>>>>> So basically
>>>>>>> Qemu is just replicating that to validate again.
>>>>>> I would prefer that iommufd users not have to go out to sysfs..
>>>>>>
>>>>>>> Or another option is extending the IOMMU_GET_HW_INFO IOCTL to
>>>>>> return the phys
>>>>>>> smmuv3 base address which can avoid going through the sysfs.
>>>>>> It also doesn't seem great to expose a physical address. But we could
>>>>>> have an 'iommu instance id' that was a unique small integer?
>>>>> Ok. But how the user space can map that to the device?
>>>> Why does it need to?
>>>>
>>>> libvirt picks some label for the vsmmu instance, it doesn't matter
>>>> what the string is.
>>>>
>>>> qemu validates that all of the vsmmu instances are only linked to PCI
>>>> device that have the same iommu ID. This is already happening in the
>>>> kernel, it will fail attaches to mismatched instances.
>>>>
>>>> Nothing further is needed?
>>> -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \
>>> -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \
>>> -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \
>> I don't get what is the point of adding such an id if it is not
>> referenced anywhere?
> Every QDev device instance has an 'id' property - if you don't
> set one explicitly, QEMU will generate one internally. Libvirt
> will always set the 'id' property to avoid the internal auto-
> generated IDs, as it wants full knowledge of naming.

OK thank you for the explanation

Eric
>
> With regards,
> Daniel


Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable nested SMMUv3
Posted by Nathan Chen 9 months, 1 week ago

On 1/31/2025 8:08 AM, Eric Auger wrote:
>>>>>> And Qemu does some checking to make sure that the device is indeed
>>>>> associated
>>>>>> with the specified phys-smmuv3.  This can be done going through the
>>>>> sysfs path checking
>>>>>> which is what I guess libvirt is currently doing to populate the
>>> topology.
>>>>> So basically
>>>>>> Qemu is just replicating that to validate again.
>>>>> I would prefer that iommufd users not have to go out to sysfs..
>>>>>
>>>>>> Or another option is extending the IOMMU_GET_HW_INFO IOCTL to
>>>>> return the phys
>>>>>> smmuv3 base address which can avoid going through the sysfs.
>>>>> It also doesn't seem great to expose a physical address. But we could
>>>>> have an 'iommu instance id' that was a unique small integer?
>>>> Ok. But how the user space can map that to the device?
>>> Why does it need to?
>>>
>>> libvirt picks some label for the vsmmu instance, it doesn't matter
>>> what the string is.
>>>
>>> qemu validates that all of the vsmmu instances are only linked to PCI
>>> device that have the same iommu ID. This is already happening in the
>>> kernel, it will fail attaches to mismatched instances.
>>>
>>> Nothing further is needed?
>> -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \
>> -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \
>> -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \
> I don't get what is the point of adding such an id if it is not
> referenced anywhere?
> 
> Eric

Daniel mentions that the host-to-guest SMMU pairing must be chosen such 
that it makes conceptual sense w.r.t. the guest NUMA to host NUMA 
pairing [0]. The current implementation allows for incorrect host to 
guest numa node pairings, e.g. pSMMU has affinity to host numa node 0, 
but it’s paired with a vSMMU paired with a guest numa node pinned to 
host numa node 1.

By specifying the host SMMU id, we can explicitly pair a host SMMU with 
a guest SMMU associated with the correct PXB NUMA node, vs. implying the 
host-to-guest SMMU pairing based on what devices are attached to the 
PXB. While it would not completely prevent the incorrect pSMMU/vSMMU 
pairing w.r.t. host to guest numa node pairings, specifying the pSMMU id 
would make the implications of host to guest numa node pairings more 
clear when specifying a vSMMU instance.

 From the libvirt discussion with Daniel [1], he also states "libvirt's 
goal has always been to make everything that's functionally impacting a 
guest device be 100% explicit. So I don't think we should be implying 
mappings to the host SMMU in QEMU at all, QEMU must be told what to map 
to." Specifying the id would be a means of explicitly specifying host to 
guest SMMU mapping instead of implying the mapping.

[0] https://lore.kernel.org/qemu-devel/Z51DmtP83741RAsb@redhat.com/
[1] 
https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/7GDT6RX5LPAJMPP4ZSC4ACME6GVMG236/#X6R52JRBYDFZ5PSJFR534A655UZ3RHKN

Thanks,
Nathan

Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable nested SMMUv3
Posted by Daniel P. Berrangé 9 months, 1 week ago
On Wed, Feb 05, 2025 at 12:53:42PM -0800, Nathan Chen wrote:
> 
> 
> On 1/31/2025 8:08 AM, Eric Auger wrote:
> > > > > > > And Qemu does some checking to make sure that the device is indeed
> > > > > > associated
> > > > > > > with the specified phys-smmuv3.  This can be done going through the
> > > > > > sysfs path checking
> > > > > > > which is what I guess libvirt is currently doing to populate the
> > > > topology.
> > > > > > So basically
> > > > > > > Qemu is just replicating that to validate again.
> > > > > > I would prefer that iommufd users not have to go out to sysfs..
> > > > > > 
> > > > > > > Or another option is extending the IOMMU_GET_HW_INFO IOCTL to
> > > > > > return the phys
> > > > > > > smmuv3 base address which can avoid going through the sysfs.
> > > > > > It also doesn't seem great to expose a physical address. But we could
> > > > > > have an 'iommu instance id' that was a unique small integer?
> > > > > Ok. But how the user space can map that to the device?
> > > > Why does it need to?
> > > > 
> > > > libvirt picks some label for the vsmmu instance, it doesn't matter
> > > > what the string is.
> > > > 
> > > > qemu validates that all of the vsmmu instances are only linked to PCI
> > > > device that have the same iommu ID. This is already happening in the
> > > > kernel, it will fail attaches to mismatched instances.
> > > > 
> > > > Nothing further is needed?
> > > -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \
> > > -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \
> > > -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \
> > I don't get what is the point of adding such an id if it is not
> > referenced anywhere?
> > 
> > Eric
> 
> Daniel mentions that the host-to-guest SMMU pairing must be chosen such that
> it makes conceptual sense w.r.t. the guest NUMA to host NUMA pairing [0].
> The current implementation allows for incorrect host to guest numa node
> pairings, e.g. pSMMU has affinity to host numa node 0, but it’s paired with
> a vSMMU paired with a guest numa node pinned to host numa node 1.
> 
> By specifying the host SMMU id, we can explicitly pair a host SMMU with a
> guest SMMU associated with the correct PXB NUMA node, vs. implying the
> host-to-guest SMMU pairing based on what devices are attached to the PXB.
> While it would not completely prevent the incorrect pSMMU/vSMMU pairing
> w.r.t. host to guest numa node pairings, specifying the pSMMU id would make
> the implications of host to guest numa node pairings more clear when
> specifying a vSMMU instance.

You've not specified any host SMMU id in the above CLI args though,
only the PXB association.

It needs something like

 -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1,host-smmu=XXXXX

where 'XXXX' is some value to identify the host SMMU

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|