> -----Original Message----- > From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Friday, January 31, 2025 2:24 PM > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> > Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org; > qemu-devel@nongnu.org; eric.auger@redhat.com; > peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com; > Linuxarm <linuxarm@huawei.com>; Wangzhou (B) > <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; > Jonathan Cameron <jonathan.cameron@huawei.com>; > zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com> > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable > nested SMMUv3 > > On Fri, Jan 31, 2025 at 09:33:16AM +0000, Shameerali Kolothum Thodi > wrote: > > > And Qemu does some checking to make sure that the device is indeed > associated > > with the specified phys-smmuv3. This can be done going through the > sysfs path checking > > which is what I guess libvirt is currently doing to populate the topology. > So basically > > Qemu is just replicating that to validate again. > > I would prefer that iommufd users not have to go out to sysfs.. > > > Or another option is extending the IOMMU_GET_HW_INFO IOCTL to > return the phys > > smmuv3 base address which can avoid going through the sysfs. > > It also doesn't seem great to expose a physical address. But we could > have an 'iommu instance id' that was a unique small integer? Ok. But how the user space can map that to the device? Something like, /sys/bus/pci/devices/0000:7d:00.1/iommu/instance.X ? Thanks, Shameer
On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi wrote: > > > And Qemu does some checking to make sure that the device is indeed > > associated > > > with the specified phys-smmuv3. This can be done going through the > > sysfs path checking > > > which is what I guess libvirt is currently doing to populate the topology. > > So basically > > > Qemu is just replicating that to validate again. > > > > I would prefer that iommufd users not have to go out to sysfs.. > > > > > Or another option is extending the IOMMU_GET_HW_INFO IOCTL to > > return the phys > > > smmuv3 base address which can avoid going through the sysfs. > > > > It also doesn't seem great to expose a physical address. But we could > > have an 'iommu instance id' that was a unique small integer? > > Ok. But how the user space can map that to the device? Why does it need to? libvirt picks some label for the vsmmu instance, it doesn't matter what the string is. qemu validates that all of the vsmmu instances are only linked to PCI device that have the same iommu ID. This is already happening in the kernel, it will fail attaches to mismatched instances. Nothing further is needed? Jason
> -----Original Message----- > From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Friday, January 31, 2025 2:54 PM > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> > Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org; > qemu-devel@nongnu.org; eric.auger@redhat.com; > peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com; > Linuxarm <linuxarm@huawei.com>; Wangzhou (B) > <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; > Jonathan Cameron <jonathan.cameron@huawei.com>; > zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com> > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable > nested SMMUv3 > > On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi > wrote: > > > > > And Qemu does some checking to make sure that the device is indeed > > > associated > > > > with the specified phys-smmuv3. This can be done going through the > > > sysfs path checking > > > > which is what I guess libvirt is currently doing to populate the > topology. > > > So basically > > > > Qemu is just replicating that to validate again. > > > > > > I would prefer that iommufd users not have to go out to sysfs.. > > > > > > > Or another option is extending the IOMMU_GET_HW_INFO IOCTL to > > > return the phys > > > > smmuv3 base address which can avoid going through the sysfs. > > > > > > It also doesn't seem great to expose a physical address. But we could > > > have an 'iommu instance id' that was a unique small integer? > > > > Ok. But how the user space can map that to the device? > > Why does it need to? > > libvirt picks some label for the vsmmu instance, it doesn't matter > what the string is. > > qemu validates that all of the vsmmu instances are only linked to PCI > device that have the same iommu ID. This is already happening in the > kernel, it will fail attaches to mismatched instances. > > Nothing further is needed? -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \ -device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \ -device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \ -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \ -device arm-smmuv3-accel,pci-bus=pcie.2,id=smmuv2 \ -device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \ I think it works from a functionality point of view. A particular instance of arm-smmuv3-accel(say id=smmuv1) can only have devices attached to the same phys smmuv3 "iommu instance id" But not sure from a libvirt/Qemu interface point of view[0] the concerns are addressed. Daniel/Nathan? Thanks, Shameer https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/X6R52JRBYDFZ5PSJFR534A655UZ3RHKN/
Hi, On 1/31/25 4:23 PM, Shameerali Kolothum Thodi wrote: > >> -----Original Message----- >> From: Jason Gunthorpe <jgg@nvidia.com> >> Sent: Friday, January 31, 2025 2:54 PM >> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> >> Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org; >> qemu-devel@nongnu.org; eric.auger@redhat.com; >> peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com; >> Linuxarm <linuxarm@huawei.com>; Wangzhou (B) >> <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; >> Jonathan Cameron <jonathan.cameron@huawei.com>; >> zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com> >> Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable >> nested SMMUv3 >> >> On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi >> wrote: >> >>>>> And Qemu does some checking to make sure that the device is indeed >>>> associated >>>>> with the specified phys-smmuv3. This can be done going through the >>>> sysfs path checking >>>>> which is what I guess libvirt is currently doing to populate the >> topology. >>>> So basically >>>>> Qemu is just replicating that to validate again. >>>> I would prefer that iommufd users not have to go out to sysfs.. >>>> >>>>> Or another option is extending the IOMMU_GET_HW_INFO IOCTL to >>>> return the phys >>>>> smmuv3 base address which can avoid going through the sysfs. >>>> It also doesn't seem great to expose a physical address. But we could >>>> have an 'iommu instance id' that was a unique small integer? >>> Ok. But how the user space can map that to the device? >> Why does it need to? >> >> libvirt picks some label for the vsmmu instance, it doesn't matter >> what the string is. >> >> qemu validates that all of the vsmmu instances are only linked to PCI >> device that have the same iommu ID. This is already happening in the >> kernel, it will fail attaches to mismatched instances. >> >> Nothing further is needed? > -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ > -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ > -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \ I don't get what is the point of adding such an id if it is not referenced anywhere? Eric > -device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \ > > -device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \ > -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \ > -device arm-smmuv3-accel,pci-bus=pcie.2,id=smmuv2 \ > -device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \ > > I think it works from a functionality point of view. A particular > instance of arm-smmuv3-accel(say id=smmuv1) can only have devices attached > to the same phys smmuv3 "iommu instance id" > > But not sure from a libvirt/Qemu interface point of view[0] the concerns > are addressed. Daniel/Nathan? > > Thanks, > Shameer > https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/X6R52JRBYDFZ5PSJFR534A655UZ3RHKN/ >
On Fri, Jan 31, 2025 at 05:08:28PM +0100, Eric Auger wrote: > Hi, > > > On 1/31/25 4:23 PM, Shameerali Kolothum Thodi wrote: > > > >> -----Original Message----- > >> From: Jason Gunthorpe <jgg@nvidia.com> > >> Sent: Friday, January 31, 2025 2:54 PM > >> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> > >> Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org; > >> qemu-devel@nongnu.org; eric.auger@redhat.com; > >> peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com; > >> Linuxarm <linuxarm@huawei.com>; Wangzhou (B) > >> <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; > >> Jonathan Cameron <jonathan.cameron@huawei.com>; > >> zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com> > >> Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable > >> nested SMMUv3 > >> > >> On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi > >> wrote: > >> > >>>>> And Qemu does some checking to make sure that the device is indeed > >>>> associated > >>>>> with the specified phys-smmuv3. This can be done going through the > >>>> sysfs path checking > >>>>> which is what I guess libvirt is currently doing to populate the > >> topology. > >>>> So basically > >>>>> Qemu is just replicating that to validate again. > >>>> I would prefer that iommufd users not have to go out to sysfs.. > >>>> > >>>>> Or another option is extending the IOMMU_GET_HW_INFO IOCTL to > >>>> return the phys > >>>>> smmuv3 base address which can avoid going through the sysfs. > >>>> It also doesn't seem great to expose a physical address. But we could > >>>> have an 'iommu instance id' that was a unique small integer? > >>> Ok. But how the user space can map that to the device? > >> Why does it need to? > >> > >> libvirt picks some label for the vsmmu instance, it doesn't matter > >> what the string is. > >> > >> qemu validates that all of the vsmmu instances are only linked to PCI > >> device that have the same iommu ID. This is already happening in the > >> kernel, it will fail attaches to mismatched instances. > >> > >> Nothing further is needed? > > -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ > > -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ > > -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \ > I don't get what is the point of adding such an id if it is not > referenced anywhere? Every QDev device instance has an 'id' property - if you don't set one explicitly, QEMU will generate one internally. Libvirt will always set the 'id' property to avoid the internal auto- generated IDs, as it wants full knowledge of naming. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 2/6/25 9:53 AM, Daniel P. Berrangé wrote: > On Fri, Jan 31, 2025 at 05:08:28PM +0100, Eric Auger wrote: >> Hi, >> >> >> On 1/31/25 4:23 PM, Shameerali Kolothum Thodi wrote: >>>> -----Original Message----- >>>> From: Jason Gunthorpe <jgg@nvidia.com> >>>> Sent: Friday, January 31, 2025 2:54 PM >>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> >>>> Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org; >>>> qemu-devel@nongnu.org; eric.auger@redhat.com; >>>> peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com; >>>> Linuxarm <linuxarm@huawei.com>; Wangzhou (B) >>>> <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; >>>> Jonathan Cameron <jonathan.cameron@huawei.com>; >>>> zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com> >>>> Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable >>>> nested SMMUv3 >>>> >>>> On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi >>>> wrote: >>>> >>>>>>> And Qemu does some checking to make sure that the device is indeed >>>>>> associated >>>>>>> with the specified phys-smmuv3. This can be done going through the >>>>>> sysfs path checking >>>>>>> which is what I guess libvirt is currently doing to populate the >>>> topology. >>>>>> So basically >>>>>>> Qemu is just replicating that to validate again. >>>>>> I would prefer that iommufd users not have to go out to sysfs.. >>>>>> >>>>>>> Or another option is extending the IOMMU_GET_HW_INFO IOCTL to >>>>>> return the phys >>>>>>> smmuv3 base address which can avoid going through the sysfs. >>>>>> It also doesn't seem great to expose a physical address. But we could >>>>>> have an 'iommu instance id' that was a unique small integer? >>>>> Ok. But how the user space can map that to the device? >>>> Why does it need to? >>>> >>>> libvirt picks some label for the vsmmu instance, it doesn't matter >>>> what the string is. >>>> >>>> qemu validates that all of the vsmmu instances are only linked to PCI >>>> device that have the same iommu ID. This is already happening in the >>>> kernel, it will fail attaches to mismatched instances. >>>> >>>> Nothing further is needed? >>> -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ >>> -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ >>> -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \ >> I don't get what is the point of adding such an id if it is not >> referenced anywhere? > Every QDev device instance has an 'id' property - if you don't > set one explicitly, QEMU will generate one internally. Libvirt > will always set the 'id' property to avoid the internal auto- > generated IDs, as it wants full knowledge of naming. OK thank you for the explanation Eric > > With regards, > Daniel
On 1/31/2025 8:08 AM, Eric Auger wrote: >>>>>> And Qemu does some checking to make sure that the device is indeed >>>>> associated >>>>>> with the specified phys-smmuv3. This can be done going through the >>>>> sysfs path checking >>>>>> which is what I guess libvirt is currently doing to populate the >>> topology. >>>>> So basically >>>>>> Qemu is just replicating that to validate again. >>>>> I would prefer that iommufd users not have to go out to sysfs.. >>>>> >>>>>> Or another option is extending the IOMMU_GET_HW_INFO IOCTL to >>>>> return the phys >>>>>> smmuv3 base address which can avoid going through the sysfs. >>>>> It also doesn't seem great to expose a physical address. But we could >>>>> have an 'iommu instance id' that was a unique small integer? >>>> Ok. But how the user space can map that to the device? >>> Why does it need to? >>> >>> libvirt picks some label for the vsmmu instance, it doesn't matter >>> what the string is. >>> >>> qemu validates that all of the vsmmu instances are only linked to PCI >>> device that have the same iommu ID. This is already happening in the >>> kernel, it will fail attaches to mismatched instances. >>> >>> Nothing further is needed? >> -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ >> -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ >> -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \ > I don't get what is the point of adding such an id if it is not > referenced anywhere? > > Eric Daniel mentions that the host-to-guest SMMU pairing must be chosen such that it makes conceptual sense w.r.t. the guest NUMA to host NUMA pairing [0]. The current implementation allows for incorrect host to guest numa node pairings, e.g. pSMMU has affinity to host numa node 0, but it’s paired with a vSMMU paired with a guest numa node pinned to host numa node 1. By specifying the host SMMU id, we can explicitly pair a host SMMU with a guest SMMU associated with the correct PXB NUMA node, vs. implying the host-to-guest SMMU pairing based on what devices are attached to the PXB. While it would not completely prevent the incorrect pSMMU/vSMMU pairing w.r.t. host to guest numa node pairings, specifying the pSMMU id would make the implications of host to guest numa node pairings more clear when specifying a vSMMU instance. From the libvirt discussion with Daniel [1], he also states "libvirt's goal has always been to make everything that's functionally impacting a guest device be 100% explicit. So I don't think we should be implying mappings to the host SMMU in QEMU at all, QEMU must be told what to map to." Specifying the id would be a means of explicitly specifying host to guest SMMU mapping instead of implying the mapping. [0] https://lore.kernel.org/qemu-devel/Z51DmtP83741RAsb@redhat.com/ [1] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/7GDT6RX5LPAJMPP4ZSC4ACME6GVMG236/#X6R52JRBYDFZ5PSJFR534A655UZ3RHKN Thanks, Nathan
On Wed, Feb 05, 2025 at 12:53:42PM -0800, Nathan Chen wrote: > > > On 1/31/2025 8:08 AM, Eric Auger wrote: > > > > > > > And Qemu does some checking to make sure that the device is indeed > > > > > > associated > > > > > > > with the specified phys-smmuv3. This can be done going through the > > > > > > sysfs path checking > > > > > > > which is what I guess libvirt is currently doing to populate the > > > > topology. > > > > > > So basically > > > > > > > Qemu is just replicating that to validate again. > > > > > > I would prefer that iommufd users not have to go out to sysfs.. > > > > > > > > > > > > > Or another option is extending the IOMMU_GET_HW_INFO IOCTL to > > > > > > return the phys > > > > > > > smmuv3 base address which can avoid going through the sysfs. > > > > > > It also doesn't seem great to expose a physical address. But we could > > > > > > have an 'iommu instance id' that was a unique small integer? > > > > > Ok. But how the user space can map that to the device? > > > > Why does it need to? > > > > > > > > libvirt picks some label for the vsmmu instance, it doesn't matter > > > > what the string is. > > > > > > > > qemu validates that all of the vsmmu instances are only linked to PCI > > > > device that have the same iommu ID. This is already happening in the > > > > kernel, it will fail attaches to mismatched instances. > > > > > > > > Nothing further is needed? > > > -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ > > > -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ > > > -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \ > > I don't get what is the point of adding such an id if it is not > > referenced anywhere? > > > > Eric > > Daniel mentions that the host-to-guest SMMU pairing must be chosen such that > it makes conceptual sense w.r.t. the guest NUMA to host NUMA pairing [0]. > The current implementation allows for incorrect host to guest numa node > pairings, e.g. pSMMU has affinity to host numa node 0, but it’s paired with > a vSMMU paired with a guest numa node pinned to host numa node 1. > > By specifying the host SMMU id, we can explicitly pair a host SMMU with a > guest SMMU associated with the correct PXB NUMA node, vs. implying the > host-to-guest SMMU pairing based on what devices are attached to the PXB. > While it would not completely prevent the incorrect pSMMU/vSMMU pairing > w.r.t. host to guest numa node pairings, specifying the pSMMU id would make > the implications of host to guest numa node pairings more clear when > specifying a vSMMU instance. You've not specified any host SMMU id in the above CLI args though, only the PXB association. It needs something like -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1,host-smmu=XXXXX where 'XXXX' is some value to identify the host SMMU With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
© 2016 - 2025 Red Hat, Inc.