> -----Original Message----- > From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Friday, January 31, 2025 2:54 PM > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> > Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org; > qemu-devel@nongnu.org; eric.auger@redhat.com; > peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com; > Linuxarm <linuxarm@huawei.com>; Wangzhou (B) > <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; > Jonathan Cameron <jonathan.cameron@huawei.com>; > zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com> > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable > nested SMMUv3 > > On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi > wrote: > > > > > And Qemu does some checking to make sure that the device is indeed > > > associated > > > > with the specified phys-smmuv3. This can be done going through the > > > sysfs path checking > > > > which is what I guess libvirt is currently doing to populate the > topology. > > > So basically > > > > Qemu is just replicating that to validate again. > > > > > > I would prefer that iommufd users not have to go out to sysfs.. > > > > > > > Or another option is extending the IOMMU_GET_HW_INFO IOCTL to > > > return the phys > > > > smmuv3 base address which can avoid going through the sysfs. > > > > > > It also doesn't seem great to expose a physical address. But we could > > > have an 'iommu instance id' that was a unique small integer? > > > > Ok. But how the user space can map that to the device? > > Why does it need to? > > libvirt picks some label for the vsmmu instance, it doesn't matter > what the string is. > > qemu validates that all of the vsmmu instances are only linked to PCI > device that have the same iommu ID. This is already happening in the > kernel, it will fail attaches to mismatched instances. > > Nothing further is needed? -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \ -device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \ -device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \ -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \ -device arm-smmuv3-accel,pci-bus=pcie.2,id=smmuv2 \ -device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \ I think it works from a functionality point of view. A particular instance of arm-smmuv3-accel(say id=smmuv1) can only have devices attached to the same phys smmuv3 "iommu instance id" But not sure from a libvirt/Qemu interface point of view[0] the concerns are addressed. Daniel/Nathan? Thanks, Shameer https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/X6R52JRBYDFZ5PSJFR534A655UZ3RHKN/
Hi, On 1/31/25 4:23 PM, Shameerali Kolothum Thodi wrote: > >> -----Original Message----- >> From: Jason Gunthorpe <jgg@nvidia.com> >> Sent: Friday, January 31, 2025 2:54 PM >> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> >> Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org; >> qemu-devel@nongnu.org; eric.auger@redhat.com; >> peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com; >> Linuxarm <linuxarm@huawei.com>; Wangzhou (B) >> <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; >> Jonathan Cameron <jonathan.cameron@huawei.com>; >> zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com> >> Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable >> nested SMMUv3 >> >> On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi >> wrote: >> >>>>> And Qemu does some checking to make sure that the device is indeed >>>> associated >>>>> with the specified phys-smmuv3. This can be done going through the >>>> sysfs path checking >>>>> which is what I guess libvirt is currently doing to populate the >> topology. >>>> So basically >>>>> Qemu is just replicating that to validate again. >>>> I would prefer that iommufd users not have to go out to sysfs.. >>>> >>>>> Or another option is extending the IOMMU_GET_HW_INFO IOCTL to >>>> return the phys >>>>> smmuv3 base address which can avoid going through the sysfs. >>>> It also doesn't seem great to expose a physical address. But we could >>>> have an 'iommu instance id' that was a unique small integer? >>> Ok. But how the user space can map that to the device? >> Why does it need to? >> >> libvirt picks some label for the vsmmu instance, it doesn't matter >> what the string is. >> >> qemu validates that all of the vsmmu instances are only linked to PCI >> device that have the same iommu ID. This is already happening in the >> kernel, it will fail attaches to mismatched instances. >> >> Nothing further is needed? > -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ > -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ > -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \ I don't get what is the point of adding such an id if it is not referenced anywhere? Eric > -device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \ > > -device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \ > -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \ > -device arm-smmuv3-accel,pci-bus=pcie.2,id=smmuv2 \ > -device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \ > > I think it works from a functionality point of view. A particular > instance of arm-smmuv3-accel(say id=smmuv1) can only have devices attached > to the same phys smmuv3 "iommu instance id" > > But not sure from a libvirt/Qemu interface point of view[0] the concerns > are addressed. Daniel/Nathan? > > Thanks, > Shameer > https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/message/X6R52JRBYDFZ5PSJFR534A655UZ3RHKN/ >
On Fri, Jan 31, 2025 at 05:08:28PM +0100, Eric Auger wrote: > Hi, > > > On 1/31/25 4:23 PM, Shameerali Kolothum Thodi wrote: > > > >> -----Original Message----- > >> From: Jason Gunthorpe <jgg@nvidia.com> > >> Sent: Friday, January 31, 2025 2:54 PM > >> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> > >> Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org; > >> qemu-devel@nongnu.org; eric.auger@redhat.com; > >> peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com; > >> Linuxarm <linuxarm@huawei.com>; Wangzhou (B) > >> <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; > >> Jonathan Cameron <jonathan.cameron@huawei.com>; > >> zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com> > >> Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable > >> nested SMMUv3 > >> > >> On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi > >> wrote: > >> > >>>>> And Qemu does some checking to make sure that the device is indeed > >>>> associated > >>>>> with the specified phys-smmuv3. This can be done going through the > >>>> sysfs path checking > >>>>> which is what I guess libvirt is currently doing to populate the > >> topology. > >>>> So basically > >>>>> Qemu is just replicating that to validate again. > >>>> I would prefer that iommufd users not have to go out to sysfs.. > >>>> > >>>>> Or another option is extending the IOMMU_GET_HW_INFO IOCTL to > >>>> return the phys > >>>>> smmuv3 base address which can avoid going through the sysfs. > >>>> It also doesn't seem great to expose a physical address. But we could > >>>> have an 'iommu instance id' that was a unique small integer? > >>> Ok. But how the user space can map that to the device? > >> Why does it need to? > >> > >> libvirt picks some label for the vsmmu instance, it doesn't matter > >> what the string is. > >> > >> qemu validates that all of the vsmmu instances are only linked to PCI > >> device that have the same iommu ID. This is already happening in the > >> kernel, it will fail attaches to mismatched instances. > >> > >> Nothing further is needed? > > -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ > > -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ > > -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \ > I don't get what is the point of adding such an id if it is not > referenced anywhere? Every QDev device instance has an 'id' property - if you don't set one explicitly, QEMU will generate one internally. Libvirt will always set the 'id' property to avoid the internal auto- generated IDs, as it wants full knowledge of naming. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 2/6/25 9:53 AM, Daniel P. Berrangé wrote: > On Fri, Jan 31, 2025 at 05:08:28PM +0100, Eric Auger wrote: >> Hi, >> >> >> On 1/31/25 4:23 PM, Shameerali Kolothum Thodi wrote: >>>> -----Original Message----- >>>> From: Jason Gunthorpe <jgg@nvidia.com> >>>> Sent: Friday, January 31, 2025 2:54 PM >>>> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> >>>> Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org; >>>> qemu-devel@nongnu.org; eric.auger@redhat.com; >>>> peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com; >>>> Linuxarm <linuxarm@huawei.com>; Wangzhou (B) >>>> <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; >>>> Jonathan Cameron <jonathan.cameron@huawei.com>; >>>> zhangfei.gao@linaro.org; Nathan Chen <nathanc@nvidia.com> >>>> Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable >>>> nested SMMUv3 >>>> >>>> On Fri, Jan 31, 2025 at 02:39:53PM +0000, Shameerali Kolothum Thodi >>>> wrote: >>>> >>>>>>> And Qemu does some checking to make sure that the device is indeed >>>>>> associated >>>>>>> with the specified phys-smmuv3. This can be done going through the >>>>>> sysfs path checking >>>>>>> which is what I guess libvirt is currently doing to populate the >>>> topology. >>>>>> So basically >>>>>>> Qemu is just replicating that to validate again. >>>>>> I would prefer that iommufd users not have to go out to sysfs.. >>>>>> >>>>>>> Or another option is extending the IOMMU_GET_HW_INFO IOCTL to >>>>>> return the phys >>>>>>> smmuv3 base address which can avoid going through the sysfs. >>>>>> It also doesn't seem great to expose a physical address. But we could >>>>>> have an 'iommu instance id' that was a unique small integer? >>>>> Ok. But how the user space can map that to the device? >>>> Why does it need to? >>>> >>>> libvirt picks some label for the vsmmu instance, it doesn't matter >>>> what the string is. >>>> >>>> qemu validates that all of the vsmmu instances are only linked to PCI >>>> device that have the same iommu ID. This is already happening in the >>>> kernel, it will fail attaches to mismatched instances. >>>> >>>> Nothing further is needed? >>> -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ >>> -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ >>> -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \ >> I don't get what is the point of adding such an id if it is not >> referenced anywhere? > Every QDev device instance has an 'id' property - if you don't > set one explicitly, QEMU will generate one internally. Libvirt > will always set the 'id' property to avoid the internal auto- > generated IDs, as it wants full knowledge of naming. OK thank you for the explanation Eric > > With regards, > Daniel
On 1/31/2025 8:08 AM, Eric Auger wrote: >>>>>> And Qemu does some checking to make sure that the device is indeed >>>>> associated >>>>>> with the specified phys-smmuv3. This can be done going through the >>>>> sysfs path checking >>>>>> which is what I guess libvirt is currently doing to populate the >>> topology. >>>>> So basically >>>>>> Qemu is just replicating that to validate again. >>>>> I would prefer that iommufd users not have to go out to sysfs.. >>>>> >>>>>> Or another option is extending the IOMMU_GET_HW_INFO IOCTL to >>>>> return the phys >>>>>> smmuv3 base address which can avoid going through the sysfs. >>>>> It also doesn't seem great to expose a physical address. But we could >>>>> have an 'iommu instance id' that was a unique small integer? >>>> Ok. But how the user space can map that to the device? >>> Why does it need to? >>> >>> libvirt picks some label for the vsmmu instance, it doesn't matter >>> what the string is. >>> >>> qemu validates that all of the vsmmu instances are only linked to PCI >>> device that have the same iommu ID. This is already happening in the >>> kernel, it will fail attaches to mismatched instances. >>> >>> Nothing further is needed? >> -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ >> -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ >> -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \ > I don't get what is the point of adding such an id if it is not > referenced anywhere? > > Eric Daniel mentions that the host-to-guest SMMU pairing must be chosen such that it makes conceptual sense w.r.t. the guest NUMA to host NUMA pairing [0]. The current implementation allows for incorrect host to guest numa node pairings, e.g. pSMMU has affinity to host numa node 0, but it’s paired with a vSMMU paired with a guest numa node pinned to host numa node 1. By specifying the host SMMU id, we can explicitly pair a host SMMU with a guest SMMU associated with the correct PXB NUMA node, vs. implying the host-to-guest SMMU pairing based on what devices are attached to the PXB. While it would not completely prevent the incorrect pSMMU/vSMMU pairing w.r.t. host to guest numa node pairings, specifying the pSMMU id would make the implications of host to guest numa node pairings more clear when specifying a vSMMU instance. From the libvirt discussion with Daniel [1], he also states "libvirt's goal has always been to make everything that's functionally impacting a guest device be 100% explicit. So I don't think we should be implying mappings to the host SMMU in QEMU at all, QEMU must be told what to map to." Specifying the id would be a means of explicitly specifying host to guest SMMU mapping instead of implying the mapping. [0] https://lore.kernel.org/qemu-devel/Z51DmtP83741RAsb@redhat.com/ [1] https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/7GDT6RX5LPAJMPP4ZSC4ACME6GVMG236/#X6R52JRBYDFZ5PSJFR534A655UZ3RHKN Thanks, Nathan
On Wed, Feb 05, 2025 at 12:53:42PM -0800, Nathan Chen wrote: > > > On 1/31/2025 8:08 AM, Eric Auger wrote: > > > > > > > And Qemu does some checking to make sure that the device is indeed > > > > > > associated > > > > > > > with the specified phys-smmuv3. This can be done going through the > > > > > > sysfs path checking > > > > > > > which is what I guess libvirt is currently doing to populate the > > > > topology. > > > > > > So basically > > > > > > > Qemu is just replicating that to validate again. > > > > > > I would prefer that iommufd users not have to go out to sysfs.. > > > > > > > > > > > > > Or another option is extending the IOMMU_GET_HW_INFO IOCTL to > > > > > > return the phys > > > > > > > smmuv3 base address which can avoid going through the sysfs. > > > > > > It also doesn't seem great to expose a physical address. But we could > > > > > > have an 'iommu instance id' that was a unique small integer? > > > > > Ok. But how the user space can map that to the device? > > > > Why does it need to? > > > > > > > > libvirt picks some label for the vsmmu instance, it doesn't matter > > > > what the string is. > > > > > > > > qemu validates that all of the vsmmu instances are only linked to PCI > > > > device that have the same iommu ID. This is already happening in the > > > > kernel, it will fail attaches to mismatched instances. > > > > > > > > Nothing further is needed? > > > -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ > > > -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ > > > -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1 \ > > I don't get what is the point of adding such an id if it is not > > referenced anywhere? > > > > Eric > > Daniel mentions that the host-to-guest SMMU pairing must be chosen such that > it makes conceptual sense w.r.t. the guest NUMA to host NUMA pairing [0]. > The current implementation allows for incorrect host to guest numa node > pairings, e.g. pSMMU has affinity to host numa node 0, but it’s paired with > a vSMMU paired with a guest numa node pinned to host numa node 1. > > By specifying the host SMMU id, we can explicitly pair a host SMMU with a > guest SMMU associated with the correct PXB NUMA node, vs. implying the > host-to-guest SMMU pairing based on what devices are attached to the PXB. > While it would not completely prevent the incorrect pSMMU/vSMMU pairing > w.r.t. host to guest numa node pairings, specifying the pSMMU id would make > the implications of host to guest numa node pairings more clear when > specifying a vSMMU instance. You've not specified any host SMMU id in the above CLI args though, only the PXB association. It needs something like -device arm-smmuv3-accel,bus=pcie.1,id=smmuv1,host-smmu=XXXXX where 'XXXX' is some value to identify the host SMMU With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
© 2016 - 2025 Red Hat, Inc.