> -----Original Message----- > From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Thursday, February 6, 2025 5:59 PM > To: Daniel P. Berrangé <berrange@redhat.com> > Cc: Shameerali Kolothum Thodi > <shameerali.kolothum.thodi@huawei.com>; qemu-arm@nongnu.org; > qemu-devel@nongnu.org; eric.auger@redhat.com; > peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com; > Linuxarm <linuxarm@huawei.com>; Wangzhou (B) > <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; > Jonathan Cameron <jonathan.cameron@huawei.com>; > zhangfei.gao@linaro.org; nathanc@nvidia.com > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable > nested SMMUv3 > > On Thu, Feb 06, 2025 at 05:54:57PM +0000, Daniel P. Berrangé wrote: > > > > We shouldn't assume any VFIO device exists in the QEMU cnofig at the > time > > > > we realize the virtual ssmu. I expect the SMMU may be cold plugged, > while > > > > the VFIO devices may be hot plugged arbitrarly later, and we should > have > > > > the association initialized the SMMU is realized. > > > > > > This is not supported kernel side, you can't instantiate a vIOMMU > > > without a VFIO device that uses it. For security. > > > > What are the security concerns here ? > > You should not be able to open iommufd and manipulate iommu HW that > you don't have a VFIO descriptor for, including creating physical > vIOMMU resources, allocating command queues and whatever else. > > Some kind of hot plug smmu would have to create a vSMMU without any > kernel backing and then later bind it to a kernel implementation. Not sure I get the problem with associating vSMMU with a pSMMU. Something like an iommu instance id mentioned before, -device arm-smmuv3-accel,id=smmuv2,bus=pcie.2,host-smmu=iommu.1 This can realize the vSMMU without actually creating a vIOMMU in kernel. And when the dev gets attached/realized, check (GET_HW_INFO)the specified iommu instance id matches or not. Or the concern here is exporting an iommu instance id to user space? Thanks, Shameer
On Thu, Feb 06, 2025 at 06:04:57PM +0000, Shameerali Kolothum Thodi wrote: > > Some kind of hot plug smmu would have to create a vSMMU without any > > kernel backing and then later bind it to a kernel implementation. > > Not sure I get the problem with associating vSMMU with a pSMMU. Something > like an iommu instance id mentioned before, > > -device arm-smmuv3-accel,id=smmuv2,bus=pcie.2,host-smmu=iommu.1 > > This can realize the vSMMU without actually creating a vIOMMU in kernel. > And when the dev gets attached/realized, check (GET_HW_INFO)the specified > iommu instance id matches or not. > > Or the concern here is exporting an iommu instance id to user space? Philisophically we do not permit any HW access through iommufd without a VFIO fd to "prove" the process has rights to touch hardware. We don't have any way to prove the process has rights to touch the iommu hardware seperately from VFIO. So even if you invent an iommu ID we cannot accept it as a handle to create viommu in iommufd. Jason
> -----Original Message----- > From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Thursday, February 6, 2025 6:13 PM > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> > Cc: Daniel P. Berrangé <berrange@redhat.com>; qemu-arm@nongnu.org; > qemu-devel@nongnu.org; eric.auger@redhat.com; > peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com; > Linuxarm <linuxarm@huawei.com>; Wangzhou (B) > <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; > Jonathan Cameron <jonathan.cameron@huawei.com>; > zhangfei.gao@linaro.org; nathanc@nvidia.com > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable > nested SMMUv3 > > On Thu, Feb 06, 2025 at 06:04:57PM +0000, Shameerali Kolothum Thodi > wrote: > > > Some kind of hot plug smmu would have to create a vSMMU without > any > > > kernel backing and then later bind it to a kernel implementation. > > > > Not sure I get the problem with associating vSMMU with a pSMMU. > Something > > like an iommu instance id mentioned before, > > > > -device arm-smmuv3-accel,id=smmuv2,bus=pcie.2,host-smmu=iommu.1 > > > > This can realize the vSMMU without actually creating a vIOMMU in kernel. > > And when the dev gets attached/realized, check (GET_HW_INFO)the > specified > > iommu instance id matches or not. > > > > Or the concern here is exporting an iommu instance id to user space? > > Philisophically we do not permit any HW access through iommufd without > a VFIO fd to "prove" the process has rights to touch hardware. > > We don't have any way to prove the process has rights to touch the > iommu hardware seperately from VFIO. It is not. Qemu just instantiates a vSMMU and assigns the IOMMU instance id to it. > > So even if you invent an iommu ID we cannot accept it as a handle to > create viommu in iommufd. Creating the vIOMMU only happens when the user does a cold/hot plug of a VFIO device. At that time Qemu checks whether the assigned id matches with whatever the kernel tell it. Thanks, Shameer
On Thu, Feb 06, 2025 at 06:18:14PM +0000, Shameerali Kolothum Thodi wrote: > > So even if you invent an iommu ID we cannot accept it as a handle to > > create viommu in iommufd. > > Creating the vIOMMU only happens when the user does a cold/hot plug of > a VFIO device. At that time Qemu checks whether the assigned id matches > with whatever the kernel tell it. This is not hard up until the guest is started. If you boot a guest without a backing viommu iommufd object then there will be some more complexities. Jason
On Thu, Feb 06, 2025 at 02:22:01PM -0400, Jason Gunthorpe wrote: > On Thu, Feb 06, 2025 at 06:18:14PM +0000, Shameerali Kolothum Thodi wrote: > > > > So even if you invent an iommu ID we cannot accept it as a handle to > > > create viommu in iommufd. > > > > Creating the vIOMMU only happens when the user does a cold/hot plug of > > a VFIO device. At that time Qemu checks whether the assigned id matches > > with whatever the kernel tell it. > > This is not hard up until the guest is started. If you boot a guest > without a backing viommu iommufd object then there will be some more > complexities. Yea, I imagined that things would be complicated with hotplugs.. On one hand, I got the part that we need some fixed link forehand to ease migration/hotplugs. On the other hand, all IOMMUFD ioctls need a VFIO device FD, which brings the immediate attention that we cannot even decide vSMMU's capabilities being reflected in its IDR/IIDR registers, without a coldplug device -- if we boot a VM (one vSMMU<->pSMMU) with only a hotplug device, the IOMMU_GET_HW_INFO cannot be done during guest kernel probing vSMMU instance. So we would have to reset the vSMMU "HW" after the device hotplug? Nicolin
> -----Original Message----- > From: Nicolin Chen <nicolinc@nvidia.com> > Sent: Thursday, February 6, 2025 8:33 PM > To: Shameerali Kolothum Thodi > <shameerali.kolothum.thodi@huawei.com>; Daniel P. Berrangé > <berrange@redhat.com>; Jason Gunthorpe <jgg@nvidia.com> > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; > eric.auger@redhat.com; peter.maydell@linaro.org; ddutile@redhat.com; > Linuxarm <linuxarm@huawei.com>; Wangzhou (B) > <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; > Jonathan Cameron <jonathan.cameron@huawei.com>; > zhangfei.gao@linaro.org; nathanc@nvidia.com > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable > nested SMMUv3 > > On Thu, Feb 06, 2025 at 02:22:01PM -0400, Jason Gunthorpe wrote: > > On Thu, Feb 06, 2025 at 06:18:14PM +0000, Shameerali Kolothum Thodi > wrote: > > > > > > So even if you invent an iommu ID we cannot accept it as a handle to > > > > create viommu in iommufd. > > > > > > Creating the vIOMMU only happens when the user does a cold/hot > plug of > > > a VFIO device. At that time Qemu checks whether the assigned id > matches > > > with whatever the kernel tell it. > > > > This is not hard up until the guest is started. If you boot a guest > > without a backing viommu iommufd object then there will be some more > > complexities. > > Yea, I imagined that things would be complicated with hotplugs.. > > On one hand, I got the part that we need some fixed link forehand > to ease migration/hotplugs. > > On the other hand, all IOMMUFD ioctls need a VFIO device FD, which > brings the immediate attention that we cannot even decide vSMMU's > capabilities being reflected in its IDR/IIDR registers, without a > coldplug device -- if we boot a VM (one vSMMU<->pSMMU) with only a > hotplug device, the IOMMU_GET_HW_INFO cannot be done during guest Right. I forgot about the call to smmu_dev_get_info() during the reset. That means we need at least one dev per Guest SMMU during Guest boot :( Thanks, Shameer
On Fri, Feb 07, 2025 at 10:21:17AM +0000, Shameerali Kolothum Thodi wrote: > > > > -----Original Message----- > > From: Nicolin Chen <nicolinc@nvidia.com> > > Sent: Thursday, February 6, 2025 8:33 PM > > To: Shameerali Kolothum Thodi > > <shameerali.kolothum.thodi@huawei.com>; Daniel P. Berrangé > > <berrange@redhat.com>; Jason Gunthorpe <jgg@nvidia.com> > > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; > > eric.auger@redhat.com; peter.maydell@linaro.org; ddutile@redhat.com; > > Linuxarm <linuxarm@huawei.com>; Wangzhou (B) > > <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; > > Jonathan Cameron <jonathan.cameron@huawei.com>; > > zhangfei.gao@linaro.org; nathanc@nvidia.com > > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable > > nested SMMUv3 > > > > On Thu, Feb 06, 2025 at 02:22:01PM -0400, Jason Gunthorpe wrote: > > > On Thu, Feb 06, 2025 at 06:18:14PM +0000, Shameerali Kolothum Thodi > > wrote: > > > > > > > > So even if you invent an iommu ID we cannot accept it as a handle to > > > > > create viommu in iommufd. > > > > > > > > Creating the vIOMMU only happens when the user does a cold/hot > > plug of > > > > a VFIO device. At that time Qemu checks whether the assigned id > > matches > > > > with whatever the kernel tell it. > > > > > > This is not hard up until the guest is started. If you boot a guest > > > without a backing viommu iommufd object then there will be some more > > > complexities. > > > > Yea, I imagined that things would be complicated with hotplugs.. > > > > On one hand, I got the part that we need some fixed link forehand > > to ease migration/hotplugs. > > > > On the other hand, all IOMMUFD ioctls need a VFIO device FD, which > > brings the immediate attention that we cannot even decide vSMMU's > > capabilities being reflected in its IDR/IIDR registers, without a > > coldplug device -- if we boot a VM (one vSMMU<->pSMMU) with only a > > hotplug device, the IOMMU_GET_HW_INFO cannot be done during guest > > Right. I forgot about the call to smmu_dev_get_info() during the reset. > That means we need at least one dev per Guest SMMU during Guest > boot :( That's pretty unpleasant as a usage restriction. It sounds like there needs to be a way to configure & control the vIOMMU independantly of attaching a specific VFIO device. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
> -----Original Message----- > From: Daniel P. Berrangé <berrange@redhat.com> > Sent: Friday, February 7, 2025 10:32 AM > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> > Cc: Nicolin Chen <nicolinc@nvidia.com>; Jason Gunthorpe > <jgg@nvidia.com>; qemu-arm@nongnu.org; qemu-devel@nongnu.org; > eric.auger@redhat.com; peter.maydell@linaro.org; ddutile@redhat.com; > Linuxarm <linuxarm@huawei.com>; Wangzhou (B) > <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; > Jonathan Cameron <jonathan.cameron@huawei.com>; > zhangfei.gao@linaro.org; nathanc@nvidia.com > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable > nested SMMUv3 > > On Fri, Feb 07, 2025 at 10:21:17AM +0000, Shameerali Kolothum Thodi > wrote: > > > > > > > -----Original Message----- > > > From: Nicolin Chen <nicolinc@nvidia.com> > > > Sent: Thursday, February 6, 2025 8:33 PM > > > To: Shameerali Kolothum Thodi > > > <shameerali.kolothum.thodi@huawei.com>; Daniel P. Berrangé > > > <berrange@redhat.com>; Jason Gunthorpe <jgg@nvidia.com> > > > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; > > > eric.auger@redhat.com; peter.maydell@linaro.org; > ddutile@redhat.com; > > > Linuxarm <linuxarm@huawei.com>; Wangzhou (B) > > > <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; > > > Jonathan Cameron <jonathan.cameron@huawei.com>; > > > zhangfei.gao@linaro.org; nathanc@nvidia.com > > > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user- > creatable > > > nested SMMUv3 > > > > > > On Thu, Feb 06, 2025 at 02:22:01PM -0400, Jason Gunthorpe wrote: > > > > On Thu, Feb 06, 2025 at 06:18:14PM +0000, Shameerali Kolothum > Thodi > > > wrote: > > > > > > > > > > So even if you invent an iommu ID we cannot accept it as a handle > to > > > > > > create viommu in iommufd. > > > > > > > > > > Creating the vIOMMU only happens when the user does a cold/hot > > > plug of > > > > > a VFIO device. At that time Qemu checks whether the assigned id > > > matches > > > > > with whatever the kernel tell it. > > > > > > > > This is not hard up until the guest is started. If you boot a guest > > > > without a backing viommu iommufd object then there will be some > more > > > > complexities. > > > > > > Yea, I imagined that things would be complicated with hotplugs.. > > > > > > On one hand, I got the part that we need some fixed link forehand > > > to ease migration/hotplugs. > > > > > > On the other hand, all IOMMUFD ioctls need a VFIO device FD, which > > > brings the immediate attention that we cannot even decide vSMMU's > > > capabilities being reflected in its IDR/IIDR registers, without a > > > coldplug device -- if we boot a VM (one vSMMU<->pSMMU) with only a > > > hotplug device, the IOMMU_GET_HW_INFO cannot be done during > guest > > > > Right. I forgot about the call to smmu_dev_get_info() during the reset. > > That means we need at least one dev per Guest SMMU during Guest > > boot :( > > That's pretty unpleasant as a usage restriction. It sounds like there > needs to be a way to configure & control the vIOMMU independantly of > attaching a specific VFIO device. Yes, that would be ideal. Just wondering whether we can have something like the vfio_register_iommu_driver() for iommufd subsystem by which it can directly access iommu drivers ops(may be a restricted set). Not sure about the layering violations and other security issues with that... Thanks, Shameer
On Fri, Feb 07, 2025 at 12:21:54PM +0000, Shameerali Kolothum Thodi wrote: > Just wondering whether we can have something like the > vfio_register_iommu_driver() for iommufd subsystem by which it can directly > access iommu drivers ops(may be a restricted set). I very much want to try hard to avoid that. AFAICT you do not need a VFIO device, or access to the HW_INFO of the smmu to start up a SMMU driver. Yes, you cannot later attach a VFIO device with a pSMMU that materially differs from vSMMU setup, but that is fine. qemu has long had a duality where you can either "inherit from host" for an easy setup or be "fully specified" and support live migration/etc. CPUID as a simple example. So, what the smmu patches are doing now is "inherit from host" and that requires a VFIO device to work. I think that is fine. If you want to do full hotplug then you need to "fully specified" on the command line so a working vSMMU can be shown to the guest with no devices, and no kernel involvement. Obviously this is a highly advanced operating mode as things like IIDR and errata need to be considered, but I would guess booting with no vPCI devices is already abnormal. Jason
On Thu, Feb 06, 2025 at 12:33:19PM -0800, Nicolin Chen wrote: > On Thu, Feb 06, 2025 at 02:22:01PM -0400, Jason Gunthorpe wrote: > > On Thu, Feb 06, 2025 at 06:18:14PM +0000, Shameerali Kolothum Thodi wrote: > > > > > > So even if you invent an iommu ID we cannot accept it as a handle to > > > > create viommu in iommufd. > > > > > > Creating the vIOMMU only happens when the user does a cold/hot plug of > > > a VFIO device. At that time Qemu checks whether the assigned id matches > > > with whatever the kernel tell it. > > > > This is not hard up until the guest is started. If you boot a guest > > without a backing viommu iommufd object then there will be some more > > complexities. > > Yea, I imagined that things would be complicated with hotplugs.. > > On one hand, I got the part that we need some fixed link forehand > to ease migration/hotplugs. > > On the other hand, all IOMMUFD ioctls need a VFIO device FD, which > brings the immediate attention that we cannot even decide vSMMU's > capabilities being reflected in its IDR/IIDR registers, without a > coldplug device As Daniel was saying this all has to be specifiable on the command line. IMHO if the vSMMU is not fully specified by the time the boot happens (either explicity via command line or implicitly by querying the live HW) then it qemu should fail. Jason
On Thu, Feb 06, 2025 at 04:38:55PM -0400, Jason Gunthorpe wrote: > On Thu, Feb 06, 2025 at 12:33:19PM -0800, Nicolin Chen wrote: > > On Thu, Feb 06, 2025 at 02:22:01PM -0400, Jason Gunthorpe wrote: > > > On Thu, Feb 06, 2025 at 06:18:14PM +0000, Shameerali Kolothum Thodi wrote: > > > > > > > > So even if you invent an iommu ID we cannot accept it as a handle to > > > > > create viommu in iommufd. > > > > > > > > Creating the vIOMMU only happens when the user does a cold/hot plug of > > > > a VFIO device. At that time Qemu checks whether the assigned id matches > > > > with whatever the kernel tell it. > > > > > > This is not hard up until the guest is started. If you boot a guest > > > without a backing viommu iommufd object then there will be some more > > > complexities. > > > > Yea, I imagined that things would be complicated with hotplugs.. > > > > On one hand, I got the part that we need some fixed link forehand > > to ease migration/hotplugs. > > > > On the other hand, all IOMMUFD ioctls need a VFIO device FD, which > > brings the immediate attention that we cannot even decide vSMMU's > > capabilities being reflected in its IDR/IIDR registers, without a > > coldplug device > > As Daniel was saying this all has to be specifiable on the command > line. > > IMHO if the vSMMU is not fully specified by the time the boot happens > (either explicity via command line or implicitly by querying the live > HW) then it qemu should fail. Though that makes sense, that would assume we could only support the case where a VM has at least one cold plug device per vSMMU? Otherwise, even if we specify vSMMU to which pSMMU via a command line, we can't get access to the pSMMU via IOMMU_GET_HW_INFO.. Thanks Nicolin
On Thu, Feb 06, 2025 at 12:48:40PM -0800, Nicolin Chen wrote: > On Thu, Feb 06, 2025 at 04:38:55PM -0400, Jason Gunthorpe wrote: > > On Thu, Feb 06, 2025 at 12:33:19PM -0800, Nicolin Chen wrote: > > > On Thu, Feb 06, 2025 at 02:22:01PM -0400, Jason Gunthorpe wrote: > > > > On Thu, Feb 06, 2025 at 06:18:14PM +0000, Shameerali Kolothum Thodi wrote: > > > > > > > > > > So even if you invent an iommu ID we cannot accept it as a handle to > > > > > > create viommu in iommufd. > > > > > > > > > > Creating the vIOMMU only happens when the user does a cold/hot plug of > > > > > a VFIO device. At that time Qemu checks whether the assigned id matches > > > > > with whatever the kernel tell it. > > > > > > > > This is not hard up until the guest is started. If you boot a guest > > > > without a backing viommu iommufd object then there will be some more > > > > complexities. > > > > > > Yea, I imagined that things would be complicated with hotplugs.. > > > > > > On one hand, I got the part that we need some fixed link forehand > > > to ease migration/hotplugs. > > > > > > On the other hand, all IOMMUFD ioctls need a VFIO device FD, which > > > brings the immediate attention that we cannot even decide vSMMU's > > > capabilities being reflected in its IDR/IIDR registers, without a > > > coldplug device > > > > As Daniel was saying this all has to be specifiable on the command > > line. > > > > IMHO if the vSMMU is not fully specified by the time the boot happens > > (either explicity via command line or implicitly by querying the live > > HW) then it qemu should fail. > > Though that makes sense, that would assume we could only support > the case where a VM has at least one cold plug device per vSMMU? > > Otherwise, even if we specify vSMMU to which pSMMU via a command > line, we can't get access to the pSMMU via IOMMU_GET_HW_INFO.. You'd use the command line information and wouldn't need GET_HW_INFO, it would be complicated Jason
On Thu, Feb 06, 2025 at 05:11:13PM -0400, Jason Gunthorpe wrote: > On Thu, Feb 06, 2025 at 12:48:40PM -0800, Nicolin Chen wrote: > > On Thu, Feb 06, 2025 at 04:38:55PM -0400, Jason Gunthorpe wrote: > > > On Thu, Feb 06, 2025 at 12:33:19PM -0800, Nicolin Chen wrote: > > > > On Thu, Feb 06, 2025 at 02:22:01PM -0400, Jason Gunthorpe wrote: > > > > > On Thu, Feb 06, 2025 at 06:18:14PM +0000, Shameerali Kolothum Thodi wrote: > > > > > > > > > > > > So even if you invent an iommu ID we cannot accept it as a handle to > > > > > > > create viommu in iommufd. > > > > > > > > > > > > Creating the vIOMMU only happens when the user does a cold/hot plug of > > > > > > a VFIO device. At that time Qemu checks whether the assigned id matches > > > > > > with whatever the kernel tell it. > > > > > > > > > > This is not hard up until the guest is started. If you boot a guest > > > > > without a backing viommu iommufd object then there will be some more > > > > > complexities. > > > > > > > > Yea, I imagined that things would be complicated with hotplugs.. > > > > > > > > On one hand, I got the part that we need some fixed link forehand > > > > to ease migration/hotplugs. > > > > > > > > On the other hand, all IOMMUFD ioctls need a VFIO device FD, which > > > > brings the immediate attention that we cannot even decide vSMMU's > > > > capabilities being reflected in its IDR/IIDR registers, without a > > > > coldplug device > > > > > > As Daniel was saying this all has to be specifiable on the command > > > line. > > > > > > IMHO if the vSMMU is not fully specified by the time the boot happens > > > (either explicity via command line or implicitly by querying the live > > > HW) then it qemu should fail. > > > > Though that makes sense, that would assume we could only support > > the case where a VM has at least one cold plug device per vSMMU? > > > > Otherwise, even if we specify vSMMU to which pSMMU via a command > > line, we can't get access to the pSMMU via IOMMU_GET_HW_INFO.. > > You'd use the command line information and wouldn't need GET_HW_INFO, > it would be complicated Do you mean the "-device arm-smmuv3-accel,id=xx" line? This still won't give us the host IDR/IIDR register values to probe a vSMMU, unless it has a VFIO device assigned to vSMMU's associated PXB in that command line? Nicolin
On Thu, Feb 06, 2025 at 02:46:42PM -0800, Nicolin Chen wrote: > > You'd use the command line information and wouldn't need GET_HW_INFO, > > it would be complicated > > Do you mean the "-device arm-smmuv3-accel,id=xx" line? This still > won't give us the host IDR/IIDR register values to probe a vSMMU, > unless it has a VFIO device assigned to vSMMU's associated PXB in > that command line? Yes, put the IDR registers on the command line too. Nothing from the host should be copied to the guest without the option to control it through the command line. Jason
© 2016 - 2025 Red Hat, Inc.