1 | Hi Daniel, | ||
---|---|---|---|
2 | 1 | ||
3 | > -----Original Message----- | 2 | > -----Original Message----- |
4 | > From: Daniel P. Berrangé <berrange@redhat.com> | 3 | > From: Jason Gunthorpe <jgg@nvidia.com> |
5 | > Sent: Thursday, January 30, 2025 4:00 PM | 4 | > Sent: Thursday, February 6, 2025 5:59 PM |
6 | > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> | 5 | > To: Daniel P. Berrangé <berrange@redhat.com> |
7 | > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; | 6 | > Cc: Shameerali Kolothum Thodi |
8 | > eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com; | 7 | > <shameerali.kolothum.thodi@huawei.com>; qemu-arm@nongnu.org; |
9 | > nicolinc@nvidia.com; ddutile@redhat.com; Linuxarm | 8 | > qemu-devel@nongnu.org; eric.auger@redhat.com; |
10 | > <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>; | 9 | > peter.maydell@linaro.org; nicolinc@nvidia.com; ddutile@redhat.com; |
11 | > jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron | 10 | > Linuxarm <linuxarm@huawei.com>; Wangzhou (B) |
12 | > <jonathan.cameron@huawei.com>; zhangfei.gao@linaro.org | 11 | > <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; |
12 | > Jonathan Cameron <jonathan.cameron@huawei.com>; | ||
13 | > zhangfei.gao@linaro.org; nathanc@nvidia.com | ||
13 | > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable | 14 | > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable |
14 | > nested SMMUv3 | 15 | > nested SMMUv3 |
15 | > | 16 | > |
16 | > On Fri, Nov 08, 2024 at 12:52:37PM +0000, Shameer Kolothum via wrote: | 17 | > On Thu, Feb 06, 2025 at 05:54:57PM +0000, Daniel P. Berrangé wrote: |
17 | > > How to use it(Eg:): | 18 | > > > > We shouldn't assume any VFIO device exists in the QEMU cnofig at the |
19 | > time | ||
20 | > > > > we realize the virtual ssmu. I expect the SMMU may be cold plugged, | ||
21 | > while | ||
22 | > > > > the VFIO devices may be hot plugged arbitrarly later, and we should | ||
23 | > have | ||
24 | > > > > the association initialized the SMMU is realized. | ||
25 | > > > | ||
26 | > > > This is not supported kernel side, you can't instantiate a vIOMMU | ||
27 | > > > without a VFIO device that uses it. For security. | ||
18 | > > | 28 | > > |
19 | > > On a HiSilicon platform that has multiple physical SMMUv3s, the ACC ZIP | 29 | > > What are the security concerns here ? |
20 | > VF | ||
21 | > > devices and HNS VF devices are behind different SMMUv3s. So for a | ||
22 | > Guest, | ||
23 | > > specify two smmuv3-nested devices each behind a pxb-pcie as below, | ||
24 | > > | ||
25 | > > ./qemu-system-aarch64 -machine virt,gic-version=3,default-bus-bypass- | ||
26 | > iommu=on \ | ||
27 | > > -enable-kvm -cpu host -m 4G -smp cpus=8,maxcpus=8 \ | ||
28 | > > -object iommufd,id=iommufd0 \ | ||
29 | > > -bios QEMU_EFI.fd \ | ||
30 | > > -kernel Image \ | ||
31 | > > -device virtio-blk-device,drive=fs \ | ||
32 | > > -drive if=none,file=rootfs.qcow2,id=fs \ | ||
33 | > > -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ | ||
34 | > > -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ | ||
35 | > > -device arm-smmuv3-nested,id=smmuv1,pci-bus=pcie.1 \ | ||
36 | > > -device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \ | ||
37 | > > -device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \ | ||
38 | > > -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \ | ||
39 | > > -device arm-smmuv3-nested,id=smmuv2,pci-bus=pcie.2 \ | ||
40 | > > -device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \ | ||
41 | > > -append "rdinit=init console=ttyAMA0 root=/dev/vda2 rw | ||
42 | > earlycon=pl011,0x9000000" \ | ||
43 | > > -device virtio-9p-pci,fsdev=p9fs2,mount_tag=p9,bus=pcie.0 \ | ||
44 | > > -fsdev local,id=p9fs2,path=p9root,security_model=mapped \ | ||
45 | > > -net none \ | ||
46 | > > -nographic | ||
47 | > | 30 | > |
48 | > Above you say the host has 2 SMMUv3 devices, and you've created 2 | 31 | > You should not be able to open iommufd and manipulate iommu HW that |
49 | > SMMUv3 | 32 | > you don't have a VFIO descriptor for, including creating physical |
50 | > guest devices to match. | 33 | > vIOMMU resources, allocating command queues and whatever else. |
51 | > | 34 | > |
52 | > The various emails in this thread & libvirt thread, indicate that each | 35 | > Some kind of hot plug smmu would have to create a vSMMU without any |
53 | > guest SMMUv3 is associated with a host SMMUv3, but I don't see any | 36 | > kernel backing and then later bind it to a kernel implementation. |
54 | > property on the command line for 'arm-ssmv3-nested' that tells it which | ||
55 | > host eSMMUv3 it is to be associated with. | ||
56 | > | ||
57 | > How does this association work ? | ||
58 | 37 | ||
59 | You are right. The association is not very obvious in Qemu. The association | 38 | Not sure I get the problem with associating vSMMU with a pSMMU. Something |
60 | and checking is done implicitly by kernel at the moment. I will try to explain | 39 | like an iommu instance id mentioned before, |
61 | it here. | ||
62 | 40 | ||
63 | Each "arm-smmuv3-nested" instance, when the first device gets attached | 41 | -device arm-smmuv3-accel,id=smmuv2,bus=pcie.2,host-smmu=iommu.1 |
64 | to it, will create a S2 HWPT and a corresponding SMMUv3 domain in kernel | ||
65 | SMMUv3 driver. This domain will have a pointer representing the physical | ||
66 | SMMUv3 that the device belongs. And any other device which belongs to | ||
67 | the same physical SMMUv3 can share this S2 domain. | ||
68 | 42 | ||
69 | If a device that belongs to a different physical SMMUv3 gets attached to | 43 | This can realize the vSMMU without actually creating a vIOMMU in kernel. |
70 | the above domain, the HWPT attach will eventually fail as the physical | 44 | And when the dev gets attached/realized, check (GET_HW_INFO)the specified |
71 | smmuv3 in the domains will have a mismatch, | 45 | iommu instance id matches or not. |
72 | https://elixir.bootlin.com/linux/v6.13/source/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c#L2860 | ||
73 | 46 | ||
74 | And as I mentioned in cover letter, Qemu will report, | 47 | Or the concern here is exporting an iommu instance id to user space? |
75 | |||
76 | " | ||
77 | Attempt to add the HNS VF to a different SMMUv3 will result in, | ||
78 | |||
79 | -device vfio-pci,host=0000:7d:02.2,bus=pcie.port3,iommufd=iommufd0: Unable to attach viommu | ||
80 | -device vfio-pci,host=0000:7d:02.2,bus=pcie.port3,iommufd=iommufd0: vfio 0000:7d:02.2: | ||
81 | Failed to set iommu_device: [iommufd=29] error attach 0000:7d:02.2 (38) to id=11: Invalid argument | ||
82 | |||
83 | At present Qemu is not doing any extra validation other than the above | ||
84 | failure to make sure the user configuration is correct or not. The | ||
85 | assumption is libvirt will take care of this. | ||
86 | " | ||
87 | So in summary, if the libvirt gets it wrong, Qemu will fail with error. | ||
88 | |||
89 | If a more explicit association is required, some help from kernel is required | ||
90 | to identify the physical SMMUv3 associated with the device. | ||
91 | |||
92 | Jason/Nicolin, any thoughts on this? | ||
93 | 48 | ||
94 | Thanks, | 49 | Thanks, |
95 | Shameer | 50 | Shameer |
96 | 51 | diff view generated by jsdifflib |