1 | 1 | ||
---|---|---|---|
2 | > -----Original Message----- | 2 | > -----Original Message----- |
3 | > From: Nicolin Chen <nicolinc@nvidia.com> | 3 | > From: Shameerali Kolothum Thodi |
4 | > Sent: Wednesday, November 13, 2024 9:43 PM | 4 | > Sent: Thursday, January 30, 2025 6:09 PM |
5 | > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> | 5 | > To: 'Daniel P. Berrangé' <berrange@redhat.com> |
6 | > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; | 6 | > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; |
7 | > eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com; | 7 | > eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com; |
8 | > ddutile@redhat.com; Linuxarm <linuxarm@huawei.com>; Wangzhou (B) | 8 | > nicolinc@nvidia.com; ddutile@redhat.com; Linuxarm |
9 | > <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; | 9 | > <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>; |
10 | > Jonathan Cameron <jonathan.cameron@huawei.com>; | 10 | > jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron |
11 | > zhangfei.gao@linaro.org | 11 | > <jonathan.cameron@huawei.com>; zhangfei.gao@linaro.org |
12 | > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable | 12 | > Subject: RE: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable |
13 | > nested SMMUv3 | 13 | > nested SMMUv3 |
14 | > | 14 | > |
15 | > On Fri, Nov 08, 2024 at 12:52:37PM +0000, Shameer Kolothum wrote: | 15 | > Hi Daniel, |
16 | > > This RFC is for initial discussion/test purposes only and includes | ||
17 | > > patches that are only relevant for adding the "arm-smmuv3-nested" | ||
18 | > > support. For the complete branch please find, | ||
19 | > > https://github.com/hisilicon/qemu/commits/private-smmuv3-nested-dev- | ||
20 | > rf | ||
21 | > > c-v1/ | ||
22 | > | 16 | > |
23 | > I guess the QEMU branch above pairs with this (vIOMMU v6)? | 17 | > > -----Original Message----- |
24 | > https://github.com/nicolinc/iommufd/commits/smmuv3_nesting-with-rmr | 18 | > > From: Daniel P. Berrangé <berrange@redhat.com> |
19 | > > Sent: Thursday, January 30, 2025 4:00 PM | ||
20 | > > To: Shameerali Kolothum Thodi | ||
21 | > <shameerali.kolothum.thodi@huawei.com> | ||
22 | > > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; | ||
23 | > > eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com; | ||
24 | > > nicolinc@nvidia.com; ddutile@redhat.com; Linuxarm | ||
25 | > > <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>; | ||
26 | > > jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron | ||
27 | > > <jonathan.cameron@huawei.com>; zhangfei.gao@linaro.org | ||
28 | > > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable | ||
29 | > > nested SMMUv3 | ||
30 | > > | ||
31 | > > On Fri, Nov 08, 2024 at 12:52:37PM +0000, Shameer Kolothum via wrote: | ||
32 | > > > How to use it(Eg:): | ||
33 | > > > | ||
34 | > > > On a HiSilicon platform that has multiple physical SMMUv3s, the ACC | ||
35 | > ZIP | ||
36 | > > VF | ||
37 | > > > devices and HNS VF devices are behind different SMMUv3s. So for a | ||
38 | > > Guest, | ||
39 | > > > specify two smmuv3-nested devices each behind a pxb-pcie as below, | ||
40 | > > > | ||
41 | > > > ./qemu-system-aarch64 -machine virt,gic-version=3,default-bus-bypass- | ||
42 | > > iommu=on \ | ||
43 | > > > -enable-kvm -cpu host -m 4G -smp cpus=8,maxcpus=8 \ | ||
44 | > > > -object iommufd,id=iommufd0 \ | ||
45 | > > > -bios QEMU_EFI.fd \ | ||
46 | > > > -kernel Image \ | ||
47 | > > > -device virtio-blk-device,drive=fs \ | ||
48 | > > > -drive if=none,file=rootfs.qcow2,id=fs \ | ||
49 | > > > -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ | ||
50 | > > > -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ | ||
51 | > > > -device arm-smmuv3-nested,id=smmuv1,pci-bus=pcie.1 \ | ||
52 | > > > -device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \ | ||
53 | > > > -device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \ | ||
54 | > > > -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \ | ||
55 | > > > -device arm-smmuv3-nested,id=smmuv2,pci-bus=pcie.2 \ | ||
56 | > > > -device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \ | ||
57 | > > > -append "rdinit=init console=ttyAMA0 root=/dev/vda2 rw | ||
58 | > > earlycon=pl011,0x9000000" \ | ||
59 | > > > -device virtio-9p-pci,fsdev=p9fs2,mount_tag=p9,bus=pcie.0 \ | ||
60 | > > > -fsdev local,id=p9fs2,path=p9root,security_model=mapped \ | ||
61 | > > > -net none \ | ||
62 | > > > -nographic | ||
63 | > > | ||
64 | > > Above you say the host has 2 SMMUv3 devices, and you've created 2 | ||
65 | > > SMMUv3 | ||
66 | > > guest devices to match. | ||
67 | > > | ||
68 | > > The various emails in this thread & libvirt thread, indicate that each | ||
69 | > > guest SMMUv3 is associated with a host SMMUv3, but I don't see any | ||
70 | > > property on the command line for 'arm-ssmv3-nested' that tells it which | ||
71 | > > host eSMMUv3 it is to be associated with. | ||
72 | > > | ||
73 | > > How does this association work ? | ||
74 | > | ||
75 | > You are right. The association is not very obvious in Qemu. The association | ||
76 | > and checking is done implicitly by kernel at the moment. I will try to | ||
77 | > explain | ||
78 | > it here. | ||
79 | > | ||
80 | > Each "arm-smmuv3-nested" instance, when the first device gets attached | ||
81 | > to it, will create a S2 HWPT and a corresponding SMMUv3 domain in kernel | ||
82 | > SMMUv3 driver. This domain will have a pointer representing the physical | ||
83 | > SMMUv3 that the device belongs. And any other device which belongs to | ||
84 | > the same physical SMMUv3 can share this S2 domain. | ||
85 | > | ||
86 | > If a device that belongs to a different physical SMMUv3 gets attached to | ||
87 | > the above domain, the HWPT attach will eventually fail as the physical | ||
88 | > smmuv3 in the domains will have a mismatch, | ||
89 | > https://elixir.bootlin.com/linux/v6.13/source/drivers/iommu/arm/arm- | ||
90 | > smmu-v3/arm-smmu-v3.c#L2860 | ||
91 | > | ||
92 | > And as I mentioned in cover letter, Qemu will report, | ||
93 | > | ||
94 | > " | ||
95 | > Attempt to add the HNS VF to a different SMMUv3 will result in, | ||
96 | > | ||
97 | > -device vfio-pci,host=0000:7d:02.2,bus=pcie.port3,iommufd=iommufd0: | ||
98 | > Unable to attach viommu | ||
99 | > -device vfio-pci,host=0000:7d:02.2,bus=pcie.port3,iommufd=iommufd0: vfio | ||
100 | > 0000:7d:02.2: | ||
101 | > Failed to set iommu_device: [iommufd=29] error attach 0000:7d:02.2 (38) | ||
102 | > to id=11: Invalid argument | ||
103 | > | ||
104 | > At present Qemu is not doing any extra validation other than the above | ||
105 | > failure to make sure the user configuration is correct or not. The | ||
106 | > assumption is libvirt will take care of this. | ||
107 | > " | ||
108 | > So in summary, if the libvirt gets it wrong, Qemu will fail with error. | ||
109 | > | ||
110 | > If a more explicit association is required, some help from kernel is required | ||
111 | > to identify the physical SMMUv3 associated with the device. | ||
25 | 112 | ||
26 | I actually based it on top of a kernel branch that Zhangfei is keeping for his verification tests. | 113 | Again thinking about this, to have an explicit association in the Qemu command |
27 | https://github.com/Linaro/linux-kernel-uadk/commits/6.12-wip-10.26/ | 114 | line between the vSMMUv3 and the phys smmuv3, |
28 | 115 | ||
29 | But yes, it indeed looks like based on the branch you mentioned above. | 116 | We can possibly add something like, |
117 | |||
118 | -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ | ||
119 | -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ | ||
120 | -device arm-smmuv3-accel,bus=pcie.1,phys-smmuv3= smmu3.0x0000000100000000 \ | ||
121 | -device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \ | ||
122 | |||
123 | -device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \ | ||
124 | -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \ | ||
125 | -device arm-smmuv3-nested,id=smmuv2,pci-bus=pcie.2, phys-smmuv3= smmu3.0x0000000200000000 \ | ||
126 | -device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \ | ||
127 | |||
128 | etc. | ||
129 | |||
130 | And Qemu does some checking to make sure that the device is indeed associated | ||
131 | with the specified phys-smmuv3. This can be done going through the sysfs path checking | ||
132 | which is what I guess libvirt is currently doing to populate the topology. So basically | ||
133 | Qemu is just replicating that to validate again. | ||
134 | |||
135 | Or another option is extending the IOMMU_GET_HW_INFO IOCTL to return the phys | ||
136 | smmuv3 base address which can avoid going through the sysfs. | ||
137 | |||
138 | The only difference between the current approach(kernel failing the attach implicitly) | ||
139 | and the above is, Qemu can provide a validation of inputs and may be report a better | ||
140 | error message than just saying " Unable to attach viommu/: Invalid argument". | ||
141 | |||
142 | If the command line looks Ok, I will go with the sysfs path validation method first in my | ||
143 | next respin. | ||
144 | |||
145 | Please let me know. | ||
30 | 146 | ||
31 | Thanks, | 147 | Thanks, |
32 | Shameer. | 148 | Shameer |
149 | |||
150 | |||
151 | |||
152 | diff view generated by jsdifflib |