1 | 1 | ||
---|---|---|---|
2 | > -----Original Message----- | 2 | > -----Original Message----- |
3 | > From: Shameerali Kolothum Thodi | 3 | > From: Nathan Chen <nathanc@nvidia.com> |
4 | > Sent: Thursday, January 30, 2025 6:09 PM | 4 | > Sent: Friday, November 22, 2024 1:42 AM |
5 | > To: 'Daniel P. Berrangé' <berrange@redhat.com> | 5 | > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> |
6 | > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; | 6 | > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; |
7 | > eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com; | 7 | > eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com; |
8 | > nicolinc@nvidia.com; ddutile@redhat.com; Linuxarm | 8 | > ddutile@redhat.com; Linuxarm <linuxarm@huawei.com>; Wangzhou (B) |
9 | > <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>; | 9 | > <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>; |
10 | > jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron | 10 | > Jonathan Cameron <jonathan.cameron@huawei.com>; |
11 | > <jonathan.cameron@huawei.com>; zhangfei.gao@linaro.org | 11 | > zhangfei.gao@linaro.org; Nicolin Chen <nicolinc@nvidia.com> |
12 | > Subject: RE: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable | 12 | > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable |
13 | > nested SMMUv3 | 13 | > nested SMMUv3 |
14 | > | 14 | > |
15 | > Hi Daniel, | 15 | > >> Also as a heads up, I've added support for auto-inserting PCIe switch |
16 | > >> between the PXB and GPUs in libvirt to attach multiple devices to a | ||
17 | > SMMU | ||
18 | > >> node per libvirt's documentation - "If you intend to plug multiple | ||
19 | > >> devices into a pcie-expander-bus, you must connect a | ||
20 | > >> pcie-switch-upstream-port to the pcie-root-port that is plugged into the | ||
21 | > >> pcie-expander-bus, and multiple pcie-switch-downstream-ports to the | ||
22 | > >> pcie-switch-upstream-port". Future unit-tests should follow this | ||
23 | > >> topology configuration. | ||
24 | > > | ||
25 | > > Ok. Could you please give me an example Qemu equivalent command | ||
26 | > option, | ||
27 | > > if possible, for the above case. I am not that familiar with libvirt | ||
28 | > and I would | ||
29 | > > also like to test the above scenario. | ||
16 | > | 30 | > |
17 | > > -----Original Message----- | 31 | > You can use "-device x3130-upstream" for the upstream switch port, and |
18 | > > From: Daniel P. Berrangé <berrange@redhat.com> | 32 | > "-device xio3130-downstream" for the downstream port: |
19 | > > Sent: Thursday, January 30, 2025 4:00 PM | ||
20 | > > To: Shameerali Kolothum Thodi | ||
21 | > <shameerali.kolothum.thodi@huawei.com> | ||
22 | > > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org; | ||
23 | > > eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com; | ||
24 | > > nicolinc@nvidia.com; ddutile@redhat.com; Linuxarm | ||
25 | > > <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>; | ||
26 | > > jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron | ||
27 | > > <jonathan.cameron@huawei.com>; zhangfei.gao@linaro.org | ||
28 | > > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable | ||
29 | > > nested SMMUv3 | ||
30 | > > | ||
31 | > > On Fri, Nov 08, 2024 at 12:52:37PM +0000, Shameer Kolothum via wrote: | ||
32 | > > > How to use it(Eg:): | ||
33 | > > > | ||
34 | > > > On a HiSilicon platform that has multiple physical SMMUv3s, the ACC | ||
35 | > ZIP | ||
36 | > > VF | ||
37 | > > > devices and HNS VF devices are behind different SMMUv3s. So for a | ||
38 | > > Guest, | ||
39 | > > > specify two smmuv3-nested devices each behind a pxb-pcie as below, | ||
40 | > > > | ||
41 | > > > ./qemu-system-aarch64 -machine virt,gic-version=3,default-bus-bypass- | ||
42 | > > iommu=on \ | ||
43 | > > > -enable-kvm -cpu host -m 4G -smp cpus=8,maxcpus=8 \ | ||
44 | > > > -object iommufd,id=iommufd0 \ | ||
45 | > > > -bios QEMU_EFI.fd \ | ||
46 | > > > -kernel Image \ | ||
47 | > > > -device virtio-blk-device,drive=fs \ | ||
48 | > > > -drive if=none,file=rootfs.qcow2,id=fs \ | ||
49 | > > > -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ | ||
50 | > > > -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ | ||
51 | > > > -device arm-smmuv3-nested,id=smmuv1,pci-bus=pcie.1 \ | ||
52 | > > > -device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \ | ||
53 | > > > -device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \ | ||
54 | > > > -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \ | ||
55 | > > > -device arm-smmuv3-nested,id=smmuv2,pci-bus=pcie.2 \ | ||
56 | > > > -device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \ | ||
57 | > > > -append "rdinit=init console=ttyAMA0 root=/dev/vda2 rw | ||
58 | > > earlycon=pl011,0x9000000" \ | ||
59 | > > > -device virtio-9p-pci,fsdev=p9fs2,mount_tag=p9,bus=pcie.0 \ | ||
60 | > > > -fsdev local,id=p9fs2,path=p9root,security_model=mapped \ | ||
61 | > > > -net none \ | ||
62 | > > > -nographic | ||
63 | > > | ||
64 | > > Above you say the host has 2 SMMUv3 devices, and you've created 2 | ||
65 | > > SMMUv3 | ||
66 | > > guest devices to match. | ||
67 | > > | ||
68 | > > The various emails in this thread & libvirt thread, indicate that each | ||
69 | > > guest SMMUv3 is associated with a host SMMUv3, but I don't see any | ||
70 | > > property on the command line for 'arm-ssmv3-nested' that tells it which | ||
71 | > > host eSMMUv3 it is to be associated with. | ||
72 | > > | ||
73 | > > How does this association work ? | ||
74 | > | 33 | > |
75 | > You are right. The association is not very obvious in Qemu. The association | 34 | > -device pxb-pcie,bus_nr=250,id=pci.1,bus=pcie.0,addr=0x1 \ |
76 | > and checking is done implicitly by kernel at the moment. I will try to | 35 | > -device pcie-root-port,id=pci.2,bus=pci.1,addr=0x0 \ |
77 | > explain | 36 | > -device x3130-upstream,id=pci.3,bus=pci.2,addr=0x0 \ |
78 | > it here. | 37 | > -device xio3130- |
79 | > | 38 | > downstream,id=pci.4,bus=pci.3,addr=0x0,chassis=17,port=1 \ |
80 | > Each "arm-smmuv3-nested" instance, when the first device gets attached | 39 | > -device vfio-pci,host=0009:01:00.0,id=hostdev0,bus=pci.4,addr=0x0 \ |
81 | > to it, will create a S2 HWPT and a corresponding SMMUv3 domain in kernel | 40 | > -device arm-smmuv3-nested,pci-bus=pci.1 |
82 | > SMMUv3 driver. This domain will have a pointer representing the physical | ||
83 | > SMMUv3 that the device belongs. And any other device which belongs to | ||
84 | > the same physical SMMUv3 can share this S2 domain. | ||
85 | > | ||
86 | > If a device that belongs to a different physical SMMUv3 gets attached to | ||
87 | > the above domain, the HWPT attach will eventually fail as the physical | ||
88 | > smmuv3 in the domains will have a mismatch, | ||
89 | > https://elixir.bootlin.com/linux/v6.13/source/drivers/iommu/arm/arm- | ||
90 | > smmu-v3/arm-smmu-v3.c#L2860 | ||
91 | > | ||
92 | > And as I mentioned in cover letter, Qemu will report, | ||
93 | > | ||
94 | > " | ||
95 | > Attempt to add the HNS VF to a different SMMUv3 will result in, | ||
96 | > | ||
97 | > -device vfio-pci,host=0000:7d:02.2,bus=pcie.port3,iommufd=iommufd0: | ||
98 | > Unable to attach viommu | ||
99 | > -device vfio-pci,host=0000:7d:02.2,bus=pcie.port3,iommufd=iommufd0: vfio | ||
100 | > 0000:7d:02.2: | ||
101 | > Failed to set iommu_device: [iommufd=29] error attach 0000:7d:02.2 (38) | ||
102 | > to id=11: Invalid argument | ||
103 | > | ||
104 | > At present Qemu is not doing any extra validation other than the above | ||
105 | > failure to make sure the user configuration is correct or not. The | ||
106 | > assumption is libvirt will take care of this. | ||
107 | > " | ||
108 | > So in summary, if the libvirt gets it wrong, Qemu will fail with error. | ||
109 | > | ||
110 | > If a more explicit association is required, some help from kernel is required | ||
111 | > to identify the physical SMMUv3 associated with the device. | ||
112 | 41 | ||
113 | Again thinking about this, to have an explicit association in the Qemu command | 42 | Thanks. Just wondering why libvirt mandates usage of pcie-switch for multiple |
114 | line between the vSMMUv3 and the phys smmuv3, | 43 | device plugging rather than just using pcie-root-ports? |
115 | 44 | ||
116 | We can possibly add something like, | 45 | Please let me if there is any advantage in doing so that you are aware of. |
117 | |||
118 | -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \ | ||
119 | -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \ | ||
120 | -device arm-smmuv3-accel,bus=pcie.1,phys-smmuv3= smmu3.0x0000000100000000 \ | ||
121 | -device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \ | ||
122 | |||
123 | -device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \ | ||
124 | -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \ | ||
125 | -device arm-smmuv3-nested,id=smmuv2,pci-bus=pcie.2, phys-smmuv3= smmu3.0x0000000200000000 \ | ||
126 | -device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \ | ||
127 | |||
128 | etc. | ||
129 | |||
130 | And Qemu does some checking to make sure that the device is indeed associated | ||
131 | with the specified phys-smmuv3. This can be done going through the sysfs path checking | ||
132 | which is what I guess libvirt is currently doing to populate the topology. So basically | ||
133 | Qemu is just replicating that to validate again. | ||
134 | |||
135 | Or another option is extending the IOMMU_GET_HW_INFO IOCTL to return the phys | ||
136 | smmuv3 base address which can avoid going through the sysfs. | ||
137 | |||
138 | The only difference between the current approach(kernel failing the attach implicitly) | ||
139 | and the above is, Qemu can provide a validation of inputs and may be report a better | ||
140 | error message than just saying " Unable to attach viommu/: Invalid argument". | ||
141 | |||
142 | If the command line looks Ok, I will go with the sysfs path validation method first in my | ||
143 | next respin. | ||
144 | |||
145 | Please let me know. | ||
146 | 46 | ||
147 | Thanks, | 47 | Thanks, |
148 | Shameer | 48 | Shameer |
149 | |||
150 | |||
151 | |||
152 | diff view generated by jsdifflib |