1
1
2
> -----Original Message-----
2
> -----Original Message-----
3
> From: Shameerali Kolothum Thodi
3
> From: Peter Maydell <peter.maydell@linaro.org>
4
> Sent: Thursday, January 30, 2025 6:09 PM
4
> Sent: Friday, December 13, 2024 1:33 PM
5
> To: 'Daniel P. Berrangé' <berrange@redhat.com>
5
> To: Jason Gunthorpe <jgg@nvidia.com>
6
> Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org;
6
> Cc: Daniel P. Berrangé <berrange@redhat.com>; Shameerali Kolothum
7
> eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com;
7
> Thodi <shameerali.kolothum.thodi@huawei.com>; qemu-arm@nongnu.org;
8
> nicolinc@nvidia.com; ddutile@redhat.com; Linuxarm
8
> qemu-devel@nongnu.org; eric.auger@redhat.com; nicolinc@nvidia.com;
9
> <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
9
> ddutile@redhat.com; Linuxarm <linuxarm@huawei.com>; Wangzhou (B)
10
> jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron
10
> <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>;
11
> <jonathan.cameron@huawei.com>; zhangfei.gao@linaro.org
11
> Jonathan Cameron <jonathan.cameron@huawei.com>;
12
> Subject: RE: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable
12
> zhangfei.gao@linaro.org
13
> Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable
13
> nested SMMUv3
14
> nested SMMUv3
14
>
15
>
15
> Hi Daniel,
16
> On Fri, 13 Dec 2024 at 12:46, Jason Gunthorpe <jgg@nvidia.com> wrote:
17
> >
18
> > On Fri, Dec 13, 2024 at 12:00:43PM +0000, Daniel P. Berrangé wrote:
19
> > > On Fri, Nov 08, 2024 at 12:52:37PM +0000, Shameer Kolothum via wrote:
20
> > > > Hi,
21
> > > >
22
> > > > This series adds initial support for a user-creatable "arm-smmuv3-
23
> nested"
24
> > > > device to Qemu. At present the Qemu ARM SMMUv3 emulation is per
25
> machine
26
> > > > and cannot support multiple SMMUv3s.
27
> > > >
28
> > > > In order to support vfio-pci dev assignment with vSMMUv3, the
29
> physical
30
> > > > SMMUv3 has to be configured in nested mode. Having a pluggable
31
> > > > "arm-smmuv3-nested" device enables us to have multiple vSMMUv3
32
> for Guests
33
> > > > running on a host with multiple physical SMMUv3s. A few benefits of
34
> doing
35
> > > > this are,
36
> > >
37
> > > I'm not very familiar with arm, but from this description I'm not
38
> > > really seeing how "nesting" is involved here. You're only talking
39
> > > about the host and 1 L1 guest, no L2 guest.
40
> >
41
> > nesting is the term the iommu side is using to refer to the 2
42
> > dimensional paging, ie a guest page table on top of a hypervisor page
43
> > table.
16
>
44
>
17
> > -----Original Message-----
45
> Isn't that more usually called "two stage" paging? Calling
18
> > From: Daniel P. Berrangé <berrange@redhat.com>
46
> that "nesting" seems like it is going to be massively confusing...
19
> > Sent: Thursday, January 30, 2025 4:00 PM
47
20
> > To: Shameerali Kolothum Thodi
48
Yes. This will be renamed in future revisions as arm-smmuv3-accel.
21
> <shameerali.kolothum.thodi@huawei.com>
49
22
> > Cc: qemu-arm@nongnu.org; qemu-devel@nongnu.org;
23
> > eric.auger@redhat.com; peter.maydell@linaro.org; jgg@nvidia.com;
24
> > nicolinc@nvidia.com; ddutile@redhat.com; Linuxarm
25
> > <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
26
> > jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron
27
> > <jonathan.cameron@huawei.com>; zhangfei.gao@linaro.org
28
> > Subject: Re: [RFC PATCH 0/5] hw/arm/virt: Add support for user-creatable
29
> > nested SMMUv3
30
> >
31
> > On Fri, Nov 08, 2024 at 12:52:37PM +0000, Shameer Kolothum via wrote:
32
> > > How to use it(Eg:):
33
> > >
34
> > > On a HiSilicon platform that has multiple physical SMMUv3s, the ACC
35
> ZIP
36
> > VF
37
> > > devices and HNS VF devices are behind different SMMUv3s. So for a
38
> > Guest,
39
> > > specify two smmuv3-nested devices each behind a pxb-pcie as below,
40
> > >
41
> > > ./qemu-system-aarch64 -machine virt,gic-version=3,default-bus-bypass-
42
> > iommu=on \
43
> > > -enable-kvm -cpu host -m 4G -smp cpus=8,maxcpus=8 \
44
> > > -object iommufd,id=iommufd0 \
45
> > > -bios QEMU_EFI.fd \
46
> > > -kernel Image \
47
> > > -device virtio-blk-device,drive=fs \
48
> > > -drive if=none,file=rootfs.qcow2,id=fs \
49
> > > -device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \
50
> > > -device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \
51
> > > -device arm-smmuv3-nested,id=smmuv1,pci-bus=pcie.1 \
52
> > > -device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \
53
> > > -device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \
54
> > > -device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \
55
> > > -device arm-smmuv3-nested,id=smmuv2,pci-bus=pcie.2 \
56
> > > -device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \
57
> > > -append "rdinit=init console=ttyAMA0 root=/dev/vda2 rw
58
> > earlycon=pl011,0x9000000" \
59
> > > -device virtio-9p-pci,fsdev=p9fs2,mount_tag=p9,bus=pcie.0 \
60
> > > -fsdev local,id=p9fs2,path=p9root,security_model=mapped \
61
> > > -net none \
62
> > > -nographic
63
> >
64
> > Above you say the host has 2 SMMUv3 devices, and you've created 2
65
> > SMMUv3
66
> > guest devices to match.
67
> >
68
> > The various emails in this thread & libvirt thread, indicate that each
69
> > guest SMMUv3 is associated with a host SMMUv3, but I don't see any
70
> > property on the command line for 'arm-ssmv3-nested' that tells it which
71
> > host eSMMUv3 it is to be associated with.
72
> >
73
> > How does this association work ?
74
>
50
>
75
> You are right. The association is not very obvious in Qemu. The association
51
> Also, how does it relate to what this series seems to be
76
> and checking is done implicitly by kernel at the moment. I will try to
52
> doing, where we provide the guest with two separate SMMUs?
77
> explain
53
> (Are those two SMMUs "nested" in the sense that one is sitting
78
> it here.
54
> behind the other?)
79
>
80
> Each "arm-smmuv3-nested" instance, when the first device gets attached
81
> to it, will create a S2 HWPT and a corresponding SMMUv3 domain in kernel
82
> SMMUv3 driver. This domain will have a pointer representing the physical
83
> SMMUv3 that the device belongs. And any other device which belongs to
84
> the same physical SMMUv3 can share this S2 domain.
85
>
86
> If a device that belongs to a different physical SMMUv3 gets attached to
87
> the above domain, the HWPT attach will eventually fail as the physical
88
> smmuv3 in the domains will have a mismatch,
89
> https://elixir.bootlin.com/linux/v6.13/source/drivers/iommu/arm/arm-
90
> smmu-v3/arm-smmu-v3.c#L2860
91
>
92
> And as I mentioned in cover letter, Qemu will report,
93
>
94
> "
95
> Attempt to add the HNS VF to a different SMMUv3 will result in,
96
>
97
> -device vfio-pci,host=0000:7d:02.2,bus=pcie.port3,iommufd=iommufd0:
98
> Unable to attach viommu
99
> -device vfio-pci,host=0000:7d:02.2,bus=pcie.port3,iommufd=iommufd0: vfio
100
> 0000:7d:02.2:
101
> Failed to set iommu_device: [iommufd=29] error attach 0000:7d:02.2 (38)
102
> to id=11: Invalid argument
103
>
104
> At present Qemu is not doing any extra validation other than the above
105
> failure to make sure the user configuration is correct or not. The
106
> assumption is libvirt will take care of this.
107
> "
108
> So in summary, if the libvirt gets it wrong, Qemu will fail with error.
109
>
110
> If a more explicit association is required, some help from kernel is required
111
> to identify the physical SMMUv3 associated with the device.
112
55
113
Again thinking about this, to have an explicit association in the Qemu command
56
I don't think it requires two SMMUs in Guest. The nested or "two
114
line between the vSMMUv3 and the phys smmuv3,
57
stage" means the stage 1 page table is owned by Guest and stage 2
58
by host. And this is achieved by IOMMUFD provided IOCTLs.
115
59
116
We can possibly add something like,
60
There is a precurser to this series where the support for hw accelerated
61
2 stage support is added in Qemu SMMUv3 code.
117
62
118
-device pxb-pcie,id=pcie.1,bus_nr=8,bus=pcie.0 \
63
Please see the complete branch here,
119
-device pcie-root-port,id=pcie.port1,bus=pcie.1,chassis=1 \
64
https://github.com/hisilicon/qemu/commits/private-smmuv3-nested-dev-rfc-v1/
120
-device arm-smmuv3-accel,bus=pcie.1,phys-smmuv3= smmu3.0x0000000100000000 \
65
And patches prior to this commit adds that support:
121
-device vfio-pci,host=0000:7d:02.1,bus=pcie.port1,iommufd=iommufd0 \
66
4ccdbe3: ("cover-letter: Add HW accelerated nesting support for arm
67
SMMUv3")
122
68
123
-device pxb-pcie,id=pcie.2,bus_nr=16,bus=pcie.0 \
69
Nicolin is soon going to send out those for review. Or I can include
124
-device pcie-root-port,id=pcie.port2,bus=pcie.2,chassis=2 \
70
those in this series so that it gives a complete picture. Nicolin?
125
-device arm-smmuv3-nested,id=smmuv2,pci-bus=pcie.2, phys-smmuv3= smmu3.0x0000000200000000 \
126
-device vfio-pci,host=0000:75:00.1,bus=pcie.port2,iommufd=iommufd0 \
127
71
128
etc.
72
Hope this clarifies any confusion.
129
130
And Qemu does some checking to make sure that the device is indeed associated
131
with the specified phys-smmuv3. This can be done going through the sysfs path checking
132
which is what I guess libvirt is currently doing to populate the topology. So basically
133
Qemu is just replicating that to validate again.
134
135
Or another option is extending the IOMMU_GET_HW_INFO IOCTL to return the phys
136
smmuv3 base address which can avoid going through the sysfs.
137
138
The only difference between the current approach(kernel failing the attach implicitly)
139
and the above is, Qemu can provide a validation of inputs and may be report a better
140
error message than just saying " Unable to attach viommu/: Invalid argument".
141
142
If the command line looks Ok, I will go with the sysfs path validation method first in my
143
next respin.
144
145
Please let me know.
146
73
147
Thanks,
74
Thanks,
148
Shameer
75
Shameer
149
76
150
77
151
78
152
diff view generated by jsdifflib