[PATCH 3/3] virtio-iommu: Support PCI device aliases

Zhenzhong Duan posted 3 patches 10 months, 1 week ago
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Eric Auger <eric.auger@redhat.com>, Peter Maydell <peter.maydell@linaro.org>, "Michael S. Tsirkin" <mst@redhat.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>, Eduardo Habkost <eduardo@habkost.net>, Peter Xu <peterx@redhat.com>, Jason Wang <jasowang@redhat.com>, Helge Deller <deller@gmx.de>, Andrey Smirnov <andrew.smirnov@gmail.com>, "Cédric Le Goater" <clg@kaod.org>, Nicholas Piggin <npiggin@gmail.com>, "Frédéric Barrat" <fbarrat@linux.ibm.com>, "Hervé Poussineau" <hpoussin@reactos.org>, Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>, BALATON Zoltan <balaton@eik.bme.hu>, Daniel Henrique Barboza <danielhb413@gmail.com>, David Gibson <david@gibson.dropbear.id.au>, Harsh Prateek Bora <harshpb@linux.ibm.com>, Elena Ufimtseva <elena.ufimtseva@oracle.com>, Jagannathan Raman <jag.raman@oracle.com>, Matthew Rosato <mjrosato@linux.ibm.com>, Eric Farman <farman@linux.ibm.com>, Thomas Huth <thuth@redhat.com>, David Hildenbrand <david@redhat.com>, Ilya Leoshkevich <iii@linux.ibm.com>, Halil Pasic <pasic@linux.ibm.com>, Christian Borntraeger <borntraeger@linux.ibm.com>
[PATCH 3/3] virtio-iommu: Support PCI device aliases
Posted by Zhenzhong Duan 10 months, 1 week ago
Currently virtio-iommu doesn't work well if there are multiple devices
in same iommu group. In below example config, guest virtio-iommu driver
can successfully probe first device but fail on others. Only one device
under the bridge can work normally.

-device virtio-iommu \
-device pcie-pci-bridge,id=root0 \
-device vfio-pci,host=81:11.0,bus=root0 \
-device vfio-pci,host=6f:01.0,bus=root0 \

The reason is virtio-iommu stores AS(address space) in hash table with
aliased BDF and corelates endpoint which is indexed by device's real
BDF, i.e., virtio_iommu_mr() is passed a real BDF to lookup AS hash
table, we either get wrong AS or NULL.

Fix it by storing AS indexed by real BDF. This way also make iova_ranges
from vfio device stored in IOMMUDevice of real BDF successfully.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/virtio/virtio-iommu.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index d99c1f0d64..6880d92a44 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -399,27 +399,27 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
                                               int real_devfn)
 {
     VirtIOIOMMU *s = opaque;
-    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
+    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, real_bus);
     static uint32_t mr_index;
     IOMMUDevice *sdev;
 
     if (!sbus) {
         sbus = g_malloc0(sizeof(IOMMUPciBus) +
                          sizeof(IOMMUDevice *) * PCI_DEVFN_MAX);
-        sbus->bus = bus;
-        g_hash_table_insert(s->as_by_busptr, bus, sbus);
+        sbus->bus = real_bus;
+        g_hash_table_insert(s->as_by_busptr, real_bus, sbus);
     }
 
-    sdev = sbus->pbdev[devfn];
+    sdev = sbus->pbdev[real_devfn];
     if (!sdev) {
         char *name = g_strdup_printf("%s-%d-%d",
                                      TYPE_VIRTIO_IOMMU_MEMORY_REGION,
-                                     mr_index++, devfn);
-        sdev = sbus->pbdev[devfn] = g_new0(IOMMUDevice, 1);
+                                     mr_index++, real_devfn);
+        sdev = sbus->pbdev[real_devfn] = g_new0(IOMMUDevice, 1);
 
         sdev->viommu = s;
-        sdev->bus = bus;
-        sdev->devfn = devfn;
+        sdev->bus = real_bus;
+        sdev->devfn = real_devfn;
 
         trace_virtio_iommu_init_iommu_mr(name);
 
-- 
2.34.1
Re: [PATCH 3/3] virtio-iommu: Support PCI device aliases
Posted by Eric Auger 10 months ago
Hi Zhenzhong,

On 1/22/24 07:40, Zhenzhong Duan wrote:
> Currently virtio-iommu doesn't work well if there are multiple devices
> in same iommu group. In below example config, guest virtio-iommu driver
> can successfully probe first device but fail on others. Only one device
> under the bridge can work normally.
>
> -device virtio-iommu \
> -device pcie-pci-bridge,id=root0 \
> -device vfio-pci,host=81:11.0,bus=root0 \
> -device vfio-pci,host=6f:01.0,bus=root0 \
>
> The reason is virtio-iommu stores AS(address space) in hash table with
> aliased BDF and corelates endpoint which is indexed by device's real
> BDF, i.e., virtio_iommu_mr() is passed a real BDF to lookup AS hash
> table, we either get wrong AS or NULL.
>
> Fix it by storing AS indexed by real BDF. This way also make iova_ranges
> from vfio device stored in IOMMUDevice of real BDF successfully.
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>  hw/virtio/virtio-iommu.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> index d99c1f0d64..6880d92a44 100644
> --- a/hw/virtio/virtio-iommu.c
> +++ b/hw/virtio/virtio-iommu.c
> @@ -399,27 +399,27 @@ static AddressSpace *virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
>                                                int real_devfn)
>  {
>      VirtIOIOMMU *s = opaque;
> -    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
> +    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, real_bus);
>      static uint32_t mr_index;
>      IOMMUDevice *sdev;
>  
>      if (!sbus) {
>          sbus = g_malloc0(sizeof(IOMMUPciBus) +
>                           sizeof(IOMMUDevice *) * PCI_DEVFN_MAX);
> -        sbus->bus = bus;
> -        g_hash_table_insert(s->as_by_busptr, bus, sbus);
> +        sbus->bus = real_bus;
> +        g_hash_table_insert(s->as_by_busptr, real_bus, sbus);
>      }
>  
> -    sdev = sbus->pbdev[devfn];
> +    sdev = sbus->pbdev[real_devfn];
>      if (!sdev) {
>          char *name = g_strdup_printf("%s-%d-%d",
>                                       TYPE_VIRTIO_IOMMU_MEMORY_REGION,
> -                                     mr_index++, devfn);
> -        sdev = sbus->pbdev[devfn] = g_new0(IOMMUDevice, 1);
> +                                     mr_index++, real_devfn);
> +        sdev = sbus->pbdev[real_devfn] = g_new0(IOMMUDevice, 1);
>  
>          sdev->viommu = s;
> -        sdev->bus = bus;
> -        sdev->devfn = devfn;
> +        sdev->bus = real_bus;
> +        sdev->devfn = real_devfn;
but then this means the 2 devices would be abstracted by two different
IOMMU MRs whereas in practice they cannot be distinguished from an IOMMU
pov. Shouldn't the virtio-iommu driver use the same ep_id for both
devices within the same group?

Note there are some known issues about virtio-iommu and pcie-to-pci
bridges which were reported early last year and confirmed by Robin
Murphy. See:

[RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr() <https://lore.kernel.org/all/20230116124709.793084-1-eric.auger@redhat.com/#r>

https://lore.kernel.org/all/20230116124709.793084-1-eric.auger@redhat.com/

Thanks

Eric




>  
>          trace_virtio_iommu_init_iommu_mr(name);
>
RE: [PATCH 3/3] virtio-iommu: Support PCI device aliases
Posted by Duan, Zhenzhong 10 months ago

>-----Original Message-----
>From: Eric Auger <eric.auger@redhat.com>
>Subject: Re: [PATCH 3/3] virtio-iommu: Support PCI device aliases
>
>Hi Zhenzhong,
>
>On 1/22/24 07:40, Zhenzhong Duan wrote:
>> Currently virtio-iommu doesn't work well if there are multiple devices
>> in same iommu group. In below example config, guest virtio-iommu driver
>> can successfully probe first device but fail on others. Only one device
>> under the bridge can work normally.
>>
>> -device virtio-iommu \
>> -device pcie-pci-bridge,id=root0 \
>> -device vfio-pci,host=81:11.0,bus=root0 \
>> -device vfio-pci,host=6f:01.0,bus=root0 \
>>
>> The reason is virtio-iommu stores AS(address space) in hash table with
>> aliased BDF and corelates endpoint which is indexed by device's real
>> BDF, i.e., virtio_iommu_mr() is passed a real BDF to lookup AS hash
>> table, we either get wrong AS or NULL.
>>
>> Fix it by storing AS indexed by real BDF. This way also make iova_ranges
>> from vfio device stored in IOMMUDevice of real BDF successfully.
>>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>  hw/virtio/virtio-iommu.c | 16 ++++++++--------
>>  1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
>> index d99c1f0d64..6880d92a44 100644
>> --- a/hw/virtio/virtio-iommu.c
>> +++ b/hw/virtio/virtio-iommu.c
>> @@ -399,27 +399,27 @@ static AddressSpace
>*virtio_iommu_find_add_as(PCIBus *bus, void *opaque,
>>                                                int real_devfn)
>>  {
>>      VirtIOIOMMU *s = opaque;
>> -    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr, bus);
>> +    IOMMUPciBus *sbus = g_hash_table_lookup(s->as_by_busptr,
>real_bus);
>>      static uint32_t mr_index;
>>      IOMMUDevice *sdev;
>>
>>      if (!sbus) {
>>          sbus = g_malloc0(sizeof(IOMMUPciBus) +
>>                           sizeof(IOMMUDevice *) * PCI_DEVFN_MAX);
>> -        sbus->bus = bus;
>> -        g_hash_table_insert(s->as_by_busptr, bus, sbus);
>> +        sbus->bus = real_bus;
>> +        g_hash_table_insert(s->as_by_busptr, real_bus, sbus);
>>      }
>>
>> -    sdev = sbus->pbdev[devfn];
>> +    sdev = sbus->pbdev[real_devfn];
>>      if (!sdev) {
>>          char *name = g_strdup_printf("%s-%d-%d",
>>                                       TYPE_VIRTIO_IOMMU_MEMORY_REGION,
>> -                                     mr_index++, devfn);
>> -        sdev = sbus->pbdev[devfn] = g_new0(IOMMUDevice, 1);
>> +                                     mr_index++, real_devfn);
>> +        sdev = sbus->pbdev[real_devfn] = g_new0(IOMMUDevice, 1);
>>
>>          sdev->viommu = s;
>> -        sdev->bus = bus;
>> -        sdev->devfn = devfn;
>> +        sdev->bus = real_bus;
>> +        sdev->devfn = real_devfn;
>but then this means the 2 devices would be abstracted by two different
>IOMMU MRs whereas in practice they cannot be distinguished from an
>IOMMU pov.

Yes, normally the two different IOMMU MRs should link to same guest domain,
so translation result will be same. But if a malicious guest try to break,
then it fails to block that.

>Shouldn't the virtio-iommu driver use the same ep_id for both
>devices within the same group?

IIUC, you mean for domain attach and not probe request?
I was thinking ep_id represented an existing device in guest, not the aliased one. 

>
>Note there are some known issues about virtio-iommu and pcie-to-pci
>bridges which were reported early last year and confirmed by Robin
>Murphy. See:
>
>[RFC] virtio-iommu: Take into account possible aliasing in virtio_iommu_mr()
><https://lore.kernel.org/all/20230116124709.793084-1-
>eric.auger@redhat.com/#r>

Thanks for sharing, it’s valuable😊

BRs.
Zhenzhong