[PATCH 3/3] intel_iommu: Fix DMA failure when guest switches IOMMU domain

Zhenzhong Duan posted 3 patches 1 month ago
Maintainers: "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, Yi Liu <yi.l.liu@intel.com>, "Clément Mathieu--Drif" <clement.mathieu--drif@eviden.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, Eduardo Habkost <eduardo@habkost.net>
[PATCH 3/3] intel_iommu: Fix DMA failure when guest switches IOMMU domain
Posted by Zhenzhong Duan 1 month ago
Kernel allows user to switch IOMMU domain, e.g., switch between DMA
and identity domain. When this happen in IOMMU scalable mode, a pasid
cache invalidation request is sent, this request is ignored by vIOMMU
which leads to device binding to wrong address space, then DMA fails.

This issue exists in scalable mode with both first stage and second
stage translations, both emulated and passthrough devices.

Take network device for example, below sequence trigger issue:

1. start a guest with iommu=pt
2. echo 0000:01:00.0 > /sys/bus/pci/drivers/virtio-pci/unbind
3. echo DMA > /sys/kernel/iommu_groups/6/type
4. echo 0000:01:00.0 > /sys/bus/pci/drivers/virtio-pci/bind
5. Ping test

Fix it by switching address space in invalidation handler.

Fixes: 4a4f219e8a10 ("intel_iommu: add scalable-mode option to make scalable mode work")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 hw/i386/intel_iommu.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index d656e9c256..30275a4f23 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3104,7 +3104,7 @@ static void vtd_pasid_cache_sync_locked(gpointer key, gpointer value,
          * reset where the whole guest memory is treated as zeroed.
          */
         pc_entry->valid = false;
-        return;
+        goto switch_as;
     }
 
     /*
@@ -3134,6 +3134,10 @@ static void vtd_pasid_cache_sync_locked(gpointer key, gpointer value,
 
     pc_entry->pasid_entry = pe;
     pc_entry->valid = true;
+
+switch_as:
+    vtd_switch_address_space(vtd_as);
+    vtd_address_space_sync(vtd_as);
 }
 
 static void vtd_pasid_cache_sync(IntelIOMMUState *s, VTDPASIDCacheInfo *pc_info)
-- 
2.47.1
Re: [PATCH 3/3] intel_iommu: Fix DMA failure when guest switches IOMMU domain
Posted by Yi Liu 1 month ago
On 2025/10/15 18:20, Zhenzhong Duan wrote:
> Kernel allows user to switch IOMMU domain, e.g., switch between DMA
> and identity domain. When this happen in IOMMU scalable mode, a pasid
> cache invalidation request is sent, this request is ignored by vIOMMU
> which leads to device binding to wrong address space, then DMA fails.
> 
> This issue exists in scalable mode with both first stage and second
> stage translations, both emulated and passthrough devices.

does it affect emulated device? The domain switching should have
IOTLB/PIOTLB invalidation. right? Then the emulated device should
not been affected.

> 
> Take network device for example, below sequence trigger issue:
> 
> 1. start a guest with iommu=pt
> 2. echo 0000:01:00.0 > /sys/bus/pci/drivers/virtio-pci/unbind
> 3. echo DMA > /sys/kernel/iommu_groups/6/type
> 4. echo 0000:01:00.0 > /sys/bus/pci/drivers/virtio-pci/bind
> 5. Ping test
> 
> Fix it by switching address space in invalidation handler.

a good catch.

> 
> Fixes: 4a4f219e8a10 ("intel_iommu: add scalable-mode option to make scalable mode work")
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>   hw/i386/intel_iommu.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index d656e9c256..30275a4f23 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -3104,7 +3104,7 @@ static void vtd_pasid_cache_sync_locked(gpointer key, gpointer value,
>            * reset where the whole guest memory is treated as zeroed.
>            */
>           pc_entry->valid = false;
> -        return;
> +        goto switch_as;
>       }
>   
>       /*
> @@ -3134,6 +3134,10 @@ static void vtd_pasid_cache_sync_locked(gpointer key, gpointer value,
>   
>       pc_entry->pasid_entry = pe;
>       pc_entry->valid = true;
> +
> +switch_as:
> +    vtd_switch_address_space(vtd_as);
> +    vtd_address_space_sync(vtd_as);
>   }
>   
>   static void vtd_pasid_cache_sync(IntelIOMMUState *s, VTDPASIDCacheInfo *pc_info)

The change looks good to me. You might want to adjust a bit per the
comment in patch 01.

Regards,
Yi Liu
RE: [PATCH 3/3] intel_iommu: Fix DMA failure when guest switches IOMMU domain
Posted by Duan, Zhenzhong 4 weeks, 1 day ago

>-----Original Message-----
>From: Liu, Yi L <yi.l.liu@intel.com>
>Subject: Re: [PATCH 3/3] intel_iommu: Fix DMA failure when guest switches
>IOMMU domain
>
>On 2025/10/15 18:20, Zhenzhong Duan wrote:
>> Kernel allows user to switch IOMMU domain, e.g., switch between DMA
>> and identity domain. When this happen in IOMMU scalable mode, a pasid
>> cache invalidation request is sent, this request is ignored by vIOMMU
>> which leads to device binding to wrong address space, then DMA fails.
>>
>> This issue exists in scalable mode with both first stage and second
>> stage translations, both emulated and passthrough devices.
>
>does it affect emulated device? The domain switching should have
>IOTLB/PIOTLB invalidation. right? Then the emulated device should
>not been affected.

Yes, because we missed address space switch in vIOMMU, vtd_iommu_translate isn't called even with DMA domain.

With a vhost emulated net card, I can get below error, guest hang.

qemu-system-x86_64: Fail to lookup the translated address fffff000
qemu-system-x86_64: unable to start vhost net: 14: falling back on userspace virtio
qemu-system-x86_64: Guest says index 65535 is available
qemu-system-x86_64: Guest moved used index from 0 to 65535

Thanks
Zhenzhong