[PATCH 1/5] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap

Zhenzhong Duan posted 5 patches 2 weeks, 4 days ago
Maintainers: Yi Liu <yi.l.liu@intel.com>, Eric Auger <eric.auger@redhat.com>, Zhenzhong Duan <zhenzhong.duan@intel.com>, "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, "Clément Mathieu--Drif" <clement.mathieu--drif@eviden.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, Eduardo Habkost <eduardo@habkost.net>, Alex Williamson <alex.williamson@redhat.com>, "Cédric Le Goater" <clg@redhat.com>
[PATCH 1/5] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap
Posted by Zhenzhong Duan 2 weeks, 4 days ago
Currently we support device and iommu dirty tracking, device dirty
tracking is preferred.

Add the framework code in iommufd_cdev_unmap_one() to choose either
device or iommu dirty tracking, just like vfio_legacy_dma_unmap_one().

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Xudong Hao <xudong.hao@intel.com>
Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
---
 hw/vfio/iommufd.c | 56 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 45 insertions(+), 11 deletions(-)

diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 48c590b6a9..b5d6e54c45 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -58,33 +58,67 @@ static int iommufd_cdev_map_file(const VFIOContainerBase *bcontainer,
                                         iova, size, fd, start, readonly);
 }
 
-static int iommufd_cdev_unmap(const VFIOContainerBase *bcontainer,
-                              hwaddr iova, ram_addr_t size,
-                              IOMMUTLBEntry *iotlb, bool unmap_all)
+static int iommufd_cdev_unmap_one(const VFIOContainerBase *bcontainer,
+                                  hwaddr iova, ram_addr_t size,
+                                  IOMMUTLBEntry *iotlb)
 {
     const VFIOIOMMUFDContainer *container =
         container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer);
+    bool need_dirty_sync = false;
+    Error *local_err = NULL;
+    IOMMUFDBackend *be = container->be;
+    uint32_t ioas_id = container->ioas_id;
+    int ret;
+
+    if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
+        if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
+            bcontainer->dirty_pages_supported) {
+            /* TODO: query dirty bitmap before DMA unmap */
+            return iommufd_backend_unmap_dma(be, ioas_id, iova, size);
+        }
+
+        need_dirty_sync = true;
+    }
+
+    ret = iommufd_backend_unmap_dma(be, ioas_id, iova, size);
+    if (ret) {
+        return ret;
+    }
 
+    if (need_dirty_sync) {
+        ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
+                                                iotlb->translated_addr,
+                                                &local_err);
+        if (ret) {
+            error_report_err(local_err);
+            return ret;
+        }
+    }
+
+    return 0;
+}
+
+static int iommufd_cdev_unmap(const VFIOContainerBase *bcontainer,
+                              hwaddr iova, ram_addr_t size,
+                              IOMMUTLBEntry *iotlb, bool unmap_all)
+{
     /* unmap in halves */
     if (unmap_all) {
         Int128 llsize = int128_rshift(int128_2_64(), 1);
         int ret;
 
-        ret = iommufd_backend_unmap_dma(container->be, container->ioas_id,
-                                        0, int128_get64(llsize));
+        ret = iommufd_cdev_unmap_one(bcontainer, 0, int128_get64(llsize),
+                                     iotlb);
 
         if (ret == 0) {
-            ret = iommufd_backend_unmap_dma(container->be, container->ioas_id,
-                                            int128_get64(llsize),
-                                            int128_get64(llsize));
+            ret = iommufd_cdev_unmap_one(bcontainer, int128_get64(llsize),
+                                         int128_get64(llsize), iotlb);
         }
 
         return ret;
     }
 
-    /* TODO: Handle dma_unmap_bitmap with iotlb args (migration) */
-    return iommufd_backend_unmap_dma(container->be,
-                                     container->ioas_id, iova, size);
+    return iommufd_cdev_unmap_one(bcontainer, iova, size, iotlb);
 }
 
 static bool iommufd_cdev_kvm_device_add(VFIODevice *vbasedev, Error **errp)
-- 
2.47.1
Re: [PATCH 1/5] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap
Posted by Cédric Le Goater 1 week, 2 days ago
Hello Zhenzhong

On 9/10/25 04:36, Zhenzhong Duan wrote:
> Currently we support device and iommu dirty tracking, device dirty
> tracking is preferred.
> 
> Add the framework code in iommufd_cdev_unmap_one() to choose either
> device or iommu dirty tracking, just like vfio_legacy_dma_unmap_one().

I wonder if commit 567d7d3e6be5 ("vfio/common: Work around kernel
overflow bug in DMA unmap") could be removed now to make the code
common to both VFIO IOMMU Type1 and IOMMUFD backends.

I asked Alex and Peter in another thread.

Thanks,

C.




> 
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Xudong Hao <xudong.hao@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
>   hw/vfio/iommufd.c | 56 +++++++++++++++++++++++++++++++++++++----------
>   1 file changed, 45 insertions(+), 11 deletions(-)
> 
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index 48c590b6a9..b5d6e54c45 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -58,33 +58,67 @@ static int iommufd_cdev_map_file(const VFIOContainerBase *bcontainer,
>                                           iova, size, fd, start, readonly);
>   }
>   
> -static int iommufd_cdev_unmap(const VFIOContainerBase *bcontainer,
> -                              hwaddr iova, ram_addr_t size,
> -                              IOMMUTLBEntry *iotlb, bool unmap_all)
> +static int iommufd_cdev_unmap_one(const VFIOContainerBase *bcontainer,
> +                                  hwaddr iova, ram_addr_t size,
> +                                  IOMMUTLBEntry *iotlb)
>   {
>       const VFIOIOMMUFDContainer *container =
>           container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer);
> +    bool need_dirty_sync = false;
> +    Error *local_err = NULL;
> +    IOMMUFDBackend *be = container->be;
> +    uint32_t ioas_id = container->ioas_id;
> +    int ret;
> +
> +    if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
> +        if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
> +            bcontainer->dirty_pages_supported) {
> +            /* TODO: query dirty bitmap before DMA unmap */
> +            return iommufd_backend_unmap_dma(be, ioas_id, iova, size);
> +        }
> +
> +        need_dirty_sync = true;
> +    }
> +
> +    ret = iommufd_backend_unmap_dma(be, ioas_id, iova, size);
> +    if (ret) {
> +        return ret;
> +    }
>   
> +    if (need_dirty_sync) {
> +        ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
> +                                                iotlb->translated_addr,
> +                                                &local_err);
> +        if (ret) {
> +            error_report_err(local_err);
> +            return ret;
> +        }
> +    }
> +
> +    return 0;
> +}
> +static int iommufd_cdev_unmap(const VFIOContainerBase *bcontainer,
> +                              hwaddr iova, ram_addr_t size,
> +                              IOMMUTLBEntry *iotlb, bool unmap_all)
> +{
>       /* unmap in halves */
>       if (unmap_all) {
>           Int128 llsize = int128_rshift(int128_2_64(), 1);
>           int ret;
>   
> -        ret = iommufd_backend_unmap_dma(container->be, container->ioas_id,
> -                                        0, int128_get64(llsize));
> +        ret = iommufd_cdev_unmap_one(bcontainer, 0, int128_get64(llsize),
> +                                     iotlb);
>   
>           if (ret == 0) {
> -            ret = iommufd_backend_unmap_dma(container->be, container->ioas_id,
> -                                            int128_get64(llsize),
> -                                            int128_get64(llsize));
> +            ret = iommufd_cdev_unmap_one(bcontainer, int128_get64(llsize),
> +                                         int128_get64(llsize), iotlb);
>           }
>   
>           return ret;
>       }
>   
> -    /* TODO: Handle dma_unmap_bitmap with iotlb args (migration) */
> -    return iommufd_backend_unmap_dma(container->be,
> -                                     container->ioas_id, iova, size);
> +    return iommufd_cdev_unmap_one(bcontainer, iova, size, iotlb);
>   }
>   
>   static bool iommufd_cdev_kvm_device_add(VFIODevice *vbasedev, Error **errp)
RE: [PATCH 1/5] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap
Posted by Duan, Zhenzhong 6 days, 13 hours ago
Hi Cedric,

>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH 1/5] vfio/iommufd: Add framework code to support
>getting dirty bitmap before unmap
>
>Hello Zhenzhong
>
>On 9/10/25 04:36, Zhenzhong Duan wrote:
>> Currently we support device and iommu dirty tracking, device dirty
>> tracking is preferred.
>>
>> Add the framework code in iommufd_cdev_unmap_one() to choose either
>> device or iommu dirty tracking, just like vfio_legacy_dma_unmap_one().
>
>I wonder if commit 567d7d3e6be5 ("vfio/common: Work around kernel
>overflow bug in DMA unmap") could be removed now to make the code
>common to both VFIO IOMMU Type1 and IOMMUFD backends.

I am not clear if there is other reason to keep the workaround, but the original
kernel issue had been fixed with below commit:

commit 58fec830fc19208354895d9832785505046d6c01
Author: Alex Williamson <alex.williamson@redhat.com>
Date:   Mon Jan 7 22:13:22 2019 -0700

    vfio/type1: Fix unmap overflow off-by-one

    The below referenced commit adds a test for integer overflow, but in
    doing so prevents the unmap ioctl from ever including the last page of
    the address space.  Subtract one to compare to the last address of the
    unmap to avoid the overflow and wrap-around.

    Fixes: 71a7d3d78e3c ("vfio/type1: silence integer overflow warning")
    Link: https://bugzilla.redhat.com/show_bug.cgi?id=1662291
    Cc: stable@vger.kernel.org # v4.15+
    Reported-by: Pei Zhang <pezhang@redhat.com>
    Debugged-by: Peter Xu <peterx@redhat.com>
    Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
    Reviewed-by: Peter Xu <peterx@redhat.com>
    Tested-by: Peter Xu <peterx@redhat.com>
    Reviewed-by: Cornelia Huck <cohuck@redhat.com>
    Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

>
>I asked Alex and Peter in another thread.

Just curious on the answer, may I ask which thread?

btw: I just found unmapping in halves seems unnecessary as both backends of kernel side support unmap_all now.

    if (unmap_all) {
        /* The unmap ioctl doesn't accept a full 64-bit span. */
        Int128 llsize = int128_rshift(int128_2_64(), 1);

        ret = vfio_legacy_dma_unmap_one(bcontainer, 0, int128_get64(llsize),
                                        iotlb);

        if (ret == 0) {
            ret = vfio_legacy_dma_unmap_one(bcontainer, int128_get64(llsize),
                                            int128_get64(llsize), iotlb);
        }

    } else {
        ret = vfio_legacy_dma_unmap_one(bcontainer, iova, size, iotlb);
    }

Thanks
Zhenzhong
Re: [PATCH 1/5] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap
Posted by Cédric Le Goater 6 days, 8 hours ago
On 9/22/25 05:17, Duan, Zhenzhong wrote:
> Hi Cedric,
> 
>> -----Original Message-----
>> From: Cédric Le Goater <clg@redhat.com>
>> Subject: Re: [PATCH 1/5] vfio/iommufd: Add framework code to support
>> getting dirty bitmap before unmap
>>
>> Hello Zhenzhong
>>
>> On 9/10/25 04:36, Zhenzhong Duan wrote:
>>> Currently we support device and iommu dirty tracking, device dirty
>>> tracking is preferred.
>>>
>>> Add the framework code in iommufd_cdev_unmap_one() to choose either
>>> device or iommu dirty tracking, just like vfio_legacy_dma_unmap_one().
>>
>> I wonder if commit 567d7d3e6be5 ("vfio/common: Work around kernel
>> overflow bug in DMA unmap") could be removed now to make the code
>> common to both VFIO IOMMU Type1 and IOMMUFD backends.
> 
> I am not clear if there is other reason to keep the workaround, but the original
> kernel issue had been fixed with below commit:
> 
> commit 58fec830fc19208354895d9832785505046d6c01
> Author: Alex Williamson <alex.williamson@redhat.com>
> Date:   Mon Jan 7 22:13:22 2019 -0700
> 
>      vfio/type1: Fix unmap overflow off-by-one
> 
>      The below referenced commit adds a test for integer overflow, but in
>      doing so prevents the unmap ioctl from ever including the last page of
>      the address space.  Subtract one to compare to the last address of the
>      unmap to avoid the overflow and wrap-around.
> 
>      Fixes: 71a7d3d78e3c ("vfio/type1: silence integer overflow warning")
>      Link: https://bugzilla.redhat.com/show_bug.cgi?id=1662291
>      Cc: stable@vger.kernel.org # v4.15+
>      Reported-by: Pei Zhang <pezhang@redhat.com>
>      Debugged-by: Peter Xu <peterx@redhat.com>
>      Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
>      Reviewed-by: Peter Xu <peterx@redhat.com>
>      Tested-by: Peter Xu <peterx@redhat.com>
>      Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>      Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> 
>>
>> I asked Alex and Peter in another thread.
> 
> Just curious on the answer, may I ask which thread?

According to Alex, the QEMU workaround can be removed :

https://lore.kernel.org/qemu-devel/20250919102447.748e17fe.alex.williamson@redhat.com/

> btw: I just found unmapping in halves seems unnecessary as both backends of kernel side support unmap_all now.
> 
>      if (unmap_all) {
>          /* The unmap ioctl doesn't accept a full 64-bit span. */
>          Int128 llsize = int128_rshift(int128_2_64(), 1);
> 
>          ret = vfio_legacy_dma_unmap_one(bcontainer, 0, int128_get64(llsize),
>                                          iotlb);
> 
>          if (ret == 0) {
>              ret = vfio_legacy_dma_unmap_one(bcontainer, int128_get64(llsize),
>                                              int128_get64(llsize), iotlb);
>          }
> 
>      } else {
>          ret = vfio_legacy_dma_unmap_one(bcontainer, iova, size, iotlb);
>      }

Good. So we can simply both backends it seems.

Thanks,

C.







RE: [PATCH 1/5] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap
Posted by Duan, Zhenzhong 5 days, 14 hours ago

>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH 1/5] vfio/iommufd: Add framework code to support
>getting dirty bitmap before unmap
>
>On 9/22/25 05:17, Duan, Zhenzhong wrote:
>> Hi Cedric,
>>
>>> -----Original Message-----
>>> From: Cédric Le Goater <clg@redhat.com>
>>> Subject: Re: [PATCH 1/5] vfio/iommufd: Add framework code to support
>>> getting dirty bitmap before unmap
>>>
>>> Hello Zhenzhong
>>>
>>> On 9/10/25 04:36, Zhenzhong Duan wrote:
>>>> Currently we support device and iommu dirty tracking, device dirty
>>>> tracking is preferred.
>>>>
>>>> Add the framework code in iommufd_cdev_unmap_one() to choose
>either
>>>> device or iommu dirty tracking, just like vfio_legacy_dma_unmap_one().
>>>
>>> I wonder if commit 567d7d3e6be5 ("vfio/common: Work around kernel
>>> overflow bug in DMA unmap") could be removed now to make the code
>>> common to both VFIO IOMMU Type1 and IOMMUFD backends.
>>
>> I am not clear if there is other reason to keep the workaround, but the
>original
>> kernel issue had been fixed with below commit:
>>
>> commit 58fec830fc19208354895d9832785505046d6c01
>> Author: Alex Williamson <alex.williamson@redhat.com>
>> Date:   Mon Jan 7 22:13:22 2019 -0700
>>
>>      vfio/type1: Fix unmap overflow off-by-one
>>
>>      The below referenced commit adds a test for integer overflow, but in
>>      doing so prevents the unmap ioctl from ever including the last page
>of
>>      the address space.  Subtract one to compare to the last address of
>the
>>      unmap to avoid the overflow and wrap-around.
>>
>>      Fixes: 71a7d3d78e3c ("vfio/type1: silence integer overflow warning")
>>      Link: https://bugzilla.redhat.com/show_bug.cgi?id=1662291
>>      Cc: stable@vger.kernel.org # v4.15+
>>      Reported-by: Pei Zhang <pezhang@redhat.com>
>>      Debugged-by: Peter Xu <peterx@redhat.com>
>>      Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
>>      Reviewed-by: Peter Xu <peterx@redhat.com>
>>      Tested-by: Peter Xu <peterx@redhat.com>
>>      Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>>      Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
>>
>>>
>>> I asked Alex and Peter in another thread.
>>
>> Just curious on the answer, may I ask which thread?
>
>According to Alex, the QEMU workaround can be removed :
>
>https://lore.kernel.org/qemu-devel/20250919102447.748e17fe.alex.williams
>on@redhat.com/
>
>> btw: I just found unmapping in halves seems unnecessary as both backends
>of kernel side support unmap_all now.
>>
>>      if (unmap_all) {
>>          /* The unmap ioctl doesn't accept a full 64-bit span. */
>>          Int128 llsize = int128_rshift(int128_2_64(), 1);
>>
>>          ret = vfio_legacy_dma_unmap_one(bcontainer, 0,
>int128_get64(llsize),
>>                                          iotlb);
>>
>>          if (ret == 0) {
>>              ret = vfio_legacy_dma_unmap_one(bcontainer,
>int128_get64(llsize),
>>                                              int128_get64(llsize),
>iotlb);
>>          }
>>
>>      } else {
>>          ret = vfio_legacy_dma_unmap_one(bcontainer, iova, size,
>iotlb);
>>      }
>
>Good. So we can simply both backends it seems.

Will you handle them or not? I mean the workaround removing and unmapping_all optimization.

Thanks
Zhenzhong
Re: [PATCH 1/5] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap
Posted by Cédric Le Goater 5 days, 9 hours ago
On 9/23/25 04:45, Duan, Zhenzhong wrote:
> 
> 
>> -----Original Message-----
>> From: Cédric Le Goater <clg@redhat.com>
>> Subject: Re: [PATCH 1/5] vfio/iommufd: Add framework code to support
>> getting dirty bitmap before unmap
>>
>> On 9/22/25 05:17, Duan, Zhenzhong wrote:
>>> Hi Cedric,
>>>
>>>> -----Original Message-----
>>>> From: Cédric Le Goater <clg@redhat.com>
>>>> Subject: Re: [PATCH 1/5] vfio/iommufd: Add framework code to support
>>>> getting dirty bitmap before unmap
>>>>
>>>> Hello Zhenzhong
>>>>
>>>> On 9/10/25 04:36, Zhenzhong Duan wrote:
>>>>> Currently we support device and iommu dirty tracking, device dirty
>>>>> tracking is preferred.
>>>>>
>>>>> Add the framework code in iommufd_cdev_unmap_one() to choose
>> either
>>>>> device or iommu dirty tracking, just like vfio_legacy_dma_unmap_one().
>>>>
>>>> I wonder if commit 567d7d3e6be5 ("vfio/common: Work around kernel
>>>> overflow bug in DMA unmap") could be removed now to make the code
>>>> common to both VFIO IOMMU Type1 and IOMMUFD backends.
>>>
>>> I am not clear if there is other reason to keep the workaround, but the
>> original
>>> kernel issue had been fixed with below commit:
>>>
>>> commit 58fec830fc19208354895d9832785505046d6c01
>>> Author: Alex Williamson <alex.williamson@redhat.com>
>>> Date:   Mon Jan 7 22:13:22 2019 -0700
>>>
>>>       vfio/type1: Fix unmap overflow off-by-one
>>>
>>>       The below referenced commit adds a test for integer overflow, but in
>>>       doing so prevents the unmap ioctl from ever including the last page
>> of
>>>       the address space.  Subtract one to compare to the last address of
>> the
>>>       unmap to avoid the overflow and wrap-around.
>>>
>>>       Fixes: 71a7d3d78e3c ("vfio/type1: silence integer overflow warning")
>>>       Link: https://bugzilla.redhat.com/show_bug.cgi?id=1662291
>>>       Cc: stable@vger.kernel.org # v4.15+
>>>       Reported-by: Pei Zhang <pezhang@redhat.com>
>>>       Debugged-by: Peter Xu <peterx@redhat.com>
>>>       Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
>>>       Reviewed-by: Peter Xu <peterx@redhat.com>
>>>       Tested-by: Peter Xu <peterx@redhat.com>
>>>       Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>>>       Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
>>>
>>>>
>>>> I asked Alex and Peter in another thread.
>>>
>>> Just curious on the answer, may I ask which thread?
>>
>> According to Alex, the QEMU workaround can be removed :
>>
>> https://lore.kernel.org/qemu-devel/20250919102447.748e17fe.alex.williams
>> on@redhat.com/
>>
>>> btw: I just found unmapping in halves seems unnecessary as both backends
>> of kernel side support unmap_all now.
>>>
>>>       if (unmap_all) {
>>>           /* The unmap ioctl doesn't accept a full 64-bit span. */
>>>           Int128 llsize = int128_rshift(int128_2_64(), 1);
>>>
>>>           ret = vfio_legacy_dma_unmap_one(bcontainer, 0,
>> int128_get64(llsize),
>>>                                           iotlb);
>>>
>>>           if (ret == 0) {
>>>               ret = vfio_legacy_dma_unmap_one(bcontainer,
>> int128_get64(llsize),
>>>                                               int128_get64(llsize),
>> iotlb);
>>>           }
>>>
>>>       } else {
>>>           ret = vfio_legacy_dma_unmap_one(bcontainer, iova, size,
>> iotlb);
>>>       }
>>
>> Good. So we can simply both backends it seems.

*ify

> 
> Will you handle them or not? I mean the workaround removing and unmapping_all optimization.

I can revert 567d7d3e6be5 ("vfio/common: Work around kernel overflow
bug in DMA unmap") but, AFAICT, the "unmap DMAs in halves" method (see
1b296c3def4b) should be kept.

Thanks,

C.


RE: [PATCH 1/5] vfio/iommufd: Add framework code to support getting dirty bitmap before unmap
Posted by Duan, Zhenzhong 5 days, 7 hours ago


>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH 1/5] vfio/iommufd: Add framework code to support
>getting dirty bitmap before unmap
>
>On 9/23/25 04:45, Duan, Zhenzhong wrote:
>>
>>
>>> -----Original Message-----
>>> From: Cédric Le Goater <clg@redhat.com>
>>> Subject: Re: [PATCH 1/5] vfio/iommufd: Add framework code to support
>>> getting dirty bitmap before unmap
>>>
>>> On 9/22/25 05:17, Duan, Zhenzhong wrote:
>>>> Hi Cedric,
>>>>
>>>>> -----Original Message-----
>>>>> From: Cédric Le Goater <clg@redhat.com>
>>>>> Subject: Re: [PATCH 1/5] vfio/iommufd: Add framework code to support
>>>>> getting dirty bitmap before unmap
>>>>>
>>>>> Hello Zhenzhong
>>>>>
>>>>> On 9/10/25 04:36, Zhenzhong Duan wrote:
>>>>>> Currently we support device and iommu dirty tracking, device dirty
>>>>>> tracking is preferred.
>>>>>>
>>>>>> Add the framework code in iommufd_cdev_unmap_one() to choose
>>> either
>>>>>> device or iommu dirty tracking, just like
>vfio_legacy_dma_unmap_one().
>>>>>
>>>>> I wonder if commit 567d7d3e6be5 ("vfio/common: Work around kernel
>>>>> overflow bug in DMA unmap") could be removed now to make the code
>>>>> common to both VFIO IOMMU Type1 and IOMMUFD backends.
>>>>
>>>> I am not clear if there is other reason to keep the workaround, but the
>>> original
>>>> kernel issue had been fixed with below commit:
>>>>
>>>> commit 58fec830fc19208354895d9832785505046d6c01
>>>> Author: Alex Williamson <alex.williamson@redhat.com>
>>>> Date:   Mon Jan 7 22:13:22 2019 -0700
>>>>
>>>>       vfio/type1: Fix unmap overflow off-by-one
>>>>
>>>>       The below referenced commit adds a test for integer overflow,
>but in
>>>>       doing so prevents the unmap ioctl from ever including the last
>page
>>> of
>>>>       the address space.  Subtract one to compare to the last address
>of
>>> the
>>>>       unmap to avoid the overflow and wrap-around.
>>>>
>>>>       Fixes: 71a7d3d78e3c ("vfio/type1: silence integer overflow
>warning")
>>>>       Link: https://bugzilla.redhat.com/show_bug.cgi?id=1662291
>>>>       Cc: stable@vger.kernel.org # v4.15+
>>>>       Reported-by: Pei Zhang <pezhang@redhat.com>
>>>>       Debugged-by: Peter Xu <peterx@redhat.com>
>>>>       Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
>>>>       Reviewed-by: Peter Xu <peterx@redhat.com>
>>>>       Tested-by: Peter Xu <peterx@redhat.com>
>>>>       Reviewed-by: Cornelia Huck <cohuck@redhat.com>
>>>>       Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
>>>>
>>>>>
>>>>> I asked Alex and Peter in another thread.
>>>>
>>>> Just curious on the answer, may I ask which thread?
>>>
>>> According to Alex, the QEMU workaround can be removed :
>>>
>>>
>https://lore.kernel.org/qemu-devel/20250919102447.748e17fe.alex.williams
>>> on@redhat.com/
>>>
>>>> btw: I just found unmapping in halves seems unnecessary as both
>backends
>>> of kernel side support unmap_all now.
>>>>
>>>>       if (unmap_all) {
>>>>           /* The unmap ioctl doesn't accept a full 64-bit span. */
>>>>           Int128 llsize = int128_rshift(int128_2_64(), 1);
>>>>
>>>>           ret = vfio_legacy_dma_unmap_one(bcontainer, 0,
>>> int128_get64(llsize),
>>>>                                           iotlb);
>>>>
>>>>           if (ret == 0) {
>>>>               ret = vfio_legacy_dma_unmap_one(bcontainer,
>>> int128_get64(llsize),
>>>>
>int128_get64(llsize),
>>> iotlb);
>>>>           }
>>>>
>>>>       } else {
>>>>           ret = vfio_legacy_dma_unmap_one(bcontainer, iova, size,
>>> iotlb);
>>>>       }
>>>
>>> Good. So we can simply both backends it seems.
>
>*ify
>
>>
>> Will you handle them or not? I mean the workaround removing and
>unmapping_all optimization.
>
>I can revert 567d7d3e6be5 ("vfio/common: Work around kernel overflow
>bug in DMA unmap") but, AFAICT, the "unmap DMAs in halves" method (see
>1b296c3def4b) should be kept.

OK, let me take the unmap_all part. For example, for iommufd, it can be simplified to:
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -67,19 +67,8 @@ static int iommufd_cdev_unmap(const VFIOContainerBase *bcontainer,

     /* unmap in halves */
     if (unmap_all) {
-        Int128 llsize = int128_rshift(int128_2_64(), 1);
-        int ret;
-
-        ret = iommufd_backend_unmap_dma(container->be, container->ioas_id,
-                                        0, int128_get64(llsize));
-
-        if (ret == 0) {
-            ret = iommufd_backend_unmap_dma(container->be, container->ioas_id,
-                                            int128_get64(llsize),
-                                            int128_get64(llsize));
-        }
-
-        return ret;
+        assert(!iova && !size);
+        size = UINT64_MAX;
     }

     /* TODO: Handle dma_unmap_bitmap with iotlb args (migration) */

Thanks
Zhenzhong