[PATCH 3/5] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support

Zhenzhong Duan posted 5 patches 2 weeks, 4 days ago
Maintainers: Yi Liu <yi.l.liu@intel.com>, Eric Auger <eric.auger@redhat.com>, Zhenzhong Duan <zhenzhong.duan@intel.com>, "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, "Clément Mathieu--Drif" <clement.mathieu--drif@eviden.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, Eduardo Habkost <eduardo@habkost.net>, Alex Williamson <alex.williamson@redhat.com>, "Cédric Le Goater" <clg@redhat.com>
[PATCH 3/5] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
Posted by Zhenzhong Duan 2 weeks, 4 days ago
Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing the last dirty
bitmap query right before unmap, no PTEs flushes. This accelerates the
query without issue because unmap will tear down the mapping anyway.

Add a new element dirty_tracking_flags in VFIOIOMMUFDContainer to
be used for the flags of iommufd dirty tracking. Currently it is
set to either IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR or 0 based on
the scenario.

Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Tested-by: Xudong Hao <xudong.hao@intel.com>
Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
---
 hw/vfio/vfio-iommufd.h   | 1 +
 include/system/iommufd.h | 2 +-
 backends/iommufd.c       | 5 +++--
 hw/vfio/iommufd.c        | 6 +++++-
 backends/trace-events    | 2 +-
 5 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/hw/vfio/vfio-iommufd.h b/hw/vfio/vfio-iommufd.h
index 07ea0f4304..e0af241c75 100644
--- a/hw/vfio/vfio-iommufd.h
+++ b/hw/vfio/vfio-iommufd.h
@@ -26,6 +26,7 @@ typedef struct VFIOIOMMUFDContainer {
     VFIOContainerBase bcontainer;
     IOMMUFDBackend *be;
     uint32_t ioas_id;
+    uint64_t dirty_tracking_flags;
     QLIST_HEAD(, VFIOIOASHwpt) hwpt_list;
 } VFIOIOMMUFDContainer;
 
diff --git a/include/system/iommufd.h b/include/system/iommufd.h
index c9c72ffc45..63898e7b0d 100644
--- a/include/system/iommufd.h
+++ b/include/system/iommufd.h
@@ -64,7 +64,7 @@ bool iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t hwpt_id,
 bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
                                       uint64_t iova, ram_addr_t size,
                                       uint64_t page_size, uint64_t *data,
-                                      Error **errp);
+                                      uint64_t flags, Error **errp);
 bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
                                       uint32_t data_type, uint32_t entry_len,
                                       uint32_t *entry_num, void *data,
diff --git a/backends/iommufd.c b/backends/iommufd.c
index 2a33c7ab0b..3c4f6157e2 100644
--- a/backends/iommufd.c
+++ b/backends/iommufd.c
@@ -361,7 +361,7 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
                                       uint32_t hwpt_id,
                                       uint64_t iova, ram_addr_t size,
                                       uint64_t page_size, uint64_t *data,
-                                      Error **errp)
+                                      uint64_t flags, Error **errp)
 {
     int ret;
     struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap = {
@@ -371,11 +371,12 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
         .length = size,
         .page_size = page_size,
         .data = (uintptr_t)data,
+        .flags = flags,
     };
 
     ret = ioctl(be->fd, IOMMU_HWPT_GET_DIRTY_BITMAP, &get_dirty_bitmap);
     trace_iommufd_backend_get_dirty_bitmap(be->fd, hwpt_id, iova, size,
-                                           page_size, ret ? errno : 0);
+                                           flags, page_size, ret ? errno : 0);
     if (ret) {
         error_setg_errno(errp, errno,
                          "IOMMU_HWPT_GET_DIRTY_BITMAP (iova: 0x%"HWADDR_PRIx
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index 0057488ce9..c897aa6b17 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -62,7 +62,7 @@ static int iommufd_cdev_unmap_one(const VFIOContainerBase *bcontainer,
                                   hwaddr iova, ram_addr_t size,
                                   IOMMUTLBEntry *iotlb)
 {
-    const VFIOIOMMUFDContainer *container =
+    VFIOIOMMUFDContainer *container =
         container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer);
     bool need_dirty_sync = false;
     Error *local_err = NULL;
@@ -73,9 +73,12 @@ static int iommufd_cdev_unmap_one(const VFIOContainerBase *bcontainer,
     if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
         if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
             bcontainer->dirty_pages_supported) {
+            container->dirty_tracking_flags =
+                                      IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR;
             ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
                                                     iotlb->translated_addr,
                                                     &local_err);
+            container->dirty_tracking_flags = 0;
             if (ret) {
                 error_report_err(local_err);
             }
@@ -248,6 +251,7 @@ static int iommufd_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
         if (!iommufd_backend_get_dirty_bitmap(container->be, hwpt->hwpt_id,
                                               iova, size, page_size,
                                               (uint64_t *)vbmap->bitmap,
+                                              container->dirty_tracking_flags,
                                               errp)) {
             return -EINVAL;
         }
diff --git a/backends/trace-events b/backends/trace-events
index 56132d3fd2..e1992ba12f 100644
--- a/backends/trace-events
+++ b/backends/trace-events
@@ -19,5 +19,5 @@ iommufd_backend_alloc_ioas(int iommufd, uint32_t ioas) " iommufd=%d ioas=%d"
 iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_t flags, uint32_t hwpt_type, uint32_t len, uint64_t data_ptr, uint32_t out_hwpt_id, int ret) " iommufd=%d dev_id=%u pt_id=%u flags=0x%x hwpt_type=%u len=%u data_ptr=0x%"PRIx64" out_hwpt=%u (%d)"
 iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
 iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
-iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
+iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t flags, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" flags=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
 iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
-- 
2.47.1
Re: [PATCH 3/5] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
Posted by Joao Martins 1 week, 2 days ago
On 10/09/2025 03:36, Zhenzhong Duan wrote:
> Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing the last dirty
> bitmap query right before unmap, no PTEs flushes. This accelerates the
> query without issue because unmap will tear down the mapping anyway.
> 
> Add a new element dirty_tracking_flags in VFIOIOMMUFDContainer to
> be used for the flags of iommufd dirty tracking. Currently it is
> set to either IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR or 0 based on
> the scenario.
> 
> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> Tested-by: Xudong Hao <xudong.hao@intel.com>
> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
> ---
>  hw/vfio/vfio-iommufd.h   | 1 +
>  include/system/iommufd.h | 2 +-
>  backends/iommufd.c       | 5 +++--
>  hw/vfio/iommufd.c        | 6 +++++-
>  backends/trace-events    | 2 +-
>  5 files changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/vfio/vfio-iommufd.h b/hw/vfio/vfio-iommufd.h
> index 07ea0f4304..e0af241c75 100644
> --- a/hw/vfio/vfio-iommufd.h
> +++ b/hw/vfio/vfio-iommufd.h
> @@ -26,6 +26,7 @@ typedef struct VFIOIOMMUFDContainer {
>      VFIOContainerBase bcontainer;
>      IOMMUFDBackend *be;
>      uint32_t ioas_id;
> +    uint64_t dirty_tracking_flags;
>      QLIST_HEAD(, VFIOIOASHwpt) hwpt_list;
>  } VFIOIOMMUFDContainer;
>  
> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
> index c9c72ffc45..63898e7b0d 100644
> --- a/include/system/iommufd.h
> +++ b/include/system/iommufd.h
> @@ -64,7 +64,7 @@ bool iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t hwpt_id,
>  bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be, uint32_t hwpt_id,
>                                        uint64_t iova, ram_addr_t size,
>                                        uint64_t page_size, uint64_t *data,
> -                                      Error **errp);
> +                                      uint64_t flags, Error **errp);
>  bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be, uint32_t id,
>                                        uint32_t data_type, uint32_t entry_len,
>                                        uint32_t *entry_num, void *data,
> diff --git a/backends/iommufd.c b/backends/iommufd.c
> index 2a33c7ab0b..3c4f6157e2 100644
> --- a/backends/iommufd.c
> +++ b/backends/iommufd.c
> @@ -361,7 +361,7 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>                                        uint32_t hwpt_id,
>                                        uint64_t iova, ram_addr_t size,
>                                        uint64_t page_size, uint64_t *data,
> -                                      Error **errp)
> +                                      uint64_t flags, Error **errp)
>  {
>      int ret;
>      struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap = {
> @@ -371,11 +371,12 @@ bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>          .length = size,
>          .page_size = page_size,
>          .data = (uintptr_t)data,
> +        .flags = flags,
>      };
>  
>      ret = ioctl(be->fd, IOMMU_HWPT_GET_DIRTY_BITMAP, &get_dirty_bitmap);
>      trace_iommufd_backend_get_dirty_bitmap(be->fd, hwpt_id, iova, size,
> -                                           page_size, ret ? errno : 0);
> +                                           flags, page_size, ret ? errno : 0);
>      if (ret) {
>          error_setg_errno(errp, errno,
>                           "IOMMU_HWPT_GET_DIRTY_BITMAP (iova: 0x%"HWADDR_PRIx
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index 0057488ce9..c897aa6b17 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -62,7 +62,7 @@ static int iommufd_cdev_unmap_one(const VFIOContainerBase *bcontainer,
>                                    hwaddr iova, ram_addr_t size,
>                                    IOMMUTLBEntry *iotlb)
>  {
> -    const VFIOIOMMUFDContainer *container =
> +    VFIOIOMMUFDContainer *container =
>          container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer);
>      bool need_dirty_sync = false;
>      Error *local_err = NULL;
> @@ -73,9 +73,12 @@ static int iommufd_cdev_unmap_one(const VFIOContainerBase *bcontainer,
>      if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
>          if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>              bcontainer->dirty_pages_supported) {
> +            container->dirty_tracking_flags =
> +                                      IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR;
>              ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
>                                                      iotlb->translated_addr,
>                                                      &local_err);
> +            container->dirty_tracking_flags = 0;

Why not changing vfio_container_query_dirty_bitmap to pass a flags too, like the
original patches? This is a little unnecssary odd style to pass a flag via
container structure rather and then clearing.

Part of the reason the original series had a VFIO_GET_DIRTY_NO_FLUSH for generic
container abstraction was to not mix IOMMUFD UAPI specifics into base container
API. Then in getting a VFIO_GET_DIRTY_NO_FLUSH, then type1 backend could just
ignore the flag, while IOMMUFD translates it to IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR

>              if (ret) {
>                  error_report_err(local_err);
>              }
> @@ -248,6 +251,7 @@ static int iommufd_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
>          if (!iommufd_backend_get_dirty_bitmap(container->be, hwpt->hwpt_id,
>                                                iova, size, page_size,
>                                                (uint64_t *)vbmap->bitmap,
> +                                              container->dirty_tracking_flags,
>                                                errp)) {
>              return -EINVAL;
>          }
> diff --git a/backends/trace-events b/backends/trace-events
> index 56132d3fd2..e1992ba12f 100644
> --- a/backends/trace-events
> +++ b/backends/trace-events
> @@ -19,5 +19,5 @@ iommufd_backend_alloc_ioas(int iommufd, uint32_t ioas) " iommufd=%d ioas=%d"
>  iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_t flags, uint32_t hwpt_type, uint32_t len, uint64_t data_ptr, uint32_t out_hwpt_id, int ret) " iommufd=%d dev_id=%u pt_id=%u flags=0x%x hwpt_type=%u len=%u data_ptr=0x%"PRIx64" out_hwpt=%u (%d)"
>  iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)"
>  iommufd_backend_set_dirty(int iommufd, uint32_t hwpt_id, bool start, int ret) " iommufd=%d hwpt=%u enable=%d (%d)"
> -iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
> +iommufd_backend_get_dirty_bitmap(int iommufd, uint32_t hwpt_id, uint64_t iova, uint64_t size, uint64_t flags, uint64_t page_size, int ret) " iommufd=%d hwpt=%u iova=0x%"PRIx64" size=0x%"PRIx64" flags=0x%"PRIx64" page_size=0x%"PRIx64" (%d)"
>  iommufd_backend_invalidate_cache(int iommufd, uint32_t id, uint32_t data_type, uint32_t entry_len, uint32_t entry_num, uint32_t done_num, uint64_t data_ptr, int ret) " iommufd=%d id=%u data_type=%u entry_len=%u entry_num=%u done_num=%u data_ptr=0x%"PRIx64" (%d)"
RE: [PATCH 3/5] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
Posted by Duan, Zhenzhong 6 days, 11 hours ago
Hi Joao,

>-----Original Message-----
>From: Joao Martins <joao.m.martins@oracle.com>
>Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>
>On 10/09/2025 03:36, Zhenzhong Duan wrote:
>> Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing the last
>dirty
>> bitmap query right before unmap, no PTEs flushes. This accelerates the
>> query without issue because unmap will tear down the mapping anyway.
>>
>> Add a new element dirty_tracking_flags in VFIOIOMMUFDContainer to
>> be used for the flags of iommufd dirty tracking. Currently it is
>> set to either IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR or 0 based on
>> the scenario.
>>
>> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> Tested-by: Xudong Hao <xudong.hao@intel.com>
>> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
>> ---
>>  hw/vfio/vfio-iommufd.h   | 1 +
>>  include/system/iommufd.h | 2 +-
>>  backends/iommufd.c       | 5 +++--
>>  hw/vfio/iommufd.c        | 6 +++++-
>>  backends/trace-events    | 2 +-
>>  5 files changed, 11 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/vfio/vfio-iommufd.h b/hw/vfio/vfio-iommufd.h
>> index 07ea0f4304..e0af241c75 100644
>> --- a/hw/vfio/vfio-iommufd.h
>> +++ b/hw/vfio/vfio-iommufd.h
>> @@ -26,6 +26,7 @@ typedef struct VFIOIOMMUFDContainer {
>>      VFIOContainerBase bcontainer;
>>      IOMMUFDBackend *be;
>>      uint32_t ioas_id;
>> +    uint64_t dirty_tracking_flags;
>>      QLIST_HEAD(, VFIOIOASHwpt) hwpt_list;
>>  } VFIOIOMMUFDContainer;
>>
>> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
>> index c9c72ffc45..63898e7b0d 100644
>> --- a/include/system/iommufd.h
>> +++ b/include/system/iommufd.h
>> @@ -64,7 +64,7 @@ bool
>iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t
>hwpt_id,
>>  bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>uint32_t hwpt_id,
>>                                        uint64_t iova, ram_addr_t
>size,
>>                                        uint64_t page_size,
>uint64_t *data,
>> -                                      Error **errp);
>> +                                      uint64_t flags, Error
>**errp);
>>  bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be,
>uint32_t id,
>>                                        uint32_t data_type,
>uint32_t entry_len,
>>                                        uint32_t *entry_num, void
>*data,
>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>> index 2a33c7ab0b..3c4f6157e2 100644
>> --- a/backends/iommufd.c
>> +++ b/backends/iommufd.c
>> @@ -361,7 +361,7 @@ bool
>iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>                                        uint32_t hwpt_id,
>>                                        uint64_t iova, ram_addr_t
>size,
>>                                        uint64_t page_size,
>uint64_t *data,
>> -                                      Error **errp)
>> +                                      uint64_t flags, Error **errp)
>>  {
>>      int ret;
>>      struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap = {
>> @@ -371,11 +371,12 @@ bool
>iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>          .length = size,
>>          .page_size = page_size,
>>          .data = (uintptr_t)data,
>> +        .flags = flags,
>>      };
>>
>>      ret = ioctl(be->fd, IOMMU_HWPT_GET_DIRTY_BITMAP,
>&get_dirty_bitmap);
>>      trace_iommufd_backend_get_dirty_bitmap(be->fd, hwpt_id, iova,
>size,
>> -                                           page_size, ret ? errno :
>0);
>> +                                           flags, page_size, ret ?
>errno : 0);
>>      if (ret) {
>>          error_setg_errno(errp, errno,
>>                           "IOMMU_HWPT_GET_DIRTY_BITMAP
>(iova: 0x%"HWADDR_PRIx
>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>> index 0057488ce9..c897aa6b17 100644
>> --- a/hw/vfio/iommufd.c
>> +++ b/hw/vfio/iommufd.c
>> @@ -62,7 +62,7 @@ static int iommufd_cdev_unmap_one(const
>VFIOContainerBase *bcontainer,
>>                                    hwaddr iova, ram_addr_t size,
>>                                    IOMMUTLBEntry *iotlb)
>>  {
>> -    const VFIOIOMMUFDContainer *container =
>> +    VFIOIOMMUFDContainer *container =
>>          container_of(bcontainer, VFIOIOMMUFDContainer,
>bcontainer);
>>      bool need_dirty_sync = false;
>>      Error *local_err = NULL;
>> @@ -73,9 +73,12 @@ static int iommufd_cdev_unmap_one(const
>VFIOContainerBase *bcontainer,
>>      if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
>>          if
>(!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>>              bcontainer->dirty_pages_supported) {
>> +            container->dirty_tracking_flags =
>> +
>IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR;
>>              ret = vfio_container_query_dirty_bitmap(bcontainer, iova,
>size,
>>
>iotlb->translated_addr,
>>
>&local_err);
>> +            container->dirty_tracking_flags = 0;
>
>Why not changing vfio_container_query_dirty_bitmap to pass a flags too, like
>the
>original patches? This is a little unnecssary odd style to pass a flag via
>container structure rather and then clearing.

Just want to be simpler, original patch introduced a new parameter to almost all
variants of *_query_dirty_bitmap() while the flags parameter is only used by
IOMMUFD backend when doing unmap_bitmap. Currently we already have three
backends, legacy VFIO, IOMMUFD and VFIO-user, only IOMMUFD need the flag.

I take container->dirty_tracking_flags as a notification mechanism, so set it before
vfio_container_query_dirty_bitmap() and clear it thereafter. Maybe clearing it in
iommufd_query_dirty_bitmap() is easier to be acceptable?

>
>Part of the reason the original series had a VFIO_GET_DIRTY_NO_FLUSH for
>generic
>container abstraction was to not mix IOMMUFD UAPI specifics into base
>container
>API. Then in getting a VFIO_GET_DIRTY_NO_FLUSH, then type1 backend
>could just
>ignore the flag, while IOMMUFD translates it to
>IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR

I did port original patch https://github.com/yiliu1765/qemu/commit/99f83595d79d2e4170c9e456cf1a7b9521bd4f80
But it looks complex to have 'flags' parameter everywhere.

Thanks
Zhenzhong
Re: [PATCH 3/5] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
Posted by Cédric Le Goater 6 days, 1 hour ago
On 9/22/25 07:49, Duan, Zhenzhong wrote:
> Hi Joao,
> 
>> -----Original Message-----
>> From: Joao Martins <joao.m.martins@oracle.com>
>> Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>>
>> On 10/09/2025 03:36, Zhenzhong Duan wrote:
>>> Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing the last
>> dirty
>>> bitmap query right before unmap, no PTEs flushes. This accelerates the
>>> query without issue because unmap will tear down the mapping anyway.
>>>
>>> Add a new element dirty_tracking_flags in VFIOIOMMUFDContainer to
>>> be used for the flags of iommufd dirty tracking. Currently it is
>>> set to either IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR or 0 based on
>>> the scenario.
>>>
>>> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> Tested-by: Xudong Hao <xudong.hao@intel.com>
>>> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
>>> ---
>>>   hw/vfio/vfio-iommufd.h   | 1 +
>>>   include/system/iommufd.h | 2 +-
>>>   backends/iommufd.c       | 5 +++--
>>>   hw/vfio/iommufd.c        | 6 +++++-
>>>   backends/trace-events    | 2 +-
>>>   5 files changed, 11 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/hw/vfio/vfio-iommufd.h b/hw/vfio/vfio-iommufd.h
>>> index 07ea0f4304..e0af241c75 100644
>>> --- a/hw/vfio/vfio-iommufd.h
>>> +++ b/hw/vfio/vfio-iommufd.h
>>> @@ -26,6 +26,7 @@ typedef struct VFIOIOMMUFDContainer {
>>>       VFIOContainerBase bcontainer;
>>>       IOMMUFDBackend *be;
>>>       uint32_t ioas_id;
>>> +    uint64_t dirty_tracking_flags;
>>>       QLIST_HEAD(, VFIOIOASHwpt) hwpt_list;
>>>   } VFIOIOMMUFDContainer;
>>>
>>> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
>>> index c9c72ffc45..63898e7b0d 100644
>>> --- a/include/system/iommufd.h
>>> +++ b/include/system/iommufd.h
>>> @@ -64,7 +64,7 @@ bool
>> iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t
>> hwpt_id,
>>>   bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>> uint32_t hwpt_id,
>>>                                         uint64_t iova, ram_addr_t
>> size,
>>>                                         uint64_t page_size,
>> uint64_t *data,
>>> -                                      Error **errp);
>>> +                                      uint64_t flags, Error
>> **errp);
>>>   bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be,
>> uint32_t id,
>>>                                         uint32_t data_type,
>> uint32_t entry_len,
>>>                                         uint32_t *entry_num, void
>> *data,
>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>> index 2a33c7ab0b..3c4f6157e2 100644
>>> --- a/backends/iommufd.c
>>> +++ b/backends/iommufd.c
>>> @@ -361,7 +361,7 @@ bool
>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>                                         uint32_t hwpt_id,
>>>                                         uint64_t iova, ram_addr_t
>> size,
>>>                                         uint64_t page_size,
>> uint64_t *data,
>>> -                                      Error **errp)
>>> +                                      uint64_t flags, Error **errp)
>>>   {
>>>       int ret;
>>>       struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap = {
>>> @@ -371,11 +371,12 @@ bool
>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>           .length = size,
>>>           .page_size = page_size,
>>>           .data = (uintptr_t)data,
>>> +        .flags = flags,
>>>       };
>>>
>>>       ret = ioctl(be->fd, IOMMU_HWPT_GET_DIRTY_BITMAP,
>> &get_dirty_bitmap);
>>>       trace_iommufd_backend_get_dirty_bitmap(be->fd, hwpt_id, iova,
>> size,
>>> -                                           page_size, ret ? errno :
>> 0);
>>> +                                           flags, page_size, ret ?
>> errno : 0);
>>>       if (ret) {
>>>           error_setg_errno(errp, errno,
>>>                            "IOMMU_HWPT_GET_DIRTY_BITMAP
>> (iova: 0x%"HWADDR_PRIx
>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>> index 0057488ce9..c897aa6b17 100644
>>> --- a/hw/vfio/iommufd.c
>>> +++ b/hw/vfio/iommufd.c
>>> @@ -62,7 +62,7 @@ static int iommufd_cdev_unmap_one(const
>> VFIOContainerBase *bcontainer,
>>>                                     hwaddr iova, ram_addr_t size,
>>>                                     IOMMUTLBEntry *iotlb)
>>>   {
>>> -    const VFIOIOMMUFDContainer *container =
>>> +    VFIOIOMMUFDContainer *container =
>>>           container_of(bcontainer, VFIOIOMMUFDContainer,
>> bcontainer);
>>>       bool need_dirty_sync = false;
>>>       Error *local_err = NULL;
>>> @@ -73,9 +73,12 @@ static int iommufd_cdev_unmap_one(const
>> VFIOContainerBase *bcontainer,
>>>       if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
>>>           if
>> (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>>>               bcontainer->dirty_pages_supported) {
>>> +            container->dirty_tracking_flags =
>>> +
>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR;
>>>               ret = vfio_container_query_dirty_bitmap(bcontainer, iova,
>> size,
>>>
>> iotlb->translated_addr,
>>>
>> &local_err);
>>> +            container->dirty_tracking_flags = 0;
>>
>> Why not changing vfio_container_query_dirty_bitmap to pass a flags too, like
>> the
>> original patches? This is a little unnecssary odd style to pass a flag via
>> container structure rather and then clearing.
> 
> Just want to be simpler, original patch introduced a new parameter to almost all
> variants of *_query_dirty_bitmap() while the flags parameter is only used by
> IOMMUFD backend when doing unmap_bitmap. Currently we already have three
> backends, legacy VFIO, IOMMUFD and VFIO-user, only IOMMUFD need the flag.
> 
> I take container->dirty_tracking_flags as a notification mechanism, so set it before
> vfio_container_query_dirty_bitmap() and clear it thereafter. Maybe clearing it in
> iommufd_query_dirty_bitmap() is easier to be acceptable?
> 
>>
>> Part of the reason the original series had a VFIO_GET_DIRTY_NO_FLUSH for
>> generic
>> container abstraction was to not mix IOMMUFD UAPI specifics into base
>> container
>> API. Then in getting a VFIO_GET_DIRTY_NO_FLUSH, then type1 backend
>> could just
>> ignore the flag, while IOMMUFD translates it to
>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR
> 
> I did port original patch https://github.com/yiliu1765/qemu/commit/99f83595d79d2e4170c9e456cf1a7b9521bd4f80
> But it looks complex to have 'flags' parameter everywhere.
I think I would prefer like Joao to avoid caching information if possible
but I haven't check closely the mess it would introduce in the code. Let
me check.

Thanks,

C.
RE: [PATCH 3/5] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
Posted by Duan, Zhenzhong 5 days, 14 hours ago

>-----Original Message-----
>From: Cédric Le Goater <clg@redhat.com>
>Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>
>On 9/22/25 07:49, Duan, Zhenzhong wrote:
>> Hi Joao,
>>
>>> -----Original Message-----
>>> From: Joao Martins <joao.m.martins@oracle.com>
>>> Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>>>
>>> On 10/09/2025 03:36, Zhenzhong Duan wrote:
>>>> Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing the
>last
>>> dirty
>>>> bitmap query right before unmap, no PTEs flushes. This accelerates the
>>>> query without issue because unmap will tear down the mapping anyway.
>>>>
>>>> Add a new element dirty_tracking_flags in VFIOIOMMUFDContainer to
>>>> be used for the flags of iommufd dirty tracking. Currently it is
>>>> set to either IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR or 0 based
>on
>>>> the scenario.
>>>>
>>>> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> Tested-by: Xudong Hao <xudong.hao@intel.com>
>>>> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
>>>> ---
>>>>   hw/vfio/vfio-iommufd.h   | 1 +
>>>>   include/system/iommufd.h | 2 +-
>>>>   backends/iommufd.c       | 5 +++--
>>>>   hw/vfio/iommufd.c        | 6 +++++-
>>>>   backends/trace-events    | 2 +-
>>>>   5 files changed, 11 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/hw/vfio/vfio-iommufd.h b/hw/vfio/vfio-iommufd.h
>>>> index 07ea0f4304..e0af241c75 100644
>>>> --- a/hw/vfio/vfio-iommufd.h
>>>> +++ b/hw/vfio/vfio-iommufd.h
>>>> @@ -26,6 +26,7 @@ typedef struct VFIOIOMMUFDContainer {
>>>>       VFIOContainerBase bcontainer;
>>>>       IOMMUFDBackend *be;
>>>>       uint32_t ioas_id;
>>>> +    uint64_t dirty_tracking_flags;
>>>>       QLIST_HEAD(, VFIOIOASHwpt) hwpt_list;
>>>>   } VFIOIOMMUFDContainer;
>>>>
>>>> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
>>>> index c9c72ffc45..63898e7b0d 100644
>>>> --- a/include/system/iommufd.h
>>>> +++ b/include/system/iommufd.h
>>>> @@ -64,7 +64,7 @@ bool
>>> iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t
>>> hwpt_id,
>>>>   bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>> uint32_t hwpt_id,
>>>>                                         uint64_t iova,
>ram_addr_t
>>> size,
>>>>                                         uint64_t page_size,
>>> uint64_t *data,
>>>> -                                      Error **errp);
>>>> +                                      uint64_t flags, Error
>>> **errp);
>>>>   bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be,
>>> uint32_t id,
>>>>                                         uint32_t data_type,
>>> uint32_t entry_len,
>>>>                                         uint32_t *entry_num,
>void
>>> *data,
>>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>>> index 2a33c7ab0b..3c4f6157e2 100644
>>>> --- a/backends/iommufd.c
>>>> +++ b/backends/iommufd.c
>>>> @@ -361,7 +361,7 @@ bool
>>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>                                         uint32_t hwpt_id,
>>>>                                         uint64_t iova,
>ram_addr_t
>>> size,
>>>>                                         uint64_t page_size,
>>> uint64_t *data,
>>>> -                                      Error **errp)
>>>> +                                      uint64_t flags, Error
>**errp)
>>>>   {
>>>>       int ret;
>>>>       struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap = {
>>>> @@ -371,11 +371,12 @@ bool
>>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>           .length = size,
>>>>           .page_size = page_size,
>>>>           .data = (uintptr_t)data,
>>>> +        .flags = flags,
>>>>       };
>>>>
>>>>       ret = ioctl(be->fd, IOMMU_HWPT_GET_DIRTY_BITMAP,
>>> &get_dirty_bitmap);
>>>>       trace_iommufd_backend_get_dirty_bitmap(be->fd, hwpt_id,
>iova,
>>> size,
>>>> -                                           page_size, ret ?
>errno :
>>> 0);
>>>> +                                           flags, page_size,
>ret ?
>>> errno : 0);
>>>>       if (ret) {
>>>>           error_setg_errno(errp, errno,
>>>>                            "IOMMU_HWPT_GET_DIRTY_BITMAP
>>> (iova: 0x%"HWADDR_PRIx
>>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>>> index 0057488ce9..c897aa6b17 100644
>>>> --- a/hw/vfio/iommufd.c
>>>> +++ b/hw/vfio/iommufd.c
>>>> @@ -62,7 +62,7 @@ static int iommufd_cdev_unmap_one(const
>>> VFIOContainerBase *bcontainer,
>>>>                                     hwaddr iova, ram_addr_t
>size,
>>>>                                     IOMMUTLBEntry *iotlb)
>>>>   {
>>>> -    const VFIOIOMMUFDContainer *container =
>>>> +    VFIOIOMMUFDContainer *container =
>>>>           container_of(bcontainer, VFIOIOMMUFDContainer,
>>> bcontainer);
>>>>       bool need_dirty_sync = false;
>>>>       Error *local_err = NULL;
>>>> @@ -73,9 +73,12 @@ static int iommufd_cdev_unmap_one(const
>>> VFIOContainerBase *bcontainer,
>>>>       if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer))
>{
>>>>           if
>>> (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>>>>               bcontainer->dirty_pages_supported) {
>>>> +            container->dirty_tracking_flags =
>>>> +
>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR;
>>>>               ret = vfio_container_query_dirty_bitmap(bcontainer,
>iova,
>>> size,
>>>>
>>> iotlb->translated_addr,
>>>>
>>> &local_err);
>>>> +            container->dirty_tracking_flags = 0;
>>>
>>> Why not changing vfio_container_query_dirty_bitmap to pass a flags too,
>like
>>> the
>>> original patches? This is a little unnecssary odd style to pass a flag via
>>> container structure rather and then clearing.
>>
>> Just want to be simpler, original patch introduced a new parameter to
>almost all
>> variants of *_query_dirty_bitmap() while the flags parameter is only used by
>> IOMMUFD backend when doing unmap_bitmap. Currently we already have
>three
>> backends, legacy VFIO, IOMMUFD and VFIO-user, only IOMMUFD need the
>flag.
>>
>> I take container->dirty_tracking_flags as a notification mechanism, so set it
>before
>> vfio_container_query_dirty_bitmap() and clear it thereafter. Maybe clearing
>it in
>> iommufd_query_dirty_bitmap() is easier to be acceptable?
>>
>>>
>>> Part of the reason the original series had a VFIO_GET_DIRTY_NO_FLUSH
>for
>>> generic
>>> container abstraction was to not mix IOMMUFD UAPI specifics into base
>>> container
>>> API. Then in getting a VFIO_GET_DIRTY_NO_FLUSH, then type1 backend
>>> could just
>>> ignore the flag, while IOMMUFD translates it to
>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR
>>
>> I did port original patch
>https://github.com/yiliu1765/qemu/commit/99f83595d79d2e4170c9e456cf1
>a7b9521bd4f80
>> But it looks complex to have 'flags' parameter everywhere.
>I think I would prefer like Joao to avoid caching information if possible
>but I haven't check closely the mess it would introduce in the code. Let
>me check.

OK, waiting for your finial decision, I'm not that eager on a simpler change,
I can cherry-pick Joao's change if you prefer it.

Thanks
Zhenzhong
Re: [PATCH 3/5] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
Posted by Joao Martins 6 days ago
On 22/09/2025 17:02, Cédric Le Goater wrote:
> On 9/22/25 07:49, Duan, Zhenzhong wrote:
>> Hi Joao,
>>
>>> -----Original Message-----
>>> From: Joao Martins <joao.m.martins@oracle.com>
>>> Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>>>
>>> On 10/09/2025 03:36, Zhenzhong Duan wrote:
>>>> Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing the last
>>> dirty
>>>> bitmap query right before unmap, no PTEs flushes. This accelerates the
>>>> query without issue because unmap will tear down the mapping anyway.
>>>>
>>>> Add a new element dirty_tracking_flags in VFIOIOMMUFDContainer to
>>>> be used for the flags of iommufd dirty tracking. Currently it is
>>>> set to either IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR or 0 based on
>>>> the scenario.
>>>>
>>>> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> Tested-by: Xudong Hao <xudong.hao@intel.com>
>>>> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
>>>> ---
>>>>   hw/vfio/vfio-iommufd.h   | 1 +
>>>>   include/system/iommufd.h | 2 +-
>>>>   backends/iommufd.c       | 5 +++--
>>>>   hw/vfio/iommufd.c        | 6 +++++-
>>>>   backends/trace-events    | 2 +-
>>>>   5 files changed, 11 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/hw/vfio/vfio-iommufd.h b/hw/vfio/vfio-iommufd.h
>>>> index 07ea0f4304..e0af241c75 100644
>>>> --- a/hw/vfio/vfio-iommufd.h
>>>> +++ b/hw/vfio/vfio-iommufd.h
>>>> @@ -26,6 +26,7 @@ typedef struct VFIOIOMMUFDContainer {
>>>>       VFIOContainerBase bcontainer;
>>>>       IOMMUFDBackend *be;
>>>>       uint32_t ioas_id;
>>>> +    uint64_t dirty_tracking_flags;
>>>>       QLIST_HEAD(, VFIOIOASHwpt) hwpt_list;
>>>>   } VFIOIOMMUFDContainer;
>>>>
>>>> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
>>>> index c9c72ffc45..63898e7b0d 100644
>>>> --- a/include/system/iommufd.h
>>>> +++ b/include/system/iommufd.h
>>>> @@ -64,7 +64,7 @@ bool
>>> iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t
>>> hwpt_id,
>>>>   bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>> uint32_t hwpt_id,
>>>>                                         uint64_t iova, ram_addr_t
>>> size,
>>>>                                         uint64_t page_size,
>>> uint64_t *data,
>>>> -                                      Error **errp);
>>>> +                                      uint64_t flags, Error
>>> **errp);
>>>>   bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be,
>>> uint32_t id,
>>>>                                         uint32_t data_type,
>>> uint32_t entry_len,
>>>>                                         uint32_t *entry_num, void
>>> *data,
>>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>>> index 2a33c7ab0b..3c4f6157e2 100644
>>>> --- a/backends/iommufd.c
>>>> +++ b/backends/iommufd.c
>>>> @@ -361,7 +361,7 @@ bool
>>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>                                         uint32_t hwpt_id,
>>>>                                         uint64_t iova, ram_addr_t
>>> size,
>>>>                                         uint64_t page_size,
>>> uint64_t *data,
>>>> -                                      Error **errp)
>>>> +                                      uint64_t flags, Error **errp)
>>>>   {
>>>>       int ret;
>>>>       struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap = {
>>>> @@ -371,11 +371,12 @@ bool
>>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>           .length = size,
>>>>           .page_size = page_size,
>>>>           .data = (uintptr_t)data,
>>>> +        .flags = flags,
>>>>       };
>>>>
>>>>       ret = ioctl(be->fd, IOMMU_HWPT_GET_DIRTY_BITMAP,
>>> &get_dirty_bitmap);
>>>>       trace_iommufd_backend_get_dirty_bitmap(be->fd, hwpt_id, iova,
>>> size,
>>>> -                                           page_size, ret ? errno :
>>> 0);
>>>> +                                           flags, page_size, ret ?
>>> errno : 0);
>>>>       if (ret) {
>>>>           error_setg_errno(errp, errno,
>>>>                            "IOMMU_HWPT_GET_DIRTY_BITMAP
>>> (iova: 0x%"HWADDR_PRIx
>>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>>> index 0057488ce9..c897aa6b17 100644
>>>> --- a/hw/vfio/iommufd.c
>>>> +++ b/hw/vfio/iommufd.c
>>>> @@ -62,7 +62,7 @@ static int iommufd_cdev_unmap_one(const
>>> VFIOContainerBase *bcontainer,
>>>>                                     hwaddr iova, ram_addr_t size,
>>>>                                     IOMMUTLBEntry *iotlb)
>>>>   {
>>>> -    const VFIOIOMMUFDContainer *container =
>>>> +    VFIOIOMMUFDContainer *container =
>>>>           container_of(bcontainer, VFIOIOMMUFDContainer,
>>> bcontainer);
>>>>       bool need_dirty_sync = false;
>>>>       Error *local_err = NULL;
>>>> @@ -73,9 +73,12 @@ static int iommufd_cdev_unmap_one(const
>>> VFIOContainerBase *bcontainer,
>>>>       if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
>>>>           if
>>> (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>>>>               bcontainer->dirty_pages_supported) {
>>>> +            container->dirty_tracking_flags =
>>>> +
>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR;
>>>>               ret = vfio_container_query_dirty_bitmap(bcontainer, iova,
>>> size,
>>>>
>>> iotlb->translated_addr,
>>>>
>>> &local_err);
>>>> +            container->dirty_tracking_flags = 0;
>>>
>>> Why not changing vfio_container_query_dirty_bitmap to pass a flags too, like
>>> the
>>> original patches? This is a little unnecssary odd style to pass a flag via
>>> container structure rather and then clearing.
>>
>> Just want to be simpler, original patch introduced a new parameter to almost all
>> variants of *_query_dirty_bitmap() while the flags parameter is only used by
>> IOMMUFD backend when doing unmap_bitmap. Currently we already have three
>> backends, legacy VFIO, IOMMUFD and VFIO-user, only IOMMUFD need the flag.
>>
>> I take container->dirty_tracking_flags as a notification mechanism, so set it
>> before
>> vfio_container_query_dirty_bitmap() and clear it thereafter. Maybe clearing it in
>> iommufd_query_dirty_bitmap() is easier to be acceptable?
>>
>>>
>>> Part of the reason the original series had a VFIO_GET_DIRTY_NO_FLUSH for
>>> generic
>>> container abstraction was to not mix IOMMUFD UAPI specifics into base
>>> container
>>> API. Then in getting a VFIO_GET_DIRTY_NO_FLUSH, then type1 backend
>>> could just
>>> ignore the flag, while IOMMUFD translates it to
>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR
>>
>> I did port original patch https://github.com/yiliu1765/qemu/
>> commit/99f83595d79d2e4170c9e456cf1a7b9521bd4f80
>> But it looks complex to have 'flags' parameter everywhere.
> I think I would prefer like Joao to avoid caching information if possible
> but I haven't check closely the mess it would introduce in the code. Let
> me check.
> 

My recollection was that it wasn't that much churn added as this series is
already doing the most of the churn. But I am checking how the code would look
like to properly respond to his suggestion on why he changing it towards
structuref field.

Re: [PATCH 3/5] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
Posted by Joao Martins 6 days ago
On 22/09/2025 17:06, Joao Martins wrote:
> On 22/09/2025 17:02, Cédric Le Goater wrote:
>> On 9/22/25 07:49, Duan, Zhenzhong wrote:
>>> Hi Joao,
>>>
>>>> -----Original Message-----
>>>> From: Joao Martins <joao.m.martins@oracle.com>
>>>> Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>>>>
>>>> On 10/09/2025 03:36, Zhenzhong Duan wrote:
>>>>> Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing the last
>>>> dirty
>>>>> bitmap query right before unmap, no PTEs flushes. This accelerates the
>>>>> query without issue because unmap will tear down the mapping anyway.
>>>>>
>>>>> Add a new element dirty_tracking_flags in VFIOIOMMUFDContainer to
>>>>> be used for the flags of iommufd dirty tracking. Currently it is
>>>>> set to either IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR or 0 based on
>>>>> the scenario.
>>>>>
>>>>> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
>>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>>> Tested-by: Xudong Hao <xudong.hao@intel.com>
>>>>> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
>>>>> ---
>>>>>   hw/vfio/vfio-iommufd.h   | 1 +
>>>>>   include/system/iommufd.h | 2 +-
>>>>>   backends/iommufd.c       | 5 +++--
>>>>>   hw/vfio/iommufd.c        | 6 +++++-
>>>>>   backends/trace-events    | 2 +-
>>>>>   5 files changed, 11 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/hw/vfio/vfio-iommufd.h b/hw/vfio/vfio-iommufd.h
>>>>> index 07ea0f4304..e0af241c75 100644
>>>>> --- a/hw/vfio/vfio-iommufd.h
>>>>> +++ b/hw/vfio/vfio-iommufd.h
>>>>> @@ -26,6 +26,7 @@ typedef struct VFIOIOMMUFDContainer {
>>>>>       VFIOContainerBase bcontainer;
>>>>>       IOMMUFDBackend *be;
>>>>>       uint32_t ioas_id;
>>>>> +    uint64_t dirty_tracking_flags;
>>>>>       QLIST_HEAD(, VFIOIOASHwpt) hwpt_list;
>>>>>   } VFIOIOMMUFDContainer;
>>>>>
>>>>> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
>>>>> index c9c72ffc45..63898e7b0d 100644
>>>>> --- a/include/system/iommufd.h
>>>>> +++ b/include/system/iommufd.h
>>>>> @@ -64,7 +64,7 @@ bool
>>>> iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t
>>>> hwpt_id,
>>>>>   bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>> uint32_t hwpt_id,
>>>>>                                         uint64_t iova, ram_addr_t
>>>> size,
>>>>>                                         uint64_t page_size,
>>>> uint64_t *data,
>>>>> -                                      Error **errp);
>>>>> +                                      uint64_t flags, Error
>>>> **errp);
>>>>>   bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be,
>>>> uint32_t id,
>>>>>                                         uint32_t data_type,
>>>> uint32_t entry_len,
>>>>>                                         uint32_t *entry_num, void
>>>> *data,
>>>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>>>> index 2a33c7ab0b..3c4f6157e2 100644
>>>>> --- a/backends/iommufd.c
>>>>> +++ b/backends/iommufd.c
>>>>> @@ -361,7 +361,7 @@ bool
>>>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>>                                         uint32_t hwpt_id,
>>>>>                                         uint64_t iova, ram_addr_t
>>>> size,
>>>>>                                         uint64_t page_size,
>>>> uint64_t *data,
>>>>> -                                      Error **errp)
>>>>> +                                      uint64_t flags, Error **errp)
>>>>>   {
>>>>>       int ret;
>>>>>       struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap = {
>>>>> @@ -371,11 +371,12 @@ bool
>>>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>>           .length = size,
>>>>>           .page_size = page_size,
>>>>>           .data = (uintptr_t)data,
>>>>> +        .flags = flags,
>>>>>       };
>>>>>
>>>>>       ret = ioctl(be->fd, IOMMU_HWPT_GET_DIRTY_BITMAP,
>>>> &get_dirty_bitmap);
>>>>>       trace_iommufd_backend_get_dirty_bitmap(be->fd, hwpt_id, iova,
>>>> size,
>>>>> -                                           page_size, ret ? errno :
>>>> 0);
>>>>> +                                           flags, page_size, ret ?
>>>> errno : 0);
>>>>>       if (ret) {
>>>>>           error_setg_errno(errp, errno,
>>>>>                            "IOMMU_HWPT_GET_DIRTY_BITMAP
>>>> (iova: 0x%"HWADDR_PRIx
>>>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>>>> index 0057488ce9..c897aa6b17 100644
>>>>> --- a/hw/vfio/iommufd.c
>>>>> +++ b/hw/vfio/iommufd.c
>>>>> @@ -62,7 +62,7 @@ static int iommufd_cdev_unmap_one(const
>>>> VFIOContainerBase *bcontainer,
>>>>>                                     hwaddr iova, ram_addr_t size,
>>>>>                                     IOMMUTLBEntry *iotlb)
>>>>>   {
>>>>> -    const VFIOIOMMUFDContainer *container =
>>>>> +    VFIOIOMMUFDContainer *container =
>>>>>           container_of(bcontainer, VFIOIOMMUFDContainer,
>>>> bcontainer);
>>>>>       bool need_dirty_sync = false;
>>>>>       Error *local_err = NULL;
>>>>> @@ -73,9 +73,12 @@ static int iommufd_cdev_unmap_one(const
>>>> VFIOContainerBase *bcontainer,
>>>>>       if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
>>>>>           if
>>>> (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>>>>>               bcontainer->dirty_pages_supported) {
>>>>> +            container->dirty_tracking_flags =
>>>>> +
>>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR;
>>>>>               ret = vfio_container_query_dirty_bitmap(bcontainer, iova,
>>>> size,
>>>>>
>>>> iotlb->translated_addr,
>>>>>
>>>> &local_err);
>>>>> +            container->dirty_tracking_flags = 0;
>>>>
>>>> Why not changing vfio_container_query_dirty_bitmap to pass a flags too, like
>>>> the
>>>> original patches? This is a little unnecssary odd style to pass a flag via
>>>> container structure rather and then clearing.
>>>
>>> Just want to be simpler, original patch introduced a new parameter to almost all
>>> variants of *_query_dirty_bitmap() while the flags parameter is only used by
>>> IOMMUFD backend when doing unmap_bitmap. Currently we already have three
>>> backends, legacy VFIO, IOMMUFD and VFIO-user, only IOMMUFD need the flag.
>>>
>>> I take container->dirty_tracking_flags as a notification mechanism, so set it
>>> before
>>> vfio_container_query_dirty_bitmap() and clear it thereafter. Maybe clearing it in
>>> iommufd_query_dirty_bitmap() is easier to be acceptable?
>>>
>>>>
>>>> Part of the reason the original series had a VFIO_GET_DIRTY_NO_FLUSH for
>>>> generic
>>>> container abstraction was to not mix IOMMUFD UAPI specifics into base
>>>> container
>>>> API. Then in getting a VFIO_GET_DIRTY_NO_FLUSH, then type1 backend
>>>> could just
>>>> ignore the flag, while IOMMUFD translates it to
>>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR
>>>
>>> I did port original patch https://github.com/yiliu1765/qemu/
>>> commit/99f83595d79d2e4170c9e456cf1a7b9521bd4f80
>>> But it looks complex to have 'flags' parameter everywhere.
>> I think I would prefer like Joao to avoid caching information if possible
>> but I haven't check closely the mess it would introduce in the code. Let
>> me check.
>>
> 
> My recollection was that it wasn't that much churn added as this series is
> already doing the most of the churn. But I am checking how the code would look
> like to properly respond to his suggestion on why he changing it towards
> structuref field.

The churn should be similar to this patch:

https://github.com/jpemartins/qemu/commit/5e1f11075299a5fa9564a26788dc9cc1717e297c

It's mostly an interface parameter addition of flags.
But there's now a few more callsites and I suppose that's the extra churn. But I don't think
vfio-user would need to change. See snip below.

I've pushed this here https://github.com/jpemartins/qemu/commits/relax-viommu-lm but showing the
series wide changes for discussion. The thing I didn't do was have an intermediate 'backend
agnostic' flag thingie like the original one
(https://github.com/jpemartins/qemu/commit/1ef8abc896ae69be8896a68705fe4a87204709e0) with
VFIO_GET_DIRTY_NO_FLUSH

diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c
index 56304978e1e8..2fe08e010405 100644
--- a/hw/vfio/container-base.c
+++ b/hw/vfio/container-base.c
@@ -211,17 +211,16 @@ static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
 }

 static int vfio_container_iommu_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
-                   VFIOBitmap *vbmap, hwaddr iova, hwaddr size, Error **errp)
+                   VFIOBitmap *vbmap, hwaddr iova, hwaddr size, uint64_t flags, Error **errp)
 {
     VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);

     g_assert(vioc->query_dirty_bitmap);
-    return vioc->query_dirty_bitmap(bcontainer, vbmap, iova, size,
-                                               errp);
+    return vioc->query_dirty_bitmap(bcontainer, vbmap, iova, size, flags, errp);
 }

 static int vfio_container_devices_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
-                 VFIOBitmap *vbmap, hwaddr iova, hwaddr size, Error **errp)
+                 VFIOBitmap *vbmap, hwaddr iova, hwaddr size, uint64_t flags, Error **errp)
 {
     VFIODevice *vbasedev;
     int ret;
@@ -243,7 +242,7 @@ static int vfio_container_devices_query_dirty_bitmap(const VFIOContainerBase *bc
 }

 int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer, uint64_t iova,
-                          uint64_t size, ram_addr_t ram_addr, Error **errp)
+                          uint64_t size, uint64_t flags, ram_addr_t ram_addr, Error **errp)
 {
     bool all_device_dirty_tracking =
         vfio_container_devices_dirty_tracking_is_supported(bcontainer);
@@ -267,10 +266,10 @@ int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer, uint6

     if (all_device_dirty_tracking) {
         ret = vfio_container_devices_query_dirty_bitmap(bcontainer, &vbmap, iova, size,
-                                                        errp);
+                                                        flags, errp);
     } else {
         ret = vfio_container_iommu_query_dirty_bitmap(bcontainer, &vbmap, iova, size,
-                                                     errp);
+                                                      flags, errp);
     }

     if (ret) {
@@ -280,7 +279,7 @@ int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer, uint6
     dirty_pages = cpu_physical_memory_set_dirty_lebitmap(vbmap.bitmap, ram_addr,
                                                          vbmap.pages);

-    trace_vfio_container_query_dirty_bitmap(iova, size, vbmap.size, ram_addr,
+    trace_vfio_container_query_dirty_bitmap(iova, size, flags, vbmap.size, ram_addr,
                                             dirty_pages);
 out:
     g_free(vbmap.bitmap);
diff --git a/hw/vfio/container.c b/hw/vfio/container.c
index 030c6d3f89cf..86c4d06298f7 100644
--- a/hw/vfio/container.c
+++ b/hw/vfio/container.c
@@ -169,7 +169,7 @@ static int vfio_legacy_dma_unmap_one(const VFIOContainerBase *bcontainer,
     }

     if (need_dirty_sync) {
-        ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
+        ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size, 0,
                                     iotlb->translated_addr, &local_err);
         if (ret) {
             error_report_err(local_err);
@@ -267,7 +267,8 @@ vfio_legacy_set_dirty_page_tracking(const VFIOContainerBase *bcontainer,
 }

 static int vfio_legacy_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
-                      VFIOBitmap *vbmap, hwaddr iova, hwaddr size, Error **errp)
+                      VFIOBitmap *vbmap, hwaddr iova, hwaddr size, uint64_t flags,
+                      Error **errp)
 {
     const VFIOContainer *container = VFIO_IOMMU_LEGACY(bcontainer);
     struct vfio_iommu_type1_dirty_bitmap *dbitmap;
diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
index b82597de9116..c1c8755c6d7c 100644
--- a/hw/vfio/iommufd.c
+++ b/hw/vfio/iommufd.c
@@ -73,12 +73,10 @@ static int iommufd_cdev_unmap_one(const VFIOContainerBase *bcontainer,
     if (iotlb && vfio_container_dirty_tracking_is_started(bcontainer)) {
         if (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
             bcontainer->dirty_pages_supported) {
-            container->dirty_tracking_flags =
-                                      IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR;
             ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
+                                                    IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR,
                                                     iotlb->translated_addr,
                                                     &local_err);
-            container->dirty_tracking_flags = 0;
             if (ret) {
                 error_report_err(local_err);
             }
@@ -95,7 +93,7 @@ static int iommufd_cdev_unmap_one(const VFIOContainerBase *bcontainer,
     }

     if (need_dirty_sync) {
-        ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size,
+        ret = vfio_container_query_dirty_bitmap(bcontainer, iova, size, 0,
                                                 iotlb->translated_addr,
                                                 &local_err);
         if (ret) {
@@ -235,7 +233,7 @@ err:

 static int iommufd_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
                                       VFIOBitmap *vbmap, hwaddr iova,
-                                      hwaddr size, Error **errp)
+                                      hwaddr size, uint64_t flags, Error **errp)
 {
     VFIOIOMMUFDContainer *container = container_of(bcontainer,
                                                    VFIOIOMMUFDContainer,
@@ -251,8 +249,7 @@ static int iommufd_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
         if (!iommufd_backend_get_dirty_bitmap(container->be, hwpt->hwpt_id,
                                               iova, size, page_size,
                                               (uint64_t *)vbmap->bitmap,
-                                              container->dirty_tracking_flags,
-                                              errp)) {
+                                              flags, errp)) {
             return -EINVAL;
         }
     }
diff --git a/hw/vfio/listener.c b/hw/vfio/listener.c
index e09383316598..27cbb45282ae 100644
--- a/hw/vfio/listener.c
+++ b/hw/vfio/listener.c
@@ -1082,7 +1082,7 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
     translated_addr = memory_region_get_ram_addr(mr) + xlat;

     ret = vfio_container_query_dirty_bitmap(bcontainer, iova, iotlb->addr_mask + 1,
-                                translated_addr, &local_err);
+                                0, translated_addr, &local_err);
     if (ret) {
         error_prepend(&local_err,
                       "vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", "
@@ -1118,7 +1118,7 @@ static int vfio_ram_discard_query_dirty_bitmap(MemoryRegionSection *section,
      * Sync the whole mapped region (spanning multiple individual mappings)
      * in one go.
      */
-    ret = vfio_container_query_dirty_bitmap(vrdl->bcontainer, iova, size, ram_addr,
+    ret = vfio_container_query_dirty_bitmap(vrdl->bcontainer, iova, size, 0, ram_addr,
                                 &local_err);
     if (ret) {
         error_report_err(local_err);
@@ -1203,7 +1203,7 @@ static int vfio_sync_dirty_bitmap(VFIOContainerBase *bcontainer,

     return vfio_container_query_dirty_bitmap(bcontainer,
                    REAL_HOST_PAGE_ALIGN(section->offset_within_address_space),
-                                 int128_get64(section->size), ram_addr, errp);
+                                 int128_get64(section->size), 0, ram_addr, errp);
 }

 static void vfio_listener_log_sync(MemoryListener *listener,
diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
index e3d571f8c845..ee18dafdb2ae 100644
--- a/hw/vfio/trace-events
+++ b/hw/vfio/trace-events
@@ -105,7 +105,7 @@ vfio_device_dirty_tracking_start(int nr_ranges, uint64_t min32, uint64_t max32,
 vfio_iommu_map_dirty_notify(uint64_t iova_start, uint64_t iova_end) "iommu dirty @ 0x%"PRIx64" -
0x%"PRIx64

 # container-base.c
-vfio_container_query_dirty_bitmap(uint64_t iova, uint64_t size, uint64_t bitmap_size, uint64_t
start, uint64_t dirty_pages) "
iova=0x%"PRIx64" size= 0x%"PRIx64" bitmap_size=0x%"PRIx64" start=0x%"PRIx64" dirty_pages=%"PRIu64
+vfio_container_query_dirty_bitmap(uint64_t iova, uint64_t size, uint64_t flags, uint64_t
bitmap_size, uint64_t start, uint64_
t dirty_pages) "iova=0x%"PRIx64" size= 0x%"PRIx64" flags= 0x%"PRIx64" bitmap_size=0x%"PRIx64"
start=0x%"PRIx64" dirty_pages=%"
PRIu64

 # container.c
 vfio_container_disconnect(int fd) "close container->fd=%d"
diff --git a/hw/vfio/vfio-iommufd.h b/hw/vfio/vfio-iommufd.h
index e0af241c75cf..07ea0f430496 100644
--- a/hw/vfio/vfio-iommufd.h
+++ b/hw/vfio/vfio-iommufd.h
@@ -26,7 +26,6 @@ typedef struct VFIOIOMMUFDContainer {
     VFIOContainerBase bcontainer;
     IOMMUFDBackend *be;
     uint32_t ioas_id;
-    uint64_t dirty_tracking_flags;
     QLIST_HEAD(, VFIOIOASHwpt) hwpt_list;
 } VFIOIOMMUFDContainer;
diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h
index acbd48a18a3a..e88b690cf423 100644
--- a/include/hw/vfio/vfio-container-base.h
+++ b/include/hw/vfio/vfio-container-base.h
@@ -98,7 +98,7 @@ bool vfio_container_dirty_tracking_is_started(
 bool vfio_container_devices_dirty_tracking_is_supported(
     const VFIOContainerBase *bcontainer);
 int vfio_container_query_dirty_bitmap(const VFIOContainerBase *bcontainer,
-    uint64_t iova, uint64_t size, ram_addr_t ram_addr, Error **errp);
+    uint64_t iova, uint64_t size, uint64_t flags, ram_addr_t ram_addr, Error **errp);

 GList *vfio_container_get_iova_ranges(const VFIOContainerBase *bcontainer);

@@ -252,12 +252,14 @@ struct VFIOIOMMUClass {
      * @vbmap: #VFIOBitmap internal bitmap structure
      * @iova: iova base address
      * @size: size of iova range
+     * @flags: flags to the dirty tracking query
      * @errp: pointer to Error*, to store an error if it happens.
      *
      * Returns zero to indicate success and negative for error.
      */
     int (*query_dirty_bitmap)(const VFIOContainerBase *bcontainer,
-                VFIOBitmap *vbmap, hwaddr iova, hwaddr size, Error **errp);
+                VFIOBitmap *vbmap, hwaddr iova, hwaddr size, uint64_t flags,
+                Error **errp);
     /* PCI specific */
     int (*pci_hot_reset)(VFIODevice *vbasedev, bool single);

RE: [PATCH 3/5] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
Posted by Duan, Zhenzhong 5 days, 14 hours ago

>-----Original Message-----
>From: Joao Martins <joao.m.martins@oracle.com>
>Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>
>On 22/09/2025 17:06, Joao Martins wrote:
>> On 22/09/2025 17:02, Cédric Le Goater wrote:
>>> On 9/22/25 07:49, Duan, Zhenzhong wrote:
>>>> Hi Joao,
>>>>
>>>>> -----Original Message-----
>>>>> From: Joao Martins <joao.m.martins@oracle.com>
>>>>> Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>>>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>>>>>
>>>>> On 10/09/2025 03:36, Zhenzhong Duan wrote:
>>>>>> Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing the
>last
>>>>> dirty
>>>>>> bitmap query right before unmap, no PTEs flushes. This accelerates the
>>>>>> query without issue because unmap will tear down the mapping
>anyway.
>>>>>>
>>>>>> Add a new element dirty_tracking_flags in VFIOIOMMUFDContainer to
>>>>>> be used for the flags of iommufd dirty tracking. Currently it is
>>>>>> set to either IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR or 0
>based on
>>>>>> the scenario.
>>>>>>
>>>>>> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
>>>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>>>> Tested-by: Xudong Hao <xudong.hao@intel.com>
>>>>>> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
>>>>>> ---
>>>>>>   hw/vfio/vfio-iommufd.h   | 1 +
>>>>>>   include/system/iommufd.h | 2 +-
>>>>>>   backends/iommufd.c       | 5 +++--
>>>>>>   hw/vfio/iommufd.c        | 6 +++++-
>>>>>>   backends/trace-events    | 2 +-
>>>>>>   5 files changed, 11 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/vfio/vfio-iommufd.h b/hw/vfio/vfio-iommufd.h
>>>>>> index 07ea0f4304..e0af241c75 100644
>>>>>> --- a/hw/vfio/vfio-iommufd.h
>>>>>> +++ b/hw/vfio/vfio-iommufd.h
>>>>>> @@ -26,6 +26,7 @@ typedef struct VFIOIOMMUFDContainer {
>>>>>>       VFIOContainerBase bcontainer;
>>>>>>       IOMMUFDBackend *be;
>>>>>>       uint32_t ioas_id;
>>>>>> +    uint64_t dirty_tracking_flags;
>>>>>>       QLIST_HEAD(, VFIOIOASHwpt) hwpt_list;
>>>>>>   } VFIOIOMMUFDContainer;
>>>>>>
>>>>>> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
>>>>>> index c9c72ffc45..63898e7b0d 100644
>>>>>> --- a/include/system/iommufd.h
>>>>>> +++ b/include/system/iommufd.h
>>>>>> @@ -64,7 +64,7 @@ bool
>>>>> iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t
>>>>> hwpt_id,
>>>>>>   bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>> uint32_t hwpt_id,
>>>>>>                                         uint64_t
>iova, ram_addr_t
>>>>> size,
>>>>>>                                         uint64_t
>page_size,
>>>>> uint64_t *data,
>>>>>> -                                      Error
>**errp);
>>>>>> +                                      uint64_t
>flags, Error
>>>>> **errp);
>>>>>>   bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be,
>>>>> uint32_t id,
>>>>>>                                         uint32_t
>data_type,
>>>>> uint32_t entry_len,
>>>>>>                                         uint32_t
>*entry_num, void
>>>>> *data,
>>>>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>>>>> index 2a33c7ab0b..3c4f6157e2 100644
>>>>>> --- a/backends/iommufd.c
>>>>>> +++ b/backends/iommufd.c
>>>>>> @@ -361,7 +361,7 @@ bool
>>>>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>>>                                         uint32_t
>hwpt_id,
>>>>>>                                         uint64_t
>iova, ram_addr_t
>>>>> size,
>>>>>>                                         uint64_t
>page_size,
>>>>> uint64_t *data,
>>>>>> -                                      Error **errp)
>>>>>> +                                      uint64_t
>flags, Error **errp)
>>>>>>   {
>>>>>>       int ret;
>>>>>>       struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap = {
>>>>>> @@ -371,11 +371,12 @@ bool
>>>>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>>>           .length = size,
>>>>>>           .page_size = page_size,
>>>>>>           .data = (uintptr_t)data,
>>>>>> +        .flags = flags,
>>>>>>       };
>>>>>>
>>>>>>       ret = ioctl(be->fd, IOMMU_HWPT_GET_DIRTY_BITMAP,
>>>>> &get_dirty_bitmap);
>>>>>>       trace_iommufd_backend_get_dirty_bitmap(be->fd, hwpt_id,
>iova,
>>>>> size,
>>>>>>
>-                                           page_size,
>ret ? errno :
>>>>> 0);
>>>>>> +                                           flags,
>page_size, ret ?
>>>>> errno : 0);
>>>>>>       if (ret) {
>>>>>>           error_setg_errno(errp, errno,
>>>>>>                            "IOMMU_HWPT_GET_DIRTY_
>BITMAP
>>>>> (iova: 0x%"HWADDR_PRIx
>>>>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>>>>> index 0057488ce9..c897aa6b17 100644
>>>>>> --- a/hw/vfio/iommufd.c
>>>>>> +++ b/hw/vfio/iommufd.c
>>>>>> @@ -62,7 +62,7 @@ static int iommufd_cdev_unmap_one(const
>>>>> VFIOContainerBase *bcontainer,
>>>>>>                                     hwaddr iova,
>ram_addr_t size,
>>>>>>                                     IOMMUTLBEntr
>y *iotlb)
>>>>>>   {
>>>>>> -    const VFIOIOMMUFDContainer *container =
>>>>>> +    VFIOIOMMUFDContainer *container =
>>>>>>           container_of(bcontainer, VFIOIOMMUFDContainer,
>>>>> bcontainer);
>>>>>>       bool need_dirty_sync = false;
>>>>>>       Error *local_err = NULL;
>>>>>> @@ -73,9 +73,12 @@ static int iommufd_cdev_unmap_one(const
>>>>> VFIOContainerBase *bcontainer,
>>>>>>       if (iotlb &&
>vfio_container_dirty_tracking_is_started(bcontainer)) {
>>>>>>           if
>>>>> (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>>>>>>               bcontainer->dirty_pages_supported) {
>>>>>> +            container->dirty_tracking_flags =
>>>>>> +
>>>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR;
>>>>>>               ret =
>vfio_container_query_dirty_bitmap(bcontainer, iova,
>>>>> size,
>>>>>>
>>>>> iotlb->translated_addr,
>>>>>>
>>>>> &local_err);
>>>>>> +            container->dirty_tracking_flags = 0;
>>>>>
>>>>> Why not changing vfio_container_query_dirty_bitmap to pass a flags
>too, like
>>>>> the
>>>>> original patches? This is a little unnecssary odd style to pass a flag via
>>>>> container structure rather and then clearing.
>>>>
>>>> Just want to be simpler, original patch introduced a new parameter to
>almost all
>>>> variants of *_query_dirty_bitmap() while the flags parameter is only used
>by
>>>> IOMMUFD backend when doing unmap_bitmap. Currently we already
>have three
>>>> backends, legacy VFIO, IOMMUFD and VFIO-user, only IOMMUFD need
>the flag.
>>>>
>>>> I take container->dirty_tracking_flags as a notification mechanism, so set
>it
>>>> before
>>>> vfio_container_query_dirty_bitmap() and clear it thereafter. Maybe
>clearing it in
>>>> iommufd_query_dirty_bitmap() is easier to be acceptable?
>>>>
>>>>>
>>>>> Part of the reason the original series had a VFIO_GET_DIRTY_NO_FLUSH
>for
>>>>> generic
>>>>> container abstraction was to not mix IOMMUFD UAPI specifics into base
>>>>> container
>>>>> API. Then in getting a VFIO_GET_DIRTY_NO_FLUSH, then type1 backend
>>>>> could just
>>>>> ignore the flag, while IOMMUFD translates it to
>>>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR
>>>>
>>>> I did port original patch https://github.com/yiliu1765/qemu/
>>>> commit/99f83595d79d2e4170c9e456cf1a7b9521bd4f80
>>>> But it looks complex to have 'flags' parameter everywhere.
>>> I think I would prefer like Joao to avoid caching information if possible
>>> but I haven't check closely the mess it would introduce in the code. Let
>>> me check.
>>>
>>
>> My recollection was that it wasn't that much churn added as this series is
>> already doing the most of the churn. But I am checking how the code would
>look
>> like to properly respond to his suggestion on why he changing it towards
>> structuref field.
>
>The churn should be similar to this patch:
>
>https://github.com/jpemartins/qemu/commit/5e1f11075299a5fa9564a26788
>dc9cc1717e297c
>
>It's mostly an interface parameter addition of flags.

Yes, it's a general interface parameter but it's only used by IOMMUFD.
That's my only argument.

>But there's now a few more callsites and I suppose that's the extra churn. But
>I don't think
>vfio-user would need to change. See snip below.

vfio-user need same change as vfio-legacy though they don't use 'flags' parameter,
or else there is build error:

../hw/vfio-user/container.c:340:30: error: assignment to ‘int (*)(const VFIOContainerBase *, VFIOBitmap *, hwaddr,  hwaddr,  uint64_t,  Error **)’ {aka ‘int (*)(const VFIOContainerBase *, VFIOBitmap *, long unsigned int,  long unsigned int,  long unsigned int,  Error **)’} from incompatible pointer type ‘int (*)(const VFIOContainerBase *, VFIOBitmap *, hwaddr,  hwaddr,  Error **)’ {aka ‘int (*)(const VFIOContainerBase *, VFIOBitmap *, long unsigned int,  long unsigned int,  Error **)’} [-Werror=incompatible-pointer-types]
  340 |     vioc->query_dirty_bitmap = vfio_user_query_dirty_bitmap;

>
>I've pushed this here
>https://github.com/jpemartins/qemu/commits/relax-viommu-lm but showing
>the
>series wide changes for discussion. The thing I didn't do was have an
>intermediate 'backend
>agnostic' flag thingie like the original one
>(https://github.com/jpemartins/qemu/commit/1ef8abc896ae69be8896a6870
>5fe4a87204709e0) with
>VFIO_GET_DIRTY_NO_FLUSH
>
>diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c
>index 56304978e1e8..2fe08e010405 100644
>--- a/hw/vfio/container-base.c
>+++ b/hw/vfio/container-base.c
>@@ -211,17 +211,16 @@ static int
>vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
> }
>
> static int vfio_container_iommu_query_dirty_bitmap(const
>VFIOContainerBase *bcontainer,
>-                   VFIOBitmap *vbmap, hwaddr iova, hwaddr size, Error
>**errp)
>+                   VFIOBitmap *vbmap, hwaddr iova, hwaddr size,
>uint64_t flags, Error **errp)
> {
>     VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
>
>     g_assert(vioc->query_dirty_bitmap);
>-    return vioc->query_dirty_bitmap(bcontainer, vbmap, iova, size,
>-                                               errp);
>+    return vioc->query_dirty_bitmap(bcontainer, vbmap, iova, size, flags,
>errp);
> }
>
> static int vfio_container_devices_query_dirty_bitmap(const
>VFIOContainerBase *bcontainer,
>-                 VFIOBitmap *vbmap, hwaddr iova, hwaddr size, Error
>**errp)
>+                 VFIOBitmap *vbmap, hwaddr iova, hwaddr size,
>uint64_t flags, Error **errp)

I think we can drop 'flags' here as you did in your old patch?

> {
>     VFIODevice *vbasedev;
>     int ret;
>@@ -243,7 +242,7 @@ static int
>vfio_container_devices_query_dirty_bitmap(const VFIOContainerBase *bc
> }
>
> int vfio_container_query_dirty_bitmap(const VFIOContainerBase
>*bcontainer, uint64_t iova,
>-                          uint64_t size, ram_addr_t ram_addr, Error
>**errp)
>+                          uint64_t size, uint64_t flags, ram_addr_t
>ram_addr, Error **errp)
> {
>     bool all_device_dirty_tracking =
>         vfio_container_devices_dirty_tracking_is_supported(bcontainer);
>@@ -267,10 +266,10 @@ int vfio_container_query_dirty_bitmap(const
>VFIOContainerBase *bcontainer, uint6
>
>     if (all_device_dirty_tracking) {
>         ret = vfio_container_devices_query_dirty_bitmap(bcontainer,
>&vbmap, iova, size,
>-                                                        errp);
>+                                                        flags,
>errp);

Same here.

Thanks
Zhenzhong

Re: [PATCH 3/5] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
Posted by Joao Martins 5 days, 7 hours ago
On 23/09/2025 03:50, Duan, Zhenzhong wrote:
>> -----Original Message-----
>> From: Joao Martins <joao.m.martins@oracle.com>
>> Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>>
>> On 22/09/2025 17:06, Joao Martins wrote:
>>> On 22/09/2025 17:02, Cédric Le Goater wrote:
>>>> On 9/22/25 07:49, Duan, Zhenzhong wrote:
>>>>> Hi Joao,
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Joao Martins <joao.m.martins@oracle.com>
>>>>>> Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>>>>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>>>>>>
>>>>>> On 10/09/2025 03:36, Zhenzhong Duan wrote:
>>>>>>> Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing the
>> last
>>>>>> dirty
>>>>>>> bitmap query right before unmap, no PTEs flushes. This accelerates the
>>>>>>> query without issue because unmap will tear down the mapping
>> anyway.
>>>>>>>
>>>>>>> Add a new element dirty_tracking_flags in VFIOIOMMUFDContainer to
>>>>>>> be used for the flags of iommufd dirty tracking. Currently it is
>>>>>>> set to either IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR or 0
>> based on
>>>>>>> the scenario.
>>>>>>>
>>>>>>> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
>>>>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>>>>> Tested-by: Xudong Hao <xudong.hao@intel.com>
>>>>>>> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
>>>>>>> ---
>>>>>>>   hw/vfio/vfio-iommufd.h   | 1 +
>>>>>>>   include/system/iommufd.h | 2 +-
>>>>>>>   backends/iommufd.c       | 5 +++--
>>>>>>>   hw/vfio/iommufd.c        | 6 +++++-
>>>>>>>   backends/trace-events    | 2 +-
>>>>>>>   5 files changed, 11 insertions(+), 5 deletions(-)
>>>>>>>
>>>>>>> diff --git a/hw/vfio/vfio-iommufd.h b/hw/vfio/vfio-iommufd.h
>>>>>>> index 07ea0f4304..e0af241c75 100644
>>>>>>> --- a/hw/vfio/vfio-iommufd.h
>>>>>>> +++ b/hw/vfio/vfio-iommufd.h
>>>>>>> @@ -26,6 +26,7 @@ typedef struct VFIOIOMMUFDContainer {
>>>>>>>       VFIOContainerBase bcontainer;
>>>>>>>       IOMMUFDBackend *be;
>>>>>>>       uint32_t ioas_id;
>>>>>>> +    uint64_t dirty_tracking_flags;
>>>>>>>       QLIST_HEAD(, VFIOIOASHwpt) hwpt_list;
>>>>>>>   } VFIOIOMMUFDContainer;
>>>>>>>
>>>>>>> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
>>>>>>> index c9c72ffc45..63898e7b0d 100644
>>>>>>> --- a/include/system/iommufd.h
>>>>>>> +++ b/include/system/iommufd.h
>>>>>>> @@ -64,7 +64,7 @@ bool
>>>>>> iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be, uint32_t
>>>>>> hwpt_id,
>>>>>>>   bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>>> uint32_t hwpt_id,
>>>>>>>                                         uint64_t
>> iova, ram_addr_t
>>>>>> size,
>>>>>>>                                         uint64_t
>> page_size,
>>>>>> uint64_t *data,
>>>>>>> -                                      Error
>> **errp);
>>>>>>> +                                      uint64_t
>> flags, Error
>>>>>> **errp);
>>>>>>>   bool iommufd_backend_invalidate_cache(IOMMUFDBackend *be,
>>>>>> uint32_t id,
>>>>>>>                                         uint32_t
>> data_type,
>>>>>> uint32_t entry_len,
>>>>>>>                                         uint32_t
>> *entry_num, void
>>>>>> *data,
>>>>>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>>>>>> index 2a33c7ab0b..3c4f6157e2 100644
>>>>>>> --- a/backends/iommufd.c
>>>>>>> +++ b/backends/iommufd.c
>>>>>>> @@ -361,7 +361,7 @@ bool
>>>>>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>>>>                                         uint32_t
>> hwpt_id,
>>>>>>>                                         uint64_t
>> iova, ram_addr_t
>>>>>> size,
>>>>>>>                                         uint64_t
>> page_size,
>>>>>> uint64_t *data,
>>>>>>> -                                      Error **errp)
>>>>>>> +                                      uint64_t
>> flags, Error **errp)
>>>>>>>   {
>>>>>>>       int ret;
>>>>>>>       struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap = {
>>>>>>> @@ -371,11 +371,12 @@ bool
>>>>>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>>>>           .length = size,
>>>>>>>           .page_size = page_size,
>>>>>>>           .data = (uintptr_t)data,
>>>>>>> +        .flags = flags,
>>>>>>>       };
>>>>>>>
>>>>>>>       ret = ioctl(be->fd, IOMMU_HWPT_GET_DIRTY_BITMAP,
>>>>>> &get_dirty_bitmap);
>>>>>>>       trace_iommufd_backend_get_dirty_bitmap(be->fd, hwpt_id,
>> iova,
>>>>>> size,
>>>>>>>
>> -                                           page_size,
>> ret ? errno :
>>>>>> 0);
>>>>>>> +                                           flags,
>> page_size, ret ?
>>>>>> errno : 0);
>>>>>>>       if (ret) {
>>>>>>>           error_setg_errno(errp, errno,
>>>>>>>                            "IOMMU_HWPT_GET_DIRTY_
>> BITMAP
>>>>>> (iova: 0x%"HWADDR_PRIx
>>>>>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>>>>>> index 0057488ce9..c897aa6b17 100644
>>>>>>> --- a/hw/vfio/iommufd.c
>>>>>>> +++ b/hw/vfio/iommufd.c
>>>>>>> @@ -62,7 +62,7 @@ static int iommufd_cdev_unmap_one(const
>>>>>> VFIOContainerBase *bcontainer,
>>>>>>>                                     hwaddr iova,
>> ram_addr_t size,
>>>>>>>                                     IOMMUTLBEntr
>> y *iotlb)
>>>>>>>   {
>>>>>>> -    const VFIOIOMMUFDContainer *container =
>>>>>>> +    VFIOIOMMUFDContainer *container =
>>>>>>>           container_of(bcontainer, VFIOIOMMUFDContainer,
>>>>>> bcontainer);
>>>>>>>       bool need_dirty_sync = false;
>>>>>>>       Error *local_err = NULL;
>>>>>>> @@ -73,9 +73,12 @@ static int iommufd_cdev_unmap_one(const
>>>>>> VFIOContainerBase *bcontainer,
>>>>>>>       if (iotlb &&
>> vfio_container_dirty_tracking_is_started(bcontainer)) {
>>>>>>>           if
>>>>>> (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>>>>>>>               bcontainer->dirty_pages_supported) {
>>>>>>> +            container->dirty_tracking_flags =
>>>>>>> +
>>>>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR;
>>>>>>>               ret =
>> vfio_container_query_dirty_bitmap(bcontainer, iova,
>>>>>> size,
>>>>>>>
>>>>>> iotlb->translated_addr,
>>>>>>>
>>>>>> &local_err);
>>>>>>> +            container->dirty_tracking_flags = 0;
>>>>>>
>>>>>> Why not changing vfio_container_query_dirty_bitmap to pass a flags
>> too, like
>>>>>> the
>>>>>> original patches? This is a little unnecssary odd style to pass a flag via
>>>>>> container structure rather and then clearing.
>>>>>
>>>>> Just want to be simpler, original patch introduced a new parameter to
>> almost all
>>>>> variants of *_query_dirty_bitmap() while the flags parameter is only used
>> by
>>>>> IOMMUFD backend when doing unmap_bitmap. Currently we already
>> have three
>>>>> backends, legacy VFIO, IOMMUFD and VFIO-user, only IOMMUFD need
>> the flag.
>>>>>
>>>>> I take container->dirty_tracking_flags as a notification mechanism, so set
>> it
>>>>> before
>>>>> vfio_container_query_dirty_bitmap() and clear it thereafter. Maybe
>> clearing it in
>>>>> iommufd_query_dirty_bitmap() is easier to be acceptable?
>>>>>
>>>>>>
>>>>>> Part of the reason the original series had a VFIO_GET_DIRTY_NO_FLUSH
>> for
>>>>>> generic
>>>>>> container abstraction was to not mix IOMMUFD UAPI specifics into base
>>>>>> container
>>>>>> API. Then in getting a VFIO_GET_DIRTY_NO_FLUSH, then type1 backend
>>>>>> could just
>>>>>> ignore the flag, while IOMMUFD translates it to
>>>>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR
>>>>>
>>>>> I did port original patch https://github.com/yiliu1765/qemu/
>>>>> commit/99f83595d79d2e4170c9e456cf1a7b9521bd4f80
>>>>> But it looks complex to have 'flags' parameter everywhere.
>>>> I think I would prefer like Joao to avoid caching information if possible
>>>> but I haven't check closely the mess it would introduce in the code. Let
>>>> me check.
>>>>
>>>
>>> My recollection was that it wasn't that much churn added as this series is
>>> already doing the most of the churn. But I am checking how the code would
>> look
>>> like to properly respond to his suggestion on why he changing it towards
>>> structuref field.
>>
>> The churn should be similar to this patch:
>>
>> https://github.com/jpemartins/qemu/commit/5e1f11075299a5fa9564a26788
>> dc9cc1717e297c
>>
>> It's mostly an interface parameter addition of flags.
> 
> Yes, it's a general interface parameter but it's only used by IOMMUFD.
> That's my only argument.
> 

You could use that argument for anything that's not common and unique to each
backend. type1 dirty tracking is effectively "frozen" in UAPI (IIRC) so I won't
expect flags. There's device dirty tracking, IOMMUFD dirty tracking and
vfio-user dirty tracking. While I am not anticipating any new thing here (at
least in the needs I have ahead, can't speak for other companies/contributors);
but you could see the flags as future proofing it.

The ugly part of the alternative is that it sort of limits the interface  to be
single-user only as nobody call into IOMMUFD with the dirty_tracking_flags if
it's called concurrently for some other purpose than unmap. We sort of use the
container data structure as a argument storage for a single call as opposed to
store some intermediary state.

I mean I agree with you that there's churn, but it's essentially a argument
change that's only churn-y because there's 3 users.

>> But there's now a few more callsites and I suppose that's the extra churn. But
>> I don't think
>> vfio-user would need to change. See snip below.
> 
> vfio-user need same change as vfio-legacy though they don't use 'flags' parameter,
> or else there is build error:
> 
> ../hw/vfio-user/container.c:340:30: error: assignment to ‘int (*)(const VFIOContainerBase *, VFIOBitmap *, hwaddr,  hwaddr,  uint64_t,  Error **)’ {aka ‘int (*)(const VFIOContainerBase *, VFIOBitmap *, long unsigned int,  long unsigned int,  long unsigned int,  Error **)’} from incompatible pointer type ‘int (*)(const VFIOContainerBase *, VFIOBitmap *, hwaddr,  hwaddr,  Error **)’ {aka ‘int (*)(const VFIOContainerBase *, VFIOBitmap *, long unsigned int,  long unsigned int,  Error **)’} [-Werror=incompatible-pointer-types]
>   340 |     vioc->query_dirty_bitmap = vfio_user_query_dirty_bitmap;
> 

yes -- I didn't have VFIO_USER compiled in probably why I didn't notice it.

>>
>> I've pushed this here
>> https://github.com/jpemartins/qemu/commits/relax-viommu-lm but showing
>> the
>> series wide changes for discussion. The thing I didn't do was have an
>> intermediate 'backend
>> agnostic' flag thingie like the original one
>> (https://github.com/jpemartins/qemu/commit/1ef8abc896ae69be8896a6870
>> 5fe4a87204709e0) with
>> VFIO_GET_DIRTY_NO_FLUSH
>>

Instead of a flags, an alternative would be to have a 'bool unmap' which then
IOMMUFD can translate to IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR.

That's a bit cleaner as to avoid leaking backend UAPIs into the API but it
limits to 'unmap' case.

>> diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c
>> index 56304978e1e8..2fe08e010405 100644
>> --- a/hw/vfio/container-base.c
>> +++ b/hw/vfio/container-base.c
>> @@ -211,17 +211,16 @@ static int
>> vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
>> }
>>
>> static int vfio_container_iommu_query_dirty_bitmap(const
>> VFIOContainerBase *bcontainer,
>> -                   VFIOBitmap *vbmap, hwaddr iova, hwaddr size, Error
>> **errp)
>> +                   VFIOBitmap *vbmap, hwaddr iova, hwaddr size,
>> uint64_t flags, Error **errp)
>> {
>>     VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
>>
>>     g_assert(vioc->query_dirty_bitmap);
>> -    return vioc->query_dirty_bitmap(bcontainer, vbmap, iova, size,
>> -                                               errp);
>> +    return vioc->query_dirty_bitmap(bcontainer, vbmap, iova, size, flags,
>> errp);
>> }
>>
>> static int vfio_container_devices_query_dirty_bitmap(const
>> VFIOContainerBase *bcontainer,
>> -                 VFIOBitmap *vbmap, hwaddr iova, hwaddr size, Error
>> **errp)
>> +                 VFIOBitmap *vbmap, hwaddr iova, hwaddr size,
>> uint64_t flags, Error **errp)
> 
> I think we can drop 'flags' here as you did in your old patch?
> 

Yeah

>> {
>>     VFIODevice *vbasedev;
>>     int ret;
>> @@ -243,7 +242,7 @@ static int
>> vfio_container_devices_query_dirty_bitmap(const VFIOContainerBase *bc
>> }
>>
>> int vfio_container_query_dirty_bitmap(const VFIOContainerBase
>> *bcontainer, uint64_t iova,
>> -                          uint64_t size, ram_addr_t ram_addr, Error
>> **errp)
>> +                          uint64_t size, uint64_t flags, ram_addr_t
>> ram_addr, Error **errp)
>> {
>>     bool all_device_dirty_tracking =
>>         vfio_container_devices_dirty_tracking_is_supported(bcontainer);
>> @@ -267,10 +266,10 @@ int vfio_container_query_dirty_bitmap(const
>> VFIOContainerBase *bcontainer, uint6
>>
>>     if (all_device_dirty_tracking) {
>>         ret = vfio_container_devices_query_dirty_bitmap(bcontainer,
>> &vbmap, iova, size,
>> -                                                        errp);
>> +                                                        flags,
>> errp);
> 
> Same here.

/me nods

RE: [PATCH 3/5] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
Posted by Duan, Zhenzhong 5 days, 7 hours ago

>-----Original Message-----
>From: Joao Martins <joao.m.martins@oracle.com>
>Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>
>On 23/09/2025 03:50, Duan, Zhenzhong wrote:
>>> -----Original Message-----
>>> From: Joao Martins <joao.m.martins@oracle.com>
>>> Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>>>
>>> On 22/09/2025 17:06, Joao Martins wrote:
>>>> On 22/09/2025 17:02, Cédric Le Goater wrote:
>>>>> On 9/22/25 07:49, Duan, Zhenzhong wrote:
>>>>>> Hi Joao,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Joao Martins <joao.m.martins@oracle.com>
>>>>>>> Subject: Re: [PATCH 3/5] vfio/iommufd: Add
>>>>>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>>>>>>>
>>>>>>> On 10/09/2025 03:36, Zhenzhong Duan wrote:
>>>>>>>> Pass IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR when doing
>the
>>> last
>>>>>>> dirty
>>>>>>>> bitmap query right before unmap, no PTEs flushes. This accelerates
>the
>>>>>>>> query without issue because unmap will tear down the mapping
>>> anyway.
>>>>>>>>
>>>>>>>> Add a new element dirty_tracking_flags in VFIOIOMMUFDContainer
>to
>>>>>>>> be used for the flags of iommufd dirty tracking. Currently it is
>>>>>>>> set to either IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR or 0
>>> based on
>>>>>>>> the scenario.
>>>>>>>>
>>>>>>>> Co-developed-by: Joao Martins <joao.m.martins@oracle.com>
>>>>>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>>>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>>>>>> Tested-by: Xudong Hao <xudong.hao@intel.com>
>>>>>>>> Tested-by: Giovannio Cabiddu <giovanni.cabiddu@intel.com>
>>>>>>>> ---
>>>>>>>>   hw/vfio/vfio-iommufd.h   | 1 +
>>>>>>>>   include/system/iommufd.h | 2 +-
>>>>>>>>   backends/iommufd.c       | 5 +++--
>>>>>>>>   hw/vfio/iommufd.c        | 6 +++++-
>>>>>>>>   backends/trace-events    | 2 +-
>>>>>>>>   5 files changed, 11 insertions(+), 5 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/hw/vfio/vfio-iommufd.h b/hw/vfio/vfio-iommufd.h
>>>>>>>> index 07ea0f4304..e0af241c75 100644
>>>>>>>> --- a/hw/vfio/vfio-iommufd.h
>>>>>>>> +++ b/hw/vfio/vfio-iommufd.h
>>>>>>>> @@ -26,6 +26,7 @@ typedef struct VFIOIOMMUFDContainer {
>>>>>>>>       VFIOContainerBase bcontainer;
>>>>>>>>       IOMMUFDBackend *be;
>>>>>>>>       uint32_t ioas_id;
>>>>>>>> +    uint64_t dirty_tracking_flags;
>>>>>>>>       QLIST_HEAD(, VFIOIOASHwpt) hwpt_list;
>>>>>>>>   } VFIOIOMMUFDContainer;
>>>>>>>>
>>>>>>>> diff --git a/include/system/iommufd.h b/include/system/iommufd.h
>>>>>>>> index c9c72ffc45..63898e7b0d 100644
>>>>>>>> --- a/include/system/iommufd.h
>>>>>>>> +++ b/include/system/iommufd.h
>>>>>>>> @@ -64,7 +64,7 @@ bool
>>>>>>> iommufd_backend_set_dirty_tracking(IOMMUFDBackend *be,
>uint32_t
>>>>>>> hwpt_id,
>>>>>>>>   bool iommufd_backend_get_dirty_bitmap(IOMMUFDBackend
>*be,
>>>>>>> uint32_t hwpt_id,
>>>>>>>>                                         uint64_
>t
>>> iova, ram_addr_t
>>>>>>> size,
>>>>>>>>                                         uint64_
>t
>>> page_size,
>>>>>>> uint64_t *data,
>>>>>>>> -                                      Error
>>> **errp);
>>>>>>>> +                                      uint64_t
>>> flags, Error
>>>>>>> **errp);
>>>>>>>>   bool iommufd_backend_invalidate_cache(IOMMUFDBackend
>*be,
>>>>>>> uint32_t id,
>>>>>>>>                                         uint32_
>t
>>> data_type,
>>>>>>> uint32_t entry_len,
>>>>>>>>                                         uint32_
>t
>>> *entry_num, void
>>>>>>> *data,
>>>>>>>> diff --git a/backends/iommufd.c b/backends/iommufd.c
>>>>>>>> index 2a33c7ab0b..3c4f6157e2 100644
>>>>>>>> --- a/backends/iommufd.c
>>>>>>>> +++ b/backends/iommufd.c
>>>>>>>> @@ -361,7 +361,7 @@ bool
>>>>>>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>>>>>                                         uint32_
>t
>>> hwpt_id,
>>>>>>>>                                         uint64_
>t
>>> iova, ram_addr_t
>>>>>>> size,
>>>>>>>>                                         uint64_
>t
>>> page_size,
>>>>>>> uint64_t *data,
>>>>>>>> -                                      Error
>**errp)
>>>>>>>> +                                      uint64_t
>>> flags, Error **errp)
>>>>>>>>   {
>>>>>>>>       int ret;
>>>>>>>>       struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap = {
>>>>>>>> @@ -371,11 +371,12 @@ bool
>>>>>>> iommufd_backend_get_dirty_bitmap(IOMMUFDBackend *be,
>>>>>>>>           .length = size,
>>>>>>>>           .page_size = page_size,
>>>>>>>>           .data = (uintptr_t)data,
>>>>>>>> +        .flags = flags,
>>>>>>>>       };
>>>>>>>>
>>>>>>>>       ret = ioctl(be->fd, IOMMU_HWPT_GET_DIRTY_BITMAP,
>>>>>>> &get_dirty_bitmap);
>>>>>>>>       trace_iommufd_backend_get_dirty_bitmap(be->fd,
>hwpt_id,
>>> iova,
>>>>>>> size,
>>>>>>>>
>>>
>-                                           page_size,
>>> ret ? errno :
>>>>>>> 0);
>>>>>>>>
>+                                           flags,
>>> page_size, ret ?
>>>>>>> errno : 0);
>>>>>>>>       if (ret) {
>>>>>>>>           error_setg_errno(errp, errno,
>>>>>>>>                            "IOMMU_HWPT_GET_DIRT
>Y_
>>> BITMAP
>>>>>>> (iova: 0x%"HWADDR_PRIx
>>>>>>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>>>>>>> index 0057488ce9..c897aa6b17 100644
>>>>>>>> --- a/hw/vfio/iommufd.c
>>>>>>>> +++ b/hw/vfio/iommufd.c
>>>>>>>> @@ -62,7 +62,7 @@ static int iommufd_cdev_unmap_one(const
>>>>>>> VFIOContainerBase *bcontainer,
>>>>>>>>                                     hwaddr iova,
>>> ram_addr_t size,
>>>>>>>>                                     IOMMUTLBE
>ntr
>>> y *iotlb)
>>>>>>>>   {
>>>>>>>> -    const VFIOIOMMUFDContainer *container =
>>>>>>>> +    VFIOIOMMUFDContainer *container =
>>>>>>>>           container_of(bcontainer, VFIOIOMMUFDContainer,
>>>>>>> bcontainer);
>>>>>>>>       bool need_dirty_sync = false;
>>>>>>>>       Error *local_err = NULL;
>>>>>>>> @@ -73,9 +73,12 @@ static int iommufd_cdev_unmap_one(const
>>>>>>> VFIOContainerBase *bcontainer,
>>>>>>>>       if (iotlb &&
>>> vfio_container_dirty_tracking_is_started(bcontainer)) {
>>>>>>>>           if
>>>>>>> (!vfio_container_devices_dirty_tracking_is_supported(bcontainer) &&
>>>>>>>>               bcontainer->dirty_pages_supported) {
>>>>>>>> +            container->dirty_tracking_flags =
>>>>>>>> +
>>>>>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR;
>>>>>>>>               ret =
>>> vfio_container_query_dirty_bitmap(bcontainer, iova,
>>>>>>> size,
>>>>>>>>
>>>>>>> iotlb->translated_addr,
>>>>>>>>
>>>>>>> &local_err);
>>>>>>>> +            container->dirty_tracking_flags = 0;
>>>>>>>
>>>>>>> Why not changing vfio_container_query_dirty_bitmap to pass a flags
>>> too, like
>>>>>>> the
>>>>>>> original patches? This is a little unnecssary odd style to pass a flag via
>>>>>>> container structure rather and then clearing.
>>>>>>
>>>>>> Just want to be simpler, original patch introduced a new parameter to
>>> almost all
>>>>>> variants of *_query_dirty_bitmap() while the flags parameter is only
>used
>>> by
>>>>>> IOMMUFD backend when doing unmap_bitmap. Currently we already
>>> have three
>>>>>> backends, legacy VFIO, IOMMUFD and VFIO-user, only IOMMUFD need
>>> the flag.
>>>>>>
>>>>>> I take container->dirty_tracking_flags as a notification mechanism, so
>set
>>> it
>>>>>> before
>>>>>> vfio_container_query_dirty_bitmap() and clear it thereafter. Maybe
>>> clearing it in
>>>>>> iommufd_query_dirty_bitmap() is easier to be acceptable?
>>>>>>
>>>>>>>
>>>>>>> Part of the reason the original series had a
>VFIO_GET_DIRTY_NO_FLUSH
>>> for
>>>>>>> generic
>>>>>>> container abstraction was to not mix IOMMUFD UAPI specifics into
>base
>>>>>>> container
>>>>>>> API. Then in getting a VFIO_GET_DIRTY_NO_FLUSH, then type1
>backend
>>>>>>> could just
>>>>>>> ignore the flag, while IOMMUFD translates it to
>>>>>>> IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR
>>>>>>
>>>>>> I did port original patch https://github.com/yiliu1765/qemu/
>>>>>> commit/99f83595d79d2e4170c9e456cf1a7b9521bd4f80
>>>>>> But it looks complex to have 'flags' parameter everywhere.
>>>>> I think I would prefer like Joao to avoid caching information if possible
>>>>> but I haven't check closely the mess it would introduce in the code. Let
>>>>> me check.
>>>>>
>>>>
>>>> My recollection was that it wasn't that much churn added as this series is
>>>> already doing the most of the churn. But I am checking how the code
>would
>>> look
>>>> like to properly respond to his suggestion on why he changing it towards
>>>> structuref field.
>>>
>>> The churn should be similar to this patch:
>>>
>>>
>https://github.com/jpemartins/qemu/commit/5e1f11075299a5fa9564a26788
>>> dc9cc1717e297c
>>>
>>> It's mostly an interface parameter addition of flags.
>>
>> Yes, it's a general interface parameter but it's only used by IOMMUFD.
>> That's my only argument.
>>
>
>You could use that argument for anything that's not common and unique to
>each
>backend. type1 dirty tracking is effectively "frozen" in UAPI (IIRC) so I won't
>expect flags. There's device dirty tracking, IOMMUFD dirty tracking and
>vfio-user dirty tracking. While I am not anticipating any new thing here (at
>least in the needs I have ahead, can't speak for other
>companies/contributors);
>but you could see the flags as future proofing it.

Yes, agree it's future proofing.

>
>The ugly part of the alternative is that it sort of limits the interface  to be
>single-user only as nobody call into IOMMUFD with the dirty_tracking_flags if
>it's called concurrently for some other purpose than unmap. We sort of use
>the
>container data structure as a argument storage for a single call as opposed to
>store some intermediary state.

dirty_tracking_flags only take effect during the unmap path except you take it
for other purpose. But yes, I made the change based on the fact that unmap path
is protected by BQL, if the concurrent access is allowed in future, this change will
break.

Thanks
Zhenzhong

RE: [PATCH 3/5] vfio/iommufd: Add IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
Posted by Duan, Zhenzhong 5 days, 6 hours ago

>-----Original Message-----
>From: Duan, Zhenzhong
>Subject: RE: [PATCH 3/5] vfio/iommufd: Add
>IOMMU_HWPT_GET_DIRTY_BITMAP_NO_CLEAR flag support
>>The ugly part of the alternative is that it sort of limits the interface  to be
>>single-user only as nobody call into IOMMUFD with the dirty_tracking_flags
>if
>>it's called concurrently for some other purpose than unmap. We sort of use
>>the
>>container data structure as a argument storage for a single call as opposed
>to
>>store some intermediary state.
>
>dirty_tracking_flags only take effect during the unmap path except you take it
>for other purpose. But yes, I made the change based on the fact that unmap
>path
>is protected by BQL, if the concurrent access is allowed in future, this change
>will
>break.

I found I misunderstood your comment above. Yes, I planned dirty_tracking_flags
only for unmap usage, not for others. I didn't take it as argument but a notification
mechanism only for unmap_all path optimization.

Thanks
Zhenzhong