[PATCH v3 00/13] vfio/migration: Device dirty page tracking

Joao Martins posted 13 patches 1 year, 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230304014343.33646-1-joao.m.martins@oracle.com
Maintainers: Alex Williamson <alex.williamson@redhat.com>, "Cédric Le Goater" <clg@redhat.com>
There is a newer version of this series
docs/devel/vfio-migration.rst |  46 ++-
hw/vfio/common.c              | 668 ++++++++++++++++++++++++++++------
hw/vfio/migration.c           |  21 ++
hw/vfio/trace-events          |   2 +
include/hw/vfio/vfio-common.h |  15 +
5 files changed, 617 insertions(+), 135 deletions(-)
[PATCH v3 00/13] vfio/migration: Device dirty page tracking
Posted by Joao Martins 1 year, 1 month ago
Hey,

Presented herewith a series based on the basic VFIO migration protocol v2
implementation [1].

It is split from its parent series[5] to solely focus on device dirty
page tracking. Device dirty page tracking allows the VFIO device to
record its DMAs and report them back when needed. This is part of VFIO
migration and is used during pre-copy phase of migration to track the
RAM pages that the device has written to and mark those pages dirty, so
they can later be re-sent to target.

Device dirty page tracking uses the DMA logging uAPI to discover device
capabilities, to start and stop tracking, and to get dirty page bitmap
report. Extra details and uAPI definition can be found here [3].

Device dirty page tracking operates in VFIOContainer scope. I.e., When
dirty tracking is started, stopped or dirty page report is queried, all
devices within a VFIOContainer are iterated and for each of them device
dirty page tracking is started, stopped or dirty page report is queried,
respectively.

Device dirty page tracking is used only if all devices within a
VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
used, and if that is not supported as well, memory is perpetually marked
dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
support, the last two usually have the same effect of perpetually
marking all pages dirty.

Normally, when asked to start dirty tracking, all the currently DMA
mapped ranges are tracked by device dirty page tracking. If using a
vIOMMU we block live migration. It's temporary and a separate series is
going to add support for it. Thus this series focus on getting the
ground work first.

The series is organized as follows:

- Patches 1-7: Fix bugs and do some preparatory work required prior to
  adding device dirty page tracking.
- Patches 8-10: Implement device dirty page tracking.
- Patch 11: Blocks live migration with vIOMMU.
- Patches 12-13 enable device dirty page tracking and document it.

Comments, improvements as usual appreciated.

Thanks,
	Joao

Changes from v2 [5]:
- Split initial dirty page tracking support from the parent series to
  split into smaller parts.
- Replace an IOVATree with a simple two range setup: one range for 32-bit
  another one for 64-bit address space. After discussions it was sorted out
  this way due to unnecessary complexity of IOVAtree while being more
  efficient too without stressing so much of the UAPI limits. (patch 7 and 8) 
- For now exclude vIOMMU, and so add a live migration blocker if a
  vIOMMU is passed in. This will be followed up with vIOMMU support in
  a separate series. (patch 10)
- Add new patches to reuse most helpers used across memory listeners.
  This is useful for reusal when recording DMA ranges.  (patch 5 and 6)
- Adjust Documentation to avoid mentioning the vIOMMU and instead
  claim that vIOMMU with device dirty page tracking is blocked. Cedric
  gave a Rb, but I've dropped taking into consideration the split and no
  vIOMMU support (patch 13)
- Improve VFIOBitmap to avoid allocating a 16byte structure to
  place it on the stack. Remove the free helper function. (patch 4)
- Fixing the compilation issues (patch 8 and 10). Possibly not 100%
  addressed as I am still working out the env to repro it.

Changes from v1 [4]:
- Rebased on latest master branch. As part of it, made some changes in
  pre-copy to adjust it to Juan's new patches:
  1. Added a new patch that passes threshold_size parameter to
     .state_pending_{estimate,exact}() handlers.
  2. Added a new patch that refactors vfio_save_block().
  3. Changed the pre-copy patch to cache and report pending pre-copy
     size in the .state_pending_estimate() handler.
- Removed unnecessary P2P code. This should be added later on when P2P
  support is added. (Alex)
- Moved the dirty sync to be after the DMA unmap in vfio_dma_unmap()
  (patch #11). (Alex)
- Stored vfio_devices_all_device_dirty_tracking()'s value in a local
  variable in vfio_get_dirty_bitmap() so it can be re-used (patch #11).
- Refactored the viommu device dirty tracking ranges creation code to
  make it clearer (patch #15).
- Changed overflow check in vfio_iommu_range_is_device_tracked() to
  emphasize that we specifically check for 2^64 wrap around (patch #15).
- Added R-bs / Acks.

[1]
https://lore.kernel.org/qemu-devel/167658846945.932837.1420176491103357684.stgit@omen/

[2]
https://lore.kernel.org/kvm/20221206083438.37807-3-yishaih@nvidia.com/

[3]
https://lore.kernel.org/netdev/20220908183448.195262-4-yishaih@nvidia.com/

[4] https://lore.kernel.org/qemu-devel/20230126184948.10478-1-avihaih@nvidia.com/

[5] https://lore.kernel.org/qemu-devel/20230222174915.5647-1-avihaih@nvidia.com/


Avihai Horon (6):
  vfio/common: Fix error reporting in vfio_get_dirty_bitmap()
  vfio/common: Fix wrong %m usages
  vfio/common: Abort migration if dirty log start/stop/sync fails
  vfio/common: Add VFIOBitmap and alloc function
  vfio/common: Extract code from vfio_get_dirty_bitmap() to new function
  docs/devel: Document VFIO device dirty page tracking

Joao Martins (7):
  vfio/common: Add helper to validate iova/end against hostwin
  vfio/common: Consolidate skip/invalid section into helper
  vfio/common: Record DMA mapped IOVA ranges
  vfio/common: Add device dirty page tracking start/stop
  vfio/common: Add device dirty page bitmap sync
  vfio/migration: Block migration with vIOMMU
  vfio/migration: Query device dirty page tracking support

 docs/devel/vfio-migration.rst |  46 ++-
 hw/vfio/common.c              | 668 ++++++++++++++++++++++++++++------
 hw/vfio/migration.c           |  21 ++
 hw/vfio/trace-events          |   2 +
 include/hw/vfio/vfio-common.h |  15 +
 5 files changed, 617 insertions(+), 135 deletions(-)

-- 
2.17.2
Re: [PATCH v3 00/13] vfio/migration: Device dirty page tracking
Posted by Cédric Le Goater 1 year, 1 month ago
On 3/4/23 02:43, Joao Martins wrote:
> Hey,
> 
> Presented herewith a series based on the basic VFIO migration protocol v2
> implementation [1].
> 
> It is split from its parent series[5] to solely focus on device dirty
> page tracking. Device dirty page tracking allows the VFIO device to
> record its DMAs and report them back when needed. This is part of VFIO
> migration and is used during pre-copy phase of migration to track the
> RAM pages that the device has written to and mark those pages dirty, so
> they can later be re-sent to target.
> 
> Device dirty page tracking uses the DMA logging uAPI to discover device
> capabilities, to start and stop tracking, and to get dirty page bitmap
> report. Extra details and uAPI definition can be found here [3].
> 
> Device dirty page tracking operates in VFIOContainer scope. I.e., When
> dirty tracking is started, stopped or dirty page report is queried, all
> devices within a VFIOContainer are iterated and for each of them device
> dirty page tracking is started, stopped or dirty page report is queried,
> respectively.
> 
> Device dirty page tracking is used only if all devices within a
> VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
> used, and if that is not supported as well, memory is perpetually marked
> dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
> support, the last two usually have the same effect of perpetually
> marking all pages dirty.
> 
> Normally, when asked to start dirty tracking, all the currently DMA
> mapped ranges are tracked by device dirty page tracking. If using a
> vIOMMU we block live migration. It's temporary and a separate series is
> going to add support for it. Thus this series focus on getting the
> ground work first.
> 
> The series is organized as follows:
> 
> - Patches 1-7: Fix bugs and do some preparatory work required prior to
>    adding device dirty page tracking.
> - Patches 8-10: Implement device dirty page tracking.
> - Patch 11: Blocks live migration with vIOMMU.
> - Patches 12-13 enable device dirty page tracking and document it.
> 
> Comments, improvements as usual appreciated.

It would be helpful to have some feed back from Avihai on the new patches
introduced in v3 or v4 before merging.

Also, (being curious) did you test migration with a TCG guest ?

Thanks,

C.
Re: [PATCH v3 00/13] vfio/migration: Device dirty page tracking
Posted by Avihai Horon 1 year, 1 month ago
On 06/03/2023 19:23, Cédric Le Goater wrote:
> External email: Use caution opening links or attachments
>
>
> On 3/4/23 02:43, Joao Martins wrote:
>> Hey,
>>
>> Presented herewith a series based on the basic VFIO migration 
>> protocol v2
>> implementation [1].
>>
>> It is split from its parent series[5] to solely focus on device dirty
>> page tracking. Device dirty page tracking allows the VFIO device to
>> record its DMAs and report them back when needed. This is part of VFIO
>> migration and is used during pre-copy phase of migration to track the
>> RAM pages that the device has written to and mark those pages dirty, so
>> they can later be re-sent to target.
>>
>> Device dirty page tracking uses the DMA logging uAPI to discover device
>> capabilities, to start and stop tracking, and to get dirty page bitmap
>> report. Extra details and uAPI definition can be found here [3].
>>
>> Device dirty page tracking operates in VFIOContainer scope. I.e., When
>> dirty tracking is started, stopped or dirty page report is queried, all
>> devices within a VFIOContainer are iterated and for each of them device
>> dirty page tracking is started, stopped or dirty page report is queried,
>> respectively.
>>
>> Device dirty page tracking is used only if all devices within a
>> VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
>> used, and if that is not supported as well, memory is perpetually marked
>> dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
>> support, the last two usually have the same effect of perpetually
>> marking all pages dirty.
>>
>> Normally, when asked to start dirty tracking, all the currently DMA
>> mapped ranges are tracked by device dirty page tracking. If using a
>> vIOMMU we block live migration. It's temporary and a separate series is
>> going to add support for it. Thus this series focus on getting the
>> ground work first.
>>
>> The series is organized as follows:
>>
>> - Patches 1-7: Fix bugs and do some preparatory work required prior to
>>    adding device dirty page tracking.
>> - Patches 8-10: Implement device dirty page tracking.
>> - Patch 11: Blocks live migration with vIOMMU.
>> - Patches 12-13 enable device dirty page tracking and document it.
>>
>> Comments, improvements as usual appreciated.
>
> It would be helpful to have some feed back from Avihai on the new patches
> introduced in v3 or v4 before merging.

Sure, will send it shortly.

Thanks.


Re: [PATCH v3 00/13] vfio/migration: Device dirty page tracking
Posted by Joao Martins 1 year, 1 month ago
On 06/03/2023 17:23, Cédric Le Goater wrote:
> On 3/4/23 02:43, Joao Martins wrote:
>> Hey,
>>
>> Presented herewith a series based on the basic VFIO migration protocol v2
>> implementation [1].
>>
>> It is split from its parent series[5] to solely focus on device dirty
>> page tracking. Device dirty page tracking allows the VFIO device to
>> record its DMAs and report them back when needed. This is part of VFIO
>> migration and is used during pre-copy phase of migration to track the
>> RAM pages that the device has written to and mark those pages dirty, so
>> they can later be re-sent to target.
>>
>> Device dirty page tracking uses the DMA logging uAPI to discover device
>> capabilities, to start and stop tracking, and to get dirty page bitmap
>> report. Extra details and uAPI definition can be found here [3].
>>
>> Device dirty page tracking operates in VFIOContainer scope. I.e., When
>> dirty tracking is started, stopped or dirty page report is queried, all
>> devices within a VFIOContainer are iterated and for each of them device
>> dirty page tracking is started, stopped or dirty page report is queried,
>> respectively.
>>
>> Device dirty page tracking is used only if all devices within a
>> VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
>> used, and if that is not supported as well, memory is perpetually marked
>> dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
>> support, the last two usually have the same effect of perpetually
>> marking all pages dirty.
>>
>> Normally, when asked to start dirty tracking, all the currently DMA
>> mapped ranges are tracked by device dirty page tracking. If using a
>> vIOMMU we block live migration. It's temporary and a separate series is
>> going to add support for it. Thus this series focus on getting the
>> ground work first.
>>
>> The series is organized as follows:
>>
>> - Patches 1-7: Fix bugs and do some preparatory work required prior to
>>    adding device dirty page tracking.
>> - Patches 8-10: Implement device dirty page tracking.
>> - Patch 11: Blocks live migration with vIOMMU.
>> - Patches 12-13 enable device dirty page tracking and document it.
>>
>> Comments, improvements as usual appreciated.
> 
> It would be helpful to have some feed back from Avihai on the new patches
> introduced in v3 or v4 before merging.
> 
I am gonna let him comment but Avihai is definitely looking/testing it too e.g.
one comment he mentioned to me that I have slated preemptively for v4 too is to
remove the 2 unnecessary iova-tree.h includes in patch 7 (given that I removed
the IOVATree need at all).

> Also, (being curious) did you test migration with a TCG guest ?
> 
KVM guests only.

	Joao

Re: [PATCH v3 00/13] vfio/migration: Device dirty page tracking
Posted by Alex Williamson 1 year, 1 month ago
On Sat,  4 Mar 2023 01:43:30 +0000
Joao Martins <joao.m.martins@oracle.com> wrote:

> Hey,
> 
> Presented herewith a series based on the basic VFIO migration protocol v2
> implementation [1].
> 
> It is split from its parent series[5] to solely focus on device dirty
> page tracking. Device dirty page tracking allows the VFIO device to
> record its DMAs and report them back when needed. This is part of VFIO
> migration and is used during pre-copy phase of migration to track the
> RAM pages that the device has written to and mark those pages dirty, so
> they can later be re-sent to target.
> 
> Device dirty page tracking uses the DMA logging uAPI to discover device
> capabilities, to start and stop tracking, and to get dirty page bitmap
> report. Extra details and uAPI definition can be found here [3].
> 
> Device dirty page tracking operates in VFIOContainer scope. I.e., When
> dirty tracking is started, stopped or dirty page report is queried, all
> devices within a VFIOContainer are iterated and for each of them device
> dirty page tracking is started, stopped or dirty page report is queried,
> respectively.
> 
> Device dirty page tracking is used only if all devices within a
> VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
> used, and if that is not supported as well, memory is perpetually marked
> dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
> support, the last two usually have the same effect of perpetually
> marking all pages dirty.
> 
> Normally, when asked to start dirty tracking, all the currently DMA
> mapped ranges are tracked by device dirty page tracking. If using a
> vIOMMU we block live migration. It's temporary and a separate series is
> going to add support for it. Thus this series focus on getting the
> ground work first.
> 
> The series is organized as follows:
> 
> - Patches 1-7: Fix bugs and do some preparatory work required prior to
>   adding device dirty page tracking.
> - Patches 8-10: Implement device dirty page tracking.
> - Patch 11: Blocks live migration with vIOMMU.
> - Patches 12-13 enable device dirty page tracking and document it.
> 
> Comments, improvements as usual appreciated.

Still some CI failures:

https://gitlab.com/alex.williamson/qemu/-/pipelines/796657474

The docker failures are normal, afaict the rest are not.  Thanks,

Alex
Re: [PATCH v3 00/13] vfio/migration: Device dirty page tracking
Posted by Joao Martins 1 year, 1 month ago
On 05/03/2023 20:57, Alex Williamson wrote:
> On Sat,  4 Mar 2023 01:43:30 +0000
> Joao Martins <joao.m.martins@oracle.com> wrote:
> 
>> Hey,
>>
>> Presented herewith a series based on the basic VFIO migration protocol v2
>> implementation [1].
>>
>> It is split from its parent series[5] to solely focus on device dirty
>> page tracking. Device dirty page tracking allows the VFIO device to
>> record its DMAs and report them back when needed. This is part of VFIO
>> migration and is used during pre-copy phase of migration to track the
>> RAM pages that the device has written to and mark those pages dirty, so
>> they can later be re-sent to target.
>>
>> Device dirty page tracking uses the DMA logging uAPI to discover device
>> capabilities, to start and stop tracking, and to get dirty page bitmap
>> report. Extra details and uAPI definition can be found here [3].
>>
>> Device dirty page tracking operates in VFIOContainer scope. I.e., When
>> dirty tracking is started, stopped or dirty page report is queried, all
>> devices within a VFIOContainer are iterated and for each of them device
>> dirty page tracking is started, stopped or dirty page report is queried,
>> respectively.
>>
>> Device dirty page tracking is used only if all devices within a
>> VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
>> used, and if that is not supported as well, memory is perpetually marked
>> dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
>> support, the last two usually have the same effect of perpetually
>> marking all pages dirty.
>>
>> Normally, when asked to start dirty tracking, all the currently DMA
>> mapped ranges are tracked by device dirty page tracking. If using a
>> vIOMMU we block live migration. It's temporary and a separate series is
>> going to add support for it. Thus this series focus on getting the
>> ground work first.
>>
>> The series is organized as follows:
>>
>> - Patches 1-7: Fix bugs and do some preparatory work required prior to
>>   adding device dirty page tracking.
>> - Patches 8-10: Implement device dirty page tracking.
>> - Patch 11: Blocks live migration with vIOMMU.
>> - Patches 12-13 enable device dirty page tracking and document it.
>>
>> Comments, improvements as usual appreciated.
> 
> Still some CI failures:
> 
> https://gitlab.com/alex.williamson/qemu/-/pipelines/796657474
> 
> The docker failures are normal, afaict the rest are not.  Thanks,
> 

Ugh, sorry

The patch below scissors mark (and also attached as a file) fixes those build
issues. I managed to reproduce on i386 target builds, and these changes fix my
32-bit build.

I don't have a working Gitlab setup[*] though to trigger the CI to enable to
wealth of targets it build-tests. If you could kindly test the patch attached in
a new pipeline (applied on top of the branch you just build) below to understand
if the CI gets happy. I will include these changes in the right patches (patch 8
and 10) for the v4 spin.

	Joao

[*] I'm working with Gitlab support to understand what's wrong there with my
account.

----------------->8-----------------

From bbf2c3bbb9c9e97f12dfe49f85dac8cc1f0c5d97 Mon Sep 17 00:00:00 2001
From: Joao Martins <joao.m.martins@oracle.com>
Date: Sun, 5 Mar 2023 18:12:29 -0500
Subject: [PATCH v3 14/13] vfio/common: Fix 32-bit builds

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
---
 hw/vfio/common.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 9b909f856722..eecff5bb16c6 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1554,7 +1554,7 @@ vfio_device_feature_dma_logging_start_create(VFIOContainer
*container)
         return NULL;
     }

-    control->ranges = (__aligned_u64)ranges;
+    control->ranges = (__u64)(uintptr_t)ranges;
     if (tracking->max32) {
         ranges->iova = tracking->min32;
         ranges->length = (tracking->max32 - tracking->min32) + 1;
@@ -1578,7 +1578,7 @@ static void vfio_device_feature_dma_logging_start_destroy(
     struct vfio_device_feature_dma_logging_control *control =
         (struct vfio_device_feature_dma_logging_control *)feature->data;
     struct vfio_device_feature_dma_logging_range *ranges =
-        (struct vfio_device_feature_dma_logging_range *)control->ranges;
+        (struct vfio_device_feature_dma_logging_range *)(uintptr_t)
control->ranges;

     g_free(ranges);
     g_free(feature);
@@ -1646,7 +1646,7 @@ static int vfio_device_dma_logging_report(VFIODevice
*vbasedev, hwaddr iova,
 {
     uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature) +
                         sizeof(struct vfio_device_feature_dma_logging_report),
-                        sizeof(__aligned_u64))] = {};
+                        sizeof(__u64))] = {};
     struct vfio_device_feature *feature = (struct vfio_device_feature *)buf;
     struct vfio_device_feature_dma_logging_report *report =
         (struct vfio_device_feature_dma_logging_report *)feature->data;
@@ -1654,7 +1654,7 @@ static int vfio_device_dma_logging_report(VFIODevice
*vbasedev, hwaddr iova,
     report->iova = iova;
     report->length = size;
     report->page_size = qemu_real_host_page_size();
-    report->bitmap = (__aligned_u64)bitmap;
+    report->bitmap = (__u64)(uintptr_t)bitmap;

     feature->argsz = sizeof(buf);
     feature->flags =
--
2.17.2From bbf2c3bbb9c9e97f12dfe49f85dac8cc1f0c5d97 Mon Sep 17 00:00:00 2001
From: Joao Martins <joao.m.martins@oracle.com>
Date: Sun, 5 Mar 2023 18:12:29 -0500
Subject: [PATCH v3 14/13] vfio/common: Fix 32-bit builds

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
---
 hw/vfio/common.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index 9b909f856722..eecff5bb16c6 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1554,7 +1554,7 @@ vfio_device_feature_dma_logging_start_create(VFIOContainer *container)
         return NULL;
     }
 
-    control->ranges = (__aligned_u64)ranges;
+    control->ranges = (__u64)(uintptr_t)ranges;
     if (tracking->max32) {
         ranges->iova = tracking->min32;
         ranges->length = (tracking->max32 - tracking->min32) + 1;
@@ -1578,7 +1578,7 @@ static void vfio_device_feature_dma_logging_start_destroy(
     struct vfio_device_feature_dma_logging_control *control =
         (struct vfio_device_feature_dma_logging_control *)feature->data;
     struct vfio_device_feature_dma_logging_range *ranges =
-        (struct vfio_device_feature_dma_logging_range *)control->ranges;
+        (struct vfio_device_feature_dma_logging_range *)(uintptr_t) control->ranges;
 
     g_free(ranges);
     g_free(feature);
@@ -1646,7 +1646,7 @@ static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
 {
     uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature) +
                         sizeof(struct vfio_device_feature_dma_logging_report),
-                        sizeof(__aligned_u64))] = {};
+                        sizeof(__u64))] = {};
     struct vfio_device_feature *feature = (struct vfio_device_feature *)buf;
     struct vfio_device_feature_dma_logging_report *report =
         (struct vfio_device_feature_dma_logging_report *)feature->data;
@@ -1654,7 +1654,7 @@ static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
     report->iova = iova;
     report->length = size;
     report->page_size = qemu_real_host_page_size();
-    report->bitmap = (__aligned_u64)bitmap;
+    report->bitmap = (__u64)(uintptr_t)bitmap;
 
     feature->argsz = sizeof(buf);
     feature->flags =
-- 
2.17.2

Re: [PATCH v3 00/13] vfio/migration: Device dirty page tracking
Posted by Alex Williamson 1 year, 1 month ago
On Sun, 5 Mar 2023 23:33:35 +0000
Joao Martins <joao.m.martins@oracle.com> wrote:

> On 05/03/2023 20:57, Alex Williamson wrote:
> > On Sat,  4 Mar 2023 01:43:30 +0000
> > Joao Martins <joao.m.martins@oracle.com> wrote:
> >   
> >> Hey,
> >>
> >> Presented herewith a series based on the basic VFIO migration protocol v2
> >> implementation [1].
> >>
> >> It is split from its parent series[5] to solely focus on device dirty
> >> page tracking. Device dirty page tracking allows the VFIO device to
> >> record its DMAs and report them back when needed. This is part of VFIO
> >> migration and is used during pre-copy phase of migration to track the
> >> RAM pages that the device has written to and mark those pages dirty, so
> >> they can later be re-sent to target.
> >>
> >> Device dirty page tracking uses the DMA logging uAPI to discover device
> >> capabilities, to start and stop tracking, and to get dirty page bitmap
> >> report. Extra details and uAPI definition can be found here [3].
> >>
> >> Device dirty page tracking operates in VFIOContainer scope. I.e., When
> >> dirty tracking is started, stopped or dirty page report is queried, all
> >> devices within a VFIOContainer are iterated and for each of them device
> >> dirty page tracking is started, stopped or dirty page report is queried,
> >> respectively.
> >>
> >> Device dirty page tracking is used only if all devices within a
> >> VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
> >> used, and if that is not supported as well, memory is perpetually marked
> >> dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
> >> support, the last two usually have the same effect of perpetually
> >> marking all pages dirty.
> >>
> >> Normally, when asked to start dirty tracking, all the currently DMA
> >> mapped ranges are tracked by device dirty page tracking. If using a
> >> vIOMMU we block live migration. It's temporary and a separate series is
> >> going to add support for it. Thus this series focus on getting the
> >> ground work first.
> >>
> >> The series is organized as follows:
> >>
> >> - Patches 1-7: Fix bugs and do some preparatory work required prior to
> >>   adding device dirty page tracking.
> >> - Patches 8-10: Implement device dirty page tracking.
> >> - Patch 11: Blocks live migration with vIOMMU.
> >> - Patches 12-13 enable device dirty page tracking and document it.
> >>
> >> Comments, improvements as usual appreciated.  
> > 
> > Still some CI failures:
> > 
> > https://gitlab.com/alex.williamson/qemu/-/pipelines/796657474
> > 
> > The docker failures are normal, afaict the rest are not.  Thanks,
> >   
> 
> Ugh, sorry
> 
> The patch below scissors mark (and also attached as a file) fixes those build
> issues. I managed to reproduce on i386 target builds, and these changes fix my
> 32-bit build.
> 
> I don't have a working Gitlab setup[*] though to trigger the CI to enable to
> wealth of targets it build-tests. If you could kindly test the patch attached in
> a new pipeline (applied on top of the branch you just build) below to understand
> if the CI gets happy. I will include these changes in the right patches (patch 8
> and 10) for the v4 spin.

Looks like this passes:

https://gitlab.com/alex.williamson/qemu/-/pipelines/796750136

Thanks,
Alex
Re: [PATCH v3 00/13] vfio/migration: Device dirty page tracking
Posted by Joao Martins 1 year, 1 month ago
On 06/03/2023 02:19, Alex Williamson wrote:
> On Sun, 5 Mar 2023 23:33:35 +0000
> Joao Martins <joao.m.martins@oracle.com> wrote:
> 
>> On 05/03/2023 20:57, Alex Williamson wrote:
>>> On Sat,  4 Mar 2023 01:43:30 +0000
>>> Joao Martins <joao.m.martins@oracle.com> wrote:
>>>   
>>>> Hey,
>>>>
>>>> Presented herewith a series based on the basic VFIO migration protocol v2
>>>> implementation [1].
>>>>
>>>> It is split from its parent series[5] to solely focus on device dirty
>>>> page tracking. Device dirty page tracking allows the VFIO device to
>>>> record its DMAs and report them back when needed. This is part of VFIO
>>>> migration and is used during pre-copy phase of migration to track the
>>>> RAM pages that the device has written to and mark those pages dirty, so
>>>> they can later be re-sent to target.
>>>>
>>>> Device dirty page tracking uses the DMA logging uAPI to discover device
>>>> capabilities, to start and stop tracking, and to get dirty page bitmap
>>>> report. Extra details and uAPI definition can be found here [3].
>>>>
>>>> Device dirty page tracking operates in VFIOContainer scope. I.e., When
>>>> dirty tracking is started, stopped or dirty page report is queried, all
>>>> devices within a VFIOContainer are iterated and for each of them device
>>>> dirty page tracking is started, stopped or dirty page report is queried,
>>>> respectively.
>>>>
>>>> Device dirty page tracking is used only if all devices within a
>>>> VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
>>>> used, and if that is not supported as well, memory is perpetually marked
>>>> dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
>>>> support, the last two usually have the same effect of perpetually
>>>> marking all pages dirty.
>>>>
>>>> Normally, when asked to start dirty tracking, all the currently DMA
>>>> mapped ranges are tracked by device dirty page tracking. If using a
>>>> vIOMMU we block live migration. It's temporary and a separate series is
>>>> going to add support for it. Thus this series focus on getting the
>>>> ground work first.
>>>>
>>>> The series is organized as follows:
>>>>
>>>> - Patches 1-7: Fix bugs and do some preparatory work required prior to
>>>>   adding device dirty page tracking.
>>>> - Patches 8-10: Implement device dirty page tracking.
>>>> - Patch 11: Blocks live migration with vIOMMU.
>>>> - Patches 12-13 enable device dirty page tracking and document it.
>>>>
>>>> Comments, improvements as usual appreciated.  
>>>
>>> Still some CI failures:
>>>
>>> https://gitlab.com/alex.williamson/qemu/-/pipelines/796657474
>>>
>>> The docker failures are normal, afaict the rest are not.  Thanks,
>>>   
>>
>> Ugh, sorry
>>
>> The patch below scissors mark (and also attached as a file) fixes those build
>> issues. I managed to reproduce on i386 target builds, and these changes fix my
>> 32-bit build.
>>
>> I don't have a working Gitlab setup[*] though to trigger the CI to enable to
>> wealth of targets it build-tests. If you could kindly test the patch attached in
>> a new pipeline (applied on top of the branch you just build) below to understand
>> if the CI gets happy. I will include these changes in the right patches (patch 8
>> and 10) for the v4 spin.
> 
> Looks like this passes:
> 
> https://gitlab.com/alex.williamson/qemu/-/pipelines/796750136
> 
Great, I've staged this fixes in patches 8&10!

I have a sliver of hope that we might still make it by soft freeze (tomorrow?).
If you think it can still make it, should the rest of the series is good, then I
can follow up v4 today/tomorrow. Thoughts?

	Joao
Re: [PATCH v3 00/13] vfio/migration: Device dirty page tracking
Posted by Cédric Le Goater 1 year, 1 month ago
On 3/6/23 10:45, Joao Martins wrote:
> On 06/03/2023 02:19, Alex Williamson wrote:
>> On Sun, 5 Mar 2023 23:33:35 +0000
>> Joao Martins <joao.m.martins@oracle.com> wrote:
>>
>>> On 05/03/2023 20:57, Alex Williamson wrote:
>>>> On Sat,  4 Mar 2023 01:43:30 +0000
>>>> Joao Martins <joao.m.martins@oracle.com> wrote:
>>>>    
>>>>> Hey,
>>>>>
>>>>> Presented herewith a series based on the basic VFIO migration protocol v2
>>>>> implementation [1].
>>>>>
>>>>> It is split from its parent series[5] to solely focus on device dirty
>>>>> page tracking. Device dirty page tracking allows the VFIO device to
>>>>> record its DMAs and report them back when needed. This is part of VFIO
>>>>> migration and is used during pre-copy phase of migration to track the
>>>>> RAM pages that the device has written to and mark those pages dirty, so
>>>>> they can later be re-sent to target.
>>>>>
>>>>> Device dirty page tracking uses the DMA logging uAPI to discover device
>>>>> capabilities, to start and stop tracking, and to get dirty page bitmap
>>>>> report. Extra details and uAPI definition can be found here [3].
>>>>>
>>>>> Device dirty page tracking operates in VFIOContainer scope. I.e., When
>>>>> dirty tracking is started, stopped or dirty page report is queried, all
>>>>> devices within a VFIOContainer are iterated and for each of them device
>>>>> dirty page tracking is started, stopped or dirty page report is queried,
>>>>> respectively.
>>>>>
>>>>> Device dirty page tracking is used only if all devices within a
>>>>> VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
>>>>> used, and if that is not supported as well, memory is perpetually marked
>>>>> dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
>>>>> support, the last two usually have the same effect of perpetually
>>>>> marking all pages dirty.
>>>>>
>>>>> Normally, when asked to start dirty tracking, all the currently DMA
>>>>> mapped ranges are tracked by device dirty page tracking. If using a
>>>>> vIOMMU we block live migration. It's temporary and a separate series is
>>>>> going to add support for it. Thus this series focus on getting the
>>>>> ground work first.
>>>>>
>>>>> The series is organized as follows:
>>>>>
>>>>> - Patches 1-7: Fix bugs and do some preparatory work required prior to
>>>>>    adding device dirty page tracking.
>>>>> - Patches 8-10: Implement device dirty page tracking.
>>>>> - Patch 11: Blocks live migration with vIOMMU.
>>>>> - Patches 12-13 enable device dirty page tracking and document it.
>>>>>
>>>>> Comments, improvements as usual appreciated.
>>>>
>>>> Still some CI failures:
>>>>
>>>> https://gitlab.com/alex.williamson/qemu/-/pipelines/796657474
>>>>
>>>> The docker failures are normal, afaict the rest are not.  Thanks,
>>>>    
>>>
>>> Ugh, sorry
>>>
>>> The patch below scissors mark (and also attached as a file) fixes those build
>>> issues. I managed to reproduce on i386 target builds, and these changes fix my
>>> 32-bit build.
>>>
>>> I don't have a working Gitlab setup[*] though to trigger the CI to enable to
>>> wealth of targets it build-tests. If you could kindly test the patch attached in
>>> a new pipeline (applied on top of the branch you just build) below to understand
>>> if the CI gets happy. I will include these changes in the right patches (patch 8
>>> and 10) for the v4 spin.
>>
>> Looks like this passes:
>>
>> https://gitlab.com/alex.williamson/qemu/-/pipelines/796750136
>>
> Great, I've staged this fixes in patches 8&10!
> 
> I have a sliver of hope that we might still make it by soft freeze (tomorrow?).
> If you think it can still make it, should the rest of the series is good, then I
> can follow up v4 today/tomorrow. Thoughts?

I would say, wait and see if a v4 is needed first. These changes are
relatively easy to fold in.

C.



> 
> 	Joao
>
Re: [PATCH v3 00/13] vfio/migration: Device dirty page tracking
Posted by Alex Williamson 1 year, 1 month ago
On Mon, 6 Mar 2023 12:05:06 +0100
Cédric Le Goater <clg@redhat.com> wrote:

> On 3/6/23 10:45, Joao Martins wrote:
> > On 06/03/2023 02:19, Alex Williamson wrote:  
> >> On Sun, 5 Mar 2023 23:33:35 +0000
> >> Joao Martins <joao.m.martins@oracle.com> wrote:
> >>  
> >>> On 05/03/2023 20:57, Alex Williamson wrote:  
> >>>> On Sat,  4 Mar 2023 01:43:30 +0000
> >>>> Joao Martins <joao.m.martins@oracle.com> wrote:
> >>>>      
> >>>>> Hey,
> >>>>>
> >>>>> Presented herewith a series based on the basic VFIO migration protocol v2
> >>>>> implementation [1].
> >>>>>
> >>>>> It is split from its parent series[5] to solely focus on device dirty
> >>>>> page tracking. Device dirty page tracking allows the VFIO device to
> >>>>> record its DMAs and report them back when needed. This is part of VFIO
> >>>>> migration and is used during pre-copy phase of migration to track the
> >>>>> RAM pages that the device has written to and mark those pages dirty, so
> >>>>> they can later be re-sent to target.
> >>>>>
> >>>>> Device dirty page tracking uses the DMA logging uAPI to discover device
> >>>>> capabilities, to start and stop tracking, and to get dirty page bitmap
> >>>>> report. Extra details and uAPI definition can be found here [3].
> >>>>>
> >>>>> Device dirty page tracking operates in VFIOContainer scope. I.e., When
> >>>>> dirty tracking is started, stopped or dirty page report is queried, all
> >>>>> devices within a VFIOContainer are iterated and for each of them device
> >>>>> dirty page tracking is started, stopped or dirty page report is queried,
> >>>>> respectively.
> >>>>>
> >>>>> Device dirty page tracking is used only if all devices within a
> >>>>> VFIOContainer support it. Otherwise, VFIO IOMMU dirty page tracking is
> >>>>> used, and if that is not supported as well, memory is perpetually marked
> >>>>> dirty by QEMU. Note that since VFIO IOMMU dirty page tracking has no HW
> >>>>> support, the last two usually have the same effect of perpetually
> >>>>> marking all pages dirty.
> >>>>>
> >>>>> Normally, when asked to start dirty tracking, all the currently DMA
> >>>>> mapped ranges are tracked by device dirty page tracking. If using a
> >>>>> vIOMMU we block live migration. It's temporary and a separate series is
> >>>>> going to add support for it. Thus this series focus on getting the
> >>>>> ground work first.
> >>>>>
> >>>>> The series is organized as follows:
> >>>>>
> >>>>> - Patches 1-7: Fix bugs and do some preparatory work required prior to
> >>>>>    adding device dirty page tracking.
> >>>>> - Patches 8-10: Implement device dirty page tracking.
> >>>>> - Patch 11: Blocks live migration with vIOMMU.
> >>>>> - Patches 12-13 enable device dirty page tracking and document it.
> >>>>>
> >>>>> Comments, improvements as usual appreciated.  
> >>>>
> >>>> Still some CI failures:
> >>>>
> >>>> https://gitlab.com/alex.williamson/qemu/-/pipelines/796657474
> >>>>
> >>>> The docker failures are normal, afaict the rest are not.  Thanks,
> >>>>      
> >>>
> >>> Ugh, sorry
> >>>
> >>> The patch below scissors mark (and also attached as a file) fixes those build
> >>> issues. I managed to reproduce on i386 target builds, and these changes fix my
> >>> 32-bit build.
> >>>
> >>> I don't have a working Gitlab setup[*] though to trigger the CI to enable to
> >>> wealth of targets it build-tests. If you could kindly test the patch attached in
> >>> a new pipeline (applied on top of the branch you just build) below to understand
> >>> if the CI gets happy. I will include these changes in the right patches (patch 8
> >>> and 10) for the v4 spin.  
> >>
> >> Looks like this passes:
> >>
> >> https://gitlab.com/alex.williamson/qemu/-/pipelines/796750136
> >>  
> > Great, I've staged this fixes in patches 8&10!
> > 
> > I have a sliver of hope that we might still make it by soft freeze (tomorrow?).
> > If you think it can still make it, should the rest of the series is good, then I
> > can follow up v4 today/tomorrow. Thoughts?  
> 
> I would say, wait and see if a v4 is needed first. These changes are
> relatively easy to fold in.

I think we have enough changes and fixes to post a v4 once you're happy
with it.  We should have tomorrow, the 7th to get final reviews and
post a pull request.  Thanks,

Alex