On 2/22/23 18:49, Avihai Horon wrote:
> Adjust the VFIO dirty page tracking documentation and add a section to
> describe device dirty page tracking.
>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Thanks,
C.
> ---
> docs/devel/vfio-migration.rst | 50 ++++++++++++++++++++++-------------
> 1 file changed, 32 insertions(+), 18 deletions(-)
>
> diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst
> index ba80b9150d..a432cda081 100644
> --- a/docs/devel/vfio-migration.rst
> +++ b/docs/devel/vfio-migration.rst
> @@ -71,22 +71,37 @@ System memory dirty pages tracking
> ----------------------------------
>
> A ``log_global_start`` and ``log_global_stop`` memory listener callback informs
> -the VFIO IOMMU module to start and stop dirty page tracking. A ``log_sync``
> -memory listener callback marks those system memory pages as dirty which are
> -used for DMA by the VFIO device. The dirty pages bitmap is queried per
> -container. All pages pinned by the vendor driver through external APIs have to
> -be marked as dirty during migration. When there are CPU writes, CPU dirty page
> -tracking can identify dirtied pages, but any page pinned by the vendor driver
> -can also be written by the device. There is currently no device or IOMMU
> -support for dirty page tracking in hardware.
> +the VFIO dirty tracking module to start and stop dirty page tracking. A
> +``log_sync`` memory listener callback queries the dirty page bitmap from the
> +dirty tracking module and marks system memory pages which were DMA-ed by the
> +VFIO device as dirty. The dirty page bitmap is queried per container.
> +
> +Currently there are two ways dirty page tracking can be done:
> +(1) Device dirty tracking:
> +In this method the device is responsible to log and report its DMAs. This
> +method can be used only if the device is capable of tracking its DMAs.
> +Discovering device capability, starting and stopping dirty tracking, and
> +syncing the dirty bitmaps from the device are done using the DMA logging uAPI.
> +More info about the uAPI can be found in the comments of the
> +``vfio_device_feature_dma_logging_control`` and
> +``vfio_device_feature_dma_logging_report`` structures in the header file
> +linux-headers/linux/vfio.h.
> +
> +(2) VFIO IOMMU module:
> +In this method dirty tracking is done by IOMMU. However, there is currently no
> +IOMMU support for dirty page tracking. For this reason, all pages are
> +perpetually marked dirty, unless the device driver pins pages through external
> +APIs in which case only those pinned pages are perpetually marked dirty.
> +
> +If the above two methods are not supported, all pages are perpetually marked
> +dirty by QEMU.
>
> By default, dirty pages are tracked during pre-copy as well as stop-and-copy
> -phase. So, a page pinned by the vendor driver will be copied to the destination
> -in both phases. Copying dirty pages in pre-copy phase helps QEMU to predict if
> -it can achieve its downtime tolerances. If QEMU during pre-copy phase keeps
> -finding dirty pages continuously, then it understands that even in stop-and-copy
> -phase, it is likely to find dirty pages and can predict the downtime
> -accordingly.
> +phase. So, a page marked as dirty will be copied to the destination in both
> +phases. Copying dirty pages in pre-copy phase helps QEMU to predict if it can
> +achieve its downtime tolerances. If QEMU during pre-copy phase keeps finding
> +dirty pages continuously, then it understands that even in stop-and-copy phase,
> +it is likely to find dirty pages and can predict the downtime accordingly.
>
> QEMU also provides a per device opt-out option ``pre-copy-dirty-page-tracking``
> which disables querying the dirty bitmap during pre-copy phase. If it is set to
> @@ -97,10 +112,9 @@ System memory dirty pages tracking when vIOMMU is enabled
> ---------------------------------------------------------
>
> With vIOMMU, an IO virtual address range can get unmapped while in pre-copy
> -phase of migration. In that case, the unmap ioctl returns any dirty pages in
> -that range and QEMU reports corresponding guest physical pages dirty. During
> -stop-and-copy phase, an IOMMU notifier is used to get a callback for mapped
> -pages and then dirty pages bitmap is fetched from VFIO IOMMU modules for those
> +phase of migration. In that case, dirty page bitmap for this range is queried
> +and synced with QEMU. During stop-and-copy phase, an IOMMU notifier is used to
> +get a callback for mapped pages and then dirty page bitmap is fetched for those
> mapped ranges.
>
> Flow of state changes during Live migration