[PATCH v2 0/7] UFFD write-tracking migration/snapshots

Andrey Gruzdev via posted 7 patches 3 years, 4 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20201118132048.429092-1-andrey.gruzdev@virtuozzo.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Eric Blake <eblake@redhat.com>, Markus Armbruster <armbru@redhat.com>, Juan Quintela <quintela@redhat.com>
There is a newer version of this series
include/exec/memory.h |   7 +
migration/migration.c | 338 +++++++++++++++++++++++++++++++-
migration/migration.h |   4 +
migration/ram.c       | 439 +++++++++++++++++++++++++++++++++++++++++-
migration/ram.h       |   4 +
migration/savevm.c    |   1 -
migration/savevm.h    |   2 +
qapi/migration.json   |   7 +-
8 files changed, 790 insertions(+), 12 deletions(-)
[PATCH v2 0/7] UFFD write-tracking migration/snapshots
Posted by Andrey Gruzdev via 3 years, 4 months ago
Currently the only way to make (external) live VM snapshot is using existing
dirty page logging migration mechanism. The main problem is that it tends to
produce a lot of page duplicates while running VM goes on updating already
saved pages. That leads to the fact that vmstate image size is commonly several
times bigger then non-zero part of virtual machine's RSS. Time required to
converge RAM migration and the size of snapshot image severely depend on the
guest memory write rate, sometimes resulting in unacceptably long snapshot
creation time and huge image size.

This series propose a way to solve the aforementioned problems. This is done
by using different RAM migration mechanism based on UFFD write protection
management introduced in v5.7 kernel. The migration strategy is to 'freeze'
guest RAM content using write-protection and iteratively release protection
for memory ranges that have already been saved to the migration stream.
At the same time we read in pending UFFD write fault events and save those
pages out-of-order with higher priority.

How to use:
1. Enable write-tracking migration capability
   virsh qemu-monitor-command <domain> --hmp migrate_set_capability.
track-writes-ram on

2. Start the external migration to a file
   virsh qemu-monitor-command <domain> --hmp migrate exec:'cat > ./vm_state'

3. Wait for the migration finish and check that the migration has completed.
state.

Andrey Gruzdev (7):
  Introduce 'track-writes-ram' migration capability.
  Introduced UFFD-WP low-level interface helpers. Implemented support
    for the whole RAM block memory protection/un-protection. Higher
    level ram_write_tracking_start() and ram_write_tracking_stop() to
    start/stop tracking memory writes on the whole VM memory.
  Support UFFD write fault processing in ram_save_iterate().
  Implementation of write-tracking migration thread.
  Implementation of vm_start() BH.
  The rest of write tracking migration code.
  Introduced simple linear scan rate limiting mechanism for write
    tracking migration.

 include/exec/memory.h |   7 +
 migration/migration.c | 338 +++++++++++++++++++++++++++++++-
 migration/migration.h |   4 +
 migration/ram.c       | 439 +++++++++++++++++++++++++++++++++++++++++-
 migration/ram.h       |   4 +
 migration/savevm.c    |   1 -
 migration/savevm.h    |   2 +
 qapi/migration.json   |   7 +-
 8 files changed, 790 insertions(+), 12 deletions(-)

-- 
2.25.1


Re: [PATCH v2 0/7] UFFD write-tracking migration/snapshots
Posted by Eric Blake 3 years, 4 months ago
On 11/18/20 7:20 AM, Andrey Gruzdev wrote:
> Currently the only way to make (external) live VM snapshot is using existing
> dirty page logging migration mechanism. The main problem is that it tends to
> produce a lot of page duplicates while running VM goes on updating already
> saved pages. That leads to the fact that vmstate image size is commonly several
> times bigger then non-zero part of virtual machine's RSS. Time required to
> converge RAM migration and the size of snapshot image severely depend on the
> guest memory write rate, sometimes resulting in unacceptably long snapshot
> creation time and huge image size.
> 
> This series propose a way to solve the aforementioned problems. This is done
> by using different RAM migration mechanism based on UFFD write protection
> management introduced in v5.7 kernel. The migration strategy is to 'freeze'
> guest RAM content using write-protection and iteratively release protection
> for memory ranges that have already been saved to the migration stream.
> At the same time we read in pending UFFD write fault events and save those
> pages out-of-order with higher priority.
> 
> How to use:
> 1. Enable write-tracking migration capability
>    virsh qemu-monitor-command <domain> --hmp migrate_set_capability.
> track-writes-ram on
> 
> 2. Start the external migration to a file
>    virsh qemu-monitor-command <domain> --hmp migrate exec:'cat > ./vm_state'
> 
> 3. Wait for the migration finish and check that the migration has completed.
> state.
> 
> Andrey Gruzdev (7):
>   Introduce 'track-writes-ram' migration capability.
>   Introduced UFFD-WP low-level interface helpers. Implemented support
>     for the whole RAM block memory protection/un-protection. Higher
>     level ram_write_tracking_start() and ram_write_tracking_stop() to
>     start/stop tracking memory writes on the whole VM memory.

Subject line is too long on that patch. You probably forgot a newline.
Also, it is more common to not include a trailing '.' in the subject line.

>   Support UFFD write fault processing in ram_save_iterate().
>   Implementation of write-tracking migration thread.
>   Implementation of vm_start() BH.
>   The rest of write tracking migration code.
>   Introduced simple linear scan rate limiting mechanism for write
>     tracking migration.
> 

How does v2 differ from v1?  It makes life easier for reviewers to know
what to look for in the respin that was fixed based on the problems in
the earlier version.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


Re: [PATCH v2 0/7] UFFD write-tracking migration/snapshots
Posted by Andrey Gruzdev 3 years, 4 months ago
On 18.11.2020 17:54, Eric Blake wrote:
> On 11/18/20 7:20 AM, Andrey Gruzdev wrote:
>> Currently the only way to make (external) live VM snapshot is using existing
>> dirty page logging migration mechanism. The main problem is that it tends to
>> produce a lot of page duplicates while running VM goes on updating already
>> saved pages. That leads to the fact that vmstate image size is commonly several
>> times bigger then non-zero part of virtual machine's RSS. Time required to
>> converge RAM migration and the size of snapshot image severely depend on the
>> guest memory write rate, sometimes resulting in unacceptably long snapshot
>> creation time and huge image size.
>>
>> This series propose a way to solve the aforementioned problems. This is done
>> by using different RAM migration mechanism based on UFFD write protection
>> management introduced in v5.7 kernel. The migration strategy is to 'freeze'
>> guest RAM content using write-protection and iteratively release protection
>> for memory ranges that have already been saved to the migration stream.
>> At the same time we read in pending UFFD write fault events and save those
>> pages out-of-order with higher priority.
>>
>> How to use:
>> 1. Enable write-tracking migration capability
>>     virsh qemu-monitor-command <domain> --hmp migrate_set_capability.
>> track-writes-ram on
>>
>> 2. Start the external migration to a file
>>     virsh qemu-monitor-command <domain> --hmp migrate exec:'cat > ./vm_state'
>>
>> 3. Wait for the migration finish and check that the migration has completed.
>> state.
>>
>> Andrey Gruzdev (7):
>>    Introduce 'track-writes-ram' migration capability.
>>    Introduced UFFD-WP low-level interface helpers. Implemented support
>>      for the whole RAM block memory protection/un-protection. Higher
>>      level ram_write_tracking_start() and ram_write_tracking_stop() to
>>      start/stop tracking memory writes on the whole VM memory.
> 
> Subject line is too long on that patch. You probably forgot a newline.
> Also, it is more common to not include a trailing '.' in the subject line.
> 

Sorry, I'm a bit new to mailing list process, indeed I missed newline to 
separate subject.

>>    Support UFFD write fault processing in ram_save_iterate().
>>    Implementation of write-tracking migration thread.
>>    Implementation of vm_start() BH.
>>    The rest of write tracking migration code.
>>    Introduced simple linear scan rate limiting mechanism for write
>>      tracking migration.
>>
> 
> How does v2 differ from v1?  It makes life easier for reviewers to know
> what to look for in the respin that was fixed based on the problems in
> the earlier version.
> 

The only difference v0->v2 is that the latter would pass checkpatch 
test, nothing else. I've accidently disabled checkpatch commit hook
after found it's false positive on do {} while (0); - for the semicolon.

-- 
Andrey Gruzdev, Principal Engineer
Virtuozzo GmbH  +7-903-247-6397
                 virtuzzo.com

Re: [PATCH v2 0/7] UFFD write-tracking migration/snapshots
Posted by Andrey Gruzdev 3 years, 4 months ago
On 18.11.2020 16:20, Andrey Gruzdev wrote:
> Currently the only way to make (external) live VM snapshot is using existing
> dirty page logging migration mechanism. The main problem is that it tends to
> produce a lot of page duplicates while running VM goes on updating already
> saved pages. That leads to the fact that vmstate image size is commonly several
> times bigger then non-zero part of virtual machine's RSS. Time required to
> converge RAM migration and the size of snapshot image severely depend on the
> guest memory write rate, sometimes resulting in unacceptably long snapshot
> creation time and huge image size.
> 
> This series propose a way to solve the aforementioned problems. This is done
> by using different RAM migration mechanism based on UFFD write protection
> management introduced in v5.7 kernel. The migration strategy is to 'freeze'
> guest RAM content using write-protection and iteratively release protection
> for memory ranges that have already been saved to the migration stream.
> At the same time we read in pending UFFD write fault events and save those
> pages out-of-order with higher priority.
> 
> How to use:
> 1. Enable write-tracking migration capability
>     virsh qemu-monitor-command <domain> --hmp migrate_set_capability.
> track-writes-ram on
> 
> 2. Start the external migration to a file
>     virsh qemu-monitor-command <domain> --hmp migrate exec:'cat > ./vm_state'
> 
> 3. Wait for the migration finish and check that the migration has completed.
> state.
> 
> Andrey Gruzdev (7):
>    Introduce 'track-writes-ram' migration capability.
>    Introduced UFFD-WP low-level interface helpers. Implemented support
>      for the whole RAM block memory protection/un-protection. Higher
>      level ram_write_tracking_start() and ram_write_tracking_stop() to
>      start/stop tracking memory writes on the whole VM memory.
>    Support UFFD write fault processing in ram_save_iterate().
>    Implementation of write-tracking migration thread.
>    Implementation of vm_start() BH.
>    The rest of write tracking migration code.
>    Introduced simple linear scan rate limiting mechanism for write
>      tracking migration.
> 
>   include/exec/memory.h |   7 +
>   migration/migration.c | 338 +++++++++++++++++++++++++++++++-
>   migration/migration.h |   4 +
>   migration/ram.c       | 439 +++++++++++++++++++++++++++++++++++++++++-
>   migration/ram.h       |   4 +
>   migration/savevm.c    |   1 -
>   migration/savevm.h    |   2 +
>   qapi/migration.json   |   7 +-
>   8 files changed, 790 insertions(+), 12 deletions(-)
> 

Also need to note that this patch series is a kind of 'refinking' of 
Denis Plotnikov's ideas he implemented in his series
'[PATCH v0 0/4] migration: add background snapshot'.

-- 
Andrey Gruzdev, Principal Engineer
Virtuozzo GmbH  +7-903-247-6397
                 virtuzzo.com