On 08.12.2020 21:24, Peter Xu wrote:
> On Fri, Dec 04, 2020 at 12:30:59PM +0300, Andrey Gruzdev wrote:
>> This patch series is a kind of 'rethinking' of Denis Plotnikov's ideas he's
>> implemented in his series '[PATCH v0 0/4] migration: add background snapshot'.
>>
>> Currently the only way to make (external) live VM snapshot is using existing
>> dirty page logging migration mechanism. The main problem is that it tends to
>> produce a lot of page duplicates while running VM goes on updating already
>> saved pages. That leads to the fact that vmstate image size is commonly several
>> times bigger then non-zero part of virtual machine's RSS. Time required to
>> converge RAM migration and the size of snapshot image severely depend on the
>> guest memory write rate, sometimes resulting in unacceptably long snapshot
>> creation time and huge image size.
>>
>> This series propose a way to solve the aforementioned problems. This is done
>> by using different RAM migration mechanism based on UFFD write protection
>> management introduced in v5.7 kernel. The migration strategy is to 'freeze'
>> guest RAM content using write-protection and iteratively release protection
>> for memory ranges that have already been saved to the migration stream.
>> At the same time we read in pending UFFD write fault events and save those
>> pages out-of-order with higher priority.
>>
>> How to use:
>> 1. Enable write-tracking migration capability
>> virsh qemu-monitor-command <domain> --hmp migrate_set_capability.
>> track-writes-ram on
>>
>> 2. Start the external migration to a file
>> virsh qemu-monitor-command <domain> --hmp migrate exec:'cat > ./vm_state'
>>
>> 3. Wait for the migration finish and check that the migration has completed.
>> state.
>>
>> Changes v4->v5:
>>
>> * 1. Refactored util/userfaultfd.c code to support features required by postcopy.
>> * 2. Introduced checks for host kernel and guest memory backend compatibility
>> * to 'background-snapshot' branch in migrate_caps_check().
>> * 3. Switched to using trace_xxx instead of info_report()/error_report() for
>> * cases when error message must be hidden (probing UFFD-IO) or info may be
>> * really littering output if goes to stderr.
>> * 4 Added RCU_READ_LOCK_GUARDs to the code dealing with RAM block list.
>> * 5. Added memory_region_ref() for each RAM block being wr-protected.
>> * 6. Reused qemu_ram_block_from_host() instead of custom RAM block lookup routine.
>> * 7. Refused from using specific hwaddr/ram_addr_t in favour of void */uint64_t.
>> * 8. Currently dropped 'linear-scan-rate-limiting' patch. The reason is that
>> * that choosen criteria for high-latency fault detection (i.e. timestamp of
>> * UFFD event fetch) is not representative enough for this task.
>> * At the moment it looks somehow like premature optimization effort.
>> * 8. Dropped some unnecessary/unused code.
> I went over the series and it looks nice!
>
> There're a few todos for this series, so I added them into the wiki page (I
> created a "feature" section for migration todo and put live snapshot there):
>
> https://wiki.qemu.org/ToDo/LiveMigration#Features
>
> Anyone feel free to add..
>
> Thanks,
>
Thanks, Peter!
--
Andrey Gruzdev, Principal Engineer
Virtuozzo GmbH +7-903-247-6397
virtuzzo.com