[PATCH v1 0/3] migration: Fixes to the 'background-snapshot' code

Andrey Gruzdev posted 3 patches 3 years, 1 month ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20210319145249.425189-1-andrey.gruzdev@virtuozzo.com
Maintainers: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, David Hildenbrand <david@redhat.com>, Juan Quintela <quintela@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>
There is a newer version of this series
hw/virtio/virtio-balloon.c |  8 +++++--
include/migration/misc.h   |  2 ++
migration/migration.c      | 18 +++++++++++++-
migration/ram.c            | 48 ++++++++++++++++++++++++++++++++++++++
migration/ram.h            |  1 +
5 files changed, 74 insertions(+), 3 deletions(-)
[PATCH v1 0/3] migration: Fixes to the 'background-snapshot' code
Posted by Andrey Gruzdev 3 years, 1 month ago
Changes v0->v1:
 * Using qemu_real_host_page_size instead of TARGET_PAGE_SIZE for host
   page size in ram_block_populate_pages()
 * More elegant implementation of ram_block_populate_pages()

This patch series contains:
 * Fix to the issue with occasionally truncated non-iterable device state
 * Solution to compatibility issues with virtio-balloon device
 * Fix to the issue when discarded or never populated pages miss UFFD
   write protection and get into migration stream in dirty state

Andrey Gruzdev (3):
  migration: Fix missing qemu_fflush() on buffer file in
    bg_migration_thread
  migration: Inhibit virtio-balloon for the duration of background
    snapshot
  migration: Pre-fault memory before starting background snasphot

 hw/virtio/virtio-balloon.c |  8 +++++--
 include/migration/misc.h   |  2 ++
 migration/migration.c      | 18 +++++++++++++-
 migration/ram.c            | 48 ++++++++++++++++++++++++++++++++++++++
 migration/ram.h            |  1 +
 5 files changed, 74 insertions(+), 3 deletions(-)

-- 
2.25.1


Re: [PATCH v1 0/3] migration: Fixes to the 'background-snapshot' code
Posted by Peter Xu 3 years, 1 month ago
On Fri, Mar 19, 2021 at 05:52:46PM +0300, Andrey Gruzdev wrote:
> Changes v0->v1:
>  * Using qemu_real_host_page_size instead of TARGET_PAGE_SIZE for host
>    page size in ram_block_populate_pages()
>  * More elegant implementation of ram_block_populate_pages()
> 
> This patch series contains:
>  * Fix to the issue with occasionally truncated non-iterable device state
>  * Solution to compatibility issues with virtio-balloon device
>  * Fix to the issue when discarded or never populated pages miss UFFD
>    write protection and get into migration stream in dirty state
> 
> Andrey Gruzdev (3):
>   migration: Fix missing qemu_fflush() on buffer file in
>     bg_migration_thread
>   migration: Inhibit virtio-balloon for the duration of background
>     snapshot
>   migration: Pre-fault memory before starting background snasphot

Unless Andrey would like to respin a new version, this version looks good to me
(I don't think the adding new helper issue in patch 1 is a blocker):

Reviewed-by: Peter Xu <peterx@redhat.com>

I'm also looking into introducing UFFD_FEATURE_WP_UNALLOCATED so as to
wr-protect page holes too for a uffd-wp region when the feature bit is set.
With that feature we should be able to avoid pre-fault as what we do in the
last patch of this series.  However even if that can work out, we'll still need
this for old kernel anyways.

Thanks,

-- 
Peter Xu


Re: [PATCH v1 0/3] migration: Fixes to the 'background-snapshot' code
Posted by Andrey Gruzdev 3 years, 1 month ago
On 24.03.2021 01:21, Peter Xu wrote:
> On Fri, Mar 19, 2021 at 05:52:46PM +0300, Andrey Gruzdev wrote:
>> Changes v0->v1:
>>   * Using qemu_real_host_page_size instead of TARGET_PAGE_SIZE for host
>>     page size in ram_block_populate_pages()
>>   * More elegant implementation of ram_block_populate_pages()
>>
>> This patch series contains:
>>   * Fix to the issue with occasionally truncated non-iterable device state
>>   * Solution to compatibility issues with virtio-balloon device
>>   * Fix to the issue when discarded or never populated pages miss UFFD
>>     write protection and get into migration stream in dirty state
>>
>> Andrey Gruzdev (3):
>>    migration: Fix missing qemu_fflush() on buffer file in
>>      bg_migration_thread
>>    migration: Inhibit virtio-balloon for the duration of background
>>      snapshot
>>    migration: Pre-fault memory before starting background snasphot
> Unless Andrey would like to respin a new version, this version looks good to me
> (I don't think the adding new helper issue in patch 1 is a blocker):
>
> Reviewed-by: Peter Xu <peterx@redhat.com>

Thanks.

> I'm also looking into introducing UFFD_FEATURE_WP_UNALLOCATED so as to
> wr-protect page holes too for a uffd-wp region when the feature bit is set.
> With that feature we should be able to avoid pre-fault as what we do in the
> last patch of this series.  However even if that can work out, we'll still need
> this for old kernel anyways.

I'm curious this new feature is based on adding wr-protection at the level of VMAs,
so we won't miss write faults for missing pages?

> Thanks,
>

Re: [PATCH v1 0/3] migration: Fixes to the 'background-snapshot' code
Posted by Peter Xu 3 years, 1 month ago
On Wed, Mar 24, 2021 at 11:09:27AM +0300, Andrey Gruzdev wrote:
> > I'm also looking into introducing UFFD_FEATURE_WP_UNALLOCATED so as to
> > wr-protect page holes too for a uffd-wp region when the feature bit is set.
> > With that feature we should be able to avoid pre-fault as what we do in the
> > last patch of this series.  However even if that can work out, we'll still need
> > this for old kernel anyways.
> 
> I'm curious this new feature is based on adding wr-protection at the level of VMAs,
> so we won't miss write faults for missing pages?

I think we can do it with multiple ways.

The most efficient one would be wr-protect the range during uffd-wp
registration, so as you said it'll be per-vma attribute.  However that'll
change the general semantics of uffd-wp as normally we need registration and
explicit wr-protect.  Then it'll still be pte-based for faulted in pages (the
ones we wr-protected during registration will still be), however for the rest
it'll become vma-based.  It's indeed a bit confusing.

The other way is we can fault in zero page during UFFDIO_WRITEPROTECT.  However
that's less efficient, since it's close to pre-fault on read but it's just
slightly more cleaner than doing it in userspace.  When I rethink about this it
may not worth it to do in kernel if userspace can achieve things similar.

So let's stick with current solution; that idea may need more thoughts..

Thanks,

-- 
Peter Xu


Re: [PATCH v1 0/3] migration: Fixes to the 'background-snapshot' code
Posted by Andrey Gruzdev 3 years, 1 month ago
On 24.03.2021 18:41, Peter Xu wrote:
> On Wed, Mar 24, 2021 at 11:09:27AM +0300, Andrey Gruzdev wrote:
>>> I'm also looking into introducing UFFD_FEATURE_WP_UNALLOCATED so as to
>>> wr-protect page holes too for a uffd-wp region when the feature bit is set.
>>> With that feature we should be able to avoid pre-fault as what we do in the
>>> last patch of this series.  However even if that can work out, we'll still need
>>> this for old kernel anyways.
>> I'm curious this new feature is based on adding wr-protection at the level of VMAs,
>> so we won't miss write faults for missing pages?
> I think we can do it with multiple ways.
>
> The most efficient one would be wr-protect the range during uffd-wp
> registration, so as you said it'll be per-vma attribute.  However that'll
> change the general semantics of uffd-wp as normally we need registration and
> explicit wr-protect.  Then it'll still be pte-based for faulted in pages (the
> ones we wr-protected during registration will still be), however for the rest
> it'll become vma-based.  It's indeed a bit confusing.
>
> The other way is we can fault in zero page during UFFDIO_WRITEPROTECT.  However
> that's less efficient, since it's close to pre-fault on read but it's just
> slightly more cleaner than doing it in userspace.  When I rethink about this it
> may not worth it to do in kernel if userspace can achieve things similar.
>
> So let's stick with current solution; that idea may need more thoughts..
>
> Thanks,
>
Agree, let's stick with current solution. For the future I think having 
a registration
flag like WP_MISSING to induce per-vma wr-protection is not a bad choice.

The reason is that usage of UFFDIO_WRITEPROTECT ioctl is often 
asymmetrical; we usually
write-protect the whole registration range but un-protect by small chunks.

So if we stay with current current symmetric protect/un-protect API but 
add the registration
flag to handle protection for unpopulated pages - that may be worth to do.

-- 
Andrey Gruzdev, Principal Engineer
Virtuozzo GmbH  +7-903-247-6397
                 virtuzzo.com