[v1] migration: Postcopy Preemption

[PATCH RFC 02/15] migration: Allow pss->page jump over clean pages

Posted by Peter Xu 4 years ago

Commit ba1b7c812c ("migration/ram: Optimize ram_save_host_page()") managed to
optimize host huge page use case by scanning the dirty bitmap when looking for
the next dirty small page to migrate.

However when updating the pss->page before returning from that function, we
used MIN() of these two values: (1) next dirty bit, or (2) end of current sent
huge page, to fix up pss->page.

That sounds unnecessary, because I see nowhere that requires pss->page to be
not going over current huge page boundary.

What we need here is probably MAX() instead of MIN() so that we'll start
scanning from the next dirty bit next time. Since pss->page can't be smaller
than hostpage_boundary (the loop guarantees it), it probably means we don't
need to fix it up at all.

Cc: Keqian Zhu <zhukeqian1@huawei.com>
Cc: Kunkun Jiang <jiangkunkun@huawei.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/ram.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 381ad56d26..94b0ad4234 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2229,8 +2229,6 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
     } while ((pss->page < hostpage_boundary) &&
              offset_in_ramblock(pss->block,
                                 ((ram_addr_t)pss->page) << TARGET_PAGE_BITS));
-    /* The offset we leave with is the min boundary of host page and block */
-    pss->page = MIN(pss->page, hostpage_boundary);
 
     res = ram_save_release_protection(rs, pss, start_page);
     return (res < 0 ? res : pages);
-- 
2.32.0

Re: [PATCH RFC 02/15] migration: Allow pss->page jump over clean pages

Posted by Dr. David Alan Gilbert 4 years ago

* Peter Xu (peterx@redhat.com) wrote:
> Commit ba1b7c812c ("migration/ram: Optimize ram_save_host_page()") managed to
> optimize host huge page use case by scanning the dirty bitmap when looking for
> the next dirty small page to migrate.
> 
> However when updating the pss->page before returning from that function, we
> used MIN() of these two values: (1) next dirty bit, or (2) end of current sent
> huge page, to fix up pss->page.
> 
> That sounds unnecessary, because I see nowhere that requires pss->page to be
> not going over current huge page boundary.
> 
> What we need here is probably MAX() instead of MIN() so that we'll start
> scanning from the next dirty bit next time. Since pss->page can't be smaller
> than hostpage_boundary (the loop guarantees it), it probably means we don't
> need to fix it up at all.
> 
> Cc: Keqian Zhu <zhukeqian1@huawei.com>
> Cc: Kunkun Jiang <jiangkunkun@huawei.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>


Hmm, I think that's potentially necessary.  note that the start of
ram_save_host_page stores the 'start_page' at entry.
That' start_page' goes to the ram_save_release_protection and so
I think it needs to be pagesize aligned for the mmap/uffd that happens.

Dave

> ---
>  migration/ram.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 381ad56d26..94b0ad4234 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2229,8 +2229,6 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
>      } while ((pss->page < hostpage_boundary) &&
>               offset_in_ramblock(pss->block,
>                                  ((ram_addr_t)pss->page) << TARGET_PAGE_BITS));
> -    /* The offset we leave with is the min boundary of host page and block */
> -    pss->page = MIN(pss->page, hostpage_boundary);
>  
>      res = ram_save_release_protection(rs, pss, start_page);
>      return (res < 0 ? res : pages);
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [PATCH RFC 02/15] migration: Allow pss->page jump over clean pages

Posted by Peter Xu 4 years ago

On Wed, Jan 19, 2022 at 01:42:47PM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > Commit ba1b7c812c ("migration/ram: Optimize ram_save_host_page()") managed to
> > optimize host huge page use case by scanning the dirty bitmap when looking for
> > the next dirty small page to migrate.
> > 
> > However when updating the pss->page before returning from that function, we
> > used MIN() of these two values: (1) next dirty bit, or (2) end of current sent
> > huge page, to fix up pss->page.
> > 
> > That sounds unnecessary, because I see nowhere that requires pss->page to be
> > not going over current huge page boundary.
> > 
> > What we need here is probably MAX() instead of MIN() so that we'll start
> > scanning from the next dirty bit next time. Since pss->page can't be smaller
> > than hostpage_boundary (the loop guarantees it), it probably means we don't
> > need to fix it up at all.
> > 
> > Cc: Keqian Zhu <zhukeqian1@huawei.com>
> > Cc: Kunkun Jiang <jiangkunkun@huawei.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> 
> 
> Hmm, I think that's potentially necessary.  note that the start of
> ram_save_host_page stores the 'start_page' at entry.
> That' start_page' goes to the ram_save_release_protection and so
> I think it needs to be pagesize aligned for the mmap/uffd that happens.

Right, that's indeed a functional change, but IMHO it's also fine.

When reaching ram_save_release_protection(), what we guaranteed is that below
page range contains no dirty bits in ramblock dirty bitmap:

  range0 = [start_page, pss->page)

Side note: inclusive on start, but not inclusive on the end side of range0
(that is, pss->page can be pointing to a dirty page).

What ram_save_release_protection() does is to unprotect the pages and let them
run free.  If we're sure range0 contains no dirty page, it means we have
already copied them over into the snapshot, so IIUC it's safe to unprotect all
of it (even if it's already bigger than the host page size)?

That can be slightly less efficient for live snapshot in some extreme cases
(when unprotect, we'll need to walk the pgtables in the uffd ioctl()), but I
don't assume live snapshot to be run on a huge VM, so hopefully it's still
fine?  Not to mention it should make live migration a little bit faster,
assuming that's more frequently used..

Thanks,

-- 
Peter Xu

Re: [PATCH RFC 02/15] migration: Allow pss->page jump over clean pages

Posted by Dr. David Alan Gilbert 4 years ago

* Peter Xu (peterx@redhat.com) wrote:
> On Wed, Jan 19, 2022 at 01:42:47PM +0000, Dr. David Alan Gilbert wrote:
> > * Peter Xu (peterx@redhat.com) wrote:
> > > Commit ba1b7c812c ("migration/ram: Optimize ram_save_host_page()") managed to
> > > optimize host huge page use case by scanning the dirty bitmap when looking for
> > > the next dirty small page to migrate.
> > > 
> > > However when updating the pss->page before returning from that function, we
> > > used MIN() of these two values: (1) next dirty bit, or (2) end of current sent
> > > huge page, to fix up pss->page.
> > > 
> > > That sounds unnecessary, because I see nowhere that requires pss->page to be
> > > not going over current huge page boundary.
> > > 
> > > What we need here is probably MAX() instead of MIN() so that we'll start
> > > scanning from the next dirty bit next time. Since pss->page can't be smaller
> > > than hostpage_boundary (the loop guarantees it), it probably means we don't
> > > need to fix it up at all.
> > > 
> > > Cc: Keqian Zhu <zhukeqian1@huawei.com>
> > > Cc: Kunkun Jiang <jiangkunkun@huawei.com>
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > 
> > 
> > Hmm, I think that's potentially necessary.  note that the start of
> > ram_save_host_page stores the 'start_page' at entry.
> > That' start_page' goes to the ram_save_release_protection and so
> > I think it needs to be pagesize aligned for the mmap/uffd that happens.
> 
> Right, that's indeed a functional change, but IMHO it's also fine.
> 
> When reaching ram_save_release_protection(), what we guaranteed is that below
> page range contains no dirty bits in ramblock dirty bitmap:
> 
>   range0 = [start_page, pss->page)
> 
> Side note: inclusive on start, but not inclusive on the end side of range0
> (that is, pss->page can be pointing to a dirty page).
> 
> What ram_save_release_protection() does is to unprotect the pages and let them
> run free.  If we're sure range0 contains no dirty page, it means we have
> already copied them over into the snapshot, so IIUC it's safe to unprotect all
> of it (even if it's already bigger than the host page size)?

I think what's worrying me is the alignment of the address going into
UFFDIO_WRITEPROTECT in uffd_change_protection - if it was previously
huge page aligned and now isn't, what breaks? (Did it support
hugepages?)

> That can be slightly less efficient for live snapshot in some extreme cases
> (when unprotect, we'll need to walk the pgtables in the uffd ioctl()), but I
> don't assume live snapshot to be run on a huge VM, so hopefully it's still
> fine?  Not to mention it should make live migration a little bit faster,
> assuming that's more frequently used..

Hmm I don't think I understand that statement.

Dave

> 
> Thanks,
> 
> -- 
> Peter Xu
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [PATCH RFC 02/15] migration: Allow pss->page jump over clean pages

Posted by Peter Xu 3 years, 12 months ago

On Thu, Feb 03, 2022 at 06:19:22PM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > On Wed, Jan 19, 2022 at 01:42:47PM +0000, Dr. David Alan Gilbert wrote:
> > > * Peter Xu (peterx@redhat.com) wrote:
> > > > Commit ba1b7c812c ("migration/ram: Optimize ram_save_host_page()") managed to
> > > > optimize host huge page use case by scanning the dirty bitmap when looking for
> > > > the next dirty small page to migrate.
> > > > 
> > > > However when updating the pss->page before returning from that function, we
> > > > used MIN() of these two values: (1) next dirty bit, or (2) end of current sent
> > > > huge page, to fix up pss->page.
> > > > 
> > > > That sounds unnecessary, because I see nowhere that requires pss->page to be
> > > > not going over current huge page boundary.
> > > > 
> > > > What we need here is probably MAX() instead of MIN() so that we'll start
> > > > scanning from the next dirty bit next time. Since pss->page can't be smaller
> > > > than hostpage_boundary (the loop guarantees it), it probably means we don't
> > > > need to fix it up at all.
> > > > 
> > > > Cc: Keqian Zhu <zhukeqian1@huawei.com>
> > > > Cc: Kunkun Jiang <jiangkunkun@huawei.com>
> > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > 
> > > 
> > > Hmm, I think that's potentially necessary.  note that the start of
> > > ram_save_host_page stores the 'start_page' at entry.
> > > That' start_page' goes to the ram_save_release_protection and so
> > > I think it needs to be pagesize aligned for the mmap/uffd that happens.
> > 
> > Right, that's indeed a functional change, but IMHO it's also fine.
> > 
> > When reaching ram_save_release_protection(), what we guaranteed is that below
> > page range contains no dirty bits in ramblock dirty bitmap:
> > 
> >   range0 = [start_page, pss->page)
> > 
> > Side note: inclusive on start, but not inclusive on the end side of range0
> > (that is, pss->page can be pointing to a dirty page).
> > 
> > What ram_save_release_protection() does is to unprotect the pages and let them
> > run free.  If we're sure range0 contains no dirty page, it means we have
> > already copied them over into the snapshot, so IIUC it's safe to unprotect all
> > of it (even if it's already bigger than the host page size)?
> 
> I think what's worrying me is the alignment of the address going into
> UFFDIO_WRITEPROTECT in uffd_change_protection - if it was previously
> huge page aligned and now isn't, what breaks? (Did it support
> hugepages?)

Good point..

It doesn't support huge pages yet, but we'd better keep it always page aligned
for the unprotect ioctl.

> 
> > That can be slightly less efficient for live snapshot in some extreme cases
> > (when unprotect, we'll need to walk the pgtables in the uffd ioctl()), but I
> > don't assume live snapshot to be run on a huge VM, so hopefully it's still
> > fine?  Not to mention it should make live migration a little bit faster,
> > assuming that's more frequently used..
> 
> Hmm I don't think I understand that statement.

I meant since we've scanned over those clean pages we don't need to scan it
again in the next find_dirty_block() call for precopy, per the "faster"
statement.

But to make it simple I think I'll drop this patch in the next version.

Thanks!

-- 
Peter Xu