[PATCH 13/20] migration: Move channel setup out of postcopy_try_recover()

Peter Xu posted 20 patches 3 years, 11 months ago
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Eric Blake <eblake@redhat.com>, Laurent Vivier <lvivier@redhat.com>, Juan Quintela <quintela@redhat.com>, Thomas Huth <thuth@redhat.com>, Markus Armbruster <armbru@redhat.com>
There is a newer version of this series
[PATCH 13/20] migration: Move channel setup out of postcopy_try_recover()
Posted by Peter Xu 3 years, 11 months ago
We used to use postcopy_try_recover() to replace migration_incoming_setup() to
setup incoming channels.  That's fine for the old world, but in the new world
there can be more than one channels that need setup.  Better move the channel
setup out of it so that postcopy_try_recover() only handles the last phase of
switching to the recovery phase.

To do that in migration_fd_process_incoming(), move the postcopy_try_recover()
call to be after migration_incoming_setup(), which will setup the channels.
While in migration_ioc_process_incoming(), postpone the recover() routine right
before we'll jump into migration_incoming_process().

A side benefit is we don't need to pass in QEMUFile* to postcopy_try_recover()
anymore.  Remove it.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/migration.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 67520d3105..b2e6446457 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -665,19 +665,20 @@ void migration_incoming_process(void)
 }
 
 /* Returns true if recovered from a paused migration, otherwise false */
-static bool postcopy_try_recover(QEMUFile *f)
+static bool postcopy_try_recover(void)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
 
     if (mis->state == MIGRATION_STATUS_POSTCOPY_PAUSED) {
         /* Resumed from a paused postcopy migration */
 
-        mis->from_src_file = f;
+        /* This should be set already in migration_incoming_setup() */
+        assert(mis->from_src_file);
         /* Postcopy has standalone thread to do vm load */
-        qemu_file_set_blocking(f, true);
+        qemu_file_set_blocking(mis->from_src_file, true);
 
         /* Re-configure the return path */
-        mis->to_src_file = qemu_file_get_return_path(f);
+        mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
 
         migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_PAUSED,
                           MIGRATION_STATUS_POSTCOPY_RECOVER);
@@ -698,11 +699,10 @@ static bool postcopy_try_recover(QEMUFile *f)
 
 void migration_fd_process_incoming(QEMUFile *f, Error **errp)
 {
-    if (postcopy_try_recover(f)) {
+    if (!migration_incoming_setup(f, errp)) {
         return;
     }
-
-    if (!migration_incoming_setup(f, errp)) {
+    if (postcopy_try_recover()) {
         return;
     }
     migration_incoming_process();
@@ -718,11 +718,6 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
         /* The first connection (multifd may have multiple) */
         QEMUFile *f = qemu_fopen_channel_input(ioc);
 
-        /* If it's a recovery, we're done */
-        if (postcopy_try_recover(f)) {
-            return;
-        }
-
         if (!migration_incoming_setup(f, errp)) {
             return;
         }
@@ -743,6 +738,10 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
     }
 
     if (start_migration) {
+        /* If it's a recovery, we're done */
+        if (postcopy_try_recover()) {
+            return;
+        }
         migration_incoming_process();
     }
 }
-- 
2.32.0


Re: [PATCH 13/20] migration: Move channel setup out of postcopy_try_recover()
Posted by Dr. David Alan Gilbert 3 years, 11 months ago
* Peter Xu (peterx@redhat.com) wrote:
> We used to use postcopy_try_recover() to replace migration_incoming_setup() to
> setup incoming channels.  That's fine for the old world, but in the new world
> there can be more than one channels that need setup.  Better move the channel
> setup out of it so that postcopy_try_recover() only handles the last phase of
> switching to the recovery phase.
> 
> To do that in migration_fd_process_incoming(), move the postcopy_try_recover()
> call to be after migration_incoming_setup(), which will setup the channels.
> While in migration_ioc_process_incoming(), postpone the recover() routine right
> before we'll jump into migration_incoming_process().
> 
> A side benefit is we don't need to pass in QEMUFile* to postcopy_try_recover()
> anymore.  Remove it.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>

OK, but note one question below:

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/migration.c | 23 +++++++++++------------
>  1 file changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 67520d3105..b2e6446457 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -665,19 +665,20 @@ void migration_incoming_process(void)
>  }
>  
>  /* Returns true if recovered from a paused migration, otherwise false */
> -static bool postcopy_try_recover(QEMUFile *f)
> +static bool postcopy_try_recover(void)
>  {
>      MigrationIncomingState *mis = migration_incoming_get_current();
>  
>      if (mis->state == MIGRATION_STATUS_POSTCOPY_PAUSED) {
>          /* Resumed from a paused postcopy migration */
>  
> -        mis->from_src_file = f;
> +        /* This should be set already in migration_incoming_setup() */
> +        assert(mis->from_src_file);
>          /* Postcopy has standalone thread to do vm load */
> -        qemu_file_set_blocking(f, true);
> +        qemu_file_set_blocking(mis->from_src_file, true);

Does that set_blocking happen on the 2nd channel somewhere?

Dave

>  
>          /* Re-configure the return path */
> -        mis->to_src_file = qemu_file_get_return_path(f);
> +        mis->to_src_file = qemu_file_get_return_path(mis->from_src_file);
>  
>          migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_PAUSED,
>                            MIGRATION_STATUS_POSTCOPY_RECOVER);
> @@ -698,11 +699,10 @@ static bool postcopy_try_recover(QEMUFile *f)
>  
>  void migration_fd_process_incoming(QEMUFile *f, Error **errp)
>  {
> -    if (postcopy_try_recover(f)) {
> +    if (!migration_incoming_setup(f, errp)) {
>          return;
>      }
> -
> -    if (!migration_incoming_setup(f, errp)) {
> +    if (postcopy_try_recover()) {
>          return;
>      }
>      migration_incoming_process();
> @@ -718,11 +718,6 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>          /* The first connection (multifd may have multiple) */
>          QEMUFile *f = qemu_fopen_channel_input(ioc);
>  
> -        /* If it's a recovery, we're done */
> -        if (postcopy_try_recover(f)) {
> -            return;
> -        }
> -
>          if (!migration_incoming_setup(f, errp)) {
>              return;
>          }
> @@ -743,6 +738,10 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>      }
>  
>      if (start_migration) {
> +        /* If it's a recovery, we're done */
> +        if (postcopy_try_recover()) {
> +            return;
> +        }
>          migration_incoming_process();
>      }
>  }
> -- 
> 2.32.0
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH 13/20] migration: Move channel setup out of postcopy_try_recover()
Posted by Peter Xu 3 years, 11 months ago
On Tue, Feb 22, 2022 at 10:57:34AM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > We used to use postcopy_try_recover() to replace migration_incoming_setup() to
> > setup incoming channels.  That's fine for the old world, but in the new world
> > there can be more than one channels that need setup.  Better move the channel
> > setup out of it so that postcopy_try_recover() only handles the last phase of
> > switching to the recovery phase.
> > 
> > To do that in migration_fd_process_incoming(), move the postcopy_try_recover()
> > call to be after migration_incoming_setup(), which will setup the channels.
> > While in migration_ioc_process_incoming(), postpone the recover() routine right
> > before we'll jump into migration_incoming_process().
> > 
> > A side benefit is we don't need to pass in QEMUFile* to postcopy_try_recover()
> > anymore.  Remove it.
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> 
> OK, but note one question below:
> 
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Thanks.

> 
> > ---
> >  migration/migration.c | 23 +++++++++++------------
> >  1 file changed, 11 insertions(+), 12 deletions(-)
> > 
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 67520d3105..b2e6446457 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -665,19 +665,20 @@ void migration_incoming_process(void)
> >  }
> >  
> >  /* Returns true if recovered from a paused migration, otherwise false */
> > -static bool postcopy_try_recover(QEMUFile *f)
> > +static bool postcopy_try_recover(void)
> >  {
> >      MigrationIncomingState *mis = migration_incoming_get_current();
> >  
> >      if (mis->state == MIGRATION_STATUS_POSTCOPY_PAUSED) {
> >          /* Resumed from a paused postcopy migration */
> >  
> > -        mis->from_src_file = f;
> > +        /* This should be set already in migration_incoming_setup() */
> > +        assert(mis->from_src_file);
> >          /* Postcopy has standalone thread to do vm load */
> > -        qemu_file_set_blocking(f, true);
> > +        qemu_file_set_blocking(mis->from_src_file, true);
> 
> Does that set_blocking happen on the 2nd channel somewhere?

Nop.  I think the rational is that by default all channels are blocking.

Then what happened is: migration code only sets the main channel to
non-blocking on incoming, that's in migration_incoming_setup().  Hence for
postcopy recovery we need to tweak it to blocking here.

The 2nd new channel is not operated by migration_incoming_setup(), but by
postcopy_preempt_new_channel(), so it keeps the original blocking state,
which should be blocking.

If we want to make that clear, we can proactively set non-blocking too in
postcopy_preempt_new_channel() on the 2nd channel.  It's just that it
should be optional as long as blocking is the default for any new fd of a
socket.

Thanks,

-- 
Peter Xu