[PATCH v3 2/5] migration: Create MigrationState active field

Juan Quintela posted 5 patches 6 years ago
There is a newer version of this series
[PATCH v3 2/5] migration: Create MigrationState active field
Posted by Juan Quintela 6 years ago
Right now, there is no easy way to dectect if we have already
cancelled/finished/failed a migration.  This field is setup to true
when we start a migration, and it is set to false as soon as we stop
it.

It fixes a real bug, in ram_save_iterate() we call functions that
wrote to the channel even if we know that migration has stopped for
any reason.  This gives problems with multifd because we need to
synchronize various semoaphores that we don't want to take.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c | 5 +++++
 migration/migration.h | 5 +++++
 migration/ram.c       | 2 +-
 migration/savevm.c    | 2 ++
 4 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index 990bff00c0..60bc8710b6 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1583,6 +1583,8 @@ static void migrate_fd_cancel(MigrationState *s)
     QEMUFile *f = migrate_get_current()->to_dst_file;
     trace_migrate_fd_cancel();
 
+    s->active = false;
+
     if (s->rp_state.from_dst_file) {
         /* shutdown the rp socket, so causing the rp thread to shutdown */
         qemu_file_shutdown(s->rp_state.from_dst_file);
@@ -2834,6 +2836,7 @@ static void migration_completion(MigrationState *s)
     }
 
     if (!migrate_colo_enabled()) {
+        s->active = false;
         migrate_set_state(&s->state, current_active_state,
                           MIGRATION_STATUS_COMPLETED);
     }
@@ -2859,6 +2862,7 @@ fail_invalidate:
     }
 
 fail:
+    s->active = false;
     migrate_set_state(&s->state, current_active_state,
                       MIGRATION_STATUS_FAILED);
 }
@@ -3289,6 +3293,7 @@ static void *migration_thread(void *opaque)
     }
 
     qemu_savevm_state_setup(s->to_dst_file);
+    s->active = true;
 
     if (qemu_savevm_nr_failover_devices()) {
         migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
diff --git a/migration/migration.h b/migration/migration.h
index aa9ff6f27b..e0386efe95 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -169,6 +169,11 @@ struct MigrationState
 
     int state;
 
+    /* Is the migration channel still open.  When migration finish,
+     * gets an error or is cancelled this becomes false.
+     */
+
+    bool active;
     /* State related to return path */
     struct {
         QEMUFile     *from_dst_file;
diff --git a/migration/ram.c b/migration/ram.c
index 8f9f3bba5b..44ca56e1ea 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3521,7 +3521,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
     ram_control_after_iterate(f, RAM_CONTROL_ROUND);
 
 out:
-    if (ret >= 0) {
+    if (ret >= 0 && migrate_get_current()->active) {
         multifd_send_sync_main(rs);
         qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
         qemu_fflush(f);
diff --git a/migration/savevm.c b/migration/savevm.c
index adfdca26ac..3efde5b3dd 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1550,6 +1550,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
     qemu_mutex_unlock_iothread();
     qemu_savevm_state_header(f);
     qemu_savevm_state_setup(f);
+    ms->active = true;
     qemu_mutex_lock_iothread();
 
     while (qemu_file_get_error(f) == 0) {
@@ -1574,6 +1575,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
         status = MIGRATION_STATUS_COMPLETED;
     }
     migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP, status);
+    ms->active = false;
 
     /* f is outer parameter, it should not stay in global migration state after
      * this function finished */
-- 
2.24.1


Re: [PATCH v3 2/5] migration: Create MigrationState active field
Posted by Dr. David Alan Gilbert 6 years ago
* Juan Quintela (quintela@redhat.com) wrote:
> Right now, there is no easy way to dectect if we have already
> cancelled/finished/failed a migration.  This field is setup to true
> when we start a migration, and it is set to false as soon as we stop
> it.
> 
> It fixes a real bug, in ram_save_iterate() we call functions that
> wrote to the channel even if we know that migration has stopped for
> any reason.  This gives problems with multifd because we need to
> synchronize various semoaphores that we don't want to take.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>

Why can't you use migration_is_active() in the ram.c case?
My preference would be just to stick with something derived
from the state rather than tacking another state bit on.

> ---
>  migration/migration.c | 5 +++++
>  migration/migration.h | 5 +++++
>  migration/ram.c       | 2 +-
>  migration/savevm.c    | 2 ++
>  4 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 990bff00c0..60bc8710b6 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1583,6 +1583,8 @@ static void migrate_fd_cancel(MigrationState *s)
>      QEMUFile *f = migrate_get_current()->to_dst_file;
>      trace_migrate_fd_cancel();
>  
> +    s->active = false;
> +
>      if (s->rp_state.from_dst_file) {
>          /* shutdown the rp socket, so causing the rp thread to shutdown */
>          qemu_file_shutdown(s->rp_state.from_dst_file);
> @@ -2834,6 +2836,7 @@ static void migration_completion(MigrationState *s)
>      }
>  
>      if (!migrate_colo_enabled()) {
> +        s->active = false;
>          migrate_set_state(&s->state, current_active_state,
>                            MIGRATION_STATUS_COMPLETED);

You've not always got these two the same way around - i.e. do you change
the state first or do you set the active state first?  I think it needs
to be consistent.

>      }
> @@ -2859,6 +2862,7 @@ fail_invalidate:
>      }
>  
>  fail:
> +    s->active = false;
>      migrate_set_state(&s->state, current_active_state,
>                        MIGRATION_STATUS_FAILED);
>  }
> @@ -3289,6 +3293,7 @@ static void *migration_thread(void *opaque)
>      }
>  
>      qemu_savevm_state_setup(s->to_dst_file);
> +    s->active = true;
>  
>      if (qemu_savevm_nr_failover_devices()) {
>          migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> diff --git a/migration/migration.h b/migration/migration.h
> index aa9ff6f27b..e0386efe95 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -169,6 +169,11 @@ struct MigrationState
>  
>      int state;
>  
> +    /* Is the migration channel still open.  When migration finish,
> +     * gets an error or is cancelled this becomes false.
> +     */
> +
> +    bool active;
>      /* State related to return path */
>      struct {
>          QEMUFile     *from_dst_file;
> diff --git a/migration/ram.c b/migration/ram.c
> index 8f9f3bba5b..44ca56e1ea 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3521,7 +3521,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>      ram_control_after_iterate(f, RAM_CONTROL_ROUND);
>  
>  out:
> -    if (ret >= 0) {
> +    if (ret >= 0 && migrate_get_current()->active) {
>          multifd_send_sync_main(rs);
>          qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>          qemu_fflush(f);
> diff --git a/migration/savevm.c b/migration/savevm.c
> index adfdca26ac..3efde5b3dd 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1550,6 +1550,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
>      qemu_mutex_unlock_iothread();
>      qemu_savevm_state_header(f);
>      qemu_savevm_state_setup(f);
> +    ms->active = true;
>      qemu_mutex_lock_iothread();
>  
>      while (qemu_file_get_error(f) == 0) {
> @@ -1574,6 +1575,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
>          status = MIGRATION_STATUS_COMPLETED;
>      }
>      migrate_set_state(&ms->state, MIGRATION_STATUS_SETUP, status);
> +    ms->active = false;
>  
>      /* f is outer parameter, it should not stay in global migration state after
>       * this function finished */
> -- 
> 2.24.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH v3 2/5] migration: Create MigrationState active field
Posted by Juan Quintela 6 years ago
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> Right now, there is no easy way to dectect if we have already
>> cancelled/finished/failed a migration.  This field is setup to true
>> when we start a migration, and it is set to false as soon as we stop
>> it.
>> 
>> It fixes a real bug, in ram_save_iterate() we call functions that
>> wrote to the channel even if we know that migration has stopped for
>> any reason.  This gives problems with multifd because we need to
>> synchronize various semoaphores that we don't want to take.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>
> Why can't you use migration_is_active() in the ram.c case?
> My preference would be just to stick with something derived
> from the state rather than tacking another state bit on.

My plan was to go the other way around.
We need to use the state with atomics, I wanted a single way of deciding
if we are/or not in the middle of a migration.  Just now it is too
confusing on my opinion.

>> ---
>>  migration/migration.c | 5 +++++
>>  migration/migration.h | 5 +++++
>>  migration/ram.c       | 2 +-
>>  migration/savevm.c    | 2 ++
>>  4 files changed, 13 insertions(+), 1 deletion(-)
>> 
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 990bff00c0..60bc8710b6 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1583,6 +1583,8 @@ static void migrate_fd_cancel(MigrationState *s)
>>      QEMUFile *f = migrate_get_current()->to_dst_file;
>>      trace_migrate_fd_cancel();
>>  
>> +    s->active = false;
>> +
>>      if (s->rp_state.from_dst_file) {
>>          /* shutdown the rp socket, so causing the rp thread to shutdown */
>>          qemu_file_shutdown(s->rp_state.from_dst_file);
>> @@ -2834,6 +2836,7 @@ static void migration_completion(MigrationState *s)
>>      }
>>  
>>      if (!migrate_colo_enabled()) {
>> +        s->active = false;
>>          migrate_set_state(&s->state, current_active_state,
>>                            MIGRATION_STATUS_COMPLETED);
>
> You've not always got these two the same way around - i.e. do you change
> the state first or do you set the active state first?  I think it needs
> to be consistent.

Ok.

Thanks, Juan.


Re: [PATCH v3 2/5] migration: Create MigrationState active field
Posted by Juan Quintela 6 years ago
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> Right now, there is no easy way to dectect if we have already
>> cancelled/finished/failed a migration.  This field is setup to true
>> when we start a migration, and it is set to false as soon as we stop
>> it.
>> 
>> It fixes a real bug, in ram_save_iterate() we call functions that
>> wrote to the channel even if we know that migration has stopped for
>> any reason.  This gives problems with multifd because we need to
>> synchronize various semoaphores that we don't want to take.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>
> Why can't you use migration_is_active() in the ram.c case?
> My preference would be just to stick with something derived
> from the state rather than tacking another state bit on.

Trying to redo this as something more reasonable.
Problem that I was trying to do is being sure that we know in what state
we are.  Real migration states are:

- NOT_STARTED: We haven't even started
- SETUP: We have started with local stuff but haven't yet transmitted
  anything
- ACTIVE: Migration is donig well, we are trasnmitting data
- FINISHED: We have finished migration (COMPLETED/FAILED/CANCELLED/CANCELLING)
- COLO: Yet a completelly different can of worms

To make things even more interesting, we export ->state, so code can do
whatever they want with that variable.

What do we need in a lot of places:
- migration_is_running() (i.e. channel is still open).

And we go left and right to be sure what is going on.

>> @@ -2834,6 +2836,7 @@ static void migration_completion(MigrationState *s)
>>      }
>>  
>>      if (!migrate_colo_enabled()) {
>> +        s->active = false;
>>          migrate_set_state(&s->state, current_active_state,
>>                            MIGRATION_STATUS_COMPLETED);
>
> You've not always got these two the same way around - i.e. do you change
> the state first or do you set the active state first?  I think it needs
> to be consistent.

As said, I will try to move that to inside migrate_set_state()

thanks, Juan.