[Qemu-devel] [PATCH] migration: fix exec/fd migrations

Juan Quintela posted 1 patch 7 years, 5 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20180523091411.1073-1-quintela@redhat.com
Test checkpatch passed
Test docker-mingw@fedora passed
Test docker-quick@centos7 passed
Test s390x passed
migration/exec.c | 4 ++++
migration/fd.c   | 4 ++++
2 files changed, 8 insertions(+)
[Qemu-devel] [PATCH] migration: fix exec/fd migrations
Posted by Juan Quintela 7 years, 5 months ago
Commit:

commit 36c2f8be2c4eb0003ac77a14910842b7ddd7337e
Author: Juan Quintela <quintela@redhat.com>
Date:   Wed Mar 7 08:40:52 2018 +0100

    migration: Delay start of migration main routines

Missed tcp and fd transports.  This fix its.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/exec.c | 4 ++++
 migration/fd.c   | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/migration/exec.c b/migration/exec.c
index 9d0f82f1f0..0bbeb63c97 100644
--- a/migration/exec.c
+++ b/migration/exec.c
@@ -20,6 +20,7 @@
 #include "qemu/osdep.h"
 #include "channel.h"
 #include "exec.h"
+#include "migration.h"
 #include "io/channel-command.h"
 #include "trace.h"
 
@@ -48,6 +49,9 @@ static gboolean exec_accept_incoming_migration(QIOChannel *ioc,
 {
     migration_channel_process_incoming(ioc);
     object_unref(OBJECT(ioc));
+    if (!migrate_use_multifd()) {
+        migration_incoming_process();
+    }
     return G_SOURCE_REMOVE;
 }
 
diff --git a/migration/fd.c b/migration/fd.c
index 9a380bbbc4..fee34ffdc0 100644
--- a/migration/fd.c
+++ b/migration/fd.c
@@ -17,6 +17,7 @@
 #include "qemu/osdep.h"
 #include "channel.h"
 #include "fd.h"
+#include "migration.h"
 #include "monitor/monitor.h"
 #include "io/channel-util.h"
 #include "trace.h"
@@ -48,6 +49,9 @@ static gboolean fd_accept_incoming_migration(QIOChannel *ioc,
 {
     migration_channel_process_incoming(ioc);
     object_unref(OBJECT(ioc));
+    if (!migrate_use_multifd()) {
+        migration_incoming_process();
+    }
     return G_SOURCE_REMOVE;
 }
 
-- 
2.17.0


Re: [Qemu-devel] [PATCH] migration: fix exec/fd migrations
Posted by Kevin Wolf 7 years, 5 months ago
Am 23.05.2018 um 11:14 hat Juan Quintela geschrieben:
> Commit:
> 
> commit 36c2f8be2c4eb0003ac77a14910842b7ddd7337e
> Author: Juan Quintela <quintela@redhat.com>
> Date:   Wed Mar 7 08:40:52 2018 +0100
> 
>     migration: Delay start of migration main routines
> 
> Missed tcp and fd transports.  This fix its.
> 
> Reported-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Juan Quintela <quintela@redhat.com>

In my first test run, qemu-iotests 169 still timed out with a message
that qemu recevied signal 6, but I haven't been able to reproduce it
since. Quite possible that that's an old, unrelated problem. Other than
that, the tests are passing again. I suppose this is enough for:

Tested-by: Kevin Wolf <kwolf@redhat.com>

Re: [Qemu-devel] [PATCH] migration: fix exec/fd migrations
Posted by Peter Xu 7 years, 5 months ago
On Wed, May 23, 2018 at 11:14:11AM +0200, Juan Quintela wrote:
> Commit:
> 
> commit 36c2f8be2c4eb0003ac77a14910842b7ddd7337e
> Author: Juan Quintela <quintela@redhat.com>
> Date:   Wed Mar 7 08:40:52 2018 +0100
> 
>     migration: Delay start of migration main routines
> 
> Missed tcp and fd transports.  This fix its.
> 
> Reported-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  migration/exec.c | 4 ++++
>  migration/fd.c   | 4 ++++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/migration/exec.c b/migration/exec.c
> index 9d0f82f1f0..0bbeb63c97 100644
> --- a/migration/exec.c
> +++ b/migration/exec.c
> @@ -20,6 +20,7 @@
>  #include "qemu/osdep.h"
>  #include "channel.h"
>  #include "exec.h"
> +#include "migration.h"
>  #include "io/channel-command.h"
>  #include "trace.h"
>  
> @@ -48,6 +49,9 @@ static gboolean exec_accept_incoming_migration(QIOChannel *ioc,
>  {
>      migration_channel_process_incoming(ioc);
>      object_unref(OBJECT(ioc));
> +    if (!migrate_use_multifd()) {
> +        migration_incoming_process();
> +    }
>      return G_SOURCE_REMOVE;
>  }
>  
> diff --git a/migration/fd.c b/migration/fd.c
> index 9a380bbbc4..fee34ffdc0 100644
> --- a/migration/fd.c
> +++ b/migration/fd.c
> @@ -17,6 +17,7 @@
>  #include "qemu/osdep.h"
>  #include "channel.h"
>  #include "fd.h"
> +#include "migration.h"
>  #include "monitor/monitor.h"
>  #include "io/channel-util.h"
>  #include "trace.h"
> @@ -48,6 +49,9 @@ static gboolean fd_accept_incoming_migration(QIOChannel *ioc,
>  {
>      migration_channel_process_incoming(ioc);
>      object_unref(OBJECT(ioc));
> +    if (!migrate_use_multifd()) {
> +        migration_incoming_process();
> +    }
>      return G_SOURCE_REMOVE;
>  }

We are calling migration_incoming_process() everywhere...  But
actually we have a single entrance at
migration_ioc_process_incoming().  How about we still use a single
entrance for migration instead (and actually I just noticed that
postcopy recovery should be broken now on master since now we don't
call migration_fd_process_incoming at all...).  I mean something like
this (even not tested with compile, just to show what I mean):

diff --git a/migration/migration.c b/migration/migration.c
index 05aec2c905..fb27daf940 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -468,7 +468,6 @@ void migration_fd_process_incoming(QEMUFile *f)
         qemu_sem_post(&mis->postcopy_pause_sem_dst);
     } else {
         /* New incoming migration */
-        migration_incoming_setup(f);
         migration_incoming_process();
     }
 }
@@ -476,13 +475,24 @@ void migration_fd_process_incoming(QEMUFile *f)
 void migration_ioc_process_incoming(QIOChannel *ioc)
 {
     MigrationIncomingState *mis = migration_incoming_get_current();
+    bool start_migration = true;
 
     if (!mis->from_src_file) {
         QEMUFile *f = qemu_fopen_channel_input(ioc);
         migration_incoming_setup(f);
-        return;
+        if (migrate_use_multifd()) {
+            /* We need to wait until all channels settled */
+            return;
+        }
+    }
+
+    if (migrate_use_multifd()) {
+        start_migration = multifd_recv_new_channel(ioc);
+    }
+
+    if (start_migration) {
+        migration_fd_process_incoming(mis->from_src_file);
     }
-    multifd_recv_new_channel(ioc);
 }
 
 /**
diff --git a/migration/ram.c b/migration/ram.c
index 5bcbf7a9f9..dad3ed03b8 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -866,7 +866,8 @@ bool multifd_recv_all_channels_created(void)
     return thread_count == atomic_read(&multifd_recv_state->count);
 }
 
-void multifd_recv_new_channel(QIOChannel *ioc)
+/* Return true if we have all the channels ready; otherwise false */
+bool multifd_recv_new_channel(QIOChannel *ioc)
 {
     MultiFDRecvParams *p;
     Error *local_err = NULL;
@@ -892,9 +893,8 @@ void multifd_recv_new_channel(QIOChannel *ioc)
     qemu_thread_create(&p->thread, p->name, multifd_recv_thread, p,
                        QEMU_THREAD_JOINABLE);
     atomic_inc(&multifd_recv_state->count);
-    if (multifd_recv_state->count == migrate_multifd_channels()) {
-        migration_incoming_process();
-    }
+
+    return multifd_recv_state->count == migrate_multifd_channels();
 }
 
 /**
diff --git a/migration/rdma.c b/migration/rdma.c
index 7d233b0820..6f8c6d9abc 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3696,6 +3696,7 @@ static void rdma_accept_incoming_migration(void *opaque)
     }
 
     rdma->migration_started_on_destination = 1;
+    migration_incoming_setup(f);
     migration_fd_process_incoming(f);
 }
 
diff --git a/migration/socket.c b/migration/socket.c
index 3456eb76e9..18321c9605 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -170,10 +170,6 @@ static void socket_accept_incoming_migration(QIONetListener *listener,
         qio_net_listener_disconnect(listener);
 
         object_unref(OBJECT(listener));
-
-        if (!migrate_use_multifd()) {
-            migration_incoming_process();
-        }
     }
 }
 
This patch will make sure we call migration_incoming_process() only
once, logically speaking it should also fix the breakage of postcopy
recovery.

What do you think?

CC Dan too.

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH] migration: fix exec/fd migrations
Posted by John Snow 7 years, 5 months ago

On 05/23/2018 05:14 AM, Juan Quintela wrote:
> Commit:
> 
> commit 36c2f8be2c4eb0003ac77a14910842b7ddd7337e
> Author: Juan Quintela <quintela@redhat.com>
> Date:   Wed Mar 7 08:40:52 2018 +0100
> 
>     migration: Delay start of migration main routines
> 
> Missed tcp and fd transports.  This fix its.
> 
> Reported-by: Kevin Wolf <kwolf@redhat.com>
> Signed-off-by: Juan Quintela <quintela@redhat.com>

Fixes things for me, but I see that Peter Xu has more concerns.

Would be happy with checking this in for now and following up with
better refactors so that iotests works again in the meantime.

Tested-by: John Snow <jsnow@redhat.com>

Re: [Qemu-devel] [PATCH] migration: fix exec/fd migrations
Posted by Juan Quintela 7 years, 5 months ago
John Snow <jsnow@redhat.com> wrote:
> On 05/23/2018 05:14 AM, Juan Quintela wrote:
>> Commit:
>> 
>> commit 36c2f8be2c4eb0003ac77a14910842b7ddd7337e
>> Author: Juan Quintela <quintela@redhat.com>
>> Date:   Wed Mar 7 08:40:52 2018 +0100
>> 
>>     migration: Delay start of migration main routines
>> 
>> Missed tcp and fd transports.  This fix its.
>> 
>> Reported-by: Kevin Wolf <kwolf@redhat.com>
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>
> Fixes things for me, but I see that Peter Xu has more concerns.
>
> Would be happy with checking this in for now and following up with
> better refactors so that iotests works again in the meantime.
>
> Tested-by: John Snow <jsnow@redhat.com>

That is my plan.  Will send a pull request Tomorrow, and we can go from
there.  Peter suggestion is good but requires developtemt and testing.

Thanks, Juan.