QEMU will abort when vhost-user process is restarted during migration
when vhost_log_global_start/stop is called. The reason is clear that
vhost_dev_set_log returns -1 because network connection is lost.
To handle this situation, let's cancel migration by setting migrate
state to failure and report it to user.
---
hw/virtio/vhost.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index ddc42f0..92725f7 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -26,6 +26,8 @@
#include "hw/virtio/virtio-bus.h"
#include "hw/virtio/virtio-access.h"
#include "migration/blocker.h"
+#include "migration/migration.h"
+#include "migration/qemu-file.h"
#include "sysemu/dma.h"
/* enabled until disconnected backend stabilizes */
@@ -885,7 +887,10 @@ static void vhost_log_global_start(MemoryListener *listener)
r = vhost_migration_log(listener, true);
if (r < 0) {
- abort();
+ error_report("Failed to start vhost dirty log");
+ if (migrate_get_current()->migration_thread_running) {
+ qemu_file_set_error(migrate_get_current()->to_dst_file, -ECHILD);
+ }
}
}
@@ -895,7 +900,10 @@ static void vhost_log_global_stop(MemoryListener *listener)
r = vhost_migration_log(listener, false);
if (r < 0) {
- abort();
+ error_report("Failed to stop vhost dirty log");
+ if (migrate_get_current()->migration_thread_running) {
+ qemu_file_set_error(migrate_get_current()->to_dst_file, -ECHILD);
+ }
}
}
--
1.8.3.1
On Fri, Dec 01, 2017 at 01:58:32PM +0800, fangying wrote:
> QEMU will abort when vhost-user process is restarted during migration
> when vhost_log_global_start/stop is called. The reason is clear that
> vhost_dev_set_log returns -1 because network connection is lost.
>
> To handle this situation, let's cancel migration by setting migrate
> state to failure and report it to user.
In fact I don't see this as the right way to fix it. Backend is dead so why
not just proceed with migration? We just need to make sure we re-send
migration data on re-connect.
> ---
> hw/virtio/vhost.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> index ddc42f0..92725f7 100644
> --- a/hw/virtio/vhost.c
> +++ b/hw/virtio/vhost.c
> @@ -26,6 +26,8 @@
> #include "hw/virtio/virtio-bus.h"
> #include "hw/virtio/virtio-access.h"
> #include "migration/blocker.h"
> +#include "migration/migration.h"
> +#include "migration/qemu-file.h"
> #include "sysemu/dma.h"
>
> /* enabled until disconnected backend stabilizes */
> @@ -885,7 +887,10 @@ static void vhost_log_global_start(MemoryListener *listener)
>
> r = vhost_migration_log(listener, true);
> if (r < 0) {
> - abort();
> + error_report("Failed to start vhost dirty log");
> + if (migrate_get_current()->migration_thread_running) {
> + qemu_file_set_error(migrate_get_current()->to_dst_file, -ECHILD);
> + }
> }
> }
>
> @@ -895,7 +900,10 @@ static void vhost_log_global_stop(MemoryListener *listener)
>
> r = vhost_migration_log(listener, false);
> if (r < 0) {
> - abort();
> + error_report("Failed to stop vhost dirty log");
> + if (migrate_get_current()->migration_thread_running) {
> + qemu_file_set_error(migrate_get_current()->to_dst_file, -ECHILD);
> + }
> }
> }
>
> --
> 1.8.3.1
>
On 2017/12/1 22:39, Michael S. Tsirkin wrote:
> On Fri, Dec 01, 2017 at 01:58:32PM +0800, fangying wrote:
>> QEMU will abort when vhost-user process is restarted during migration
>> when vhost_log_global_start/stop is called. The reason is clear that
>> vhost_dev_set_log returns -1 because network connection is lost.
>>
>> To handle this situation, let's cancel migration by setting migrate
>> state to failure and report it to user.
>
> In fact I don't see this as the right way to fix it. Backend is dead so why
> not just proceed with migration? We just need to make sure we re-send
> migration data on re-connect.
> This is where vhost start/stop migration dirty log. The original code aborts
qemu here beacuse vhost data stream may break down if we fail to start/stop
vhost dirty log during migration. Backend may be active after vhost_log_global_start.
dirty log start ----------------- dirty log stop
^ ^
| |
----- backend dead ----- backend active
Currently we don't re-send migration data on re-connect in this situation.
May we should work it out.
>> ---
>> hw/virtio/vhost.c | 12 ++++++++++--
>> 1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> index ddc42f0..92725f7 100644
>> --- a/hw/virtio/vhost.c
>> +++ b/hw/virtio/vhost.c
>> @@ -26,6 +26,8 @@
>> #include "hw/virtio/virtio-bus.h"
>> #include "hw/virtio/virtio-access.h"
>> #include "migration/blocker.h"
>> +#include "migration/migration.h"
>> +#include "migration/qemu-file.h"
>> #include "sysemu/dma.h"
>>
>> /* enabled until disconnected backend stabilizes */
>> @@ -885,7 +887,10 @@ static void vhost_log_global_start(MemoryListener *listener)
>>
>> r = vhost_migration_log(listener, true);
>> if (r < 0) {
>> - abort();
>> + error_report("Failed to start vhost dirty log");
>> + if (migrate_get_current()->migration_thread_running) {
>> + qemu_file_set_error(migrate_get_current()->to_dst_file, -ECHILD);
>> + }
>> }
>> }
>>
>> @@ -895,7 +900,10 @@ static void vhost_log_global_stop(MemoryListener *listener)
>>
>> r = vhost_migration_log(listener, false);
>> if (r < 0) {
>> - abort();
>> + error_report("Failed to stop vhost dirty log");
>> + if (migrate_get_current()->migration_thread_running) {
>> + qemu_file_set_error(migrate_get_current()->to_dst_file, -ECHILD);
>> + }
>> }
>> }
>>
>> --
>> 1.8.3.1
>>
>
> .
>
On Wed, Dec 06, 2017 at 09:30:27PM +0800, Ying Fang wrote: > > On 2017/12/1 22:39, Michael S. Tsirkin wrote: > > On Fri, Dec 01, 2017 at 01:58:32PM +0800, fangying wrote: > >> QEMU will abort when vhost-user process is restarted during migration > >> when vhost_log_global_start/stop is called. The reason is clear that > >> vhost_dev_set_log returns -1 because network connection is lost. > >> > >> To handle this situation, let's cancel migration by setting migrate > >> state to failure and report it to user. > > > > In fact I don't see this as the right way to fix it. Backend is dead so why > > not just proceed with migration? We just need to make sure we re-send > > migration data on re-connect. > > This is where vhost start/stop migration dirty log. The original code aborts > qemu here beacuse vhost data stream may break down if we fail to start/stop > vhost dirty log during migration. Backend may be active after vhost_log_global_start. > > dirty log start ----------------- dirty log stop > ^ ^ > | | > ----- backend dead ----- backend active I'm sorry, I don't understand yet. Backend is active after logging started - why is this a problem? > Currently we don't re-send migration data on re-connect in this situation. > May we should work it out. So basically backend connects after logging started, and we do not tell it to start logging and where - is that the issue? I agree, that would be a bug then. -- MST
On 2017/12/7 0:34, Michael S. Tsirkin wrote: > On Wed, Dec 06, 2017 at 09:30:27PM +0800, Ying Fang wrote: >> >> On 2017/12/1 22:39, Michael S. Tsirkin wrote: >>> On Fri, Dec 01, 2017 at 01:58:32PM +0800, fangying wrote: >>>> QEMU will abort when vhost-user process is restarted during migration >>>> when vhost_log_global_start/stop is called. The reason is clear that >>>> vhost_dev_set_log returns -1 because network connection is lost. >>>> >>>> To handle this situation, let's cancel migration by setting migrate >>>> state to failure and report it to user. >>> >>> In fact I don't see this as the right way to fix it. Backend is dead so why >>> not just proceed with migration? We just need to make sure we re-send >>> migration data on re-connect. >>> This is where vhost start/stop migration dirty log. The original code aborts >> qemu here beacuse vhost data stream may break down if we fail to start/stop >> vhost dirty log during migration. Backend may be active after vhost_log_global_start. >> >> dirty log start ----------------- dirty log stop >> ^ ^ >> | | >> ----- backend dead ----- backend active > > I'm sorry, I don't understand yet. Backend is active after logging started - > why is this a problem?Sorry, I did not explain it well. IF backend is dead when dirty log start is called, vhost_dev_set_log/vhost_dev_set_features may fail because connection is temporarily lost. So even if migration is in progress and vhost-user backend is active again later, vhost-user dirty memory is not logged. > >> Currently we don't re-send migration data on re-connect in this situation. >> May we should work it out. > > So basically backend connects after logging started, and we > do not tell it to start logging and where - is that the issue? > I agree, that would be a bug then. > Yes, this is just the issue.
© 2016 - 2026 Red Hat, Inc.