[Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy

Vladimir Sementsov-Ogievskiy posted 16 patches 8 years, 3 months ago
There is a newer version of this series
[Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Vladimir Sementsov-Ogievskiy 8 years, 3 months ago
Split common postcopy staff from ram postcopy staff.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 migration/migration.c | 39 ++++++++++++++++++++++++++-------------
 migration/migration.h |  2 ++
 migration/savevm.c    | 48 +++++++++++++++++++++++++++++++++++++++---------
 3 files changed, 67 insertions(+), 22 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 51ccd1a4c5..d3a2fd405a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1221,6 +1221,11 @@ bool migrate_postcopy_ram(void)
     return s->enabled_capabilities[MIGRATION_CAPABILITY_POSTCOPY_RAM];
 }
 
+bool migrate_postcopy(void)
+{
+    return migrate_postcopy_ram();
+}
+
 bool migrate_auto_converge(void)
 {
     MigrationState *s;
@@ -1577,9 +1582,11 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
      * need to tell the destination to throw any pages it's already received
      * that are dirty
      */
-    if (ram_postcopy_send_discard_bitmap(ms)) {
-        error_report("postcopy send discard bitmap failed");
-        goto fail;
+    if (migrate_postcopy_ram()) {
+        if (ram_postcopy_send_discard_bitmap(ms)) {
+            error_report("postcopy send discard bitmap failed");
+            goto fail;
+        }
     }
 
     /*
@@ -1588,8 +1595,10 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
      * wrap their state up here
      */
     qemu_file_set_rate_limit(ms->to_dst_file, INT64_MAX);
-    /* Ping just for debugging, helps line traces up */
-    qemu_savevm_send_ping(ms->to_dst_file, 2);
+    if (migrate_postcopy_ram()) {
+        /* Ping just for debugging, helps line traces up */
+        qemu_savevm_send_ping(ms->to_dst_file, 2);
+    }
 
     /*
      * While loading the device state we may trigger page transfer
@@ -1614,7 +1623,9 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
     qemu_savevm_send_postcopy_listen(fb);
 
     qemu_savevm_state_complete_precopy(fb, false, false);
-    qemu_savevm_send_ping(fb, 3);
+    if (migrate_postcopy_ram()) {
+        qemu_savevm_send_ping(fb, 3);
+    }
 
     qemu_savevm_send_postcopy_run(fb);
 
@@ -1649,11 +1660,13 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
 
     qemu_mutex_unlock_iothread();
 
-    /*
-     * Although this ping is just for debug, it could potentially be
-     * used for getting a better measurement of downtime at the source.
-     */
-    qemu_savevm_send_ping(ms->to_dst_file, 4);
+    if (migrate_postcopy_ram()) {
+        /*
+         * Although this ping is just for debug, it could potentially be
+         * used for getting a better measurement of downtime at the source.
+         */
+        qemu_savevm_send_ping(ms->to_dst_file, 4);
+    }
 
     if (migrate_release_ram()) {
         ram_postcopy_migrated_memory_release(ms);
@@ -1831,7 +1844,7 @@ static void *migration_thread(void *opaque)
         qemu_savevm_send_ping(s->to_dst_file, 1);
     }
 
-    if (migrate_postcopy_ram()) {
+    if (migrate_postcopy()) {
         /*
          * Tell the destination that we *might* want to do postcopy later;
          * if the other end can't do postcopy it should fail now, nice and
@@ -1864,7 +1877,7 @@ static void *migration_thread(void *opaque)
             if (pending_size && pending_size >= threshold_size) {
                 /* Still a significant amount to transfer */
 
-                if (migrate_postcopy_ram() &&
+                if (migrate_postcopy() &&
                     s->state != MIGRATION_STATUS_POSTCOPY_ACTIVE &&
                     pend_nonpost <= threshold_size &&
                     atomic_read(&s->start_postcopy)) {
diff --git a/migration/migration.h b/migration/migration.h
index 148c9facbc..1d974bacce 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -165,6 +165,8 @@ bool migration_is_blocked(Error **errp);
 bool migration_in_postcopy(void);
 MigrationState *migrate_get_current(void);
 
+bool migrate_postcopy(void);
+
 bool migrate_release_ram(void);
 bool migrate_postcopy_ram(void);
 bool migrate_zero_blocks(void);
diff --git a/migration/savevm.c b/migration/savevm.c
index 37da83509f..cf79e1d3ac 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -89,7 +89,7 @@ static struct mig_cmd_args {
     [MIG_CMD_INVALID]          = { .len = -1, .name = "INVALID" },
     [MIG_CMD_OPEN_RETURN_PATH] = { .len =  0, .name = "OPEN_RETURN_PATH" },
     [MIG_CMD_PING]             = { .len = sizeof(uint32_t), .name = "PING" },
-    [MIG_CMD_POSTCOPY_ADVISE]  = { .len = 16, .name = "POSTCOPY_ADVISE" },
+    [MIG_CMD_POSTCOPY_ADVISE]  = { .len = -1, .name = "POSTCOPY_ADVISE" },
     [MIG_CMD_POSTCOPY_LISTEN]  = { .len =  0, .name = "POSTCOPY_LISTEN" },
     [MIG_CMD_POSTCOPY_RUN]     = { .len =  0, .name = "POSTCOPY_RUN" },
     [MIG_CMD_POSTCOPY_RAM_DISCARD] = {
@@ -98,6 +98,23 @@ static struct mig_cmd_args {
     [MIG_CMD_MAX]              = { .len = -1, .name = "MAX" },
 };
 
+/* Note for MIG_CMD_POSTCOPY_ADVISE:
+ * The format of arguments is depending on postcopy mode:
+ * - postcopy RAM only
+ *   uint64_t host page size
+ *   uint64_t taget page size
+ *
+ * - postcopy RAM and postcopy dirty bitmaps
+ *   format is the same as for postcopy RAM only
+ *
+ * - postcopy dirty bitmaps only
+ *   Nothing. Command length field is 0.
+ *
+ * Be careful: adding a new postcopy entity with some other parameters should
+ * not break format self-description ability. Good way is to introduce some
+ * generic extendable format with an exception for two old entities.
+ */
+
 static int announce_self_create(uint8_t *buf,
                                 uint8_t *mac_addr)
 {
@@ -861,12 +878,17 @@ int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len)
 /* Send prior to any postcopy transfer */
 void qemu_savevm_send_postcopy_advise(QEMUFile *f)
 {
-    uint64_t tmp[2];
-    tmp[0] = cpu_to_be64(ram_pagesize_summary());
-    tmp[1] = cpu_to_be64(qemu_target_page_size());
+    if (migrate_postcopy_ram()) {
+        uint64_t tmp[2];
+        tmp[0] = cpu_to_be64(ram_pagesize_summary());
+        tmp[1] = cpu_to_be64(qemu_target_page_size());
 
-    trace_qemu_savevm_send_postcopy_advise();
-    qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 16, (uint8_t *)tmp);
+        trace_qemu_savevm_send_postcopy_advise();
+        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE,
+                                 16, (uint8_t *)tmp);
+    } else {
+        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 0, NULL);
+    }
 }
 
 /* Sent prior to starting the destination running in postcopy, discard pages
@@ -1352,6 +1374,10 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
         return -1;
     }
 
+    if (!migrate_postcopy_ram()) {
+        return 0;
+    }
+
     if (!postcopy_ram_supported_by_host()) {
         postcopy_state_set(POSTCOPY_INCOMING_NONE);
         return -1;
@@ -1562,7 +1588,9 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
          * A rare case, we entered listen without having to do any discards,
          * so do the setup that's normally done at the time of the 1st discard.
          */
-        postcopy_ram_prepare_discard(mis);
+        if (migrate_postcopy_ram()) {
+            postcopy_ram_prepare_discard(mis);
+        }
     }
 
     /*
@@ -1570,8 +1598,10 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
      * However, at this point the CPU shouldn't be running, and the IO
      * shouldn't be doing anything yet so don't actually expect requests
      */
-    if (postcopy_ram_enable_notify(mis)) {
-        return -1;
+    if (migrate_postcopy_ram()) {
+        if (postcopy_ram_enable_notify(mis)) {
+            return -1;
+        }
     }
 
     if (mis->have_listen_thread) {
-- 
2.11.1


Re: [Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Dr. David Alan Gilbert 8 years, 3 months ago
* Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> Split common postcopy staff from ram postcopy staff.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

I'm OK with this; I'm not sure I'd have bothered making the ping's
dependent on it being RAM.

(These first few are pretty much a separable series).

Note a few things below I'd prefer if you reroll:

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/migration.c | 39 ++++++++++++++++++++++++++-------------
>  migration/migration.h |  2 ++
>  migration/savevm.c    | 48 +++++++++++++++++++++++++++++++++++++++---------
>  3 files changed, 67 insertions(+), 22 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 51ccd1a4c5..d3a2fd405a 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1221,6 +1221,11 @@ bool migrate_postcopy_ram(void)
>      return s->enabled_capabilities[MIGRATION_CAPABILITY_POSTCOPY_RAM];
>  }
>  
> +bool migrate_postcopy(void)
> +{
> +    return migrate_postcopy_ram();
> +}
> +
>  bool migrate_auto_converge(void)
>  {
>      MigrationState *s;
> @@ -1577,9 +1582,11 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>       * need to tell the destination to throw any pages it's already received
>       * that are dirty
>       */
> -    if (ram_postcopy_send_discard_bitmap(ms)) {
> -        error_report("postcopy send discard bitmap failed");
> -        goto fail;
> +    if (migrate_postcopy_ram()) {
> +        if (ram_postcopy_send_discard_bitmap(ms)) {

I'd rather:
       if (migrate_postcopy_ram() &&
           ram_postcopy_send_discard_bitmap(ms)) {

avoiding one extra layer of {}'s

Dave

> +            error_report("postcopy send discard bitmap failed");
> +            goto fail;
> +        }
>      }
>  
>      /*
> @@ -1588,8 +1595,10 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>       * wrap their state up here
>       */
>      qemu_file_set_rate_limit(ms->to_dst_file, INT64_MAX);
> -    /* Ping just for debugging, helps line traces up */
> -    qemu_savevm_send_ping(ms->to_dst_file, 2);
> +    if (migrate_postcopy_ram()) {
> +        /* Ping just for debugging, helps line traces up */
> +        qemu_savevm_send_ping(ms->to_dst_file, 2);
> +    }
>  
>      /*
>       * While loading the device state we may trigger page transfer
> @@ -1614,7 +1623,9 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>      qemu_savevm_send_postcopy_listen(fb);
>  
>      qemu_savevm_state_complete_precopy(fb, false, false);
> -    qemu_savevm_send_ping(fb, 3);
> +    if (migrate_postcopy_ram()) {
> +        qemu_savevm_send_ping(fb, 3);
> +    }
>  
>      qemu_savevm_send_postcopy_run(fb);
>  
> @@ -1649,11 +1660,13 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>  
>      qemu_mutex_unlock_iothread();
>  
> -    /*
> -     * Although this ping is just for debug, it could potentially be
> -     * used for getting a better measurement of downtime at the source.
> -     */
> -    qemu_savevm_send_ping(ms->to_dst_file, 4);
> +    if (migrate_postcopy_ram()) {
> +        /*
> +         * Although this ping is just for debug, it could potentially be
> +         * used for getting a better measurement of downtime at the source.
> +         */
> +        qemu_savevm_send_ping(ms->to_dst_file, 4);
> +    }
>  
>      if (migrate_release_ram()) {
>          ram_postcopy_migrated_memory_release(ms);
> @@ -1831,7 +1844,7 @@ static void *migration_thread(void *opaque)
>          qemu_savevm_send_ping(s->to_dst_file, 1);
>      }
>  
> -    if (migrate_postcopy_ram()) {
> +    if (migrate_postcopy()) {
>          /*
>           * Tell the destination that we *might* want to do postcopy later;
>           * if the other end can't do postcopy it should fail now, nice and
> @@ -1864,7 +1877,7 @@ static void *migration_thread(void *opaque)
>              if (pending_size && pending_size >= threshold_size) {
>                  /* Still a significant amount to transfer */
>  
> -                if (migrate_postcopy_ram() &&
> +                if (migrate_postcopy() &&
>                      s->state != MIGRATION_STATUS_POSTCOPY_ACTIVE &&
>                      pend_nonpost <= threshold_size &&
>                      atomic_read(&s->start_postcopy)) {
> diff --git a/migration/migration.h b/migration/migration.h
> index 148c9facbc..1d974bacce 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -165,6 +165,8 @@ bool migration_is_blocked(Error **errp);
>  bool migration_in_postcopy(void);
>  MigrationState *migrate_get_current(void);
>  
> +bool migrate_postcopy(void);
> +
>  bool migrate_release_ram(void);
>  bool migrate_postcopy_ram(void);
>  bool migrate_zero_blocks(void);
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 37da83509f..cf79e1d3ac 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -89,7 +89,7 @@ static struct mig_cmd_args {
>      [MIG_CMD_INVALID]          = { .len = -1, .name = "INVALID" },
>      [MIG_CMD_OPEN_RETURN_PATH] = { .len =  0, .name = "OPEN_RETURN_PATH" },
>      [MIG_CMD_PING]             = { .len = sizeof(uint32_t), .name = "PING" },
> -    [MIG_CMD_POSTCOPY_ADVISE]  = { .len = 16, .name = "POSTCOPY_ADVISE" },
> +    [MIG_CMD_POSTCOPY_ADVISE]  = { .len = -1, .name = "POSTCOPY_ADVISE" },
>      [MIG_CMD_POSTCOPY_LISTEN]  = { .len =  0, .name = "POSTCOPY_LISTEN" },
>      [MIG_CMD_POSTCOPY_RUN]     = { .len =  0, .name = "POSTCOPY_RUN" },
>      [MIG_CMD_POSTCOPY_RAM_DISCARD] = {
> @@ -98,6 +98,23 @@ static struct mig_cmd_args {
>      [MIG_CMD_MAX]              = { .len = -1, .name = "MAX" },
>  };
>  
> +/* Note for MIG_CMD_POSTCOPY_ADVISE:
> + * The format of arguments is depending on postcopy mode:
> + * - postcopy RAM only
> + *   uint64_t host page size
> + *   uint64_t taget page size
> + *
> + * - postcopy RAM and postcopy dirty bitmaps
> + *   format is the same as for postcopy RAM only
> + *
> + * - postcopy dirty bitmaps only
> + *   Nothing. Command length field is 0.
> + *
> + * Be careful: adding a new postcopy entity with some other parameters should
> + * not break format self-description ability. Good way is to introduce some
> + * generic extendable format with an exception for two old entities.
> + */
> +
>  static int announce_self_create(uint8_t *buf,
>                                  uint8_t *mac_addr)
>  {
> @@ -861,12 +878,17 @@ int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len)
>  /* Send prior to any postcopy transfer */
>  void qemu_savevm_send_postcopy_advise(QEMUFile *f)
>  {
> -    uint64_t tmp[2];
> -    tmp[0] = cpu_to_be64(ram_pagesize_summary());
> -    tmp[1] = cpu_to_be64(qemu_target_page_size());
> +    if (migrate_postcopy_ram()) {
> +        uint64_t tmp[2];
> +        tmp[0] = cpu_to_be64(ram_pagesize_summary());
> +        tmp[1] = cpu_to_be64(qemu_target_page_size());
>  
> -    trace_qemu_savevm_send_postcopy_advise();
> -    qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 16, (uint8_t *)tmp);
> +        trace_qemu_savevm_send_postcopy_advise();
> +        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE,
> +                                 16, (uint8_t *)tmp);
> +    } else {
> +        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 0, NULL);
> +    }
>  }
>  
>  /* Sent prior to starting the destination running in postcopy, discard pages
> @@ -1352,6 +1374,10 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
>          return -1;
>      }
>  
> +    if (!migrate_postcopy_ram()) {
> +        return 0;
> +    }
> +
>      if (!postcopy_ram_supported_by_host()) {
>          postcopy_state_set(POSTCOPY_INCOMING_NONE);
>          return -1;
> @@ -1562,7 +1588,9 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>           * A rare case, we entered listen without having to do any discards,
>           * so do the setup that's normally done at the time of the 1st discard.
>           */
> -        postcopy_ram_prepare_discard(mis);
> +        if (migrate_postcopy_ram()) {
> +            postcopy_ram_prepare_discard(mis);
> +        }
>      }
>  
>      /*
> @@ -1570,8 +1598,10 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>       * However, at this point the CPU shouldn't be running, and the IO
>       * shouldn't be doing anything yet so don't actually expect requests
>       */
> -    if (postcopy_ram_enable_notify(mis)) {
> -        return -1;
> +    if (migrate_postcopy_ram()) {
> +        if (postcopy_ram_enable_notify(mis)) {
> +            return -1;
> +        }
>      }
>  
>      if (mis->have_listen_thread) {
> -- 
> 2.11.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Vladimir Sementsov-Ogievskiy 8 years, 3 months ago
11.07.2017 16:06, Dr. David Alan Gilbert wrote:
> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>> Split common postcopy staff from ram postcopy staff.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> I'm OK with this; I'm not sure I'd have bothered making the ping's
> dependent on it being RAM.
>
> (These first few are pretty much a separable series).

It would be grate if you (or Juan?) can take them separately.

>
> Note a few things below I'd prefer if you reroll:
>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>
>> ---
>>   migration/migration.c | 39 ++++++++++++++++++++++++++-------------
>>   migration/migration.h |  2 ++
>>   migration/savevm.c    | 48 +++++++++++++++++++++++++++++++++++++++---------
>>   3 files changed, 67 insertions(+), 22 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 51ccd1a4c5..d3a2fd405a 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1221,6 +1221,11 @@ bool migrate_postcopy_ram(void)
>>       return s->enabled_capabilities[MIGRATION_CAPABILITY_POSTCOPY_RAM];
>>   }
>>   
>> +bool migrate_postcopy(void)
>> +{
>> +    return migrate_postcopy_ram();
>> +}
>> +
>>   bool migrate_auto_converge(void)
>>   {
>>       MigrationState *s;
>> @@ -1577,9 +1582,11 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>>        * need to tell the destination to throw any pages it's already received
>>        * that are dirty
>>        */
>> -    if (ram_postcopy_send_discard_bitmap(ms)) {
>> -        error_report("postcopy send discard bitmap failed");
>> -        goto fail;
>> +    if (migrate_postcopy_ram()) {
>> +        if (ram_postcopy_send_discard_bitmap(ms)) {
> I'd rather:
>         if (migrate_postcopy_ram() &&
>             ram_postcopy_send_discard_bitmap(ms)) {
>
> avoiding one extra layer of {}'s
>
> Dave
>
>> +            error_report("postcopy send discard bitmap failed");
>> +            goto fail;
>> +        }
>>       }
>>   
>>       /*
>> @@ -1588,8 +1595,10 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>>        * wrap their state up here
>>        */
>>       qemu_file_set_rate_limit(ms->to_dst_file, INT64_MAX);
>> -    /* Ping just for debugging, helps line traces up */
>> -    qemu_savevm_send_ping(ms->to_dst_file, 2);
>> +    if (migrate_postcopy_ram()) {
>> +        /* Ping just for debugging, helps line traces up */
>> +        qemu_savevm_send_ping(ms->to_dst_file, 2);
>> +    }
>>   
>>       /*
>>        * While loading the device state we may trigger page transfer
>> @@ -1614,7 +1623,9 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>>       qemu_savevm_send_postcopy_listen(fb);
>>   
>>       qemu_savevm_state_complete_precopy(fb, false, false);
>> -    qemu_savevm_send_ping(fb, 3);
>> +    if (migrate_postcopy_ram()) {
>> +        qemu_savevm_send_ping(fb, 3);
>> +    }
>>   
>>       qemu_savevm_send_postcopy_run(fb);
>>   
>> @@ -1649,11 +1660,13 @@ static int postcopy_start(MigrationState *ms, bool *old_vm_running)
>>   
>>       qemu_mutex_unlock_iothread();
>>   
>> -    /*
>> -     * Although this ping is just for debug, it could potentially be
>> -     * used for getting a better measurement of downtime at the source.
>> -     */
>> -    qemu_savevm_send_ping(ms->to_dst_file, 4);
>> +    if (migrate_postcopy_ram()) {
>> +        /*
>> +         * Although this ping is just for debug, it could potentially be
>> +         * used for getting a better measurement of downtime at the source.
>> +         */
>> +        qemu_savevm_send_ping(ms->to_dst_file, 4);
>> +    }
>>   
>>       if (migrate_release_ram()) {
>>           ram_postcopy_migrated_memory_release(ms);
>> @@ -1831,7 +1844,7 @@ static void *migration_thread(void *opaque)
>>           qemu_savevm_send_ping(s->to_dst_file, 1);
>>       }
>>   
>> -    if (migrate_postcopy_ram()) {
>> +    if (migrate_postcopy()) {
>>           /*
>>            * Tell the destination that we *might* want to do postcopy later;
>>            * if the other end can't do postcopy it should fail now, nice and
>> @@ -1864,7 +1877,7 @@ static void *migration_thread(void *opaque)
>>               if (pending_size && pending_size >= threshold_size) {
>>                   /* Still a significant amount to transfer */
>>   
>> -                if (migrate_postcopy_ram() &&
>> +                if (migrate_postcopy() &&
>>                       s->state != MIGRATION_STATUS_POSTCOPY_ACTIVE &&
>>                       pend_nonpost <= threshold_size &&
>>                       atomic_read(&s->start_postcopy)) {
>> diff --git a/migration/migration.h b/migration/migration.h
>> index 148c9facbc..1d974bacce 100644
>> --- a/migration/migration.h
>> +++ b/migration/migration.h
>> @@ -165,6 +165,8 @@ bool migration_is_blocked(Error **errp);
>>   bool migration_in_postcopy(void);
>>   MigrationState *migrate_get_current(void);
>>   
>> +bool migrate_postcopy(void);
>> +
>>   bool migrate_release_ram(void);
>>   bool migrate_postcopy_ram(void);
>>   bool migrate_zero_blocks(void);
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 37da83509f..cf79e1d3ac 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -89,7 +89,7 @@ static struct mig_cmd_args {
>>       [MIG_CMD_INVALID]          = { .len = -1, .name = "INVALID" },
>>       [MIG_CMD_OPEN_RETURN_PATH] = { .len =  0, .name = "OPEN_RETURN_PATH" },
>>       [MIG_CMD_PING]             = { .len = sizeof(uint32_t), .name = "PING" },
>> -    [MIG_CMD_POSTCOPY_ADVISE]  = { .len = 16, .name = "POSTCOPY_ADVISE" },
>> +    [MIG_CMD_POSTCOPY_ADVISE]  = { .len = -1, .name = "POSTCOPY_ADVISE" },
>>       [MIG_CMD_POSTCOPY_LISTEN]  = { .len =  0, .name = "POSTCOPY_LISTEN" },
>>       [MIG_CMD_POSTCOPY_RUN]     = { .len =  0, .name = "POSTCOPY_RUN" },
>>       [MIG_CMD_POSTCOPY_RAM_DISCARD] = {
>> @@ -98,6 +98,23 @@ static struct mig_cmd_args {
>>       [MIG_CMD_MAX]              = { .len = -1, .name = "MAX" },
>>   };
>>   
>> +/* Note for MIG_CMD_POSTCOPY_ADVISE:
>> + * The format of arguments is depending on postcopy mode:
>> + * - postcopy RAM only
>> + *   uint64_t host page size
>> + *   uint64_t taget page size
>> + *
>> + * - postcopy RAM and postcopy dirty bitmaps
>> + *   format is the same as for postcopy RAM only
>> + *
>> + * - postcopy dirty bitmaps only
>> + *   Nothing. Command length field is 0.
>> + *
>> + * Be careful: adding a new postcopy entity with some other parameters should
>> + * not break format self-description ability. Good way is to introduce some
>> + * generic extendable format with an exception for two old entities.
>> + */
>> +
>>   static int announce_self_create(uint8_t *buf,
>>                                   uint8_t *mac_addr)
>>   {
>> @@ -861,12 +878,17 @@ int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len)
>>   /* Send prior to any postcopy transfer */
>>   void qemu_savevm_send_postcopy_advise(QEMUFile *f)
>>   {
>> -    uint64_t tmp[2];
>> -    tmp[0] = cpu_to_be64(ram_pagesize_summary());
>> -    tmp[1] = cpu_to_be64(qemu_target_page_size());
>> +    if (migrate_postcopy_ram()) {
>> +        uint64_t tmp[2];
>> +        tmp[0] = cpu_to_be64(ram_pagesize_summary());
>> +        tmp[1] = cpu_to_be64(qemu_target_page_size());
>>   
>> -    trace_qemu_savevm_send_postcopy_advise();
>> -    qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 16, (uint8_t *)tmp);
>> +        trace_qemu_savevm_send_postcopy_advise();
>> +        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE,
>> +                                 16, (uint8_t *)tmp);
>> +    } else {
>> +        qemu_savevm_command_send(f, MIG_CMD_POSTCOPY_ADVISE, 0, NULL);
>> +    }
>>   }
>>   
>>   /* Sent prior to starting the destination running in postcopy, discard pages
>> @@ -1352,6 +1374,10 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
>>           return -1;
>>       }
>>   
>> +    if (!migrate_postcopy_ram()) {
>> +        return 0;
>> +    }
>> +
>>       if (!postcopy_ram_supported_by_host()) {
>>           postcopy_state_set(POSTCOPY_INCOMING_NONE);
>>           return -1;
>> @@ -1562,7 +1588,9 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>>            * A rare case, we entered listen without having to do any discards,
>>            * so do the setup that's normally done at the time of the 1st discard.
>>            */
>> -        postcopy_ram_prepare_discard(mis);
>> +        if (migrate_postcopy_ram()) {
>> +            postcopy_ram_prepare_discard(mis);
>> +        }
>>       }
>>   
>>       /*
>> @@ -1570,8 +1598,10 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
>>        * However, at this point the CPU shouldn't be running, and the IO
>>        * shouldn't be doing anything yet so don't actually expect requests
>>        */
>> -    if (postcopy_ram_enable_notify(mis)) {
>> -        return -1;
>> +    if (migrate_postcopy_ram()) {
>> +        if (postcopy_ram_enable_notify(mis)) {
>> +            return -1;
>> +        }
>>       }
>>   
>>       if (mis->have_listen_thread) {
>> -- 
>> 2.11.1
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by John Snow 8 years, 2 months ago

On 07/11/2017 09:38 AM, Vladimir Sementsov-Ogievskiy wrote:
> 11.07.2017 16:06, Dr. David Alan Gilbert wrote:
>> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>>> Split common postcopy staff from ram postcopy staff.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> I'm OK with this; I'm not sure I'd have bothered making the ping's
>> dependent on it being RAM.
>>
>> (These first few are pretty much a separable series).
> 
> It would be grate if you (or Juan?) can take them separately.
> 

Yes please. I don't think this ever happened, did it? Can we split off
1-3 and re-roll?

[Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Vladimir Sementsov-Ogievskiy 8 years, 1 month ago
ping for 1-3
Can we merge them?

22.08.2017 02:34, John Snow wrote:
>
> On 07/11/2017 09:38 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 11.07.2017 16:06, Dr. David Alan Gilbert wrote:
>>> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>>>> Split common postcopy staff from ram postcopy staff.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> I'm OK with this; I'm not sure I'd have bothered making the ping's
>>> dependent on it being RAM.
>>>
>>> (These first few are pretty much a separable series).
>> It would be grate if you (or Juan?) can take them separately.
>>
> Yes please. I don't think this ever happened, did it? Can we split off
> 1-3 and re-roll?



-- 
Best regards,
Vladimir


Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Dr. David Alan Gilbert 8 years, 1 month ago
* Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> ping for 1-3
> Can we merge them?

I see all of them have R-b's; so lets try and put them in the next
migration merge.

Quintela: Sound good?

Dave

> 22.08.2017 02:34, John Snow wrote:
> > 
> > On 07/11/2017 09:38 AM, Vladimir Sementsov-Ogievskiy wrote:
> > > 11.07.2017 16:06, Dr. David Alan Gilbert wrote:
> > > > * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> > > > > Split common postcopy staff from ram postcopy staff.
> > > > > 
> > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > > > I'm OK with this; I'm not sure I'd have bothered making the ping's
> > > > dependent on it being RAM.
> > > > 
> > > > (These first few are pretty much a separable series).
> > > It would be grate if you (or Juan?) can take them separately.
> > > 
> > Yes please. I don't think this ever happened, did it? Can we split off
> > 1-3 and re-roll?
> 
> 
> 
> -- 
> Best regards,
> Vladimir
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Juan Quintela 8 years, 1 month ago
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>> ping for 1-3
>> Can we merge them?
>
> I see all of them have R-b's; so lets try and put them in the next
> migration merge.
>
> Quintela: Sound good?

Yeap.

Later, Juan.

Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Kevin Wolf 8 years, 1 month ago
Am 20.09.2017 um 13:45 hat Juan Quintela geschrieben:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> >> ping for 1-3
> >> Can we merge them?
> >
> > I see all of them have R-b's; so lets try and put them in the next
> > migration merge.
> >
> > Quintela: Sound good?
> 
> Yeap.

This patch broke qemu-iotests 181 ('Test postcopy live migration with
shared storage'):

--- /home/kwolf/source/qemu/tests/qemu-iotests/181.out  2017-06-16 19:19:53.000000000 +0200
+++ 181.out.bad 2017-09-25 15:20:40.787582000 +0200
@@ -21,18 +21,16 @@
 === Do some I/O on the destination ===
 
 QEMU X.Y.Z monitor - type 'help' for more information
-(qemu) qemu-io disk "read -P 0x55 0 64k"
+(qemu) QEMU_PROG: Expected vmdescription section, but got 0
+QEMU_PROG: Failed to get "write" lock
+Is another process using the image?
+qemu-io disk "read -P 0x55 0 64k"
 read 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 (qemu) 
 (qemu) qemu-io disk "write -P 0x66 1M 64k"
-wrote 65536/65536 bytes at offset 1048576
-64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-
-=== Shut down and check image ===
-
-(qemu) quit
-(qemu) 
-(qemu) quit
-No errors were found on the image.
-*** done
+QEMU_PROG: block/io.c:1359: bdrv_aligned_pwritev: Assertion `child->perm & BLK_PERM_WRITE' failed.
+./common.config: Aborted                 (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
+echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
+fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
+Timeout waiting for ops/sec on handle 1

Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Vladimir Sementsov-Ogievskiy 8 years, 1 month ago
25.09.2017 16:23, Kevin Wolf wrote:
> Am 20.09.2017 um 13:45 hat Juan Quintela geschrieben:
>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>>> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>>>> ping for 1-3
>>>> Can we merge them?
>>> I see all of them have R-b's; so lets try and put them in the next
>>> migration merge.
>>>
>>> Quintela: Sound good?
>> Yeap.
> This patch broke qemu-iotests 181 ('Test postcopy live migration with
> shared storage'):
>
> --- /home/kwolf/source/qemu/tests/qemu-iotests/181.out  2017-06-16 19:19:53.000000000 +0200
> +++ 181.out.bad 2017-09-25 15:20:40.787582000 +0200
> @@ -21,18 +21,16 @@
>   === Do some I/O on the destination ===
>   
>   QEMU X.Y.Z monitor - type 'help' for more information
> -(qemu) qemu-io disk "read -P 0x55 0 64k"
> +(qemu) QEMU_PROG: Expected vmdescription section, but got 0
> +QEMU_PROG: Failed to get "write" lock
> +Is another process using the image?
> +qemu-io disk "read -P 0x55 0 64k"
>   read 65536/65536 bytes at offset 0
>   64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>   (qemu)
>   (qemu) qemu-io disk "write -P 0x66 1M 64k"
> -wrote 65536/65536 bytes at offset 1048576
> -64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -
> -=== Shut down and check image ===
> -
> -(qemu) quit
> -(qemu)
> -(qemu) quit
> -No errors were found on the image.
> -*** done
> +QEMU_PROG: block/io.c:1359: bdrv_aligned_pwritev: Assertion `child->perm & BLK_PERM_WRITE' failed.
> +./common.config: Aborted                 (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
> +echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
> +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
> +Timeout waiting for ops/sec on handle 1

Not sure about locking (don't see this error on my old kernel without 
OFD locking), but it looks like that
181 test should be fixed to set postcopy-ram capability on target too 
(which was considered as correct way on list)

-- 
Best regards,
Vladimir


Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Kevin Wolf 8 years, 1 month ago
Am 25.09.2017 um 16:31 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 25.09.2017 16:23, Kevin Wolf wrote:
> > Am 20.09.2017 um 13:45 hat Juan Quintela geschrieben:
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > > * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> > > > > ping for 1-3
> > > > > Can we merge them?
> > > > I see all of them have R-b's; so lets try and put them in the next
> > > > migration merge.
> > > > 
> > > > Quintela: Sound good?
> > > Yeap.
> > This patch broke qemu-iotests 181 ('Test postcopy live migration with
> > shared storage'):
> > 
> > --- /home/kwolf/source/qemu/tests/qemu-iotests/181.out  2017-06-16 19:19:53.000000000 +0200
> > +++ 181.out.bad 2017-09-25 15:20:40.787582000 +0200
> > @@ -21,18 +21,16 @@
> >   === Do some I/O on the destination ===
> >   QEMU X.Y.Z monitor - type 'help' for more information
> > -(qemu) qemu-io disk "read -P 0x55 0 64k"
> > +(qemu) QEMU_PROG: Expected vmdescription section, but got 0
> > +QEMU_PROG: Failed to get "write" lock
> > +Is another process using the image?
> > +qemu-io disk "read -P 0x55 0 64k"
> >   read 65536/65536 bytes at offset 0
> >   64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> >   (qemu)
> >   (qemu) qemu-io disk "write -P 0x66 1M 64k"
> > -wrote 65536/65536 bytes at offset 1048576
> > -64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > -
> > -=== Shut down and check image ===
> > -
> > -(qemu) quit
> > -(qemu)
> > -(qemu) quit
> > -No errors were found on the image.
> > -*** done
> > +QEMU_PROG: block/io.c:1359: bdrv_aligned_pwritev: Assertion `child->perm & BLK_PERM_WRITE' failed.
> > +./common.config: Aborted                 (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
> > +echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
> > +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
> > +Timeout waiting for ops/sec on handle 1
> 
> Not sure about locking (don't see this error on my old kernel without OFD
> locking), but it looks like that
> 181 test should be fixed to set postcopy-ram capability on target too (which
> was considered as correct way on list)

Whatever you think the preferred way to set up postcopy migration is: If
something worked before this patch and doesn't after it, that's a
regression and breaks backwards compatibility.

If we were talking about a graceful failure, maybe we could discuss
whether carefully and deliberately breaking compatibility could be
justified in this specific case. But the breakage is neither mentioned
in the commit message nor is it graceful, so I can only call it a bug.

Kevin

Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Vladimir Sementsov-Ogievskiy 8 years, 1 month ago
25.09.2017 17:58, Kevin Wolf wrote:
> Am 25.09.2017 um 16:31 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 25.09.2017 16:23, Kevin Wolf wrote:
>>> Am 20.09.2017 um 13:45 hat Juan Quintela geschrieben:
>>>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>>>>> * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
>>>>>> ping for 1-3
>>>>>> Can we merge them?
>>>>> I see all of them have R-b's; so lets try and put them in the next
>>>>> migration merge.
>>>>>
>>>>> Quintela: Sound good?
>>>> Yeap.
>>> This patch broke qemu-iotests 181 ('Test postcopy live migration with
>>> shared storage'):
>>>
>>> --- /home/kwolf/source/qemu/tests/qemu-iotests/181.out  2017-06-16 19:19:53.000000000 +0200
>>> +++ 181.out.bad 2017-09-25 15:20:40.787582000 +0200
>>> @@ -21,18 +21,16 @@
>>>    === Do some I/O on the destination ===
>>>    QEMU X.Y.Z monitor - type 'help' for more information
>>> -(qemu) qemu-io disk "read -P 0x55 0 64k"
>>> +(qemu) QEMU_PROG: Expected vmdescription section, but got 0
>>> +QEMU_PROG: Failed to get "write" lock
>>> +Is another process using the image?
>>> +qemu-io disk "read -P 0x55 0 64k"
>>>    read 65536/65536 bytes at offset 0
>>>    64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>    (qemu)
>>>    (qemu) qemu-io disk "write -P 0x66 1M 64k"
>>> -wrote 65536/65536 bytes at offset 1048576
>>> -64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>> -
>>> -=== Shut down and check image ===
>>> -
>>> -(qemu) quit
>>> -(qemu)
>>> -(qemu) quit
>>> -No errors were found on the image.
>>> -*** done
>>> +QEMU_PROG: block/io.c:1359: bdrv_aligned_pwritev: Assertion `child->perm & BLK_PERM_WRITE' failed.
>>> +./common.config: Aborted                 (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
>>> +echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
>>> +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
>>> +Timeout waiting for ops/sec on handle 1
>> Not sure about locking (don't see this error on my old kernel without OFD
>> locking), but it looks like that
>> 181 test should be fixed to set postcopy-ram capability on target too (which
>> was considered as correct way on list)
> Whatever you think the preferred way to set up postcopy migration is: If
> something worked before this patch and doesn't after it, that's a
> regression and breaks backwards compatibility.
>
> If we were talking about a graceful failure, maybe we could discuss
> whether carefully and deliberately breaking compatibility could be
> justified in this specific case. But the breakage is neither mentioned
> in the commit message nor is it graceful, so I can only call it a bug.
>
> Kevin

It's of course my fault, I don't mean "it's wrong test, so it's not my 
problem") And I've already sent a patch.

-- 
Best regards,
Vladimir


Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Dr. David Alan Gilbert 8 years, 1 month ago
* Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> 25.09.2017 17:58, Kevin Wolf wrote:
> > Am 25.09.2017 um 16:31 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > 25.09.2017 16:23, Kevin Wolf wrote:
> > > > Am 20.09.2017 um 13:45 hat Juan Quintela geschrieben:
> > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > > > > * Vladimir Sementsov-Ogievskiy (vsementsov@virtuozzo.com) wrote:
> > > > > > > ping for 1-3
> > > > > > > Can we merge them?
> > > > > > I see all of them have R-b's; so lets try and put them in the next
> > > > > > migration merge.
> > > > > > 
> > > > > > Quintela: Sound good?
> > > > > Yeap.
> > > > This patch broke qemu-iotests 181 ('Test postcopy live migration with
> > > > shared storage'):
> > > > 
> > > > --- /home/kwolf/source/qemu/tests/qemu-iotests/181.out  2017-06-16 19:19:53.000000000 +0200
> > > > +++ 181.out.bad 2017-09-25 15:20:40.787582000 +0200
> > > > @@ -21,18 +21,16 @@
> > > >    === Do some I/O on the destination ===
> > > >    QEMU X.Y.Z monitor - type 'help' for more information
> > > > -(qemu) qemu-io disk "read -P 0x55 0 64k"
> > > > +(qemu) QEMU_PROG: Expected vmdescription section, but got 0
> > > > +QEMU_PROG: Failed to get "write" lock
> > > > +Is another process using the image?
> > > > +qemu-io disk "read -P 0x55 0 64k"
> > > >    read 65536/65536 bytes at offset 0
> > > >    64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > > >    (qemu)
> > > >    (qemu) qemu-io disk "write -P 0x66 1M 64k"
> > > > -wrote 65536/65536 bytes at offset 1048576
> > > > -64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > > > -
> > > > -=== Shut down and check image ===
> > > > -
> > > > -(qemu) quit
> > > > -(qemu)
> > > > -(qemu) quit
> > > > -No errors were found on the image.
> > > > -*** done
> > > > +QEMU_PROG: block/io.c:1359: bdrv_aligned_pwritev: Assertion `child->perm & BLK_PERM_WRITE' failed.
> > > > +./common.config: Aborted                 (core dumped) ( if [ -n "${QEMU_NEED_PID}" ]; then
> > > > +echo $BASHPID > "${QEMU_TEST_DIR}/qemu-${_QEMU_HANDLE}.pid";
> > > > +fi; exec "$QEMU_PROG" $QEMU_OPTIONS "$@" )
> > > > +Timeout waiting for ops/sec on handle 1
> > > Not sure about locking (don't see this error on my old kernel without OFD
> > > locking), but it looks like that
> > > 181 test should be fixed to set postcopy-ram capability on target too (which
> > > was considered as correct way on list)
> > Whatever you think the preferred way to set up postcopy migration is: If
> > something worked before this patch and doesn't after it, that's a
> > regression and breaks backwards compatibility.
> > 
> > If we were talking about a graceful failure, maybe we could discuss
> > whether carefully and deliberately breaking compatibility could be
> > justified in this specific case. But the breakage is neither mentioned
> > in the commit message nor is it graceful, so I can only call it a bug.
> > 
> > Kevin
> 
> It's of course my fault, I don't mean "it's wrong test, so it's not my
> problem") And I've already sent a patch.

Why does this fail so badly, asserts etc - I was hoping for something
a bit more obvious from the migration code.

postcopy did originally work without the destination having the flag on
but setting the flag on the destination was always good practice because
it detected whether the host support was there early on.

Dave

> -- 
> Best regards,
> Vladimir
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Kevin Wolf 8 years, 1 month ago
Am 25.09.2017 um 17:27 hat Dr. David Alan Gilbert geschrieben:
> > > Whatever you think the preferred way to set up postcopy migration is: If
> > > something worked before this patch and doesn't after it, that's a
> > > regression and breaks backwards compatibility.
> > > 
> > > If we were talking about a graceful failure, maybe we could discuss
> > > whether carefully and deliberately breaking compatibility could be
> > > justified in this specific case. But the breakage is neither mentioned
> > > in the commit message nor is it graceful, so I can only call it a bug.
> > > 
> > > Kevin
> > 
> > It's of course my fault, I don't mean "it's wrong test, so it's not my
> > problem") And I've already sent a patch.
> 
> Why does this fail so badly, asserts etc - I was hoping for something
> a bit more obvious from the migration code.
> 
> postcopy did originally work without the destination having the flag on
> but setting the flag on the destination was always good practice because
> it detected whether the host support was there early on.

So what does this mean for 2.11? Do you think it is acceptable breaking
cases where the flag isn't set on the destination?

If so, just changing the test case is enough. But if not, I'd rather
keep the test case as it is and fix only the migration code.

Kevin

Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Dr. David Alan Gilbert 8 years, 1 month ago
* Kevin Wolf (kwolf@redhat.com) wrote:
> Am 25.09.2017 um 17:27 hat Dr. David Alan Gilbert geschrieben:
> > > > Whatever you think the preferred way to set up postcopy migration is: If
> > > > something worked before this patch and doesn't after it, that's a
> > > > regression and breaks backwards compatibility.
> > > > 
> > > > If we were talking about a graceful failure, maybe we could discuss
> > > > whether carefully and deliberately breaking compatibility could be
> > > > justified in this specific case. But the breakage is neither mentioned
> > > > in the commit message nor is it graceful, so I can only call it a bug.
> > > > 
> > > > Kevin
> > > 
> > > It's of course my fault, I don't mean "it's wrong test, so it's not my
> > > problem") And I've already sent a patch.
> > 
> > Why does this fail so badly, asserts etc - I was hoping for something
> > a bit more obvious from the migration code.
> > 
> > postcopy did originally work without the destination having the flag on
> > but setting the flag on the destination was always good practice because
> > it detected whether the host support was there early on.
> 
> So what does this mean for 2.11? Do you think it is acceptable breaking
> cases where the flag isn't set on the destination?

I think so, because we've always recommended setting it on the
destination for the early detection.

> If so, just changing the test case is enough. But if not, I'd rather
> keep the test case as it is and fix only the migration code.

I'd take the test case fix, but I also want to dig why it fails so
badly; it would be nice just to have a clean failure telling you
that postcopy was expected.

Dave

> 
> Kevin
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] ping Re: [PATCH v7 03/16] migration: split common postcopy out of ram postcopy
Posted by Kevin Wolf 8 years, 1 month ago
Am 26.09.2017 um 12:21 hat Dr. David Alan Gilbert geschrieben:
> * Kevin Wolf (kwolf@redhat.com) wrote:
> > Am 25.09.2017 um 17:27 hat Dr. David Alan Gilbert geschrieben:
> > > > > Whatever you think the preferred way to set up postcopy migration is: If
> > > > > something worked before this patch and doesn't after it, that's a
> > > > > regression and breaks backwards compatibility.
> > > > > 
> > > > > If we were talking about a graceful failure, maybe we could discuss
> > > > > whether carefully and deliberately breaking compatibility could be
> > > > > justified in this specific case. But the breakage is neither mentioned
> > > > > in the commit message nor is it graceful, so I can only call it a bug.
> > > > > 
> > > > > Kevin
> > > > 
> > > > It's of course my fault, I don't mean "it's wrong test, so it's not my
> > > > problem") And I've already sent a patch.
> > > 
> > > Why does this fail so badly, asserts etc - I was hoping for something
> > > a bit more obvious from the migration code.
> > > 
> > > postcopy did originally work without the destination having the flag on
> > > but setting the flag on the destination was always good practice because
> > > it detected whether the host support was there early on.
> > 
> > So what does this mean for 2.11? Do you think it is acceptable breaking
> > cases where the flag isn't set on the destination?
> 
> I think so, because we've always recommended setting it on the
> destination for the early detection.

Okay, I'll include the test case patch in my pull request today then.

> > If so, just changing the test case is enough. But if not, I'd rather
> > keep the test case as it is and fix only the migration code.
> 
> I'd take the test case fix, but I also want to dig why it fails so
> badly; it would be nice just to have a clean failure telling you
> that postcopy was expected.

Yes, that would be nice.

Kevin