[Qemu-devel] [PATCH v2] migration: incoming postcopy advise sanity checks

Greg Kurz posted 1 patch 6 years, 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/151791154436.32601.15403203498591276038.stgit@bahia.lan
Test checkpatch passed
Test docker-mingw@fedora passed
Test docker-quick@centos6 passed
Test ppc passed
Test s390x passed
There is a newer version of this series
migration/savevm.c |   21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
[Qemu-devel] [PATCH v2] migration: incoming postcopy advise sanity checks
Posted by Greg Kurz 6 years, 1 month ago
If postcopy-ram was set on the source but not on the destination,
migration doesn't occur, the destination prints an error and boots
the guest:

qemu-system-ppc64: Expected vmdescription section, but got 0

We end up with two running instances.

This behaviour was introduced in 2.11 by commit 58110f0acb1a "migration:
split common postcopy out of ram postcopy" to prepare ground for the
upcoming dirty bitmap postcopy support. It adds a new case where the
source may send an empty postcopy advise because dirty bitmap doesn't
need to check page sizes like RAM postcopy does.

If the source has enabled postcopy-ram, then it sends an advise with
the page size values. If the destination hasn't enabled postcopy-ram,
then loadvm_postcopy_handle_advise() leaves the page size values on
the stream and returns. This confuses qemu_loadvm_state() later on
and causes the destination to start execution.

As discussed several times, postcopy-ram should be enabled both sides
to be functional. This patch changes the destination to perform some
extra checks on the advise length to ensure this is the case. Otherwise
an error is returned and migration is aborted.

Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
---
v2: - error out if postcopy-ram is enabled but the source hasn't sent the
      expected 16 byte advise
    - more descriptive message if postcopy-ram is disabled but the source
      has sent a 16 byte advise
---
 migration/savevm.c |   21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index b7908f62be3c..e97671c1f7bd 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1376,7 +1376,8 @@ static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
  * *might* happen - it might be skipped if precopy transferred everything
  * quickly.
  */
-static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
+static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
+                                         uint16_t len)
 {
     PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_ADVISE);
     uint64_t remote_pagesize_summary, local_pagesize_summary, remote_tps;
@@ -1387,8 +1388,22 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
         return -1;
     }
 
-    if (!migrate_postcopy_ram()) {
+    switch (len) {
+    case 0:
+        if (migrate_postcopy_ram()) {
+            warn_report("RAM postcopy is enabled but have 0 byte advise");
+            return -EINVAL;
+        }
         return 0;
+    case 8 + 8:
+        if (!migrate_postcopy_ram()) {
+            error_report("RAM postcopy is disabled but have 16 byte advise");
+            return -EINVAL;
+        }
+        break;
+    default:
+        error_report("CMD_POSTCOPY_ADVISE invalid length (%d)", len);
+        return -EINVAL;
     }
 
     if (!postcopy_ram_supported_by_host(mis)) {
@@ -1807,7 +1822,7 @@ static int loadvm_process_command(QEMUFile *f)
         return loadvm_handle_cmd_packaged(mis);
 
     case MIG_CMD_POSTCOPY_ADVISE:
-        return loadvm_postcopy_handle_advise(mis);
+        return loadvm_postcopy_handle_advise(mis, len);
 
     case MIG_CMD_POSTCOPY_LISTEN:
         return loadvm_postcopy_handle_listen(mis);


Re: [Qemu-devel] [PATCH v2] migration: incoming postcopy advise sanity checks
Posted by Daniel Henrique Barboza 6 years, 1 month ago

On 02/06/2018 08:05 AM, Greg Kurz wrote:
> If postcopy-ram was set on the source but not on the destination,
> migration doesn't occur, the destination prints an error and boots
> the guest:
>
> qemu-system-ppc64: Expected vmdescription section, but got 0
>
> We end up with two running instances.
>
> This behaviour was introduced in 2.11 by commit 58110f0acb1a "migration:
> split common postcopy out of ram postcopy" to prepare ground for the
> upcoming dirty bitmap postcopy support. It adds a new case where the
> source may send an empty postcopy advise because dirty bitmap doesn't
> need to check page sizes like RAM postcopy does.
>
> If the source has enabled postcopy-ram, then it sends an advise with
> the page size values. If the destination hasn't enabled postcopy-ram,
> then loadvm_postcopy_handle_advise() leaves the page size values on
> the stream and returns. This confuses qemu_loadvm_state() later on
> and causes the destination to start execution.
>
> As discussed several times, postcopy-ram should be enabled both sides
> to be functional. This patch changes the destination to perform some
> extra checks on the advise length to ensure this is the case. Otherwise
> an error is returned and migration is aborted.
>
> Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>

> v2: - error out if postcopy-ram is enabled but the source hasn't sent the
>        expected 16 byte advise
>      - more descriptive message if postcopy-ram is disabled but the source
>        has sent a 16 byte advise
> ---
>   migration/savevm.c |   21 ++++++++++++++++++---
>   1 file changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/migration/savevm.c b/migration/savevm.c
> index b7908f62be3c..e97671c1f7bd 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1376,7 +1376,8 @@ static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
>    * *might* happen - it might be skipped if precopy transferred everything
>    * quickly.
>    */
> -static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
> +static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
> +                                         uint16_t len)
>   {
>       PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_ADVISE);
>       uint64_t remote_pagesize_summary, local_pagesize_summary, remote_tps;
> @@ -1387,8 +1388,22 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
>           return -1;
>       }
>
> -    if (!migrate_postcopy_ram()) {
> +    switch (len) {
> +    case 0:
> +        if (migrate_postcopy_ram()) {
> +            warn_report("RAM postcopy is enabled but have 0 byte advise");
> +            return -EINVAL;
> +        }
>           return 0;
> +    case 8 + 8:
> +        if (!migrate_postcopy_ram()) {
> +            error_report("RAM postcopy is disabled but have 16 byte advise");
> +            return -EINVAL;
> +        }
> +        break;
> +    default:
> +        error_report("CMD_POSTCOPY_ADVISE invalid length (%d)", len);
> +        return -EINVAL;
>       }
>
>       if (!postcopy_ram_supported_by_host(mis)) {
> @@ -1807,7 +1822,7 @@ static int loadvm_process_command(QEMUFile *f)
>           return loadvm_handle_cmd_packaged(mis);
>
>       case MIG_CMD_POSTCOPY_ADVISE:
> -        return loadvm_postcopy_handle_advise(mis);
> +        return loadvm_postcopy_handle_advise(mis, len);
>
>       case MIG_CMD_POSTCOPY_LISTEN:
>           return loadvm_postcopy_handle_listen(mis);
>
>


Re: [Qemu-devel] [PATCH v2] migration: incoming postcopy advise sanity checks
Posted by Vladimir Sementsov-Ogievskiy 6 years, 1 month ago
06.02.2018 13:05, Greg Kurz wrote:
> If postcopy-ram was set on the source but not on the destination,
> migration doesn't occur, the destination prints an error and boots
> the guest:
>
> qemu-system-ppc64: Expected vmdescription section, but got 0
>
> We end up with two running instances.
>
> This behaviour was introduced in 2.11 by commit 58110f0acb1a "migration:
> split common postcopy out of ram postcopy" to prepare ground for the
> upcoming dirty bitmap postcopy support. It adds a new case where the
> source may send an empty postcopy advise because dirty bitmap doesn't
> need to check page sizes like RAM postcopy does.
>
> If the source has enabled postcopy-ram, then it sends an advise with
> the page size values. If the destination hasn't enabled postcopy-ram,
> then loadvm_postcopy_handle_advise() leaves the page size values on
> the stream and returns. This confuses qemu_loadvm_state() later on
> and causes the destination to start execution.
>
> As discussed several times, postcopy-ram should be enabled both sides
> to be functional. This patch changes the destination to perform some
> extra checks on the advise length to ensure this is the case. Otherwise
> an error is returned and migration is aborted.
>
> Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
> Signed-off-by: Greg Kurz <groug@kaod.org>
> ---
> v2: - error out if postcopy-ram is enabled but the source hasn't sent the
>        expected 16 byte advise
>      - more descriptive message if postcopy-ram is disabled but the source
>        has sent a 16 byte advise
> ---
>   migration/savevm.c |   21 ++++++++++++++++++---
>   1 file changed, 18 insertions(+), 3 deletions(-)
>
> diff --git a/migration/savevm.c b/migration/savevm.c
> index b7908f62be3c..e97671c1f7bd 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1376,7 +1376,8 @@ static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
>    * *might* happen - it might be skipped if precopy transferred everything
>    * quickly.
>    */
> -static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
> +static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
> +                                         uint16_t len)
>   {
>       PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_ADVISE);
>       uint64_t remote_pagesize_summary, local_pagesize_summary, remote_tps;
> @@ -1387,8 +1388,22 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
>           return -1;
>       }
>   
> -    if (!migrate_postcopy_ram()) {
> +    switch (len) {
> +    case 0:
> +        if (migrate_postcopy_ram()) {
> +            warn_report("RAM postcopy is enabled but have 0 byte advise");
> +            return -EINVAL;

I think, if we return -EINVAL, it should be error_report, not 
warn_report. However, I don't insist, so with or without s/warn/error/,

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

And thank you for investigating this, it's actually my fault.

> +        }
>           return 0;
> +    case 8 + 8:
> +        if (!migrate_postcopy_ram()) {
> +            error_report("RAM postcopy is disabled but have 16 byte advise");
> +            return -EINVAL;
> +        }
> +        break;
> +    default:
> +        error_report("CMD_POSTCOPY_ADVISE invalid length (%d)", len);
> +        return -EINVAL;
>       }
>   
>       if (!postcopy_ram_supported_by_host(mis)) {
> @@ -1807,7 +1822,7 @@ static int loadvm_process_command(QEMUFile *f)
>           return loadvm_handle_cmd_packaged(mis);
>   
>       case MIG_CMD_POSTCOPY_ADVISE:
> -        return loadvm_postcopy_handle_advise(mis);
> +        return loadvm_postcopy_handle_advise(mis, len);
>   
>       case MIG_CMD_POSTCOPY_LISTEN:
>           return loadvm_postcopy_handle_listen(mis);
>


-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH v2] migration: incoming postcopy advise sanity checks
Posted by Dr. David Alan Gilbert 6 years, 1 month ago
* Greg Kurz (groug@kaod.org) wrote:
> If postcopy-ram was set on the source but not on the destination,
> migration doesn't occur, the destination prints an error and boots
> the guest:
> 
> qemu-system-ppc64: Expected vmdescription section, but got 0
> 
> We end up with two running instances.
> 
> This behaviour was introduced in 2.11 by commit 58110f0acb1a "migration:
> split common postcopy out of ram postcopy" to prepare ground for the
> upcoming dirty bitmap postcopy support. It adds a new case where the
> source may send an empty postcopy advise because dirty bitmap doesn't
> need to check page sizes like RAM postcopy does.
> 
> If the source has enabled postcopy-ram, then it sends an advise with
> the page size values. If the destination hasn't enabled postcopy-ram,
> then loadvm_postcopy_handle_advise() leaves the page size values on
> the stream and returns. This confuses qemu_loadvm_state() later on
> and causes the destination to start execution.
> 
> As discussed several times, postcopy-ram should be enabled both sides
> to be functional. This patch changes the destination to perform some
> extra checks on the advise length to ensure this is the case. Otherwise
> an error is returned and migration is aborted.
> 
> Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
> Signed-off-by: Greg Kurz <groug@kaod.org>

Thanks for finding and fixing this.

Queued.

Dave

> ---
> v2: - error out if postcopy-ram is enabled but the source hasn't sent the
>       expected 16 byte advise
>     - more descriptive message if postcopy-ram is disabled but the source
>       has sent a 16 byte advise
> ---
>  migration/savevm.c |   21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index b7908f62be3c..e97671c1f7bd 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1376,7 +1376,8 @@ static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis);
>   * *might* happen - it might be skipped if precopy transferred everything
>   * quickly.
>   */
> -static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
> +static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis,
> +                                         uint16_t len)
>  {
>      PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_ADVISE);
>      uint64_t remote_pagesize_summary, local_pagesize_summary, remote_tps;
> @@ -1387,8 +1388,22 @@ static int loadvm_postcopy_handle_advise(MigrationIncomingState *mis)
>          return -1;
>      }
>  
> -    if (!migrate_postcopy_ram()) {
> +    switch (len) {
> +    case 0:
> +        if (migrate_postcopy_ram()) {
> +            warn_report("RAM postcopy is enabled but have 0 byte advise");
> +            return -EINVAL;
> +        }
>          return 0;
> +    case 8 + 8:
> +        if (!migrate_postcopy_ram()) {
> +            error_report("RAM postcopy is disabled but have 16 byte advise");
> +            return -EINVAL;
> +        }
> +        break;
> +    default:
> +        error_report("CMD_POSTCOPY_ADVISE invalid length (%d)", len);
> +        return -EINVAL;
>      }
>  
>      if (!postcopy_ram_supported_by_host(mis)) {
> @@ -1807,7 +1822,7 @@ static int loadvm_process_command(QEMUFile *f)
>          return loadvm_handle_cmd_packaged(mis);
>  
>      case MIG_CMD_POSTCOPY_ADVISE:
> -        return loadvm_postcopy_handle_advise(mis);
> +        return loadvm_postcopy_handle_advise(mis, len);
>  
>      case MIG_CMD_POSTCOPY_LISTEN:
>          return loadvm_postcopy_handle_listen(mis);
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK