[v6] Eliminate multifd flush

[PATCH v6 3/3] multifd: Only flush once each full round of memory

Posted by Juan Quintela 2 years, 11 months ago

We need to add a new flag to mean to flush at that point.
Notice that we still flush at the end of setup and at the end of
complete stages.

Signed-off-by: Juan Quintela <quintela@redhat.com>

---

Add missing qemu_fflush(), now it passes all tests always.
---
 qapi/migration.json   |  3 ---
 migration/migration.c |  6 +-----
 migration/ram.c       | 28 +++++++++++++++++++++++++++-
 3 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 3afd81174d..34e1657c4e 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -491,9 +491,6 @@
 #                                    suboptimal (we flush too many
 #                                    times).
 #                                    Default value is false.
-#                                    Setting this capability has no
-#                                    effect until the patch that
-#                                    removes this comment.
 #                                    (since 8.0)
 #
 # Features:
diff --git a/migration/migration.c b/migration/migration.c
index cfba0da005..74bcc16848 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2706,11 +2706,7 @@ bool migrate_multifd_flush_after_each_section(void)
 {
     MigrationState *s = migrate_get_current();
 
-    /*
-     * Until the patch that remove this comment, we always return that
-     * the capability is enabled.
-     */
-    return true || s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION];
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION];
 }
 
 bool migrate_pause_before_switchover(void)
diff --git a/migration/ram.c b/migration/ram.c
index 6191dac9af..bc5eb1640b 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -85,6 +85,7 @@
 #define RAM_SAVE_FLAG_XBZRLE   0x40
 /* 0x80 is reserved in qemu-file.h for RAM_SAVE_FLAG_HOOK */
 #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
+#define RAM_SAVE_FLAG_MULTIFD_FLUSH    0x200
 /* We can't use any flag that is bigger than 0x200 */
 
 int (*xbzrle_encode_buffer_func)(uint8_t *, uint8_t *, int,
@@ -1595,6 +1596,7 @@ retry:
  * associated with the search process.
  *
  * Returns:
+ *         <0: An error happened
  *         PAGE_ALL_CLEAN: no dirty page found, give up
  *         PAGE_TRY_AGAIN: no dirty page found, retry for next block
  *         PAGE_DIRTY_FOUND: dirty page found
@@ -1622,6 +1624,15 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
         pss->page = 0;
         pss->block = QLIST_NEXT_RCU(pss->block, next);
         if (!pss->block) {
+            if (!migrate_multifd_flush_after_each_section()) {
+                QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
+                int ret = multifd_send_sync_main(f);
+                if (ret < 0) {
+                    return ret;
+                }
+                qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
+                qemu_fflush(f);
+            }
             /*
              * If memory migration starts over, we will meet a dirtied page
              * which may still exists in compression threads's ring, so we
@@ -2614,6 +2625,9 @@ static int ram_find_and_save_block(RAMState *rs)
                     break;
                 } else if (res == PAGE_TRY_AGAIN) {
                     continue;
+                } else if (res < 0) {
+                    pages = res;
+                    break;
                 }
             }
         }
@@ -3300,6 +3314,10 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
         return ret;
     }
 
+    if (!migrate_multifd_flush_after_each_section()) {
+        qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
+    }
+
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
     qemu_fflush(f);
 
@@ -3485,6 +3503,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
         return ret;
     }
 
+    if (!migrate_multifd_flush_after_each_section()) {
+        qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
+    }
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
     qemu_fflush(f);
 
@@ -4169,7 +4190,9 @@ int ram_load_postcopy(QEMUFile *f, int channel)
             }
             decompress_data_with_multi_threads(f, page_buffer, len);
             break;
-
+        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
+            multifd_recv_sync_main();
+            break;
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
             if (migrate_multifd_flush_after_each_section()) {
@@ -4443,6 +4466,9 @@ static int ram_load_precopy(QEMUFile *f)
                 break;
             }
             break;
+        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
+            multifd_recv_sync_main();
+            break;
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
             if (migrate_multifd_flush_after_each_section()) {
-- 
2.39.1

Re: [PATCH v6 3/3] multifd: Only flush once each full round of memory

Posted by Peter Xu 2 years, 11 months ago

On Wed, Feb 15, 2023 at 07:02:31PM +0100, Juan Quintela wrote:
> We need to add a new flag to mean to flush at that point.
> Notice that we still flush at the end of setup and at the end of
> complete stages.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>

Acked-by: Peter Xu <peterx@redhat.com>

One nitpick below.

> 
> ---
> 
> Add missing qemu_fflush(), now it passes all tests always.
> ---
>  qapi/migration.json   |  3 ---
>  migration/migration.c |  6 +-----
>  migration/ram.c       | 28 +++++++++++++++++++++++++++-
>  3 files changed, 28 insertions(+), 9 deletions(-)
> 
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 3afd81174d..34e1657c4e 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -491,9 +491,6 @@
>  #                                    suboptimal (we flush too many
>  #                                    times).
>  #                                    Default value is false.
> -#                                    Setting this capability has no
> -#                                    effect until the patch that
> -#                                    removes this comment.
>  #                                    (since 8.0)
>  #
>  # Features:
> diff --git a/migration/migration.c b/migration/migration.c
> index cfba0da005..74bcc16848 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2706,11 +2706,7 @@ bool migrate_multifd_flush_after_each_section(void)
>  {
>      MigrationState *s = migrate_get_current();
>  
> -    /*
> -     * Until the patch that remove this comment, we always return that
> -     * the capability is enabled.
> -     */
> -    return true || s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION];
> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION];
>  }
>  
>  bool migrate_pause_before_switchover(void)
> diff --git a/migration/ram.c b/migration/ram.c
> index 6191dac9af..bc5eb1640b 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -85,6 +85,7 @@
>  #define RAM_SAVE_FLAG_XBZRLE   0x40
>  /* 0x80 is reserved in qemu-file.h for RAM_SAVE_FLAG_HOOK */
>  #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
> +#define RAM_SAVE_FLAG_MULTIFD_FLUSH    0x200
>  /* We can't use any flag that is bigger than 0x200 */
>  
>  int (*xbzrle_encode_buffer_func)(uint8_t *, uint8_t *, int,
> @@ -1595,6 +1596,7 @@ retry:
>   * associated with the search process.
>   *
>   * Returns:
> + *         <0: An error happened
>   *         PAGE_ALL_CLEAN: no dirty page found, give up
>   *         PAGE_TRY_AGAIN: no dirty page found, retry for next block
>   *         PAGE_DIRTY_FOUND: dirty page found
> @@ -1622,6 +1624,15 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
>          pss->page = 0;
>          pss->block = QLIST_NEXT_RCU(pss->block, next);
>          if (!pss->block) {
> +            if (!migrate_multifd_flush_after_each_section()) {
> +                QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
> +                int ret = multifd_send_sync_main(f);
> +                if (ret < 0) {
> +                    return ret;
> +                }
> +                qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> +                qemu_fflush(f);
> +            }
>              /*
>               * If memory migration starts over, we will meet a dirtied page
>               * which may still exists in compression threads's ring, so we
> @@ -2614,6 +2625,9 @@ static int ram_find_and_save_block(RAMState *rs)
>                      break;
>                  } else if (res == PAGE_TRY_AGAIN) {
>                      continue;
> +                } else if (res < 0) {
> +                    pages = res;
> +                    break;
>                  }
>              }
>          }
> @@ -3300,6 +3314,10 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>          return ret;
>      }
>  
> +    if (!migrate_multifd_flush_after_each_section()) {
> +        qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> +    }
> +
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>      qemu_fflush(f);
>  
> @@ -3485,6 +3503,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>          return ret;
>      }
>  
> +    if (!migrate_multifd_flush_after_each_section()) {
> +        qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> +    }
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>      qemu_fflush(f);
>  
> @@ -4169,7 +4190,9 @@ int ram_load_postcopy(QEMUFile *f, int channel)
>              }
>              decompress_data_with_multi_threads(f, page_buffer, len);
>              break;
> -
> +        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
> +            multifd_recv_sync_main();
> +            break;
>          case RAM_SAVE_FLAG_EOS:
>              /* normal exit */
>              if (migrate_multifd_flush_after_each_section()) {

We could have dropped RAM_SAVE_FLAG_MULTIFD_FLUSH and RAM_SAVE_FLAG_EOS for
now until we support postcopy+multifd.

Here it's not only about enabling them together, but it's about running
them in parallel, which I doubt whether it can really be implemented at all
due to the fundamentally concepts difference between multifd & postcopy.. :(

> @@ -4443,6 +4466,9 @@ static int ram_load_precopy(QEMUFile *f)
>                  break;
>              }
>              break;
> +        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
> +            multifd_recv_sync_main();
> +            break;
>          case RAM_SAVE_FLAG_EOS:
>              /* normal exit */
>              if (migrate_multifd_flush_after_each_section()) {
> -- 
> 2.39.1
> 
> 

-- 
Peter Xu

Re: [PATCH v6 3/3] multifd: Only flush once each full round of memory

Posted by Juan Quintela 2 years, 11 months ago

Peter Xu <peterx@redhat.com> wrote:
> On Wed, Feb 15, 2023 at 07:02:31PM +0100, Juan Quintela wrote:
>> We need to add a new flag to mean to flush at that point.
>> Notice that we still flush at the end of setup and at the end of
>> complete stages.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>
> Acked-by: Peter Xu <peterx@redhat.com>
>
> One nitpick below.

Thanks.
>> @@ -4169,7 +4190,9 @@ int ram_load_postcopy(QEMUFile *f, int channel)
>>              }
>>              decompress_data_with_multi_threads(f, page_buffer, len);
>>              break;
>> -
>> +        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
>> +            multifd_recv_sync_main();
>> +            break;
>>          case RAM_SAVE_FLAG_EOS:
>>              /* normal exit */
>>              if (migrate_multifd_flush_after_each_section()) {
>
> We could have dropped RAM_SAVE_FLAG_MULTIFD_FLUSH and RAM_SAVE_FLAG_EOS for
> now until we support postcopy+multifd.

I don't think so.

We have this curse of biblic proportions called Backwards compatibility.

We need to mark the beggining and end of sections.  That is independent
of multifd.
And for multifd we have to flush all channels at the end of each
iteration through RAM.  We could do that without involving the main
thread, but I don't see the point of doing that.

> Here it's not only about enabling them together, but it's about running
> them in parallel, which I doubt whether it can really be implemented at all
> due to the fundamentally concepts difference between multifd & postcopy.. :(

Later, Juan.

Re: [PATCH v6 3/3] multifd: Only flush once each full round of memory

Posted by Peter Xu 2 years, 11 months ago

On Thu, Feb 16, 2023 at 12:00:55PM +0100, Juan Quintela wrote:
> >> @@ -4169,7 +4190,9 @@ int ram_load_postcopy(QEMUFile *f, int channel)
> >>              }
> >>              decompress_data_with_multi_threads(f, page_buffer, len);
> >>              break;
> >> -
> >> +        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
> >> +            multifd_recv_sync_main();
> >> +            break;
> >>          case RAM_SAVE_FLAG_EOS:
> >>              /* normal exit */
> >>              if (migrate_multifd_flush_after_each_section()) {
> >
> > We could have dropped RAM_SAVE_FLAG_MULTIFD_FLUSH and RAM_SAVE_FLAG_EOS for
> > now until we support postcopy+multifd.
> 
> I don't think so.
> 
> We have this curse of biblic proportions called Backwards compatibility.
> 
> We need to mark the beggining and end of sections.  That is independent
> of multifd.
> And for multifd we have to flush all channels at the end of each
> iteration through RAM.  We could do that without involving the main
> thread, but I don't see the point of doing that.

Oops, sorry I didn't mean to drop the flags RAM_SAVE_FLAG_EOS itself, but
the calls to multifd_recv_sync_main().

RAM_SAVE_FLAG_MULTIFD_FLUSH as a whole can be dropped.

Thanks,

-- 
Peter Xu

[PATCH v6 1/3] multifd: Create property multifd-flush-after-each-section
[PATCH v6 2/3] multifd: Protect multifd_send_sync_main() calls
[PATCH v6 3/3] multifd: Only flush once each full round of memory