migration/migration.c | 15 ++++++++++++++- migration/savevm.c | 1 + 2 files changed, 15 insertions(+), 1 deletion(-)
From: Ivan Ren <ivanren@tencent.com>
This patch fix a multifd migration bug in migration speed calculation, this
problem can be reproduced as follows:
1. start a vm and give a heavy memory write stress to prevent the vm be
successfully migrated to destination
2. begin a migration with multifd
3. migrate for a long time [actually, this can be measured by transferred bytes]
4. migrate cancel
5. begin a new migration with multifd, the migration will directly run into
migration_completion phase
Reason as follows:
Migration update bandwidth and s->threshold_size in function
migration_update_counters after BUFFER_DELAY time:
current_bytes = migration_total_bytes(s);
transferred = current_bytes - s->iteration_initial_bytes;
time_spent = current_time - s->iteration_start_time;
bandwidth = (double)transferred / time_spent;
s->threshold_size = bandwidth * s->parameters.downtime_limit;
In multifd migration, migration_total_bytes function return
qemu_ftell(s->to_dst_file) + ram_counters.multifd_bytes.
s->iteration_initial_bytes will be initialized to 0 at every new migration,
but ram_counters is a global variable, and history migration data will be
accumulated. So if the ram_counters.multifd_bytes is big enough, it may lead
pending_size >= s->threshold_size become false in migration_iteration_run
after the first migration_update_counters.
Signed-off-by: Ivan Ren <ivanren@tencent.com>
---
migration/migration.c | 15 ++++++++++++++-
migration/savevm.c | 1 +
2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/migration/migration.c b/migration/migration.c
index 8a607fe1e2..d35a6ae6f9 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1908,6 +1908,11 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc,
}
migrate_init(s);
+ /*
+ * set ram_counters memory to zero for a
+ * new migration
+ */
+ memset(&ram_counters, 0, sizeof(ram_counters));
return true;
}
@@ -3187,6 +3192,10 @@ static void *migration_thread(void *opaque)
object_ref(OBJECT(s));
s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+ /*
+ * Update s->iteration_initial_bytes to match s->iteration_start_time.
+ */
+ s->iteration_initial_bytes = migration_total_bytes(s);
qemu_savevm_state_header(s->to_dst_file);
@@ -3252,7 +3261,11 @@ static void *migration_thread(void *opaque)
* breaking transferred_bytes and bandwidth calculation
*/
s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
- s->iteration_initial_bytes = 0;
+ /*
+ * Update s->iteration_initial_bytes to current size to
+ * avoid historical data lead wrong bandwidth.
+ */
+ s->iteration_initial_bytes = migration_total_bytes(s);
}
current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
diff --git a/migration/savevm.c b/migration/savevm.c
index 79ed44d475..480c511b19 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1424,6 +1424,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
}
migrate_init(ms);
+ memset(&ram_counters, 0, sizeof(ram_counters));
ms->to_dst_file = f;
qemu_mutex_unlock_iothread();
--
2.17.2 (Apple Git-113)
Ivan Ren <renyime@gmail.com> wrote: > From: Ivan Ren <ivanren@tencent.com> > > This patch fix a multifd migration bug in migration speed calculation, this > problem can be reproduced as follows: > 1. start a vm and give a heavy memory write stress to prevent the vm be > successfully migrated to destination > 2. begin a migration with multifd > 3. migrate for a long time [actually, this can be measured by transferred bytes] > 4. migrate cancel > 5. begin a new migration with multifd, the migration will directly run into > migration_completion phase > > Reason as follows: > > Migration update bandwidth and s->threshold_size in function > migration_update_counters after BUFFER_DELAY time: > > current_bytes = migration_total_bytes(s); > transferred = current_bytes - s->iteration_initial_bytes; > time_spent = current_time - s->iteration_start_time; > bandwidth = (double)transferred / time_spent; > s->threshold_size = bandwidth * s->parameters.downtime_limit; > > In multifd migration, migration_total_bytes function return > qemu_ftell(s->to_dst_file) + ram_counters.multifd_bytes. > s->iteration_initial_bytes will be initialized to 0 at every new migration, > but ram_counters is a global variable, and history migration data will be > accumulated. So if the ram_counters.multifd_bytes is big enough, it may lead > pending_size >= s->threshold_size become false in migration_iteration_run > after the first migration_update_counters. > > Signed-off-by: Ivan Ren <ivanren@tencent.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
On Tue, Jul 30, 2019 at 01:36:32PM +0800, Ivan Ren wrote: >From: Ivan Ren <ivanren@tencent.com> > >This patch fix a multifd migration bug in migration speed calculation, this >problem can be reproduced as follows: >1. start a vm and give a heavy memory write stress to prevent the vm be > successfully migrated to destination >2. begin a migration with multifd >3. migrate for a long time [actually, this can be measured by transferred bytes] >4. migrate cancel >5. begin a new migration with multifd, the migration will directly run into > migration_completion phase > >Reason as follows: > >Migration update bandwidth and s->threshold_size in function >migration_update_counters after BUFFER_DELAY time: > > current_bytes = migration_total_bytes(s); > transferred = current_bytes - s->iteration_initial_bytes; > time_spent = current_time - s->iteration_start_time; > bandwidth = (double)transferred / time_spent; > s->threshold_size = bandwidth * s->parameters.downtime_limit; > >In multifd migration, migration_total_bytes function return >qemu_ftell(s->to_dst_file) + ram_counters.multifd_bytes. >s->iteration_initial_bytes will be initialized to 0 at every new migration, >but ram_counters is a global variable, and history migration data will be >accumulated. So if the ram_counters.multifd_bytes is big enough, it may lead >pending_size >= s->threshold_size become false in migration_iteration_run >after the first migration_update_counters. > >Signed-off-by: Ivan Ren <ivanren@tencent.com> >--- > migration/migration.c | 15 ++++++++++++++- > migration/savevm.c | 1 + > 2 files changed, 15 insertions(+), 1 deletion(-) > >diff --git a/migration/migration.c b/migration/migration.c >index 8a607fe1e2..d35a6ae6f9 100644 >--- a/migration/migration.c >+++ b/migration/migration.c >@@ -1908,6 +1908,11 @@ static bool migrate_prepare(MigrationState *s, bool blk, bool blk_inc, > } > > migrate_init(s); >+ /* >+ * set ram_counters memory to zero for a >+ * new migration >+ */ >+ memset(&ram_counters, 0, sizeof(ram_counters)); > > return true; > } >@@ -3187,6 +3192,10 @@ static void *migration_thread(void *opaque) > > object_ref(OBJECT(s)); > s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); >+ /* >+ * Update s->iteration_initial_bytes to match s->iteration_start_time. >+ */ >+ s->iteration_initial_bytes = migration_total_bytes(s); Is this one necessary? We have sent out nothing yet. > > qemu_savevm_state_header(s->to_dst_file); > >@@ -3252,7 +3261,11 @@ static void *migration_thread(void *opaque) > * breaking transferred_bytes and bandwidth calculation > */ > s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); >- s->iteration_initial_bytes = 0; >+ /* >+ * Update s->iteration_initial_bytes to current size to >+ * avoid historical data lead wrong bandwidth. >+ */ >+ s->iteration_initial_bytes = migration_total_bytes(s); > } > > current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); >diff --git a/migration/savevm.c b/migration/savevm.c >index 79ed44d475..480c511b19 100644 >--- a/migration/savevm.c >+++ b/migration/savevm.c >@@ -1424,6 +1424,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp) > } > > migrate_init(ms); >+ memset(&ram_counters, 0, sizeof(ram_counters)); > ms->to_dst_file = f; > > qemu_mutex_unlock_iothread(); >-- >2.17.2 (Apple Git-113) > -- Wei Yang Help you, Help me
>> s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); >>+ /* >>+ * Update s->iteration_initial_bytes to match s->iteration_start_time. >>+ */ >>+ s->iteration_initial_bytes = migration_total_bytes(s); > >Is this one necessary? We have sent out nothing yet. Yes, currently nothing has been sent yet at this point. Is that better to always match the update of iteration_initial_bytes and iteration_start_time in a explicit way to avoid some potential missing? Thanks. On Thu, Aug 1, 2019 at 10:21 AM Wei Yang <richardw.yang@linux.intel.com> wrote: > On Tue, Jul 30, 2019 at 01:36:32PM +0800, Ivan Ren wrote: > >From: Ivan Ren <ivanren@tencent.com> > > > >This patch fix a multifd migration bug in migration speed calculation, > this > >problem can be reproduced as follows: > >1. start a vm and give a heavy memory write stress to prevent the vm be > > successfully migrated to destination > >2. begin a migration with multifd > >3. migrate for a long time [actually, this can be measured by transferred > bytes] > >4. migrate cancel > >5. begin a new migration with multifd, the migration will directly run > into > > migration_completion phase > > > >Reason as follows: > > > >Migration update bandwidth and s->threshold_size in function > >migration_update_counters after BUFFER_DELAY time: > > > > current_bytes = migration_total_bytes(s); > > transferred = current_bytes - s->iteration_initial_bytes; > > time_spent = current_time - s->iteration_start_time; > > bandwidth = (double)transferred / time_spent; > > s->threshold_size = bandwidth * s->parameters.downtime_limit; > > > >In multifd migration, migration_total_bytes function return > >qemu_ftell(s->to_dst_file) + ram_counters.multifd_bytes. > >s->iteration_initial_bytes will be initialized to 0 at every new > migration, > >but ram_counters is a global variable, and history migration data will be > >accumulated. So if the ram_counters.multifd_bytes is big enough, it may > lead > >pending_size >= s->threshold_size become false in migration_iteration_run > >after the first migration_update_counters. > > > >Signed-off-by: Ivan Ren <ivanren@tencent.com> > >--- > > migration/migration.c | 15 ++++++++++++++- > > migration/savevm.c | 1 + > > 2 files changed, 15 insertions(+), 1 deletion(-) > > > >diff --git a/migration/migration.c b/migration/migration.c > >index 8a607fe1e2..d35a6ae6f9 100644 > >--- a/migration/migration.c > >+++ b/migration/migration.c > >@@ -1908,6 +1908,11 @@ static bool migrate_prepare(MigrationState *s, > bool blk, bool blk_inc, > > } > > > > migrate_init(s); > >+ /* > >+ * set ram_counters memory to zero for a > >+ * new migration > >+ */ > >+ memset(&ram_counters, 0, sizeof(ram_counters)); > > > > return true; > > } > >@@ -3187,6 +3192,10 @@ static void *migration_thread(void *opaque) > > > > object_ref(OBJECT(s)); > > s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); > >+ /* > >+ * Update s->iteration_initial_bytes to match > s->iteration_start_time. > >+ */ > >+ s->iteration_initial_bytes = migration_total_bytes(s); > > Is this one necessary? We have sent out nothing yet. > > > > > qemu_savevm_state_header(s->to_dst_file); > > > >@@ -3252,7 +3261,11 @@ static void *migration_thread(void *opaque) > > * breaking transferred_bytes and bandwidth calculation > > */ > > s->iteration_start_time = > qemu_clock_get_ms(QEMU_CLOCK_REALTIME); > >- s->iteration_initial_bytes = 0; > >+ /* > >+ * Update s->iteration_initial_bytes to current size to > >+ * avoid historical data lead wrong bandwidth. > >+ */ > >+ s->iteration_initial_bytes = migration_total_bytes(s); > > } > > > > current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); > >diff --git a/migration/savevm.c b/migration/savevm.c > >index 79ed44d475..480c511b19 100644 > >--- a/migration/savevm.c > >+++ b/migration/savevm.c > >@@ -1424,6 +1424,7 @@ static int qemu_savevm_state(QEMUFile *f, Error > **errp) > > } > > > > migrate_init(ms); > >+ memset(&ram_counters, 0, sizeof(ram_counters)); > > ms->to_dst_file = f; > > > > qemu_mutex_unlock_iothread(); > >-- > >2.17.2 (Apple Git-113) > > > > -- > Wei Yang > Help you, Help me >
On Thu, Aug 01, 2019 at 04:10:34PM +0800, Ivan Ren wrote: >>> s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); >>>+ /* >>>+ * Update s->iteration_initial_bytes to match >s->iteration_start_time. >>>+ */ >>>+ s->iteration_initial_bytes = migration_total_bytes(s); >> >>Is this one necessary? We have sent out nothing yet. > >Yes, currently nothing has been sent yet at this point. > >Is that better to always match the update of iteration_initial_bytes >and iteration_start_time in a explicit way to avoid some potential missing? > You may get some point. Well after a close look, we may find other potential problem. 1. To be consistency, we need to update iteration_initial_pages too. So my opinion is to wrap the update of these three counters into a helper function. So each time all of them. 2. In function ram_get_total_transferred_pages, do we missed multifd_bytes? -- Wei Yang Help you, Help me
>>>> s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); >>>>+ /* >>>>+ * Update s->iteration_initial_bytes to match >>s->iteration_start_time. >>>>+ */ >>>>+ s->iteration_initial_bytes = migration_total_bytes(s); >>> >>>Is this one necessary? We have sent out nothing yet. >> >>Yes, currently nothing has been sent yet at this point. >> >>Is that better to always match the update of iteration_initial_bytes >>and iteration_start_time in a explicit way to avoid some potential missing? >> > >You may get some point. Well after a close look, we may find other potential >problem. > >1. To be consistency, we need to update iteration_initial_pages too. > So my opinion is to wrap the update of these three counters into a helper > function. So each time all of them. >2. In function ram_get_total_transferred_pages, do we missed multifd_bytes? In function ram_save_multifd_page, ram pages transferred by multifd threads is counted by ram_counters.normal. You mean other multifd bytes like multifd packet or multifd sync info? Thanks. On Fri, Aug 2, 2019 at 8:49 AM Wei Yang <richardw.yang@linux.intel.com> wrote: > On Thu, Aug 01, 2019 at 04:10:34PM +0800, Ivan Ren wrote: > >>> s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); > >>>+ /* > >>>+ * Update s->iteration_initial_bytes to match > >s->iteration_start_time. > >>>+ */ > >>>+ s->iteration_initial_bytes = migration_total_bytes(s); > >> > >>Is this one necessary? We have sent out nothing yet. > > > >Yes, currently nothing has been sent yet at this point. > > > >Is that better to always match the update of iteration_initial_bytes > >and iteration_start_time in a explicit way to avoid some potential > missing? > > > > You may get some point. Well after a close look, we may find other > potential > problem. > > 1. To be consistency, we need to update iteration_initial_pages too. > So my opinion is to wrap the update of these three counters into a > helper > function. So each time all of them. > > 2. In function ram_get_total_transferred_pages, do we missed multifd_bytes? > > -- > Wei Yang > Help you, Help me >
On Fri, Aug 02, 2019 at 01:46:41PM +0800, Ivan Ren wrote: >>>>> s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); >>>>>+ /* >>>>>+ * Update s->iteration_initial_bytes to match >>>s->iteration_start_time. >>>>>+ */ >>>>>+ s->iteration_initial_bytes = migration_total_bytes(s); >>>> >>>>Is this one necessary? We have sent out nothing yet. >>> >>>Yes, currently nothing has been sent yet at this point. >>> >>>Is that better to always match the update of iteration_initial_bytes >>>and iteration_start_time in a explicit way to avoid some potential >missing? >>> >> >>You may get some point. Well after a close look, we may find other >potential >>problem. >> Well, I guess you need to use another tool to send mail. The format is corrupted. >>1. To be consistency, we need to update iteration_initial_pages too. >> So my opinion is to wrap the update of these three counters into a >helper >> function. So each time all of them. I don't see you reply this one or the mail is corrupted. If we don't update iteration_initial_pages, the initial_pages will mismatch the initial_bytes. Am I right? >>2. In function ram_get_total_transferred_pages, do we missed multifd_bytes? > >In function ram_save_multifd_page, ram pages transferred by multifd threads >is >counted by ram_counters.normal. >You mean other multifd bytes like multifd packet or multifd sync info? > Ok, it is counted in normal. While if my understanding is correct, normal is used to count pages sent by save_normal_page(). Sounds this is misused? >Thanks. > >On Fri, Aug 2, 2019 at 8:49 AM Wei Yang <richardw.yang@linux.intel.com> >wrote: > -- Wei Yang Help you, Help me
On Fri, Aug 2, 2019 at 1:59 PM Wei Yang <richardw.yang@linux.intel.com> wrote: > > On Fri, Aug 02, 2019 at 01:46:41PM +0800, Ivan Ren wrote: > >>>>> s->iteration_start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); > >>>>>+ /* > >>>>>+ * Update s->iteration_initial_bytes to match > >>>s->iteration_start_time. > >>>>>+ */ > >>>>>+ s->iteration_initial_bytes = migration_total_bytes(s); > >>>> > >>>>Is this one necessary? We have sent out nothing yet. > >>> > >>>Yes, currently nothing has been sent yet at this point. > >>> > >>>Is that better to always match the update of iteration_initial_bytes > >>>and iteration_start_time in a explicit way to avoid some potential > >missing? > >>> > >> > >>You may get some point. Well after a close look, we may find other > >potential > >>problem. > >> > > Well, I guess you need to use another tool to send mail. The format is > corrupted. > OK > >>1. To be consistency, we need to update iteration_initial_pages too. > >> So my opinion is to wrap the update of these three counters into a > >helper > >> function. So each time all of them. > > I don't see you reply this one or the mail is corrupted. > > If we don't update iteration_initial_pages, the initial_pages will mismatch > the initial_bytes. Am I right? Yes, agree, I'll send a new version, thanks. > > >>2. In function ram_get_total_transferred_pages, do we missed multifd_bytes? > > > >In function ram_save_multifd_page, ram pages transferred by multifd threads > >is > >counted by ram_counters.normal. > >You mean other multifd bytes like multifd packet or multifd sync info? > > > > Ok, it is counted in normal. > > While if my understanding is correct, normal is used to count pages sent by > save_normal_page(). Sounds this is misused? > Yes, current it is counted in normal, a specific counter is more accurate. Thanks
© 2016 - 2024 Red Hat, Inc.