migration/migration.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Various parts of the migration code do different things when they're
in postcopy mode; prior to this patch this has been 'postcopy-active'.
This patch extends 'in_postcopy' to include 'postcopy-paused' and
'postcopy-recover'.
In particular, when you set the max-postcopy-bandwidth parameter, this
only affects the current migration fd if we're 'in_postcopy';
this leads to a race in the postcopy recovery test where it increases
the speed from 4k/sec to unlimited, but that increase can get ignored
if the change is made between the point at which the reconnection
happens and it transitions back to active.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
migration/migration.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/migration/migration.c b/migration/migration.c
index 01863a95f5..5f7e4d15e9 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1659,7 +1659,14 @@ bool migration_in_postcopy(void)
{
MigrationState *s = migrate_get_current();
- return (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
+ switch (s->state) {
+ case MIGRATION_STATUS_POSTCOPY_ACTIVE:
+ case MIGRATION_STATUS_POSTCOPY_PAUSED:
+ case MIGRATION_STATUS_POSTCOPY_RECOVER:
+ return true;
+ default:
+ return false;
+ }
}
bool migration_in_postcopy_after_devices(MigrationState *s)
--
2.21.0
Dr. David Alan Gilbert (git) <dgilbert@redhat.com> writes: > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > Various parts of the migration code do different things when they're > in postcopy mode; prior to this patch this has been 'postcopy-active'. > This patch extends 'in_postcopy' to include 'postcopy-paused' and > 'postcopy-recover'. > > In particular, when you set the max-postcopy-bandwidth parameter, this > only affects the current migration fd if we're 'in_postcopy'; > this leads to a race in the postcopy recovery test where it increases > the speed from 4k/sec to unlimited, but that increase can get ignored > if the change is made between the point at which the reconnection > happens and it transitions back to active. > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> I'm stress testing it now. > --- > migration/migration.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/migration/migration.c b/migration/migration.c > index 01863a95f5..5f7e4d15e9 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -1659,7 +1659,14 @@ bool migration_in_postcopy(void) > { > MigrationState *s = migrate_get_current(); > > - return (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE); > + switch (s->state) { > + case MIGRATION_STATUS_POSTCOPY_ACTIVE: > + case MIGRATION_STATUS_POSTCOPY_PAUSED: > + case MIGRATION_STATUS_POSTCOPY_RECOVER: > + return true; > + default: > + return false; > + } > } > > bool migration_in_postcopy_after_devices(MigrationState *s) -- Alex Bennée
* Dr. David Alan Gilbert (git) (dgilbert@redhat.com) wrote: > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > Various parts of the migration code do different things when they're > in postcopy mode; prior to this patch this has been 'postcopy-active'. > This patch extends 'in_postcopy' to include 'postcopy-paused' and > 'postcopy-recover'. > > In particular, when you set the max-postcopy-bandwidth parameter, this > only affects the current migration fd if we're 'in_postcopy'; > this leads to a race in the postcopy recovery test where it increases > the speed from 4k/sec to unlimited, but that increase can get ignored > if the change is made between the point at which the reconnection > happens and it transitions back to active. > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Queued > --- > migration/migration.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/migration/migration.c b/migration/migration.c > index 01863a95f5..5f7e4d15e9 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -1659,7 +1659,14 @@ bool migration_in_postcopy(void) > { > MigrationState *s = migrate_get_current(); > > - return (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE); > + switch (s->state) { > + case MIGRATION_STATUS_POSTCOPY_ACTIVE: > + case MIGRATION_STATUS_POSTCOPY_PAUSED: > + case MIGRATION_STATUS_POSTCOPY_RECOVER: > + return true; > + default: > + return false; > + } > } > > bool migration_in_postcopy_after_devices(MigrationState *s) > -- > 2.21.0 > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> writes: > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > Various parts of the migration code do different things when they're > in postcopy mode; prior to this patch this has been 'postcopy-active'. > This patch extends 'in_postcopy' to include 'postcopy-paused' and > 'postcopy-recover'. > > In particular, when you set the max-postcopy-bandwidth parameter, this > only affects the current migration fd if we're 'in_postcopy'; > this leads to a race in the postcopy recovery test where it increases > the speed from 4k/sec to unlimited, but that increase can get ignored > if the change is made between the point at which the reconnection > happens and it transitions back to active. > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> This seems to fix the intermittent hangs I observed and bisected to commit 8504ddeca0 "migration: Fix postcopy bw for recovery". Tested-by: Markus Armbruster <armbru@redhat.com>
Dr. David Alan Gilbert (git) <dgilbert@redhat.com> writes: > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > Various parts of the migration code do different things when they're > in postcopy mode; prior to this patch this has been 'postcopy-active'. > This patch extends 'in_postcopy' to include 'postcopy-paused' and > 'postcopy-recover'. > > In particular, when you set the max-postcopy-bandwidth parameter, this > only affects the current migration fd if we're 'in_postcopy'; > this leads to a race in the postcopy recovery test where it increases > the speed from 4k/sec to unlimited, but that increase can get ignored > if the change is made between the point at which the reconnection > happens and it transitions back to active. > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> In my xenial stress test I run 100 times and it never triggered the 180s timeout I set on my retry.py script: Tested-by: Alex Bennée <alex.bennee@linaro.org> > --- > migration/migration.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/migration/migration.c b/migration/migration.c > index 01863a95f5..5f7e4d15e9 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -1659,7 +1659,14 @@ bool migration_in_postcopy(void) > { > MigrationState *s = migrate_get_current(); > > - return (s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE); > + switch (s->state) { > + case MIGRATION_STATUS_POSTCOPY_ACTIVE: > + case MIGRATION_STATUS_POSTCOPY_PAUSED: > + case MIGRATION_STATUS_POSTCOPY_RECOVER: > + return true; > + default: > + return false; > + } > } > > bool migration_in_postcopy_after_devices(MigrationState *s) -- Alex Bennée
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote: > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > Various parts of the migration code do different things when they're > in postcopy mode; prior to this patch this has been 'postcopy-active'. > This patch extends 'in_postcopy' to include 'postcopy-paused' and > 'postcopy-recover'. > > In particular, when you set the max-postcopy-bandwidth parameter, this > only affects the current migration fd if we're 'in_postcopy'; > this leads to a race in the postcopy recovery test where it increases > the speed from 4k/sec to unlimited, but that increase can get ignored > if the change is made between the point at which the reconnection > happens and it transitions back to active. > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com>
On Mon, Sep 23, 2019 at 06:49:42PM +0100, Dr. David Alan Gilbert (git) wrote: > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > Various parts of the migration code do different things when they're > in postcopy mode; prior to this patch this has been 'postcopy-active'. > This patch extends 'in_postcopy' to include 'postcopy-paused' and > 'postcopy-recover'. > > In particular, when you set the max-postcopy-bandwidth parameter, this > only affects the current migration fd if we're 'in_postcopy'; > this leads to a race in the postcopy recovery test where it increases > the speed from 4k/sec to unlimited, but that increase can get ignored > if the change is made between the point at which the reconnection > happens and it transitions back to active. > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Yeh this makes quite a lot of sense to me... Reviewed-by: Peter Xu <peterx@redhat.com> -- Peter Xu
© 2016 - 2024 Red Hat, Inc.