[RFC PATCH 1/2] migration: Report error in incoming migration

Fabiano Rosas posted 2 patches 2 years ago
Maintainers: Juan Quintela <quintela@redhat.com>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Leonardo Bras <leobras@redhat.com>
There is a newer version of this series
[RFC PATCH 1/2] migration: Report error in incoming migration
Posted by Fabiano Rosas 2 years ago
We're not currently reporting the errors set with migrate_set_error()
when incoming migration fails.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/migration.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index 28a34c9068..cca32c553c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -698,6 +698,13 @@ process_incoming_migration_co(void *opaque)
     }
 
     if (ret < 0) {
+        MigrationState *s = migrate_get_current();
+
+        if (migrate_has_error(s)) {
+            WITH_QEMU_LOCK_GUARD(&s->error_mutex) {
+                error_report_err(s->error);
+            }
+        }
         error_report("load of migration failed: %s", strerror(-ret));
         goto fail;
     }
-- 
2.35.3
Re: [RFC PATCH 1/2] migration: Report error in incoming migration
Posted by Peter Xu 2 years ago
On Thu, Nov 09, 2023 at 01:58:55PM -0300, Fabiano Rosas wrote:
> We're not currently reporting the errors set with migrate_set_error()
> when incoming migration fails.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  migration/migration.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 28a34c9068..cca32c553c 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -698,6 +698,13 @@ process_incoming_migration_co(void *opaque)
>      }
>  
>      if (ret < 0) {
> +        MigrationState *s = migrate_get_current();
> +
> +        if (migrate_has_error(s)) {
> +            WITH_QEMU_LOCK_GUARD(&s->error_mutex) {
> +                error_report_err(s->error);
> +            }
> +        }

What's the major benefit of dumping this explicitly?

And this is not relevant to the multifd problem, correct?

>          error_report("load of migration failed: %s", strerror(-ret));
>          goto fail;
>      }
> -- 
> 2.35.3
> 

-- 
Peter Xu
Re: [RFC PATCH 1/2] migration: Report error in incoming migration
Posted by Fabiano Rosas 2 years ago
Peter Xu <peterx@redhat.com> writes:

> On Thu, Nov 09, 2023 at 01:58:55PM -0300, Fabiano Rosas wrote:
>> We're not currently reporting the errors set with migrate_set_error()
>> when incoming migration fails.
>> 
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>>  migration/migration.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>> 
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 28a34c9068..cca32c553c 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -698,6 +698,13 @@ process_incoming_migration_co(void *opaque)
>>      }
>>  
>>      if (ret < 0) {
>> +        MigrationState *s = migrate_get_current();
>> +
>> +        if (migrate_has_error(s)) {
>> +            WITH_QEMU_LOCK_GUARD(&s->error_mutex) {
>> +                error_report_err(s->error);
>> +            }
>> +        }
>
> What's the major benefit of dumping this explicitly?

This is incoming migration, so there's no centralized error reporting
aside from the useless "load of migration failed: -5". If the code has
not called error_report we just never see the error message.

> And this is not relevant to the multifd problem, correct?

Yes, I'm being sneaky.
Re: [RFC PATCH 1/2] migration: Report error in incoming migration
Posted by Peter Xu 2 years ago
On Fri, Nov 10, 2023 at 07:58:00AM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Thu, Nov 09, 2023 at 01:58:55PM -0300, Fabiano Rosas wrote:
> >> We're not currently reporting the errors set with migrate_set_error()
> >> when incoming migration fails.
> >> 
> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> >> ---
> >>  migration/migration.c | 7 +++++++
> >>  1 file changed, 7 insertions(+)
> >> 
> >> diff --git a/migration/migration.c b/migration/migration.c
> >> index 28a34c9068..cca32c553c 100644
> >> --- a/migration/migration.c
> >> +++ b/migration/migration.c
> >> @@ -698,6 +698,13 @@ process_incoming_migration_co(void *opaque)
> >>      }
> >>  
> >>      if (ret < 0) {
> >> +        MigrationState *s = migrate_get_current();
> >> +
> >> +        if (migrate_has_error(s)) {
> >> +            WITH_QEMU_LOCK_GUARD(&s->error_mutex) {
> >> +                error_report_err(s->error);
> >> +            }
> >> +        }
> >
> > What's the major benefit of dumping this explicitly?
> 
> This is incoming migration, so there's no centralized error reporting
> aside from the useless "load of migration failed: -5". If the code has
> not called error_report we just never see the error message.
> 
> > And this is not relevant to the multifd problem, correct?
> 
> Yes, I'm being sneaky.

Trying to sneak one patch into a 2 patch series is prone to be exposed and
lose the effect. :-)

I remember we had the verbose error before. Was that lost since some
commit?  In all cases, feel free to post that separately if you think we
should get it back.

The multifd fixes do not look like a regression either for this release. If
so, both of them may be better next release's material?

-- 
Peter Xu
Re: [RFC PATCH 1/2] migration: Report error in incoming migration
Posted by Fabiano Rosas 2 years ago
Peter Xu <peterx@redhat.com> writes:

> On Fri, Nov 10, 2023 at 07:58:00AM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Thu, Nov 09, 2023 at 01:58:55PM -0300, Fabiano Rosas wrote:
>> >> We're not currently reporting the errors set with migrate_set_error()
>> >> when incoming migration fails.
>> >> 
>> >> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> >> ---
>> >>  migration/migration.c | 7 +++++++
>> >>  1 file changed, 7 insertions(+)
>> >> 
>> >> diff --git a/migration/migration.c b/migration/migration.c
>> >> index 28a34c9068..cca32c553c 100644
>> >> --- a/migration/migration.c
>> >> +++ b/migration/migration.c
>> >> @@ -698,6 +698,13 @@ process_incoming_migration_co(void *opaque)
>> >>      }
>> >>  
>> >>      if (ret < 0) {
>> >> +        MigrationState *s = migrate_get_current();
>> >> +
>> >> +        if (migrate_has_error(s)) {
>> >> +            WITH_QEMU_LOCK_GUARD(&s->error_mutex) {
>> >> +                error_report_err(s->error);
>> >> +            }
>> >> +        }
>> >
>> > What's the major benefit of dumping this explicitly?
>> 
>> This is incoming migration, so there's no centralized error reporting
>> aside from the useless "load of migration failed: -5". If the code has
>> not called error_report we just never see the error message.
>> 
>> > And this is not relevant to the multifd problem, correct?
>> 
>> Yes, I'm being sneaky.
>
> Trying to sneak one patch into a 2 patch series is prone to be exposed and
> lose the effect. :-)
>
> I remember we had the verbose error before. Was that lost since some
> commit?  In all cases, feel free to post that separately if you think we
> should get it back.
>
> The multifd fixes do not look like a regression either for this release. If
> so, both of them may be better next release's material?

People have complained about it on IRC and I hit it twice in a week. I
would call it a regression. However, we _do_ have an indication that it
might have been there all along since someone already tried to fix a
very similar issue, maybe even the same one. So I'm fine with punting to
the next release.