Commit b673eab4e2 "multifd: Make multifd_load_setup() get an Error
parameter" changed migration_incoming_setup() to take an Error **
argument, and adjusted the callers accordingly. It neglected to
change adjust multifd_load_setup(): it still exit()s on error. Clean
that up.
The error now gets propagated up two call chains: via
migration_fd_process_incoming() to rdma_accept_incoming_migration(),
and via migration_ioc_process_incoming() to
migration_channel_process_incoming(). Both chain ends report the
error with error_report_err(), but otherwise ignore it. Behavioral
change: we no longer exit() on this error.
This is consistent with how we handle other errors here, e.g. from
multifd_recv_new_channel() via migration_ioc_process_incoming() to
migration_channel_process_incoming(). Wether it's consistently right
or consistently wrong I can't tell.
Also clean up the return value from the unusual 0 on success, 1 on
error to the more common true on success, false on error.
Cc: Juan Quintela <quintela@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
migration/migration.c | 27 +++++++++------------------
1 file changed, 9 insertions(+), 18 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 231dc24414..c1c0a48647 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -609,30 +609,25 @@ fail:
}
/**
- * @migration_incoming_setup: Setup incoming migration
- *
- * Returns 0 for no error or 1 for error
- *
+ * migration_incoming_setup: Setup incoming migration
* @f: file for main migration channel
* @errp: where to put errors
+ *
+ * Returns: %true on success, %false on error.
*/
-static int migration_incoming_setup(QEMUFile *f, Error **errp)
+static bool migration_incoming_setup(QEMUFile *f, Error **errp)
{
MigrationIncomingState *mis = migration_incoming_get_current();
- Error *local_err = NULL;
- if (multifd_load_setup(&local_err) != 0) {
- /* We haven't been able to create multifd threads
- nothing better to do */
- error_report_err(local_err);
- exit(EXIT_FAILURE);
+ if (multifd_load_setup(errp) != 0) {
+ return false;
}
if (!mis->from_src_file) {
mis->from_src_file = f;
}
qemu_file_set_blocking(f, false);
- return 0;
+ return true;
}
void migration_incoming_process(void)
@@ -675,14 +670,11 @@ static bool postcopy_try_recover(QEMUFile *f)
void migration_fd_process_incoming(QEMUFile *f, Error **errp)
{
- Error *local_err = NULL;
-
if (postcopy_try_recover(f)) {
return;
}
- if (migration_incoming_setup(f, &local_err)) {
- error_propagate(errp, local_err);
+ if (!migration_incoming_setup(f, errp)) {
return;
}
migration_incoming_process();
@@ -703,8 +695,7 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
return;
}
- if (migration_incoming_setup(f, &local_err)) {
- error_propagate(errp, local_err);
+ if (!migration_incoming_setup(f, errp)) {
return;
}
--
2.31.1
On Tue, Jul 20, 2021 at 02:54:02PM +0200, Markus Armbruster wrote: > Commit b673eab4e2 "multifd: Make multifd_load_setup() get an Error > parameter" changed migration_incoming_setup() to take an Error ** > argument, and adjusted the callers accordingly. It neglected to > change adjust multifd_load_setup(): it still exit()s on error. Clean > that up. > > The error now gets propagated up two call chains: via > migration_fd_process_incoming() to rdma_accept_incoming_migration(), > and via migration_ioc_process_incoming() to > migration_channel_process_incoming(). Both chain ends report the > error with error_report_err(), but otherwise ignore it. Behavioral > change: we no longer exit() on this error. > > This is consistent with how we handle other errors here, e.g. from > multifd_recv_new_channel() via migration_ioc_process_incoming() to > migration_channel_process_incoming(). Wether it's consistently right Whether > or consistently wrong I can't tell. > > Also clean up the return value from the unusual 0 on success, 1 on > error to the more common true on success, false on error. > > Cc: Juan Quintela <quintela@redhat.com> > Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> > Signed-off-by: Markus Armbruster <armbru@redhat.com> > --- > migration/migration.c | 27 +++++++++------------------ > 1 file changed, 9 insertions(+), 18 deletions(-) > Reviewed-by: Eric Blake <eblake@redhat.com> -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Eric Blake <eblake@redhat.com> writes: > On Tue, Jul 20, 2021 at 02:54:02PM +0200, Markus Armbruster wrote: >> Commit b673eab4e2 "multifd: Make multifd_load_setup() get an Error >> parameter" changed migration_incoming_setup() to take an Error ** >> argument, and adjusted the callers accordingly. It neglected to >> change adjust multifd_load_setup(): it still exit()s on error. Clean >> that up. >> >> The error now gets propagated up two call chains: via >> migration_fd_process_incoming() to rdma_accept_incoming_migration(), >> and via migration_ioc_process_incoming() to >> migration_channel_process_incoming(). Both chain ends report the >> error with error_report_err(), but otherwise ignore it. Behavioral >> change: we no longer exit() on this error. >> >> This is consistent with how we handle other errors here, e.g. from >> multifd_recv_new_channel() via migration_ioc_process_incoming() to >> migration_channel_process_incoming(). Wether it's consistently right > > Whether ACK >> or consistently wrong I can't tell. >> >> Also clean up the return value from the unusual 0 on success, 1 on >> error to the more common true on success, false on error. >> >> Cc: Juan Quintela <quintela@redhat.com> >> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> >> Signed-off-by: Markus Armbruster <armbru@redhat.com> >> --- >> migration/migration.c | 27 +++++++++------------------ >> 1 file changed, 9 insertions(+), 18 deletions(-) >> > > Reviewed-by: Eric Blake <eblake@redhat.com> Thanks!
> Commit b673eab4e2 "multifd: Make multifd_load_setup() get an Error
> parameter" changed migration_incoming_setup() to take an Error **
> argument, and adjusted the callers accordingly. It neglected to
> change adjust multifd_load_setup(): it still exit()s on error. Clean
> that up.
>
> The error now gets propagated up two call chains: via
> migration_fd_process_incoming() to rdma_accept_incoming_migration(),
> and via migration_ioc_process_incoming() to
> migration_channel_process_incoming(). Both chain ends report the
> error with error_report_err(), but otherwise ignore it. Behavioral
> change: we no longer exit() on this error.
>
> This is consistent with how we handle other errors here, e.g. from
> multifd_recv_new_channel() via migration_ioc_process_incoming() to
> migration_channel_process_incoming(). Wether it's consistently right
> or consistently wrong I can't tell.
>
> Also clean up the return value from the unusual 0 on success, 1 on
> error to the more common true on success, false on error.
>
> Cc: Juan Quintela <quintela@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
> migration/migration.c | 27 +++++++++------------------
> 1 file changed, 9 insertions(+), 18 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 231dc24414..c1c0a48647 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -609,30 +609,25 @@ fail:
> }
>
> /**
> - * @migration_incoming_setup: Setup incoming migration
> - *
> - * Returns 0 for no error or 1 for error
> - *
> + * migration_incoming_setup: Setup incoming migration
> * @f: file for main migration channel
> * @errp: where to put errors
> + *
> + * Returns: %true on success, %false on error.
> */
> -static int migration_incoming_setup(QEMUFile *f, Error **errp)
> +static bool migration_incoming_setup(QEMUFile *f, Error **errp)
> {
> MigrationIncomingState *mis = migration_incoming_get_current();
> - Error *local_err = NULL;
>
> - if (multifd_load_setup(&local_err) != 0) {
> - /* We haven't been able to create multifd threads
> - nothing better to do */
> - error_report_err(local_err);
> - exit(EXIT_FAILURE);
> + if (multifd_load_setup(errp) != 0) {
> + return false;
> }
>
> if (!mis->from_src_file) {
> mis->from_src_file = f;
> }
> qemu_file_set_blocking(f, false);
> - return 0;
> + return true;
> }
>
> void migration_incoming_process(void)
> @@ -675,14 +670,11 @@ static bool postcopy_try_recover(QEMUFile *f)
>
> void migration_fd_process_incoming(QEMUFile *f, Error **errp)
> {
> - Error *local_err = NULL;
> -
> if (postcopy_try_recover(f)) {
> return;
> }
>
> - if (migration_incoming_setup(f, &local_err)) {
> - error_propagate(errp, local_err);
> + if (!migration_incoming_setup(f, errp)) {
> return;
> }
> migration_incoming_process();
> @@ -703,8 +695,7 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
> return;
> }
>
> - if (migration_incoming_setup(f, &local_err)) {
> - error_propagate(errp, local_err);
> + if (!migration_incoming_setup(f, errp)) {
> return;
> }
>
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
* Markus Armbruster (armbru@redhat.com) wrote:
> Commit b673eab4e2 "multifd: Make multifd_load_setup() get an Error
> parameter" changed migration_incoming_setup() to take an Error **
> argument, and adjusted the callers accordingly. It neglected to
> change adjust multifd_load_setup(): it still exit()s on error. Clean
> that up.
>
> The error now gets propagated up two call chains: via
> migration_fd_process_incoming() to rdma_accept_incoming_migration(),
> and via migration_ioc_process_incoming() to
> migration_channel_process_incoming(). Both chain ends report the
> error with error_report_err(), but otherwise ignore it. Behavioral
> change: we no longer exit() on this error.
>
> This is consistent with how we handle other errors here, e.g. from
> multifd_recv_new_channel() via migration_ioc_process_incoming() to
> migration_channel_process_incoming(). Wether it's consistently right
> or consistently wrong I can't tell.
>
> Also clean up the return value from the unusual 0 on success, 1 on
> error to the more common true on success, false on error.
>
> Cc: Juan Quintela <quintela@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
> migration/migration.c | 27 +++++++++------------------
> 1 file changed, 9 insertions(+), 18 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 231dc24414..c1c0a48647 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -609,30 +609,25 @@ fail:
> }
>
> /**
> - * @migration_incoming_setup: Setup incoming migration
> - *
> - * Returns 0 for no error or 1 for error
> - *
> + * migration_incoming_setup: Setup incoming migration
> * @f: file for main migration channel
> * @errp: where to put errors
> + *
> + * Returns: %true on success, %false on error.
> */
> -static int migration_incoming_setup(QEMUFile *f, Error **errp)
> +static bool migration_incoming_setup(QEMUFile *f, Error **errp)
> {
> MigrationIncomingState *mis = migration_incoming_get_current();
> - Error *local_err = NULL;
>
> - if (multifd_load_setup(&local_err) != 0) {
> - /* We haven't been able to create multifd threads
> - nothing better to do */
> - error_report_err(local_err);
> - exit(EXIT_FAILURE);
> + if (multifd_load_setup(errp) != 0) {
> + return false;
> }
What I don't know is how well that will survive; for example in
multifd_load_setup it creates multiple threads and calls the recv_setup
member on each thread; now if one of those fails what happens - if we
don't exit, is it cleaned up enough so you can try another
migrate_incoming or is it just going to fall over?
Dave
>
> if (!mis->from_src_file) {
> mis->from_src_file = f;
> }
> qemu_file_set_blocking(f, false);
> - return 0;
> + return true;
> }
>
> void migration_incoming_process(void)
> @@ -675,14 +670,11 @@ static bool postcopy_try_recover(QEMUFile *f)
>
> void migration_fd_process_incoming(QEMUFile *f, Error **errp)
> {
> - Error *local_err = NULL;
> -
> if (postcopy_try_recover(f)) {
> return;
> }
>
> - if (migration_incoming_setup(f, &local_err)) {
> - error_propagate(errp, local_err);
> + if (!migration_incoming_setup(f, errp)) {
> return;
> }
> migration_incoming_process();
> @@ -703,8 +695,7 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
> return;
> }
>
> - if (migration_incoming_setup(f, &local_err)) {
> - error_propagate(errp, local_err);
> + if (!migration_incoming_setup(f, errp)) {
> return;
> }
>
> --
> 2.31.1
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
> * Markus Armbruster (armbru@redhat.com) wrote:
>> Commit b673eab4e2 "multifd: Make multifd_load_setup() get an Error
>> parameter" changed migration_incoming_setup() to take an Error **
>> argument, and adjusted the callers accordingly. It neglected to
>> change adjust multifd_load_setup(): it still exit()s on error. Clean
>> that up.
>>
>> The error now gets propagated up two call chains: via
>> migration_fd_process_incoming() to rdma_accept_incoming_migration(),
>> and via migration_ioc_process_incoming() to
>> migration_channel_process_incoming(). Both chain ends report the
>> error with error_report_err(), but otherwise ignore it. Behavioral
>> change: we no longer exit() on this error.
>>
>> This is consistent with how we handle other errors here, e.g. from
>> multifd_recv_new_channel() via migration_ioc_process_incoming() to
>> migration_channel_process_incoming(). Wether it's consistently right
>> or consistently wrong I can't tell.
>>
>> Also clean up the return value from the unusual 0 on success, 1 on
>> error to the more common true on success, false on error.
>>
>> Cc: Juan Quintela <quintela@redhat.com>
>> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>> migration/migration.c | 27 +++++++++------------------
>> 1 file changed, 9 insertions(+), 18 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 231dc24414..c1c0a48647 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -609,30 +609,25 @@ fail:
>> }
>>
>> /**
>> - * @migration_incoming_setup: Setup incoming migration
>> - *
>> - * Returns 0 for no error or 1 for error
>> - *
>> + * migration_incoming_setup: Setup incoming migration
>> * @f: file for main migration channel
>> * @errp: where to put errors
>> + *
>> + * Returns: %true on success, %false on error.
>> */
>> -static int migration_incoming_setup(QEMUFile *f, Error **errp)
>> +static bool migration_incoming_setup(QEMUFile *f, Error **errp)
>> {
>> MigrationIncomingState *mis = migration_incoming_get_current();
>> - Error *local_err = NULL;
>>
>> - if (multifd_load_setup(&local_err) != 0) {
>> - /* We haven't been able to create multifd threads
>> - nothing better to do */
>> - error_report_err(local_err);
>> - exit(EXIT_FAILURE);
>> + if (multifd_load_setup(errp) != 0) {
>> + return false;
>> }
>
> What I don't know is how well that will survive; for example in
> multifd_load_setup it creates multiple threads and calls the recv_setup
> member on each thread; now if one of those fails what happens - if we
> don't exit, is it cleaned up enough so you can try another
> migrate_incoming or is it just going to fall over?
I don't know, either.
The inconsistent error handling we have is not good. More consistent
error handling that unmasks bad error handling could be worse, unless we
fix the unmasked badness.
How can we move forward with this patch?
One way *I* could move forward is to tack a FIXME comment to the
problematic exit(1) instead of removing it.
[...]
Markus Armbruster <armbru@redhat.com> writes:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes:
>
>> * Markus Armbruster (armbru@redhat.com) wrote:
>>> Commit b673eab4e2 "multifd: Make multifd_load_setup() get an Error
>>> parameter" changed migration_incoming_setup() to take an Error **
>>> argument, and adjusted the callers accordingly. It neglected to
>>> change adjust multifd_load_setup(): it still exit()s on error. Clean
>>> that up.
>>>
>>> The error now gets propagated up two call chains: via
>>> migration_fd_process_incoming() to rdma_accept_incoming_migration(),
>>> and via migration_ioc_process_incoming() to
>>> migration_channel_process_incoming(). Both chain ends report the
>>> error with error_report_err(), but otherwise ignore it. Behavioral
>>> change: we no longer exit() on this error.
>>>
>>> This is consistent with how we handle other errors here, e.g. from
>>> multifd_recv_new_channel() via migration_ioc_process_incoming() to
>>> migration_channel_process_incoming(). Wether it's consistently right
>>> or consistently wrong I can't tell.
>>>
>>> Also clean up the return value from the unusual 0 on success, 1 on
>>> error to the more common true on success, false on error.
>>>
>>> Cc: Juan Quintela <quintela@redhat.com>
>>> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>>> ---
>>> migration/migration.c | 27 +++++++++------------------
>>> 1 file changed, 9 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/migration/migration.c b/migration/migration.c
>>> index 231dc24414..c1c0a48647 100644
>>> --- a/migration/migration.c
>>> +++ b/migration/migration.c
>>> @@ -609,30 +609,25 @@ fail:
>>> }
>>>
>>> /**
>>> - * @migration_incoming_setup: Setup incoming migration
>>> - *
>>> - * Returns 0 for no error or 1 for error
>>> - *
>>> + * migration_incoming_setup: Setup incoming migration
>>> * @f: file for main migration channel
>>> * @errp: where to put errors
>>> + *
>>> + * Returns: %true on success, %false on error.
>>> */
>>> -static int migration_incoming_setup(QEMUFile *f, Error **errp)
>>> +static bool migration_incoming_setup(QEMUFile *f, Error **errp)
>>> {
>>> MigrationIncomingState *mis = migration_incoming_get_current();
>>> - Error *local_err = NULL;
>>>
>>> - if (multifd_load_setup(&local_err) != 0) {
>>> - /* We haven't been able to create multifd threads
>>> - nothing better to do */
>>> - error_report_err(local_err);
>>> - exit(EXIT_FAILURE);
>>> + if (multifd_load_setup(errp) != 0) {
>>> + return false;
>>> }
>>
>> What I don't know is how well that will survive; for example in
>> multifd_load_setup it creates multiple threads and calls the recv_setup
>> member on each thread; now if one of those fails what happens - if we
>> don't exit, is it cleaned up enough so you can try another
>> migrate_incoming or is it just going to fall over?
>
> I don't know, either.
>
> The inconsistent error handling we have is not good. More consistent
> error handling that unmasks bad error handling could be worse, unless we
> fix the unmasked badness.
>
> How can we move forward with this patch?
>
> One way *I* could move forward is to tack a FIXME comment to the
> problematic exit(1) instead of removing it.
>
> [...]
I forgot this was still undecided, and included the patch in my pull
request. It has become commit 7d6f6933aa7. Feel free to revert it, or
to ask me to restore the exit() on failure.
© 2016 - 2026 Red Hat, Inc.