[PATCH 5/6] migration: Rephrase message on failure to save / load Xen device state

Markus Armbruster posted 6 patches 6 months, 2 weeks ago
Maintainers: Fam Zheng <fam@euphon.net>, Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, "Marc-André Lureau" <marcandre.lureau@redhat.com>, Markus Armbruster <armbru@redhat.com>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <richard.henderson@linaro.org>
[PATCH 5/6] migration: Rephrase message on failure to save / load Xen device state
Posted by Markus Armbruster 6 months, 2 weeks ago
Functions that use an Error **errp parameter to return errors should
not also report them to the user, because reporting is the caller's
job.  When the caller does, the error is reported twice.  When it
doesn't (because it recovered from the error), there is no error to
report, i.e. the report is bogus.

qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
this principle: they call qemu_save_device_state() and
qemu_loadvm_state(), which call error_report_err().

I wish I could clean this up now, but migration's error reporting is
too complicated (confused?) for me to mess with it.

Instead, I'm merely improving the error reported by
qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the
QMP core from

    An IO error has occurred

to
    saving Xen device state failed

and

    loading Xen device state failed

respectively.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 migration/savevm.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 4509482ec4..a4a856982a 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -45,7 +45,6 @@
 #include "qapi/qapi-commands-migration.h"
 #include "qapi/clone-visitor.h"
 #include "qapi/qapi-builtin-visit.h"
-#include "qapi/qmp/qerror.h"
 #include "qemu/error-report.h"
 #include "sysemu/cpus.h"
 #include "exec/memory.h"
@@ -3208,7 +3207,7 @@ void qmp_xen_save_devices_state(const char *filename, bool has_live, bool live,
     object_unref(OBJECT(ioc));
     ret = qemu_save_device_state(f);
     if (ret < 0 || qemu_fclose(f) < 0) {
-        error_setg(errp, QERR_IO_ERROR);
+        error_setg(errp, "saving Xen device state failed");
     } else {
         /* libxl calls the QMP command "stop" before calling
          * "xen-save-devices-state" and in case of migration failure, libxl
@@ -3257,7 +3256,7 @@ void qmp_xen_load_devices_state(const char *filename, Error **errp)
     ret = qemu_loadvm_state(f);
     qemu_fclose(f);
     if (ret < 0) {
-        error_setg(errp, QERR_IO_ERROR);
+        error_setg(errp, "loading Xen device state failed");
     }
     migration_incoming_state_destroy();
 }
-- 
2.45.0
Re: [PATCH 5/6] migration: Rephrase message on failure to save / load Xen device state
Posted by Peter Xu 6 months ago
On Mon, May 13, 2024 at 04:17:02PM +0200, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job.  When the caller does, the error is reported twice.  When it
> doesn't (because it recovered from the error), there is no error to
> report, i.e. the report is bogus.
> 
> qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
> this principle: they call qemu_save_device_state() and
> qemu_loadvm_state(), which call error_report_err().
> 
> I wish I could clean this up now, but migration's error reporting is
> too complicated (confused?) for me to mess with it.

:-(

> 
> Instead, I'm merely improving the error reported by
> qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the
> QMP core from
> 
>     An IO error has occurred
> 
> to
>     saving Xen device state failed
> 
> and
> 
>     loading Xen device state failed
> 
> respectively.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>

Acked-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu
Re: [PATCH 5/6] migration: Rephrase message on failure to save / load Xen device state
Posted by Markus Armbruster 6 months ago
Peter Xu <peterx@redhat.com> writes:

> On Mon, May 13, 2024 at 04:17:02PM +0200, Markus Armbruster wrote:
>> Functions that use an Error **errp parameter to return errors should
>> not also report them to the user, because reporting is the caller's
>> job.  When the caller does, the error is reported twice.  When it
>> doesn't (because it recovered from the error), there is no error to
>> report, i.e. the report is bogus.
>> 
>> qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
>> this principle: they call qemu_save_device_state() and
>> qemu_loadvm_state(), which call error_report_err().
>> 
>> I wish I could clean this up now, but migration's error reporting is
>> too complicated (confused?) for me to mess with it.
>
> :-(

If I understood how it's *supposed* to work, I might have a chance...

I can see a mixture of reporting errors directly (with error_report() &
friends), passing them to callers (via Error **errp), and storing them
in / retrieving them from MigrationState member @error.  This can't be
right.

I think a necessary first step towards getting it right is a shared
understanding how errors are to be handled in migration code.  This
includes how error data should flow from error source to error sink, and
what the possible sinks are.

>> Instead, I'm merely improving the error reported by
>> qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the
>> QMP core from
>> 
>>     An IO error has occurred
>> 
>> to
>>     saving Xen device state failed
>> 
>> and
>> 
>>     loading Xen device state failed
>> 
>> respectively.
>> 
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>
> Acked-by: Peter Xu <peterx@redhat.com>

Thanks!
Re: [PATCH 5/6] migration: Rephrase message on failure to save / load Xen device state
Posted by Peter Xu 6 months ago
On Mon, May 27, 2024 at 12:53:22PM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Mon, May 13, 2024 at 04:17:02PM +0200, Markus Armbruster wrote:
> >> Functions that use an Error **errp parameter to return errors should
> >> not also report them to the user, because reporting is the caller's
> >> job.  When the caller does, the error is reported twice.  When it
> >> doesn't (because it recovered from the error), there is no error to
> >> report, i.e. the report is bogus.
> >> 
> >> qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
> >> this principle: they call qemu_save_device_state() and
> >> qemu_loadvm_state(), which call error_report_err().
> >> 
> >> I wish I could clean this up now, but migration's error reporting is
> >> too complicated (confused?) for me to mess with it.
> >
> > :-(
> 
> If I understood how it's *supposed* to work, I might have a chance...
> 
> I can see a mixture of reporting errors directly (with error_report() &
> friends), passing them to callers (via Error **errp), and storing them
> in / retrieving them from MigrationState member @error.  This can't be
> right.
> 
> I think a necessary first step towards getting it right is a shared
> understanding how errors are to be handled in migration code.  This
> includes how error data should flow from error source to error sink, and
> what the possible sinks are.

True.  I think the sink should always be MigrationState.error so far.

There's also the other complexity on detecting errors using either
qemu_file_get_error() or migrate_get_error().. the error handling in
migration is indeed prone to a cleanup.

I've added a cleanup entry for migration todo page:

https://wiki.qemu.org/ToDo/LiveMigration#Migration_error_detection_and_reporting

Thanks,

-- 
Peter Xu
Re: [PATCH 5/6] migration: Rephrase message on failure to save / load Xen device state
Posted by Fabiano Rosas 6 months, 2 weeks ago
Markus Armbruster <armbru@redhat.com> writes:

> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job.  When the caller does, the error is reported twice.  When it
> doesn't (because it recovered from the error), there is no error to
> report, i.e. the report is bogus.
>
> qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
> this principle: they call qemu_save_device_state() and
> qemu_loadvm_state(), which call error_report_err().
>
> I wish I could clean this up now, but migration's error reporting is
> too complicated (confused?) for me to mess with it.
>
> Instead, I'm merely improving the error reported by
> qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the
> QMP core from
>
>     An IO error has occurred
>
> to
>     saving Xen device state failed
>
> and
>
>     loading Xen device state failed
>
> respectively.
>
> Signed-off-by: Markus Armbruster <armbru@redhat.com>

Acked-by: Fabiano Rosas <farosas@suse.de>
Re: [PATCH 5/6] migration: Rephrase message on failure to save / load Xen device state
Posted by Philippe Mathieu-Daudé 6 months, 2 weeks ago
On 13/5/24 16:17, Markus Armbruster wrote:
> Functions that use an Error **errp parameter to return errors should
> not also report them to the user, because reporting is the caller's
> job.  When the caller does, the error is reported twice.  When it
> doesn't (because it recovered from the error), there is no error to
> report, i.e. the report is bogus.
> 
> qmp_xen_save_devices_state() and qmp_xen_load_devices_state() violate
> this principle: they call qemu_save_device_state() and
> qemu_loadvm_state(), which call error_report_err().
> 
> I wish I could clean this up now, but migration's error reporting is
> too complicated (confused?) for me to mess with it.
> 
> Instead, I'm merely improving the error reported by
> qmp_xen_load_devices_state() and qmp_xen_load_devices_state() to the
> QMP core from
> 
>      An IO error has occurred
> 
> to
>      saving Xen device state failed
> 
> and
> 
>      loading Xen device state failed
> 
> respectively.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>   migration/savevm.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>