[PATCH for-11.0 v2 0/7] migration: Error reporting cleanups

Peter Xu posted 7 patches 1 week, 5 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20251201194510.1121221-1-peterx@redhat.com
Maintainers: Markus Armbruster <armbru@redhat.com>, Michael Roth <michael.roth@amd.com>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>
include/qapi/error.h             | 20 ++++++++++++-
migration/migration.h            |  2 +-
migration/channel.c              |  1 -
migration/cpr-exec.c             |  5 ++--
migration/migration.c            | 51 +++++++++++++++-----------------
migration/multifd-device-state.c |  6 ++--
migration/multifd.c              | 30 +++++++++----------
migration/postcopy-ram.c         |  5 ++--
migration/ram.c                  |  4 +--
migration/savevm.c               | 17 +++++------
10 files changed, 73 insertions(+), 68 deletions(-)
[PATCH for-11.0 v2 0/7] migration: Error reporting cleanups
Posted by Peter Xu 1 week, 5 days ago
Based-on: <20251125070554.2256181-1-armbru@redhat.com>

This series is based on Markus's recent fix:

[PATCH] migration: Fix double-free on error path
https://lore.kernel.org/r/20251125070554.2256181-1-armbru@redhat.com

v2:
- Added R-bs
- Patch 1:
  - update commit message on s/accidentally merged/merged without proper
    review/ [Markus]
- Patch 2:
  - Added a new follow up patch here from Markus to poison Error's autoptr
- Patch 3:
  - Rename migration_connect_set_error to migration_connect_error_propagate
    [Markus]
  - Add comments in commit log for both migrate_connect() and the rename
    [Markus]
- Patch 4:
  - Rename multifd_send_set_error to multifd_send_error_propagate [Markus]
- Patch 6:
  - Make migrate_error_propagate() take MigrationState* as before [Markus]
  - Remove the one use case of g_clear_pointer() [Markus]
  - Touch up commit message for the change

This series should address the issues discussed in this thread here:

https://lore.kernel.org/r/871plmk1bc.fsf@pond.sub.org

The problem is Error is not a good candidate of g_autoptr, however the
cleanup function was merged without enough review.  Luckily, we only have
two users so far (after Markus's patch above lands).  This series removes
the last two in migration code and reverts the auto cleanup function for
Error.  Instead, poison the auto cleanup function.

When at it, it'll also change migrate_set_error() to start taking ownership
of errors, just like what most error APIs do.  When at it, it is renamed to
migrate_error_propagate() to imply migration version of error_propagate().

Comments welcomed, thanks.

Markus Armbruster (1):
  error: Poison g_autoptr(Error) to prevent its use

Peter Xu (6):
  migration: Use explicit error_free() instead of g_autoptr
  Revert "error: define g_autoptr() cleanup function for the Error type"
  migration: Make migration_connect_set_error() own the error
  migration: Make multifd_send_set_error() own the error
  migration: Make multifd_recv_terminate_threads() own the error
  migration: Replace migrate_set_error() with migrate_error_propagate()

 include/qapi/error.h             | 20 ++++++++++++-
 migration/migration.h            |  2 +-
 migration/channel.c              |  1 -
 migration/cpr-exec.c             |  5 ++--
 migration/migration.c            | 51 +++++++++++++++-----------------
 migration/multifd-device-state.c |  6 ++--
 migration/multifd.c              | 30 +++++++++----------
 migration/postcopy-ram.c         |  5 ++--
 migration/ram.c                  |  4 +--
 migration/savevm.c               | 17 +++++------
 10 files changed, 73 insertions(+), 68 deletions(-)

-- 
2.50.1
Re: [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups
Posted by Fabiano Rosas 1 week, 4 days ago
Peter Xu <peterx@redhat.com> writes:

> Based-on: <20251125070554.2256181-1-armbru@redhat.com>
>
> This series is based on Markus's recent fix:
>
> [PATCH] migration: Fix double-free on error path
> https://lore.kernel.org/r/20251125070554.2256181-1-armbru@redhat.com
>
> v2:
> - Added R-bs
> - Patch 1:
>   - update commit message on s/accidentally merged/merged without proper
>     review/ [Markus]
> - Patch 2:
>   - Added a new follow up patch here from Markus to poison Error's autoptr
> - Patch 3:
>   - Rename migration_connect_set_error to migration_connect_error_propagate
>     [Markus]
>   - Add comments in commit log for both migrate_connect() and the rename
>     [Markus]
> - Patch 4:
>   - Rename multifd_send_set_error to multifd_send_error_propagate [Markus]
> - Patch 6:
>   - Make migrate_error_propagate() take MigrationState* as before [Markus]
>   - Remove the one use case of g_clear_pointer() [Markus]
>   - Touch up commit message for the change
>
> This series should address the issues discussed in this thread here:
>
> https://lore.kernel.org/r/871plmk1bc.fsf@pond.sub.org

Thank you Markus for this. It's very helpful to have someone keeping us
in check regarding the usage of generic QEMU interfaces. Migration code
tends to drift incredibly..

>
> The problem is Error is not a good candidate of g_autoptr, however the
> cleanup function was merged without enough review.  Luckily, we only have
> two users so far (after Markus's patch above lands).  This series removes
> the last two in migration code and reverts the auto cleanup function for
> Error.  Instead, poison the auto cleanup function.
>
> When at it, it'll also change migrate_set_error() to start taking ownership
> of errors, just like what most error APIs do.  When at it, it is renamed to
> migrate_error_propagate() to imply migration version of error_propagate().
>
> Comments welcomed, thanks.
>

I think with this series we could now work to reduce the complexity of
migration_connect():

The outgoing code in socket.c and tls.c could call
migration_connect_error_propagate directly so migration_channel_connect
only needs to check migrate_has_error() and then exit as early as
possible. From migration_connect onwards we can assume connection
success.

What do you think?

tangent:
(is it too much bikeshedding if I send a patch doing s/migrat*_/mig_/
all over the place? it's so annoying having to check the code to get the
prefix correct when writing emails)
Re: [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups
Posted by Peter Xu 1 week, 4 days ago
On Tue, Dec 02, 2025 at 10:55:04AM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > Based-on: <20251125070554.2256181-1-armbru@redhat.com>
> >
> > This series is based on Markus's recent fix:
> >
> > [PATCH] migration: Fix double-free on error path
> > https://lore.kernel.org/r/20251125070554.2256181-1-armbru@redhat.com
> >
> > v2:
> > - Added R-bs
> > - Patch 1:
> >   - update commit message on s/accidentally merged/merged without proper
> >     review/ [Markus]
> > - Patch 2:
> >   - Added a new follow up patch here from Markus to poison Error's autoptr
> > - Patch 3:
> >   - Rename migration_connect_set_error to migration_connect_error_propagate
> >     [Markus]
> >   - Add comments in commit log for both migrate_connect() and the rename
> >     [Markus]
> > - Patch 4:
> >   - Rename multifd_send_set_error to multifd_send_error_propagate [Markus]
> > - Patch 6:
> >   - Make migrate_error_propagate() take MigrationState* as before [Markus]
> >   - Remove the one use case of g_clear_pointer() [Markus]
> >   - Touch up commit message for the change
> >
> > This series should address the issues discussed in this thread here:
> >
> > https://lore.kernel.org/r/871plmk1bc.fsf@pond.sub.org
> 
> Thank you Markus for this. It's very helpful to have someone keeping us
> in check regarding the usage of generic QEMU interfaces. Migration code
> tends to drift incredibly..
> 
> >
> > The problem is Error is not a good candidate of g_autoptr, however the
> > cleanup function was merged without enough review.  Luckily, we only have
> > two users so far (after Markus's patch above lands).  This series removes
> > the last two in migration code and reverts the auto cleanup function for
> > Error.  Instead, poison the auto cleanup function.
> >
> > When at it, it'll also change migrate_set_error() to start taking ownership
> > of errors, just like what most error APIs do.  When at it, it is renamed to
> > migrate_error_propagate() to imply migration version of error_propagate().
> >
> > Comments welcomed, thanks.
> >
> 
> I think with this series we could now work to reduce the complexity of
> migration_connect():
> 
> The outgoing code in socket.c and tls.c could call
> migration_connect_error_propagate directly so migration_channel_connect
> only needs to check migrate_has_error() and then exit as early as
> possible. From migration_connect onwards we can assume connection
> success.
> 
> What do you think?

As long as you read commit 688a3dcba980bf and will manage all those, it
sounds like a good thing to try.

> 
> tangent:
> (is it too much bikeshedding if I send a patch doing s/migrat*_/mig_/
> all over the place? it's so annoying having to check the code to get the
> prefix correct when writing emails)

From downstream POV, it'll be a slight burden whenever we need to backport
later patches to "the world before the rename".  It's not a huge deal but
we should consider that.

I'd confess it's likely the best time to do this if you want it for
upstream POV - we don't have a lot concurrent projects ongoing, so if this
lands it can be in the 1st pull for 11.0.

If you have some "vibe coding" tools, maybe you can spend 2 mins to see how
it looks like and decide whether to send a patch.  I would say don't spend
too much time on this (while you're still keep rebasing the options series! :)

-- 
Peter Xu
Re: [PATCH for-11.0 v2 0/7] migration: Error reporting cleanups
Posted by Peter Xu 1 week, 3 days ago
On Mon, Dec 01, 2025 at 02:45:03PM -0500, Peter Xu wrote:
> Based-on: <20251125070554.2256181-1-armbru@redhat.com>
> 
> This series is based on Markus's recent fix:
> 
> [PATCH] migration: Fix double-free on error path
> https://lore.kernel.org/r/20251125070554.2256181-1-armbru@redhat.com
> 
> v2:
> - Added R-bs
> - Patch 1:
>   - update commit message on s/accidentally merged/merged without proper
>     review/ [Markus]
> - Patch 2:
>   - Added a new follow up patch here from Markus to poison Error's autoptr
> - Patch 3:
>   - Rename migration_connect_set_error to migration_connect_error_propagate
>     [Markus]
>   - Add comments in commit log for both migrate_connect() and the rename
>     [Markus]
> - Patch 4:
>   - Rename multifd_send_set_error to multifd_send_error_propagate [Markus]
> - Patch 6:
>   - Make migrate_error_propagate() take MigrationState* as before [Markus]
>   - Remove the one use case of g_clear_pointer() [Markus]
>   - Touch up commit message for the change
> 
> This series should address the issues discussed in this thread here:
> 
> https://lore.kernel.org/r/871plmk1bc.fsf@pond.sub.org
> 
> The problem is Error is not a good candidate of g_autoptr, however the
> cleanup function was merged without enough review.  Luckily, we only have
> two users so far (after Markus's patch above lands).  This series removes
> the last two in migration code and reverts the auto cleanup function for
> Error.  Instead, poison the auto cleanup function.
> 
> When at it, it'll also change migrate_set_error() to start taking ownership
> of errors, just like what most error APIs do.  When at it, it is renamed to
> migrate_error_propagate() to imply migration version of error_propagate().
> 
> Comments welcomed, thanks.
> 
> Markus Armbruster (1):
>   error: Poison g_autoptr(Error) to prevent its use
> 
> Peter Xu (6):
>   migration: Use explicit error_free() instead of g_autoptr
>   Revert "error: define g_autoptr() cleanup function for the Error type"
>   migration: Make migration_connect_set_error() own the error
>   migration: Make multifd_send_set_error() own the error
>   migration: Make multifd_recv_terminate_threads() own the error
>   migration: Replace migrate_set_error() with migrate_error_propagate()
> 
>  include/qapi/error.h             | 20 ++++++++++++-
>  migration/migration.h            |  2 +-
>  migration/channel.c              |  1 -
>  migration/cpr-exec.c             |  5 ++--
>  migration/migration.c            | 51 +++++++++++++++-----------------
>  migration/multifd-device-state.c |  6 ++--
>  migration/multifd.c              | 30 +++++++++----------
>  migration/postcopy-ram.c         |  5 ++--
>  migration/ram.c                  |  4 +--
>  migration/savevm.c               | 17 +++++------
>  10 files changed, 73 insertions(+), 68 deletions(-)

Thanks for the reviews, I queued all 8 patches for 11.0 (with small
tweaks per discussion).

-- 
Peter Xu