Changes in v2:
Removed some extra changes that were not adding much to the series,
left multifd_recv_setup() call where it was; stopped merging the
connection code at the end of the series.
Added further cleanup to CPR code in qmp_migrate and moved it to
cpr-transfer.c.
Minor changes:
- patch 4: didn't add a new trace with the hostname, there's already
one at qio_channel_tls_new_client()
- patch 6: changed other instances of s->parameters.mode to migrate_mode()
- patch 7: changed migrate_init() to also use migrate_error_free()
- patch 8: remove extra error_report_err
- patch 11: removed the extra cpr cleanup
- patch 12: stopped including migration_tls_channel_connect() in the
async portion
- patches 13-15: split the removal of QEMUFile from channel.c and
corrected the RDMA code
- patch 21: removed the comments about memory management
CI run: https://gitlab.com/farosas/qemu/-/pipelines/2245393469
v1 (rfc):
https://lore.kernel.org/r/20251226211930.27565-1-farosas@suse.de
Address some of the issues that make the early connection code a bit
too idiosyncratic. By "early connection" I mean from
qmp_migrate[_incoming] until the start of the migration
thread|coroutine.
(IOW, the whole dance of going into socket code, starting an async
routine, calling back to migration code, checking TLS, going back
again, coming back once more, etc. All while passing an error_in and
hostname string that eventually gets (maybe) ignored in tls code,
along with some is_resume checks along the way)
This series is mostly inspired by the work Markus and Peter did
recently in organizing some of the error handling code. The new
migration_connect_error_propagate() function seems like a good place
to centralize the error handling and call migration_cleanup() during
this early connection phase when everything is still fairly
linear. (apologies if I'm dirtying your design =)
Aside from the initial patches that are a bit disruptive, most of the
series is just refactoring to make the code easier to navigate, names
more consistent and some general cleanups.
- patches 1-8:
General cleanups, could be applied standalone, although they are
prerequisites for the rest of the series.
- patches 9-12:
Changes to allow calling migration_cleanup() from
migration_connect_error_propagate().
The idea here is to make sure error propagation and cleanup happen
when the error is detected, without calling into non-error-path
functions.
This is the more risky change because it will cause cleanup to run in
places where it didn't before.
- patch 13 & 19:
The main change of this series, simplifying the
qmp-migrate --> migration_connect path. Stops calling the connection
functions when an error happens in the transport code, adds
clarification around which paths have asynchronous completion and
makes the synchronous path return to their caller to start the
migration instead of initiating it themselves.
- patches 14-18, 20-22:
Moves code out of migration.c and into channel.c. Now that the code is
more compartmentalized, move it to a more appropriate source file.
- patches 23-25:
BONUS CONTENT, wrap the uri/channels parsing and move it to channel.c
as well.
- future work?
I think we could move all QMP command functions to a QAPI-specific
file, but I don't see any standardization in the tree, there's
block/qapi.c, various instances of foo-qmp-cmds.c and many more just
laying along with the rest of the code. So I left this for another
moment.
CI run: https://gitlab.com/farosas/qemu/-/pipelines/2233810778
Fabiano Rosas (25):
migration: Remove redundant state change
migration: Fix state change at migration_channel_process_incoming
migration/tls: Remove unused parameter
migration: Cleanup TLS handshake hostname passing
migration: Move postcopy_try_recover into migration_incoming_process
migration: Use migrate_mode() to query for cpr-transfer
migration: Free the error earlier in the resume case
migration: Move error reporting out of migration_cleanup
migration: Expand migration_connect_error_propagate to cover
cancelling
migration: yank: Move register instance earlier
migration: Fold migration_cleanup() into
migration_connect_error_propagate()
migration: Handle error in the early async paths
migration: Move setting of QEMUFile into
migration_outgoing|incoming_setup
migration/rdma: Use common connection paths
migration: Start incoming from channel.c
migration/channel: Rename migration_channel_connect
migration: Rename instances of start
migration: Move channel code to channel.c
migration: Move transport connection code into channel.c
migration: Move channel parsing to channel.c
migration: Move URI parsing to channel.c
migration: Free cpr-transfer MigrationAddress along with gsource
migration: Move CPR HUP watch to cpr-transfer.c
migration: Remove qmp_migrate_finish
migration/channel: Centralize calling
migration_channel_connect_outgoing
include/migration/cpr.h | 5 +
migration/channel.c | 370 ++++++++++++++++++++++----
migration/channel.h | 27 +-
migration/cpr-exec.c | 2 +-
migration/cpr-transfer.c | 23 ++
migration/exec.c | 11 +-
migration/exec.h | 8 +-
migration/fd.c | 15 +-
migration/fd.h | 9 +-
migration/file.c | 20 +-
migration/file.h | 7 +-
migration/migration.c | 557 ++++++++++-----------------------------
migration/migration.h | 16 +-
migration/multifd.c | 17 +-
migration/multifd.h | 2 +-
migration/options.c | 5 +
migration/postcopy-ram.c | 2 +-
migration/rdma.c | 46 ++--
migration/rdma.h | 6 +-
migration/socket.c | 30 +--
migration/socket.h | 6 +-
migration/tls.c | 33 +--
migration/tls.h | 9 +-
migration/trace-events | 20 +-
24 files changed, 637 insertions(+), 609 deletions(-)
--
2.51.0