:p
atchew
Login
From: Prasad Pandit <pjp@fedoraproject.org> Hello, * Currently Multifd and Postcopy migration can not be used together. QEMU shows "Postcopy is not yet compatible with multifd" message. When migrating guests with large (100's GB) RAM, Multifd threads help to accelerate migration, but inability to use it with the Postcopy mode delays guest start up on the destination side. * This patch series allows to enable both Multifd and Postcopy migration together. In this, Precopy and Multifd threads work during the initial guest (RAM) transfer. When migration moves to the Postcopy phase, Precopy and Multifd threads on the source side stop sending data on their channels. Instead Postcopy threads on the destination start to request pages from the source side. * This series introduces magic value (4-bytes) to be sent on the Postcopy channel. It helps to differentiate channels and properly setup incoming connections on the destination side. Thank you. --- Prasad Pandit (5): migration/multifd: move macros to multifd header migration/postcopy: magic value for postcopy channel migration: remove multifd check with postcopy migration: refactor ram_save_target_page functions migration: enable multifd and postcopy together migration/migration.c | 73 ++++++++++++++++++++++++---------------- migration/multifd.c | 4 --- migration/multifd.h | 5 +++ migration/options.c | 8 ++--- migration/postcopy-ram.c | 7 ++++ migration/postcopy-ram.h | 3 ++ migration/ram.c | 54 ++++++++++++----------------- 7 files changed, 83 insertions(+), 71 deletions(-) -- 2.47.0
From: Prasad Pandit <pjp@fedoraproject.org> Move MULTIFD_ macros to the header file so that they are accessible from other files. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/multifd.c | 4 ---- migration/multifd.h | 5 +++++ 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index XXXXXXX..XXXXXXX 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -XXX,XX +XXX,XX @@ #include "io/channel-socket.h" #include "yank_functions.h" -/* Multiple fd's */ - -#define MULTIFD_MAGIC 0x11223344U -#define MULTIFD_VERSION 1 typedef struct { uint32_t magic; diff --git a/migration/multifd.h b/migration/multifd.h index XXXXXXX..XXXXXXX 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -XXX,XX +XXX,XX @@ bool multifd_queue_page(RAMBlock *block, ram_addr_t offset); bool multifd_recv(void); MultiFDRecvData *multifd_get_recv_data(void); +/* Multiple fd's */ + +#define MULTIFD_MAGIC 0x11223344U +#define MULTIFD_VERSION 1 + /* Multifd Compression flags */ #define MULTIFD_FLAG_SYNC (1 << 0) -- 2.47.0
From: Prasad Pandit <pjp@fedoraproject.org> During migration, the precopy and the multifd channels send magic value (4-bytes) to the destination side, for it to identify the channel and properly establish connection. But Postcopy channel did not send such value. Introduce a magic value to be sent on the postcopy channel. It helps to identify channels when both multifd and postcopy migration are enabled together. An explicitly defined magic value makes code easier to follow, because it is consistent with the other channels. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/postcopy-ram.c | 7 +++++++ migration/postcopy-ram.h | 3 +++ 2 files changed, 10 insertions(+) diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index XXXXXXX..XXXXXXX 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -XXX,XX +XXX,XX @@ void postcopy_unregister_shared_ufd(struct PostCopyFD *pcfd) void postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file) { + if (mis->postcopy_qemufile_dst) { + return; + } /* * The new loading channel has its own threads, so it needs to be * blocked too. It's by default true, just be explicit. @@ -XXX,XX +XXX,XX @@ postcopy_preempt_send_channel_done(MigrationState *s, * postcopy_qemufile_src to know whether it failed or not. */ qemu_sem_post(&s->postcopy_qemufile_src_sem); + + /* Send magic value to identify postcopy channel on the destination */ + uint32_t magic = cpu_to_be32(POSTCOPY_MAGIC); + qio_channel_write_all(ioc, (char *)&magic, sizeof(magic), NULL); } static void diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h index XXXXXXX..XXXXXXX 100644 --- a/migration/postcopy-ram.h +++ b/migration/postcopy-ram.h @@ -XXX,XX +XXX,XX @@ #include "qapi/qapi-types-migration.h" +/* Magic value to identify postcopy channel on the destination */ +#define POSTCOPY_MAGIC 0x55667788U + /* Return true if the host supports everything we need to do postcopy-ram */ bool postcopy_ram_supported_by_host(MigrationIncomingState *mis, Error **errp); -- 2.47.0
From: Prasad Pandit <pjp@fedoraproject.org> Remove multifd capability check with Postcopy mode. This helps to enable both multifd and postcopy together. Update migrate_multifd() to return false when migration reaches Postcopy phase. In Postcopy phase, source guest is paused, so the migration threads on the source stop sending/pushing data on the channels. The destination guest starts running and Postcopy threads there begin to request/pull data from the source side. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/options.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/migration/options.c b/migration/options.c index XXXXXXX..XXXXXXX 100644 --- a/migration/options.c +++ b/migration/options.c @@ -XXX,XX +XXX,XX @@ bool migrate_multifd(void) { MigrationState *s = migrate_get_current(); - return s->capabilities[MIGRATION_CAPABILITY_MULTIFD]; + return s->capabilities[MIGRATION_CAPABILITY_MULTIFD] + && !migration_in_postcopy(); } bool migrate_pause_before_switchover(void) @@ -XXX,XX +XXX,XX @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) error_setg(errp, "Postcopy is not compatible with ignore-shared"); return false; } - - if (new_caps[MIGRATION_CAPABILITY_MULTIFD]) { - error_setg(errp, "Postcopy is not yet compatible with multifd"); - return false; - } } if (new_caps[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) { -- 2.47.0
From: Prasad Pandit <pjp@fedoraproject.org> Refactor ram_save_target_page legacy and multifd functions into one. Other than simplifying it, it avoids reinitialization of the 'migration_ops' object, when migration moves from multifd to postcopy phase. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/ram.c | 54 ++++++++++++++++++++----------------------------- 1 file changed, 22 insertions(+), 32 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index XXXXXXX..XXXXXXX 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -XXX,XX +XXX,XX @@ int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len, } /** - * ram_save_target_page_legacy: save one target page + * ram_save_target_page_common: + * send one target page to multifd workers OR save one target page. * - * Returns the number of pages written + * Multifd mode: returns 1 if the page was queued, -1 otherwise. + * + * Non-multifd mode: returns the number of pages written * * @rs: current RAM state * @pss: data about the page we want to send */ -static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss) +static int ram_save_target_page_common(RAMState *rs, PageSearchStatus *pss) { ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; int res; + if (migrate_multifd()) { + RAMBlock *block = pss->block; + /* + * While using multifd live migration, we still need to handle zero + * page checking on the migration main thread. + */ + if (migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) { + if (save_zero_page(rs, pss, offset)) { + return 1; + } + } + + return ram_save_multifd_page(block, offset); + } + if (control_save_page(pss, offset, &res)) { return res; } @@ -XXX,XX +XXX,XX @@ static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss) return ram_save_page(rs, pss); } -/** - * ram_save_target_page_multifd: send one target page to multifd workers - * - * Returns 1 if the page was queued, -1 otherwise. - * - * @rs: current RAM state - * @pss: data about the page we want to send - */ -static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss) -{ - RAMBlock *block = pss->block; - ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; - - /* - * While using multifd live migration, we still need to handle zero - * page checking on the migration main thread. - */ - if (migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) { - if (save_zero_page(rs, pss, offset)) { - return 1; - } - } - - return ram_save_multifd_page(block, offset); -} - /* Should be called before sending a host page */ static void pss_host_page_prepare(PageSearchStatus *pss) { @@ -XXX,XX +XXX,XX @@ static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp) } migration_ops = g_malloc0(sizeof(MigrationOps)); + migration_ops->ram_save_target_page = ram_save_target_page_common; if (migrate_multifd()) { multifd_ram_save_setup(); - migration_ops->ram_save_target_page = ram_save_target_page_multifd; - } else { - migration_ops->ram_save_target_page = ram_save_target_page_legacy; } bql_unlock(); -- 2.47.0
From: Prasad Pandit <pjp@fedoraproject.org> Enable Multifd and Postcopy migration together. The migration_ioc_process_incoming() routine checks magic value sent on each channel and helps to properly setup multifd and postcopy channels. Idea is to take advantage of the multifd threads to accelerate transfer of large guest RAM to the destination and switch to postcopy mode sooner. The Precopy and Multifd threads work during the initial guest RAM transfer. When migration moves to the Postcopy phase, the source guest is paused, so the Precopy and Multifd threads stop sending data on their channels. Postcopy threads on the destination request/pull data from the source side. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/migration.c | 73 ++++++++++++++++++++++++++----------------- 1 file changed, 44 insertions(+), 29 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index XXXXXXX..XXXXXXX 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -XXX,XX +XXX,XX @@ enum mig_rp_message_type { MIG_RP_MSG_MAX }; +/* Migration channel types */ +enum { CH_DEFAULT, CH_MULTIFD, CH_POSTCOPY }; + /* When we add fault tolerance, we could have several migrations at once. For now we don't need to add dynamic creation of migration */ @@ -XXX,XX +XXX,XX @@ void migration_fd_process_incoming(QEMUFile *f) * Returns true when we want to start a new incoming migration process, * false otherwise. */ -static bool migration_should_start_incoming(bool main_channel) +static bool migration_should_start_incoming(uint8_t channel) { + if (channel == CH_POSTCOPY) { + return false; + } + /* Multifd doesn't start unless all channels are established */ if (migrate_multifd()) { - return migration_has_all_channels(); - } - - /* Preempt channel only starts when the main channel is created */ - if (migrate_postcopy_preempt()) { - return main_channel; + return multifd_recv_all_channels_created(); } /* @@ -XXX,XX +XXX,XX @@ static bool migration_should_start_incoming(bool main_channel) * it's the main channel that's being created, and we should always * proceed with this channel. */ - assert(main_channel); + assert(channel == CH_DEFAULT); return true; } @@ -XXX,XX +XXX,XX @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) MigrationIncomingState *mis = migration_incoming_get_current(); Error *local_err = NULL; QEMUFile *f; - bool default_channel = true; uint32_t channel_magic = 0; + uint8_t channel = CH_DEFAULT; int ret = 0; - if (migrate_multifd() && !migrate_mapped_ram() && - !migrate_postcopy_ram() && - qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { + if (qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { /* * With multiple channels, it is possible that we receive channels * out of order on destination side, causing incorrect mapping of @@ -XXX,XX +XXX,XX @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) return; } - default_channel = (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC)); - } else { - default_channel = !mis->from_src_file; + if (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC)) { + channel = CH_DEFAULT; + } else if (channel_magic == cpu_to_be32(MULTIFD_MAGIC)) { + channel = CH_MULTIFD; + } else if (channel_magic == cpu_to_be32(POSTCOPY_MAGIC)) { + if (qio_channel_read_all(ioc, (char *)&channel_magic, + sizeof(channel_magic), &local_err)) { + error_report_err(local_err); + return; + } + channel = CH_POSTCOPY; + } else { + error_report("%s: could not identify channel, unknown magic: %u", + __func__, channel_magic); + return; + } } if (multifd_recv_setup(errp) != 0) { return; } - if (default_channel) { + if (channel == CH_DEFAULT) { f = qemu_file_new_input(ioc); migration_incoming_setup(f); - } else { + } else if (channel == CH_MULTIFD) { /* Multiple connections */ - assert(migration_needs_multiple_sockets()); if (migrate_multifd()) { multifd_recv_new_channel(ioc, &local_err); - } else { + } + if (local_err) { + error_propagate(errp, local_err); + return; + } + } else if (channel == CH_POSTCOPY) { + if (migrate_postcopy()) { assert(migrate_postcopy_preempt()); f = qemu_file_new_input(ioc); postcopy_preempt_new_channel(mis, f); } - if (local_err) { - error_propagate(errp, local_err); - return; - } } - if (migration_should_start_incoming(default_channel)) { + if (migration_should_start_incoming(channel)) { /* If it's a recovery, we're done */ if (postcopy_try_recover()) { return; @@ -XXX,XX +XXX,XX @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) */ bool migration_has_all_channels(void) { + bool ret = false; MigrationIncomingState *mis = migration_incoming_get_current(); if (!mis->from_src_file) { - return false; + return ret; } if (migrate_multifd()) { - return multifd_recv_all_channels_created(); + ret = multifd_recv_all_channels_created(); } - if (migrate_postcopy_preempt()) { - return mis->postcopy_qemufile_dst != NULL; + if (ret && migrate_postcopy_preempt()) { + ret = mis->postcopy_qemufile_dst != NULL; } - return true; + return ret; } int migrate_send_rp_switchover_ack(MigrationIncomingState *mis) -- 2.47.0
From: Prasad Pandit <pjp@fedoraproject.org> Hello, * Currently Multifd and Postcopy migration can not be used together. QEMU shows "Postcopy is not yet compatible with multifd" message. When migrating guests with large (100's GB) RAM, Multifd threads help to accelerate migration, but inability to use it with the Postcopy mode delays guest start up on the destination side. * This patch series allows to enable both Multifd and Postcopy migration together. Precopy and Multifd threads work during the initial guest (RAM) transfer. When migration moves to the Postcopy phase, Multifd threads are restrained and the Postcopy threads start to request pages from the source side. * This series removes magic value (4-bytes) introduced in the previous series for the Postcopy channel. And refactoring of the 'ram_save_target_page' function is made independent of the multifd & postcopy change. v1: https://lore.kernel.org/qemu-devel/20241126115748.118683-1-ppandit@redhat.com/T/#u v0: https://lore.kernel.org/qemu-devel/20241029150908.1136894-1-ppandit@redhat.com/T/#u Thank you. --- Prasad Pandit (3): migration/multifd: move macros to multifd header migration: refactor ram_save_target_page functions migration: enable multifd and postcopy together migration/migration.c | 90 +++++++++++++++++++++++--------------- migration/multifd-nocomp.c | 3 +- migration/multifd.c | 5 --- migration/multifd.h | 5 +++ migration/options.c | 5 --- migration/ram.c | 73 +++++++++---------------------- 6 files changed, 82 insertions(+), 99 deletions(-) -- 2.47.1
From: Prasad Pandit <pjp@fedoraproject.org> Move MULTIFD_ macros to the header file so that they are accessible from other source files. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/multifd.c | 5 ----- migration/multifd.h | 5 +++++ 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index XXXXXXX..XXXXXXX 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -XXX,XX +XXX,XX @@ #include "io/channel-socket.h" #include "yank_functions.h" -/* Multiple fd's */ - -#define MULTIFD_MAGIC 0x11223344U -#define MULTIFD_VERSION 1 - typedef struct { uint32_t magic; uint32_t version; diff --git a/migration/multifd.h b/migration/multifd.h index XXXXXXX..XXXXXXX 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -XXX,XX +XXX,XX @@ bool multifd_queue_page(RAMBlock *block, ram_addr_t offset); bool multifd_recv(void); MultiFDRecvData *multifd_get_recv_data(void); +/* Multiple fd's */ + +#define MULTIFD_MAGIC 0x11223344U +#define MULTIFD_VERSION 1 + /* Multifd Compression flags */ #define MULTIFD_FLAG_SYNC (1 << 0) -- 2.47.1
From: Prasad Pandit <pjp@fedoraproject.org> Refactor ram_save_target_page legacy and multifd functions into one. Other than simplifying it, it frees 'migration_ops' object from usage, so it is expunged. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/ram.c | 67 +++++++++++++------------------------------------ 1 file changed, 17 insertions(+), 50 deletions(-) v2: Make refactoring change independent of the multifd & postcopy change. v1: Further refactor ram_save_target_page() function to conflate save_zero_page() calls. - https://lore.kernel.org/qemu-devel/20241126115748.118683-1-ppandit@redhat.com/T/#u v0: - https://lore.kernel.org/qemu-devel/20241029150908.1136894-1-ppandit@redhat.com/T/#u diff --git a/migration/ram.c b/migration/ram.c index XXXXXXX..XXXXXXX 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -XXX,XX +XXX,XX @@ void ram_transferred_add(uint64_t bytes) } } -struct MigrationOps { - int (*ram_save_target_page)(RAMState *rs, PageSearchStatus *pss); -}; -typedef struct MigrationOps MigrationOps; - -MigrationOps *migration_ops; - static int ram_save_host_page_urgent(PageSearchStatus *pss); /* NOTE: page is the PFN not real ram_addr_t. */ @@ -XXX,XX +XXX,XX @@ int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len, } /** - * ram_save_target_page_legacy: save one target page - * - * Returns the number of pages written + * ram_save_target_page: save one target page to the precopy thread + * OR to multifd workers. * * @rs: current RAM state * @pss: data about the page we want to send */ -static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss) +static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss) { ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; int res; + if (!migrate_multifd() + || migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) { + if (save_zero_page(rs, pss, offset)) { + return 1; + } + } + + if (migrate_multifd()) { + RAMBlock *block = pss->block; + return ram_save_multifd_page(block, offset); + } + if (control_save_page(pss, offset, &res)) { return res; } - if (save_zero_page(rs, pss, offset)) { - return 1; - } - return ram_save_page(rs, pss); } -/** - * ram_save_target_page_multifd: send one target page to multifd workers - * - * Returns 1 if the page was queued, -1 otherwise. - * - * @rs: current RAM state - * @pss: data about the page we want to send - */ -static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss) -{ - RAMBlock *block = pss->block; - ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; - - /* - * While using multifd live migration, we still need to handle zero - * page checking on the migration main thread. - */ - if (migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) { - if (save_zero_page(rs, pss, offset)) { - return 1; - } - } - - return ram_save_multifd_page(block, offset); -} - /* Should be called before sending a host page */ static void pss_host_page_prepare(PageSearchStatus *pss) { @@ -XXX,XX +XXX,XX @@ static int ram_save_host_page_urgent(PageSearchStatus *pss) if (page_dirty) { /* Be strict to return code; it must be 1, or what else? */ - if (migration_ops->ram_save_target_page(rs, pss) != 1) { + if (ram_save_target_page(rs, pss) != 1) { error_report_once("%s: ram_save_target_page failed", __func__); ret = -1; goto out; @@ -XXX,XX +XXX,XX @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss) if (preempt_active) { qemu_mutex_unlock(&rs->bitmap_mutex); } - tmppages = migration_ops->ram_save_target_page(rs, pss); + tmppages = ram_save_target_page(rs, pss); if (tmppages >= 0) { pages += tmppages; /* @@ -XXX,XX +XXX,XX @@ static void ram_save_cleanup(void *opaque) xbzrle_cleanup(); multifd_ram_save_cleanup(); ram_state_cleanup(rsp); - g_free(migration_ops); - migration_ops = NULL; } static void ram_state_reset(RAMState *rs) @@ -XXX,XX +XXX,XX @@ static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp) return ret; } - migration_ops = g_malloc0(sizeof(MigrationOps)); - if (migrate_multifd()) { multifd_ram_save_setup(); - migration_ops->ram_save_target_page = ram_save_target_page_multifd; - } else { - migration_ops->ram_save_target_page = ram_save_target_page_legacy; } bql_unlock(); -- 2.47.1
From: Prasad Pandit <pjp@fedoraproject.org> Enable Multifd and Postcopy migration together. The migration_ioc_process_incoming() routine checks magic value sent on each channel and helps to properly setup multifd and postcopy channels. The Precopy and Multifd threads work during the initial guest RAM transfer. When migration moves to the Postcopy phase, the multifd threads are restrained and Postcopy threads on the destination request/pull data from the source side. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/migration.c | 90 +++++++++++++++++++++++--------------- migration/multifd-nocomp.c | 3 +- migration/options.c | 5 --- migration/ram.c | 8 ++-- 4 files changed, 61 insertions(+), 45 deletions(-) v2: Merge earlier options.c patch into this one. Also make !migration_in_postcopy() check in this patch, to separate refactoring change from this one. v1: Avoid using 4-bytes magic value for the Postcopy channel. Flush and synchronise Multifd thread before postcopy_start(). - https://lore.kernel.org/qemu-devel/20241126115748.118683-1-ppandit@redhat.com/T/#u v0: - https://lore.kernel.org/qemu-devel/20241029150908.1136894-1-ppandit@redhat.com/T/#u diff --git a/migration/migration.c b/migration/migration.c index XXXXXXX..XXXXXXX 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -XXX,XX +XXX,XX @@ enum mig_rp_message_type { MIG_RP_MSG_MAX }; +/* Migration channel types */ +enum { CH_DEFAULT, CH_MULTIFD, CH_POSTCOPY }; + /* When we add fault tolerance, we could have several migrations at once. For now we don't need to add dynamic creation of migration */ @@ -XXX,XX +XXX,XX @@ void migration_fd_process_incoming(QEMUFile *f) /* * Returns true when we want to start a new incoming migration process, * false otherwise. + * + * All the required channels must be in place before a new incoming + * migration process starts. + * - Multifd enabled: + * The main channel and the multifd channels are required. + * - Multifd/Postcopy disabled: + * The main channel is required. + * - Postcopy enabled: + * We don't want to start a new incoming migration when + * the postcopy channel is created. Because it is created + * towards the end of the precopy migration. */ -static bool migration_should_start_incoming(bool main_channel) +static bool migration_should_start_incoming(uint8_t channel) { - /* Multifd doesn't start unless all channels are established */ - if (migrate_multifd()) { - return migration_has_all_channels(); - } + bool ret = false; + + if (channel != CH_POSTCOPY) { + MigrationIncomingState *mis = migration_incoming_get_current(); + ret = mis->from_src_file ? true : false; - /* Preempt channel only starts when the main channel is created */ - if (migrate_postcopy_preempt()) { - return main_channel; + if (ret && migrate_multifd()) { + ret = multifd_recv_all_channels_created(); + } } - /* - * For all the rest types of migration, we should only reach here when - * it's the main channel that's being created, and we should always - * proceed with this channel. - */ - assert(main_channel); - return true; + return ret; } void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) @@ -XXX,XX +XXX,XX @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) MigrationIncomingState *mis = migration_incoming_get_current(); Error *local_err = NULL; QEMUFile *f; - bool default_channel = true; uint32_t channel_magic = 0; + uint8_t channel = CH_DEFAULT; int ret = 0; - if (migrate_multifd() && !migrate_mapped_ram() && - !migrate_postcopy_ram() && - qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { + if (!migration_should_start_incoming(channel) + && qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { /* * With multiple channels, it is possible that we receive channels * out of order on destination side, causing incorrect mapping of @@ -XXX,XX +XXX,XX @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) return; } - default_channel = (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC)); + if (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC)) { + channel = CH_DEFAULT; + } else if (channel_magic == cpu_to_be32(MULTIFD_MAGIC)) { + channel = CH_MULTIFD; + } else { + error_report("%s: could not identify channel, unknown magic: %u", + __func__, channel_magic); + return; + } + } else { - default_channel = !mis->from_src_file; + channel = CH_POSTCOPY; } if (multifd_recv_setup(errp) != 0) { return; } - if (default_channel) { + if (channel == CH_DEFAULT) { f = qemu_file_new_input(ioc); migration_incoming_setup(f); - } else { + } else if (channel == CH_MULTIFD) { /* Multiple connections */ - assert(migration_needs_multiple_sockets()); if (migrate_multifd()) { multifd_recv_new_channel(ioc, &local_err); - } else { + } + if (local_err) { + error_propagate(errp, local_err); + return; + } + } else if (channel == CH_POSTCOPY) { + if (migrate_postcopy()) { assert(migrate_postcopy_preempt()); + assert(!mis->postcopy_qemufile_dst); f = qemu_file_new_input(ioc); postcopy_preempt_new_channel(mis, f); } - if (local_err) { - error_propagate(errp, local_err); - return; - } } - if (migration_should_start_incoming(default_channel)) { + if (migration_should_start_incoming(channel)) { /* If it's a recovery, we're done */ if (postcopy_try_recover()) { return; @@ -XXX,XX +XXX,XX @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) */ bool migration_has_all_channels(void) { + bool ret = false; MigrationIncomingState *mis = migration_incoming_get_current(); if (!mis->from_src_file) { - return false; + return ret; } if (migrate_multifd()) { - return multifd_recv_all_channels_created(); + ret = multifd_recv_all_channels_created(); } - if (migrate_postcopy_preempt()) { - return mis->postcopy_qemufile_dst != NULL; + if (ret && migrate_postcopy_preempt()) { + ret = mis->postcopy_qemufile_dst != NULL; } - return true; + return ret; } int migrate_send_rp_switchover_ack(MigrationIncomingState *mis) diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index XXXXXXX..XXXXXXX 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -XXX,XX +XXX,XX @@ #include "exec/ramblock.h" #include "exec/target_page.h" #include "file.h" +#include "migration.h" #include "multifd.h" #include "options.h" #include "qapi/error.h" @@ -XXX,XX +XXX,XX @@ retry: int multifd_ram_flush_and_sync(void) { - if (!migrate_multifd()) { + if (!migrate_multifd() || migration_in_postcopy()) { return 0; } diff --git a/migration/options.c b/migration/options.c index XXXXXXX..XXXXXXX 100644 --- a/migration/options.c +++ b/migration/options.c @@ -XXX,XX +XXX,XX @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) error_setg(errp, "Postcopy is not compatible with ignore-shared"); return false; } - - if (new_caps[MIGRATION_CAPABILITY_MULTIFD]) { - error_setg(errp, "Postcopy is not yet compatible with multifd"); - return false; - } } if (new_caps[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) { diff --git a/migration/ram.c b/migration/ram.c index XXXXXXX..XXXXXXX 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -XXX,XX +XXX,XX @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss) pss->page = 0; pss->block = QLIST_NEXT_RCU(pss->block, next); if (!pss->block) { - if (migrate_multifd() && - (!migrate_multifd_flush_after_each_section() || - migrate_mapped_ram())) { + if (migrate_multifd() && !migration_in_postcopy() + && (!migrate_multifd_flush_after_each_section() + || migrate_mapped_ram())) { QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel; int ret = multifd_ram_flush_and_sync(); if (ret < 0) { @@ -XXX,XX +XXX,XX @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss) } } - if (migrate_multifd()) { + if (migrate_multifd() && !migration_in_postcopy()) { RAMBlock *block = pss->block; return ram_save_multifd_page(block, offset); } -- 2.47.1