hw/vfio/migration.c | 13 +++++++++++++ hw/vfio/trace-events | 3 +++ include/hw/vfio/vfio-common.h | 3 +++ include/migration/register.h | 4 ++++ migration/multifd.c | 2 +- migration/ram.c | 1 + migration/trace-events | 1 + 7 files changed, 26 insertions(+), 1 deletion(-)
From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> A new version of the multifd device state transfer support with VFIO consumer patch set is being prepared, the previous version and the associated discussion is available here: https://lore.kernel.org/qemu-devel/cover.1724701542.git.maciej.szmigiero@oracle.com/ This new version was originally targeting QEMU 9.2 but such schedule proved to be too optimistic due to sheer number of invasive changes/rework required, especially with respect to the VFIO internal threads management and their synchronization with the migration core. In addition to these changes, recently merged commit 3b5948f808e3 ("vfio/migration: Report only stop-copy size in vfio_state_pending_exact()") seems to have uncovered a race between multifd RAM and device state transfers: RAM transfer sender finishes the multifd stream with a SYNC in ram_save_complete() but the multifd receive channels are only released from this SYNC after the migration is wholly complete in process_incoming_migration_bh(). The above causes problems if the multifd channels need to still be running after the RAM transfer is completed, for example because there is still remaining device state to be transferred. Since QEMU 9.2 code freeze is coming I've separated small uncontroversial commits from that WiP main patch set here, some of which were already reviewed during previous main patch set iterations. This way at least future code conflicts can be reduced and the amount of patches that need to be carried in the future versions of the main patch set is reduced. Maciej S. Szmigiero (4): vfio/migration: Add save_{iterate,complete_precopy}_started trace events migration/ram: Add load start trace event migration/multifd: Zero p->flags before starting filling a packet migration: Document the BQL behavior of load SaveVMHandlers hw/vfio/migration.c | 13 +++++++++++++ hw/vfio/trace-events | 3 +++ include/hw/vfio/vfio-common.h | 3 +++ include/migration/register.h | 4 ++++ migration/multifd.c | 2 +- migration/ram.c | 1 + migration/trace-events | 1 + 7 files changed, 26 insertions(+), 1 deletion(-)
On Tue, Oct 29, 2024 at 03:58:12PM +0100, Maciej S. Szmigiero wrote: > From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> > > A new version of the multifd device state transfer support with VFIO consumer > patch set is being prepared, the previous version and the associated > discussion is available here: > https://lore.kernel.org/qemu-devel/cover.1724701542.git.maciej.szmigiero@oracle.com/ > > This new version was originally targeting QEMU 9.2 but such schedule proved > to be too optimistic due to sheer number of invasive changes/rework required, > especially with respect to the VFIO internal threads management and their > synchronization with the migration core. > > In addition to these changes, recently merged commit 3b5948f808e3 > ("vfio/migration: Report only stop-copy size in vfio_state_pending_exact()") > seems to have uncovered a race between multifd RAM and device state transfers: > RAM transfer sender finishes the multifd stream with a SYNC in > ram_save_complete() but the multifd receive channels are only released > from this SYNC after the migration is wholly complete in > process_incoming_migration_bh(). > > The above causes problems if the multifd channels need to still be > running after the RAM transfer is completed, for example because > there is still remaining device state to be transferred. > > Since QEMU 9.2 code freeze is coming I've separated small uncontroversial > commits from that WiP main patch set here, some of which were already > reviewed during previous main patch set iterations. > > This way at least future code conflicts can be reduced and the amount > of patches that need to be carried in the future versions of the main > patch set is reduced. > > > Maciej S. Szmigiero (4): > vfio/migration: Add save_{iterate,complete_precopy}_started trace > events > migration/ram: Add load start trace event > migration/multifd: Zero p->flags before starting filling a packet > migration: Document the BQL behavior of load SaveVMHandlers I queued patch 2-3. Patch 4 is ok to be merged even after softfreeze if it's a doc only change, but we don't need to rush either.. Thanks, -- Peter Xu
On 29.10.2024 21:40, Peter Xu wrote: > On Tue, Oct 29, 2024 at 03:58:12PM +0100, Maciej S. Szmigiero wrote: >> From: "Maciej S. Szmigiero" <maciej.szmigiero@oracle.com> >> >> A new version of the multifd device state transfer support with VFIO consumer >> patch set is being prepared, the previous version and the associated >> discussion is available here: >> https://lore.kernel.org/qemu-devel/cover.1724701542.git.maciej.szmigiero@oracle.com/ >> >> This new version was originally targeting QEMU 9.2 but such schedule proved >> to be too optimistic due to sheer number of invasive changes/rework required, >> especially with respect to the VFIO internal threads management and their >> synchronization with the migration core. >> >> In addition to these changes, recently merged commit 3b5948f808e3 >> ("vfio/migration: Report only stop-copy size in vfio_state_pending_exact()") >> seems to have uncovered a race between multifd RAM and device state transfers: >> RAM transfer sender finishes the multifd stream with a SYNC in >> ram_save_complete() but the multifd receive channels are only released >> from this SYNC after the migration is wholly complete in >> process_incoming_migration_bh(). >> >> The above causes problems if the multifd channels need to still be >> running after the RAM transfer is completed, for example because >> there is still remaining device state to be transferred. >> >> Since QEMU 9.2 code freeze is coming I've separated small uncontroversial >> commits from that WiP main patch set here, some of which were already >> reviewed during previous main patch set iterations. >> >> This way at least future code conflicts can be reduced and the amount >> of patches that need to be carried in the future versions of the main >> patch set is reduced. >> >> >> Maciej S. Szmigiero (4): >> vfio/migration: Add save_{iterate,complete_precopy}_started trace >> events >> migration/ram: Add load start trace event >> migration/multifd: Zero p->flags before starting filling a packet >> migration: Document the BQL behavior of load SaveVMHandlers > > I queued patch 2-3. Patch 4 is ok to be merged even after softfreeze if > it's a doc only change, but we don't need to rush either.. > > Thanks, > Thanks! Maciej
© 2016 - 2024 Red Hat, Inc.