[PATCH v11 00/21] migration: Add COLO multifd support and COLO migration unit test

Lukas Straub posted 21 patches 1 month, 1 week ago
Failed in applying to current master (apply log)
Maintainers: Pierrick Bouvier <pierrick.bouvier@linaro.org>, Lukas Straub <lukasstraub2@web.de>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Laurent Vivier <lvivier@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
MAINTAINERS                        |   6 +-
docs/COLO-FT.txt                   | 334 ----------------------------------
docs/system/index.rst              |   1 +
docs/system/qemu-colo.rst          | 362 +++++++++++++++++++++++++++++++++++++
include/migration/colo.h           |   3 -
migration/colo.c                   |  59 +++---
migration/meson.build              |   2 +-
migration/migration.c              |  80 ++++----
migration/multifd-colo.c           |  44 +++++
migration/multifd-colo.h           |  26 +++
migration/multifd-nocomp.c         |  10 +-
migration/multifd.c                |  26 ++-
migration/multifd.h                |   5 +-
migration/options.c                |  10 +-
migration/ram.c                    |  12 +-
migration/savevm.c                 |  37 +---
migration/savevm.h                 |   1 -
migration/trace-events             |   1 -
tests/qtest/meson.build            |   7 +-
tests/qtest/migration-test.c       |   1 +
tests/qtest/migration/colo-tests.c | 199 ++++++++++++++++++++
tests/qtest/migration/framework.c  |  13 ++
tests/qtest/migration/framework.h  |   5 +
23 files changed, 777 insertions(+), 467 deletions(-)
[PATCH v11 00/21] migration: Add COLO multifd support and COLO migration unit test
Posted by Lukas Straub 1 month, 1 week ago
Screw b4, I'm back to git send-email again. Also screw my email provider black-box.

Hello everyone,
This has some cleanups for and adds multifd support and migration unit tests
for COLO migration.

Regards,
Lukas

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
Changes in v11:
- Use colo with return-path capability
- Link to v10: https://lore.kernel.org/qemu-devel/20260220-colo_unit_test_multifd-v10-0-bfe67d422ef1@web.de

Changes in v10:
- multifd: always kick the main thread
- always open the return path socket on source
- Link to v9: https://lore.kernel.org/qemu-devel/20260218-colo_unit_test_multifd-v9-0-d8dbdb0ca6f6@web.de

Changes in v9:
- Rebase onto master
- Fix two rare bugs discovered during sresstesting the colo unit test
- Link to v8: https://lore.kernel.org/qemu-devel/20260210-colo_unit_test_multifd-v8-0-7f9e5f7d082b@web.de

Changes in v8:
- Fix peter's review comments
- Link to v7: https://lore.kernel.org/qemu-devel/20260210-colo_unit_test_multifd-v7-0-23bd32f36828@web.de

Changes in v7:
- Fix peter's review comments
- Link to v6: https://lore.kernel.org/qemu-devel/20260206-colo_unit_test_multifd-v6-0-27779dda139d@web.de

Changes in v6:
- Fix the crash when running COLO with TCG accel.
- Link to v5: https://lore.kernel.org/qemu-devel/20260203-colo_unit_test_multifd-v5-0-57508b7389f6@web.de

Changes in v5:
- Remove unused inmports from multifd-colo.c
- Mention the checkpoint overhead of reset to the Q35 fix
- Link to v4: https://lore.kernel.org/qemu-devel/20260130-colo_unit_test_multifd-v4-0-7115ab6f0e77@web.de

Changes in v4:
- Add cleanup patches to remove migration_incoming_colo_enabled() and MIG_CMD_ENABLE_COLO
- Add more comments to the colo unit test
- Call colo_release_ram_cache() after multifd threads terminate
- Link to v3: https://lore.kernel.org/qemu-devel/20260125-colo_unit_test_multifd-v3-0-ae926ccd8eae@web.de

Changes in v3:
- Fix peter's review comments.
- Fix COLO with Q35 machine
- Link to v2: https://lore.kernel.org/qemu-devel/20260117-colo_unit_test_multifd-v2-0-ab521777fa51@web.de

Changes in v2:
- Fix review comments
- Hide stderr in colo migration test since the logged errors are expected
- Add benchmarking data for multifd
- Add myself as maintainer for COLO migration framework
- Link to v1: https://lore.kernel.org/qemu-devel/20251230-colo_unit_test_multifd-v1-0-f9734bc74c71@web.de

---
Lukas Straub (21):
      MAINTAINERS: Add myself as maintainer for COLO migration framework
      MAINTAINERS: Remove Hailiang Zhang from COLO migration framework
      colo: Setup ram cache in normal migration path
      colo: Replace migration_incoming_colo_enabled() with migrate_colo()
      colo: Remove ENABLE_COLO savevm command and mark it as deprecated
      ram: Remove colo special-casing
      multifd: Move ram state receive into multifd_ram_state_recv()
      multifd: Add COLO support
      Call colo_release_ram_cache() after multifd threads terminate
      colo: Fix crash during device vmstate load
      colo: Hold the BQL while sending ram state
      colo: Do not hold the BQL while receiving ram state.
      migration-test: Add COLO migration unit test
      Convert colo main documentation to restructuredText
      qemu-colo.rst: Miscellaneous changes
      qemu-colo.rst: Add my copyright
      qemu-colo.rst: Simplify the block replication setup
      multifd: Fix hang if send thread errors during sync
      colo: Use file lock in primary_vm_do_failover()
      migration: Keep s->rp_state.from_dst_file open until migration ends
      colo: Reuse the return path from migration on primary and secondary side

 MAINTAINERS                        |   6 +-
 docs/COLO-FT.txt                   | 334 ----------------------------------
 docs/system/index.rst              |   1 +
 docs/system/qemu-colo.rst          | 362 +++++++++++++++++++++++++++++++++++++
 include/migration/colo.h           |   3 -
 migration/colo.c                   |  59 +++---
 migration/meson.build              |   2 +-
 migration/migration.c              |  80 ++++----
 migration/multifd-colo.c           |  44 +++++
 migration/multifd-colo.h           |  26 +++
 migration/multifd-nocomp.c         |  10 +-
 migration/multifd.c                |  26 ++-
 migration/multifd.h                |   5 +-
 migration/options.c                |  10 +-
 migration/ram.c                    |  12 +-
 migration/savevm.c                 |  37 +---
 migration/savevm.h                 |   1 -
 migration/trace-events             |   1 -
 tests/qtest/meson.build            |   7 +-
 tests/qtest/migration-test.c       |   1 +
 tests/qtest/migration/colo-tests.c | 199 ++++++++++++++++++++
 tests/qtest/migration/framework.c  |  13 ++
 tests/qtest/migration/framework.h  |   5 +
 23 files changed, 777 insertions(+), 467 deletions(-)
---
base-commit: d8a9d97317d03190b34498741f98f22e2a9afe3e
change-id: 20251230-colo_unit_test_multifd-8bf58dcebd46

Best regards,
-- 
Lukas Straub <lukasstraub2@web.de>
Re: [PATCH v11 00/21] migration: Add COLO multifd support and COLO migration unit test
Posted by Lukas Straub 1 month ago
On Mon,  2 Mar 2026 12:43:30 +0100
Lukas Straub <lukasstraub2@web.de> wrote:

> 
> Hello everyone,
> This has some cleanups for and adds multifd support and migration unit tests
> for COLO migration.
> 
> Regards,
> Lukas

Hello Peter, Hello Fabiano,
Will you apply this?

Best regards,
Lukas Straub

> 
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> ---
> Changes in v11:
> - Use colo with return-path capability
> - Link to v10: https://lore.kernel.org/qemu-devel/20260220-colo_unit_test_multifd-v10-0-bfe67d422ef1@web.de
> 
> Changes in v10:
> - multifd: always kick the main thread
> - always open the return path socket on source
> - Link to v9: https://lore.kernel.org/qemu-devel/20260218-colo_unit_test_multifd-v9-0-d8dbdb0ca6f6@web.de
> 
> Changes in v9:
> - Rebase onto master
> - Fix two rare bugs discovered during sresstesting the colo unit test
> - Link to v8: https://lore.kernel.org/qemu-devel/20260210-colo_unit_test_multifd-v8-0-7f9e5f7d082b@web.de
> 
> Changes in v8:
> - Fix peter's review comments
> - Link to v7: https://lore.kernel.org/qemu-devel/20260210-colo_unit_test_multifd-v7-0-23bd32f36828@web.de
> 
> Changes in v7:
> - Fix peter's review comments
> - Link to v6: https://lore.kernel.org/qemu-devel/20260206-colo_unit_test_multifd-v6-0-27779dda139d@web.de
> 
> Changes in v6:
> - Fix the crash when running COLO with TCG accel.
> - Link to v5: https://lore.kernel.org/qemu-devel/20260203-colo_unit_test_multifd-v5-0-57508b7389f6@web.de
> 
> Changes in v5:
> - Remove unused inmports from multifd-colo.c
> - Mention the checkpoint overhead of reset to the Q35 fix
> - Link to v4: https://lore.kernel.org/qemu-devel/20260130-colo_unit_test_multifd-v4-0-7115ab6f0e77@web.de
> 
> Changes in v4:
> - Add cleanup patches to remove migration_incoming_colo_enabled() and MIG_CMD_ENABLE_COLO
> - Add more comments to the colo unit test
> - Call colo_release_ram_cache() after multifd threads terminate
> - Link to v3: https://lore.kernel.org/qemu-devel/20260125-colo_unit_test_multifd-v3-0-ae926ccd8eae@web.de
> 
> Changes in v3:
> - Fix peter's review comments.
> - Fix COLO with Q35 machine
> - Link to v2: https://lore.kernel.org/qemu-devel/20260117-colo_unit_test_multifd-v2-0-ab521777fa51@web.de
> 
> Changes in v2:
> - Fix review comments
> - Hide stderr in colo migration test since the logged errors are expected
> - Add benchmarking data for multifd
> - Add myself as maintainer for COLO migration framework
> - Link to v1: https://lore.kernel.org/qemu-devel/20251230-colo_unit_test_multifd-v1-0-f9734bc74c71@web.de
> 
> ---
> Lukas Straub (21):
>       MAINTAINERS: Add myself as maintainer for COLO migration framework
>       MAINTAINERS: Remove Hailiang Zhang from COLO migration framework
>       colo: Setup ram cache in normal migration path
>       colo: Replace migration_incoming_colo_enabled() with migrate_colo()
>       colo: Remove ENABLE_COLO savevm command and mark it as deprecated
>       ram: Remove colo special-casing
>       multifd: Move ram state receive into multifd_ram_state_recv()
>       multifd: Add COLO support
>       Call colo_release_ram_cache() after multifd threads terminate
>       colo: Fix crash during device vmstate load
>       colo: Hold the BQL while sending ram state
>       colo: Do not hold the BQL while receiving ram state.
>       migration-test: Add COLO migration unit test
>       Convert colo main documentation to restructuredText
>       qemu-colo.rst: Miscellaneous changes
>       qemu-colo.rst: Add my copyright
>       qemu-colo.rst: Simplify the block replication setup
>       multifd: Fix hang if send thread errors during sync
>       colo: Use file lock in primary_vm_do_failover()
>       migration: Keep s->rp_state.from_dst_file open until migration ends
>       colo: Reuse the return path from migration on primary and secondary side
> 
>  MAINTAINERS                        |   6 +-
>  docs/COLO-FT.txt                   | 334 ----------------------------------
>  docs/system/index.rst              |   1 +
>  docs/system/qemu-colo.rst          | 362 +++++++++++++++++++++++++++++++++++++
>  include/migration/colo.h           |   3 -
>  migration/colo.c                   |  59 +++---
>  migration/meson.build              |   2 +-
>  migration/migration.c              |  80 ++++----
>  migration/multifd-colo.c           |  44 +++++
>  migration/multifd-colo.h           |  26 +++
>  migration/multifd-nocomp.c         |  10 +-
>  migration/multifd.c                |  26 ++-
>  migration/multifd.h                |   5 +-
>  migration/options.c                |  10 +-
>  migration/ram.c                    |  12 +-
>  migration/savevm.c                 |  37 +---
>  migration/savevm.h                 |   1 -
>  migration/trace-events             |   1 -
>  tests/qtest/meson.build            |   7 +-
>  tests/qtest/migration-test.c       |   1 +
>  tests/qtest/migration/colo-tests.c | 199 ++++++++++++++++++++
>  tests/qtest/migration/framework.c  |  13 ++
>  tests/qtest/migration/framework.h  |   5 +
>  23 files changed, 777 insertions(+), 467 deletions(-)
> ---
> base-commit: d8a9d97317d03190b34498741f98f22e2a9afe3e
> change-id: 20251230-colo_unit_test_multifd-8bf58dcebd46
> 
> Best regards,

Re: [PATCH v11 00/21] migration: Add COLO multifd support and COLO migration unit test
Posted by Fabiano Rosas 1 month ago
Lukas Straub <lukasstraub2@web.de> writes:

> On Mon,  2 Mar 2026 12:43:30 +0100
> Lukas Straub <lukasstraub2@web.de> wrote:
>
>> 
>> Hello everyone,
>> This has some cleanups for and adds multifd support and migration unit tests
>> for COLO migration.
>> 
>> Regards,
>> Lukas
>
> Hello Peter, Hello Fabiano,
> Will you apply this?
>

Very likely yes. I'm starting the tests on migration-next, if all goes
well I'll send the pull request later today.

> Best regards,
> Lukas Straub
>
>> 
>> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
>> ---
>> Changes in v11:
>> - Use colo with return-path capability
>> - Link to v10: https://lore.kernel.org/qemu-devel/20260220-colo_unit_test_multifd-v10-0-bfe67d422ef1@web.de
>> 
>> Changes in v10:
>> - multifd: always kick the main thread
>> - always open the return path socket on source
>> - Link to v9: https://lore.kernel.org/qemu-devel/20260218-colo_unit_test_multifd-v9-0-d8dbdb0ca6f6@web.de
>> 
>> Changes in v9:
>> - Rebase onto master
>> - Fix two rare bugs discovered during sresstesting the colo unit test
>> - Link to v8: https://lore.kernel.org/qemu-devel/20260210-colo_unit_test_multifd-v8-0-7f9e5f7d082b@web.de
>> 
>> Changes in v8:
>> - Fix peter's review comments
>> - Link to v7: https://lore.kernel.org/qemu-devel/20260210-colo_unit_test_multifd-v7-0-23bd32f36828@web.de
>> 
>> Changes in v7:
>> - Fix peter's review comments
>> - Link to v6: https://lore.kernel.org/qemu-devel/20260206-colo_unit_test_multifd-v6-0-27779dda139d@web.de
>> 
>> Changes in v6:
>> - Fix the crash when running COLO with TCG accel.
>> - Link to v5: https://lore.kernel.org/qemu-devel/20260203-colo_unit_test_multifd-v5-0-57508b7389f6@web.de
>> 
>> Changes in v5:
>> - Remove unused inmports from multifd-colo.c
>> - Mention the checkpoint overhead of reset to the Q35 fix
>> - Link to v4: https://lore.kernel.org/qemu-devel/20260130-colo_unit_test_multifd-v4-0-7115ab6f0e77@web.de
>> 
>> Changes in v4:
>> - Add cleanup patches to remove migration_incoming_colo_enabled() and MIG_CMD_ENABLE_COLO
>> - Add more comments to the colo unit test
>> - Call colo_release_ram_cache() after multifd threads terminate
>> - Link to v3: https://lore.kernel.org/qemu-devel/20260125-colo_unit_test_multifd-v3-0-ae926ccd8eae@web.de
>> 
>> Changes in v3:
>> - Fix peter's review comments.
>> - Fix COLO with Q35 machine
>> - Link to v2: https://lore.kernel.org/qemu-devel/20260117-colo_unit_test_multifd-v2-0-ab521777fa51@web.de
>> 
>> Changes in v2:
>> - Fix review comments
>> - Hide stderr in colo migration test since the logged errors are expected
>> - Add benchmarking data for multifd
>> - Add myself as maintainer for COLO migration framework
>> - Link to v1: https://lore.kernel.org/qemu-devel/20251230-colo_unit_test_multifd-v1-0-f9734bc74c71@web.de
>> 
>> ---
>> Lukas Straub (21):
>>       MAINTAINERS: Add myself as maintainer for COLO migration framework
>>       MAINTAINERS: Remove Hailiang Zhang from COLO migration framework
>>       colo: Setup ram cache in normal migration path
>>       colo: Replace migration_incoming_colo_enabled() with migrate_colo()
>>       colo: Remove ENABLE_COLO savevm command and mark it as deprecated
>>       ram: Remove colo special-casing
>>       multifd: Move ram state receive into multifd_ram_state_recv()
>>       multifd: Add COLO support
>>       Call colo_release_ram_cache() after multifd threads terminate
>>       colo: Fix crash during device vmstate load
>>       colo: Hold the BQL while sending ram state
>>       colo: Do not hold the BQL while receiving ram state.
>>       migration-test: Add COLO migration unit test
>>       Convert colo main documentation to restructuredText
>>       qemu-colo.rst: Miscellaneous changes
>>       qemu-colo.rst: Add my copyright
>>       qemu-colo.rst: Simplify the block replication setup
>>       multifd: Fix hang if send thread errors during sync
>>       colo: Use file lock in primary_vm_do_failover()
>>       migration: Keep s->rp_state.from_dst_file open until migration ends
>>       colo: Reuse the return path from migration on primary and secondary side
>> 
>>  MAINTAINERS                        |   6 +-
>>  docs/COLO-FT.txt                   | 334 ----------------------------------
>>  docs/system/index.rst              |   1 +
>>  docs/system/qemu-colo.rst          | 362 +++++++++++++++++++++++++++++++++++++
>>  include/migration/colo.h           |   3 -
>>  migration/colo.c                   |  59 +++---
>>  migration/meson.build              |   2 +-
>>  migration/migration.c              |  80 ++++----
>>  migration/multifd-colo.c           |  44 +++++
>>  migration/multifd-colo.h           |  26 +++
>>  migration/multifd-nocomp.c         |  10 +-
>>  migration/multifd.c                |  26 ++-
>>  migration/multifd.h                |   5 +-
>>  migration/options.c                |  10 +-
>>  migration/ram.c                    |  12 +-
>>  migration/savevm.c                 |  37 +---
>>  migration/savevm.h                 |   1 -
>>  migration/trace-events             |   1 -
>>  tests/qtest/meson.build            |   7 +-
>>  tests/qtest/migration-test.c       |   1 +
>>  tests/qtest/migration/colo-tests.c | 199 ++++++++++++++++++++
>>  tests/qtest/migration/framework.c  |  13 ++
>>  tests/qtest/migration/framework.h  |   5 +
>>  23 files changed, 777 insertions(+), 467 deletions(-)
>> ---
>> base-commit: d8a9d97317d03190b34498741f98f22e2a9afe3e
>> change-id: 20251230-colo_unit_test_multifd-8bf58dcebd46
>> 
>> Best regards,