[PATCH 0/3] migration: Add COLO multifd support and COLO migration unit test

Lukas Straub posted 3 patches 1 month, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20251230-colo._5Funit._5Ftest._5Fmultifd-v1-0-f9734bc74c71@web.de
Maintainers: Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Laurent Vivier <lvivier@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
There is a newer version of this series
migration/meson.build              |   2 +-
migration/multifd-colo.c           |  57 ++++++++++++++++++
migration/multifd-colo.h           |  26 +++++++++
migration/multifd.c                |  14 ++++-
tests/qtest/meson.build            |   7 ++-
tests/qtest/migration-test.c       |   1 +
tests/qtest/migration/colo-tests.c | 115 +++++++++++++++++++++++++++++++++++++
tests/qtest/migration/framework.c  |  69 +++++++++++++++++++++-
tests/qtest/migration/framework.h  |  10 ++++
9 files changed, 294 insertions(+), 7 deletions(-)
[PATCH 0/3] migration: Add COLO multifd support and COLO migration unit test
Posted by Lukas Straub 1 month, 1 week ago
Hello everyone,
This adds COLO multifd support and migration unit tests for COLO migration
and failover.

Regards,
Lukas

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
---
Lukas Straub (3):
      multifd: Add colo support
      migration-test: Add -snapshot option for COLO
      migration-test: Add COLO migration unit test

 migration/meson.build              |   2 +-
 migration/multifd-colo.c           |  57 ++++++++++++++++++
 migration/multifd-colo.h           |  26 +++++++++
 migration/multifd.c                |  14 ++++-
 tests/qtest/meson.build            |   7 ++-
 tests/qtest/migration-test.c       |   1 +
 tests/qtest/migration/colo-tests.c | 115 +++++++++++++++++++++++++++++++++++++
 tests/qtest/migration/framework.c  |  69 +++++++++++++++++++++-
 tests/qtest/migration/framework.h  |  10 ++++
 9 files changed, 294 insertions(+), 7 deletions(-)
---
base-commit: 942b0d378a1de9649085ad6db5306d5b8cef3591
change-id: 20251230-colo_unit_test_multifd-8bf58dcebd46

Best regards,
-- 
Lukas Straub <lukasstraub2@web.de>
Re: [PATCH 0/3] migration: Add COLO multifd support and COLO migration unit test
Posted by Peter Xu 1 month ago
On Tue, Dec 30, 2025 at 03:05:43PM +0100, Lukas Straub wrote:
> Hello everyone,
> This adds COLO multifd support and migration unit tests for COLO migration
> and failover.
> 
> Regards,
> Lukas
> 
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> ---
> Lukas Straub (3):
>       multifd: Add colo support
>       migration-test: Add -snapshot option for COLO
>       migration-test: Add COLO migration unit test
> 
>  migration/meson.build              |   2 +-
>  migration/multifd-colo.c           |  57 ++++++++++++++++++
>  migration/multifd-colo.h           |  26 +++++++++
>  migration/multifd.c                |  14 ++++-
>  tests/qtest/meson.build            |   7 ++-
>  tests/qtest/migration-test.c       |   1 +
>  tests/qtest/migration/colo-tests.c | 115 +++++++++++++++++++++++++++++++++++++
>  tests/qtest/migration/framework.c  |  69 +++++++++++++++++++++-
>  tests/qtest/migration/framework.h  |  10 ++++
>  9 files changed, 294 insertions(+), 7 deletions(-)
> ---
> base-commit: 942b0d378a1de9649085ad6db5306d5b8cef3591
> change-id: 20251230-colo_unit_test_multifd-8bf58dcebd46
> 
> Best regards,
> -- 
> Lukas Straub <lukasstraub2@web.de>
> 

Lukas,

I gave it a shot on the tests locally.  I saw a lot of errors even if qtest
didn't think it's failing.  I do not know if it's only me.  Let's discuss
deprecation first, then if we want to keep COLO, then please have a look.
Log attached.

===8<===

$ QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test --full -r /x86_64/migration/colo
TAP version 14
# random seed: R02Sa4f442d17819fa84c9ab14620fa9dd5e
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -machine none -accel qtest
# Start of x86_64 tests
# Start of migration tests
# Start of colo tests
# Start of plain tests
# Running /x86_64/migration/colo/plain/secondary_failover
# Using machine type: pc-i440fx-10.2
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name source,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/src_serial -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name target,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/dest_serial -incoming tcp:127.0.0.1:0  -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
qemu-system-x86_64: Can't receive COLO message: Input/output error
qemu-system-x86_64: Unable to write to socket: Broken pipe
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
qemu-system-x86_64: Can't send COLO message: Input/output error
ok 1 /x86_64/migration/colo/plain/secondary_failover
# slow test /x86_64/migration/colo/plain/secondary_failover executed in 2.96 secs
# Running /x86_64/migration/colo/plain/primary_failover
# Using machine type: pc-i440fx-10.2
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name source,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/src_serial -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name target,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/dest_serial -incoming tcp:127.0.0.1:0  -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
qemu-system-x86_64: Can't receive COLO message: Input/output error
qemu-system-x86_64: Can't receive COLO message: Input/output error
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
ok 2 /x86_64/migration/colo/plain/primary_failover
# slow test /x86_64/migration/colo/plain/primary_failover executed in 2.54 secs
# Running /x86_64/migration/colo/plain/primary_failover_checkpoint
# Using machine type: pc-i440fx-10.2
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name source,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/src_serial -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name target,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/dest_serial -incoming tcp:127.0.0.1:0  -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
qemu-system-x86_64: Failed to load section ID: stream error: -5: Channel error: Input/output error
ok 3 /x86_64/migration/colo/plain/primary_failover_checkpoint
# slow test /x86_64/migration/colo/plain/primary_failover_checkpoint executed in 2.87 secs
# Running /x86_64/migration/colo/plain/secondary_failover_checkpoint
# Using machine type: pc-i440fx-10.2
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name source,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/src_serial -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name target,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/dest_serial -incoming tcp:127.0.0.1:0  -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
qemu-system-x86_64: Failed to load section ID: stream error: -5: Unable to read from socket: Connection reset by peer
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
qemu-system-x86_64: Can't send COLO message: Input/output error
qemu-system-x86_64: Can't send COLO message: Input/output error
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
ok 4 /x86_64/migration/colo/plain/secondary_failover_checkpoint
# slow test /x86_64/migration/colo/plain/secondary_failover_checkpoint executed in 3.30 secs
# End of plain tests
# Start of multifd tests
# Running /x86_64/migration/colo/multifd/secondary_failover
# Using machine type: pc-i440fx-10.2
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name source,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/src_serial -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name target,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/dest_serial -incoming defer  -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
qemu-system-x86_64: Can't receive COLO message: Input/output error
qemu-system-x86_64: Can't send COLO message: Input/output error
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
ok 5 /x86_64/migration/colo/multifd/secondary_failover
# slow test /x86_64/migration/colo/multifd/secondary_failover executed in 2.70 secs
# Running /x86_64/migration/colo/multifd/primary_failover
# Using machine type: pc-i440fx-10.2
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name source,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/src_serial -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name target,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/dest_serial -incoming defer  -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
qemu-system-x86_64: Can't receive COLO message: Input/output error
qemu-system-x86_64: Can't receive COLO message: Input/output error
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
ok 6 /x86_64/migration/colo/multifd/primary_failover
# slow test /x86_64/migration/colo/multifd/primary_failover executed in 2.00 secs
# Running /x86_64/migration/colo/multifd/primary_failover_checkpoint
# Using machine type: pc-i440fx-10.2
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name source,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/src_serial -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name target,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/dest_serial -incoming defer  -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
qemu-system-x86_64: Failed to load section ID: stream error: -5: Channel error: Input/output error
ok 7 /x86_64/migration/colo/multifd/primary_failover_checkpoint
# slow test /x86_64/migration/colo/multifd/primary_failover_checkpoint executed in 2.26 secs
# Running /x86_64/migration/colo/multifd/secondary_failover_checkpoint
# Using machine type: pc-i440fx-10.2
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name source,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/src_serial -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-2106378.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-2106378.qmp,id=char0 -mon chardev=char0,mode=control -display none -audio none -run-with exit-with-parent=on -accel kvm -accel tcg -machine pc-i440fx-10.2, -name target,debug-threads=on -machine memory-backend=mig.mem -object memory-backend-ram,id=mig.mem,size=150M,share=off -serial file:/tmp/migration-test-WWZRI3/dest_serial -incoming defer  -drive if=none,id=d0,file=/tmp/migration-test-WWZRI3/bootsect,format=raw -device ide-hd,drive=d0,secs=1,cyls=1,heads=1 -snapshot   -accel qtest
qemu-system-x86_64: Failed to load section ID: stream error: -5: Unable to read from socket: Connection reset by peer
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
qemu-system-x86_64: Can't send COLO message: Input/output error
qemu-system-x86_64: Can't send COLO message: Input/output error
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
qemu-system-x86_64: Unable to shutdown socket: Transport endpoint is not connected
ok 8 /x86_64/migration/colo/multifd/secondary_failover_checkpoint
# slow test /x86_64/migration/colo/multifd/secondary_failover_checkpoint executed in 3.07 secs
# End of multifd tests
# End of colo tests
# End of migration tests
# End of x86_64 tests
1..8


-- 
Peter Xu
Re: [PATCH 0/3] migration: Add COLO multifd support and COLO migration unit test
Posted by Peter Xu 1 month, 1 week ago
On Tue, Dec 30, 2025 at 03:05:43PM +0100, Lukas Straub wrote:
> Hello everyone,
> This adds COLO multifd support and migration unit tests for COLO migration
> and failover.

Hi, Lukas,

I'll review the series after the new year.

Could you still introduce some background on how you're deploying COLO?  Do
you use it in production, or for fun?

COLO is still a nice and interesting feature, said that, COLO has quite a
lot of code plugged into migration core.  I wished it's like a multifd
compressor which was much more self-contained, but it's not.  I wished we
can simplify the code in QEMU migration.

We've talked it through before with current COLO maintainers, it looks to
me there aren't really much users using it in production, meanwhile COLO
doesn't look like a feature to benefit individual QEMU users either.

I want to study the use case of COLO in status quo, and evaluate how much
effort we should put on it in the future.  Note that if it's for fun we can
always use a stable branch which will be there forever.  We'll need to
think about QEMU evolving in the future, and what's best for QEMU.

Thanks,

> 
> Regards,
> Lukas
> 
> Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> ---
> Lukas Straub (3):
>       multifd: Add colo support
>       migration-test: Add -snapshot option for COLO
>       migration-test: Add COLO migration unit test
> 
>  migration/meson.build              |   2 +-
>  migration/multifd-colo.c           |  57 ++++++++++++++++++
>  migration/multifd-colo.h           |  26 +++++++++
>  migration/multifd.c                |  14 ++++-
>  tests/qtest/meson.build            |   7 ++-
>  tests/qtest/migration-test.c       |   1 +
>  tests/qtest/migration/colo-tests.c | 115 +++++++++++++++++++++++++++++++++++++
>  tests/qtest/migration/framework.c  |  69 +++++++++++++++++++++-
>  tests/qtest/migration/framework.h  |  10 ++++
>  9 files changed, 294 insertions(+), 7 deletions(-)
> ---
> base-commit: 942b0d378a1de9649085ad6db5306d5b8cef3591
> change-id: 20251230-colo_unit_test_multifd-8bf58dcebd46
> 
> Best regards,
> -- 
> Lukas Straub <lukasstraub2@web.de>
> 

-- 
Peter Xu
Re: [PATCH 0/3] migration: Add COLO multifd support and COLO migration unit test
Posted by Zhang Chen 1 month ago
On Tue, Dec 30, 2025 at 11:02 PM Peter Xu <peterx@redhat.com> wrote:
>
> On Tue, Dec 30, 2025 at 03:05:43PM +0100, Lukas Straub wrote:
> > Hello everyone,
> > This adds COLO multifd support and migration unit tests for COLO migration
> > and failover.
>
> Hi, Lukas,
>
> I'll review the series after the new year.
>
> Could you still introduce some background on how you're deploying COLO?  Do
> you use it in production, or for fun?
>
> COLO is still a nice and interesting feature, said that, COLO has quite a
> lot of code plugged into migration core.  I wished it's like a multifd
> compressor which was much more self-contained, but it's not.  I wished we
> can simplify the code in QEMU migration.
>
> We've talked it through before with current COLO maintainers, it looks to
> me there aren't really much users using it in production, meanwhile COLO
> doesn't look like a feature to benefit individual QEMU users either.
>
> I want to study the use case of COLO in status quo, and evaluate how much
> effort we should put on it in the future.  Note that if it's for fun we can
> always use a stable branch which will be there forever.  We'll need to
> think about QEMU evolving in the future, and what's best for QEMU.
>
> Thanks,
>

Hi Lukas and Peter,

Thanks for this series, I will support for background info if Peter
have any questions.
And CC Hailiang Zhang, although he hasn't replied to emails for a long time.
If no one objects, I think Lukas can replease Hailiang for COLO Framework.

COLO Framework
M: Hailiang Zhang <zhanghailiang@xfusion.com>
S: Maintained
F: migration/colo*
F: include/migration/colo.h
F: include/migration/failover.h
F: docs/COLO-FT.txt

Thanks
Chen

> >
> > Regards,
> > Lukas
> >
> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> > ---
> > Lukas Straub (3):
> >       multifd: Add colo support
> >       migration-test: Add -snapshot option for COLO
> >       migration-test: Add COLO migration unit test
> >
> >  migration/meson.build              |   2 +-
> >  migration/multifd-colo.c           |  57 ++++++++++++++++++
> >  migration/multifd-colo.h           |  26 +++++++++
> >  migration/multifd.c                |  14 ++++-
> >  tests/qtest/meson.build            |   7 ++-
> >  tests/qtest/migration-test.c       |   1 +
> >  tests/qtest/migration/colo-tests.c | 115 +++++++++++++++++++++++++++++++++++++
> >  tests/qtest/migration/framework.c  |  69 +++++++++++++++++++++-
> >  tests/qtest/migration/framework.h  |  10 ++++
> >  9 files changed, 294 insertions(+), 7 deletions(-)
> > ---
> > base-commit: 942b0d378a1de9649085ad6db5306d5b8cef3591
> > change-id: 20251230-colo_unit_test_multifd-8bf58dcebd46
> >
> > Best regards,
> > --
> > Lukas Straub <lukasstraub2@web.de>
> >
>
> --
> Peter Xu
>
Re: [PATCH 0/3] migration: Add COLO multifd support and COLO migration unit test
Posted by Peter Xu 1 month ago
On Sun, Jan 04, 2026 at 01:44:52PM +0800, Zhang Chen wrote:
> On Tue, Dec 30, 2025 at 11:02 PM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Tue, Dec 30, 2025 at 03:05:43PM +0100, Lukas Straub wrote:
> > > Hello everyone,
> > > This adds COLO multifd support and migration unit tests for COLO migration
> > > and failover.
> >
> > Hi, Lukas,
> >
> > I'll review the series after the new year.
> >
> > Could you still introduce some background on how you're deploying COLO?  Do
> > you use it in production, or for fun?
> >
> > COLO is still a nice and interesting feature, said that, COLO has quite a
> > lot of code plugged into migration core.  I wished it's like a multifd
> > compressor which was much more self-contained, but it's not.  I wished we
> > can simplify the code in QEMU migration.
> >
> > We've talked it through before with current COLO maintainers, it looks to
> > me there aren't really much users using it in production, meanwhile COLO
> > doesn't look like a feature to benefit individual QEMU users either.
> >
> > I want to study the use case of COLO in status quo, and evaluate how much
> > effort we should put on it in the future.  Note that if it's for fun we can
> > always use a stable branch which will be there forever.  We'll need to
> > think about QEMU evolving in the future, and what's best for QEMU.
> >
> > Thanks,
> >
> 
> Hi Lukas and Peter,

Hi, Chen,

> 
> Thanks for this series, I will support for background info if Peter
> have any questions.

Thanks, I believe my major question so far was, whether we should deprecate
COLO in migration framework. :)

The netfilters and rest can be discussed separately.

Now looking back at my initial ask in Zhijian's fix, I still agree with
Zhijian on these two points mentioned:

https://lore.kernel.org/all/b2eadde7-57e9-426c-8487-e500ba06410e@fujitsu.com/

That is:

        - Active users who depend on it.
        - A unit test for the COLO framework.

Meanwhile, I can't see how COLO would win if to be compared with some
app-level HA infrastructure.. considering the overhead it requires on
running two VMs and compare every packet.

Lukas, thanks for trying to fix the 2nd.  I apologize that I still
requested you to send these patches, without further raising the attention
that I still want to discuss deprecation.  I don't think anyone yet proved
we should keep COLO.  I do plan to send one patch adding COLO framework to
deprecation, if nobody would stop me in a week justifying question 1 above.

We kind of proved almost nobody is actively using COLO anymore in the past
few releases.  If nobody is using COLO, we should simply drop it.

> And CC Hailiang Zhang, although he hasn't replied to emails for a long time.
> If no one objects, I think Lukas can replease Hailiang for COLO Framework.
> 
> COLO Framework
> M: Hailiang Zhang <zhanghailiang@xfusion.com>
> S: Maintained
> F: migration/colo*
> F: include/migration/colo.h
> F: include/migration/failover.h
> F: docs/COLO-FT.txt

Right, this is also another reason why I think we may want to deprecate
COLO framework.

Since I requested this series (sorry again, Lukas), I'll review it today no
matter if we decide to merge this series at last, or deprecate COLO
framework.

Thanks,

> 
> Thanks
> Chen
> 
> > >
> > > Regards,
> > > Lukas
> > >
> > > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> > > ---
> > > Lukas Straub (3):
> > >       multifd: Add colo support
> > >       migration-test: Add -snapshot option for COLO
> > >       migration-test: Add COLO migration unit test
> > >
> > >  migration/meson.build              |   2 +-
> > >  migration/multifd-colo.c           |  57 ++++++++++++++++++
> > >  migration/multifd-colo.h           |  26 +++++++++
> > >  migration/multifd.c                |  14 ++++-
> > >  tests/qtest/meson.build            |   7 ++-
> > >  tests/qtest/migration-test.c       |   1 +
> > >  tests/qtest/migration/colo-tests.c | 115 +++++++++++++++++++++++++++++++++++++
> > >  tests/qtest/migration/framework.c  |  69 +++++++++++++++++++++-
> > >  tests/qtest/migration/framework.h  |  10 ++++
> > >  9 files changed, 294 insertions(+), 7 deletions(-)
> > > ---
> > > base-commit: 942b0d378a1de9649085ad6db5306d5b8cef3591
> > > change-id: 20251230-colo_unit_test_multifd-8bf58dcebd46
> > >
> > > Best regards,
> > > --
> > > Lukas Straub <lukasstraub2@web.de>
> > >
> >
> > --
> > Peter Xu
> >
> 

-- 
Peter Xu


Re: [PATCH 0/3] migration: Add COLO multifd support and COLO migration unit test
Posted by Lukas Straub 3 weeks, 3 days ago
On Tue, 6 Jan 2026 14:48:14 -0500
Peter Xu <peterx@redhat.com> wrote:

> On Sun, Jan 04, 2026 at 01:44:52PM +0800, Zhang Chen wrote:
> > On Tue, Dec 30, 2025 at 11:02 PM Peter Xu <peterx@redhat.com> wrote:  
> > >
> > > On Tue, Dec 30, 2025 at 03:05:43PM +0100, Lukas Straub wrote:  
> > > > Hello everyone,
> > > > This adds COLO multifd support and migration unit tests for COLO migration
> > > > and failover.  
> > >
> > > Hi, Lukas,
> > >
> > > I'll review the series after the new year.
> > >
> > > Could you still introduce some background on how you're deploying COLO?  Do
> > > you use it in production, or for fun?
> > >
> > > COLO is still a nice and interesting feature, said that, COLO has quite a
> > > lot of code plugged into migration core.  I wished it's like a multifd
> > > compressor which was much more self-contained, but it's not.  I wished we
> > > can simplify the code in QEMU migration.
> > >
> > > We've talked it through before with current COLO maintainers, it looks to
> > > me there aren't really much users using it in production, meanwhile COLO
> > > doesn't look like a feature to benefit individual QEMU users either.
> > >
> > > I want to study the use case of COLO in status quo, and evaluate how much
> > > effort we should put on it in the future.  Note that if it's for fun we can
> > > always use a stable branch which will be there forever.  We'll need to
> > > think about QEMU evolving in the future, and what's best for QEMU.
> > >
> > > Thanks,
> > >  
> > 
> > Hi Lukas and Peter,  
> 
> Hi, Chen,
> 
> > 
> > Thanks for this series, I will support for background info if Peter
> > have any questions.  
> 
> Thanks, I believe my major question so far was, whether we should deprecate
> COLO in migration framework. :)
> 
> The netfilters and rest can be discussed separately.
> 
> Now looking back at my initial ask in Zhijian's fix, I still agree with
> Zhijian on these two points mentioned:
> 
> https://lore.kernel.org/all/b2eadde7-57e9-426c-8487-e500ba06410e@fujitsu.com/
> 
> That is:
> 
>         - Active users who depend on it.
>         - A unit test for the COLO framework.
> 
> Meanwhile, I can't see how COLO would win if to be compared with some
> app-level HA infrastructure.. considering the overhead it requires on
> running two VMs and compare every packet.
> 
> Lukas, thanks for trying to fix the 2nd.  I apologize that I still
> requested you to send these patches, without further raising the attention
> that I still want to discuss deprecation.  I don't think anyone yet proved
> we should keep COLO.  I do plan to send one patch adding COLO framework to
> deprecation, if nobody would stop me in a week justifying question 1 above.

Hello Peter,

I am a consultant on open-source high availability and fault tolaerance
solutions. I provide a complete cluster management solution with
automatic failover and failback for Qemu COLO.

Qemu COLOs lockstepping architecture has a big performance advantage
and it outperforms the market leader by 10x-100x in latency.
No one else provides this unique architecture.

I have customers that depend on this.

I occasionally get inquiries about Qemu COLO even without doing
any kind of marketing. So there is a general interest for this.

Also, Canonical considers providing this to one of their customers.

Regards,
Lukas Straub


> 
> We kind of proved almost nobody is actively using COLO anymore in the past
> few releases.  If nobody is using COLO, we should simply drop it.
> 
> > And CC Hailiang Zhang, although he hasn't replied to emails for a long time.
> > If no one objects, I think Lukas can replease Hailiang for COLO Framework.
> > 
> > COLO Framework
> > M: Hailiang Zhang <zhanghailiang@xfusion.com>
> > S: Maintained
> > F: migration/colo*
> > F: include/migration/colo.h
> > F: include/migration/failover.h
> > F: docs/COLO-FT.txt  
> 
> Right, this is also another reason why I think we may want to deprecate
> COLO framework.

I will take over maintainership.

> 
> Since I requested this series (sorry again, Lukas), I'll review it today no
> matter if we decide to merge this series at last, or deprecate COLO
> framework.
> 
> Thanks,
> 
> > 
> > Thanks
> > Chen
> >   
> > > >
> > > > Regards,
> > > > Lukas
> > > >
> > > > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> > > > ---
> > > > Lukas Straub (3):
> > > >       multifd: Add colo support
> > > >       migration-test: Add -snapshot option for COLO
> > > >       migration-test: Add COLO migration unit test
> > > >
> > > >  migration/meson.build              |   2 +-
> > > >  migration/multifd-colo.c           |  57 ++++++++++++++++++
> > > >  migration/multifd-colo.h           |  26 +++++++++
> > > >  migration/multifd.c                |  14 ++++-
> > > >  tests/qtest/meson.build            |   7 ++-
> > > >  tests/qtest/migration-test.c       |   1 +
> > > >  tests/qtest/migration/colo-tests.c | 115 +++++++++++++++++++++++++++++++++++++
> > > >  tests/qtest/migration/framework.c  |  69 +++++++++++++++++++++-
> > > >  tests/qtest/migration/framework.h  |  10 ++++
> > > >  9 files changed, 294 insertions(+), 7 deletions(-)
> > > > ---
> > > > base-commit: 942b0d378a1de9649085ad6db5306d5b8cef3591
> > > > change-id: 20251230-colo_unit_test_multifd-8bf58dcebd46
> > > >
> > > > Best regards,
> > > > --
> > > > Lukas Straub <lukasstraub2@web.de>
> > > >  
> > >
> > > --
> > > Peter Xu
> > >  
> >   
> 

Re: [PATCH 0/3] migration: Add COLO multifd support and COLO migration unit test
Posted by Zhang Chen 1 month ago
On Sun, Jan 4, 2026 at 1:44 PM Zhang Chen <zhangckid@gmail.com> wrote:
>
> On Tue, Dec 30, 2025 at 11:02 PM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Tue, Dec 30, 2025 at 03:05:43PM +0100, Lukas Straub wrote:
> > > Hello everyone,
> > > This adds COLO multifd support and migration unit tests for COLO migration
> > > and failover.
> >
> > Hi, Lukas,
> >
> > I'll review the series after the new year.
> >
> > Could you still introduce some background on how you're deploying COLO?  Do
> > you use it in production, or for fun?
> >
> > COLO is still a nice and interesting feature, said that, COLO has quite a
> > lot of code plugged into migration core.  I wished it's like a multifd
> > compressor which was much more self-contained, but it's not.  I wished we
> > can simplify the code in QEMU migration.
> >
> > We've talked it through before with current COLO maintainers, it looks to
> > me there aren't really much users using it in production, meanwhile COLO
> > doesn't look like a feature to benefit individual QEMU users either.
> >
> > I want to study the use case of COLO in status quo, and evaluate how much
> > effort we should put on it in the future.  Note that if it's for fun we can
> > always use a stable branch which will be there forever.  We'll need to
> > think about QEMU evolving in the future, and what's best for QEMU.
> >
> > Thanks,
> >
>
> Hi Lukas and Peter,
>
> Thanks for this series, I will support for background info if Peter
> have any questions.
> And CC Hailiang Zhang, although he hasn't replied to emails for a long time.
> If no one objects, I think Lukas can replease Hailiang for COLO Framework.
>

S/replease/replace

> COLO Framework
> M: Hailiang Zhang <zhanghailiang@xfusion.com>
> S: Maintained
> F: migration/colo*
> F: include/migration/colo.h
> F: include/migration/failover.h
> F: docs/COLO-FT.txt
>
> Thanks
> Chen
>
> > >
> > > Regards,
> > > Lukas
> > >
> > > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> > > ---
> > > Lukas Straub (3):
> > >       multifd: Add colo support
> > >       migration-test: Add -snapshot option for COLO
> > >       migration-test: Add COLO migration unit test
> > >
> > >  migration/meson.build              |   2 +-
> > >  migration/multifd-colo.c           |  57 ++++++++++++++++++
> > >  migration/multifd-colo.h           |  26 +++++++++
> > >  migration/multifd.c                |  14 ++++-
> > >  tests/qtest/meson.build            |   7 ++-
> > >  tests/qtest/migration-test.c       |   1 +
> > >  tests/qtest/migration/colo-tests.c | 115 +++++++++++++++++++++++++++++++++++++
> > >  tests/qtest/migration/framework.c  |  69 +++++++++++++++++++++-
> > >  tests/qtest/migration/framework.h  |  10 ++++
> > >  9 files changed, 294 insertions(+), 7 deletions(-)
> > > ---
> > > base-commit: 942b0d378a1de9649085ad6db5306d5b8cef3591
> > > change-id: 20251230-colo_unit_test_multifd-8bf58dcebd46
> > >
> > > Best regards,
> > > --
> > > Lukas Straub <lukasstraub2@web.de>
> > >
> >
> > --
> > Peter Xu
> >