tests/qtest/migration-test.c | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-)
Since commit: commit 1bfc8dde505f1e6a92697c52aa9b09e81b54c78f Author: Dr. David Alan Gilbert <dgilbert@redhat.com> Date: Mon Mar 6 15:26:12 2023 +0000 tests/migration: Tweek auto converge limits check Thomas found an autoconverge test failure where the migration completed before the autoconverge had kicked in. [...] migration-test has become very slow. On my laptop, before that commit migration-test takes 2min10seconds After that commit, it takes around 11minutes We can't revert it because it fixes a real problem when the host machine is overloaded. See the comment on test_migrate_auto_converge(). So I did two things here: - Once that we have setup up precopy, we can move to full speed, no need to continue at 3MB/s. That slowed everything a lot. - Only run auto_converge tests when we are asking for slow tests. Only that test on my hardware requires more than 1min to run. We need to run it at 3MB/s, but we are asking it to do 15 iterations throgh 150MB of RAM. We can have a test that is (reasonably) fast, or one that also works when machine is very loaded. To test that things still works over load, I used my desktop (ancient core i9900), and run migration-test in a loop in 20 terminals (load was 40) and didn't see a single failure in more than 1 hour run. Please, review. PD. Yes, I am still looking at the dreaded multifd_cancel test, but even on this setup, I am unable to get a failure. I have never seen a failure on it when I am running it, but I am only running x86 kvm linux. Moving to arm or tcg and see how well that goes. Juan Quintela (2): tests/migration: Make precopy fast tests/migration: Only run auto_converge in slow mode tests/qtest/migration-test.c | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) -- 2.39.2
On 12/04/2023 16.19, Juan Quintela wrote: > Since commit: > > commit 1bfc8dde505f1e6a92697c52aa9b09e81b54c78f > Author: Dr. David Alan Gilbert <dgilbert@redhat.com> > Date: Mon Mar 6 15:26:12 2023 +0000 > > tests/migration: Tweek auto converge limits check > > Thomas found an autoconverge test failure where the > migration completed before the autoconverge had kicked in. > [...] > > migration-test has become very slow. > On my laptop, before that commit migration-test takes 2min10seconds > After that commit, it takes around 11minutes > > We can't revert it because it fixes a real problem when the host > machine is overloaded. See the comment on test_migrate_auto_converge(). Thanks, your patches decrease the time to run the migration-test from 16 minutes down to 5 minutes on my system, that's a great improvement, indeed! Tested-by: Thomas Huth <thuth@redhat.com> (though 5 minutes are still quite a lot for qtests ... maybe some other parts could be moved to only run with g_test_slow() ?)
Thomas Huth <thuth@redhat.com> wrote: > On 12/04/2023 16.19, Juan Quintela wrote: >> Since commit: >> commit 1bfc8dde505f1e6a92697c52aa9b09e81b54c78f >> Author: Dr. David Alan Gilbert <dgilbert@redhat.com> >> Date: Mon Mar 6 15:26:12 2023 +0000 >> tests/migration: Tweek auto converge limits check >> Thomas found an autoconverge test failure where the >> migration completed before the autoconverge had kicked in. >> [...] >> migration-test has become very slow. >> On my laptop, before that commit migration-test takes 2min10seconds >> After that commit, it takes around 11minutes >> We can't revert it because it fixes a real problem when the host >> machine is overloaded. See the comment on test_migrate_auto_converge(). > > Thanks, your patches decrease the time to run the migration-test from > 16 minutes down to 5 minutes on my system, that's a great improvement, > indeed! > > Tested-by: Thomas Huth <thuth@redhat.com> > > (though 5 minutes are still quite a lot for qtests ... maybe some > other parts could be moved to only run with g_test_slow() ?) And once that we are on this topic. Is there a way to launch several tests on the same binary on parallel? i.e. every migration thread uses a maximum of 2 cores, so in a server I can run several at the same time (I know that migration-test.c tests need to be modified so they don't interfere, but I have that changes on my tree), but I don't know of a way to launch them. Thanks, Juan. PD, and I don't know why launching a qemu is so slow, the minimal time that I am able to get for launching the two qemus is around 0.5 seconds. Later, Juan.
Thomas Huth <thuth@redhat.com> wrote: > On 12/04/2023 16.19, Juan Quintela wrote: >> Since commit: >> commit 1bfc8dde505f1e6a92697c52aa9b09e81b54c78f >> Author: Dr. David Alan Gilbert <dgilbert@redhat.com> >> Date: Mon Mar 6 15:26:12 2023 +0000 >> tests/migration: Tweek auto converge limits check >> Thomas found an autoconverge test failure where the >> migration completed before the autoconverge had kicked in. >> [...] >> migration-test has become very slow. >> On my laptop, before that commit migration-test takes 2min10seconds >> After that commit, it takes around 11minutes >> We can't revert it because it fixes a real problem when the host >> machine is overloaded. See the comment on test_migrate_auto_converge(). > > Thanks, your patches decrease the time to run the migration-test from > 16 minutes down to 5 minutes on my system, that's a great improvement, > indeed! > > Tested-by: Thomas Huth <thuth@redhat.com> Thanks > (though 5 minutes are still quite a lot for qtests ... maybe some > other parts could be moved to only run with g_test_slow() ?) Hi Could you gime the output of: time for i in $(./tests/qtest/migration-test -l | grep "^/"); do echo $i; time ./tests/qtest/migration-test -p $i; done To see what tests are taking so long on your system? On my system (i9900K processor, i.e. not the latest) and auto_converge moved to slow the total of the tests take a bit more than 1 minute. qemu-system-x86_64 on x86_64 host: real 0m54.295s user 0m47.283s sys 0m16.969s qemu-system-aarch64 on x86_64 host: real 0m42.466s user 0m42.247s sys 0m13.747s s390x and ppc64 refuse to run non-natively. Later, Juan.
On 18/04/2023 13.42, Juan Quintela wrote: > Thomas Huth <thuth@redhat.com> wrote: >> On 12/04/2023 16.19, Juan Quintela wrote: >>> Since commit: >>> commit 1bfc8dde505f1e6a92697c52aa9b09e81b54c78f >>> Author: Dr. David Alan Gilbert <dgilbert@redhat.com> >>> Date: Mon Mar 6 15:26:12 2023 +0000 >>> tests/migration: Tweek auto converge limits check >>> Thomas found an autoconverge test failure where the >>> migration completed before the autoconverge had kicked in. >>> [...] >>> migration-test has become very slow. >>> On my laptop, before that commit migration-test takes 2min10seconds >>> After that commit, it takes around 11minutes >>> We can't revert it because it fixes a real problem when the host >>> machine is overloaded. See the comment on test_migrate_auto_converge(). >> >> Thanks, your patches decrease the time to run the migration-test from >> 16 minutes down to 5 minutes on my system, that's a great improvement, >> indeed! >> >> Tested-by: Thomas Huth <thuth@redhat.com> > > Thanks > >> (though 5 minutes are still quite a lot for qtests ... maybe some >> other parts could be moved to only run with g_test_slow() ?) > > Hi > > Could you gime the output of: > > time for i in $(./tests/qtest/migration-test -l | grep "^/"); do echo $i; time ./tests/qtest/migration-test -p $i; done > > To see what tests are taking so long on your system? > > On my system (i9900K processor, i.e. not the latest) and auto_converge > moved to slow the total of the tests take a bit more than 1 minute. This is with both of your patches applied: /x86_64/migration/bad_dest /x86_64/migration/bad_dest: OK real 0m0,342s user 0m0,123s sys 0m0,088s /x86_64/migration/fd_proto /x86_64/migration/fd_proto: OK real 0m1,135s user 0m0,730s sys 0m0,514s /x86_64/migration/validate_uuid /x86_64/migration/validate_uuid: OK real 0m0,519s user 0m0,379s sys 0m0,230s /x86_64/migration/validate_uuid_error /x86_64/migration/validate_uuid_error: OK real 0m0,275s user 0m0,145s sys 0m0,114s /x86_64/migration/validate_uuid_src_not_set /x86_64/migration/validate_uuid_src_not_set: OK real 0m0,514s user 0m0,377s sys 0m0,225s /x86_64/migration/validate_uuid_dst_not_set /x86_64/migration/validate_uuid_dst_not_set: OK real 0m0,519s user 0m0,392s sys 0m0,220s /x86_64/migration/dirty_ring /x86_64/migration/dirty_ring: OK real 0m1,079s user 0m0,613s sys 0m0,532s /x86_64/migration/vcpu_dirty_limit /x86_64/migration/vcpu_dirty_limit: OK real 0m6,308s user 0m4,025s sys 0m1,224s /x86_64/migration/postcopy/plain /x86_64/migration/postcopy/plain: OK real 0m35,446s user 0m47,208s sys 0m11,828s /x86_64/migration/postcopy/recovery/plain /x86_64/migration/postcopy/recovery/plain: OK real 0m34,707s user 0m46,357s sys 0m11,366s /x86_64/migration/postcopy/recovery/tls/psk /x86_64/migration/postcopy/recovery/tls/psk: OK real 0m33,052s user 0m46,539s sys 0m11,537s /x86_64/migration/postcopy/preempt/plain /x86_64/migration/postcopy/preempt/plain: OK real 0m35,107s user 0m46,556s sys 0m11,755s /x86_64/migration/postcopy/preempt/recovery/plain /x86_64/migration/postcopy/preempt/recovery/plain: OK real 0m35,329s user 0m46,951s sys 0m11,529s /x86_64/migration/postcopy/preempt/recovery/tls/psk /x86_64/migration/postcopy/preempt/recovery/tls/psk: OK real 0m36,237s user 0m51,450s sys 0m12,419s /x86_64/migration/postcopy/preempt/tls/psk /x86_64/migration/postcopy/preempt/tls/psk: OK real 0m35,033s user 0m49,244s sys 0m12,123s /x86_64/migration/postcopy/tls/psk /x86_64/migration/postcopy/tls/psk: OK real 0m36,097s user 0m50,873s sys 0m12,569s /x86_64/migration/precopy/unix/plain /x86_64/migration/precopy/unix/plain: OK real 0m1,034s user 0m0,654s sys 0m0,463s /x86_64/migration/precopy/unix/xbzrle /x86_64/migration/precopy/unix/xbzrle: OK real 0m1,119s user 0m0,740s sys 0m0,499s /x86_64/migration/precopy/unix/tls/psk /x86_64/migration/precopy/unix/tls/psk: OK real 0m3,555s user 0m5,448s sys 0m0,655s /x86_64/migration/precopy/unix/tls/x509/default-host /x86_64/migration/precopy/unix/tls/x509/default-host: OK real 0m1,022s user 0m1,664s sys 0m0,112s /x86_64/migration/precopy/unix/tls/x509/override-host /x86_64/migration/precopy/unix/tls/x509/override-host: OK real 0m1,841s user 0m1,921s sys 0m0,739s /x86_64/migration/precopy/tcp/plain /x86_64/migration/precopy/tcp/plain: OK real 0m1,241s user 0m0,859s sys 0m0,584s /x86_64/migration/precopy/tcp/tls/psk/match /x86_64/migration/precopy/tcp/tls/psk/match: OK real 0m2,114s user 0m2,628s sys 0m0,613s /x86_64/migration/precopy/tcp/tls/psk/mismatch /x86_64/migration/precopy/tcp/tls/psk/mismatch: OK real 0m0,575s user 0m0,554s sys 0m0,116s /x86_64/migration/precopy/tcp/tls/x509/default-host /x86_64/migration/precopy/tcp/tls/x509/default-host: OK real 0m1,538s user 0m1,460s sys 0m0,608s /x86_64/migration/precopy/tcp/tls/x509/override-host /x86_64/migration/precopy/tcp/tls/x509/override-host: OK real 0m1,825s user 0m1,915s sys 0m0,703s /x86_64/migration/precopy/tcp/tls/x509/mismatch-host /x86_64/migration/precopy/tcp/tls/x509/mismatch-host: OK real 0m0,961s user 0m1,430s sys 0m0,111s /x86_64/migration/precopy/tcp/tls/x509/friendly-client /x86_64/migration/precopy/tcp/tls/x509/friendly-client: OK real 0m1,806s user 0m1,897s sys 0m0,679s /x86_64/migration/precopy/tcp/tls/x509/hostile-client /x86_64/migration/precopy/tcp/tls/x509/hostile-client: OK real 0m0,645s user 0m0,614s sys 0m0,136s /x86_64/migration/precopy/tcp/tls/x509/allow-anon-client /x86_64/migration/precopy/tcp/tls/x509/allow-anon-client: OK real 0m2,204s user 0m2,695s sys 0m0,667s /x86_64/migration/precopy/tcp/tls/x509/reject-anon-client /x86_64/migration/precopy/tcp/tls/x509/reject-anon-client: OK real 0m1,530s user 0m2,360s sys 0m0,156s /x86_64/migration/multifd/tcp/plain/none /x86_64/migration/multifd/tcp/plain/none: OK real 0m1,055s user 0m0,647s sys 0m0,592s /x86_64/migration/multifd/tcp/plain/zlib /x86_64/migration/multifd/tcp/plain/zlib: OK real 0m1,144s user 0m1,763s sys 0m0,437s /x86_64/migration/multifd/tcp/plain/zstd /x86_64/migration/multifd/tcp/plain/zstd: OK real 0m1,073s user 0m0,999s sys 0m0,537s /x86_64/migration/multifd/tcp/tls/psk/match /x86_64/migration/multifd/tcp/tls/psk/match: OK real 0m1,453s user 0m2,475s sys 0m0,704s /x86_64/migration/multifd/tcp/tls/psk/mismatch /x86_64/migration/multifd/tcp/tls/psk/mismatch: OK real 0m0,905s user 0m1,256s sys 0m0,106s /x86_64/migration/multifd/tcp/tls/x509/default-host /x86_64/migration/multifd/tcp/tls/x509/default-host: OK real 0m3,761s user 0m5,874s sys 0m0,985s /x86_64/migration/multifd/tcp/tls/x509/override-host /x86_64/migration/multifd/tcp/tls/x509/override-host: OK real 0m3,238s user 0m4,794s sys 0m0,998s /x86_64/migration/multifd/tcp/tls/x509/mismatch-host /x86_64/migration/multifd/tcp/tls/x509/mismatch-host: OK real 0m0,851s user 0m1,007s sys 0m0,120s /x86_64/migration/multifd/tcp/tls/x509/allow-anon-client /x86_64/migration/multifd/tcp/tls/x509/allow-anon-client: OK real 0m2,607s user 0m3,530s sys 0m1,013s /x86_64/migration/multifd/tcp/tls/x509/reject-anon-client /x86_64/migration/multifd/tcp/tls/x509/reject-anon-client: OK real 0m1,915s user 0m3,223s sys 0m0,180s real 5m32,733s user 7m24,380s sys 1m50,801s Thomas
Thomas Huth <thuth@redhat.com> wrote: > On 18/04/2023 13.42, Juan Quintela wrote: >> Thomas Huth <thuth@redhat.com> wrote: >>> On 12/04/2023 16.19, Juan Quintela wrote: >>>> Since commit: >>>> commit 1bfc8dde505f1e6a92697c52aa9b09e81b54c78f >>>> Author: Dr. David Alan Gilbert <dgilbert@redhat.com> >>>> Date: Mon Mar 6 15:26:12 2023 +0000 >>>> tests/migration: Tweek auto converge limits check >>>> Thomas found an autoconverge test failure where the >>>> migration completed before the autoconverge had kicked in. >>>> [...] >>>> migration-test has become very slow. >>>> On my laptop, before that commit migration-test takes 2min10seconds >>>> After that commit, it takes around 11minutes >>>> We can't revert it because it fixes a real problem when the host >>>> machine is overloaded. See the comment on test_migrate_auto_converge(). >>> >>> Thanks, your patches decrease the time to run the migration-test from >>> 16 minutes down to 5 minutes on my system, that's a great improvement, >>> indeed! >>> >>> Tested-by: Thomas Huth <thuth@redhat.com> >> Thanks >> >>> (though 5 minutes are still quite a lot for qtests ... maybe some >>> other parts could be moved to only run with g_test_slow() ?) >> Hi >> Could you gime the output of: >> time for i in $(./tests/qtest/migration-test -l | grep "^/"); do >> echo $i; time ./tests/qtest/migration-test -p $i; done >> To see what tests are taking so long on your system? >> On my system (i9900K processor, i.e. not the latest) and >> auto_converge >> moved to slow the total of the tests take a bit more than 1 minute. > > This is with both of your patches applied: > /x86_64/migration/postcopy/plain > /x86_64/migration/postcopy/plain: OK > > real 0m35,446s > user 0m47,208s > sys 0m11,828s This is quite slower than on mine, basically almost all the code that does migration. $ time ./tests/qtest/migration-test -p /x86_64/migration/postcopy/plain # random seed: R02S42809b71f513e8524bd24df5facd5768 # Start of x86_64 tests # Start of migration tests # Start of postcopy tests # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-246853.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-246853.qmp,id=char0 -mon chardev=char0,mode=control -display none -accel kvm -accel tcg -name source,debug-threads=on -m 150M -serial file:/tmp/migration-test-1MGL31/src_serial -drive file=/tmp/migration-test-1MGL31/bootsect,format=raw -accel qtest # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-246853.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-246853.qmp,id=char0 -mon chardev=char0,mode=control -display none -accel kvm -accel tcg -name target,debug-threads=on -m 150M -serial file:/tmp/migration-test-1MGL31/dest_serial -incoming unix:/tmp/migration-test-1MGL31/migsocket -drive file=/tmp/migration-test-1MGL31/bootsect,format=raw -accel qtest ok 1 /x86_64/migration/postcopy/plain # End of postcopy tests # End of migration tests # End of x86_64 tests 1..1 real 0m1.104s user 0m0.697s sys 0m0.414s > /x86_64/migration/postcopy/recovery/plain > /x86_64/migration/postcopy/recovery/plain: OK > > real 0m34,707s > user 0m46,357s > sys 0m11,366s > /x86_64/migration/postcopy/recovery/tls/psk > /x86_64/migration/postcopy/recovery/tls/psk: OK > > real 0m33,052s > user 0m46,539s > sys 0m11,537s > /x86_64/migration/postcopy/preempt/plain > /x86_64/migration/postcopy/preempt/plain: OK > > real 0m35,107s > user 0m46,556s > sys 0m11,755s > /x86_64/migration/postcopy/preempt/recovery/plain > /x86_64/migration/postcopy/preempt/recovery/plain: OK > > real 0m35,329s > user 0m46,951s > sys 0m11,529s > /x86_64/migration/postcopy/preempt/recovery/tls/psk > /x86_64/migration/postcopy/preempt/recovery/tls/psk: OK > > real 0m36,237s > user 0m51,450s > sys 0m12,419s > /x86_64/migration/postcopy/preempt/tls/psk > /x86_64/migration/postcopy/preempt/tls/psk: OK > > real 0m35,033s > user 0m49,244s > sys 0m12,123s > /x86_64/migration/postcopy/tls/psk > /x86_64/migration/postcopy/tls/psk: OK > > real 0m36,097s > user 0m50,873s > sys 0m12,569s > real 5m32,733s > user 7m24,380s > sys 1m50,801s Ouch. Can I ask: - what is your machine? It is specially slow? Otherwise I want to know why it is happening. - as what is going slow to you is postcopy, can you told me what is this setting? # we want postcopy to work for normal users vm.unprivileged_userfaultfd = 1 And if it is not set, just change it and retest. Thanks, Juan.
On Tue, Apr 18, 2023 at 03:19:33PM +0200, Juan Quintela wrote: > Thomas Huth <thuth@redhat.com> wrote: > > On 18/04/2023 13.42, Juan Quintela wrote: > >> Thomas Huth <thuth@redhat.com> wrote: > >>> On 12/04/2023 16.19, Juan Quintela wrote: > >>>> Since commit: > >>>> commit 1bfc8dde505f1e6a92697c52aa9b09e81b54c78f > >>>> Author: Dr. David Alan Gilbert <dgilbert@redhat.com> > >>>> Date: Mon Mar 6 15:26:12 2023 +0000 > >>>> tests/migration: Tweek auto converge limits check > >>>> Thomas found an autoconverge test failure where the > >>>> migration completed before the autoconverge had kicked in. > >>>> [...] > >>>> migration-test has become very slow. > >>>> On my laptop, before that commit migration-test takes 2min10seconds > >>>> After that commit, it takes around 11minutes > >>>> We can't revert it because it fixes a real problem when the host > >>>> machine is overloaded. See the comment on test_migrate_auto_converge(). > >>> > >>> Thanks, your patches decrease the time to run the migration-test from > >>> 16 minutes down to 5 minutes on my system, that's a great improvement, > >>> indeed! > >>> > >>> Tested-by: Thomas Huth <thuth@redhat.com> > >> Thanks > >> > >>> (though 5 minutes are still quite a lot for qtests ... maybe some > >>> other parts could be moved to only run with g_test_slow() ?) > >> Hi > >> Could you gime the output of: > >> time for i in $(./tests/qtest/migration-test -l | grep "^/"); do > >> echo $i; time ./tests/qtest/migration-test -p $i; done > >> To see what tests are taking so long on your system? > >> On my system (i9900K processor, i.e. not the latest) and > >> auto_converge > >> moved to slow the total of the tests take a bit more than 1 minute. > > > > This is with both of your patches applied: > > > > /x86_64/migration/postcopy/plain > > /x86_64/migration/postcopy/plain: OK > > > > real 0m35,446s > > user 0m47,208s > > sys 0m11,828s > > This is quite slower than on mine, basically almost all the code that > does migration. This is expected AFAIK. The migrate_postcopy_prepare method waits for 1 complete pre-copy pass to run at 3mbps, before switching to pre-copy mode. > > $ time ./tests/qtest/migration-test -p /x86_64/migration/postcopy/plain > # random seed: R02S42809b71f513e8524bd24df5facd5768 > # Start of x86_64 tests > # Start of migration tests > # Start of postcopy tests > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-246853.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-246853.qmp,id=char0 -mon chardev=char0,mode=control -display none -accel kvm -accel tcg -name source,debug-threads=on -m 150M -serial file:/tmp/migration-test-1MGL31/src_serial -drive file=/tmp/migration-test-1MGL31/bootsect,format=raw -accel qtest > # starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-246853.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-246853.qmp,id=char0 -mon chardev=char0,mode=control -display none -accel kvm -accel tcg -name target,debug-threads=on -m 150M -serial file:/tmp/migration-test-1MGL31/dest_serial -incoming unix:/tmp/migration-test-1MGL31/migsocket -drive file=/tmp/migration-test-1MGL31/bootsect,format=raw -accel qtest > ok 1 /x86_64/migration/postcopy/plain > # End of postcopy tests > # End of migration tests > # End of x86_64 tests > 1..1 > > real 0m1.104s > user 0m0.697s > sys 0m0.414s That is surprisingly fast - it is like it is not doing the pre-copy pass at all. > > real 5m32,733s > > user 7m24,380s > > sys 1m50,801s > > Ouch. > > Can I ask: > - what is your machine? It is specially slow? > Otherwise I want to know why it is happening. This matches what I see in my laptop - any test which runs a full pre-copy pass gets 30 seconds time added for this phase With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 18/04/2023 15.19, Juan Quintela wrote: > Thomas Huth <thuth@redhat.com> wrote: >> On 18/04/2023 13.42, Juan Quintela wrote: >>> Thomas Huth <thuth@redhat.com> wrote: >>>> On 12/04/2023 16.19, Juan Quintela wrote: >>>>> Since commit: >>>>> commit 1bfc8dde505f1e6a92697c52aa9b09e81b54c78f >>>>> Author: Dr. David Alan Gilbert <dgilbert@redhat.com> >>>>> Date: Mon Mar 6 15:26:12 2023 +0000 >>>>> tests/migration: Tweek auto converge limits check >>>>> Thomas found an autoconverge test failure where the >>>>> migration completed before the autoconverge had kicked in. >>>>> [...] >>>>> migration-test has become very slow. >>>>> On my laptop, before that commit migration-test takes 2min10seconds >>>>> After that commit, it takes around 11minutes >>>>> We can't revert it because it fixes a real problem when the host >>>>> machine is overloaded. See the comment on test_migrate_auto_converge(). >>>> >>>> Thanks, your patches decrease the time to run the migration-test from >>>> 16 minutes down to 5 minutes on my system, that's a great improvement, >>>> indeed! >>>> >>>> Tested-by: Thomas Huth <thuth@redhat.com> >>> Thanks >>> >>>> (though 5 minutes are still quite a lot for qtests ... maybe some >>>> other parts could be moved to only run with g_test_slow() ?) >>> Hi >>> Could you gime the output of: >>> time for i in $(./tests/qtest/migration-test -l | grep "^/"); do >>> echo $i; time ./tests/qtest/migration-test -p $i; done >>> To see what tests are taking so long on your system? >>> On my system (i9900K processor, i.e. not the latest) and >>> auto_converge >>> moved to slow the total of the tests take a bit more than 1 minute. >> >> This is with both of your patches applied: ... >> real 5m32,733s >> user 7m24,380s >> sys 1m50,801s > > Ouch. > > Can I ask: > - what is your machine? It is specially slow? It's a 4 year old T480s ThinkPad laptop. > Otherwise I want to know why it is happening. > > - as what is going slow to you is postcopy, can you told me what is this > setting? > > # we want postcopy to work for normal users > vm.unprivileged_userfaultfd = 1 $ sysctl vm.unprivileged_userfaultfd sysctl: cannot stat /proc/sys/vm/unprivileged_userfaultfd: No such file or directory > And if it is not set, just change it and retest. Seems like it is not available on RHEL 8 yet :-( Shall we maybe disable the postcopy tests if unprivileged_userfaultfd is not available? Thomas
© 2016 - 2024 Red Hat, Inc.