On 28/06/2022 12.54, Daniel P. Berrangé wrote:
> Since the TLS tests were added a few people have reported seeing
> hangs in some of the TLS test cases for migration. Debugging
> has revealed that in all cases the test was waiting for a STOP
> event that never arrived.
>
> The problem is that TLS performance is highly dependant on the
> crypto impl. Some people have been running tests on machines
> which are highly efficient at running the guest dirtying workload
> but relatively slow at TLS. This has prevented convergance from
> being reliably achieved in the configured max downtime.
>
> Since this test design has been long standing I suspect the
> lack of convergance is a likely cause of previous hangs we've
> seen in various scenarios that resulted in us disabling the test
> on s390 TCG, ppc TCG and ppc KVM-PR.
>
> Thus I have suggested we drop this skip conditions, though I would
> note that I've not had the ability to actually test the effect that
> this has.
>
> Daniel P. Berrangé (5):
> tests: wait max 120 seconds for migration test status changes
> tests: wait for migration completion before looking for STOP event
> tests: increase migration test converge downtime to 30 seconds
> tests: use consistent bandwidth/downtime limits in migration tests
> tests: stop skipping migration test on s390x/ppc64
>
> tests/qtest/migration-helpers.c | 14 ++++++
> tests/qtest/migration-test.c | 80 ++++++++++-----------------------
> 2 files changed, 38 insertions(+), 56 deletions(-)
FYI, this is fixing the issue with the hang that I saw with the
precopy/unix/tls/x509/override-host test on my RHEL8 s390x host.
Tested-by: Thomas Huth <thuth@redhat.com>