The migration tests have support for being passed two QEMU binaries to
test migration compatibility.
Add a CI job that builds the lastest release of QEMU and another job
that uses that version plus an already present build of the current
version and run the migration tests with the two, both as source and
destination. I.e.:
old QEMU (n-1) -> current QEMU (development tree)
current QEMU (development tree) -> old QEMU (n-1)
The purpose of this CI job is to ensure the code we're about to merge
will not cause a migration compatibility problem when migrating the
next release (which will contain that code) to/from the previous
release.
I'm leaving the jobs as manual for now because using an older QEMU in
tests could hit bugs that were already fixed in the current
development tree and we need to handle those case-by-case.
Note: for user forks, the version tags need to be pushed to gitlab
otherwise it won't be able to checkout a different version.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
.gitlab-ci.d/buildtest.yml | 53 ++++++++++++++++++++++++++++++++++++++
1 file changed, 53 insertions(+)
diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index 91663946de..81163a3f6a 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -167,6 +167,59 @@ build-system-centos:
x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu
MAKE_CHECK_ARGS: check-build
+build-previous-qemu:
+ extends: .native_build_job_template
+ artifacts:
+ when: on_success
+ expire_in: 2 days
+ paths:
+ - build-previous
+ exclude:
+ - build-previous/**/*.p
+ - build-previous/**/*.a.p
+ - build-previous/**/*.fa.p
+ - build-previous/**/*.c.o
+ - build-previous/**/*.c.o.d
+ - build-previous/**/*.fa
+ needs:
+ job: amd64-opensuse-leap-container
+ variables:
+ QEMU_JOB_OPTIONAL: 1
+ IMAGE: opensuse-leap
+ TARGETS: x86_64-softmmu aarch64-softmmu
+ before_script:
+ - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)"
+ - git checkout $QEMU_PREV_VERSION
+ after_script:
+ - mv build build-previous
+
+.migration-compat-common:
+ extends: .common_test_job_template
+ needs:
+ - job: build-previous-qemu
+ - job: build-system-opensuse
+ allow_failure: true
+ variables:
+ QEMU_JOB_OPTIONAL: 1
+ IMAGE: opensuse-leap
+ MAKE_CHECK_ARGS: check-build
+ script:
+ - cd build
+ - QTEST_QEMU_BINARY_SRC=../build-previous/qemu-system-${TARGET}
+ QTEST_QEMU_BINARY=./qemu-system-${TARGET} ./tests/qtest/migration-test
+ - QTEST_QEMU_BINARY_DST=../build-previous/qemu-system-${TARGET}
+ QTEST_QEMU_BINARY=./qemu-system-${TARGET} ./tests/qtest/migration-test
+
+migration-compat-aarch64:
+ extends: .migration-compat-common
+ variables:
+ TARGET: aarch64
+
+migration-compat-x86_64:
+ extends: .migration-compat-common
+ variables:
+ TARGET: x86_64
+
check-system-centos:
extends: .native_test_job_template
needs:
--
2.35.3
On 1/5/24 19:04, Fabiano Rosas wrote: > The migration tests have support for being passed two QEMU binaries to > test migration compatibility. > > Add a CI job that builds the lastest release of QEMU and another job > that uses that version plus an already present build of the current > version and run the migration tests with the two, both as source and > destination. I.e.: > > old QEMU (n-1) -> current QEMU (development tree) > current QEMU (development tree) -> old QEMU (n-1) > > The purpose of this CI job is to ensure the code we're about to merge > will not cause a migration compatibility problem when migrating the > next release (which will contain that code) to/from the previous > release. > > I'm leaving the jobs as manual for now because using an older QEMU in > tests could hit bugs that were already fixed in the current > development tree and we need to handle those case-by-case. > > Note: for user forks, the version tags need to be pushed to gitlab > otherwise it won't be able to checkout a different version. > > Signed-off-by: Fabiano Rosas <farosas@suse.de> > --- > .gitlab-ci.d/buildtest.yml | 53 ++++++++++++++++++++++++++++++++++++++ > 1 file changed, 53 insertions(+) > > diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml > index 91663946de..81163a3f6a 100644 > --- a/.gitlab-ci.d/buildtest.yml > +++ b/.gitlab-ci.d/buildtest.yml > @@ -167,6 +167,59 @@ build-system-centos: > x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu > MAKE_CHECK_ARGS: check-build > > +build-previous-qemu: > + extends: .native_build_job_template > + artifacts: > + when: on_success > + expire_in: 2 days > + paths: > + - build-previous > + exclude: > + - build-previous/**/*.p > + - build-previous/**/*.a.p > + - build-previous/**/*.fa.p > + - build-previous/**/*.c.o > + - build-previous/**/*.c.o.d > + - build-previous/**/*.fa > + needs: > + job: amd64-opensuse-leap-container > + variables: > + QEMU_JOB_OPTIONAL: 1 > + IMAGE: opensuse-leap > + TARGETS: x86_64-softmmu aarch64-softmmu > + before_script: > + - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)" > + - git checkout $QEMU_PREV_VERSION > + after_script: > + - mv build build-previous > + > +.migration-compat-common: > + extends: .common_test_job_template > + needs: > + - job: build-previous-qemu > + - job: build-system-opensuse > + allow_failure: true > + variables: > + QEMU_JOB_OPTIONAL: 1 > + IMAGE: opensuse-leap > + MAKE_CHECK_ARGS: check-build > + script: > + - cd build > + - QTEST_QEMU_BINARY_SRC=../build-previous/qemu-system-${TARGET} > + QTEST_QEMU_BINARY=./qemu-system-${TARGET} ./tests/qtest/migration-test > + - QTEST_QEMU_BINARY_DST=../build-previous/qemu-system-${TARGET} > + QTEST_QEMU_BINARY=./qemu-system-${TARGET} ./tests/qtest/migration-test > + > +migration-compat-aarch64: > + extends: .migration-compat-common > + variables: > + TARGET: aarch64 > + > +migration-compat-x86_64: > + extends: .migration-compat-common > + variables: > + TARGET: x86_64 What about the others archs, s390x and ppc ? Do you lack the resources or are there any problems to address ? Thanks, C.
Cédric Le Goater <clg@redhat.com> writes: > On 1/5/24 19:04, Fabiano Rosas wrote: >> The migration tests have support for being passed two QEMU binaries to >> test migration compatibility. >> >> Add a CI job that builds the lastest release of QEMU and another job >> that uses that version plus an already present build of the current >> version and run the migration tests with the two, both as source and >> destination. I.e.: >> >> old QEMU (n-1) -> current QEMU (development tree) >> current QEMU (development tree) -> old QEMU (n-1) >> >> The purpose of this CI job is to ensure the code we're about to merge >> will not cause a migration compatibility problem when migrating the >> next release (which will contain that code) to/from the previous >> release. >> >> I'm leaving the jobs as manual for now because using an older QEMU in >> tests could hit bugs that were already fixed in the current >> development tree and we need to handle those case-by-case. >> >> Note: for user forks, the version tags need to be pushed to gitlab >> otherwise it won't be able to checkout a different version. >> >> Signed-off-by: Fabiano Rosas <farosas@suse.de> >> --- >> .gitlab-ci.d/buildtest.yml | 53 ++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 53 insertions(+) >> >> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml >> index 91663946de..81163a3f6a 100644 >> --- a/.gitlab-ci.d/buildtest.yml >> +++ b/.gitlab-ci.d/buildtest.yml >> @@ -167,6 +167,59 @@ build-system-centos: >> x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu >> MAKE_CHECK_ARGS: check-build >> >> +build-previous-qemu: >> + extends: .native_build_job_template >> + artifacts: >> + when: on_success >> + expire_in: 2 days >> + paths: >> + - build-previous >> + exclude: >> + - build-previous/**/*.p >> + - build-previous/**/*.a.p >> + - build-previous/**/*.fa.p >> + - build-previous/**/*.c.o >> + - build-previous/**/*.c.o.d >> + - build-previous/**/*.fa >> + needs: >> + job: amd64-opensuse-leap-container >> + variables: >> + QEMU_JOB_OPTIONAL: 1 >> + IMAGE: opensuse-leap >> + TARGETS: x86_64-softmmu aarch64-softmmu >> + before_script: >> + - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)" >> + - git checkout $QEMU_PREV_VERSION >> + after_script: >> + - mv build build-previous >> + >> +.migration-compat-common: >> + extends: .common_test_job_template >> + needs: >> + - job: build-previous-qemu >> + - job: build-system-opensuse >> + allow_failure: true >> + variables: >> + QEMU_JOB_OPTIONAL: 1 >> + IMAGE: opensuse-leap >> + MAKE_CHECK_ARGS: check-build >> + script: >> + - cd build >> + - QTEST_QEMU_BINARY_SRC=../build-previous/qemu-system-${TARGET} >> + QTEST_QEMU_BINARY=./qemu-system-${TARGET} ./tests/qtest/migration-test >> + - QTEST_QEMU_BINARY_DST=../build-previous/qemu-system-${TARGET} >> + QTEST_QEMU_BINARY=./qemu-system-${TARGET} ./tests/qtest/migration-test >> + >> +migration-compat-aarch64: >> + extends: .migration-compat-common >> + variables: >> + TARGET: aarch64 >> + >> +migration-compat-x86_64: >> + extends: .migration-compat-common >> + variables: >> + TARGET: x86_64 > > > What about the others archs, s390x and ppc ? Do you lack the resources > or are there any problems to address ? Currently s390x and ppc are only tested on KVM. Which means they are not tested at all unless someone runs migration-test on a custom runner. The same is true for this test. The TCG tests have been disabled: /* * On ppc64, the test only works with kvm-hv, but not with kvm-pr and TCG * is touchy due to race conditions on dirty bits (especially on PPC for * some reason) */ /* * Similar to ppc64, s390x seems to be touchy with TCG, so disable it * there until the problems are resolved */ It would be great if we could figure out what these issues are and fix them so we can at least test with TCG like we do for aarch64. Doing a TCG run of migration-test with both archs (one binary only, not this series): - ppc survived one run, taking 6 minutes longer than x86/Aarch64. - s390x survived one run, taking 40s less than x86/aarch64. I'll leave them enabled on my machine and do some runs here and there, see if I spot something. If not, we can consider re-enabling them once we figure out why ppc takes so long.
On 09/01/2024 21.58, Fabiano Rosas wrote: > Cédric Le Goater <clg@redhat.com> writes: > >> On 1/5/24 19:04, Fabiano Rosas wrote: >>> The migration tests have support for being passed two QEMU binaries to >>> test migration compatibility. >>> >>> Add a CI job that builds the lastest release of QEMU and another job >>> that uses that version plus an already present build of the current >>> version and run the migration tests with the two, both as source and >>> destination. I.e.: >>> >>> old QEMU (n-1) -> current QEMU (development tree) >>> current QEMU (development tree) -> old QEMU (n-1) >>> >>> The purpose of this CI job is to ensure the code we're about to merge >>> will not cause a migration compatibility problem when migrating the >>> next release (which will contain that code) to/from the previous >>> release. >>> >>> I'm leaving the jobs as manual for now because using an older QEMU in >>> tests could hit bugs that were already fixed in the current >>> development tree and we need to handle those case-by-case. >>> >>> Note: for user forks, the version tags need to be pushed to gitlab >>> otherwise it won't be able to checkout a different version. >>> >>> Signed-off-by: Fabiano Rosas <farosas@suse.de> >>> --- >>> .gitlab-ci.d/buildtest.yml | 53 ++++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 53 insertions(+) >>> >>> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml >>> index 91663946de..81163a3f6a 100644 >>> --- a/.gitlab-ci.d/buildtest.yml >>> +++ b/.gitlab-ci.d/buildtest.yml >>> @@ -167,6 +167,59 @@ build-system-centos: >>> x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu >>> MAKE_CHECK_ARGS: check-build >>> >>> +build-previous-qemu: >>> + extends: .native_build_job_template >>> + artifacts: >>> + when: on_success >>> + expire_in: 2 days >>> + paths: >>> + - build-previous >>> + exclude: >>> + - build-previous/**/*.p >>> + - build-previous/**/*.a.p >>> + - build-previous/**/*.fa.p >>> + - build-previous/**/*.c.o >>> + - build-previous/**/*.c.o.d >>> + - build-previous/**/*.fa >>> + needs: >>> + job: amd64-opensuse-leap-container >>> + variables: >>> + QEMU_JOB_OPTIONAL: 1 >>> + IMAGE: opensuse-leap >>> + TARGETS: x86_64-softmmu aarch64-softmmu >>> + before_script: >>> + - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' VERSION)" >>> + - git checkout $QEMU_PREV_VERSION >>> + after_script: >>> + - mv build build-previous >>> + >>> +.migration-compat-common: >>> + extends: .common_test_job_template >>> + needs: >>> + - job: build-previous-qemu >>> + - job: build-system-opensuse >>> + allow_failure: true >>> + variables: >>> + QEMU_JOB_OPTIONAL: 1 >>> + IMAGE: opensuse-leap >>> + MAKE_CHECK_ARGS: check-build >>> + script: >>> + - cd build >>> + - QTEST_QEMU_BINARY_SRC=../build-previous/qemu-system-${TARGET} >>> + QTEST_QEMU_BINARY=./qemu-system-${TARGET} ./tests/qtest/migration-test >>> + - QTEST_QEMU_BINARY_DST=../build-previous/qemu-system-${TARGET} >>> + QTEST_QEMU_BINARY=./qemu-system-${TARGET} ./tests/qtest/migration-test >>> + >>> +migration-compat-aarch64: >>> + extends: .migration-compat-common >>> + variables: >>> + TARGET: aarch64 >>> + >>> +migration-compat-x86_64: >>> + extends: .migration-compat-common >>> + variables: >>> + TARGET: x86_64 >> >> >> What about the others archs, s390x and ppc ? Do you lack the resources >> or are there any problems to address ? > > Currently s390x and ppc are only tested on KVM. Which means they are not > tested at all unless someone runs migration-test on a custom runner. The > same is true for this test. > > The TCG tests have been disabled: > /* > * On ppc64, the test only works with kvm-hv, but not with kvm-pr and TCG > * is touchy due to race conditions on dirty bits (especially on PPC for > * some reason) > */ > > /* > * Similar to ppc64, s390x seems to be touchy with TCG, so disable it > * there until the problems are resolved > */ > > It would be great if we could figure out what these issues are and fix > them so we can at least test with TCG like we do for aarch64. > > Doing a TCG run of migration-test with both archs (one binary only, not > this series): > > - ppc survived one run, taking 6 minutes longer than x86/Aarch64. > - s390x survived one run, taking 40s less than x86/aarch64. > > I'll leave them enabled on my machine and do some runs here and there, > see if I spot something. If not, we can consider re-enabling them once > we figure out why ppc takes so long. I was curious and re-enabled the ppc64 and s390x migration tests with TCG on my laptop here, running "make check-tcg -j$(nproc)" in a loop. s390x unfortunately hang after the second iteration already, but ppc64 survived 25 runs (then I stopped it). So we might want to try to re-enable ppc64 at least. But we might need to cut the run time for ppc64 with TCG a little bit, it is currently the longest test on my system (it takes 240s to finish, while all other tests finish within 150s). Thomas
On Fri, Jan 05, 2024 at 03:04:48PM -0300, Fabiano Rosas wrote: > The migration tests have support for being passed two QEMU binaries to > test migration compatibility. > > Add a CI job that builds the lastest release of QEMU and another job > that uses that version plus an already present build of the current > version and run the migration tests with the two, both as source and > destination. I.e.: > > old QEMU (n-1) -> current QEMU (development tree) > current QEMU (development tree) -> old QEMU (n-1) > > The purpose of this CI job is to ensure the code we're about to merge > will not cause a migration compatibility problem when migrating the > next release (which will contain that code) to/from the previous > release. > > I'm leaving the jobs as manual for now because using an older QEMU in > tests could hit bugs that were already fixed in the current > development tree and we need to handle those case-by-case. Can we opt-out those broken tests using either your "since:" thing or anything similar? I hope we can start to run something by default in the CI in 9.0 to cover n-1 -> n, even if starting with a subset of tests. Is it possible? Thanks, -- Peter Xu
Peter Xu <peterx@redhat.com> writes: > On Fri, Jan 05, 2024 at 03:04:48PM -0300, Fabiano Rosas wrote: >> The migration tests have support for being passed two QEMU binaries to >> test migration compatibility. >> >> Add a CI job that builds the lastest release of QEMU and another job >> that uses that version plus an already present build of the current >> version and run the migration tests with the two, both as source and >> destination. I.e.: >> >> old QEMU (n-1) -> current QEMU (development tree) >> current QEMU (development tree) -> old QEMU (n-1) >> >> The purpose of this CI job is to ensure the code we're about to merge >> will not cause a migration compatibility problem when migrating the >> next release (which will contain that code) to/from the previous >> release. >> >> I'm leaving the jobs as manual for now because using an older QEMU in >> tests could hit bugs that were already fixed in the current >> development tree and we need to handle those case-by-case. > > Can we opt-out those broken tests using either your "since:" thing or > anything similar? If it's something migration related, then yes. But there might be other types of breakages that have nothing to do with migration. Our tests are not resilent enough (nor they should) to detect when QEMU aborted for other reasons. Think about the -audio issue: the old QEMU would just say "there's no -audio option, abort" and that's a test failure of course. > I hope we can start to run something by default in the CI in 9.0 to cover > n-1 -> n, even if starting with a subset of tests. Is it possible? We could maybe have it enabled with "allow_failure" set. The important thing here is that we don't want to get reports of "flaky test". These tests are kind of flaky by definition, there's no way to backport a fix to the older QEMU, so there's always the chance that this test will be broken for a whole release cycle. We should act fast in adding the "since" annotation or other workaround, but that depends on our availability and the type of bug that we hit.
On Tue, Jan 09, 2024 at 10:00:17AM -0300, Fabiano Rosas wrote: > > Can we opt-out those broken tests using either your "since:" thing or > > anything similar? > > If it's something migration related, then yes. But there might be other > types of breakages that have nothing to do with migration. Our tests are > not resilent enough (nor they should) to detect when QEMU aborted for > other reasons. Think about the -audio issue: the old QEMU would just say > "there's no -audio option, abort" and that's a test failure of course. I'm wondering whether we can more or less remedy that by running migration-test under the build-previous directory for cross-binary tests. We don't necessarily need to cross-test anything new happening anyway. IOW, we use both old QEMU / migration-test for "n-1", and we only use "n" for the new QEMU binary? -- Peter Xu
© 2016 - 2024 Red Hat, Inc.