From: Thomas Huth <thuth@redhat.com>
We've got this nice vmstate-static-checker.py script that can help
to detect screw-ups in the migration states. Unfortunately, it's
currently only run manually, so there could be regressions that nobody
notices immediately. Let's run it from a functional test automatically
so that we got at least a basic coverage in each CI run.
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
MAINTAINERS | 1 +
tests/functional/meson.build | 13 +++++++-
tests/functional/test_vmstate.py | 56 ++++++++++++++++++++++++++++++++
3 files changed, 69 insertions(+), 1 deletion(-)
create mode 100755 tests/functional/test_vmstate.py
diff --git a/MAINTAINERS b/MAINTAINERS
index 65fb61844b3..6a8d81458ad 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3525,6 +3525,7 @@ F: migration/
F: scripts/vmstate-static-checker.py
F: tests/data/vmstate-static-checker/
F: tests/functional/test_migration.py
+F: tests/functional/test_vmstate.py
F: tests/qtest/migration/
F: tests/qtest/migration-*
F: docs/devel/migration/
diff --git a/tests/functional/meson.build b/tests/functional/meson.build
index b317ad42c5a..9f339e626f6 100644
--- a/tests/functional/meson.build
+++ b/tests/functional/meson.build
@@ -76,6 +76,7 @@ tests_generic_bsduser = [
tests_aarch64_system_quick = [
'migration',
+ 'vmstate',
]
tests_aarch64_system_thorough = [
@@ -164,6 +165,10 @@ tests_loongarch64_system_thorough = [
'loongarch64_virt',
]
+tests_m68k_system_quick = [
+ 'vmstate',
+]
+
tests_m68k_system_thorough = [
'm68k_mcf5208evb',
'm68k_nextcube',
@@ -230,6 +235,7 @@ tests_ppc_system_thorough = [
tests_ppc64_system_quick = [
'migration',
+ 'vmstate',
]
tests_ppc64_system_thorough = [
@@ -265,6 +271,10 @@ tests_rx_system_thorough = [
'rx_gdbsim',
]
+tests_s390x_system_quick = [
+ 'vmstate',
+]
+
tests_s390x_system_thorough = [
's390x_ccw_virtio',
's390x_replay',
@@ -305,8 +315,9 @@ tests_x86_64_system_quick = [
'migration',
'pc_cpu_hotplug_props',
'virtio_version',
- 'x86_cpu_model_versions',
+ 'vmstate',
'vnc',
+ 'x86_cpu_model_versions',
]
tests_x86_64_system_thorough = [
diff --git a/tests/functional/test_vmstate.py b/tests/functional/test_vmstate.py
new file mode 100755
index 00000000000..3ba56d580db
--- /dev/null
+++ b/tests/functional/test_vmstate.py
@@ -0,0 +1,56 @@
+#!/usr/bin/env python3
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# This test runs the vmstate-static-checker script with the current QEMU
+
+import subprocess
+
+from qemu_test import QemuSystemTest
+
+
+class VmStateTest(QemuSystemTest):
+
+ def test_vmstate(self):
+ target_machine = {
+ 'aarch64': 'virt-7.2',
+ 'm68k': 'virt-7.2',
+ 'ppc64': 'pseries-7.2',
+ 's390x': 's390-ccw-virtio-7.2',
+ 'x86_64': 'pc-q35-7.2',
+ }
+ self.set_machine(target_machine[self.arch])
+
+ # Run QEMU to get the current vmstate json file:
+ dst_json = self.scratch_file('dest.json')
+ self.log.info('Dumping vmstate from ' + self.qemu_bin)
+ cp = subprocess.run([self.qemu_bin, '-nodefaults',
+ '-M', target_machine[self.arch],
+ '-dump-vmstate', dst_json],
+ stdout=subprocess.PIPE,
+ stderr=subprocess.STDOUT,
+ text=True)
+ if cp.returncode != 0:
+ self.fail('Running QEMU failed:\n' + cp.stdout)
+ if cp.stdout:
+ self.log.info('QEMU output: ' + cp.stdout)
+
+ # Check whether the old vmstate json file is still compatible:
+ src_json = self.data_file('..', 'data', 'vmstate-static-checker',
+ self.arch,
+ target_machine[self.arch] + '.json')
+ vmstate_checker = self.data_file('..', '..', 'scripts',
+ 'vmstate-static-checker.py')
+ self.log.info('Comparing vmstate with ' + src_json)
+ cp = subprocess.run([vmstate_checker, '-s', src_json, '-d', dst_json],
+ stdout=subprocess.PIPE,
+ stderr=subprocess.STDOUT,
+ text=True)
+ if cp.returncode != 0:
+ self.fail('Running vmstate-static-checker failed:\n' + cp.stdout)
+ if cp.stdout:
+ self.log.warning('vmstate-static-checker output: ' + cp.stdout)
+
+
+if __name__ == '__main__':
+ QemuSystemTest.main()
--
2.49.0
On 4/29/25 8:21 AM, Thomas Huth wrote:
> From: Thomas Huth <thuth@redhat.com>
>
> We've got this nice vmstate-static-checker.py script that can help
> to detect screw-ups in the migration states. Unfortunately, it's
> currently only run manually, so there could be regressions that nobody
> notices immediately. Let's run it from a functional test automatically
> so that we got at least a basic coverage in each CI run.
>
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
> MAINTAINERS | 1 +
> tests/functional/meson.build | 13 +++++++-
> tests/functional/test_vmstate.py | 56 ++++++++++++++++++++++++++++++++
> 3 files changed, 69 insertions(+), 1 deletion(-)
> create mode 100755 tests/functional/test_vmstate.py
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 65fb61844b3..6a8d81458ad 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3525,6 +3525,7 @@ F: migration/
> F: scripts/vmstate-static-checker.py
> F: tests/data/vmstate-static-checker/
> F: tests/functional/test_migration.py
> +F: tests/functional/test_vmstate.py
> F: tests/qtest/migration/
> F: tests/qtest/migration-*
> F: docs/devel/migration/
> diff --git a/tests/functional/meson.build b/tests/functional/meson.build
> index b317ad42c5a..9f339e626f6 100644
> --- a/tests/functional/meson.build
> +++ b/tests/functional/meson.build
> @@ -76,6 +76,7 @@ tests_generic_bsduser = [
>
> tests_aarch64_system_quick = [
> 'migration',
> + 'vmstate',
> ]
>
> tests_aarch64_system_thorough = [
> @@ -164,6 +165,10 @@ tests_loongarch64_system_thorough = [
> 'loongarch64_virt',
> ]
>
> +tests_m68k_system_quick = [
> + 'vmstate',
> +]
> +
> tests_m68k_system_thorough = [
> 'm68k_mcf5208evb',
> 'm68k_nextcube',
> @@ -230,6 +235,7 @@ tests_ppc_system_thorough = [
>
> tests_ppc64_system_quick = [
> 'migration',
> + 'vmstate',
> ]
>
> tests_ppc64_system_thorough = [
> @@ -265,6 +271,10 @@ tests_rx_system_thorough = [
> 'rx_gdbsim',
> ]
>
> +tests_s390x_system_quick = [
> + 'vmstate',
> +]
> +
> tests_s390x_system_thorough = [
> 's390x_ccw_virtio',
> 's390x_replay',
> @@ -305,8 +315,9 @@ tests_x86_64_system_quick = [
> 'migration',
> 'pc_cpu_hotplug_props',
> 'virtio_version',
> - 'x86_cpu_model_versions',
> + 'vmstate',
> 'vnc',
> + 'x86_cpu_model_versions',
> ]
>
> tests_x86_64_system_thorough = [
> diff --git a/tests/functional/test_vmstate.py b/tests/functional/test_vmstate.py
> new file mode 100755
> index 00000000000..3ba56d580db
> --- /dev/null
> +++ b/tests/functional/test_vmstate.py
> @@ -0,0 +1,56 @@
> +#!/usr/bin/env python3
> +#
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +#
> +# This test runs the vmstate-static-checker script with the current QEMU
> +
> +import subprocess
> +
> +from qemu_test import QemuSystemTest
> +
> +
> +class VmStateTest(QemuSystemTest):
> +
> + def test_vmstate(self):
> + target_machine = {
> + 'aarch64': 'virt-7.2',
> + 'm68k': 'virt-7.2',
> + 'ppc64': 'pseries-7.2',
> + 's390x': 's390-ccw-virtio-7.2',
> + 'x86_64': 'pc-q35-7.2',
> + }
> + self.set_machine(target_machine[self.arch])
> +
> + # Run QEMU to get the current vmstate json file:
> + dst_json = self.scratch_file('dest.json')
> + self.log.info('Dumping vmstate from ' + self.qemu_bin)
> + cp = subprocess.run([self.qemu_bin, '-nodefaults',
> + '-M', target_machine[self.arch],
> + '-dump-vmstate', dst_json],
> + stdout=subprocess.PIPE,
> + stderr=subprocess.STDOUT,
> + text=True)
> + if cp.returncode != 0:
> + self.fail('Running QEMU failed:\n' + cp.stdout)
> + if cp.stdout:
> + self.log.info('QEMU output: ' + cp.stdout)
> +
> + # Check whether the old vmstate json file is still compatible:
> + src_json = self.data_file('..', 'data', 'vmstate-static-checker',
> + self.arch,
> + target_machine[self.arch] + '.json')
> + vmstate_checker = self.data_file('..', '..', 'scripts',
> + 'vmstate-static-checker.py')
> + self.log.info('Comparing vmstate with ' + src_json)
> + cp = subprocess.run([vmstate_checker, '-s', src_json, '-d', dst_json],
> + stdout=subprocess.PIPE,
> + stderr=subprocess.STDOUT,
> + text=True)
> + if cp.returncode != 0:
> + self.fail('Running vmstate-static-checker failed:\n' + cp.stdout)
> + if cp.stdout:
> + self.log.warning('vmstate-static-checker output: ' + cp.stdout)
> +
> +
> +if __name__ == '__main__':
> + QemuSystemTest.main()
Thanks for this series Thomas, it's very useful.
Could we extend this automatically to test migration on all
combinations: {qemu-system-*} x {machine}?
We could generate a single list of references, containing hashes of all
outputs, and a simple and clean command to regenerate all those, and
associated jsons, so we don't pollute qemu code with tons of json.
This way, we can automatically detect that we never regress, not only
from release to release, but commit to commit.
In case we need to update reference, people can point what's the actual
difference in the commit message.
As well, since I took a look into that before, this check is not enough
regarding migration. Beyonds the VMDstate, we should check as well that
the default values of every field are not changed. For instance, we
recently changed the default pauth property of arm cpus, and without a
careful backcompat, it would have break migration. It's a bit more
tricky, since there is nothing available now to dump this (I hacked that
using a custom trace). And definitely not something in the scope of your
series, just worth mentioning.
I hope we can one day get rid of all "Is this change safe regarding
migration?" comments because we know we can trust our CI instead.
Regards,
Pierrick
On Wed, Apr 30, 2025 at 09:10:30AM -0700, Pierrick Bouvier wrote:
> On 4/29/25 8:21 AM, Thomas Huth wrote:
> > From: Thomas Huth <thuth@redhat.com>
> >
> > We've got this nice vmstate-static-checker.py script that can help
> > to detect screw-ups in the migration states. Unfortunately, it's
> > currently only run manually, so there could be regressions that nobody
> > notices immediately. Let's run it from a functional test automatically
> > so that we got at least a basic coverage in each CI run.
> >
> > Signed-off-by: Thomas Huth <thuth@redhat.com>
> > ---
> > MAINTAINERS | 1 +
> > tests/functional/meson.build | 13 +++++++-
> > tests/functional/test_vmstate.py | 56 ++++++++++++++++++++++++++++++++
> > 3 files changed, 69 insertions(+), 1 deletion(-)
> > create mode 100755 tests/functional/test_vmstate.py
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 65fb61844b3..6a8d81458ad 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -3525,6 +3525,7 @@ F: migration/
> > F: scripts/vmstate-static-checker.py
> > F: tests/data/vmstate-static-checker/
> > F: tests/functional/test_migration.py
> > +F: tests/functional/test_vmstate.py
> > F: tests/qtest/migration/
> > F: tests/qtest/migration-*
> > F: docs/devel/migration/
> > diff --git a/tests/functional/meson.build b/tests/functional/meson.build
> > index b317ad42c5a..9f339e626f6 100644
> > --- a/tests/functional/meson.build
> > +++ b/tests/functional/meson.build
> > @@ -76,6 +76,7 @@ tests_generic_bsduser = [
> > tests_aarch64_system_quick = [
> > 'migration',
> > + 'vmstate',
> > ]
> > tests_aarch64_system_thorough = [
> > @@ -164,6 +165,10 @@ tests_loongarch64_system_thorough = [
> > 'loongarch64_virt',
> > ]
> > +tests_m68k_system_quick = [
> > + 'vmstate',
> > +]
> > +
> > tests_m68k_system_thorough = [
> > 'm68k_mcf5208evb',
> > 'm68k_nextcube',
> > @@ -230,6 +235,7 @@ tests_ppc_system_thorough = [
> > tests_ppc64_system_quick = [
> > 'migration',
> > + 'vmstate',
> > ]
> > tests_ppc64_system_thorough = [
> > @@ -265,6 +271,10 @@ tests_rx_system_thorough = [
> > 'rx_gdbsim',
> > ]
> > +tests_s390x_system_quick = [
> > + 'vmstate',
> > +]
> > +
> > tests_s390x_system_thorough = [
> > 's390x_ccw_virtio',
> > 's390x_replay',
> > @@ -305,8 +315,9 @@ tests_x86_64_system_quick = [
> > 'migration',
> > 'pc_cpu_hotplug_props',
> > 'virtio_version',
> > - 'x86_cpu_model_versions',
> > + 'vmstate',
> > 'vnc',
> > + 'x86_cpu_model_versions',
> > ]
> > tests_x86_64_system_thorough = [
> > diff --git a/tests/functional/test_vmstate.py b/tests/functional/test_vmstate.py
> > new file mode 100755
> > index 00000000000..3ba56d580db
> > --- /dev/null
> > +++ b/tests/functional/test_vmstate.py
> > @@ -0,0 +1,56 @@
> > +#!/usr/bin/env python3
> > +#
> > +# SPDX-License-Identifier: GPL-2.0-or-later
> > +#
> > +# This test runs the vmstate-static-checker script with the current QEMU
> > +
> > +import subprocess
> > +
> > +from qemu_test import QemuSystemTest
> > +
> > +
> > +class VmStateTest(QemuSystemTest):
> > +
> > + def test_vmstate(self):
> > + target_machine = {
> > + 'aarch64': 'virt-7.2',
> > + 'm68k': 'virt-7.2',
> > + 'ppc64': 'pseries-7.2',
> > + 's390x': 's390-ccw-virtio-7.2',
> > + 'x86_64': 'pc-q35-7.2',
> > + }
> > + self.set_machine(target_machine[self.arch])
> > +
> > + # Run QEMU to get the current vmstate json file:
> > + dst_json = self.scratch_file('dest.json')
> > + self.log.info('Dumping vmstate from ' + self.qemu_bin)
> > + cp = subprocess.run([self.qemu_bin, '-nodefaults',
> > + '-M', target_machine[self.arch],
> > + '-dump-vmstate', dst_json],
> > + stdout=subprocess.PIPE,
> > + stderr=subprocess.STDOUT,
> > + text=True)
> > + if cp.returncode != 0:
> > + self.fail('Running QEMU failed:\n' + cp.stdout)
> > + if cp.stdout:
> > + self.log.info('QEMU output: ' + cp.stdout)
> > +
> > + # Check whether the old vmstate json file is still compatible:
> > + src_json = self.data_file('..', 'data', 'vmstate-static-checker',
> > + self.arch,
> > + target_machine[self.arch] + '.json')
> > + vmstate_checker = self.data_file('..', '..', 'scripts',
> > + 'vmstate-static-checker.py')
> > + self.log.info('Comparing vmstate with ' + src_json)
> > + cp = subprocess.run([vmstate_checker, '-s', src_json, '-d', dst_json],
> > + stdout=subprocess.PIPE,
> > + stderr=subprocess.STDOUT,
> > + text=True)
> > + if cp.returncode != 0:
> > + self.fail('Running vmstate-static-checker failed:\n' + cp.stdout)
> > + if cp.stdout:
> > + self.log.warning('vmstate-static-checker output: ' + cp.stdout)
> > +
> > +
> > +if __name__ == '__main__':
> > + QemuSystemTest.main()
>
> Thanks for this series Thomas, it's very useful.
> Could we extend this automatically to test migration on all combinations:
> {qemu-system-*} x {machine}?
> We could generate a single list of references, containing hashes of all
> outputs, and a simple and clean command to regenerate all those, and
> associated jsons, so we don't pollute qemu code with tons of json.
I think a major challenge would be false positives, and how to filter
them.. when we put anything into CI.
Side note: yesterday I just wrote a script to do exactly this, by auto
build binaries and check all relevant archs over all machine types
supported. It looks like this:
https://gitlab.com/peterx/qemu/-/commit/c4abfa39f8943cd62f0d982ecb36537df398ae70
The plan is I can run this at the end of each release, though.. not yet for
CI. I also don't have plan to upstream this script, maybe I'll keep it
myself as of now unless someone thinks we should have it.
PS: I just ran it over v9.2..v10.0 over the default 4 archs
(x86,arm,ppc,s390) and a huge list was generated.. I believe most of them
are false positives, I'll delay walking the list for some time.. I
attached the result at the end in case anyone is interested.
>
> This way, we can automatically detect that we never regress, not only from
> release to release, but commit to commit.
>
> In case we need to update reference, people can point what's the actual
> difference in the commit message.
>
> As well, since I took a look into that before, this check is not enough
> regarding migration. Beyonds the VMDstate, we should check as well that the
> default values of every field are not changed. For instance, we recently
> changed the default pauth property of arm cpus, and without a careful
> backcompat, it would have break migration. It's a bit more tricky, since
> there is nothing available now to dump this (I hacked that using a custom
> trace). And definitely not something in the scope of your series, just worth
> mentioning.
>
> I hope we can one day get rid of all "Is this change safe regarding
> migration?" comments because we know we can trust our CI instead.
IMHO it's extremely hard (if not impossible) to guarantee that, because
some migration bug may only trigger in special paths that not always
happen, e.g. it can even involve guest driver behavior.
Said that, Fabiano used to work on supporting device-specific tests in
qtests/migration-test.c. I don't think it landed but maybe we have room
for specific device tests using qtests/migration-test.c framework.
https://lore.kernel.org/all/20240523201922.28007-1-farosas@suse.de/
Thanks,
--
Peter Xu
On 5/1/25 7:28 AM, Peter Xu wrote:
> On Wed, Apr 30, 2025 at 09:10:30AM -0700, Pierrick Bouvier wrote:
>> On 4/29/25 8:21 AM, Thomas Huth wrote:
>>> From: Thomas Huth <thuth@redhat.com>
>>>
>>> We've got this nice vmstate-static-checker.py script that can help
>>> to detect screw-ups in the migration states. Unfortunately, it's
>>> currently only run manually, so there could be regressions that nobody
>>> notices immediately. Let's run it from a functional test automatically
>>> so that we got at least a basic coverage in each CI run.
>>>
>>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>>> ---
>>> MAINTAINERS | 1 +
>>> tests/functional/meson.build | 13 +++++++-
>>> tests/functional/test_vmstate.py | 56 ++++++++++++++++++++++++++++++++
>>> 3 files changed, 69 insertions(+), 1 deletion(-)
>>> create mode 100755 tests/functional/test_vmstate.py
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index 65fb61844b3..6a8d81458ad 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -3525,6 +3525,7 @@ F: migration/
>>> F: scripts/vmstate-static-checker.py
>>> F: tests/data/vmstate-static-checker/
>>> F: tests/functional/test_migration.py
>>> +F: tests/functional/test_vmstate.py
>>> F: tests/qtest/migration/
>>> F: tests/qtest/migration-*
>>> F: docs/devel/migration/
>>> diff --git a/tests/functional/meson.build b/tests/functional/meson.build
>>> index b317ad42c5a..9f339e626f6 100644
>>> --- a/tests/functional/meson.build
>>> +++ b/tests/functional/meson.build
>>> @@ -76,6 +76,7 @@ tests_generic_bsduser = [
>>> tests_aarch64_system_quick = [
>>> 'migration',
>>> + 'vmstate',
>>> ]
>>> tests_aarch64_system_thorough = [
>>> @@ -164,6 +165,10 @@ tests_loongarch64_system_thorough = [
>>> 'loongarch64_virt',
>>> ]
>>> +tests_m68k_system_quick = [
>>> + 'vmstate',
>>> +]
>>> +
>>> tests_m68k_system_thorough = [
>>> 'm68k_mcf5208evb',
>>> 'm68k_nextcube',
>>> @@ -230,6 +235,7 @@ tests_ppc_system_thorough = [
>>> tests_ppc64_system_quick = [
>>> 'migration',
>>> + 'vmstate',
>>> ]
>>> tests_ppc64_system_thorough = [
>>> @@ -265,6 +271,10 @@ tests_rx_system_thorough = [
>>> 'rx_gdbsim',
>>> ]
>>> +tests_s390x_system_quick = [
>>> + 'vmstate',
>>> +]
>>> +
>>> tests_s390x_system_thorough = [
>>> 's390x_ccw_virtio',
>>> 's390x_replay',
>>> @@ -305,8 +315,9 @@ tests_x86_64_system_quick = [
>>> 'migration',
>>> 'pc_cpu_hotplug_props',
>>> 'virtio_version',
>>> - 'x86_cpu_model_versions',
>>> + 'vmstate',
>>> 'vnc',
>>> + 'x86_cpu_model_versions',
>>> ]
>>> tests_x86_64_system_thorough = [
>>> diff --git a/tests/functional/test_vmstate.py b/tests/functional/test_vmstate.py
>>> new file mode 100755
>>> index 00000000000..3ba56d580db
>>> --- /dev/null
>>> +++ b/tests/functional/test_vmstate.py
>>> @@ -0,0 +1,56 @@
>>> +#!/usr/bin/env python3
>>> +#
>>> +# SPDX-License-Identifier: GPL-2.0-or-later
>>> +#
>>> +# This test runs the vmstate-static-checker script with the current QEMU
>>> +
>>> +import subprocess
>>> +
>>> +from qemu_test import QemuSystemTest
>>> +
>>> +
>>> +class VmStateTest(QemuSystemTest):
>>> +
>>> + def test_vmstate(self):
>>> + target_machine = {
>>> + 'aarch64': 'virt-7.2',
>>> + 'm68k': 'virt-7.2',
>>> + 'ppc64': 'pseries-7.2',
>>> + 's390x': 's390-ccw-virtio-7.2',
>>> + 'x86_64': 'pc-q35-7.2',
>>> + }
>>> + self.set_machine(target_machine[self.arch])
>>> +
>>> + # Run QEMU to get the current vmstate json file:
>>> + dst_json = self.scratch_file('dest.json')
>>> + self.log.info('Dumping vmstate from ' + self.qemu_bin)
>>> + cp = subprocess.run([self.qemu_bin, '-nodefaults',
>>> + '-M', target_machine[self.arch],
>>> + '-dump-vmstate', dst_json],
>>> + stdout=subprocess.PIPE,
>>> + stderr=subprocess.STDOUT,
>>> + text=True)
>>> + if cp.returncode != 0:
>>> + self.fail('Running QEMU failed:\n' + cp.stdout)
>>> + if cp.stdout:
>>> + self.log.info('QEMU output: ' + cp.stdout)
>>> +
>>> + # Check whether the old vmstate json file is still compatible:
>>> + src_json = self.data_file('..', 'data', 'vmstate-static-checker',
>>> + self.arch,
>>> + target_machine[self.arch] + '.json')
>>> + vmstate_checker = self.data_file('..', '..', 'scripts',
>>> + 'vmstate-static-checker.py')
>>> + self.log.info('Comparing vmstate with ' + src_json)
>>> + cp = subprocess.run([vmstate_checker, '-s', src_json, '-d', dst_json],
>>> + stdout=subprocess.PIPE,
>>> + stderr=subprocess.STDOUT,
>>> + text=True)
>>> + if cp.returncode != 0:
>>> + self.fail('Running vmstate-static-checker failed:\n' + cp.stdout)
>>> + if cp.stdout:
>>> + self.log.warning('vmstate-static-checker output: ' + cp.stdout)
>>> +
>>> +
>>> +if __name__ == '__main__':
>>> + QemuSystemTest.main()
>>
>> Thanks for this series Thomas, it's very useful.
>> Could we extend this automatically to test migration on all combinations:
>> {qemu-system-*} x {machine}?
>> We could generate a single list of references, containing hashes of all
>> outputs, and a simple and clean command to regenerate all those, and
>> associated jsons, so we don't pollute qemu code with tons of json.
>
> I think a major challenge would be false positives, and how to filter
> them.. when we put anything into CI.
>
A fail would be expected everytime something changes:
- it can be a default field that has a new value (particularly sensitive
for cpus)
- it can be a new cpu field that is added
- it can be a board definition change
- it can be a hardware related change
In all cases, even though it does not break migration, it's interesting
to know such a change happen.
As well, if it's simple to update and get differences of the various
dumps per {binary, board}, then it's trivial to identify and comment the
"false positive". The more often it runs (ideally, per PR, or per
series), the easier it is to identify what changed.
> Side note: yesterday I just wrote a script to do exactly this, by auto
> build binaries and check all relevant archs over all machine types
> supported. It looks like this:
>
> https://gitlab.com/peterx/qemu/-/commit/c4abfa39f8943cd62f0d982ecb36537df398ae70
>
> The plan is I can run this at the end of each release, though.. not yet for
> CI. I also don't have plan to upstream this script, maybe I'll keep it
> myself as of now unless someone thinks we should have it.
>
Glad to hear we have a script, but sad to hear "will run manually once
every 6 months".
> PS: I just ran it over v9.2..v10.0 over the default 4 archs
> (x86,arm,ppc,s390) and a huge list was generated.. I believe most of them
> are false positives, I'll delay walking the list for some time.. I
> attached the result at the end in case anyone is interested.
>
Would that be possible to post this on an online forge like GitLab, and
commit the previous and new versions of dumps (the data, not the list of
failures), so we can see all the differences in a nice way?
>>
>> This way, we can automatically detect that we never regress, not only from
>> release to release, but commit to commit.
>>
>> In case we need to update reference, people can point what's the actual
>> difference in the commit message.
>>
>> As well, since I took a look into that before, this check is not enough
>> regarding migration. Beyonds the VMDstate, we should check as well that the
>> default values of every field are not changed. For instance, we recently
>> changed the default pauth property of arm cpus, and without a careful
>> backcompat, it would have break migration. It's a bit more tricky, since
>> there is nothing available now to dump this (I hacked that using a custom
>> trace). And definitely not something in the scope of your series, just worth
>> mentioning.
>>
>> I hope we can one day get rid of all "Is this change safe regarding
>> migration?" comments because we know we can trust our CI instead.
>
> IMHO it's extremely hard (if not impossible) to guarantee that, because
> some migration bug may only trigger in special paths that not always
> happen, e.g. it can even involve guest driver behavior.
>
It would not cover 100%, but if we already make sure that a VM stopped
at end of qemu_init() has a predictable dump, it's already a huge win
over having nothing.
We can use the same argument that current QEMU CI does not cover 100% of
the code (*much* less than that from what I tried), but it's still
better than no test.
> Said that, Fabiano used to work on supporting device-specific tests in
> qtests/migration-test.c. I don't think it landed but maybe we have room
> for specific device tests using qtests/migration-test.c framework.
>
> https://lore.kernel.org/all/20240523201922.28007-1-farosas@suse.de/
>
> Thanks,
>
Regards,
Pierrick
© 2016 - 2025 Red Hat, Inc.