It has always been possible to enable arbitrary migration capabilities
and attempt to take a snapshot of the VM with the savevm/loadvm
commands as well as their QMP counterparts
snapshot-save/snapshot-load.
Most migration capabilities are not meant to be used with snapshots
and there's a risk of crashing QEMU or producing incorrect
behavior. Ideally, every migration capability would either be
implemented for savevm or explicitly rejected.
Add a compatibility check routine and reject the snapshot command if
an incompatible capability is enabled. For now only act on the the two
that actually cause a crash: multifd and mapped-ram.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2881
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
migration/options.c | 26 ++++++++++++++++++++++++++
migration/options.h | 1 +
migration/savevm.c | 8 ++++++++
3 files changed, 35 insertions(+)
diff --git a/migration/options.c b/migration/options.c
index b0ac2ea408..8772b98dca 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -443,11 +443,37 @@ INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot,
MIGRATION_CAPABILITY_VALIDATE_UUID,
MIGRATION_CAPABILITY_ZERO_COPY_SEND);
+/* Snapshot compatibility check list */
+static const
+INITIALIZE_MIGRATE_CAPS_SET(check_caps_savevm,
+ MIGRATION_CAPABILITY_MULTIFD,
+ MIGRATION_CAPABILITY_MAPPED_RAM,
+);
+
static bool migrate_incoming_started(void)
{
return !!migration_incoming_get_current()->transport_data;
}
+bool migrate_can_snapshot(Error **errp)
+{
+ MigrationState *s = migrate_get_current();
+ int i;
+
+ for (i = 0; i < check_caps_savevm.size; i++) {
+ int incomp_cap = check_caps_savevm.caps[i];
+
+ if (s->capabilities[incomp_cap]) {
+ error_setg(errp,
+ "Snapshots are not compatible with %s",
+ MigrationCapability_str(incomp_cap));
+ return false;
+ }
+ }
+
+ return true;
+}
+
/**
* @migration_caps_check - check capability compatibility
*
diff --git a/migration/options.h b/migration/options.h
index 762be4e641..20b71b6f2a 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -58,6 +58,7 @@ bool migrate_tls(void);
/* capabilities helpers */
bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp);
+bool migrate_can_snapshot(Error **errp);
/* parameters */
diff --git a/migration/savevm.c b/migration/savevm.c
index ce158c3512..3be13bcfe8 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -3229,6 +3229,10 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
GLOBAL_STATE_CODE();
+ if (!migrate_can_snapshot(errp)) {
+ return false;
+ }
+
if (migration_is_blocked(errp)) {
return false;
}
@@ -3413,6 +3417,10 @@ bool load_snapshot(const char *name, const char *vmstate,
int ret;
MigrationIncomingState *mis = migration_incoming_get_current();
+ if (!migrate_can_snapshot(errp)) {
+ return false;
+ }
+
if (!bdrv_all_can_snapshot(has_devices, devices, errp)) {
return false;
}
--
2.35.3
Hello Fabiano,
First of all thanks a lot for the quick follow up to my issue!
I just want to point out that with only mapped-ram enabled (without
multifd) savevm/loadvm do not lead to a crash but just to an error
according to my (few) experiments (on upstream).
Ciao
Marco
On Thursday, March 27, 2025 15:39 CET, Fabiano Rosas <farosas@suse.de> wrote:
> It has always been possible to enable arbitrary migration capabilities
> and attempt to take a snapshot of the VM with the savevm/loadvm
> commands as well as their QMP counterparts
> snapshot-save/snapshot-load.
>
> Most migration capabilities are not meant to be used with snapshots
> and there's a risk of crashing QEMU or producing incorrect
> behavior. Ideally, every migration capability would either be
> implemented for savevm or explicitly rejected.
>
> Add a compatibility check routine and reject the snapshot command if
> an incompatible capability is enabled. For now only act on the the two
> that actually cause a crash: multifd and mapped-ram.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2881
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
> migration/options.c | 26 ++++++++++++++++++++++++++
> migration/options.h | 1 +
> migration/savevm.c | 8 ++++++++
> 3 files changed, 35 insertions(+)
>
> diff --git a/migration/options.c b/migration/options.c
> index b0ac2ea408..8772b98dca 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -443,11 +443,37 @@ INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot,
> MIGRATION_CAPABILITY_VALIDATE_UUID,
> MIGRATION_CAPABILITY_ZERO_COPY_SEND);
>
> +/* Snapshot compatibility check list */
> +static const
> +INITIALIZE_MIGRATE_CAPS_SET(check_caps_savevm,
> + MIGRATION_CAPABILITY_MULTIFD,
> + MIGRATION_CAPABILITY_MAPPED_RAM,
> +);
> +
> static bool migrate_incoming_started(void)
> {
> return !!migration_incoming_get_current()->transport_data;
> }
>
> +bool migrate_can_snapshot(Error **errp)
> +{
> + MigrationState *s = migrate_get_current();
> + int i;
> +
> + for (i = 0; i < check_caps_savevm.size; i++) {
> + int incomp_cap = check_caps_savevm.caps[i];
> +
> + if (s->capabilities[incomp_cap]) {
> + error_setg(errp,
> + "Snapshots are not compatible with %s",
> + MigrationCapability_str(incomp_cap));
> + return false;
> + }
> + }
> +
> + return true;
> +}
> +
> /**
> * @migration_caps_check - check capability compatibility
> *
> diff --git a/migration/options.h b/migration/options.h
> index 762be4e641..20b71b6f2a 100644
> --- a/migration/options.h
> +++ b/migration/options.h
> @@ -58,6 +58,7 @@ bool migrate_tls(void);
> /* capabilities helpers */
>
> bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp);
> +bool migrate_can_snapshot(Error **errp);
>
> /* parameters */
>
> diff --git a/migration/savevm.c b/migration/savevm.c
> index ce158c3512..3be13bcfe8 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -3229,6 +3229,10 @@ bool save_snapshot(const char *name, bool overwrite, const char *vmstate,
>
> GLOBAL_STATE_CODE();
>
> + if (!migrate_can_snapshot(errp)) {
> + return false;
> + }
> +
> if (migration_is_blocked(errp)) {
> return false;
> }
> @@ -3413,6 +3417,10 @@ bool load_snapshot(const char *name, const char *vmstate,
> int ret;
> MigrationIncomingState *mis = migration_incoming_get_current();
>
> + if (!migrate_can_snapshot(errp)) {
> + return false;
> + }
> +
> if (!bdrv_all_can_snapshot(has_devices, devices, errp)) {
> return false;
> }
> --
> 2.35.3
>
"Marco Cavenati" <Marco.Cavenati@eurecom.fr> writes: > Hello Fabiano, > > First of all thanks a lot for the quick follow up to my issue! > > I just want to point out that with only mapped-ram enabled (without > multifd) savevm/loadvm do not lead to a crash but just to an error > according to my (few) experiments (on upstream). > Yes, absolutely. I used imprecise language. Thanks for the correction, I'll explain it better in the following versions.
On Thu, Mar 27, 2025 at 11:39:31AM -0300, Fabiano Rosas wrote: > It has always been possible to enable arbitrary migration capabilities > and attempt to take a snapshot of the VM with the savevm/loadvm > commands as well as their QMP counterparts > snapshot-save/snapshot-load. > > Most migration capabilities are not meant to be used with snapshots > and there's a risk of crashing QEMU or producing incorrect > behavior. Ideally, every migration capability would either be > implemented for savevm or explicitly rejected. IMHO, this a prime example of why migration config shouldn't be held as global state, and instead passed as parameters to the commands that need them. The snapshot-save/load commands would then only be able to accept what few settings are actually relevant, instead of inheriting any/all global migration state. > Add a compatibility check routine and reject the snapshot command if > an incompatible capability is enabled. For now only act on the the two > that actually cause a crash: multifd and mapped-ram. > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2818 Issue is 2881 not 2818 ^^^^^^^ > Signed-off-by: Fabiano Rosas <farosas@suse.de> > --- > migration/options.c | 26 ++++++++++++++++++++++++++ > migration/options.h | 1 + > migration/savevm.c | 8 ++++++++ > 3 files changed, 35 insertions(+) With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Daniel P. Berrangé <berrange@redhat.com> writes: > On Thu, Mar 27, 2025 at 11:39:31AM -0300, Fabiano Rosas wrote: >> It has always been possible to enable arbitrary migration capabilities >> and attempt to take a snapshot of the VM with the savevm/loadvm >> commands as well as their QMP counterparts >> snapshot-save/snapshot-load. >> >> Most migration capabilities are not meant to be used with snapshots >> and there's a risk of crashing QEMU or producing incorrect >> behavior. Ideally, every migration capability would either be >> implemented for savevm or explicitly rejected. > > IMHO, this a prime example of why migration config shouldn't be held > as global state, and instead passed as parameters to the commands > that need them. The snapshot-save/load commands would then only > be able to accept what few settings are actually relevant, instead > of inheriting any/all global migration state. > Right, I remember we got caught around the fact that some migration options are needed during runtime as well... but I don't remember the details, let try to find that thread. >> Add a compatibility check routine and reject the snapshot command if >> an incompatible capability is enabled. For now only act on the the two >> that actually cause a crash: multifd and mapped-ram. >> >> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2818 > > Issue is 2881 not 2818 ^^^^^^^ > It seem that was your own C-t =) >> Signed-off-by: Fabiano Rosas <farosas@suse.de> >> --- >> migration/options.c | 26 ++++++++++++++++++++++++++ >> migration/options.h | 1 + >> migration/savevm.c | 8 ++++++++ >> 3 files changed, 35 insertions(+) > > With regards, > Daniel
Fabiano Rosas <farosas@suse.de> writes:
> Daniel P. Berrangé <berrange@redhat.com> writes:
>
>> On Thu, Mar 27, 2025 at 11:39:31AM -0300, Fabiano Rosas wrote:
>>> It has always been possible to enable arbitrary migration capabilities
>>> and attempt to take a snapshot of the VM with the savevm/loadvm
>>> commands as well as their QMP counterparts
>>> snapshot-save/snapshot-load.
>>>
>>> Most migration capabilities are not meant to be used with snapshots
>>> and there's a risk of crashing QEMU or producing incorrect
>>> behavior. Ideally, every migration capability would either be
>>> implemented for savevm or explicitly rejected.
>>
>> IMHO, this a prime example of why migration config shouldn't be held
>> as global state, and instead passed as parameters to the commands
>> that need them. The snapshot-save/load commands would then only
>> be able to accept what few settings are actually relevant, instead
>> of inheriting any/all global migration state.
>>
>
> Right, I remember we got caught around the fact that some migration
> options are needed during runtime as well... but I don't remember the
> details, let try to find that thread.
>
Found it: https://lore.kernel.org/r/ZVM5xmsaE41WJYgb@redhat.com
I don't think it's *too* hard to start passing the configuration to
qmp_migrate & friends. We just need to figure out a path for the
compatibility.
I'm thiking of:
1) Unifying capabilities and parameters in a MigrationConfig
structure. We take the opportunity to fix the tls options to 'str'
instead of StrOrNull.
2) Deprecate migrate-set-capabilities. There are no capabilities
anymore.
3) Deprecate migate-set-parameters. There are no parameters
anymore. Alternatively, reuse the existing command, but have it take the
additional capabilities as optional (everything else is already
optional).
4) Depending on what we do on (3), add a new migrate-set-config command
that sets every option. All as optional. This would be nice because we
wouldn't need to worry about breaking compat on the tls options, we just
define the new command in the correct way.
5) Add a {'*config': MigrationConfig} entry to qmp_migrate and
migrate_set_{config|parameters}. Here is where I have questions, because
ideally we'd have a way to limit the migrate_set_config command to only
the options that can be set at runtime. But I can't see a way of doing
that without the duplication of the options in the QAPI .json file. I'm
inclined to allow the whole set of options and do some tracking on the
side in options.c in the migration code.
(same issue for savevm really. To allow it to (say) work with
mapped-ram, we'd need a duplicate mapped-ram entry in migration.json)
About (2) and (3). If we use this unified MigrationConfig, I can keep
the old commands working (along with the query_* variants), by defining
a compat function that converts from those commands specific format into
the new format. But then there's the question of what do we do when a
new capability/parameter comes along? Can we declare that the old
commands will not see the new data and that's it? If there's no
distinction between caps and params anymore, there isn't even a way to
decide which command to use.
>>> Add a compatibility check routine and reject the snapshot command if
>>> an incompatible capability is enabled. For now only act on the the
>>> two that actually cause a crash: multifd and mapped-ram.
>>>
>>> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2818
>>
>> Issue is 2881 not 2818 ^^^^^^^
>>
>
> It seem that was your own C-t =)
>
>>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>>> ---
>>> migration/options.c | 26 ++++++++++++++++++++++++++
>>> migration/options.h | 1 +
>>> migration/savevm.c | 8 ++++++++
>>> 3 files changed, 35 insertions(+)
>>
>> With regards,
>> Daniel
Fabiano Rosas <farosas@suse.de> writes:
+Cc Markus
context:
This series was trying to stop savevm from crashing when arbitrary
migration capabilities are enabled. Daniel brought up the previous
discussion around unifying capabilities + parameters and passing it all
via the migrate (or snapshot in this case) command arguments. I'm
looking into that now.
> Fabiano Rosas <farosas@suse.de> writes:
>
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>>
>>> On Thu, Mar 27, 2025 at 11:39:31AM -0300, Fabiano Rosas wrote:
>>>> It has always been possible to enable arbitrary migration capabilities
>>>> and attempt to take a snapshot of the VM with the savevm/loadvm
>>>> commands as well as their QMP counterparts
>>>> snapshot-save/snapshot-load.
>>>>
>>>> Most migration capabilities are not meant to be used with snapshots
>>>> and there's a risk of crashing QEMU or producing incorrect
>>>> behavior. Ideally, every migration capability would either be
>>>> implemented for savevm or explicitly rejected.
>>>
>>> IMHO, this a prime example of why migration config shouldn't be held
>>> as global state, and instead passed as parameters to the commands
>>> that need them. The snapshot-save/load commands would then only
>>> be able to accept what few settings are actually relevant, instead
>>> of inheriting any/all global migration state.
>>>
>>
>> Right, I remember we got caught around the fact that some migration
>> options are needed during runtime as well... but I don't remember the
>> details, let try to find that thread.
>>
>
> Found it: https://lore.kernel.org/r/ZVM5xmsaE41WJYgb@redhat.com
>
> I don't think it's *too* hard to start passing the configuration to
> qmp_migrate & friends. We just need to figure out a path for the
> compatibility.
>
> I'm thiking of:
>
> 1) Unifying capabilities and parameters in a MigrationConfig
> structure. We take the opportunity to fix the tls options to 'str'
> instead of StrOrNull.
>
> 2) Deprecate migrate-set-capabilities. There are no capabilities
> anymore.
>
> 3) Deprecate migate-set-parameters. There are no parameters
> anymore. Alternatively, reuse the existing command, but have it take the
> additional capabilities as optional (everything else is already
> optional).
>
> 4) Depending on what we do on (3), add a new migrate-set-config command
> that sets every option. All as optional. This would be nice because we
> wouldn't need to worry about breaking compat on the tls options, we just
> define the new command in the correct way.
>
> 5) Add a {'*config': MigrationConfig} entry to qmp_migrate and
> migrate_set_{config|parameters}. Here is where I have questions, because
> ideally we'd have a way to limit the migrate_set_config command to only
> the options that can be set at runtime. But I can't see a way of doing
> that without the duplication of the options in the QAPI .json file. I'm
> inclined to allow the whole set of options and do some tracking on the
> side in options.c in the migration code.
>
> (same issue for savevm really. To allow it to (say) work with
> mapped-ram, we'd need a duplicate mapped-ram entry in migration.json)
>
> About (2) and (3). If we use this unified MigrationConfig, I can keep
> the old commands working (along with the query_* variants), by defining
> a compat function that converts from those commands specific format into
> the new format. But then there's the question of what do we do when a
> new capability/parameter comes along? Can we declare that the old
> commands will not see the new data and that's it? If there's no
> distinction between caps and params anymore, there isn't even a way to
> decide which command to use.
>
Fabiano Rosas <farosas@suse.de> writes:
> Fabiano Rosas <farosas@suse.de> writes:
>
> +Cc Markus
>
> context:
> This series was trying to stop savevm from crashing when arbitrary
> migration capabilities are enabled. Daniel brought up the previous
> discussion around unifying capabilities + parameters and passing it all
> via the migrate (or snapshot in this case) command arguments. I'm
> looking into that now.
>
>> Fabiano Rosas <farosas@suse.de> writes:
>>
>>> Daniel P. Berrangé <berrange@redhat.com> writes:
>>>
>>>> On Thu, Mar 27, 2025 at 11:39:31AM -0300, Fabiano Rosas wrote:
>>>>> It has always been possible to enable arbitrary migration capabilities
>>>>> and attempt to take a snapshot of the VM with the savevm/loadvm
>>>>> commands as well as their QMP counterparts
>>>>> snapshot-save/snapshot-load.
>>>>>
>>>>> Most migration capabilities are not meant to be used with snapshots
>>>>> and there's a risk of crashing QEMU or producing incorrect
>>>>> behavior. Ideally, every migration capability would either be
>>>>> implemented for savevm or explicitly rejected.
>>>>
>>>> IMHO, this a prime example of why migration config shouldn't be held
>>>> as global state, and instead passed as parameters to the commands
>>>> that need them. The snapshot-save/load commands would then only
>>>> be able to accept what few settings are actually relevant, instead
>>>> of inheriting any/all global migration state.
>>>>
>>>
>>> Right, I remember we got caught around the fact that some migration
>>> options are needed during runtime as well... but I don't remember the
>>> details, let try to find that thread.
>>>
>>
>> Found it: https://lore.kernel.org/r/ZVM5xmsaE41WJYgb@redhat.com
>>
>> I don't think it's *too* hard to start passing the configuration to
>> qmp_migrate & friends. We just need to figure out a path for the
>> compatibility.
>>
>> I'm thiking of:
>>
>> 1) Unifying capabilities and parameters in a MigrationConfig
>> structure. We take the opportunity to fix the tls options to 'str'
>> instead of StrOrNull.
>>
>> 2) Deprecate migrate-set-capabilities. There are no capabilities
>> anymore.
>>
>> 3) Deprecate migate-set-parameters. There are no parameters
>> anymore. Alternatively, reuse the existing command, but have it take the
>> additional capabilities as optional (everything else is already
>> optional).
>>
>> 4) Depending on what we do on (3), add a new migrate-set-config command
>> that sets every option. All as optional. This would be nice because we
>> wouldn't need to worry about breaking compat on the tls options, we just
>> define the new command in the correct way.
>>
>> 5) Add a {'*config': MigrationConfig} entry to qmp_migrate and
>> migrate_set_{config|parameters}. Here is where I have questions, because
>> ideally we'd have a way to limit the migrate_set_config command to only
>> the options that can be set at runtime. But I can't see a way of doing
>> that without the duplication of the options in the QAPI .json file. I'm
>> inclined to allow the whole set of options and do some tracking on the
>> side in options.c in the migration code.
>>
>> (same issue for savevm really. To allow it to (say) work with
>> mapped-ram, we'd need a duplicate mapped-ram entry in migration.json)
>>
>> About (2) and (3). If we use this unified MigrationConfig, I can keep
>> the old commands working (along with the query_* variants), by defining
>> a compat function that converts from those commands specific format into
>> the new format. But then there's the question of what do we do when a
>> new capability/parameter comes along? Can we declare that the old
>> commands will not see the new data and that's it? If there's no
>> distinction between caps and params anymore, there isn't even a way to
>> decide which command to use.
>>
Hi all, please disregard my messages on this thread. I've posted a
series which has a cover letter that explains the situation better:
[RFC PATCH 00/13] migration: Unify capabilities and parameters
https://lore.kernel.org/r/20250411191443.22565-1-farosas@suse.de
© 2016 - 2026 Red Hat, Inc.