[RFC PATCH 04/13] migration: Fix parameter validation

Fabiano Rosas posted 13 patches 7 months, 1 week ago
There is a newer version of this series
[RFC PATCH 04/13] migration: Fix parameter validation
Posted by Fabiano Rosas 7 months, 1 week ago
The migration parameters validation involves producing a temporary
structure which merges the current parameter values with the new
parameters set by the user.

The has_ boolean fields of MigrateSetParameter are taken into
consideration when writing the temporary structure, however the copy
of the current parameters also copies all the has_ fields of
s->parameters and those are (almost) all true due to being initialized
by migrate_params_init().

Since the temporary structure copy does not carry over the has_ fields
from MigrateSetParameters, only the values which were initialized in
migrate_params_init() will end up being validated. This causes
(almost) all of the migration parameters to be validated again every
time a parameter is set, which could be considered a bug. But it also
skips validation of those values which are not set in
migrate_params_init(), which is a worse issue.

Fix by initializing the missing values in migrate_params_init().
Currently 'avail_switchover_bandwidth' and 'block_bitmap_mapping' are
affected.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/options.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/migration/options.c b/migration/options.c
index cac28540dd..625d597a85 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -987,6 +987,8 @@ void migrate_params_init(MigrationParameters *params)
     params->has_mode = true;
     params->has_zero_page_detection = true;
     params->has_direct_io = true;
+    params->has_avail_switchover_bandwidth = true;
+    params->has_block_bitmap_mapping = true;
 }
 
 /*
-- 
2.35.3
Re: [RFC PATCH 04/13] migration: Fix parameter validation
Posted by Peter Xu 6 months ago
On Fri, Apr 11, 2025 at 04:14:34PM -0300, Fabiano Rosas wrote:
> The migration parameters validation involves producing a temporary
> structure which merges the current parameter values with the new
> parameters set by the user.
> 
> The has_ boolean fields of MigrateSetParameter are taken into
> consideration when writing the temporary structure, however the copy
> of the current parameters also copies all the has_ fields of
> s->parameters and those are (almost) all true due to being initialized
> by migrate_params_init().
> 
> Since the temporary structure copy does not carry over the has_ fields
> from MigrateSetParameters, only the values which were initialized in
> migrate_params_init() will end up being validated. This causes
> (almost) all of the migration parameters to be validated again every
> time a parameter is set, which could be considered a bug. But it also
> skips validation of those values which are not set in
> migrate_params_init(), which is a worse issue.

IMHO it's ok to double check all parameters in slow path.  Definitely not
ok to skip them.. So now the question is, if migrate_params_test_apply() so
far should check all params anyway...

Shall we drop the checking for all has_ there, then IIUC we also don't need
any initializations for has_* in migrate_params_init() here?

So, admittedly s->parameters.has_* is still ugly to be present.. we declare
all of them not used and ignore them always at least in s->parameters.

> 
> Fix by initializing the missing values in migrate_params_init().
> Currently 'avail_switchover_bandwidth' and 'block_bitmap_mapping' are
> affected.
> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  migration/options.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/migration/options.c b/migration/options.c
> index cac28540dd..625d597a85 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -987,6 +987,8 @@ void migrate_params_init(MigrationParameters *params)
>      params->has_mode = true;
>      params->has_zero_page_detection = true;
>      params->has_direct_io = true;
> +    params->has_avail_switchover_bandwidth = true;
> +    params->has_block_bitmap_mapping = true;
>  }
>  
>  /*
> -- 
> 2.35.3
> 

-- 
Peter Xu
Re: [RFC PATCH 04/13] migration: Fix parameter validation
Posted by Fabiano Rosas 5 months, 3 weeks ago
Peter Xu <peterx@redhat.com> writes:

> On Fri, Apr 11, 2025 at 04:14:34PM -0300, Fabiano Rosas wrote:
>> The migration parameters validation involves producing a temporary
>> structure which merges the current parameter values with the new
>> parameters set by the user.
>> 
>> The has_ boolean fields of MigrateSetParameter are taken into
>> consideration when writing the temporary structure, however the copy
>> of the current parameters also copies all the has_ fields of
>> s->parameters and those are (almost) all true due to being initialized
>> by migrate_params_init().
>> 
>> Since the temporary structure copy does not carry over the has_ fields
>> from MigrateSetParameters, only the values which were initialized in
>> migrate_params_init() will end up being validated. This causes
>> (almost) all of the migration parameters to be validated again every
>> time a parameter is set, which could be considered a bug. But it also
>> skips validation of those values which are not set in
>> migrate_params_init(), which is a worse issue.
>
> IMHO it's ok to double check all parameters in slow path.  Definitely not
> ok to skip them.. So now the question is, if migrate_params_test_apply() so
> far should check all params anyway...
>

Well, either way is fine by me. In the current code, I can't tell what
the intention was, unfortunately.

We could check all params, but then we need to make sure they never
change in between calls to migrate-set-params. Looking at the code I
don't see any place where we allow s->parameters to change.

But then I worry about checks that:
- might be too costly
- only make sense the first time
- depend on the order of setting (a param/cap that should only be set
before/after some other param/cap)

> Shall we drop the checking for all has_ there, then IIUC we also don't need
> any initializations for has_* in migrate_params_init() here?
>

It'd be nice to not have to touch the has_ fields, yes. I'll experiment
with it.

> So, admittedly s->parameters.has_* is still ugly to be present.. we declare
> all of them not used and ignore them always at least in s->parameters.
>
>> 
>> Fix by initializing the missing values in migrate_params_init().
>> Currently 'avail_switchover_bandwidth' and 'block_bitmap_mapping' are
>> affected.
>> 
>> Signed-off-by: Fabiano Rosas <farosas@suse.de>
>> ---
>>  migration/options.c | 2 ++
>>  1 file changed, 2 insertions(+)
>> 
>> diff --git a/migration/options.c b/migration/options.c
>> index cac28540dd..625d597a85 100644
>> --- a/migration/options.c
>> +++ b/migration/options.c
>> @@ -987,6 +987,8 @@ void migrate_params_init(MigrationParameters *params)
>>      params->has_mode = true;
>>      params->has_zero_page_detection = true;
>>      params->has_direct_io = true;
>> +    params->has_avail_switchover_bandwidth = true;
>> +    params->has_block_bitmap_mapping = true;
>>  }
>>  
>>  /*
>> -- 
>> 2.35.3
>>
Re: [RFC PATCH 04/13] migration: Fix parameter validation
Posted by Fabiano Rosas 5 months, 3 weeks ago
Fabiano Rosas <farosas@suse.de> writes:

> Peter Xu <peterx@redhat.com> writes:
>
>> On Fri, Apr 11, 2025 at 04:14:34PM -0300, Fabiano Rosas wrote:
>>> The migration parameters validation involves producing a temporary
>>> structure which merges the current parameter values with the new
>>> parameters set by the user.
>>> 
>>> The has_ boolean fields of MigrateSetParameter are taken into
>>> consideration when writing the temporary structure, however the copy
>>> of the current parameters also copies all the has_ fields of
>>> s->parameters and those are (almost) all true due to being initialized
>>> by migrate_params_init().
>>> 
>>> Since the temporary structure copy does not carry over the has_ fields
>>> from MigrateSetParameters, only the values which were initialized in
>>> migrate_params_init() will end up being validated. This causes
>>> (almost) all of the migration parameters to be validated again every
>>> time a parameter is set, which could be considered a bug. But it also
>>> skips validation of those values which are not set in
>>> migrate_params_init(), which is a worse issue.
>>
>> IMHO it's ok to double check all parameters in slow path.  Definitely not
>> ok to skip them.. So now the question is, if migrate_params_test_apply() so
>> far should check all params anyway...

Actually, this doesn't work...

The migrate-set-* commands have optional fields, so we need some form of
checking has_* to know which fields the user is setting. Otherwise
MigrationSetParameters will have zeros all over that will trip the
check.

Then, we need some form of checking has_* to be able to enventually get
the values into s->config (former s->parameters/capabilities), otherwise
we'll overwrite the already-set values with the potentially empty ones
coming from QAPI.

Then there's also the issue of knowing whether a value is 0 because the
user set it 0 or because it was never set.

We also can't apply an invalid value to s->config and validate it after
because some parameters are allowed to be changed during migration.
Re: [RFC PATCH 04/13] migration: Fix parameter validation
Posted by Peter Xu 5 months, 3 weeks ago
On Thu, May 22, 2025 at 02:39:48PM -0300, Fabiano Rosas wrote:
> Actually, this doesn't work...
> 
> The migrate-set-* commands have optional fields, so we need some form of
> checking has_* to know which fields the user is setting. Otherwise
> MigrationSetParameters will have zeros all over that will trip the
> check.
> 
> Then, we need some form of checking has_* to be able to enventually get
> the values into s->config (former s->parameters/capabilities), otherwise
> we'll overwrite the already-set values with the potentially empty ones
> coming from QAPI.
> 
> Then there's also the issue of knowing whether a value is 0 because the
> user set it 0 or because it was never set.
> 
> We also can't apply an invalid value to s->config and validate it after
> because some parameters are allowed to be changed during migration.

What I meant was we only conditionally ignore the has_* fields in below:

  (1) migrate_params_check(), so that QEMU always checks all parameters in
      the MigrationParameters* specified when invoking the function.

  (2) MigrationState.parameters, so that as long as the parameters are
      applied (it should only happen after sanity check all pass..) then we
      ignore these has_* fields (until MigrationState.parameters can have a
      better struct to not include these has_* fields).

We can keep the has_* checks in migrate_params_test_apply() and
migrate_params_apply(), so that we won't touch the ones the user didn't
specify in the QMP commands as you said.

The benefits of having above 1/2 ignoring has_* is some code removal where
we assume has_* always are TRUEs.

This can be still a bit confusing, but at least we don't need to init has_*
fields in migrate_params_init() anymore as they'll be all ignored, then
there's no chance we forget set TRUEs to any new params either.

-- 
Peter Xu
Re: [RFC PATCH 04/13] migration: Fix parameter validation
Posted by Fabiano Rosas 5 months, 3 weeks ago
Peter Xu <peterx@redhat.com> writes:

> On Thu, May 22, 2025 at 02:39:48PM -0300, Fabiano Rosas wrote:
>> Actually, this doesn't work...
>> 
>> The migrate-set-* commands have optional fields, so we need some form of
>> checking has_* to know which fields the user is setting. Otherwise
>> MigrationSetParameters will have zeros all over that will trip the
>> check.
>> 
>> Then, we need some form of checking has_* to be able to enventually get
>> the values into s->config (former s->parameters/capabilities), otherwise
>> we'll overwrite the already-set values with the potentially empty ones
>> coming from QAPI.
>> 
>> Then there's also the issue of knowing whether a value is 0 because the
>> user set it 0 or because it was never set.
>> 
>> We also can't apply an invalid value to s->config and validate it after
>> because some parameters are allowed to be changed during migration.
>
> What I meant was we only conditionally ignore the has_* fields in below:
>
>   (1) migrate_params_check(), so that QEMU always checks all parameters in
>       the MigrationParameters* specified when invoking the function.
>

Yes, I realised later that's what you meant. I'm looking into
it. Hitting some issues with the block_bitmap_mapping option, which
seems to be able to become NULL even if has_block_bitmap_mapping is
true. According to commit 3cba22c9ad ("migration: Fix
block_bitmap_mapping migration").

>   (2) MigrationState.parameters, so that as long as the parameters are
>       applied (it should only happen after sanity check all pass..) then we
>       ignore these has_* fields (until MigrationState.parameters can have a
>       better struct to not include these has_* fields).
>
> We can keep the has_* checks in migrate_params_test_apply() and
> migrate_params_apply(), so that we won't touch the ones the user didn't
> specify in the QMP commands as you said.
>
> The benefits of having above 1/2 ignoring has_* is some code removal where
> we assume has_* always are TRUEs.
>
> This can be still a bit confusing, but at least we don't need to init has_*
> fields in migrate_params_init() anymore as they'll be all ignored, then
> there's no chance we forget set TRUEs to any new params either.