[PATCH v8 2/6] target/arm/machine: Take account cpreg mig tolerances in case of mismatch

Eric Auger posted 6 patches 1 month, 1 week ago
There is a newer version of this series
[PATCH v8 2/6] target/arm/machine: Take account cpreg mig tolerances in case of mismatch
Posted by Eric Auger 1 month, 1 week ago
If there is a mismatch between the cpreg indexes found on both ends,
check whether a tolerance was registered for the given kvmidx. If any,
silence warning/errors.

Create dedicated helper functions that print the name of the culprit reg
and analyze whether a tolerance is set. According set the level of traces
and analyze whether the migration must eventually fail.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
 target/arm/machine.c    | 30 ++++++++++++++++++++----------
 target/arm/trace-events |  2 ++
 2 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/target/arm/machine.c b/target/arm/machine.c
index 476dad00ee7..13b8a0b2220 100644
--- a/target/arm/machine.c
+++ b/target/arm/machine.c
@@ -1065,24 +1065,34 @@ static void handle_cpreg_missing_in_incoming_stream(ARMCPU *cpu, uint64_t kvmidx
 {
     g_autofree gchar *name = print_register_name(kvmidx);
 
-    warn_report("%s: %s "
-                "expected by the destination but not in the incoming stream: "
-                 "skip it", __func__, name);
+    if (!arm_cpu_cpreg_has_mig_tolerance(cpu, kvmidx,
+                                         0, 0, ToleranceNotOnBothEnds)) {
+        warn_report("%s: %s "
+                    "expected by the destination but not in the incoming stream: "
+                     "skip it", __func__, name);
+    } else {
+        trace_tolerate_cpreg_missing_in_incoming_stream(name);
+    }
 }
 
 /*
- * Handle the situation where @kvmidx is in the incoming stream
- * but not on destination. This currently fails the migration but
- * we plan to accomodate some exceptions, hence the boolean returned value.
+ * Handle the situation where @kvmidx is in the incoming
+ * stream but not on destination. This fails the migration if
+ * no cpreg mig tolerance is set for this @kvmidx
  */
 static bool handle_cpreg_only_in_incoming_stream(ARMCPU *cpu, uint64_t kvmidx)
 {
     g_autofree gchar *name = print_register_name(kvmidx);
-    bool fail = true;
-
-    error_report("%s: %s in the incoming stream but unknown on the "
-                 "destination: fail migration", __func__, name);
+    bool fail = false;
 
+    if (!arm_cpu_cpreg_has_mig_tolerance(cpu, kvmidx,
+                                        0, 0, ToleranceNotOnBothEnds)) {
+        error_report("%s: %s in the incoming stream but unknown on the "
+                     "destination: fail migration", __func__, name);
+        fail = true;
+    } else {
+        trace_tolerate_cpreg_only_in_incoming_stream(name);
+    }
     return fail;
 }
 
diff --git a/target/arm/trace-events b/target/arm/trace-events
index 2de0406f784..8502fb3265c 100644
--- a/target/arm/trace-events
+++ b/target/arm/trace-events
@@ -29,3 +29,5 @@ arm_psci_call(uint64_t x0, uint64_t x1, uint64_t x2, uint64_t x3, uint32_t cpuid
 
 # machine.c
 cpu_post_load(uint32_t cpreg_vmstate_array_len, uint32_t cpreg_array_len) "cpreg_vmstate_array_len=%d cpreg_array_len=%d"
+tolerate_cpreg_missing_in_incoming_stream(char *name) "%s is missing in incoming stream but this is explicitly tolerated"
+tolerate_cpreg_only_in_incoming_stream(char *name) "%s is in incoming stream but not on destination but this is explicitly tolerated"
-- 
2.53.0
Re: [PATCH v8 2/6] target/arm/machine: Take account cpreg mig tolerances in case of mismatch
Posted by Sebastian Ott 1 month ago
On Wed, 4 Mar 2026, Eric Auger wrote:
> If there is a mismatch between the cpreg indexes found on both ends,
> check whether a tolerance was registered for the given kvmidx. If any,
> silence warning/errors.
>
> Create dedicated helper functions that print the name of the culprit reg
> and analyze whether a tolerance is set. According set the level of traces
> and analyze whether the migration must eventually fail.
>
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> ---
> target/arm/machine.c    | 30 ++++++++++++++++++++----------
> target/arm/trace-events |  2 ++
> 2 files changed, 22 insertions(+), 10 deletions(-)
>
> diff --git a/target/arm/machine.c b/target/arm/machine.c
> index 476dad00ee7..13b8a0b2220 100644
> --- a/target/arm/machine.c
> +++ b/target/arm/machine.c
> @@ -1065,24 +1065,34 @@ static void handle_cpreg_missing_in_incoming_stream(ARMCPU *cpu, uint64_t kvmidx
> {
>     g_autofree gchar *name = print_register_name(kvmidx);
>
> -    warn_report("%s: %s "
> -                "expected by the destination but not in the incoming stream: "
> -                 "skip it", __func__, name);
> +    if (!arm_cpu_cpreg_has_mig_tolerance(cpu, kvmidx,
> +                                         0, 0, ToleranceNotOnBothEnds)) {

I don't get this. You apply a different tolerance type than the one
(potentially) registered. This check will only fail when there was no
tolerance registered for this kvmidx. But this can be changed by someone
registering a different type.

> +        warn_report("%s: %s "
> +                    "expected by the destination but not in the incoming stream: "
> +                     "skip it", __func__, name);
> +    } else {
> +        trace_tolerate_cpreg_missing_in_incoming_stream(name);
> +    }
> }
>
> /*
> - * Handle the situation where @kvmidx is in the incoming stream
> - * but not on destination. This currently fails the migration but
> - * we plan to accomodate some exceptions, hence the boolean returned value.
> + * Handle the situation where @kvmidx is in the incoming
> + * stream but not on destination. This fails the migration if
> + * no cpreg mig tolerance is set for this @kvmidx
>  */
> static bool handle_cpreg_only_in_incoming_stream(ARMCPU *cpu, uint64_t kvmidx)
> {
>     g_autofree gchar *name = print_register_name(kvmidx);
> -    bool fail = true;
> -
> -    error_report("%s: %s in the incoming stream but unknown on the "
> -                 "destination: fail migration", __func__, name);
> +    bool fail = false;
>
> +    if (!arm_cpu_cpreg_has_mig_tolerance(cpu, kvmidx,
> +                                        0, 0, ToleranceNotOnBothEnds)) {
> +        error_report("%s: %s in the incoming stream but unknown on the "
> +                     "destination: fail migration", __func__, name);
> +        fail = true;
> +    } else {
> +        trace_tolerate_cpreg_only_in_incoming_stream(name);
> +    }
>     return fail;
> }
>
> diff --git a/target/arm/trace-events b/target/arm/trace-events
> index 2de0406f784..8502fb3265c 100644
> --- a/target/arm/trace-events
> +++ b/target/arm/trace-events
> @@ -29,3 +29,5 @@ arm_psci_call(uint64_t x0, uint64_t x1, uint64_t x2, uint64_t x3, uint32_t cpuid
>
> # machine.c
> cpu_post_load(uint32_t cpreg_vmstate_array_len, uint32_t cpreg_array_len) "cpreg_vmstate_array_len=%d cpreg_array_len=%d"
> +tolerate_cpreg_missing_in_incoming_stream(char *name) "%s is missing in incoming stream but this is explicitly tolerated"
> +tolerate_cpreg_only_in_incoming_stream(char *name) "%s is in incoming stream but not on destination but this is explicitly tolerated"
> -- 
> 2.53.0
>
>
Re: [PATCH v8 2/6] target/arm/machine: Take account cpreg mig tolerances in case of mismatch
Posted by Eric Auger 1 month ago

On 3/6/26 11:51 AM, Sebastian Ott wrote:
> On Wed, 4 Mar 2026, Eric Auger wrote:
>> If there is a mismatch between the cpreg indexes found on both ends,
>> check whether a tolerance was registered for the given kvmidx. If any,
>> silence warning/errors.
>>
>> Create dedicated helper functions that print the name of the culprit reg
>> and analyze whether a tolerance is set. According set the level of
>> traces
>> and analyze whether the migration must eventually fail.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>> target/arm/machine.c    | 30 ++++++++++++++++++++----------
>> target/arm/trace-events |  2 ++
>> 2 files changed, 22 insertions(+), 10 deletions(-)
>>
>> diff --git a/target/arm/machine.c b/target/arm/machine.c
>> index 476dad00ee7..13b8a0b2220 100644
>> --- a/target/arm/machine.c
>> +++ b/target/arm/machine.c
>> @@ -1065,24 +1065,34 @@ static void
>> handle_cpreg_missing_in_incoming_stream(ARMCPU *cpu, uint64_t kvmidx
>> {
>>     g_autofree gchar *name = print_register_name(kvmidx);
>>
>> -    warn_report("%s: %s "
>> -                "expected by the destination but not in the incoming
>> stream: "
>> -                 "skip it", __func__, name);
>> +    if (!arm_cpu_cpreg_has_mig_tolerance(cpu, kvmidx,
>> +                                         0, 0,
>> ToleranceNotOnBothEnds)) {
>
> I don't get this. You apply a different tolerance type than the one
> (potentially) registered. This check will only fail when there was no
> tolerance registered for this kvmidx. But this can be changed by someone
> registering a different type. 

Note at the moment, only one tolerance can be registered for a kvmidx

Eric
>
>> +        warn_report("%s: %s "
>> +                    "expected by the destination but not in the
>> incoming stream: "
>> +                     "skip it", __func__, name);
>> +    } else {
>> +        trace_tolerate_cpreg_missing_in_incoming_stream(name);
>> +    }
>> }
>>
>> /*
>> - * Handle the situation where @kvmidx is in the incoming stream
>> - * but not on destination. This currently fails the migration but
>> - * we plan to accomodate some exceptions, hence the boolean returned
>> value.
>> + * Handle the situation where @kvmidx is in the incoming
>> + * stream but not on destination. This fails the migration if
>> + * no cpreg mig tolerance is set for this @kvmidx
>>  */
>> static bool handle_cpreg_only_in_incoming_stream(ARMCPU *cpu,
>> uint64_t kvmidx)
>> {
>>     g_autofree gchar *name = print_register_name(kvmidx);
>> -    bool fail = true;
>> -
>> -    error_report("%s: %s in the incoming stream but unknown on the "
>> -                 "destination: fail migration", __func__, name);
>> +    bool fail = false;
>>
>> +    if (!arm_cpu_cpreg_has_mig_tolerance(cpu, kvmidx,
>> +                                        0, 0,
>> ToleranceNotOnBothEnds)) {
>> +        error_report("%s: %s in the incoming stream but unknown on
>> the "
>> +                     "destination: fail migration", __func__, name);
>> +        fail = true;
>> +    } else {
>> +        trace_tolerate_cpreg_only_in_incoming_stream(name);
>> +    }
>>     return fail;
>> }
>>
>> diff --git a/target/arm/trace-events b/target/arm/trace-events
>> index 2de0406f784..8502fb3265c 100644
>> --- a/target/arm/trace-events
>> +++ b/target/arm/trace-events
>> @@ -29,3 +29,5 @@ arm_psci_call(uint64_t x0, uint64_t x1, uint64_t
>> x2, uint64_t x3, uint32_t cpuid
>>
>> # machine.c
>> cpu_post_load(uint32_t cpreg_vmstate_array_len, uint32_t
>> cpreg_array_len) "cpreg_vmstate_array_len=%d cpreg_array_len=%d"
>> +tolerate_cpreg_missing_in_incoming_stream(char *name) "%s is missing
>> in incoming stream but this is explicitly tolerated"
>> +tolerate_cpreg_only_in_incoming_stream(char *name) "%s is in
>> incoming stream but not on destination but this is explicitly tolerated"
>> -- 
>> 2.53.0
>>
>>
>


Re: [PATCH v8 2/6] target/arm/machine: Take account cpreg mig tolerances in case of mismatch
Posted by Eric Auger 1 month ago

On 3/6/26 11:51 AM, Sebastian Ott wrote:
> On Wed, 4 Mar 2026, Eric Auger wrote:
>> If there is a mismatch between the cpreg indexes found on both ends,
>> check whether a tolerance was registered for the given kvmidx. If any,
>> silence warning/errors.
>>
>> Create dedicated helper functions that print the name of the culprit reg
>> and analyze whether a tolerance is set. According set the level of
>> traces
>> and analyze whether the migration must eventually fail.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> ---
>> target/arm/machine.c    | 30 ++++++++++++++++++++----------
>> target/arm/trace-events |  2 ++
>> 2 files changed, 22 insertions(+), 10 deletions(-)
>>
>> diff --git a/target/arm/machine.c b/target/arm/machine.c
>> index 476dad00ee7..13b8a0b2220 100644
>> --- a/target/arm/machine.c
>> +++ b/target/arm/machine.c
>> @@ -1065,24 +1065,34 @@ static void
>> handle_cpreg_missing_in_incoming_stream(ARMCPU *cpu, uint64_t kvmidx
>> {
>>     g_autofree gchar *name = print_register_name(kvmidx);
>>
>> -    warn_report("%s: %s "
>> -                "expected by the destination but not in the incoming
>> stream: "
>> -                 "skip it", __func__, name);
>> +    if (!arm_cpu_cpreg_has_mig_tolerance(cpu, kvmidx,
>> +                                         0, 0,
>> ToleranceNotOnBothEnds)) {
>
> I don't get this. You apply a different tolerance type than the one
> (potentially) registered. This check will only fail when there was no
> tolerance registered for this kvmidx. But this can be changed by someone
> registering a different type. 

OK so we know the cpreg is missing in the input stream. If there is a
tolerance of type ToleranceNotOnBothEnds for this kvmidx, that's fine,
we just trace and do not report any warning. Otherwise we warn_report
the issue but do not fail.
Does it clarify?

Eric
>
>> +        warn_report("%s: %s "
>> +                    "expected by the destination but not in the
>> incoming stream: "
>> +                     "skip it", __func__, name);
>> +    } else {
>> +        trace_tolerate_cpreg_missing_in_incoming_stream(name);
>> +    }
>> }
>>
>> /*
>> - * Handle the situation where @kvmidx is in the incoming stream
>> - * but not on destination. This currently fails the migration but
>> - * we plan to accomodate some exceptions, hence the boolean returned
>> value.
>> + * Handle the situation where @kvmidx is in the incoming
>> + * stream but not on destination. This fails the migration if
>> + * no cpreg mig tolerance is set for this @kvmidx
>>  */
>> static bool handle_cpreg_only_in_incoming_stream(ARMCPU *cpu,
>> uint64_t kvmidx)
>> {
>>     g_autofree gchar *name = print_register_name(kvmidx);
>> -    bool fail = true;
>> -
>> -    error_report("%s: %s in the incoming stream but unknown on the "
>> -                 "destination: fail migration", __func__, name);
>> +    bool fail = false;
>>
>> +    if (!arm_cpu_cpreg_has_mig_tolerance(cpu, kvmidx,
>> +                                        0, 0,
>> ToleranceNotOnBothEnds)) {
>> +        error_report("%s: %s in the incoming stream but unknown on
>> the "
>> +                     "destination: fail migration", __func__, name);
>> +        fail = true;
>> +    } else {
>> +        trace_tolerate_cpreg_only_in_incoming_stream(name);
>> +    }
>>     return fail;
>> }
>>
>> diff --git a/target/arm/trace-events b/target/arm/trace-events
>> index 2de0406f784..8502fb3265c 100644
>> --- a/target/arm/trace-events
>> +++ b/target/arm/trace-events
>> @@ -29,3 +29,5 @@ arm_psci_call(uint64_t x0, uint64_t x1, uint64_t
>> x2, uint64_t x3, uint32_t cpuid
>>
>> # machine.c
>> cpu_post_load(uint32_t cpreg_vmstate_array_len, uint32_t
>> cpreg_array_len) "cpreg_vmstate_array_len=%d cpreg_array_len=%d"
>> +tolerate_cpreg_missing_in_incoming_stream(char *name) "%s is missing
>> in incoming stream but this is explicitly tolerated"
>> +tolerate_cpreg_only_in_incoming_stream(char *name) "%s is in
>> incoming stream but not on destination but this is explicitly tolerated"
>> -- 
>> 2.53.0
>>
>>
>


Re: [PATCH v8 2/6] target/arm/machine: Take account cpreg mig tolerances in case of mismatch
Posted by Sebastian Ott 1 month ago
On Fri, 6 Mar 2026, Eric Auger wrote:
> On 3/6/26 11:51 AM, Sebastian Ott wrote:
>> On Wed, 4 Mar 2026, Eric Auger wrote:
>>> If there is a mismatch between the cpreg indexes found on both ends,
>>> check whether a tolerance was registered for the given kvmidx. If any,
>>> silence warning/errors.
>>>
>>> Create dedicated helper functions that print the name of the culprit reg
>>> and analyze whether a tolerance is set. According set the level of
>>> traces
>>> and analyze whether the migration must eventually fail.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>> ---
>>> target/arm/machine.c    | 30 ++++++++++++++++++++----------
>>> target/arm/trace-events |  2 ++
>>> 2 files changed, 22 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/target/arm/machine.c b/target/arm/machine.c
>>> index 476dad00ee7..13b8a0b2220 100644
>>> --- a/target/arm/machine.c
>>> +++ b/target/arm/machine.c
>>> @@ -1065,24 +1065,34 @@ static void
>>> handle_cpreg_missing_in_incoming_stream(ARMCPU *cpu, uint64_t kvmidx
>>> {
>>>     g_autofree gchar *name = print_register_name(kvmidx);
>>>
>>> -    warn_report("%s: %s "
>>> -                "expected by the destination but not in the incoming
>>> stream: "
>>> -                 "skip it", __func__, name);
>>> +    if (!arm_cpu_cpreg_has_mig_tolerance(cpu, kvmidx,
>>> +                                         0, 0,
>>> ToleranceNotOnBothEnds)) {
>>
>> I don't get this. You apply a different tolerance type than the one
>> (potentially) registered. This check will only fail when there was no
>> tolerance registered for this kvmidx. But this can be changed by someone
>> registering a different type. 
>
> OK so we know the cpreg is missing in the input stream. If there is a
> tolerance of type ToleranceNotOnBothEnds for this kvmidx, that's fine,
> we just trace and do not report any warning. Otherwise we warn_report
> the issue but do not fail.
> Does it clarify?

Yes, it does.

It just feels a bit odd that we identify specific issues when comparing
destination vs incomming stream and then test if that specific issue is
tolerated. With the infrastructure from patch 1 I would have expected
that arm_cpu_cpreg_has_mig_tolerance() can already handle the specific
issues - something like:

if (dest[i] == incomming[i]) {
 	//all good
} else if (arm_cpu_cpreg_has_mig_tolerance()) {
 	//all good
} else
 	fail();


..but I guess I'm just missing smth.

Sebastian