Hi,
On 9/11/25 3:40 PM, Eric Auger wrote:
> When migrating ARM guests accross same machines with different host
> kernels we are likely to encounter failures such as:
>
> "failed to load cpu:cpreg_vmstate_array_len"
>
> This is due to the fact KVM exposes a different number of registers
> to qemu on source and destination. When trying to migrate a bigger
> register set to a smaller one, qemu cannot save the CPU state.
>
> For example, recently we faced such kind of situations with:
> - unconditionnal exposure of KVM_REG_ARM_VENDOR_HYP_BMAP_2 FW pseudo
> register from v6.16 onwards. Causes backward migration failure.
> - removal of unconditionnal exposure of TCR2_EL1, PIRE0_EL1, PIR_EL1
> from v6.13 onwards. Causes forward migration failure.
>
> More details can be found in individual patches.
>
> This situation is really problematic for distributions which want to
> guarantee forward and backward migration of a given machine type
> between different releases.
>
> This small series tries to address that issue by introducing CPU
> array properties that list the registers to ignore or to fake according
> to the situation. An example is given to illustrate how those props
> could be used to apply compats for machine types supposed to "see" the
> same register set accross various host kernels.
>
> Obviously this is a last resort situation and this situation should be
> avoided as much as possible.
Gentle ping. Any other comments/advices on how to mitigate those kinds
of issue?
I think I will split the series because although it tries to address the
same
"failed to load cpu:cpreg_vmstate_array_len" class of error, hiding or faking KVM registers induce different side effects and risks and it may be better to handle them separately.
I forgot to mention that when registers disappear without notice/KVM knob, one could argue that the easiest way is to backport the fix in older kernels but there will always be a point when a VM running on a non fixed host kernel won't be live migratable to a kernel where the fix was backported, which kills the point of doing live migration I think.
Best Regards
Eric
>
> Eric Auger (3):
> target/arm/cpu: Add new CPU property for KVM regs to hide
> target/arm/kvm: Add new CPU property for KVM regs to enforce
> hw/arm/virt: [DO NOT UPSTREAM] Enforce compatibility with older
> kernels
>
> target/arm/cpu.h | 15 +++++++
> hw/arm/virt.c | 19 ++++++++
> target/arm/kvm.c | 99 +++++++++++++++++++++++++++++++++++++++--
> target/arm/trace-events | 6 +++
> 4 files changed, 135 insertions(+), 4 deletions(-)
>