Setting nested state upon migration needs to happen after kvm_put_sregs2()
to e.g. have EFER.SVME set. This, however, doesn't work for vCPU reset:
when vCPU is in VMX root operation, certain CR bits are locked and
kvm_put_sregs2() may fail. As nested state is fully cleaned up upon
vCPU reset (kvm_arch_reset_vcpu() -> kvm_init_nested_state()), calling
kvm_put_nested_state() before kvm_put_sregs2() is OK, this will ensure
that vCPU is *not* in VMX root opertaion.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
target/i386/kvm/kvm.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4f8dacc1d4b5..73e3880fa57b 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -4529,18 +4529,34 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
- /* must be before kvm_put_nested_state so that EFER.SVME is set */
+ /*
+ * When resetting a vCPU, make sure to reset nested state first to
+ * e.g clear VMXON state and unlock certain CR4 bits.
+ */
+ if (level == KVM_PUT_RESET_STATE) {
+ ret = kvm_put_nested_state(x86_cpu);
+ if (ret < 0) {
+ return ret;
+ }
+ }
+
ret = has_sregs2 ? kvm_put_sregs2(x86_cpu) : kvm_put_sregs(x86_cpu);
if (ret < 0) {
return ret;
}
- if (level >= KVM_PUT_RESET_STATE) {
+ /*
+ * When putting full CPU state, kvm_put_nested_state() must happen after
+ * kvm_put_sregs{,2} so that e.g. EFER.SVME is already set.
+ */
+ if (level == KVM_PUT_FULL_STATE) {
ret = kvm_put_nested_state(x86_cpu);
if (ret < 0) {
return ret;
}
+ }
+ if (level >= KVM_PUT_RESET_STATE) {
ret = kvm_put_msr_feature_control(x86_cpu);
if (ret < 0) {
return ret;
--
2.37.1
On Wed, 2022-08-10 at 16:00 +0200, Vitaly Kuznetsov wrote:
> Setting nested state upon migration needs to happen after kvm_put_sregs2()
> to e.g. have EFER.SVME set. This, however, doesn't work for vCPU reset:
> when vCPU is in VMX root operation, certain CR bits are locked and
> kvm_put_sregs2() may fail. As nested state is fully cleaned up upon
> vCPU reset (kvm_arch_reset_vcpu() -> kvm_init_nested_state()), calling
> kvm_put_nested_state() before kvm_put_sregs2() is OK, this will ensure
> that vCPU is *not* in VMX root opertaion.
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
> target/i386/kvm/kvm.c | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 4f8dacc1d4b5..73e3880fa57b 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -4529,18 +4529,34 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>
> assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
>
> - /* must be before kvm_put_nested_state so that EFER.SVME is set */
> + /*
> + * When resetting a vCPU, make sure to reset nested state first to
> + * e.g clear VMXON state and unlock certain CR4 bits.
> + */
> + if (level == KVM_PUT_RESET_STATE) {
> + ret = kvm_put_nested_state(x86_cpu);
> + if (ret < 0) {
> + return ret;
> + }
I should have mentioned this, I actually already debugged the same issue while
trying to reproduce the smm int window bug.
100% my fault.
I also share the same feeling that this might be yet another 'whack a mole' and
break somewhere else, but overall it does make sense.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Best regards,
Maxim Levitsky
> + }
> +
> ret = has_sregs2 ? kvm_put_sregs2(x86_cpu) : kvm_put_sregs(x86_cpu);
> if (ret < 0) {
> return ret;
> }
>
> - if (level >= KVM_PUT_RESET_STATE) {
> + /*
> + * When putting full CPU state, kvm_put_nested_state() must happen after
> + * kvm_put_sregs{,2} so that e.g. EFER.SVME is already set.
> + */
> + if (level == KVM_PUT_FULL_STATE) {
> ret = kvm_put_nested_state(x86_cpu);
> if (ret < 0) {
> return ret;
> }
> + }
>
> + if (level >= KVM_PUT_RESET_STATE) {
> ret = kvm_put_msr_feature_control(x86_cpu);
> if (ret < 0) {
> return ret;
Maxim Levitsky <mlevitsk@redhat.com> writes:
> On Wed, 2022-08-10 at 16:00 +0200, Vitaly Kuznetsov wrote:
>> Setting nested state upon migration needs to happen after kvm_put_sregs2()
>> to e.g. have EFER.SVME set. This, however, doesn't work for vCPU reset:
>> when vCPU is in VMX root operation, certain CR bits are locked and
>> kvm_put_sregs2() may fail. As nested state is fully cleaned up upon
>> vCPU reset (kvm_arch_reset_vcpu() -> kvm_init_nested_state()), calling
>> kvm_put_nested_state() before kvm_put_sregs2() is OK, this will ensure
>> that vCPU is *not* in VMX root opertaion.
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>> target/i386/kvm/kvm.c | 20 ++++++++++++++++++--
>> 1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
>> index 4f8dacc1d4b5..73e3880fa57b 100644
>> --- a/target/i386/kvm/kvm.c
>> +++ b/target/i386/kvm/kvm.c
>> @@ -4529,18 +4529,34 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
>>
>> assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
>>
>> - /* must be before kvm_put_nested_state so that EFER.SVME is set */
>> + /*
>> + * When resetting a vCPU, make sure to reset nested state first to
>> + * e.g clear VMXON state and unlock certain CR4 bits.
>> + */
>> + if (level == KVM_PUT_RESET_STATE) {
>> + ret = kvm_put_nested_state(x86_cpu);
>> + if (ret < 0) {
>> + return ret;
>> + }
>
> I should have mentioned this, I actually already debugged the same issue while
> trying to reproduce the smm int window bug.
> 100% my fault.
>
> I also share the same feeling that this might be yet another 'whack a mole' and
> break somewhere else, but overall it does make sense.
This certainly *is* a 'whack a mole' and I'm sure there are other cases
when one of calls in kvm_arch_put_registers() fails. We need to work on
what's missing so we can expose kvm_vcpu_reset() to VMMs.
>
>
> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
>
Thanks!
--
Vitaly
© 2016 - 2026 Red Hat, Inc.