[PATCH v1 0/2] i386: KVM: Fix 'system_reset' failures when vCPU is in VMX root operation

Vitaly Kuznetsov posted 2 patches 1 year, 8 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20220818150113.479917-1-vkuznets@redhat.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Marcelo Tosatti <mtosatti@redhat.com>
target/i386/kvm/kvm.c | 54 +++++++++++++++++++++++++++++++------------
1 file changed, 39 insertions(+), 15 deletions(-)
[PATCH v1 0/2] i386: KVM: Fix 'system_reset' failures when vCPU is in VMX root operation
Posted by Vitaly Kuznetsov 1 year, 8 months ago
Changes since RFC:
- Call kvm_put_msr_feature_control() before kvm_put_sregs2() to achieve
 the same result [Paolo].
- Add Maxim's R-b to PATCH1.

It was discovered that Windows 11 with WSL2 (Hyper-V) enabled guests fail
to reboot when QEMU's 'system_reset' command is issued. The problem appears
to be that KVM_SET_SREGS2 fails because zeroed CR4 register value doesn't
pass vmx_is_valid_cr4() check in KVM as certain bits can't be zero while in
VMX root operation (post-VMXON). kvm_arch_put_registers() does call 
kvm_put_nested_state() which is supposed to kick vCPU out of VMX root
operation, however, it only does so after kvm_put_sregs2() and there's
a good reason for that: 'real' nested state requires e.g. EFER.SVME to
be set. 

The root cause of the issue seems to be that QEMU is doing quite a lot
to forcefully reset a vCPU as KVM doesn't export kvm_vcpu_reset() (or,
rather, it's super-set) yet. While all the numerous existing APIs for
setting a vCPU state work fine for a newly created vCPU, using them for
vCPU reset is a mess caused by various dependencies between different
components of the state (VMX, SMM, MSRs, XCRs, CPUIDs, ...). It would've
been possible to allow to set 'inconsistent' state and only validate it
upon VCPU_RUN from the very beginning but that ship has long sailed for
KVM. A new, dedicated API for vCPU reset is likely the way to go.

Resolve the immediate issue by setting MSR_IA32_FEATURE_CONTROL before
kvm_put_sregs2() (and kvm_put_nested_state()), this ensures vCPU gets
kicked out of VMX root operation.

Vitaly Kuznetsov (2):
  i386: reset KVM nested state upon CPU reset
  i386: do kvm_put_msr_feature_control() first thing when vCPU is reset

 target/i386/kvm/kvm.c | 54 +++++++++++++++++++++++++++++++------------
 1 file changed, 39 insertions(+), 15 deletions(-)

-- 
2.37.1
Re: [PATCH v1 0/2] i386: KVM: Fix 'system_reset' failures when vCPU is in VMX root operation
Posted by Paolo Bonzini 1 year, 8 months ago
On 8/18/22 17:01, Vitaly Kuznetsov wrote:
> Changes since RFC:
> - Call kvm_put_msr_feature_control() before kvm_put_sregs2() to achieve
>   the same result [Paolo].
> - Add Maxim's R-b to PATCH1.
> 
> It was discovered that Windows 11 with WSL2 (Hyper-V) enabled guests fail
> to reboot when QEMU's 'system_reset' command is issued. The problem appears
> to be that KVM_SET_SREGS2 fails because zeroed CR4 register value doesn't
> pass vmx_is_valid_cr4() check in KVM as certain bits can't be zero while in
> VMX root operation (post-VMXON). kvm_arch_put_registers() does call
> kvm_put_nested_state() which is supposed to kick vCPU out of VMX root
> operation, however, it only does so after kvm_put_sregs2() and there's
> a good reason for that: 'real' nested state requires e.g. EFER.SVME to
> be set.
> 
> The root cause of the issue seems to be that QEMU is doing quite a lot
> to forcefully reset a vCPU as KVM doesn't export kvm_vcpu_reset() (or,
> rather, it's super-set) yet. While all the numerous existing APIs for
> setting a vCPU state work fine for a newly created vCPU, using them for
> vCPU reset is a mess caused by various dependencies between different
> components of the state (VMX, SMM, MSRs, XCRs, CPUIDs, ...). It would've
> been possible to allow to set 'inconsistent' state and only validate it
> upon VCPU_RUN from the very beginning but that ship has long sailed for
> KVM. A new, dedicated API for vCPU reset is likely the way to go.
> 
> Resolve the immediate issue by setting MSR_IA32_FEATURE_CONTROL before
> kvm_put_sregs2() (and kvm_put_nested_state()), this ensures vCPU gets
> kicked out of VMX root operation.

Queued, thanks.

Paolo