[Qemu-devel] [PATCH] i386/kvm: Do not sync nested state during runtime

Jan Kiszka posted 1 patch 4 years, 9 months ago
Test checkpatch passed
Test s390x passed
Test asan passed
Test docker-mingw@fedora passed
Test FreeBSD passed
Test docker-clang@ubuntu failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/bdd53f40-4e60-f3ae-7ec6-162198214953@siemens.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Richard Henderson <rth@twiddle.net>, Marcelo Tosatti <mtosatti@redhat.com>, Eduardo Habkost <ehabkost@redhat.com>
target/i386/kvm.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
[Qemu-devel] [PATCH] i386/kvm: Do not sync nested state during runtime
Posted by Jan Kiszka 4 years, 9 months ago
Writing the nested state e.g. after a vmport access can invalidate
important parts of the kernel-internal state, and it is not needed as
well. So leave this out from KVM_PUT_RUNTIME_STATE.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 target/i386/kvm.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index ec7870c6af..da98b2cbca 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -3577,12 +3577,12 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
 
     assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
 
-    ret = kvm_put_nested_state(x86_cpu);
-    if (ret < 0) {
-        return ret;
-    }
-
     if (level >= KVM_PUT_RESET_STATE) {
+        ret = kvm_put_nested_state(x86_cpu);
+        if (ret < 0) {
+            return ret;
+        }
+
         ret = kvm_put_msr_feature_control(x86_cpu);
         if (ret < 0) {
             return ret;
-- 
2.16.4

Re: [Qemu-devel] [PATCH] i386/kvm: Do not sync nested state during runtime
Posted by Liran Alon 4 years, 9 months ago

> On 22 Jul 2019, at 7:00, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> 
> Writing the nested state e.g. after a vmport access can invalidate
> important parts of the kernel-internal state, and it is not needed as
> well. So leave this out from KVM_PUT_RUNTIME_STATE.
> 
> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

As QEMU never modifies vCPU nested-state in userspace besides in vmload and vCPU creation,
shouldn’t this be under KVM_PUT_FULL_STATE? Same as the call to kvm_arch_set_tsc_khz().

-Liran 

> ---
> target/i386/kvm.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index ec7870c6af..da98b2cbca 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -3577,12 +3577,12 @@ int kvm_arch_put_registers(CPUState *cpu, int level)
> 
>     assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu));
> 
> -    ret = kvm_put_nested_state(x86_cpu);
> -    if (ret < 0) {
> -        return ret;
> -    }
> -
>     if (level >= KVM_PUT_RESET_STATE) {
> +        ret = kvm_put_nested_state(x86_cpu);
> +        if (ret < 0) {
> +            return ret;
> +        }
> +
>         ret = kvm_put_msr_feature_control(x86_cpu);
>         if (ret < 0) {
>             return ret;
> -- 
> 2.16.4


Re: [Qemu-devel] [PATCH] i386/kvm: Do not sync nested state during runtime
Posted by Jan Kiszka 4 years, 9 months ago
On 22.07.19 11:44, Liran Alon wrote:
> 
> 
>> On 22 Jul 2019, at 7:00, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>
>> Writing the nested state e.g. after a vmport access can invalidate
>> important parts of the kernel-internal state, and it is not needed as
>> well. So leave this out from KVM_PUT_RUNTIME_STATE.
>>
>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> 
> As QEMU never modifies vCPU nested-state in userspace besides in vmload and vCPU creation,
> shouldn’t this be under KVM_PUT_FULL_STATE? Same as the call to kvm_arch_set_tsc_khz().

Reset is a relevant modification as well. If we do not write back a state that
is disabling virtualization, we break.

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH] i386/kvm: Do not sync nested state during runtime
Posted by Liran Alon 4 years, 9 months ago

> On 22 Jul 2019, at 13:20, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> 
> On 22.07.19 11:44, Liran Alon wrote:
>> 
>> 
>>> On 22 Jul 2019, at 7:00, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>> 
>>> Writing the nested state e.g. after a vmport access can invalidate
>>> important parts of the kernel-internal state, and it is not needed as
>>> well. So leave this out from KVM_PUT_RUNTIME_STATE.
>>> 
>>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>> 
>> As QEMU never modifies vCPU nested-state in userspace besides in vmload and vCPU creation,
>> shouldn’t this be under KVM_PUT_FULL_STATE? Same as the call to kvm_arch_set_tsc_khz().
> 
> Reset is a relevant modification as well. If we do not write back a state that
> is disabling virtualization, we break.
> 
> Jan

Currently QEMU writes to userspace maintained nested-state only at kvm_arch_init_vcpu() and
when loading vmstate_nested_state vmstate subsection.
kvm_arch_reset_vcpu() do not modify userspace maintained nested-state.

I would expect KVM internal nested-state to be reset as part of KVM’s vmx_vcpu_reset().
Are we missing a call to vmx_leave_nested() there? 

-Liran
Re: [Qemu-devel] [PATCH] i386/kvm: Do not sync nested state during runtime
Posted by Jan Kiszka 4 years, 9 months ago
On 22.07.19 12:31, Liran Alon wrote:
>> On 22 Jul 2019, at 13:20, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>
>> On 22.07.19 11:44, Liran Alon wrote:
>>>
>>>
>>>> On 22 Jul 2019, at 7:00, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>>>
>>>> Writing the nested state e.g. after a vmport access can invalidate
>>>> important parts of the kernel-internal state, and it is not needed as
>>>> well. So leave this out from KVM_PUT_RUNTIME_STATE.
>>>>
>>>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>
>>> As QEMU never modifies vCPU nested-state in userspace besides in vmload and vCPU creation,
>>> shouldn’t this be under KVM_PUT_FULL_STATE? Same as the call to kvm_arch_set_tsc_khz().
>>
>> Reset is a relevant modification as well. If we do not write back a state that
>> is disabling virtualization, we break.
>>
>> Jan
> 
> Currently QEMU writes to userspace maintained nested-state only at kvm_arch_init_vcpu() and
> when loading vmstate_nested_state vmstate subsection.
> kvm_arch_reset_vcpu() do not modify userspace maintained nested-state.

Hmm, then we probably achieve that effect by clearing the related bit in CR4. If
doing that implies an invalidation of the nested state by KVM, we are fine.
Otherwise, I would expect userspace to reset the state to VMCLEAR and purge any
traces of prior use.

> 
> I would expect KVM internal nested-state to be reset as part of KVM’s vmx_vcpu_reset().

vmx_vcpu_reset is not issued on userspace initiated VM reset, only on INIT/SIPI
cycles. It's misleading, and I remember myself running into that when I hacked
on KVM back then.

Jan

> Are we missing a call to vmx_leave_nested() there? 
> 
> -Liran
> 

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [PATCH] i386/kvm: Do not sync nested state during runtime
Posted by Paolo Bonzini 4 years, 9 months ago
On 22/07/19 12:43, Jan Kiszka wrote:
>> Currently QEMU writes to userspace maintained nested-state only at kvm_arch_init_vcpu() and
>> when loading vmstate_nested_state vmstate subsection.
>> kvm_arch_reset_vcpu() do not modify userspace maintained nested-state.
> Hmm, then we probably achieve that effect by clearing the related bit in CR4.

Almost: by clearing the VMX enable bit in MSR_IA32_FEATURE_CONTROL.
Actually I think you contributed that. :)

I think we could in principle skip that MSR write if env->nested_state
!= NULL, but it doesn't hurt either, and it makes sense that nested virt
state goes together with MSR_IA32_FEATURE_CONTROL since the latter is
indede nested virtualization related.

Paolo

> If doing that implies an invalidation of the nested state by KVM, we
> are fine. Otherwise, I would expect userspace to reset the state to
> VMCLEAR and purge any traces of prior use.