Hi,
This series aims to add support for QEMU to be able to migrate VMs that
are running nested hypervisors. In order to do so, it utilizes the new
IOCTLs introduced in KVM commit 8fcc4b5923af ("kvm: nVMX: Introduce
KVM_CAP_NESTED_STATE") which was created for this purpose.
1st patch add a missed cleanup for deleting VMX migration blocker in
case vCPU init fails.
2st patch introduce kvm_arch_destroy_vcpu() to perform per-vCPU
destruction logic that is arch-dependent.
3st patch is just refactoring to use symbolic constants instead of hard-coded
numbers.
4st patch fixes QEMU to update DR6 when QEMU re-inject #DB to guest after
it was intercepted by KVM when guest is debugged.
5th patch adds migration blocker for vCPU exposed with either Intel VMX
or AMD SVM. Until now it was blocked only for Intel VMX.
6rd patch updates linux-headers to have updated struct kvm_nested_state.
The updated struct now have explicit fields for the data portion.
7rd patch add vmstate support for saving/restoring kernel integer types (e.g. __u16).
8th patch adds support for saving and restoring nested state in order to migrate
guests which run a nested hypervisor.
9th patch add support for KVM_CAP_EXCEPTION_PAYLOAD. This new KVM capability
allows userspace to properly distingiush between pending and injecting exceptions.
10th patch changes the nested virtualization migration blocker to only
be added when kernel lack support for one of the capabilities required
for correct nested migration. i.e. Either KVM_CAP_NESTED_STATE or
KVM_CAP_EXCEPTION_PAYLOAD.
Regards,
-Liran
v1->v2 changes:
* Add patch to fix bug when re-inject #DB to guest.
* Add support for KVM_CAP_EXCEPTION_PAYLOAD.
* Use explicit fields for struct kvm_nested_state data portion.
* Use vmstate subsections to save/restore nested state in order to properly
* support forward & backwards migration compatability.
* Remove VMX migration blocker.
v2->v3 changes:
* Add kvm_arch_destroy_vcpu().
* Use DR6_BS where appropriate.
* Add cpu_pre_save() logic to convert pending exception to injected
exception if guest is running L2.
* Converted max_nested_state_len to int instead of uint32_t.
* Use kvm_arch_destroy_vcpu() to free nested_state.
* Add migration blocker for vCPU exposed with AMD SVM.
* Don't rely on CR4 or MSR_EFER to know if it is required to
migrate new VMState subsections.
* Signal if vCPU is in guest-mode in hflags as original intention by Paolo.
v3->v4 changes:
* Add delete of nested migration blocker in case vCPU init fail.
* Change detection of VMX/SVM to not consider CPU vendor.
* Modify linux-headers to not have SVM stubs and use constant for vmcs12 size.
* Wrapped definition of kernel integer vmstate macros with #ifdef CONFIG_LINUX.
* Add migration blocker in case vCPU is exposed with VMX/SVM and kernel
is lacking one of the KVM caps required to migrate nested workloads.
On 19/06/19 18:21, Liran Alon wrote:
> Hi,
>
> This series aims to add support for QEMU to be able to migrate VMs that
> are running nested hypervisors. In order to do so, it utilizes the new
> IOCTLs introduced in KVM commit 8fcc4b5923af ("kvm: nVMX: Introduce
> KVM_CAP_NESTED_STATE") which was created for this purpose.
Applied with just three minor changes that should be uncontroversial:
> 6rd patch updates linux-headers to have updated struct kvm_nested_state.
> The updated struct now have explicit fields for the data portion.
Changed patch title to "linux-headers: sync with latest KVM headers from
Linux 5.2"
> 7rd patch add vmstate support for saving/restoring kernel integer types (e.g. __u16).
>
> 8th patch adds support for saving and restoring nested state in order to migrate
> guests which run a nested hypervisor.
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index e924663f32..f3cf6e1b27 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1671,10 +1671,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
struct kvm_vmx_nested_state_hdr *vmx_hdr =
&env->nested_state->hdr.vmx;
+ env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
vmx_hdr->vmxon_pa = -1ull;
vmx_hdr->vmcs12_pa = -1ull;
}
-
}
cpu->kvm_msr_buf = g_malloc0(MSR_BUF_SIZE);
which is a no-op since KVM_STATE_NESTED_FORMAT_VMX is zero, but it's tidy.
> 9th patch add support for KVM_CAP_EXCEPTION_PAYLOAD. This new KVM capability
> allows userspace to properly distingiush between pending and injecting exceptions.
>
> 10th patch changes the nested virtualization migration blocker to only
> be added when kernel lack support for one of the capabilities required
> for correct nested migration. i.e. Either KVM_CAP_NESTED_STATE or
> KVM_CAP_EXCEPTION_PAYLOAD.
Had to disable this for SVM unfortunately.
> On 20 Jun 2019, at 15:38, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 19/06/19 18:21, Liran Alon wrote:
>> Hi,
>>
>> This series aims to add support for QEMU to be able to migrate VMs that
>> are running nested hypervisors. In order to do so, it utilizes the new
>> IOCTLs introduced in KVM commit 8fcc4b5923af ("kvm: nVMX: Introduce
>> KVM_CAP_NESTED_STATE") which was created for this purpose.
>
> Applied with just three minor changes that should be uncontroversial:
ACK. Where can I see the applied patches for review?
>
>> 6rd patch updates linux-headers to have updated struct kvm_nested_state.
>> The updated struct now have explicit fields for the data portion.
>
> Changed patch title to "linux-headers: sync with latest KVM headers from
> Linux 5.2”
ACK.
>
>> 7rd patch add vmstate support for saving/restoring kernel integer types (e.g. __u16).
>>
>> 8th patch adds support for saving and restoring nested state in order to migrate
>> guests which run a nested hypervisor.
>
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index e924663f32..f3cf6e1b27 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -1671,10 +1671,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
> struct kvm_vmx_nested_state_hdr *vmx_hdr =
> &env->nested_state->hdr.vmx;
>
> + env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
> vmx_hdr->vmxon_pa = -1ull;
> vmx_hdr->vmcs12_pa = -1ull;
> }
> -
> }
>
> cpu->kvm_msr_buf = g_malloc0(MSR_BUF_SIZE);
>
> which is a no-op since KVM_STATE_NESTED_FORMAT_VMX is zero, but it's tidy.
I agree. My bad. Thanks for adding this :)
>
>> 9th patch add support for KVM_CAP_EXCEPTION_PAYLOAD. This new KVM capability
>> allows userspace to properly distingiush between pending and injecting exceptions.
>>
>> 10th patch changes the nested virtualization migration blocker to only
>> be added when kernel lack support for one of the capabilities required
>> for correct nested migration. i.e. Either KVM_CAP_NESTED_STATE or
>> KVM_CAP_EXCEPTION_PAYLOAD.
>
> Had to disable this for SVM unfortunately.
For backwards compatibility I assume… Sounds reasonable to me so ACK.
Even though I must say I would really like to hear your opinion about the thread I had with David Gilbert regarding QEMU’s migration backwards compatibility:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg622274.html
Thanks for the assistance pushing this forward,
-Liran
> On 20 Jun 2019, at 16:28, Liran Alon <liran.alon@oracle.com> wrote:
>
>
>
>> On 20 Jun 2019, at 15:38, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> On 19/06/19 18:21, Liran Alon wrote:
>>> Hi,
>>>
>>> This series aims to add support for QEMU to be able to migrate VMs that
>>> are running nested hypervisors. In order to do so, it utilizes the new
>>> IOCTLs introduced in KVM commit 8fcc4b5923af ("kvm: nVMX: Introduce
>>> KVM_CAP_NESTED_STATE") which was created for this purpose.
>>
>> Applied with just three minor changes that should be uncontroversial:
>
> ACK. Where can I see the applied patches for review?
>
>>
>>> 6rd patch updates linux-headers to have updated struct kvm_nested_state.
>>> The updated struct now have explicit fields for the data portion.
>>
>> Changed patch title to "linux-headers: sync with latest KVM headers from
>> Linux 5.2”
>
> ACK.
>
>>
>>> 7rd patch add vmstate support for saving/restoring kernel integer types (e.g. __u16).
>>>
>>> 8th patch adds support for saving and restoring nested state in order to migrate
>>> guests which run a nested hypervisor.
>>
>> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
>> index e924663f32..f3cf6e1b27 100644
>> --- a/target/i386/kvm.c
>> +++ b/target/i386/kvm.c
>> @@ -1671,10 +1671,10 @@ int kvm_arch_init_vcpu(CPUState *cs)
>> struct kvm_vmx_nested_state_hdr *vmx_hdr =
>> &env->nested_state->hdr.vmx;
>>
>> + env->nested_state->format = KVM_STATE_NESTED_FORMAT_VMX;
>> vmx_hdr->vmxon_pa = -1ull;
>> vmx_hdr->vmcs12_pa = -1ull;
>> }
>> -
>> }
>>
>> cpu->kvm_msr_buf = g_malloc0(MSR_BUF_SIZE);
>>
>> which is a no-op since KVM_STATE_NESTED_FORMAT_VMX is zero, but it's tidy.
>
> I agree. My bad. Thanks for adding this :)
Actually, I think it makes more sense to condition here on cpu_has_vmx(env) instead of IS_INTEL_CPU(env).
And also add an “else if (cpu_has_svm(env))” that sets env->nested_state->format to KVM_STATE_NESTED_FORMAT_SVM.
If you can change that when applying. :)
-Liran
>
>>
>>> 9th patch add support for KVM_CAP_EXCEPTION_PAYLOAD. This new KVM capability
>>> allows userspace to properly distingiush between pending and injecting exceptions.
>>>
>>> 10th patch changes the nested virtualization migration blocker to only
>>> be added when kernel lack support for one of the capabilities required
>>> for correct nested migration. i.e. Either KVM_CAP_NESTED_STATE or
>>> KVM_CAP_EXCEPTION_PAYLOAD.
>>
>> Had to disable this for SVM unfortunately.
>
> For backwards compatibility I assume… Sounds reasonable to me so ACK.
>
> Even though I must say I would really like to hear your opinion about the thread I had with David Gilbert regarding QEMU’s migration backwards compatibility:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mail-2Darchive.com_qemu-2Ddevel-40nongnu.org_msg622274.html&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0&m=aPCucPqkbmosKyDNeWq6rNNJ4Ry4GCh4HlxnZcQvAS8&s=ZnEgQlntxSZ2cZf9nnqJa74vM3cq_yPUlTEL1pwVpUs&e=
>
> Thanks for the assistance pushing this forward,
> -Liran
>
>
© 2016 - 2025 Red Hat, Inc.