From nobody Tue Nov 11 03:19:21 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=oracle.com ARC-Seal: i=1; a=rsa-sha256; t=1560794796; cv=none; d=zoho.com; s=zohoarc; b=dWeWK0sc6WkOyFsvKaj0XhKy1dz4cjihFq3vx2V/ayEgtRjVBvNizZkK3d32nYq4SWfuJgZP49W08xXrOds4j/eZ0HLMD1y9SlCdMtoKJQx4U6g1fsir0fkrgo3L7Gi16XqGLQbw7vPJpAtUlv9TXKPpRlqLYhIomkX6oYor//Y= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1560794796; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=3Y8DuIsH7NP4DnY3/dRIyoy4k98WZS+kHiuaOETCum8=; b=QJiphwPjmt0A26vTA2Zi33GZmbjR4X4qyBbwpuF2TQNDaK83FcbFymMn49W/6ZjKdeS1FT0TgEOyHzcvO+4nFlHijZrA55oIox2l2S/1n3m63h6d5BgOI+7ytI9qrwewOl4Iu9T+/5oWERJ+8FHcFthmW8ij6xX6R7OVXP0q0cY= ARC-Authentication-Results: i=1; mx.zoho.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1560794796173509.99010795020376; Mon, 17 Jun 2019 11:06:36 -0700 (PDT) Received: from localhost ([::1]:50738 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hcw1a-0004Tc-IC for importer@patchew.org; Mon, 17 Jun 2019 14:06:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:52162) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hcvtA-0005lI-Hs for qemu-devel@nongnu.org; Mon, 17 Jun 2019 13:57:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hcvt5-0003aJ-Kh for qemu-devel@nongnu.org; Mon, 17 Jun 2019 13:57:51 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:44174) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hcvt5-0003Yr-6M for qemu-devel@nongnu.org; Mon, 17 Jun 2019 13:57:47 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x5HHsJDT054763; Mon, 17 Jun 2019 17:57:44 GMT Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 2t4rmnyx96-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 17 Jun 2019 17:57:44 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x5HHvhnM190482; Mon, 17 Jun 2019 17:57:43 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3020.oracle.com with ESMTP id 2t5h5t93kd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 17 Jun 2019 17:57:43 +0000 Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x5HHvgpD029140; Mon, 17 Jun 2019 17:57:42 GMT Received: from spark.ravello.local (/213.57.127.2) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 17 Jun 2019 10:57:42 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2018-07-02; bh=3Y8DuIsH7NP4DnY3/dRIyoy4k98WZS+kHiuaOETCum8=; b=kDXUowrctmaqREeoN0zzm8ksr9O75TiCqh8WSgCZOlul386YFhoyMAUcmVQGUoGyzivt gfp4ZK2hS37okOOsLj17ExG1mvqBubejPqRhbFyWFlPjx05WTjlMlc4W5LuQrTH0eGBo HiD4a933be1TphFoqBVO2vUxA1vC8BLe8ALctHZqweQ0qTvyDFLLc/sazVrrEVibONbe jKlbh765jyHUIRI6OZTjrWkwoz4sa/eSTPe2fZ8G44ZuZsEmK55LR9sJbTBvG7G4/SYV 7duy4EReTLsv7b/6yvt1D1MB0NUT854ge5G0N7vGkpr6H2Z9kPVkTd3BwWfucJ2hcGqb ew== From: Liran Alon To: qemu-devel@nongnu.org Date: Mon, 17 Jun 2019 20:56:56 +0300 Message-Id: <20190617175658.135869-8-liran.alon@oracle.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190617175658.135869-1-liran.alon@oracle.com> References: <20190617175658.135869-1-liran.alon@oracle.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9291 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=3 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906170161 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9291 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=3 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1906170160 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 141.146.126.78 Subject: [Qemu-devel] [QEMU PATCH v3 7/9] KVM: i386: Add support for save and restore nested state X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ehabkost@redhat.com, kvm@vger.kernel.org, maran.wilson@oracle.com, mtosatti@redhat.com, dgilbert@redhat.com, Liran Alon , Nikita Leshenko , pbonzini@redhat.com, rth@twiddle.net, jmattson@google.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" Kernel commit 8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE") introduced new IOCTLs to extract and restore vCPU state related to Intel VMX & AMD SVM. Utilize these IOCTLs to add support for migration of VMs which are running nested hypervisors. Reviewed-by: Nikita Leshenko Signed-off-by: Liran Alon Reviewed-by: Maran Wilson --- accel/kvm/kvm-all.c | 8 ++ include/sysemu/kvm.h | 1 + target/i386/cpu.h | 3 + target/i386/kvm.c | 80 +++++++++++++++++ target/i386/machine.c | 196 ++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 288 insertions(+) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 59a3aa3a40da..4fdf5b04b131 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -88,6 +88,7 @@ struct KVMState #ifdef KVM_CAP_SET_GUEST_DEBUG QTAILQ_HEAD(, kvm_sw_breakpoint) kvm_sw_breakpoints; #endif + int max_nested_state_len; int many_ioeventfds; int intx_set_mask; bool sync_mmu; @@ -1678,6 +1679,8 @@ static int kvm_init(MachineState *ms) s->debugregs =3D kvm_check_extension(s, KVM_CAP_DEBUGREGS); #endif =20 + s->max_nested_state_len =3D kvm_check_extension(s, KVM_CAP_NESTED_STAT= E); + #ifdef KVM_CAP_IRQ_ROUTING kvm_direct_msi_allowed =3D (kvm_check_extension(s, KVM_CAP_SIGNAL_MSI)= > 0); #endif @@ -2245,6 +2248,11 @@ int kvm_has_debugregs(void) return kvm_state->debugregs; } =20 +int kvm_max_nested_state_length(void) +{ + return kvm_state->max_nested_state_len; +} + int kvm_has_many_ioeventfds(void) { if (!kvm_enabled()) { diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 64f55e519df7..acd90aebb6c4 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -210,6 +210,7 @@ bool kvm_has_sync_mmu(void); int kvm_has_vcpu_events(void); int kvm_has_robust_singlestep(void); int kvm_has_debugregs(void); +int kvm_max_nested_state_length(void); int kvm_has_pit_state2(void); int kvm_has_many_ioeventfds(void); int kvm_has_gsi_routing(void); diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 79d9495ceb0c..a6bb71849869 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1350,6 +1350,9 @@ typedef struct CPUX86State { #if defined(CONFIG_KVM) || defined(CONFIG_HVF) void *xsave_buf; #endif +#if defined(CONFIG_KVM) + struct kvm_nested_state *nested_state; +#endif #if defined(CONFIG_HVF) HVFX86EmulatorState *hvf_emul; #endif diff --git a/target/i386/kvm.c b/target/i386/kvm.c index f43e2d69859e..5950c3ed0d1c 100644 --- a/target/i386/kvm.c +++ b/target/i386/kvm.c @@ -931,6 +931,7 @@ int kvm_arch_init_vcpu(CPUState *cs) struct kvm_cpuid_entry2 *c; uint32_t signature[3]; int kvm_base =3D KVM_CPUID_SIGNATURE; + int max_nested_state_len; int r; Error *local_err =3D NULL; =20 @@ -1331,6 +1332,24 @@ int kvm_arch_init_vcpu(CPUState *cs) if (has_xsave) { env->xsave_buf =3D qemu_memalign(4096, sizeof(struct kvm_xsave)); } + + max_nested_state_len =3D kvm_max_nested_state_length(); + if (max_nested_state_len > 0) { + assert(max_nested_state_len >=3D offsetof(struct kvm_nested_state,= data)); + env->nested_state =3D g_malloc0(max_nested_state_len); + + env->nested_state->size =3D max_nested_state_len; + + if (IS_INTEL_CPU(env)) { + struct kvm_vmx_nested_state_hdr *vmx_hdr =3D + &env->nested_state->hdr.vmx; + + vmx_hdr->vmxon_pa =3D -1ull; + vmx_hdr->vmcs12_pa =3D -1ull; + } + + } + cpu->kvm_msr_buf =3D g_malloc0(MSR_BUF_SIZE); =20 if (!(env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_RDTSCP)) { @@ -1352,12 +1371,18 @@ int kvm_arch_init_vcpu(CPUState *cs) int kvm_arch_destroy_vcpu(CPUState *cs) { X86CPU *cpu =3D X86_CPU(cs); + CPUX86State *env =3D &cpu->env; =20 if (cpu->kvm_msr_buf) { g_free(cpu->kvm_msr_buf); cpu->kvm_msr_buf =3D NULL; } =20 + if (env->nested_state) { + g_free(env->nested_state); + env->nested_state =3D NULL; + } + return 0; } =20 @@ -3072,6 +3097,52 @@ static int kvm_get_debugregs(X86CPU *cpu) return 0; } =20 +static int kvm_put_nested_state(X86CPU *cpu) +{ + CPUX86State *env =3D &cpu->env; + int max_nested_state_len =3D kvm_max_nested_state_length(); + + if (max_nested_state_len <=3D 0) { + return 0; + } + + assert(env->nested_state->size <=3D max_nested_state_len); + return kvm_vcpu_ioctl(CPU(cpu), KVM_SET_NESTED_STATE, env->nested_stat= e); +} + +static int kvm_get_nested_state(X86CPU *cpu) +{ + CPUX86State *env =3D &cpu->env; + int max_nested_state_len =3D kvm_max_nested_state_length(); + int ret; + + if (max_nested_state_len <=3D 0) { + return 0; + } + + /* + * It is possible that migration restored a smaller size into + * nested_state->hdr.size than what our kernel support. + * We preserve migration origin nested_state->hdr.size for + * call to KVM_SET_NESTED_STATE but wish that our next call + * to KVM_GET_NESTED_STATE will use max size our kernel support. + */ + env->nested_state->size =3D max_nested_state_len; + + ret =3D kvm_vcpu_ioctl(CPU(cpu), KVM_GET_NESTED_STATE, env->nested_sta= te); + if (ret < 0) { + return ret; + } + + if (env->nested_state->flags & KVM_STATE_NESTED_GUEST_MODE) { + env->hflags |=3D HF_GUEST_MASK; + } else { + env->hflags &=3D ~HF_GUEST_MASK; + } + + return ret; +} + int kvm_arch_put_registers(CPUState *cpu, int level) { X86CPU *x86_cpu =3D X86_CPU(cpu); @@ -3079,6 +3150,11 @@ int kvm_arch_put_registers(CPUState *cpu, int level) =20 assert(cpu_is_stopped(cpu) || qemu_cpu_is_self(cpu)); =20 + ret =3D kvm_put_nested_state(x86_cpu); + if (ret < 0) { + return ret; + } + if (level >=3D KVM_PUT_RESET_STATE) { ret =3D kvm_put_msr_feature_control(x86_cpu); if (ret < 0) { @@ -3194,6 +3270,10 @@ int kvm_arch_get_registers(CPUState *cs) if (ret < 0) { goto out; } + ret =3D kvm_get_nested_state(cpu); + if (ret < 0) { + goto out; + } ret =3D 0; out: cpu_sync_bndcs_hflags(&cpu->env); diff --git a/target/i386/machine.c b/target/i386/machine.c index 225b5d433bc4..95299ebff44a 100644 --- a/target/i386/machine.c +++ b/target/i386/machine.c @@ -231,6 +231,15 @@ static int cpu_pre_save(void *opaque) env->segs[R_SS].flags &=3D ~(env->segs[R_SS].flags & DESC_DPL_MASK= ); } =20 +#ifdef CONFIG_KVM + /* Verify we have nested virtualization state from kernel if required = */ + if (cpu_has_nested_virt(env) && !env->nested_state) { + error_report("Guest enabled nested virtualization but kernel " + "does not support saving of nested state"); + return -EINVAL; + } +#endif + return 0; } =20 @@ -278,6 +287,16 @@ static int cpu_post_load(void *opaque, int version_id) env->hflags &=3D ~HF_CPL_MASK; env->hflags |=3D (env->segs[R_SS].flags >> DESC_DPL_SHIFT) & HF_CPL_MA= SK; =20 +#ifdef CONFIG_KVM + if ((env->hflags & HF_GUEST_MASK) && + (!env->nested_state || + !(env->nested_state->flags & KVM_STATE_NESTED_GUEST_MODE))) { + error_report("vCPU set in guest-mode inconsistent with " + "migrated kernel nested state"); + return -EINVAL; + } +#endif + env->fpstt =3D (env->fpus_vmstate >> 11) & 7; env->fpus =3D env->fpus_vmstate & ~0x3800; env->fptag_vmstate ^=3D 0xff; @@ -851,6 +870,180 @@ static const VMStateDescription vmstate_tsc_khz =3D { } }; =20 +#ifdef CONFIG_KVM + +static bool vmx_vmcs12_needed(void *opaque) +{ + struct kvm_nested_state *nested_state =3D opaque; + return (nested_state->size > + offsetof(struct kvm_nested_state, data.vmx[0].vmcs12)); +} + +static const VMStateDescription vmstate_vmx_vmcs12 =3D { + .name =3D "cpu/kvm_nested_state/vmx/vmcs12", + .version_id =3D 1, + .minimum_version_id =3D 1, + .needed =3D vmx_vmcs12_needed, + .fields =3D (VMStateField[]) { + VMSTATE_UINT8_ARRAY(data.vmx[0].vmcs12, + struct kvm_nested_state, 0x1000), + VMSTATE_END_OF_LIST() + } +}; + +static bool vmx_shadow_vmcs12_needed(void *opaque) +{ + struct kvm_nested_state *nested_state =3D opaque; + return (nested_state->size > + offsetof(struct kvm_nested_state, data.vmx[0].shadow_vmcs12)); +} + +static const VMStateDescription vmstate_vmx_shadow_vmcs12 =3D { + .name =3D "cpu/kvm_nested_state/vmx/shadow_vmcs12", + .version_id =3D 1, + .minimum_version_id =3D 1, + .needed =3D vmx_shadow_vmcs12_needed, + .fields =3D (VMStateField[]) { + VMSTATE_UINT8_ARRAY(data.vmx[0].shadow_vmcs12, + struct kvm_nested_state, 0x1000), + VMSTATE_END_OF_LIST() + } +}; + +static bool vmx_nested_state_needed(void *opaque) +{ + struct kvm_nested_state *nested_state =3D opaque; + + return ((nested_state->format =3D=3D KVM_STATE_NESTED_FORMAT_VMX) && + ((nested_state->hdr.vmx.vmxon_pa !=3D -1ull) || + (nested_state->hdr.vmx.smm.flags & KVM_STATE_NESTED_SMM_VMXON= ))); +} + +static const VMStateDescription vmstate_vmx_nested_state =3D { + .name =3D "cpu/kvm_nested_state/vmx", + .version_id =3D 1, + .minimum_version_id =3D 1, + .needed =3D vmx_nested_state_needed, + .fields =3D (VMStateField[]) { + VMSTATE_U64(hdr.vmx.vmxon_pa, struct kvm_nested_state), + VMSTATE_U64(hdr.vmx.vmcs12_pa, struct kvm_nested_state), + VMSTATE_U16(hdr.vmx.smm.flags, struct kvm_nested_state), + VMSTATE_END_OF_LIST() + }, + .subsections =3D (const VMStateDescription*[]) { + &vmstate_vmx_vmcs12, + &vmstate_vmx_shadow_vmcs12, + NULL, + } +}; + +static bool svm_nested_state_needed(void *opaque) +{ + struct kvm_nested_state *nested_state =3D opaque; + + return (nested_state->format =3D=3D KVM_STATE_NESTED_FORMAT_SVM); +} + +static const VMStateDescription vmstate_svm_nested_state =3D { + .name =3D "cpu/kvm_nested_state/svm", + .version_id =3D 1, + .minimum_version_id =3D 1, + .needed =3D svm_nested_state_needed, + .fields =3D (VMStateField[]) { + VMSTATE_END_OF_LIST() + } +}; + +static bool nested_state_needed(void *opaque) +{ + X86CPU *cpu =3D opaque; + CPUX86State *env =3D &cpu->env; + + return (env->nested_state && + (vmx_nested_state_needed(env->nested_state) || + svm_nested_state_needed(env->nested_state))); +} + +static int nested_state_post_load(void *opaque, int version_id) +{ + X86CPU *cpu =3D opaque; + CPUX86State *env =3D &cpu->env; + struct kvm_nested_state *nested_state =3D env->nested_state; + int min_nested_state_len =3D offsetof(struct kvm_nested_state, data); + int max_nested_state_len =3D kvm_max_nested_state_length(); + + /* + * If our kernel don't support setting nested state + * and we have received nested state from migration stream, + * we need to fail migration + */ + if (max_nested_state_len <=3D 0) { + error_report("Received nested state when kernel cannot restore it"= ); + return -EINVAL; + } + + /* + * Verify that the size of received nested_state struct + * at least cover required header and is not larger + * than the max size that our kernel support + */ + if (nested_state->size < min_nested_state_len) { + error_report("Received nested state size less than min: " + "len=3D%d, min=3D%d", + nested_state->size, min_nested_state_len); + return -EINVAL; + } + if (nested_state->size > max_nested_state_len) { + error_report("Recieved unsupported nested state size: " + "nested_state->size=3D%d, max=3D%d", + nested_state->size, max_nested_state_len); + return -EINVAL; + } + + /* Verify format is valid */ + if ((nested_state->format !=3D KVM_STATE_NESTED_FORMAT_VMX) && + (nested_state->format !=3D KVM_STATE_NESTED_FORMAT_SVM)) { + error_report("Received invalid nested state format: %d", + nested_state->format); + return -EINVAL; + } + + return 0; +} + +static const VMStateDescription vmstate_kvm_nested_state =3D { + .name =3D "cpu/kvm_nested_state", + .version_id =3D 1, + .minimum_version_id =3D 1, + .fields =3D (VMStateField[]) { + VMSTATE_U16(flags, struct kvm_nested_state), + VMSTATE_U16(format, struct kvm_nested_state), + VMSTATE_U32(size, struct kvm_nested_state), + VMSTATE_END_OF_LIST() + }, + .subsections =3D (const VMStateDescription*[]) { + &vmstate_vmx_nested_state, + &vmstate_svm_nested_state, + NULL + } +}; + +static const VMStateDescription vmstate_nested_state =3D { + .name =3D "cpu/nested_state", + .version_id =3D 1, + .minimum_version_id =3D 1, + .needed =3D nested_state_needed, + .post_load =3D nested_state_post_load, + .fields =3D (VMStateField[]) { + VMSTATE_STRUCT_POINTER(env.nested_state, X86CPU, + vmstate_kvm_nested_state, + struct kvm_nested_state), + VMSTATE_END_OF_LIST() + } +}; + +#endif + static bool mcg_ext_ctl_needed(void *opaque) { X86CPU *cpu =3D opaque; @@ -1089,6 +1282,9 @@ VMStateDescription vmstate_x86_cpu =3D { &vmstate_msr_intel_pt, &vmstate_msr_virt_ssbd, &vmstate_svm_npt, +#ifdef CONFIG_KVM + &vmstate_nested_state, +#endif NULL } }; --=20 2.20.1