From nobody Tue Nov 11 14:25:36 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1563635968; cv=none; d=zoho.com; s=zohoarc; b=aq2O12prgPtC2KVPeOemRrH4aN1iJsD8X8iumCbpfAiW/rzNbCs5oEqwctGDLmIjh5fB30cUJXn0D3CvcoLp5MmtBf2nbkJTVy1EYqCaOuKgUoIL5K8YGhIxcqBTZow63W2XrwUGfSLdwu9iGH/EtXhitHXwbTD48IoAEUCJhrE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1563635968; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=yiDc/TEXPQasznA1y4wzCBkJJW+skyq8/HgGa6nZqFY=; b=mk9WqD5Z4ob9TI960QkG7ig6OkUwwsu2tmgB/o5FS4SMLgYqmsrTrtPeBsqcIIEe/KJQtF3g0VgqhhEul/rb55XcB2WbTzfVmo1PuVstdr4i19BkGNzxBznGQ2yzBKRiomsxbtin83hwdgqgUdVrKne8cDsCypqf4wrmyf0BgLw= ARC-Authentication-Results: i=1; mx.zoho.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 156363596864981.01782020080088; Sat, 20 Jul 2019 08:19:28 -0700 (PDT) Received: from localhost ([::1]:51594 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hor8x-0007NB-40 for importer@patchew.org; Sat, 20 Jul 2019 11:19:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39784) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hor8P-0005n1-RE for qemu-devel@nongnu.org; Sat, 20 Jul 2019 11:18:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hor8O-00084S-8y for qemu-devel@nongnu.org; Sat, 20 Jul 2019 11:18:53 -0400 Received: from mail-wm1-x342.google.com ([2a00:1450:4864:20::342]:55359) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hor8O-000841-13 for qemu-devel@nongnu.org; Sat, 20 Jul 2019 11:18:52 -0400 Received: by mail-wm1-x342.google.com with SMTP id a15so31256010wmj.5 for ; Sat, 20 Jul 2019 08:18:51 -0700 (PDT) Received: from localhost.localdomain ([2001:b07:6468:f312:8501:6b03:f18c:74f8]) by smtp.gmail.com with ESMTPSA id 4sm77517291wro.78.2019.07.20.08.18.48 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Sat, 20 Jul 2019 08:18:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=yiDc/TEXPQasznA1y4wzCBkJJW+skyq8/HgGa6nZqFY=; b=QAH9R5N2b0+/dYNyJgfXQW+rZ1D/yXMamb6cgawPaFOLgX2GY7XujvSzwYLyRHYKwj H6lSqNSeh9YV2DZ4xWyPSS4Dqvk7WlrFuf/kBg66xgXM2JaORwnxXoLscRGUPZXTzgR6 7aOGjDonvzOvbyiJMBDoIWdqtoalWce5eTQYgubS4vT7LMHbg6Pfrd1ILzD9+HInmpKb fSwv/nXPxz7b15yr9v/lJjcx55cbPXUl7vUI91l4eRF6T5x6r2SMue1N8Vk8xtxdQoh8 xzr1fnGkGl6cjrg/6mjYQ1LfLzBzC1tSZS/8DOV7JJM4B8VzXMGc9rJjSK6AcbYRyDmU VoZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=yiDc/TEXPQasznA1y4wzCBkJJW+skyq8/HgGa6nZqFY=; b=FvYzFz1uwqzhDJTQrGJXKnu9zFi8i83CJ7HOepvs2CwjNcjXpiTIAjV0GMXa+sBZTP ormzz/bRPyKXVm7/M+pOOBLjJ+7Ao2FdpQY68PLFax/UlGQZe9XUI3t1Qs51XDW1athp qEJ0FC/pzqyOyvmPfHlRn9Z2uILaRB02DSNJdaDMF9h0z3m2h4dn+r9ArhL00F89JPba JxxJCLVx9BlRYGpb8JT4eSPaQl9/4fizJE6C6D5dBHRTO0hYGcxSfwRQWgXGMXu5shyB 5YDwZmBLXu5sbsfx8bEIz/J3C5mQkeIpAXsibhpg4F6hPH+VBWUcLZvtX5KRm8MtptPW jG7A== X-Gm-Message-State: APjAAAWRVkXiSvgCR2B32MtWuit44nEQHbrHG3fqWjiXF3IWmERl6tvm OdESopsANURM0JoacI2QzxN+HzqdDvs= X-Google-Smtp-Source: APXvYqypOoEvUqv1Jr8zQN4SuewT+AB1Mz7v7OrJLM2PpkxLZMoEJ3mCxTvNqtDhqb7u8iDZndfbWQ== X-Received: by 2002:a7b:c8c3:: with SMTP id f3mr56398553wml.124.1563635930626; Sat, 20 Jul 2019 08:18:50 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Date: Sat, 20 Jul 2019 17:18:35 +0200 Message-Id: <20190720151846.7450-2-pbonzini@redhat.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190720151846.7450-1-pbonzini@redhat.com> References: <20190720151846.7450-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:4864:20::342 Subject: [Qemu-devel] [PULL 01/12] target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Joao Martins , Liran Alon , Maran Wilson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" From: Liran Alon Previous to this change, a vCPU exposed with VMX running on a kernel without KVM_CAP_NESTED_STATE or KVM_CAP_EXCEPTION_PAYLOAD resulted in adding a migration blocker. This was because when the code was written it was thought there is no way to reliably know if a vCPU is utilising VMX or not at runtime. However, it turns out that this can be known to some extent: In order for a vCPU to enter VMX operation it must have CR4.VMXE set. Since it was set, CR4.VMXE must remain set as long as the vCPU is in VMX operation. This is because CR4.VMXE is one of the bits set in MSR_IA32_VMX_CR4_FIXED1. There is one exception to the above statement when vCPU enters SMM mode. When a vCPU enters SMM mode, it temporarily exits VMX operation and may also reset CR4.VMXE during execution in SMM mode. When the vCPU exits SMM mode, vCPU state is restored to be in VMX operation and CR4.VMXE is restored to its original state of being set. Therefore, when the vCPU is not in SMM mode, we can infer whether VMX is being used by examining CR4.VMXE. Otherwise, we cannot know for certain but assume the worse that vCPU may utilise VMX. Summaring all the above, a vCPU may have enabled VMX in case CR4.VMXE is set or vCPU is in SMM mode. Therefore, remove migration blocker and check before migration (cpu_pre_save()) if the vCPU may have enabled VMX. If true, only then require relevant kernel capabilities. While at it, demand KVM_CAP_EXCEPTION_PAYLOAD only when the vCPU is in guest-mode and there is a pending/injected exception. Otherwise, this kernel capability is not required for proper migration. Reviewed-by: Joao Martins Signed-off-by: Liran Alon Reviewed-by: Maran Wilson Tested-by: Maran Wilson Signed-off-by: Paolo Bonzini --- target/i386/cpu.h | 22 ++++++++++++++++++++++ target/i386/kvm.c | 26 ++++++-------------------- target/i386/kvm_i386.h | 1 + target/i386/machine.c | 24 ++++++++++++++++++++---- 4 files changed, 49 insertions(+), 24 deletions(-) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 05393cf..8b3dc55 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -1877,6 +1877,28 @@ static inline bool cpu_has_vmx(CPUX86State *env) return env->features[FEAT_1_ECX] & CPUID_EXT_VMX; } =20 +/* + * In order for a vCPU to enter VMX operation it must have CR4.VMXE set. + * Since it was set, CR4.VMXE must remain set as long as vCPU is in + * VMX operation. This is because CR4.VMXE is one of the bits set + * in MSR_IA32_VMX_CR4_FIXED1. + * + * There is one exception to above statement when vCPU enters SMM mode. + * When a vCPU enters SMM mode, it temporarily exit VMX operation and + * may also reset CR4.VMXE during execution in SMM mode. + * When vCPU exits SMM mode, vCPU state is restored to be in VMX operation + * and CR4.VMXE is restored to it's original value of being set. + * + * Therefore, when vCPU is not in SMM mode, we can infer whether + * VMX is being used by examining CR4.VMXE. Otherwise, we cannot + * know for certain. + */ +static inline bool cpu_vmx_maybe_enabled(CPUX86State *env) +{ + return cpu_has_vmx(env) && + ((env->cr[4] & CR4_VMXE_MASK) || (env->hflags & HF_SMM_MASK)); +} + /* fpu_helper.c */ void update_fp_status(CPUX86State *env); void update_mxcsr_status(CPUX86State *env); diff --git a/target/i386/kvm.c b/target/i386/kvm.c index ec7870c..4542f0f 100644 --- a/target/i386/kvm.c +++ b/target/i386/kvm.c @@ -128,6 +128,11 @@ bool kvm_has_adjust_clock_stable(void) return (ret =3D=3D KVM_CLOCK_TSC_STABLE); } =20 +bool kvm_has_exception_payload(void) +{ + return has_exception_payload; +} + bool kvm_allows_irq0_override(void) { return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing(); @@ -1342,7 +1347,6 @@ static int hyperv_init_vcpu(X86CPU *cpu) } =20 static Error *invtsc_mig_blocker; -static Error *nested_virt_mig_blocker; =20 #define KVM_MAX_CPUID_ENTRIES 100 =20 @@ -1653,22 +1657,6 @@ int kvm_arch_init_vcpu(CPUState *cs) !!(c->ecx & CPUID_EXT_SMX); } =20 - if (cpu_has_vmx(env) && !nested_virt_mig_blocker && - ((kvm_max_nested_state_length() <=3D 0) || !has_exception_payload)= ) { - error_setg(&nested_virt_mig_blocker, - "Kernel do not provide required capabilities for " - "nested virtualization migration. " - "(CAP_NESTED_STATE=3D%d, CAP_EXCEPTION_PAYLOAD=3D%d)", - kvm_max_nested_state_length() > 0, - has_exception_payload); - r =3D migrate_add_blocker(nested_virt_mig_blocker, &local_err); - if (local_err) { - error_report_err(local_err); - error_free(nested_virt_mig_blocker); - return r; - } - } - if (env->mcg_cap & MCG_LMCE_P) { has_msr_mcg_ext_ctl =3D has_msr_feature_control =3D true; } @@ -1683,7 +1671,7 @@ int kvm_arch_init_vcpu(CPUState *cs) if (local_err) { error_report_err(local_err); error_free(invtsc_mig_blocker); - goto fail2; + return r; } } } @@ -1752,8 +1740,6 @@ int kvm_arch_init_vcpu(CPUState *cs) =20 fail: migrate_del_blocker(invtsc_mig_blocker); - fail2: - migrate_del_blocker(nested_virt_mig_blocker); =20 return r; } diff --git a/target/i386/kvm_i386.h b/target/i386/kvm_i386.h index 3057ba4..06fe06b 100644 --- a/target/i386/kvm_i386.h +++ b/target/i386/kvm_i386.h @@ -35,6 +35,7 @@ bool kvm_allows_irq0_override(void); bool kvm_has_smm(void); bool kvm_has_adjust_clock_stable(void); +bool kvm_has_exception_payload(void); void kvm_synchronize_all_tsc(void); void kvm_arch_reset_vcpu(X86CPU *cs); void kvm_arch_do_init_vcpu(X86CPU *cs); diff --git a/target/i386/machine.c b/target/i386/machine.c index 704ba6d..ac2d1d1 100644 --- a/target/i386/machine.c +++ b/target/i386/machine.c @@ -7,6 +7,7 @@ #include "hw/isa/isa.h" #include "migration/cpu.h" #include "hyperv.h" +#include "kvm_i386.h" =20 #include "sysemu/kvm.h" #include "sysemu/tcg.h" @@ -232,10 +233,25 @@ static int cpu_pre_save(void *opaque) } =20 #ifdef CONFIG_KVM - /* Verify we have nested virtualization state from kernel if required = */ - if (kvm_enabled() && cpu_has_vmx(env) && !env->nested_state) { - error_report("Guest enabled nested virtualization but kernel " - "does not support saving of nested state"); + /* + * In case vCPU may have enabled VMX, we need to make sure kernel have + * required capabilities in order to perform migration correctly: + * + * 1) We must be able to extract vCPU nested-state from KVM. + * + * 2) In case vCPU is running in guest-mode and it has a pending excep= tion, + * we must be able to determine if it's in a pending or injected state. + * Note that in case KVM don't have required capability to do so, + * a pending/injected exception will always appear as an + * injected exception. + */ + if (kvm_enabled() && cpu_vmx_maybe_enabled(env) && + (!env->nested_state || + (!kvm_has_exception_payload() && (env->hflags & HF_GUEST_MASK) && + env->exception_injected))) { + error_report("Guest maybe enabled nested virtualization but kernel= " + "does not support required capabilities to save vCPU " + "nested state"); return -EINVAL; } #endif --=20 1.8.3.1