From nobody Sun Feb 8 00:12:07 2026 Received: from out-182.mta0.migadu.com (out-182.mta0.migadu.com [91.218.175.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 898A62EDD50 for ; Mon, 10 Nov 2025 22:29:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813788; cv=none; b=eHFHWzaBMEots1lzWW17aNrGxDYNAHxQ/rNEuu3SM9l1SaaBZnzOWGdtMYQkuUM8knMT0qGAYz/Lk7uGZv6ej/Yv3VYrg5un+rWw6Zvaj24wmosMbSF3suNQZBTqscW6vRMCq+u9PinWP512UQB8cT/T5apf/3Sn2siC1KOgZnc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813788; c=relaxed/simple; bh=Ka5ryjJbye53/XhbocNd3xnqwC4X/HCy4iBxVgetqus=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=l6sNlc/t6IjYKf3FmXedI1mt8xo34sRm8DdFPuK/cmEOEUN48jBBiY5XkH8NcuPAnIelWu2cp8O5C5h5j8URtRJAt0puyWEkWNdGzkFtjVmDmqn4Re5+ZFLgFICzNQwrAwOHsyhdGVuMh52PLxxQ4h/8zqCTMw4eCovbVrT5aCY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=QT8FRsqi; arc=none smtp.client-ip=91.218.175.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="QT8FRsqi" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813783; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=B5i4M/pS5dso2YXNgZqjF8i6MR1L41opDR7U4EI5R90=; b=QT8FRsqi11bGmc8qgEysnXteU/hhMPMPerNeV2iYIiyykAdoUVtJN5CGELNbQp+ghr7mur 2u6ynutZ5nhnRRS7g9weQ/Opt/0KVnkKa3JFwK2+lHS6WTCcYKoXr2Kb8ATXfwNRxb2QHA lBaFLoJcjajODwtnIdTldePnaAsa0HI= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed , stable@vger.kernel.org Subject: [PATCH v2 01/13] KVM: SVM: Switch svm_copy_lbrs() to a macro Date: Mon, 10 Nov 2025 22:29:10 +0000 Message-ID: <20251110222922.613224-2-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" In preparation for using svm_copy_lbrs() with 'struct vmcb_save_area' without a containing 'struct vmcb', and later even 'struct vmcb_save_area_cached', make it a macro. Pull the call to vmcb_mark_dirty() out to the callers. Macros are generally not preferred compared to functions, mainly due to type-safety. However, in this case it seems like having a simple macro copying a few fields is better than copy-pasting the same 5 lines of code in different places. On the bright side, pulling vmcb_mark_dirty() calls to the callers makes it clear that in one case, vmcb_mark_dirty() was being called on VMCB12. It is not architecturally defined for the CPU to clear arbitrary clean bits, and it is not needed, so drop that one call. Technically fixes the non-architectural behavior of setting the dirty bit on VMCB12. Fixes: d20c796ca370 ("KVM: x86: nSVM: implement nested LBR virtualization") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed --- arch/x86/kvm/svm/nested.c | 16 ++++++++++------ arch/x86/kvm/svm/svm.c | 11 ----------- arch/x86/kvm/svm/svm.h | 10 +++++++++- 3 files changed, 19 insertions(+), 18 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index da6e80b3ac353..a37bd5c1f36fa 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -675,10 +675,12 @@ static void nested_vmcb02_prepare_save(struct vcpu_sv= m *svm, struct vmcb *vmcb12 * Reserved bits of DEBUGCTL are ignored. Be consistent with * svm_set_msr's definition of reserved bits. */ - svm_copy_lbrs(vmcb02, vmcb12); + svm_copy_lbrs(&vmcb02->save, &vmcb12->save); + vmcb_mark_dirty(vmcb02, VMCB_LBR); vmcb02->save.dbgctl &=3D ~DEBUGCTL_RESERVED_BITS; } else { - svm_copy_lbrs(vmcb02, vmcb01); + svm_copy_lbrs(&vmcb02->save, &vmcb01->save); + vmcb_mark_dirty(vmcb02, VMCB_LBR); } svm_update_lbrv(&svm->vcpu); } @@ -1184,10 +1186,12 @@ int nested_svm_vmexit(struct vcpu_svm *svm) kvm_make_request(KVM_REQ_EVENT, &svm->vcpu); =20 if (unlikely(guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) && - (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) - svm_copy_lbrs(vmcb12, vmcb02); - else - svm_copy_lbrs(vmcb01, vmcb02); + (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) { + svm_copy_lbrs(&vmcb12->save, &vmcb02->save); + } else { + svm_copy_lbrs(&vmcb01->save, &vmcb02->save); + vmcb_mark_dirty(vmcb01, VMCB_LBR); + } =20 svm_update_lbrv(vcpu); =20 diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 10c21e4c5406f..711276e8ee84f 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -795,17 +795,6 @@ static void svm_recalc_msr_intercepts(struct kvm_vcpu = *vcpu) */ } =20 -void svm_copy_lbrs(struct vmcb *to_vmcb, struct vmcb *from_vmcb) -{ - to_vmcb->save.dbgctl =3D from_vmcb->save.dbgctl; - to_vmcb->save.br_from =3D from_vmcb->save.br_from; - to_vmcb->save.br_to =3D from_vmcb->save.br_to; - to_vmcb->save.last_excp_from =3D from_vmcb->save.last_excp_from; - to_vmcb->save.last_excp_to =3D from_vmcb->save.last_excp_to; - - vmcb_mark_dirty(to_vmcb, VMCB_LBR); -} - static void __svm_enable_lbrv(struct kvm_vcpu *vcpu) { to_svm(vcpu)->vmcb->control.virt_ext |=3D LBR_CTL_ENABLE_MASK; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index c856d8e0f95e7..f6fb70ddf7272 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -687,8 +687,16 @@ static inline void *svm_vcpu_alloc_msrpm(void) return svm_alloc_permissions_map(MSRPM_SIZE, GFP_KERNEL_ACCOUNT); } =20 +#define svm_copy_lbrs(to, from) \ +({ \ + (to)->dbgctl =3D (from)->dbgctl; \ + (to)->br_from =3D (from)->br_from; \ + (to)->br_to =3D (from)->br_to; \ + (to)->last_excp_from =3D (from)->last_excp_from; \ + (to)->last_excp_to =3D (from)->last_excp_to; \ +}) + void svm_vcpu_free_msrpm(void *msrpm); -void svm_copy_lbrs(struct vmcb *to_vmcb, struct vmcb *from_vmcb); void svm_enable_lbrv(struct kvm_vcpu *vcpu); void svm_update_lbrv(struct kvm_vcpu *vcpu); =20 --=20 2.51.2.1041.gc1ab5b90ca-goog From nobody Sun Feb 8 00:12:07 2026 Received: from out-184.mta0.migadu.com (out-184.mta0.migadu.com [91.218.175.184]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 43818330B1F for ; Mon, 10 Nov 2025 22:29:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.184 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813790; cv=none; b=KTZ4p+tyu3FzujrtF0Mg69ebmSo0evMlMC3KPEu1WOHXKrQrR1vjl8DeG+Ku5PLP9lV8S5GsoQWCg8L7aQNyPRr6J8QZX40wIq3H43CBpKgyj07+8Ual4VVGSSoKdweyPre2k8UK20LoY+R+8hMQ8+PYZzxYHrXw/tGceSQygWo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813790; c=relaxed/simple; bh=xFbdQlplq0WqCRJ78rpGzqY3fFN0yCjW8nbOWqfddkQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=o0DQYMiadavZNChI7DaaUHuv32/qNQwEW28c8z0ImWSjf2gtDm5S7G1EG9Fm7T2cRGL0lVxHKYGFFLZsLqr1JuxeuK4qBWP6UNoaJMqVqtpLh/lvVnBnlXi8OhqpPw01iHc3/iSKbMjduiEYLArHOJeAJqclKH1C1DQFFAlqAo0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=taVGA8jA; arc=none smtp.client-ip=91.218.175.184 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="taVGA8jA" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813785; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TcWwaqQvIuhlseQmCNLeqyZH+QJu2fp9KDcT1knut6I=; b=taVGA8jAEgPSWJPPg0uEVYU8GRVY9Xh4j3a/JvYdRh2ygnMYKSqAuEZWbd5HTf0vBdwVLi 2tsH7RW8Wwj0lJAX5P3zSLWHqmUBZssnWdMYuxQwaiK1ZFPySvQtoTM8kDAGRzkQBEO09h Ml9UzgCgxBoHEMu1Ekk5qaOXhLyvoE0= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed , stable@vger.kernel.org Subject: [PATCH v2 02/13] KVM: SVM: Add missing save/restore handling of LBR MSRs Date: Mon, 10 Nov 2025 22:29:11 +0000 Message-ID: <20251110222922.613224-3-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" MSR_IA32_DEBUGCTLMSR and LBR MSRs are currently not enumerated by KVM_GET_MSR_INDEX_LIST, and LBR MSRs cannot be set with KVM_SET_MSRS. So save/restore is completely broken. Fix it by adding the MSRs to msrs_to_save_base, and allowing writes to LBR MSRs from userspace only (as they are read-only MSRs). Additionally, to correctly restore L1's LBRs while L2 is running, make sure the LBRs are copied from the captured VMCB01 save area in svm_copy_vmrun_state(). Fixes: 24e09cbf480a ("KVM: SVM: enable LBR virtualization") Cc: stable@vger.kernel.org Reported-by: Jim Mattson Signed-off-by: Yosry Ahmed --- arch/x86/kvm/svm/nested.c | 3 +++ arch/x86/kvm/svm/svm.c | 20 ++++++++++++++++++++ arch/x86/kvm/x86.c | 3 +++ 3 files changed, 26 insertions(+) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index a37bd5c1f36fa..74211c5c68026 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1055,6 +1055,9 @@ void svm_copy_vmrun_state(struct vmcb_save_area *to_s= ave, to_save->isst_addr =3D from_save->isst_addr; to_save->ssp =3D from_save->ssp; } + + if (lbrv) + svm_copy_lbrs(to_save, from_save); } =20 void svm_copy_vmloadsave_state(struct vmcb *to_vmcb, struct vmcb *from_vmc= b) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 711276e8ee84f..af0e9c26527e3 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -2983,6 +2983,26 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct= msr_data *msr) vmcb_mark_dirty(svm->vmcb, VMCB_LBR); svm_update_lbrv(vcpu); break; + case MSR_IA32_LASTBRANCHFROMIP: + if (!msr->host_initiated) + return 1; + svm->vmcb->save.br_from =3D data; + break; + case MSR_IA32_LASTBRANCHTOIP: + if (!msr->host_initiated) + return 1; + svm->vmcb->save.br_to =3D data; + break; + case MSR_IA32_LASTINTFROMIP: + if (!msr->host_initiated) + return 1; + svm->vmcb->save.last_excp_from =3D data; + break; + case MSR_IA32_LASTINTTOIP: + if (!msr->host_initiated) + return 1; + svm->vmcb->save.last_excp_to =3D data; + break; case MSR_VM_HSAVE_PA: /* * Old kernels did not validate the value written to diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c9c2aa6f4705e..9cb824f9cf644 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -348,6 +348,9 @@ static const u32 msrs_to_save_base[] =3D { MSR_IA32_U_CET, MSR_IA32_S_CET, MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP, MSR_IA32_INT_SSP_TAB, + MSR_IA32_DEBUGCTLMSR, + MSR_IA32_LASTBRANCHFROMIP, MSR_IA32_LASTBRANCHTOIP, + MSR_IA32_LASTINTFROMIP, MSR_IA32_LASTINTTOIP, }; =20 static const u32 msrs_to_save_pmu[] =3D { --=20 2.51.2.1041.gc1ab5b90ca-goog From nobody Sun Feb 8 00:12:07 2026 Received: from out-185.mta0.migadu.com (out-185.mta0.migadu.com [91.218.175.185]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF8BB334683 for ; Mon, 10 Nov 2025 22:29:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.185 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813790; cv=none; b=nF+1SpEGxGKz7sT4utgCQkjhdddSKEa0wGAfLdudSKYndk9T0oQilPF7vVVd6hiuB72v54R2+kg32OwifLuzk6zzwa3tY5ODthOaNOLLwhawpfuDjYoqorguhmTuk1+LZe5syj/Foe2J8pc9vHbT0H5QtH5KWTjMvpghXmTqKgQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813790; c=relaxed/simple; bh=xLkyk4AAk1tD5E/VGsWEZsoK4aysRr0fOWyfwvnKtik=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bOgj3suyLmnMIWmXGMsEpmA/UJVMjeftieB6rdC8ujEzhafy0zFFeLteqop3IJTYrF0xKAbVEM6sxQhQ41fOyeTWjFwgvmmgC04+sjPZMpCajMSd9jycEarTqUXCnziUOcZwHjr457gBigikThFZTfg/FYynvPTJNrVf6WeDA3I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=ruD9vbzY; arc=none smtp.client-ip=91.218.175.185 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="ruD9vbzY" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813787; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=itMlBIbN1PqufgHbzCymjjQHzxqWc8TraVJJMw770PU=; b=ruD9vbzY9dUe0vOPOLwxzofM4ZsBilpwvh16XUJLWubQXbIEZA8UAyPlavFNSqZ+FN31y4 Dihpk4qm1zW+0/zmKw6po7HX6/pRW0rus3z8pnol0CT4htPoOFH+BX+2beOw++euwuE9K2 pS6T16myesMOyb9BUPW9lKuhKcUDCDY= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed Subject: [PATCH v2 03/13] KVM: selftests: Add a test for LBR save/restore (ft. nested) Date: Mon, 10 Nov 2025 22:29:12 +0000 Message-ID: <20251110222922.613224-4-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Add a selftest exercising save/restore with usage of LBRs in both L1 and L2, and making sure all LBRs remain intact. Signed-off-by: Yosry Ahmed --- tools/testing/selftests/kvm/Makefile.kvm | 1 + .../selftests/kvm/include/x86/processor.h | 5 + .../selftests/kvm/x86/svm_lbr_nested_state.c | 155 ++++++++++++++++++ 3 files changed, 161 insertions(+) create mode 100644 tools/testing/selftests/kvm/x86/svm_lbr_nested_state.c diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selft= ests/kvm/Makefile.kvm index 148d427ff24be..9a19554ffd3c1 100644 --- a/tools/testing/selftests/kvm/Makefile.kvm +++ b/tools/testing/selftests/kvm/Makefile.kvm @@ -105,6 +105,7 @@ TEST_GEN_PROGS_x86 +=3D x86/svm_vmcall_test TEST_GEN_PROGS_x86 +=3D x86/svm_int_ctl_test TEST_GEN_PROGS_x86 +=3D x86/svm_nested_shutdown_test TEST_GEN_PROGS_x86 +=3D x86/svm_nested_soft_inject_test +TEST_GEN_PROGS_x86 +=3D x86/svm_lbr_nested_state TEST_GEN_PROGS_x86 +=3D x86/tsc_scaling_sync TEST_GEN_PROGS_x86 +=3D x86/sync_regs_test TEST_GEN_PROGS_x86 +=3D x86/ucna_injection_test diff --git a/tools/testing/selftests/kvm/include/x86/processor.h b/tools/te= sting/selftests/kvm/include/x86/processor.h index 51cd84b9ca664..aee4b83c47b19 100644 --- a/tools/testing/selftests/kvm/include/x86/processor.h +++ b/tools/testing/selftests/kvm/include/x86/processor.h @@ -1367,6 +1367,11 @@ static inline bool kvm_is_ignore_msrs(void) return get_kvm_param_bool("ignore_msrs"); } =20 +static inline bool kvm_is_lbrv_enabled(void) +{ + return !!get_kvm_amd_param_integer("lbrv"); +} + uint64_t *__vm_get_page_table_entry(struct kvm_vm *vm, uint64_t vaddr, int *level); uint64_t *vm_get_page_table_entry(struct kvm_vm *vm, uint64_t vaddr); diff --git a/tools/testing/selftests/kvm/x86/svm_lbr_nested_state.c b/tools= /testing/selftests/kvm/x86/svm_lbr_nested_state.c new file mode 100644 index 0000000000000..a343279546fd8 --- /dev/null +++ b/tools/testing/selftests/kvm/x86/svm_lbr_nested_state.c @@ -0,0 +1,155 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * svm_lbr_nested_state + * + * Test that LBRs are maintained correctly in both L1 and L2 during + * save/restore. + * + * Copyright (C) 2025, Google, Inc. + */ + +#include "test_util.h" +#include "kvm_util.h" +#include "processor.h" +#include "svm_util.h" + + +#define L2_GUEST_STACK_SIZE 64 + +#define DO_BRANCH() asm volatile("jmp 1f\n 1: nop") + +struct lbr_branch { + u64 from, to; +}; + +volatile struct lbr_branch l2_branch; + +#define RECORD_BRANCH(b, s) \ +({ \ + wrmsr(MSR_IA32_DEBUGCTLMSR, DEBUGCTLMSR_LBR); \ + DO_BRANCH(); \ + (b)->from =3D rdmsr(MSR_IA32_LASTBRANCHFROMIP); \ + (b)->to =3D rdmsr(MSR_IA32_LASTBRANCHTOIP); \ + /* Disabe LBR right after to avoid overriding the IPs */ \ + wrmsr(MSR_IA32_DEBUGCTLMSR, 0); \ + \ + GUEST_ASSERT_NE((b)->from, 0); \ + GUEST_ASSERT_NE((b)->to, 0); \ + GUEST_PRINTF("%s: (0x%lx, 0x%lx)\n", (s), (b)->from, (b)->to); \ +}) \ + +#define CHECK_BRANCH_MSRS(b) \ +({ \ + GUEST_ASSERT_EQ((b)->from, rdmsr(MSR_IA32_LASTBRANCHFROMIP)); \ + GUEST_ASSERT_EQ((b)->to, rdmsr(MSR_IA32_LASTBRANCHTOIP)); \ +}) + +#define CHECK_BRANCH_VMCB(b, vmcb) \ +({ \ + GUEST_ASSERT_EQ((b)->from, vmcb->save.br_from); \ + GUEST_ASSERT_EQ((b)->to, vmcb->save.br_to); \ +}) \ + +static void l2_guest_code(struct svm_test_data *svm) +{ + /* Record a branch, trigger save/restore, and make sure LBRs are intact */ + RECORD_BRANCH(&l2_branch, "L2 branch"); + GUEST_SYNC(true); + CHECK_BRANCH_MSRS(&l2_branch); + vmmcall(); +} + +static void l1_guest_code(struct svm_test_data *svm, bool nested_lbrv) +{ + unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE]; + struct vmcb *vmcb =3D svm->vmcb; + struct lbr_branch l1_branch; + + /* Record a branch, trigger save/restore, and make sure LBRs are intact */ + RECORD_BRANCH(&l1_branch, "L1 branch"); + GUEST_SYNC(true); + CHECK_BRANCH_MSRS(&l1_branch); + + /* Run L2, which will also do the same */ + generic_svm_setup(svm, l2_guest_code, + &l2_guest_stack[L2_GUEST_STACK_SIZE]); + + if (nested_lbrv) + vmcb->control.virt_ext =3D LBR_CTL_ENABLE_MASK; + else + vmcb->control.virt_ext &=3D ~LBR_CTL_ENABLE_MASK; + + run_guest(vmcb, svm->vmcb_gpa); + GUEST_ASSERT(svm->vmcb->control.exit_code =3D=3D SVM_EXIT_VMMCALL); + + /* Trigger save/restore one more time before checking, just for kicks */ + GUEST_SYNC(true); + + /* + * If LBR_CTL_ENABLE is set, L1 and L2 should have separate LBR MSRs, so + * expect L1's LBRs to remain intact and L2 LBRs to be in the VMCB. + * Otherwise, the MSRs are shared between L1 & L2 so expect L2's LBRs. + */ + if (nested_lbrv) { + CHECK_BRANCH_MSRS(&l1_branch); + CHECK_BRANCH_VMCB(&l2_branch, vmcb); + } else { + CHECK_BRANCH_MSRS(&l2_branch); + } + GUEST_DONE(); +} + +void test_lbrv_nested_state(bool nested_lbrv) +{ + struct kvm_x86_state *state =3D NULL; + struct kvm_vcpu *vcpu; + vm_vaddr_t svm_gva; + struct kvm_vm *vm; + struct ucall uc; + + pr_info("Testing with nested LBRV %s\n", nested_lbrv ? "enabled" : "disab= led"); + + vm =3D vm_create_with_one_vcpu(&vcpu, l1_guest_code); + vcpu_alloc_svm(vm, &svm_gva); + vcpu_args_set(vcpu, 2, svm_gva, nested_lbrv); + + for (;;) { + vcpu_run(vcpu); + TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_IO); + switch (get_ucall(vcpu, &uc)) { + case UCALL_SYNC: + /* Save the vCPU state and restore it in a new VM on sync */ + pr_info("Guest triggered save/restore.\n"); + state =3D vcpu_save_state(vcpu); + kvm_vm_release(vm); + vcpu =3D vm_recreate_with_one_vcpu(vm); + vcpu_load_state(vcpu, state); + break; + case UCALL_ABORT: + REPORT_GUEST_ASSERT(uc); + /* NOT REACHED */ + case UCALL_DONE: + goto done; + case UCALL_PRINTF: + pr_info("%s", uc.buffer); + break; + default: + TEST_FAIL("Unknown ucall %lu", uc.cmd); + } + } +done: + if (state) + kvm_x86_state_cleanup(state); + kvm_vm_free(vm); +} + +int main(int argc, char *argv[]) +{ + TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_SVM)); + TEST_REQUIRE(kvm_is_lbrv_enabled()); + + test_lbrv_nested_state(/*nested_lbrv=3D*/false); + test_lbrv_nested_state(/*nested_lbrv=3D*/true); + + return 0; +} --=20 2.51.2.1041.gc1ab5b90ca-goog From nobody Sun Feb 8 00:12:07 2026 Received: from out-174.mta0.migadu.com (out-174.mta0.migadu.com [91.218.175.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0E9AE33FE22; Mon, 10 Nov 2025 22:29:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813793; cv=none; b=PiIu3XGN9k3RN2GncWNBi/W3aMxCLyazW+d97w7hUVsZ21FMmE2vrqpvxKcHr9dgrAgw0BWBpvzAcTZBHdC1TEaTMbgNLQaQwAineLRZ878TsS6XMzEw0mQAXY2dQwlwtvl+htiCVUMIBO5S3pf09aoUIC5c7NWbBkSZr/DHGLI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813793; c=relaxed/simple; bh=9DRor7lV4Re2abUUUUzQvXd6E2q83n/HLL7sD5H893U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=AsDOnGRR5OORB+7szjl+xtC/FayPE9iDmwXRKGIFrLZBLxCvU5p2tv/sni6kzAaIkq2uOupwx7LALGtpz6F+CUgVqZl35r7Jgyde8E1fAo9mhEJULOY5fm4Cl+kU3UmmMmjwRPOKF52dLxKp8klLhgQwwYO+uCsdmgj6wHLXgCs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=RzC3TX2/; arc=none smtp.client-ip=91.218.175.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="RzC3TX2/" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813788; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fp6La+n8eNRyqu4lEfbbzhDmUUR1lpJy82ZrTTptZlk=; b=RzC3TX2/itPXUGhg55lbr/4d2BdQ77NpaA7a6Dpfj3QkiVIFVNWwah2/Um4MWsl0Cjitjr dbgE1fyl9m5csn8qLT/hMTkbA22BR/oZa8025eLrL1v8gxWO/OYMVpvU/J+U2B/VNHhI5G z7AnoyC3T3E/++NR8J3hpn22mMCd0nk= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed , stable@vger.kernel.org Subject: [PATCH v2 04/13] KVM: nSVM: Fix consistency checks for NP_ENABLE Date: Mon, 10 Nov 2025 22:29:13 +0000 Message-ID: <20251110222922.613224-5-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT KVM currenty fails a nested VMRUN and injects VMEXIT_INVALID (aka SVM_EXIT_ERR) if L1 sets NP_ENABLE and the host does not support NPTs. On first glance, it seems like the check should actually be for guest_cpu_cap_has(X86_FEATURE_NPT) instead, as it is possible for the host to support NPTs but the guest CPUID to not advertise it. However, the consistency check is not architectural to begin with. The APM does not mention VMEXIT_INVALID if NP_ENABLE is set on a processor that does not have X86_FEATURE_NPT. Hence, NP_ENABLE should be ignored if X86_FEATURE_NPT is not available for L1. Apart from the consistency check, this is currently the case because NP_ENABLE is actually copied from VMCB01 to VMCB02, not from VMCB12. On the other hand, the APM does mention two other consistency checks for NP_ENABLE, both of which are missing (paraphrased): In Volume #2, 15.25.3 (24593=E2=80=94Rev. 3.42=E2=80=94March 2024): If VMRUN is executed with hCR0.PG cleared to zero and NP_ENABLE set to 1, VMRUN terminates with #VMEXIT(VMEXIT_INVALID) In Volume #2, 15.25.4 (24593=E2=80=94Rev. 3.42=E2=80=94March 2024): When VMRUN is executed with nested paging enabled (NP_ENABLE =3D 1), the following conditions are considered illegal state combinations, in addition to those mentioned in =E2=80=9CCanonicalization and Consistency Checks=E2=80=9D: =E2=80=A2 Any MBZ bit of nCR3 is set. =E2=80=A2 Any G_PAT.PA field has an unsupported type encoding or any reserved field in G_PAT has a nonzero value. Replace the existing consistency check with consistency checks on hCR0.PG and nCR3. Only perform the consistency checks if L1 has X86_FEATURE_NPT and NP_ENABLE is set in VMCB12. The G_PAT consistency check will be addressed separately. As it is now possible for an L1 to run L2 with NP_ENABLE set but ignored, also check that L1 has X86_FEATURE_NPT in nested_npt_enabled(). Pass L1's CR0 to __nested_vmcb_check_controls(). In nested_vmcb_check_controls(), L1's CR0 is available through kvm_read_cr0(), as vcpu->arch.cr0 is not updated to L2's CR0 until later through nested_vmcb02_prepare_save() -> svm_set_cr0(). In svm_set_nested_state(), L1's CR0 is available in the captured save area, as svm_get_nested_state() captures L1's save area when running L2, and L1's CR0 is stashed in VMCB01 on nested VMRUN (in nested_svm_vmrun()). Fixes: 4b16184c1cca ("KVM: SVM: Initialize Nested Nested MMU context on VMR= UN") Cc: stable@vger.kernel.org Signed-off-by: Yosry Ahmed --- arch/x86/kvm/svm/nested.c | 21 ++++++++++++++++----- arch/x86/kvm/svm/svm.h | 3 ++- 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 74211c5c68026..87bcc5eff96e8 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -325,7 +325,8 @@ static bool nested_svm_check_bitmap_pa(struct kvm_vcpu = *vcpu, u64 pa, u32 size) } =20 static bool __nested_vmcb_check_controls(struct kvm_vcpu *vcpu, - struct vmcb_ctrl_area_cached *control) + struct vmcb_ctrl_area_cached *control, + unsigned long l1_cr0) { if (CC(!vmcb12_is_intercept(control, INTERCEPT_VMRUN))) return false; @@ -333,8 +334,12 @@ static bool __nested_vmcb_check_controls(struct kvm_vc= pu *vcpu, if (CC(control->asid =3D=3D 0)) return false; =20 - if (CC((control->nested_ctl & SVM_NESTED_CTL_NP_ENABLE) && !npt_enabled)) - return false; + if (nested_npt_enabled(to_svm(vcpu))) { + if (CC(!kvm_vcpu_is_legal_gpa(vcpu, control->nested_cr3))) + return false; + if (CC(!(l1_cr0 & X86_CR0_PG))) + return false; + } =20 if (CC(!nested_svm_check_bitmap_pa(vcpu, control->msrpm_base_pa, MSRPM_SIZE))) @@ -400,7 +405,12 @@ static bool nested_vmcb_check_controls(struct kvm_vcpu= *vcpu) struct vcpu_svm *svm =3D to_svm(vcpu); struct vmcb_ctrl_area_cached *ctl =3D &svm->nested.ctl; =20 - return __nested_vmcb_check_controls(vcpu, ctl); + /* + * Make sure we did not enter guest mode yet, in which case + * kvm_read_cr0() could return L2's CR0. + */ + WARN_ON_ONCE(is_guest_mode(vcpu)); + return __nested_vmcb_check_controls(vcpu, ctl, kvm_read_cr0(vcpu)); } =20 static @@ -1831,7 +1841,8 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu, =20 ret =3D -EINVAL; __nested_copy_vmcb_control_to_cache(vcpu, &ctl_cached, ctl); - if (!__nested_vmcb_check_controls(vcpu, &ctl_cached)) + /* 'save' contains L1 state saved from before VMRUN */ + if (!__nested_vmcb_check_controls(vcpu, &ctl_cached, save->cr0)) goto out_free; =20 /* diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index f6fb70ddf7272..3e805a43ffcdb 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -552,7 +552,8 @@ static inline bool gif_set(struct vcpu_svm *svm) =20 static inline bool nested_npt_enabled(struct vcpu_svm *svm) { - return svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE; + return guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_NPT) && + svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE; } =20 static inline bool nested_vnmi_enabled(struct vcpu_svm *svm) --=20 2.51.2.1041.gc1ab5b90ca-goog From nobody Sun Feb 8 00:12:07 2026 Received: from out-178.mta0.migadu.com (out-178.mta0.migadu.com [91.218.175.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66904342C80 for ; Mon, 10 Nov 2025 22:29:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813794; cv=none; b=ZL8bWPUEczoavJb+s9nAzj7HEtsbSujWjwZgHFfMe0gMUy19yYMz2J3EgOA+8IKs9WbNCN/gAqO1L3PWjyT1S8HhpDyObGHZCLFyKKIGWJDdZn1dpUJQFWg47/7uFnBm62xwfZSxEACX/OGceAR3/5lUzfaZnO4z94Rh2Jd7vI8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813794; c=relaxed/simple; bh=xXgXymYjKd8awN8u9TFDRWtwimq9S5R/oZg0SkTw7gA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=cKa31+CYiT/oxaSJ4CBqk/M5ry96Qwb12sgh9sEG5zoyVvdG850/lRjbBPtd/NA0ALUeikfuren00dFEsW+/3Omfnw/bGSpSA7h33UeOjDWNCsq2AhgvK46TgwXaZ+c2BSRk5wABFGMrRCkDICw8Sf7+6TvlKpgSHIXeFLlM6zw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=ht9GTSe+; arc=none smtp.client-ip=91.218.175.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="ht9GTSe+" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813790; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fBel3nHLRqwGeYPYTmzGhGXE3bf83TbOfhnE0QM9ssY=; b=ht9GTSe+EfQWzBpePdaASOBrsBwq+kpeEyjaW7akc8lQORRekfL/pOUZAlJ39B3aJu0ojM Ffrm6Py0GBBsyo2iLoTYOf2hgI/ruSaSnA51Rg2b4PHr2rl8MkZ6HaYd1A19bWn/2DQ/ZY pCH2h8s7Tq8ckM4XwidZT9/6Dz8kYy4= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed Subject: [PATCH v2 05/13] KVM: nSVM: Add missing consistency check for EFER, CR0, CR4, and CS Date: Mon, 10 Nov 2025 22:29:14 +0000 Message-ID: <20251110222922.613224-6-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT According to the APM Volume #2, 15.5, Canonicalization and Consistency Checks (24593=E2=80=94Rev. 3.42=E2=80=94March 2024), the following conditio= n (among others) results in a #VMEXIT with VMEXIT_INVALID (aka SVM_EXIT_ERR): EFER.LME, CR0.PG, CR4.PAE, CS.L, and CS.D are all non-zero. Add the missing consistency check. This is functionally a nop because the nested VMRUN results in SVM_EXIT_ERR in HW, which is forwarded to L1, but KVM makes all consistency checks before a VMRUN is actually attempted. Signed-off-by: Yosry Ahmed --- arch/x86/kvm/svm/nested.c | 7 +++++++ arch/x86/kvm/svm/svm.h | 1 + 2 files changed, 8 insertions(+) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 87bcc5eff96e8..abdaacb04dd9e 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -380,6 +380,11 @@ static bool __nested_vmcb_check_save(struct kvm_vcpu *= vcpu, CC(!(save->cr0 & X86_CR0_PE)) || CC(!kvm_vcpu_is_legal_cr3(vcpu, save->cr3))) return false; + + if (CC((save->cr4 & X86_CR4_PAE) && + (save->cs.attrib & SVM_SELECTOR_L_MASK) && + (save->cs.attrib & SVM_SELECTOR_DB_MASK))) + return false; } =20 /* Note, SVM doesn't have any additional restrictions on CR4. */ @@ -473,6 +478,8 @@ static void __nested_copy_vmcb_save_to_cache(struct vmc= b_save_area_cached *to, * Copy only fields that are validated, as we need them * to avoid TOC/TOU races. */ + to->cs =3D from->cs; + to->efer =3D from->efer; to->cr0 =3D from->cr0; to->cr3 =3D from->cr3; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 3e805a43ffcdb..a6913a0820125 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -142,6 +142,7 @@ struct kvm_vmcb_info { }; =20 struct vmcb_save_area_cached { + struct vmcb_seg cs; u64 efer; u64 cr4; u64 cr3; --=20 2.51.2.1041.gc1ab5b90ca-goog From nobody Sun Feb 8 00:12:07 2026 Received: from out-189.mta0.migadu.com (out-189.mta0.migadu.com [91.218.175.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 18162343D6F; Mon, 10 Nov 2025 22:29:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.189 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813796; cv=none; b=NE7P0yLADE9MEyt57rVOavFoz8Gsnw4hXu7C7sWSfYxFahu96yKo9+avbf/rI0jfbublwQSrgvYVmzn5MfpcWDfO8UfiPnY6HTdIUmCqbdbW7XkM02j4J2hr4WuncytptBQ/RdaCQPjxP7mNILoK4JOBbes1hQBTyzG30IQfHpo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813796; c=relaxed/simple; bh=Cxffpccj5xFRhcWG0Ed5lwYp4MppoHShIW8aXH7ntGs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=V1mjyAONchwptIrziA1kiZTkBMaQAjaiHCeJYAlo+0R5fIbgHKNk6pVgDhtfY6/Jil658PE3AhyR29vMmc/MPfmO4d8F086Xb9VzGpPlHhNCcuYlgSZWoWFd8pAi77gDu+mAMNkGNqU3kPGchK9KnmKqhWt2jlEYEHANVB/N1Yk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=pTzAyI1G; arc=none smtp.client-ip=91.218.175.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="pTzAyI1G" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813792; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=L3Fblg04o99kVliEUA69thwScEzBmyTkPPe3+2CH9YU=; b=pTzAyI1GuhP5YHK2FctXI/Ml8OZ2oWaqIqzhepcKnk4b33UZzo5+Zc+eepxsOJ5n5HugAr jLD8k0X9hiPpq0xgaaB+BDLmsCM2XCf1IkAk8XECIVkkTvPm2QT8yqHszTp90FerXuA82/ IOqoXWglxkCV2ZogdYQB7GvxeJOUk7I= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed Subject: [PATCH v2 06/13] KVM: nSVM: Add missing consistency check for event_inj Date: Mon, 10 Nov 2025 22:29:15 +0000 Message-ID: <20251110222922.613224-7-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT According to the APM Volume #2, 15.20 (24593=E2=80=94Rev. 3.42=E2=80=94Marc= h 2024): VMRUN exits with VMEXIT_INVALID error code if either: =E2=80=A2 Reserved values of TYPE have been specified, or =E2=80=A2 TYPE =3D 3 (exception) has been specified with a vector that do= es not correspond to an exception (this includes vector 2, which is an NMI, not an exception). Add the missing consistency checks to KVM. For the second point, inject VMEXIT_INVALID if the vector is anything but the vectors defined by the APM for exceptions. Reserved vectors are also considered invalid, which matches the HW behavior. Vector 9 (i.e. #CSO) is considered invalid because it is reserved on modern CPUs, and according to LLMs no CPUs exist supporting SVM and producing #CSOs. Defined exceptions could be different between virtual CPUs as new CPUs define new vectors. In a best effort to dynamically define the valid vectors, make all currently defined vectors as valid except those obviously tied to a CPU feature: SHSTK -> #CP and SEV-ES -> #VC. As new vectors are defined, they can similarly be tied to corresponding CPU features. Invalid vectors on specific (e.g. old) CPUs that are missed by KVM should be rejected by HW anyway. Signed-off-by: Yosry Ahmed --- arch/x86/kvm/svm/nested.c | 51 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index abdaacb04dd9e..418d6aa4e32e8 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -324,6 +324,54 @@ static bool nested_svm_check_bitmap_pa(struct kvm_vcpu= *vcpu, u64 pa, u32 size) kvm_vcpu_is_legal_gpa(vcpu, addr + size - 1); } =20 +static bool nested_svm_event_inj_valid_exept(struct kvm_vcpu *vcpu, u8 vec= tor) +{ + /* + * Vectors that do not correspond to a defined exception are invalid + * (including #NMI and reserved vectors). In a best to define valid + * exceptions based on the virtual CPU, make all exceptions always valid + * except those obviously tied to a CPU feature. + */ + switch (vector) { + case DE_VECTOR: case DB_VECTOR: case BP_VECTOR: case OF_VECTOR: + case BR_VECTOR: case UD_VECTOR: case NM_VECTOR: case DF_VECTOR: + case TS_VECTOR: case NP_VECTOR: case SS_VECTOR: case GP_VECTOR: + case PF_VECTOR: case MF_VECTOR: case AC_VECTOR: case MC_VECTOR: + case XM_VECTOR: case HV_VECTOR: case SX_VECTOR: + return true; + case CP_VECTOR: + return guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK); + case VC_VECTOR: + return guest_cpu_cap_has(vcpu, X86_FEATURE_SEV_ES); + } + return false; +} + +/* + * According to the APM, VMRUN exits with SVM_EXIT_ERR if SVM_EVTINJ_VALID= is + * set and: + * - The type of event_inj is not one of the defined values. + * - The type is SVM_EVTINJ_TYPE_EXEPT, but the vector is not a valid exce= ption. + */ +static bool nested_svm_check_event_inj(struct kvm_vcpu *vcpu, u32 event_in= j) +{ + u32 type =3D event_inj & SVM_EVTINJ_TYPE_MASK; + u8 vector =3D event_inj & SVM_EVTINJ_VEC_MASK; + + if (!(event_inj & SVM_EVTINJ_VALID)) + return true; + + if (type !=3D SVM_EVTINJ_TYPE_INTR && type !=3D SVM_EVTINJ_TYPE_NMI && + type !=3D SVM_EVTINJ_TYPE_EXEPT && type !=3D SVM_EVTINJ_TYPE_SOFT) + return false; + + if (type =3D=3D SVM_EVTINJ_TYPE_EXEPT && + !nested_svm_event_inj_valid_exept(vcpu, vector)) + return false; + + return true; +} + static bool __nested_vmcb_check_controls(struct kvm_vcpu *vcpu, struct vmcb_ctrl_area_cached *control, unsigned long l1_cr0) @@ -353,6 +401,9 @@ static bool __nested_vmcb_check_controls(struct kvm_vcp= u *vcpu, return false; } =20 + if (CC(!nested_svm_check_event_inj(vcpu, control->event_inj))) + return false; + return true; } =20 --=20 2.51.2.1041.gc1ab5b90ca-goog From nobody Sun Feb 8 00:12:07 2026 Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [91.218.175.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB4EC3446D2 for ; Mon, 10 Nov 2025 22:29:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813797; cv=none; b=faEGM6H5ODGrES4QFpWFZuQndblC6RSguw9ftAMbWV3UZw23kfQOrEA01MRd5TRMCtMixUEjcM7RTm+NYhJY7EbxMCLbCydu6QeCbNft8qBXj1pKmTgCbWPlgbbL4eDa/tWMwEBZ9a6hvLGhNEps6mFs1uSrGhNtAz+Lu2kqh2c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813797; c=relaxed/simple; bh=l0UxroigB+PnIz3u8yPbFO/o/PZWiBbGseabESzE+EQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=I9fQKr319c72ylcvwqa5cm8PSIelVY8qWb3V6r436/jdLMdfY5I7RfsiVqdrva/tiREQwQC6t+xVdB0ouAbjktPk0xT1+NTAEERMUrtje29qtx235jtBzQVRbKq6qY4OVq5hBlt3I63SwGXpOMTacz7cN7M6hWgWN66DolE1wXg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=SpfcTVyA; arc=none smtp.client-ip=91.218.175.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="SpfcTVyA" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813794; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UiUrvZex1V5sHdMUXPyCFkReR+0+PJj+VJ66yyn3+XA=; b=SpfcTVyAwmiHdZKC9dShgrdjPVTMpXL1t5MZz+hLdvq5W6Am8uImV4XY5Q0ZkcWR8OxAKM UwRnmZMiz7q586aXHB9hGQdv2IfEpM1IRTx1aZ5bxKqTeGe4r9iGtUVDU1asr46/pMfWGl 8aSvd/7surbHtQ8KCuKnizFy+UFPAZo= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed Subject: [PATCH v2 07/13] KVM: SVM: Rename vmcb->nested_ctl to vmcb->misc_ctl Date: Mon, 10 Nov 2025 22:29:16 +0000 Message-ID: <20251110222922.613224-8-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" The 'nested_ctl' field is misnamed. Although the first bit is for nested paging, the other defined bits are for SEV/SEV-ES. Other bits in the same field according to the APM (but not defined by KVM) include "Guest Mode Execution Trap", "Enable INVLPGB/TLBSYNC", and other control bits unrelated to 'nested'. There is nothing common among these bits, so just name the field misc_ctl. Also rename the flags accordingly. Signed-off-by: Yosry Ahmed --- arch/x86/include/asm/svm.h | 8 ++++---- arch/x86/kvm/svm/nested.c | 8 ++++---- arch/x86/kvm/svm/sev.c | 4 ++-- arch/x86/kvm/svm/svm.c | 4 ++-- arch/x86/kvm/svm/svm.h | 4 ++-- tools/testing/selftests/kvm/include/x86/svm.h | 6 +++--- 6 files changed, 17 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index 17f6c3fedeee7..76ec1d40e6461 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -142,7 +142,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area { u64 exit_info_2; u32 exit_int_info; u32 exit_int_info_err; - u64 nested_ctl; + u64 misc_ctl; u64 avic_vapic_bar; u64 ghcb_gpa; u32 event_inj; @@ -236,9 +236,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area { #define SVM_IOIO_SIZE_MASK (7 << SVM_IOIO_SIZE_SHIFT) #define SVM_IOIO_ASIZE_MASK (7 << SVM_IOIO_ASIZE_SHIFT) =20 -#define SVM_NESTED_CTL_NP_ENABLE BIT(0) -#define SVM_NESTED_CTL_SEV_ENABLE BIT(1) -#define SVM_NESTED_CTL_SEV_ES_ENABLE BIT(2) +#define SVM_MISC_CTL_NP_ENABLE BIT(0) +#define SVM_MISC_CTL_SEV_ENABLE BIT(1) +#define SVM_MISC_CTL_SEV_ES_ENABLE BIT(2) =20 =20 #define SVM_TSC_RATIO_RSVD 0xffffff0000000000ULL diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 418d6aa4e32e8..2a5c3788f954b 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -492,7 +492,7 @@ void __nested_copy_vmcb_control_to_cache(struct kvm_vcp= u *vcpu, to->exit_info_2 =3D from->exit_info_2; to->exit_int_info =3D from->exit_int_info; to->exit_int_info_err =3D from->exit_int_info_err; - to->nested_ctl =3D from->nested_ctl; + to->misc_ctl =3D from->misc_ctl; to->event_inj =3D from->event_inj; to->event_inj_err =3D from->event_inj_err; to->next_rip =3D from->next_rip; @@ -818,7 +818,7 @@ static void nested_vmcb02_prepare_control(struct vcpu_s= vm *svm, } =20 /* Copied from vmcb01. msrpm_base can be overwritten later. */ - vmcb02->control.nested_ctl =3D vmcb01->control.nested_ctl; + vmcb02->control.misc_ctl =3D vmcb01->control.misc_ctl; vmcb02->control.iopm_base_pa =3D vmcb01->control.iopm_base_pa; vmcb02->control.msrpm_base_pa =3D vmcb01->control.msrpm_base_pa; =20 @@ -964,7 +964,7 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmc= b12_gpa, vmcb12->save.rip, vmcb12->control.int_ctl, vmcb12->control.event_inj, - vmcb12->control.nested_ctl, + vmcb12->control.misc_ctl, vmcb12->control.nested_cr3, vmcb12->save.cr3, KVM_ISA_SVM); @@ -1759,7 +1759,7 @@ static void nested_copy_vmcb_cache_to_control(struct = vmcb_control_area *dst, dst->exit_info_2 =3D from->exit_info_2; dst->exit_int_info =3D from->exit_int_info; dst->exit_int_info_err =3D from->exit_int_info_err; - dst->nested_ctl =3D from->nested_ctl; + dst->misc_ctl =3D from->misc_ctl; dst->event_inj =3D from->event_inj; dst->event_inj_err =3D from->event_inj_err; dst->next_rip =3D from->next_rip; diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 0835c664fbfdb..4eff5cc43821a 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -4553,7 +4553,7 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm, bo= ol init_event) struct kvm_sev_info *sev =3D to_kvm_sev_info(svm->vcpu.kvm); struct vmcb *vmcb =3D svm->vmcb01.ptr; =20 - svm->vmcb->control.nested_ctl |=3D SVM_NESTED_CTL_SEV_ES_ENABLE; + svm->vmcb->control.misc_ctl |=3D SVM_MISC_CTL_SEV_ES_ENABLE; =20 /* * An SEV-ES guest requires a VMSA area that is a separate from the @@ -4624,7 +4624,7 @@ void sev_init_vmcb(struct vcpu_svm *svm, bool init_ev= ent) { struct kvm_vcpu *vcpu =3D &svm->vcpu; =20 - svm->vmcb->control.nested_ctl |=3D SVM_NESTED_CTL_SEV_ENABLE; + svm->vmcb->control.misc_ctl |=3D SVM_MISC_CTL_SEV_ENABLE; clr_exception_intercept(svm, UD_VECTOR); =20 /* diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index af0e9c26527e3..b5b4965e6bfdd 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1098,7 +1098,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu, bool ini= t_event) =20 if (npt_enabled) { /* Setup VMCB for Nested Paging */ - control->nested_ctl |=3D SVM_NESTED_CTL_NP_ENABLE; + control->misc_ctl |=3D SVM_MISC_CTL_NP_ENABLE; svm_clr_intercept(svm, INTERCEPT_INVLPG); clr_exception_intercept(svm, PF_VECTOR); svm_clr_intercept(svm, INTERCEPT_CR3_READ); @@ -3273,7 +3273,7 @@ static void dump_vmcb(struct kvm_vcpu *vcpu) pr_err("%-20s%016llx\n", "exit_info2:", control->exit_info_2); pr_err("%-20s%08x\n", "exit_int_info:", control->exit_int_info); pr_err("%-20s%08x\n", "exit_int_info_err:", control->exit_int_info_err); - pr_err("%-20s%lld\n", "nested_ctl:", control->nested_ctl); + pr_err("%-20s%lld\n", "misc_ctl:", control->misc_ctl); pr_err("%-20s%016llx\n", "nested_cr3:", control->nested_cr3); pr_err("%-20s%016llx\n", "avic_vapic_bar:", control->avic_vapic_bar); pr_err("%-20s%016llx\n", "ghcb:", control->ghcb_gpa); diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index a6913a0820125..861ed9c33977b 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -169,7 +169,7 @@ struct vmcb_ctrl_area_cached { u64 exit_info_2; u32 exit_int_info; u32 exit_int_info_err; - u64 nested_ctl; + u64 misc_ctl; u32 event_inj; u32 event_inj_err; u64 next_rip; @@ -554,7 +554,7 @@ static inline bool gif_set(struct vcpu_svm *svm) static inline bool nested_npt_enabled(struct vcpu_svm *svm) { return guest_cpu_cap_has(&svm->vcpu, X86_FEATURE_NPT) && - svm->nested.ctl.nested_ctl & SVM_NESTED_CTL_NP_ENABLE; + svm->nested.ctl.misc_ctl & SVM_MISC_CTL_NP_ENABLE; } =20 static inline bool nested_vnmi_enabled(struct vcpu_svm *svm) diff --git a/tools/testing/selftests/kvm/include/x86/svm.h b/tools/testing/= selftests/kvm/include/x86/svm.h index 29cffd0a91816..5d2bcce34c019 100644 --- a/tools/testing/selftests/kvm/include/x86/svm.h +++ b/tools/testing/selftests/kvm/include/x86/svm.h @@ -98,7 +98,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area { u64 exit_info_2; u32 exit_int_info; u32 exit_int_info_err; - u64 nested_ctl; + u64 misc_ctl; u64 avic_vapic_bar; u8 reserved_4[8]; u32 event_inj; @@ -176,8 +176,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area { #define SVM_VM_CR_SVM_LOCK_MASK 0x0008ULL #define SVM_VM_CR_SVM_DIS_MASK 0x0010ULL =20 -#define SVM_NESTED_CTL_NP_ENABLE BIT(0) -#define SVM_NESTED_CTL_SEV_ENABLE BIT(1) +#define SVM_MISC_CTL_CTL_NP_ENABLE BIT(0) +#define SVM_MISC_CTL_SEV_ENABLE BIT(1) =20 struct __attribute__ ((__packed__)) vmcb_seg { u16 selector; --=20 2.51.2.1041.gc1ab5b90ca-goog From nobody Sun Feb 8 00:12:07 2026 Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [91.218.175.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 36A1D3469EF for ; Mon, 10 Nov 2025 22:29:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813799; cv=none; b=t01l6fhIq4t8x8S4qfmYFiar4ntgXdWGLtQ5l0aQxNLLtQ7JfsusNC+wgrZIlrB6dgcXraJjpiGVHlZFC4dibGo4iGiDTgYmee2vFkm1SUQC9zMDb5WY+nGvkBo3EV/z4G1Zyfai5NRpB9Qar0uqkmYzlHzyCA3mXszCKjiFnas= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813799; c=relaxed/simple; bh=CTpWQljGo7SIhry3GJxSntCBsiNB6KyCQmBp87MfkfU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=H1LhPC/PNrv8wEwy5tgWMkiPubJicHrbQlOFhI8jjUBanF2SwW8nIjYp8Vl4aR0SSyLu0kg6WGwS4cRbC2PMloWDJI9rO/VzMRBLbPqEjqL39VBS40kyYG9DEwT8oHw3uF6Guix4siWN+x50s0kL+Byh4T0buVUXhO8jBd1RI1E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=gAkH0EFa; arc=none smtp.client-ip=91.218.175.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="gAkH0EFa" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813795; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xHtrDPC89v7+CUaEieWClxIgYnDZlUoVlT/Iig2+OTQ=; b=gAkH0EFaP/dIEiCZyGQ4bGodwZ/t/x+YDmnTsjt7UuP9RkSRGn2MWCpb2H/ToyfxHS1463 vfmJZdZw+pgDANcERyopR3Noqpkl6JzyYbNaD6c7rJjdVtMpheSAqG8uT6jA6KBNGXah9D UJCtQVwGOkYa4ojQj9cu/FHAcJyn38I= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed Subject: [PATCH v2 08/13] KVM: SVM: Rename vmcb->virt_ext to vmcb->misc_ctl2 Date: Mon, 10 Nov 2025 22:29:17 +0000 Message-ID: <20251110222922.613224-9-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" 'virt' is confusing in the VMCB because it is relative and ambiguous. The 'virt_ext' field includes bits for LBR virtualization and VMSAVE/VMLOAD virtualization, so it's just another miscellaneous control field. Name it as such. While at it, move the definitions of the bits below those for 'misc_ctl' and rename them for consistency. Signed-off-by: Yosry Ahmed --- arch/x86/include/asm/svm.h | 7 +++---- arch/x86/kvm/svm/nested.c | 18 ++++++++--------- arch/x86/kvm/svm/svm.c | 20 +++++++++---------- arch/x86/kvm/svm/svm.h | 2 +- tools/testing/selftests/kvm/include/x86/svm.h | 8 ++++---- .../selftests/kvm/x86/svm_lbr_nested_state.c | 4 ++-- 6 files changed, 29 insertions(+), 30 deletions(-) diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index 76ec1d40e6461..a842018952d2c 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -148,7 +148,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area { u32 event_inj; u32 event_inj_err; u64 nested_cr3; - u64 virt_ext; + u64 misc_ctl2; u32 clean; u32 reserved_5; u64 next_rip; @@ -219,9 +219,6 @@ struct __attribute__ ((__packed__)) vmcb_control_area { #define X2APIC_MODE_SHIFT 30 #define X2APIC_MODE_MASK (1 << X2APIC_MODE_SHIFT) =20 -#define LBR_CTL_ENABLE_MASK BIT_ULL(0) -#define VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK BIT_ULL(1) - #define SVM_INTERRUPT_SHADOW_MASK BIT_ULL(0) #define SVM_GUEST_INTERRUPT_MASK BIT_ULL(1) =20 @@ -240,6 +237,8 @@ struct __attribute__ ((__packed__)) vmcb_control_area { #define SVM_MISC_CTL_SEV_ENABLE BIT(1) #define SVM_MISC_CTL_SEV_ES_ENABLE BIT(2) =20 +#define SVM_MISC_CTL2_LBR_CTL_ENABLE BIT_ULL(0) +#define SVM_MISC_CTL2_V_VMLOAD_VMSAVE_ENABLE BIT_ULL(1) =20 #define SVM_TSC_RATIO_RSVD 0xffffff0000000000ULL #define SVM_TSC_RATIO_MIN 0x0000000000000001ULL diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 2a5c3788f954b..b8d65832c64de 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -117,7 +117,7 @@ static bool nested_vmcb_needs_vls_intercept(struct vcpu= _svm *svm) if (!nested_npt_enabled(svm)) return true; =20 - if (!(svm->nested.ctl.virt_ext & VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK)) + if (!(svm->nested.ctl.misc_ctl2 & SVM_MISC_CTL2_V_VMLOAD_VMSAVE_ENABLE)) return true; =20 return false; @@ -180,7 +180,7 @@ void recalc_intercepts(struct vcpu_svm *svm) vmcb_set_intercept(c, INTERCEPT_VMLOAD); vmcb_set_intercept(c, INTERCEPT_VMSAVE); } else { - WARN_ON(!(c->virt_ext & VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK)); + WARN_ON(!(c->misc_ctl2 & SVM_MISC_CTL2_V_VMLOAD_VMSAVE_ENABLE)); } } =20 @@ -497,7 +497,7 @@ void __nested_copy_vmcb_control_to_cache(struct kvm_vcp= u *vcpu, to->event_inj_err =3D from->event_inj_err; to->next_rip =3D from->next_rip; to->nested_cr3 =3D from->nested_cr3; - to->virt_ext =3D from->virt_ext; + to->misc_ctl2 =3D from->misc_ctl2; to->pause_filter_count =3D from->pause_filter_count; to->pause_filter_thresh =3D from->pause_filter_thresh; =20 @@ -738,7 +738,7 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm = *svm, struct vmcb *vmcb12 } =20 if (unlikely(guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) && - (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) { + (svm->nested.ctl.misc_ctl2 & SVM_MISC_CTL2_LBR_CTL_ENABLE))) { /* * Reserved bits of DEBUGCTL are ignored. Be consistent with * svm_set_msr's definition of reserved bits. @@ -902,10 +902,10 @@ static void nested_vmcb02_prepare_control(struct vcpu= _svm *svm, svm->soft_int_next_rip =3D vmcb12_rip; } =20 - /* LBR_CTL_ENABLE_MASK is controlled by svm_update_lbrv() */ + /* SVM_MISC_CTL2_LBR_CTL_ENABLE is controlled by svm_update_lbrv() */ =20 if (!nested_vmcb_needs_vls_intercept(svm)) - vmcb02->control.virt_ext |=3D VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; + vmcb02->control.misc_ctl2 |=3D SVM_MISC_CTL2_V_VMLOAD_VMSAVE_ENABLE; =20 if (guest_cpu_cap_has(vcpu, X86_FEATURE_PAUSEFILTER)) pause_count12 =3D svm->nested.ctl.pause_filter_count; @@ -1257,7 +1257,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm) kvm_make_request(KVM_REQ_EVENT, &svm->vcpu); =20 if (unlikely(guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) && - (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) { + (svm->nested.ctl.misc_ctl2 & SVM_MISC_CTL2_LBR_CTL_ENABLE))) { svm_copy_lbrs(&vmcb12->save, &vmcb02->save); } else { svm_copy_lbrs(&vmcb01->save, &vmcb02->save); @@ -1763,8 +1763,8 @@ static void nested_copy_vmcb_cache_to_control(struct = vmcb_control_area *dst, dst->event_inj =3D from->event_inj; dst->event_inj_err =3D from->event_inj_err; dst->next_rip =3D from->next_rip; - dst->nested_cr3 =3D from->nested_cr3; - dst->virt_ext =3D from->virt_ext; + dst->nested_cr3 =3D from->nested_cr3; + dst->misc_ctl2 =3D from->misc_ctl2; dst->pause_filter_count =3D from->pause_filter_count; dst->pause_filter_thresh =3D from->pause_filter_thresh; /* 'clean' and 'hv_enlightenments' are not changed by KVM */ diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index b5b4965e6bfdd..9789f7e72ae97 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -705,7 +705,7 @@ void *svm_alloc_permissions_map(unsigned long size, gfp= _t gfp_mask) =20 static void svm_recalc_lbr_msr_intercepts(struct kvm_vcpu *vcpu) { - bool intercept =3D !(to_svm(vcpu)->vmcb->control.virt_ext & LBR_CTL_ENABL= E_MASK); + bool intercept =3D !(to_svm(vcpu)->vmcb->control.misc_ctl2 & SVM_MISC_CTL= 2_LBR_CTL_ENABLE); =20 svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHFROMIP, MSR_TYPE_RW, i= ntercept); svm_set_intercept_for_msr(vcpu, MSR_IA32_LASTBRANCHTOIP, MSR_TYPE_RW, int= ercept); @@ -797,7 +797,7 @@ static void svm_recalc_msr_intercepts(struct kvm_vcpu *= vcpu) =20 static void __svm_enable_lbrv(struct kvm_vcpu *vcpu) { - to_svm(vcpu)->vmcb->control.virt_ext |=3D LBR_CTL_ENABLE_MASK; + to_svm(vcpu)->vmcb->control.misc_ctl2 |=3D SVM_MISC_CTL2_LBR_CTL_ENABLE; } =20 void svm_enable_lbrv(struct kvm_vcpu *vcpu) @@ -809,16 +809,16 @@ void svm_enable_lbrv(struct kvm_vcpu *vcpu) static void __svm_disable_lbrv(struct kvm_vcpu *vcpu) { KVM_BUG_ON(sev_es_guest(vcpu->kvm), vcpu->kvm); - to_svm(vcpu)->vmcb->control.virt_ext &=3D ~LBR_CTL_ENABLE_MASK; + to_svm(vcpu)->vmcb->control.misc_ctl2 &=3D ~SVM_MISC_CTL2_LBR_CTL_ENABLE; } =20 void svm_update_lbrv(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm =3D to_svm(vcpu); - bool current_enable_lbrv =3D svm->vmcb->control.virt_ext & LBR_CTL_ENABLE= _MASK; + bool current_enable_lbrv =3D svm->vmcb->control.misc_ctl2 & SVM_MISC_CTL2= _LBR_CTL_ENABLE; bool enable_lbrv =3D (svm->vmcb->save.dbgctl & DEBUGCTLMSR_LBR) || (is_guest_mode(vcpu) && guest_cpu_cap_has(vcpu, X86_FEATURE_LBRV) && - (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK)); + (svm->nested.ctl.misc_ctl2 & SVM_MISC_CTL2_LBR_CTL_ENABLE)); =20 if (enable_lbrv && !current_enable_lbrv) __svm_enable_lbrv(vcpu); @@ -979,7 +979,7 @@ static void svm_recalc_instruction_intercepts(struct kv= m_vcpu *vcpu) if (guest_cpuid_is_intel_compatible(vcpu)) { svm_set_intercept(svm, INTERCEPT_VMLOAD); svm_set_intercept(svm, INTERCEPT_VMSAVE); - svm->vmcb->control.virt_ext &=3D ~VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; + svm->vmcb->control.misc_ctl2 &=3D ~SVM_MISC_CTL2_V_VMLOAD_VMSAVE_ENABLE; } else { /* * If hardware supports Virtual VMLOAD VMSAVE then enable it @@ -988,7 +988,7 @@ static void svm_recalc_instruction_intercepts(struct kv= m_vcpu *vcpu) if (vls) { svm_clr_intercept(svm, INTERCEPT_VMLOAD); svm_clr_intercept(svm, INTERCEPT_VMSAVE); - svm->vmcb->control.virt_ext |=3D VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK; + svm->vmcb->control.misc_ctl2 |=3D SVM_MISC_CTL2_V_VMLOAD_VMSAVE_ENABLE; } } } @@ -3279,7 +3279,7 @@ static void dump_vmcb(struct kvm_vcpu *vcpu) pr_err("%-20s%016llx\n", "ghcb:", control->ghcb_gpa); pr_err("%-20s%08x\n", "event_inj:", control->event_inj); pr_err("%-20s%08x\n", "event_inj_err:", control->event_inj_err); - pr_err("%-20s%lld\n", "virt_ext:", control->virt_ext); + pr_err("%-20s%lld\n", "misc_ctl2:", control->misc_ctl2); pr_err("%-20s%016llx\n", "next_rip:", control->next_rip); pr_err("%-20s%016llx\n", "avic_backing_page:", control->avic_backing_page= ); pr_err("%-20s%016llx\n", "avic_logical_id:", control->avic_logical_id); @@ -4261,7 +4261,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_= vcpu *vcpu, u64 run_flags) * VM-Exit), as running with the host's DEBUGCTL can negatively affect * guest state and can even be fatal, e.g. due to Bus Lock Detect. */ - if (!(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK) && + if (!(svm->vmcb->control.misc_ctl2 & SVM_MISC_CTL2_LBR_CTL_ENABLE) && vcpu->arch.host_debugctl !=3D svm->vmcb->save.dbgctl) update_debugctlmsr(svm->vmcb->save.dbgctl); =20 @@ -4292,7 +4292,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_= vcpu *vcpu, u64 run_flags) if (unlikely(svm->vmcb->control.exit_code =3D=3D SVM_EXIT_NMI)) kvm_before_interrupt(vcpu, KVM_HANDLING_NMI); =20 - if (!(svm->vmcb->control.virt_ext & LBR_CTL_ENABLE_MASK) && + if (!(svm->vmcb->control.misc_ctl2 & SVM_MISC_CTL2_LBR_CTL_ENABLE) && vcpu->arch.host_debugctl !=3D svm->vmcb->save.dbgctl) update_debugctlmsr(vcpu->arch.host_debugctl); =20 diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 861ed9c33977b..68be3a08e3e62 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -174,7 +174,7 @@ struct vmcb_ctrl_area_cached { u32 event_inj_err; u64 next_rip; u64 nested_cr3; - u64 virt_ext; + u64 misc_ctl2; u32 clean; u64 bus_lock_rip; union { diff --git a/tools/testing/selftests/kvm/include/x86/svm.h b/tools/testing/= selftests/kvm/include/x86/svm.h index 5d2bcce34c019..a3f4eadffeb46 100644 --- a/tools/testing/selftests/kvm/include/x86/svm.h +++ b/tools/testing/selftests/kvm/include/x86/svm.h @@ -104,7 +104,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area { u32 event_inj; u32 event_inj_err; u64 nested_cr3; - u64 virt_ext; + u64 misc_ctl2; u32 clean; u32 reserved_5; u64 next_rip; @@ -156,9 +156,6 @@ struct __attribute__ ((__packed__)) vmcb_control_area { #define AVIC_ENABLE_SHIFT 31 #define AVIC_ENABLE_MASK (1 << AVIC_ENABLE_SHIFT) =20 -#define LBR_CTL_ENABLE_MASK BIT_ULL(0) -#define VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK BIT_ULL(1) - #define SVM_INTERRUPT_SHADOW_MASK 1 =20 #define SVM_IOIO_STR_SHIFT 2 @@ -179,6 +176,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area { #define SVM_MISC_CTL_CTL_NP_ENABLE BIT(0) #define SVM_MISC_CTL_SEV_ENABLE BIT(1) =20 +#define SVM_MISC_CTL2_LBR_CTL_ENABLE BIT_ULL(0) +#define SVM_MISC_CTL2_V_VMLOAD_VMSAVE_ENABLE BIT_ULL(1) + struct __attribute__ ((__packed__)) vmcb_seg { u16 selector; u16 attrib; diff --git a/tools/testing/selftests/kvm/x86/svm_lbr_nested_state.c b/tools= /testing/selftests/kvm/x86/svm_lbr_nested_state.c index a343279546fd8..4a9e644b8931e 100644 --- a/tools/testing/selftests/kvm/x86/svm_lbr_nested_state.c +++ b/tools/testing/selftests/kvm/x86/svm_lbr_nested_state.c @@ -75,9 +75,9 @@ static void l1_guest_code(struct svm_test_data *svm, bool= nested_lbrv) &l2_guest_stack[L2_GUEST_STACK_SIZE]); =20 if (nested_lbrv) - vmcb->control.virt_ext =3D LBR_CTL_ENABLE_MASK; + vmcb->control.misc_ctl2 =3D SVM_MISC_CTL2_LBR_CTL_ENABLE; else - vmcb->control.virt_ext &=3D ~LBR_CTL_ENABLE_MASK; + vmcb->control.misc_ctl2 &=3D ~SVM_MISC_CTL2_LBR_CTL_ENABLE; =20 run_guest(vmcb, svm->vmcb_gpa); GUEST_ASSERT(svm->vmcb->control.exit_code =3D=3D SVM_EXIT_VMMCALL); --=20 2.51.2.1041.gc1ab5b90ca-goog From nobody Sun Feb 8 00:12:07 2026 Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [91.218.175.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6220346E6C for ; Mon, 10 Nov 2025 22:29:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813801; cv=none; b=nNJLMC2wS6i3PqcKU13FaZsAi1GHxSGOqAkD14DNwyFjIVi93zWUWHX4Y7op5qjpFFcLsvVeCVHahoo4ypGfYWGqlnCWrdDCWhFbVxdVfTOFdtLgVxJx9GEGT2jEYGHWiGXeOQFxdvjDtifzWxr4mObGViE70dYfrCha6cizfpU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813801; c=relaxed/simple; bh=XmsDnzdQAX9LKcAhN7aZbv+ptAkPWEAfiTfq26F9t5A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lZkxs2CbxZbuMXl31A2q/odmFl0g35zAUEPNj2mdcMS3tXLaFla9kJ2zQg1dDiVIZmMYcEKzFy41lPvC3j3UAnJMrtfULmJHvcniauDTdDj10cSVvKPsVMDWuiDrgqcFrs+3/wQjl2Zrja7cE+HWSaZRZPFW0oDMTNTlQ5xnLLQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=BMmfvu/a; arc=none smtp.client-ip=91.218.175.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="BMmfvu/a" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813797; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SA6nMXNvLyR/7xLQ8U0SivnG2zt17+l7CbjrPgJtDuI=; b=BMmfvu/aYJECBGML9vXNN+SMUUxAXE11j8WNcPmEdJo84vytVc1G17e8Kkdj4b8gLebcSW pFOMNfRwFdk7yCrDsnKGdvEUVfD/4qWTV1OIJdAaOoxTy7nnOnZvUDS+6renGmbZj+xUdY 6eQCsf41/IDcx72Kc13yQDz0gAiCF4Q= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed Subject: [PATCH v2 09/13] KVM: nSVM: Cache all used fields from VMCB12 Date: Mon, 10 Nov 2025 22:29:18 +0000 Message-ID: <20251110222922.613224-10-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Currently, most fields used from VMCB12 are cached in svm->nested.{ctl/save}. This is mainly to avoid TOC-TOU bugs. However, for the save area, only the fields used in the consistency checks (i.e. nested_vmcb_check_save()) were being cached. Other fields are read directly from guest memory in nested_vmcb02_prepare_save(). While probably benign, this still makes it possible for TOC-TOU bugs to happen. For example, RAX, RSP, and RIP are read twice, once to store in VMCB02, and once to store in vcpu->arch.regs. It is possible for the guest to modify the value between both reads, potentially causing nasty bugs. Harden against such bugs by caching everything in svm->nested.save. Cache all the needed fields, and keep all accesses to the VMCB12 strictly in nested_svm_vmrun() for caching and early error injection. Following changes will further limit the access to the VMCB12 in the nested VMRUN path. Introduce vmcb12_is_dirty() to use with the cached control fields instead of vmcb_is_dirty(), similar to vmcb12_is_intercept(). Opportunistically order the copies in __nested_copy_vmcb_save_to_cache() by the order in which the fields are defined in struct vmcb_save_area. Signed-off-by: Yosry Ahmed --- arch/x86/kvm/svm/nested.c | 116 ++++++++++++++++++++++---------------- arch/x86/kvm/svm/svm.c | 2 +- arch/x86/kvm/svm/svm.h | 27 ++++++++- 3 files changed, 93 insertions(+), 52 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index b8d65832c64de..ddcd545ec1c3c 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -525,19 +525,34 @@ void nested_copy_vmcb_control_to_cache(struct vcpu_sv= m *svm, static void __nested_copy_vmcb_save_to_cache(struct vmcb_save_area_cached = *to, struct vmcb_save_area *from) { - /* - * Copy only fields that are validated, as we need them - * to avoid TOC/TOU races. - */ + to->es =3D from->es; to->cs =3D from->cs; + to->ss =3D from->ss; + to->ds =3D from->ds; + to->gdtr =3D from->gdtr; + to->idtr =3D from->idtr; + + to->cpl =3D from->cpl; =20 to->efer =3D from->efer; - to->cr0 =3D from->cr0; - to->cr3 =3D from->cr3; to->cr4 =3D from->cr4; - - to->dr6 =3D from->dr6; + to->cr3 =3D from->cr3; + to->cr0 =3D from->cr0; to->dr7 =3D from->dr7; + to->dr6 =3D from->dr6; + + to->rflags =3D from->rflags; + to->rip =3D from->rip; + to->rsp =3D from->rsp; + + to->s_cet =3D from->s_cet; + to->ssp =3D from->ssp; + to->isst_addr =3D from->isst_addr; + + to->rax =3D from->rax; + to->cr2 =3D from->cr2; + + svm_copy_lbrs(to, from); } =20 void nested_copy_vmcb_save_to_cache(struct vcpu_svm *svm, @@ -673,8 +688,10 @@ void nested_vmcb02_compute_g_pat(struct vcpu_svm *svm) svm->nested.vmcb02.ptr->save.g_pat =3D svm->vmcb01.ptr->save.g_pat; } =20 -static void nested_vmcb02_prepare_save(struct vcpu_svm *svm, struct vmcb *= vmcb12) +static void nested_vmcb02_prepare_save(struct vcpu_svm *svm) { + struct vmcb_ctrl_area_cached *control =3D &svm->nested.ctl; + struct vmcb_save_area_cached *save =3D &svm->nested.save; bool new_vmcb12 =3D false; struct vmcb *vmcb01 =3D svm->vmcb01.ptr; struct vmcb *vmcb02 =3D svm->nested.vmcb02.ptr; @@ -689,49 +706,49 @@ static void nested_vmcb02_prepare_save(struct vcpu_sv= m *svm, struct vmcb *vmcb12 svm->nested.force_msr_bitmap_recalc =3D true; } =20 - if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_SEG))) { - vmcb02->save.es =3D vmcb12->save.es; - vmcb02->save.cs =3D vmcb12->save.cs; - vmcb02->save.ss =3D vmcb12->save.ss; - vmcb02->save.ds =3D vmcb12->save.ds; - vmcb02->save.cpl =3D vmcb12->save.cpl; + if (unlikely(new_vmcb12 || vmcb12_is_dirty(control, VMCB_SEG))) { + vmcb02->save.es =3D save->es; + vmcb02->save.cs =3D save->cs; + vmcb02->save.ss =3D save->ss; + vmcb02->save.ds =3D save->ds; + vmcb02->save.cpl =3D save->cpl; vmcb_mark_dirty(vmcb02, VMCB_SEG); } =20 - if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_DT))) { - vmcb02->save.gdtr =3D vmcb12->save.gdtr; - vmcb02->save.idtr =3D vmcb12->save.idtr; + if (unlikely(new_vmcb12 || vmcb12_is_dirty(control, VMCB_DT))) { + vmcb02->save.gdtr =3D save->gdtr; + vmcb02->save.idtr =3D save->idtr; vmcb_mark_dirty(vmcb02, VMCB_DT); } =20 if (guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK) && - (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_CET)))) { - vmcb02->save.s_cet =3D vmcb12->save.s_cet; - vmcb02->save.isst_addr =3D vmcb12->save.isst_addr; - vmcb02->save.ssp =3D vmcb12->save.ssp; + (unlikely(new_vmcb12 || vmcb12_is_dirty(control, VMCB_CET)))) { + vmcb02->save.s_cet =3D save->s_cet; + vmcb02->save.isst_addr =3D save->isst_addr; + vmcb02->save.ssp =3D save->ssp; vmcb_mark_dirty(vmcb02, VMCB_CET); } =20 - kvm_set_rflags(vcpu, vmcb12->save.rflags | X86_EFLAGS_FIXED); + kvm_set_rflags(vcpu, save->rflags | X86_EFLAGS_FIXED); =20 svm_set_efer(vcpu, svm->nested.save.efer); =20 svm_set_cr0(vcpu, svm->nested.save.cr0); svm_set_cr4(vcpu, svm->nested.save.cr4); =20 - svm->vcpu.arch.cr2 =3D vmcb12->save.cr2; + svm->vcpu.arch.cr2 =3D save->cr2; =20 - kvm_rax_write(vcpu, vmcb12->save.rax); - kvm_rsp_write(vcpu, vmcb12->save.rsp); - kvm_rip_write(vcpu, vmcb12->save.rip); + kvm_rax_write(vcpu, save->rax); + kvm_rsp_write(vcpu, save->rsp); + kvm_rip_write(vcpu, save->rip); =20 /* In case we don't even reach vcpu_run, the fields are not updated */ - vmcb02->save.rax =3D vmcb12->save.rax; - vmcb02->save.rsp =3D vmcb12->save.rsp; - vmcb02->save.rip =3D vmcb12->save.rip; + vmcb02->save.rax =3D save->rax; + vmcb02->save.rsp =3D save->rsp; + vmcb02->save.rip =3D save->rip; =20 /* These bits will be set properly on the first execution when new_vmc12 = is true */ - if (unlikely(new_vmcb12 || vmcb_is_dirty(vmcb12, VMCB_DR))) { + if (unlikely(new_vmcb12 || vmcb12_is_dirty(control, VMCB_DR))) { vmcb02->save.dr7 =3D svm->nested.save.dr7 | DR7_FIXED_1; svm->vcpu.arch.dr6 =3D svm->nested.save.dr6 | DR6_ACTIVE_LOW; vmcb_mark_dirty(vmcb02, VMCB_DR); @@ -743,7 +760,7 @@ static void nested_vmcb02_prepare_save(struct vcpu_svm = *svm, struct vmcb *vmcb12 * Reserved bits of DEBUGCTL are ignored. Be consistent with * svm_set_msr's definition of reserved bits. */ - svm_copy_lbrs(&vmcb02->save, &vmcb12->save); + svm_copy_lbrs(&vmcb02->save, save); vmcb_mark_dirty(vmcb02, VMCB_LBR); vmcb02->save.dbgctl &=3D ~DEBUGCTL_RESERVED_BITS; } else { @@ -953,28 +970,29 @@ static void nested_svm_copy_common_state(struct vmcb = *from_vmcb, struct vmcb *to to_vmcb->save.spec_ctrl =3D from_vmcb->save.spec_ctrl; } =20 -int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa, - struct vmcb *vmcb12, bool from_vmrun) +int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa, bool from_= vmrun) { struct vcpu_svm *svm =3D to_svm(vcpu); + struct vmcb_ctrl_area_cached *control =3D &svm->nested.ctl; + struct vmcb_save_area_cached *save =3D &svm->nested.save; int ret; =20 trace_kvm_nested_vmenter(svm->vmcb->save.rip, vmcb12_gpa, - vmcb12->save.rip, - vmcb12->control.int_ctl, - vmcb12->control.event_inj, - vmcb12->control.misc_ctl, - vmcb12->control.nested_cr3, - vmcb12->save.cr3, + save->rip, + control->int_ctl, + control->event_inj, + control->misc_ctl, + control->nested_cr3, + save->cr3, KVM_ISA_SVM); =20 - trace_kvm_nested_intercepts(vmcb12->control.intercepts[INTERCEPT_CR] & 0x= ffff, - vmcb12->control.intercepts[INTERCEPT_CR] >> 16, - vmcb12->control.intercepts[INTERCEPT_EXCEPTION], - vmcb12->control.intercepts[INTERCEPT_WORD3], - vmcb12->control.intercepts[INTERCEPT_WORD4], - vmcb12->control.intercepts[INTERCEPT_WORD5]); + trace_kvm_nested_intercepts(control->intercepts[INTERCEPT_CR] & 0xffff, + control->intercepts[INTERCEPT_CR] >> 16, + control->intercepts[INTERCEPT_EXCEPTION], + control->intercepts[INTERCEPT_WORD3], + control->intercepts[INTERCEPT_WORD4], + control->intercepts[INTERCEPT_WORD5]); =20 =20 svm->nested.vmcb12_gpa =3D vmcb12_gpa; @@ -984,8 +1002,8 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vm= cb12_gpa, nested_svm_copy_common_state(svm->vmcb01.ptr, svm->nested.vmcb02.ptr); =20 svm_switch_vmcb(svm, &svm->nested.vmcb02); - nested_vmcb02_prepare_control(svm, vmcb12->save.rip, vmcb12->save.cs.base= ); - nested_vmcb02_prepare_save(svm, vmcb12); + nested_vmcb02_prepare_control(svm, save->rip, save->cs.base); + nested_vmcb02_prepare_save(svm); =20 ret =3D nested_svm_load_cr3(&svm->vcpu, svm->nested.save.cr3, nested_npt_enabled(svm), from_vmrun); @@ -1074,7 +1092,7 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu) =20 svm->nested.nested_run_pending =3D 1; =20 - if (enter_svm_guest_mode(vcpu, vmcb12_gpa, vmcb12, true)) + if (enter_svm_guest_mode(vcpu, vmcb12_gpa, true)) goto out_exit_err; =20 if (nested_svm_merge_msrpm(vcpu)) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 9789f7e72ae97..2fbb0b88c6a3e 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4762,7 +4762,7 @@ static int svm_leave_smm(struct kvm_vcpu *vcpu, const= union kvm_smram *smram) vmcb12 =3D map.hva; nested_copy_vmcb_control_to_cache(svm, &vmcb12->control); nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); - ret =3D enter_svm_guest_mode(vcpu, smram64->svm_guest_vmcb_gpa, vmcb12, f= alse); + ret =3D enter_svm_guest_mode(vcpu, smram64->svm_guest_vmcb_gpa, false); =20 if (ret) goto unmap_save; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 68be3a08e3e62..ef6bdce630dc0 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -142,13 +142,32 @@ struct kvm_vmcb_info { }; =20 struct vmcb_save_area_cached { + struct vmcb_seg es; struct vmcb_seg cs; + struct vmcb_seg ss; + struct vmcb_seg ds; + struct vmcb_seg gdtr; + struct vmcb_seg idtr; + u8 cpl; u64 efer; u64 cr4; u64 cr3; u64 cr0; u64 dr7; u64 dr6; + u64 rflags; + u64 rip; + u64 rsp; + u64 s_cet; + u64 ssp; + u64 isst_addr; + u64 rax; + u64 cr2; + u64 dbgctl; + u64 br_from; + u64 br_to; + u64 last_excp_from; + u64 last_excp_to; }; =20 struct vmcb_ctrl_area_cached { @@ -422,6 +441,11 @@ static inline bool vmcb_is_dirty(struct vmcb *vmcb, in= t bit) return !test_bit(bit, (unsigned long *)&vmcb->control.clean); } =20 +static inline bool vmcb12_is_dirty(struct vmcb_ctrl_area_cached *control, = int bit) +{ + return !test_bit(bit, (unsigned long *)&control->clean); +} + static __always_inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu) { return container_of(vcpu, struct vcpu_svm, vcpu); @@ -760,8 +784,7 @@ static inline bool nested_exit_on_nmi(struct vcpu_svm *= svm) =20 int __init nested_svm_init_msrpm_merge_offsets(void); =20 -int enter_svm_guest_mode(struct kvm_vcpu *vcpu, - u64 vmcb_gpa, struct vmcb *vmcb12, bool from_vmrun); +int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb_gpa, bool from_vm= run); void svm_leave_nested(struct kvm_vcpu *vcpu); void svm_free_nested(struct vcpu_svm *svm); int svm_allocate_nested(struct vcpu_svm *svm); --=20 2.51.2.1041.gc1ab5b90ca-goog From nobody Sun Feb 8 00:12:07 2026 Received: from out-179.mta0.migadu.com (out-179.mta0.migadu.com [91.218.175.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 91B47346FAA for ; Mon, 10 Nov 2025 22:30:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813802; cv=none; b=RVtL8KPVZUS1ftSm71v/opZ3gyI4RnXhbn2AHYBisxzBjbExuqxLIlsCde0Fo0BhTPDHzh273b6g8Dl3FOpTh+vC2infda8VdBkw2gWCcxZXMuZnv1wJS9tIWgwWEECxxPfKbHLF+JhS5xpDnjrZ9kKCvIc+FJDMlw5+h6eofuU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813802; c=relaxed/simple; bh=kZLGhdUqKziP6wJo1TK4s2e4k+8TAd9mui2sIZ1T0qY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iiibRt/8phaVDLFnB1Sg6CS5yFAUpsIDtqqSOtz9FBVsCu3HiY6jfCdO7UgxKMuy3kzpfMVzExaEDI8VnCv2Dk8yDune3nOYqVmRvUh4YOG69avqlSXPIYWED1L9/YHnW93IZhAdwyV+PJwgDi80bqHzwtPhcOqX8x8cVAc+Kfk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=aLTbydnd; arc=none smtp.client-ip=91.218.175.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="aLTbydnd" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813798; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7JW+Xtx3ikg89gAAbE+q15fKd9gDpo0qaghJYGfsrC8=; b=aLTbydndnCcCg/o3XJY19Fs68ss+e46QxMPlmXcr5lOZbUnB2iFb11wfUa4iRO0Xd6XmsJ AXEUFyajr0Os6sNsoAcQQNpRi031kYT6fJGjRH/KMO0Ck1sPPRZtbfl105lUNg0c9O64EQ cosx476VX3iknBTmybESrW+jrzGjkDU= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed Subject: [PATCH v2 10/13] KVM: nSVM: Restrict mapping VMCB12 on nested VMRUN Date: Mon, 10 Nov 2025 22:29:19 +0000 Message-ID: <20251110222922.613224-11-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" All accesses to the VMCB12 in the guest memory are limited to nested_svm_vmrun(). However, the VMCB12 remains mapped until the end of the function execution. Unmapping right after the consistency checks is possible, but it becomes easy-ish to introduce bugs where 'vmcb12' is used after being unmapped. Move all accesses to the VMCB12 into a new helper, nested_svm_vmrun_read_vmcb12(), that maps the VMCB12, caches the needed fields, performs consistency checks, and unmaps it. This limits the scope of the VMCB12 mapping appropriately. It also slightly simplifies the cleanup path of nested_svm_vmrun(). nested_svm_vmrun_read_vmcb12() returns -1 if the consistency checks fail, maintaining the current behavior of skipping the instructions and unmapping the VMCB12 (although in the opposite order). Signed-off-by: Yosry Ahmed --- arch/x86/kvm/svm/nested.c | 59 ++++++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 25 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index ddcd545ec1c3c..a48668c36a191 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1023,12 +1023,39 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64= vmcb12_gpa, bool from_vmrun) return 0; } =20 +static int nested_svm_vmrun_read_vmcb12(struct kvm_vcpu *vcpu, u64 vmcb12_= gpa) +{ + struct vcpu_svm *svm =3D to_svm(vcpu); + struct kvm_host_map map; + struct vmcb *vmcb12; + int ret; + + ret =3D kvm_vcpu_map(vcpu, gpa_to_gfn(vmcb12_gpa), &map); + if (ret) + return ret; + + vmcb12 =3D map.hva; + + nested_copy_vmcb_control_to_cache(svm, &vmcb12->control); + nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); + + if (!nested_vmcb_check_save(vcpu) || + !nested_vmcb_check_controls(vcpu)) { + vmcb12->control.exit_code =3D SVM_EXIT_ERR; + vmcb12->control.exit_code_hi =3D 0; + vmcb12->control.exit_info_1 =3D 0; + vmcb12->control.exit_info_2 =3D 0; + ret =3D -1; + } + + kvm_vcpu_unmap(vcpu, &map); + return ret; +} + int nested_svm_vmrun(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm =3D to_svm(vcpu); int ret; - struct vmcb *vmcb12; - struct kvm_host_map map; u64 vmcb12_gpa; struct vmcb *vmcb01 =3D svm->vmcb01.ptr; =20 @@ -1049,8 +1076,11 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu) return ret; } =20 + if (WARN_ON_ONCE(!svm->nested.initialized)) + return -EINVAL; + vmcb12_gpa =3D svm->vmcb->save.rax; - ret =3D kvm_vcpu_map(vcpu, gpa_to_gfn(vmcb12_gpa), &map); + ret =3D nested_svm_vmrun_read_vmcb12(vcpu, vmcb12_gpa); if (ret =3D=3D -EINVAL) { kvm_inject_gp(vcpu, 0); return 1; @@ -1060,23 +1090,6 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu) =20 ret =3D kvm_skip_emulated_instruction(vcpu); =20 - vmcb12 =3D map.hva; - - if (WARN_ON_ONCE(!svm->nested.initialized)) - return -EINVAL; - - nested_copy_vmcb_control_to_cache(svm, &vmcb12->control); - nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); - - if (!nested_vmcb_check_save(vcpu) || - !nested_vmcb_check_controls(vcpu)) { - vmcb12->control.exit_code =3D SVM_EXIT_ERR; - vmcb12->control.exit_code_hi =3D 0; - vmcb12->control.exit_info_1 =3D 0; - vmcb12->control.exit_info_2 =3D 0; - goto out; - } - /* * Since vmcb01 is not in use, we can use it to store some of the L1 * state. @@ -1096,7 +1109,7 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu) goto out_exit_err; =20 if (nested_svm_merge_msrpm(vcpu)) - goto out; + return ret; =20 out_exit_err: svm->nested.nested_run_pending =3D 0; @@ -1109,10 +1122,6 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu) svm->vmcb->control.exit_info_2 =3D 0; =20 nested_svm_vmexit(svm); - -out: - kvm_vcpu_unmap(vcpu, &map); - return ret; } =20 --=20 2.51.2.1041.gc1ab5b90ca-goog From nobody Sun Feb 8 00:12:07 2026 Received: from out-177.mta0.migadu.com (out-177.mta0.migadu.com [91.218.175.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62308340A7A for ; Mon, 10 Nov 2025 22:30:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813804; cv=none; b=dqMdu8UzBIpmYZwMBgyhjIZ6wKOFkU1h7BtC2tQPTA3Mj7TRAr//KpAo8btKn/GDFMYlPX/rditGEYhcSMK9QQkmrbPaC0PhgnFjGoXTSqSvKn1jddqzpYthbmBNfl8uIgR006l4zcX9H/NmAccai230n/56mKp6dNGIPRIoLs4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813804; c=relaxed/simple; bh=SrmYL7HiCpgm4vjtAgnGlIDPHOZzMLiFMexWKIIKQI8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=a0bPHCmweFYeAwrVvBjoWumTOn5iClszgKvn9cM7VpngDYCjG/8eeVimmvXGAowlS/WwSnTanb8P2BTtfpePfxa6QO9bpDi3xyWm33qqQfdBJsTLanhJ+5e65+vZWATRGsdJ/Gtwb0qZ3/C5Uuj4BplG3YfoOucXQyOGnpe3Qbk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=rv3iYpdh; arc=none smtp.client-ip=91.218.175.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="rv3iYpdh" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813800; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QRChNnbAOtY4MB+fTlu6wng1ikuj7N/aEOeEGMy7akE=; b=rv3iYpdh4KaQaxQCo5x0yPiyHlkzW9TIqCdH+whctoMq5oQ48vB8aU+HcGhCIYIzGmZEtI 0+DpLL7R4aBMP2IJnuAG7A5Eh3DIBvPCp4RPO9yb1a+PTBjjlEvrJrMe6hT6UhmT0fnMGp Do4v1nqVGGcx4nX9TraxVJF4m+C0e78= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed Subject: [PATCH v2 11/13] KVM: nSVM: Simplify nested_svm_vmrun() Date: Mon, 10 Nov 2025 22:29:20 +0000 Message-ID: <20251110222922.613224-12-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Call nested_svm_merge_msrpm() from enter_svm_guest_mode() if called from the VMRUN path, instead of making the call in nested_svm_vmrun(). This simplifies the flow of nested_svm_vmrun() and removes all jumps to cleanup labels. Signed-off-by: Yosry Ahmed --- arch/x86/kvm/svm/nested.c | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index a48668c36a191..89830380cebc5 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -1020,6 +1020,9 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 v= mcb12_gpa, bool from_vmrun) =20 nested_svm_hv_update_vm_vp_ids(vcpu); =20 + if (from_vmrun && !nested_svm_merge_msrpm(vcpu)) + return -1; + return 0; } =20 @@ -1105,23 +1108,18 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu) =20 svm->nested.nested_run_pending =3D 1; =20 - if (enter_svm_guest_mode(vcpu, vmcb12_gpa, true)) - goto out_exit_err; - - if (nested_svm_merge_msrpm(vcpu)) - return ret; - -out_exit_err: - svm->nested.nested_run_pending =3D 0; - svm->nmi_l1_to_l2 =3D false; - svm->soft_int_injected =3D false; + if (enter_svm_guest_mode(vcpu, vmcb12_gpa, true)) { + svm->nested.nested_run_pending =3D 0; + svm->nmi_l1_to_l2 =3D false; + svm->soft_int_injected =3D false; =20 - svm->vmcb->control.exit_code =3D SVM_EXIT_ERR; - svm->vmcb->control.exit_code_hi =3D 0; - svm->vmcb->control.exit_info_1 =3D 0; - svm->vmcb->control.exit_info_2 =3D 0; + svm->vmcb->control.exit_code =3D SVM_EXIT_ERR; + svm->vmcb->control.exit_code_hi =3D 0; + svm->vmcb->control.exit_info_1 =3D 0; + svm->vmcb->control.exit_info_2 =3D 0; =20 - nested_svm_vmexit(svm); + nested_svm_vmexit(svm); + } return ret; } =20 --=20 2.51.2.1041.gc1ab5b90ca-goog From nobody Sun Feb 8 00:12:07 2026 Received: from out-181.mta0.migadu.com (out-181.mta0.migadu.com [91.218.175.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BFCE3347FF4 for ; Mon, 10 Nov 2025 22:30:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813806; cv=none; b=RKBQURMEuLokh6j7ymv4lb/y+BHqFW0nXeskYHApRdGmFzBzmZptvty+Jy1A5jVIfmbphWeHsCfXAtIM+KOq1lNdLgjncC2IjkrF0c1+7nEEnbUXjIJO/eO/2RQLHSaqyPJgQ35PQUmalciIN3GARX1Gi7+Po8naH5qKgSwTQTs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813806; c=relaxed/simple; bh=kGizhQ7m/sgNuZnwwNjLm+zZLBRfW3mEmN4qVOwIypU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YjrsvwEvChONh6YArmksQAgudTfx1j0JM7Mppl5X47ApYYtEcZ8vBN3mydlBWESLyuqxESfomJYXpsqyEvoh/Yc9Yb5ISYTfKIhJql63slHeOhSkGSGcPUf75RwIEjYISV1gB5NqSb06PrUXHQGswziIbHlwTnaaX0ssky02DxA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=c4V2QBgc; arc=none smtp.client-ip=91.218.175.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="c4V2QBgc" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813802; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qBFHmbgjJrLlphH4Z+jfT21/4SbDscWkunJKPSDgXC4=; b=c4V2QBgcF/xfriCzaW3lVtKclpiusLPYJSIt/G/VSmrAFquasn3ltRRmORfHPbPq3CGYFx XEpFUdSRxl1GT3JPJhheAh1Sk5qe/wMaEY9X0u4Juwxu2vTjgWT7EvXbBmddX7/vHL2Xo1 nWXXo/8xo8v6yvaKXKnuU1X4r2LL/ag= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed Subject: [PATCH v2 12/13] KVM: nSVM: Sanitize control fields copied from VMCB12 Date: Mon, 10 Nov 2025 22:29:21 +0000 Message-ID: <20251110222922.613224-13-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" Make sure all fields used from VMCB12 in creating the VMCB02 are sanitized, such no unhandled or reserved bits end up in the VMCB02. The following control fields are read from VMCB12 and have bits that are either reserved or not handled/advertised by KVM: tlb_ctl, int_ctl, int_state, int_vector, event_inj, misc_ctl, and misc_ctl2. The following fields do not require any extra sanitizing: - int_ctl: bits from VMCB12 are copied bit-by-bit as needed. - misc_ctl: only used in consistency checks (particularly NP_ENABLE). - misc_ctl2: bits from VMCB12 are copied bit-by-bit as needed. For the remaining fields, make sure only defined bits are copied from VMCB12 by defining appropriate masks where needed. The only exception is tlb_ctl, which is unused, so remove it. Opportunisitcally move some existing definitions in svm.h around such that they are ordered by bit position, and cleanup ignoring the lower bits of {io/msr}pm_base_pa in __nested_copy_vmcb_control_to_cache() by using PAGE_MASK. Also, expand the comment about the ASID being copied only for consistency checks. Suggested-by: Jim Mattson Signed-off-by: Yosry Ahmed --- arch/x86/include/asm/svm.h | 11 ++++++++--- arch/x86/kvm/svm/nested.c | 26 ++++++++++++++------------ arch/x86/kvm/svm/svm.h | 1 - 3 files changed, 22 insertions(+), 16 deletions(-) diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index a842018952d2c..44f2cfcd8d4ff 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -213,11 +213,13 @@ struct __attribute__ ((__packed__)) vmcb_control_area= { #define V_NMI_ENABLE_SHIFT 26 #define V_NMI_ENABLE_MASK (1 << V_NMI_ENABLE_SHIFT) =20 +#define X2APIC_MODE_SHIFT 30 +#define X2APIC_MODE_MASK (1 << X2APIC_MODE_SHIFT) + #define AVIC_ENABLE_SHIFT 31 #define AVIC_ENABLE_MASK (1 << AVIC_ENABLE_SHIFT) =20 -#define X2APIC_MODE_SHIFT 30 -#define X2APIC_MODE_MASK (1 << X2APIC_MODE_SHIFT) +#define SVM_INT_VECTOR_MASK (0xff) =20 #define SVM_INTERRUPT_SHADOW_MASK BIT_ULL(0) #define SVM_GUEST_INTERRUPT_MASK BIT_ULL(1) @@ -626,8 +628,11 @@ static inline void __unused_size_checks(void) #define SVM_EVTINJ_TYPE_EXEPT (3 << SVM_EVTINJ_TYPE_SHIFT) #define SVM_EVTINJ_TYPE_SOFT (4 << SVM_EVTINJ_TYPE_SHIFT) =20 -#define SVM_EVTINJ_VALID (1 << 31) #define SVM_EVTINJ_VALID_ERR (1 << 11) +#define SVM_EVTINJ_VALID (1 << 31) + +#define SVM_EVTINJ_RESERVED_BITS ~(SVM_EVTINJ_VEC_MASK | SVM_EVTINJ_TYPE_M= ASK | \ + SVM_EVTINJ_VALID_ERR | SVM_EVTINJ_VALID) =20 #define SVM_EXITINTINFO_VEC_MASK SVM_EVTINJ_VEC_MASK #define SVM_EXITINTINFO_TYPE_MASK SVM_EVTINJ_TYPE_MASK diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 89830380cebc5..503cb7f5a4c5f 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -479,10 +479,11 @@ void __nested_copy_vmcb_control_to_cache(struct kvm_v= cpu *vcpu, for (i =3D 0; i < MAX_INTERCEPT; i++) to->intercepts[i] =3D from->intercepts[i]; =20 - to->iopm_base_pa =3D from->iopm_base_pa; - to->msrpm_base_pa =3D from->msrpm_base_pa; + /* Lower bits of IOPM_BASE_PA and MSRPM_BASE_PA are ignored */ + to->iopm_base_pa =3D from->iopm_base_pa & PAGE_MASK; + to->msrpm_base_pa =3D from->msrpm_base_pa & PAGE_MASK; + to->tsc_offset =3D from->tsc_offset; - to->tlb_ctl =3D from->tlb_ctl; to->int_ctl =3D from->int_ctl; to->int_vector =3D from->int_vector; to->int_state =3D from->int_state; @@ -492,19 +493,21 @@ void __nested_copy_vmcb_control_to_cache(struct kvm_v= cpu *vcpu, to->exit_info_2 =3D from->exit_info_2; to->exit_int_info =3D from->exit_int_info; to->exit_int_info_err =3D from->exit_int_info_err; - to->misc_ctl =3D from->misc_ctl; + to->misc_ctl =3D from->misc_ctl; to->event_inj =3D from->event_inj; to->event_inj_err =3D from->event_inj_err; to->next_rip =3D from->next_rip; to->nested_cr3 =3D from->nested_cr3; - to->misc_ctl2 =3D from->misc_ctl2; + to->misc_ctl2 =3D from->misc_ctl2; to->pause_filter_count =3D from->pause_filter_count; to->pause_filter_thresh =3D from->pause_filter_thresh; =20 - /* Copy asid here because nested_vmcb_check_controls will check it. */ + /* + * Copy asid here because nested_vmcb_check_controls() will check it. + * The ASID could be invalid, or conflict with another VM's ASID , so it + * should never be used directly to run L2. + */ to->asid =3D from->asid; - to->msrpm_base_pa &=3D ~0x0fffULL; - to->iopm_base_pa &=3D ~0x0fffULL; =20 #ifdef CONFIG_KVM_HYPERV /* Hyper-V extensions (Enlightened VMCB) */ @@ -890,9 +893,9 @@ static void nested_vmcb02_prepare_control(struct vcpu_s= vm *svm, (svm->nested.ctl.int_ctl & int_ctl_vmcb12_bits) | (vmcb01->control.int_ctl & int_ctl_vmcb01_bits); =20 - vmcb02->control.int_vector =3D svm->nested.ctl.int_vector; - vmcb02->control.int_state =3D svm->nested.ctl.int_state; - vmcb02->control.event_inj =3D svm->nested.ctl.event_inj; + vmcb02->control.int_vector =3D svm->nested.ctl.int_vector & SVM_= INT_VECTOR_MASK; + vmcb02->control.int_state =3D svm->nested.ctl.int_state & SVM_I= NTERRUPT_SHADOW_MASK; + vmcb02->control.event_inj =3D svm->nested.ctl.event_inj & ~SVM_= EVTINJ_RESERVED_BITS; vmcb02->control.event_inj_err =3D svm->nested.ctl.event_inj_err; =20 /* @@ -1774,7 +1777,6 @@ static void nested_copy_vmcb_cache_to_control(struct = vmcb_control_area *dst, dst->msrpm_base_pa =3D from->msrpm_base_pa; dst->tsc_offset =3D from->tsc_offset; dst->asid =3D from->asid; - dst->tlb_ctl =3D from->tlb_ctl; dst->int_ctl =3D from->int_ctl; dst->int_vector =3D from->int_vector; dst->int_state =3D from->int_state; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index ef6bdce630dc0..c8d43793aa9d6 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -178,7 +178,6 @@ struct vmcb_ctrl_area_cached { u64 msrpm_base_pa; u64 tsc_offset; u32 asid; - u8 tlb_ctl; u32 int_ctl; u32 int_vector; u32 int_state; --=20 2.51.2.1041.gc1ab5b90ca-goog From nobody Sun Feb 8 00:12:07 2026 Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [91.218.175.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 83E94340D8C for ; Mon, 10 Nov 2025 22:30:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813807; cv=none; b=NuPHfrflkcg083rF8oXnqBoakG6MGirMLiBiKrchri0tGfx4sENgbY8jSfehUHnOCUZvN3tvJSvbIDk0K9LgHYCBk81BaRfhmy+a+MxcvIY6mwJgv8xxaLlNbcJjwGjWynpCcWB/sv/8yFUChADygTI7y/Uxw7d0HULekjWHb1g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762813807; c=relaxed/simple; bh=cp97R3GdqDcm5NylaZccmJl3gy9Tx4+loyUIVJzusNM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=P7/5au0x2w0b+acqcyF9fSqNKe8PZ94qdpp9ccNjwmh0t9bIrhDZMz3gCNCkmje3wMa57z2fmUiLUj4Hc85y5hJ1kbladVLpYPFomoxAMj0olo13r+xJf+w0Auy9cjo1rP/GsyOy/yGeItLTzbzG2ojWIkN8gf5H47OTP6KDgKE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=CtLichXf; arc=none smtp.client-ip=91.218.175.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="CtLichXf" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1762813803; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LaX7+H5ws1ElWVssSHUHDq3sKJLlQwFG3aehtAQ6AmU=; b=CtLichXfu636fEu6ihIFPj85dXsPwIG87igdhtmaKtn4HEsdv3VJoDz8xSuOAUeWr+WUbK V+FIAJ9Jnp2J/ozVl8MalyTkVZFTP4v6demE4QVN9SogKudzwYvq4d3j5m5rjhHqtjnPkm ovAtlj52IaKooGM2el2ZgIU2cD3unXU= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , Jim Mattson , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed Subject: [PATCH v2 13/13] KVM: nSVM: Only copy NP_ENABLE from VMCB01's misc_ctl Date: Mon, 10 Nov 2025 22:29:22 +0000 Message-ID: <20251110222922.613224-14-yosry.ahmed@linux.dev> In-Reply-To: <20251110222922.613224-1-yosry.ahmed@linux.dev> References: <20251110222922.613224-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" The 'misc_ctl' field in VMCB02 is taken as-is from VMCB01. However, the only bit that needs to copied is NP_ENABLE. This is a nop now because other bits are for SEV guests, which do not support nested. Nonetheless, this hardens against future bugs if/when other bits are set for L1 but should not be set for L2. Opportunistically add a comment explaining why NP_ENABLE is taken from VMCB01 and not VMCB02. Suggested-by: Jim Mattson Signed-off-by: Yosry Ahmed --- arch/x86/kvm/svm/nested.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 503cb7f5a4c5f..4e278c1f9e6b3 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -837,8 +837,16 @@ static void nested_vmcb02_prepare_control(struct vcpu_= svm *svm, V_NMI_BLOCKING_MASK); } =20 - /* Copied from vmcb01. msrpm_base can be overwritten later. */ - vmcb02->control.misc_ctl =3D vmcb01->control.misc_ctl; + /* + * Copied from vmcb01. msrpm_base can be overwritten later. + * + * NP_ENABLE in vmcb12 is only used for consistency checks. If L1 + * enables NPTs, KVM shadows L1's NPTs and uses those to run L2. If L1 + * disables NPT, KVM runs L2 with the same NPTs used to run L1. For the + * latter, L1 runs L2 with shadow page tables that translate L2 GVAs to + * L1 GPAs, so the same NPTs can be used for L1 and L2. + */ + vmcb02->control.misc_ctl =3D vmcb01->control.misc_ctl & SVM_MISC_CTL_NP_E= NABLE; vmcb02->control.iopm_base_pa =3D vmcb01->control.iopm_base_pa; vmcb02->control.msrpm_base_pa =3D vmcb01->control.msrpm_base_pa; =20 --=20 2.51.2.1041.gc1ab5b90ca-goog