[PATCH v7 09/26] KVM: nSVM: Triple fault if restore host CR3 fails on nested #VMEXIT

Yosry Ahmed posted 26 patches 1 month ago
[PATCH v7 09/26] KVM: nSVM: Triple fault if restore host CR3 fails on nested #VMEXIT
Posted by Yosry Ahmed 1 month ago
If loading L1's CR3 fails on a nested #VMEXIT, nested_svm_vmexit()
returns an error code that is ignored by most callers, and continues to
run L1 with corrupted state. A sane recovery is not possible in this
case, and HW behavior is to cause a shutdown. Inject a triple fault
,nstead, and do not return early from nested_svm_vmexit(). Continue
cleaning up the vCPU state (e.g. clear pending exceptions), to handle
the failure as gracefully as possible.

From the APM:
	Upon #VMEXIT, the processor performs the following actions in
	order to return to the host execution context:

	...
	if (illegal host state loaded, or exception while loading
	    host state)
		shutdown
	else
		execute first host instruction following the VMRUN

Remove the return value of nested_svm_vmexit(), which is mostly
unchecked anyway.

Fixes: d82aaef9c88a ("KVM: nSVM: use nested_svm_load_cr3() on guest->host switch")
CC: stable@vger.kernel.org
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 arch/x86/kvm/svm/nested.c | 10 +++-------
 arch/x86/kvm/svm/svm.c    | 11 ++---------
 arch/x86/kvm/svm/svm.h    |  6 +++---
 3 files changed, 8 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 5ad0ac3680fdd..bb2cec5fd0434 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1193,12 +1193,11 @@ static int nested_svm_vmexit_update_vmcb12(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-int nested_svm_vmexit(struct vcpu_svm *svm)
+void nested_svm_vmexit(struct vcpu_svm *svm)
 {
 	struct kvm_vcpu *vcpu = &svm->vcpu;
 	struct vmcb *vmcb01 = svm->vmcb01.ptr;
 	struct vmcb *vmcb02 = svm->nested.vmcb02.ptr;
-	int rc;
 
 	if (nested_svm_vmexit_update_vmcb12(vcpu))
 		kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
@@ -1317,9 +1316,8 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 
 	nested_svm_uninit_mmu_context(vcpu);
 
-	rc = nested_svm_load_cr3(vcpu, vmcb01->save.cr3, false, true);
-	if (rc)
-		return 1;
+	if (nested_svm_load_cr3(vcpu, vmcb01->save.cr3, false, true))
+		kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
 
 	/*
 	 * Drop what we picked up for L2 via svm_complete_interrupts() so it
@@ -1344,8 +1342,6 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	 */
 	if (kvm_apicv_activated(vcpu->kvm))
 		__kvm_vcpu_update_apicv(vcpu);
-
-	return 0;
 }
 
 static void nested_svm_triple_fault(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index cb53174583a26..1b31b033d79b0 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2234,13 +2234,9 @@ static int emulate_svm_instr(struct kvm_vcpu *vcpu, int opcode)
 		[SVM_INSTR_VMSAVE] = vmsave_interception,
 	};
 	struct vcpu_svm *svm = to_svm(vcpu);
-	int ret;
 
 	if (is_guest_mode(vcpu)) {
-		/* Returns '1' or -errno on failure, '0' on success. */
-		ret = nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
-		if (ret)
-			return ret;
+		nested_svm_simple_vmexit(svm, guest_mode_exit_codes[opcode]);
 		return 1;
 	}
 	return svm_instr_handlers[opcode](vcpu);
@@ -4796,7 +4792,6 @@ static int svm_enter_smm(struct kvm_vcpu *vcpu, union kvm_smram *smram)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	struct kvm_host_map map_save;
-	int ret;
 
 	if (!is_guest_mode(vcpu))
 		return 0;
@@ -4816,9 +4811,7 @@ static int svm_enter_smm(struct kvm_vcpu *vcpu, union kvm_smram *smram)
 	svm->vmcb->save.rsp = vcpu->arch.regs[VCPU_REGS_RSP];
 	svm->vmcb->save.rip = vcpu->arch.regs[VCPU_REGS_RIP];
 
-	ret = nested_svm_simple_vmexit(svm, SVM_EXIT_SW);
-	if (ret)
-		return ret;
+	nested_svm_simple_vmexit(svm, SVM_EXIT_SW);
 
 	/*
 	 * KVM uses VMCB01 to store L1 host state while L2 runs but
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 44d767cd1d25a..7629cb37c9302 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -793,14 +793,14 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu);
 void svm_copy_vmrun_state(struct vmcb_save_area *to_save,
 			  struct vmcb_save_area *from_save);
 void svm_copy_vmloadsave_state(struct vmcb *to_vmcb, struct vmcb *from_vmcb);
-int nested_svm_vmexit(struct vcpu_svm *svm);
+void nested_svm_vmexit(struct vcpu_svm *svm);
 
-static inline int nested_svm_simple_vmexit(struct vcpu_svm *svm, u32 exit_code)
+static inline void nested_svm_simple_vmexit(struct vcpu_svm *svm, u32 exit_code)
 {
 	svm->vmcb->control.exit_code	= exit_code;
 	svm->vmcb->control.exit_info_1	= 0;
 	svm->vmcb->control.exit_info_2	= 0;
-	return nested_svm_vmexit(svm);
+	nested_svm_vmexit(svm);
 }
 
 int nested_svm_exit_handled(struct vcpu_svm *svm);
-- 
2.53.0.473.g4a7958ca14-goog
Re: [PATCH v7 09/26] KVM: nSVM: Triple fault if restore host CR3 fails on nested #VMEXIT
Posted by Sean Christopherson 4 weeks, 1 day ago
On Tue, Mar 03, 2026, Yosry Ahmed wrote:
> If loading L1's CR3 fails on a nested #VMEXIT, nested_svm_vmexit()
> returns an error code that is ignored by most callers, and continues to
> run L1 with corrupted state. A sane recovery is not possible in this
> case, and HW behavior is to cause a shutdown. Inject a triple fault
> ,nstead, and do not return early from nested_svm_vmexit(). Continue

s/,/i

> cleaning up the vCPU state (e.g. clear pending exceptions), to handle
> the failure as gracefully as possible.
> 
> >From the APM:
> 	Upon #VMEXIT, the processor performs the following actions in
> 	order to return to the host execution context:
> 
> 	...
> 	if (illegal host state loaded, or exception while loading
> 	    host state)
> 		shutdown
> 	else
> 		execute first host instruction following the VMRUN

Uber nit, use spaces instead of tabs in changelogs, as indenting eight chars is
almost always overkill and changelogs are more likely to be viewed in a reader
that has tab-stops set to something other than eight.  E.g. using two spaces as
the margin and then manual indentation of four:

From the APM:

  Upon #VMEXIT, the processor performs the following actions in order to
  return to the host execution context:

  ...

  if (illegal host state loaded, or exception while loading host state)
      shutdown
  else
      execute first host instruction following the VMRUN

Remove the return value of nested_svm_vmexit(), which is mostly
unchecked anyway.


> Remove the return value of nested_svm_vmexit(), which is mostly
> unchecked anyway.
> 
> Fixes: d82aaef9c88a ("KVM: nSVM: use nested_svm_load_cr3() on guest->host switch")
> CC: stable@vger.kernel.org

Heh, and super duper uber nit, "Cc:" is much more common than "CC:" (I'm actually
somewhat surprised checkpatch didn't complain since it's so particular about case
for other trailers).

$ git log -10000 | grep "CC:" | wc -l
38
$ git log -10000 | grep "Cc:" | wc -l
11238
Re: [PATCH v7 09/26] KVM: nSVM: Triple fault if restore host CR3 fails on nested #VMEXIT
Posted by Yosry Ahmed 4 weeks, 1 day ago
On Tue, Mar 3, 2026 at 8:49 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Mar 03, 2026, Yosry Ahmed wrote:
> > If loading L1's CR3 fails on a nested #VMEXIT, nested_svm_vmexit()
> > returns an error code that is ignored by most callers, and continues to
> > run L1 with corrupted state. A sane recovery is not possible in this
> > case, and HW behavior is to cause a shutdown. Inject a triple fault
> > ,nstead, and do not return early from nested_svm_vmexit(). Continue
>
> s/,/i

Not sure how that happened lol.

>
> > cleaning up the vCPU state (e.g. clear pending exceptions), to handle
> > the failure as gracefully as possible.
> >
> > >From the APM:
> >       Upon #VMEXIT, the processor performs the following actions in
> >       order to return to the host execution context:
> >
> >       ...
> >       if (illegal host state loaded, or exception while loading
> >           host state)
> >               shutdown
> >       else
> >               execute first host instruction following the VMRUN
>
> Uber nit, use spaces instead of tabs in changelogs, as indenting eight chars is
> almost always overkill and changelogs are more likely to be viewed in a reader
> that has tab-stops set to something other than eight.  E.g. using two spaces as
> the margin and then manual indentation of four:

Yeah I started doing that recently but I didn't go back to change old ones.

[..]
> >
> > Fixes: d82aaef9c88a ("KVM: nSVM: use nested_svm_load_cr3() on guest->host switch")
> > CC: stable@vger.kernel.org
>
> Heh, and super duper uber nit, "Cc:" is much more common than "CC:" (I'm actually
> somewhat surprised checkpatch didn't complain since it's so particular about case
> for other trailers).
>
> $ git log -10000 | grep "CC:" | wc -l
> 38
> $ git log -10000 | grep "Cc:" | wc -l
> 11238

That was a mistake, I think I generally use Cc.