KVM currently injects a #GP if mapping vmcb12 fails when emulating
VMRUN/VMLOAD/VMSAVE. This is not architectural behavior, as #GP should
only be injected if the physical address is not supported or not aligned
(which hardware will do before the VMRUN intercept is checked).
Instead, handle it as an emulation failure, similar to how nVMX handles
failures to read/write guest memory in several emulation paths.
When virtual VMLOAD/VMSAVE is enabled, if vmcb12's GPA is not mapped in
the NPTs a VMEXIT(#NPF) will be generated, and KVM will install an MMIO
SPTE and emulate the instruction if there is no corresponding memslot.
x86_emulate_insn() will return EMULATION_FAILED as VMLOAD/VMSAVE are not
handled as part of the twobyte_insn cases.
Even though this will also result in an emulation failure, it will only
result in a straight return to userspace if
KVM_CAP_EXIT_ON_EMULATION_FAILURE is set. Otherwise, it would inject #UD
and only exit to userspace if not in guest mode. So the behavior is
slightly different if virtual VMLOAD/VMSAVE is enabled.
Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
Reported-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
arch/x86/kvm/svm/nested.c | 13 ++++++-------
arch/x86/kvm/svm/svm.c | 6 ++----
2 files changed, 8 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 6d4c053778b21..089cdcfd60340 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1107,15 +1107,14 @@ int nested_svm_vmrun(struct kvm_vcpu *vcpu)
ret = nested_svm_copy_vmcb12_to_cache(vcpu, vmcb12_gpa);
/*
- * Advance RIP if #GP or #UD are not injected, but otherwise
- * stop if copying and checking vmcb12 failed.
+ * Advance RIP if instruction emulation completes, whether it's a
+ * successful VMRUN or a failed one with #VMEXIT(INVALID), but not if
+ * #GP/#UD is injected, or if reading vmcb12 fails.
*/
- if (ret == -EFAULT) {
- kvm_inject_gp(vcpu, 0);
- return 1;
- } else if (ret) {
+ if (ret == -EFAULT)
+ return kvm_handle_memory_failure(vcpu, X86EMUL_IO_NEEDED, NULL);
+ else if (ret)
return kvm_skip_emulated_instruction(vcpu);
- }
ret = kvm_skip_emulated_instruction(vcpu);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7ec0b0e8945fe..35433bd345eff 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2190,10 +2190,8 @@ static int vmload_vmsave_interception(struct kvm_vcpu *vcpu, bool vmload)
if (nested_svm_check_permissions(vcpu))
return 1;
- if (kvm_vcpu_map(vcpu, gpa_to_gfn(svm->vmcb->save.rax), &map)) {
- kvm_inject_gp(vcpu, 0);
- return 1;
- }
+ if (kvm_vcpu_map(vcpu, gpa_to_gfn(svm->vmcb->save.rax), &map))
+ return kvm_handle_memory_failure(vcpu, X86EMUL_IO_NEEDED, NULL);
vmcb12 = map.hva;
--
2.53.0.473.g4a7958ca14-goog
On Fri, Mar 6, 2026 at 1:09 PM Yosry Ahmed <yosry@kernel.org> wrote:
>
> KVM currently injects a #GP if mapping vmcb12 fails when emulating
> VMRUN/VMLOAD/VMSAVE. This is not architectural behavior, as #GP should
> only be injected if the physical address is not supported or not aligned
> (which hardware will do before the VMRUN intercept is checked).
>
> Instead, handle it as an emulation failure, similar to how nVMX handles
> failures to read/write guest memory in several emulation paths.
>
> When virtual VMLOAD/VMSAVE is enabled, if vmcb12's GPA is not mapped in
> the NPTs a VMEXIT(#NPF) will be generated, and KVM will install an MMIO
> SPTE and emulate the instruction if there is no corresponding memslot.
> x86_emulate_insn() will return EMULATION_FAILED as VMLOAD/VMSAVE are not
> handled as part of the twobyte_insn cases.
>
> Even though this will also result in an emulation failure, it will only
> result in a straight return to userspace if
> KVM_CAP_EXIT_ON_EMULATION_FAILURE is set. Otherwise, it would inject #UD
> and only exit to userspace if not in guest mode. So the behavior is
> slightly different if virtual VMLOAD/VMSAVE is enabled.
>
> Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
> Reported-by: Jim Mattson <jmattson@google.com>
> Signed-off-by: Yosry Ahmed <yosry@kernel.org>
> ---
Nice find from AI bot, we should probably update gp_interception() to
make sure we reinject a #GP if the address exceeds MAXPHYADDR.
Something like:
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 5362443f4bbce..1c52d6d59c480 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2320,7 +2320,8 @@ static int gp_interception(struct kvm_vcpu *vcpu)
EMULTYPE_VMWARE_GP | EMULTYPE_NO_DECODE);
} else {
/* All SVM instructions expect page aligned RAX */
- if (svm->vmcb->save.rax & ~PAGE_MASK)
+ if (svm->vmcb->save.rax & ~PAGE_MASK ||
+ svm->vmcb->save.rax & rsvd_bits(cpuid_maxphyaddr(vcpu), 63))
goto reinject;
return emulate_svm_instr(vcpu, opcode);
On Fri, Mar 6, 2026 at 5:09 PM Yosry Ahmed <yosry@kernel.org> wrote:
>
> On Fri, Mar 6, 2026 at 1:09 PM Yosry Ahmed <yosry@kernel.org> wrote:
> >
> > KVM currently injects a #GP if mapping vmcb12 fails when emulating
> > VMRUN/VMLOAD/VMSAVE. This is not architectural behavior, as #GP should
> > only be injected if the physical address is not supported or not aligned
> > (which hardware will do before the VMRUN intercept is checked).
> >
> > Instead, handle it as an emulation failure, similar to how nVMX handles
> > failures to read/write guest memory in several emulation paths.
> >
> > When virtual VMLOAD/VMSAVE is enabled, if vmcb12's GPA is not mapped in
> > the NPTs a VMEXIT(#NPF) will be generated, and KVM will install an MMIO
> > SPTE and emulate the instruction if there is no corresponding memslot.
> > x86_emulate_insn() will return EMULATION_FAILED as VMLOAD/VMSAVE are not
> > handled as part of the twobyte_insn cases.
> >
> > Even though this will also result in an emulation failure, it will only
> > result in a straight return to userspace if
> > KVM_CAP_EXIT_ON_EMULATION_FAILURE is set. Otherwise, it would inject #UD
> > and only exit to userspace if not in guest mode. So the behavior is
> > slightly different if virtual VMLOAD/VMSAVE is enabled.
> >
> > Fixes: 3d6368ef580a ("KVM: SVM: Add VMRUN handler")
> > Reported-by: Jim Mattson <jmattson@google.com>
> > Signed-off-by: Yosry Ahmed <yosry@kernel.org>
> > ---
>
> Nice find from AI bot, we should probably update gp_interception() to
> make sure we reinject a #GP if the address exceeds MAXPHYADDR.
> Something like:
Actually we should probably add a helper (e.g. svm_instr_should_gp()
or svm_instr_check_rax()) to figure out if we need to #GP on RAX for
VMRUN/VMLOAD/VMSAVE, and use it in both gp_interception() and
check_svme_pa() -- the latter doesn't have the misaligned page check
(although I think in practice we might not need it).
© 2016 - 2026 Red Hat, Inc.