[PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator

Yosry Ahmed posted 7 patches 3 weeks, 4 days ago
There is a newer version of this series
[PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
Posted by Yosry Ahmed 3 weeks, 4 days ago
The check for legal GPA in RAX hardcodes a mask for 48 bits of physical
address width. This incorrectly injects a #GP for valid 52-bit GPAs.
However, instead of fixing the check, remove it completely as it is
unnecessary.

If RAX contains an illegal GPA, the CPU should inject #GP. If KVM
intercepts #GP, the emulator is only used for decoding the instruction.
Otherwise, if RAX is illegal from the guest's perspective but not the
host's (due to allow_smaller_maxphyaddr), then KVM should always
intercept the instructions (as NPT and VLS should both be disabled). The
interception path for VMRUN/VMSAVE/VMLOAD also does not invoke the
emulator either. Hence, the emulator can never be invoked with an
actually illegal RAX.

Outside of forced emulation or code stream rewriting, the emulator
should only be invoked for these instructions in cases such as RAX
having a legal GPA that lies outside guest memory, as the #NPF
interception handler will try to emulate the instruction after failing
to create a proper mapping in the NPT. In this case, the emulator's
responsibility ends with checking pre-intercept exceptions and
intercepts, it does not actually emulate these instructions.

According to the APM, #GP due to invalid op happens after the
interception check:

  Generally, instruction intercepts are checked after simple exceptions
  (such as #GP—when CPL is incorrect—or #UD) have been checked, but
  before exceptions related to memory accesses (such as page faults) and
  exceptions based on specific operand values.

Arguably, the emulator's checks for EFER.SVME and intercepts are also
unnecessary. If EFER.SVME is cleared or if L1 intercepts
VMRUN/VMSAVE/VMLOAD (for nested), then KVM should always be intercepting
these instructions anyway, and the emulator should not be invoked (see
above). Leave dealing with that for later.

Fixes: 01de8b09e606 ("KVM: SVM: Add intercept checks for SVM instructions")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
 arch/x86/kvm/emulate.c | 17 +++--------------
 1 file changed, 3 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 6145dac4a605a..a449a00555da1 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3883,17 +3883,6 @@ static int check_svme(struct x86_emulate_ctxt *ctxt)
 	return X86EMUL_CONTINUE;
 }
 
-static int check_svme_pa(struct x86_emulate_ctxt *ctxt)
-{
-	u64 rax = reg_read(ctxt, VCPU_REGS_RAX);
-
-	/* Valid physical address? */
-	if (rax & 0xffff000000000000ULL)
-		return emulate_gp(ctxt, 0);
-
-	return check_svme(ctxt);
-}
-
 static int check_rdtsc(struct x86_emulate_ctxt *ctxt)
 {
 	u64 cr4 = ctxt->ops->get_cr(ctxt, 4);
@@ -3997,10 +3986,10 @@ static const struct opcode group7_rm2[] = {
 };
 
 static const struct opcode group7_rm3[] = {
-	DIP(SrcNone | Prot | Priv,		vmrun,		check_svme_pa),
+	DIP(SrcNone | Prot | Priv,		vmrun,		check_svme),
 	II(SrcNone  | Prot | EmulateOnUD,	em_hypercall,	vmmcall),
-	DIP(SrcNone | Prot | Priv,		vmload,		check_svme_pa),
-	DIP(SrcNone | Prot | Priv,		vmsave,		check_svme_pa),
+	DIP(SrcNone | Prot | Priv,		vmload,		check_svme),
+	DIP(SrcNone | Prot | Priv,		vmsave,		check_svme),
 	DIP(SrcNone | Prot | Priv,		stgi,		check_svme),
 	DIP(SrcNone | Prot | Priv,		clgi,		check_svme),
 	DIP(SrcNone | Prot | Priv,		skinit,		check_svme),
-- 
2.53.0.851.ga537e3e6e9-goog

Re: [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
Posted by Paolo Bonzini 3 weeks, 2 days ago
On 3/13/26 01:10, Yosry Ahmed wrote:
> Outside of forced emulation or code stream rewriting,

But isn't that the point? Due to code stream rewriting or intentional 
usage of stale TLBs (so that the processor executes one instruction and 
the emulator another), the emulator cannot assume that it will "never be 
invoked with an actually illegal RAX".

I realize that I'm late to the show, so I apologize in advance if this 
has been discussed before.

Paolo
Re: [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
Posted by Yosry Ahmed 3 weeks, 1 day ago
On Sun, Mar 15, 2026 at 5:55 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 3/13/26 01:10, Yosry Ahmed wrote:
> > Outside of forced emulation or code stream rewriting,
>
> But isn't that the point? Due to code stream rewriting or intentional
> usage of stale TLBs (so that the processor executes one instruction and
> the emulator another), the emulator cannot assume that it will "never be
> invoked with an actually illegal RAX".
>
> I realize that I'm late to the show, so I apologize in advance if this
> has been discussed before.

Thanks for chiming in. FWIW, I initially intended to fix this check
instead of removing it, the removal came after a discussion with Sean,
see https://lore.kernel.org/kvm/abH0RdnM29Xyh_4G@google.com for more
context.

TL;DR is that the emulator support for VMRUN/VMLOAD/VMSAVE is
broken/unsupported anyway, beyond checking for intercepts and
pre-intercept exceptions (well, even that is broken), and the RAX
check should be after that architecturally.

The emulator isn't wired up to actually emulate these instructions.
Also, it performs the #GP on CPL check before the #UD on EFER.SVME
check, which is wrong. The right thing to do for the illegal RAX check
would be to move it after the intercept check, but we don't have an
execute() callback to put it in. I think what we really should do is
actually wire up the emulator to call into the intercept handlers for
VMRUN/VMLOAD/VMSAVE, such that it actually supports these
instructions, and the RAX check is done in this intercept handlers (or
at least will be after this series).

Given that the goal of this series is actually to stop injecting a
non-architectural #GP when KVM fails to read vmcb12, I didn't really
want to derail it anymore by making it emulator-focused. We can
probably do a follow-up on just improving the emulator support for
these instructions as I mentioned above, as that would take its own
series AFAICT.

The only reason this patch exists here is because the buggy RAX check
in the emulator results in injecting a non-architectural #GP on 52-bit
MAXPHYADDR platforms (the #NPF scenario described in the changelog).
So the check is blocking the goal of the series. I didn't really care
much whether we fixed it or removed it, but Sean did not want to fix
it and have the emulator directly peek into vCPU state, so removal it
was.

I hope this makes things clearer.
Re: [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
Posted by Paolo Bonzini 3 weeks ago
Il lun 16 mar 2026, 14:49 Yosry Ahmed <yosry@kernel.org> ha scritto:
>
> TL;DR is that the emulator support for VMRUN/VMLOAD/VMSAVE is
> broken/unsupported anyway, beyond checking for intercepts and
> pre-intercept exceptions (well, even that is broken), and the RAX
> check should be after that architecturally.

Agreed about that, just ripping it out completely would be fine. It's
only if the emulation happens that it has to be complete.

> Given that the goal of this series is actually to stop injecting a
> non-architectural #GP when KVM fails to read vmcb12, I didn't really
> want to derail it anymore by making it emulator-focused. We can
> probably do a follow-up on just improving the emulator support for
> these instructions as I mentioned above, as that would take its own
> series AFAICT.

Agreed.

> The only reason this patch exists here is because the buggy RAX check
> in the emulator results in injecting a non-architectural #GP on 52-bit
> MAXPHYADDR platforms (the #NPF scenario described in the changelog).

Yeah, that part was clearly broken when physical address bits broke
AMD's 48-bit limit (and I don't think the details of what is
considered an invalid physical address are part of the architectural
description, for example some SMM ranges are carved out by the
VMRUN/VMLOAD/VMSAVE microcode).

Thanks,

Paolo

> So the check is blocking the goal of the series. I didn't really care
> much whether we fixed it or removed it, but Sean did not want to fix
> it and have the emulator directly peek into vCPU state, so removal it
> was.
>
> I hope this makes things clearer.
>
Re: [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
Posted by Jim Mattson 2 weeks, 6 days ago
On Tue, Mar 17, 2026 at 6:15 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> Il lun 16 mar 2026, 14:49 Yosry Ahmed <yosry@kernel.org> ha scritto:
> >
> > TL;DR is that the emulator support for VMRUN/VMLOAD/VMSAVE is
> > broken/unsupported anyway, beyond checking for intercepts and
> > pre-intercept exceptions (well, even that is broken), and the RAX
> > check should be after that architecturally.
>
> Agreed about that, just ripping it out completely would be fine. It's
> only if the emulation happens that it has to be complete.
>
> > Given that the goal of this series is actually to stop injecting a
> > non-architectural #GP when KVM fails to read vmcb12, I didn't really
> > want to derail it anymore by making it emulator-focused. We can
> > probably do a follow-up on just improving the emulator support for
> > these instructions as I mentioned above, as that would take its own
> > series AFAICT.
>
> Agreed.
>
> > The only reason this patch exists here is because the buggy RAX check
> > in the emulator results in injecting a non-architectural #GP on 52-bit
> > MAXPHYADDR platforms (the #NPF scenario described in the changelog).
>
> Yeah, that part was clearly broken when physical address bits broke
> AMD's 48-bit limit (and I don't think the details of what is
> considered an invalid physical address are part of the architectural
> description, for example some SMM ranges are carved out by the
> VMRUN/VMLOAD/VMSAVE microcode).

IIUC, VLS is broken because it applies the host's microarchitectural
physical address restrictions to untranslated *guest* physical
addresses. Is that right?
Re: [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
Posted by Paolo Bonzini 2 weeks, 5 days ago
On Tue, Mar 17, 2026 at 3:58 PM Jim Mattson <jmattson@google.com> wrote:
> > Yeah, that part was clearly broken when physical address bits broke
> > AMD's 48-bit limit (and I don't think the details of what is
> > considered an invalid physical address are part of the architectural
> > description, for example some SMM ranges are carved out by the
> > VMRUN/VMLOAD/VMSAVE microcode).
>
> IIUC, VLS is broken because it applies the host's microarchitectural
> physical address restrictions to untranslated *guest* physical
> addresses. Is that right?

Yes, look at gp_interception() in svm.c. I'm not sure if AMD ever said
publicly whether it was an erratum, or intentional even with VLS
enabled (https://lkml.org/lkml/2021/1/13/513).

(Also I hope that, with VLS enabled, the check is performed *again*
after translating the address).

Paolo
Re: [PATCH v3 1/7] KVM: SVM: Drop RAX check for SVM instructions from the emulator
Posted by Yosry Ahmed 3 weeks ago
On Mon, Mar 16, 2026 at 06:49:35AM -0700, Yosry Ahmed wrote:
> On Sun, Mar 15, 2026 at 5:55 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> > On 3/13/26 01:10, Yosry Ahmed wrote:
> > > Outside of forced emulation or code stream rewriting,
> >
> > But isn't that the point? Due to code stream rewriting or intentional
> > usage of stale TLBs (so that the processor executes one instruction and
> > the emulator another), the emulator cannot assume that it will "never be
> > invoked with an actually illegal RAX".
> >
> > I realize that I'm late to the show, so I apologize in advance if this
> > has been discussed before.
> 
> Thanks for chiming in. FWIW, I initially intended to fix this check
> instead of removing it, the removal came after a discussion with Sean,
> see https://lore.kernel.org/kvm/abH0RdnM29Xyh_4G@google.com for more
> context.
> 
> TL;DR is that the emulator support for VMRUN/VMLOAD/VMSAVE is
> broken/unsupported anyway, beyond checking for intercepts and
> pre-intercept exceptions (well, even that is broken), and the RAX
> check should be after that architecturally.

No, this is wrong. I believe Sean's read of the APM was incomplete, he
quoted this part (which I did include in my changelog):

  Generally, instruction intercepts are checked after simple exceptions
  (such as #GP—when CPL is incorrect—or #UD) have been checked, but
  before exceptions related to memory accesses (such as page faults) and
  exceptions based on specific operand values.

But in table 15-7 for instruction intercepts, the rows for
VMRUN/VMLOAD/VMSAVE have this in the priority column:

  Checks exceptions (#GP) before the intercept.

Unlike other rows that specify #GP on CPL != 0.

Additionally, in the VMRUN pseudocode, we have this:

  IF ((MSR_EFER.SVME == 0) || (!PROTECTED_MODE))
    EXCEPTION [#UD] // This instruction can only be executed in
                       protected mode with SVM enabled
  IF (CPL != 0) // This instruction is only allowed at CPL 0
    EXCEPTION [#GP]
  IF (rAX contains an unsupported physical address)
    EXCEPTION [#GP]
  IF (intercepted(VMRUN))
    #VMEXIT (VMRUN)

The pseudocode for VMLOAD/VMSAVE does not have the intercepted() check
though, but I assume it's the save as VMRUN. I did confirm that with
vls=0, vmload_interception() is not being called on VMLOAD if RAX=-1ULL.

So I think the RAX check is actually intended to happen before the
intercept. I think I will go back to fixing the RAX check instead of
dropping it.