When KVM intercepts #GP on an SVM instruction, it re-injects the #GP if
the instruction was executed with a mis-algined RAX. However, a #GP
should also be reinjected if RAX contains an illegal GPA, according to
the APM, one of #GP conditions is:
rAX referenced a physical address above the maximum
supported physical address.
Replace the PAGE_MASK check with page_address_valid(), which checks both
page-alignment as well as the legality of the GPA based on the vCPU's
MAXPHYADDR. Use kvm_register_read() to read RAX to avoid
page_address_valid() failing on 32-bit due to garbage in the higher
bits.
Note that this is currently only a problem if KVM is running an L2 guest
and ends up synthesizing a #VMEXIT to L1, as the RAX check takes
precedence over the intercept. Otherwise, if KVM emulates the
instruction, kvm_vcpu_map() should fail on illegal GPAs and inject a #GP
anyway. However, following patches will change the failure behavior of
kvm_vcpu_map(), so make sure the #GP interception handler does this
appropriately.
Opportunistically drop a teaser FIXME about the SVM instructions
handling on #GP belonging in the emulator.
Fixes: 82a11e9c6fa2 ("KVM: SVM: Add emulation support for #GP triggered by SVM instructions")
Fixes: d1cba6c92237 ("KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround")
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
---
arch/x86/kvm/svm/svm.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 392a5088f20bf..3122a98745ab7 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2277,10 +2277,12 @@ static int gp_interception(struct kvm_vcpu *vcpu)
if (x86_decode_emulated_instruction(vcpu, 0, NULL, 0) != EMULATION_OK)
goto reinject;
+ /* FIXME: Handle SVM instructions through the emulator */
svm_exit_code = svm_instr_exit_code(vcpu);
if (svm_exit_code) {
- /* All SVM instructions expect page aligned RAX */
- if (svm->vmcb->save.rax & ~PAGE_MASK)
+ unsigned long rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
+
+ if (!page_address_valid(vcpu, rax))
goto reinject;
if (is_guest_mode(vcpu)) {
--
2.53.0.851.ga537e3e6e9-goog
On Mon, Mar 16, 2026, Yosry Ahmed wrote:
> Replace the PAGE_MASK check with page_address_valid(), which checks both
> page-alignment as well as the legality of the GPA based on the vCPU's
> MAXPHYADDR. Use kvm_register_read() to read RAX to avoid
> page_address_valid() failing on 32-bit due to garbage in the higher
> bits.
Nit, not "on" 32-bit, correct? I think you actually mean "to avoid false positives
when the vCPU is in 32-bit mode, in the unlikely case the vCPU transitioned from
64-bit back to 32-bit, without writing EAX". Because regs[] is an unsigned long,
so the upper bits of save.rax will be cleared by svm_vcpu_run() on every VM-Entry,
and it should be impossible for a purely 32-bit guest to get a non-zero value in
RAX[63:32].
And even for a 64-bit host with a 32-bit guest, the only way to get a non-zero
value in RAX[63:32] while in 32-bit mode would be to transition from 64-bit mode,
back to 32-bit mode, without writing EAX.
> Note that this is currently only a problem if KVM is running an L2 guest
> and ends up synthesizing a #VMEXIT to L1, as the RAX check takes
> precedence over the intercept. Otherwise, if KVM emulates the
> instruction, kvm_vcpu_map() should fail on illegal GPAs and inject a #GP
> anyway. However, following patches will change the failure behavior of
> kvm_vcpu_map(), so make sure the #GP interception handler does this
> appropriately.
>
> Opportunistically drop a teaser FIXME about the SVM instructions
> handling on #GP belonging in the emulator.
>
> Fixes: 82a11e9c6fa2 ("KVM: SVM: Add emulation support for #GP triggered by SVM instructions")
> Fixes: d1cba6c92237 ("KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround")
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Yosry Ahmed <yosry@kernel.org>
> ---
> arch/x86/kvm/svm/svm.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 392a5088f20bf..3122a98745ab7 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -2277,10 +2277,12 @@ static int gp_interception(struct kvm_vcpu *vcpu)
> if (x86_decode_emulated_instruction(vcpu, 0, NULL, 0) != EMULATION_OK)
> goto reinject;
>
> + /* FIXME: Handle SVM instructions through the emulator */
> svm_exit_code = svm_instr_exit_code(vcpu);
> if (svm_exit_code) {
> - /* All SVM instructions expect page aligned RAX */
> - if (svm->vmcb->save.rax & ~PAGE_MASK)
> + unsigned long rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
> +
> + if (!page_address_valid(vcpu, rax))
Eh, let it poke out, i.e.
if (!page_address_valid(vcpu, kvm_register_read(vcpu, VCPU_REGS_RAX)))
goto reinject;
> goto reinject;
>
> if (is_guest_mode(vcpu)) {
> --
> 2.53.0.851.ga537e3e6e9-goog
>
On Fri, Apr 03, 2026, Sean Christopherson wrote:
> On Mon, Mar 16, 2026, Yosry Ahmed wrote:
> > Replace the PAGE_MASK check with page_address_valid(), which checks both
> > page-alignment as well as the legality of the GPA based on the vCPU's
> > MAXPHYADDR. Use kvm_register_read() to read RAX to avoid
> > page_address_valid() failing on 32-bit due to garbage in the higher
> > bits.
>
> Nit, not "on" 32-bit, correct? I think you actually mean "to avoid false positives
> when the vCPU is in 32-bit mode, in the unlikely case the vCPU transitioned from
> 64-bit back to 32-bit, without writing EAX". Because regs[] is an unsigned long,
> so the upper bits of save.rax will be cleared by svm_vcpu_run() on every VM-Entry,
> and it should be impossible for a purely 32-bit guest to get a non-zero value in
> RAX[63:32].
>
> And even for a 64-bit host with a 32-bit guest, the only way to get a non-zero
> value in RAX[63:32] while in 32-bit mode would be to transition from 64-bit mode,
> back to 32-bit mode, without writing EAX.
>
> > Note that this is currently only a problem if KVM is running an L2 guest
> > and ends up synthesizing a #VMEXIT to L1, as the RAX check takes
> > precedence over the intercept. Otherwise, if KVM emulates the
> > instruction, kvm_vcpu_map() should fail on illegal GPAs and inject a #GP
> > anyway. However, following patches will change the failure behavior of
> > kvm_vcpu_map(), so make sure the #GP interception handler does this
> > appropriately.
> >
> > Opportunistically drop a teaser FIXME about the SVM instructions
> > handling on #GP belonging in the emulator.
> >
> > Fixes: 82a11e9c6fa2 ("KVM: SVM: Add emulation support for #GP triggered by SVM instructions")
> > Fixes: d1cba6c92237 ("KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround")
> > Suggested-by: Sean Christopherson <seanjc@google.com>
> > Signed-off-by: Yosry Ahmed <yosry@kernel.org>
> > ---
> > arch/x86/kvm/svm/svm.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 392a5088f20bf..3122a98745ab7 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -2277,10 +2277,12 @@ static int gp_interception(struct kvm_vcpu *vcpu)
> > if (x86_decode_emulated_instruction(vcpu, 0, NULL, 0) != EMULATION_OK)
> > goto reinject;
> >
> > + /* FIXME: Handle SVM instructions through the emulator */
> > svm_exit_code = svm_instr_exit_code(vcpu);
> > if (svm_exit_code) {
> > - /* All SVM instructions expect page aligned RAX */
> > - if (svm->vmcb->save.rax & ~PAGE_MASK)
> > + unsigned long rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
> > +
> > + if (!page_address_valid(vcpu, rax))
>
> Eh, let it poke out, i.e.
>
> if (!page_address_valid(vcpu, kvm_register_read(vcpu, VCPU_REGS_RAX)))
Argh, looking at the rest of this series, and at KVM's existing code, having to
use kvm_register_read() is awful. This really should be able to use kvm_rax_read(),
but that won't handle the truncation.
There are only a handful of likely-benign goofs due to this mess, but there is a
pile of manual truncation and casting going on. In addition to _raw() variants,
and mode-aware defaults, add "e" versions would be helpful, as many of the
explicit truncation flows are cases where e.g. EAX, ECX, and EDX are architecturally
accessed.
I'll put together patches, and think more on how to handle this series (the
dependencies aren't terrible, but they certainly are annoying). I'm tempted
to squeeze this into 7.1 to make future life easier...
> goto reinject;
>
> > goto reinject;
> >
> > if (is_guest_mode(vcpu)) {
> > --
> > 2.53.0.851.ga537e3e6e9-goog
> >
On Fri, Apr 3, 2026 at 12:00 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Apr 03, 2026, Sean Christopherson wrote:
> > On Mon, Mar 16, 2026, Yosry Ahmed wrote:
> > > Replace the PAGE_MASK check with page_address_valid(), which checks both
> > > page-alignment as well as the legality of the GPA based on the vCPU's
> > > MAXPHYADDR. Use kvm_register_read() to read RAX to avoid
> > > page_address_valid() failing on 32-bit due to garbage in the higher
> > > bits.
> >
> > Nit, not "on" 32-bit, correct? I think you actually mean "to avoid false positives
> > when the vCPU is in 32-bit mode, in the unlikely case the vCPU transitioned from
> > 64-bit back to 32-bit, without writing EAX". Because regs[] is an unsigned long,
> > so the upper bits of save.rax will be cleared by svm_vcpu_run() on every VM-Entry,
> > and it should be impossible for a purely 32-bit guest to get a non-zero value in
> > RAX[63:32].
> >
> > And even for a 64-bit host with a 32-bit guest, the only way to get a non-zero
> > value in RAX[63:32] while in 32-bit mode would be to transition from 64-bit mode,
> > back to 32-bit mode, without writing EAX.
> >
> > > Note that this is currently only a problem if KVM is running an L2 guest
> > > and ends up synthesizing a #VMEXIT to L1, as the RAX check takes
> > > precedence over the intercept. Otherwise, if KVM emulates the
> > > instruction, kvm_vcpu_map() should fail on illegal GPAs and inject a #GP
> > > anyway. However, following patches will change the failure behavior of
> > > kvm_vcpu_map(), so make sure the #GP interception handler does this
> > > appropriately.
> > >
> > > Opportunistically drop a teaser FIXME about the SVM instructions
> > > handling on #GP belonging in the emulator.
> > >
> > > Fixes: 82a11e9c6fa2 ("KVM: SVM: Add emulation support for #GP triggered by SVM instructions")
> > > Fixes: d1cba6c92237 ("KVM: x86: nSVM: test eax for 4K alignment for GP errata workaround")
> > > Suggested-by: Sean Christopherson <seanjc@google.com>
> > > Signed-off-by: Yosry Ahmed <yosry@kernel.org>
> > > ---
> > > arch/x86/kvm/svm/svm.c | 6 ++++--
> > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > > index 392a5088f20bf..3122a98745ab7 100644
> > > --- a/arch/x86/kvm/svm/svm.c
> > > +++ b/arch/x86/kvm/svm/svm.c
> > > @@ -2277,10 +2277,12 @@ static int gp_interception(struct kvm_vcpu *vcpu)
> > > if (x86_decode_emulated_instruction(vcpu, 0, NULL, 0) != EMULATION_OK)
> > > goto reinject;
> > >
> > > + /* FIXME: Handle SVM instructions through the emulator */
> > > svm_exit_code = svm_instr_exit_code(vcpu);
> > > if (svm_exit_code) {
> > > - /* All SVM instructions expect page aligned RAX */
> > > - if (svm->vmcb->save.rax & ~PAGE_MASK)
> > > + unsigned long rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
> > > +
> > > + if (!page_address_valid(vcpu, rax))
> >
> > Eh, let it poke out, i.e.
> >
> > if (!page_address_valid(vcpu, kvm_register_read(vcpu, VCPU_REGS_RAX)))
>
> Argh, looking at the rest of this series, and at KVM's existing code, having to
> use kvm_register_read() is awful. This really should be able to use kvm_rax_read(),
> but that won't handle the truncation.
>
> There are only a handful of likely-benign goofs due to this mess, but there is a
> pile of manual truncation and casting going on. In addition to _raw() variants,
> and mode-aware defaults, add "e" versions would be helpful, as many of the
> explicit truncation flows are cases where e.g. EAX, ECX, and EDX are architecturally
> accessed.
>
> I'll put together patches, and think more on how to handle this series (the
> dependencies aren't terrible, but they certainly are annoying). I'm tempted
> to squeeze this into 7.1 to make future life easier...
Just to make sure I understand this correctly, you'll keep this series
using kvm_register_read() and send patches on top to make
kvm_rax_read() a viable alternative and switch it, right?
On Fri, Apr 03, 2026, Yosry Ahmed wrote:
> On Fri, Apr 3, 2026 at 12:00 PM Sean Christopherson <seanjc@google.com> wrote:
> > On Fri, Apr 03, 2026, Sean Christopherson wrote:
> > > > svm_exit_code = svm_instr_exit_code(vcpu);
> > > > if (svm_exit_code) {
> > > > - /* All SVM instructions expect page aligned RAX */
> > > > - if (svm->vmcb->save.rax & ~PAGE_MASK)
> > > > + unsigned long rax = kvm_register_read(vcpu, VCPU_REGS_RAX);
> > > > +
> > > > + if (!page_address_valid(vcpu, rax))
> > >
> > > Eh, let it poke out, i.e.
> > >
> > > if (!page_address_valid(vcpu, kvm_register_read(vcpu, VCPU_REGS_RAX)))
> >
> > Argh, looking at the rest of this series, and at KVM's existing code, having to
> > use kvm_register_read() is awful. This really should be able to use kvm_rax_read(),
> > but that won't handle the truncation.
> >
> > There are only a handful of likely-benign goofs due to this mess, but there is a
> > pile of manual truncation and casting going on. In addition to _raw() variants,
> > and mode-aware defaults, add "e" versions would be helpful, as many of the
> > explicit truncation flows are cases where e.g. EAX, ECX, and EDX are architecturally
> > accessed.
> >
> > I'll put together patches, and think more on how to handle this series (the
> > dependencies aren't terrible, but they certainly are annoying). I'm tempted
> > to squeeze this into 7.1 to make future life easier...
>
> Just to make sure I understand this correctly, you'll keep this series
> using kvm_register_read() and send patches on top to make
> kvm_rax_read() a viable alternative and switch it, right?
Yep, exactly!
© 2016 - 2026 Red Hat, Inc.