Don't install a leaf TDP MMU SPTE if the parent page's level doesn't
match the target level of the fault, and instead have the vCPU retry the
faulting instruction after warning. Continuing on is completely
unnecessary as the absolute worst case scenario of retrying is DoSing
the vCPU, whereas continuing on all but guarantees bigger explosions, e.g.
------------[ cut here ]------------
kernel BUG at arch/x86/kvm/mmu/tdp_mmu.c:559!
invalid opcode: 0000 [#1] SMP
CPU: 1 PID: 1025 Comm: nx_huge_pages_t Tainted: G W 6.1.0-rc4+ #64
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:__handle_changed_spte.cold+0x95/0x9c
RSP: 0018:ffffc9000072faf8 EFLAGS: 00010246
RAX: 00000000000000c1 RBX: ffffc90000731000 RCX: 0000000000000027
RDX: 0000000000000000 RSI: 00000000ffffdfff RDI: ffff888277c5b4c8
RBP: 0600000112400bf3 R08: ffff888277c5b4c0 R09: ffffc9000072f9a0
R10: 0000000000000001 R11: 0000000000000001 R12: 06000001126009f3
R13: 0000000000000002 R14: 0000000012600901 R15: 0000000012400b01
FS: 00007fba9f853740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000010aa7a003 CR4: 0000000000172ea0
Call Trace:
<TASK>
kvm_tdp_mmu_map+0x3b0/0x510
kvm_tdp_page_fault+0x10c/0x130
kvm_mmu_page_fault+0x103/0x680
vmx_handle_exit+0x132/0x5a0 [kvm_intel]
vcpu_enter_guest+0x60c/0x16f0
kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
kvm_vcpu_ioctl+0x271/0x660
__x64_sys_ioctl+0x80/0xb0
do_syscall_64+0x2b/0x50
entry_SYSCALL_64_after_hwframe+0x46/0xb0
</TASK>
Modules linked in: kvm_intel
---[ end trace 0000000000000000 ]---
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mmu/tdp_mmu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index fd4ae99790d7..cc1fb9a65620 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1063,7 +1063,9 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu,
int ret = RET_PF_FIXED;
bool wrprot = false;
- WARN_ON(sp->role.level != fault->goal_level);
+ if (WARN_ON_ONCE(sp->role.level != fault->goal_level))
+ return RET_PF_RETRY;
+
if (unlikely(!fault->slot))
new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL);
else
--
2.39.0.rc1.256.g54fd8350bd-goog
On Mon, Dec 12, 2022 at 7:30 PM Sean Christopherson <seanjc@google.com> wrote: > > Don't install a leaf TDP MMU SPTE if the parent page's level doesn't > match the target level of the fault, and instead have the vCPU retry the > faulting instruction after warning. Continuing on is completely > unnecessary as the absolute worst case scenario of retrying is DoSing > the vCPU, whereas continuing on all but guarantees bigger explosions, e.g. Would it make sense to kill the VM instead via KVM_BUG()?
On Tue, Dec 13, 2022, David Matlack wrote: > On Mon, Dec 12, 2022 at 7:30 PM Sean Christopherson <seanjc@google.com> wrote: > > > > Don't install a leaf TDP MMU SPTE if the parent page's level doesn't > > match the target level of the fault, and instead have the vCPU retry the > > faulting instruction after warning. Continuing on is completely > > unnecessary as the absolute worst case scenario of retrying is DoSing > > the vCPU, whereas continuing on all but guarantees bigger explosions, e.g. > > Would it make sense to kill the VM instead via KVM_BUG()? No, because if bug that hits this escapes to a release, odds are quite high that retrying will succeed. E.g. the fix earlier in this series is for a rare corner case that I was able to hit consistently only by hacking KVM to effectively synchronize the page fault and zap. Other than an extra page fault, no harm has been done to the guest, e.g. there's no need to kill the VM to protect it from data corruption.
On Tue, Dec 13, 2022 at 06:15:56PM +0000, Sean Christopherson wrote: > On Tue, Dec 13, 2022, David Matlack wrote: > > On Mon, Dec 12, 2022 at 7:30 PM Sean Christopherson <seanjc@google.com> wrote: > > > > > > Don't install a leaf TDP MMU SPTE if the parent page's level doesn't > > > match the target level of the fault, and instead have the vCPU retry the > > > faulting instruction after warning. Continuing on is completely > > > unnecessary as the absolute worst case scenario of retrying is DoSing > > > the vCPU, whereas continuing on all but guarantees bigger explosions, e.g. > > > > Would it make sense to kill the VM instead via KVM_BUG()? > > No, because if bug that hits this escapes to a release, odds are quite high that > retrying will succeed. E.g. the fix earlier in this series is for a rare corner > case that I was able to hit consistently only by hacking KVM to effectively > synchronize the page fault and zap. Other than an extra page fault, no harm has > been done to the guest, e.g. there's no need to kill the VM to protect it from > data corruption. Good points, agreed!
© 2016 - 2025 Red Hat, Inc.