From nobody Mon Jun 15 15:07:16 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 45451397E8A for ; Fri, 10 Apr 2026 23:58:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775865530; cv=none; b=BCAT252/pmIyRDw4vvmw59EMO09/UYic7CROZGXLssNmGqqTKvRoetCO1vY9fw836pFX3txjXHnkshjsIfYb124iI3CFiXj4MeJktEVKnzfGKX5zYO53chHNQCqFGq9l2Uywe1SeN44sCEnnthOjh+Wy/RwJN5WwwdbMzfSbYOw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775865530; c=relaxed/simple; bh=Up0zPE44wbHizmXLy2N+MWhNJad8IKFfIgw/Q82aWvI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=TEQYsg6lY9b17097zsEc8ffRjqy9wHDzFiOfeIc19v2HzcQvOvp1WvvBgzPfcXo0q5bU9YTEdDXn6nyxshB6YAC2gxolG2GO0sb72jbfQLOgoO1K1/c2oOdqUZV4GzyKsYyCCuaYmoiVmvQRwKIpJxjgK+IxzueORtt4nn84DZc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=vPg0qnud; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="vPg0qnud" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-35c12a3bbb9so2653591a91.3 for ; Fri, 10 Apr 2026 16:58:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1775865528; x=1776470328; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=Tt9IOTDzpkaNgZx9Kt8Jp36anBcR1CNnZR/7vwCAC/8=; b=vPg0qnudNJjtEDnHzTj49SnB3jidRQXw48ZzIXktrWpfY46el2YIUGo3c3chYzln0/ X9mA9iS01wVvOfg5sbo6CDf5XysrRmFtgTyz0+nEdu2PhkFN56F/0Smoal6nF7qNSTZz 8EBaEEVL4/BORRk1UuLizgLj4PtYD3OjcryxGssKa0gwfKvAsz66f8ZdCH264x9HXq8F HnNS2YIQbxatkWx2GvKTfSqOqGNmz642whRO5KiO9bBkQnB341yMJ8YA7klfv54oNUGC Fo5AmaPDrHaB8gmcemCW9i39OY6q5jIwD3Z45C9z2okt5SAftXCWrqrpSKhDWMtfX9jk MlNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775865528; x=1776470328; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Tt9IOTDzpkaNgZx9Kt8Jp36anBcR1CNnZR/7vwCAC/8=; b=PIQancpBe7t7DHHinSAEEpdIBSuB4cdBLXJO9NrnDcPNgN6P9RSlt7GYOkSnFU6cRw GwPGkqda/OGisPJcJqWBVAoUNGaQE3/+mkqTir9XSTOjHfsIhaazbmd22K1WW/vekNb6 yDDNVX/Hpx7z4ko/ufLw2pU8lVE25OTKPLIha/mJ4CBxiVPOCm1wUb9HGwCvKD645fRM 07LR/zBjue4/GcZQr1G3grL8zeQ2mtn7G/U5rf88IrrK9yfIP8BkpP6bCC8VVcCDe14g tQrE363U1d8kVY8eP1Po7/1KwKSHzb+IZx+zHcgHSSM9LazfQm1knYqGrY5sxka9RK4w 0iVA== X-Forwarded-Encrypted: i=1; AJvYcCXl2RIATGEhX5O4iy0RfUCoD7XS5aLSOgpaui9CTFgmxbpAOXysOd75Hyxzh5GiEIx7gykpNFZ5p0KjTOE=@vger.kernel.org X-Gm-Message-State: AOJu0YzIk43v22k3tz6FpQzodV+WKxgWrWKY1gApBTZ3rl4xnpGCbtTt LgkdhTav0q6fontrkMFpOj5wmiL2dm8P8egage5ta7gbZ+ve+rMSK2HPJWhjmes/PNNut4xJUTl /wFfnhQ== X-Received: from pjbng10.prod.google.com ([2002:a17:90b:1a8a:b0:35c:15e7:3e9a]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:d403:b0:354:a57c:65db with SMTP id 98e67ed59e1d1-35e428287ddmr4641345a91.20.1775865528381; Fri, 10 Apr 2026 16:58:48 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 10 Apr 2026 16:58:25 -0700 In-Reply-To: <20260410235832.2312342-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260410235832.2312342-1-seanjc@google.com> X-Mailer: git-send-email 2.53.0.1213.gd9a14994de-goog Message-ID: <20260410235832.2312342-7-seanjc@google.com> Subject: [GIT PULL] KVM: x86: Nested SVM changes for 7.1 From: Sean Christopherson To: Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Sean Christopherson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A massive pile of nSVM changes, the majority of which are fixes of varying urgency (though nothing so urgent as to warrant a mid-cycle pull request). FWIW, there are a few more nSVM series lined up for 7.2 (gPAT, PMU host/gue= st bits, and #NPF error code fixes), and I'm also hoping to see a series to optimize TLB flushing sooner than later (but certainly not for 7.2). As noted in the "svm" PULL request, the virt_ext =3D> misc_ctl2 rename has a minor conflict with the sev_es_guest() =3D> is_sev_es_guest() overhaul. There are several much-less-fun conflicts with kvm/master due to the RSM fixes. Here's what git shows for my merge commit (or just make it look like kvm-x86/next and hope I didn't screw up? :-D). diff --cc arch/x86/kvm/svm/nested.c index b36c33255bed,b42d95fc8499..961804df5f45 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@@ -402,31 -448,6 +448,17 @@@ static bool nested_vmcb_check_save(stru return true; } =20 - static bool nested_vmcb_check_save(struct kvm_vcpu *vcpu) - { - struct vcpu_svm *svm =3D to_svm(vcpu); - struct vmcb_save_area_cached *save =3D &svm->nested.save; -=20 - return __nested_vmcb_check_save(vcpu, save); - } -=20 - static bool nested_vmcb_check_controls(struct kvm_vcpu *vcpu) - { - struct vcpu_svm *svm =3D to_svm(vcpu); - struct vmcb_ctrl_area_cached *ctl =3D &svm->nested.ctl; -=20 - return __nested_vmcb_check_controls(vcpu, ctl); - } -=20 +int nested_svm_check_cached_vmcb12(struct kvm_vcpu *vcpu) +{ - if (!nested_vmcb_check_save(vcpu) || - !nested_vmcb_check_controls(vcpu)) ++ struct vcpu_svm *svm =3D to_svm(vcpu); ++ ++ if (!nested_vmcb_check_save(vcpu, &svm->nested.save) || ++ !nested_vmcb_check_controls(vcpu, &svm->nested.ctl)) + return -EINVAL; + + return 0; +} + /* * If a feature is not advertised to L1, clear the corresponding vmcb12 * intercept. @@@ -992,6 -1047,35 +1058,34 @@@ int enter_svm_guest_mode(struct kvm_vcp return 0; } =20 + static int nested_svm_copy_vmcb12_to_cache(struct kvm_vcpu *vcpu, u64 vmc= b12_gpa) + { + struct vcpu_svm *svm =3D to_svm(vcpu); + struct kvm_host_map map; + struct vmcb *vmcb12; + int r =3D 0; +=20 + if (kvm_vcpu_map(vcpu, gpa_to_gfn(vmcb12_gpa), &map)) + return -EFAULT; +=20 + vmcb12 =3D map.hva; + nested_copy_vmcb_control_to_cache(svm, &vmcb12->control); + nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); +=20 - if (!nested_vmcb_check_save(vcpu, &svm->nested.save) || - !nested_vmcb_check_controls(vcpu, &svm->nested.ctl)) { ++ if (nested_svm_check_cached_vmcb12(vcpu) < 0) { + vmcb12->control.exit_code =3D SVM_EXIT_ERR; + vmcb12->control.exit_info_1 =3D 0; + vmcb12->control.exit_info_2 =3D 0; + vmcb12->control.event_inj =3D 0; + vmcb12->control.event_inj_err =3D 0; + svm_set_gif(svm, false); + r =3D -EINVAL; + } +=20 + kvm_vcpu_unmap(vcpu, &map); + return r; + } +=20 int nested_svm_vmrun(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm =3D to_svm(vcpu); diff --cc arch/x86/kvm/svm/svm.c index d304568588c7,1e51cbb80e86..07ed964dacf5 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@@ -4880,16 -4999,12 +5000,15 @@@ static int svm_leave_smm(struct kvm_vcp vmcb12 =3D map.hva; nested_copy_vmcb_control_to_cache(svm, &vmcb12->control); nested_copy_vmcb_save_to_cache(svm, &vmcb12->save); - ret =3D enter_svm_guest_mode(vcpu, smram64->svm_guest_vmcb_gpa, false); =20 - if (ret) + if (nested_svm_check_cached_vmcb12(vcpu) < 0) goto unmap_save; =20 - if (enter_svm_guest_mode(vcpu, smram64->svm_guest_vmcb_gpa, - vmcb12, false) !=3D 0) ++ if (enter_svm_guest_mode(vcpu, smram64->svm_guest_vmcb_gpa, false) !=3D = 0) + goto unmap_save; + + ret =3D 0; - svm->nested.nested_run_pending =3D 1; + vcpu->arch.nested_run_pending =3D KVM_NESTED_RUN_PENDING; =20 unmap_save: kvm_vcpu_unmap(vcpu, &map_save); diff --cc arch/x86/kvm/vmx/vmx.c index d16427a079f6,d75f6b22d74c..d76a21c38506 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@@ -8528,15 -8528,11 +8528,15 @@@ int vmx_leave_smm(struct kvm_vcpu *vcpu } =20 if (vmx->nested.smm.guest_mode) { + /* Triple fault if the state is invalid. */ + if (nested_vmx_check_restored_vmcs12(vcpu) < 0) + return 1; + ret =3D nested_vmx_enter_non_root_mode(vcpu, false); - if (ret) - return ret; + if (ret !=3D NVMX_VMENTRY_SUCCESS) + return 1; =20 - vmx->nested.nested_run_pending =3D 1; + vcpu->arch.nested_run_pending =3D KVM_NESTED_RUN_PENDING; vmx->nested.smm.guest_mode =3D false; } return 0; The following changes since commit 11439c4635edd669ae435eec308f4ab8a0804808: Linux 7.0-rc2 (2026-03-01 15:39:31 -0800) are available in the Git repository at: https://github.com/kvm-x86/linux.git tags/kvm-x86-nested-7.1 for you to fetch changes up to 052ca584bd7c51de0de96e684631570459d46cda: KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's name = (2026-04-03 16:08:05 -0700) ---------------------------------------------------------------- KVM nested SVM changes for 7.1 (with one common x86 fix) - To minimize the probability of corrupting guest state, defer KVM's non-architectural delivery of exception payloads (e.g. CR2 and DR6) until consumption of the payload is imminent, and force delivery of the payload in all paths where userspace saves relevant state. - Use vcpu->arch.cr2 when updating vmcb12's CR2 on nested #VMEXIT to fix a bug where L2's CR2 can get corrupted after a save/restore, e.g. if the VM is migrated while L2 is faulting in memory. - Fix a class of nSVM bugs where some fields written by the CPU are not synchronized from vmcb02 to cached vmcb12 after VMRUN, and so are not up-to-date when saved by KVM_GET_NESTED_STATE. - Fix a class of bugs where the ordering between KVM_SET_NESTED_STATE and KVM_SET_{S}REGS could cause vmcb02 to be incorrectly initialized after save+restore. - Add a variety of missing nSVM consistency checks. - Fix several bugs where KVM failed to correctly update VMCB fields on nes= ted #VMEXIT. - Fix several bugs where KVM failed to correctly synthesize #UD or #GP for SVM-related instructions. - Add support for save+restore of virtualized LBRs (on SVM). - Refactor various helpers and macros to improve clarity and (hopefully) m= ake the code easier to maintain. - Aggressively sanitize fields when copying from vmcb12 to guard against unintentionally allowing L1 to utilize yet-to-be-defined features. - Fix several bugs where KVM botched rAX legality checks when emulating SVM instructions. Note, KVM is still flawed in that KVM doesn't address size prefix overrides for 64-bit guests; this should probably be documented a= s a KVM erratum. - Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fails instead of somewhat arbitrarily synthesizing #GP (i.e. don't bastardize AMD's alrea= dy- sketchy behavior of generating #GP if for "unsupported" addresses). - Cache all used vmcb12 fields to further harden against TOCTOU bugs. ---------------------------------------------------------------- Jim Mattson (1): KVM: x86: SVM: Remove vmcb_is_dirty() Kevin Cheng (4): KVM: SVM: Inject #UD for INVLPGA if EFER.SVME=3D0 KVM: nSVM: Raise #UD if unhandled VMMCALL isn't intercepted by L1 KVM: SVM: Move STGI and CLGI intercept handling KVM: SVM: Recalc instructions intercepts when EFER.SVME is toggled Sean Christopherson (12): KVM: x86: Defer non-architectural deliver of exception payload to use= rspace read KVM: nSVM: Delay setting soft IRQ RIP tracking fields until vCPU run KVM: SVM: Explicitly mark vmcb01 dirty after modifying VMCB intercepts KVM: nSVM: Always intercept VMMCALL when L2 is active KVM: SVM: Separate recalc_intercepts() into nested vs. non-nested par= ts KVM: nSVM: Directly (re)calc vmcb02 intercepts from nested_vmcb02_pre= pare_control() KVM: nSVM: Use intuitive local variables in nested_vmcb02_recalc_inte= rcepts() KVM: nSVM: Move vmcb_ctrl_area_cached.bus_lock_rip to svm_nested_state KVM: nSVM: Capture svm->nested.ctl as vmcb12_ctrl when preparing vmcb= 02 KVM: SVM: Rename vmcb->nested_ctl to vmcb->misc_ctl KVM: SVM: Add a helper to get LBR field pointer to dedup MSR accesses KVM: x86: Suppress WARNs on nested_run_pending after userspace exit Yosry Ahmed (49): KVM: nSVM: Use vcpu->arch.cr2 when updating vmcb12 on nested #VMEXIT KVM: nSVM: Mark all of vmcb02 dirty when restoring nested state KVM: nSVM: Ensure AVIC is inhibited when restoring a vCPU to guest mo= de KVM: nSVM: Sync NextRIP to cached vmcb12 after VMRUN of L2 KVM: nSVM: Sync interrupt shadow to cached vmcb12 after VMRUN of L2 KVM: selftests: Extend state_test to check vGIF KVM: selftests: Extend state_test to check next_rip KVM: nSVM: Always use NextRIP as vmcb02's NextRIP after first L2 VMRUN KVM: nSVM: Delay stuffing L2's current RIP into NextRIP until vCPU run KVM: nSVM: Avoid clearing VMCB_LBR in vmcb12 KVM: SVM: Switch svm_copy_lbrs() to a macro KVM: SVM: Add missing save/restore handling of LBR MSRs KVM: selftests: Add a test for LBR save/restore (ft. nested) KVM: nSVM: Always inject a #GP if mapping VMCB12 fails on nested VMRUN KVM: nSVM: Refactor checking LBRV enablement in vmcb12 into a helper KVM: nSVM: Refactor writing vmcb12 on nested #VMEXIT as a helper KVM: nSVM: Triple fault if mapping VMCB12 fails on nested #VMEXIT KVM: nSVM: Triple fault if restore host CR3 fails on nested #VMEXIT KVM: nSVM: Clear GIF on nested #VMEXIT(INVALID) KVM: nSVM: Clear EVENTINJ fields in vmcb12 on nested #VMEXIT KVM: nSVM: Clear tracking of L1->L2 NMI and soft IRQ on nested #VMEXIT KVM: nSVM: Drop nested_vmcb_check_{save/control}() wrappers KVM: nSVM: Drop the non-architectural consistency check for NP_ENABLE KVM: nSVM: Add missing consistency check for nCR3 validity KVM: nSVM: Add missing consistency check for EFER, CR0, CR4, and CS KVM: nSVM: Add missing consistency check for EVENTINJ KVM: nSVM: WARN and abort vmcb02 intercepts recalc if vmcb02 isn't ac= tive KVM: nSVM: Use vmcb12_is_intercept() in nested_sync_control_from_vmcb= 02() KVM: SVM: Rename vmcb->virt_ext to vmcb->misc_ctl2 KVM: nSVM: Cache all used fields from VMCB12 KVM: nSVM: Restrict mapping vmcb12 on nested VMRUN KVM: nSVM: Use PAGE_MASK to drop lower bits of bitmap GPAs from vmcb12 KVM: nSVM: Sanitize TLB_CONTROL field when copying from vmcb12 KVM: nSVM: Sanitize INT/EVENTINJ fields when copying from vmcb12 KVM: nSVM: Only copy SVM_MISC_ENABLE_NP from VMCB01's misc_ctl KVM: selftest: Add a selftest for VMRUN/#VMEXIT with unmappable vmcb12 KVM: SVM: Triple fault L1 on unintercepted EFER.SVME clear by L2 KVM: selftests: Add a test for L2 clearing EFER.SVME without intercept KVM: nSVM: Simplify error handling of nested_svm_copy_vmcb12_to_cache= () KVM: x86: Move nested_run_pending to kvm_vcpu_arch KVM: SVM: Properly check RAX in the emulator for SVM instructions KVM: SVM: Refactor SVM instruction handling on #GP intercept KVM: SVM: Properly check RAX on #GP intercept of SVM instructions KVM: SVM: Move RAX legality check to SVM insn interception handlers KVM: SVM: Check EFER.SVME and CPL on #GP intercept of SVM instructions KVM: SVM: Treat mapping failures equally in VMLOAD/VMSAVE emulation KVM: nSVM: Fail emulation of VMRUN/VMLOAD/VMSAVE if mapping vmcb12 fa= ils KVM: selftests: Rework svm_nested_invalid_vmcb12_gpa KVM: selftests: Drop 'invalid' from svm_nested_invalid_vmcb12_gpa's n= ame arch/x86/include/asm/kvm_host.h | 15 + arch/x86/include/asm/svm.h | 20 +- arch/x86/kvm/emulate.c | 3 +- arch/x86/kvm/hyperv.h | 8 - arch/x86/kvm/kvm_emulate.h | 2 + arch/x86/kvm/svm/hyperv.h | 9 +- arch/x86/kvm/svm/nested.c | 613 ++++++++++++-----= ---- arch/x86/kvm/svm/sev.c | 6 +- arch/x86/kvm/svm/svm.c | 352 ++++++++---- arch/x86/kvm/svm/svm.h | 81 ++- arch/x86/kvm/vmx/nested.c | 50 +- arch/x86/kvm/vmx/vmx.c | 16 +- arch/x86/kvm/vmx/vmx.h | 3 - arch/x86/kvm/x86.c | 78 ++- arch/x86/kvm/x86.h | 10 + tools/testing/selftests/kvm/Makefile.kvm | 3 + .../testing/selftests/kvm/include/x86/processor.h | 5 + tools/testing/selftests/kvm/include/x86/svm.h | 14 +- tools/testing/selftests/kvm/lib/x86/svm.c | 2 +- .../selftests/kvm/x86/nested_vmsave_vmload_test.c | 16 +- tools/testing/selftests/kvm/x86/state_test.c | 35 ++ .../selftests/kvm/x86/svm_lbr_nested_state.c | 145 +++++ .../selftests/kvm/x86/svm_nested_clear_efer_svme.c | 55 ++ .../selftests/kvm/x86/svm_nested_vmcb12_gpa.c | 176 ++++++ 24 files changed, 1228 insertions(+), 489 deletions(-) create mode 100644 tools/testing/selftests/kvm/x86/svm_lbr_nested_state.c create mode 100644 tools/testing/selftests/kvm/x86/svm_nested_clear_efer_s= vme.c create mode 100644 tools/testing/selftests/kvm/x86/svm_nested_vmcb12_gpa.c