From nobody Thu Oct 2 19:27:58 2025 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 575462DA765 for ; Fri, 12 Sep 2025 23:24:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757719445; cv=none; b=qGn9B6K2rOaVPYG6aKAtFmszjHgaR4Zy3kP7rdxM47rahBBuutvtpENVsQv9kteNqaN7EJdQBL6IINRvgZYcNd4/FqwkWmIjznz0rbnOqv/BTn2Y4vj6VV8lOKUVzl0uflu2NEmz6skApJXIivSXcYHWR428z3Sfuoaze6kFM/A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757719445; c=relaxed/simple; bh=SVc3vKHvqM8aL2a6O6g/g2oxGAeijGGSxnClHNB4C6w=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=jdiDGaEJO7vvy1+HRUCvous9mkhkdZS35ItXoCXTRy8FG7A3tPBUnxYHcS2Qzj2oFnRbHBdj+JBFPbSWyHvYyMGBvnlhVEI27xvlkpgWJcntlf8PE5Eq86lLMnyRwCGz5s4BsP/+qg2NmmB96iJKHYz8PQm7kcJIcfdhFaeDqT0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Qf/fmkt3; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Qf/fmkt3" Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-b4e63a34f3fso1552407a12.1 for ; Fri, 12 Sep 2025 16:24:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757719442; x=1758324242; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=lhvDAsdFtNhqbS2apMIeQwTS5dWNE1AFLyo1ltoYXSk=; b=Qf/fmkt3V8IKaZB8FaJtbbgeeTxSIDnHiXVslVz1QrInPRD2gKOe52XjFQ8Jdqr7Y1 3dtqjN1ZEq+uEYg5nwcd0OEfDIDIN0SvbpfnhJ75q90cp9KwdN0UsFdHL/GWI44RuPIM Rfr5qbwJr0ZvBGkHzjbUNp/e4sxqfUr/RlqAJMqYCVE5yTFzo71BgHRt8Ou071BSHF/2 zpwhBKAoRnq9QJ06CeO6qBZkoQzr4r8Gc8J/h06T7PmwmjneFjzeD8pGQKVtjkYe6yH9 IT3MBz5PDOj/wgF5R61hdeYlB8jC4G1DtTay1XshOhxKq1R4yIMcweLqQoJB/oY3w8dG LkVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757719442; x=1758324242; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=lhvDAsdFtNhqbS2apMIeQwTS5dWNE1AFLyo1ltoYXSk=; b=b1aKdGDVVZTvzv3xYGKaIaG8ygLMVziEsc0aWrpZBiIMwv4mCuAAqzqKYQor94i9RZ ubaYhVpC9RffEc+izS1h8G2mBXPDSTTUHQu9fUgAXMOCm+VY7W0hPklPPdR2DZ/cpdro OSuzS+LBHrtAWvzEkTRRb1vYF6P1VFepmyrnS8R3Bqw6zsVhP/Y3bPTtAeUZxt1ftk96 epig3VsVI6GUj2Sq8BawZPu5H40pwhaeVQ730YmG8p1Jb5Tvxn0I0CDEwlQ84vAPtWbT n2WD84EbHWGvriJBaYV//8SfL2uXG+ID10yOgKzBFlzCWndX0p7gQbvV2Y0wlE0LZZ2O DtGA== X-Forwarded-Encrypted: i=1; AJvYcCXrgLgs8tXdrmKrMEgiaZF34dfcwnqdnFvlri+zW8ErdyQQriMstxFANlY1lr4kLP1q6etLIBaePd5qw0w=@vger.kernel.org X-Gm-Message-State: AOJu0YybIzUVnRoXDouVJ3cZKJEyIP4aLLOHt0BRs4FtDu8QGoM8clMW O7ONZiwDOJu8J+nIIbxS1ZVc2NffgnstIVCCWS5tXSdrT2ml8eJvwEfZevprkNQt1L/Ww22OXo5 nkJ2rzA== X-Google-Smtp-Source: AGHT+IHXSqQr69qglcbP9OJDJfDaAXmQsfFsYSZg18i2zG0BgbxqeL4rgGFDlGflwb3BFyRBRQzFdblnKeQ= X-Received: from pglu27.prod.google.com ([2002:a63:141b:0:b0:b52:14a4:3f3f]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:e293:b0:250:595e:a69e with SMTP id adf61e73a8af0-2602c04f9ddmr5890176637.43.1757719441731; Fri, 12 Sep 2025 16:24:01 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 12 Sep 2025 16:22:59 -0700 In-Reply-To: <20250912232319.429659-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250912232319.429659-1-seanjc@google.com> X-Mailer: git-send-email 2.51.0.384.g4c02a37b29-goog Message-ID: <20250912232319.429659-22-seanjc@google.com> Subject: [PATCH v15 21/41] KVM: nVMX: Prepare for enabling CET support for nested guest From: Sean Christopherson To: Paolo Bonzini , Sean Christopherson Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Tom Lendacky , Mathias Krause , John Allen , Rick Edgecombe , Chao Gao , Maxim Levitsky , Xiaoyao Li , Zhang Yi Z Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Set up CET MSRs, related VM_ENTRY/EXIT control bits and fixed CR4 setting to enable CET for nested VM. vmcs12 and vmcs02 needs to be synced when L2 exits to L1 or when L1 wants to resume L2, that way correct CET states can be observed by one another. Please note that consistency checks regarding CET state during VM-Entry will be added later to prevent this patch from becoming too large. Advertising the new CET VM_ENTRY/EXIT control bits are also be deferred until after the consistency checks are added. Signed-off-by: Yang Weijiang Tested-by: Mathias Krause Tested-by: John Allen Tested-by: Rick Edgecombe Signed-off-by: Chao Gao Signed-off-by: Sean Christopherson Reviewed-by: Xin Li (Intel) Tested-by: Xin Li (Intel) --- arch/x86/kvm/vmx/nested.c | 77 +++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmcs12.c | 6 +++ arch/x86/kvm/vmx/vmcs12.h | 14 ++++++- arch/x86/kvm/vmx/vmx.c | 2 + arch/x86/kvm/vmx/vmx.h | 3 ++ 5 files changed, 101 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 14f9822b611d..51d69f368689 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -721,6 +721,24 @@ static inline bool nested_vmx_prepare_msr_bitmap(struc= t kvm_vcpu *vcpu, nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_MPERF, MSR_TYPE_R); =20 + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_U_CET, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_S_CET, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL0_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL1_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL2_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL3_SSP, MSR_TYPE_RW); + kvm_vcpu_unmap(vcpu, &map); =20 vmx->nested.force_msr_bitmap_recalc =3D false; @@ -2521,6 +2539,32 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vm= x, struct loaded_vmcs *vmcs0 } } =20 +static void vmcs_read_cet_state(struct kvm_vcpu *vcpu, u64 *s_cet, + u64 *ssp, u64 *ssp_tbl) +{ + if (guest_cpu_cap_has(vcpu, X86_FEATURE_IBT) || + guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK)) + *s_cet =3D vmcs_readl(GUEST_S_CET); + + if (guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK)) { + *ssp =3D vmcs_readl(GUEST_SSP); + *ssp_tbl =3D vmcs_readl(GUEST_INTR_SSP_TABLE); + } +} + +static void vmcs_write_cet_state(struct kvm_vcpu *vcpu, u64 s_cet, + u64 ssp, u64 ssp_tbl) +{ + if (guest_cpu_cap_has(vcpu, X86_FEATURE_IBT) || + guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK)) + vmcs_writel(GUEST_S_CET, s_cet); + + if (guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK)) { + vmcs_writel(GUEST_SSP, ssp); + vmcs_writel(GUEST_INTR_SSP_TABLE, ssp_tbl); + } +} + static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs1= 2) { struct hv_enlightened_vmcs *hv_evmcs =3D nested_vmx_evmcs(vmx); @@ -2637,6 +2681,10 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx= , struct vmcs12 *vmcs12) vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr); vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr); =20 + if (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE) + vmcs_write_cet_state(&vmx->vcpu, vmcs12->guest_s_cet, + vmcs12->guest_ssp, vmcs12->guest_ssp_tbl); + set_cr4_guest_host_mask(vmx); } =20 @@ -2676,6 +2724,13 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, str= uct vmcs12 *vmcs12, kvm_set_dr(vcpu, 7, vcpu->arch.dr7); vmx_guest_debugctl_write(vcpu, vmx->nested.pre_vmenter_debugctl); } + + if (!vmx->nested.nested_run_pending || + !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE)) + vmcs_write_cet_state(vcpu, vmx->nested.pre_vmenter_s_cet, + vmx->nested.pre_vmenter_ssp, + vmx->nested.pre_vmenter_ssp_tbl); + if (kvm_mpx_supported() && (!vmx->nested.nested_run_pending || !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))) vmcs_write64(GUEST_BNDCFGS, vmx->nested.pre_vmenter_bndcfgs); @@ -3552,6 +3607,12 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_m= ode(struct kvm_vcpu *vcpu, !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))) vmx->nested.pre_vmenter_bndcfgs =3D vmcs_read64(GUEST_BNDCFGS); =20 + if (!vmx->nested.nested_run_pending || + !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE)) + vmcs_read_cet_state(vcpu, &vmx->nested.pre_vmenter_s_cet, + &vmx->nested.pre_vmenter_ssp, + &vmx->nested.pre_vmenter_ssp_tbl); + /* * Overwrite vmcs01.GUEST_CR3 with L1's CR3 if EPT is disabled *and* * nested early checks are disabled. In the event of a "late" VM-Fail, @@ -4635,6 +4696,10 @@ static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *v= cpu, struct vmcs12 *vmcs12) =20 if (vmcs12->vm_exit_controls & VM_EXIT_SAVE_IA32_EFER) vmcs12->guest_ia32_efer =3D vcpu->arch.efer; + + vmcs_read_cet_state(&vmx->vcpu, &vmcs12->guest_s_cet, + &vmcs12->guest_ssp, + &vmcs12->guest_ssp_tbl); } =20 /* @@ -4760,6 +4825,18 @@ static void load_vmcs12_host_state(struct kvm_vcpu *= vcpu, if (vmcs12->vm_exit_controls & VM_EXIT_CLEAR_BNDCFGS) vmcs_write64(GUEST_BNDCFGS, 0); =20 + /* + * Load CET state from host state if VM_EXIT_LOAD_CET_STATE is set. + * otherwise CET state should be retained across VM-exit, i.e., + * guest values should be propagated from vmcs12 to vmcs01. + */ + if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_CET_STATE) + vmcs_write_cet_state(vcpu, vmcs12->host_s_cet, vmcs12->host_ssp, + vmcs12->host_ssp_tbl); + else + vmcs_write_cet_state(vcpu, vmcs12->guest_s_cet, vmcs12->guest_ssp, + vmcs12->guest_ssp_tbl); + if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PAT) { vmcs_write64(GUEST_IA32_PAT, vmcs12->host_ia32_pat); vcpu->arch.pat =3D vmcs12->host_ia32_pat; diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c index 106a72c923ca..4233b5ca9461 100644 --- a/arch/x86/kvm/vmx/vmcs12.c +++ b/arch/x86/kvm/vmx/vmcs12.c @@ -139,6 +139,9 @@ const unsigned short vmcs12_field_offsets[] =3D { FIELD(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions), FIELD(GUEST_SYSENTER_ESP, guest_sysenter_esp), FIELD(GUEST_SYSENTER_EIP, guest_sysenter_eip), + FIELD(GUEST_S_CET, guest_s_cet), + FIELD(GUEST_SSP, guest_ssp), + FIELD(GUEST_INTR_SSP_TABLE, guest_ssp_tbl), FIELD(HOST_CR0, host_cr0), FIELD(HOST_CR3, host_cr3), FIELD(HOST_CR4, host_cr4), @@ -151,5 +154,8 @@ const unsigned short vmcs12_field_offsets[] =3D { FIELD(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip), FIELD(HOST_RSP, host_rsp), FIELD(HOST_RIP, host_rip), + FIELD(HOST_S_CET, host_s_cet), + FIELD(HOST_SSP, host_ssp), + FIELD(HOST_INTR_SSP_TABLE, host_ssp_tbl), }; const unsigned int nr_vmcs12_fields =3D ARRAY_SIZE(vmcs12_field_offsets); diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h index 56fd150a6f24..4ad6b16525b9 100644 --- a/arch/x86/kvm/vmx/vmcs12.h +++ b/arch/x86/kvm/vmx/vmcs12.h @@ -117,7 +117,13 @@ struct __packed vmcs12 { natural_width host_ia32_sysenter_eip; natural_width host_rsp; natural_width host_rip; - natural_width paddingl[8]; /* room for future expansion */ + natural_width host_s_cet; + natural_width host_ssp; + natural_width host_ssp_tbl; + natural_width guest_s_cet; + natural_width guest_ssp; + natural_width guest_ssp_tbl; + natural_width paddingl[2]; /* room for future expansion */ u32 pin_based_vm_exec_control; u32 cpu_based_vm_exec_control; u32 exception_bitmap; @@ -294,6 +300,12 @@ static inline void vmx_check_vmcs12_offsets(void) CHECK_OFFSET(host_ia32_sysenter_eip, 656); CHECK_OFFSET(host_rsp, 664); CHECK_OFFSET(host_rip, 672); + CHECK_OFFSET(host_s_cet, 680); + CHECK_OFFSET(host_ssp, 688); + CHECK_OFFSET(host_ssp_tbl, 696); + CHECK_OFFSET(guest_s_cet, 704); + CHECK_OFFSET(guest_ssp, 712); + CHECK_OFFSET(guest_ssp_tbl, 720); CHECK_OFFSET(pin_based_vm_exec_control, 744); CHECK_OFFSET(cpu_based_vm_exec_control, 748); CHECK_OFFSET(exception_bitmap, 752); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 8d2186d6549f..989008f5307e 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7749,6 +7749,8 @@ static void nested_vmx_cr_fixed1_bits_update(struct k= vm_vcpu *vcpu) cr4_fixed1_update(X86_CR4_PKE, ecx, feature_bit(PKU)); cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP)); cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57)); + cr4_fixed1_update(X86_CR4_CET, ecx, feature_bit(SHSTK)); + cr4_fixed1_update(X86_CR4_CET, edx, feature_bit(IBT)); =20 entry =3D kvm_find_cpuid_entry_index(vcpu, 0x7, 1); cr4_fixed1_update(X86_CR4_LAM_SUP, eax, feature_bit(LAM)); diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 08a9a0075404..ecfdba666465 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -181,6 +181,9 @@ struct nested_vmx { */ u64 pre_vmenter_debugctl; u64 pre_vmenter_bndcfgs; + u64 pre_vmenter_s_cet; + u64 pre_vmenter_ssp; + u64 pre_vmenter_ssp_tbl; =20 /* to migrate it to L1 if L2 writes to L1's CR8 directly */ int l1_tpr_threshold; --=20 2.51.0.384.g4c02a37b29-goog