From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D54EE13D8A4; Tue, 12 Aug 2025 02:56:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967389; cv=none; b=otabR51NYlpp+zNby2G/qSGa7brN5NikuC3/4XZQpD4nSGIqXVpJLoWc8rRN5HttREyiyti+unrDAjy1yQtx9JuwZQtNOe/a2cXewpyScfATMmbnJYLcf3XIBBf29lZDsKjJXdk/oIoy4Mp2ej2Ptg56eF3sXJMdm8Nvd9AFSBA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967389; c=relaxed/simple; bh=59Lddcpsx0ltVEXxI4JbBZLMy6Ums8t0Jrpm7HGtKn4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ujdgFOTVslHKFr9dfNhHVmnGuRZKAOOQx2u2C8EJEFau63A4WG/Vgzpg1DidGngEo3gp4lR7BAUyTIgIM/1nViaqCKXyc/qdyrIiLeXX9F5c50LqKp5MHnfXUY6OO3TzfSHw3SFeZghK6sYCzz9y3PoIElnkLw09wyEMV4m1vls= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=kxcTuiT7; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="kxcTuiT7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967388; x=1786503388; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=59Lddcpsx0ltVEXxI4JbBZLMy6Ums8t0Jrpm7HGtKn4=; b=kxcTuiT7oWQSy1DwjWRfD1Yg7eZwv9AL41GDnEJPwaFPFrlk30+FkjJc ASHHrKpnSk83MJhd8ixSj5KhDkiGFmvR5yWtkB4J9uDOUzTbGACL5vUrd HZKBDrdN54w9BYB2Gr+VzFotC0GoT3osPA8MQUGBQRoUWipmYLYu/ZVbx 6BZ71DMqvteJs2wndxATMba4XsDXykcVUCDSU0JNYl1O3iAo6ixDHlpVJ yEb/MOmap9XMNt4Pbti5A9sAFHDeDZj560hEEDy0OnAeEx0fLp2VJPJBn 7NTTxIu87tCOIRLpSp1SKHch9ryqGwh8febwj1tEfyTxTk++PDwQnniHC Q==; X-CSE-ConnectionGUID: uSDb3QFySuS0zKwuerezbg== X-CSE-MsgGUID: 1G7TgS+rSDaVCawxWa/hkQ== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100429" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100429" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:23 -0700 X-CSE-ConnectionGUID: ZlS2m3n+TtaREeseP9aRng== X-CSE-MsgGUID: aZ+matr2R/qcLazvMLmoSg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321218" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:23 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Chao Gao , Mathias Krause , John Allen , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 01/24] KVM: x86: Rename kvm_{g,s}et_msr()* to show that they emulate guest accesses Date: Mon, 11 Aug 2025 19:55:09 -0700 Message-ID: <20250812025606.74625-2-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Rename kvm_{g,s}et_msr_with_filter() kvm_{g,s}et_msr() to kvm_emulate_msr_{read,write} __kvm_emulate_msr_{read,write} to make it more obvious that KVM uses these helpers to emulate guest behaviors, i.e., host_initiated =3D=3D false in these helpers. Suggested-by: Sean Christopherson Suggested-by: Chao Gao Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao Signed-off-by: Sean Christopherson Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- v12: use less verbose function names -- Sean/Xin --- arch/x86/include/asm/kvm_host.h | 8 ++++---- arch/x86/kvm/smm.c | 4 ++-- arch/x86/kvm/vmx/nested.c | 14 +++++++------- arch/x86/kvm/x86.c | 26 +++++++++++++------------- 4 files changed, 26 insertions(+), 26 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index f19a76d3ca0e..86e4d0b8469b 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2149,11 +2149,11 @@ void kvm_prepare_event_vectoring_exit(struct kvm_vc= pu *vcpu, gpa_t gpa); =20 void kvm_enable_efer_bits(u64); bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer); -int kvm_get_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 *data); -int kvm_set_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 data); +int kvm_emulate_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data); +int kvm_emulate_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data); int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, bool host_i= nitiated); -int kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data); -int kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data); +int __kvm_emulate_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data); +int __kvm_emulate_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data); int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu); int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu); int kvm_emulate_as_nop(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c index 9864c057187d..5dd8a1646800 100644 --- a/arch/x86/kvm/smm.c +++ b/arch/x86/kvm/smm.c @@ -529,7 +529,7 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *c= txt, =20 vcpu->arch.smbase =3D smstate->smbase; =20 - if (kvm_set_msr(vcpu, MSR_EFER, smstate->efer & ~EFER_LMA)) + if (__kvm_emulate_msr_write(vcpu, MSR_EFER, smstate->efer & ~EFER_LMA)) return X86EMUL_UNHANDLEABLE; =20 rsm_load_seg_64(vcpu, &smstate->tr, VCPU_SREG_TR); @@ -620,7 +620,7 @@ int emulator_leave_smm(struct x86_emulate_ctxt *ctxt) =20 /* And finally go back to 32-bit mode. */ efer =3D 0; - kvm_set_msr(vcpu, MSR_EFER, efer); + __kvm_emulate_msr_write(vcpu, MSR_EFER, efer); } #endif =20 diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index b8ea1969113d..7dc2e1c09ea6 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -997,7 +997,7 @@ static u32 nested_vmx_load_msr(struct kvm_vcpu *vcpu, u= 64 gpa, u32 count) __func__, i, e.index, e.reserved); goto fail; } - if (kvm_set_msr_with_filter(vcpu, e.index, e.value)) { + if (kvm_emulate_msr_write(vcpu, e.index, e.value)) { pr_debug_ratelimited( "%s cannot write MSR (%u, 0x%x, 0x%llx)\n", __func__, i, e.index, e.value); @@ -1033,7 +1033,7 @@ static bool nested_vmx_get_vmexit_msr_value(struct kv= m_vcpu *vcpu, } } =20 - if (kvm_get_msr_with_filter(vcpu, msr_index, data)) { + if (kvm_emulate_msr_read(vcpu, msr_index, data)) { pr_debug_ratelimited("%s cannot read MSR (0x%x)\n", __func__, msr_index); return false; @@ -2770,8 +2770,8 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, stru= ct vmcs12 *vmcs12, =20 if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL) && kvm_pmu_has_perf_global_ctrl(vcpu_to_pmu(vcpu)) && - WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL, - vmcs12->guest_ia32_perf_global_ctrl))) { + WARN_ON_ONCE(__kvm_emulate_msr_write(vcpu, MSR_CORE_PERF_GLOBAL_CTRL, + vmcs12->guest_ia32_perf_global_ctrl))) { *entry_failure_code =3D ENTRY_FAIL_DEFAULT; return -EINVAL; } @@ -4758,8 +4758,8 @@ static void load_vmcs12_host_state(struct kvm_vcpu *v= cpu, } if ((vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL) && kvm_pmu_has_perf_global_ctrl(vcpu_to_pmu(vcpu))) - WARN_ON_ONCE(kvm_set_msr(vcpu, MSR_CORE_PERF_GLOBAL_CTRL, - vmcs12->host_ia32_perf_global_ctrl)); + WARN_ON_ONCE(__kvm_emulate_msr_write(vcpu, MSR_CORE_PERF_GLOBAL_CTRL, + vmcs12->host_ia32_perf_global_ctrl)); =20 /* Set L1 segment info according to Intel SDM 27.5.2 Loading Host Segment and Descriptor-Table Registers */ @@ -4937,7 +4937,7 @@ static void nested_vmx_restore_host_state(struct kvm_= vcpu *vcpu) goto vmabort; } =20 - if (kvm_set_msr_with_filter(vcpu, h.index, h.value)) { + if (kvm_emulate_msr_write(vcpu, h.index, h.value)) { pr_debug_ratelimited( "%s WRMSR failed (%u, 0x%x, 0x%llx)\n", __func__, j, h.index, h.value); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a1c49bc681c4..09b106a5afdf 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1932,33 +1932,33 @@ static int kvm_get_msr_ignored_check(struct kvm_vcp= u *vcpu, __kvm_get_msr); } =20 -int kvm_get_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 *data) +int kvm_emulate_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data) { if (!kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_READ)) return KVM_MSR_RET_FILTERED; return kvm_get_msr_ignored_check(vcpu, index, data, false); } -EXPORT_SYMBOL_GPL(kvm_get_msr_with_filter); +EXPORT_SYMBOL_GPL(kvm_emulate_msr_read); =20 -int kvm_set_msr_with_filter(struct kvm_vcpu *vcpu, u32 index, u64 data) +int kvm_emulate_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data) { if (!kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_WRITE)) return KVM_MSR_RET_FILTERED; return kvm_set_msr_ignored_check(vcpu, index, data, false); } -EXPORT_SYMBOL_GPL(kvm_set_msr_with_filter); +EXPORT_SYMBOL_GPL(kvm_emulate_msr_write); =20 -int kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data) +int __kvm_emulate_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data) { return kvm_get_msr_ignored_check(vcpu, index, data, false); } -EXPORT_SYMBOL_GPL(kvm_get_msr); +EXPORT_SYMBOL_GPL(__kvm_emulate_msr_read); =20 -int kvm_set_msr(struct kvm_vcpu *vcpu, u32 index, u64 data) +int __kvm_emulate_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data) { return kvm_set_msr_ignored_check(vcpu, index, data, false); } -EXPORT_SYMBOL_GPL(kvm_set_msr); +EXPORT_SYMBOL_GPL(__kvm_emulate_msr_write); =20 static void complete_userspace_rdmsr(struct kvm_vcpu *vcpu) { @@ -2030,7 +2030,7 @@ int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu) u64 data; int r; =20 - r =3D kvm_get_msr_with_filter(vcpu, ecx, &data); + r =3D kvm_emulate_msr_read(vcpu, ecx, &data); =20 if (!r) { trace_kvm_msr_read(ecx, data); @@ -2055,7 +2055,7 @@ int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu) u64 data =3D kvm_read_edx_eax(vcpu); int r; =20 - r =3D kvm_set_msr_with_filter(vcpu, ecx, data); + r =3D kvm_emulate_msr_write(vcpu, ecx, data); =20 if (!r) { trace_kvm_msr_write(ecx, data); @@ -8353,7 +8353,7 @@ static int emulator_get_msr_with_filter(struct x86_em= ulate_ctxt *ctxt, struct kvm_vcpu *vcpu =3D emul_to_vcpu(ctxt); int r; =20 - r =3D kvm_get_msr_with_filter(vcpu, msr_index, pdata); + r =3D kvm_emulate_msr_read(vcpu, msr_index, pdata); if (r < 0) return X86EMUL_UNHANDLEABLE; =20 @@ -8376,7 +8376,7 @@ static int emulator_set_msr_with_filter(struct x86_em= ulate_ctxt *ctxt, struct kvm_vcpu *vcpu =3D emul_to_vcpu(ctxt); int r; =20 - r =3D kvm_set_msr_with_filter(vcpu, msr_index, data); + r =3D kvm_emulate_msr_write(vcpu, msr_index, data); if (r < 0) return X86EMUL_UNHANDLEABLE; =20 @@ -8396,7 +8396,7 @@ static int emulator_set_msr_with_filter(struct x86_em= ulate_ctxt *ctxt, static int emulator_get_msr(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 *pdata) { - return kvm_get_msr(emul_to_vcpu(ctxt), msr_index, pdata); + return __kvm_emulate_msr_read(emul_to_vcpu(ctxt), msr_index, pdata); } =20 static int emulator_check_rdpmc_early(struct x86_emulate_ctxt *ctxt, u32 p= mc) --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D355D26529E; Tue, 12 Aug 2025 02:56:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967390; cv=none; b=guRPvrxAQ7LFNK+j1vzfkSNvUx5jL3KyVGNRpdKFX2BoE386ktxE0LzpbnmWKpWCy4hDC7w1CKAUlYfm3y4VD4tqgOwGc8YCPt07cjYuA4z6g5V1Lk8syzILs4MfNkdOcOrlVqwtbDTqwvoqtcdbp4ZrwJYQRFa1Fm43OIFhtmg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967390; c=relaxed/simple; bh=F3k+MYbu7ZOn0p/0kCjrCIf8V5yPNvy8jKQh5YApCvQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iQod1jZTb1I6JmKcZrdv90JIirH7JnX7O5Ot0n/YIhm129V2pAydOExB+V6xXlO2IT+AXxCz8Bpxu9SCsNWzepKcMoaCaP1elD48pHjzbjsvh+uC+/WBls+/lTTZMTDuFOsS08Z6t/CaI+G5kl0tri3S29U/tbZn2A8vLO5GbXI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=KQctbW2o; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KQctbW2o" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967389; x=1786503389; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=F3k+MYbu7ZOn0p/0kCjrCIf8V5yPNvy8jKQh5YApCvQ=; b=KQctbW2oHTFUv/7wj8BfmBRbr8+WfFCbns/TOzlLjkL9aMAm3R0H7RSg G6ddlv+/SAtLGL5fHcoakykRvBPad0UhaOVmpCnUoV67AWh0OLjA2Yp3j WUZKI/2jzsOxN5/lP2ptB8i/zvq5y+q00Pk6hkEr1fa5sEseYvcGNeg+f OLELA2Da4i0GzlnRCWeSeQFras0Gu2YxpS01OXg2wzzV1/RTGwU1x+y0r dme6sBacMP/VpFTY5JMIESb7ezjZDw8Xp+GmRmJCNdD5Fgj6VS+eWFuUv cqQiT/Ji+N1IqePcs/3GdzF/MLgqQV6THjqHEWcmIxJW1K9IFwvEhjHGL A==; X-CSE-ConnectionGUID: 1yO4Z3ZAQVyB3z8Z1RYCpw== X-CSE-MsgGUID: XlouCPtjSoqHOeg60pYjFg== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100438" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100438" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:23 -0700 X-CSE-ConnectionGUID: 6wsujkrgT16mCQij2E8mgg== X-CSE-MsgGUID: I/opl937T7yHkMyrIgBSSA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321222" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:23 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Chao Gao , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 02/24] KVM: x86: Use double-underscore read/write MSR helpers as appropriate Date: Mon, 11 Aug 2025 19:55:10 -0700 Message-ID: <20250812025606.74625-3-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Sean Christopherson Use the double-underscore helpers for emulating MSR reads and writes in he no-underscore versions to better capture the relationship between the two sets of APIs (the double-underscore versions don't honor userspace MSR filters). No functional change intended. Signed-off-by: Sean Christopherson Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/x86.c | 29 ++++++++++++++++------------- 1 file changed, 16 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 09b106a5afdf..65c787bcfe8b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1932,11 +1932,24 @@ static int kvm_get_msr_ignored_check(struct kvm_vcp= u *vcpu, __kvm_get_msr); } =20 +int __kvm_emulate_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data) +{ + return kvm_get_msr_ignored_check(vcpu, index, data, false); +} +EXPORT_SYMBOL_GPL(__kvm_emulate_msr_read); + +int __kvm_emulate_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data) +{ + return kvm_set_msr_ignored_check(vcpu, index, data, false); +} +EXPORT_SYMBOL_GPL(__kvm_emulate_msr_write); + int kvm_emulate_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data) { if (!kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_READ)) return KVM_MSR_RET_FILTERED; - return kvm_get_msr_ignored_check(vcpu, index, data, false); + + return __kvm_emulate_msr_read(vcpu, index, data); } EXPORT_SYMBOL_GPL(kvm_emulate_msr_read); =20 @@ -1944,21 +1957,11 @@ int kvm_emulate_msr_write(struct kvm_vcpu *vcpu, u3= 2 index, u64 data) { if (!kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_WRITE)) return KVM_MSR_RET_FILTERED; - return kvm_set_msr_ignored_check(vcpu, index, data, false); -} -EXPORT_SYMBOL_GPL(kvm_emulate_msr_write); =20 -int __kvm_emulate_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data) -{ - return kvm_get_msr_ignored_check(vcpu, index, data, false); + return __kvm_emulate_msr_write(vcpu, index, data); } -EXPORT_SYMBOL_GPL(__kvm_emulate_msr_read); +EXPORT_SYMBOL_GPL(kvm_emulate_msr_write); =20 -int __kvm_emulate_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data) -{ - return kvm_set_msr_ignored_check(vcpu, index, data, false); -} -EXPORT_SYMBOL_GPL(__kvm_emulate_msr_write); =20 static void complete_userspace_rdmsr(struct kvm_vcpu *vcpu) { --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CFBD72D2390; Tue, 12 Aug 2025 02:56:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967391; cv=none; b=TQkFcjXHyvJ8WB9fFd0LHluE3FH/RKdRNMDPTroWZDegu8Jw26NG8eLMcsdquQJ+3vkhTE3yDDgNR01XrxlKoxthwMGl62d9KkJc5MuZbBbYP69vJQ+sNvbrNMfNxIT8sBV9ES7b2NizEmTUtQu1y1J8CFYmCp9Bkdbb5xi5gKU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967391; c=relaxed/simple; bh=E/2yq9i/iT6Va/MtK28zHkNUQ3UnGlyNG+Ds6CvNXpM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BCp6upsr9apyRwlRMDbTEz2JCJs3tTtrTpxGghB3XNR/C6vlMmr/bKkLWfLt9mbV8lcUr0gyJ9TJruuhaRRLblBalHm+ELtZwFjc3J012+g06BTPrE03HR3r8Sa5JaM0yFSvqovXPXIXcmMWaagTqeImul2sHzV9Cmb2emR3KHQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Im49Yzqm; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Im49Yzqm" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967390; x=1786503390; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=E/2yq9i/iT6Va/MtK28zHkNUQ3UnGlyNG+Ds6CvNXpM=; b=Im49YzqmzPfkMyJPfRsdCMIL9tKpV5IFOn9DZw8ZktzywAU/Q0iPr2/w Nb51Nkg/c7whj5hjSliipN/VNhtvnH+HoRUKS86SrqoNCMjgIQTp7AnUk bPEOvRlAQVnsuRI8s3+v5CK6aJyAHjvDHOyZpdsbh+wRq3mhL1/O56wDL LlTYdiRI67/NEA3APT33D+aLyN0iyc4K2GpPqTkPHDVjkvZatuktGzpG6 tkzKxqp707kkChU8SIaR9v29TbHYfRfE7t2UXI/dEJsXdyztla7QnL6Lc o8BOxAy5wLwHFU2ZdtZHpaMTInk+2GaoBPbEYosX+6TTJ38bxeSk9R2Tu Q==; X-CSE-ConnectionGUID: I9DnUG7kQQCIkKDHWqTf3g== X-CSE-MsgGUID: 3qe4zZZPSlOOVM5EY0Ii7w== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100448" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100448" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:24 -0700 X-CSE-ConnectionGUID: lGi/MtTPRsydcVqe2AIBcQ== X-CSE-MsgGUID: vnU5tKEPTDK4vn1nsBj4WA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321232" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:24 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Mathias Krause , John Allen , Chao Gao , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 03/24] KVM: x86: Add kvm_msr_{read,write}() helpers Date: Mon, 11 Aug 2025 19:55:11 -0700 Message-ID: <20250812025606.74625-4-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Wrap __kvm_{get,set}_msr() into two new helpers for KVM usage and use the helpers to replace existing usage of the raw functions. kvm_msr_{read,write}() are KVM-internal helpers, i.e. used when KVM needs to get/set a MSR value for emulating CPU behavior, i.e., host_initiated =3D= =3D %true in the helpers. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/cpuid.c | 2 +- arch/x86/kvm/x86.c | 16 +++++++++++++--- 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 86e4d0b8469b..39b93642e7d2 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2151,9 +2151,10 @@ void kvm_enable_efer_bits(u64); bool kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer); int kvm_emulate_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data); int kvm_emulate_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data); -int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, bool host_i= nitiated); int __kvm_emulate_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data); int __kvm_emulate_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data); +int kvm_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data); +int kvm_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data); int kvm_emulate_rdmsr(struct kvm_vcpu *vcpu); int kvm_emulate_wrmsr(struct kvm_vcpu *vcpu); int kvm_emulate_as_nop(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index e2836a255b16..30fd18700972 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -2001,7 +2001,7 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *= ebx, if (function =3D=3D 7 && index =3D=3D 0) { u64 data; if ((*ebx & (feature_bit(RTM) | feature_bit(HLE))) && - !__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) && + !kvm_msr_read(vcpu, MSR_IA32_TSX_CTRL, &data) && (data & TSX_CTRL_CPUID_CLEAR)) *ebx &=3D ~(feature_bit(RTM) | feature_bit(HLE)); } else if (function =3D=3D 0x80000007) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 65c787bcfe8b..726028eb647b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1898,8 +1898,8 @@ static int kvm_set_msr_ignored_check(struct kvm_vcpu = *vcpu, * Returns 0 on success, non-0 otherwise. * Assumes vcpu_load() was already called. */ -int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, - bool host_initiated) +static int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data, + bool host_initiated) { struct msr_data msr; int ret; @@ -1925,6 +1925,16 @@ int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, = u64 *data, return ret; } =20 +int kvm_msr_write(struct kvm_vcpu *vcpu, u32 index, u64 data) +{ + return __kvm_set_msr(vcpu, index, data, true); +} + +int kvm_msr_read(struct kvm_vcpu *vcpu, u32 index, u64 *data) +{ + return __kvm_get_msr(vcpu, index, data, true); +} + static int kvm_get_msr_ignored_check(struct kvm_vcpu *vcpu, u32 index, u64 *data, bool host_initiated) { @@ -12463,7 +12473,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool ini= t_event) MSR_IA32_MISC_ENABLE_BTS_UNAVAIL; =20 __kvm_set_xcr(vcpu, 0, XFEATURE_MASK_FP); - __kvm_set_msr(vcpu, MSR_IA32_XSS, 0, true); + kvm_msr_write(vcpu, MSR_IA32_XSS, 0); } =20 /* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */ --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0AF992D8365; Tue, 12 Aug 2025 02:56:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967391; cv=none; b=Gnk1fJUDK5hAczU7LMIODq7GexscdFo6y8DgmNM8j4YxBJVmwX3LLE9MyLq6CerG1smcIZYIiSzxCrLglIlnq5fWF8Mn+Nq7tgP4o5ADMEv90g33tpX1a26uY2o5t+GYTsSVJY/6RwCkHCWbW6Tw3/VbtHLiR4vnWwNGWS1gC8g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967391; c=relaxed/simple; bh=FVanvm/VpXucreMGhf99RZyR3UD92uB5Vfal1SBRqZ8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HY2tqkpV5pixS7Fug7B/zWWhaiQQeqi8UNktzy3Wux+81Niclap/0CPf7tVRkKfbmo2qsph2QPrzLJEvqmmqYSX+jYU6b1eDGJpILvqCkeC7ARnm7n/ZcOF/2P0bvD1GjJ4Il4vjzBifU8zwmaGRUjbjW/uJJsro9g02xbt3/bk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ZIiKLTvB; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ZIiKLTvB" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967390; x=1786503390; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FVanvm/VpXucreMGhf99RZyR3UD92uB5Vfal1SBRqZ8=; b=ZIiKLTvB+calwWSZQWsqtbUud5/v8EgJDdxVSxWUlXlvzbaLQi/BYRDP q9HCW1dfRIw1rSKO8nFnvsWJSo9MWqvH1bM8ZCV4RKu/xA7/eCNbEp6UQ D40IEf0ZanH6k43Z4rgo0Uc6p57FewGY55jZ3oOvOWHN8JBvAQ9jehZEy aDm6ODScQPrhL00FZNRi89l42xdCwfhQu5188vxaDP5tPji/TkJkT/IWZ bXKuUbvTvZza8P4yjaU7tS55LAORcNjKSP+4ev2c+KDRO7Wpm46eypne4 +CmnyFzAAjjD5sgwJngGf0mBf0wKQdcKqMm5sLc4CQbniVmlk7SJcoqnG g==; X-CSE-ConnectionGUID: RvO1Unu9Ry6E+ZdmvoGTvQ== X-CSE-MsgGUID: J+KUbK0URuCQLt8o0k2gBA== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100460" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100460" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:25 -0700 X-CSE-ConnectionGUID: 0350Ci/XTv6+YnVdLh8+Ww== X-CSE-MsgGUID: grSDObJtRymncPkTWyMBPg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321240" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:25 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Mathias Krause , John Allen , Chao Gao , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 04/24] KVM: x86: Manually clear MPX state only on INIT Date: Mon, 11 Aug 2025 19:55:12 -0700 Message-ID: <20250812025606.74625-5-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Sean Christopherson Don't manually clear/zero MPX state on RESET, as the guest FPU state is zero allocated and KVM only does RESET during vCPU creation, i.e. the relevant state is guaranteed to be all zeroes. Opportunistically move the relevant code into a helper in anticipation of adding support for CET shadow stacks, which also has state that is zeroed on INIT. Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/x86.c | 46 ++++++++++++++++++++++++++++++---------------- 1 file changed, 30 insertions(+), 16 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 726028eb647b..6cf0d15a7a64 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12389,6 +12389,35 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) kvfree(vcpu->arch.cpuid_entries); } =20 +static void kvm_xstate_reset(struct kvm_vcpu *vcpu, bool init_event) +{ + struct fpstate *fpstate =3D vcpu->arch.guest_fpu.fpstate; + + /* + * Guest FPU state is zero allocated and so doesn't need to be manually + * cleared on RESET, i.e. during vCPU creation. + */ + if (!init_event || !fpstate) + return; + + /* + * On INIT, only select XSTATE components are zeroed, most components + * are unchanged. Currently, the only components that are zeroed and + * supported by KVM are MPX related. + */ + if (!kvm_mpx_supported()) + return; + + /* + * All paths that lead to INIT are required to load the guest's FPU + * state (because most paths are buried in KVM_RUN). + */ + kvm_put_guest_fpu(vcpu); + fpstate_clear_xstate_component(fpstate, XFEATURE_BNDREGS); + fpstate_clear_xstate_component(fpstate, XFEATURE_BNDCSR); + kvm_load_guest_fpu(vcpu); +} + void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) { struct kvm_cpuid_entry2 *cpuid_0x1; @@ -12446,22 +12475,7 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool in= it_event) kvm_async_pf_hash_reset(vcpu); vcpu->arch.apf.halted =3D false; =20 - if (vcpu->arch.guest_fpu.fpstate && kvm_mpx_supported()) { - struct fpstate *fpstate =3D vcpu->arch.guest_fpu.fpstate; - - /* - * All paths that lead to INIT are required to load the guest's - * FPU state (because most paths are buried in KVM_RUN). - */ - if (init_event) - kvm_put_guest_fpu(vcpu); - - fpstate_clear_xstate_component(fpstate, XFEATURE_BNDREGS); - fpstate_clear_xstate_component(fpstate, XFEATURE_BNDCSR); - - if (init_event) - kvm_load_guest_fpu(vcpu); - } + kvm_xstate_reset(vcpu, init_event); =20 if (!init_event) { vcpu->arch.smbase =3D 0x30000; --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D8082E2F09; Tue, 12 Aug 2025 02:56:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967392; cv=none; b=r/EGUSAZDgI9HJo+IR8KQ+VT5SJ1usfK6V//KO/nnknf4TyJHcM/OkrcJrL6GrA3mNdLKyIrXpQ9LMPjZ1OO7cRmMxlQgSE7CgE66PxMdGZ5fPmf0Ux55QvHVm2hcLmLEQqhrkFkSe4Dm8ANcf0eNZf90j3MdEC4ndkUPDBqiPY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967392; c=relaxed/simple; bh=qK4bs4dpKAzsNUcm95NUj00AyCZjbxXuuQ8IYy7JQMo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TRcTD21FrwF35e0oe/v2I8M6G2TprYhk+2zI3Innw5uRwfsZ63V8Ozhcxrlfor+uXmdQWa3pn/HgGJBX01g6R+X8Bb6Ty4T2IuYG2yE/sXQLop/r4yUIC7NxwQwq+ajSLvyNJKo865NeUXWSSfQhLMOCWslO4pIQOyZ3yZkoZ90= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Ehcsjtly; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ehcsjtly" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967390; x=1786503390; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qK4bs4dpKAzsNUcm95NUj00AyCZjbxXuuQ8IYy7JQMo=; b=Ehcsjtly2BSvNQVeX7UPdx23im7Qc8SRYVAgZHetCAqV1rjVMc2UwwbW B5rsk0DSGaCPC9woBvjU/fGO6sbE3/fo4E2l4WZr8t/Kl+DRMRyFksaxC g3+U0lQJMi4eSX3MLEMrsexOVxgs05ewSxkKxeAj0MhaYCd4FENCBzHDA abqMNvD8xuyhRga5tuDUOe6OM8HSI28RxCU+tZ44DDmbhRJIV35KTs/hk Gf0irl+z6Vv2GHqTt/JbHIDPLxWu0Bqf14iplhlbRrGIAGuQCKCcDl4u2 4xCtB1e6a6bb4ApZlHpQWqWu/cgcTd6HscSqIFi/sE8xQ8ui2nrbTc07Y g==; X-CSE-ConnectionGUID: QiD2tIKIQLm+Onfpl8AW4Q== X-CSE-MsgGUID: urG3NIg4TCK1lLUcIJHBww== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100471" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100471" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:26 -0700 X-CSE-ConnectionGUID: HQHkQSFLQaiGGWQCWe7EaQ== X-CSE-MsgGUID: bo5uQO73RUerjI/VcxP1LA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321243" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:26 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Chao Gao , Sean Christopherson , Mathias Krause , John Allen , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 05/24] KVM: x86: Zero XSTATE components on INIT by iterating over supported features Date: Mon, 11 Aug 2025 19:55:13 -0700 Message-ID: <20250812025606.74625-6-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Tweak the code a bit to facilitate resetting more xstate components in the future, e.g., CET's xstate-managed MSRs. No functional change intended. Suggested-by: Sean Christopherson Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/x86.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6cf0d15a7a64..0010aa45bf9f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12392,6 +12392,8 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) static void kvm_xstate_reset(struct kvm_vcpu *vcpu, bool init_event) { struct fpstate *fpstate =3D vcpu->arch.guest_fpu.fpstate; + u64 xfeatures_mask; + int i; =20 /* * Guest FPU state is zero allocated and so doesn't need to be manually @@ -12405,16 +12407,20 @@ static void kvm_xstate_reset(struct kvm_vcpu *vcp= u, bool init_event) * are unchanged. Currently, the only components that are zeroed and * supported by KVM are MPX related. */ - if (!kvm_mpx_supported()) + xfeatures_mask =3D (kvm_caps.supported_xcr0 | kvm_caps.supported_xss) & + (XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR); + if (!xfeatures_mask) return; =20 + BUILD_BUG_ON(sizeof(xfeatures_mask) * BITS_PER_BYTE <=3D XFEATURE_MAX); + /* * All paths that lead to INIT are required to load the guest's FPU * state (because most paths are buried in KVM_RUN). */ kvm_put_guest_fpu(vcpu); - fpstate_clear_xstate_component(fpstate, XFEATURE_BNDREGS); - fpstate_clear_xstate_component(fpstate, XFEATURE_BNDCSR); + for_each_set_bit(i, (unsigned long *)&xfeatures_mask, XFEATURE_MAX) + fpstate_clear_xstate_component(fpstate, i); kvm_load_guest_fpu(vcpu); } =20 --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 796242E3B06; Tue, 12 Aug 2025 02:56:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967393; cv=none; b=Q3odfPgYNzaf1l0S9MKBY7GsjEmohVS2NKHQb+kvPY0yLnGCytB+KG+SSeQxbv00XPqbBzkfIH7n1zNw2LCCl1SID8dSWQFCjP92PfArdOIpGb4udJl2dwQaT+bWljjJIf/YRuQLzM9Sl00Mh3jUQwVPJrFfqz+maSdlpkUKkoY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967393; c=relaxed/simple; bh=jeQqjBfT82jHpwHIytMtiFYaLrUjL/zJkR6ezqsWNk0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EKBO33V0F/TWBfYihbRa07g3ExPh8NyS2GIIDNg5sa84A5dhkREjdFO9kLbL+iqrskR91ipGJaDqqzgRdlsQuvgWAzlVaFivCjUK/WtyBojc9Jazc7L7MvkhtouGxhd6ACzps8czk+KY/St9SZahAQIqqChRUOyT2XdsCddxK+Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=PMRxIzRD; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="PMRxIzRD" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967391; x=1786503391; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jeQqjBfT82jHpwHIytMtiFYaLrUjL/zJkR6ezqsWNk0=; b=PMRxIzRDmkhwGspKfNQ4Z8e0okgCevmEz+c3wLmYls4Atl+6vJzsuiF2 1M+A0aYCk4pb1nkH3DmPhqQPClxFgU5mMb69XX6HABSWy5OSn7pjvSPbq uX5MSp1ZVpsEaelqpWJfMrsLsnFA8Sux674TTqrvvmai85PRRj5DO+oD5 8BCdgf2LfGwdByT7mhCrEH2UL24YK4pzDZ/oPBrWHO35ztEIuTq5iDtJy ar584tRCkgLB96grLmzFXS+GbuYpjqzHmzvHBCllfxURUMSQu016tVcIH +P0928bZ+zgGXaQT/PAwvPYiUmCfo2PbinsQnCDGp8AKEbFH/Tk4RoEu1 w==; X-CSE-ConnectionGUID: e9d8IHIhRl2g9N0Rp+M8Ww== X-CSE-MsgGUID: U+OESObDQKKu0YDR7vQKeQ== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100481" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100481" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:27 -0700 X-CSE-ConnectionGUID: j1N8w3HqSg6hdfihp8ZgEg== X-CSE-MsgGUID: jsyFMOGiQn2goLd+1pJGJA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321247" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:27 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Mathias Krause , John Allen , Chao Gao , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 06/24] KVM: x86: Introduce KVM_{G,S}ET_ONE_REG uAPIs support Date: Mon, 11 Aug 2025 19:55:14 -0700 Message-ID: <20250812025606.74625-7-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Enable KVM_{G,S}ET_ONE_REG uAPIs so that userspace can access HW MSR or KVM synthetic MSR through it. In CET KVM series [1], KVM "steals" an MSR from PV MSR space and access it via KVM_{G,S}ET_MSRs uAPIs, but the approach pollutes PV MSR space and hides the difference of synthetic MSRs and normal HW defined MSRs. Now carve out a separate room in KVM-customized MSR address space for synthetic MSRs. The synthetic MSRs are not exposed to userspace via KVM_GET_MSR_INDEX_LIST, instead userspace complies with KVM's setup and composes the uAPI params. KVM synthetic MSR indices start from 0 and increase linearly. Userspace caller should tag MSR type correctly in order to access intended HW or synthetic MSR. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang Link: https://lore.kernel.org/all/20240219074733.122080-18-weijiang.yang@in= tel.com/ [1] Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/include/uapi/asm/kvm.h | 10 +++++ arch/x86/kvm/x86.c | 66 +++++++++++++++++++++++++++++++++ 2 files changed, 76 insertions(+) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kv= m.h index 0f15d683817d..e72d9e6c1739 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -411,6 +411,16 @@ struct kvm_xcrs { __u64 padding[16]; }; =20 +#define KVM_X86_REG_MSR (1 << 2) +#define KVM_X86_REG_SYNTHETIC (1 << 3) + +struct kvm_x86_reg_id { + __u32 index; + __u8 type; + __u8 rsvd; + __u16 rsvd16; +}; + #define KVM_SYNC_X86_REGS (1UL << 0) #define KVM_SYNC_X86_SREGS (1UL << 1) #define KVM_SYNC_X86_EVENTS (1UL << 2) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0010aa45bf9f..f77949c3e9dd 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2241,6 +2241,31 @@ static int do_set_msr(struct kvm_vcpu *vcpu, unsigne= d index, u64 *data) return kvm_set_msr_ignored_check(vcpu, index, *data, true); } =20 +static int kvm_get_one_msr(struct kvm_vcpu *vcpu, u32 msr, u64 __user *val= ue) +{ + u64 val; + int r; + + r =3D do_get_msr(vcpu, msr, &val); + if (r) + return r; + + if (put_user(val, value)) + return -EFAULT; + + return 0; +} + +static int kvm_set_one_msr(struct kvm_vcpu *vcpu, u32 msr, u64 __user *val= ue) +{ + u64 val; + + if (get_user(val, value)) + return -EFAULT; + + return do_set_msr(vcpu, msr, &val); +} + #ifdef CONFIG_X86_64 struct pvclock_clock { int vclock_mode; @@ -5902,6 +5927,11 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu= *vcpu, } } =20 +static int kvm_translate_synthetic_msr(struct kvm_x86_reg_id *reg) +{ + return -EINVAL; +} + long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -6018,6 +6048,42 @@ long kvm_arch_vcpu_ioctl(struct file *filp, srcu_read_unlock(&vcpu->kvm->srcu, idx); break; } + case KVM_GET_ONE_REG: + case KVM_SET_ONE_REG: { + struct kvm_x86_reg_id *id; + struct kvm_one_reg reg; + u64 __user *value; + + r =3D -EFAULT; + if (copy_from_user(®, argp, sizeof(reg))) + break; + + r =3D -EINVAL; + id =3D (struct kvm_x86_reg_id *)®.id; + if (id->rsvd || id->rsvd16) + break; + + if (id->type !=3D KVM_X86_REG_MSR && + id->type !=3D KVM_X86_REG_SYNTHETIC) + break; + + if (id->type =3D=3D KVM_X86_REG_SYNTHETIC) { + r =3D kvm_translate_synthetic_msr(id); + if (r) + break; + } + + r =3D -EINVAL; + if (id->type !=3D KVM_X86_REG_MSR) + break; + + value =3D u64_to_user_ptr(reg.addr); + if (ioctl =3D=3D KVM_GET_ONE_REG) + r =3D kvm_get_one_msr(vcpu, id->index, value); + else + r =3D kvm_set_one_msr(vcpu, id->index, value); + break; + } case KVM_TPR_ACCESS_REPORTING: { struct kvm_tpr_access_ctl tac; =20 --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B387F2E4247; Tue, 12 Aug 2025 02:56:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967393; cv=none; b=sKp36JU+6tt98Z/87ybCbnPMX8bfS5uSyXDWEmrlbzEecVWTXAgvK1GRPfIPbiIW3n7XjNIKakVl81KI+ePkwvFaayqQutljlVCBID7a7RXf4JEF7vyNRobUR6VXiB10Q9WVYLLSJa8UJu7qbZm1ZmLBC2s3tZY/I1Ll/tDNfjs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967393; c=relaxed/simple; bh=F+QAkh5cv8nzilO4NetL4dW3egNXWHBjD8Zuf/btujg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qc/sFrpmVHl+RIovZFGEHsQ1MB3hRq3vn0V+swtDn/3FH5f6BnTMzPOgL/SnkEMVfFA44+90yMIIGPkSJx/0a3iBDj2+3TvCDNOxzW088RCEyiPLHJ8s0f6VYaKzNu9ZOsnAeD7x1dBp+QGbrZ3/yU/KbueDKMVXiUIriR/NATg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=AiAnXZo0; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="AiAnXZo0" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967391; x=1786503391; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=F+QAkh5cv8nzilO4NetL4dW3egNXWHBjD8Zuf/btujg=; b=AiAnXZo0OJxeb2t7MGW+74S8cclaHXFhW9/fqqb3WYWMzRs8RiV+WCEr h9WQ7Xg4KT/OLtTvPDCpR4Bs+yL91Npfv5XLIcSYurpMTThxZz7kFxJkS 3tQ8y4n8ZAHXbzo5iAXu91ShtknK8hUIlWn32OlyR86/Twzk4RfEAmwtb TPIr/6lirhaxxQs3TB48JTy8oC3MgESf9xBjNJdlpneijcEOOkn9mrAmK dY388daZT9rbT7D/864u5AqVlDJslGSPPGw5jOo8Q91Zdy6BhuMbidy1m EH7Dy/JVnNlIEQ5DxLaIeuMT+ZTz3OuUQS6ddyKgwB81mBE+sEcROJBEG A==; X-CSE-ConnectionGUID: k99xw3JBRF+n3oJJOF+yWQ== X-CSE-MsgGUID: xIevt7qZQLG+luofkeF8JQ== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100492" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100492" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:27 -0700 X-CSE-ConnectionGUID: KQtY5XFKQamazscsALEI7Q== X-CSE-MsgGUID: fUn3JX0TRl67dtzAP3EKXg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321250" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:27 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Chao Gao , Mathias Krause , John Allen , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 07/24] KVM: x86: Report XSS as to-be-saved if there are supported features Date: Mon, 11 Aug 2025 19:55:15 -0700 Message-ID: <20250812025606.74625-8-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Sean Christopherson Add MSR_IA32_XSS to list of MSRs reported to userspace if supported_xss is non-zero, i.e. KVM supports at least one XSS based feature. Before enabling CET virtualization series, guest IA32_MSR_XSS is guaranteed to be 0, i.e., XSAVES/XRSTORS is executed in non-root mode with XSS =3D=3D 0, which equals to the effect of XSAVE/XRSTOR. Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/x86.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f77949c3e9dd..e0e440607e02 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -335,7 +335,7 @@ static const u32 msrs_to_save_base[] =3D { MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B, MSR_IA32_UMWAIT_CONTROL, =20 - MSR_IA32_XFD, MSR_IA32_XFD_ERR, + MSR_IA32_XFD, MSR_IA32_XFD_ERR, MSR_IA32_XSS, }; =20 static const u32 msrs_to_save_pmu[] =3D { @@ -7444,6 +7444,10 @@ static void kvm_probe_msr_to_save(u32 msr_index) if (!(kvm_get_arch_capabilities() & ARCH_CAP_TSX_CTRL_MSR)) return; break; + case MSR_IA32_XSS: + if (!kvm_caps.supported_xss) + return; + break; default: break; } --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 190672DBF5E; Tue, 12 Aug 2025 02:56:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967393; cv=none; b=JPR2LxGmOVBKaAmN4gDaOU5Sew+9CgyECAOwjoNxub016mWvMBKRkKvu8IknPZila7G3LawXB6LxYypQTlZ3e/mVMplhcMYyi5CHnvGsTrvY/jh/Qt4dvuURxcEeSl+GhR6mFWyn005X9PDtbgSrsds99ystN872nK2rMZOChW4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967393; c=relaxed/simple; bh=2U24O1bgkAVdpdWyZ7+ZklS19rPq17HuAzXFKypFVJo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Z5aQz9/8KmRW/vWYC5olhE9iPLpekOz4GaqirBbk5kndeYOe5pmagwJZJ9WL2osMU0PvNGnL+reI4pUySq+TTyfoZIXqFwt3TIekBa+ip64CpCG79IgTtirocWEMKvmV1FkgUWLQVs2WSV73/7AnOAtIy5LyiyriYa9iHZiZVq0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=GA6eeptM; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="GA6eeptM" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967392; x=1786503392; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2U24O1bgkAVdpdWyZ7+ZklS19rPq17HuAzXFKypFVJo=; b=GA6eeptM5WfwigXqBEwznnQ8ZDy1AkT7qRpJym8gDc5p79ijyRm/xKI1 fSTkAmRFko4znsb3tTDBIxpv63eRhGmnjiJd1Wob0uTiTuaBdMvORUiM+ ZIGUDD+/jUFj30ONcEzeLg3pMKX4+Yq+j7E4VqCsdY3r+fQbTL2fINw9U G9pF9U6PV+tPY2jbHg5pqqLd3Mg3iqAvyIlx+5FpLfwn2Z1JTh0y9Dp1D pqH5kOxY4RhLn9fNvVhiCFm/JZwTJj2G9aJm/+MRzFA50wnO4pyS9ePaY O8ojYQGVxTUxDe/e9DEujtPgzw7RfUHqkNUDALJEzxICa4ZCcDg8TIe0I w==; X-CSE-ConnectionGUID: SYvecsWHQ1Csfp3GkCcG7w== X-CSE-MsgGUID: mv1ZcK+vSNWIlxiaqfKkOw== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100502" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100502" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:28 -0700 X-CSE-ConnectionGUID: fQ2aFdQISRCflwAs9t8PlA== X-CSE-MsgGUID: Z4A/oE4oT16IoxKsvm0Dsg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321265" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:28 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Zhang Yi Z , Chao Gao , Mathias Krause , John Allen , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 08/24] KVM: x86: Refresh CPUID on write to guest MSR_IA32_XSS Date: Mon, 11 Aug 2025 19:55:16 -0700 Message-ID: <20250812025606.74625-9-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Update CPUID.(EAX=3D0DH,ECX=3D1).EBX to reflect current required xstate size due to XSS MSR modification. CPUID(EAX=3D0DH,ECX=3D1).EBX reports the required storage size of all enabl= ed xstate features in (XCR0 | IA32_XSS). The CPUID value can be used by guest before allocate sufficient xsave buffer. Note, KVM does not yet support any XSS based features, i.e. supported_xss is guaranteed to be zero at this time. Opportunistically return KVM_MSR_RET_UNSUPPORTED if guest CPUID doesn't enumerate it. Since KVM_MSR_RET_UNSUPPORTED takes care of host_initiated cases, drop the host_initiated check. Suggested-by: Sean Christopherson Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/include/asm/kvm_host.h | 3 ++- arch/x86/kvm/cpuid.c | 15 ++++++++++++++- arch/x86/kvm/x86.c | 9 +++++---- 3 files changed, 21 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 39b93642e7d2..1faf53df6259 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -811,7 +811,6 @@ struct kvm_vcpu_arch { bool at_instruction_boundary; bool tpr_access_reporting; bool xfd_no_write_intercept; - u64 ia32_xss; u64 microcode_version; u64 arch_capabilities; u64 perf_capabilities; @@ -872,6 +871,8 @@ struct kvm_vcpu_arch { =20 u64 xcr0; u64 guest_supported_xcr0; + u64 guest_supported_xss; + u64 ia32_xss; =20 struct kvm_pio_request pio; void *pio_data; diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 30fd18700972..85079caaf507 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -263,6 +263,17 @@ static u64 cpuid_get_supported_xcr0(struct kvm_vcpu *v= cpu) return (best->eax | ((u64)best->edx << 32)) & kvm_caps.supported_xcr0; } =20 +static u64 cpuid_get_supported_xss(struct kvm_vcpu *vcpu) +{ + struct kvm_cpuid_entry2 *best; + + best =3D kvm_find_cpuid_entry_index(vcpu, 0xd, 1); + if (!best) + return 0; + + return (best->ecx | ((u64)best->edx << 32)) & kvm_caps.supported_xss; +} + static __always_inline void kvm_update_feature_runtime(struct kvm_vcpu *vc= pu, struct kvm_cpuid_entry2 *entry, unsigned int x86_feature, @@ -305,7 +316,8 @@ static void kvm_update_cpuid_runtime(struct kvm_vcpu *v= cpu) best =3D kvm_find_cpuid_entry_index(vcpu, 0xD, 1); if (best && (cpuid_entry_has(best, X86_FEATURE_XSAVES) || cpuid_entry_has(best, X86_FEATURE_XSAVEC))) - best->ebx =3D xstate_required_size(vcpu->arch.xcr0, true); + best->ebx =3D xstate_required_size(vcpu->arch.xcr0 | + vcpu->arch.ia32_xss, true); } =20 static bool kvm_cpuid_has_hyperv(struct kvm_vcpu *vcpu) @@ -424,6 +436,7 @@ void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) } =20 vcpu->arch.guest_supported_xcr0 =3D cpuid_get_supported_xcr0(vcpu); + vcpu->arch.guest_supported_xss =3D cpuid_get_supported_xss(vcpu); =20 vcpu->arch.pv_cpuid.features =3D kvm_apply_cpuid_pv_features_quirk(vcpu); =20 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e0e440607e02..c91472d36717 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3998,16 +3998,17 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struc= t msr_data *msr_info) } break; case MSR_IA32_XSS: - if (!msr_info->host_initiated && - !guest_cpuid_has(vcpu, X86_FEATURE_XSAVES)) - return 1; + if (!guest_cpuid_has(vcpu, X86_FEATURE_XSAVES)) + return KVM_MSR_RET_UNSUPPORTED; /* * KVM supports exposing PT to the guest, but does not support * IA32_XSS[bit 8]. Guests have to use RDMSR/WRMSR rather than * XSAVES/XRSTORS to save/restore PT MSRs. */ - if (data & ~kvm_caps.supported_xss) + if (data & ~vcpu->arch.guest_supported_xss) return 1; + if (vcpu->arch.ia32_xss =3D=3D data) + break; vcpu->arch.ia32_xss =3D data; vcpu->arch.cpuid_dynamic_bits_dirty =3D true; break; --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F1C32E54BE; Tue, 12 Aug 2025 02:56:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967394; cv=none; b=CxR91Kby1v/FLs9EN88z8T7FtMuw4IrHUVH6Q8aJlxOBgL92NhDJhaF/kFPn6K324S6CwW22A5Xf8sWHWmrYDL+n0lQY7qQcNPebsh8uMXKQcPupWJBXlgozEI13CZuQ8jT4BEVrAluph1fEwII5Bk4EcRQbVTOLT+LEes0WpA0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967394; c=relaxed/simple; bh=fb9MNzqI9AYMQgj0Sy+jl5waBqpWeSu18aiJ0OmUcPI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qhwEk9uaQoyufo01o1uSVOvC07qxMQbZhrFSX4Ip4lhMQtclix8jh+Mj/nE03tIT32wslHG0DEnK/C7DBhb4oa2+KHQqmFXFsw2I7CuhsEjr6IkOt5RxArjEpWO61mTb/f68VeYupWBHSzV1rS8l6UsGocl7lcCTS2/38cnFWPQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=fQXXuNbt; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="fQXXuNbt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967393; x=1786503393; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fb9MNzqI9AYMQgj0Sy+jl5waBqpWeSu18aiJ0OmUcPI=; b=fQXXuNbtP24DZOlW/4MIBvjMdxAqxqSSLmg8IpKdNAPIsujUBkFIaWSH vhrMGyNpMXpr8HcfQhqAUpnbtBDCzxt0WFcGFuJ9XY/aqH+/iv7sSN49T OGUsC1HcP2SzIW24o3sOasEIj/BFu9zx8P5WuSGFEV/ivFGfc5JjKJKH4 szL3eRbJTTaBN1P9wuZSJcWjNWp6SPY46QTAVSmtoAxiZpzH1IHze14cN SKy+at/6ZzJPvUs1e2t2HZtMblQqKxjnPd3mIsdBt+JS79VS1qePKMl3o uvnPOlKXTmy/X8oklVgxex0/4K7WRs/w068hePvJjsJoacn/kQHjxk7SW A==; X-CSE-ConnectionGUID: /IleTJDoQyqZKvJcCCLBIw== X-CSE-MsgGUID: WdcvRvBWSDqR+D1Kfg29dg== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100511" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100511" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:29 -0700 X-CSE-ConnectionGUID: DgCd/eGJR4WYbfaKiQQcXw== X-CSE-MsgGUID: 5eo/vqaiQOWKxhV5cHf1zg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321274" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:29 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Chao Gao , Mathias Krause , John Allen , Sean Christopherson , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 09/24] KVM: x86: Initialize kvm_caps.supported_xss Date: Mon, 11 Aug 2025 19:55:17 -0700 Message-ID: <20250812025606.74625-10-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Set original kvm_caps.supported_xss to (host_xss & KVM_SUPPORTED_XSS) if XSAVES is supported. host_xss contains the host supported xstate feature bits for thread FPU context switch, KVM_SUPPORTED_XSS includes all KVM enabled XSS feature bits, the resulting value represents the supervisor xstates that are available to guest and are backed by host FPU framework for swapping {guest,host} XSAVE-managed registers/MSRs. Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/x86.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c91472d36717..235a5de66e68 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -220,6 +220,8 @@ static struct kvm_user_return_msrs __percpu *user_retur= n_msrs; | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE) =20 +#define KVM_SUPPORTED_XSS 0 + bool __read_mostly allow_smaller_maxphyaddr =3D 0; EXPORT_SYMBOL_GPL(allow_smaller_maxphyaddr); =20 @@ -9762,14 +9764,17 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *op= s) kvm_host.xcr0 =3D xgetbv(XCR_XFEATURE_ENABLED_MASK); kvm_caps.supported_xcr0 =3D kvm_host.xcr0 & KVM_SUPPORTED_XCR0; } + + if (boot_cpu_has(X86_FEATURE_XSAVES)) { + rdmsrq(MSR_IA32_XSS, kvm_host.xss); + kvm_caps.supported_xss =3D kvm_host.xss & KVM_SUPPORTED_XSS; + } + kvm_caps.supported_quirks =3D KVM_X86_VALID_QUIRKS; kvm_caps.inapplicable_quirks =3D KVM_X86_CONDITIONAL_QUIRKS; =20 rdmsrq_safe(MSR_EFER, &kvm_host.efer); =20 - if (boot_cpu_has(X86_FEATURE_XSAVES)) - rdmsrq(MSR_IA32_XSS, kvm_host.xss); - kvm_init_pmu_capability(ops->pmu_ops); =20 if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES)) --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 68E652E54D8; Tue, 12 Aug 2025 02:56:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967395; cv=none; b=Fyo/7uYq/u0ia+r4tUyGydhD5fsNoOmif0tdjEKvUmCT/r8rrMDdwNHW11bfkYThZ1TVmahZuXrUvD0H+tFHhimh8PjVXwsOqvFCnCYT9XWxLS1H7ViRQtB8RO980j8Pzyat+dk5AKzzbxnX03A6erY5f27yC71B6+UwkLKz704= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967395; c=relaxed/simple; bh=3tFOaFBOSeZjmll7xLj8eingh7ebZS7GcZeBuOUBow0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qtvCcTMtMqk7slfXr1Yf7fYNWtSxkxXqDrNqfSHJ+vU95ViAPlAvI0br5RJyYSYZ8bLiplfIO8TXBZz/OnavsgITxVSWDVghKx0Qib15a+2VTFRqCuqK80WPXh99Gu547u5EkWAXOS1KzYlN5a8QlZ2iKmxZQxS7MEeRFkDcseU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=S/Ih9DoP; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="S/Ih9DoP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967393; x=1786503393; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3tFOaFBOSeZjmll7xLj8eingh7ebZS7GcZeBuOUBow0=; b=S/Ih9DoPHkTGBru8LSq2R7u1CbH8ORvZSalINENJKl4l9RQKFIn8Kk/V v4ttGoIJZnYICc4yuDrpYvK5QjFt1cUnpBRIcJ5Tn3+sCI4AvThitDugj 6JFJxPZAIDzTNOseo9xYr+Qfkr5p1eu5UjrQlKVWF2wkGB3HzQ8ARH7JI l5HCTjoMJJXmomVBrYuLCVzW+SKjY8NsEsmlWpjsCuXn+m/ZC2BNt/xSw xUa/NXAjxISKaDSySg84PzPK41NHIUdbBvPsCNIJ2/eJXW6xE4Cr5eiL9 tTEbHipvl59aeeAkNKgH6xSPjDPjpyCxaGJwBzrUK+6iT+6ljrH2VOXDh w==; X-CSE-ConnectionGUID: ODNaG4QgQvC2TF9Wj4B8ng== X-CSE-MsgGUID: n8qbM/dhTr60T/sMYWzfNA== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100521" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100521" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:30 -0700 X-CSE-ConnectionGUID: PGnre2yrQhG8qtaZrzQDhA== X-CSE-MsgGUID: MsVbQP96T/2tIc1FwDWZyQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321283" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:30 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Mathias Krause , John Allen , Chao Gao , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 10/24] KVM: x86: Load guest FPU state when access XSAVE-managed MSRs Date: Mon, 11 Aug 2025 19:55:18 -0700 Message-ID: <20250812025606.74625-11-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Sean Christopherson Load the guest's FPU state if userspace is accessing MSRs whose values are managed by XSAVES. Introduce two helpers, kvm_{get,set}_xstate_msr(), to facilitate access to such kind of MSRs. If MSRs supported in kvm_caps.supported_xss are passed through to guest, the guest MSRs are swapped with host's before vCPU exits to userspace and after it reenters kernel before next VM-entry. Because the modified code is also used for the KVM_GET_MSRS device ioctl(), explicitly check @vcpu is non-null before attempting to load guest state. The XSAVE-managed MSRs cannot be retrieved via the device ioctl() without loading guest FPU state (which doesn't exist). Note that guest_cpuid_has() is not queried as host userspace is allowed to access MSRs that have not been exposed to the guest, e.g. it might do KVM_SET_MSRS prior to KVM_SET_CPUID2. The two helpers are put here in order to manifest accessing xsave-managed MSRs requires special check and handling to guarantee the correctness of read/write to the MSRs. Signed-off-by: Sean Christopherson Co-developed-by: Yang Weijiang Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/x86.c | 35 ++++++++++++++++++++++++++++++++++- arch/x86/kvm/x86.h | 24 ++++++++++++++++++++++++ 2 files changed, 58 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 235a5de66e68..f87974e0682b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -136,6 +136,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct k= vm_sregs2 *sregs2); static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); =20 static DEFINE_MUTEX(vendor_module_lock); +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); + struct kvm_x86_ops kvm_x86_ops __read_mostly; =20 #define KVM_X86_OP(func) \ @@ -4553,6 +4556,21 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct= msr_data *msr_info) } EXPORT_SYMBOL_GPL(kvm_get_msr_common); =20 +/* + * Returns true if the MSR in question is managed via XSTATE, i.e. is con= text + * switched with the rest of guest FPU state. + */ +static bool is_xstate_managed_msr(u32 index) +{ + switch (index) { + case MSR_IA32_U_CET: + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + return true; + default: + return false; + } +} + /* * Read or write a bunch of msrs. All parameters are kernel addresses. * @@ -4563,11 +4581,26 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct k= vm_msrs *msrs, int (*do_msr)(struct kvm_vcpu *vcpu, unsigned index, u64 *data)) { + bool fpu_loaded =3D false; int i; =20 - for (i =3D 0; i < msrs->nmsrs; ++i) + for (i =3D 0; i < msrs->nmsrs; ++i) { + /* + * If userspace is accessing one or more XSTATE-managed MSRs, + * temporarily load the guest's FPU state so that the guest's + * MSR value(s) is resident in hardware, i.e. so that KVM can + * get/set the MSR via RDMSR/WRMSR. + */ + if (vcpu && !fpu_loaded && kvm_caps.supported_xss && + is_xstate_managed_msr(entries[i].index)) { + kvm_load_guest_fpu(vcpu); + fpu_loaded =3D true; + } if (do_msr(vcpu, entries[i].index, &entries[i].data)) break; + } + if (fpu_loaded) + kvm_put_guest_fpu(vcpu); =20 return i; } diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index bcfd9b719ada..31ce76369cc7 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -699,4 +699,28 @@ int ____kvm_emulate_hypercall(struct kvm_vcpu *vcpu, i= nt cpl, =20 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); =20 +/* + * Lock and/or reload guest FPU and access xstate MSRs. For accesses initi= ated + * by host, guest FPU is loaded in __msr_io(). For accesses initiated by g= uest, + * guest FPU should have been loaded already. + */ + +static inline void kvm_get_xstate_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); + kvm_fpu_get(); + rdmsrl(msr_info->index, msr_info->data); + kvm_fpu_put(); +} + +static inline void kvm_set_xstate_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); + kvm_fpu_get(); + wrmsrl(msr_info->index, msr_info->data); + kvm_fpu_put(); +} + #endif --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D936A2E5B1E; Tue, 12 Aug 2025 02:56:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967395; cv=none; b=qcgBKnN470Ojhvny0IheLbHdbU9pWkC+YDnm7thnvQalwnMm/N0xqBQYfnOhHvbLzSek/aI02SWTy8+2RNQpoN2oV0brm9c24msFkIB2da405soQXKfkncm9p8et8P1oUpGnjQO9GMnsp5s7m2KyDmuL9NeMuGG2c7yKmWiwhjw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967395; c=relaxed/simple; bh=iHtDX6Kw8jTF13K/I8VPDptLJ5RQ2S1gc1SzihMXvNY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MhpWb+YlqLq14UBaZoyBNFBKGv/j32brlNFfNINk3ntdaxS9xPrvTe2l4R35Oc2lDV9TqXJr0+UpDOY67gi3/5rsTzCWDTjob72JkQHL7dAKxI+w0kwbCq0bkKcWi1xn6NWqYn4DgnE6ovhB3ybOkT145J8aSW/6j67UtfKgUdI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=A1MPcKyr; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="A1MPcKyr" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967394; x=1786503394; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iHtDX6Kw8jTF13K/I8VPDptLJ5RQ2S1gc1SzihMXvNY=; b=A1MPcKyr6q6ya8mA8/qvrO8i4O5EL+cj1HSxmgyPgrlYDpvXYHzh2PCI N2Wj4oo91lktRSH2/FNkHMWEwjiBUlcjSCNplTe6HcEeMhVbRUd8dEsrh p7xXSDwZyd12rjYlbkkV9NJOZzAssw12EbZCagiovGHHRFh+0/1uEjS3S jtCwH1JrZAUA+bvZ5TKVsPxq242FiN7Bzz/FoEWUO3VIexQjqZqm5m8E8 k2JjOXOwv+Ml3rliEMcO9OSF6CtRIJWlrsaD0Ubvwv0AY9o08gmSq2mbH JuWwudNV94b485tjTJ02tPu+yUYud0R093vr2k88j0B3EFCbju3kuo0Lz A==; X-CSE-ConnectionGUID: zSQehnEhR9eIRXEflHxLFw== X-CSE-MsgGUID: Sup5we1PQdizMOHLOMpitA== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100532" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100532" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:31 -0700 X-CSE-ConnectionGUID: ut7sF5w2SHCDJjL2hssxAQ== X-CSE-MsgGUID: 4flUTBgvT425LPM4atel9Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321288" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:31 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Chao Gao , Mathias Krause , John Allen , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 11/24] KVM: x86: Add fault checks for guest CR4.CET setting Date: Mon, 11 Aug 2025 19:55:19 -0700 Message-ID: <20250812025606.74625-12-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Check potential faults for CR4.CET setting per Intel SDM requirements. CET can be enabled if and only if CR0.WP =3D=3D 1, i.e. setting CR4.CET =3D= =3D 1 faults if CR0.WP =3D=3D 0 and setting CR0.WP =3D=3D 0 fails if CR4.CET = =3D=3D 1. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Yang Weijiang Reviewed-by: Chao Gao Reviewed-by: Maxim Levitsky Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/x86.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f87974e0682b..cca05908f989 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1172,6 +1172,9 @@ int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long = cr0) (is_64_bit_mode(vcpu) || kvm_is_cr4_bit_set(vcpu, X86_CR4_PCIDE))) return 1; =20 + if (!(cr0 & X86_CR0_WP) && kvm_is_cr4_bit_set(vcpu, X86_CR4_CET)) + return 1; + kvm_x86_call(set_cr0)(vcpu, cr0); =20 kvm_post_set_cr0(vcpu, old_cr0, cr0); @@ -1371,6 +1374,9 @@ int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long = cr4) return 1; } =20 + if ((cr4 & X86_CR4_CET) && !kvm_is_cr0_bit_set(vcpu, X86_CR0_WP)) + return 1; + kvm_x86_call(set_cr4)(vcpu, cr4); =20 kvm_post_set_cr4(vcpu, old_cr4, cr4); --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F229C2E62BB; Tue, 12 Aug 2025 02:56:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967396; cv=none; b=bhKxW1tDGzgVLmGflu+XK+DcMmblkLQXnO6fo+GK7rrmcY5q/4sg0iqFLoeym7rI3ACVUf8iqP0eTiCZUlGgEvCg7y+H13aw06Z3hIXniDwC7t4IXRxTFAorYHnsuT10Z9/VyREqi1YGvOQ0GfmjmEiffMlOtmIbI4MVnfEI3Z0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967396; c=relaxed/simple; bh=MpGUFCMJ91Ij3rW0ZCW1FCN8YsfJ3Roxt36gb8FH7A8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=telKxcblB6ljnIU9FFpTql8PNc664SePUqzNT5bzfWSJq65MmJRUw7nUlGPN9xB2FUZU5gZDo0jMvaDGCUK2ejTC+FuHoWwwmqS+NNdxBsUDm0+NJQGtZl7tErwC/UmSct1xXjXBO0y2Jo4JeEIlaRwqhFwqg1GiDpp3bmJAaTY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=c3dbvqPz; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="c3dbvqPz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967395; x=1786503395; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MpGUFCMJ91Ij3rW0ZCW1FCN8YsfJ3Roxt36gb8FH7A8=; b=c3dbvqPzOasYAFV9noYHAvL+fynCk0Y0mla6VLkLo+HYfR1Sptnth9dM 32R77qERBM8J+JlwHiYd/lHvG553+BE34Olorst3CXsZ5Wob5z+U1WvQO jyyt4lfoyGvAGPdKAzDJ7s6iA+dCneJg8a6nrpm2iP9u/DR1DmkEKADzk /chlbHMj4KI4Kvk3NLWyIV12AtqQGKBKoZ2D1zobPzEq9YZYkHNAkdAY/ VLDDrcipA3jNKR4s0QJHxsa558mUezHbYIz5reqXGVEAasIrA7SlAxZNQ tHLWMGCw3ZF5eaOdsce8c+X3qPXjTJ0fxX+NNpdDOPJl5vFfVHABpMS1u Q==; X-CSE-ConnectionGUID: ZrFK0yLNQr+amAHQy+LYPA== X-CSE-MsgGUID: xfdOx8JzRy2/XVNIvr0ZKA== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100542" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100542" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:31 -0700 X-CSE-ConnectionGUID: nc/eP1EhRwu2iB7zbB0MkQ== X-CSE-MsgGUID: vXH1hqgtQfia22S5wE6k1Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321292" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:31 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Chao Gao , Mathias Krause , John Allen , Sean Christopherson , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 12/24] KVM: x86: Report KVM supported CET MSRs as to-be-saved Date: Mon, 11 Aug 2025 19:55:20 -0700 Message-ID: <20250812025606.74625-13-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Add CET MSRs to the list of MSRs reported to userspace if the feature, i.e. IBT or SHSTK, associated with the MSRs is supported by KVM. Suggested-by: Chao Gao Signed-off-by: Yang Weijiang Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/x86.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cca05908f989..7134f428b9f7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -341,6 +341,10 @@ static const u32 msrs_to_save_base[] =3D { MSR_IA32_UMWAIT_CONTROL, =20 MSR_IA32_XFD, MSR_IA32_XFD_ERR, MSR_IA32_XSS, + + MSR_IA32_U_CET, MSR_IA32_S_CET, + MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, MSR_IA32_PL2_SSP, + MSR_IA32_PL3_SSP, MSR_IA32_INT_SSP_TAB, }; =20 static const u32 msrs_to_save_pmu[] =3D { @@ -7490,6 +7494,20 @@ static void kvm_probe_msr_to_save(u32 msr_index) if (!kvm_caps.supported_xss) return; break; + case MSR_IA32_U_CET: + case MSR_IA32_S_CET: + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) + return; + break; + case MSR_IA32_INT_SSP_TAB: + if (!kvm_cpu_cap_has(X86_FEATURE_LM)) + return; + fallthrough; + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK)) + return; + break; default: break; } --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5AE802E62D5; Tue, 12 Aug 2025 02:56:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967397; cv=none; b=mX/caUYy5pnxHOYPqaIMH+8qL61lgSmmqHrfLYP2mGDMKd+4j2DCZUOR/YVgDVZOv1clxArC3JXlrnqZiIs882m6EvUxvmIhW5u+XqCu4uS6Lm0iN4le47YBZLoGF4U9Et+h0VR+CDKOXFYQaZgtdK7t5lXSzYg6IKbpsz4OpTA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967397; c=relaxed/simple; bh=zOyDr5CS/rL3JFy6/NMts830Rt8jyBcRLPgkk2BF6W0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=L084K484KZj6gp7+VoOKeJkbpQah7z1h9IlTgKhVnSwdJ6d7tUKWmaJFjyq5n2sjI5coJku0dQCJusqWTq4j6WuvaJdQq4O6xbuQHekGtor4ckr7jxkCcmKLEqjEbV1nqenLO9PLoJVfYja+yRJLO3soAAtNEckBG2MfJ7JulOo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=QeRa5M4z; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="QeRa5M4z" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967395; x=1786503395; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zOyDr5CS/rL3JFy6/NMts830Rt8jyBcRLPgkk2BF6W0=; b=QeRa5M4zqRZSdJrChFBIqcsGge3ifjPOjC1Fvh0A/fM+ZpOhBtVFNvBA vLqqUpuWbq4IoPQlnpFrTbWIEpUyK2CX35abyu5QCQYux1ICq0eDui2vb txhnmjgopAuNaw45KIZEKH/2gVSOfAZB3kS/uBeRIxVXI4eVX6ZHkns74 hM8dbBORqATtWxOI7L8xOkyN4GTEXEtkarkcuLnqf/LszZsw7yLjupWpt qlCEjEAFVO0QQySXCNHVjSUCbaRndXbXq2tVs0aVlfNYLI7IHyQiRE9nF 9ZRiEhdwGSAwoUo7TnnA704MVByZYY+Q6DRF67HNTsXwmA5BjjA5tW4Wl A==; X-CSE-ConnectionGUID: DuLJTC19SoqP/1npjK7ZpA== X-CSE-MsgGUID: v81VXOR5TzqtoDDS/UVqKg== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100551" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100551" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:32 -0700 X-CSE-ConnectionGUID: EKagoBrmT0iuia85/8A4pQ== X-CSE-MsgGUID: t3c7DHX2QgeP2oq0UV8B8g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321303" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:32 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Zhang Yi Z , Chao Gao , Mathias Krause , John Allen , Sean Christopherson , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 13/24] KVM: VMX: Introduce CET VMCS fields and control bits Date: Mon, 11 Aug 2025 19:55:21 -0700 Message-ID: <20250812025606.74625-14-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Control-flow Enforcement Technology (CET) is a kind of CPU feature used to prevent Return/CALL/Jump-Oriented Programming (ROP/COP/JOP) attacks. It provides two sub-features(SHSTK,IBT) to defend against ROP/COP/JOP style control-flow subversion attacks. Shadow Stack (SHSTK): A shadow stack is a second stack used exclusively for control transfer operations. The shadow stack is separate from the data/normal stack and can be enabled individually in user and kernel mode. When shadow stack is enabled, CALL pushes the return address on both the data and shadow stack. RET pops the return address from both stacks and compares them. If the return addresses from the two stacks do not match, the processor generates a #CP. Indirect Branch Tracking (IBT): IBT introduces instruction(ENDBRANCH)to mark valid target addresses of indirect branches (CALL, JMP etc...). If an indirect branch is executed and the next instruction is _not_ an ENDBRANCH, the processor generates a #CP. These instruction behaves as a NOP on platforms that have no CET. Several new CET MSRs are defined to support CET: MSR_IA32_{U,S}_CET: CET settings for {user,supervisor} CET respectively. MSR_IA32_PL{0,1,2,3}_SSP: SHSTK pointer linear address for CPL{0,1,2,3}. MSR_IA32_INT_SSP_TAB: Linear address of SHSTK pointer table, whose entry is indexed by IST of interrupt gate desc. Two XSAVES state bits are introduced for CET: IA32_XSS:[bit 11]: Control saving/restoring user mode CET states IA32_XSS:[bit 12]: Control saving/restoring supervisor mode CET states. Six VMCS fields are introduced for CET: {HOST,GUEST}_S_CET: Stores CET settings for kernel mode. {HOST,GUEST}_SSP: Stores current active SSP. {HOST,GUEST}_INTR_SSP_TABLE: Stores current active MSR_IA32_INT_SSP_TAB. On Intel platforms, two additional bits are defined in VM_EXIT and VM_ENTRY control fields: If VM_EXIT_LOAD_CET_STATE =3D 1, host CET states are loaded from following VMCS fields at VM-Exit: HOST_S_CET HOST_SSP HOST_INTR_SSP_TABLE If VM_ENTRY_LOAD_CET_STATE =3D 1, guest CET states are loaded from following VMCS fields at VM-Entry: GUEST_S_CET GUEST_SSP GUEST_INTR_SSP_TABLE Co-developed-by: Zhang Yi Z Signed-off-by: Zhang Yi Z Signed-off-by: Yang Weijiang Reviewed-by: Chao Gao Reviewed-by: Maxim Levitsky Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/include/asm/vmx.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index cca7d6641287..ce10a7e2d3d9 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -106,6 +106,7 @@ #define VM_EXIT_CLEAR_BNDCFGS 0x00800000 #define VM_EXIT_PT_CONCEAL_PIP 0x01000000 #define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000 +#define VM_EXIT_LOAD_CET_STATE 0x10000000 =20 #define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff =20 @@ -119,6 +120,7 @@ #define VM_ENTRY_LOAD_BNDCFGS 0x00010000 #define VM_ENTRY_PT_CONCEAL_PIP 0x00020000 #define VM_ENTRY_LOAD_IA32_RTIT_CTL 0x00040000 +#define VM_ENTRY_LOAD_CET_STATE 0x00100000 =20 #define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff =20 @@ -369,6 +371,9 @@ enum vmcs_field { GUEST_PENDING_DBG_EXCEPTIONS =3D 0x00006822, GUEST_SYSENTER_ESP =3D 0x00006824, GUEST_SYSENTER_EIP =3D 0x00006826, + GUEST_S_CET =3D 0x00006828, + GUEST_SSP =3D 0x0000682a, + GUEST_INTR_SSP_TABLE =3D 0x0000682c, HOST_CR0 =3D 0x00006c00, HOST_CR3 =3D 0x00006c02, HOST_CR4 =3D 0x00006c04, @@ -381,6 +386,9 @@ enum vmcs_field { HOST_IA32_SYSENTER_EIP =3D 0x00006c12, HOST_RSP =3D 0x00006c14, HOST_RIP =3D 0x00006c16, + HOST_S_CET =3D 0x00006c18, + HOST_SSP =3D 0x00006c1a, + HOST_INTR_SSP_TABLE =3D 0x00006c1c }; =20 /* --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A74792E7174; Tue, 12 Aug 2025 02:56:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967397; cv=none; b=JbaAPBZaaRr4y8BeE2egazN3B9j3yvZVhBfzSWV4PEg46iNANcud7gAyCJvp3fPUByECxzvvTxSMnqHijCN0R6WUfw9oMB7SzdgyobhBN/PdL/F1BpXl/a5H8yb1hB+QtW+MMiiBVM1Bj8AXg1+Q+EychGYjBUwMmvfdMQu5y+4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967397; c=relaxed/simple; bh=Pbs8pVnxI8ZQEBDx6QcNwUCWk1rHplcYcFzwnv6FgGk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JihPdL/JmsdoIAOpkVE6KMQK7vSvVBC8UDG/g8RMx8YJPKO6RvgSBNIu2N0ZDEroDdzC3hgBnfETalZWtLDmmLc7BDlwv1GojGVJAb0k+tero3AddSZKpcdhiNam/AVGfgsobugd/EKIPK6HjnW4FNTXG4vku5DLcMntjO3m+K4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Ula7wo2V; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Ula7wo2V" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967395; x=1786503395; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Pbs8pVnxI8ZQEBDx6QcNwUCWk1rHplcYcFzwnv6FgGk=; b=Ula7wo2V7I2/ttvq+1Oeh408odFcQwJH46ysMnUjTKWgjruKyqN685wf 7hdnLMI/3MxW1abJd4q3rpGw57zkNjXXxQYU/gzwuqTpN5G580ljIiIQF RJZQzwoOWFpAAIsF+9vMCMBBLwsw5ZoyVdj1G5splfXWCipUWdBWDqkQw Dyms15yXXFao73YUnr0kr+yIu2ELwnMMWqAvU9HkccezG7QlzW8imgqFh jlXGH1oHxEsuMfkeUUv8XRmuz5vxV/3IdI9sThMrcdtAfny4MEG7Izx6g l77vjmPcEQUwhvQemQxIxFOCv+W4ZUBguHVTurZugZGfqcJzYUGYHzfBi Q==; X-CSE-ConnectionGUID: ce8vz3G+QQ2U3wGwdI7hZw== X-CSE-MsgGUID: 6EJ5FVWIRrSqxYSA01t22w== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100561" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100561" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:33 -0700 X-CSE-ConnectionGUID: uJzIiBdfR1ODzuxn0Q34/g== X-CSE-MsgGUID: YKTVyIHVQYCxBG0rbHHOqg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321314" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:33 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Mathias Krause , John Allen , Chao Gao , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 14/24] KVM: x86: Enable guest SSP read/write interface with new uAPIs Date: Mon, 11 Aug 2025 19:55:22 -0700 Message-ID: <20250812025606.74625-15-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Enable guest shadow stack pointer(SSP) access interface with new uAPIs. CET guest SSP is HW register which has corresponding VMCS field to save /restore guest values when VM-{Exit,Entry} happens. KVM handles SSP as a synthetic MSR for userspace access. Use a translation helper to set up mapping for SSP synthetic index and KVM-internal MSR index so that userspace doesn't need to take care of KVM's management for synthetic MSRs and avoid conflicts. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/include/uapi/asm/kvm.h | 3 +++ arch/x86/kvm/x86.c | 10 +++++++++- arch/x86/kvm/x86.h | 10 ++++++++++ 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kv= m.h index e72d9e6c1739..a4870d9c9279 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -421,6 +421,9 @@ struct kvm_x86_reg_id { __u16 rsvd16; }; =20 +/* KVM synthetic MSR index staring from 0 */ +#define KVM_SYNTHETIC_GUEST_SSP 0 + #define KVM_SYNC_X86_REGS (1UL << 0) #define KVM_SYNC_X86_SREGS (1UL << 1) #define KVM_SYNC_X86_EVENTS (1UL << 2) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 7134f428b9f7..b5c4db4b7e04 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5975,7 +5975,15 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu= *vcpu, =20 static int kvm_translate_synthetic_msr(struct kvm_x86_reg_id *reg) { - return -EINVAL; + switch (reg->index) { + case KVM_SYNTHETIC_GUEST_SSP: + reg->type =3D KVM_X86_REG_MSR; + reg->index =3D MSR_KVM_INTERNAL_GUEST_SSP; + break; + default: + return -EINVAL; + } + return 0; } =20 long kvm_arch_vcpu_ioctl(struct file *filp, diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 31ce76369cc7..f8fbd33db067 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -101,6 +101,16 @@ do { \ #define KVM_SVM_DEFAULT_PLE_WINDOW_MAX USHRT_MAX #define KVM_SVM_DEFAULT_PLE_WINDOW 3000 =20 +/* + * KVM's internal, non-ABI indices for synthetic MSRs. The values themselv= es + * are arbitrary and have no meaning, the only requirement is that they do= n't + * conflict with "real" MSRs that KVM supports. Use values at the upper end + * of KVM's reserved paravirtual MSR range to minimize churn, i.e. these v= alues + * will be usable until KVM exhausts its supply of paravirtual MSR indices. + */ + +#define MSR_KVM_INTERNAL_GUEST_SSP 0x4b564dff + static inline unsigned int __grow_ple_window(unsigned int val, unsigned int base, unsigned int modifier, unsigned int max) { --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A43862E764B; Tue, 12 Aug 2025 02:56:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967398; cv=none; b=YiPd0BBQMYM6Bg071rW5TLibZnZnct8cXB16np6Q8uE7FSrOFcIarKkppRgqYdwCEwMUsB50YqFtNT2rF/7Y8jj/pPCw3rrlhZaZh1Wv3U+uWx0239CM5+dIb0jYZzAj+U4FqY0UwHM3h2saj46iqvXL9BvxbUbUCICd2PfMwes= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967398; c=relaxed/simple; bh=pclJ9Rtt4zR94pPpscgVs4cYRU+wMG+LAMS6GmGnH9I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TAwIsGYJIpnpsWUmWlH+B1gDZLrxnv5v69rPDbFxmfEbzEev23LcaWDyyzlHEpbxIq2C5m4hR36RFeCPr7CROKugq5yNvPdHBjUrx1Fs2uRhX8/JcHl0mU5agtzhxRrwg9eq2jttDztMH2yGOmq0scXPA8NDoKNnIohkDHn/NnY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=maE7j/fy; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="maE7j/fy" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967396; x=1786503396; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pclJ9Rtt4zR94pPpscgVs4cYRU+wMG+LAMS6GmGnH9I=; b=maE7j/fyjmJ/2uPhVLNYsA1C7wbrxI3l+bKTIv8ToXv8X4OoMvcGkiWT 32y18Ql3EnsuOhTr+LHP31Ypv6d13n0qxgbBGYoIrs/Xgcf4KBdwHo54c JE1shrmVqLX1sDnog31LY+E8j0TdENZgP6881c/qohxmZMM4oG59j+TiW Dbhj8O8yYdxex0V+bf7ZlkoNZ9CJ/B48KAG/tgDsMK0YSu3QvRR3vYBMf iHAuiHtdz11v8tUbIFNusY4UtI2QVWiClbMdcPzWrHwUGQps80YRsS232 kJDIb/VDk0mNEL+RwWWLH3D6Lknol4/tXow5tOc1JfxfHI59smXoxr/wX Q==; X-CSE-ConnectionGUID: B8/FFYlKSfiRQzgQRThfkw== X-CSE-MsgGUID: wrLTWqGsQJ2rnSgJBM1ptQ== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100575" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100575" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:34 -0700 X-CSE-ConnectionGUID: X0Rxla+dTqy2YMIQxwBEzQ== X-CSE-MsgGUID: BbbaVR8eTamPsR51vX4QUw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321324" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:34 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Mathias Krause , John Allen , Chao Gao , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 15/24] KVM: VMX: Emulate read and write to CET MSRs Date: Mon, 11 Aug 2025 19:55:23 -0700 Message-ID: <20250812025606.74625-16-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Add emulation interface for CET MSR access. The emulation code is split into common part and vendor specific part. The former does common checks for MSRs, e.g., accessibility, data validity etc., then passes operation to either XSAVE-managed MSRs via the helpers or CET VMCS fields. SSP can only be read via RDSSP. Writing even requires destructive and potentially faulting operations such as SAVEPREVSSP/RSTORSSP or SETSSBSY/CLRSSBSY. Let the host use a pseudo-MSR that is just a wrapper for the GUEST_SSP field of the VMCS. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/vmx/vmx.c | 18 ++++++++++++++++++ arch/x86/kvm/x86.c | 43 ++++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.h | 23 ++++++++++++++++++++++ 3 files changed, 84 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index aa157fe5b7b3..bd572c8c7bc3 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2093,6 +2093,15 @@ int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_da= ta *msr_info) else msr_info->data =3D vmx->pt_desc.guest.addr_a[index / 2]; break; + case MSR_IA32_S_CET: + msr_info->data =3D vmcs_readl(GUEST_S_CET); + break; + case MSR_KVM_INTERNAL_GUEST_SSP: + msr_info->data =3D vmcs_readl(GUEST_SSP); + break; + case MSR_IA32_INT_SSP_TAB: + msr_info->data =3D vmcs_readl(GUEST_INTR_SSP_TABLE); + break; case MSR_IA32_DEBUGCTLMSR: msr_info->data =3D vmx_guest_debugctl_read(); break; @@ -2411,6 +2420,15 @@ int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_da= ta *msr_info) else vmx->pt_desc.guest.addr_a[index / 2] =3D data; break; + case MSR_IA32_S_CET: + vmcs_writel(GUEST_S_CET, data); + break; + case MSR_KVM_INTERNAL_GUEST_SSP: + vmcs_writel(GUEST_SSP, data); + break; + case MSR_IA32_INT_SSP_TAB: + vmcs_writel(GUEST_INTR_SSP_TABLE, data); + break; case MSR_IA32_PERF_CAPABILITIES: if (data & PMU_CAP_LBR_FMT) { if ((data & PMU_CAP_LBR_FMT) !=3D diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index b5c4db4b7e04..cc39ace47262 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1885,6 +1885,27 @@ static int __kvm_set_msr(struct kvm_vcpu *vcpu, u32 = index, u64 data, =20 data =3D (u32)data; break; + case MSR_IA32_U_CET: + case MSR_IA32_S_CET: + if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK) && + !guest_cpu_cap_has(vcpu, X86_FEATURE_IBT)) + return KVM_MSR_RET_UNSUPPORTED; + if (!is_cet_msr_valid(vcpu, data)) + return 1; + break; + case MSR_KVM_INTERNAL_GUEST_SSP: + if (!host_initiated) + return 1; + fallthrough; + case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB: + if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK)) + return KVM_MSR_RET_UNSUPPORTED; + if (is_noncanonical_msr_address(data, vcpu)) + return 1; + /* All SSP MSRs except MSR_IA32_INT_SSP_TAB must be 4-byte aligned */ + if (index !=3D MSR_IA32_INT_SSP_TAB && !IS_ALIGNED(data, 4)) + return 1; + break; } =20 msr.data =3D data; @@ -1929,6 +1950,20 @@ static int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 = index, u64 *data, !guest_cpu_cap_has(vcpu, X86_FEATURE_RDPID)) return 1; break; + case MSR_IA32_U_CET: + case MSR_IA32_S_CET: + if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK) && + !guest_cpu_cap_has(vcpu, X86_FEATURE_IBT)) + return KVM_MSR_RET_UNSUPPORTED; + break; + case MSR_KVM_INTERNAL_GUEST_SSP: + if (!host_initiated) + return 1; + fallthrough; + case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB: + if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK)) + return KVM_MSR_RET_UNSUPPORTED; + break; } =20 msr.index =3D index; @@ -4207,6 +4242,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct= msr_data *msr_info) vcpu->arch.guest_fpu.xfd_err =3D data; break; #endif + case MSR_IA32_U_CET: + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + kvm_set_xstate_msr(vcpu, msr_info); + break; default: if (kvm_pmu_is_valid_msr(vcpu, msr)) return kvm_pmu_set_msr(vcpu, msr_info); @@ -4556,6 +4595,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct= msr_data *msr_info) msr_info->data =3D vcpu->arch.guest_fpu.xfd_err; break; #endif + case MSR_IA32_U_CET: + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: + kvm_get_xstate_msr(vcpu, msr_info); + break; default: if (kvm_pmu_is_valid_msr(vcpu, msr_info->index)) return kvm_pmu_get_msr(vcpu, msr_info); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index f8fbd33db067..d5b039addd11 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -733,4 +733,27 @@ static inline void kvm_set_xstate_msr(struct kvm_vcpu = *vcpu, kvm_fpu_put(); } =20 +#define CET_US_RESERVED_BITS GENMASK(9, 6) +#define CET_US_SHSTK_MASK_BITS GENMASK(1, 0) +#define CET_US_IBT_MASK_BITS (GENMASK_ULL(5, 2) | GENMASK_ULL(63, 10)) +#define CET_US_LEGACY_BITMAP_BASE(data) ((data) >> 12) + +static inline bool is_cet_msr_valid(struct kvm_vcpu *vcpu, u64 data) +{ + if (data & CET_US_RESERVED_BITS) + return false; + if (!guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK) && + (data & CET_US_SHSTK_MASK_BITS)) + return false; + if (!guest_cpu_cap_has(vcpu, X86_FEATURE_IBT) && + (data & CET_US_IBT_MASK_BITS)) + return false; + if (!IS_ALIGNED(CET_US_LEGACY_BITMAP_BASE(data), 4)) + return false; + /* IBT can be suppressed iff the TRACKER isn't WAIT_ENDBR. */ + if ((data & CET_SUPPRESS) && (data & CET_WAIT_ENDBR)) + return false; + + return true; +} #endif --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E21A2E7BB1; Tue, 12 Aug 2025 02:56:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967398; cv=none; b=lR8MrZlRFyfkyIc+muJIllS9+RYbj92AfdaVhyQCDRBOcR62XGCytQqJVOLcYsgeJhDKzD6gfP608DkAXGBoe60b1Muo1zc/sHfQa8wmrA5Iz9LskLnuuMag0FfFR0RsnI/hrjHw+ITrG3u4toDBvS0uesFUNjLmQl+GRoJNEM4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967398; c=relaxed/simple; bh=5WHPn4yw0uta1t4HIbKm9txODdqNNrH1SC+NG8IkwRk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YhUhKko3BV0Ra3k4kv/vkqF1HsR1Kg747ZTIHOsRjieD1v7odfhAU/0/MbahAVmu6YhHFLKZPmH1qsR6hxYH9HJVSUBNTqHTXUFZUI+NPhgYPglkI70i1RO7B+1zxIDxY1qwdx9ijNToiH1gWkftpCWV2PF2Vff4SW2PJK7gfW4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=PAOHQC2L; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="PAOHQC2L" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967397; x=1786503397; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5WHPn4yw0uta1t4HIbKm9txODdqNNrH1SC+NG8IkwRk=; b=PAOHQC2LG+haFcgWvZQ25XYiSRzUajqfPV87S6jIfGItf5OMGmk6lh5b w6mB92s7FQZXyXWyKs5d1/S2gfGafIJu5JQcEwdBeaQKLaJA9tNN8TMe2 /rwb/BiM0vNUgU3MzphSblQuceEPw9w81HpX7SNYvGfg2fyrgWh1exj3W C5Byf7CrBv+2tU3tpgGeqN4MgKrsrW1BM7P3BkJqHzsKv9kV5L6yXA9/e 3FLakrPDer7SSJIizDfwINbzOobyBh2E+tiIXwJR9GsQEhh6nWPNfEHaZ gwbhcexRImHBnkf+OqtwYXUACmcWfNiK6pGGpMDa3FBND3rQ+e46NA/Su A==; X-CSE-ConnectionGUID: M2iImkelRKyAZqnpYO2TlQ== X-CSE-MsgGUID: cr7xWPdIQLSUTtQwwgSZgQ== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100586" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100586" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:35 -0700 X-CSE-ConnectionGUID: qDJbjBKRSHmKRijU9TyeDw== X-CSE-MsgGUID: gxFrywQ8Qda7fLs2AaAFFQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321332" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:35 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Mathias Krause , John Allen , Chao Gao , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 16/24] KVM: x86: Save and reload SSP to/from SMRAM Date: Mon, 11 Aug 2025 19:55:24 -0700 Message-ID: <20250812025606.74625-17-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Save CET SSP to SMRAM on SMI and reload it on RSM. KVM emulates HW arch behavior when guest enters/leaves SMM mode,i.e., save registers to SMRAM at the entry of SMM and reload them at the exit to SMM. Per SDM, SSP is one of such registers on 64-bit Arch, and add the support for SSP. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/smm.c | 8 ++++++++ arch/x86/kvm/smm.h | 2 +- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/smm.c b/arch/x86/kvm/smm.c index 5dd8a1646800..b0b14ba37f9a 100644 --- a/arch/x86/kvm/smm.c +++ b/arch/x86/kvm/smm.c @@ -269,6 +269,10 @@ static void enter_smm_save_state_64(struct kvm_vcpu *v= cpu, enter_smm_save_seg_64(vcpu, &smram->gs, VCPU_SREG_GS); =20 smram->int_shadow =3D kvm_x86_call(get_interrupt_shadow)(vcpu); + + if (guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK) && + kvm_msr_read(vcpu, MSR_KVM_INTERNAL_GUEST_SSP, &smram->ssp)) + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); } #endif =20 @@ -558,6 +562,10 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *= ctxt, kvm_x86_call(set_interrupt_shadow)(vcpu, 0); ctxt->interruptibility =3D (u8)smstate->int_shadow; =20 + if (guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK) && + kvm_msr_write(vcpu, MSR_KVM_INTERNAL_GUEST_SSP, smstate->ssp)) + return X86EMUL_UNHANDLEABLE; + return X86EMUL_CONTINUE; } #endif diff --git a/arch/x86/kvm/smm.h b/arch/x86/kvm/smm.h index 551703fbe200..db3c88f16138 100644 --- a/arch/x86/kvm/smm.h +++ b/arch/x86/kvm/smm.h @@ -116,8 +116,8 @@ struct kvm_smram_state_64 { u32 smbase; u32 reserved4[5]; =20 - /* ssp and svm_* fields below are not implemented by KVM */ u64 ssp; + /* svm_* fields below are not implemented by KVM */ u64 svm_guest_pat; u64 svm_host_efer; u64 svm_host_cr4; --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 93DD92E7BD2; Tue, 12 Aug 2025 02:56:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967399; cv=none; b=rjiKjS/QEH2JJc+T8ufW5tdzW13nBTg2tV4rWLwfM7QCMyaypMVVwOkW1X+/xrpuYm+2ofKbLqRpPuZd747u4UIX+6tLJ5eMoTHY81ImH+0FB2/SIJAdHTS6lnqa9Urg0fdxLWIHZgfD+7/5t57zgZxFth1LrKWyyEaqzgU+gTs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967399; c=relaxed/simple; bh=HufWc81H77CIL8AVtjRq6/E/Pufpr+TtO1E3be/dcII=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Q7mxnFwCPoKh4oO1M8IMv5VmD+N/FfT7+DHajEthfa0psRCvUAIky5zDj8nFqmmpTpEDZq/hzGnktFaurT4Dqo4dUUBE65ZwGw243uB0VWK6ptEOzoOXWAKPr/PQc9Eff8JkHqkbrd7irOnyCBf0cLyOi7QPw+RNxL9q9+6nxlg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=BdI2NoqB; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="BdI2NoqB" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967397; x=1786503397; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HufWc81H77CIL8AVtjRq6/E/Pufpr+TtO1E3be/dcII=; b=BdI2NoqB5VSudflI+mNtxQKp39EGWSpbtmnfP6IZfcSCGIirmLtwZfwK U1oJLNIQmkpdMYdYa42161+iV8SrunkGzdPNv5IfvUqKbWk+BBnx1xopt c4dz0lr8iHP+/g33JvCPtmp2BIZ60FU6xajsoUPLq1KMI2VFJDVaE6k/P 7qra6r97gJ718umtegdtAkJ/Ru3aLXalHwFldssnTDc/hUyR+ecJCvK/4 3BT1LIsXnK9WoP1+bUHU+mgLZa4RwIwg1+PiynJ3X2KUheJox7X6gViex lq6KvPI5q8SZD4MfLBCGUWt6+jKq/P/3jYzI1qgxDA8qU1LNJBpdXmGa7 Q==; X-CSE-ConnectionGUID: i7KvvX1WSFmKH2iQu6WYWg== X-CSE-MsgGUID: OehvMwGyRLONXsfr0Wg0zA== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100596" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100596" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:36 -0700 X-CSE-ConnectionGUID: oLT/2wJYSlyYYHvFYBZ2zA== X-CSE-MsgGUID: 5aP0q0ybQBWwDpa4RSI9gg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321337" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:36 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Mathias Krause , John Allen , Chao Gao , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 17/24] KVM: VMX: Set up interception for CET MSRs Date: Mon, 11 Aug 2025 19:55:25 -0700 Message-ID: <20250812025606.74625-18-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Enable/disable CET MSRs interception per associated feature configuration. Shadow Stack feature requires all CET MSRs passed through to guest to make it supported in user and supervisor mode while IBT feature only depends on MSR_IA32_{U,S}_CETS_CET to enable user and supervisor IBT. Note, this MSR design introduced an architectural limitation of SHSTK and IBT control for guest, i.e., when SHSTK is exposed, IBT is also available to guest from architectural perspective since IBT relies on subset of SHSTK relevant MSRs. Suggested-by: Sean Christopherson Signed-off-by: Yang Weijiang Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/vmx/vmx.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index bd572c8c7bc3..130ffbe7dc1a 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4088,6 +4088,8 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcp= u) =20 void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu) { + bool set; + if (!cpu_has_vmx_msr_bitmap()) return; =20 @@ -4133,6 +4135,24 @@ void vmx_recalc_msr_intercepts(struct kvm_vcpu *vcpu) vmx_set_intercept_for_msr(vcpu, MSR_IA32_FLUSH_CMD, MSR_TYPE_W, !guest_cpu_cap_has(vcpu, X86_FEATURE_FLUSH_L1D)); =20 + if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) { + set =3D !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK); + + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, MSR_TYPE_RW, set); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, MSR_TYPE_RW, set); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, MSR_TYPE_RW, set); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, MSR_TYPE_RW, set); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW, set); + } + + if (kvm_cpu_cap_has(X86_FEATURE_SHSTK) || kvm_cpu_cap_has(X86_FEATURE_IBT= )) { + set =3D !guest_cpu_cap_has(vcpu, X86_FEATURE_IBT) && + !guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK); + + vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, MSR_TYPE_RW, set); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, MSR_TYPE_RW, set); + } + /* * x2APIC and LBR MSR intercepts are modified on-demand and cannot be * filtered by userspace. --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A0A2A2E8892; Tue, 12 Aug 2025 02:56:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967400; cv=none; b=hpNU0h79pO/QgaIgRuzzkCeAbIIFh5lqpuTxLF52iqDF5/hqumCwovQzDq/m0XgE3I54Qs0WaJW1YyDxZ8SDMeN1WiER8gVcNeT26i192zmi/wLx3+kTGVN71Ghs8r4hSdKwMTsVwzw+XbuuJK1GSDgxF2U1YnMTMDYuq8jQ15E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967400; c=relaxed/simple; bh=3Y66VREBi/y59CbbNcrB0Qn9rmB5pHGIdRJdQs94hio=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EUuMDhZic3m0gn9/ltbu7OWG6P4eIvFd1yNTesptP6FFSCt/D3FIbp/HjgsjWcvqntL/BpxZgPKB6x4A9rBXZZVAfmCg3Gyz4CE0znMnQpzd2jKBdZAET0cZLaW1lLgn6SSkcSx1lYgaaY+KHWzAAhzffsa7Rc7WHsEvsO1AgS4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=b/UgHn+n; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="b/UgHn+n" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967398; x=1786503398; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3Y66VREBi/y59CbbNcrB0Qn9rmB5pHGIdRJdQs94hio=; b=b/UgHn+nh1U816OsYwmcHGbDlhbSq5++ZQjSPe9NrWO9J3v4QmpCy/dC n8QhmUKcRjbKDeAYbTcwtBORshvLUYLuU4NiCvK9Ktpc/qVfqgfxUy44L D/v2EYz1cghPkB92Oo5RCMqngrqn2ABsfK0onxBGQ5aSe0KP+vQueyz9y Ad7I3SJ8J6U40DYU56vK28TqgnMd+X0H6pyEXRILKxyINTaFKz2aiG4nW LlukhECvBiU5GcBRvCNBW3fVmgXgZbXpKKrOZBGpwyn6UOSpoSFyny8O8 xgMp2E7/MdaWqCjMMgprLSgYXHIeeYpxKZyQOo+AytUj2JWlEgfE+ecyK w==; X-CSE-ConnectionGUID: irF6PIxgQ6me/o9lI76D5w== X-CSE-MsgGUID: i/IWXe+aRiCB852MGyN8+Q== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100608" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100608" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:37 -0700 X-CSE-ConnectionGUID: x86NpfnZR+a5rUYq1cM3yQ== X-CSE-MsgGUID: fmI2FYDTQmavoeu+FXALgw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321348" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:37 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Sean Christopherson , Chao Gao , Mathias Krause , John Allen , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 18/24] KVM: VMX: Set host constant supervisor states to VMCS fields Date: Mon, 11 Aug 2025 19:55:26 -0700 Message-ID: <20250812025606.74625-19-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Save constant values to HOST_{S_CET,SSP,INTR_SSP_TABLE} field explicitly. Kernel IBT is supported and the setting in MSR_IA32_S_CET is static after post-boot(The exception is BIOS call case but vCPU thread never across it) and KVM doesn't need to refresh HOST_S_CET field before every VM-Enter/ VM-Exit sequence. Host supervisor shadow stack is not enabled now and SSP is not accessible to kernel mode, thus it's safe to set host IA32_INT_SSP_TAB/SSP VMCS field to 0s. When shadow stack is enabled for CPL3, SSP is reloaded from PL3_SSP before it exits to userspace. Check SDM Vol 2A/B Chapter 3/4 for SYSCALL/ SYSRET/SYSENTER SYSEXIT/RDSSP/CALL etc. Prevent KVM module loading if host supervisor shadow stack SHSTK_EN is set in MSR_IA32_S_CET as KVM cannot co-exit with it correctly. Suggested-by: Sean Christopherson Suggested-by: Chao Gao Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/vmx/capabilities.h | 4 ++++ arch/x86/kvm/vmx/vmx.c | 15 +++++++++++++++ arch/x86/kvm/x86.c | 12 ++++++++++++ arch/x86/kvm/x86.h | 1 + 4 files changed, 32 insertions(+) diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilitie= s.h index 5316c27f6099..7d290b2cb0f4 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -103,6 +103,10 @@ static inline bool cpu_has_load_perf_global_ctrl(void) return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL; } =20 +static inline bool cpu_has_load_cet_ctrl(void) +{ + return (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_CET_STATE); +} static inline bool cpu_has_vmx_mpx(void) { return vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_BNDCFGS; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 130ffbe7dc1a..ba44223405cd 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4308,6 +4308,21 @@ void vmx_set_constant_host_state(struct vcpu_vmx *vm= x) =20 if (cpu_has_load_ia32_efer()) vmcs_write64(HOST_IA32_EFER, kvm_host.efer); + + /* + * Supervisor shadow stack is not enabled on host side, i.e., + * host IA32_S_CET.SHSTK_EN bit is guaranteed to 0 now, per SDM + * description(RDSSP instruction), SSP is not readable in CPL0, + * so resetting the two registers to 0s at VM-Exit does no harm + * to kernel execution. When execution flow exits to userspace, + * SSP is reloaded from IA32_PL3_SSP. Check SDM Vol.2A/B Chapter + * 3 and 4 for details. + */ + if (cpu_has_load_cet_ctrl()) { + vmcs_writel(HOST_S_CET, kvm_host.s_cet); + vmcs_writel(HOST_SSP, 0); + vmcs_writel(HOST_INTR_SSP_TABLE, 0); + } } =20 void set_cr4_guest_host_mask(struct vcpu_vmx *vmx) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cc39ace47262..91e78c506105 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9845,6 +9845,18 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) return -EIO; } =20 + if (boot_cpu_has(X86_FEATURE_SHSTK)) { + rdmsrl(MSR_IA32_S_CET, kvm_host.s_cet); + /* + * Linux doesn't yet support supervisor shadow stacks (SSS), so + * KVM doesn't save/restore the associated MSRs, i.e. KVM may + * clobber the host values. Yell and refuse to load if SSS is + * unexpectedly enabled, e.g. to avoid crashing the host. + */ + if (WARN_ON_ONCE(kvm_host.s_cet & CET_SHSTK_EN)) + return -EIO; + } + memset(&kvm_caps, 0, sizeof(kvm_caps)); =20 x86_emulator_cache =3D kvm_alloc_emulator_cache(); diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index d5b039addd11..d612ddcae247 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -50,6 +50,7 @@ struct kvm_host_values { u64 efer; u64 xcr0; u64 xss; + u64 s_cet; u64 arch_capabilities; }; =20 --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF49D2E88A1; Tue, 12 Aug 2025 02:56:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967400; cv=none; b=VhBI44M13e/w0X/VY367IO4hhE0jqz0PHyhu/Q9eUeyBf8HTm5CwxsI2dZTVQWIqWRoNFXqXTZzM0d2bp5QTps21Y0749KwQqdeTbjA2jApLQGb71oCxXGZsE01/TJ4hndEG3xaZZJHjyBm6UGqUUxNKv5Kh3R0Q7X7+i+ipAww= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967400; c=relaxed/simple; bh=dXQQ6a4EW2NVwV3ot4UhfBim9c1Mpdn5wm1COxgLrU0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DxRk1Nj5FLH8wkAPdC4ACAT3u/JGmIh+oF59yHGuz7N8m1OZdr8AEMUs+fy0iFhM9sF7LNXg2r1XP/k7pNve3ELQvA5vNS82KPOk3N6LFHD6Y0FR1NnWz/iMWA8S1Z51aJYdaNoxLSQ4mw09de88od8STRAAV9pOMmSonBoF/bw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=k/fpH+Qi; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="k/fpH+Qi" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967399; x=1786503399; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dXQQ6a4EW2NVwV3ot4UhfBim9c1Mpdn5wm1COxgLrU0=; b=k/fpH+QiGL7jtZ9xxSrBZZoRYf1zVPlFKO4S+8ng0NmJpxoNL3xNkt7f cOiwWaQe0G/ChYMZBz6FgUhntATelOP6k/nQi4oZzGq0njknIm87IEtPI +GzR/c+V2T3+u0kcmWBnGGIRiHHkeGK4ynmOlP5FFJDDrk8wOU2q0nej9 7ML9IPCZbfNf7kK9hwoMh5NT6VA2RQTE3qvxDnHRvMgswkNSR/9DArFtl dndseUmXaeI+bv70675Nh41TRvQHRnGu9OYbuMIaLi85jlom88vrgXRhX rWOOkrZ4B44mkJu6FGs7EYk0sUUWogR2tDqIxD/oy4K0Tzf6WjeMZL11I A==; X-CSE-ConnectionGUID: pKWcvkDvQua7Gk4TOjLU6w== X-CSE-MsgGUID: eExxvCgARz27Q+NmyRI4Fg== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100620" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100620" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:37 -0700 X-CSE-ConnectionGUID: 8M7JzkofTFO43FrUak2t3g== X-CSE-MsgGUID: sVLhmJzKTnyRE2+a4R7a8A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321354" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:37 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Chao Gao , Mathias Krause , John Allen , Sean Christopherson , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 19/24] KVM: x86: Don't emulate instructions guarded by CET Date: Mon, 11 Aug 2025 19:55:27 -0700 Message-ID: <20250812025606.74625-20-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Don't emulate the branch instructions, e.g., CALL/RET/JMP etc., when CET is active in guest, return KVM_INTERNAL_ERROR_EMULATION to userspace to handle it. KVM doesn't emulate CPU behaviors to check CET protected stuffs while emulating guest instructions, instead it stops emulation on detecting the instructions in process are CET protected. By doing so, it can avoid generating bogus #CP in guest and preventing CET protected execution flow subversion from guest side. Suggested-by: Chao Gao Signed-off-by: Yang Weijiang Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/emulate.c | 46 ++++++++++++++++++++++++++++++++---------- 1 file changed, 35 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 1349e278cd2a..80b9d1e4a50a 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -178,6 +178,8 @@ #define IncSP ((u64)1 << 54) /* SP is incremented before ModRM calc= */ #define TwoMemOp ((u64)1 << 55) /* Instruction has two memory operand = */ #define IsBranch ((u64)1 << 56) /* Instruction is considered a branch.= */ +#define ShadowStack ((u64)1 << 57) /* Instruction protected by Shadow Sta= ck. */ +#define IndirBrnTrk ((u64)1 << 58) /* Instruction protected by IBT. */ =20 #define DstXacc (DstAccLo | SrcAccHi | SrcWrite) =20 @@ -4068,9 +4070,11 @@ static const struct opcode group4[] =3D { static const struct opcode group5[] =3D { F(DstMem | SrcNone | Lock, em_inc), F(DstMem | SrcNone | Lock, em_dec), - I(SrcMem | NearBranch | IsBranch, em_call_near_abs), - I(SrcMemFAddr | ImplicitOps | IsBranch, em_call_far), - I(SrcMem | NearBranch | IsBranch, em_jmp_abs), + I(SrcMem | NearBranch | IsBranch | ShadowStack | IndirBrnTrk, + em_call_near_abs), + I(SrcMemFAddr | ImplicitOps | IsBranch | ShadowStack | IndirBrnTrk, + em_call_far), + I(SrcMem | NearBranch | IsBranch | IndirBrnTrk, em_jmp_abs), I(SrcMemFAddr | ImplicitOps | IsBranch, em_jmp_far), I(SrcMem | Stack | TwoMemOp, em_push), D(Undefined), }; @@ -4332,11 +4336,11 @@ static const struct opcode opcode_table[256] =3D { /* 0xC8 - 0xCF */ I(Stack | SrcImmU16 | Src2ImmByte | IsBranch, em_enter), I(Stack | IsBranch, em_leave), - I(ImplicitOps | SrcImmU16 | IsBranch, em_ret_far_imm), - I(ImplicitOps | IsBranch, em_ret_far), - D(ImplicitOps | IsBranch), DI(SrcImmByte | IsBranch, intn), + I(ImplicitOps | SrcImmU16 | IsBranch | ShadowStack, em_ret_far_imm), + I(ImplicitOps | IsBranch | ShadowStack, em_ret_far), + D(ImplicitOps | IsBranch), DI(SrcImmByte | IsBranch | ShadowStack, intn), D(ImplicitOps | No64 | IsBranch), - II(ImplicitOps | IsBranch, em_iret, iret), + II(ImplicitOps | IsBranch | ShadowStack, em_iret, iret), /* 0xD0 - 0xD7 */ G(Src2One | ByteOp, group2), G(Src2One, group2), G(Src2CL | ByteOp, group2), G(Src2CL, group2), @@ -4352,7 +4356,7 @@ static const struct opcode opcode_table[256] =3D { I2bvIP(SrcImmUByte | DstAcc, em_in, in, check_perm_in), I2bvIP(SrcAcc | DstImmUByte, em_out, out, check_perm_out), /* 0xE8 - 0xEF */ - I(SrcImm | NearBranch | IsBranch, em_call), + I(SrcImm | NearBranch | IsBranch | ShadowStack, em_call), D(SrcImm | ImplicitOps | NearBranch | IsBranch), I(SrcImmFAddr | No64 | IsBranch, em_jmp_far), D(SrcImmByte | ImplicitOps | NearBranch | IsBranch), @@ -4371,7 +4375,8 @@ static const struct opcode opcode_table[256] =3D { static const struct opcode twobyte_table[256] =3D { /* 0x00 - 0x0F */ G(0, group6), GD(0, &group7), N, N, - N, I(ImplicitOps | EmulateOnUD | IsBranch, em_syscall), + N, I(ImplicitOps | EmulateOnUD | IsBranch | ShadowStack | IndirBrnTrk, + em_syscall), II(ImplicitOps | Priv, em_clts, clts), N, DI(ImplicitOps | Priv, invd), DI(ImplicitOps | Priv, wbinvd), N, N, N, D(ImplicitOps | ModRM | SrcMem | NoAccess), N, N, @@ -4402,8 +4407,9 @@ static const struct opcode twobyte_table[256] =3D { IIP(ImplicitOps, em_rdtsc, rdtsc, check_rdtsc), II(ImplicitOps | Priv, em_rdmsr, rdmsr), IIP(ImplicitOps, em_rdpmc, rdpmc, check_rdpmc), - I(ImplicitOps | EmulateOnUD | IsBranch, em_sysenter), - I(ImplicitOps | Priv | EmulateOnUD | IsBranch, em_sysexit), + I(ImplicitOps | EmulateOnUD | IsBranch | ShadowStack | IndirBrnTrk, + em_sysenter), + I(ImplicitOps | Priv | EmulateOnUD | IsBranch | ShadowStack, em_sysexit), N, N, N, N, N, N, N, N, N, N, /* 0x40 - 0x4F */ @@ -4941,6 +4947,24 @@ int x86_decode_insn(struct x86_emulate_ctxt *ctxt, v= oid *insn, int insn_len, int if (ctxt->d =3D=3D 0) return EMULATION_FAILED; =20 + if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_CET) { + u64 u_cet, s_cet; + bool stop_em; + + if (ctxt->ops->get_msr(ctxt, MSR_IA32_U_CET, &u_cet) || + ctxt->ops->get_msr(ctxt, MSR_IA32_S_CET, &s_cet)) + return EMULATION_FAILED; + + stop_em =3D ((u_cet & CET_SHSTK_EN) || (s_cet & CET_SHSTK_EN)) && + (opcode.flags & ShadowStack); + + stop_em |=3D ((u_cet & CET_ENDBR_EN) || (s_cet & CET_ENDBR_EN)) && + (opcode.flags & IndirBrnTrk); + + if (stop_em) + return EMULATION_FAILED; + } + ctxt->execute =3D opcode.u.execute; =20 if (unlikely(emulation_type & EMULTYPE_TRAP_UD) && --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AACF42E8E05; Tue, 12 Aug 2025 02:56:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967402; cv=none; b=vGBtf6wDlo9XkPKsQYtu9IF1CYTQOpNlxRkQliqxUMtMYIbKMsb5yq4rqZ4RbxIh/QLMk8gsVKaX1OQW7yODBg9VW5xKPxgF6A8Uitrp+BiuyrHeJC64napxOPWktbNv6lhPmXQzR04giMXvmBF7+AsMnMvc6babScqB2o6o6Ek= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967402; c=relaxed/simple; bh=RgdBQnv5nEZ4Ya93rAMDIcyFwaVa60Mts0RgzRWW0Xg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HtVbMX+m/Wp+5kzMb184SDU1Aj/5qzSnw40naWw5bodq1rf6vn1twmCj+zdjqCLZJKldsVkIjDdQztxiXctqPz+pFbYA/tUQL9n/StrAGdXXizc7VPRlBa9GU3jIHQYVFDQ06FuKHGeDJWzMOq6ad1kjiQV9u2FRj1Extv+BlQ4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=RFZTok0D; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="RFZTok0D" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967399; x=1786503399; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RgdBQnv5nEZ4Ya93rAMDIcyFwaVa60Mts0RgzRWW0Xg=; b=RFZTok0D3msALxpzektaWtLVg2ITjKOtxDoZSejVF0gNrfD9PPgPIcnu PlZMDH8g1CSOUwrjG12F6QrvWKIKYwegS6XrXCHKVGQd43+TURpRs4Uc0 lvXHOXylgP08WSybn+9VSq1jyYEviSPB+7vaiWdvDpLbjKXO4OKs64peG +wh468nAgczFaiDzmyZTnDBNsEPxqqIRnfNVYPaX+kLunjzXt2kdRDoyR +A+GnJuLhNw6ieSAdyNl8ETGWrZr/c/XLRK9uizB8uBOPuA320kO6XIwD a+eX8tumevjh73IM9vaLpYHiuU85k4znZVUtPZ4dsNrskEuvUPEOuTP14 Q==; X-CSE-ConnectionGUID: C2ocCJBiTTqrDAc2HObA2Q== X-CSE-MsgGUID: 7dN9soMMSLmTFuVG8WFJrA== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100631" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100631" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:39 -0700 X-CSE-ConnectionGUID: JwCVGIGwQFudV5QIkLk+zw== X-CSE-MsgGUID: fTHO4pipSgWGOqbaJNeQgg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321359" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:39 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Mathias Krause , John Allen , Chao Gao , Sean Christopherson , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 20/24] KVM: x86: Enable CET virtualization for VMX and advertise to userspace Date: Mon, 11 Aug 2025 19:55:28 -0700 Message-ID: <20250812025606.74625-21-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Expose CET features to guest if KVM/host can support them, clear CPUID feature bits if KVM/host cannot support. Set CPUID feature bits so that CET features are available in guest CPUID. Add CR4.CET bit support in order to allow guest set CET master control bit. Disable KVM CET feature if unrestricted_guest is unsupported/disabled as KVM does not support emulating CET. The CET load-bits in VM_ENTRY/VM_EXIT control fields should be set to make guest CET xstates isolated from host's. On platforms with VMX_BASIC[bit56] =3D=3D 0, inject #CP at VMX entry with e= rror code will fail, and if VMX_BASIC[bit56] =3D=3D 1, #CP injection with or wit= hout error code is allowed. Disable CET feature bits if the MSR bit is cleared so that nested VMM can inject #CP if and only if VMX_BASIC[bit56] =3D=3D 1. Don't expose CET feature if either of {U,S}_CET xstate bits is cleared in host XSS or if XSAVES isn't supported. CET MSRs are reset to 0s after RESET, power-up and INIT, clear guest CET xsave-area fields so that guest CET MSRs are reset to 0s after the events. Meanwhile explicitly disable SHSTK and IBT for SVM because CET KVM enabling for SVM is not ready. Signed-off-by: Yang Weijiang Signed-off-by: Mathias Krause Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/include/asm/vmx.h | 1 + arch/x86/kvm/cpuid.c | 2 ++ arch/x86/kvm/svm/svm.c | 4 ++++ arch/x86/kvm/vmx/capabilities.h | 5 +++++ arch/x86/kvm/vmx/vmx.c | 30 +++++++++++++++++++++++++++++- arch/x86/kvm/vmx/vmx.h | 6 ++++-- arch/x86/kvm/x86.c | 22 +++++++++++++++++++--- arch/x86/kvm/x86.h | 3 +++ 9 files changed, 68 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_hos= t.h index 1faf53df6259..9a667d7efe83 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -142,7 +142,7 @@ | X86_CR4_OSXSAVE | X86_CR4_SMEP | X86_CR4_FSGSBASE \ | X86_CR4_OSXMMEXCPT | X86_CR4_LA57 | X86_CR4_VMXE \ | X86_CR4_SMAP | X86_CR4_PKE | X86_CR4_UMIP \ - | X86_CR4_LAM_SUP)) + | X86_CR4_LAM_SUP | X86_CR4_CET)) =20 #define CR8_RESERVED_BITS (~(unsigned long)X86_CR8_TPR) =20 diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index ce10a7e2d3d9..c85c50019523 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -134,6 +134,7 @@ #define VMX_BASIC_DUAL_MONITOR_TREATMENT BIT_ULL(49) #define VMX_BASIC_INOUT BIT_ULL(54) #define VMX_BASIC_TRUE_CTLS BIT_ULL(55) +#define VMX_BASIC_NO_HW_ERROR_CODE_CC BIT_ULL(56) =20 static inline u32 vmx_basic_vmcs_revision_id(u64 vmx_basic) { diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 85079caaf507..2515b2623fb1 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -944,6 +944,7 @@ void kvm_set_cpu_caps(void) VENDOR_F(WAITPKG), F(SGX_LC), F(BUS_LOCK_DETECT), + F(SHSTK), ); =20 /* @@ -970,6 +971,7 @@ void kvm_set_cpu_caps(void) F(AMX_INT8), F(AMX_BF16), F(FLUSH_L1D), + F(IBT), ); =20 if (boot_cpu_has(X86_FEATURE_AMD_IBPB_RET) && diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index d9931c6c4bc6..fb36774b8046 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -5231,6 +5231,10 @@ static __init void svm_set_cpu_caps(void) kvm_caps.supported_perf_cap =3D 0; kvm_caps.supported_xss =3D 0; =20 + /* KVM doesn't yet support CET virtualization for SVM. */ + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + /* CPUID 0x80000001 and 0x8000000A (SVM features) */ if (nested) { kvm_cpu_cap_set(X86_FEATURE_SVM); diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilitie= s.h index 7d290b2cb0f4..47b0dec8665a 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -76,6 +76,11 @@ static inline bool cpu_has_vmx_basic_inout(void) return vmcs_config.basic & VMX_BASIC_INOUT; } =20 +static inline bool cpu_has_vmx_basic_no_hw_errcode(void) +{ + return vmcs_config.basic & VMX_BASIC_NO_HW_ERROR_CODE_CC; +} + static inline bool cpu_has_virtual_nmis(void) { return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS && diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index ba44223405cd..ae8be898e1df 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2602,6 +2602,7 @@ static int setup_vmcs_config(struct vmcs_config *vmcs= _conf, { VM_ENTRY_LOAD_IA32_EFER, VM_EXIT_LOAD_IA32_EFER }, { VM_ENTRY_LOAD_BNDCFGS, VM_EXIT_CLEAR_BNDCFGS }, { VM_ENTRY_LOAD_IA32_RTIT_CTL, VM_EXIT_CLEAR_IA32_RTIT_CTL }, + { VM_ENTRY_LOAD_CET_STATE, VM_EXIT_LOAD_CET_STATE }, }; =20 memset(vmcs_conf, 0, sizeof(*vmcs_conf)); @@ -4870,6 +4871,14 @@ void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init= _event) =20 vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0); /* 22.2.1 */ =20 + if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) { + vmcs_writel(GUEST_SSP, 0); + vmcs_writel(GUEST_INTR_SSP_TABLE, 0); + } + if (kvm_cpu_cap_has(X86_FEATURE_IBT) || + kvm_cpu_cap_has(X86_FEATURE_SHSTK)) + vmcs_writel(GUEST_S_CET, 0); + kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu); =20 vpid_sync_context(vmx->vpid); @@ -6318,6 +6327,10 @@ void dump_vmcs(struct kvm_vcpu *vcpu) if (vmcs_read32(VM_EXIT_MSR_STORE_COUNT) > 0) vmx_dump_msrs("guest autostore", &vmx->msr_autostore.guest); =20 + if (vmentry_ctl & VM_ENTRY_LOAD_CET_STATE) + pr_err("S_CET =3D 0x%016lx, SSP =3D 0x%016lx, SSP TABLE =3D 0x%016lx\n", + vmcs_readl(GUEST_S_CET), vmcs_readl(GUEST_SSP), + vmcs_readl(GUEST_INTR_SSP_TABLE)); pr_err("*** Host State ***\n"); pr_err("RIP =3D 0x%016lx RSP =3D 0x%016lx\n", vmcs_readl(HOST_RIP), vmcs_readl(HOST_RSP)); @@ -6348,6 +6361,10 @@ void dump_vmcs(struct kvm_vcpu *vcpu) vmcs_read64(HOST_IA32_PERF_GLOBAL_CTRL)); if (vmcs_read32(VM_EXIT_MSR_LOAD_COUNT) > 0) vmx_dump_msrs("host autoload", &vmx->msr_autoload.host); + if (vmexit_ctl & VM_EXIT_LOAD_CET_STATE) + pr_err("S_CET =3D 0x%016lx, SSP =3D 0x%016lx, SSP TABLE =3D 0x%016lx\n", + vmcs_readl(HOST_S_CET), vmcs_readl(HOST_SSP), + vmcs_readl(HOST_INTR_SSP_TABLE)); =20 pr_err("*** Control State ***\n"); pr_err("CPUBased=3D0x%08x SecondaryExec=3D0x%08x TertiaryExec=3D0x%016llx= \n", @@ -7925,7 +7942,6 @@ static __init void vmx_set_cpu_caps(void) kvm_cpu_cap_set(X86_FEATURE_UMIP); =20 /* CPUID 0xD.1 */ - kvm_caps.supported_xss =3D 0; if (!cpu_has_vmx_xsaves()) kvm_cpu_cap_clear(X86_FEATURE_XSAVES); =20 @@ -7937,6 +7953,18 @@ static __init void vmx_set_cpu_caps(void) =20 if (cpu_has_vmx_waitpkg()) kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG); + + /* + * Disable CET if unrestricted_guest is unsupported as KVM doesn't + * enforce CET HW behaviors in emulator. On platforms with + * VMX_BASIC[bit56] =3D=3D 0, inject #CP at VMX entry with error code + * fails, so disable CET in this case too. + */ + if (!cpu_has_load_cet_ctrl() || !enable_unrestricted_guest || + !cpu_has_vmx_basic_no_hw_errcode()) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + } } =20 static bool vmx_is_io_intercepted(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index d3389baf3ab3..89200586b35a 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -484,7 +484,8 @@ static inline u8 vmx_get_rvi(void) VM_ENTRY_LOAD_IA32_EFER | \ VM_ENTRY_LOAD_BNDCFGS | \ VM_ENTRY_PT_CONCEAL_PIP | \ - VM_ENTRY_LOAD_IA32_RTIT_CTL) + VM_ENTRY_LOAD_IA32_RTIT_CTL | \ + VM_ENTRY_LOAD_CET_STATE) =20 #define __KVM_REQUIRED_VMX_VM_EXIT_CONTROLS \ (VM_EXIT_SAVE_DEBUG_CONTROLS | \ @@ -506,7 +507,8 @@ static inline u8 vmx_get_rvi(void) VM_EXIT_LOAD_IA32_EFER | \ VM_EXIT_CLEAR_BNDCFGS | \ VM_EXIT_PT_CONCEAL_PIP | \ - VM_EXIT_CLEAR_IA32_RTIT_CTL) + VM_EXIT_CLEAR_IA32_RTIT_CTL | \ + VM_EXIT_LOAD_CET_STATE) =20 #define KVM_REQUIRED_VMX_PIN_BASED_VM_EXEC_CONTROL \ (PIN_BASED_EXT_INTR_MASK | \ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 91e78c506105..081d676fd259 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -223,7 +223,8 @@ static struct kvm_user_return_msrs __percpu *user_retur= n_msrs; | XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \ | XFEATURE_MASK_PKRU | XFEATURE_MASK_XTILE) =20 -#define KVM_SUPPORTED_XSS 0 +#define KVM_SUPPORTED_XSS (XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_CET_KERNEL) =20 bool __read_mostly allow_smaller_maxphyaddr =3D 0; EXPORT_SYMBOL_GPL(allow_smaller_maxphyaddr); @@ -9943,6 +9944,20 @@ int kvm_x86_vendor_init(struct kvm_x86_init_ops *ops) if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES)) kvm_caps.supported_xss =3D 0; =20 + if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) && + !kvm_cpu_cap_has(X86_FEATURE_IBT)) + kvm_caps.supported_xss &=3D ~(XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL); + + if ((kvm_caps.supported_xss & (XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL)) !=3D + (XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL)) { + kvm_cpu_cap_clear(X86_FEATURE_SHSTK); + kvm_cpu_cap_clear(X86_FEATURE_IBT); + kvm_caps.supported_xss &=3D ~(XFEATURE_MASK_CET_USER | + XFEATURE_MASK_CET_KERNEL); + } + if (kvm_caps.has_tsc_control) { /* * Make sure the user can only configure tsc_khz values that @@ -12601,10 +12616,11 @@ static void kvm_xstate_reset(struct kvm_vcpu *vcp= u, bool init_event) /* * On INIT, only select XSTATE components are zeroed, most components * are unchanged. Currently, the only components that are zeroed and - * supported by KVM are MPX related. + * supported by KVM are MPX and CET related. */ xfeatures_mask =3D (kvm_caps.supported_xcr0 | kvm_caps.supported_xss) & - (XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR); + (XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR | + XFEATURE_MASK_CET_USER | XFEATURE_MASK_CET_KERNEL); if (!xfeatures_mask) return; =20 diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index d612ddcae247..c2053b753980 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -679,6 +679,9 @@ static inline bool __kvm_is_valid_cr4(struct kvm_vcpu *= vcpu, unsigned long cr4) __reserved_bits |=3D X86_CR4_PCIDE; \ if (!__cpu_has(__c, X86_FEATURE_LAM)) \ __reserved_bits |=3D X86_CR4_LAM_SUP; \ + if (!__cpu_has(__c, X86_FEATURE_SHSTK) && \ + !__cpu_has(__c, X86_FEATURE_IBT)) \ + __reserved_bits |=3D X86_CR4_CET; \ __reserved_bits; \ }) =20 --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8EDDE2E92C7; Tue, 12 Aug 2025 02:56:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967402; cv=none; b=emlItQ0UKzgJELVRN9aeidgWxbQQmEkUDYy/JRCPO+3VkdMi6/36ZeYK8HcqHgTzncMPh6FVxrfAEZZrgu0D+tVskBURR1IEUubRWvAU6ERGmRUHakPReszC1OWzjPqu9smfUvaRPsZEpZ+TQI79WqfPiZPMGE7ohkKisC+OigI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967402; c=relaxed/simple; bh=w3Q4yyxdp8dsJ+JpwOlJec8eWmhQVLyb4ShiT1attu4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DgUMb/fqLDkHg58dY1loQfVJiP+EJ5fndvrC7I2BQoeKajjOLeYj9cEPTZhyXjUypa2EtHa4yG5DzAGP0lNjkydd7RrhG/IPr7VmhnZGRwF8AyX/F5IYvqpUCSXUxLz0DdOcmT91hjbqQnti8RZ1CvO7dGuRvgByq4bRj2JshFs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=P6uuxJe5; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="P6uuxJe5" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967400; x=1786503400; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=w3Q4yyxdp8dsJ+JpwOlJec8eWmhQVLyb4ShiT1attu4=; b=P6uuxJe5NLgRU3KTAK2g7v6//rhcEGd1dcZW7At9QG5QJ64+JNp/S0TE Wr9fGFSRBaCflaiC76aRsWHK8Wz7701e+rHNJuQm0Z/gHrNKGXxoabSpf emLh0AxXwJ99QQuptmwcm2Xcu5XDFxg12UBjweIq5zFbYYUPkmh52x6ND 8DWnq5me7BiOuWBBdT30QhPsHWrgKHSnQJF4IrvubYg1KcQP8eTvwt86O PUGvmqI1JZRqcXVEu9EmpTF8Hjhfa7QjOJ7gmoEdxISvuQThVp7wIXrA0 qTmSvzNm/Vw7QvvjV9X/IXYYuD0XZuPGzCVfyS480nRrXQ1B3M93h6MdV Q==; X-CSE-ConnectionGUID: sQzD5SsTQQy1RWLLOs8g6g== X-CSE-MsgGUID: N9d43MPuQS+jdLb6EzH6nQ== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100640" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100640" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:40 -0700 X-CSE-ConnectionGUID: ntP5WHjEQ0C+n4bKPkE5pA== X-CSE-MsgGUID: 5BQt3YmVRDiuxFSmxzYHvw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321364" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:40 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Chao Gao , Mathias Krause , John Allen , Sean Christopherson , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 21/24] KVM: nVMX: Virtualize NO_HW_ERROR_CODE_CC for L1 event injection to L2 Date: Mon, 11 Aug 2025 19:55:29 -0700 Message-ID: <20250812025606.74625-22-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Per SDM description(Vol.3D, Appendix A.1): "If bit 56 is read as 1, software can use VM entry to deliver a hardware exception with or without an error code, regardless of vector" Modify has_error_code check before inject events to nested guest. Only enforce the check when guest is in real mode, the exception is not hard exception and the platform doesn't enumerate bit56 in VMX_BASIC, in all other case ignore the check to make the logic consistent with SDM. Signed-off-by: Yang Weijiang Reviewed-by: Maxim Levitsky Reviewed-by: Chao Gao Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/vmx/nested.c | 28 +++++++++++++++++++--------- arch/x86/kvm/vmx/nested.h | 5 +++++ 2 files changed, 24 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 7dc2e1c09ea6..618cc6c6425c 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -1272,9 +1272,10 @@ static int vmx_restore_vmx_basic(struct vcpu_vmx *vm= x, u64 data) { const u64 feature_bits =3D VMX_BASIC_DUAL_MONITOR_TREATMENT | VMX_BASIC_INOUT | - VMX_BASIC_TRUE_CTLS; + VMX_BASIC_TRUE_CTLS | + VMX_BASIC_NO_HW_ERROR_CODE_CC; =20 - const u64 reserved_bits =3D GENMASK_ULL(63, 56) | + const u64 reserved_bits =3D GENMASK_ULL(63, 57) | GENMASK_ULL(47, 45) | BIT_ULL(31); =20 @@ -2949,7 +2950,6 @@ static int nested_check_vm_entry_controls(struct kvm_= vcpu *vcpu, u8 vector =3D intr_info & INTR_INFO_VECTOR_MASK; u32 intr_type =3D intr_info & INTR_INFO_INTR_TYPE_MASK; bool has_error_code =3D intr_info & INTR_INFO_DELIVER_CODE_MASK; - bool should_have_error_code; bool urg =3D nested_cpu_has2(vmcs12, SECONDARY_EXEC_UNRESTRICTED_GUEST); bool prot_mode =3D !urg || vmcs12->guest_cr0 & X86_CR0_PE; @@ -2966,12 +2966,20 @@ static int nested_check_vm_entry_controls(struct kv= m_vcpu *vcpu, CC(intr_type =3D=3D INTR_TYPE_OTHER_EVENT && vector !=3D 0)) return -EINVAL; =20 - /* VM-entry interruption-info field: deliver error code */ - should_have_error_code =3D - intr_type =3D=3D INTR_TYPE_HARD_EXCEPTION && prot_mode && - x86_exception_has_error_code(vector); - if (CC(has_error_code !=3D should_have_error_code)) - return -EINVAL; + /* + * Cannot deliver error code in real mode or if the interrupt + * type is not hardware exception. For other cases, do the + * consistency check only if the vCPU doesn't enumerate + * VMX_BASIC_NO_HW_ERROR_CODE_CC. + */ + if (!prot_mode || intr_type !=3D INTR_TYPE_HARD_EXCEPTION) { + if (CC(has_error_code)) + return -EINVAL; + } else if (!nested_cpu_has_no_hw_errcode_cc(vcpu)) { + if (CC(has_error_code !=3D + x86_exception_has_error_code(vector))) + return -EINVAL; + } =20 /* VM-entry exception error code */ if (CC(has_error_code && @@ -7205,6 +7213,8 @@ static void nested_vmx_setup_basic(struct nested_vmx_= msrs *msrs) msrs->basic |=3D VMX_BASIC_TRUE_CTLS; if (cpu_has_vmx_basic_inout()) msrs->basic |=3D VMX_BASIC_INOUT; + if (cpu_has_vmx_basic_no_hw_errcode()) + msrs->basic |=3D VMX_BASIC_NO_HW_ERROR_CODE_CC; } =20 static void nested_vmx_setup_cr_fixed(struct nested_vmx_msrs *msrs) diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h index 6eedcfc91070..983484d42ebf 100644 --- a/arch/x86/kvm/vmx/nested.h +++ b/arch/x86/kvm/vmx/nested.h @@ -309,6 +309,11 @@ static inline bool nested_cr4_valid(struct kvm_vcpu *v= cpu, unsigned long val) __kvm_is_valid_cr4(vcpu, val); } =20 +static inline bool nested_cpu_has_no_hw_errcode_cc(struct kvm_vcpu *vcpu) +{ + return to_vmx(vcpu)->nested.msrs.basic & VMX_BASIC_NO_HW_ERROR_CODE_CC; +} + /* No difference in the restrictions on guest and host CR4 in VMX operatio= n. */ #define nested_guest_cr4_valid nested_cr4_valid #define nested_host_cr4_valid nested_cr4_valid --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B5C852E9EA3; Tue, 12 Aug 2025 02:56:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967403; cv=none; b=A00O6H3mQrtoK02suFuy2QOAp/3OKPHRC4sQsCYt32PAlcgdyX7Z1ngTBvkDjNq3eocBJZF9N7sQz29g2yBSgOglptZMIQTkN121ViNU1aUSeMlNP75gOfmEHQc2+ve12deaLKzr5N1AERPasQDeNK0NPRq0oL9AoOcKDRI5sig= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967403; c=relaxed/simple; bh=HpNMVy/JD/QbfDg5M+Dl2AXw1+lzyY5V2vaaZ5hfoGM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=m/ZPt0skDsWFTtx/3ztLyDtVmw9fBy61nAHn8YIn8BaqbttCeXN61fK98rPUkf65pAtcbwAjkHrqhMCweyZlkB1VyEFPZtkEV7OocxCYKk7DuFW2qkb0+6jcSMOoPKo/1mjIaO9+qdf3agXNdg/TCv5Mb7rBpbHjh1QNcrhqcxI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=iqsmVDUt; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="iqsmVDUt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967401; x=1786503401; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HpNMVy/JD/QbfDg5M+Dl2AXw1+lzyY5V2vaaZ5hfoGM=; b=iqsmVDUtoHaQHSQIDNmzYxu5pJBBAaaaQXY4emlMzcargvJWpCOdbnxa vPW+/zlGnG4q6oRavMtQEUt/4rw/pWEV+zLzWhXCOM3yaOTNDR+jcYB22 1xpdh5ejzkoXrQGeGuDhwLnqFYm8jaSplENjLl0O7Uax+P/Qkmf8A6xtG 6oj0fbXHpz37tv6O0LlAYEcmJ4j95Busi6uptuLUwrWx19In+f3p5XPqT yUJiWdA9iCZm7qOe6bz/9jGOOlnAvhN6RUHo/hTGt8vwNAQ7rAmKFssUl lBqIM9O+hQChdikSXHSxkZYBbp2Trjjr7VoAzC5wfSfYV/d4EC8103h+L g==; X-CSE-ConnectionGUID: yYOCEJtaTRSaqggm0lz/Rg== X-CSE-MsgGUID: wYH2nRsLSUCJpgoBnv+0WQ== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100650" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100650" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:41 -0700 X-CSE-ConnectionGUID: Iz33e7F2TluOWpN6VNxPoQ== X-CSE-MsgGUID: IAyRF1E3She7p0bGIJA9yA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321369" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:41 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Mathias Krause , John Allen , Chao Gao , Sean Christopherson , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 22/24] KVM: nVMX: Enable CET support for nested guest Date: Mon, 11 Aug 2025 19:55:30 -0700 Message-ID: <20250812025606.74625-23-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang Set up CET MSRs, related VM_ENTRY/EXIT control bits and fixed CR4 setting to enable CET for nested VM. vmcs12 and vmcs02 needs to be synced when L2 exits to L1 or when L1 wants to resume L2, that way correct CET states can be observed by one another. Signed-off-by: Yang Weijiang Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- v12: use a consistent order when referring to s_cet, ssp, and ssp_table throughout the patch this patch. -- Xin --- arch/x86/kvm/vmx/nested.c | 80 ++++++++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/vmcs12.c | 6 +++ arch/x86/kvm/vmx/vmcs12.h | 14 ++++++- arch/x86/kvm/vmx/vmx.c | 2 + arch/x86/kvm/vmx/vmx.h | 3 ++ 5 files changed, 102 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 618cc6c6425c..f20f205c6560 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -721,6 +721,27 @@ static inline bool nested_vmx_prepare_msr_bitmap(struc= t kvm_vcpu *vcpu, nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, MSR_IA32_MPERF, MSR_TYPE_R); =20 + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_U_CET, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_S_CET, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL0_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL1_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL2_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_PL3_SSP, MSR_TYPE_RW); + + nested_vmx_set_intercept_for_msr(vmx, msr_bitmap_l1, msr_bitmap_l0, + MSR_IA32_INT_SSP_TAB, MSR_TYPE_RW); + kvm_vcpu_unmap(vcpu, &map); =20 vmx->nested.force_msr_bitmap_recalc =3D false; @@ -2521,6 +2542,32 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vm= x, struct loaded_vmcs *vmcs0 } } =20 +static inline void cet_vmcs_fields_get(struct kvm_vcpu *vcpu, u64 *s_cet, + u64 *ssp, u64 *ssp_tbl) +{ + if (guest_cpu_cap_has(vcpu, X86_FEATURE_IBT) || + guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK)) + *s_cet =3D vmcs_readl(GUEST_S_CET); + + if (guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK)) { + *ssp =3D vmcs_readl(GUEST_SSP); + *ssp_tbl =3D vmcs_readl(GUEST_INTR_SSP_TABLE); + } +} + +static inline void cet_vmcs_fields_set(struct kvm_vcpu *vcpu, u64 s_cet, + u64 ssp, u64 ssp_tbl) +{ + if (guest_cpu_cap_has(vcpu, X86_FEATURE_IBT) || + guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK)) + vmcs_writel(GUEST_S_CET, s_cet); + + if (guest_cpu_cap_has(vcpu, X86_FEATURE_SHSTK)) { + vmcs_writel(GUEST_SSP, ssp); + vmcs_writel(GUEST_INTR_SSP_TABLE, ssp_tbl); + } +} + static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs1= 2) { struct hv_enlightened_vmcs *hv_evmcs =3D nested_vmx_evmcs(vmx); @@ -2637,6 +2684,10 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx= , struct vmcs12 *vmcs12) vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr); vmcs_write32(VM_ENTRY_MSR_LOAD_COUNT, vmx->msr_autoload.guest.nr); =20 + if (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE) + cet_vmcs_fields_set(&vmx->vcpu, vmcs12->guest_s_cet, + vmcs12->guest_ssp, vmcs12->guest_ssp_tbl); + set_cr4_guest_host_mask(vmx); } =20 @@ -2676,6 +2727,13 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, str= uct vmcs12 *vmcs12, kvm_set_dr(vcpu, 7, vcpu->arch.dr7); vmx_guest_debugctl_write(vcpu, vmx->nested.pre_vmenter_debugctl); } + + if (!vmx->nested.nested_run_pending || + !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE)) + cet_vmcs_fields_set(vcpu, vmx->nested.pre_vmenter_s_cet, + vmx->nested.pre_vmenter_ssp, + vmx->nested.pre_vmenter_ssp_tbl); + if (kvm_mpx_supported() && (!vmx->nested.nested_run_pending || !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))) vmcs_write64(GUEST_BNDCFGS, vmx->nested.pre_vmenter_bndcfgs); @@ -3552,6 +3610,12 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_m= ode(struct kvm_vcpu *vcpu, !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))) vmx->nested.pre_vmenter_bndcfgs =3D vmcs_read64(GUEST_BNDCFGS); =20 + if (!vmx->nested.nested_run_pending || + !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE)) + cet_vmcs_fields_get(vcpu, &vmx->nested.pre_vmenter_s_cet, + &vmx->nested.pre_vmenter_ssp, + &vmx->nested.pre_vmenter_ssp_tbl); + /* * Overwrite vmcs01.GUEST_CR3 with L1's CR3 if EPT is disabled *and* * nested early checks are disabled. In the event of a "late" VM-Fail, @@ -4479,6 +4543,9 @@ static bool is_vmcs12_ext_field(unsigned long field) case GUEST_IDTR_BASE: case GUEST_PENDING_DBG_EXCEPTIONS: case GUEST_BNDCFGS: + case GUEST_S_CET: + case GUEST_SSP: + case GUEST_INTR_SSP_TABLE: return true; default: break; @@ -4529,6 +4596,10 @@ static void sync_vmcs02_to_vmcs12_rare(struct kvm_vc= pu *vcpu, vmcs12->guest_pending_dbg_exceptions =3D vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS); =20 + cet_vmcs_fields_get(&vmx->vcpu, &vmcs12->guest_s_cet, + &vmcs12->guest_ssp, + &vmcs12->guest_ssp_tbl); + vmx->nested.need_sync_vmcs02_to_vmcs12_rare =3D false; } =20 @@ -4760,6 +4831,10 @@ static void load_vmcs12_host_state(struct kvm_vcpu *= vcpu, if (vmcs12->vm_exit_controls & VM_EXIT_CLEAR_BNDCFGS) vmcs_write64(GUEST_BNDCFGS, 0); =20 + if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_CET_STATE) + cet_vmcs_fields_set(vcpu, vmcs12->host_s_cet, vmcs12->host_ssp, + vmcs12->host_ssp_tbl); + if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_IA32_PAT) { vmcs_write64(GUEST_IA32_PAT, vmcs12->host_ia32_pat); vcpu->arch.pat =3D vmcs12->host_ia32_pat; @@ -7037,7 +7112,7 @@ static void nested_vmx_setup_exit_ctls(struct vmcs_co= nfig *vmcs_conf, VM_EXIT_HOST_ADDR_SPACE_SIZE | #endif VM_EXIT_LOAD_IA32_PAT | VM_EXIT_SAVE_IA32_PAT | - VM_EXIT_CLEAR_BNDCFGS; + VM_EXIT_CLEAR_BNDCFGS | VM_EXIT_LOAD_CET_STATE; msrs->exit_ctls_high |=3D VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR | VM_EXIT_LOAD_IA32_EFER | VM_EXIT_SAVE_IA32_EFER | @@ -7059,7 +7134,8 @@ static void nested_vmx_setup_entry_ctls(struct vmcs_c= onfig *vmcs_conf, #ifdef CONFIG_X86_64 VM_ENTRY_IA32E_MODE | #endif - VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS; + VM_ENTRY_LOAD_IA32_PAT | VM_ENTRY_LOAD_BNDCFGS | + VM_ENTRY_LOAD_CET_STATE; msrs->entry_ctls_high |=3D (VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR | VM_ENTRY_LOAD_IA32_EFER | VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL); diff --git a/arch/x86/kvm/vmx/vmcs12.c b/arch/x86/kvm/vmx/vmcs12.c index 106a72c923ca..4233b5ca9461 100644 --- a/arch/x86/kvm/vmx/vmcs12.c +++ b/arch/x86/kvm/vmx/vmcs12.c @@ -139,6 +139,9 @@ const unsigned short vmcs12_field_offsets[] =3D { FIELD(GUEST_PENDING_DBG_EXCEPTIONS, guest_pending_dbg_exceptions), FIELD(GUEST_SYSENTER_ESP, guest_sysenter_esp), FIELD(GUEST_SYSENTER_EIP, guest_sysenter_eip), + FIELD(GUEST_S_CET, guest_s_cet), + FIELD(GUEST_SSP, guest_ssp), + FIELD(GUEST_INTR_SSP_TABLE, guest_ssp_tbl), FIELD(HOST_CR0, host_cr0), FIELD(HOST_CR3, host_cr3), FIELD(HOST_CR4, host_cr4), @@ -151,5 +154,8 @@ const unsigned short vmcs12_field_offsets[] =3D { FIELD(HOST_IA32_SYSENTER_EIP, host_ia32_sysenter_eip), FIELD(HOST_RSP, host_rsp), FIELD(HOST_RIP, host_rip), + FIELD(HOST_S_CET, host_s_cet), + FIELD(HOST_SSP, host_ssp), + FIELD(HOST_INTR_SSP_TABLE, host_ssp_tbl), }; const unsigned int nr_vmcs12_fields =3D ARRAY_SIZE(vmcs12_field_offsets); diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h index 56fd150a6f24..4ad6b16525b9 100644 --- a/arch/x86/kvm/vmx/vmcs12.h +++ b/arch/x86/kvm/vmx/vmcs12.h @@ -117,7 +117,13 @@ struct __packed vmcs12 { natural_width host_ia32_sysenter_eip; natural_width host_rsp; natural_width host_rip; - natural_width paddingl[8]; /* room for future expansion */ + natural_width host_s_cet; + natural_width host_ssp; + natural_width host_ssp_tbl; + natural_width guest_s_cet; + natural_width guest_ssp; + natural_width guest_ssp_tbl; + natural_width paddingl[2]; /* room for future expansion */ u32 pin_based_vm_exec_control; u32 cpu_based_vm_exec_control; u32 exception_bitmap; @@ -294,6 +300,12 @@ static inline void vmx_check_vmcs12_offsets(void) CHECK_OFFSET(host_ia32_sysenter_eip, 656); CHECK_OFFSET(host_rsp, 664); CHECK_OFFSET(host_rip, 672); + CHECK_OFFSET(host_s_cet, 680); + CHECK_OFFSET(host_ssp, 688); + CHECK_OFFSET(host_ssp_tbl, 696); + CHECK_OFFSET(guest_s_cet, 704); + CHECK_OFFSET(guest_ssp, 712); + CHECK_OFFSET(guest_ssp_tbl, 720); CHECK_OFFSET(pin_based_vm_exec_control, 744); CHECK_OFFSET(cpu_based_vm_exec_control, 748); CHECK_OFFSET(exception_bitmap, 752); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index ae8be898e1df..3ab0fe6e47c9 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7711,6 +7711,8 @@ static void nested_vmx_cr_fixed1_bits_update(struct k= vm_vcpu *vcpu) cr4_fixed1_update(X86_CR4_PKE, ecx, feature_bit(PKU)); cr4_fixed1_update(X86_CR4_UMIP, ecx, feature_bit(UMIP)); cr4_fixed1_update(X86_CR4_LA57, ecx, feature_bit(LA57)); + cr4_fixed1_update(X86_CR4_CET, ecx, feature_bit(SHSTK)); + cr4_fixed1_update(X86_CR4_CET, edx, feature_bit(IBT)); =20 entry =3D kvm_find_cpuid_entry_index(vcpu, 0x7, 1); cr4_fixed1_update(X86_CR4_LAM_SUP, eax, feature_bit(LAM)); diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 89200586b35a..3bf7748ddfc2 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -181,6 +181,9 @@ struct nested_vmx { */ u64 pre_vmenter_debugctl; u64 pre_vmenter_bndcfgs; + u64 pre_vmenter_s_cet; + u64 pre_vmenter_ssp; + u64 pre_vmenter_ssp_tbl; =20 /* to migrate it to L1 if L2 writes to L1's CR8 directly */ int l1_tpr_threshold; --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C6282E9EC0; Tue, 12 Aug 2025 02:56:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967403; cv=none; b=NwOcYawEBa8I8MYGj8vj1i/rDbR3tn6ccuLm2zbiYOR7iQwFyj33qnwa168RwRpfJMETBR5hy7LYIevUH/h83FudlC8aum5IeioUUpP2lzo6nhCntGkOcXM5ijPv5I+dk5uJES8GvdslNjEjyBsU1liBacWU7B5JGEty2DUNbhY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967403; c=relaxed/simple; bh=28wdc2OVznvO/ghyvXwa1i1yEnJdtEDjNmhzmOLkpaY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=H0yQGWFnGnNq7IQ72hYjXGFvFHmKVe0btK69dd0h1QAeyerqSabyMbGqIHYqVJmo5dH3YX89/fkIxc3QqRboyCKV1XUvjvIhLg5otVsKYMJnyYWwDfnC8HHZroLLhaE8XlrISjyFPl3sFDG9s8hxgLrF7E08VTMaxWPpsuE63ug= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=VdJCfvs1; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="VdJCfvs1" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967402; x=1786503402; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=28wdc2OVznvO/ghyvXwa1i1yEnJdtEDjNmhzmOLkpaY=; b=VdJCfvs1yTWXPOWgQgQb38DXnVcDXLYvuZB7iHxIxFVVH2GBZVER8mfL ENB1LEATYVPnfNNHDlMK4N7CPKf73FhsTpTuZaNUNx+Ai0GycT4WKh6HT yRZJ2hvZLfaUIu9fU2bPCQCSTZ0C//FBFYAAyU6Tkd460lFX6RVn9/wyK aMnBN4fwYI2oEYha6yW6rHTFYYxiVbHGd8CBVu5OIyVG756xXs3opp00z ijP7/xVWakIOqV8Tm63YGLomeivXtg92jtfN106GLFYn+RrkVAHXbWl7i elnE+nx2r0PzWAlIq+212pSHUsGhiQ/36Xhh15KCQ70ym3ZBg4qdRPp58 g==; X-CSE-ConnectionGUID: lTEIpHgJRrWo0t3+vuO6qw== X-CSE-MsgGUID: poO5P4eQS4GypyX9WMln9A== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100659" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100659" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:42 -0700 X-CSE-ConnectionGUID: ycQCYsuXQYeAPPPCZ2/kqQ== X-CSE-MsgGUID: JDI38Bv3QGyiyZ9o448JYg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321373" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:42 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Chao Gao , Mathias Krause , John Allen , Sean Christopherson , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 23/24] KVM: nVMX: Add consistency checks for CR0.WP and CR4.CET Date: Mon, 11 Aug 2025 19:55:31 -0700 Message-ID: <20250812025606.74625-24-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add consistency checks for CR4.CET and CR0.WP in guest-state or host-state area in the VMCS12. This ensures that configurations with CR4.CET set and CR0.WP not set result in VM-entry failure, aligning with architectural behavior. Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/vmx/nested.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index f20f205c6560..47e413e56764 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3114,6 +3114,9 @@ static int nested_vmx_check_host_state(struct kvm_vcp= u *vcpu, CC(!kvm_vcpu_is_legal_cr3(vcpu, vmcs12->host_cr3))) return -EINVAL; =20 + if (CC(vmcs12->host_cr4 & X86_CR4_CET && !(vmcs12->host_cr0 & X86_CR0_WP)= )) + return -EINVAL; + if (CC(is_noncanonical_msr_address(vmcs12->host_ia32_sysenter_esp, vcpu))= || CC(is_noncanonical_msr_address(vmcs12->host_ia32_sysenter_eip, vcpu))) return -EINVAL; @@ -3228,6 +3231,9 @@ static int nested_vmx_check_guest_state(struct kvm_vc= pu *vcpu, CC(!nested_guest_cr4_valid(vcpu, vmcs12->guest_cr4))) return -EINVAL; =20 + if (CC(vmcs12->guest_cr4 & X86_CR4_CET && !(vmcs12->guest_cr0 & X86_CR0_W= P))) + return -EINVAL; + if ((vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS) && (CC(!kvm_dr7_valid(vmcs12->guest_dr7)) || CC(!vmx_is_valid_debugctl(vcpu, vmcs12->guest_ia32_debugctl, false))= )) --=20 2.47.1 From nobody Sat Oct 4 22:35:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 221A52EA156; Tue, 12 Aug 2025 02:56:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967404; cv=none; b=n5/vItoyToSdHlmkMlzJVDxF21onetWX4HgsT3fMmusJQm/qEIX7xI5YhXxha+hYYttyzAlkGMEGaVevo87bxWYTquUs+MGVW5nPpn8ncWgwrugGmEv9S45Q5PlyWLSsuWAlUmPv6jyXOFKq20YeNPNX5KNojArqJnCIFiliYrw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754967404; c=relaxed/simple; bh=WWXmejuSNj1MrJ3JUYatoYWUto9PetsGoWxYgnZzpo0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LALirT5Gw11nm2uxGsbqcSry68AjLcpdCqIjakb/zkNhzgpP3ehgmBQ45b4tLJCw7Hp78uDHG9ztMJwWOzM+COzntqyAN7xApLWGoRd5X2Q/50/p/+sSEFqtdZsZYX4vOpdLZVsAXof3Gzrtka8tQhuSjtAOvvN8QqPMHSuXCMQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=XV3YqZH4; arc=none smtp.client-ip=198.175.65.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="XV3YqZH4" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1754967403; x=1786503403; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WWXmejuSNj1MrJ3JUYatoYWUto9PetsGoWxYgnZzpo0=; b=XV3YqZH40lv8MDAr4w+4rYrcBaycCMPeKRqKEh9C3hIT1ot7VFtRkwnK h+roaZGlZj9p/f4aw3RNe2upzifO7WY1zIMuM0NX1smP01tVF8OAfwpbp SyMYKNEn23RcTFzrOT/3c7Tzu0Y7/emMtJCJPFbHnlCirHIcXSiUj4Ct6 WyiKrg9NWXJ/xtWpjlZtHfNPpweOAomK3UnyXktTqP7kJJ6hVdyCXBZ9t 1det+DD1NEBuK9Ks1wQowHPmSBy0Os/WK3+qFQOMl1uhFwOEIysXT/zUC HIuNVSc155opscw4JXF2RQNguPb8c5yYQwq4F+70I6WeFSh/pL5StOup5 Q==; X-CSE-ConnectionGUID: o74QWlmtTny6sS1Wu5x++A== X-CSE-MsgGUID: 5JH2vl8bSLqpkpQ127KV4w== X-IronPort-AV: E=McAfee;i="6800,10657,11518"; a="57100670" X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="57100670" Received: from fmviesa004.fm.intel.com ([10.60.135.144]) by orvoesa111.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:43 -0700 X-CSE-ConnectionGUID: Z7MXRvTaRT2ixdXVUoUTdA== X-CSE-MsgGUID: KP531+qoTKu8N9r3Ct92sA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.17,284,1747724400"; d="scan'208";a="171321376" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by fmviesa004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2025 19:56:42 -0700 From: Chao Gao To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: mlevitsk@redhat.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, xin@zytor.com, Chao Gao , Mathias Krause , John Allen , Sean Christopherson , Paolo Bonzini , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Subject: [PATCH v12 24/24] KVM: nVMX: Add consistency checks for CET states Date: Mon, 11 Aug 2025 19:55:32 -0700 Message-ID: <20250812025606.74625-25-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250812025606.74625-1-chao.gao@intel.com> References: <20250812025606.74625-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce consistency checks for CET states during nested VM-entry. A VMCS contains both guest and host CET states, each comprising the IA32_S_CET MSR, SSP, and IA32_INTERRUPT_SSP_TABLE_ADDR MSR. Various checks are applied to CET states during VM-entry as documented in SDM Vol3 Chapter "VM ENTRIES". Implement all these checks during nested VM-entry to emulate the architectural behavior. In summary, there are three kinds of checks on guest/host CET states during VM-entry: A. Checks applied to both guest states and host states: * The IA32_S_CET field must not set any reserved bits; bits 10 (SUPPRESS) and 11 (TRACKER) cannot both be set. * SSP should not have bits 1:0 set. * The IA32_INTERRUPT_SSP_TABLE_ADDR field must be canonical. B. Checks applied to host states only * IA32_S_CET MSR and SSP must be canonical if the CPU enters 64-bit mode after VM-exit. Otherwise, IA32_S_CET and SSP must have their higher 32 bits cleared. C. Checks applied to guest states only: * IA32_S_CET MSR and SSP are not required to be canonical (i.e., 63:N-1 are identical, where N is the CPU's maximum linear-address width). But, bits 63:N of SSP must be identical. Tested-by: Mathias Krause Tested-by: John Allen Signed-off-by: Chao Gao Tested-by: Rick Edgecombe --- arch/x86/kvm/vmx/nested.c | 47 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 47e413e56764..7c88fedc27c7 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -3104,6 +3104,17 @@ static bool is_l1_noncanonical_address_on_vmexit(u64= la, struct vmcs12 *vmcs12) return !__is_canonical_address(la, l1_address_bits_on_exit); } =20 +static bool is_valid_cet_state(struct kvm_vcpu *vcpu, u64 s_cet, u64 ssp, = u64 ssp_tbl) +{ + if (!is_cet_msr_valid(vcpu, s_cet) || !IS_ALIGNED(ssp, 4)) + return false; + + if (is_noncanonical_msr_address(ssp_tbl, vcpu)) + return false; + + return true; +} + static int nested_vmx_check_host_state(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) { @@ -3173,6 +3184,26 @@ static int nested_vmx_check_host_state(struct kvm_vc= pu *vcpu, return -EINVAL; } =20 + if (vmcs12->vm_exit_controls & VM_EXIT_LOAD_CET_STATE) { + if (CC(!is_valid_cet_state(vcpu, vmcs12->host_s_cet, vmcs12->host_ssp, + vmcs12->host_ssp_tbl))) + return -EINVAL; + + /* + * IA32_S_CET and SSP must be canonical if the host will + * enter 64-bit mode after VM-exit; otherwise, higher + * 32-bits must be all 0s. + */ + if (ia32e) { + if (CC(is_noncanonical_msr_address(vmcs12->host_s_cet, vcpu)) || + CC(is_noncanonical_msr_address(vmcs12->host_ssp, vcpu))) + return -EINVAL; + } else { + if (CC(vmcs12->host_s_cet >> 32) || CC(vmcs12->host_ssp >> 32)) + return -EINVAL; + } + } + return 0; } =20 @@ -3283,6 +3314,22 @@ static int nested_vmx_check_guest_state(struct kvm_v= cpu *vcpu, CC((vmcs12->guest_bndcfgs & MSR_IA32_BNDCFGS_RSVD)))) return -EINVAL; =20 + if (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_CET_STATE) { + if (CC(!is_valid_cet_state(vcpu, vmcs12->guest_s_cet, vmcs12->guest_ssp, + vmcs12->guest_ssp_tbl))) + return -EINVAL; + + /* + * Guest SSP must have 63:N bits identical, rather than + * be canonical (i.e., 63:N-1 bits identical), where N is + * the CPU's maximum linear-address width. Similar to + * is_noncanonical_msr_address(), use the host's + * linear-address width. + */ + if (CC(!__is_canonical_address(vmcs12->guest_ssp, max_host_virt_addr_bit= s() + 1))) + return -EINVAL; + } + if (nested_check_guest_non_reg_state(vmcs12)) return -EINVAL; =20 --=20 2.47.1