From nobody Wed Dec 17 08:54:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F047E25B691; Mon, 12 May 2025 08:57:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747040268; cv=none; b=B9VvuZX+XtRzG8Y5MUKvcU/KOie1rcr/3kUorVpzlqb25x8eu3iZtyi/q33y4hh2vQr9U1WTivhVlOsmd/8pUEX2GSY8ztPSGvqDKUqHtksGWWRKBNEjOeL4/JV5JCoRor/RubAUbr6JH9nr0A3YTws8WXJ/gG8SCa2kRUbO3H8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747040268; c=relaxed/simple; bh=+hNZSzs+UU0rkjlXcA84b2Bq54FhNUxCpNk6I3td9qg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rvSPOl7c+wzGJ2AI/QJR4NsVd1tqiRes2bxnXeD4dPuQXA6Ww6+JtwwV9XbT3EichOJVbv8nuT8rDo6S+nFKIKy6QtPJk5/e/Mg6QYu3nd4Ob6p4YWEVaeZPQ+SRsSydVogYADQl6Oa+zVpsXOCPzXQgEAve/K/NHIqt31Gj+EQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=dyqNmqy8; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="dyqNmqy8" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1747040265; x=1778576265; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+hNZSzs+UU0rkjlXcA84b2Bq54FhNUxCpNk6I3td9qg=; b=dyqNmqy8GaJtLdGswb7BcpF0xj2EIyhgk1VbsuFiAxemcf9XhwiFi+tD KWnelDuywME4RAmGhCEd5FZrHkeyoRmn/KKEhKXeLQnM08j+Q6GgrfYwR weCSa90Y+wTb1FYJ5/y3UutaL2QA8p8jcigu4h+91VkWOaVFvx1cX9+yQ lCyPtkfN2HZPtwFSMXChlIBYuKDwFRDZoJdM5ybMzJsyjj0O1Y2P5Jk2t pY5XENSMQJmCOE9wB8+BnVhboZnQ+XePRiO1qeyXMrwfOHYWYXG5kUVb2 PzPYACIpLNwGzyPG3OUeZ3lQsD8eSJ9Xa+Ig9i1JKVV/8sXP5+r90Yob5 Q==; X-CSE-ConnectionGUID: dLcqbt58SW2jcvyKMWFFgw== X-CSE-MsgGUID: 6PkiyQKwRoumycdJSwSTqw== X-IronPort-AV: E=McAfee;i="6700,10204,11430"; a="59488692" X-IronPort-AV: E=Sophos;i="6.15,281,1739865600"; d="scan'208";a="59488692" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2025 01:57:44 -0700 X-CSE-ConnectionGUID: eDU5l9WKQp2Mq9YEp1dZiw== X-CSE-MsgGUID: /0EaOvfmQACyWE7Up/w+1w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,281,1739865600"; d="scan'208";a="138235776" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2025 01:57:44 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Chao Gao , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Samuel Holland , Mitchell Levy , Kees Cook , Stanislav Spassov , Eric Biggers , Nikolay Borisov , Oleg Nesterov , Vignesh Balasubramanian Subject: [PATCH v7 1/6] x86/fpu/xstate: Differentiate default features for host and guest FPUs Date: Mon, 12 May 2025 01:57:04 -0700 Message-ID: <20250512085735.564475-2-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250512085735.564475-1-chao.gao@intel.com> References: <20250512085735.564475-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, guest and host FPUs share the same default features. However, the CET supervisor xstate is the first feature that needs to be enabled exclusively for guest FPUs. Enabling it for host FPUs leads to a waste of 24 bytes in the XSAVE buffer. To support "guest-only" features, add a new structure to hold the default features and sizes for guest FPUs to clearly differentiate them from those for host FPUs. Note that, 1) for now, the default features for guest and host FPUs remain the same. This will change in a follow-up patch once guest permissions, default xfeatures, and fpstate size are all converted to use the guest defaults. 2) only supervisor features will diverge between guest FPUs and host FPUs, while user features will remain the same [1][2]. So, the new vcpu_fpu_config struct does not include default user features and size for the UABI buffer. An alternative approach is adding a guest_only_xfeatures member to fpu_kernel_cfg and adding two helper functions to calculate the guest default xfeatures and size. However, calculating these defaults at runtime would introduce unnecessary overhead. Suggested-by: Chang S. Bae Signed-off-by: Chao Gao Reviewed-by: Rick Edgecombe Link: https://lore.kernel.org/kvm/aAwdQ759Y6V7SGhv@google.com/ [1] Link: https://lore.kernel.org/kvm/9ca17e1169805f35168eb722734fbf3579187886.= camel@intel.com/ [2] reviewed-by/acked-by if appropriate? --- v7: add Rick's Reviewed-by v6: Drop vcpu_fpu_config.user_* (Rick) Reset guest default size when XSAVE is unavaiable or disabled (Chang) v5: Add a new vcpu_fpu_config instead of adding new members to fpu_state_config (Chang) Extract a helper to set default values (Chang) --- arch/x86/include/asm/fpu/types.h | 26 ++++++++++++++++++++++++++ arch/x86/kernel/fpu/core.c | 1 + arch/x86/kernel/fpu/init.c | 1 + arch/x86/kernel/fpu/xstate.c | 27 +++++++++++++++++++++------ 4 files changed, 49 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/ty= pes.h index 1c94121acd3d..abd193a1a52e 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -551,6 +551,31 @@ struct fpu_guest { struct fpstate *fpstate; }; =20 +/* + * FPU state configuration data for fpu_guest. + * Initialized at boot time. Read only after init. + */ +struct vcpu_fpu_config { + /* + * @size: + * + * The default size of the register state buffer in guest FPUs. + * Includes all supported features except independent managed + * features and features which have to be requested by user space + * before usage. + */ + unsigned int size; + + /* + * @features: + * + * The default supported features bitmap in guest FPUs. Does not + * include independent managed features and features which have to + * be requested by user space before usage. + */ + u64 features; +}; + /* * FPU state configuration data. Initialized at boot time. Read only after= init. */ @@ -606,5 +631,6 @@ struct fpu_state_config { =20 /* FPU state configuration information */ extern struct fpu_state_config fpu_kernel_cfg, fpu_user_cfg; +extern struct vcpu_fpu_config guest_default_cfg; =20 #endif /* _ASM_X86_FPU_TYPES_H */ diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 1cda5b78540b..2cd5e1910ff8 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -36,6 +36,7 @@ DEFINE_PER_CPU(u64, xfd_state); /* The FPU state configuration data for kernel and user space */ struct fpu_state_config fpu_kernel_cfg __ro_after_init; struct fpu_state_config fpu_user_cfg __ro_after_init; +struct vcpu_fpu_config guest_default_cfg __ro_after_init; =20 /* * Represents the initial FPU state. It's mostly (but not completely) zero= es, diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c index 6bb3e35c40e2..e19660cdc70c 100644 --- a/arch/x86/kernel/fpu/init.c +++ b/arch/x86/kernel/fpu/init.c @@ -202,6 +202,7 @@ static void __init fpu__init_system_xstate_size_legacy(= void) fpu_kernel_cfg.default_size =3D size; fpu_user_cfg.max_size =3D size; fpu_user_cfg.default_size =3D size; + guest_default_cfg.size =3D size; } =20 /* diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 1c8410b68108..f32047e12500 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -742,6 +742,9 @@ static int __init init_xstate_size(void) fpu_user_cfg.default_size =3D xstate_calculate_size(fpu_user_cfg.default_features, false); =20 + guest_default_cfg.size =3D + xstate_calculate_size(guest_default_cfg.features, compacted); + return 0; } =20 @@ -762,6 +765,7 @@ static void __init fpu__init_disable_system_xstate(unsi= gned int legacy_size) fpu_kernel_cfg.default_size =3D legacy_size; fpu_user_cfg.max_size =3D legacy_size; fpu_user_cfg.default_size =3D legacy_size; + guest_default_cfg.size =3D legacy_size; =20 /* * Prevent enabling the static branch which enables writes to the @@ -772,6 +776,21 @@ static void __init fpu__init_disable_system_xstate(uns= igned int legacy_size) fpstate_reset(x86_task_fpu(current)); } =20 +static void __init init_default_features(u64 kernel_max_features, u64 user= _max_features) +{ + u64 kfeatures =3D kernel_max_features; + u64 ufeatures =3D user_max_features; + + /* Default feature sets should not include dynamic xfeatures. */ + kfeatures &=3D ~XFEATURE_MASK_USER_DYNAMIC; + ufeatures &=3D ~XFEATURE_MASK_USER_DYNAMIC; + + fpu_kernel_cfg.default_features =3D kfeatures; + fpu_user_cfg.default_features =3D ufeatures; + + guest_default_cfg.features =3D kfeatures; +} + /* * Enable and initialize the xsave feature. * Called once per system bootup. @@ -854,12 +873,8 @@ void __init fpu__init_system_xstate(unsigned int legac= y_size) fpu_user_cfg.max_features =3D fpu_kernel_cfg.max_features; fpu_user_cfg.max_features &=3D XFEATURE_MASK_USER_SUPPORTED; =20 - /* Clean out dynamic features from default */ - fpu_kernel_cfg.default_features =3D fpu_kernel_cfg.max_features; - fpu_kernel_cfg.default_features &=3D ~XFEATURE_MASK_USER_DYNAMIC; - - fpu_user_cfg.default_features =3D fpu_user_cfg.max_features; - fpu_user_cfg.default_features &=3D ~XFEATURE_MASK_USER_DYNAMIC; + /* Now, given maximum feature set, determine default values */ + init_default_features(fpu_kernel_cfg.max_features, fpu_user_cfg.max_featu= res); =20 /* Store it for paranoia check at the end */ xfeatures =3D fpu_kernel_cfg.max_features; --=20 2.47.1 From nobody Wed Dec 17 08:54:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E6CB25B69A; Mon, 12 May 2025 08:57:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747040267; cv=none; b=G9Jxq9nOUbhcGanqE85W976IzdE5paDG06PZzr7mVSlaDZ9lnLSE8mocoSxMONaOzd4iAr1grHlKbltHj2EacYOi1uQrnNeMfBx4yk4Wzwa35EQ1YdCtvIes4CY2WJatGAjPrA7vd+Is/csopvlPQnZZBfflsdEHHdhsFYV6GZo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747040267; c=relaxed/simple; bh=krG29boC1nw0mvew+VXZrkh+APefvKGWDsm547BBetk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qZDmsg/RrYXfAdKLJd3XtgRab8q+ZuWyALZWQ18iW/2d0I1qjE6d3eQPz12l6XJ4Jo9g86wppCfN8PZF7m8o8z/fzowZJhTkQ+6QE+5wtNhQDl+Y8gtLJ/BFPbWw+wuYVcPmQooZhf5hkDHsph+BDoGnF9MxfXk2eFabjZ1wXPY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=J4g56BTp; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="J4g56BTp" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1747040265; x=1778576265; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=krG29boC1nw0mvew+VXZrkh+APefvKGWDsm547BBetk=; b=J4g56BTpgEffkRpwMKD8z/fnpx+j8B7k0FuoWpAc8cOfhlS9vKcEpbsy BMlvIDbIhB1sdBHq1Ilt50OIjiNFYc6NtkB5NTayAebH3N6A3aavwejYe +DfOrHQbM8rbg6A5KTpLxvUQIE7maS46k14UpjT1eSu99naK0lyTtUiNq WGHmY58o++OL69m/emttCjXbjXfgedIXG84c3LH2nfZDlNJVDArPTkh5f jdFqSb8iqjSCE+WT/mt6M0Bj2Tc4G5YhBSeT0oFyr0iVh5nZQiMmEjEUF 98/l3QcKZvmJNf8+SLAbEWBSU3gK6GwspbhxJAc/0pJAz1kbIiicFTOev g==; X-CSE-ConnectionGUID: ZRaWj8gaQGG5bf2a/aPpzA== X-CSE-MsgGUID: ZGTTWy4/RvuxAKZ0vhcznw== X-IronPort-AV: E=McAfee;i="6700,10204,11430"; a="59488706" X-IronPort-AV: E=Sophos;i="6.15,281,1739865600"; d="scan'208";a="59488706" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2025 01:57:45 -0700 X-CSE-ConnectionGUID: sI1dBwrtRHK+BLzNza+vsw== X-CSE-MsgGUID: Vv8762iBQ8iwVu9vWeVZOA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,281,1739865600"; d="scan'208";a="138235779" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2025 01:57:45 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Chao Gao , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Oleg Nesterov , Stanislav Spassov , Kees Cook , Eric Biggers Subject: [PATCH v7 2/6] x86/fpu: Initialize guest FPU permissions from guest defaults Date: Mon, 12 May 2025 01:57:05 -0700 Message-ID: <20250512085735.564475-3-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250512085735.564475-1-chao.gao@intel.com> References: <20250512085735.564475-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, fpu->guest_perm is copied from fpu->perm, which is derived from fpu_kernel_cfg.default_features. Guest defaults were introduced to differentiate the features and sizes of host and guest FPUs. Copying guest FPU permissions from the host will lead to inconsistencies between the guest default features and permissions. Initialize guest FPU permissions from guest defaults instead of host defaults. This ensures that any changes to guest default features are automatically reflected in guest permissions, which in turn guarantees that fpstate_realloc() allocates a correctly sized XSAVE buffer for guest FPUs. Suggested-by: Chang S. Bae Signed-off-by: Chao Gao Reviewed-by: Rick Edgecombe reviewed-by/acked-by if appropriate? --- v6: Drop vcpu_fpu_config.user_* and collect reviews (Rick) --- arch/x86/kernel/fpu/core.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 2cd5e1910ff8..444e517a8648 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -553,8 +553,14 @@ void fpstate_reset(struct fpu *fpu) fpu->perm.__state_perm =3D fpu_kernel_cfg.default_features; fpu->perm.__state_size =3D fpu_kernel_cfg.default_size; fpu->perm.__user_state_size =3D fpu_user_cfg.default_size; - /* Same defaults for guests */ - fpu->guest_perm =3D fpu->perm; + + fpu->guest_perm.__state_perm =3D guest_default_cfg.features; + fpu->guest_perm.__state_size =3D guest_default_cfg.size; + /* + * User features and sizes remain the same between guest FPUs + * and host FPUs. + */ + fpu->guest_perm.__user_state_size =3D fpu_user_cfg.default_size; } =20 static inline void fpu_inherit_perms(struct fpu *dst_fpu) --=20 2.47.1 From nobody Wed Dec 17 08:54:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3D2DE25C828; Mon, 12 May 2025 08:57:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747040268; cv=none; b=Xsk831mssOtSw5w29kHccfAbQjljKGxt4QNhdSSqNqkdv6KyJ9YR7d48ZbWTYT4UjGjrmyhOrgiUWOz2io9P2/Lxvbi2GdSbd+59NnhnZRqPjHhjLK4DDBEMOH8AFVh3nSP9Vh7ntnc0f+cVB2tXvvh1j2gi8wYhxbMDNtQyd6s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747040268; c=relaxed/simple; bh=jc52Pr0csP5dB1yXbUo7PLFxgai0G9NfQ4qQyHUvsJ4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KqnZCCTR1iACsmq9wYagXF687G3ZHHdRaLmaV5EqbSx+YcEZfoy5kv/7wLCz6MQBiSPqnbB929v2VxSwMmP6oORSL/8JRDHx9kYSFxwmNss7uZn49FhmOsG1uIoWdtTH+3Dc9nzkAPeabsqsvOxg74mUPzIAuG8Fl2kVl4fGGMs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Lvf6rR2/; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Lvf6rR2/" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1747040266; x=1778576266; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jc52Pr0csP5dB1yXbUo7PLFxgai0G9NfQ4qQyHUvsJ4=; b=Lvf6rR2/GTtBWVJFlT1kRvq4CbDHyHycmJ9tc0L47GFSnY40hWICohV6 IMNj9X98QC20hr7tcvoeNCgvu9xaVdBHhuuos9v2wrEAaGhtq5g78WUiS lmkmNGY3oXBrrlueRBMEAZ62uMRHbA5z0HLZ9nqewdtbXxDE25E+/2wNk b8PObBIzObIB1SHQhfDGgTUUWBcmG66V8Kom6YE8QUwDO5Hs/bIxbS8cV 5Ib7+n1IhKjPgdkBxBvdX0EcWoAjULLTfZN7BzIbJJSgFC+jWf55rUGW9 IRp0ITMRgA30gHL5UKrSSa/epYgKZaGhUpZeNxolVBVkCmwzuHGomh556 Q==; X-CSE-ConnectionGUID: vn3yYv/vTKOSWwM4m5sK3g== X-CSE-MsgGUID: JpW3umWYSYehHxzdrLuJFw== X-IronPort-AV: E=McAfee;i="6700,10204,11430"; a="59488717" X-IronPort-AV: E=Sophos;i="6.15,281,1739865600"; d="scan'208";a="59488717" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2025 01:57:45 -0700 X-CSE-ConnectionGUID: XUcQzhT0QUKmxAWsNN/1eQ== X-CSE-MsgGUID: ix0kynOpSdeg1l9tzuqNNQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,281,1739865600"; d="scan'208";a="138235782" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2025 01:57:45 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Chao Gao , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Eric Biggers , Kees Cook Subject: [PATCH v7 3/6] x86/fpu: Initialize guest fpstate and FPU pseudo container from guest defaults Date: Mon, 12 May 2025 01:57:06 -0700 Message-ID: <20250512085735.564475-4-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250512085735.564475-1-chao.gao@intel.com> References: <20250512085735.564475-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" fpu_alloc_guest_fpstate() currently uses host defaults to initialize guest fpstate and pseudo containers. Guest defaults were introduced to differentiate the features and sizes of host and guest FPUs. Switch to using guest defaults instead. Adjust __fpstate_reset() to handle different defaults for host and guest FPUs. And to distinguish between the types of FPUs, move the initialization of indicators (is_guest and is_valloc) before the reset. Suggested-by: Chang S. Bae Signed-off-by: Chao Gao reviewed-by/acked-by if appropriate? --- v7: tweak __fpstate_reset() instead of adding a guest-specific reset function (Sean/Dave) v6: Drop vcpu_fpu_config.user_* (Rick) v5: init is_valloc/is_guest in the guest-specific reset function (Chang) arch/x86/kernel/fpu/core.c | 27 ++++++++++++++++++++------- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 444e517a8648..0d501bd25d79 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -236,19 +236,22 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) struct fpstate *fpstate; unsigned int size; =20 - size =3D fpu_kernel_cfg.default_size + ALIGN(offsetof(struct fpstate, reg= s), 64); + size =3D guest_default_cfg.size + ALIGN(offsetof(struct fpstate, regs), 6= 4); + fpstate =3D vzalloc(size); if (!fpstate) return false; =20 + /* Initialize indicators to reflect properties of the fpstate */ + fpstate->is_valloc =3D true; + fpstate->is_guest =3D true; + /* Leave xfd to 0 (the reset value defined by spec) */ __fpstate_reset(fpstate, 0); fpstate_init_user(fpstate); - fpstate->is_valloc =3D true; - fpstate->is_guest =3D true; =20 gfpu->fpstate =3D fpstate; - gfpu->xfeatures =3D fpu_kernel_cfg.default_features; + gfpu->xfeatures =3D guest_default_cfg.features; =20 /* * KVM sets the FP+SSE bits in the XSAVE header when copying FPU state @@ -535,10 +538,20 @@ void fpstate_init_user(struct fpstate *fpstate) =20 static void __fpstate_reset(struct fpstate *fpstate, u64 xfd) { - /* Initialize sizes and feature masks */ - fpstate->size =3D fpu_kernel_cfg.default_size; + /* + * Initialize sizes and feature masks. Supervisor features and + * sizes may diverge between guest FPUs and host FPUs, whereas + * user features and sizes are always identical the same. + */ + if (fpstate->is_guest) { + fpstate->size =3D guest_default_cfg.size; + fpstate->xfeatures =3D guest_default_cfg.features; + } else { + fpstate->size =3D fpu_kernel_cfg.default_size; + fpstate->xfeatures =3D fpu_kernel_cfg.default_features; + } + fpstate->user_size =3D fpu_user_cfg.default_size; - fpstate->xfeatures =3D fpu_kernel_cfg.default_features; fpstate->user_xfeatures =3D fpu_user_cfg.default_features; fpstate->xfd =3D xfd; } --=20 2.47.1 From nobody Wed Dec 17 08:54:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F56D25D1F8; Mon, 12 May 2025 08:57:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747040271; cv=none; b=JsnUOu6U49leQpvChgcLLE5UpqgXde+VQIzAerzMm2LwzAIZl37S8ciqqt+y/UIft7PP5LeJSpFVzQ/UJYsng7xhzS0KKXkqVI/qkWksJamoMy7jBmEYliv4v/VMTVAR0CDVM0FrC2fZGzxejmk+BIQiA8CALV4t5fmP8ciukP4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747040271; c=relaxed/simple; bh=0KNdT4K2HwqU0ZdMB4peGWxDogoI8jp/u9JVa2ytbX4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Cv7eCWKS5+FYzuvV9eMevv2VcJG1Q/OSaJjLobF/KlMn019HKl1e5KZ/FhxEXeRF57fIjW0qckJ4N31U/nk4z/LXjCWPQeD1gDc2GBAhmjoChCHmQdHSGZDvQbBv3TPC3OxKPe8sphjPfcet/PbmRCzgF4LaK1ylCdt9LeQ4eAM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=IqweOwXN; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="IqweOwXN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1747040267; x=1778576267; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0KNdT4K2HwqU0ZdMB4peGWxDogoI8jp/u9JVa2ytbX4=; b=IqweOwXNZp4+HuGrYzWtlXPpkBA+4Gr5YWrc+fNy0Zs5FEBzZnY+JGEy 3M0OsHBJmVgF36RGbYLs06TTPpS+TfaNsLbL4btiqXUHIDFicu4DN3NrE hqBuPK+AI+8IB1BitMOr2uvFa0BN4K8kb3G1nSf2Lr/AVhEHZXBNZXwdl pt7irpSk6jOwPpk9MR2IfHvbYkf7u7aPbWLTSeehu8rHvet0zO8P4msOt rCPzU1k/GGoSVKLHYyuH72TkyTh83dJh4Rm3FBjrwNBCpolmimT4LCbD5 PH75C+hImz1J5LuuAMfHQCDep515xHNhQTCybxlW8VnKccR138j34m20q A==; X-CSE-ConnectionGUID: KfU7m4ykR+mzrUy9MSb/Zw== X-CSE-MsgGUID: eMqnrVoyR4W9ErGRl2ZzLw== X-IronPort-AV: E=McAfee;i="6700,10204,11430"; a="59488731" X-IronPort-AV: E=Sophos;i="6.15,281,1739865600"; d="scan'208";a="59488731" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2025 01:57:46 -0700 X-CSE-ConnectionGUID: WRlz5RKLQWufey33GcoE5A== X-CSE-MsgGUID: D04i/FHNSvWzKMcTrXMsTA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,281,1739865600"; d="scan'208";a="138235787" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2025 01:57:46 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Chao Gao , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Kees Cook , Stanislav Spassov , Eric Biggers Subject: [PATCH v7 4/6] x86/fpu: Remove xfd argument from __fpstate_reset() Date: Mon, 12 May 2025 01:57:07 -0700 Message-ID: <20250512085735.564475-5-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250512085735.564475-1-chao.gao@intel.com> References: <20250512085735.564475-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The initial values for fpstate::xfd differ between guest and host fpstates. Currently, the initial values are passed as an argument to __fpstate_reset(). But, __fpstate_reset() already assigns different default features and sizes based on the type of fpstates (i.e., guest or host). So, handle fpstate::xfd in a similar way to highlight the differences in the initial xfd value between guest and host fpstates Suggested-by: Sean Christopherson Signed-off-by: Chao Gao Link: https://lore.kernel.org/all/aBuf7wiiDT0Wflhk@google.com/ reviewed-by/acked-by if appropriate? --- v7: new. arch/x86/kernel/fpu/core.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 0d501bd25d79..a3cafed350e0 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -211,7 +211,7 @@ void fpu_reset_from_exception_fixup(void) } =20 #if IS_ENABLED(CONFIG_KVM) -static void __fpstate_reset(struct fpstate *fpstate, u64 xfd); +static void __fpstate_reset(struct fpstate *fpstate); =20 static void fpu_lock_guest_permissions(void) { @@ -246,8 +246,7 @@ bool fpu_alloc_guest_fpstate(struct fpu_guest *gfpu) fpstate->is_valloc =3D true; fpstate->is_guest =3D true; =20 - /* Leave xfd to 0 (the reset value defined by spec) */ - __fpstate_reset(fpstate, 0); + __fpstate_reset(fpstate); fpstate_init_user(fpstate); =20 gfpu->fpstate =3D fpstate; @@ -536,7 +535,7 @@ void fpstate_init_user(struct fpstate *fpstate) fpstate_init_fstate(fpstate); } =20 -static void __fpstate_reset(struct fpstate *fpstate, u64 xfd) +static void __fpstate_reset(struct fpstate *fpstate) { /* * Initialize sizes and feature masks. Supervisor features and @@ -546,21 +545,23 @@ static void __fpstate_reset(struct fpstate *fpstate, = u64 xfd) if (fpstate->is_guest) { fpstate->size =3D guest_default_cfg.size; fpstate->xfeatures =3D guest_default_cfg.features; + /* Leave xfd to 0 (the reset value defined by spec) */ + fpstate->xfd =3D 0; } else { fpstate->size =3D fpu_kernel_cfg.default_size; fpstate->xfeatures =3D fpu_kernel_cfg.default_features; + fpstate->xfd =3D init_fpstate.xfd; } =20 fpstate->user_size =3D fpu_user_cfg.default_size; fpstate->user_xfeatures =3D fpu_user_cfg.default_features; - fpstate->xfd =3D xfd; } =20 void fpstate_reset(struct fpu *fpu) { /* Set the fpstate pointer to the default fpstate */ fpu->fpstate =3D &fpu->__fpstate; - __fpstate_reset(fpu->fpstate, init_fpstate.xfd); + __fpstate_reset(fpu->fpstate); =20 /* Initialize the permission related info in fpu */ fpu->perm.__state_perm =3D fpu_kernel_cfg.default_features; --=20 2.47.1 From nobody Wed Dec 17 08:54:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6B29E266572; Mon, 12 May 2025 08:57:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747040271; cv=none; b=DBuZZlQM57Z+leQbqNYBEQWdgamtwGtX9haqbsP6LikvVNSq9vn4paxSF41io1/J83IjguQxmaJzaXLX+3AtP9PqtkXHvgHESw6N0W65bVk8NFnPY9robySt+0pmHn6ducah0khIhyP53TrU/qW8J07bc14tREgn+prpH5bOu6Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747040271; c=relaxed/simple; bh=aJVYBo6drvmcTv/txl/DmxL0wX3+CLuv7hTrsvr1hSU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=WzgS+zrjFYWxusw9voDb4c/8cD2Fwly0F31JYXGNVr7L7UAGn0LUNSHhH0s7WsjIuQVA1mS94hiODIydWv6xmPmUQudVd+uNHWikEkkZAa6SHfE0Nedl4BxfOVVJcxxvvTDts4eprfbdCAwF8O8F7GML8ONWQr700rHgPAEVmxU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=A7Ze9dFD; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="A7Ze9dFD" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1747040269; x=1778576269; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aJVYBo6drvmcTv/txl/DmxL0wX3+CLuv7hTrsvr1hSU=; b=A7Ze9dFDm4+7OHmeaUft9e74uPlFKfojGItUNUtQc9a6XJvKbuy9lC9N rErNBqunqfCgVdl5vToow2tSM3hSjGHmxUhdGd+Abt3q+axEIbCu6b5E2 YiLD9Yl2NO/2+J57a/nNgXEPUNahBQJD+uyOMOn7PM3JR00xDMDLKpPOf QcOFEqZiitIr9eRI2mOK1lhdJ0uV8W4ng2j0wkg3j6zG5fJj/sx7cvW1N kWtw+n5WpvO1prOyUR6hA+MsmZPg/klGs+EXVLjfPgrTZaLMyjBDzTlt0 VJhtpzly9xmvHINjbwf77GotpelUtci71Zi7sddj0mXXMjonDpDh+UK0q w==; X-CSE-ConnectionGUID: p0O27EjWQ7ez1skDfDZJ+w== X-CSE-MsgGUID: Scm3hfoNRTKrQZzltwiM1w== X-IronPort-AV: E=McAfee;i="6700,10204,11430"; a="59488745" X-IronPort-AV: E=Sophos;i="6.15,281,1739865600"; d="scan'208";a="59488745" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2025 01:57:48 -0700 X-CSE-ConnectionGUID: 2zwa3coCQs6jAFW/rSa7aQ== X-CSE-MsgGUID: rimokqMmTV6vBceyTLUEBQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,281,1739865600"; d="scan'208";a="138235794" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2025 01:57:49 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Chao Gao , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Mitchell Levy , Samuel Holland , Zhao Liu , Maxim Levitsky , Vignesh Balasubramanian , Aruna Ramakrishna , Uros Bizjak Subject: [PATCH v7 5/6] x86/fpu/xstate: Introduce "guest-only" supervisor xfeature set Date: Mon, 12 May 2025 01:57:08 -0700 Message-ID: <20250512085735.564475-6-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250512085735.564475-1-chao.gao@intel.com> References: <20250512085735.564475-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Yang Weijiang In preparation for upcoming CET virtualization support, the CET supervisor state will be added as a "guest-only" feature, since it is required only by KVM (i.e., guest FPUs). Establish the infrastructure for "guest-only" features. Define a new XFEATURE_MASK_GUEST_SUPERVISOR mask to specify features that are enabled by default in guest FPUs but not in host FPUs. Specifically, for any bit in this set, permission is granted and XSAVE space is allocated during vCPU creation. Non-guest FPUs cannot enable guest-only features, even dynamically, and no XSAVE space will be allocated for them. The mask is currently empty, but this will be changed by a subsequent patch. Co-developed-by: Chao Gao Signed-off-by: Chao Gao Signed-off-by: Yang Weijiang Reviewed-by: Rick Edgecombe reviewed-by/acked-by if appropriate? --- v6: Collect reviews v5: Explain in detail the reasoning behind the mask name choice below the "---" separator line. In previous versions, the mask was named "XFEATURE_MASK_SUPERVISOR_DYNAMIC" Dave suggested this name [1], but he also noted, "I don't feel strongly abo= ut it and I've said my piece. I won't NAK it one way or the other." The term "dynamic" was initially preferred because it reflects the impact on XSAVE buffers=E2=80=94some buffers accommodate dynamic features while ot= hers do not. This naming allows for the introduction of dynamic features that are not strictly "guest-only", offering flexibility beyond KVM. However, using "dynamic" has led to confusion [2]. Chang pointed out that permission granting and buffer allocation are actually static at VCPU allocation, diverging from the model for user dynamic features. He also questioned the rationale for introducing a kernel dynamic feature mask while using it as a guest-only feature mask [3]. Moreover, Thomas remarked that "the dynamic naming is really bad" [4]. Although his specific concerns are unclear, we should be cautious about reinstating the "kernel dynamic feature" naming. Therefore, in v4, I renamed the mask to "XFEATURE_MASK_SUPERVISOR_GUEST" and further refined it to "XFEATURE_MASK_GUEST_SUPERVISOR" in this v5. [1]: https://lore.kernel.org/all/893ac578-baaf-4f4f-96ee-e012dfc073a8@intel= .com/#t [2]: https://lore.kernel.org/kvm/e15d1074-d5ec-431d-86e5-a58bc6297df8@intel= .com/ [3]: https://lore.kernel.org/kvm/7bee70fd-b2b9-4466-a694-4bf3486b19c7@intel= .com/ [4]: https://lore.kernel.org/all/87sg1owmth.ffs@nanos.tec.linutronix.de/ --- arch/x86/include/asm/fpu/types.h | 9 +++++---- arch/x86/include/asm/fpu/xstate.h | 6 +++++- arch/x86/kernel/fpu/xstate.c | 14 +++++++++++--- arch/x86/kernel/fpu/xstate.h | 5 +++++ 4 files changed, 26 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/ty= pes.h index abd193a1a52e..54ba567258d6 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -592,8 +592,9 @@ struct fpu_state_config { * @default_size: * * The default size of the register state buffer. Includes all - * supported features except independent managed features and - * features which have to be requested by user space before usage. + * supported features except independent managed features, + * guest-only features and features which have to be requested by + * user space before usage. */ unsigned int default_size; =20 @@ -609,8 +610,8 @@ struct fpu_state_config { * @default_features: * * The default supported features bitmap. Does not include - * independent managed features and features which have to - * be requested by user space before usage. + * independent managed features, guest-only features and features + * which have to be requested by user space before usage. */ u64 default_features; /* diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/x= state.h index b308a76afbb7..a3cd25453f94 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -46,9 +46,13 @@ /* Features which are dynamically enabled for a process on request */ #define XFEATURE_MASK_USER_DYNAMIC XFEATURE_MASK_XTILE_DATA =20 +/* Supervisor features which are enabled only in guest FPUs */ +#define XFEATURE_MASK_GUEST_SUPERVISOR 0 + /* All currently supported supervisor features */ #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \ - XFEATURE_MASK_CET_USER) + XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_GUEST_SUPERVISOR) =20 /* * A supervisor state component may not always contain valuable informatio= n, diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index f32047e12500..e77cbfd18094 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -781,14 +781,22 @@ static void __init init_default_features(u64 kernel_m= ax_features, u64 user_max_f u64 kfeatures =3D kernel_max_features; u64 ufeatures =3D user_max_features; =20 - /* Default feature sets should not include dynamic xfeatures. */ - kfeatures &=3D ~XFEATURE_MASK_USER_DYNAMIC; + /* + * Default feature sets should not include dynamic and guest-only + * xfeatures at all. + */ + kfeatures &=3D ~(XFEATURE_MASK_USER_DYNAMIC | XFEATURE_MASK_GUEST_SUPERVI= SOR); ufeatures &=3D ~XFEATURE_MASK_USER_DYNAMIC; =20 fpu_kernel_cfg.default_features =3D kfeatures; fpu_user_cfg.default_features =3D ufeatures; =20 - guest_default_cfg.features =3D kfeatures; + /* + * Ensure VCPU FPU container only reserves a space for guest-only + * xfeatures. This distinction can save kernel memory by + * maintaining a necessary amount of XSAVE buffer. + */ + guest_default_cfg.features =3D kfeatures | xfeatures_mask_guest_supe= rvisor(); } =20 /* diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h index a0256ef34ecb..5ced1a92e666 100644 --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -61,6 +61,11 @@ static inline u64 xfeatures_mask_supervisor(void) return fpu_kernel_cfg.max_features & XFEATURE_MASK_SUPERVISOR_SUPPORTED; } =20 +static inline u64 xfeatures_mask_guest_supervisor(void) +{ + return fpu_kernel_cfg.max_features & XFEATURE_MASK_GUEST_SUPERVISOR; +} + static inline u64 xfeatures_mask_independent(void) { if (!cpu_feature_enabled(X86_FEATURE_ARCH_LBR)) --=20 2.47.1 From nobody Wed Dec 17 08:54:48 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 57DC5266F09; Mon, 12 May 2025 08:57:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747040273; cv=none; b=JqIZsMgtqBq0JOI3RXb7TNtcN+7rPhmnne9D8y/Jye4rDU7wYC4BRcHHY9mKi8jS1HzYGF/02ONl5MmjFRjK7SYBA5AW5UhrvU/USj90syJl3vA1V0iaPCYjjm5kw1bmAvHjoTh8sdCT8Szg0clpDwhEhAepF1krNgFbdI3BFS8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747040273; c=relaxed/simple; bh=SlxEBFvzTBZ9T/ujcuKgXBKVk73IYIx38GKTZR1Hk7g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=igf/BKPH/JLksc7BEuxuh+RuINef+BxLxbE/em73AulfnJrU0XhdjZDY/Ry5m4k+QeuiyIc5CktjPlH+ZmcZsfApshGo1ZTwpx+r3kLNgWDJaz8MTNsofXEz2zclgF6pBXJfqgtWgv3VmzYykzkLhYvupXqb3mLToVTB+jjvo6M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ILA3uqDR; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ILA3uqDR" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1747040271; x=1778576271; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SlxEBFvzTBZ9T/ujcuKgXBKVk73IYIx38GKTZR1Hk7g=; b=ILA3uqDRktpydzjHf0Fm6JGRQlESQxTA7ckVjCD/XQzR4Eh7zJlCMbrr 698n04NVFAsxYJi7jpn2eU0b7TsF9vUVp5ObijqGKBGp5mvm4C4i5CWzh 18SvaFCn9qnW1iLLxJD9f/iSLMrIHTzbpnAmoXyjj7qMr66m6cUqEON51 rehsmm1ieZbgK7nHVALrU0uuwufMQLW9xokii3f13j+IsOEX7lqsaAUqE EG1F/3gfZ69ZFh5EpACZP4aqaiUDhon+WLIJK1/gGp3yDlGJKxcAwxr5N Wt6R7OiURRnavhm5B8HQQ2Mr4VN0etcEgP97+W7/8PjntclKOIsn9yiG5 g==; X-CSE-ConnectionGUID: lvHSOnx1Ru+p2L+PSm2MFg== X-CSE-MsgGUID: urlAg1bZRUCLOG7jzmiE4w== X-IronPort-AV: E=McAfee;i="6700,10204,11430"; a="59488769" X-IronPort-AV: E=Sophos;i="6.15,281,1739865600"; d="scan'208";a="59488769" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2025 01:57:50 -0700 X-CSE-ConnectionGUID: EYMdIFAqR/eWtRQkBTgJ3A== X-CSE-MsgGUID: z4ON7Q2jSMuzrNK1IcQudw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,281,1739865600"; d="scan'208";a="138235802" Received: from 984fee019967.jf.intel.com ([10.165.54.94]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 May 2025 01:57:50 -0700 From: Chao Gao To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, tglx@linutronix.de, dave.hansen@intel.com, seanjc@google.com, pbonzini@redhat.com Cc: peterz@infradead.org, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com, bp@alien8.de, chang.seok.bae@intel.com, xin3.li@intel.com, Chao Gao , Maxim Levitsky , Ingo Molnar , Dave Hansen , "H. Peter Anvin" , Mitchell Levy , Samuel Holland , Sohil Mehta , Vignesh Balasubramanian Subject: [PATCH v7 6/6] x86/fpu/xstate: Add CET supervisor xfeature support as a guest-only feature Date: Mon, 12 May 2025 01:57:09 -0700 Message-ID: <20250512085735.564475-7-chao.gao@intel.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250512085735.564475-1-chao.gao@intel.com> References: <20250512085735.564475-1-chao.gao@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Yang Weijiang =3D=3D Background =3D=3D CET defines two register states: CET user, which includes user-mode control registers, and CET supervisor, which consists of shadow-stack pointers for privilege levels 0-2. Current kernels disable shadow stacks in kernel mode, making the CET supervisor state unused and eliminating the need for context switching. =3D=3D Problem =3D=3D To virtualize CET for guests, KVM must accurately emulate hardware behavior. A key challenge arises because there is no CPUID flag to indicate that shadow stack is supported only in user mode. Therefore, KVM cannot assume guests will not enable shadow stacks in kernel mode and must preserve the CET supervisor state of vCPUs. =3D=3D Solution =3D=3D An initial proposal to manually save and restore CET supervisor states using raw RDMSR/WRMSR in KVM was rejected due to performance concerns and its impact on KVM's ABI. Instead, leveraging the kernel's FPU infrastructure for context switching was favored [1]. The main question then became whether to enable the CET supervisor state globally for all processes or restrict it to vCPU processes. This decision involves a trade-off between a 24-byte XSTATE buffer waste for all non-vCPU processes and approximately 100 lines of code complexity in the kernel [2]. The agreed approach is to first try this optimal solution [3], i.e., restricting the CET supervisor state to guest FPUs only and eliminating unnecessary space waste. The guest-only xfeature infrastructure has already been added. Now, introduce CET supervisor xstate support as the first guest-only feature to prepare for the upcoming CET virtualization in KVM. Signed-off-by: Yang Weijiang Signed-off-by: Chao Gao Reviewed-by: Rick Edgecombe Reviewed-by: Maxim Levitsky Link: https://lore.kernel.org/kvm/ZM1jV3UPL0AMpVDI@google.com/ [1] Link: https://lore.kernel.org/kvm/1c2fd06e-2e97-4724-80ab-8695aa4334e7@inte= l.com/ [2] Link: https://lore.kernel.org/kvm/2597a87b-1248-b8ce-ce60-94074bc67ea4@inte= l.com/ [3] reviewed-by/acked-by if appropriate? --- v5: Introduce CET supervisor xfeature directly as a guest-only feature, rather than first introducing it in one patch and then converting it to guest-only in a subsequent patch. (Chang) Add new features after cleanups/bug fixes (Chang, Dave, Ingo) Improve the commit message to follow the suggested background-problem-solution pattern. --- arch/x86/include/asm/fpu/types.h | 14 ++++++++++++-- arch/x86/include/asm/fpu/xstate.h | 5 ++--- arch/x86/kernel/fpu/xstate.c | 5 ++++- 3 files changed, 18 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/ty= pes.h index 54ba567258d6..93e99d2583d6 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -118,7 +118,7 @@ enum xfeature { XFEATURE_PKRU, XFEATURE_PASID, XFEATURE_CET_USER, - XFEATURE_CET_KERNEL_UNUSED, + XFEATURE_CET_KERNEL, XFEATURE_RSRVD_COMP_13, XFEATURE_RSRVD_COMP_14, XFEATURE_LBR, @@ -142,7 +142,7 @@ enum xfeature { #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU) #define XFEATURE_MASK_PASID (1 << XFEATURE_PASID) #define XFEATURE_MASK_CET_USER (1 << XFEATURE_CET_USER) -#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL_UNUSED) +#define XFEATURE_MASK_CET_KERNEL (1 << XFEATURE_CET_KERNEL) #define XFEATURE_MASK_LBR (1 << XFEATURE_LBR) #define XFEATURE_MASK_XTILE_CFG (1 << XFEATURE_XTILE_CFG) #define XFEATURE_MASK_XTILE_DATA (1 << XFEATURE_XTILE_DATA) @@ -268,6 +268,16 @@ struct cet_user_state { u64 user_ssp; }; =20 +/* + * State component 12 is Control-flow Enforcement supervisor states. + * This state includes SSP pointers for privilege levels 0 through 2. + */ +struct cet_supervisor_state { + u64 pl0_ssp; + u64 pl1_ssp; + u64 pl2_ssp; +} __packed; + /* * State component 15: Architectural LBR configuration state. * The size of Arch LBR state depends on the number of LBRs (lbr_depth). diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/x= state.h index a3cd25453f94..7a7dc9d56027 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -47,7 +47,7 @@ #define XFEATURE_MASK_USER_DYNAMIC XFEATURE_MASK_XTILE_DATA =20 /* Supervisor features which are enabled only in guest FPUs */ -#define XFEATURE_MASK_GUEST_SUPERVISOR 0 +#define XFEATURE_MASK_GUEST_SUPERVISOR XFEATURE_MASK_CET_KERNEL =20 /* All currently supported supervisor features */ #define XFEATURE_MASK_SUPERVISOR_SUPPORTED (XFEATURE_MASK_PASID | \ @@ -79,8 +79,7 @@ * Unsupported supervisor features. When a supervisor feature in this mask= is * supported in the future, move it to the supported supervisor feature ma= sk. */ -#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT | \ - XFEATURE_MASK_CET_KERNEL) +#define XFEATURE_MASK_SUPERVISOR_UNSUPPORTED (XFEATURE_MASK_PT) =20 /* All supervisor states including supported and unsupported states. */ #define XFEATURE_MASK_SUPERVISOR_ALL (XFEATURE_MASK_SUPERVISOR_SUPPORTED |= \ diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index e77cbfd18094..549cc8929407 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -56,7 +56,7 @@ static const char *xfeature_names[] =3D "Protection Keys User registers", "PASID state", "Control-flow User registers", - "Control-flow Kernel registers (unused)", + "Control-flow Kernel registers (KVM only)", "unknown xstate feature", "unknown xstate feature", "unknown xstate feature", @@ -80,6 +80,7 @@ static unsigned short xsave_cpuid_features[] __initdata = =3D { [XFEATURE_PKRU] =3D X86_FEATURE_OSPKE, [XFEATURE_PASID] =3D X86_FEATURE_ENQCMD, [XFEATURE_CET_USER] =3D X86_FEATURE_SHSTK, + [XFEATURE_CET_KERNEL] =3D X86_FEATURE_SHSTK, [XFEATURE_XTILE_CFG] =3D X86_FEATURE_AMX_TILE, [XFEATURE_XTILE_DATA] =3D X86_FEATURE_AMX_TILE, [XFEATURE_APX] =3D X86_FEATURE_APX, @@ -371,6 +372,7 @@ static __init void os_xrstor_booting(struct xregs_state= *xstate) XFEATURE_MASK_BNDCSR | \ XFEATURE_MASK_PASID | \ XFEATURE_MASK_CET_USER | \ + XFEATURE_MASK_CET_KERNEL | \ XFEATURE_MASK_XTILE | \ XFEATURE_MASK_APX) =20 @@ -572,6 +574,7 @@ static bool __init check_xstate_against_struct(int nr) case XFEATURE_PASID: return XCHECK_SZ(sz, nr, struct ia32_pasid_state); case XFEATURE_XTILE_CFG: return XCHECK_SZ(sz, nr, struct xtile_cfg); case XFEATURE_CET_USER: return XCHECK_SZ(sz, nr, struct cet_user_state); + case XFEATURE_CET_KERNEL: return XCHECK_SZ(sz, nr, struct cet_supervisor_= state); case XFEATURE_APX: return XCHECK_SZ(sz, nr, struct apx_state); case XFEATURE_XTILE_DATA: check_xtile_data_against_struct(sz); return tru= e; default: --=20 2.47.1