arch/x86/include/asm/fpu/types.h | 58 ++++++++++++++++++++++--------- arch/x86/include/asm/fpu/xstate.h | 9 +++-- arch/x86/kernel/fpu/core.c | 36 ++++++++++++------- arch/x86/kernel/fpu/xstate.c | 42 +++++++++++++++------- arch/x86/kernel/fpu/xstate.h | 2 ++ 5 files changed, 102 insertions(+), 45 deletions(-)
==Changelog==
v3->v4:
- Remove fpu_guest_cfg.
The fact that only the default_features and default_size fields of
fpu_guest_cfg are used suggests that a full fpu_guest_cfg may not be
necessary. Adding two members, "guest_default_xfeatures" and
"guest_default_size", or even a single "guest_only_xfeatures" member in
fpu_kernel_cfg, similar to "independent_xfeatures", is more logical. To
facilitate discussion, implement this approach in this version.
- Extract the fix for inconsistencies in fpu_guest and post it separately
(Chang)
- Rename XFEATURE_MASK_KERNEL_DYNAMIC to XFEATURE_MASK_SUPERVISOR_GUEST as
tglx noted "this dynamic naming is really bad":
https://lore.kernel.org/all/87sg1owmth.ffs@nanos.tec.linutronix.de/
- Rerun performance tests and update the performance claims in the cover-letter
(Dave)
- Tighten down the changelogs and drop useless comments (Dave)
- Reorder the patches to put the CET supervisor state patch before the
"guest-only" optimization, allowing maintainers to easily adopt or omit the
optimization.
- v3: https://lore.kernel.org/kvm/20250307164123.1613414-1-chao.gao@intel.com/
v2->v3:
- reorder patches to add fpu_guest_cfg first and then introduce dynamic kernel
feature concept (Dave)
- Revise changelog for all patches except the first and the last one (Dave)
- Split up patches that do multiple things into separate patches.
- collect tags for patch 1
- v2: https://lore.kernel.org/kvm/20241126101710.62492-1-chao.gao@intel.com/
v1->v2:
- rebase onto the latest kvm-x86/next
- Add performance data to the cover-letter
- v1: https://lore.kernel.org/kvm/73802bff-833c-4233-9a5b-88af0d062c82@intel.com/
==Background==
This series spins off from CET KVM virtualization enabling series [1].
The purpose is to get these preparation work resolved ahead of KVM part
landing. There was a discussion about introducing CET supervisor state
support [2] [3].
CET supervisor state, i.e., IA32_PL{0,1,2}_SSP, are xsave-managed MSRs,
it can be enabled via IA32_XSS[bit 12]. KVM relies on host side CET
supervisor state support to fully enable guest CET MSR contents storage.
The benefits are: 1) No need to manually save/restore the 3 MSRs when
vCPU fpu context is sched in/out. 2) Omit manually swapping the three
MSRs at VM-Exit/VM-Entry for guest/host. 3) Make guest CET user/supervisor
states managed in a consistent manner within host kernel FPU framework.
==Solution==
This series tries to:
1) Fix existing issue regarding enabling guest supervisor states support.
2) Add CET supervisor state support in core kernel.
3) Introduce guest default features and size for guest fpstate setup.
With the preparation work landed, for guest fpstate, we have xstate_bv[12]
== xcomp_bv[12] == 1 and CET supervisor state is saved/reloaded when
xsaves/xrstors executes on guest fpstate.
For non-guest/normal fpstate, we have xstate_bv[12] == xcomp_bv[12] == 0,
then HW can optimize xsaves/xrstors operations.
==Performance==
We measured context-switching performance with the benchmark [4] in following
three cases.
case 1: the baseline. i.e., this series isn't applied
case 2: baseline + this series. CET-S space is allocated for guest fpu only.
case 3: baseline + allocate CET-S space for all tasks. Hardware init
optimization avoids writing out CET-S space on each XSAVES.
The performance differences in the three cases are very small and fall within the
run-to-run variation.
Case 2 is preferred over Case 3 because it can save 24B of CET-S space for all
non-vCPU threads with just a one-line change:
+ fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_SUPERVISOR_GUEST
We believe adding guest defaults has its own merits. It improves readability,
decouples host FPUs and guest FPUs, and arguably enhances extensibility.
[1]: https://lore.kernel.org/all/20240219074733.122080-1-weijiang.yang@intel.com/
[2]: https://lore.kernel.org/all/ZM1jV3UPL0AMpVDI@google.com/
[3]: https://lore.kernel.org/all/2597a87b-1248-b8ce-ce60-94074bc67ea4@intel.com/
[4]: https://github.com/antonblanchard/will-it-scale/blob/master/tests/context_switch1.c
Chao Gao (4):
x86/fpu: Drop @perm from guest pseudo FPU container
x86/fpu/xstate: Differentiate default features for host and guest FPUs
x86/fpu: Initialize guest FPU permissions from guest defaults
x86/fpu: Initialize guest fpstate and FPU pseudo container from guest
defaults
Sean Christopherson (1):
x86/fpu/xstate: Always preserve non-user xfeatures/flags in
__state_perm
Yang Weijiang (3):
x86/fpu/xstate: Add CET supervisor xfeature support
x86/fpu/xstate: Introduce "guest-only" supervisor xfeature set
x86/fpu/xstate: Warn if guest-only supervisor states are detected in
normal fpstate
arch/x86/include/asm/fpu/types.h | 58 ++++++++++++++++++++++---------
arch/x86/include/asm/fpu/xstate.h | 9 +++--
arch/x86/kernel/fpu/core.c | 36 ++++++++++++-------
arch/x86/kernel/fpu/xstate.c | 42 +++++++++++++++-------
arch/x86/kernel/fpu/xstate.h | 2 ++
5 files changed, 102 insertions(+), 45 deletions(-)
--
2.46.1
On 3/18/2025 8:31 AM, Chao Gao wrote: > > arch/x86/include/asm/fpu/types.h | 58 ++++++++++++++++++++++--------- > arch/x86/include/asm/fpu/xstate.h | 9 +++-- > arch/x86/kernel/fpu/core.c | 36 ++++++++++++------- > arch/x86/kernel/fpu/xstate.c | 42 +++++++++++++++------- > arch/x86/kernel/fpu/xstate.h | 2 ++ > 5 files changed, 102 insertions(+), 45 deletions(-) Hi Chao Gao, I've left a few comments on your patches. It looks like the structure of your patch set has been shifting between postings. I’d recommend reviewing the build-up of changes more carefully — see my reply on patch3. Additionally, your approach to the ongoing discussion comes across as somewhat reserved. Since you’re presenting a counterpoint to the maintainer’s suggestion, I’d encourage you to articulate your reasoning more clearly (see my comment on patch 7). Most of other feedback focuses on refining individual changes. Specifically, your upcoming modifications to the initialization sequence do not look super clear. My suggestions in patch 4 reflect my interpretation, but please consider whether there's a clearer way to present the new configuration settings. Thanks, Chang
Xin and I were discussing the dependencies we have going here. For arch reasons, CET supervisor shadow stack was waiting for FRED support. And for KVM development process reasons KVM FRED support is waiting for KVM CET support. It’s almost a circle. It looks like Chao has re-arranged the patches such that the space saving optimization could be dropped. It would just look like switching to the pretty trivial patches 1-3 I guess. However, he didn’t actually advocate for that to happen. But the subtext seems to be that it should be considered? I think there aren't concerns with the bones of the optimization in the later patches. It’s just polishing that is needed. But some of the polishing needed (i.e. the long debated naming of for the guest category of features) is downstream of the growing complexity of the FPU, which we were planning to accept. So maybe it’s turning out more costly than we expected? In any case, at this point I think we need to either double down on polishing this thing up (by pausing other work) and have a clear “please do this with these patches" request, or declare failure and argue for the smaller version. I guess I still lean towards keeping the optimization. But I do think it's worth considering at this point. Rick
On 4/2/25 14:12, Edgecombe, Rick P wrote: > In any case, at this point I think we need to either double down on polishing > this thing up (by pausing other work) and have a clear “please do this with > these patches" request, or declare failure and argue for the smaller version. > > I guess I still lean towards keeping the optimization. But I do think it's worth > considering at this point. I'm not quite feeling the same sense of panic and some need to pause other work. This is thing is at v4. Between spring break and the merge window, v4 hasn't gotten many eyeballs, except Chang's (thanks Chang!).
On Wed, Apr 02, 2025, Dave Hansen wrote: > On 4/2/25 14:12, Edgecombe, Rick P wrote: > > In any case, at this point I think we need to either double down on polishing > > this thing up (by pausing other work) and have a clear “please do this with > > these patches" request, or declare failure and argue for the smaller version. > > > > I guess I still lean towards keeping the optimization. But I do think it's worth > > considering at this point. > > I'm not quite feeling the same sense of panic and some need to pause > other work. This is thing is at v4. Between spring break and the merge > window, v4 hasn't gotten many eyeballs, except Chang's (thanks Chang!). This particular series is at v4, but KVM CET support was more or less ready to roll 4 *years* ago. I agree there's no need to panic, but some sense of urgency would be nice.
© 2016 - 2025 Red Hat, Inc.