arch/x86/kernel/fpu/xstate.c | 2 +- tools/testing/selftests/x86/amx.c | 16 ++++++++++++++-- 2 files changed, 15 insertions(+), 3 deletions(-)
Changes from V3: * Rebased onto 5.17-rc1. V3: https://lore.kernel.org/lkml/20211110003209.21666-1-chang.seok.bae@intel.com/ --- The recent x86 dynamic state support incorporates the arch_prctl option to request permission before using a dynamic state. It was designed to add the requested feature in the group leader's permission bitmask so that every thread can reference this master bitmask. The group leader is assumed to be unchanged here. The mainline is the case as a group leader is identified at fork() or exec() time only. This master bitmask should include non-dynamic features always, as they are permitted by default. Users may check them via ARCH_GET_XCOMP_PERM. But, in hindsight, the implementation does overwrite the bitmask with the requested bit only, instead of adding the bit to the existing one. This overwrite effectively revokes the permission that is granted already. Fix the code and also update the selftest to disclose the issue if there is. Chang S. Bae (1): selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test Yang Zhong (1): x86/arch_prctl: Fix the ARCH_REQ_XCOMP_PERM implementation arch/x86/kernel/fpu/xstate.c | 2 +- tools/testing/selftests/x86/amx.c | 16 ++++++++++++++-- 2 files changed, 15 insertions(+), 3 deletions(-) base-commit: e783362eb54cd99b2cac8b3a9aeac942e6f6ac07 -- 2.17.1
Thomas, Dave, can this series be included in 5.18 and CCed to stable? The bug makes the __state_perm field completely wrong. As a result, arch_prctl(ARCH_GET_XCOMP_PERM) only returns the features that were requested last, forgetting what was already in __state_perm (the "permitted" argument to __xstate_request_perm). In KVM, it is a bit worse. It affects arch_prctl(ARCH_GET_XCOMP_GUEST_PERM) in the same way and also ioctl(KVM_GET_SUPPORTED_CPUID), but the bug can also make KVM return the wrong xsave state size to userspace. It's likely to go unnoticed by userspace until Intel adds non-dynamic states above a dynamic one, but potentially userspace could allocate a buffer that is too small. Paolo On 1/29/22 18:36, Chang S. Bae wrote: > Changes from V3: > * Rebased onto 5.17-rc1. > > V3: https://lore.kernel.org/lkml/20211110003209.21666-1-chang.seok.bae@intel.com/ > > --- > > The recent x86 dynamic state support incorporates the arch_prctl option to > request permission before using a dynamic state. > > It was designed to add the requested feature in the group leader's > permission bitmask so that every thread can reference this master bitmask. > The group leader is assumed to be unchanged here. The mainline is the case > as a group leader is identified at fork() or exec() time only. > > This master bitmask should include non-dynamic features always, as they > are permitted by default. Users may check them via ARCH_GET_XCOMP_PERM. > > But, in hindsight, the implementation does overwrite the bitmask with the > requested bit only, instead of adding the bit to the existing one. This > overwrite effectively revokes the permission that is granted already. > > Fix the code and also update the selftest to disclose the issue if there > is. > > Chang S. Bae (1): > selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test > > Yang Zhong (1): > x86/arch_prctl: Fix the ARCH_REQ_XCOMP_PERM implementation > > arch/x86/kernel/fpu/xstate.c | 2 +- > tools/testing/selftests/x86/amx.c | 16 ++++++++++++++-- > 2 files changed, 15 insertions(+), 3 deletions(-) > > > base-commit: e783362eb54cd99b2cac8b3a9aeac942e6f6ac07
Paolo,
On Wed, Mar 23 2022 at 12:04, Paolo Bonzini wrote:
> can this series be included in 5.18 and CCed to stable?
working on it. There is another issue with that which I'm currently
looking into.
Thanks,
tglx
On Wed, Mar 23 2022 at 13:27, Thomas Gleixner wrote:
> On Wed, Mar 23 2022 at 12:04, Paolo Bonzini wrote:
>> can this series be included in 5.18 and CCed to stable?
>
> working on it. There is another issue with that which I'm currently
> looking into.
The size calculation for the kernel state fails to take supervisor
states into account. Up to 5.18 that did not matter because ENQCMD/PASID
was disabled. But now it matters...
Thanks,
tglx
---
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1625,6 +1625,8 @@ static int __xstate_request_perm(u64 per
/* Calculate the resulting kernel state size */
mask = permitted | requested;
+ /* Take supervisor states into account */
+ mask |= xfeatures_mask_supervisor();
ksize = xstate_calculate_size(mask, compacted);
/* Calculate the resulting user state size */
On 3/23/22 13:55, Thomas Gleixner wrote: > --- a/arch/x86/kernel/fpu/xstate.c > +++ b/arch/x86/kernel/fpu/xstate.c > @@ -1625,6 +1625,8 @@ static int __xstate_request_perm(u64 per > > /* Calculate the resulting kernel state size */ > mask = permitted | requested; > + /* Take supervisor states into account */ > + mask |= xfeatures_mask_supervisor(); > ksize = xstate_calculate_size(mask, compacted); > This should be only added in for the !guest case. Paolo
On Wed, Mar 23 2022 at 15:24, Paolo Bonzini wrote: > On 3/23/22 13:55, Thomas Gleixner wrote: >> --- a/arch/x86/kernel/fpu/xstate.c >> +++ b/arch/x86/kernel/fpu/xstate.c >> @@ -1625,6 +1625,8 @@ static int __xstate_request_perm(u64 per >> >> /* Calculate the resulting kernel state size */ >> mask = permitted | requested; >> + /* Take supervisor states into account */ >> + mask |= xfeatures_mask_supervisor(); >> ksize = xstate_calculate_size(mask, compacted); >> > > This should be only added in for the !guest case. Yes, I figured that out already :)
On Wed, Mar 23 2022 at 18:07, Thomas Gleixner wrote:
> On Wed, Mar 23 2022 at 15:24, Paolo Bonzini wrote:
>> On 3/23/22 13:55, Thomas Gleixner wrote:
>>> --- a/arch/x86/kernel/fpu/xstate.c
>>> +++ b/arch/x86/kernel/fpu/xstate.c
>>> @@ -1625,6 +1625,8 @@ static int __xstate_request_perm(u64 per
>>>
>>> /* Calculate the resulting kernel state size */
>>> mask = permitted | requested;
>>> + /* Take supervisor states into account */
>>> + mask |= xfeatures_mask_supervisor();
>>> ksize = xstate_calculate_size(mask, compacted);
>>>
>>
>> This should be only added in for the !guest case.
>
> Yes, I figured that out already :)
Hrm, that has more consequences vs. the buffer conversion functions. Let
me stare some more.
Thanks,
tglx
© 2016 - 2026 Red Hat, Inc.