[PATCH v3 2/2] x86/pkeys: Set XSTATE_BV[PKRU] to 1 so that PKRU is XRSTOR'd correctly

Aruna Ramakrishna posted 2 patches 1 year, 2 months ago
[PATCH v3 2/2] x86/pkeys: Set XSTATE_BV[PKRU] to 1 so that PKRU is XRSTOR'd correctly
Posted by Aruna Ramakrishna 1 year, 2 months ago
PKRU value is not XRSTOR'd from the XSAVE area if the corresponding
XSTATE_BV[i] bit is 0. A wrpkru(0) sets XSTATE_BV[PKRU] to 0 on AMD
systems, which means the PKRU value updated on the sigframe later on,
in update_pkru_in_sigframe(), is ignored.

To make this behavior consistent across Intel and AMD systems, and to
ensure that the PKRU value updated on the sigframe is always restored
correctly, explicitly set XSTATE_BV[PKRU] to 1.

Fixes: 70044df250d0 ("x86/pkeys: Update PKRU to enable all pkeys before XSAVE")

Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
Suggested-by: Rudi Horn <rudi.horn@oracle.com>
---
 arch/x86/kernel/fpu/xstate.h | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h
index 6b2924fbe5b8..aa16f1a1bbcf 100644
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -72,10 +72,22 @@ static inline u64 xfeatures_mask_independent(void)
 /*
  * Update the value of PKRU register that was already pushed onto the signal frame.
  */
-static inline int update_pkru_in_sigframe(struct xregs_state __user *buf, u32 pkru)
+static inline int update_pkru_in_sigframe(struct xregs_state __user *buf, u64 mask, u32 pkru)
 {
+	u64 xstate_bv;
+	int err;
+
 	if (unlikely(!cpu_feature_enabled(X86_FEATURE_OSPKE)))
 		return 0;
+
+	/* Mark PKRU as in-use so that it is restored correctly. */
+	xstate_bv = (mask & xfeatures_in_use()) | XFEATURE_MASK_PKRU;
+
+	err =  __put_user(xstate_bv, &buf->header.xfeatures);
+	if (err)
+		return err;
+
+	/* Update PKRU value in the userspace xsave buffer. */
 	return __put_user(pkru, (unsigned int __user *)get_xsave_addr_user(buf, XFEATURE_PKRU));
 }
 
@@ -292,7 +304,7 @@ static inline int xsave_to_user_sigframe(struct xregs_state __user *buf, u32 pkr
 	clac();
 
 	if (!err)
-		err = update_pkru_in_sigframe(buf, pkru);
+		err = update_pkru_in_sigframe(buf, mask, pkru);
 
 	return err;
 }
-- 
2.43.5
Re: [PATCH v3 2/2] x86/pkeys: Set XSTATE_BV[PKRU] to 1 so that PKRU is XRSTOR'd correctly
Posted by Dave Hansen 1 year, 2 months ago
On 11/19/24 09:45, Aruna Ramakrishna wrote:
> PKRU value is not XRSTOR'd from the XSAVE area if the corresponding
> XSTATE_BV[i] bit is 0. A wrpkru(0) sets XSTATE_BV[PKRU] to 0 on AMD
> systems, which means the PKRU value updated on the sigframe later on,
> in update_pkru_in_sigframe(), is ignored.
> 
> To make this behavior consistent across Intel and AMD systems, and to
> ensure that the PKRU value updated on the sigframe is always restored
> correctly, explicitly set XSTATE_BV[PKRU] to 1.
> 
> Fixes: 70044df250d0 ("x86/pkeys: Update PKRU to enable all pkeys before XSAVE")
> 
> Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
> Suggested-by: Rudi Horn <rudi.horn@oracle.com>

I still think this changelog needs quite a bit of work for someone to
make sense of this if they read it in a year.  Perhaps:

--

When XSTATE_BV[i] is 0, and XRSTOR attempts to restore state component
'i' it ignores any value in the XSAVE buffer and instead restores the
state component's init value.

This means that if XSAVE writes XSTATE_BV[PKRU]=0 then XRSTOR will
ignore the value that update_pkru_in_sigframe() writes to the XSAVE buffer.

XSTATE_BV[PKRU] only gets written as 0 if PKRU is in its init state. On
Intel CPUs, basically never happens because the kernel usually
overwrites the init value (aside: this is why we didn't notice this bug
until now). But on AMD, the init tracker is more aggressive and will
track PKRU as being in its init state upon any wrpkru(0x0).
Unfortunately, sig_prepare_pkru() does just that: wrpkru(0x0).

To fix this, always overwrite the sigframe XSTATE_BV with a value that
has XSTATE_BV[PKRU]==1.  This ensures that XRSTOR will not ignore what
update_pkru_in_sigframe() wrote.

The problematic sequence of events is something like this:

Userspace does:
	* wrpkru(0xffff0000) (or whatever)
	* Hardware sets: XINUSE[PKRU]=1
Signal happens, kernel is entered:
	* sig_prepare_pkru() => wrpkru(0x00000000)
	* Hardware sets: XINUSE[PKRU]=0 (aggressive AMD init tracker)
	* XSAVE writes most of XSAVE buffer, including
	  XSTATE_BV[PKRU]=XINUSE[PKRU]=0
	* update_pkru_in_sigframe() overwrite PKRU in XSAVE buffer
... signal handling
	* XRSTOR sees XSTATE_BV[PKRU]==0, ignores just-written value
	  from update_pkru_in_sigframe()

But otherwise, I think the code is fine:

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

I can fix up the changelog at application time if everyone is OK with it.
Re: [PATCH v3 2/2] x86/pkeys: Set XSTATE_BV[PKRU] to 1 so that PKRU is XRSTOR'd correctly
Posted by Aruna Ramakrishna 1 year, 2 months ago

> On Nov 22, 2024, at 4:10 PM, Dave Hansen <dave.hansen@intel.com> wrote:
> 
> On 11/19/24 09:45, Aruna Ramakrishna wrote:
>> PKRU value is not XRSTOR'd from the XSAVE area if the corresponding
>> XSTATE_BV[i] bit is 0. A wrpkru(0) sets XSTATE_BV[PKRU] to 0 on AMD
>> systems, which means the PKRU value updated on the sigframe later on,
>> in update_pkru_in_sigframe(), is ignored.
>> 
>> To make this behavior consistent across Intel and AMD systems, and to
>> ensure that the PKRU value updated on the sigframe is always restored
>> correctly, explicitly set XSTATE_BV[PKRU] to 1.
>> 
>> Fixes: 70044df250d0 ("x86/pkeys: Update PKRU to enable all pkeys before XSAVE")
>> 
>> Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
>> Suggested-by: Rudi Horn <rudi.horn@oracle.com>
> 
> I still think this changelog needs quite a bit of work for someone to
> make sense of this if they read it in a year.  Perhaps:
> 
> --
> 
> When XSTATE_BV[i] is 0, and XRSTOR attempts to restore state component
> 'i' it ignores any value in the XSAVE buffer and instead restores the
> state component's init value.
> 
> This means that if XSAVE writes XSTATE_BV[PKRU]=0 then XRSTOR will
> ignore the value that update_pkru_in_sigframe() writes to the XSAVE buffer.
> 
> XSTATE_BV[PKRU] only gets written as 0 if PKRU is in its init state. On
> Intel CPUs, basically never happens because the kernel usually
> overwrites the init value (aside: this is why we didn't notice this bug
> until now). But on AMD, the init tracker is more aggressive and will
> track PKRU as being in its init state upon any wrpkru(0x0).
> Unfortunately, sig_prepare_pkru() does just that: wrpkru(0x0).
> 
> To fix this, always overwrite the sigframe XSTATE_BV with a value that
> has XSTATE_BV[PKRU]==1.  This ensures that XRSTOR will not ignore what
> update_pkru_in_sigframe() wrote.
> 
> The problematic sequence of events is something like this:
> 
> Userspace does:
> * wrpkru(0xffff0000) (or whatever)
> * Hardware sets: XINUSE[PKRU]=1
> Signal happens, kernel is entered:
> * sig_prepare_pkru() => wrpkru(0x00000000)
> * Hardware sets: XINUSE[PKRU]=0 (aggressive AMD init tracker)
> * XSAVE writes most of XSAVE buffer, including
>  XSTATE_BV[PKRU]=XINUSE[PKRU]=0
> * update_pkru_in_sigframe() overwrite PKRU in XSAVE buffer
> ... signal handling
> * XRSTOR sees XSTATE_BV[PKRU]==0, ignores just-written value
>  from update_pkru_in_sigframe()
> 
> But otherwise, I think the code is fine:
> 
> Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
> 
> I can fix up the changelog at application time if everyone is OK with it.

Thank you Dave. I agree, this reads better.

I’m a little unclear if I should send out a v4 with the updated changelog.

Thanks,
Aruna


Re: [PATCH v3 2/2] x86/pkeys: Set XSTATE_BV[PKRU] to 1 so that PKRU is XRSTOR'd correctly
Posted by Dave Hansen 1 year, 2 months ago
On 12/2/24 10:33, Aruna Ramakrishna wrote:
> I’m a little unclear if I should send out a v4 with the updated
> changelog.

No need, but thanks for clarifying.  I'm planning on applying the patch
with the fixed up changelog shortly.