In the case where a user thread sets up an alternate signal stack
protected by the default pkey (i.e. pkey 0), while the thread's stack
is protected by a non-zero pkey, both these pkeys have to be enabled in
the PKRU register for the signal to be delivered to the application
correctly. However, the PKRU value restored after handling the signal
must not enable this extra pkey (i.e. pkey 0), so the PKRU value on the
on the sigframe should be overwritten with the user-defined value.
Add helper functions that will update PKRU value on the sigframe after
XSAVE. These functions will be called in a later patch; this patch does not
change any behavior as yet.
Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
---
arch/x86/kernel/fpu/signal.c | 11 +++++++++++
arch/x86/kernel/fpu/xstate.c | 13 +++++++++++++
arch/x86/kernel/fpu/xstate.h | 1 +
arch/x86/kernel/signal.c | 15 +++++++++++++++
4 files changed, 40 insertions(+)
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 2b3b9e140dd4..b0b254b931fd 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -63,6 +63,16 @@ static inline bool check_xstate_in_sigframe(struct fxregs_state __user *fxbuf,
return true;
}
+/*
+ * Update the value of PKRU register that was already pushed onto the signal frame.
+ */
+static inline int update_pkru_in_sigframe(struct xregs_state __user *buf, u32 pkru)
+{
+ if (unlikely(!cpu_feature_enabled(X86_FEATURE_OSPKE)))
+ return 0;
+ return __put_user(pkru, (unsigned int __user *)get_xsave_addr_user(buf, XFEATURE_PKRU));
+}
+
/*
* Signal frame handlers.
*/
@@ -160,6 +170,7 @@ static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, u32 pk
{
if (use_xsave())
return xsave_to_user_sigframe(buf);
+
if (use_fxsr())
return fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
else
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 33a214b1a4ce..e257478a0962 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -992,6 +992,19 @@ void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr)
return __raw_xsave_addr(xsave, xfeature_nr);
}
+/*
+ * Given an xstate feature nr, calculate where in the xsave buffer the state is.
+ * The xsave buffer should be in standard format, not compacted (e.g. user mode
+ * signal frames).
+ */
+void __user *get_xsave_addr_user(struct xregs_state __user *xsave, int xfeature_nr)
+{
+ if (WARN_ON_ONCE(!xfeature_enabled(xfeature_nr)))
+ return NULL;
+
+ return (void __user *)xsave + xstate_offsets[xfeature_nr];
+}
+
#ifdef CONFIG_ARCH_HAS_PKEYS
/*
diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h
index 19ca623ffa2a..236742db69fa 100644
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -55,6 +55,7 @@ extern void fpu__init_cpu_xstate(void);
extern void fpu__init_system_xstate(unsigned int legacy_size);
extern void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
+extern void __user *get_xsave_addr_user(struct xregs_state *xsave, int xfeature_nr);
static inline u64 xfeatures_mask_supervisor(void)
{
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 94b894437327..3fa66b2fe753 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -224,6 +224,21 @@ setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs, u32 pkru)
}
}
+/*
+ * Enable init_pkru pkey as well as the user-defined pkey to ensure that both
+ * the current stack and the alternate signal stack are writeable.
+ * Note: this function assumes that the alternate signal stack is accessible
+ * with the init_pkru_value. If the sigaltstack is protected by a different,
+ * non-zero pkey, then the application will segfault.
+ */
+static inline u32 sig_prepare_pkru(void)
+{
+ u32 orig_pkru = read_pkru();
+
+ write_pkru(orig_pkru & pkru_get_init_value());
+ return orig_pkru;
+}
+
static void
handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
--
2.39.3
The orig_pkru & init_pkru_value is quite difficult to understand. case 1> init_pkru: 00 (allow all) orig_pkru all cases => allow all case 2> init_pkru: 01 (disable all) Orig_pkru: allow all 00 => 00 allow all. disable all 01 => 01 disable all. disable write 10 => 00 allow all <--- *** odd *** disable all 11 => 01 disable all case 3> init pkru: 10 (disable write) allow all 00 => 00 allow all. disable all 01 => 00 (allow all) <----*** odd *** disable write 10 => 10 allow all disable all 11 => 10 disable write <--- *** odd *** case 4> init pkru: 11 (disable all) orig_pkru all cases => unchanged. set PKRU(0) seems to be better, easy to understand. In addition, kernel overwrites the PKRU during the signal handleing is a new ABI, it might be the best to add a flag during sigaltstack(), similar to how SS_AUTODISARM is set. > + return orig_pkru; > +} > + -Jeff
> On Jun 10, 2024, at 2:39 PM, jeffxu@chromium.org wrote: > > The orig_pkru & init_pkru_value is quite difficult to understand. > > case 1> init_pkru: 00 (allow all) > orig_pkru all cases => allow all > > case 2> init_pkru: 01 (disable all) > Orig_pkru: > allow all 00 => 00 allow all. > disable all 01 => 01 disable all. > disable write 10 => 00 allow all <--- *** odd *** > disable all 11 => 01 disable all > > case 3> init pkru: 10 (disable write) > allow all 00 => 00 allow all. > disable all 01 => 00 (allow all) <----*** odd *** > disable write 10 => 10 allow all > disable all 11 => 10 disable write <--- *** odd *** > > case 4> init pkru: 11 (disable all) > orig_pkru all cases => unchanged. > > set PKRU(0) seems to be better, easy to understand. > I’m not sure I follow. The default init_pkru is 0x55555554 (enable only pkey 0). Let’s assume the application sets up PKRU = 0x55555545 (i.e. enable only pkey 2). We want to set up the PKRU to enable both pkey 0 and pkey 2, before the XSAVE, so that both the current stack as well as the alternate signal stack are writable. So with write_pkru(orig_pkru & pkru_get_init_value()); It changes PKRU to 0x55555544 - enabling both pkey 0 and pkey 2. After the XSAVE, it calls update_pkru_in_sigframe(), which overwrites this (new) PKRU saved on the sigframe with orig_pkru, which is 0x55555545 in this example. Setting PKRU to 0 would be simpler, it would enable all pkeys - 0 through 15 - which, as Thomas pointed out, seems unnecessary. The application needs the pkey it enabled for access to its own stack, and we need to enable pkey 0 under the hood to enable access to the alternate signal stack. Thanks, Aruna
On Tue, Jun 11, 2024 at 7:05 AM Aruna Ramakrishna <aruna.ramakrishna@oracle.com> wrote: > > > > > On Jun 10, 2024, at 2:39 PM, jeffxu@chromium.org wrote: > > > > The orig_pkru & init_pkru_value is quite difficult to understand. > > > > case 1> init_pkru: 00 (allow all) > > orig_pkru all cases => allow all > > > > case 2> init_pkru: 01 (disable all) > > Orig_pkru: > > allow all 00 => 00 allow all. > > disable all 01 => 01 disable all. > > disable write 10 => 00 allow all <--- *** odd *** > > disable all 11 => 01 disable all > > > > case 3> init pkru: 10 (disable write) > > allow all 00 => 00 allow all. > > disable all 01 => 00 (allow all) <----*** odd *** > > disable write 10 => 10 allow all > > disable all 11 => 10 disable write <--- *** odd *** > > > > case 4> init pkru: 11 (disable all) > > orig_pkru all cases => unchanged. > > > > set PKRU(0) seems to be better, easy to understand. > > > > I’m not sure I follow. > > The default init_pkru is 0x55555554 (enable only pkey 0). Let’s assume the application > sets up PKRU = 0x55555545 (i.e. enable only pkey 2). We want to set up the PKRU > to enable both pkey 0 and pkey 2, before the XSAVE, so that both the current stack as > well as the alternate signal stack are writable. > > So with > write_pkru(orig_pkru & pkru_get_init_value()); > > It changes PKRU to 0x55555544 - enabling both pkey 0 and pkey 2. > Consider below examples: 1> The default init_pkru is 0x55555554 (pkey 0 has all access, 1-15 disable all) and thread has PKRU of 0xaaaaaaa8 (pkey 0 has all access, 1-15 disable write) init_pkru & curr_pkru will have 0x0 If altstack is protected by pkey 1, your code will change PKRU to 0, so the kernel is able to read/write to altstack. 2> However when the thread's PKRU disable all access to 1-15: The default init_pkru is 0x55555554 (pkey 0 has all access, 1-15 disable all) and thread has PKRU of 0x5555554 (pkey 0 has all access, 1-15 disable all) init_pkru & curr_pkru will have 0x55555554 If altstack is protected by pkey 1, kernel doesn't change PKRU, so still not able to access altstack. 3> This algorithm is stranger if inti_pkru is configured differently: The init_pkru is 0xaaaaaaa8 (pkey 0 has all access, and 1-15 disables write.) and thread has PKRU of 0x55555554 (pkey 0 has all access, 1-15 disable all) init_pkru & curr_pkru will have 0x0 (0-15 has all access). Overall I think this is a confusing algorithm to decide the new PKRU to use. > After the XSAVE, it calls update_pkru_in_sigframe(), which overwrites this (new) > PKRU saved on the sigframe with orig_pkru, which is 0x55555545 in this example. > > Setting PKRU to 0 would be simpler, it would enable all pkeys - 0 through 15 - which, > as Thomas pointed out, seems unnecessary. The application needs the pkey it > enabled for access to its own stack, and we need to enable pkey 0 under the hood > to enable access to the alternate signal stack. > I think you are referring to Thomas's comments in V3, copy here for ease of response: >User space resumes with the default PKRU value and the first thing user >space does when entering the signal handler is to push stuff on the >signal stack. ... > If user space protects the task stack or the sigalt stack with a key > which is not in init_pkru_value then it does not matter at all whether > it dies in handle_signal() or later when returning to user space, no? The userspace could register a custom handler (written in assembly) to change PKRU and allow access to the stack, this could be written in such that it is before pushing stuff to the stack. So all it requires is that the kernel doesn't SIGSEGV when preparing the signal frame in sigaltstack, this is where PKRU=0 inside the kernel path helps. Even today, without patch, one can already do following: 1> use PKEY 1 to protect sigaltstack 3> let the thread have all access to PKEY 1 3> send a signal to the thread, kernel will save PKRU to the altstack correctly. 4> kernel set init_pkur before hands over control to userspace 5> userspace set PKRU to allow access to PKEY 1 as the first thing to do. 6> on sig_return, threads have PKRU restored correctly from the value in sigframe. That is why I consider "let kernel to modify PKRU to give access to altstack" is an ABI change, i.e. compare with current behavior that kernel deny such access. It might be good to consider adding a new flag in ss_flags. Thanks. -Jeff
© 2016 - 2026 Red Hat, Inc.