If the alternate signal stack is protected by a different pkey than the
current execution stack, copying xsave data to the sigaltstack will fail
if its pkey is not enabled in the PKRU register.
We do not know which pkey was used by the application for the altstack,
so enable all pkeys before xsave.
But this updated PKRU value is also pushed onto the sigframe, which
means the register value restored from sigcontext will be different from
the user-defined one, which is unexpected. Fix that by overwriting the
PKRU value on the sigframe with the original, user-defined PKRU.
Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
---
arch/x86/kernel/fpu/signal.c | 11 +++++++++--
arch/x86/kernel/signal.c | 12 ++++++++++--
2 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 931c5469d7f3..1065ab995305 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -168,8 +168,15 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame,
static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, u32 pkru)
{
- if (use_xsave())
- return xsave_to_user_sigframe(buf);
+ int err = 0;
+
+ if (use_xsave()) {
+ err = xsave_to_user_sigframe(buf);
+ if (!err)
+ err = update_pkru_in_sigframe(buf, pkru);
+ return err;
+ }
+
if (use_fxsr())
return fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
else
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 9dc77ad03a0e..5f441039b572 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -102,7 +102,7 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
unsigned long math_size = 0;
unsigned long sp = regs->sp;
unsigned long buf_fx = 0;
- u32 pkru = read_pkru();
+ u32 pkru;
/* redzone */
if (!ia32_frame)
@@ -157,9 +157,17 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
return (void __user *)-1L;
}
+ /* Update PKRU to enable access to the alternate signal stack. */
+ pkru = sig_prepare_pkru();
/* save i387 and extended state */
- if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru))
+ if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru)) {
+ /*
+ * Restore PKRU to the original, user-defined value; disable
+ * extra pkeys enabled for the alternate signal stack, if any.
+ */
+ write_pkru(pkru);
return (void __user *)-1L;
+ }
return (void __user *)sp;
}
--
2.39.3
Re commit 70044df250d022572e26cd301bddf75eac1fe50e: https://lore.kernel.org/all/20240802061318.2140081-4-aruna.ramakrishna@oracle.com/ > If the alternate signal stack is protected by a different pkey than the > current execution stack, copying xsave data to the sigaltstack will fail > if its pkey is not enabled in the PKRU register. > > We do not know which pkey was used by the application for the altstack, > so enable all pkeys before xsave. > > But this updated PKRU value is also pushed onto the sigframe, which > means the register value restored from sigcontext will be different from > the user-defined one, which is unexpected. Fix that by overwriting the > PKRU value on the sigframe with the original, user-defined PKRU. Hi, This unfortunatly seems to be broken for rseq user-space writes. If the signal is caused by rseq struct being inaccessible due to PKEYs, we try to write to rseq again at setup_rt_frame->rseq_signal_deliver, which happens _before_ sig_prepare_pkru and won't succeed (PKEY is still inaccessible, hard kills the process). Any PKEY sandbox would want to restict untrusted access to rseq as well (otherwise allows easy sandbox escapes). If we do sig_prepare_pkru before rseq_signal_deliver (and generally before any copy_to_userpace), then user-space handler gets SIGSEGV and could unregister rseq and retry. However, I am not sure if it's the best solution performance- and complexity-wise (for user-space). A better solution may be to change __rseq_handle_notify_resume to temporary switch to default PKEY if user accesses fail. Rseq is similar to signals in this respect. Since rseq updates happen asynchronously with respect to user-space control flow, if a program uses rseq and ever makes rseq inaccessible with PKEYs, it's in trouble and will be randomly killed. Since rseq updates are asynchronous as signals, they shouldn't assume PKEY is set to default value that allows access to rseq descriptor. Thoughts?
On 2/4/25 02:01, Dmitry Vyukov wrote: >> But this updated PKRU value is also pushed onto the sigframe, which >> means the register value restored from sigcontext will be different from >> the user-defined one, which is unexpected. Fix that by overwriting the >> PKRU value on the sigframe with the original, user-defined PKRU. > > This unfortunatly seems to be broken for rseq user-space writes. Hi Dmitry, Are you saying that Aruna's patch caused a regression in the rseq code? Or that the rseq code has a separate but similar issue to the one that Aruna's patch fixed?
On Thu, 6 Feb 2025 at 19:21, Dave Hansen <dave.hansen@intel.com> wrote: > > On 2/4/25 02:01, Dmitry Vyukov wrote: > >> But this updated PKRU value is also pushed onto the sigframe, which > >> means the register value restored from sigcontext will be different from > >> the user-defined one, which is unexpected. Fix that by overwriting the > >> PKRU value on the sigframe with the original, user-defined PKRU. > > > > This unfortunatly seems to be broken for rseq user-space writes. > > Hi Dmitry, > > Are you saying that Aruna's patch caused a regression in the rseq code? > Or that the rseq code has a separate but similar issue to the one that > Aruna's patch fixed? No, no regression. Just another issue for real PKEY uses.
On Tue, 4 Feb 2025 at 11:02, Dmitry Vyukov <dvyukov@google.com> wrote:
>
> Re commit 70044df250d022572e26cd301bddf75eac1fe50e:
> https://lore.kernel.org/all/20240802061318.2140081-4-aruna.ramakrishna@oracle.com/
>
> > If the alternate signal stack is protected by a different pkey than the
> > current execution stack, copying xsave data to the sigaltstack will fail
> > if its pkey is not enabled in the PKRU register.
> >
> > We do not know which pkey was used by the application for the altstack,
> > so enable all pkeys before xsave.
> >
> > But this updated PKRU value is also pushed onto the sigframe, which
> > means the register value restored from sigcontext will be different from
> > the user-defined one, which is unexpected. Fix that by overwriting the
> > PKRU value on the sigframe with the original, user-defined PKRU.
>
> Hi,
>
> This unfortunatly seems to be broken for rseq user-space writes.
> If the signal is caused by rseq struct being inaccessible due to PKEYs,
> we try to write to rseq again at setup_rt_frame->rseq_signal_deliver,
> which happens _before_ sig_prepare_pkru and won't succeed
> (PKEY is still inaccessible, hard kills the process).
> Any PKEY sandbox would want to restict untrusted access to rseq
> as well (otherwise allows easy sandbox escapes).
>
> If we do sig_prepare_pkru before rseq_signal_deliver (and generally
> before any copy_to_userpace), then user-space handler gets SIGSEGV
> and could unregister rseq and retry.
>
> However, I am not sure if it's the best solution performance-
> and complexity-wise (for user-space). A better solution may be to
> change __rseq_handle_notify_resume to temporary switch to default
> PKEY if user accesses fail.
> Rseq is similar to signals in this respect. Since rseq updates
> happen asynchronously with respect to user-space control flow,
> if a program uses rseq and ever makes rseq inaccessible with PKEYs,
> it's in trouble and will be randomly killed.
> Since rseq updates are asynchronous as signals, they shouldn't
> assume PKEY is set to default value that allows access
> to rseq descriptor.
>
> Thoughts?
Another question about switching to pkey 0 and not switching back on all errors.
Can it create security problems by allowing sandboxed code to escape?
Namely, here:
+ /* Update PKRU to enable access to the alternate signal stack. */
+ pkru = sig_prepare_pkru();
/* save i387 and extended state */
- if (!copy_fpstate_to_sigframe(*fpstate, (void __user
*)buf_fx, math_size, pkru))
+ if (!copy_fpstate_to_sigframe(*fpstate, (void __user
*)buf_fx, math_size, pkru)) {
+ /*
+ * Restore PKRU to the original, user-defined value; disable
+ * extra pkeys enabled for the alternate signal stack, if any.
+ */
+ write_pkru(pkru);
return (void __user *)-1L;
+ }
we restore to the original pkru on this error, but there are other
failure paths later, e.g.:
https://elixir.bootlin.com/linux/v6.13.1/source/arch/x86/kernel/signal_64.c#L199
on these errors paths we will eventually get here to force_sig(SIGSEGV):
https://elixir.bootlin.com/linux/v6.13.1/source/kernel/signal.c#L1685
which just sends SIGSEGV and is not fatal.
So hypothetically, if there is a SIGUSR1 handler without SA_ONSTACK,
which fails, but SIGSEGV handler has SA_ONSTACK and doesn't fail, this
will result in resetting PKRU to 0 without restoring it back.
Or sandboxed code somehow arranges for the first signal setup for other reasons.
This is, of course, a tricky attack vector, and the program must
resume after SIGSEGV somehow (there are some such cases, e.g. mmaping
something lazily and retrying), but with security you never know how
creative an attacker can get and what you are missing that they are
not missing. So it looks safer to restore to the original PKRU on all
errors.
Hi Dmitry
On Thu, Feb 6, 2025 at 10:06 AM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Tue, 4 Feb 2025 at 11:02, Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > Re commit 70044df250d022572e26cd301bddf75eac1fe50e:
> > https://lore.kernel.org/all/20240802061318.2140081-4-aruna.ramakrishna@oracle.com/
> >
> > > If the alternate signal stack is protected by a different pkey than the
> > > current execution stack, copying xsave data to the sigaltstack will fail
> > > if its pkey is not enabled in the PKRU register.
> > >
> > > We do not know which pkey was used by the application for the altstack,
> > > so enable all pkeys before xsave.
> > >
> > > But this updated PKRU value is also pushed onto the sigframe, which
> > > means the register value restored from sigcontext will be different from
> > > the user-defined one, which is unexpected. Fix that by overwriting the
> > > PKRU value on the sigframe with the original, user-defined PKRU.
> >
> > Hi,
> >
> > This unfortunatly seems to be broken for rseq user-space writes.
> > If the signal is caused by rseq struct being inaccessible due to PKEYs,
> > we try to write to rseq again at setup_rt_frame->rseq_signal_deliver,
> > which happens _before_ sig_prepare_pkru and won't succeed
> > (PKEY is still inaccessible, hard kills the process).
> > Any PKEY sandbox would want to restict untrusted access to rseq
> > as well (otherwise allows easy sandbox escapes).
> >
> > If we do sig_prepare_pkru before rseq_signal_deliver (and generally
> > before any copy_to_userpace), then user-space handler gets SIGSEGV
> > and could unregister rseq and retry.
> >
> > However, I am not sure if it's the best solution performance-
> > and complexity-wise (for user-space). A better solution may be to
> > change __rseq_handle_notify_resume to temporary switch to default
> > PKEY if user accesses fail.
> > Rseq is similar to signals in this respect. Since rseq updates
> > happen asynchronously with respect to user-space control flow,
> > if a program uses rseq and ever makes rseq inaccessible with PKEYs,
> > it's in trouble and will be randomly killed.
> > Since rseq updates are asynchronous as signals, they shouldn't
> > assume PKEY is set to default value that allows access
> > to rseq descriptor.
> >
> > Thoughts?
>
> Another question about switching to pkey 0 and not switching back on all errors.
> Can it create security problems by allowing sandboxed code to escape?
>
Sandbox escape would be bad , we wouldn't want the calling thread to
get PKRU = 0 in any error path.
> Namely, here:
>
> + /* Update PKRU to enable access to the alternate signal stack. */
> + pkru = sig_prepare_pkru();
> /* save i387 and extended state */
> - if (!copy_fpstate_to_sigframe(*fpstate, (void __user
> *)buf_fx, math_size, pkru))
> + if (!copy_fpstate_to_sigframe(*fpstate, (void __user
> *)buf_fx, math_size, pkru)) {
> + /*
> + * Restore PKRU to the original, user-defined value; disable
> + * extra pkeys enabled for the alternate signal stack, if any.
> + */
> + write_pkru(pkru);
> return (void __user *)-1L;
> + }
>
> we restore to the original pkru on this error, but there are other
> failure paths later, e.g.:
> https://elixir.bootlin.com/linux/v6.13.1/source/arch/x86/kernel/signal_64.c#L199
>
> on these errors paths we will eventually get here to force_sig(SIGSEGV):
> https://elixir.bootlin.com/linux/v6.13.1/source/kernel/signal.c#L1685
> which just sends SIGSEGV and is not fatal.
>
> So hypothetically, if there is a SIGUSR1 handler without SA_ONSTACK,
> which fails, but SIGSEGV handler has SA_ONSTACK and doesn't fail, this
> will result in resetting PKRU to 0 without restoring it back.
> Or sandboxed code somehow arranges for the first signal setup for other reasons.
>
Can you walk me through the setup and steps that led to this situation?
> This is, of course, a tricky attack vector, and the program must
> resume after SIGSEGV somehow (there are some such cases, e.g. mmaping
> something lazily and retrying), but with security you never know how
> creative an attacker can get and what you are missing that they are
> not missing. So it looks safer to restore to the original PKRU on all
> errors.
Thanks
-Jeff
On Mon, 10 Feb 2025 at 23:46, Jeff Xu <jeffxu@chromium.org> wrote:
>
> Hi Dmitry
>
> On Thu, Feb 6, 2025 at 10:06 AM Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > On Tue, 4 Feb 2025 at 11:02, Dmitry Vyukov <dvyukov@google.com> wrote:
> > >
> > > Re commit 70044df250d022572e26cd301bddf75eac1fe50e:
> > > https://lore.kernel.org/all/20240802061318.2140081-4-aruna.ramakrishna@oracle.com/
> > >
> > > > If the alternate signal stack is protected by a different pkey than the
> > > > current execution stack, copying xsave data to the sigaltstack will fail
> > > > if its pkey is not enabled in the PKRU register.
> > > >
> > > > We do not know which pkey was used by the application for the altstack,
> > > > so enable all pkeys before xsave.
> > > >
> > > > But this updated PKRU value is also pushed onto the sigframe, which
> > > > means the register value restored from sigcontext will be different from
> > > > the user-defined one, which is unexpected. Fix that by overwriting the
> > > > PKRU value on the sigframe with the original, user-defined PKRU.
> > >
> > > Hi,
> > >
> > > This unfortunatly seems to be broken for rseq user-space writes.
> > > If the signal is caused by rseq struct being inaccessible due to PKEYs,
> > > we try to write to rseq again at setup_rt_frame->rseq_signal_deliver,
> > > which happens _before_ sig_prepare_pkru and won't succeed
> > > (PKEY is still inaccessible, hard kills the process).
> > > Any PKEY sandbox would want to restict untrusted access to rseq
> > > as well (otherwise allows easy sandbox escapes).
> > >
> > > If we do sig_prepare_pkru before rseq_signal_deliver (and generally
> > > before any copy_to_userpace), then user-space handler gets SIGSEGV
> > > and could unregister rseq and retry.
> > >
> > > However, I am not sure if it's the best solution performance-
> > > and complexity-wise (for user-space). A better solution may be to
> > > change __rseq_handle_notify_resume to temporary switch to default
> > > PKEY if user accesses fail.
> > > Rseq is similar to signals in this respect. Since rseq updates
> > > happen asynchronously with respect to user-space control flow,
> > > if a program uses rseq and ever makes rseq inaccessible with PKEYs,
> > > it's in trouble and will be randomly killed.
> > > Since rseq updates are asynchronous as signals, they shouldn't
> > > assume PKEY is set to default value that allows access
> > > to rseq descriptor.
> > >
> > > Thoughts?
> >
> > Another question about switching to pkey 0 and not switching back on all errors.
> > Can it create security problems by allowing sandboxed code to escape?
> >
> Sandbox escape would be bad , we wouldn't want the calling thread to
> get PKRU = 0 in any error path.
>
> > Namely, here:
> >
> > + /* Update PKRU to enable access to the alternate signal stack. */
> > + pkru = sig_prepare_pkru();
> > /* save i387 and extended state */
> > - if (!copy_fpstate_to_sigframe(*fpstate, (void __user
> > *)buf_fx, math_size, pkru))
> > + if (!copy_fpstate_to_sigframe(*fpstate, (void __user
> > *)buf_fx, math_size, pkru)) {
> > + /*
> > + * Restore PKRU to the original, user-defined value; disable
> > + * extra pkeys enabled for the alternate signal stack, if any.
> > + */
> > + write_pkru(pkru);
> > return (void __user *)-1L;
> > + }
> >
> > we restore to the original pkru on this error, but there are other
> > failure paths later, e.g.:
> > https://elixir.bootlin.com/linux/v6.13.1/source/arch/x86/kernel/signal_64.c#L199
> >
> > on these errors paths we will eventually get here to force_sig(SIGSEGV):
> > https://elixir.bootlin.com/linux/v6.13.1/source/kernel/signal.c#L1685
> > which just sends SIGSEGV and is not fatal.
> >
> > So hypothetically, if there is a SIGUSR1 handler without SA_ONSTACK,
> > which fails, but SIGSEGV handler has SA_ONSTACK and doesn't fail, this
> > will result in resetting PKRU to 0 without restoring it back.
> > Or sandboxed code somehow arranges for the first signal setup for other reasons.
> >
> Can you walk me through the setup and steps that led to this situation?
I don't anything more concrete steps, just the observation that PKRU
is restored only on 1 out of N error paths.
> > This is, of course, a tricky attack vector, and the program must
> > resume after SIGSEGV somehow (there are some such cases, e.g. mmaping
> > something lazily and retrying), but with security you never know how
> > creative an attacker can get and what you are missing that they are
> > not missing. So it looks safer to restore to the original PKRU on all
> > errors.
>
> Thanks
> -Jeff
On Thu, Aug 1, 2024 at 11:13 PM Aruna Ramakrishna
<aruna.ramakrishna@oracle.com> wrote:
>
> If the alternate signal stack is protected by a different pkey than the
> current execution stack, copying xsave data to the sigaltstack will fail
> if its pkey is not enabled in the PKRU register.
>
> We do not know which pkey was used by the application for the altstack,
> so enable all pkeys before xsave.
>
> But this updated PKRU value is also pushed onto the sigframe, which
> means the register value restored from sigcontext will be different from
> the user-defined one, which is unexpected. Fix that by overwriting the
> PKRU value on the sigframe with the original, user-defined PKRU.
>
> Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
> ---
> arch/x86/kernel/fpu/signal.c | 11 +++++++++--
> arch/x86/kernel/signal.c | 12 ++++++++++--
> 2 files changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> index 931c5469d7f3..1065ab995305 100644
> --- a/arch/x86/kernel/fpu/signal.c
> +++ b/arch/x86/kernel/fpu/signal.c
> @@ -168,8 +168,15 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame,
>
> static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, u32 pkru)
> {
> - if (use_xsave())
> - return xsave_to_user_sigframe(buf);
> + int err = 0;
> +
> + if (use_xsave()) {
> + err = xsave_to_user_sigframe(buf);
> + if (!err)
> + err = update_pkru_in_sigframe(buf, pkru);
> + return err;
> + }
> +
> if (use_fxsr())
> return fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
> else
> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> index 9dc77ad03a0e..5f441039b572 100644
> --- a/arch/x86/kernel/signal.c
> +++ b/arch/x86/kernel/signal.c
> @@ -102,7 +102,7 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
> unsigned long math_size = 0;
> unsigned long sp = regs->sp;
> unsigned long buf_fx = 0;
> - u32 pkru = read_pkru();
> + u32 pkru;
>
> /* redzone */
> if (!ia32_frame)
> @@ -157,9 +157,17 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
> return (void __user *)-1L;
> }
>
> + /* Update PKRU to enable access to the alternate signal stack. */
> + pkru = sig_prepare_pkru();
I think, the better place to call sig_prepare_pkru() at begging of the
get_sigframe:
get_sigframe()
{
/* ... */
if (ka->sa.sa_flags & SA_ONSTACK) {
if (sas_ss_flags(sp) == 0) {
// set pkru = 0 <--- set pkru = 0 here.
entering_altstack = true;
}
}
For two reasons:
- We probably don't want all signaling handling going through "PKRU=0"
, this includes the regular stack and nested signaling handling. The
best place is when "entering the altstack". IIUC, this feature only
enabled when sigaltstack() is used.
- The thread might not have read-access to the altstack, so we will
want to make sure that pkru=0 is set before any read to the altstack.
And IIRC, fpu__alloc_mathframe needs read-access to the altstack.
To help with testing, I will send a test case to demo the usage.
(please give me sometime to organize the test code, I'm hoping to send
out before the end of next week)
Also, could you please consider adding a new flag SS_PKEYALTSTACK (see
SS_AUTODISARM for example). Most existing apps that use sigaltstack()
don't use PKEY and don't care about setting PKRU=0, and don't need to
overwrite the sig frame after XSAVE. The flag will limit the impact
of this patch.
Thanks
Best regards,
-Jeff
> /* save i387 and extended state */
> - if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru))
> + if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru)) {
> + /*
> + * Restore PKRU to the original, user-defined value; disable
> + * extra pkeys enabled for the alternate signal stack, if any.
> + */
> + write_pkru(pkru);
> return (void __user *)-1L;
> + }
>
> return (void __user *)sp;
> }
> --
> 2.39.3
>
Hi Aruna
On Fri, Aug 9, 2024 at 10:30 AM Jeff Xu <jeffxu@chromium.org> wrote:
>
> On Thu, Aug 1, 2024 at 11:13 PM Aruna Ramakrishna
> <aruna.ramakrishna@oracle.com> wrote:
> >
> > If the alternate signal stack is protected by a different pkey than the
> > current execution stack, copying xsave data to the sigaltstack will fail
> > if its pkey is not enabled in the PKRU register.
> >
> > We do not know which pkey was used by the application for the altstack,
> > so enable all pkeys before xsave.
> >
> > But this updated PKRU value is also pushed onto the sigframe, which
> > means the register value restored from sigcontext will be different from
> > the user-defined one, which is unexpected. Fix that by overwriting the
> > PKRU value on the sigframe with the original, user-defined PKRU.
> >
> > Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
> > ---
> > arch/x86/kernel/fpu/signal.c | 11 +++++++++--
> > arch/x86/kernel/signal.c | 12 ++++++++++--
> > 2 files changed, 19 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> > index 931c5469d7f3..1065ab995305 100644
> > --- a/arch/x86/kernel/fpu/signal.c
> > +++ b/arch/x86/kernel/fpu/signal.c
> > @@ -168,8 +168,15 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame,
> >
> > static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, u32 pkru)
> > {
> > - if (use_xsave())
> > - return xsave_to_user_sigframe(buf);
> > + int err = 0;
> > +
> > + if (use_xsave()) {
> > + err = xsave_to_user_sigframe(buf);
> > + if (!err)
> > + err = update_pkru_in_sigframe(buf, pkru);
> > + return err;
> > + }
> > +
> > if (use_fxsr())
> > return fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
> > else
> > diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
> > index 9dc77ad03a0e..5f441039b572 100644
> > --- a/arch/x86/kernel/signal.c
> > +++ b/arch/x86/kernel/signal.c
> > @@ -102,7 +102,7 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
> > unsigned long math_size = 0;
> > unsigned long sp = regs->sp;
> > unsigned long buf_fx = 0;
> > - u32 pkru = read_pkru();
> > + u32 pkru;
> >
> > /* redzone */
> > if (!ia32_frame)
> > @@ -157,9 +157,17 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
> > return (void __user *)-1L;
> > }
> >
> > + /* Update PKRU to enable access to the alternate signal stack. */
> > + pkru = sig_prepare_pkru();
> I think, the better place to call sig_prepare_pkru() at begging of the
> get_sigframe:
>
> get_sigframe()
> {
> /* ... */
> if (ka->sa.sa_flags & SA_ONSTACK) {
> if (sas_ss_flags(sp) == 0) {
> // set pkru = 0 <--- set pkru = 0 here.
> entering_altstack = true;
> }
> }
> For two reasons:
> - We probably don't want all signaling handling going through "PKRU=0"
> , this includes the regular stack and nested signaling handling. The
> best place is when "entering the altstack". IIUC, this feature only
> enabled when sigaltstack() is used.
> - The thread might not have read-access to the altstack, so we will
> want to make sure that pkru=0 is set before any read to the altstack.
> And IIRC, fpu__alloc_mathframe needs read-access to the altstack.
> To help with testing, I will send a test case to demo the usage.
Apologies, I remembered it wrong, the fpu__alloc_mathframe doesn't
need read-access to altstack.
But I think the best place to set pkru=0 is still when the stack
entering altstack, as shown above. the reason is: The signal stack can
be nested, i.e. trigger another signaling inside signal handler and we
don't need to set pkru=0 multiple times (only the first time enter
sigaltstack)
> (please give me sometime to organize the test code, I'm hoping to send
> out before the end of next week)
>
the test code is placed at [1]
[1] https://github.com/peaktocreek/pkeytest
> Also, could you please consider adding a new flag SS_PKEYALTSTACK (see
> SS_AUTODISARM for example). Most existing apps that use sigaltstack()
> don't use PKEY and don't care about setting PKRU=0, and don't need to
> overwrite the sig frame after XSAVE. The flag will limit the impact
> of this patch.
>
Thanks
-Jeff
> Thanks
> Best regards,
> -Jeff
>
>
>
>
>
> > /* save i387 and extended state */
> > - if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru))
> > + if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru)) {
> > + /*
> > + * Restore PKRU to the original, user-defined value; disable
> > + * extra pkeys enabled for the alternate signal stack, if any.
> > + */
> > + write_pkru(pkru);
> > return (void __user *)-1L;
> > + }
> >
> > return (void __user *)sp;
> > }
> > --
> > 2.39.3
> >
The following commit has been merged into the x86/mm branch of tip:
Commit-ID: 70044df250d022572e26cd301bddf75eac1fe50e
Gitweb: https://git.kernel.org/tip/70044df250d022572e26cd301bddf75eac1fe50e
Author: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
AuthorDate: Fri, 02 Aug 2024 06:13:16
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 02 Aug 2024 14:12:21 +02:00
x86/pkeys: Update PKRU to enable all pkeys before XSAVE
If the alternate signal stack is protected by a different PKEY than the
current execution stack, copying XSAVE data to the sigaltstack will fail
if its PKEY is not enabled in the PKRU register.
It's unknown which pkey was used by the application for the altstack, so
enable all PKEYS before XSAVE.
But this updated PKRU value is also pushed onto the sigframe, which
means the register value restored from sigcontext will be different from
the user-defined one, which is incorrect.
Fix that by overwriting the PKRU value on the sigframe with the original,
user-defined PKRU.
Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240802061318.2140081-4-aruna.ramakrishna@oracle.com
---
arch/x86/kernel/fpu/signal.c | 11 +++++++++--
arch/x86/kernel/signal.c | 12 ++++++++++--
2 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 931c546..1065ab9 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -168,8 +168,15 @@ static inline bool save_xstate_epilog(void __user *buf, int ia32_frame,
static inline int copy_fpregs_to_sigframe(struct xregs_state __user *buf, u32 pkru)
{
- if (use_xsave())
- return xsave_to_user_sigframe(buf);
+ int err = 0;
+
+ if (use_xsave()) {
+ err = xsave_to_user_sigframe(buf);
+ if (!err)
+ err = update_pkru_in_sigframe(buf, pkru);
+ return err;
+ }
+
if (use_fxsr())
return fxsave_to_user_sigframe((struct fxregs_state __user *) buf);
else
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 9dc77ad..5f44103 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -102,7 +102,7 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
unsigned long math_size = 0;
unsigned long sp = regs->sp;
unsigned long buf_fx = 0;
- u32 pkru = read_pkru();
+ u32 pkru;
/* redzone */
if (!ia32_frame)
@@ -157,9 +157,17 @@ get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
return (void __user *)-1L;
}
+ /* Update PKRU to enable access to the alternate signal stack. */
+ pkru = sig_prepare_pkru();
/* save i387 and extended state */
- if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru))
+ if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru)) {
+ /*
+ * Restore PKRU to the original, user-defined value; disable
+ * extra pkeys enabled for the alternate signal stack, if any.
+ */
+ write_pkru(pkru);
return (void __user *)-1L;
+ }
return (void __user *)sp;
}
© 2016 - 2025 Red Hat, Inc.