[v2] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

[PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Oleg Nesterov 1 month, 1 week ago

PF_USER_WORKER threads don't really differ from PF_KTHREAD threads
at least in that they never return to usermode and never use their
FPU state.

However, ptrace or coredump paths can access their FPU state and this
is the only reason why x86_task_fpu(PF_USER_WORKER) needs to work and
and discriminate PF_USER_WORKER from PF_KTHREAD. Unlike all other x86
FPU code paths which do not distinguish them.

OTOH, arch/x86/kernel/fpu/regset.c doesn't really need "struct fpu *",
the .regset_get() functions actually need a "struct fpstate *". If the
target task is PF_USER_WORKER, they can safely use &init_fpstate. So
this series adds the new simple helper

	static struct fpstate *get_fpstate(struct task_struct *task)
	{
		struct fpu *fpu;

		if (unlikely(task->flags & PF_USER_WORKER))
			return &init_fpstate;

		fpu = x86_task_fpu(task);
		if (task == current)
			fpu_sync_fpstate(fpu);
		return fpu->fpstate;
	}

which can be used instead of x86_task_fpu(task)->fpstate pattern in
arch/x86/kernel/fpu/regset.c.

However, there is an annoying complication: shstk_alloc_thread_stack()
can alloc the pointless shadow stack for PF_USER_WORKER thread and set
the ARCH_SHSTK_SHSTK flag. This means that ssp_get()->ssp_active() can
return true, and in this case it wouldn't be right to use the "unrelated"
init_fpstate.

That is why this series includes 4/5, and to me it looks like a cleanup
which makes sense regardless.

Link to V1: https://lore.kernel.org/all/20250814101340.GA17288@redhat.com/

Changes:

	- improve the subject/changelog in 1/5

	- drop "x86/shstk: add "task_struct *tsk" argument to reset_thread_features()"

	- rework 4/5 to not use reset_thread_features()

TODO:
	update the fpregs_soft_get() and user_regset.set() paths as well

Oleg.
---

 arch/x86/include/asm/shstk.h |  4 ++--
 arch/x86/kernel/fpu/regset.c | 46 ++++++++++++++++++++++++++------------------
 arch/x86/kernel/fpu/xstate.c | 12 ++++++------
 arch/x86/kernel/fpu/xstate.h |  4 ++--
 arch/x86/kernel/process.c    |  2 +-
 arch/x86/kernel/shstk.c      |  9 +++++++--
 6 files changed, 45 insertions(+), 32 deletions(-)

Re: [PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Edgecombe, Rick P 1 month, 1 week ago

On Fri, 2025-08-22 at 17:36 +0200, Oleg Nesterov wrote:
> PF_USER_WORKER threads don't really differ from PF_KTHREAD threads
> at least in that they never return to usermode and never use their
> FPU state.
> 
> However, ptrace or coredump paths can access their FPU state and this
> is the only reason why x86_task_fpu(PF_USER_WORKER) needs to work and
> and discriminate PF_USER_WORKER from PF_KTHREAD. Unlike all other x86
> FPU code paths which do not distinguish them.
> 
> OTOH, arch/x86/kernel/fpu/regset.c doesn't really need "struct fpu *",
> the .regset_get() functions actually need a "struct fpstate *". If the
> target task is PF_USER_WORKER, they can safely use &init_fpstate. So
> this series adds the new simple helper

PKRU affects kernel accesses to userspace. io threads and vhost access
userspace. So why don't we want PKRU state to be inherited for user workers? I
guess it is not today, but to me, conceptually we maybe don't want a special
case for them? So rather than add more special handling, could we actually just
remove special handling to make it consistent?

But again, what exactly is the problem here? Is there a crash or something for
user workers?

Re: [PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Oleg Nesterov 1 month, 1 week ago

On 08/22, Edgecombe, Rick P wrote:
>
> On Fri, 2025-08-22 at 17:36 +0200, Oleg Nesterov wrote:
> > PF_USER_WORKER threads don't really differ from PF_KTHREAD threads
> > at least in that they never return to usermode and never use their
> > FPU state.
> >
> > However, ptrace or coredump paths can access their FPU state and this
> > is the only reason why x86_task_fpu(PF_USER_WORKER) needs to work and
> > and discriminate PF_USER_WORKER from PF_KTHREAD. Unlike all other x86
> > FPU code paths which do not distinguish them.
> >
> > OTOH, arch/x86/kernel/fpu/regset.c doesn't really need "struct fpu *",
> > the .regset_get() functions actually need a "struct fpstate *". If the
> > target task is PF_USER_WORKER, they can safely use &init_fpstate. So
> > this series adds the new simple helper
>
> PKRU affects kernel accesses to userspace. io threads and vhost access
> userspace. So why don't we want PKRU state to be inherited for user workers?

Sorry I don't follow... Again, this is not my area, I am sure I've missed something.
But could you please explain how can this series affect the PKRU logic?

> I guess it is not today, but to me, conceptually we maybe don't want a special
> case for them? So rather than add more special handling, could we actually just
> remove special handling to make it consistent?

Could you spell please?

> But again, what exactly is the problem here? Is there a crash or something for
> user workers?

Well. I already tried to to explain this in the previous discussions. Apperently
I wasn't clear, my fault. So I guess this needs yet another email which I'll write
tomorrow, becauase I am already sleeping today.

Oleg.

Re: [PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Edgecombe, Rick P 1 month, 1 week ago

On Fri, 2025-08-22 at 21:21 +0200, Oleg Nesterov wrote:
> > PKRU affects kernel accesses to userspace. io threads and vhost access
> > userspace. So why don't we want PKRU state to be inherited for user workers?
> 
> Sorry I don't follow... Again, this is not my area, I am sure I've missed
> something.
> But could you please explain how can this series affect the PKRU logic?
> 
> > I guess it is not today

I'm sorry, this is incorrect. PKRU is not kept in the FPU structs anymore. So it
should be inherited over clone I guess. But despite not being in the actual FPU
buffer, for compatibility it's left in the uabi xstate copy stuff that the
regsets use.

> > , but to me, conceptually we maybe don't want a special case for them? So
> > rather than add more special handling, could we actually just remove special
> > handling to make it consistent?
> 
> Could you spell please?

PKRU is for "protection keys". These are memory permissions that can be applied
per-thread. So you can make a virtual address toggleable for read or write
without affecting the other threads. The kernel is supposed to have these
permissions enforced just like the normal ones (read-only, etc). So it's an
example of user FPU state that applied to the kernel. Except that, as above, it
had to be moved out of the actual FPU state because it was special in that way.

But I think you could argue that it should be part of ptrace regsets. A debugger
may want to inspect what PKRU enforcement was happening for the io thread.

> 
> > But again, what exactly is the problem here? Is there a crash or something
> > for
> > user workers?
> 
> Well. I already tried to to explain this in the previous discussions.
> Apperently I wasn't clear, my fault. So I guess this needs yet another email
> which I'll write tomorrow, becauase I am already sleeping today.

I believe you said something like "sorry my fault and I'll explain in another
mail"[0]. Did I miss it?

[0]
https://lore.kernel.org/lkml/20250815191306.GK11549@redhat.com/

Re: [PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Oleg Nesterov 1 month, 1 week ago

On 08/22, Edgecombe, Rick P wrote:
>
> On Fri, 2025-08-22 at 21:21 +0200, Oleg Nesterov wrote:
> > > PKRU affects kernel accesses to userspace. io threads and vhost access
> > > userspace. So why don't we want PKRU state to be inherited for user workers?
> >
> > Sorry I don't follow... Again, this is not my area, I am sure I've missed
> > something.
> > But could you please explain how can this series affect the PKRU logic?
> >
> > > I guess it is not today
>
> I'm sorry, this is incorrect. PKRU is not kept in the FPU structs anymore. So it
> should be inherited over clone I guess.

Yes,

> But despite not being in the actual FPU
> buffer, for compatibility it's left in the uabi xstate copy stuff that the
> regsets use.

Yes, and copy_xstate_to_uabi_buf() still reports target->thread.pkru for
io threads.

So this series doesn't make any difference in this respect...

> > > But again, what exactly is the problem here? Is there a crash or something
> > > for
> > > user workers?
> >
> > Well. I already tried to to explain this in the previous discussions.
> > Apperently I wasn't clear, my fault. So I guess this needs yet another email
> > which I'll write tomorrow, becauase I am already sleeping today.
>
> I believe you said something like "sorry my fault and I'll explain in another
> mail"[0]. Did I miss it?
>
> [0]
> https://lore.kernel.org/lkml/20250815191306.GK11549@redhat.com/

I tried to add more details in this "[PATCH v2 0/5]" cover letter, in particular
to explain why does this series include "[PATCH v2 4/5] x86/shstk: don't create the
shadow stack for PF_USER_WORKERs". I thought that your were asking to explain this
part...

So. Sorry if it wasn't clear, this series is not a bug fix or something like this.
This starts the cleanups I was thinking about year ago, see

	https://lore.kernel.org/all/20240606120038.GB22450@redhat.com/

Then later we can probably make more changes so that the kernel threads
(PF_KTHREADs and PF_USER_WORKERs) will run without "struct fpu" attached
to task_struct, so that x86_task_fpu() should return NULL regardless of
CONFIG_X86_DEBUG_FPU.

But even the WARN_ON_ONCE(task->flags & (PF_KTHREAD|PF_USER_WORKER)) in
x86_task_fpu() makes sense to me.

Say, switch_fpu() has no reason to check "PF_KTHREAD | PF_USER_WORKER",
this check should die. But if something goes wrong, it would be nice to
have a warning for io threads as well.

But as I said, I understand that cleanups are always subjective. It seems
that nobody is interested, and the only reviewer (you ;) doesn't like these
changes. I am going to give up.

That said... Could you explain why do you dislike 4/5 ?

Oleg.

Re: [PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Edgecombe, Rick P 1 month, 1 week ago

On Mon, 2025-08-25 at 15:47 +0200, Oleg Nesterov wrote:
> I tried to add more details in this "[PATCH v2 0/5]" cover letter, in particular
> to explain why does this series include "[PATCH v2 4/5] x86/shstk: don't create the
> shadow stack for PF_USER_WORKERs". I thought that your were asking to explain this
> part...
> 
> So. Sorry if it wasn't clear, this series is not a bug fix or something like this.
> This starts the cleanups I was thinking about year ago, see
> 
> 	https://lore.kernel.org/all/20240606120038.GB22450@redhat.com/
> 
> Then later we can probably make more changes so that the kernel threads
> (PF_KTHREADs and PF_USER_WORKERs) will run without "struct fpu" attached
> to task_struct, so that x86_task_fpu() should return NULL regardless of
> CONFIG_X86_DEBUG_FPU.

To save space?

> 
> But even the WARN_ON_ONCE(task->flags & (PF_KTHREAD|PF_USER_WORKER)) in
> x86_task_fpu() makes sense to me.
> 
> Say, switch_fpu() has no reason to check "PF_KTHREAD | PF_USER_WORKER",
> this check should die. 
> 

Digging through git, the reason for the PF_USER_WORKER check in switch_fpu() was
originally: "more of a cosmetic thing that was found while debugging and issue
and pondering why the FPU flag is set on these threads."

So a goal could be to make the code make more sense? If that is a reason, then I
have some concerns with it. The simpler code would need to somehow work with
that (I think...) user workers should still have a PKRU value. So then does
ptrace need branching logic for xstateregs_get/set() with a struct fpu and
without? If so, is that ultimately simpler?

> But if something goes wrong, it would be nice to
> have a warning for io threads as well.

I guess I question whether it really makes sense to add a special case for
PF_USER_WORKER, including the existing logic. But I'm still trying to piece
together a clearly stated benefit.

> 
> But as I said, I understand that cleanups are always subjective. It seems
> that nobody is interested, and the only reviewer (you ;) doesn't like these
> changes. I am going to give up.
> 
> That said... Could you explain why do you dislike 4/5 ?

As I said, shstk_alloc_thread_stack() shouldn't clear ARCH_SHSTK_SHSTK because
the function is about shadow stack allocation. It also doesn't make sense to
clear ARCH_SHSTK_SHSTK for user workers. It seemed like Mark (arm shadow stack
person) agreed on those...

I think Dave also questioned whether a rare extra shadow stack is really a
problem.

Re: [PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Oleg Nesterov 1 month, 1 week ago

On 08/27, Edgecombe, Rick P wrote:
>
> On Mon, 2025-08-25 at 15:47 +0200, Oleg Nesterov wrote:
> >
> > So. Sorry if it wasn't clear, this series is not a bug fix or something like this.
> > This starts the cleanups I was thinking about year ago, see
> >
> > 	https://lore.kernel.org/all/20240606120038.GB22450@redhat.com/
> >
> > Then later we can probably make more changes so that the kernel threads
> > (PF_KTHREADs and PF_USER_WORKERs) will run without "struct fpu" attached
> > to task_struct, so that x86_task_fpu() should return NULL regardless of
> > CONFIG_X86_DEBUG_FPU.
>
> To save space?

Yes. And to make the fact that kernel threads never use (do not really have)
FPU state more clear.

> > But even the WARN_ON_ONCE(task->flags & (PF_KTHREAD|PF_USER_WORKER)) in
> > x86_task_fpu() makes sense to me.
> >
> > Say, switch_fpu() has no reason to check "PF_KTHREAD | PF_USER_WORKER",
> > this check should die. 
> >
>
> Digging through git, the reason for the PF_USER_WORKER check in switch_fpu() was
> originally: "more of a cosmetic thing that was found while debugging and issue
> and pondering why the FPU flag is set on these threads."

Whatever reasons we had, they're gone. We can rely on TIF_NEED_FPU_LOAD.
I'll send a coupld of patches tomorrow.

> So a goal could be to make the code make more sense? If that is a reason, then I
> have some concerns with it. The simpler code would need to somehow work with
> that (I think...) user workers should still have a PKRU value. So then does
> ptrace need branching logic for xstateregs_get/set() with a struct fpu and
> without? If so, is that ultimately simpler?

Sorry, I don't understand... In particular, I don't understand again how
this connects to PKRU. __switch_to()->x86_pkru_load() doesn't depend on
switch_fpu() ?

> > But if something goes wrong, it would be nice to
> > have a warning for io threads as well.
>
> I guess I question whether it really makes sense to add a special case for
> PF_USER_WORKER, including the existing logic. But I'm still trying to piece
> together a clearly stated benefit.

Again, I don't understand... To me, currently arch/x86/kernel/fpu/regset.c
adds a special case for PF_USER_WORKER, this series tries to remove it (but
we need a bit more of simple changes).

> > That said... Could you explain why do you dislike 4/5 ?
>
> As I said, shstk_alloc_thread_stack() shouldn't clear ARCH_SHSTK_SHSTK because
> the function is about shadow stack allocation.

OK, then how/where we can clear this flag if we avoid the pointless shadow stack
allocation for PF_USER_WORKER?

> It also doesn't make sense to
> clear ARCH_SHSTK_SHSTK for user workers.

Why?

> I think Dave also questioned whether a rare extra shadow stack is really a
> problem.

Sure, it is not really a problem. In that it is not a bug. But why we can't
avoid the pointless shadow stack / ARCH_SHSTK_SHSTK for user workers ? 4/5
doesn't complicate this code.

Plus, again, the current code is not consistent. fpu_clone() won't do
update_fpu_shstk() in this case. Not a bug too, but imo deserves a cleanup.

Oleg.

Re: [PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Edgecombe, Rick P 1 month ago

On Wed, 2025-08-27 at 16:51 +0200, Oleg Nesterov wrote:
> > 
> > I guess I question whether it really makes sense to add a special case for
> > PF_USER_WORKER, including the existing logic. But I'm still trying to piece
> > together a clearly stated benefit.
> 
> Again, I don't understand... To me, currently arch/x86/kernel/fpu/regset.c
> adds a special case for PF_USER_WORKER, this series tries to remove it (but
> we need a bit more of simple changes).

That commit I dug up? It didn't have a super strong justification either. Can
you say what your intended benefit is?

> 
> > > That said... Could you explain why do you dislike 4/5 ?
> > 
> > As I said, shstk_alloc_thread_stack() shouldn't clear ARCH_SHSTK_SHSTK
> > because
> > the function is about shadow stack allocation.
> 
> OK, then how/where we can clear this flag if we avoid the pointless shadow
> stack allocation for PF_USER_WORKER?

*If* we want to worry about an extra shadow stack allocation (which Dave seems
to doubt), we don't need to clear ARCH_SHSTK_SHSTK to avoid allocations. Other
thread types already avoid it (vfork, etc). So just add to the existing logic
that skips shadow stack allocation. Make it do that for user workers too, and
leave ARCH_SHSTK_SHSTK alone.

> 
> > It also doesn't make sense to clear ARCH_SHSTK_SHSTK for user workers.
> 
> Why?

Because ARCH_SHSTK_SHSTK is supposed to be inherited by children. It adds a
special case for no reason.

> 
> > I think Dave also questioned whether a rare extra shadow stack is really a
> > problem.
> 
> Sure, it is not really a problem. In that it is not a bug. But why we can't
> avoid the pointless shadow stack / ARCH_SHSTK_SHSTK for user workers ? 4/5
> doesn't complicate this code.
> 
> Plus, again, the current code is not consistent. fpu_clone() won't do
> update_fpu_shstk() in this case. Not a bug too, but imo deserves a cleanup.

I thought we discussed that the user worker logic already wipes the whole FPU
state though, so we don't need to call update_fpu_shstk(). Did I get that wrong?

Re: [PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Oleg Nesterov 1 month ago

On 08/28, Edgecombe, Rick P wrote:
>
> On Wed, 2025-08-27 at 16:51 +0200, Oleg Nesterov wrote:
> > >
> > > I guess I question whether it really makes sense to add a special case for
> > > PF_USER_WORKER, including the existing logic. But I'm still trying to piece
> > > together a clearly stated benefit.
> >
> > Again, I don't understand... To me, currently arch/x86/kernel/fpu/regset.c
> > adds a special case for PF_USER_WORKER, this series tries to remove it (but
> > we need a bit more of simple changes).
>
> That commit I dug up? It didn't have a super strong justification either. Can
> you say what your intended benefit is?

I meant that arch/x86/kernel/fpu/regset.c adds a special case for PF_USER_WORKER
in that this is the only case when x86_task_fpu(PF_USER_WORKER) is used.

> > OK, then how/where we can clear this flag if we avoid the pointless shadow
> > stack allocation for PF_USER_WORKER?
>
> *If* we want to worry about an extra shadow stack allocation (which Dave seems
> to doubt), we don't need to clear ARCH_SHSTK_SHSTK to avoid allocations. Other
> thread types already avoid it (vfork, etc). So just add to the existing logic
> that skips shadow stack allocation. Make it do that for user workers too, and
> leave ARCH_SHSTK_SHSTK alone.

From 0/5:

	However, there is an annoying complication: shstk_alloc_thread_stack()
	can alloc the pointless shadow stack for PF_USER_WORKER thread and set
	the ARCH_SHSTK_SHSTK flag. This means that ssp_get()->ssp_active() can
	return true, and in this case it wouldn't be right to use the "unrelated"
	init_fpstate.

> > Why?
>
> Because ARCH_SHSTK_SHSTK is supposed to be inherited by children. It adds a
> special case for no reason.

See above. And it has no meaning for io-threads, right?

> > Plus, again, the current code is not consistent. fpu_clone() won't do
> > update_fpu_shstk() in this case. Not a bug too, but imo deserves a cleanup.
>
> I thought we discussed that the user worker logic already wipes the whole FPU
> state though, so we don't need to call update_fpu_shstk(). Did I get that wrong?

Sure, but see the note from 0/5.

We don't need to call update_fpu_shstk() and initialize ->user_ssp.
Yet ssp_get() will report the bogus cetregs->user_ssp.

This all doesn't look right to me even if nothing really bad can happen.

Oleg.

Re: [PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Edgecombe, Rick P 1 month ago

On Fri, 2025-08-29 at 17:06 +0200, Oleg Nesterov wrote:
> > *If* we want to worry about an extra shadow stack allocation (which Dave
> > seems to doubt), we don't need to clear ARCH_SHSTK_SHSTK to avoid
> > allocations.
> > Other thread types already avoid it (vfork, etc). So just add to the
> > existing logic that skips shadow stack allocation. Make it do that for user
> > workers too, and leave ARCH_SHSTK_SHSTK alone.
> 
> From 0/5:
> 
> 	However, there is an annoying complication:
> shstk_alloc_thread_stack()
> 	can alloc the pointless shadow stack for PF_USER_WORKER thread and
> set
> 	the ARCH_SHSTK_SHSTK flag. This means that ssp_get()->ssp_active()
> can
> 	return true, and in this case it wouldn't be right to use the
> "unrelated"
> 	init_fpstate.

Yea the ptrace code currently assumes there will be a non-init SHSTK FPU state.
But if the init state is currently associated with the FPU, whether it's via a
cleared copy, or some pointer redirection as you proposed, what is the
difference?

Hmm, I actually do see a potential concrete issue...

fpu_clone() will wipe out the FPU state for PF_USER_WORKER, which means if
xsaves decides to use the init optimization for CET, "get_xsave_addr(xsave,
XFEATURE_CET_USER)" could return NULL and trigger a warning. I would think we
could address this by just removing the warning, since the comment is incorrect.

diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c
index 0986c2200adc..094a891bfea8 100644
--- a/arch/x86/kernel/fpu/regset.c
+++ b/arch/x86/kernel/fpu/regset.c
@@ -196,15 +196,8 @@ int ssp_get(struct task_struct *target, const struct
user_regset *regset,

        sync_fpstate(fpu);
        cetregs = get_xsave_addr(&fpu->fpstate->regs.xsave, XFEATURE_CET_USER);
-       if (WARN_ON(!cetregs)) {
-               /*
-                * This shouldn't ever be NULL because shadow stack was
-                * verified to be enabled above. This means
-                * MSR_IA32_U_CET.CET_SHSTK_EN should be 1 and so
-                * XFEATURE_CET_USER should not be in the init state.
-                */
+       if (cetregs)
                return -ENODEV;
-       }

        return membuf_write(&to, (unsigned long *)&cetregs->user_ssp,
                            sizeof(cetregs->user_ssp));
@@ -241,15 +234,8 @@ int ssp_set(struct task_struct *target, const struct
user_regset *regset,
        fpu_force_restore(fpu);

        cetregs = get_xsave_addr(xsave, XFEATURE_CET_USER);
-       if (WARN_ON(!cetregs)) {
-               /*
-                * This shouldn't ever be NULL because shadow stack was
-                * verified to be enabled above. This means
-                * MSR_IA32_U_CET.CET_SHSTK_EN should be 1 and so
-                * XFEATURE_CET_USER should not be in the init state.
-                */
+       if (cetregs)
                return -ENODEV;
-       }

        cetregs->user_ssp = user_ssp;
        return 0;

If PF_USER_WORKER's ever do grow the ability to spawn threads, further changes
would be needed to restore CET_SHSTK_EN for the new thread. I actually think
this is a further point towards not having special logic for PF_USER_WORKER FPUs
(beyond the PKRU reasoning). As in, instead of making these proposed changes,
instead rollback the existing differences. But I'm not sure it's worth it at
this time.

Re: [PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Oleg Nesterov 1 month ago

On 09/02, Edgecombe, Rick P wrote:
>
> On Fri, 2025-08-29 at 17:06 +0200, Oleg Nesterov wrote:
> > > *If* we want to worry about an extra shadow stack allocation (which Dave
> > > seems to doubt), we don't need to clear ARCH_SHSTK_SHSTK to avoid
> > > allocations.
> > > Other thread types already avoid it (vfork, etc). So just add to the
> > > existing logic that skips shadow stack allocation. Make it do that for user
> > > workers too, and leave ARCH_SHSTK_SHSTK alone.
> >
> > From 0/5:
> >
> > 	However, there is an annoying complication:
> > shstk_alloc_thread_stack()
> > 	can alloc the pointless shadow stack for PF_USER_WORKER thread and
> > set
> > 	the ARCH_SHSTK_SHSTK flag. This means that ssp_get()->ssp_active()
> > can
> > 	return true, and in this case it wouldn't be right to use the
> > "unrelated"
> > 	init_fpstate.
>
> Yea the ptrace code currently assumes there will be a non-init SHSTK FPU state.
> But if the init state is currently associated with the FPU, whether it's via a
> cleared copy, or some pointer redirection as you proposed, what is the
> difference?

Well. Lets forget about other changes for the moment. Lets only discuss 4/5.

> Hmm, I actually do see a potential concrete issue...
>
> fpu_clone() will wipe out the FPU state for PF_USER_WORKER, which means if
> xsaves decides to use the init optimization for CET, "get_xsave_addr(xsave,
> XFEATURE_CET_USER)" could return NULL and trigger a warning.

Even if get_xsave_addr() returns a valid pointer, what is the point to try to
report cetregs->user_ssp which doesn't match the reality?
Again, update_fpu_shstk() was not called, ->user_ssp can't be correct.

Why not simply clear ARCH_SHSTK_SHSTK as 4/5 does? With this change ssp_get()
will return -ENODEV right after the ssp_active() check.

Unless I am totally confused, ARCH_SHSTK_SHSTK has no meaning for PF_USER_WORKER
kernel threads, so I don't understand your objections.

Oleg.

Re: [PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Edgecombe, Rick P 1 month ago

On Wed, 2025-09-03 at 11:54 +0200, Oleg Nesterov wrote:
> > Hmm, I actually do see a potential concrete issue...
> > 
> > fpu_clone() will wipe out the FPU state for PF_USER_WORKER, which means if
> > xsaves decides to use the init optimization for CET, "get_xsave_addr(xsave,
> > XFEATURE_CET_USER)" could return NULL and trigger a warning.
> 
> Even if get_xsave_addr() returns a valid pointer, what is the point to try to
> report cetregs->user_ssp which doesn't match the reality?
> Again, update_fpu_shstk() was not called, ->user_ssp can't be correct.

I think it would be better to have less special cases in the FPU. I'm not sure
what you mean by "correct". As above, it gets zeroed in fpu_clone(). I guess you
want it to be something else.

If we are talking pure correctness, and not functional issues, I'm questioning
whether it actually should be zeroed for user workers. The behavior would be
leave shadow stack enabled, but don't allocate a shadow stack. But all this adds
complexity cost, which seems there isn't appetite for.

I do see the potential to hit a warning from userspace. That seems like
something that could be fixed and does have a clear purpose.

Re: [PATCH v2 0/5] x86/fpu: don't abuse x86_task_fpu(PF_USER_WORKER) in .regset_get() paths

Posted by Oleg Nesterov 4 weeks, 1 day ago

On 09/03, Edgecombe, Rick P wrote:
>
> On Wed, 2025-09-03 at 11:54 +0200, Oleg Nesterov wrote:
> > > Hmm, I actually do see a potential concrete issue...
> > >
> > > fpu_clone() will wipe out the FPU state for PF_USER_WORKER, which means if
> > > xsaves decides to use the init optimization for CET, "get_xsave_addr(xsave,
> > > XFEATURE_CET_USER)" could return NULL and trigger a warning.
> >
> > Even if get_xsave_addr() returns a valid pointer, what is the point to try to
> > report cetregs->user_ssp which doesn't match the reality?
> > Again, update_fpu_shstk() was not called, ->user_ssp can't be correct.
>
> I think it would be better to have less special cases in the FPU.

Agreed,

> I'm not sure
> what you mean by "correct". As above, it gets zeroed in fpu_clone(). I guess you
> want it to be something else.

Well. I think that if copy_thread() path allocate the shadow stack, then
ssp_get() should report the value returned by shstk_alloc_thread_stack().
If the thread runs without shstk/ARCH_SHSTK_SHSTK ssp_get() should return
-ENODEV. Regardless of PF_USER_WORKER.

Now lets recall that my actual motivation is "don't abuse x86_task_fpu(PF_USER_WORKER)",
and we also have ssp_set(). Without this patch which clears ARCH_SHSTK_SHSTK
ssp_set() -> x86_task_fpu(PF_USER_WORKER) has to return a "real" FPU state.

Oleg.