arch/x86/include/asm/shstk.h | 4 ++-- arch/x86/kernel/fpu/regset.c | 46 ++++++++++++++++++++++++++------------------ arch/x86/kernel/fpu/xstate.c | 12 ++++++------ arch/x86/kernel/fpu/xstate.h | 4 ++-- arch/x86/kernel/process.c | 2 +- arch/x86/kernel/shstk.c | 9 +++++++-- 6 files changed, 45 insertions(+), 32 deletions(-)
PF_USER_WORKER threads don't really differ from PF_KTHREAD threads at least in that they never return to usermode and never use their FPU state. However, ptrace or coredump paths can access their FPU state and this is the only reason why x86_task_fpu(PF_USER_WORKER) needs to work and and discriminate PF_USER_WORKER from PF_KTHREAD. Unlike all other x86 FPU code paths which do not distinguish them. OTOH, arch/x86/kernel/fpu/regset.c doesn't really need "struct fpu *", the .regset_get() functions actually need a "struct fpstate *". If the target task is PF_USER_WORKER, they can safely use &init_fpstate. So this series adds the new simple helper static struct fpstate *get_fpstate(struct task_struct *task) { struct fpu *fpu; if (unlikely(task->flags & PF_USER_WORKER)) return &init_fpstate; fpu = x86_task_fpu(task); if (task == current) fpu_sync_fpstate(fpu); return fpu->fpstate; } which can be used instead of x86_task_fpu(task)->fpstate pattern in arch/x86/kernel/fpu/regset.c. However, there is an annoying complication: shstk_alloc_thread_stack() can alloc the pointless shadow stack for PF_USER_WORKER thread and set the ARCH_SHSTK_SHSTK flag. This means that ssp_get()->ssp_active() can return true, and in this case it wouldn't be right to use the "unrelated" init_fpstate. That is why this series includes 4/5, and to me it looks like a cleanup which makes sense regardless. Link to V1: https://lore.kernel.org/all/20250814101340.GA17288@redhat.com/ Changes: - improve the subject/changelog in 1/5 - drop "x86/shstk: add "task_struct *tsk" argument to reset_thread_features()" - rework 4/5 to not use reset_thread_features() TODO: update the fpregs_soft_get() and user_regset.set() paths as well Oleg. --- arch/x86/include/asm/shstk.h | 4 ++-- arch/x86/kernel/fpu/regset.c | 46 ++++++++++++++++++++++++++------------------ arch/x86/kernel/fpu/xstate.c | 12 ++++++------ arch/x86/kernel/fpu/xstate.h | 4 ++-- arch/x86/kernel/process.c | 2 +- arch/x86/kernel/shstk.c | 9 +++++++-- 6 files changed, 45 insertions(+), 32 deletions(-)
On Fri, 2025-08-22 at 17:36 +0200, Oleg Nesterov wrote: > PF_USER_WORKER threads don't really differ from PF_KTHREAD threads > at least in that they never return to usermode and never use their > FPU state. > > However, ptrace or coredump paths can access their FPU state and this > is the only reason why x86_task_fpu(PF_USER_WORKER) needs to work and > and discriminate PF_USER_WORKER from PF_KTHREAD. Unlike all other x86 > FPU code paths which do not distinguish them. > > OTOH, arch/x86/kernel/fpu/regset.c doesn't really need "struct fpu *", > the .regset_get() functions actually need a "struct fpstate *". If the > target task is PF_USER_WORKER, they can safely use &init_fpstate. So > this series adds the new simple helper PKRU affects kernel accesses to userspace. io threads and vhost access userspace. So why don't we want PKRU state to be inherited for user workers? I guess it is not today, but to me, conceptually we maybe don't want a special case for them? So rather than add more special handling, could we actually just remove special handling to make it consistent? But again, what exactly is the problem here? Is there a crash or something for user workers?
On 08/22, Edgecombe, Rick P wrote: > > On Fri, 2025-08-22 at 17:36 +0200, Oleg Nesterov wrote: > > PF_USER_WORKER threads don't really differ from PF_KTHREAD threads > > at least in that they never return to usermode and never use their > > FPU state. > > > > However, ptrace or coredump paths can access their FPU state and this > > is the only reason why x86_task_fpu(PF_USER_WORKER) needs to work and > > and discriminate PF_USER_WORKER from PF_KTHREAD. Unlike all other x86 > > FPU code paths which do not distinguish them. > > > > OTOH, arch/x86/kernel/fpu/regset.c doesn't really need "struct fpu *", > > the .regset_get() functions actually need a "struct fpstate *". If the > > target task is PF_USER_WORKER, they can safely use &init_fpstate. So > > this series adds the new simple helper > > PKRU affects kernel accesses to userspace. io threads and vhost access > userspace. So why don't we want PKRU state to be inherited for user workers? Sorry I don't follow... Again, this is not my area, I am sure I've missed something. But could you please explain how can this series affect the PKRU logic? > I guess it is not today, but to me, conceptually we maybe don't want a special > case for them? So rather than add more special handling, could we actually just > remove special handling to make it consistent? Could you spell please? > But again, what exactly is the problem here? Is there a crash or something for > user workers? Well. I already tried to to explain this in the previous discussions. Apperently I wasn't clear, my fault. So I guess this needs yet another email which I'll write tomorrow, becauase I am already sleeping today. Oleg.
On Fri, 2025-08-22 at 21:21 +0200, Oleg Nesterov wrote: > > PKRU affects kernel accesses to userspace. io threads and vhost access > > userspace. So why don't we want PKRU state to be inherited for user workers? > > Sorry I don't follow... Again, this is not my area, I am sure I've missed > something. > But could you please explain how can this series affect the PKRU logic? > > > I guess it is not today I'm sorry, this is incorrect. PKRU is not kept in the FPU structs anymore. So it should be inherited over clone I guess. But despite not being in the actual FPU buffer, for compatibility it's left in the uabi xstate copy stuff that the regsets use. > > , but to me, conceptually we maybe don't want a special case for them? So > > rather than add more special handling, could we actually just remove special > > handling to make it consistent? > > Could you spell please? PKRU is for "protection keys". These are memory permissions that can be applied per-thread. So you can make a virtual address toggleable for read or write without affecting the other threads. The kernel is supposed to have these permissions enforced just like the normal ones (read-only, etc). So it's an example of user FPU state that applied to the kernel. Except that, as above, it had to be moved out of the actual FPU state because it was special in that way. But I think you could argue that it should be part of ptrace regsets. A debugger may want to inspect what PKRU enforcement was happening for the io thread. > > > But again, what exactly is the problem here? Is there a crash or something > > for > > user workers? > > Well. I already tried to to explain this in the previous discussions. > Apperently I wasn't clear, my fault. So I guess this needs yet another email > which I'll write tomorrow, becauase I am already sleeping today. I believe you said something like "sorry my fault and I'll explain in another mail"[0]. Did I miss it? [0] https://lore.kernel.org/lkml/20250815191306.GK11549@redhat.com/
On 08/22, Edgecombe, Rick P wrote: > > On Fri, 2025-08-22 at 21:21 +0200, Oleg Nesterov wrote: > > > PKRU affects kernel accesses to userspace. io threads and vhost access > > > userspace. So why don't we want PKRU state to be inherited for user workers? > > > > Sorry I don't follow... Again, this is not my area, I am sure I've missed > > something. > > But could you please explain how can this series affect the PKRU logic? > > > > > I guess it is not today > > I'm sorry, this is incorrect. PKRU is not kept in the FPU structs anymore. So it > should be inherited over clone I guess. Yes, > But despite not being in the actual FPU > buffer, for compatibility it's left in the uabi xstate copy stuff that the > regsets use. Yes, and copy_xstate_to_uabi_buf() still reports target->thread.pkru for io threads. So this series doesn't make any difference in this respect... > > > But again, what exactly is the problem here? Is there a crash or something > > > for > > > user workers? > > > > Well. I already tried to to explain this in the previous discussions. > > Apperently I wasn't clear, my fault. So I guess this needs yet another email > > which I'll write tomorrow, becauase I am already sleeping today. > > I believe you said something like "sorry my fault and I'll explain in another > mail"[0]. Did I miss it? > > [0] > https://lore.kernel.org/lkml/20250815191306.GK11549@redhat.com/ I tried to add more details in this "[PATCH v2 0/5]" cover letter, in particular to explain why does this series include "[PATCH v2 4/5] x86/shstk: don't create the shadow stack for PF_USER_WORKERs". I thought that your were asking to explain this part... So. Sorry if it wasn't clear, this series is not a bug fix or something like this. This starts the cleanups I was thinking about year ago, see https://lore.kernel.org/all/20240606120038.GB22450@redhat.com/ Then later we can probably make more changes so that the kernel threads (PF_KTHREADs and PF_USER_WORKERs) will run without "struct fpu" attached to task_struct, so that x86_task_fpu() should return NULL regardless of CONFIG_X86_DEBUG_FPU. But even the WARN_ON_ONCE(task->flags & (PF_KTHREAD|PF_USER_WORKER)) in x86_task_fpu() makes sense to me. Say, switch_fpu() has no reason to check "PF_KTHREAD | PF_USER_WORKER", this check should die. But if something goes wrong, it would be nice to have a warning for io threads as well. But as I said, I understand that cleanups are always subjective. It seems that nobody is interested, and the only reviewer (you ;) doesn't like these changes. I am going to give up. That said... Could you explain why do you dislike 4/5 ? Oleg.
On Mon, 2025-08-25 at 15:47 +0200, Oleg Nesterov wrote: > I tried to add more details in this "[PATCH v2 0/5]" cover letter, in particular > to explain why does this series include "[PATCH v2 4/5] x86/shstk: don't create the > shadow stack for PF_USER_WORKERs". I thought that your were asking to explain this > part... > > So. Sorry if it wasn't clear, this series is not a bug fix or something like this. > This starts the cleanups I was thinking about year ago, see > > https://lore.kernel.org/all/20240606120038.GB22450@redhat.com/ > > Then later we can probably make more changes so that the kernel threads > (PF_KTHREADs and PF_USER_WORKERs) will run without "struct fpu" attached > to task_struct, so that x86_task_fpu() should return NULL regardless of > CONFIG_X86_DEBUG_FPU. To save space? > > But even the WARN_ON_ONCE(task->flags & (PF_KTHREAD|PF_USER_WORKER)) in > x86_task_fpu() makes sense to me. > > Say, switch_fpu() has no reason to check "PF_KTHREAD | PF_USER_WORKER", > this check should die. > Digging through git, the reason for the PF_USER_WORKER check in switch_fpu() was originally: "more of a cosmetic thing that was found while debugging and issue and pondering why the FPU flag is set on these threads." So a goal could be to make the code make more sense? If that is a reason, then I have some concerns with it. The simpler code would need to somehow work with that (I think...) user workers should still have a PKRU value. So then does ptrace need branching logic for xstateregs_get/set() with a struct fpu and without? If so, is that ultimately simpler? > But if something goes wrong, it would be nice to > have a warning for io threads as well. I guess I question whether it really makes sense to add a special case for PF_USER_WORKER, including the existing logic. But I'm still trying to piece together a clearly stated benefit. > > But as I said, I understand that cleanups are always subjective. It seems > that nobody is interested, and the only reviewer (you ;) doesn't like these > changes. I am going to give up. > > That said... Could you explain why do you dislike 4/5 ? As I said, shstk_alloc_thread_stack() shouldn't clear ARCH_SHSTK_SHSTK because the function is about shadow stack allocation. It also doesn't make sense to clear ARCH_SHSTK_SHSTK for user workers. It seemed like Mark (arm shadow stack person) agreed on those... I think Dave also questioned whether a rare extra shadow stack is really a problem.
On 08/27, Edgecombe, Rick P wrote: > > On Mon, 2025-08-25 at 15:47 +0200, Oleg Nesterov wrote: > > > > So. Sorry if it wasn't clear, this series is not a bug fix or something like this. > > This starts the cleanups I was thinking about year ago, see > > > > https://lore.kernel.org/all/20240606120038.GB22450@redhat.com/ > > > > Then later we can probably make more changes so that the kernel threads > > (PF_KTHREADs and PF_USER_WORKERs) will run without "struct fpu" attached > > to task_struct, so that x86_task_fpu() should return NULL regardless of > > CONFIG_X86_DEBUG_FPU. > > To save space? Yes. And to make the fact that kernel threads never use (do not really have) FPU state more clear. > > But even the WARN_ON_ONCE(task->flags & (PF_KTHREAD|PF_USER_WORKER)) in > > x86_task_fpu() makes sense to me. > > > > Say, switch_fpu() has no reason to check "PF_KTHREAD | PF_USER_WORKER", > > this check should die. > > > > Digging through git, the reason for the PF_USER_WORKER check in switch_fpu() was > originally: "more of a cosmetic thing that was found while debugging and issue > and pondering why the FPU flag is set on these threads." Whatever reasons we had, they're gone. We can rely on TIF_NEED_FPU_LOAD. I'll send a coupld of patches tomorrow. > So a goal could be to make the code make more sense? If that is a reason, then I > have some concerns with it. The simpler code would need to somehow work with > that (I think...) user workers should still have a PKRU value. So then does > ptrace need branching logic for xstateregs_get/set() with a struct fpu and > without? If so, is that ultimately simpler? Sorry, I don't understand... In particular, I don't understand again how this connects to PKRU. __switch_to()->x86_pkru_load() doesn't depend on switch_fpu() ? > > But if something goes wrong, it would be nice to > > have a warning for io threads as well. > > I guess I question whether it really makes sense to add a special case for > PF_USER_WORKER, including the existing logic. But I'm still trying to piece > together a clearly stated benefit. Again, I don't understand... To me, currently arch/x86/kernel/fpu/regset.c adds a special case for PF_USER_WORKER, this series tries to remove it (but we need a bit more of simple changes). > > That said... Could you explain why do you dislike 4/5 ? > > As I said, shstk_alloc_thread_stack() shouldn't clear ARCH_SHSTK_SHSTK because > the function is about shadow stack allocation. OK, then how/where we can clear this flag if we avoid the pointless shadow stack allocation for PF_USER_WORKER? > It also doesn't make sense to > clear ARCH_SHSTK_SHSTK for user workers. Why? > I think Dave also questioned whether a rare extra shadow stack is really a > problem. Sure, it is not really a problem. In that it is not a bug. But why we can't avoid the pointless shadow stack / ARCH_SHSTK_SHSTK for user workers ? 4/5 doesn't complicate this code. Plus, again, the current code is not consistent. fpu_clone() won't do update_fpu_shstk() in this case. Not a bug too, but imo deserves a cleanup. Oleg.
On Wed, 2025-08-27 at 16:51 +0200, Oleg Nesterov wrote: > > > > I guess I question whether it really makes sense to add a special case for > > PF_USER_WORKER, including the existing logic. But I'm still trying to piece > > together a clearly stated benefit. > > Again, I don't understand... To me, currently arch/x86/kernel/fpu/regset.c > adds a special case for PF_USER_WORKER, this series tries to remove it (but > we need a bit more of simple changes). That commit I dug up? It didn't have a super strong justification either. Can you say what your intended benefit is? > > > > That said... Could you explain why do you dislike 4/5 ? > > > > As I said, shstk_alloc_thread_stack() shouldn't clear ARCH_SHSTK_SHSTK > > because > > the function is about shadow stack allocation. > > OK, then how/where we can clear this flag if we avoid the pointless shadow > stack allocation for PF_USER_WORKER? *If* we want to worry about an extra shadow stack allocation (which Dave seems to doubt), we don't need to clear ARCH_SHSTK_SHSTK to avoid allocations. Other thread types already avoid it (vfork, etc). So just add to the existing logic that skips shadow stack allocation. Make it do that for user workers too, and leave ARCH_SHSTK_SHSTK alone. > > > It also doesn't make sense to clear ARCH_SHSTK_SHSTK for user workers. > > Why? Because ARCH_SHSTK_SHSTK is supposed to be inherited by children. It adds a special case for no reason. > > > I think Dave also questioned whether a rare extra shadow stack is really a > > problem. > > Sure, it is not really a problem. In that it is not a bug. But why we can't > avoid the pointless shadow stack / ARCH_SHSTK_SHSTK for user workers ? 4/5 > doesn't complicate this code. > > Plus, again, the current code is not consistent. fpu_clone() won't do > update_fpu_shstk() in this case. Not a bug too, but imo deserves a cleanup. I thought we discussed that the user worker logic already wipes the whole FPU state though, so we don't need to call update_fpu_shstk(). Did I get that wrong?
On 08/28, Edgecombe, Rick P wrote: > > On Wed, 2025-08-27 at 16:51 +0200, Oleg Nesterov wrote: > > > > > > I guess I question whether it really makes sense to add a special case for > > > PF_USER_WORKER, including the existing logic. But I'm still trying to piece > > > together a clearly stated benefit. > > > > Again, I don't understand... To me, currently arch/x86/kernel/fpu/regset.c > > adds a special case for PF_USER_WORKER, this series tries to remove it (but > > we need a bit more of simple changes). > > That commit I dug up? It didn't have a super strong justification either. Can > you say what your intended benefit is? I meant that arch/x86/kernel/fpu/regset.c adds a special case for PF_USER_WORKER in that this is the only case when x86_task_fpu(PF_USER_WORKER) is used. > > OK, then how/where we can clear this flag if we avoid the pointless shadow > > stack allocation for PF_USER_WORKER? > > *If* we want to worry about an extra shadow stack allocation (which Dave seems > to doubt), we don't need to clear ARCH_SHSTK_SHSTK to avoid allocations. Other > thread types already avoid it (vfork, etc). So just add to the existing logic > that skips shadow stack allocation. Make it do that for user workers too, and > leave ARCH_SHSTK_SHSTK alone. From 0/5: However, there is an annoying complication: shstk_alloc_thread_stack() can alloc the pointless shadow stack for PF_USER_WORKER thread and set the ARCH_SHSTK_SHSTK flag. This means that ssp_get()->ssp_active() can return true, and in this case it wouldn't be right to use the "unrelated" init_fpstate. > > Why? > > Because ARCH_SHSTK_SHSTK is supposed to be inherited by children. It adds a > special case for no reason. See above. And it has no meaning for io-threads, right? > > Plus, again, the current code is not consistent. fpu_clone() won't do > > update_fpu_shstk() in this case. Not a bug too, but imo deserves a cleanup. > > I thought we discussed that the user worker logic already wipes the whole FPU > state though, so we don't need to call update_fpu_shstk(). Did I get that wrong? Sure, but see the note from 0/5. We don't need to call update_fpu_shstk() and initialize ->user_ssp. Yet ssp_get() will report the bogus cetregs->user_ssp. This all doesn't look right to me even if nothing really bad can happen. Oleg.
On Fri, 2025-08-29 at 17:06 +0200, Oleg Nesterov wrote: > > *If* we want to worry about an extra shadow stack allocation (which Dave > > seems to doubt), we don't need to clear ARCH_SHSTK_SHSTK to avoid > > allocations. > > Other thread types already avoid it (vfork, etc). So just add to the > > existing logic that skips shadow stack allocation. Make it do that for user > > workers too, and leave ARCH_SHSTK_SHSTK alone. > > From 0/5: > > However, there is an annoying complication: > shstk_alloc_thread_stack() > can alloc the pointless shadow stack for PF_USER_WORKER thread and > set > the ARCH_SHSTK_SHSTK flag. This means that ssp_get()->ssp_active() > can > return true, and in this case it wouldn't be right to use the > "unrelated" > init_fpstate. Yea the ptrace code currently assumes there will be a non-init SHSTK FPU state. But if the init state is currently associated with the FPU, whether it's via a cleared copy, or some pointer redirection as you proposed, what is the difference? Hmm, I actually do see a potential concrete issue... fpu_clone() will wipe out the FPU state for PF_USER_WORKER, which means if xsaves decides to use the init optimization for CET, "get_xsave_addr(xsave, XFEATURE_CET_USER)" could return NULL and trigger a warning. I would think we could address this by just removing the warning, since the comment is incorrect. diff --git a/arch/x86/kernel/fpu/regset.c b/arch/x86/kernel/fpu/regset.c index 0986c2200adc..094a891bfea8 100644 --- a/arch/x86/kernel/fpu/regset.c +++ b/arch/x86/kernel/fpu/regset.c @@ -196,15 +196,8 @@ int ssp_get(struct task_struct *target, const struct user_regset *regset, sync_fpstate(fpu); cetregs = get_xsave_addr(&fpu->fpstate->regs.xsave, XFEATURE_CET_USER); - if (WARN_ON(!cetregs)) { - /* - * This shouldn't ever be NULL because shadow stack was - * verified to be enabled above. This means - * MSR_IA32_U_CET.CET_SHSTK_EN should be 1 and so - * XFEATURE_CET_USER should not be in the init state. - */ + if (cetregs) return -ENODEV; - } return membuf_write(&to, (unsigned long *)&cetregs->user_ssp, sizeof(cetregs->user_ssp)); @@ -241,15 +234,8 @@ int ssp_set(struct task_struct *target, const struct user_regset *regset, fpu_force_restore(fpu); cetregs = get_xsave_addr(xsave, XFEATURE_CET_USER); - if (WARN_ON(!cetregs)) { - /* - * This shouldn't ever be NULL because shadow stack was - * verified to be enabled above. This means - * MSR_IA32_U_CET.CET_SHSTK_EN should be 1 and so - * XFEATURE_CET_USER should not be in the init state. - */ + if (cetregs) return -ENODEV; - } cetregs->user_ssp = user_ssp; return 0; If PF_USER_WORKER's ever do grow the ability to spawn threads, further changes would be needed to restore CET_SHSTK_EN for the new thread. I actually think this is a further point towards not having special logic for PF_USER_WORKER FPUs (beyond the PKRU reasoning). As in, instead of making these proposed changes, instead rollback the existing differences. But I'm not sure it's worth it at this time.
On 09/02, Edgecombe, Rick P wrote: > > On Fri, 2025-08-29 at 17:06 +0200, Oleg Nesterov wrote: > > > *If* we want to worry about an extra shadow stack allocation (which Dave > > > seems to doubt), we don't need to clear ARCH_SHSTK_SHSTK to avoid > > > allocations. > > > Other thread types already avoid it (vfork, etc). So just add to the > > > existing logic that skips shadow stack allocation. Make it do that for user > > > workers too, and leave ARCH_SHSTK_SHSTK alone. > > > > From 0/5: > > > > However, there is an annoying complication: > > shstk_alloc_thread_stack() > > can alloc the pointless shadow stack for PF_USER_WORKER thread and > > set > > the ARCH_SHSTK_SHSTK flag. This means that ssp_get()->ssp_active() > > can > > return true, and in this case it wouldn't be right to use the > > "unrelated" > > init_fpstate. > > Yea the ptrace code currently assumes there will be a non-init SHSTK FPU state. > But if the init state is currently associated with the FPU, whether it's via a > cleared copy, or some pointer redirection as you proposed, what is the > difference? Well. Lets forget about other changes for the moment. Lets only discuss 4/5. > Hmm, I actually do see a potential concrete issue... > > fpu_clone() will wipe out the FPU state for PF_USER_WORKER, which means if > xsaves decides to use the init optimization for CET, "get_xsave_addr(xsave, > XFEATURE_CET_USER)" could return NULL and trigger a warning. Even if get_xsave_addr() returns a valid pointer, what is the point to try to report cetregs->user_ssp which doesn't match the reality? Again, update_fpu_shstk() was not called, ->user_ssp can't be correct. Why not simply clear ARCH_SHSTK_SHSTK as 4/5 does? With this change ssp_get() will return -ENODEV right after the ssp_active() check. Unless I am totally confused, ARCH_SHSTK_SHSTK has no meaning for PF_USER_WORKER kernel threads, so I don't understand your objections. Oleg.
On Wed, 2025-09-03 at 11:54 +0200, Oleg Nesterov wrote: > > Hmm, I actually do see a potential concrete issue... > > > > fpu_clone() will wipe out the FPU state for PF_USER_WORKER, which means if > > xsaves decides to use the init optimization for CET, "get_xsave_addr(xsave, > > XFEATURE_CET_USER)" could return NULL and trigger a warning. > > Even if get_xsave_addr() returns a valid pointer, what is the point to try to > report cetregs->user_ssp which doesn't match the reality? > Again, update_fpu_shstk() was not called, ->user_ssp can't be correct. I think it would be better to have less special cases in the FPU. I'm not sure what you mean by "correct". As above, it gets zeroed in fpu_clone(). I guess you want it to be something else. If we are talking pure correctness, and not functional issues, I'm questioning whether it actually should be zeroed for user workers. The behavior would be leave shadow stack enabled, but don't allocate a shadow stack. But all this adds complexity cost, which seems there isn't appetite for. I do see the potential to hit a warning from userspace. That seems like something that could be fixed and does have a clear purpose.
On 09/03, Edgecombe, Rick P wrote: > > On Wed, 2025-09-03 at 11:54 +0200, Oleg Nesterov wrote: > > > Hmm, I actually do see a potential concrete issue... > > > > > > fpu_clone() will wipe out the FPU state for PF_USER_WORKER, which means if > > > xsaves decides to use the init optimization for CET, "get_xsave_addr(xsave, > > > XFEATURE_CET_USER)" could return NULL and trigger a warning. > > > > Even if get_xsave_addr() returns a valid pointer, what is the point to try to > > report cetregs->user_ssp which doesn't match the reality? > > Again, update_fpu_shstk() was not called, ->user_ssp can't be correct. > > I think it would be better to have less special cases in the FPU. Agreed, > I'm not sure > what you mean by "correct". As above, it gets zeroed in fpu_clone(). I guess you > want it to be something else. Well. I think that if copy_thread() path allocate the shadow stack, then ssp_get() should report the value returned by shstk_alloc_thread_stack(). If the thread runs without shstk/ARCH_SHSTK_SHSTK ssp_get() should return -ENODEV. Regardless of PF_USER_WORKER. Now lets recall that my actual motivation is "don't abuse x86_task_fpu(PF_USER_WORKER)", and we also have ssp_set(). Without this patch which clears ARCH_SHSTK_SHSTK ssp_set() -> x86_task_fpu(PF_USER_WORKER) has to return a "real" FPU state. Oleg.
© 2016 - 2025 Red Hat, Inc.