If a features_enabled(ARCH_SHSTK_SHSTK) userspace thread creates a
PF_USER_WORKER thread, shstk_alloc_thread_stack() allocates the shadow
stack for no reason, the new (kernel) thread will never return to usermode.
Plus the current code doesn't even look correct, in this case fpu_clone()
won't call update_fpu_shstk().
Add the new "bool minimal = !!args->fn" argument (which matches that of
fpu_clone()) to shstk_alloc_thread_stack() and change it to do
reset_thread_features(tsk) if "minimal" is true.
With this patch ssp_get() -> ssp_active(target) should never return true
if target->flags & PF_USER_WORKER.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
arch/x86/include/asm/shstk.h | 4 ++--
arch/x86/kernel/process.c | 2 +-
arch/x86/kernel/shstk.c | 11 ++++++++++-
3 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
index 92d449cc352a..dfb2aeebc25f 100644
--- a/arch/x86/include/asm/shstk.h
+++ b/arch/x86/include/asm/shstk.h
@@ -17,7 +17,7 @@ struct thread_shstk {
long shstk_prctl(struct task_struct *task, int option, unsigned long arg2);
void reset_thread_features(struct task_struct *task);
unsigned long shstk_alloc_thread_stack(struct task_struct *p, unsigned long clone_flags,
- unsigned long stack_size);
+ bool minimal, unsigned long stack_size);
void shstk_free(struct task_struct *p);
int setup_signal_shadow_stack(struct ksignal *ksig);
int restore_signal_shadow_stack(void);
@@ -28,7 +28,7 @@ static inline long shstk_prctl(struct task_struct *task, int option,
unsigned long arg2) { return -EINVAL; }
static inline void reset_thread_features(struct task_struct *task) {}
static inline unsigned long shstk_alloc_thread_stack(struct task_struct *p,
- unsigned long clone_flags,
+ unsigned long clone_flags, bool minimal,
unsigned long stack_size) { return 0; }
static inline void shstk_free(struct task_struct *p) {}
static inline int setup_signal_shadow_stack(struct ksignal *ksig) { return 0; }
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 1b7960cf6eb0..e932e0e53972 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -209,7 +209,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
* is disabled, new_ssp will remain 0, and fpu_clone() will know not to
* update it.
*/
- new_ssp = shstk_alloc_thread_stack(p, clone_flags, args->stack_size);
+ new_ssp = shstk_alloc_thread_stack(p, clone_flags, args->fn, args->stack_size);
if (IS_ERR_VALUE(new_ssp))
return PTR_ERR((void *)new_ssp);
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index e6d3b1371b11..3da22c6f5874 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -192,11 +192,20 @@ void reset_thread_features(struct task_struct *tsk)
}
unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, unsigned long clone_flags,
- unsigned long stack_size)
+ bool minimal, unsigned long stack_size)
{
struct thread_shstk *shstk = &tsk->thread.shstk;
unsigned long addr, size;
+ /*
+ * Kernel threads cloned from userspace thread never return to
+ * usermode.
+ */
+ if (minimal) {
+ reset_thread_features(tsk);
+ return 0;
+ }
+
/*
* If shadow stack is not enabled on the new thread, skip any
* switch to a new shadow stack.
--
2.25.1.362.g51ebf55
+Mark and Deepak On Thu, 2025-08-14 at 12:14 +0200, Oleg Nesterov wrote: > If a features_enabled(ARCH_SHSTK_SHSTK) userspace thread creates a > PF_USER_WORKER thread, shstk_alloc_thread_stack() allocates the shadow > stack for no reason, the new (kernel) thread will never return to usermode. > > Plus the current code doesn't even look correct, in this case fpu_clone() > won't call update_fpu_shstk(). The actual SSP register gets set when returning to userspace, it will get this from MSR_IA32_PL3_SSP, which is restored from the xsave buffer. What I am seeing is that fpu_clone() clears out the buffer in the 'minimal' case. So the xstate copy of the SSP should already be zero and the problem is just that a shadow stack gets allocated when one doesn't need to be. I agree we don't need to allocate a shadow stack in this case, but I'm not sure it is right to fully disable shadow stack in thread.features. First of all, disabling it from shstk_alloc_thread_stack() seems weird. It just handles allocating shadow stacks. But also it seems the requirements are very similar to the vfork case where we don't allocate a shadow stack. If we implement it similarly we can have less special cases. Lastly, it doesn't seem there is any way to clone from IO uring today, but there were proposals. The general idea from the security POV is that copied threads will keep shadow stack enabled. So it seems to fit better with the concepts involved to not clear the thread.features. How about just adding the 'minimal' condition to: if (clone_flags & CLONE_VFORK) { shstk->base = 0; shstk->size = 0; return 0; } ...then update all the comments where vfork is called out as the only case that does this? > > Add the new "bool minimal = !!args->fn" argument (which matches that of > fpu_clone()) to shstk_alloc_thread_stack() and change it to do > reset_thread_features(tsk) if "minimal" is true. > > With this patch ssp_get() -> ssp_active(target) should never return true > if target->flags & PF_USER_WORKER. > > Signed-off-by: Oleg Nesterov <oleg@redhat.com> > --- > arch/x86/include/asm/shstk.h | 4 ++-- > arch/x86/kernel/process.c | 2 +- > arch/x86/kernel/shstk.c | 11 ++++++++++- > 3 files changed, 13 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h > index 92d449cc352a..dfb2aeebc25f 100644 > --- a/arch/x86/include/asm/shstk.h > +++ b/arch/x86/include/asm/shstk.h > @@ -17,7 +17,7 @@ struct thread_shstk { > long shstk_prctl(struct task_struct *task, int option, unsigned long arg2); > void reset_thread_features(struct task_struct *task); > unsigned long shstk_alloc_thread_stack(struct task_struct *p, unsigned long clone_flags, > - unsigned long stack_size); > + bool minimal, unsigned long stack_size); > void shstk_free(struct task_struct *p); > int setup_signal_shadow_stack(struct ksignal *ksig); > int restore_signal_shadow_stack(void); > @@ -28,7 +28,7 @@ static inline long shstk_prctl(struct task_struct *task, int option, > unsigned long arg2) { return -EINVAL; } > static inline void reset_thread_features(struct task_struct *task) {} > static inline unsigned long shstk_alloc_thread_stack(struct task_struct *p, > - unsigned long clone_flags, > + unsigned long clone_flags, bool minimal, > unsigned long stack_size) { return 0; } > static inline void shstk_free(struct task_struct *p) {} > static inline int setup_signal_shadow_stack(struct ksignal *ksig) { return 0; } > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c > index 1b7960cf6eb0..e932e0e53972 100644 > --- a/arch/x86/kernel/process.c > +++ b/arch/x86/kernel/process.c > @@ -209,7 +209,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) > * is disabled, new_ssp will remain 0, and fpu_clone() will know not to > * update it. > */ > - new_ssp = shstk_alloc_thread_stack(p, clone_flags, args->stack_size); > + new_ssp = shstk_alloc_thread_stack(p, clone_flags, args->fn, args->stack_size); > if (IS_ERR_VALUE(new_ssp)) > return PTR_ERR((void *)new_ssp); > > diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c > index e6d3b1371b11..3da22c6f5874 100644 > --- a/arch/x86/kernel/shstk.c > +++ b/arch/x86/kernel/shstk.c > @@ -192,11 +192,20 @@ void reset_thread_features(struct task_struct *tsk) > } > > unsigned long shstk_alloc_thread_stack(struct task_struct *tsk, unsigned long clone_flags, > - unsigned long stack_size) > + bool minimal, unsigned long stack_size) > { > struct thread_shstk *shstk = &tsk->thread.shstk; > unsigned long addr, size; > > + /* > + * Kernel threads cloned from userspace thread never return to > + * usermode. > + */ > + if (minimal) { > + reset_thread_features(tsk); > + return 0; > + } > + > /* > * If shadow stack is not enabled on the new thread, skip any > * switch to a new shadow stack.
On 08/14, Edgecombe, Rick P wrote: > > +Mark and Deepak > > On Thu, 2025-08-14 at 12:14 +0200, Oleg Nesterov wrote: > > If a features_enabled(ARCH_SHSTK_SHSTK) userspace thread creates a > > PF_USER_WORKER thread, shstk_alloc_thread_stack() allocates the shadow > > stack for no reason, the new (kernel) thread will never return to usermode. > > > > Plus the current code doesn't even look correct, in this case fpu_clone() > > won't call update_fpu_shstk(). ... > I agree we don't need to allocate a shadow stack in this case, Great, > but I'm not sure > it is right to fully disable shadow stack in thread.features. Why? > First of all, > disabling it from shstk_alloc_thread_stack() seems weird. It just handles > allocating shadow stacks. I agree in advance with any other change. > Lastly, it doesn't seem there is any way to clone from IO uring today, Not sure I understand... create_io_thread() ? > How about just adding the 'minimal' condition to: > if (clone_flags & CLONE_VFORK) { > shstk->base = 0; > shstk->size = 0; > return 0; > } > ...then update all the comments where vfork is called out as the only case that > does this? but create_io_thread() and vhost_task_create() do not use CLONE_VFORK? Oleg.
On Fri, 2025-08-15 at 14:17 +0200, Oleg Nesterov wrote: > > but I'm not sure > > it is right to fully disable shadow stack in thread.features. > > Why? The bit in thread.features is like a sticky bit that is inherrited whenver a thread is cloned. How it works normally is that the first thread in the app (really glibc loader) enables shadow stacks, then new threads automatically inherit that shadow stack is enabled. So in practice it is like a process wide thing, but stored on each thread. This process-wide behavior is to add to the security. You don't want to allow a protected app to spawn a new thread that escapes the enforcement. There is a way to manually disable shadow stack per- thread, but it is protected by ARCH_SHSTK_LOCK, which gets set by glibc loader before jumping into the actual app. When shadow stack is enabled, depending on the circumstances a new shadow stack will automatically be allocated for a new thread. shstk->base and shstk->size are about that automatically enabled shadow stack. So what are we trying to do for PF_USER_WORKER? Prevent them from wasting a VMA with an unused shadow stack? Or set PF_USER_WORKER's aside from the logic that is about more than protecting the individual thread in the process? > > > First of all, > > disabling it from shstk_alloc_thread_stack() seems weird. It just handles > > allocating shadow stacks. > > I agree in advance with any other change. > > > Lastly, it doesn't seem there is any way to clone from IO uring today, > > Not sure I understand... create_io_thread() ? There was some discussion in the past about adding a clone, but the comment was more about whether it fit the concepts in involved. https://lwn.net/Articles/908268/ > > > How about just adding the 'minimal' condition to: > > if (clone_flags & CLONE_VFORK) { > > shstk->base = 0; > > shstk->size = 0; > > return 0; > > } > > ...then update all the comments where vfork is called out as the only case > > that > > does this? > > but create_io_thread() and vhost_task_create() do not use CLONE_VFORK? No, to make it have the same logic as the vfork case (which doesn't allocate a new shadow stack). Like: if ((clone_flags & CLONE_VFORK) || minimal) { shstk->base = 0; shstk->size = 0; return 0; } Or as Mark was suggesting, replace it with something like: if (needs_new_shstk(clone_args)) { shstk->base = 0; shstk->size = 0; return 0; }
On 08/15, Edgecombe, Rick P wrote: > > The bit in thread.features is like a sticky bit that is inherrited whenver a > thread is cloned. ... > You don't want to allow a protected app to spawn a new thread that > escapes the enforcement. Ah, this is clear. But again, PF_USER_WORKER is the kernel thread cloned by the kernel. Yes, it shares the same thread-group, but this is only to make SIGKILL/exit_group/etc work. It is not that userspace app can create it via something like pthread_create(). > So what are we trying to do for PF_USER_WORKER? Prevent them from wasting a VMA > with an unused shadow stack? Or set PF_USER_WORKER's aside from the logic that > is about more than protecting the individual thread in the process? Let me quote my answer to Mark: The fact that a kernel thread can have the pointless ARCH_SHSTK_SHSTK is the only reason I know why x86_task_fpu(PF_USER_WORKER) has to work. I'd like to make this logic consistent with PF_KTHREAD, and in the longer term change the x86 FPU code so that the kernel threads can run without without "struct fpu" attached to task_struct. And again, please see Warning from x86_task_fpu() https://lore.kernel.org/all/aJVuZZgYjEMxiUYq@ly-workstation/ PF_USER_WORKERs and shadow stack https://lore.kernel.org/all/20250813162824.GA15234@redhat.com/ and 6/6 in this series. > No, to make it have the same logic as the vfork case (which doesn't allocate a > new shadow stack). > > Like: > if ((clone_flags & CLONE_VFORK) || minimal) { > shstk->base = 0; > shstk->size = 0; > return 0; > } Aha, got it. Agreed, but I think we also need to clear ARCH_SHSTK_SHSTK copied by arch_dup_task_struct() ? Oleg.
On Fri, 2025-08-15 at 18:54 +0200, Oleg Nesterov wrote: > > So what are we trying to do for PF_USER_WORKER? Prevent them from wasting a > > VMA > > with an unused shadow stack? Or set PF_USER_WORKER's aside from the logic > > that > > is about more than protecting the individual thread in the process? > > Let me quote my answer to Mark: > > The fact that a kernel thread can have the pointless ARCH_SHSTK_SHSTK > is > the only reason I know why x86_task_fpu(PF_USER_WORKER) has to work. Maybe you can explain the exact failure mode here? ARCH_SHSTK_SHSTK isn't part of the FPU infrastructure, so maybe you can explain how there is some cascade. > > I'd like to make this logic consistent with PF_KTHREAD, and in the > longer > term change the x86 FPU code so that the kernel threads can run > without > without "struct fpu" attached to task_struct. For PF_USER_WORKER it still has access to the user MM, right? Shouldn't it inherit PKRU from the parent? Stop me if I'm telling you something you already know... For better or worse, the x86 FPU state has grown from the classic "extra math registers", to include other things like PKRU and supervisor state. Supervisor state controls other thread specific state that *only* the kernel is supposed to have complete access to. Some of the xfeatures are even about saving and restoring state that *only* affects the kernel. So I think each xfeature needs to be evaluated, and they are all annoyingly different. I think LBR works in the kernel too? > > And again, please see > > Warning from x86_task_fpu() > https://lore.kernel.org/all/aJVuZZgYjEMxiUYq@ly-workstation/ > > PF_USER_WORKERs and shadow stack > https://lore.kernel.org/all/20250813162824.GA15234@redhat.com/ > > and 6/6 in this series. I kind of think it would be more appropriate for you to explain more about what you are trying to do. I've read three things: - Prevent wasting a shstk VMA (which Dave suggested maybe wasn't worth it) - Prevent issue around update_fpu_shstk(), which I'm not sure is an issue - Prevent ptrace from setting FPU state on user workers because it caused problems (details unclear), by changing the FPU design in a way that apparently has impacts across FPU-using features (unclear why this is the best way to prevent it) TBH, based on my current understanding, all three sound dubious to me. Let's get on the same page as far as the goals before we discuss shstk changes further. You might be aware that the x86 FPU is seen as too complex already and so adding special cases tends to have a high bar. So consider making a strong, clear justification for the overall problem/solution you have in mind.
Rick, I'll try to write another email and send V2 next week. Although perhaps I'll send some other unrelated and "obvious" cleanups before that... Until then: On 08/15, Edgecombe, Rick P wrote: > > Maybe you can explain Damn yes, agreed, my fault. > Stop me if I'm telling you something you already know... Or, quite possibly, you should stop me to send the patches which change the code I can hardly understand ;) And that is why I CC'ed the experts like you. See you next week, Oleg.
On Thu, Aug 14, 2025 at 05:03:36PM +0000, Edgecombe, Rick P wrote: > On Thu, 2025-08-14 at 12:14 +0200, Oleg Nesterov wrote: > > If a features_enabled(ARCH_SHSTK_SHSTK) userspace thread creates a > > PF_USER_WORKER thread, shstk_alloc_thread_stack() allocates the shadow > > stack for no reason, the new (kernel) thread will never return to usermode. > I agree we don't need to allocate a shadow stack in this case, but I'm not sure > it is right to fully disable shadow stack in thread.features. First of all, > disabling it from shstk_alloc_thread_stack() seems weird. It just handles > allocating shadow stacks. I agree that it's better to leave userspace shadow stacks enabled, given that the reason we're not allocating the shadow stack is that we don't expect to ever return to userspace then it should be fine to leave the feature turned on for userspace. If we mess up and do somehow return to userspace it seems better to have the feature enabled and hopefully give us some protection against our mistake, and if something causes the worker thread to start a normal thread then things should work as expected. > How about just adding the 'minimal' condition to: > if (clone_flags & CLONE_VFORK) { > shstk->base = 0; > shstk->size = 0; > return 0; > } > ...then update all the comments where vfork is called out as the only case that > does this? Perhaps we should factor the logic for deciding if we need to allocate a userspace shadow stack out into the arch independent code and do something like set a flag in kernel_clone_args that the arches can check? I think the logic is the same for all arches at the minute and don't see a reason why it'd diverge. That'd collide a bit with my clone3() series, there's some overlap there with that creating another reason why the decision would change. Reducing the duplication would be nice.
On 08/14, Mark Brown wrote: > > On Thu, Aug 14, 2025 at 05:03:36PM +0000, Edgecombe, Rick P wrote: > > On Thu, 2025-08-14 at 12:14 +0200, Oleg Nesterov wrote: > > > > If a features_enabled(ARCH_SHSTK_SHSTK) userspace thread creates a > > > PF_USER_WORKER thread, shstk_alloc_thread_stack() allocates the shadow > > > stack for no reason, the new (kernel) thread will never return to usermode. > > > I agree we don't need to allocate a shadow stack in this case, but I'm not sure > > it is right to fully disable shadow stack in thread.features. First of all, > > disabling it from shstk_alloc_thread_stack() seems weird. It just handles > > allocating shadow stacks. > > I agree that it's better to leave userspace shadow stacks enabled, given > that the reason we're not allocating the shadow stack is that we don't > expect to ever return to userspace then it should be fine to leave the > feature turned on for userspace. If we mess up and do somehow return to > userspace But a PF_USER_WORKER task can never return to userspace. It doesn't differ from PF_KTHREAD in this respect. Oleg.
On 08/15, Oleg Nesterov wrote: > > On 08/14, Mark Brown wrote: > > > > On Thu, Aug 14, 2025 at 05:03:36PM +0000, Edgecombe, Rick P wrote: > > > On Thu, 2025-08-14 at 12:14 +0200, Oleg Nesterov wrote: > > > > > > If a features_enabled(ARCH_SHSTK_SHSTK) userspace thread creates a > > > > PF_USER_WORKER thread, shstk_alloc_thread_stack() allocates the shadow > > > > stack for no reason, the new (kernel) thread will never return to usermode. > > > > > I agree we don't need to allocate a shadow stack in this case, but I'm not sure > > > it is right to fully disable shadow stack in thread.features. First of all, > > > disabling it from shstk_alloc_thread_stack() seems weird. It just handles > > > allocating shadow stacks. > > > > I agree that it's better to leave userspace shadow stacks enabled, given > > that the reason we're not allocating the shadow stack is that we don't > > expect to ever return to userspace then it should be fine to leave the > > feature turned on for userspace. If we mess up and do somehow return to > > userspace > > But a PF_USER_WORKER task can never return to userspace. It doesn't differ > from PF_KTHREAD in this respect. ... of course unless it does exec. Oleg.
On Fri, Aug 15, 2025 at 03:08:52PM +0200, Oleg Nesterov wrote: > On 08/15, Oleg Nesterov wrote: > > On 08/14, Mark Brown wrote: > > > I agree that it's better to leave userspace shadow stacks enabled, given > > > that the reason we're not allocating the shadow stack is that we don't > > > expect to ever return to userspace then it should be fine to leave the > > > feature turned on for userspace. If we mess up and do somehow return to > > > userspace > > But a PF_USER_WORKER task can never return to userspace. It doesn't differ > > from PF_KTHREAD in this respect. > ... of course unless it does exec. Sure, but OTOH at least for arm64 there's no cost to leaving the feature enabled unless you actually execute userspace code so if we never return to userspace writing the code to disable isn't really buying us anything.
On 08/15, Mark Brown wrote: > > On Fri, Aug 15, 2025 at 03:08:52PM +0200, Oleg Nesterov wrote: > > On 08/15, Oleg Nesterov wrote: > > > On 08/14, Mark Brown wrote: > > > > > I agree that it's better to leave userspace shadow stacks enabled, given > > > > that the reason we're not allocating the shadow stack is that we don't > > > > expect to ever return to userspace then it should be fine to leave the > > > > feature turned on for userspace. If we mess up and do somehow return to > > > > userspace > > > > But a PF_USER_WORKER task can never return to userspace. It doesn't differ > > > from PF_KTHREAD in this respect. > > > ... of course unless it does exec. > > Sure, but OTOH at least for arm64 there's no cost to leaving the feature > enabled unless you actually execute userspace code so if we never return > to userspace writing the code to disable isn't really buying us anything. The fact that a kernel thread can have the pointless ARCH_SHSTK_SHSTK is the only reason I know why x86_task_fpu(PF_USER_WORKER) has to work. I'd like to make this logic consistent with PF_KTHREAD, and in the longer term change the x86 FPU code so that the kernel threads can run without without "struct fpu" attached to task_struct. Again, please see https://lore.kernel.org/all/20250813191441.GA26754@redhat.com/ Oleg.
On Fri, Aug 15, 2025 at 05:43:11PM +0200, Oleg Nesterov wrote: > On 08/15, Mark Brown wrote: > > Sure, but OTOH at least for arm64 there's no cost to leaving the feature > > enabled unless you actually execute userspace code so if we never return > > to userspace writing the code to disable isn't really buying us anything. > The fact that a kernel thread can have the pointless ARCH_SHSTK_SHSTK is > the only reason I know why x86_task_fpu(PF_USER_WORKER) has to work. > I'd like to make this logic consistent with PF_KTHREAD, and in the longer > term change the x86 FPU code so that the kernel threads can run without > without "struct fpu" attached to task_struct. OK, that's entirely x86 specific - there's no reason we'd want to do that for arm64.
On 08/15, Mark Brown wrote: > > On Fri, Aug 15, 2025 at 05:43:11PM +0200, Oleg Nesterov wrote: > > On 08/15, Mark Brown wrote: > > > > Sure, but OTOH at least for arm64 there's no cost to leaving the feature > > > enabled unless you actually execute userspace code so if we never return > > > to userspace writing the code to disable isn't really buying us anything. > > > The fact that a kernel thread can have the pointless ARCH_SHSTK_SHSTK is > > the only reason I know why x86_task_fpu(PF_USER_WORKER) has to work. > > > I'd like to make this logic consistent with PF_KTHREAD, and in the longer > > term change the x86 FPU code so that the kernel threads can run without > > without "struct fpu" attached to task_struct. > > OK, that's entirely x86 specific - there's no reason we'd want to do > that for arm64. Since I know nothing about arm64. Any reason we do want to have the unnecessary ARCH_SHSTK_SHSTK/shstk on arm64? And... do you agree that shstk_alloc_thread_stack() without update_fpu_shstk() in copy_thread() path doesn't look right? Even if nothing really bad can happen. Oleg.
On Fri, Aug 15, 2025 at 06:00:23PM +0200, Oleg Nesterov wrote: > On 08/15, Mark Brown wrote: > > OK, that's entirely x86 specific - there's no reason we'd want to do > > that for arm64. > Since I know nothing about arm64. Any reason we do want to have the unnecessary > ARCH_SHSTK_SHSTK/shstk on arm64? If you mean allocating the userspace shadow stack for threads that never go to userspace then no, it's not needed. > And... do you agree that shstk_alloc_thread_stack() without update_fpu_shstk() > in copy_thread() path doesn't look right? Even if nothing really bad can happen. Honestly the update_fpu_ stuff is sufficently x86 specific that it doesn't really register as something I'd expect to see in that path.
On Thu, 2025-08-14 at 19:33 +0100, Mark Brown wrote: > > How about just adding the 'minimal' condition to: > > if (clone_flags & CLONE_VFORK) { > > shstk->base = 0; > > shstk->size = 0; > > return 0; > > } > > ...then update all the comments where vfork is called out as the only case > > that > > does this? > > Perhaps we should factor the logic for deciding if we need to allocate a > userspace shadow stack out into the arch independent code and do > something like set a flag in kernel_clone_args that the arches can > check? I think the logic is the same for all arches at the minute and > don't see a reason why it'd diverge. Sounds good. Like a little start to this: https://lore.kernel.org/lkml/20241010-shstk_converge-v1-0-631beca676e7@rivosinc.com/ > That'd collide a bit with my > clone3() series, there's some overlap there with that creating another > reason why the decision would change. Reducing the duplication would be > nice. Argh, I need to test the latest of that still.
On Thu, Aug 14, 2025 at 10:43:45PM +0000, Edgecombe, Rick P wrote: > On Thu, 2025-08-14 at 19:33 +0100, Mark Brown wrote: > > Perhaps we should factor the logic for deciding if we need to allocate a > > userspace shadow stack out into the arch independent code and do > > something like set a flag in kernel_clone_args that the arches can > > check? I think the logic is the same for all arches at the minute and > > don't see a reason why it'd diverge. > Sounds good. Like a little start to this: > https://lore.kernel.org/lkml/20241010-shstk_converge-v1-0-631beca676e7@rivosinc.com/ Yes, exactly. > > That'd collide a bit with my > > clone3() series, there's some overlap there with that creating another > > reason why the decision would change. Reducing the duplication would be > > nice. > Argh, I need to test the latest of that still. Yury pointed me at some newer x86 systems I was able to get access to so I was finally able to test that one myself before sending it out, confirmation would be good but hopefully it's fine. I've been holding back on sending a rebased version out since Deepak was going to help me get set up to test it on RISC-V. Though I see now that the RISC-V code has vanished from -next (I guess due to fallout from the issues with the merge to Linus, it looks like there's almost nothing in the branch currently), not sure what the plan is there? Perhaps I should just send it out, but given the difficulty getting anyone to pay attention I was trying to avoid issues with missing updates for newly added RISC-V shadow stacks.
On Fri, Aug 15, 2025 at 12:44:14PM +0100, Mark Brown wrote: >On Thu, Aug 14, 2025 at 10:43:45PM +0000, Edgecombe, Rick P wrote: >> On Thu, 2025-08-14 at 19:33 +0100, Mark Brown wrote: > >> > Perhaps we should factor the logic for deciding if we need to allocate a >> > userspace shadow stack out into the arch independent code and do >> > something like set a flag in kernel_clone_args that the arches can >> > check? I think the logic is the same for all arches at the minute and >> > don't see a reason why it'd diverge. > >> Sounds good. Like a little start to this: >> https://lore.kernel.org/lkml/20241010-shstk_converge-v1-0-631beca676e7@rivosinc.com/ > >Yes, exactly. > >> > That'd collide a bit with my >> > clone3() series, there's some overlap there with that creating another >> > reason why the decision would change. Reducing the duplication would be >> > nice. > >> Argh, I need to test the latest of that still. > >Yury pointed me at some newer x86 systems I was able to get access to so >I was finally able to test that one myself before sending it out, >confirmation would be good but hopefully it's fine. I've been holding >back on sending a rebased version out since Deepak was going to help me >get set up to test it on RISC-V. Though I see now that the RISC-V code >has vanished from -next (I guess due to fallout from the issues with the >merge to Linus, it looks like there's almost nothing in the branch >currently), not sure what the plan is there? > >Perhaps I should just send it out, but given the difficulty getting >anyone to pay attention I was trying to avoid issues with missing >updates for newly added RISC-V shadow stacks. Yes I was trying to get that sorted as well. Because now I'll have to rebase my changes to 6.17. So I wanted to make sure that it applies cleanly. I suggest that you send it out because risc-v was left out anyways. I'll apply your patch series on my risc-v shadow stack changes (on top of 6.17) and will report back. It might be easier that way. How does that sound?
On Fri, Aug 15, 2025 at 12:11:47PM -0700, Deepak Gupta wrote: > On Fri, Aug 15, 2025 at 12:44:14PM +0100, Mark Brown wrote: > > confirmation would be good but hopefully it's fine. I've been holding > > back on sending a rebased version out since Deepak was going to help me > > get set up to test it on RISC-V. Though I see now that the RISC-V code > > has vanished from -next (I guess due to fallout from the issues with the > > merge to Linus, it looks like there's almost nothing in the branch > > currently), not sure what the plan is there? > > Perhaps I should just send it out, but given the difficulty getting > > anyone to pay attention I was trying to avoid issues with missing > > updates for newly added RISC-V shadow stacks. > Yes I was trying to get that sorted as well. Because now I'll have to > rebase my changes to 6.17. So I wanted to make sure that it applies > cleanly. I suggest that you send it out because risc-v was left out > anyways. I'll apply your patch series on my risc-v shadow stack changes > (on top of 6.17) and will report back. It might be easier that way. > How does that sound? Sounds good. My main concern is that I don't want to end up needlessly holding off either series due to dependencies/cross tree issues - I remain (endlessly!) hopeful that the everyone's happy with the clone3() work at this point and it could get merged, but if RISC-V support is going in then it should support the new interface too. Hopefully we can do something like apply this on a branch and then merge that into the RISC-V tree?
On Mon, Aug 18, 2025 at 06:27:02PM +0100, Mark Brown wrote: >On Fri, Aug 15, 2025 at 12:11:47PM -0700, Deepak Gupta wrote: >> On Fri, Aug 15, 2025 at 12:44:14PM +0100, Mark Brown wrote: > >> > confirmation would be good but hopefully it's fine. I've been holding >> > back on sending a rebased version out since Deepak was going to help me >> > get set up to test it on RISC-V. Though I see now that the RISC-V code >> > has vanished from -next (I guess due to fallout from the issues with the >> > merge to Linus, it looks like there's almost nothing in the branch >> > currently), not sure what the plan is there? > >> > Perhaps I should just send it out, but given the difficulty getting >> > anyone to pay attention I was trying to avoid issues with missing >> > updates for newly added RISC-V shadow stacks. > >> Yes I was trying to get that sorted as well. Because now I'll have to >> rebase my changes to 6.17. So I wanted to make sure that it applies >> cleanly. I suggest that you send it out because risc-v was left out >> anyways. I'll apply your patch series on my risc-v shadow stack changes >> (on top of 6.17) and will report back. It might be easier that way. > >> How does that sound? > >Sounds good. > >My main concern is that I don't want to end up needlessly holding off >either series due to dependencies/cross tree issues - I remain >(endlessly!) hopeful that the everyone's happy with the clone3() work at >this point and it could get merged, but if RISC-V support is going in >then it should support the new interface too. Hopefully we can do >something like apply this on a branch and then merge that into the >RISC-V tree? Yes they should make to upstream independent of each other. There are some changes which I can adopt in my current set of changes (like change `shstk_alloc_thread_stack` prototype). In your current patchsets "fork: Add shadow stack support to clone3()" will need riscv support. Other than that I see that support for `my_syscall2` for riscv arch will be needed. If clone3 changes make in before riscv shadow stack changes, then its easy I'll just add riscv specific changes pertaining to clone3 in my current patchsets. If clone3 changes are later than riscv shadow stack changes, then I'll point you to a branch from where you can pick riscv specific clone3 + shadow stack changes. If they both are going in together (may happen for 6.18 window), then we will have to coordinate on which one applies first. So let's keep an eye on that. If I push my branch somewhere on github (riscv shadow stack changes and then clone3 changes on top of them), Is that fine? -Deepak
© 2016 - 2025 Red Hat, Inc.