Commit 8671bad873eb ("sched: Do not call __put_task_struct() on rt
if pi_blocked_on is set") changed the behavior of put_task_struct()
unconditionally, even when PREEMPT_RT was not enabled, in clear mismatch
with the commit description.
Restore the previous behavior of put_task_struct() for the PREEMPT_RT
disabled case.
Fixes: 8671bad873eb ("sched: Do not call __put_task_struct() on rt if pi_blocked_on is set")
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
---
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index ea41795a352b..51678a541477 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -130,6 +133,16 @@ static inline void put_task_struct(struct task_struct *t)
if (!refcount_dec_and_test(&t->usage))
return;
+ /* In !RT, it is always safe to call __put_task_struct(). */
+ if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
+ static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP);
+
+ lock_map_acquire_try(&put_task_map);
+ __put_task_struct(t);
+ lock_map_release(&put_task_map);
+ return;
+ }
+
/*
* Under PREEMPT_RT, we can't call __put_task_struct
* in atomic context because it will indirectly
@@ -137,10 +150,6 @@ static inline void put_task_struct(struct task_struct *t)
* current process has a mutex enqueued (blocked on
* a PI chain).
*
- * In !RT, it is always safe to call __put_task_struct().
- * Though, in order to simplify the code, resort to the
- * deferred call too.
- *
* call_rcu() will schedule __put_task_struct_rcu_cb()
* to be called in process context.
*
On 2025-08-06 16:43:55 [-0300], Luis Claudio R. Goncalves wrote: > Commit 8671bad873eb ("sched: Do not call __put_task_struct() on rt > if pi_blocked_on is set") changed the behavior of put_task_struct() > unconditionally, even when PREEMPT_RT was not enabled, in clear mismatch > with the commit description. > > Restore the previous behavior of put_task_struct() for the PREEMPT_RT > disabled case. > > Fixes: 8671bad873eb ("sched: Do not call __put_task_struct() on rt if pi_blocked_on is set") > Acked-by: Oleg Nesterov <oleg@redhat.com> > Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> > --- > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h > index ea41795a352b..51678a541477 100644 > --- a/include/linux/sched/task.h > +++ b/include/linux/sched/task.h > @@ -130,6 +133,16 @@ static inline void put_task_struct(struct task_struct *t) > if (!refcount_dec_and_test(&t->usage)) > return; > > + /* In !RT, it is always safe to call __put_task_struct(). */ I don't want to drag this but this comment is obvious for anyone who is fluent in C. It is just a statement with no explanation. An important note would be that the atomic context restriction only apply to PREEMPT_RT and therefore we have this context override for lockdep below. The other question would be why don't we do this unconditionally regardless of PREEMPT_RT. The only reason I could find is that releasing the task here from the "exit path" makes the vmap stack "earlier" available for reuse. > + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) { > + static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP); > + > + lock_map_acquire_try(&put_task_map); > + __put_task_struct(t); > + lock_map_release(&put_task_map); > + return; > + } > + > /* > * Under PREEMPT_RT, we can't call __put_task_struct > * in atomic context because it will indirectly > @@ -137,10 +150,6 @@ static inline void put_task_struct(struct task_struct *t) > * current process has a mutex enqueued (blocked on > * a PI chain). > * > - * In !RT, it is always safe to call __put_task_struct(). > - * Though, in order to simplify the code, resort to the > - * deferred call too. > - * > * call_rcu() will schedule __put_task_struct_rcu_cb() > * to be called in process context. > * > Sebastian
On 08/11, Sebastian Andrzej Siewior wrote: > > I don't want to drag this but this comment is obvious for anyone who is > fluent in C. It is just a statement with no explanation. > An important note would be that the atomic context restriction only > apply to PREEMPT_RT and therefore we have this context override for > lockdep below. The other question would be why don't we do this > unconditionally regardless of PREEMPT_RT. The only reason I could find > is that releasing the task here from the "exit path" makes the vmap > stack "earlier" available for reuse. Sorry, could you clarify your "other" question? What exactly do you think we could do regardless of PREEMPT_RT? Oleg. > > > + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) { > > + static DEFINE_WAIT_OVERRIDE_MAP(put_task_map, LD_WAIT_SLEEP); > > + > > + lock_map_acquire_try(&put_task_map); > > + __put_task_struct(t); > > + lock_map_release(&put_task_map); > > + return; > > + } > > + > > /* > > * Under PREEMPT_RT, we can't call __put_task_struct > > * in atomic context because it will indirectly > > @@ -137,10 +150,6 @@ static inline void put_task_struct(struct task_struct *t) > > * current process has a mutex enqueued (blocked on > > * a PI chain). > > * > > - * In !RT, it is always safe to call __put_task_struct(). > > - * Though, in order to simplify the code, resort to the > > - * deferred call too. > > - * > > * call_rcu() will schedule __put_task_struct_rcu_cb() > > * to be called in process context. > > * > > > > Sebastian >
On 2025-08-11 12:40:34 [+0200], Oleg Nesterov wrote: > On 08/11, Sebastian Andrzej Siewior wrote: > > > > I don't want to drag this but this comment is obvious for anyone who is > > fluent in C. It is just a statement with no explanation. > > An important note would be that the atomic context restriction only > > apply to PREEMPT_RT and therefore we have this context override for > > lockdep below. The other question would be why don't we do this > > unconditionally regardless of PREEMPT_RT. The only reason I could find > > is that releasing the task here from the "exit path" makes the vmap > > stack "earlier" available for reuse. > > Sorry, could you clarify your "other" question? > > What exactly do you think we could do regardless of PREEMPT_RT? Do what we do now and have one free path for task_struct regardless if PREEMPT_RT is enabled or not. The one via RCU which delays the freeing until after the grace period. > Oleg. Sebastian
On 08/11, Sebastian Andrzej Siewior wrote: > > On 2025-08-11 12:40:34 [+0200], Oleg Nesterov wrote: > > > > What exactly do you think we could do regardless of PREEMPT_RT? > > Do what we do now and have one free path for task_struct regardless if > PREEMPT_RT is enabled or not. The one via RCU which delays the freeing > until after the grace period. Ah, got it. I won't really argue, but... call_rcu() is not free, it obviously delays free_task/etc. To me this !PREEMPT_RT optimization makes sense. But lets forget it for the moment. This patch https://lore.kernel.org/all/aGvTz5VaPFyj0pBV@uudg.org [PATCH v6] sched: do not call __put_task_struct() on rt if pi_blocked_on is set removed that optimization by mistake, this doesn't even match the changelog. I think it should be restored, and this is what the new patch from Luis does. Then we can discuss this all again and possibly remove it, but this should be explicitly documented in the changelog. Oleg.
On 2025-08-11 13:21:20 [+0200], Oleg Nesterov wrote: > On 08/11, Sebastian Andrzej Siewior wrote: > > > > On 2025-08-11 12:40:34 [+0200], Oleg Nesterov wrote: > > > > > > What exactly do you think we could do regardless of PREEMPT_RT? > > > > Do what we do now and have one free path for task_struct regardless if > > PREEMPT_RT is enabled or not. The one via RCU which delays the freeing > > until after the grace period. > > Ah, got it. I won't really argue, but... > > call_rcu() is not free, it obviously delays free_task/etc. To me this > !PREEMPT_RT optimization makes sense. > > But lets forget it for the moment. This patch > > https://lore.kernel.org/all/aGvTz5VaPFyj0pBV@uudg.org > [PATCH v6] sched: do not call __put_task_struct() on rt if pi_blocked_on is set > > removed that optimization by mistake, this doesn't even match the changelog. > I think it should be restored, and this is what the new patch from Luis does. > > Then we can discuss this all again and possibly remove it, but this > should be explicitly documented in the changelog. Certainly. I am not saying we should keep it as is. The added comment appears wrong but I am all for getting this merged and then sorting out later. Thank you for spotting this ;) > Oleg. Sebastian
© 2016 - 2025 Red Hat, Inc.