[PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.

Sebastian Andrzej Siewior posted 2 patches 2 years, 8 months ago
There is a newer version of this series
[PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.
Posted by Sebastian Andrzej Siewior 2 years, 8 months ago
On PREEMPT_RT keeping preemption disabled during the invocation of
cgroup_enter_frozen() is a problem because the function acquires css_set_lock
which is a sleeping lock on PREEMPT_RT and must not be acquired with disabled
preemption.
The preempt-disabled section is only for performance optimisation
reasons and can be avoided.

Extend the comment and don't disable preemption before scheduling on
PREEMPT_RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/signal.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index da017a5461163..9e07b3075c72e 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2328,11 +2328,16 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
 	 * The preempt-disable section ensures that there will be no preemption
 	 * between unlock and schedule() and so improving the performance since
 	 * the ptracer has no reason to sleep.
+	 *
+	 * This optimisation is not doable on PREEMPT_RT due to the spinlock_t
+	 * within the preempt-disable section.
 	 */
-	preempt_disable();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		preempt_disable();
 	read_unlock(&tasklist_lock);
 	cgroup_enter_frozen();
-	preempt_enable_no_resched();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		preempt_enable_no_resched();
 	schedule();
 	cgroup_leave_frozen(true);
 
-- 
2.40.1
Re: [PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.
Posted by Oleg Nesterov 2 years, 8 months ago
The patch LGTM, but I am a bit confused by the changelog/comments,
I guess I missed something...

On 06/06, Sebastian Andrzej Siewior wrote:
>
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2328,11 +2328,16 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
>  	 * The preempt-disable section ensures that there will be no preemption
>  	 * between unlock and schedule() and so improving the performance since
>  	 * the ptracer has no reason to sleep.
> +	 *
> +	 * This optimisation is not doable on PREEMPT_RT due to the spinlock_t
> +	 * within the preempt-disable section.
>  	 */
> -	preempt_disable();
> +	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> +		preempt_disable();

Not only we the problems with cgroup_enter_frozen(), afaics (please correct me)
this optimisation doesn't work on RT anyway?

IIUC, read_lock() on RT disables migration but not preemption, so it is simply
too late to do preempt_disable() before unlock/schedule. The tracer can preempt
the tracee right after do_notify_parent_cldstop().

Oleg.
Re: [PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.
Posted by Peter Zijlstra 2 years, 8 months ago
On Tue, Jun 06, 2023 at 01:04:48PM +0200, Oleg Nesterov wrote:
> The patch LGTM, but I am a bit confused by the changelog/comments,
> I guess I missed something...
> 
> On 06/06, Sebastian Andrzej Siewior wrote:
> >
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -2328,11 +2328,16 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
> >  	 * The preempt-disable section ensures that there will be no preemption
> >  	 * between unlock and schedule() and so improving the performance since
> >  	 * the ptracer has no reason to sleep.
> > +	 *
> > +	 * This optimisation is not doable on PREEMPT_RT due to the spinlock_t
> > +	 * within the preempt-disable section.
> >  	 */
> > -	preempt_disable();
> > +	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> > +		preempt_disable();
> 
> Not only we the problems with cgroup_enter_frozen(), afaics (please correct me)
> this optimisation doesn't work on RT anyway?
> 
> IIUC, read_lock() on RT disables migration but not preemption, so it is simply
> too late to do preempt_disable() before unlock/schedule. The tracer can preempt
> the tracee right after do_notify_parent_cldstop().

Correct -- but I think you can disable preemption over what is
effectivly rwsem_up_read(), but you can't over the effective
rtmutex_lock() that cgroup_enter_frozen() will then attempt.

(iow, unlock() doesn't tend to sleep, while lock() does)

But you're correct to point out that the whole preempt_disable() thing
is entirely pointless due to the whole task_lock region being
preemptible before it.
Re: [PATCH REPOST 2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.
Posted by Oleg Nesterov 2 years, 8 months ago
On 06/06, Peter Zijlstra wrote:
>
> On Tue, Jun 06, 2023 at 01:04:48PM +0200, Oleg Nesterov wrote:
> > The patch LGTM, but I am a bit confused by the changelog/comments,
> > I guess I missed something...
> >
> > On 06/06, Sebastian Andrzej Siewior wrote:
> > >
> > > --- a/kernel/signal.c
> > > +++ b/kernel/signal.c
> > > @@ -2328,11 +2328,16 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
> > >  	 * The preempt-disable section ensures that there will be no preemption
> > >  	 * between unlock and schedule() and so improving the performance since
> > >  	 * the ptracer has no reason to sleep.
> > > +	 *
> > > +	 * This optimisation is not doable on PREEMPT_RT due to the spinlock_t
> > > +	 * within the preempt-disable section.
> > >  	 */
> > > -	preempt_disable();
> > > +	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> > > +		preempt_disable();
> >
> > Not only we the problems with cgroup_enter_frozen(), afaics (please correct me)
> > this optimisation doesn't work on RT anyway?
> >
> > IIUC, read_lock() on RT disables migration but not preemption, so it is simply
> > too late to do preempt_disable() before unlock/schedule. The tracer can preempt
> > the tracee right after do_notify_parent_cldstop().
>
> Correct -- but I think you can disable preemption over what is
> effectivly rwsem_up_read(), but you can't over the effective
> rtmutex_lock() that cgroup_enter_frozen() will then attempt.
>
> (iow, unlock() doesn't tend to sleep, while lock() does)
>
> But you're correct to point out that the whole preempt_disable() thing
> is entirely pointless due to the whole task_lock region being
> preemptible before it.

Thanks Peter.

So I think the comment should be updated. Otherwise it looks as if it makes
sense to try to move cgroup_enter_frozen() up before preempt_disable().

Oleg.
[PATCH 2/2 v2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.
Posted by Sebastian Andrzej Siewior 2 years, 8 months ago
On PREEMPT_RT keeping preemption disabled during the invocation of
cgroup_enter_frozen() is a problem because the function acquires css_set_lock
which is a sleeping lock on PREEMPT_RT and must not be acquired with disabled
preemption.
The preempt-disabled section is only for performance optimisation
reasons and can be avoided.

Extend the comment and don't disable preemption before scheduling on
PREEMPT_RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---

Is this better?

v1…v2:
  - Extend the comment to note that preemption isn't disabled due to
    the lock to make it obvious that the optimisation isn't just
    harmful but also pointless.

 kernel/signal.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index da017a5461163..dcb0b1fbcb3a8 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2328,11 +2328,20 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
 	 * The preempt-disable section ensures that there will be no preemption
 	 * between unlock and schedule() and so improving the performance since
 	 * the ptracer has no reason to sleep.
+	 *
+	 * On PREEMPT_RT locking tasklist_lock does not disable preemption.
+	 * Therefore the task can be preempted (after
+	 * do_notify_parent_cldstop()) before unlocking tasklist_lock so there
+	 * is no benefit in doing this. The optimisation is harmful on
+	 * PEEMPT_RT because the spinlock_t (in cgroup_enter_frozen()) must not
+	 * be acquired with disabled preemption.
 	 */
-	preempt_disable();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		preempt_disable();
 	read_unlock(&tasklist_lock);
 	cgroup_enter_frozen();
-	preempt_enable_no_resched();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		preempt_enable_no_resched();
 	schedule();
 	cgroup_leave_frozen(true);
 
-- 
2.40.1
Re: [PATCH 2/2 v2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.
Posted by Oleg Nesterov 2 years, 8 months ago
On 06/06, Sebastian Andrzej Siewior wrote:
>
> v1…v2:
>   - Extend the comment to note that preemption isn't disabled due to
>     the lock to make it obvious that the optimisation isn't just
>     harmful but also pointless.

Thanks,

Acked-by: Oleg Nesterov <oleg@redhat.com>