sched/rt: use smp_wmb in rt_clear_overload

[PATCH] sched/rt: use smp_wmb in rt_clear_overload

Posted by Jon Kohler 1 week ago

Add smp_wmb in rt_clear_overload, which ensures that the cleared
cpumask bit is visible to properly iterate over any remaining
overloaded CPU(s).

The smp_wmb pairs with the smp_rmb in pull_rt_task(), ensuring that a
thread will observe rto_count and the correct cpumask.

This visibility is important for NO_RT_PUSH_IPI use cases where a
thread may iterate over an outdated view of rto_mask where target CPUs
are no longer overloaded.

Signed-off-by: Jon Kohler <jon@nutanix.com>
---
 kernel/sched/rt.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 172c588de542..f68a454bb0e3 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -354,9 +354,13 @@ static inline void rt_clear_overload(struct rq *rq)
 	if (!rq->online)
 		return;
 
-	/* the order here really doesn't matter */
 	atomic_dec(&rq->rd->rto_count);
 	cpumask_clear_cpu(rq->cpu, rq->rd->rto_mask);
+	/*
+	 * Barrier pairs with pull_rt_task(), such that threads will
+	 * observe the correct cpu mask for any remaining overloaded CPU(s).
+	 */
+	smp_wmb();
 }
 
 static inline int has_pushable_tasks(struct rq *rq)
-- 
2.43.0

Re: [PATCH] sched/rt: use smp_wmb in rt_clear_overload

Posted by Peter Zijlstra 1 week ago

On Thu, Nov 14, 2024 at 02:31:56PM -0700, Jon Kohler wrote:
> Add smp_wmb in rt_clear_overload, which ensures that the cleared
> cpumask bit is visible to properly iterate over any remaining
> overloaded CPU(s).
> 
> The smp_wmb pairs with the smp_rmb in pull_rt_task(), ensuring that a
> thread will observe rto_count and the correct cpumask.
> 
> This visibility is important for NO_RT_PUSH_IPI use cases where a
> thread may iterate over an outdated view of rto_mask where target CPUs
> are no longer overloaded.
> 
> Signed-off-by: Jon Kohler <jon@nutanix.com>
> ---
>  kernel/sched/rt.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 172c588de542..f68a454bb0e3 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -354,9 +354,13 @@ static inline void rt_clear_overload(struct rq *rq)
>  	if (!rq->online)
>  		return;
>  
> -	/* the order here really doesn't matter */
>  	atomic_dec(&rq->rd->rto_count);
>  	cpumask_clear_cpu(rq->cpu, rq->rd->rto_mask);
> +	/*
> +	 * Barrier pairs with pull_rt_task(), such that threads will
> +	 * observe the correct cpu mask for any remaining overloaded CPU(s).
> +	 */
> +	smp_wmb();
>  }

There is a comment in pull_rt_task() that says there is a barrier in
rt_set_overloaded(), was there ever, when did it go away?

Also, both atomic_dec() and cpumask_clear_cpu() are atomic ops, sadly
they're both variants that don't imply much on our weak architectures
and smp_mb__after_atomic() would be too much for them :/

Oh well.

Also, when modifying rt, always look at dl because that shares a ton of
logic, dl_set_overload() weirdly has all this differently -- and
actually has the barrier on. Please make it all match.

Re: [PATCH] sched/rt: use smp_wmb in rt_clear_overload

Posted by Steven Rostedt 1 week ago

On Fri, 15 Nov 2024 11:03:41 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> There is a comment in pull_rt_task() that says there is a barrier in
> rt_set_overloaded(), was there ever, when did it go away?

That's a typo. It matches the barrier in rt_set_overload().

I'll take a look at this change too, since I wrote a bunch of this code a
long time ago. :-p

-- Steve