kernel/sched/rt.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
Add smp_wmb in rt_clear_overload, which ensures that the cleared
cpumask bit is visible to properly iterate over any remaining
overloaded CPU(s).
The smp_wmb pairs with the smp_rmb in pull_rt_task(), ensuring that a
thread will observe rto_count and the correct cpumask.
This visibility is important for NO_RT_PUSH_IPI use cases where a
thread may iterate over an outdated view of rto_mask where target CPUs
are no longer overloaded.
Signed-off-by: Jon Kohler <jon@nutanix.com>
---
kernel/sched/rt.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 172c588de542..f68a454bb0e3 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -354,9 +354,13 @@ static inline void rt_clear_overload(struct rq *rq)
if (!rq->online)
return;
- /* the order here really doesn't matter */
atomic_dec(&rq->rd->rto_count);
cpumask_clear_cpu(rq->cpu, rq->rd->rto_mask);
+ /*
+ * Barrier pairs with pull_rt_task(), such that threads will
+ * observe the correct cpu mask for any remaining overloaded CPU(s).
+ */
+ smp_wmb();
}
static inline int has_pushable_tasks(struct rq *rq)
--
2.43.0
On Thu, Nov 14, 2024 at 02:31:56PM -0700, Jon Kohler wrote: > Add smp_wmb in rt_clear_overload, which ensures that the cleared > cpumask bit is visible to properly iterate over any remaining > overloaded CPU(s). > > The smp_wmb pairs with the smp_rmb in pull_rt_task(), ensuring that a > thread will observe rto_count and the correct cpumask. > > This visibility is important for NO_RT_PUSH_IPI use cases where a > thread may iterate over an outdated view of rto_mask where target CPUs > are no longer overloaded. > > Signed-off-by: Jon Kohler <jon@nutanix.com> > --- > kernel/sched/rt.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c > index 172c588de542..f68a454bb0e3 100644 > --- a/kernel/sched/rt.c > +++ b/kernel/sched/rt.c > @@ -354,9 +354,13 @@ static inline void rt_clear_overload(struct rq *rq) > if (!rq->online) > return; > > - /* the order here really doesn't matter */ > atomic_dec(&rq->rd->rto_count); > cpumask_clear_cpu(rq->cpu, rq->rd->rto_mask); > + /* > + * Barrier pairs with pull_rt_task(), such that threads will > + * observe the correct cpu mask for any remaining overloaded CPU(s). > + */ > + smp_wmb(); > } There is a comment in pull_rt_task() that says there is a barrier in rt_set_overloaded(), was there ever, when did it go away? Also, both atomic_dec() and cpumask_clear_cpu() are atomic ops, sadly they're both variants that don't imply much on our weak architectures and smp_mb__after_atomic() would be too much for them :/ Oh well. Also, when modifying rt, always look at dl because that shares a ton of logic, dl_set_overload() weirdly has all this differently -- and actually has the barrier on. Please make it all match.
On Fri, 15 Nov 2024 11:03:41 +0100 Peter Zijlstra <peterz@infradead.org> wrote: > There is a comment in pull_rt_task() that says there is a barrier in > rt_set_overloaded(), was there ever, when did it go away? That's a typo. It matches the barrier in rt_set_overload(). I'll take a look at this change too, since I wrote a bunch of this code a long time ago. :-p -- Steve
© 2016 - 2024 Red Hat, Inc.