[PATCH] sched/rt: use for_each_cpu_wrap to iterate over rto_mask

Jon Kohler posted 1 patch 1 week ago
kernel/sched/rt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] sched/rt: use for_each_cpu_wrap to iterate over rto_mask
Posted by Jon Kohler 1 week ago
When using NO_RT_PUSH_IPI, using for_each_cpu() over rto_mask may cause
many CPUs to attempt to pull load from the same CPU, causing RQ
lock contention.

Use for_each_cpu_wrap instead to spread out which RQ gets evaluated
first, similar to how _nohz_idle_balance iterates over idle_cpus_mask.
This strategy is beneficial when there are many CPUs in rto_mask and
many other CPUs going in and out of schedule() at the same time.

Signed-off-by: Jon Kohler <jon@nutanix.com>
---
 kernel/sched/rt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 172c588de542..c883ff122f5d 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2308,7 +2308,7 @@ static void pull_rt_task(struct rq *this_rq)
 	}
 #endif
 
-	for_each_cpu(cpu, this_rq->rd->rto_mask) {
+	for_each_cpu_wrap(cpu, this_rq->rd->rto_mask, this_cpu+1) {
 		if (this_cpu == cpu)
 			continue;
 
-- 
2.43.0
Re: [PATCH] sched/rt: use for_each_cpu_wrap to iterate over rto_mask
Posted by Peter Zijlstra 1 week ago
On Thu, Nov 14, 2024 at 03:05:58PM -0700, Jon Kohler wrote:
> When using NO_RT_PUSH_IPI, using for_each_cpu() over rto_mask may cause
> many CPUs to attempt to pull load from the same CPU, causing RQ
> lock contention.
> 
> Use for_each_cpu_wrap instead to spread out which RQ gets evaluated
> first, similar to how _nohz_idle_balance iterates over idle_cpus_mask.
> This strategy is beneficial when there are many CPUs in rto_mask and
> many other CPUs going in and out of schedule() at the same time.
> 
> Signed-off-by: Jon Kohler <jon@nutanix.com>
> ---
>  kernel/sched/rt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 172c588de542..c883ff122f5d 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2308,7 +2308,7 @@ static void pull_rt_task(struct rq *this_rq)
>  	}
>  #endif
>  
> -	for_each_cpu(cpu, this_rq->rd->rto_mask) {
> +	for_each_cpu_wrap(cpu, this_rq->rd->rto_mask, this_cpu+1) {
>  		if (this_cpu == cpu)
>  			continue;

Works for me I suppose, but as with that other rt patch, please do the
matching change for dl too.