[PATCH] sched_ext: improve WAKE_SYNC behavior for default idle CPU selection

Andrea Righi posted 1 patch 1 month, 1 week ago
There is a newer version of this series
kernel/sched/ext.c | 44 +++++++++++++++++++++++++++++++-------------
1 file changed, 31 insertions(+), 13 deletions(-)
[PATCH] sched_ext: improve WAKE_SYNC behavior for default idle CPU selection
Posted by Andrea Righi 1 month, 1 week ago
In the sched_ext built-in idle CPU selection logic, when handling a
WF_SYNC wakeup, we always attempt to migrate the task to the waker's
CPU, as the waker is expected to yield the CPU after waking the task.

However, it may be preferable to keep the task on its previous CPU if
the waker's CPU is cache-affine.

The same approach is also used by the fair class and in other scx
schedulers, like scx_rusty and scx_bpfland.

Therefore, apply the same logic to the built-in idle CPU selection
policy as well.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
---
 kernel/sched/ext.c | 44 +++++++++++++++++++++++++++++++-------------
 1 file changed, 31 insertions(+), 13 deletions(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 6eae3b69bf6e..8b7d90db3cfb 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -3087,20 +3087,38 @@ static s32 scx_select_cpu_dfl(struct task_struct *p, s32 prev_cpu,
 	*found = false;
 
 	/*
-	 * If WAKE_SYNC, the waker's local DSQ is empty, and the system is
-	 * under utilized, wake up @p to the local DSQ of the waker. Checking
-	 * only for an empty local DSQ is insufficient as it could give the
-	 * wakee an unfair advantage when the system is oversaturated.
-	 * Checking only for the presence of idle CPUs is also insufficient as
-	 * the local DSQ of the waker could have tasks piled up on it even if
-	 * there is an idle core elsewhere on the system.
-	 */
-	cpu = smp_processor_id();
-	if ((wake_flags & SCX_WAKE_SYNC) &&
-	    !cpumask_empty(idle_masks.cpu) && !(current->flags & PF_EXITING) &&
-	    cpu_rq(cpu)->scx.local_dsq.nr == 0) {
-		if (cpumask_test_cpu(cpu, p->cpus_ptr))
+	 * If WAKE_SYNC, try to migrate the wakee to the waker's CPU.
+	 */
+	if (wake_flags & SCX_WAKE_SYNC) {
+		cpu = smp_processor_id();
+
+		/*
+		 * If the waker's CPU is cache affine and prev_cpu is idle,
+		 * then avoid a migration.
+		 */
+		if (cpus_share_cache(cpu, prev_cpu) &&
+		    test_and_clear_cpu_idle(prev_cpu))
 			goto cpu_found;
+
+		/*
+		 * If the waker's local DSQ is empty, and the system is under
+		 * utilized, try to wake up @p to the local DSQ of the waker.
+		 *
+		 * Checking only for an empty local DSQ is insufficient as it
+		 * could give the wakee an unfair advantage when the system is
+		 * oversaturated.
+		 *
+		 * Checking only for the presence of idle CPUs is also
+		 * insufficient as the local DSQ of the waker could have tasks
+		 * piled up on it even if there is an idle core elsewhere on
+		 * the system.
+		 */
+		if (!cpumask_empty(idle_masks.cpu) &&
+		    !(current->flags & PF_EXITING) &&
+		    cpu_rq(cpu)->scx.local_dsq.nr == 0) {
+			if (cpumask_test_cpu(cpu, p->cpus_ptr))
+				goto cpu_found;
+		}
 	}
 
 	/*
-- 
2.47.0
Re: [PATCH] sched_ext: improve WAKE_SYNC behavior for default idle CPU selection
Posted by Tejun Heo 1 month, 1 week ago
Hello,

On Fri, Oct 18, 2024 at 11:40:33AM +0200, Andrea Righi wrote:
...
> +	 * If WAKE_SYNC, try to migrate the wakee to the waker's CPU.
> +	 */
> +	if (wake_flags & SCX_WAKE_SYNC) {
> +		cpu = smp_processor_id();
> +
> +		/*
> +		 * If the waker's CPU is cache affine and prev_cpu is idle,
> +		 * then avoid a migration.
> +		 */
> +		if (cpus_share_cache(cpu, prev_cpu) &&
> +		    test_and_clear_cpu_idle(prev_cpu))
>  			goto cpu_found;

Shouldn't this do set cpu to prev_cpu before jumping to cpu_found?

Thanks.

-- 
tejun
Re: [PATCH] sched_ext: improve WAKE_SYNC behavior for default idle CPU selection
Posted by Andrea Righi 1 month, 1 week ago
On Fri, Oct 18, 2024 at 07:38:10AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Fri, Oct 18, 2024 at 11:40:33AM +0200, Andrea Righi wrote:
> ...
> > +	 * If WAKE_SYNC, try to migrate the wakee to the waker's CPU.
> > +	 */
> > +	if (wake_flags & SCX_WAKE_SYNC) {
> > +		cpu = smp_processor_id();
> > +
> > +		/*
> > +		 * If the waker's CPU is cache affine and prev_cpu is idle,
> > +		 * then avoid a migration.
> > +		 */
> > +		if (cpus_share_cache(cpu, prev_cpu) &&
> > +		    test_and_clear_cpu_idle(prev_cpu))
> >  			goto cpu_found;
> 
> Shouldn't this do set cpu to prev_cpu before jumping to cpu_found?

Ah! You're absolutely right!

Let me send a v2 (and test it).

Thanks,
-Andrea

> 
> Thanks.
> 
> -- 
> tejun