sched/core: avoid calling select_task_rq cb if bound to one CPU for exec

[PATCH v2] sched/core: avoid calling select_task_rq cb if bound to one CPU for exec

Posted by Jianyong Wu 2 months, 1 week ago

In the current implementation, even if the task calling execl is bound
to a single CPU (or not allowed to be migrated), it still invokes the
select_task_rq callback to select a CPU. This is unnecessary and
wastes cycles.

Since select_task_rq() already includes checks for the above scenarios
(e.g., tasks bound to a single CPU or forbidden to migrate) and skips
the select_task_rq callback in such cases, we can directly use
select_task_rq() instead of invoking the callback here.

Test environment: 256-CPU X86 server
Test method: Run unixbench's execl test with task bound to a single CPU:

  $ numactl -C 10 ./Run execl -c 1

Test results: Average of 5 runs

baseline    patched    improvement
383.82      436.78     +13.8%

Change Log:

v1->v2
As suggested by Peter, replace manual corner-case checks with
select_task_rq() to align with existing logic.

Additional testing on a 256-CPU server which all sched domains have
SD_BALANCE_EXEC flag, shows that sched_exec now searches all CPUs in the
system (previously, some SD_NUMA sched domains lacked SD_BALANCE_EXEC).
This increased the performance improvement to 13.8%.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Yibin Liu <liuyibin@hygon.cn>
Signed-off-by: Yibin Liu <liuyibin@hygon.cn>
Signed-off-by: Jianyong Wu <wujianyong@hygon.cn>
---
 kernel/sched/core.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f754a60de848..6e4ba3c27e5c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5439,10 +5439,11 @@ void sched_exec(void)
 {
 	struct task_struct *p = current;
 	struct migration_arg arg;
-	int dest_cpu;
+	int dest_cpu, wake_flag = WF_EXEC;
 
 	scoped_guard (raw_spinlock_irqsave, &p->pi_lock) {
-		dest_cpu = p->sched_class->select_task_rq(p, task_cpu(p), WF_EXEC);
+		dest_cpu = select_task_rq(p, task_cpu(p), &wake_flag);
+
 		if (dest_cpu == smp_processor_id())
 			return;
 
-- 
2.43.0

RE: [PATCH v2] sched/core: avoid calling select_task_rq cb if bound to one CPU for exec

Posted by Jianyong Wu 3 weeks, 6 days ago

Gentle ping. :)

> -----Original Message-----
> From: Jianyong Wu <wujianyong@hygon.cn>
> Sent: Monday, December 1, 2025 6:56 PM
> To: peterz@infradead.org; mingo@redhat.com; juri.lelli@redhat.com;
> vincent.guittot@linaro.org
> Cc: dietmar.eggemann@arm.com; rostedt@goodmis.org;
> bsegall@google.com; mgorman@suse.de; vschneid@redhat.com;
> linux-kernel@vger.kernel.org; jianyong.wu@outlook.com; Jianyong Wu
> <wujianyong@hygon.cn>; Yibin Liu <liuyibin@hygon.cn>
> Subject: [PATCH v2] sched/core: avoid calling select_task_rq cb if bound to
> one CPU for exec
> 
> In the current implementation, even if the task calling execl is bound to a
> single CPU (or not allowed to be migrated), it still invokes the
> select_task_rq callback to select a CPU. This is unnecessary and wastes
> cycles.
> 
> Since select_task_rq() already includes checks for the above scenarios (e.g.,
> tasks bound to a single CPU or forbidden to migrate) and skips the
> select_task_rq callback in such cases, we can directly use
> select_task_rq() instead of invoking the callback here.
> 
> Test environment: 256-CPU X86 server
> Test method: Run unixbench's execl test with task bound to a single CPU:
> 
>   $ numactl -C 10 ./Run execl -c 1
> 
> Test results: Average of 5 runs
> 
> baseline    patched    improvement
> 383.82      436.78     +13.8%
> 
> Change Log:
> 
> v1->v2
> As suggested by Peter, replace manual corner-case checks with
> select_task_rq() to align with existing logic.
> 
> Additional testing on a 256-CPU server which all sched domains have
> SD_BALANCE_EXEC flag, shows that sched_exec now searches all CPUs in
> the system (previously, some SD_NUMA sched domains lacked
> SD_BALANCE_EXEC).
> This increased the performance improvement to 13.8%.
> 
> Suggested-by: Peter Zijlstra <peterz@infradead.org>
> Co-developed-by: Yibin Liu <liuyibin@hygon.cn>
> Signed-off-by: Yibin Liu <liuyibin@hygon.cn>
> Signed-off-by: Jianyong Wu <wujianyong@hygon.cn>
> ---
>  kernel/sched/core.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c index
> f754a60de848..6e4ba3c27e5c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5439,10 +5439,11 @@ void sched_exec(void)  {
>  	struct task_struct *p = current;
>  	struct migration_arg arg;
> -	int dest_cpu;
> +	int dest_cpu, wake_flag = WF_EXEC;
> 
>  	scoped_guard (raw_spinlock_irqsave, &p->pi_lock) {
> -		dest_cpu = p->sched_class->select_task_rq(p, task_cpu(p),
> WF_EXEC);
> +		dest_cpu = select_task_rq(p, task_cpu(p), &wake_flag);
> +
>  		if (dest_cpu == smp_processor_id())
>  			return;
> 
> --
> 2.43.0