kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
From: zhangwei123171 <zhangwei123171@jd.com>
When the idle core cannot be found, the first sched idle cpu
or first available idle cpu will be used if exsit.
We can use the available idle cpu detected later to ensure it
can be used if exsit.
Signed-off-by: zhangwei123171 <zhangwei123171@jd.com>
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 41b58387023d..653ca3ea09b6 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7341,7 +7341,7 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu
}
break;
}
- if (*idle_cpu == -1 && cpumask_test_cpu(cpu, cpus))
+ if (cpumask_test_cpu(cpu, cpus))
*idle_cpu = cpu;
}
--
2.33.0
Hello,
kernel test robot noticed a 2.9% improvement of stress-ng.vm-rw.ops_per_sec on:
commit: 9f2e02ee19cda318b3889a27c13aee04fdbeb179 ("[PATCH] sched/fair: prefer available idle cpu in select_idle_core")
url: https://github.com/intel-lab-lkp/linux/commits/zhangwei123171-gmail-com/sched-fair-prefer-available-idle-cpu-in-select_idle_core/20240612-195645
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git c793a62823d1ce8f70d9cfc7803e3ea436277cda
patch link: https://lore.kernel.org/all/20240612115410.1659149-1-zhangwei123171@jd.com/
patch subject: [PATCH] sched/fair: prefer available idle cpu in select_idle_core
testcase: stress-ng
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
parameters:
nr_threads: 100%
testtime: 60s
test: vm-rw
cpufreq_governor: performance
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240620/202406201547.f5077fa1-oliver.sang@intel.com
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
gcc-13/performance/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/vm-rw/stress-ng/60s
commit:
c793a62823 ("sched/core: Drop spinlocks on contention iff kernel is preemptible")
9f2e02ee19 ("sched/fair: prefer available idle cpu in select_idle_core")
c793a62823d1ce8f 9f2e02ee19cda318b3889a27c13
---------------- ---------------------------
%stddev %change %stddev
\ | \
295657 ± 6% +22.8% 362935 ± 8% meminfo.Active
295610 ± 6% +22.8% 362887 ± 8% meminfo.Active(anon)
150724 ± 21% +67.4% 252378 ± 3% sched_debug.cfs_rq:/.avg_vruntime.stddev
150724 ± 21% +67.4% 252378 ± 3% sched_debug.cfs_rq:/.min_vruntime.stddev
10941857 +3.9% 11367018 vmstat.system.cs
1076455 -1.9% 1055781 vmstat.system.in
74324 ± 6% +21.9% 90584 ± 7% proc-vmstat.nr_active_anon
3.512e+09 +3.3% 3.626e+09 proc-vmstat.nr_foll_pin_acquired
3.512e+09 +3.3% 3.626e+09 proc-vmstat.nr_foll_pin_released
74324 ± 6% +21.9% 90584 ± 7% proc-vmstat.nr_zone_active_anon
760031 -70.4% 224972 stress-ng.time.involuntary_context_switches
3.948e+08 +2.9% 4.062e+08 stress-ng.time.voluntary_context_switches
1.975e+08 +2.9% 2.032e+08 stress-ng.vm-rw.ops
3291726 +2.9% 3387191 stress-ng.vm-rw.ops_per_sec
4.035e+10 +2.2% 4.123e+10 perf-stat.i.branch-instructions
0.67 -0.0 0.65 perf-stat.i.branch-miss-rate%
6.491e+09 +1.1% 6.564e+09 perf-stat.i.cache-references
11493579 +3.5% 11900089 perf-stat.i.context-switches
2.41 -1.9% 2.37 perf-stat.i.cpi
4817773 +1.7% 4901418 perf-stat.i.cpu-migrations
2.16e+11 +2.2% 2.208e+11 perf-stat.i.instructions
0.43 +1.7% 0.43 perf-stat.i.ipc
71.91 +3.6% 74.50 perf-stat.i.metric.K/sec
0.62 -0.0 0.61 perf-stat.overall.branch-miss-rate%
2.38 -1.6% 2.34 perf-stat.overall.cpi
0.42 +1.6% 0.43 perf-stat.overall.ipc
3.903e+10 +2.5% 3.999e+10 perf-stat.ps.branch-instructions
6.286e+09 +1.7% 6.395e+09 perf-stat.ps.cache-references
11123522 +4.1% 11584380 perf-stat.ps.context-switches
4664338 +2.4% 4775066 perf-stat.ps.cpu-migrations
2.09e+11 +2.5% 2.143e+11 perf-stat.ps.instructions
1.266e+13 +2.8% 1.301e+13 perf-stat.total.instructions
16.29 -0.3 15.97 perf-profile.calltrace.cycles-pp.pipe_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
16.70 -0.3 16.39 perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
14.46 -0.3 14.16 perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_sync_key.pipe_write.vfs_write.ksys_write
14.53 -0.3 14.23 perf-profile.calltrace.cycles-pp.__wake_up_sync_key.pipe_write.vfs_write.ksys_write.do_syscall_64
13.71 -0.3 13.41 perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_sync_key.pipe_write
13.73 -0.3 13.44 perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_sync_key.pipe_write.vfs_write
18.54 -0.2 18.31 perf-profile.calltrace.cycles-pp.__clone
8.93 -0.2 8.76 perf-profile.calltrace.cycles-pp.write.stress_vm_rw
8.77 -0.2 8.60 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.stress_vm_rw
8.71 -0.2 8.54 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.stress_vm_rw
8.78 -0.2 8.62 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write.stress_vm_rw
62.61 -0.2 62.46 perf-profile.calltrace.cycles-pp.stress_vm_rw
8.39 -0.1 8.25 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.__clone
8.40 -0.1 8.26 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write.__clone
8.32 -0.1 8.18 perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write.__clone
8.52 -0.1 8.38 perf-profile.calltrace.cycles-pp.write.__clone
13.62 -0.1 13.48 perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
14.30 -0.1 14.17 perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
9.69 -0.1 9.56 perf-profile.calltrace.cycles-pp.schedule.pipe_read.vfs_read.ksys_read.do_syscall_64
9.59 -0.1 9.47 perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_read.vfs_read.ksys_read
23.18 -0.1 23.07 perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.process_vm_rw_single_vec.process_vm_rw_core.process_vm_rw
23.42 -0.1 23.31 perf-profile.calltrace.cycles-pp.copy_page_to_iter.process_vm_rw_single_vec.process_vm_rw_core.process_vm_rw.__x64_sys_process_vm_readv
9.04 -0.1 8.95 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read.stress_vm_rw
4.71 -0.1 4.62 perf-profile.calltrace.cycles-pp.available_idle_cpu.select_idle_core.select_idle_cpu.select_idle_sibling.select_task_rq_fair
9.05 -0.1 8.96 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read.stress_vm_rw
7.79 -0.1 7.71 perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read.stress_vm_rw
0.63 -0.1 0.56 perf-profile.calltrace.cycles-pp.read
0.70 -0.0 0.67 perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.pipe_read.vfs_read
0.58 -0.0 0.55 perf-profile.calltrace.cycles-pp._raw_spin_lock.__schedule.schedule_idle.do_idle.cpu_startup_entry
1.42 -0.0 1.40 perf-profile.calltrace.cycles-pp.switch_fpu_return.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
0.84 -0.0 0.83 perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.read.__clone
0.71 +0.0 0.74 perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_entity.enqueue_task_fair.ttwu_do_activate.sched_ttwu_pending
0.87 +0.0 0.90 perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule_idle.do_idle.cpu_startup_entry
6.19 +0.0 6.22 perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
6.24 +0.0 6.28 perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
5.16 +0.0 5.21 perf-profile.calltrace.cycles-pp._copy_from_iter.copy_page_from_iter.process_vm_rw_single_vec.process_vm_rw_core.process_vm_rw
2.76 +0.0 2.81 perf-profile.calltrace.cycles-pp.pin_user_pages_remote.process_vm_rw_single_vec.process_vm_rw_core.process_vm_rw.__x64_sys_process_vm_writev
3.50 +0.0 3.54 perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.common_startup_64
3.36 +0.0 3.41 perf-profile.calltrace.cycles-pp.__schedule.schedule_idle.do_idle.cpu_startup_entry.start_secondary
5.35 +0.0 5.40 perf-profile.calltrace.cycles-pp.copy_page_from_iter.process_vm_rw_single_vec.process_vm_rw_core.process_vm_rw.__x64_sys_process_vm_writev
0.66 +0.1 0.73 perf-profile.calltrace.cycles-pp.sched_mm_cid_migrate_to.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue
6.76 +0.1 6.82 perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
2.15 +0.1 2.22 perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue
7.38 +0.1 7.45 perf-profile.calltrace.cycles-pp.select_idle_cpu.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up
9.00 +0.1 9.08 perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common
9.37 +0.1 9.45 perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_sync_key
2.58 +0.1 2.68 perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue
9.08 +0.1 9.19 perf-profile.calltrace.cycles-pp.process_vm_rw_single_vec.process_vm_rw_core.process_vm_rw.__x64_sys_process_vm_writev.do_syscall_64
9.71 +0.1 9.82 perf-profile.calltrace.cycles-pp.process_vm_rw_core.process_vm_rw.__x64_sys_process_vm_writev.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.48 +0.1 8.60 perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function
10.27 +0.1 10.39 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.process_vm_writev.stress_vm_rw
10.12 +0.1 10.23 perf-profile.calltrace.cycles-pp.__x64_sys_process_vm_writev.do_syscall_64.entry_SYSCALL_64_after_hwframe.process_vm_writev.stress_vm_rw
10.24 +0.1 10.36 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.process_vm_writev.stress_vm_rw
10.10 +0.1 10.22 perf-profile.calltrace.cycles-pp.process_vm_rw.__x64_sys_process_vm_writev.do_syscall_64.entry_SYSCALL_64_after_hwframe.process_vm_writev
10.64 +0.1 10.77 perf-profile.calltrace.cycles-pp.process_vm_writev.stress_vm_rw
4.31 +0.2 4.46 perf-profile.calltrace.cycles-pp.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle.cpu_startup_entry
5.36 +0.2 5.52 perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.do_idle.cpu_startup_entry.start_secondary.common_startup_64
3.40 +0.2 3.57 perf-profile.calltrace.cycles-pp.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle
4.81 +0.2 4.99 perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle.cpu_startup_entry.start_secondary
16.66 +0.3 16.96 perf-profile.calltrace.cycles-pp.common_startup_64
16.57 +0.3 16.88 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64
16.54 +0.3 16.84 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
16.58 +0.3 16.89 perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64
76.34 -0.4 75.95 perf-profile.children.cycles-pp.do_syscall_64
76.51 -0.4 76.12 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
16.31 -0.3 16.00 perf-profile.children.cycles-pp.pipe_write
14.46 -0.3 14.16 perf-profile.children.cycles-pp.__wake_up_common
14.54 -0.3 14.24 perf-profile.children.cycles-pp.__wake_up_sync_key
13.74 -0.3 13.44 perf-profile.children.cycles-pp.autoremove_wake_function
13.72 -0.3 13.43 perf-profile.children.cycles-pp.try_to_wake_up
16.83 -0.3 16.57 perf-profile.children.cycles-pp.vfs_write
17.16 -0.2 16.91 perf-profile.children.cycles-pp.ksys_write
17.70 -0.2 17.45 perf-profile.children.cycles-pp.write
18.54 -0.2 18.31 perf-profile.children.cycles-pp.__clone
18.78 -0.2 18.58 perf-profile.children.cycles-pp.read
2.91 -0.2 2.72 perf-profile.children.cycles-pp._raw_spin_lock
13.88 -0.2 13.71 perf-profile.children.cycles-pp.pipe_read
14.54 -0.2 14.37 perf-profile.children.cycles-pp.vfs_read
9.93 -0.2 9.76 perf-profile.children.cycles-pp.schedule
14.70 -0.2 14.54 perf-profile.children.cycles-pp.ksys_read
0.24 -0.2 0.08 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
62.61 -0.2 62.46 perf-profile.children.cycles-pp.stress_vm_rw
23.98 -0.1 23.87 perf-profile.children.cycles-pp._copy_to_iter
24.25 -0.1 24.14 perf-profile.children.cycles-pp.copy_page_to_iter
3.16 -0.1 3.10 perf-profile.children.cycles-pp.enqueue_task_fair
2.61 -0.0 2.57 perf-profile.children.cycles-pp.update_load_avg
1.78 -0.0 1.74 perf-profile.children.cycles-pp.prepare_task_switch
0.28 -0.0 0.24 perf-profile.children.cycles-pp.update_rq_clock_task
2.31 -0.0 2.28 perf-profile.children.cycles-pp.syscall_exit_to_user_mode
0.38 -0.0 0.35 perf-profile.children.cycles-pp.wake_affine
0.82 -0.0 0.79 perf-profile.children.cycles-pp.update_rq_clock
0.89 -0.0 0.86 perf-profile.children.cycles-pp.asm_sysvec_call_function_single
1.43 -0.0 1.40 perf-profile.children.cycles-pp.switch_fpu_return
0.10 ± 4% -0.0 0.08 ± 3% perf-profile.children.cycles-pp.cpuacct_charge
0.21 ± 2% -0.0 0.19 ± 2% perf-profile.children.cycles-pp.task_h_load
0.13 ± 3% -0.0 0.11 perf-profile.children.cycles-pp.wakeup_preempt
0.95 -0.0 0.93 perf-profile.children.cycles-pp.prepare_to_wait_event
0.16 +0.0 0.17 perf-profile.children.cycles-pp.tick_nohz_idle_enter
0.62 +0.0 0.63 perf-profile.children.cycles-pp.entry_SYSCALL_64
0.29 +0.0 0.30 perf-profile.children.cycles-pp.nohz_run_idle_balance
0.92 +0.0 0.94 perf-profile.children.cycles-pp.mod_node_page_state
0.25 +0.0 0.27 perf-profile.children.cycles-pp.update_min_vruntime
0.77 +0.0 0.79 perf-profile.children.cycles-pp.__switch_to_asm
0.46 +0.0 0.48 perf-profile.children.cycles-pp.llist_reverse_order
6.27 +0.0 6.30 perf-profile.children.cycles-pp.cpuidle_enter
0.48 ± 2% +0.0 0.52 ± 2% perf-profile.children.cycles-pp.pick_next_task_idle
0.47 ± 2% +0.0 0.51 ± 2% perf-profile.children.cycles-pp.__update_idle_core
0.81 +0.0 0.85 perf-profile.children.cycles-pp.sched_mm_cid_migrate_to
0.09 ± 6% +0.0 0.13 ± 5% perf-profile.children.cycles-pp.generic_perform_write
3.52 +0.0 3.57 perf-profile.children.cycles-pp.schedule_idle
5.40 +0.0 5.45 perf-profile.children.cycles-pp._copy_from_iter
0.00 +0.1 0.05 ± 5% perf-profile.children.cycles-pp.copy_page_from_iter_atomic
0.09 ± 6% +0.1 0.14 ± 7% perf-profile.children.cycles-pp.shmem_file_write_iter
5.64 +0.1 5.69 perf-profile.children.cycles-pp.copy_page_from_iter
0.00 +0.1 0.06 ± 7% perf-profile.children.cycles-pp.ring_buffer_read_head
0.12 ± 6% +0.1 0.18 ± 6% perf-profile.children.cycles-pp.record__pushfn
6.80 +0.1 6.86 perf-profile.children.cycles-pp.cpuidle_idle_call
0.12 ± 6% +0.1 0.18 ± 6% perf-profile.children.cycles-pp.writen
0.00 +0.1 0.07 ± 10% perf-profile.children.cycles-pp.perf_mmap__read_head
7.43 +0.1 7.51 perf-profile.children.cycles-pp.select_idle_cpu
9.00 +0.1 9.09 perf-profile.children.cycles-pp.select_task_rq_fair
9.37 +0.1 9.46 perf-profile.children.cycles-pp.select_task_rq
0.18 ± 5% +0.1 0.29 ± 6% perf-profile.children.cycles-pp.perf_mmap__push
0.19 ± 5% +0.1 0.30 ± 6% perf-profile.children.cycles-pp.record__mmap_read_evlist
0.19 ± 5% +0.1 0.30 ± 6% perf-profile.children.cycles-pp.cmd_record
0.19 ± 6% +0.1 0.31 ± 6% perf-profile.children.cycles-pp.main
0.19 ± 6% +0.1 0.31 ± 6% perf-profile.children.cycles-pp.run_builtin
8.50 +0.1 8.62 perf-profile.children.cycles-pp.select_idle_sibling
10.12 +0.1 10.24 perf-profile.children.cycles-pp.__x64_sys_process_vm_writev
10.80 +0.1 10.92 perf-profile.children.cycles-pp.process_vm_writev
4.92 +0.1 5.07 perf-profile.children.cycles-pp.sched_ttwu_pending
5.43 +0.1 5.58 perf-profile.children.cycles-pp.flush_smp_call_function_queue
5.54 +0.2 5.70 perf-profile.children.cycles-pp.__flush_smp_call_function_queue
16.66 +0.3 16.96 perf-profile.children.cycles-pp.common_startup_64
16.66 +0.3 16.96 perf-profile.children.cycles-pp.cpu_startup_entry
16.63 +0.3 16.93 perf-profile.children.cycles-pp.do_idle
16.58 +0.3 16.89 perf-profile.children.cycles-pp.start_secondary
0.24 -0.2 0.08 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
23.76 -0.1 23.65 perf-profile.self.cycles-pp._copy_to_iter
1.59 -0.0 1.54 perf-profile.self.cycles-pp.prepare_task_switch
1.31 -0.0 1.28 perf-profile.self.cycles-pp.update_load_avg
0.36 -0.0 0.33 perf-profile.self.cycles-pp.switch_fpu_return
0.24 ± 2% -0.0 0.21 ± 2% perf-profile.self.cycles-pp.update_rq_clock_task
0.70 -0.0 0.67 perf-profile.self.cycles-pp.update_rq_clock
5.43 -0.0 5.40 perf-profile.self.cycles-pp.intel_idle
0.21 ± 2% -0.0 0.18 ± 2% perf-profile.self.cycles-pp.task_h_load
0.10 -0.0 0.08 perf-profile.self.cycles-pp.cpuacct_charge
0.17 -0.0 0.15 ± 3% perf-profile.self.cycles-pp.ttwu_queue_wakelist
0.27 -0.0 0.26 perf-profile.self.cycles-pp.try_to_wake_up
0.13 -0.0 0.12 perf-profile.self.cycles-pp.pick_next_task_fair
0.22 -0.0 0.21 perf-profile.self.cycles-pp.set_task_cpu
0.07 -0.0 0.06 perf-profile.self.cycles-pp.wakeup_preempt
0.16 ± 2% +0.0 0.18 ± 2% perf-profile.self.cycles-pp.menu_select
0.78 +0.0 0.80 perf-profile.self.cycles-pp.__switch_to
0.49 +0.0 0.50 perf-profile.self.cycles-pp.call_function_single_prep_ipi
0.84 +0.0 0.85 perf-profile.self.cycles-pp.mod_node_page_state
0.24 ± 2% +0.0 0.26 perf-profile.self.cycles-pp.remove_entity_load_avg
0.24 +0.0 0.26 ± 2% perf-profile.self.cycles-pp.update_min_vruntime
0.46 +0.0 0.48 perf-profile.self.cycles-pp.llist_reverse_order
0.55 +0.0 0.57 perf-profile.self.cycles-pp.enqueue_entity
0.76 +0.0 0.79 perf-profile.self.cycles-pp.__switch_to_asm
0.37 ± 3% +0.0 0.40 ± 2% perf-profile.self.cycles-pp.__update_idle_core
0.81 +0.0 0.85 perf-profile.self.cycles-pp.sched_mm_cid_migrate_to
1.12 +0.0 1.16 perf-profile.self.cycles-pp.select_idle_core
5.34 +0.0 5.39 perf-profile.self.cycles-pp._copy_from_iter
0.48 +0.1 0.53 perf-profile.self.cycles-pp.select_idle_cpu
0.00 +0.1 0.05 ± 5% perf-profile.self.cycles-pp.copy_page_from_iter_atomic
0.00 +0.1 0.06 ± 10% perf-profile.self.cycles-pp.ring_buffer_read_head
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Hello there,
On 6/12/2024 5:24 PM, zhangwei123171@gmail.com wrote:
> From: zhangwei123171 <zhangwei123171@jd.com>
>
> When the idle core cannot be found, the first sched idle cpu
> or first available idle cpu will be used if exsit.
>
> We can use the available idle cpu detected later to ensure it
> can be used if exsit.
Is there any particular advantage of the same? Based on my understanding
the check exists to prevent unnecessary calls to cpumask_test_cpu() if
an idle CPU is already found. On a large core count system with a large
number of cores in the LLC domain, this may result in a lot more calls
to cpumask_test_cpu() if only one core is in fact idle and there is a
storm of wakeups.
For SMT-2 system, I believe any idle thread on a busy core would be the
same (if we consider all task to have same behavior). On a larger SMT
system, it takes more overhead to consider which core is the most idle.
Consider the following case:
o CPUs of core: 0-7; Only CPU1 is busy (i is idle, b is busy)
+---+---+---+---+---+---+---+---+
| i | b | i | i | i | i | i | i |
+---+---+---+---+---+---+---+---+
^
select idle core bails out at first busy CPU which is CPU1 however
this core is only 1/8th busy.
o CPUs of core: 8-15; CPU10 to CPU15 are busy (i is idle, b is busy)
+---+---+---+---+---+---+---+---+
| i | i | b | b | b | b | b | b |
+---+---+---+---+---+---+---+---+
^
select idle core bails out at first busy CPU which is CPU10 however
this core is in fact 5/8th busy.
Technically, core with CPU0 is better but with your change, we'll select
core of CPU8. Bottom line being, there does not seem to exist a good
case where selecting the last idle thread is better than selecting the
first one. The best the scheduler can do is reduce the number of calls
to cpumask_test_cpu() once an idle CPU is found unless it decides to
scan all the CPUs of the core to find the core which is the idlest and
in a large, busy system, that is a big hammer.
Thoughts?
>
> Signed-off-by: zhangwei123171 <zhangwei123171@jd.com>
> ---
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 41b58387023d..653ca3ea09b6 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7341,7 +7341,7 @@ static int select_idle_core(struct task_struct *p, int core, struct cpumask *cpu
> }
> break;
> }
> - if (*idle_cpu == -1 && cpumask_test_cpu(cpu, cpus))
> + if (cpumask_test_cpu(cpu, cpus))
> *idle_cpu = cpu;
> }
>
--
Thanks and Regards,
Prateek
© 2016 - 2026 Red Hat, Inc.