[PATCH] sched/fair: Save cpu id locally to avoid repeated smp_processor_id() calls

Zhongqiu Han posted 1 patch 1 month, 3 weeks ago
kernel/sched/fair.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
[PATCH] sched/fair: Save cpu id locally to avoid repeated smp_processor_id() calls
Posted by Zhongqiu Han 1 month, 3 weeks ago
Avoid repeated smp_processor_id() by saving cpu id in a local variable.

- find_new_ilb(): func called with interrupts disabled.
- sched_cfs_period_timer(): cpu id saved after raw_spin_lock_irqsave().

This improves clarity and reduces overhead without changing functionality.

Signed-off-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
---
 kernel/sched/fair.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e256793b9a08..60a9830fb8a4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6401,6 +6401,8 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer)
 	int count = 0;
 
 	raw_spin_lock_irqsave(&cfs_b->lock, flags);
+	int cpu = smp_processor_id();
+
 	for (;;) {
 		overrun = hrtimer_forward_now(timer, cfs_b->period);
 		if (!overrun)
@@ -6424,13 +6426,13 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer)
 
 				pr_warn_ratelimited(
 	"cfs_period_timer[cpu%d]: period too short, scaling up (new cfs_period_us = %lld, cfs_quota_us = %lld)\n",
-					smp_processor_id(),
+					cpu,
 					div_u64(new, NSEC_PER_USEC),
 					div_u64(cfs_b->quota, NSEC_PER_USEC));
 			} else {
 				pr_warn_ratelimited(
 	"cfs_period_timer[cpu%d]: period too short, but cannot scale up without losing precision (cfs_period_us = %lld, cfs_quota_us = %lld)\n",
-					smp_processor_id(),
+					cpu,
 					div_u64(old, NSEC_PER_USEC),
 					div_u64(cfs_b->quota, NSEC_PER_USEC));
 			}
@@ -12195,12 +12197,13 @@ static inline int find_new_ilb(void)
 {
 	const struct cpumask *hk_mask;
 	int ilb_cpu;
+	int this_cpu = smp_processor_id();
 
 	hk_mask = housekeeping_cpumask(HK_TYPE_KERNEL_NOISE);
 
 	for_each_cpu_and(ilb_cpu, nohz.idle_cpus_mask, hk_mask) {
 
-		if (ilb_cpu == smp_processor_id())
+		if (ilb_cpu == this_cpu)
 			continue;
 
 		if (idle_cpu(ilb_cpu))
-- 
2.43.0
Re: [PATCH] sched/fair: Save cpu id locally to avoid repeated smp_processor_id() calls
Posted by Peter Zijlstra 1 month, 2 weeks ago
On Thu, Aug 14, 2025 at 10:01:41PM +0800, Zhongqiu Han wrote:
> Avoid repeated smp_processor_id() by saving cpu id in a local variable.
> 
> - find_new_ilb(): func called with interrupts disabled.
> - sched_cfs_period_timer(): cpu id saved after raw_spin_lock_irqsave().
> 
> This improves clarity and reduces overhead without changing functionality.

No, this makes things actively worse. It:

 - violates coding style by declaring a variable in the middle of code
 - fetches the CPU number even if its never used (hopefully the compiler
   can optimize this for you)
 - moves error path logic into the normal code path

> Signed-off-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
> ---
>  kernel/sched/fair.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index e256793b9a08..60a9830fb8a4 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6401,6 +6401,8 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer)
>  	int count = 0;
>  
>  	raw_spin_lock_irqsave(&cfs_b->lock, flags);
> +	int cpu = smp_processor_id();
> +
>  	for (;;) {
>  		overrun = hrtimer_forward_now(timer, cfs_b->period);
>  		if (!overrun)
> @@ -6424,13 +6426,13 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer)
>  
>  				pr_warn_ratelimited(
>  	"cfs_period_timer[cpu%d]: period too short, scaling up (new cfs_period_us = %lld, cfs_quota_us = %lld)\n",
> -					smp_processor_id(),
> +					cpu,
>  					div_u64(new, NSEC_PER_USEC),
>  					div_u64(cfs_b->quota, NSEC_PER_USEC));
>  			} else {
>  				pr_warn_ratelimited(
>  	"cfs_period_timer[cpu%d]: period too short, but cannot scale up without losing precision (cfs_period_us = %lld, cfs_quota_us = %lld)\n",
> -					smp_processor_id(),
> +					cpu,
>  					div_u64(old, NSEC_PER_USEC),
>  					div_u64(cfs_b->quota, NSEC_PER_USEC));
>  			}
> @@ -12195,12 +12197,13 @@ static inline int find_new_ilb(void)
>  {
>  	const struct cpumask *hk_mask;
>  	int ilb_cpu;
> +	int this_cpu = smp_processor_id();

This again violates coding style.

>  	hk_mask = housekeeping_cpumask(HK_TYPE_KERNEL_NOISE);
>  
>  	for_each_cpu_and(ilb_cpu, nohz.idle_cpus_mask, hk_mask) {
>  
> -		if (ilb_cpu == smp_processor_id())
> +		if (ilb_cpu == this_cpu)
>  			continue;

And have you checked if the compiler did this lift for you? It can
generally lift loads out of loops.

>  
>  		if (idle_cpu(ilb_cpu))
> -- 
> 2.43.0
>