[PATCH 2/2] sched/fair: get this cpu once in find_new_ilb

Shrikanth Hegde posted 2 patches 2 weeks, 4 days ago
[PATCH 2/2] sched/fair: get this cpu once in find_new_ilb
Posted by Shrikanth Hegde 2 weeks, 4 days ago
smp_processor_id() is pointless to fetch in the loop. Move it out.
No functional change.

Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
---
 kernel/sched/fair.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 02cca2c7a98d..9e561c1234b2 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -12635,11 +12635,11 @@ static inline int on_null_domain(struct rq *rq)
  */
 static inline int find_new_ilb(struct cpumask *cpus)
 {
+	int this_cpu = smp_processor_id();
 	int ilb_cpu;
 
 	for_each_cpu(ilb_cpu, cpus) {
-
-		if (ilb_cpu == smp_processor_id())
+		if (ilb_cpu == this_cpu)
 			continue;
 
 		if (idle_cpu(ilb_cpu))
-- 
2.43.0
Re: [PATCH 2/2] sched/fair: get this cpu once in find_new_ilb
Posted by K Prateek Nayak 2 weeks, 3 days ago
Hello Shrikanth,

On 3/19/2026 12:23 PM, Shrikanth Hegde wrote:
> smp_processor_id() is pointless to fetch in the loop. Move it out.
> No functional change.
> 
> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>

Makes sense! Feel free to include:

Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>

-- 
Thanks and Regards,
Prateek
Re: [PATCH 2/2] sched/fair: get this cpu once in find_new_ilb
Posted by Peter Zijlstra 2 weeks, 4 days ago
On Thu, Mar 19, 2026 at 12:23:14PM +0530, Shrikanth Hegde wrote:
> smp_processor_id() is pointless to fetch in the loop. Move it out.
> No functional change.

Isn't this what compilers are for?
Re: [PATCH 2/2] sched/fair: get this cpu once in find_new_ilb
Posted by Shrikanth Hegde 2 weeks, 4 days ago
Hi Peter.

On 3/19/26 2:50 PM, Peter Zijlstra wrote:
> On Thu, Mar 19, 2026 at 12:23:14PM +0530, Shrikanth Hegde wrote:
>> smp_processor_id() is pointless to fetch in the loop. Move it out.
>> No functional change.
> 
> Isn't this what compilers are for?

This is what i see in pseries.

c00000000028cc7c:	1d 70 80 48 	bl      c000000000a93c98 <_find_next_and_bit>
c00000000028cc80:	00 00 00 60 	nop
c00000000028cc84:	00 00 bd 80 	lwz     r5,0(r29)
c00000000028cc88:	b4 07 7e 7c 	extsw   r30,r3
c00000000028cc8c:	78 1b 7f 7c 	mr      r31,r3                            << This is ilb_cpu
c00000000028cc90:	78 1b 7a 7c 	mr      r26,r3
c00000000028cc94:	40 18 05 7c 	cmplw   r5,r3
c00000000028cc98:	78 f3 c3 7f 	mr      r3,r30
c00000000028cc9c:	5c 00 81 40 	ble     c00000000028ccf8 <kick_ilb+0x10c>
c00000000028cca0:	08 00 2d a1 	lhz     r9,8(r13)                         << cpu comes from paca_index
c00000000028cca4:	00 f8 09 7c 	cmpw    r9,r31
c00000000028cca8:	18 00 82 41 	beq     c00000000028ccc0 <kick_ilb+0xd4>
c00000000028ccac:	8d 0c 04 48 	bl      c0000000002cd938 <idle_cpu+0x8>
c00000000028ccb0:	00 00 00 60 	nop
c00000000028ccb4:	00 00 03 2c 	cmpwi   r3,0
c00000000028ccb8:	78 00 82 40 	bne     c00000000028cd30 <kick_ilb+0x144>
c00000000028ccbc:	00 00 bd 80 	lwz     r5,0(r29)
c00000000028ccc0:	01 00 df 38 	addi    r6,r31,1
c00000000028ccc4:	20 00 a5 78 	clrldi  r5,r5,32
c00000000028ccc8:	78 e3 84 7f 	mr      r4,r28
c00000000028cccc:	78 db 63 7f 	mr      r3,r27
c00000000028ccd0:	b4 07 c6 7c 	extsw   r6,r6
c00000000028ccd4:	c5 6f 80 48 	bl      c000000000a93c98 <_find_next_and_bit>



So, yes it is done inside the loop. But accessing it should be cheap as it is r13 access
which is the paca pointer.

Since different archs implement it in their own ways
(many would have it cheap, but for some may be costly)
and there is CONFIG_DEBUG_PREEMPT too.
So it would be good to hoist it out of the loop IMHO.
Re: [PATCH 2/2] sched/fair: get this cpu once in find_new_ilb
Posted by Peter Zijlstra 2 weeks, 4 days ago
On Thu, Mar 19, 2026 at 06:33:19PM +0530, Shrikanth Hegde wrote:
> Hi Peter.
> 
> On 3/19/26 2:50 PM, Peter Zijlstra wrote:
> > On Thu, Mar 19, 2026 at 12:23:14PM +0530, Shrikanth Hegde wrote:
> > > smp_processor_id() is pointless to fetch in the loop. Move it out.
> > > No functional change.
> > 
> > Isn't this what compilers are for?
> 
> This is what i see in pseries.
> 
> c00000000028cc7c:	1d 70 80 48 	bl      c000000000a93c98 <_find_next_and_bit>
> c00000000028cc80:	00 00 00 60 	nop
> c00000000028cc84:	00 00 bd 80 	lwz     r5,0(r29)
> c00000000028cc88:	b4 07 7e 7c 	extsw   r30,r3
> c00000000028cc8c:	78 1b 7f 7c 	mr      r31,r3                            << This is ilb_cpu
> c00000000028cc90:	78 1b 7a 7c 	mr      r26,r3
> c00000000028cc94:	40 18 05 7c 	cmplw   r5,r3
> c00000000028cc98:	78 f3 c3 7f 	mr      r3,r30
> c00000000028cc9c:	5c 00 81 40 	ble     c00000000028ccf8 <kick_ilb+0x10c>
> c00000000028cca0:	08 00 2d a1 	lhz     r9,8(r13)                         << cpu comes from paca_index
> c00000000028cca4:	00 f8 09 7c 	cmpw    r9,r31
> c00000000028cca8:	18 00 82 41 	beq     c00000000028ccc0 <kick_ilb+0xd4>
> c00000000028ccac:	8d 0c 04 48 	bl      c0000000002cd938 <idle_cpu+0x8>
> c00000000028ccb0:	00 00 00 60 	nop
> c00000000028ccb4:	00 00 03 2c 	cmpwi   r3,0
> c00000000028ccb8:	78 00 82 40 	bne     c00000000028cd30 <kick_ilb+0x144>
> c00000000028ccbc:	00 00 bd 80 	lwz     r5,0(r29)
> c00000000028ccc0:	01 00 df 38 	addi    r6,r31,1
> c00000000028ccc4:	20 00 a5 78 	clrldi  r5,r5,32
> c00000000028ccc8:	78 e3 84 7f 	mr      r4,r28
> c00000000028cccc:	78 db 63 7f 	mr      r3,r27
> c00000000028ccd0:	b4 07 c6 7c 	extsw   r6,r6
> c00000000028ccd4:	c5 6f 80 48 	bl      c000000000a93c98 <_find_next_and_bit>
> 
> 
> 
> So, yes it is done inside the loop. But accessing it should be cheap as it is r13 access
> which is the paca pointer.
> 
> Since different archs implement it in their own ways
> (many would have it cheap, but for some may be costly)
> and there is CONFIG_DEBUG_PREEMPT too.
> So it would be good to hoist it out of the loop IMHO.

Ah, right. And I suppose it isn't generally sound to hoist
smp_processor_id() unless you have disabled preemption, and since we
can't tell the compiler this, it just never can.

Fair enough.
Re: [PATCH 2/2] sched/fair: get this cpu once in find_new_ilb
Posted by Mukesh Kumar Chaurasiya 2 weeks, 4 days ago
On Thu, Mar 19, 2026 at 12:23:14PM +0530, Shrikanth Hegde wrote:
> smp_processor_id() is pointless to fetch in the loop. Move it out.
> No functional change.
> 
> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
> ---
>  kernel/sched/fair.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 02cca2c7a98d..9e561c1234b2 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -12635,11 +12635,11 @@ static inline int on_null_domain(struct rq *rq)
>   */
>  static inline int find_new_ilb(struct cpumask *cpus)
>  {
> +	int this_cpu = smp_processor_id();
>  	int ilb_cpu;
>  
>  	for_each_cpu(ilb_cpu, cpus) {
> -
> -		if (ilb_cpu == smp_processor_id())
> +		if (ilb_cpu == this_cpu)
>  			continue;
>  
>  		if (idle_cpu(ilb_cpu))
> -- 
> 2.43.0
> 
LGTM

Reviewed-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com>