[PATCH 0/3] sched: Simplify ifdeffery around CONFIG_SCHED_SMT

Shrikanth Hegde posted 3 patches 1 month, 1 week ago
There is a newer version of this series
include/linux/sched/smt.h |  4 ----
include/linux/topology.h  | 15 +++++++++++++-
kernel/sched/core.c       |  6 ------
kernel/sched/ext_idle.c   |  6 ------
kernel/sched/fair.c       | 41 +++++----------------------------------
kernel/sched/sched.h      |  6 ------
kernel/sched/topology.c   |  2 --
kernel/stop_machine.c     |  2 --
kernel/workqueue.c        |  4 ----
9 files changed, 19 insertions(+), 67 deletions(-)
[PATCH 0/3] sched: Simplify ifdeffery around CONFIG_SCHED_SMT
Posted by Shrikanth Hegde 1 month, 1 week ago
While working on paravirt series on maintaining preferred CPUs, I wanted
to add/remove a core. To do that, I had to access cpu_smt_mask. For that
had to use CONFIG_SCHED_SMT around the code. That made me think, for
CONFIG_SCHED_SMT=n, it effectively means to just add/remove a CPU. Thats when
I thought of this idea of using cpumask_of(cpu) when CONFIG_SCHED_SMT=n. 

Semantics
=========
- For CONFIG_SCHED_SMT=y:
    No functional change.
- For CONFIG_SCHED_SMT=n:
    - cpu_smt_mask(cpu) becomes cpumask_of(cpu), effectively making it
      per CPU with no siblings.
    - sched_smt_present remains defined, but never becomes active:
    	 Since cpumask_weight(cpumask_of(cpu)) == 1

Performance impact
==================
- CONFIG_SCHED_SMT=y:
    No change in generated code.
- CONFIG_SCHED_SMT=n:
    - Small increase in text size (~0.01%) due to removal of compile-time
      stubs. Most paths remain effectively dead due to static keys.
    - Fast paths are protected using IS_ENABLED(CONFIG_SCHED_SMT).

Testing
=======
- Did build/boot test on powerpc for CONFIG_SCHED_SMT=y/n.
- Ran hackbench on powerpc for CONFIG_SCHED_SMT=y/n. Didn't observe any
  major difference.
- Did build/boot test on x86 for CONFIG_SCHED_SMT=y/n. For x86 i had
  change the code for CONFIG_SCHED_SMT=n


Plus, Major distros make CONFIG_SCHED_SMT=y for all major
archs and few archs unconditionally make CONFIG_SCHED_SMT=y (x86,
s390), So CONFIG_SCHED_SMT=n is a rare case.

With that, cpu_smt_mask() to be used unconditionally and reduces
CONFIG_SCHED_SMT-specific code paths, improving readability and
maintainability.

Please review and consider if this simplification is a worthwhile cleanup.

Shrikanth Hegde (3):
  topology: Introduce cpu_smt_mask for CONFIG_SCHED_SMT=n
  sched: Simplify ifdeffery around cpu_smt_mask
  sched/fair: Add compile time check in fastpaths for CONFIG_SCHED_SMT=n

 include/linux/sched/smt.h |  4 ----
 include/linux/topology.h  | 15 +++++++++++++-
 kernel/sched/core.c       |  6 ------
 kernel/sched/ext_idle.c   |  6 ------
 kernel/sched/fair.c       | 41 +++++----------------------------------
 kernel/sched/sched.h      |  6 ------
 kernel/sched/topology.c   |  2 --
 kernel/stop_machine.c     |  2 --
 kernel/workqueue.c        |  4 ----
 9 files changed, 19 insertions(+), 67 deletions(-)

-- 
2.51.0
Re: [PATCH 0/3] sched: Simplify ifdeffery around CONFIG_SCHED_SMT
Posted by Valentin Schneider 1 month ago
On 06/05/26 16:30, Shrikanth Hegde wrote:
> Plus, Major distros make CONFIG_SCHED_SMT=y for all major
> archs and few archs unconditionally make CONFIG_SCHED_SMT=y (x86,
> s390), So CONFIG_SCHED_SMT=n is a rare case.
>

That leaves only a handful of #ifdefs:

  include/linux/sched/topology.h:35:2:#ifdef CONFIG_SCHED_SMT
  kernel/sched/topology.c:1757:2:#ifdef CONFIG_SCHED_SMT
  kernel/sched/topology.c:1802:2:#ifdef CONFIG_SCHED_SMT
  arch/powerpc/kernel/smp.c:986:2:#ifdef CONFIG_SCHED_SMT
  arch/powerpc/kernel/smp.c:1038:2:#ifdef CONFIG_SCHED_SMT
  arch/powerpc/kernel/smp.c:1720:2:#ifdef CONFIG_SCHED_SMT
  arch/powerpc/include/asm/smp.h:141:2:#ifdef CONFIG_SCHED_SMT

The topology code could deal with unconditionally building an SMT layer as
it'll just get degenerated. The powerpc usage is mostly topology and
cpu_smt_mask(). So if we want to push for it, we could even get rid of the
config entirely.
Re: [PATCH 0/3] sched: Simplify ifdeffery around CONFIG_SCHED_SMT
Posted by Shrikanth Hegde 1 month ago
Hi Valentin.

On 5/12/26 4:29 PM, Valentin Schneider wrote:
> On 06/05/26 16:30, Shrikanth Hegde wrote:
>> Plus, Major distros make CONFIG_SCHED_SMT=y for all major
>> archs and few archs unconditionally make CONFIG_SCHED_SMT=y (x86,
>> s390), So CONFIG_SCHED_SMT=n is a rare case.
>>
> 
> That leaves only a handful of #ifdefs:
> 
>    include/linux/sched/topology.h:35:2:#ifdef CONFIG_SCHED_SMT
>    kernel/sched/topology.c:1757:2:#ifdef CONFIG_SCHED_SMT
>    kernel/sched/topology.c:1802:2:#ifdef CONFIG_SCHED_SMT
>    arch/powerpc/kernel/smp.c:986:2:#ifdef CONFIG_SCHED_SMT
>    arch/powerpc/kernel/smp.c:1038:2:#ifdef CONFIG_SCHED_SMT
>    arch/powerpc/kernel/smp.c:1720:2:#ifdef CONFIG_SCHED_SMT
>    arch/powerpc/include/asm/smp.h:141:2:#ifdef CONFIG_SCHED_SMT
> 
> The topology code could deal with unconditionally building an SMT layer as
> it'll just get degenerated. The powerpc usage is mostly topology and
> cpu_smt_mask(). So if we want to push for it, we could even get rid of the
> config entirely.
> 

I think that's tricky. This is my take. Feel free to add/correct.

1 - That's policy decision making. It is only scheduler perspective.
     hardware topology mapping need to necessarily be on the same
     page. Hence we have the config today.


2 - Implementation is difficult as well.

Lets say we define it as:

#if !defined(cpu_smt_mask)
static inline const struct cpumask *cpu_smt_mask(int cpu)
{
         return topology_sibling_cpumask(cpu);
}
#endif

Then topology_sibling_cpumask could still point to actual HW siblings and SMT domain
may not degenerate as it could have more than 1 CPU. And Since powerpc for example
defined cpu_smt_mask, will have SMT domain always. That's not correct IMO.
Re: [PATCH 0/3] sched: Simplify ifdeffery around CONFIG_SCHED_SMT
Posted by Valentin Schneider 1 month ago
On 12/05/26 17:55, Shrikanth Hegde wrote:
> Hi Valentin.
>
> On 5/12/26 4:29 PM, Valentin Schneider wrote:
>> On 06/05/26 16:30, Shrikanth Hegde wrote:
>>> Plus, Major distros make CONFIG_SCHED_SMT=y for all major
>>> archs and few archs unconditionally make CONFIG_SCHED_SMT=y (x86,
>>> s390), So CONFIG_SCHED_SMT=n is a rare case.
>>>
>>
>> That leaves only a handful of #ifdefs:
>>
>>    include/linux/sched/topology.h:35:2:#ifdef CONFIG_SCHED_SMT
>>    kernel/sched/topology.c:1757:2:#ifdef CONFIG_SCHED_SMT
>>    kernel/sched/topology.c:1802:2:#ifdef CONFIG_SCHED_SMT
>>    arch/powerpc/kernel/smp.c:986:2:#ifdef CONFIG_SCHED_SMT
>>    arch/powerpc/kernel/smp.c:1038:2:#ifdef CONFIG_SCHED_SMT
>>    arch/powerpc/kernel/smp.c:1720:2:#ifdef CONFIG_SCHED_SMT
>>    arch/powerpc/include/asm/smp.h:141:2:#ifdef CONFIG_SCHED_SMT
>>
>> The topology code could deal with unconditionally building an SMT layer as
>> it'll just get degenerated. The powerpc usage is mostly topology and
>> cpu_smt_mask(). So if we want to push for it, we could even get rid of the
>> config entirely.
>>
>
> I think that's tricky. This is my take. Feel free to add/correct.
>
> 1 - That's policy decision making. It is only scheduler perspective.
>      hardware topology mapping need to necessarily be on the same
>      page. Hence we have the config today.
>
>
> 2 - Implementation is difficult as well.
>
> Lets say we define it as:
>
> #if !defined(cpu_smt_mask)
> static inline const struct cpumask *cpu_smt_mask(int cpu)
> {
>          return topology_sibling_cpumask(cpu);
> }
> #endif
>
> Then topology_sibling_cpumask could still point to actual HW siblings and SMT domain
> may not degenerate as it could have more than 1 CPU. And Since powerpc for example
> defined cpu_smt_mask, will have SMT domain always. That's not correct IMO.

Yeah you're right, let's forget about it for now :-)