arch/arm64/include/asm/topology.h | 6 +++++ arch/arm64/kernel/topology.c | 34 ++++++++++++++++++++++++++ drivers/base/arch_topology.c | 40 +++++++++++++++++++++++++++++++ include/linux/arch_topology.h | 24 +++++++++++++++++++ include/linux/sched/topology.h | 9 +++++++ kernel/sched/fair.c | 16 ------------- kernel/sched/topology.c | 34 ++++++++++++++++++++------ 7 files changed, 140 insertions(+), 23 deletions(-)
The scheduler currently handles CPU performance asymmetry via either: - SD_ASYM_PACKING: simple priority-based task placement (x86 ITMT) - SD_ASYM_CPUCAPACITY: capacity-aware scheduling On arm64, capacity-aware scheduling is used for any detected capacity differences. Some systems expose small per-CPU performance differences via CPPC highest_perf (e.g. due to chip binning), resulting in slightly different capacities (<~5%). These differences are sufficient to trigger SD_ASYM_CPUCAPACITY, even though the system is otherwise effectively symmetric. For such small deltas, capacity-aware scheduling is unnecessarily complex. A simpler priority-based approach, similar to x86 ITMT, is sufficient. This series introduces support for using asymmetric packing in that case: - derive per-CPU priorities from CPPC highest_perf - detect when CPUs differ but not enough to form distinct capacity classes - suppress SD_ASYM_CPUCAPACITY for such domains - enable SD_ASYM_PACKING and use CPPC-based priority ordering instead The asympacking flag is exposed at all topology levels; domains with equal priorities are unaffected, while domains spanning CPUs with different priorities can honor the ordering. RFC: I'm not entirely sure if this is the best way to implement this. Currently this is baked into CPPC and arm64, while neither are strictly necessary, we could also use cpu_capacity directly to derive the ordering and enable this for non-CPPC and/or non-arm64. RFT: Andrea, please give this a try. This should perform better in particular for single-threaded workloads and workloads that do not utilize all cores (all the time anyway). Capacity-aware scheduling wakeup works very different to the SMP path used now, some workloads will benefit, some regress, it would be nice to get some test results for these. We already discussed DCPerf MediaWiki seems to benefit from capacity-aware scheduling wakeup behavior, but others (most?) should benefit from this series. I don't know if we can also be clever about ordering amongst SMT siblings. That would be dependent on the uarch and I don't have a platform to experiment with this though, so consider this series orthogonal to the idle-core SMT considerations. On platforms with SMT though asympacking makes a lot more sense than capacity-aware scheduling, because arguing about capacity without considering utilization of the sibling(s) (and the resulting potential 'stolen' capacity we perceive) isn't theoretically sound. Christian Loehle (3): sched/topology: Introduce arch hooks for asympacking arch_topology: Export CPPC-based asympacking prios arm64/sched: Enable CPPC-based asympacking arch/arm64/include/asm/topology.h | 6 +++++ arch/arm64/kernel/topology.c | 34 ++++++++++++++++++++++++++ drivers/base/arch_topology.c | 40 +++++++++++++++++++++++++++++++ include/linux/arch_topology.h | 24 +++++++++++++++++++ include/linux/sched/topology.h | 9 +++++++ kernel/sched/fair.c | 16 ------------- kernel/sched/topology.c | 34 ++++++++++++++++++++------ 7 files changed, 140 insertions(+), 23 deletions(-) -- 2.34.1
Hi Christian, On Wed, Mar 25, 2026 at 06:13:11PM +0000, Christian Loehle wrote: ... > RFT: > Andrea, please give this a try. This should perform better in particular > for single-threaded workloads and workloads that do not utilize all > cores (all the time anyway). > Capacity-aware scheduling wakeup works very different to the SMP path > used now, some workloads will benefit, some regress, it would be nice > to get some test results for these. > We already discussed DCPerf MediaWiki seems to benefit from > capacity-aware scheduling wakeup behavior, but others (most?) should > benefit from this series. > > I don't know if we can also be clever about ordering amongst SMT siblings. > That would be dependent on the uarch and I don't have a platform to > experiment with this though, so consider this series orthogonal to the > idle-core SMT considerations. > On platforms with SMT though asympacking makes a lot more sense than > capacity-aware scheduling, because arguing about capacity without > considering utilization of the sibling(s) (and the resulting potential > 'stolen' capacity we perceive) isn't theoretically sound. I did some early testing with this patch set. On Vera I'm getting much better performance that SD_ASYM_CPUCAPACITY of course (~1.5x avg speedup), mostly because we avoid using both SMT siblings. It's still not the same improvement that I get equalizing the capacity using the 5% threshold (~1.8x speedup). Of course I need to test with more workloads and I haven't tested it on Grace yet, to check if we're regressing something, but in general it seems functional. Now it depends if SD_ASYM_PACKING is the route we want to take or if we should start addressing SMT in SD_ASYM_CPUCAPACITY, as pointed by Vincent. In general I think I agree with Vincent, independently on this particular case, it'd be nice to start improving SD_ASYM_CPUCAPACITY to support SMT. Thanks, -Andrea
On Thu, 26 Mar 2026 at 09:12, Andrea Righi <arighi@nvidia.com> wrote: > > Hi Christian, > > On Wed, Mar 25, 2026 at 06:13:11PM +0000, Christian Loehle wrote: > ... > > RFT: > > Andrea, please give this a try. This should perform better in particular > > for single-threaded workloads and workloads that do not utilize all > > cores (all the time anyway). > > Capacity-aware scheduling wakeup works very different to the SMP path > > used now, some workloads will benefit, some regress, it would be nice > > to get some test results for these. > > We already discussed DCPerf MediaWiki seems to benefit from > > capacity-aware scheduling wakeup behavior, but others (most?) should > > benefit from this series. > > > > I don't know if we can also be clever about ordering amongst SMT siblings. > > That would be dependent on the uarch and I don't have a platform to > > experiment with this though, so consider this series orthogonal to the > > idle-core SMT considerations. > > On platforms with SMT though asympacking makes a lot more sense than > > capacity-aware scheduling, because arguing about capacity without > > considering utilization of the sibling(s) (and the resulting potential > > 'stolen' capacity we perceive) isn't theoretically sound. > > I did some early testing with this patch set. On Vera I'm getting much > better performance that SD_ASYM_CPUCAPACITY of course (~1.5x avg speedup), > mostly because we avoid using both SMT siblings. It's still not the same > improvement that I get equalizing the capacity using the 5% threshold > (~1.8x speedup). IIRC the tests that you shared in your patch, you get an additonal improvement when adding some SMT awarness to SD_ASYM_CPUCAPACITY compared to equalizing the capacity > > Of course I need to test with more workloads and I haven't tested it on > Grace yet, to check if we're regressing something, but in general it seems > functional. > > Now it depends if SD_ASYM_PACKING is the route we want to take or if we > should start addressing SMT in SD_ASYM_CPUCAPACITY, as pointed by Vincent. > In general I think I agree with Vincent, independently on this particular > case, it'd be nice to start improving SD_ASYM_CPUCAPACITY to support SMT. > > Thanks, > -Andrea
On Thu, Mar 26, 2026 at 09:20:45AM +0100, Vincent Guittot wrote: > On Thu, 26 Mar 2026 at 09:12, Andrea Righi <arighi@nvidia.com> wrote: > > > > Hi Christian, > > > > On Wed, Mar 25, 2026 at 06:13:11PM +0000, Christian Loehle wrote: > > ... > > > RFT: > > > Andrea, please give this a try. This should perform better in particular > > > for single-threaded workloads and workloads that do not utilize all > > > cores (all the time anyway). > > > Capacity-aware scheduling wakeup works very different to the SMP path > > > used now, some workloads will benefit, some regress, it would be nice > > > to get some test results for these. > > > We already discussed DCPerf MediaWiki seems to benefit from > > > capacity-aware scheduling wakeup behavior, but others (most?) should > > > benefit from this series. > > > > > > I don't know if we can also be clever about ordering amongst SMT siblings. > > > That would be dependent on the uarch and I don't have a platform to > > > experiment with this though, so consider this series orthogonal to the > > > idle-core SMT considerations. > > > On platforms with SMT though asympacking makes a lot more sense than > > > capacity-aware scheduling, because arguing about capacity without > > > considering utilization of the sibling(s) (and the resulting potential > > > 'stolen' capacity we perceive) isn't theoretically sound. > > > > I did some early testing with this patch set. On Vera I'm getting much > > better performance that SD_ASYM_CPUCAPACITY of course (~1.5x avg speedup), > > mostly because we avoid using both SMT siblings. It's still not the same > > improvement that I get equalizing the capacity using the 5% threshold > > (~1.8x speedup). > > IIRC the tests that you shared in your patch, you get an additonal > improvement when adding some SMT awarness to SD_ASYM_CPUCAPACITY > compared to equalizing the capacity Yes, adding SMT awareness to SD_ASYM_CPUCAPACITY is still the apparoach that gives me the best performance so far on Vera (~1.9x avg speedup), among all those that I've tested. I'll post the updated patch set that I'm using, so we can also elaborate more on that approach as well. Thanks, -Andrea
On Wed, 25 Mar 2026 at 19:13, Christian Loehle <christian.loehle@arm.com> wrote: > > The scheduler currently handles CPU performance asymmetry via either: > > - SD_ASYM_PACKING: simple priority-based task placement (x86 ITMT) > - SD_ASYM_CPUCAPACITY: capacity-aware scheduling > > On arm64, capacity-aware scheduling is used for any detected capacity > differences. > > Some systems expose small per-CPU performance differences via CPPC > highest_perf (e.g. due to chip binning), resulting in slightly different > capacities (<~5%). These differences are sufficient to trigger > SD_ASYM_CPUCAPACITY, even though the system is otherwise effectively > symmetric. > > For such small deltas, capacity-aware scheduling is unnecessarily > complex. A simpler priority-based approach, similar to x86 ITMT, is > sufficient. I'm not convinced that moving to SD_ASYM_PACKING is the right way to move forward. 1st of all, do you target all kind of system or only SMT? It's not clear in your cover letter Moving on asym pack for !SMT doesn't make sense to me. If you don't want EAS enabled, you can disable it with /proc/sys/kernel/sched_energy_aware For SMT system and small capacity difference, I would prefer that we look at supporting SMT in SD_ASYM_CPUCAPACITY. Starting with select_idle_capacity > > This series introduces support for using asymmetric packing in that case: > > - derive per-CPU priorities from CPPC highest_perf > - detect when CPUs differ but not enough to form distinct capacity classes > - suppress SD_ASYM_CPUCAPACITY for such domains > - enable SD_ASYM_PACKING and use CPPC-based priority ordering instead > > The asympacking flag is exposed at all topology levels; domains with > equal priorities are unaffected, while domains spanning CPUs with > different priorities can honor the ordering. > > RFC: > I'm not entirely sure if this is the best way to implement this. > Currently this is baked into CPPC and arm64, while neither are strictly > necessary, we could also use cpu_capacity directly to derive the > ordering and enable this for non-CPPC and/or non-arm64. > RFT: > Andrea, please give this a try. This should perform better in particular > for single-threaded workloads and workloads that do not utilize all > cores (all the time anyway). > Capacity-aware scheduling wakeup works very different to the SMP path > used now, some workloads will benefit, some regress, it would be nice > to get some test results for these. > We already discussed DCPerf MediaWiki seems to benefit from > capacity-aware scheduling wakeup behavior, but others (most?) should > benefit from this series. > > I don't know if we can also be clever about ordering amongst SMT siblings. > That would be dependent on the uarch and I don't have a platform to > experiment with this though, so consider this series orthogonal to the > idle-core SMT considerations. > On platforms with SMT though asympacking makes a lot more sense than > capacity-aware scheduling, because arguing about capacity without > considering utilization of the sibling(s) (and the resulting potential > 'stolen' capacity we perceive) isn't theoretically sound. > > Christian Loehle (3): > sched/topology: Introduce arch hooks for asympacking > arch_topology: Export CPPC-based asympacking prios > arm64/sched: Enable CPPC-based asympacking > > arch/arm64/include/asm/topology.h | 6 +++++ > arch/arm64/kernel/topology.c | 34 ++++++++++++++++++++++++++ > drivers/base/arch_topology.c | 40 +++++++++++++++++++++++++++++++ > include/linux/arch_topology.h | 24 +++++++++++++++++++ > include/linux/sched/topology.h | 9 +++++++ > kernel/sched/fair.c | 16 ------------- > kernel/sched/topology.c | 34 ++++++++++++++++++++------ > 7 files changed, 140 insertions(+), 23 deletions(-) > > -- > 2.34.1 >
On 3/26/26 07:53, Vincent Guittot wrote: > On Wed, 25 Mar 2026 at 19:13, Christian Loehle <christian.loehle@arm.com> wrote: >> >> The scheduler currently handles CPU performance asymmetry via either: >> >> - SD_ASYM_PACKING: simple priority-based task placement (x86 ITMT) >> - SD_ASYM_CPUCAPACITY: capacity-aware scheduling >> >> On arm64, capacity-aware scheduling is used for any detected capacity >> differences. >> >> Some systems expose small per-CPU performance differences via CPPC >> highest_perf (e.g. due to chip binning), resulting in slightly different >> capacities (<~5%). These differences are sufficient to trigger >> SD_ASYM_CPUCAPACITY, even though the system is otherwise effectively >> symmetric. >> >> For such small deltas, capacity-aware scheduling is unnecessarily >> complex. A simpler priority-based approach, similar to x86 ITMT, is >> sufficient. > > I'm not convinced that moving to SD_ASYM_PACKING is the right way to > move forward. > > 1st of all, do you target all kind of system or only SMT? It's not > clear in your cover letter > > Moving on asym pack for !SMT doesn't make sense to me. If you don't > want EAS enabled, you can disable it with > /proc/sys/kernel/sched_energy_aware > > For SMT system and small capacity difference, I would prefer that we > look at supporting SMT in SD_ASYM_CPUCAPACITY. Starting with > select_idle_capacity Quoting the cover letter below, I don't think SMT + SD_ASYM_CPUCAPACITY can ever be theoretically sound and the results will become so wildly different on a per-platform / uArch + workload basis, I'm not convinced something useful would come out of it, but I'd be keen to see some experiments on this. IME a busy sibling steals so much more capacity than the difference I care about here (<5%, busy SMT sibling is often 20-30%, sometimes up to 50% but entirely dependent on workload and uarch as I've mentioned). In any case, this series isn't (primarily) for SMT systems... > [snip] >> On platforms with SMT though asympacking makes a lot more sense than >> capacity-aware scheduling, because arguing about capacity without >> considering utilization of the sibling(s) (and the resulting potential >> 'stolen' capacity we perceive) isn't theoretically sound. >> >> Christian Loehle (3): >> sched/topology: Introduce arch hooks for asympacking >> arch_topology: Export CPPC-based asympacking prios >> arm64/sched: Enable CPPC-based asympacking >> >> arch/arm64/include/asm/topology.h | 6 +++++ >> arch/arm64/kernel/topology.c | 34 ++++++++++++++++++++++++++ >> drivers/base/arch_topology.c | 40 +++++++++++++++++++++++++++++++ >> include/linux/arch_topology.h | 24 +++++++++++++++++++ >> include/linux/sched/topology.h | 9 +++++++ >> kernel/sched/fair.c | 16 ------------- >> kernel/sched/topology.c | 34 ++++++++++++++++++++------ >> 7 files changed, 140 insertions(+), 23 deletions(-) >> >> -- >> 2.34.1 >>
On 3/26/26 07:53, Vincent Guittot wrote: > On Wed, 25 Mar 2026 at 19:13, Christian Loehle <christian.loehle@arm.com> wrote: >> >> The scheduler currently handles CPU performance asymmetry via either: >> >> - SD_ASYM_PACKING: simple priority-based task placement (x86 ITMT) >> - SD_ASYM_CPUCAPACITY: capacity-aware scheduling >> >> On arm64, capacity-aware scheduling is used for any detected capacity >> differences. >> >> Some systems expose small per-CPU performance differences via CPPC >> highest_perf (e.g. due to chip binning), resulting in slightly different >> capacities (<~5%). These differences are sufficient to trigger >> SD_ASYM_CPUCAPACITY, even though the system is otherwise effectively >> symmetric. >> >> For such small deltas, capacity-aware scheduling is unnecessarily >> complex. A simpler priority-based approach, similar to x86 ITMT, is >> sufficient. > > I'm not convinced that moving to SD_ASYM_PACKING is the right way to > move forward. > t > 1st of all, do you target all kind of system or only SMT? It's not > clear in your cover letter AFAIK only Andrea has access to an unreleased asymmetric SMT system, I haven't done any tests on such a system (as the cover-letter mentions under RFT section). > > Moving on asym pack for !SMT doesn't make sense to me. If you don't > want EAS enabled, you can disable it with > /proc/sys/kernel/sched_energy_aware Sorry, what's EAS got to do with it? The system I care about here (primarily nvidia grace) has no EM. > > For SMT system and small capacity difference, I would prefer that we > look at supporting SMT in SD_ASYM_CPUCAPACITY. Starting with > select_idle_capacity This series is actually targeted for primarily the !SMT case, although it may or may not be useful for some of the SMT woes, too! (Again, I wouldn't know, I don't have such a system to test with) >[snip]
On Thu, 26 Mar 2026 at 09:16, Christian Loehle <christian.loehle@arm.com> wrote: > > On 3/26/26 07:53, Vincent Guittot wrote: > > On Wed, 25 Mar 2026 at 19:13, Christian Loehle <christian.loehle@arm.com> wrote: > >> > >> The scheduler currently handles CPU performance asymmetry via either: > >> > >> - SD_ASYM_PACKING: simple priority-based task placement (x86 ITMT) > >> - SD_ASYM_CPUCAPACITY: capacity-aware scheduling > >> > >> On arm64, capacity-aware scheduling is used for any detected capacity > >> differences. > >> > >> Some systems expose small per-CPU performance differences via CPPC > >> highest_perf (e.g. due to chip binning), resulting in slightly different > >> capacities (<~5%). These differences are sufficient to trigger > >> SD_ASYM_CPUCAPACITY, even though the system is otherwise effectively > >> symmetric. > >> > >> For such small deltas, capacity-aware scheduling is unnecessarily > >> complex. A simpler priority-based approach, similar to x86 ITMT, is > >> sufficient. > > > > I'm not convinced that moving to SD_ASYM_PACKING is the right way to > > move forward. > > t > > 1st of all, do you target all kind of system or only SMT? It's not > > clear in your cover letter > > AFAIK only Andrea has access to an unreleased asymmetric SMT system, > I haven't done any tests on such a system (as the cover-letter mentions > under RFT section). > > > > > Moving on asym pack for !SMT doesn't make sense to me. If you don't > > want EAS enabled, you can disable it with > > /proc/sys/kernel/sched_energy_aware > > Sorry, what's EAS got to do with it? The system I care about here > (primarily nvidia grace) has no EM. I tried to understand the end goal of this patch SD_ASYM_CPUCAPACITY works fine with !SMT system so why enabling SD_ASYM_PACKING for <5% diff ? That doesn't make sense to me > > > > > For SMT system and small capacity difference, I would prefer that we > > look at supporting SMT in SD_ASYM_CPUCAPACITY. Starting with > > select_idle_capacity > > This series is actually targeted for primarily the !SMT case, although > it may or may not be useful for some of the SMT woes, too! > (Again, I wouldn't know, I don't have such a system to test with) > > >[snip]
On 3/26/26 08:24, Vincent Guittot wrote: > On Thu, 26 Mar 2026 at 09:16, Christian Loehle <christian.loehle@arm.com> wrote: >> >> On 3/26/26 07:53, Vincent Guittot wrote: >>> On Wed, 25 Mar 2026 at 19:13, Christian Loehle <christian.loehle@arm.com> wrote: >>>> >>>> The scheduler currently handles CPU performance asymmetry via either: >>>> >>>> - SD_ASYM_PACKING: simple priority-based task placement (x86 ITMT) >>>> - SD_ASYM_CPUCAPACITY: capacity-aware scheduling >>>> >>>> On arm64, capacity-aware scheduling is used for any detected capacity >>>> differences. >>>> >>>> Some systems expose small per-CPU performance differences via CPPC >>>> highest_perf (e.g. due to chip binning), resulting in slightly different >>>> capacities (<~5%). These differences are sufficient to trigger >>>> SD_ASYM_CPUCAPACITY, even though the system is otherwise effectively >>>> symmetric. >>>> >>>> For such small deltas, capacity-aware scheduling is unnecessarily >>>> complex. A simpler priority-based approach, similar to x86 ITMT, is >>>> sufficient. >>> >>> I'm not convinced that moving to SD_ASYM_PACKING is the right way to >>> move forward. >>> t >>> 1st of all, do you target all kind of system or only SMT? It's not >>> clear in your cover letter >> >> AFAIK only Andrea has access to an unreleased asymmetric SMT system, >> I haven't done any tests on such a system (as the cover-letter mentions >> under RFT section). >> >>> >>> Moving on asym pack for !SMT doesn't make sense to me. If you don't >>> want EAS enabled, you can disable it with >>> /proc/sys/kernel/sched_energy_aware >> >> Sorry, what's EAS got to do with it? The system I care about here >> (primarily nvidia grace) has no EM. > > I tried to understand the end goal of this patch > > SD_ASYM_CPUCAPACITY works fine with !SMT system so why enabling > SD_ASYM_PACKING for <5% diff ? > > That doesn't make sense to me I don't know if "works fine" describes the situation accurately. I guess I should've included the context in the cover letter, but you are aware of them (you've replied to them anyway): https://lore.kernel.org/lkml/20260324005509.1134981-1-arighi@nvidia.com/ https://lore.kernel.org/lkml/20260318092214.130908-1-arighi@nvidia.com/ Andrea sees an improvement even when force-equalizing CPUs to remove SD_ASYM_CPUCAPACITY, so I'd argue it doesn't "work fine" on these platforms. To me it seems more reasonable to attempt to get these minor improvements of minor asymmetries through asympacking and leave SD_ASYM_CPUCAPACITY to the actual 'true' asymmetry (e.g. different uArch or vastly different performance levels). SD_ASYM_CPUCAPACITY handling is also arguably broken if no CPU pair in the system fulfills capacity_greater(), the call sites in fair.c give a good overview. Is $subject the right approach to deal with these platforms instead? I don't know, that's why it's marked RFC and RFT.
On Thu, 26 Mar 2026 at 10:24, Christian Loehle <christian.loehle@arm.com> wrote: > > On 3/26/26 08:24, Vincent Guittot wrote: > > On Thu, 26 Mar 2026 at 09:16, Christian Loehle <christian.loehle@arm.com> wrote: > >> > >> On 3/26/26 07:53, Vincent Guittot wrote: > >>> On Wed, 25 Mar 2026 at 19:13, Christian Loehle <christian.loehle@arm.com> wrote: > >>>> > >>>> The scheduler currently handles CPU performance asymmetry via either: > >>>> > >>>> - SD_ASYM_PACKING: simple priority-based task placement (x86 ITMT) > >>>> - SD_ASYM_CPUCAPACITY: capacity-aware scheduling > >>>> > >>>> On arm64, capacity-aware scheduling is used for any detected capacity > >>>> differences. > >>>> > >>>> Some systems expose small per-CPU performance differences via CPPC > >>>> highest_perf (e.g. due to chip binning), resulting in slightly different > >>>> capacities (<~5%). These differences are sufficient to trigger > >>>> SD_ASYM_CPUCAPACITY, even though the system is otherwise effectively > >>>> symmetric. > >>>> > >>>> For such small deltas, capacity-aware scheduling is unnecessarily > >>>> complex. A simpler priority-based approach, similar to x86 ITMT, is > >>>> sufficient. > >>> > >>> I'm not convinced that moving to SD_ASYM_PACKING is the right way to > >>> move forward. > >>> t > >>> 1st of all, do you target all kind of system or only SMT? It's not > >>> clear in your cover letter > >> > >> AFAIK only Andrea has access to an unreleased asymmetric SMT system, > >> I haven't done any tests on such a system (as the cover-letter mentions > >> under RFT section). > >> > >>> > >>> Moving on asym pack for !SMT doesn't make sense to me. If you don't > >>> want EAS enabled, you can disable it with > >>> /proc/sys/kernel/sched_energy_aware > >> > >> Sorry, what's EAS got to do with it? The system I care about here > >> (primarily nvidia grace) has no EM. > > > > I tried to understand the end goal of this patch > > > > SD_ASYM_CPUCAPACITY works fine with !SMT system so why enabling > > SD_ASYM_PACKING for <5% diff ? > > > > That doesn't make sense to me > I don't know if "works fine" describes the situation accurately. > I guess I should've included the context in the cover letter, but you > are aware of them (you've replied to them anyway): > https://lore.kernel.org/lkml/20260324005509.1134981-1-arighi@nvidia.com/ > https://lore.kernel.org/lkml/20260318092214.130908-1-arighi@nvidia.com/ > > Andrea sees an improvement even when force-equalizing CPUs to remove > SD_ASYM_CPUCAPACITY, so I'd argue it doesn't "work fine" on these platforms. IIUC this was for SMT systems not for !SMT ones but I might have missed some emails in the thread. > To me it seems more reasonable to attempt to get these minor improvements > of minor asymmetries through asympacking and leave SD_ASYM_CPUCAPACITY > to the actual 'true' asymmetry (e.g. different uArch or vastly different > performance levels). > SD_ASYM_CPUCAPACITY handling is also arguably broken if no CPU pair in > the system fulfills capacity_greater(), the call sites in fair.c give > a good overview. > Is $subject the right approach to deal with these platforms instead? > I don't know, that's why it's marked RFC and RFT.
On Thu, Mar 26, 2026 at 02:04:42PM +0100, Vincent Guittot wrote: > On Thu, 26 Mar 2026 at 10:24, Christian Loehle <christian.loehle@arm.com> wrote: > > > > On 3/26/26 08:24, Vincent Guittot wrote: > > > On Thu, 26 Mar 2026 at 09:16, Christian Loehle <christian.loehle@arm.com> wrote: > > >> > > >> On 3/26/26 07:53, Vincent Guittot wrote: > > >>> On Wed, 25 Mar 2026 at 19:13, Christian Loehle <christian.loehle@arm.com> wrote: > > >>>> > > >>>> The scheduler currently handles CPU performance asymmetry via either: > > >>>> > > >>>> - SD_ASYM_PACKING: simple priority-based task placement (x86 ITMT) > > >>>> - SD_ASYM_CPUCAPACITY: capacity-aware scheduling > > >>>> > > >>>> On arm64, capacity-aware scheduling is used for any detected capacity > > >>>> differences. > > >>>> > > >>>> Some systems expose small per-CPU performance differences via CPPC > > >>>> highest_perf (e.g. due to chip binning), resulting in slightly different > > >>>> capacities (<~5%). These differences are sufficient to trigger > > >>>> SD_ASYM_CPUCAPACITY, even though the system is otherwise effectively > > >>>> symmetric. > > >>>> > > >>>> For such small deltas, capacity-aware scheduling is unnecessarily > > >>>> complex. A simpler priority-based approach, similar to x86 ITMT, is > > >>>> sufficient. > > >>> > > >>> I'm not convinced that moving to SD_ASYM_PACKING is the right way to > > >>> move forward. > > >>> t > > >>> 1st of all, do you target all kind of system or only SMT? It's not > > >>> clear in your cover letter > > >> > > >> AFAIK only Andrea has access to an unreleased asymmetric SMT system, > > >> I haven't done any tests on such a system (as the cover-letter mentions > > >> under RFT section). > > >> > > >>> > > >>> Moving on asym pack for !SMT doesn't make sense to me. If you don't > > >>> want EAS enabled, you can disable it with > > >>> /proc/sys/kernel/sched_energy_aware > > >> > > >> Sorry, what's EAS got to do with it? The system I care about here > > >> (primarily nvidia grace) has no EM. > > > > > > I tried to understand the end goal of this patch > > > > > > SD_ASYM_CPUCAPACITY works fine with !SMT system so why enabling > > > SD_ASYM_PACKING for <5% diff ? > > > > > > That doesn't make sense to me > > I don't know if "works fine" describes the situation accurately. > > I guess I should've included the context in the cover letter, but you > > are aware of them (you've replied to them anyway): > > https://lore.kernel.org/lkml/20260324005509.1134981-1-arighi@nvidia.com/ > > https://lore.kernel.org/lkml/20260318092214.130908-1-arighi@nvidia.com/ > > > > Andrea sees an improvement even when force-equalizing CPUs to remove > > SD_ASYM_CPUCAPACITY, so I'd argue it doesn't "work fine" on these platforms. > > IIUC this was for SMT systems not for !SMT ones but I might have > missed some emails in the thread. Right, the issue I'm trying to solve is SD_ASYM_CPUCAPACITY + SMT. Removing SD_ASYM_CPUCAPACITY from the equation fixes my issue, because we fall back into the regular idle CPU selection policy, which avoids allocating both SMT siblings when possible. Thanks, -Andrea
On 3/26/26 13:45, Andrea Righi wrote: > On Thu, Mar 26, 2026 at 02:04:42PM +0100, Vincent Guittot wrote: >> On Thu, 26 Mar 2026 at 10:24, Christian Loehle <christian.loehle@arm.com> wrote: >>> >>> On 3/26/26 08:24, Vincent Guittot wrote: >>>> On Thu, 26 Mar 2026 at 09:16, Christian Loehle <christian.loehle@arm.com> wrote: >>>>> >>>>> On 3/26/26 07:53, Vincent Guittot wrote: >>>>>> On Wed, 25 Mar 2026 at 19:13, Christian Loehle <christian.loehle@arm.com> wrote: >>>>>>> >>>>>>> The scheduler currently handles CPU performance asymmetry via either: >>>>>>> >>>>>>> - SD_ASYM_PACKING: simple priority-based task placement (x86 ITMT) >>>>>>> - SD_ASYM_CPUCAPACITY: capacity-aware scheduling >>>>>>> >>>>>>> On arm64, capacity-aware scheduling is used for any detected capacity >>>>>>> differences. >>>>>>> >>>>>>> Some systems expose small per-CPU performance differences via CPPC >>>>>>> highest_perf (e.g. due to chip binning), resulting in slightly different >>>>>>> capacities (<~5%). These differences are sufficient to trigger >>>>>>> SD_ASYM_CPUCAPACITY, even though the system is otherwise effectively >>>>>>> symmetric. >>>>>>> >>>>>>> For such small deltas, capacity-aware scheduling is unnecessarily >>>>>>> complex. A simpler priority-based approach, similar to x86 ITMT, is >>>>>>> sufficient. >>>>>> >>>>>> I'm not convinced that moving to SD_ASYM_PACKING is the right way to >>>>>> move forward. >>>>>> t >>>>>> 1st of all, do you target all kind of system or only SMT? It's not >>>>>> clear in your cover letter >>>>> >>>>> AFAIK only Andrea has access to an unreleased asymmetric SMT system, >>>>> I haven't done any tests on such a system (as the cover-letter mentions >>>>> under RFT section). >>>>> >>>>>> >>>>>> Moving on asym pack for !SMT doesn't make sense to me. If you don't >>>>>> want EAS enabled, you can disable it with >>>>>> /proc/sys/kernel/sched_energy_aware >>>>> >>>>> Sorry, what's EAS got to do with it? The system I care about here >>>>> (primarily nvidia grace) has no EM. >>>> >>>> I tried to understand the end goal of this patch >>>> >>>> SD_ASYM_CPUCAPACITY works fine with !SMT system so why enabling >>>> SD_ASYM_PACKING for <5% diff ? >>>> >>>> That doesn't make sense to me >>> I don't know if "works fine" describes the situation accurately. >>> I guess I should've included the context in the cover letter, but you >>> are aware of them (you've replied to them anyway): >>> https://lore.kernel.org/lkml/20260324005509.1134981-1-arighi@nvidia.com/ >>> https://lore.kernel.org/lkml/20260318092214.130908-1-arighi@nvidia.com/ >>> >>> Andrea sees an improvement even when force-equalizing CPUs to remove >>> SD_ASYM_CPUCAPACITY, so I'd argue it doesn't "work fine" on these platforms. >> >> IIUC this was for SMT systems not for !SMT ones but I might have >> missed some emails in the thread. > > Right, the issue I'm trying to solve is SD_ASYM_CPUCAPACITY + SMT. Removing > SD_ASYM_CPUCAPACITY from the equation fixes my issue, because we fall back > into the regular idle CPU selection policy, which avoids allocating both > SMT siblings when possible. > > Thanks, > -Andrea Could you also report how Grace baseline vs ASYM_PACKING works for your benchmark? (or Vera nosmt)
Hi Christian, On Thu, Mar 26, 2026 at 03:55:54PM +0000, Christian Loehle wrote: ... > > Right, the issue I'm trying to solve is SD_ASYM_CPUCAPACITY + SMT. Removing > > SD_ASYM_CPUCAPACITY from the equation fixes my issue, because we fall back > > into the regular idle CPU selection policy, which avoids allocating both > > SMT siblings when possible. > > > > Thanks, > > -Andrea > > Could you also report how Grace baseline vs ASYM_PACKING works for your > benchmark? (or Vera nosmt) > I've done some tests with Vera nosmt. I don't see much difference with ASYM_PACKING vs ASYM_CPUCAPACITY (baseline), pretty much in error range (I see around 1-2% difference across runs, but there's not a clear bias between the two solutions). I'll try to find a Grace system and repeat the tests there as well. Thanks, -Andrea
On Thu, Mar 26, 2026 at 03:55:54PM +0000, Christian Loehle wrote: > On 3/26/26 13:45, Andrea Righi wrote: > > On Thu, Mar 26, 2026 at 02:04:42PM +0100, Vincent Guittot wrote: > >> On Thu, 26 Mar 2026 at 10:24, Christian Loehle <christian.loehle@arm.com> wrote: > >>> > >>> On 3/26/26 08:24, Vincent Guittot wrote: > >>>> On Thu, 26 Mar 2026 at 09:16, Christian Loehle <christian.loehle@arm.com> wrote: > >>>>> > >>>>> On 3/26/26 07:53, Vincent Guittot wrote: > >>>>>> On Wed, 25 Mar 2026 at 19:13, Christian Loehle <christian.loehle@arm.com> wrote: > >>>>>>> > >>>>>>> The scheduler currently handles CPU performance asymmetry via either: > >>>>>>> > >>>>>>> - SD_ASYM_PACKING: simple priority-based task placement (x86 ITMT) > >>>>>>> - SD_ASYM_CPUCAPACITY: capacity-aware scheduling > >>>>>>> > >>>>>>> On arm64, capacity-aware scheduling is used for any detected capacity > >>>>>>> differences. > >>>>>>> > >>>>>>> Some systems expose small per-CPU performance differences via CPPC > >>>>>>> highest_perf (e.g. due to chip binning), resulting in slightly different > >>>>>>> capacities (<~5%). These differences are sufficient to trigger > >>>>>>> SD_ASYM_CPUCAPACITY, even though the system is otherwise effectively > >>>>>>> symmetric. > >>>>>>> > >>>>>>> For such small deltas, capacity-aware scheduling is unnecessarily > >>>>>>> complex. A simpler priority-based approach, similar to x86 ITMT, is > >>>>>>> sufficient. > >>>>>> > >>>>>> I'm not convinced that moving to SD_ASYM_PACKING is the right way to > >>>>>> move forward. > >>>>>> t > >>>>>> 1st of all, do you target all kind of system or only SMT? It's not > >>>>>> clear in your cover letter > >>>>> > >>>>> AFAIK only Andrea has access to an unreleased asymmetric SMT system, > >>>>> I haven't done any tests on such a system (as the cover-letter mentions > >>>>> under RFT section). > >>>>> > >>>>>> > >>>>>> Moving on asym pack for !SMT doesn't make sense to me. If you don't > >>>>>> want EAS enabled, you can disable it with > >>>>>> /proc/sys/kernel/sched_energy_aware > >>>>> > >>>>> Sorry, what's EAS got to do with it? The system I care about here > >>>>> (primarily nvidia grace) has no EM. > >>>> > >>>> I tried to understand the end goal of this patch > >>>> > >>>> SD_ASYM_CPUCAPACITY works fine with !SMT system so why enabling > >>>> SD_ASYM_PACKING for <5% diff ? > >>>> > >>>> That doesn't make sense to me > >>> I don't know if "works fine" describes the situation accurately. > >>> I guess I should've included the context in the cover letter, but you > >>> are aware of them (you've replied to them anyway): > >>> https://lore.kernel.org/lkml/20260324005509.1134981-1-arighi@nvidia.com/ > >>> https://lore.kernel.org/lkml/20260318092214.130908-1-arighi@nvidia.com/ > >>> > >>> Andrea sees an improvement even when force-equalizing CPUs to remove > >>> SD_ASYM_CPUCAPACITY, so I'd argue it doesn't "work fine" on these platforms. > >> > >> IIUC this was for SMT systems not for !SMT ones but I might have > >> missed some emails in the thread. > > > > Right, the issue I'm trying to solve is SD_ASYM_CPUCAPACITY + SMT. Removing > > SD_ASYM_CPUCAPACITY from the equation fixes my issue, because we fall back > > into the regular idle CPU selection policy, which avoids allocating both > > SMT siblings when possible. > > > > Thanks, > > -Andrea > > Could you also report how Grace baseline vs ASYM_PACKING works for your > benchmark? (or Vera nosmt) > Sure, I'll try testing both and report back. -Andrea
© 2016 - 2026 Red Hat, Inc.