[PATCH v3 1/8] sched/topology: Compute sd_weight considering cpuset partitions

K Prateek Nayak posted 8 patches 2 weeks, 5 days ago
[PATCH v3 1/8] sched/topology: Compute sd_weight considering cpuset partitions
Posted by K Prateek Nayak 2 weeks, 5 days ago
The "sd_weight" used for calculating the load balancing interval, and
its limits, considers the span weight of the entire topology level
without accounting for cpuset partitions.

Compute the "sd_weight" after computing the "sd_span" considering the
cpu_map covered by the partition, and set the load balancing interval,
and its limits accordingly.

Fixes: cb83b629bae03 ("sched/numa: Rewrite the CONFIG_NUMA sched domain support")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
Changelog rfc v2..v3:

o New patch.
---
 kernel/sched/topology.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index cf643a5ddedd..649674bb6c3c 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1638,8 +1638,6 @@ sd_init(struct sched_domain_topology_level *tl,
 	int sd_id, sd_weight, sd_flags = 0;
 	struct cpumask *sd_span;
 
-	sd_weight = cpumask_weight(tl->mask(tl, cpu));
-
 	if (tl->sd_flags)
 		sd_flags = (*tl->sd_flags)();
 	if (WARN_ONCE(sd_flags & ~TOPOLOGY_SD_FLAGS,
@@ -1647,8 +1645,6 @@ sd_init(struct sched_domain_topology_level *tl,
 		sd_flags &= TOPOLOGY_SD_FLAGS;
 
 	*sd = (struct sched_domain){
-		.min_interval		= sd_weight,
-		.max_interval		= 2*sd_weight,
 		.busy_factor		= 16,
 		.imbalance_pct		= 117,
 
@@ -1668,7 +1664,6 @@ sd_init(struct sched_domain_topology_level *tl,
 					,
 
 		.last_balance		= jiffies,
-		.balance_interval	= sd_weight,
 
 		/* 50% success rate */
 		.newidle_call		= 512,
@@ -1685,6 +1680,11 @@ sd_init(struct sched_domain_topology_level *tl,
 	cpumask_and(sd_span, cpu_map, tl->mask(tl, cpu));
 	sd_id = cpumask_first(sd_span);
 
+	sd_weight = cpumask_weight(sd_span);
+	sd->min_interval = sd_weight;
+	sd->max_interval = 2 * sd_weight;
+	sd->balance_interval = sd_weight;
+
 	sd->flags |= asym_cpu_capacity_classify(sd_span, cpu_map);
 
 	WARN_ONCE((sd->flags & (SD_SHARE_CPUCAPACITY | SD_ASYM_CPUCAPACITY)) ==
-- 
2.34.1
Re: [PATCH v3 1/8] sched/topology: Compute sd_weight considering cpuset partitions
Posted by Valentin Schneider 3 days, 14 hours ago
On 20/01/26 11:32, K Prateek Nayak wrote:
> The "sd_weight" used for calculating the load balancing interval, and
> its limits, considers the span weight of the entire topology level
> without accounting for cpuset partitions.
>
> Compute the "sd_weight" after computing the "sd_span" considering the
> cpu_map covered by the partition, and set the load balancing interval,
> and its limits accordingly.
>
> Fixes: cb83b629bae03 ("sched/numa: Rewrite the CONFIG_NUMA sched domain support")
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>

Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Re: [PATCH v3 1/8] sched/topology: Compute sd_weight considering cpuset partitions
Posted by Shrikanth Hegde 2 weeks, 4 days ago

On 1/20/26 5:02 PM, K Prateek Nayak wrote:
> The "sd_weight" used for calculating the load balancing interval, and
> its limits, considers the span weight of the entire topology level
> without accounting for cpuset partitions.
>

Please add one example showing the wrong sd_weights
while having cpuset partitions. That would be helpful.

> Compute the "sd_weight" after computing the "sd_span" considering the
> cpu_map covered by the partition, and set the load balancing interval,
> and its limits accordingly.
> 
> Fixes: cb83b629bae03 ("sched/numa: Rewrite the CONFIG_NUMA sched domain support")
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
> ---
> Changelog rfc v2..v3:
> 
> o New patch.
> ---

Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Re: [PATCH v3 1/8] sched/topology: Compute sd_weight considering cpuset partitions
Posted by K Prateek Nayak 2 weeks, 4 days ago
Hello Shrikanth,

On 1/21/2026 9:12 PM, Shrikanth Hegde wrote:
> 
> 
> On 1/20/26 5:02 PM, K Prateek Nayak wrote:
>> The "sd_weight" used for calculating the load balancing interval, and
>> its limits, considers the span weight of the entire topology level
>> without accounting for cpuset partitions.
>>
> 
> Please add one example showing the wrong sd_weights
> while having cpuset partitions. That would be helpful.

Ack! I'll update with an example in the next version.

> 
>> Compute the "sd_weight" after computing the "sd_span" considering the
>> cpu_map covered by the partition, and set the load balancing interval,
>> and its limits accordingly.
>>
>> Fixes: cb83b629bae03 ("sched/numa: Rewrite the CONFIG_NUMA sched domain support")
>> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
>> ---
>> Changelog rfc v2..v3:
>>
>> o New patch.
>> ---
> 
> Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>

Thanks a ton for the review.

-- 
Thanks and Regards,
Prateek
Re: [PATCH v3 1/8] sched/topology: Compute sd_weight considering cpuset partitions
Posted by Chen, Yu C 2 weeks, 4 days ago
On 1/20/2026 7:32 PM, K Prateek Nayak wrote:
> The "sd_weight" used for calculating the load balancing interval, and
> its limits, considers the span weight of the entire topology level
> without accounting for cpuset partitions.
> 
> Compute the "sd_weight" after computing the "sd_span" considering the
> cpu_map covered by the partition, and set the load balancing interval,
> and its limits accordingly.
> 
> Fixes: cb83b629bae03 ("sched/numa: Rewrite the CONFIG_NUMA sched domain support")
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
> ---

This not only fixes the issue for periodic load balancer
but also nohz balance because it gives a correct value for nr_busy_cpus
so from my understanding,

Reviewed-by: Chen Yu <yu.c.chen@intel.com>

thanks,
Chenyu