[PATCH] sched/topology: Initialize sd_span after assignment to *sd

K Prateek Nayak posted 1 patch 1 week, 6 days ago
kernel/sched/topology.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
[PATCH] sched/topology: Initialize sd_span after assignment to *sd
Posted by K Prateek Nayak 1 week, 6 days ago
Nathan reported a kernel panic on his ARM builds after commit
8e8e23dea43e ("sched/topology: Compute sd_weight considering cpuset
partitions") which was root caused to the compiler zeroing out the first
few bytes of sd->span.

During the debug [1], it was discovered that, on some configs,
offsetof(struct sched_domain, span) at 292 was less than
sizeof(struct sched_domain) at 296 resulting in:

  *sd = { ... }

assignment clearing out first 4 bytes of sd->span which was initialized
before.

The official GCC specification for "Arrays of Length Zero" [2] says:

  Although the size of a zero-length array is zero, an array member of
  this kind may increase the size of the enclosing type as a result of
  tail padding.

which means the relative offset of the variable length array at the end
of the sturct can indeed be less than sizeof() the struct as a result of
tail padding thus overwriting that data of the flexible array that
overlapped with the padding whenever the struct is initialized as whole.

Partially revert commit 8e8e23dea43e ("sched/topology: Compute sd_weight
considering cpuset partitions") to initialize sd_span after the fixed
memebers of sd.

Use

  cpumask_weight_and(cpu_map, tl->mask(tl, cpu))

to calculate span_weight before initializing the sd_span.
cpumask_and_weight() is of same complexity as cpumask_and() and the
additional overhead is negligible.

While at it, also initialize sd->span_weight in sd_init() since
sd_weight now captures the cpu_map constraints. Fixup the
sd->span_weight whenever sd_span is fixed up by the generic topology
layer.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://lore.kernel.org/all/20260320235824.GA1176840@ax162/
Fixes: 8e8e23dea43e ("sched/topology: Compute sd_weight considering cpuset partitions")
Link: https://lore.kernel.org/all/a8c125fd-960d-4b35-b640-95a33584eb08@amd.com/ [1]
Link: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2]
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
Nathan, can you please check if this fixes the issue you are observing -
it at least fixed one that I'm observing ;-)

Peter, if you would like to keep revert and enhancements separate, let
me know and I'll spin a v2.
---
 kernel/sched/topology.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 43150591914b..721ed9b883b8 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1669,17 +1669,13 @@ sd_init(struct sched_domain_topology_level *tl,
 	struct cpumask *sd_span;
 	u64 now = sched_clock();
 
-	sd_span = sched_domain_span(sd);
-	cpumask_and(sd_span, cpu_map, tl->mask(tl, cpu));
-	sd_weight = cpumask_weight(sd_span);
-	sd_id = cpumask_first(sd_span);
+	sd_weight = cpumask_weight_and(cpu_map, tl->mask(tl, cpu));
 
 	if (tl->sd_flags)
 		sd_flags = (*tl->sd_flags)();
 	if (WARN_ONCE(sd_flags & ~TOPOLOGY_SD_FLAGS,
 		      "wrong sd_flags in topology description\n"))
 		sd_flags &= TOPOLOGY_SD_FLAGS;
-	sd_flags |= asym_cpu_capacity_classify(sd_span, cpu_map);
 
 	*sd = (struct sched_domain){
 		.min_interval		= sd_weight,
@@ -1715,8 +1711,15 @@ sd_init(struct sched_domain_topology_level *tl,
 		.last_decay_max_lb_cost	= jiffies,
 		.child			= child,
 		.name			= tl->name,
+		.span_weight		= sd_weight,
 	};
 
+	sd_span = sched_domain_span(sd);
+	cpumask_and(sd_span, cpu_map, tl->mask(tl, cpu));
+	sd_id = cpumask_first(sd_span);
+
+	sd->flags |= asym_cpu_capacity_classify(sd_span, cpu_map);
+
 	WARN_ONCE((sd->flags & (SD_SHARE_CPUCAPACITY | SD_ASYM_CPUCAPACITY)) ==
 		  (SD_SHARE_CPUCAPACITY | SD_ASYM_CPUCAPACITY),
 		  "CPU capacity asymmetry not supported on SMT\n");
@@ -2518,6 +2521,8 @@ static struct sched_domain *build_sched_domain(struct sched_domain_topology_leve
 			cpumask_or(sched_domain_span(sd),
 				   sched_domain_span(sd),
 				   sched_domain_span(child));
+
+			sd->span_weight = cpumask_weight(sched_domain_span(sd));
 		}
 
 	}
@@ -2697,7 +2702,6 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
 	/* Build the groups for the domains */
 	for_each_cpu(i, cpu_map) {
 		for (sd = *per_cpu_ptr(d.sd, i); sd; sd = sd->parent) {
-			sd->span_weight = cpumask_weight(sched_domain_span(sd));
 			if (sd->flags & SD_NUMA) {
 				if (build_overlap_sched_groups(sd, i))
 					goto error;

base-commit: fe7171d0d5dfbe189e41db99580ebacafc3c09ce
-- 
2.34.1
Re: [PATCH] sched/topology: Initialize sd_span after assignment to *sd
Posted by Peter Zijlstra 1 week, 4 days ago
On Sat, Mar 21, 2026 at 04:38:52PM +0000, K Prateek Nayak wrote:
> Nathan reported a kernel panic on his ARM builds after commit
> 8e8e23dea43e ("sched/topology: Compute sd_weight considering cpuset
> partitions") which was root caused to the compiler zeroing out the first
> few bytes of sd->span.
> 
> During the debug [1], it was discovered that, on some configs,
> offsetof(struct sched_domain, span) at 292 was less than
> sizeof(struct sched_domain) at 296 resulting in:
> 
>   *sd = { ... }
> 
> assignment clearing out first 4 bytes of sd->span which was initialized
> before.
> 
> The official GCC specification for "Arrays of Length Zero" [2] says:
> 
>   Although the size of a zero-length array is zero, an array member of
>   this kind may increase the size of the enclosing type as a result of
>   tail padding.
> 
> which means the relative offset of the variable length array at the end
> of the sturct can indeed be less than sizeof() the struct as a result of
> tail padding thus overwriting that data of the flexible array that
> overlapped with the padding whenever the struct is initialized as whole.

WTF! that's terrible :(

Why is this allowed, this makes no bloody sense :/

However the way we allocate space for flex arrays is: sizeof(*obj) +
count * sizeof(*obj->member); this means that we do have sufficient
space, irrespective of this extra padding.


Does this work?

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 51c29581f15e..defa86ed9b06 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -153,7 +153,21 @@ struct sched_domain {
 
 static inline struct cpumask *sched_domain_span(struct sched_domain *sd)
 {
-	return to_cpumask(sd->span);
+	/*
+	 * Because C is an absolutely broken piece of shit, it is allowed for
+	 * offsetof(*sd, span) < sizeof(*sd), this means that structure
+	 * initialzation *sd = { ... }; which will clear every unmentioned
+	 * member, can over-write the start of the flexible array member.
+	 *
+	 * Luckily, the way we allocate the flexible array is by:
+	 *
+	 *   sizeof(*sd) + count * sizeof(*sd->span)
+	 *
+	 * this means that we have sufficient space for the whole flex array
+	 * *outside* of sizeof(*sd). So use that, and avoid using sd->span.
+	 */
+	unsigned long *bitmap = (void *)sd + sizeof(*sd);
+	return to_cpumask(bitmap);
 }
 
 extern void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
Re: [PATCH] sched/topology: Initialize sd_span after assignment to *sd
Posted by Nathan Chancellor 1 week, 4 days ago
On Mon, Mar 23, 2026 at 10:36:27AM +0100, Peter Zijlstra wrote:
> Does this work?

Yes, that avoids the initial panic I reported.

Tested-by: Nathan Chancellor <nathan@kernel.org>

> diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> index 51c29581f15e..defa86ed9b06 100644
> --- a/include/linux/sched/topology.h
> +++ b/include/linux/sched/topology.h
> @@ -153,7 +153,21 @@ struct sched_domain {
>  
>  static inline struct cpumask *sched_domain_span(struct sched_domain *sd)
>  {
> -	return to_cpumask(sd->span);
> +	/*
> +	 * Because C is an absolutely broken piece of shit, it is allowed for
> +	 * offsetof(*sd, span) < sizeof(*sd), this means that structure
> +	 * initialzation *sd = { ... }; which will clear every unmentioned
> +	 * member, can over-write the start of the flexible array member.
> +	 *
> +	 * Luckily, the way we allocate the flexible array is by:
> +	 *
> +	 *   sizeof(*sd) + count * sizeof(*sd->span)
> +	 *
> +	 * this means that we have sufficient space for the whole flex array
> +	 * *outside* of sizeof(*sd). So use that, and avoid using sd->span.
> +	 */
> +	unsigned long *bitmap = (void *)sd + sizeof(*sd);
> +	return to_cpumask(bitmap);
>  }
>  
>  extern void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
Re: [PATCH] sched/topology: Initialize sd_span after assignment to *sd
Posted by K Prateek Nayak 1 week, 4 days ago
Hello Peter,

On 3/23/2026 3:06 PM, Peter Zijlstra wrote:
> However the way we allocate space for flex arrays is: sizeof(*obj) +
> count * sizeof(*obj->member); this means that we do have sufficient
> space, irrespective of this extra padding.
> 
> 
> Does this work?

Solves the panic on the setup shared by Nathan and KASAN hasn't
noted anything in my baremetal testing so feel free to include:

Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>

-- 
Thanks and Regards,
Prateek
Re: [PATCH] sched/topology: Initialize sd_span after assignment to *sd
Posted by Chen, Yu C 1 week, 4 days ago
On 3/23/2026 5:36 PM, Peter Zijlstra wrote:
> On Sat, Mar 21, 2026 at 04:38:52PM +0000, K Prateek Nayak wrote:
>> Nathan reported a kernel panic on his ARM builds after commit
>> 8e8e23dea43e ("sched/topology: Compute sd_weight considering cpuset
>> partitions") which was root caused to the compiler zeroing out the first
>> few bytes of sd->span.
>>
>> During the debug [1], it was discovered that, on some configs,
>> offsetof(struct sched_domain, span) at 292 was less than
>> sizeof(struct sched_domain) at 296 resulting in:
>>
>>    *sd = { ... }
>>
>> assignment clearing out first 4 bytes of sd->span which was initialized
>> before.
>>
>> The official GCC specification for "Arrays of Length Zero" [2] says:
>>
>>    Although the size of a zero-length array is zero, an array member of
>>    this kind may increase the size of the enclosing type as a result of
>>    tail padding.
>>
>> which means the relative offset of the variable length array at the end
>> of the sturct can indeed be less than sizeof() the struct as a result of
>> tail padding thus overwriting that data of the flexible array that
>> overlapped with the padding whenever the struct is initialized as whole.
> 
> WTF! that's terrible :(
> 
> Why is this allowed, this makes no bloody sense :/
> 
> However the way we allocate space for flex arrays is: sizeof(*obj) +
> count * sizeof(*obj->member); this means that we do have sufficient
> space, irrespective of this extra padding.
> 
> 
> Does this work?
> 
> diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> index 51c29581f15e..defa86ed9b06 100644
> --- a/include/linux/sched/topology.h
> +++ b/include/linux/sched/topology.h
> @@ -153,7 +153,21 @@ struct sched_domain {
>   
>   static inline struct cpumask *sched_domain_span(struct sched_domain *sd)
>   {
> -	return to_cpumask(sd->span);
> +	/*
> +	 * Because C is an absolutely broken piece of shit, it is allowed for
> +	 * offsetof(*sd, span) < sizeof(*sd), this means that structure
> +	 * initialzation *sd = { ... }; which will clear every unmentioned
> +	 * member, can over-write the start of the flexible array member.
> +	 *
> +	 * Luckily, the way we allocate the flexible array is by:
> +	 *
> +	 *   sizeof(*sd) + count * sizeof(*sd->span)
> +	 *
> +	 * this means that we have sufficient space for the whole flex array
> +	 * *outside* of sizeof(*sd). So use that, and avoid using sd->span.
> +	 */
> +	unsigned long *bitmap = (void *)sd + sizeof(*sd);
> +	return to_cpumask(bitmap);
>   }
>   
>   extern void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],

While I still wonder if it is risky to initialize the structure members 
before
*sd = { ... }, this patch could keep the current sd_init() unchanged.
According to the tests on GNR, it works as expected with no regressions
noticed on top of sched/core commit 349edbba1125 ("sched/fair: Simplify
SIS_UTIL handling in select_idle_cpu()"),

Tested-by: Chen Yu <yu.c.chen@intel.com>

thanks,
Chenyu
Re: [PATCH] sched/topology: Initialize sd_span after assignment to *sd
Posted by Jon Hunter 1 week, 4 days ago
Hi Peter,

On 23/03/2026 09:36, Peter Zijlstra wrote:
> On Sat, Mar 21, 2026 at 04:38:52PM +0000, K Prateek Nayak wrote:
>> Nathan reported a kernel panic on his ARM builds after commit
>> 8e8e23dea43e ("sched/topology: Compute sd_weight considering cpuset
>> partitions") which was root caused to the compiler zeroing out the first
>> few bytes of sd->span.
>>
>> During the debug [1], it was discovered that, on some configs,
>> offsetof(struct sched_domain, span) at 292 was less than
>> sizeof(struct sched_domain) at 296 resulting in:
>>
>>    *sd = { ... }
>>
>> assignment clearing out first 4 bytes of sd->span which was initialized
>> before.
>>
>> The official GCC specification for "Arrays of Length Zero" [2] says:
>>
>>    Although the size of a zero-length array is zero, an array member of
>>    this kind may increase the size of the enclosing type as a result of
>>    tail padding.
>>
>> which means the relative offset of the variable length array at the end
>> of the sturct can indeed be less than sizeof() the struct as a result of
>> tail padding thus overwriting that data of the flexible array that
>> overlapped with the padding whenever the struct is initialized as whole.
> 
> WTF! that's terrible :(
> 
> Why is this allowed, this makes no bloody sense :/
> 
> However the way we allocate space for flex arrays is: sizeof(*obj) +
> count * sizeof(*obj->member); this means that we do have sufficient
> space, irrespective of this extra padding.
> 
> 
> Does this work?
> 
> diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> index 51c29581f15e..defa86ed9b06 100644
> --- a/include/linux/sched/topology.h
> +++ b/include/linux/sched/topology.h
> @@ -153,7 +153,21 @@ struct sched_domain {
>   
>   static inline struct cpumask *sched_domain_span(struct sched_domain *sd)
>   {
> -	return to_cpumask(sd->span);
> +	/*
> +	 * Because C is an absolutely broken piece of shit, it is allowed for
> +	 * offsetof(*sd, span) < sizeof(*sd), this means that structure
> +	 * initialzation *sd = { ... }; which will clear every unmentioned
> +	 * member, can over-write the start of the flexible array member.
> +	 *
> +	 * Luckily, the way we allocate the flexible array is by:
> +	 *
> +	 *   sizeof(*sd) + count * sizeof(*sd->span)
> +	 *
> +	 * this means that we have sufficient space for the whole flex array
> +	 * *outside* of sizeof(*sd). So use that, and avoid using sd->span.
> +	 */
> +	unsigned long *bitmap = (void *)sd + sizeof(*sd);
> +	return to_cpumask(bitmap);
>   }
>   
>   extern void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],


I noticed the same issue that Nathan reported on 32-bit Tegra and the 
above does fix it for me.

Tested-by: Jon Hunter <jonathanh@nvidia.com>

Thanks!
Jon

-- 
nvpublic
[tip: sched/core] sched/topology: Fix sched_domain_span()
Posted by tip-bot2 for Peter Zijlstra 1 week, 3 days ago
The following commit has been merged into the sched/core branch of tip:

Commit-ID:     e379dce8af11d8d6040b4348316a499bfd174bfb
Gitweb:        https://git.kernel.org/tip/e379dce8af11d8d6040b4348316a499bfd174bfb
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Mon, 23 Mar 2026 10:36:27 +01:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 24 Mar 2026 10:07:04 +01:00

sched/topology: Fix sched_domain_span()

Commit 8e8e23dea43e ("sched/topology: Compute sd_weight considering
cpuset partitions") ends up relying on the fact that structure
initialization should not touch the flexible array.

However, the official GCC specification for "Arrays of Length Zero"
[*] says:

  Although the size of a zero-length array is zero, an array member of
  this kind may increase the size of the enclosing type as a result of
  tail padding.

Additionally, structure initialization will zero tail padding. With
the end result that since offsetof(*type, member) < sizeof(*type),
array initialization will clobber the flex array.

Luckily, the way flexible array sizes are calculated is:

  sizeof(*type) + count * sizeof(*type->member)

This means we have the complete size of the flex array *outside* of
sizeof(*type), so use that instead of relying on the broken flex array
definition.

[*] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html

Fixes: 8e8e23dea43e ("sched/topology: Compute sd_weight considering cpuset partitions")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Debugged-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260323093627.GY3738010@noisy.programming.kicks-ass.net
---
 include/linux/sched/topology.h | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index 51c2958..36553e1 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -142,18 +142,30 @@ struct sched_domain {
 
 	unsigned int span_weight;
 	/*
-	 * Span of all CPUs in this domain.
+	 * See sched_domain_span(), on why flex arrays are broken.
 	 *
-	 * NOTE: this field is variable length. (Allocated dynamically
-	 * by attaching extra space to the end of the structure,
-	 * depending on how many CPUs the kernel has booted up with)
-	 */
 	unsigned long span[];
+	 */
 };
 
 static inline struct cpumask *sched_domain_span(struct sched_domain *sd)
 {
-	return to_cpumask(sd->span);
+	/*
+	 * Turns out that C flexible arrays are fundamentally broken since it
+	 * is allowed for offsetof(*sd, span) < sizeof(*sd), this means that
+	 * structure initialzation *sd = { ... }; which writes every byte
+	 * inside sizeof(*type), will over-write the start of the flexible
+	 * array.
+	 *
+	 * Luckily, the way we allocate sched_domain is by:
+	 *
+	 *   sizeof(*sd) + cpumask_size()
+	 *
+	 * this means that we have sufficient space for the whole flex array
+	 * *outside* of sizeof(*sd). So use that, and avoid using sd->span.
+	 */
+	unsigned long *bitmap = (void *)sd + sizeof(*sd);
+	return to_cpumask(bitmap);
 }
 
 extern void partition_sched_domains(int ndoms_new, cpumask_var_t doms_new[],
Re: [PATCH] sched/topology: Initialize sd_span after assignment to *sd
Posted by Shrikanth Hegde 1 week, 4 days ago

On 3/21/26 10:08 PM, K Prateek Nayak wrote:
> Nathan reported a kernel panic on his ARM builds after commit
> 8e8e23dea43e ("sched/topology: Compute sd_weight considering cpuset
> partitions") which was root caused to the compiler zeroing out the first
> few bytes of sd->span.
> 
> During the debug [1], it was discovered that, on some configs,
> offsetof(struct sched_domain, span) at 292 was less than
> sizeof(struct sched_domain) at 296 resulting in:
> 
>    *sd = { ... }
> 
> assignment clearing out first 4 bytes of sd->span which was initialized
> before.
> 
> The official GCC specification for "Arrays of Length Zero" [2] says:
> 
>    Although the size of a zero-length array is zero, an array member of
>    this kind may increase the size of the enclosing type as a result of
>    tail padding.
> 
> which means the relative offset of the variable length array at the end
> of the sturct can indeed be less than sizeof() the struct as a result of
> tail padding thus overwriting that data of the flexible array that
> overlapped with the padding whenever the struct is initialized as whole.
> 
> Partially revert commit 8e8e23dea43e ("sched/topology: Compute sd_weight
> considering cpuset partitions") to initialize sd_span after the fixed
> memebers of sd.
> 
> Use
> 
>    cpumask_weight_and(cpu_map, tl->mask(tl, cpu))
> 
> to calculate span_weight before initializing the sd_span.
> cpumask_and_weight() is of same complexity as cpumask_and() and the
> additional overhead is negligible.
> 
> While at it, also initialize sd->span_weight in sd_init() since
> sd_weight now captures the cpu_map constraints. Fixup the
> sd->span_weight whenever sd_span is fixed up by the generic topology
> layer.
> 


This description is a bit confusing. Fixup happens naturally since
cpu_map now reflects the changes right?

Maybe mention about that removal in build_sched_domains?

> Reported-by: Nathan Chancellor <nathan@kernel.org>
> Closes: https://lore.kernel.org/all/20260320235824.GA1176840@ax162/
> Fixes: 8e8e23dea43e ("sched/topology: Compute sd_weight considering cpuset partitions")
> Link: https://lore.kernel.org/all/a8c125fd-960d-4b35-b640-95a33584eb08@amd.com/ [1]
> Link: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2]
> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
> ---
> Nathan, can you please check if this fixes the issue you are observing -
> it at least fixed one that I'm observing ;-)
> 
> Peter, if you would like to keep revert and enhancements separate, let
> me know and I'll spin a v2.
> ---
>   kernel/sched/topology.c | 16 ++++++++++------
>   1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 43150591914b..721ed9b883b8 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1669,17 +1669,13 @@ sd_init(struct sched_domain_topology_level *tl,
>   	struct cpumask *sd_span;
>   	u64 now = sched_clock();
>   
> -	sd_span = sched_domain_span(sd);
> -	cpumask_and(sd_span, cpu_map, tl->mask(tl, cpu));
> -	sd_weight = cpumask_weight(sd_span);
> -	sd_id = cpumask_first(sd_span);
> +	sd_weight = cpumask_weight_and(cpu_map, tl->mask(tl, cpu));
>   
>   	if (tl->sd_flags)
>   		sd_flags = (*tl->sd_flags)();
>   	if (WARN_ONCE(sd_flags & ~TOPOLOGY_SD_FLAGS,
>   		      "wrong sd_flags in topology description\n"))
>   		sd_flags &= TOPOLOGY_SD_FLAGS;
> -	sd_flags |= asym_cpu_capacity_classify(sd_span, cpu_map);
>   
>   	*sd = (struct sched_domain){
>   		.min_interval		= sd_weight,
> @@ -1715,8 +1711,15 @@ sd_init(struct sched_domain_topology_level *tl,
>   		.last_decay_max_lb_cost	= jiffies,
>   		.child			= child,
>   		.name			= tl->name,
> +		.span_weight		= sd_weight,
>   	};
>   
> +	sd_span = sched_domain_span(sd);
> +	cpumask_and(sd_span, cpu_map, tl->mask(tl, cpu));
> +	sd_id = cpumask_first(sd_span);
> +
> +	sd->flags |= asym_cpu_capacity_classify(sd_span, cpu_map);
> +
>   	WARN_ONCE((sd->flags & (SD_SHARE_CPUCAPACITY | SD_ASYM_CPUCAPACITY)) ==
>   		  (SD_SHARE_CPUCAPACITY | SD_ASYM_CPUCAPACITY),
>   		  "CPU capacity asymmetry not supported on SMT\n");
> @@ -2518,6 +2521,8 @@ static struct sched_domain *build_sched_domain(struct sched_domain_topology_leve
>   			cpumask_or(sched_domain_span(sd),
>   				   sched_domain_span(sd),
>   				   sched_domain_span(child));
> +
> +			sd->span_weight = cpumask_weight(sched_domain_span(sd));
>   		}
>   
>   	}
> @@ -2697,7 +2702,6 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
>   	/* Build the groups for the domains */
>   	for_each_cpu(i, cpu_map) {
>   		for (sd = *per_cpu_ptr(d.sd, i); sd; sd = sd->parent) {
> -			sd->span_weight = cpumask_weight(sched_domain_span(sd));
>   			if (sd->flags & SD_NUMA) {
>   				if (build_overlap_sched_groups(sd, i))
>   					goto error;
> 
> base-commit: fe7171d0d5dfbe189e41db99580ebacafc3c09ce


Other than nits in changelog:
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>


PS: b4 am -Q was quite confused which patch to pick for 0001.
may since it was a reply to the thread. Not sure. So i pulled
each patch separate and applied.
Re: [PATCH] sched/topology: Initialize sd_span after assignment to *sd
Posted by K Prateek Nayak 1 week, 4 days ago
Hello Shrikanth,

On 3/23/2026 2:38 PM, Shrikanth Hegde wrote:
>> While at it, also initialize sd->span_weight in sd_init() since
>> sd_weight now captures the cpu_map constraints. Fixup the
>> sd->span_weight whenever sd_span is fixed up by the generic topology
>> layer.
>>
> 
> 
> This description is a bit confusing. Fixup happens naturally since
> cpu_map now reflects the changes right?

That was for the hunk in build_sched_domain() where the the sd span
is fixed up if it is found that the child isn't a subset of the
parent in which case span_weight needs to be calculated again after
the cpumask_or().

[..snip..]

> Other than nits in changelog:
> Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>

Thanks for the review but Peter has found an alternate approach to
work around this with the current flow of computing span first.

> PS: b4 am -Q was quite confused which patch to pick for 0001.
> may since it was a reply to the thread. Not sure. So i pulled
> each patch separate and applied.

Sorry for the inconvenience. For single patch it should be still fine
to grab the raw patch but for larger series I make sure to post out
separately for convenience. Will be mindful next time.

-- 
Thanks and Regards,
Prateek