[tip: sched/urgent] sched/fair: Use sched_domain_span() for topology_span_sane()

tip-bot2 for K Prateek Nayak posted 1 patch 3 months ago
There is a newer version of this series
kernel/sched/topology.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
[tip: sched/urgent] sched/fair: Use sched_domain_span() for topology_span_sane()
Posted by tip-bot2 for K Prateek Nayak 3 months ago
The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     02bb4259ca525efa39a2531cb630329fb87fc968
Gitweb:        https://git.kernel.org/tip/02bb4259ca525efa39a2531cb630329fb87fc968
Author:        K Prateek Nayak <kprateek.nayak@amd.com>
AuthorDate:    Mon, 30 Jun 2025 06:10:59 
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 04 Jul 2025 10:35:56 +02:00

sched/fair: Use sched_domain_span() for topology_span_sane()

Leon noted a topology_span_sane() warning in their guest deployment
starting from v6.16-rc1 [1]. Debug that followed pointed to the
tl->mask() for the NODE domain being incorrectly resolved to that of the
highest NUMA domain.

tl->mask() for NODE is set to the sd_numa_mask() which depends on the
global "sched_domains_curr_level" hack. "sched_domains_curr_level" is
set to the "tl->numa_level" during tl traversal in build_sched_domains()
calling sd_init() but was not reset before topology_span_sane().

Since "tl->numa_level" still reflected the old value from
build_sched_domains(), topology_span_sane() for the NODE domain trips
when the span of the last NUMA domain overlaps.

Instead of replicating the "sched_domains_curr_level" hack, Valentin
suggested using the spans from the sched_domain objects constructed
during build_sched_domains() which can also catch overlaps when the
domain spans are fixed up by build_sched_domain().

The original warning was reproducble on the follwoing NUMA topology
reported by Leon:

    $ sudo numactl -H
    available: 5 nodes (0-4)
    node 0 cpus: 0 1
    node 0 size: 2927 MB
    node 0 free: 1603 MB
    node 1 cpus: 2 3
    node 1 size: 3023 MB
    node 1 free: 3008 MB
    node 2 cpus: 4 5
    node 2 size: 3023 MB
    node 2 free: 3007 MB
    node 3 cpus: 6 7
    node 3 size: 3023 MB
    node 3 free: 3002 MB
    node 4 cpus: 8 9
    node 4 size: 3022 MB
    node 4 free: 2718 MB
    node distances:
    node   0   1   2   3   4
      0:  10  39  38  37  36
      1:  39  10  38  37  36
      2:  38  38  10  37  36
      3:  37  37  37  10  36
      4:  36  36  36  36  10

The above topology can be mimicked using the following QEMU cmd that was
used to reproduce the warning and test the fix:

     sudo qemu-system-x86_64 -enable-kvm -cpu host \
     -m 20G -smp cpus=10,sockets=10 -machine q35 \
     -object memory-backend-ram,size=4G,id=m0 \
     -object memory-backend-ram,size=4G,id=m1 \
     -object memory-backend-ram,size=4G,id=m2 \
     -object memory-backend-ram,size=4G,id=m3 \
     -object memory-backend-ram,size=4G,id=m4 \
     -numa node,cpus=0-1,memdev=m0,nodeid=0 \
     -numa node,cpus=2-3,memdev=m1,nodeid=1 \
     -numa node,cpus=4-5,memdev=m2,nodeid=2 \
     -numa node,cpus=6-7,memdev=m3,nodeid=3 \
     -numa node,cpus=8-9,memdev=m4,nodeid=4 \
     -numa dist,src=0,dst=1,val=39 \
     -numa dist,src=0,dst=2,val=38 \
     -numa dist,src=0,dst=3,val=37 \
     -numa dist,src=0,dst=4,val=36 \
     -numa dist,src=1,dst=0,val=39 \
     -numa dist,src=1,dst=2,val=38 \
     -numa dist,src=1,dst=3,val=37 \
     -numa dist,src=1,dst=4,val=36 \
     -numa dist,src=2,dst=0,val=38 \
     -numa dist,src=2,dst=1,val=38 \
     -numa dist,src=2,dst=3,val=37 \
     -numa dist,src=2,dst=4,val=36 \
     -numa dist,src=3,dst=0,val=37 \
     -numa dist,src=3,dst=1,val=37 \
     -numa dist,src=3,dst=2,val=37 \
     -numa dist,src=3,dst=4,val=36 \
     -numa dist,src=4,dst=0,val=36 \
     -numa dist,src=4,dst=1,val=36 \
     -numa dist,src=4,dst=2,val=36 \
     -numa dist,src=4,dst=3,val=36 \
     ...

Closes: https://lore.kernel.org/lkml/20250610110701.GA256154@unreal/ [1]
Fixes: ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap") # ce29a7da84cd, f55dac1dafb3
Reported-by: Leon Romanovsky <leon@kernel.org>
Suggested-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steve Wahl <steve.wahl@hpe.com>
Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Tested-by: Valentin Schneider <vschneid@redhat.com>
Link: https://lore.kernel.org/r/20250630061059.1547-1-kprateek.nayak@amd.com
---
 kernel/sched/topology.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index b958fe4..0e46068 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2403,6 +2403,7 @@ static bool topology_span_sane(const struct cpumask *cpu_map)
 	id_seen = sched_domains_tmpmask2;
 
 	for_each_sd_topology(tl) {
+		struct sd_data *sdd = &tl->data;
 
 		/* NUMA levels are allowed to overlap */
 		if (tl->flags & SDTL_OVERLAP)
@@ -2418,22 +2419,24 @@ static bool topology_span_sane(const struct cpumask *cpu_map)
 		 * breaks the linking done for an earlier span.
 		 */
 		for_each_cpu(cpu, cpu_map) {
-			const struct cpumask *tl_cpu_mask = tl->mask(cpu);
+			struct sched_domain *sd = *per_cpu_ptr(sdd->sd, cpu);
+			struct cpumask *sd_span = sched_domain_span(sd);
 			int id;
 
 			/* lowest bit set in this mask is used as a unique id */
-			id = cpumask_first(tl_cpu_mask);
+			id = cpumask_first(sd_span);
 
 			if (cpumask_test_cpu(id, id_seen)) {
-				/* First CPU has already been seen, ensure identical spans */
-				if (!cpumask_equal(tl->mask(id), tl_cpu_mask))
+				/* First CPU has already been seen, ensure identical sd spans */
+				sd = *per_cpu_ptr(sdd->sd, id);
+				if (!cpumask_equal(sched_domain_span(sd), sd_span))
 					return false;
 			} else {
 				/* First CPU hasn't been seen before, ensure it's a completely new span */
-				if (cpumask_intersects(tl_cpu_mask, covered))
+				if (cpumask_intersects(sd_span, covered))
 					return false;
 
-				cpumask_or(covered, covered, tl_cpu_mask);
+				cpumask_or(covered, covered, sd_span);
 				cpumask_set_cpu(id, id_seen);
 			}
 		}
Re: [tip: sched/urgent] sched/fair: Use sched_domain_span() for topology_span_sane()
Posted by Borislav Petkov 3 months ago
On Fri, Jul 04, 2025 at 09:13:16AM -0000, tip-bot2 for K Prateek Nayak wrote:
> The following commit has been merged into the sched/urgent branch of tip:
> 
> Commit-ID:     02bb4259ca525efa39a2531cb630329fb87fc968
> Gitweb:        https://git.kernel.org/tip/02bb4259ca525efa39a2531cb630329fb87fc968
> Author:        K Prateek Nayak <kprateek.nayak@amd.com>
> AuthorDate:    Mon, 30 Jun 2025 06:10:59 
> Committer:     Peter Zijlstra <peterz@infradead.org>
> CommitterDate: Fri, 04 Jul 2025 10:35:56 +02:00
> 
> sched/fair: Use sched_domain_span() for topology_span_sane()

My guest doesn't like this one and reverting it ontop of the whole tip lineup
fixes it.

Holler for more data if needed.

[    0.280062] Timer migration: 2 hierarchy levels; 8 children per group; 2 crossnode level
[    0.282922] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
[    0.287572] smp: Bringing up secondary CPUs ...
[    0.288623] smpboot: x86: Booting SMP configuration:
[    0.289085] .... node  #0, CPUs:        #1  #2  #3  #4  #5  #6  #7  #8  #9 #10 #11 #12 #13 #14 #15
[    0.302358] smp: Brought up 1 node, 16 CPUs
[    0.304445] smpboot: Total of 16 processors activated (118401.12 BogoMIPS)
[    0.307884] BUG: unable to handle page fault for address: 0000000089c402fb
[    0.307884] #PF: supervisor read access in kernel mode
[    0.307884] #PF: error_code(0x0000) - not-present page
[    0.307884] PGD 0 P4D 0 
[    0.307950] Oops: Oops: 0000 [#1] SMP NOPTI
[    0.308344] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.16.0-rc4+ #1 PREEMPT(full) 
[    0.309115] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2023.11-8 02/21/2024
[    0.309934] RIP: 0010:build_sched_domains+0x627/0x1550
[    0.310086] Code: 84 75 06 00 00 f3 48 0f bc c0 48 63 f8 89 c0 48 0f a3 05 c4 cf 95 08 0f 83 6c 06 00 00 48 8b 3c fd c0 db 29 82 49 8b 44 24 18 <48> 8b 04 07 48 8b 80 90 00 00 00 48 33 86 90 00 00 00 66 85 c0 0f
[    0.310086] RSP: 0018:ffffc9000001fe60 EFLAGS: 00010247
[    0.310086] RAX: ffffffff89c402f8 RBX: ffff88800cea8e40 RCX: 0000000000000001
[    0.310086] RDX: ffffffffffffffff RSI: ffff88800ceaacc0 RDI: 0000000100000003
[    0.310086] RBP: ffff88800cc4e3e0 R08: 0000000000000000 R09: 0000000000000000
[    0.310086] R10: 00000000fffedb1d R11: 00000000fffedb1d R12: ffff88800ceda4c0
[    0.310086] R13: ffff88800cea9500 R14: 0000000000000010 R15: 000000000000000f
[    0.310086] FS:  0000000000000000(0000) GS:ffff8880f39f2000(0000) knlGS:0000000000000000
[    0.310086] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.310086] CR2: 0000000089c402fb CR3: 0000000002c1a000 CR4: 00000000003506f0
[    0.310086] Call Trace:
[    0.310086]  <TASK>
[    0.310086]  ? sched_init_domains+0x58/0xa0
[    0.310086]  sched_init_smp+0x29/0x90
[    0.310086]  kernel_init_freeable+0xa3/0x290
[    0.310086]  ? __pfx_kernel_init+0x10/0x10
[    0.310086]  kernel_init+0x1a/0x1c0
[    0.310086]  ret_from_fork+0x85/0xf0
[    0.310086]  ? __pfx_kernel_init+0x10/0x10
[    0.310086]  ret_from_fork_asm+0x1a/0x30
[    0.310086]  </TASK>
[    0.310086] Modules linked in:
[    0.310086] CR2: 0000000089c402fb
[    0.310086] ---[ end trace 0000000000000000 ]---
[    0.310086] RIP: 0010:build_sched_domains+0x627/0x1550
[    0.310086] Code: 84 75 06 00 00 f3 48 0f bc c0 48 63 f8 89 c0 48 0f a3 05 c4 cf 95 08 0f 83 6c 06 00 00 48 8b 3c fd c0 db 29 82 49 8b 44 24 18 <48> 8b 04 07 48 8b 80 90 00 00 00 48 33 86 90 00 00 00 66 85 c0 0f
[    0.310086] RSP: 0018:ffffc9000001fe60 EFLAGS: 00010247
[    0.310086] RAX: ffffffff89c402f8 RBX: ffff88800cea8e40 RCX: 0000000000000001
[    0.310086] RDX: ffffffffffffffff RSI: ffff88800ceaacc0 RDI: 0000000100000003
[    0.310086] RBP: ffff88800cc4e3e0 R08: 0000000000000000 R09: 0000000000000000
[    0.310086] R10: 00000000fffedb1d R11: 00000000fffedb1d R12: ffff88800ceda4c0
[    0.310086] R13: ffff88800cea9500 R14: 0000000000000010 R15: 000000000000000f
[    0.310086] FS:  0000000000000000(0000) GS:ffff8880f39f2000(0000) knlGS:0000000000000000
[    0.310086] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.310086] CR2: 0000000089c402fb CR3: 0000000002c1a000 CR4: 00000000003506f0
[    0.310086] note: swapper/0[1] exited with irqs disabled
[    0.310091] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[    0.311130] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]---

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [tip: sched/urgent] sched/fair: Use sched_domain_span() for topology_span_sane()
Posted by K Prateek Nayak 3 months ago
Hello Boris,

On 7/4/2025 3:51 PM, Borislav Petkov wrote:
> On Fri, Jul 04, 2025 at 09:13:16AM -0000, tip-bot2 for K Prateek Nayak wrote:
>> The following commit has been merged into the sched/urgent branch of tip:
>>
>> Commit-ID:     02bb4259ca525efa39a2531cb630329fb87fc968
>> Gitweb:        https://git.kernel.org/tip/02bb4259ca525efa39a2531cb630329fb87fc968
>> Author:        K Prateek Nayak <kprateek.nayak@amd.com>
>> AuthorDate:    Mon, 30 Jun 2025 06:10:59
>> Committer:     Peter Zijlstra <peterz@infradead.org>
>> CommitterDate: Fri, 04 Jul 2025 10:35:56 +02:00
>>
>> sched/fair: Use sched_domain_span() for topology_span_sane()
> 
> My guest doesn't like this one and reverting it ontop of the whole tip lineup
> fixes it.
> 
> Holler for more data if needed.

In an attempt to solve a complicated case, I think I overlooked the
simplest one. In your case, the PKG and NODE domain should have same
span (and covers all the CPUs in the system) and the
build_sched_domain() loop skips building the NODE domain altogether
since PKG has all the online CPUs.

Can you try the below incremental diff on top of this patch and
let me know if you still hit the error:

(Lightly tested on QEMU)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 0e46068acb0a..cce540fe36c6 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2423,6 +2423,14 @@ static bool topology_span_sane(const struct cpumask *cpu_map)
  			struct cpumask *sd_span = sched_domain_span(sd);
  			int id;
  
+			/*
+			 * If the child already covers the cpumap, sd
+			 * remains un-initialized. Use sd->private to
+			 * detect uninitialized domains.
+			 */
+			if (!sd->private)
+				continue;
+
  			/* lowest bit set in this mask is used as a unique id */
  			id = cpumask_first(sd_span);
  
---

Thank you for the report and sorry for the oversight. Hope I have not
disrupted your Feierabend.

P.S. I'm used the below cmdline to reproduce this:

   sudo ~/dev/qemu/build/qemu-system-x86_64 -enable-kvm -cpu host -m 20G \
   -smp cpus=10,socket=1,thread=10 -machine q35 \
   -object memory-backend-ram,size=20G,id=m0 \
   -numa node,cpus=0-9,memdev=m0,nodeid=0 \
   ...

> 
> [    0.280062] Timer migration: 2 hierarchy levels; 8 children per group; 2 crossnode level
> [    0.282922] NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
> [    0.287572] smp: Bringing up secondary CPUs ...
> [    0.288623] smpboot: x86: Booting SMP configuration:
> [    0.289085] .... node  #0, CPUs:        #1  #2  #3  #4  #5  #6  #7  #8  #9 #10 #11 #12 #13 #14 #15
> [    0.302358] smp: Brought up 1 node, 16 CPUs
> [    0.304445] smpboot: Total of 16 processors activated (118401.12 BogoMIPS)
> [    0.307884] BUG: unable to handle page fault for address: 0000000089c402fb
> [    0.307884] #PF: supervisor read access in kernel mode
> [    0.307884] #PF: error_code(0x0000) - not-present page
> [    0.307884] PGD 0 P4D 0
> [    0.307950] Oops: Oops: 0000 [#1] SMP NOPTI
> [    0.308344] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.16.0-rc4+ #1 PREEMPT(full)
> [    0.309115] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2023.11-8 02/21/2024
> [    0.309934] RIP: 0010:build_sched_domains+0x627/0x1550
> [    0.310086] Code: 84 75 06 00 00 f3 48 0f bc c0 48 63 f8 89 c0 48 0f a3 05 c4 cf 95 08 0f 83 6c 06 00 00 48 8b 3c fd c0 db 29 82 49 8b 44 24 18 <48> 8b 04 07 48 8b 80 90 00 00 00 48 33 86 90 00 00 00 66 85 c0 0f
> [    0.310086] RSP: 0018:ffffc9000001fe60 EFLAGS: 00010247
> [    0.310086] RAX: ffffffff89c402f8 RBX: ffff88800cea8e40 RCX: 0000000000000001
> [    0.310086] RDX: ffffffffffffffff RSI: ffff88800ceaacc0 RDI: 0000000100000003
> [    0.310086] RBP: ffff88800cc4e3e0 R08: 0000000000000000 R09: 0000000000000000
> [    0.310086] R10: 00000000fffedb1d R11: 00000000fffedb1d R12: ffff88800ceda4c0
> [    0.310086] R13: ffff88800cea9500 R14: 0000000000000010 R15: 000000000000000f
> [    0.310086] FS:  0000000000000000(0000) GS:ffff8880f39f2000(0000) knlGS:0000000000000000
> [    0.310086] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.310086] CR2: 0000000089c402fb CR3: 0000000002c1a000 CR4: 00000000003506f0
> [    0.310086] Call Trace:
> [    0.310086]  <TASK>
> [    0.310086]  ? sched_init_domains+0x58/0xa0
> [    0.310086]  sched_init_smp+0x29/0x90
> [    0.310086]  kernel_init_freeable+0xa3/0x290
> [    0.310086]  ? __pfx_kernel_init+0x10/0x10
> [    0.310086]  kernel_init+0x1a/0x1c0
> [    0.310086]  ret_from_fork+0x85/0xf0
> [    0.310086]  ? __pfx_kernel_init+0x10/0x10
> [    0.310086]  ret_from_fork_asm+0x1a/0x30
> [    0.310086]  </TASK>
> [    0.310086] Modules linked in:
> [    0.310086] CR2: 0000000089c402fb
> [    0.310086] ---[ end trace 0000000000000000 ]---
> [    0.310086] RIP: 0010:build_sched_domains+0x627/0x1550
> [    0.310086] Code: 84 75 06 00 00 f3 48 0f bc c0 48 63 f8 89 c0 48 0f a3 05 c4 cf 95 08 0f 83 6c 06 00 00 48 8b 3c fd c0 db 29 82 49 8b 44 24 18 <48> 8b 04 07 48 8b 80 90 00 00 00 48 33 86 90 00 00 00 66 85 c0 0f
> [    0.310086] RSP: 0018:ffffc9000001fe60 EFLAGS: 00010247
> [    0.310086] RAX: ffffffff89c402f8 RBX: ffff88800cea8e40 RCX: 0000000000000001
> [    0.310086] RDX: ffffffffffffffff RSI: ffff88800ceaacc0 RDI: 0000000100000003
> [    0.310086] RBP: ffff88800cc4e3e0 R08: 0000000000000000 R09: 0000000000000000
> [    0.310086] R10: 00000000fffedb1d R11: 00000000fffedb1d R12: ffff88800ceda4c0
> [    0.310086] R13: ffff88800cea9500 R14: 0000000000000010 R15: 000000000000000f
> [    0.310086] FS:  0000000000000000(0000) GS:ffff8880f39f2000(0000) knlGS:0000000000000000
> [    0.310086] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.310086] CR2: 0000000089c402fb CR3: 0000000002c1a000 CR4: 00000000003506f0
> [    0.310086] note: swapper/0[1] exited with irqs disabled
> [    0.310091] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
> [    0.311130] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009 ]---
> 

-- 
Thanks and Regards,
Prateek
Re: [tip: sched/urgent] sched/fair: Use sched_domain_span() for topology_span_sane()
Posted by Borislav Petkov 3 months ago
On Fri, Jul 04, 2025 at 05:16:44PM +0530, K Prateek Nayak wrote:
> In an attempt to solve a complicated case, I think I overlooked the
> simplest one.

It happens. No worries.

> Can you try the below incremental diff on top of this patch and

Yap, works. Thanks.

> let me know if you still hit the error:
> 
> (Lightly tested on QEMU)
> 
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 0e46068acb0a..cce540fe36c6 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2423,6 +2423,14 @@ static bool topology_span_sane(const struct cpumask *cpu_map)
>  			struct cpumask *sd_span = sched_domain_span(sd);
>  			int id;
> +			/*
> +			 * If the child already covers the cpumap, sd
> +			 * remains un-initialized. Use sd->private to
> +			 * detect uninitialized domains.
> +			 */
> +			if (!sd->private)
> +				continue;
> +
>  			/* lowest bit set in this mask is used as a unique id */
>  			id = cpumask_first(sd_span);
> ---

Yeah, when you send a hunk I should apply, no matter how easy it is, pls send
it from a mail client which doesn't mangle the diff otherwise I get:

$ test-apply.sh -n /tmp/diff
checking file kernel/sched/topology.c
Hunk #1 FAILED at 2423.
1 out of 1 hunk FAILED

Or you can attach it.

I've done it by hand now.

> Thank you for the report and sorry for the oversight.

No worries at all.

> Hope I have not disrupted your Feierabend.

Haha, I have Feierabend a lot later, if at all :-P

I hope you can enjoy the weekend a bit and not look at code too much.

:-)

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [tip: sched/urgent] sched/fair: Use sched_domain_span() for topology_span_sane()
Posted by K Prateek Nayak 3 months ago
Hello Boris,

On 7/4/2025 6:03 PM, Borislav Petkov wrote:
>> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
>> index 0e46068acb0a..cce540fe36c6 100644
>> --- a/kernel/sched/topology.c
>> +++ b/kernel/sched/topology.c
>> @@ -2423,6 +2423,14 @@ static bool topology_span_sane(const struct cpumask *cpu_map)
>>   			struct cpumask *sd_span = sched_domain_span(sd);
>>   			int id;
>> +			/*
>> +			 * If the child already covers the cpumap, sd
>> +			 * remains un-initialized. Use sd->private to
>> +			 * detect uninitialized domains.
>> +			 */
>> +			if (!sd->private)
>> +				continue;
>> +
>>   			/* lowest bit set in this mask is used as a unique id */
>>   			id = cpumask_first(sd_span);
>> ---
> 
> Yeah, when you send a hunk I should apply, no matter how easy it is, pls send
> it from a mail client which doesn't mangle the diff otherwise I get:
> 
> $ test-apply.sh -n /tmp/diff
> checking file kernel/sched/topology.c
> Hunk #1 FAILED at 2423.
> 1 out of 1 hunk FAILED
> 
> Or you can attach it.
> 
> I've done it by hand now.

Thank you for the testing and sorry about the malformed diff! I'll
double (and triple) check next time before sending. Thanks a ton for
applying it manually.

-- 
Thanks and Regards,
Prateek
Re: [tip: sched/urgent] sched/fair: Use sched_domain_span() for topology_span_sane()
Posted by Peter Zijlstra 3 months ago
On Fri, Jul 04, 2025 at 12:21:03PM +0200, Borislav Petkov wrote:
> On Fri, Jul 04, 2025 at 09:13:16AM -0000, tip-bot2 for K Prateek Nayak wrote:
> > The following commit has been merged into the sched/urgent branch of tip:
> > 
> > Commit-ID:     02bb4259ca525efa39a2531cb630329fb87fc968
> > Gitweb:        https://git.kernel.org/tip/02bb4259ca525efa39a2531cb630329fb87fc968
> > Author:        K Prateek Nayak <kprateek.nayak@amd.com>
> > AuthorDate:    Mon, 30 Jun 2025 06:10:59 
> > Committer:     Peter Zijlstra <peterz@infradead.org>
> > CommitterDate: Fri, 04 Jul 2025 10:35:56 +02:00
> > 
> > sched/fair: Use sched_domain_span() for topology_span_sane()
> 
> My guest doesn't like this one and reverting it ontop of the whole tip lineup
> fixes it.

OK, let me pull this patch real quick for now. Clearly it needs moar
work.