kernel/sched/ext.c | 221 ++++++++++-------- kernel/sched/ext_internal.h | 6 + .../include/scx/user_exit_info.bpf.h | 3 + tools/sched_ext/include/scx/user_exit_info.h | 2 + .../include/scx/user_exit_info_common.h | 5 + 5 files changed, 142 insertions(+), 95 deletions(-)
When sched_ext is disabled by an error, the per-CPU state dump in the exit info can get truncated on systems with many CPUs. If the CPU that triggered the exit happens to be in the middle or end of the CPU list, its state may never appear in the output, making it difficult to diagnose the failure. This series addresses that by always dumping the exit CPU first and surfacing the same CPU id to BPF schedulers and userspace tools. Patch 1 is a preparatory refactor that extracts the per-CPU dump logic into a scx_dump_cpu() helper. Patch 2 adds an exit_cpu field to scx_exit_info and threads it through the exit path. The scx_exit() wrapper is reworked into a macro that captures the calling CPU automatically for all error paths, while the watchdog stall site records cpu_of(rq) explicitly. scx_dump_state() reports the CPU in the dump header and emits it before the rest of the per-CPU loop so it survives any output truncation. Patch 3 propagates exit_cpu to struct user_exit_info, the BPF / userspace shared exit record. UEI_RECORD() defaults the field to -1 before its CO-RE-gated copy so older kernels remain distinguishable from "exit happened on CPU 0", and UEI_REPORT() appends "on CPU N" to the EXIT line so scheduler authors see the most diagnostically useful piece of exit info without cracking open the debug dump. Changes since v2: - Use s32 (instead of int) for the new exit_cpu field and the __scx_exit() / scx_vexit() parameter, matching the convention for CPU ids in sched_ext. - v2: https://lore.kernel.org/sched-ext/20260429060726.359024-1-changwoo@igalia.com/ Changes since v1: - Generalized "stall CPU" to "exit CPU"; the scx_exit_info field is now exit_cpu and is populated for any path through scx_exit() / __scx_exit() / scx_vexit(), not just the watchdog stall path. - Added patch 3 to expose exit_cpu via struct user_exit_info. - SysRq-D initializes exit_cpu to -1 so debug dumps not tied to an exit don't arbitrarily promote CPU 0. - Dump header now reports "on cpu N" alongside the exit kind. - v1: https://lore.kernel.org/sched-ext/20260408031113.76005-1-changwoo@igalia.com/ Changwoo Min (3): sched_ext: Extract scx_dump_cpu() from scx_dump_state() sched_ext: Dump the exit CPU first sched_ext: Expose exit_cpu to BPF and userspace kernel/sched/ext.c | 221 ++++++++++-------- kernel/sched/ext_internal.h | 6 + .../include/scx/user_exit_info.bpf.h | 3 + tools/sched_ext/include/scx/user_exit_info.h | 2 + .../include/scx/user_exit_info_common.h | 5 + 5 files changed, 142 insertions(+), 95 deletions(-) -- 2.54.0
Hello, > Changwoo Min (3): > sched_ext: Extract scx_dump_cpu() from scx_dump_state() > sched_ext: Dump the exit CPU first > sched_ext: Expose exit_cpu to BPF and userspace Applied 1-3 to sched_ext/for-7.2, thank you. A few things I noticed that might be worth a follow-up: 1. scx_rcu_cpu_stall() takes no cpu, so the captured exit_cpu ends up being the detector rather than the stalled one. We could probably plumb it through from print_other_cpu_stall(), where the stalled cpu is known. 2. scx_hardlockup_irq_workfn() already has the hung cpu locally, so passing it via __scx_exit() might be a bit more robust than relying on irq_work routing. 3. Minor: "on cpu N" (kernel) vs "on CPU N" (UEI) - the casing could probably match. Thanks. -- tejun
Hi Tejun, On Tue, Apr 28, 2026 at 10:57:27PM -1000, Tejun Heo wrote: > A few things I noticed that might be worth a follow-up: > > 1. scx_rcu_cpu_stall() takes no cpu, so the captured exit_cpu ends > up being the detector rather than the stalled one. We could > probably plumb it through from print_other_cpu_stall(), where > the stalled cpu is known. Do you mean we should change the function signatures to pass the stalled CPU through, e.g. panic_on_rcu_stall(int stalled_cpu) and scx_rcu_cpu_stall(int stalled_cpu)? > > 2. scx_hardlockup_irq_workfn() already has the hung cpu locally, so > passing it via __scx_exit() might be a bit more robust than > relying on irq_work routing. > > 3. Minor: "on cpu N" (kernel) vs "on CPU N" (UEI) - the casing > could probably match. > I have a draft patch and can send it out. If Changwoo or anyone else is already working on this, pls let me know! -- Cheers, Cheng-Yang
On Wed, Apr 29, 2026 at 07:29:30PM +0800, Cheng-Yang Chou wrote: > Hi Tejun, > > On Tue, Apr 28, 2026 at 10:57:27PM -1000, Tejun Heo wrote: > > A few things I noticed that might be worth a follow-up: > > > > 1. scx_rcu_cpu_stall() takes no cpu, so the captured exit_cpu ends > > up being the detector rather than the stalled one. We could > > probably plumb it through from print_other_cpu_stall(), where > > the stalled cpu is known. > > Do you mean we should change the function signatures to pass the stalled > CPU through, e.g. panic_on_rcu_stall(int stalled_cpu) and > scx_rcu_cpu_stall(int stalled_cpu)? Yeah. Thanks. -- tejun
Hi Cheng-Yang, On 4/29/26 8:29 PM, Cheng-Yang Chou wrote: >> 2. scx_hardlockup_irq_workfn() already has the hung cpu locally, so >> passing it via __scx_exit() might be a bit more robust than >> relying on irq_work routing. >> >> 3. Minor: "on cpu N" (kernel) vs "on CPU N" (UEI) - the casing >> could probably match. >> > I have a draft patch and can send it out. If Changwoo or anyone else is > already working on this, pls let me know! Feel free to go ahead. Thanks! Regards, Changwoo Min
© 2016 - 2026 Red Hat, Inc.