When sched_ext is disabled by an error, the per-CPU state dump in the
exit info can get truncated on systems with many CPUs. If the CPU that
triggered the exit happens to be in the middle or end of the CPU list,
its state may never appear in the output, making it difficult to
diagnose the failure.
This series addresses that by always dumping the exit CPU first and
surfacing the same CPU id to BPF schedulers and userspace tools.
Patch 1 is a preparatory refactor that extracts the per-CPU dump logic
into a scx_dump_cpu() helper.
Patch 2 adds an exit_cpu field to scx_exit_info and threads it through
the exit path. The scx_exit() wrapper is reworked into a macro that
captures the calling CPU automatically for all error paths, while the
watchdog stall site records cpu_of(rq) explicitly. scx_dump_state()
reports the CPU in the dump header and emits it before the rest of the
per-CPU loop so it survives any output truncation.
Patch 3 propagates exit_cpu to struct user_exit_info, the BPF /
userspace shared exit record. UEI_RECORD() defaults the field to -1
before its CO-RE-gated copy so older kernels remain distinguishable
from "exit happened on CPU 0", and UEI_REPORT() appends "on CPU N" to
the EXIT line so scheduler authors see the most diagnostically useful
piece of exit info without cracking open the debug dump.
Changes since v1:
- Generalized "stall CPU" to "exit CPU"; the scx_exit_info field is
now exit_cpu and is populated for any path through scx_exit() /
__scx_exit() / scx_vexit(), not just the watchdog stall path.
- Added patch 3 to expose exit_cpu via struct user_exit_info.
- SysRq-D initializes exit_cpu to -1 so debug dumps not tied to an
exit don't arbitrarily promote CPU 0.
- Dump header now reports "on cpu N" alongside the exit kind.
- v1: https://lore.kernel.org/sched-ext/20260408031113.76005-1-changwoo@igalia.com/
Changwoo Min (3):
sched_ext: Extract scx_dump_cpu() from scx_dump_state()
sched_ext: Dump the exit CPU first
sched_ext: Expose exit_cpu to BPF and userspace
kernel/sched/ext.c | 221 ++++++++++--------
kernel/sched/ext_internal.h | 6 +
.../include/scx/user_exit_info.bpf.h | 3 +
tools/sched_ext/include/scx/user_exit_info.h | 2 +
.../include/scx/user_exit_info_common.h | 5 +
5 files changed, 142 insertions(+), 95 deletions(-)
--
2.54.0