include/linux/sched/ext.h | 1 + kernel/sched/ext.c | 56 +++- tools/sched_ext/include/scx/common.bpf.h | 1 + tools/sched_ext/include/scx/compat.bpf.h | 19 ++ tools/testing/selftests/sched_ext/Makefile | 1 + .../selftests/sched_ext/peek_dsq.bpf.c | 265 ++++++++++++++++++ tools/testing/selftests/sched_ext/peek_dsq.c | 230 +++++++++++++++ 7 files changed, 571 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/sched_ext/peek_dsq.bpf.c create mode 100644 tools/testing/selftests/sched_ext/peek_dsq.c
This allows sched_ext schedulers an inexpensive operation to peek at the first element in a queue (DSQ), without creating an iterator and acquiring the lock on that queue. Note that manual testing has thus far included a modified version of the example qmap scheduler that exercises peek, as well as a modified modified LAVD (from the SCX repo) that exercises peek. The attached test passes >1000 stress tests when run in concurrent VMs, and when run sequentially on the host kernel. Presently, tested on the below workstation and server processors. - AMD Ryzen Threadripper PRO 7975WX 32-Cores - AMD EPYC 9D64 88-Core Processor Initial experiments indicate a substantial speedup (on schbench) when running an SCX scheduler with per-cpu DSQs and peeking each queue to retrieve the task with the minimum vruntime across all the CPUs. --- Changes in v3: - inline helpers and simplify - coding style tweaks Changes in v2: - make peek() only work for user DSQs and error otherwise - added a stress test component to the selftest that performs many peeks - responded to review comments from tj@kernel.org and arighi@nvidia.com - link: https://lore.kernel.org/lkml/20251003195408.675527-1-rrnewton@gmail.com/ v1 link: https://lore.kernel.org/lkml/20251002025722.3420916-1-rrnewton@gmail.com/ Ryan Newton (2): sched_ext: Add lockless peek operation for DSQs sched_ext: Add a selftest for scx_bpf_dsq_peek include/linux/sched/ext.h | 1 + kernel/sched/ext.c | 56 +++- tools/sched_ext/include/scx/common.bpf.h | 1 + tools/sched_ext/include/scx/compat.bpf.h | 19 ++ tools/testing/selftests/sched_ext/Makefile | 1 + .../selftests/sched_ext/peek_dsq.bpf.c | 265 ++++++++++++++++++ tools/testing/selftests/sched_ext/peek_dsq.c | 230 +++++++++++++++ 7 files changed, 571 insertions(+), 2 deletions(-) create mode 100644 tools/testing/selftests/sched_ext/peek_dsq.bpf.c create mode 100644 tools/testing/selftests/sched_ext/peek_dsq.c -- 2.51.0
On 10/6/25 18:04, Ryan Newton wrote: > This allows sched_ext schedulers an inexpensive operation to peek > at the first element in a queue (DSQ), without creating an iterator > and acquiring the lock on that queue. > > Note that manual testing has thus far included a modified version of the > example qmap scheduler that exercises peek, as well as a modified > modified LAVD (from the SCX repo) that exercises peek. The attached test > passes >1000 stress tests when run in concurrent VMs, and when run > sequentially on the host kernel. Presently, tested on the below > workstation and server processors. > - AMD Ryzen Threadripper PRO 7975WX 32-Cores > - AMD EPYC 9D64 88-Core Processor Is the adapted qmap and lavd available somewhere? > > Initial experiments indicate a substantial speedup (on schbench) when > running an SCX scheduler with per-cpu DSQs and peeking each queue to > retrieve the task with the minimum vruntime across all the CPUs. > > --- > Changes in v3: > - inline helpers and simplify > - coding style tweaks > > Changes in v2: > - make peek() only work for user DSQs and error otherwise > - added a stress test component to the selftest that performs many peeks > - responded to review comments from tj@kernel.org and arighi@nvidia.com > - link: https://lore.kernel.org/lkml/20251003195408.675527-1-rrnewton@gmail.com/ > > v1 link: https://lore.kernel.org/lkml/20251002025722.3420916-1-rrnewton@gmail.com/ > > Ryan Newton (2): > sched_ext: Add lockless peek operation for DSQs > sched_ext: Add a selftest for scx_bpf_dsq_peek > > include/linux/sched/ext.h | 1 + > kernel/sched/ext.c | 56 +++- > tools/sched_ext/include/scx/common.bpf.h | 1 + > tools/sched_ext/include/scx/compat.bpf.h | 19 ++ > tools/testing/selftests/sched_ext/Makefile | 1 + > .../selftests/sched_ext/peek_dsq.bpf.c | 265 ++++++++++++++++++ > tools/testing/selftests/sched_ext/peek_dsq.c | 230 +++++++++++++++ > 7 files changed, 571 insertions(+), 2 deletions(-) > create mode 100644 tools/testing/selftests/sched_ext/peek_dsq.bpf.c > create mode 100644 tools/testing/selftests/sched_ext/peek_dsq.c >
Hello Christian, Sure, the lavd with peek is here: https://github.com/sched-ext/scx/pull/2675 Beerland with peek is here: https://github.com/sched-ext/scx/commit/c2a0f185051c06cc1ebae1dc40e5fe2bd3022c1e The qmap one was not a meaningful change to the scheduler but just extra peeks thrown in with debug prints. Cheers, -Ryan P.S. Good catch on the two newlines being split into the wrong diff. Argh, I was playing with Sapling's `sl absorb` and I swear it said it amended the earlier commit. Apologies for not catching it. FWIW here's a tip with that correction: https://github.com/rrnewton/linux/commit/db426f852813e2b6deeae0869d20df1bea647a07 On Mon, Oct 6, 2025 at 1:20 PM Christian Loehle <christian.loehle@arm.com> wrote: > > On 10/6/25 18:04, Ryan Newton wrote: > > This allows sched_ext schedulers an inexpensive operation to peek > > at the first element in a queue (DSQ), without creating an iterator > > and acquiring the lock on that queue. > > > > Note that manual testing has thus far included a modified version of the > > example qmap scheduler that exercises peek, as well as a modified > > modified LAVD (from the SCX repo) that exercises peek. The attached test > > passes >1000 stress tests when run in concurrent VMs, and when run > > sequentially on the host kernel. Presently, tested on the below > > workstation and server processors. > > - AMD Ryzen Threadripper PRO 7975WX 32-Cores > > - AMD EPYC 9D64 88-Core Processor > > Is the adapted qmap and lavd available somewhere? > > > > > Initial experiments indicate a substantial speedup (on schbench) when > > running an SCX scheduler with per-cpu DSQs and peeking each queue to > > retrieve the task with the minimum vruntime across all the CPUs. > > > > --- > > Changes in v3: > > - inline helpers and simplify > > - coding style tweaks > > > > Changes in v2: > > - make peek() only work for user DSQs and error otherwise > > - added a stress test component to the selftest that performs many peeks > > - responded to review comments from tj@kernel.org and arighi@nvidia.com > > - link: https://lore.kernel.org/lkml/20251003195408.675527-1-rrnewton@gmail.com/ > > > > v1 link: https://lore.kernel.org/lkml/20251002025722.3420916-1-rrnewton@gmail.com/ > > > > Ryan Newton (2): > > sched_ext: Add lockless peek operation for DSQs > > sched_ext: Add a selftest for scx_bpf_dsq_peek > > > > include/linux/sched/ext.h | 1 + > > kernel/sched/ext.c | 56 +++- > > tools/sched_ext/include/scx/common.bpf.h | 1 + > > tools/sched_ext/include/scx/compat.bpf.h | 19 ++ > > tools/testing/selftests/sched_ext/Makefile | 1 + > > .../selftests/sched_ext/peek_dsq.bpf.c | 265 ++++++++++++++++++ > > tools/testing/selftests/sched_ext/peek_dsq.c | 230 +++++++++++++++ > > 7 files changed, 571 insertions(+), 2 deletions(-) > > create mode 100644 tools/testing/selftests/sched_ext/peek_dsq.bpf.c > > create mode 100644 tools/testing/selftests/sched_ext/peek_dsq.c > > >
© 2016 - 2025 Red Hat, Inc.