Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
Furthermore they become meaningless without rq lock, too.
Only return scx_bpf_cpu_rq() if we hold rq lock of that rq.
All upstream scx schedulers can be converted into the new
scx_bpf_task_acquire_remote_curr() instead.
Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
kernel/sched/ext.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 3e2fa0b1eb57..a66cf654f33e 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7420,10 +7420,20 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
*/
__bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
{
+ struct rq *rq;
+
if (!kf_cpu_valid(cpu, NULL))
return NULL;
- return cpu_rq(cpu);
+ preempt_disable();
+ rq = cpu_rq(cpu);
+ if (rq != scx_locked_rq()) {
+ scx_kf_error("Accessing not locked rq %d", cpu);
+ rq = NULL;
+ }
+ preempt_enable();
+
+ return rq;
}
/**
--
2.34.1
Hello, On Tue, Aug 05, 2025 at 12:10:36PM +0100, Christian Loehle wrote: > @@ -7420,10 +7420,20 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p) > */ > __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu) > { > + struct rq *rq; > + > if (!kf_cpu_valid(cpu, NULL)) > return NULL; > > - return cpu_rq(cpu); > + preempt_disable(); > + rq = cpu_rq(cpu); > + if (rq != scx_locked_rq()) { > + scx_kf_error("Accessing not locked rq %d", cpu); > + rq = NULL; > + } > + preempt_enable(); So, this will break the existing scheduler binaries immediately, which I don't think is a good way to go about it. Can you add a pr_warn_once() to print deprecation warning and add e.g. scx_bpf_locked_cpu_rq() instead? Thanks. -- tejun
On Sat, Aug 09, 2025 at 09:03:46AM -1000, Tejun Heo wrote: > Hello, > > On Tue, Aug 05, 2025 at 12:10:36PM +0100, Christian Loehle wrote: > > @@ -7420,10 +7420,20 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p) > > */ > > __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu) > > { > > + struct rq *rq; > > + > > if (!kf_cpu_valid(cpu, NULL)) > > return NULL; > > > > - return cpu_rq(cpu); > > + preempt_disable(); > > + rq = cpu_rq(cpu); > > + if (rq != scx_locked_rq()) { > > + scx_kf_error("Accessing not locked rq %d", cpu); > > + rq = NULL; > > + } > > + preempt_enable(); > > So, this will break the existing scheduler binaries immediately, which I > don't think is a good way to go about it. Can you add a pr_warn_once() to > print deprecation warning and add e.g. scx_bpf_locked_cpu_rq() instead? Yeah, this is not nice, but they would be still broken though, in PATCH 1/3 we force schedulers to check for NULL and, if they don't, the verifier won't be happy, so this already breaks existing binaries. Even if a scheduler performs the NULL check, this change might still cause incorrect behaviors, which can be worse than triggering an error. How about we introduce scx_bpf_locked_cpu_rq() and we still trigger an error in scx_bpf_cpu_rq(), mentioning about the new locked kfunc and scx_bpf_task_acquire_remote_curr()? Thanks, -Andrea
On Sun, Aug 10, 2025 at 12:52:53PM +0200, Andrea Righi wrote: > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3 > we force schedulers to check for NULL and, if they don't, the verifier > won't be happy, so this already breaks existing binaries. I ran some testing on the sched_ext for-next branch, and scx_cosmos is breaking in cosmos_init including the latest changes. I believe it kicks off a timer in init, which indirectly calls `scx_bpf_cpu_rq(cpu)->curr->flags & PF_IDLE`. This should be NULL checked, but old binaries breaking is pretty inconvenient for new users. As Andrea says, this is the already merged patch triggering this. Thanks, Jake.
On Mon, Aug 11, 2025 at 03:35:05PM +0100, Jake Hillion wrote: > On Sun, Aug 10, 2025 at 12:52:53PM +0200, Andrea Righi wrote: > > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3 > > we force schedulers to check for NULL and, if they don't, the verifier > > won't be happy, so this already breaks existing binaries. > > I ran some testing on the sched_ext for-next branch, and scx_cosmos is > breaking in cosmos_init including the latest changes. I believe it kicks > off a timer in init, which indirectly calls > `scx_bpf_cpu_rq(cpu)->curr->flags & PF_IDLE`. This should be NULL > checked, but old binaries breaking is pretty inconvenient for new users. > > As Andrea says, this is the already merged patch triggering this. We should provide a compat helper in common.bpf.h and fix the schedulers to use this helper. Something like the following (untested): static inline struct task_struct * __COMPAT_scx_bpf_task_acquire_remote_curr(s32 cpu) { struct rq *rq; if (bpf_ksym_exists(scx_bpf_task_acquire_remote_curr) return scx_bpf_task_acquire_remote_curr(cpu); rq = scx_bpf_cpu_rq(cpu); return rq ? rq->curr : NULL; } Then we can drop this after a couple of kernel releases (like in v6.20). -Andrea
On Mon, Aug 11, 2025 at 03:35:05PM +0100, Jake Hillion wrote: > On Sun, Aug 10, 2025 at 12:52:53PM +0200, Andrea Righi wrote: > > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3 > > we force schedulers to check for NULL and, if they don't, the verifier > > won't be happy, so this already breaks existing binaries. > > I ran some testing on the sched_ext for-next branch, and scx_cosmos is > breaking in cosmos_init including the latest changes. I believe it kicks > off a timer in init, which indirectly calls > `scx_bpf_cpu_rq(cpu)->curr->flags & PF_IDLE`. This should be NULL > checked, but old binaries breaking is pretty inconvenient for new users. > > As Andrea says, this is the already merged patch triggering this. Lemme revert that. I don't think we should introduce breaking changes without grace period. Thanks. -- tejun
On 8/10/25 11:52, Andrea Righi wrote: > On Sat, Aug 09, 2025 at 09:03:46AM -1000, Tejun Heo wrote: >> Hello, >> >> On Tue, Aug 05, 2025 at 12:10:36PM +0100, Christian Loehle wrote: >>> @@ -7420,10 +7420,20 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p) >>> */ >>> __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu) >>> { >>> + struct rq *rq; >>> + >>> if (!kf_cpu_valid(cpu, NULL)) >>> return NULL; >>> >>> - return cpu_rq(cpu); >>> + preempt_disable(); >>> + rq = cpu_rq(cpu); >>> + if (rq != scx_locked_rq()) { >>> + scx_kf_error("Accessing not locked rq %d", cpu); >>> + rq = NULL; >>> + } >>> + preempt_enable(); >> >> So, this will break the existing scheduler binaries immediately, which I >> don't think is a good way to go about it. Can you add a pr_warn_once() to >> print deprecation warning and add e.g. scx_bpf_locked_cpu_rq() instead? > > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3 > we force schedulers to check for NULL and, if they don't, the verifier > won't be happy, so this already breaks existing binaries. > > Even if a scheduler performs the NULL check, this change might still cause > incorrect behaviors, which can be worse than triggering an error. > > How about we introduce scx_bpf_locked_cpu_rq() and we still trigger an > error in scx_bpf_cpu_rq(), mentioning about the new locked kfunc and > scx_bpf_task_acquire_remote_curr()? If we still trigger an error in scx_bpf_cpu_rq() what's the difference between scx_bpf_cpu_rq() and scx_bpf_cpu_rq_locked() then? Just the error message?
Hello, On Sun, Aug 10, 2025 at 11:18:52PM +0100, Christian Loehle wrote: ... > > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3 > > we force schedulers to check for NULL and, if they don't, the verifier > > won't be happy, so this already breaks existing binaries. > > > > Even if a scheduler performs the NULL check, this change might still cause > > incorrect behaviors, which can be worse than triggering an error. I'll revert that change. We shouldn't be introducing breaking changes without grace period. > > How about we introduce scx_bpf_locked_cpu_rq() and we still trigger an > > error in scx_bpf_cpu_rq(), mentioning about the new locked kfunc and > > scx_bpf_task_acquire_remote_curr()? > > If we still trigger an error in scx_bpf_cpu_rq() what's the difference > between scx_bpf_cpu_rq() and scx_bpf_cpu_rq_locked() then? > Just the error message? Let's add a warning to scx_bpf_cpu_rq() pointing to scx_bpf_cpu_rq_locked(), update the schedulers, and drop scx_bpf_cpu_rq() in a couple releases. Thanks. -- tejun
© 2016 - 2025 Red Hat, Inc.