[PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()

Christian Loehle posted 3 patches 2 months ago
There is a newer version of this series
[PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
Posted by Christian Loehle 2 months ago
Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
Furthermore they become meaningless without rq lock, too.
Only return scx_bpf_cpu_rq() if we hold rq lock of that rq.

All upstream scx schedulers can be converted into the new
scx_bpf_task_acquire_remote_curr() instead.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
 kernel/sched/ext.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 3e2fa0b1eb57..a66cf654f33e 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7420,10 +7420,20 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
  */
 __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
 {
+	struct rq *rq;
+
 	if (!kf_cpu_valid(cpu, NULL))
 		return NULL;
 
-	return cpu_rq(cpu);
+	preempt_disable();
+	rq = cpu_rq(cpu);
+	if (rq != scx_locked_rq()) {
+		scx_kf_error("Accessing not locked rq %d", cpu);
+		rq = NULL;
+	}
+	preempt_enable();
+
+	return rq;
 }
 
 /**
-- 
2.34.1
Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
Posted by Tejun Heo 1 month, 3 weeks ago
Hello,

On Tue, Aug 05, 2025 at 12:10:36PM +0100, Christian Loehle wrote:
> @@ -7420,10 +7420,20 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
>   */
>  __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>  {
> +	struct rq *rq;
> +
>  	if (!kf_cpu_valid(cpu, NULL))
>  		return NULL;
>  
> -	return cpu_rq(cpu);
> +	preempt_disable();
> +	rq = cpu_rq(cpu);
> +	if (rq != scx_locked_rq()) {
> +		scx_kf_error("Accessing not locked rq %d", cpu);
> +		rq = NULL;
> +	}
> +	preempt_enable();

So, this will break the existing scheduler binaries immediately, which I
don't think is a good way to go about it. Can you add a pr_warn_once() to
print deprecation warning and add e.g. scx_bpf_locked_cpu_rq() instead?

Thanks.

-- 
tejun
Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
Posted by Andrea Righi 1 month, 3 weeks ago
On Sat, Aug 09, 2025 at 09:03:46AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Tue, Aug 05, 2025 at 12:10:36PM +0100, Christian Loehle wrote:
> > @@ -7420,10 +7420,20 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
> >   */
> >  __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
> >  {
> > +	struct rq *rq;
> > +
> >  	if (!kf_cpu_valid(cpu, NULL))
> >  		return NULL;
> >  
> > -	return cpu_rq(cpu);
> > +	preempt_disable();
> > +	rq = cpu_rq(cpu);
> > +	if (rq != scx_locked_rq()) {
> > +		scx_kf_error("Accessing not locked rq %d", cpu);
> > +		rq = NULL;
> > +	}
> > +	preempt_enable();
> 
> So, this will break the existing scheduler binaries immediately, which I
> don't think is a good way to go about it. Can you add a pr_warn_once() to
> print deprecation warning and add e.g. scx_bpf_locked_cpu_rq() instead?

Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
we force schedulers to check for NULL and, if they don't, the verifier
won't be happy, so this already breaks existing binaries.

Even if a scheduler performs the NULL check, this change might still cause
incorrect behaviors, which can be worse than triggering an error.

How about we introduce scx_bpf_locked_cpu_rq() and we still trigger an
error in scx_bpf_cpu_rq(), mentioning about the new locked kfunc and
scx_bpf_task_acquire_remote_curr()?

Thanks,
-Andrea
Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
Posted by Jake Hillion 1 month, 3 weeks ago
On Sun, Aug 10, 2025 at 12:52:53PM +0200, Andrea Righi wrote:
> Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
> we force schedulers to check for NULL and, if they don't, the verifier
> won't be happy, so this already breaks existing binaries.

I ran some testing on the sched_ext for-next branch, and scx_cosmos is
breaking in cosmos_init including the latest changes. I believe it kicks
off a timer in init, which indirectly calls
`scx_bpf_cpu_rq(cpu)->curr->flags & PF_IDLE`. This should be NULL
checked, but old binaries breaking is pretty inconvenient for new users.

As Andrea says, this is the already merged patch triggering this.

Thanks,
Jake.
Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
Posted by Andrea Righi 1 month, 3 weeks ago
On Mon, Aug 11, 2025 at 03:35:05PM +0100, Jake Hillion wrote:
> On Sun, Aug 10, 2025 at 12:52:53PM +0200, Andrea Righi wrote:
> > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
> > we force schedulers to check for NULL and, if they don't, the verifier
> > won't be happy, so this already breaks existing binaries.
> 
> I ran some testing on the sched_ext for-next branch, and scx_cosmos is
> breaking in cosmos_init including the latest changes. I believe it kicks
> off a timer in init, which indirectly calls
> `scx_bpf_cpu_rq(cpu)->curr->flags & PF_IDLE`. This should be NULL
> checked, but old binaries breaking is pretty inconvenient for new users.
> 
> As Andrea says, this is the already merged patch triggering this.

We should provide a compat helper in common.bpf.h and fix the schedulers to
use this helper. Something like the following (untested):

static inline struct task_struct *
__COMPAT_scx_bpf_task_acquire_remote_curr(s32 cpu)
{
	struct rq *rq;

	if (bpf_ksym_exists(scx_bpf_task_acquire_remote_curr)
		return scx_bpf_task_acquire_remote_curr(cpu);

	rq = scx_bpf_cpu_rq(cpu);

	return rq ? rq->curr : NULL;
}

Then we can drop this after a couple of kernel releases (like in v6.20).

-Andrea
Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
Posted by Tejun Heo 1 month, 3 weeks ago
On Mon, Aug 11, 2025 at 03:35:05PM +0100, Jake Hillion wrote:
> On Sun, Aug 10, 2025 at 12:52:53PM +0200, Andrea Righi wrote:
> > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
> > we force schedulers to check for NULL and, if they don't, the verifier
> > won't be happy, so this already breaks existing binaries.
> 
> I ran some testing on the sched_ext for-next branch, and scx_cosmos is
> breaking in cosmos_init including the latest changes. I believe it kicks
> off a timer in init, which indirectly calls
> `scx_bpf_cpu_rq(cpu)->curr->flags & PF_IDLE`. This should be NULL
> checked, but old binaries breaking is pretty inconvenient for new users.
> 
> As Andrea says, this is the already merged patch triggering this.

Lemme revert that. I don't think we should introduce breaking changes
without grace period.

Thanks.

-- 
tejun
Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
Posted by Christian Loehle 1 month, 3 weeks ago
On 8/10/25 11:52, Andrea Righi wrote:
> On Sat, Aug 09, 2025 at 09:03:46AM -1000, Tejun Heo wrote:
>> Hello,
>>
>> On Tue, Aug 05, 2025 at 12:10:36PM +0100, Christian Loehle wrote:
>>> @@ -7420,10 +7420,20 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
>>>   */
>>>  __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>>>  {
>>> +	struct rq *rq;
>>> +
>>>  	if (!kf_cpu_valid(cpu, NULL))
>>>  		return NULL;
>>>  
>>> -	return cpu_rq(cpu);
>>> +	preempt_disable();
>>> +	rq = cpu_rq(cpu);
>>> +	if (rq != scx_locked_rq()) {
>>> +		scx_kf_error("Accessing not locked rq %d", cpu);
>>> +		rq = NULL;
>>> +	}
>>> +	preempt_enable();
>>
>> So, this will break the existing scheduler binaries immediately, which I
>> don't think is a good way to go about it. Can you add a pr_warn_once() to
>> print deprecation warning and add e.g. scx_bpf_locked_cpu_rq() instead?
> 
> Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
> we force schedulers to check for NULL and, if they don't, the verifier
> won't be happy, so this already breaks existing binaries.
> 
> Even if a scheduler performs the NULL check, this change might still cause
> incorrect behaviors, which can be worse than triggering an error.
> 
> How about we introduce scx_bpf_locked_cpu_rq() and we still trigger an
> error in scx_bpf_cpu_rq(), mentioning about the new locked kfunc and
> scx_bpf_task_acquire_remote_curr()?

If we still trigger an error in scx_bpf_cpu_rq() what's the difference
between scx_bpf_cpu_rq() and scx_bpf_cpu_rq_locked() then?
Just the error message?
Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
Posted by Tejun Heo 1 month, 3 weeks ago
Hello,

On Sun, Aug 10, 2025 at 11:18:52PM +0100, Christian Loehle wrote:
...
> > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
> > we force schedulers to check for NULL and, if they don't, the verifier
> > won't be happy, so this already breaks existing binaries.
> > 
> > Even if a scheduler performs the NULL check, this change might still cause
> > incorrect behaviors, which can be worse than triggering an error.

I'll revert that change. We shouldn't be introducing breaking changes
without grace period.

> > How about we introduce scx_bpf_locked_cpu_rq() and we still trigger an
> > error in scx_bpf_cpu_rq(), mentioning about the new locked kfunc and
> > scx_bpf_task_acquire_remote_curr()?
> 
> If we still trigger an error in scx_bpf_cpu_rq() what's the difference
> between scx_bpf_cpu_rq() and scx_bpf_cpu_rq_locked() then?
> Just the error message?

Let's add a warning to scx_bpf_cpu_rq() pointing to scx_bpf_cpu_rq_locked(),
update the schedulers, and drop scx_bpf_cpu_rq() in a couple releases.

Thanks.

-- 
tejun