[PATCH 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()

Christian Loehle posted 3 patches 2 months ago
[PATCH 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
Posted by Christian Loehle 2 months ago
Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
Furthermore they become meaningless without rq lock, too.
Only return scx_bpf_cpu_rq() when we hold rq lock of that rq.

All upstream scx schedulers can be converted into the new
scx_bpf_remote_curr() instead.

Signed-off-by: Christian Loehle <christian.loehle@arm.com>
---
 kernel/sched/ext.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 92e66bb0b5f2..627df3088fd0 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -7425,7 +7425,7 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
 }
 
 /**
- * scx_bpf_cpu_rq - Fetch the rq of a CPU
+ * scx_bpf_cpu_rq - Fetch the rq of a CPU if its rq lock is currently held
  * @cpu: CPU of the rq
  */
 __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
@@ -7433,7 +7433,7 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
 	if (!kf_cpu_valid(cpu, NULL))
 		return NULL;
 
-	return cpu_rq(cpu);
+	return this_cpu_read(locked_rq) == cpu_rq(cpu) ? cpu_rq(cpu) : NULL;
 }
 
 /**
-- 
2.34.1
Re: [PATCH 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
Posted by Andrea Righi 2 months ago
On Fri, Aug 01, 2025 at 03:17:41PM +0100, Christian Loehle wrote:
> Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
> Furthermore they become meaningless without rq lock, too.
> Only return scx_bpf_cpu_rq() when we hold rq lock of that rq.
> 
> All upstream scx schedulers can be converted into the new
> scx_bpf_remote_curr() instead.
> 
> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
> ---
>  kernel/sched/ext.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 92e66bb0b5f2..627df3088fd0 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -7425,7 +7425,7 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
>  }
>  
>  /**
> - * scx_bpf_cpu_rq - Fetch the rq of a CPU
> + * scx_bpf_cpu_rq - Fetch the rq of a CPU if its rq lock is currently held
>   * @cpu: CPU of the rq
>   */
>  __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
> @@ -7433,7 +7433,7 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>  	if (!kf_cpu_valid(cpu, NULL))
>  		return NULL;
>  
> -	return cpu_rq(cpu);
> +	return this_cpu_read(locked_rq) == cpu_rq(cpu) ? cpu_rq(cpu) : NULL;

Maybe we should consider an access to an unlocked rq as invalid and trigger
scx_exit(), similar to what we do with the kf_cpu_valid() check?

Also heads up that locked_rq has been renamed scx_locked_rq_state in 6.17.

>  }
>  
>  /**
> -- 
> 2.34.1
> 

Thanks,
-Andrea
Re: [PATCH 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()
Posted by Christian Loehle 2 months ago
On 8/1/25 15:51, Andrea Righi wrote:
> On Fri, Aug 01, 2025 at 03:17:41PM +0100, Christian Loehle wrote:
>> Most fields in scx_bpf_cpu_rq() assume that its rq_lock is held.
>> Furthermore they become meaningless without rq lock, too.
>> Only return scx_bpf_cpu_rq() when we hold rq lock of that rq.
>>
>> All upstream scx schedulers can be converted into the new
>> scx_bpf_remote_curr() instead.
>>
>> Signed-off-by: Christian Loehle <christian.loehle@arm.com>
>> ---
>>  kernel/sched/ext.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
>> index 92e66bb0b5f2..627df3088fd0 100644
>> --- a/kernel/sched/ext.c
>> +++ b/kernel/sched/ext.c
>> @@ -7425,7 +7425,7 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
>>  }
>>  
>>  /**
>> - * scx_bpf_cpu_rq - Fetch the rq of a CPU
>> + * scx_bpf_cpu_rq - Fetch the rq of a CPU if its rq lock is currently held
>>   * @cpu: CPU of the rq
>>   */
>>  __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>> @@ -7433,7 +7433,7 @@ __bpf_kfunc struct rq *scx_bpf_cpu_rq(s32 cpu)
>>  	if (!kf_cpu_valid(cpu, NULL))
>>  		return NULL;
>>  
>> -	return cpu_rq(cpu);
>> +	return this_cpu_read(locked_rq) == cpu_rq(cpu) ? cpu_rq(cpu) : NULL;
> 
> Maybe we should consider an access to an unlocked rq as invalid and trigger
> scx_exit(), similar to what we do with the kf_cpu_valid() check?

Makes sense to me!

> 
> Also heads up that locked_rq has been renamed scx_locked_rq_state in 6.17.

Ah, thanks! I'll rebase!