[PATCH 2/2] sched/ext: Add BPF functions for uclamp inc and dec

Hongyan Xia posted 2 patches 1 year, 5 months ago
[PATCH 2/2] sched/ext: Add BPF functions for uclamp inc and dec
Posted by Hongyan Xia 1 year, 5 months ago
A sched_ext scheduler may have different choices for uclamp:

1. Re-use the current uclamp implementation
2. Ignore uclamp completely
3. Have its own custom uclamp implemenation

We expose uclamp BPF functions and let the scheduler itself decide what
to do.

Signed-off-by: Hongyan Xia <hongyan.xia2@arm.com>
---
 kernel/sched/ext.c                       | 12 ++++++++++++
 tools/sched_ext/include/scx/common.bpf.h |  2 ++
 2 files changed, 14 insertions(+)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 0b120104a7ce..48c553b6f0c3 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -6108,6 +6108,16 @@ __bpf_kfunc s32 scx_bpf_task_cpu(const struct task_struct *p)
 	return task_cpu(p);
 }
 
+__bpf_kfunc void scx_bpf_uclamp_rq_inc(s32 cpu, struct task_struct *p)
+{
+	uclamp_rq_inc(cpu_rq(cpu), p);
+}
+
+__bpf_kfunc void scx_bpf_uclamp_rq_dec(s32 cpu, struct task_struct *p)
+{
+	uclamp_rq_dec(cpu_rq(cpu), p);
+}
+
 __bpf_kfunc_end_defs();
 
 BTF_KFUNCS_START(scx_kfunc_ids_any)
@@ -6132,6 +6142,8 @@ BTF_ID_FLAGS(func, scx_bpf_pick_idle_cpu, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_pick_any_cpu, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_task_running, KF_RCU)
 BTF_ID_FLAGS(func, scx_bpf_task_cpu, KF_RCU)
+BTF_ID_FLAGS(func, scx_bpf_uclamp_rq_inc)
+BTF_ID_FLAGS(func, scx_bpf_uclamp_rq_dec)
 BTF_KFUNCS_END(scx_kfunc_ids_any)
 
 static const struct btf_kfunc_id_set scx_kfunc_set_any = {
diff --git a/tools/sched_ext/include/scx/common.bpf.h b/tools/sched_ext/include/scx/common.bpf.h
index dbbda0e35c5d..85ddc94fb4c1 100644
--- a/tools/sched_ext/include/scx/common.bpf.h
+++ b/tools/sched_ext/include/scx/common.bpf.h
@@ -57,6 +57,8 @@ s32 scx_bpf_pick_idle_cpu(const cpumask_t *cpus_allowed, u64 flags) __ksym;
 s32 scx_bpf_pick_any_cpu(const cpumask_t *cpus_allowed, u64 flags) __ksym;
 bool scx_bpf_task_running(const struct task_struct *p) __ksym;
 s32 scx_bpf_task_cpu(const struct task_struct *p) __ksym;
+void scx_bpf_uclamp_rq_inc(s32 cpu, struct task_struct *p) __ksym;
+void scx_bpf_uclamp_rq_dec(s32 cpu, struct task_struct *p) __ksym;
 
 static inline __attribute__((format(printf, 1, 2)))
 void ___scx_bpf_bstr_format_checker(const char *fmt, ...) {}
-- 
2.34.1
Re: [PATCH 2/2] sched/ext: Add BPF functions for uclamp inc and dec
Posted by Tejun Heo 1 year, 5 months ago
Hello.

On Wed, Jul 03, 2024 at 11:07:48AM +0100, Hongyan Xia wrote:
> +__bpf_kfunc void scx_bpf_uclamp_rq_inc(s32 cpu, struct task_struct *p)
> +{
> +	uclamp_rq_inc(cpu_rq(cpu), p);
> +}
> +
> +__bpf_kfunc void scx_bpf_uclamp_rq_dec(s32 cpu, struct task_struct *p)
> +{
> +	uclamp_rq_dec(cpu_rq(cpu), p);
> +}

So, I don't think we can expose these functions directly to the BPF
scheduler. The BPF schedulers shouldn't be able to break system integrity no
matter what they do and with the above it'd be trivial to get the bucket
counters unbalanced, right?

Thanks.

-- 
tejun
Re: [PATCH 2/2] sched/ext: Add BPF functions for uclamp inc and dec
Posted by Hongyan Xia 1 year, 5 months ago
On 03/07/2024 19:15, Tejun Heo wrote:
> Hello.
> 
> On Wed, Jul 03, 2024 at 11:07:48AM +0100, Hongyan Xia wrote:
>> +__bpf_kfunc void scx_bpf_uclamp_rq_inc(s32 cpu, struct task_struct *p)
>> +{
>> +	uclamp_rq_inc(cpu_rq(cpu), p);
>> +}
>> +
>> +__bpf_kfunc void scx_bpf_uclamp_rq_dec(s32 cpu, struct task_struct *p)
>> +{
>> +	uclamp_rq_dec(cpu_rq(cpu), p);
>> +}
> 
> So, I don't think we can expose these functions directly to the BPF
> scheduler. The BPF schedulers shouldn't be able to break system integrity no
> matter what they do and with the above it'd be trivial to get the bucket
> counters unbalanced, right?

You are right.

Actually, avoiding double enqueue or dequeue is easy and might be just a 
one-line change. The real concern is when the BPF scheduler somehow 
still has tasks on uclamp buckets when it's unloaded. Then, unloading 
the scheduler needs to do uclamp_dec().

I'll see what I can do.