sched_ext: Provide a per-scheduler unique sequence counter

[PATCH] sched_ext: Provide a per-scheduler unique sequence counter

Posted by Andrea Righi 2 months ago

In commit 431844b65f4c ("sched_ext: Provide a sysfs enable_seq counter")
we introduced a global sequence counter that is incremented every time a
BPF scheduler is used.

This is enough for now to determine if a scheduler has ever been
loaded/changed since boot. However, as we will move towards supporting
stacked schedulers, a single global counter might not be sufficient,
since we may also need to track if a specific scheduler, within a
particular hierarchy, is changed or restarted.

To address this, introduce also a per-scheduler sequence counter, which
will allow monitoring of individual scheduler changes from user space.

This counter is available in /sys/kernel/sched_ext/root/seq for now and
it just mirrors the value reported in /sys/kernel/sched_ext/enable_seq.

Signed-off-by: Andrea Righi <andrea.righi@linux.dev>
---
 kernel/sched/ext.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index d6f6bf6caecc..62782a31b316 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -662,6 +662,13 @@ struct sched_ext_ops {
 	 */
 	u64 hotplug_seq;
 
+	/*
+	 * enable_seq - unique per-scheduler counter that can be accessed from
+	 * user-space to determine if a scheduler (within a specific hierarchy)
+	 * has been restarted.
+	 */
+	s64 enable_seq;
+
 	/**
 	 * name - BPF scheduler's name
 	 *
@@ -4210,8 +4217,16 @@ static ssize_t scx_attr_ops_show(struct kobject *kobj,
 }
 SCX_ATTR(ops);
 
+static ssize_t scx_attr_seq_show(struct kobject *kobj,
+				 struct kobj_attribute *ka, char *buf)
+{
+	return sysfs_emit(buf, "%lld\n", scx_ops.enable_seq);
+}
+SCX_ATTR(seq);
+
 static struct attribute *scx_sched_attrs[] = {
 	&scx_attr_ops.attr,
+	&scx_attr_seq.attr,
 	NULL,
 };
 ATTRIBUTE_GROUPS(scx_sched);
@@ -5237,7 +5252,11 @@ static int scx_ops_enable(struct sched_ext_ops *ops, struct bpf_link *link)
 	kobject_uevent(scx_root_kobj, KOBJ_ADD);
 	mutex_unlock(&scx_ops_enable_mutex);
 
-	atomic_long_inc(&scx_enable_seq);
+	/*
+	 * Update scheduler's sequence counter (add 1 to keep it consistent
+	 * with the global scx_enable_seq counter).
+	 */
+	scx_ops.enable_seq = atomic_long_fetch_inc(&scx_enable_seq) + 1;
 
 	return 0;
 
-- 
2.46.2

Re: [PATCH] sched_ext: Provide a per-scheduler unique sequence counter

Posted by Tejun Heo 1 month, 4 weeks ago

Hello, Andrea.

On Fri, Sep 27, 2024 at 06:59:01PM +0200, Andrea Righi wrote:
...
> @@ -662,6 +662,13 @@ struct sched_ext_ops {
>  	 */
>  	u64 hotplug_seq;
>  
> +	/*
> +	 * enable_seq - unique per-scheduler counter that can be accessed from
> +	 * user-space to determine if a scheduler (within a specific hierarchy)
> +	 * has been restarted.
> +	 */
> +	s64 enable_seq;

Let's just make it a global variable for now. When we package up context for
each scheduler instance into a struct, it will get packaged up together.
It's a bit odd to add enable_seq to ops as userspace can't do anything with
it (note that hotplug_seq is different in that it's provided by the
userspace on load).

Thanks.

-- 
tejun

Re: [PATCH] sched_ext: Provide a per-scheduler unique sequence counter

Posted by Andrea Righi 1 month, 4 weeks ago

On Mon, Sep 30, 2024 at 08:33:17AM -1000, Tejun Heo wrote:
> Hello, Andrea.
> 
> On Fri, Sep 27, 2024 at 06:59:01PM +0200, Andrea Righi wrote:
> ...
> > @@ -662,6 +662,13 @@ struct sched_ext_ops {
> >  	 */
> >  	u64 hotplug_seq;
> >  
> > +	/*
> > +	 * enable_seq - unique per-scheduler counter that can be accessed from
> > +	 * user-space to determine if a scheduler (within a specific hierarchy)
> > +	 * has been restarted.
> > +	 */
> > +	s64 enable_seq;
> 
> Let's just make it a global variable for now. When we package up context for
> each scheduler instance into a struct, it will get packaged up together.
> It's a bit odd to add enable_seq to ops as userspace can't do anything with
> it (note that hotplug_seq is different in that it's provided by the
> userspace on load).

Yep, makes sense, I just sent it because it was mentioned in the other
thread about enable_seq, but we don't really need it right now, so we
can just ignore it for now.

Thanks,
-Andrea