[v3] Add latency priority for CFS class

[PATCH v3 6/8] sched/fair: Add sched group latency support

Posted by Vincent Guittot 3 years, 7 months ago

Task can set its latency priority, which is then used to decide to preempt
the current running entity of the cfs, but sched group entities still have
the default latency offset.

Add a latency field in task group to set the latency offset of the
sched_eneities of the group, which will be used against other entities in
the parent cfs when deciding which entity to schedule first.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/sched/core.c  | 24 ++++++++++++++++++++++++
 kernel/sched/fair.c  | 33 +++++++++++++++++++++++++++++++++
 kernel/sched/sched.h |  4 ++++
 3 files changed, 61 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 13cf794708ee..bfea862a3588 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10860,6 +10860,19 @@ static int cpu_idle_write_s64(struct cgroup_subsys_state *css,
 {
 	return sched_group_set_idle(css_tg(css), idle);
 }
+
+static s64 cpu_latency_read_s64(struct cgroup_subsys_state *css,
+			       struct cftype *cft)
+{
+	return css_tg(css)->latency_offset;
+}
+
+static int cpu_latency_write_s64(struct cgroup_subsys_state *css,
+				struct cftype *cft, s64 latency)
+{
+	return sched_group_set_latency(css_tg(css), latency);
+}
+
 #endif
 
 static struct cftype cpu_legacy_files[] = {
@@ -10874,6 +10887,11 @@ static struct cftype cpu_legacy_files[] = {
 		.read_s64 = cpu_idle_read_s64,
 		.write_s64 = cpu_idle_write_s64,
 	},
+	{
+		.name = "latency",
+		.read_s64 = cpu_latency_read_s64,
+		.write_s64 = cpu_latency_write_s64,
+	},
 #endif
 #ifdef CONFIG_CFS_BANDWIDTH
 	{
@@ -11091,6 +11109,12 @@ static struct cftype cpu_files[] = {
 		.read_s64 = cpu_idle_read_s64,
 		.write_s64 = cpu_idle_write_s64,
 	},
+	{
+		.name = "latency",
+		.flags = CFTYPE_NOT_ON_ROOT,
+		.read_s64 = cpu_latency_read_s64,
+		.write_s64 = cpu_latency_write_s64,
+	},
 #endif
 #ifdef CONFIG_CFS_BANDWIDTH
 	{
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a20eadb0af97..6cc4f2a9725d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11769,6 +11769,7 @@ int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent)
 		goto err;
 
 	tg->shares = NICE_0_LOAD;
+	tg->latency_offset = 0;
 
 	init_cfs_bandwidth(tg_cfs_bandwidth(tg));
 
@@ -11867,6 +11868,9 @@ void init_tg_cfs_entry(struct task_group *tg, struct cfs_rq *cfs_rq,
 	}
 
 	se->my_q = cfs_rq;
+
+	se->latency_offset = tg->latency_offset;
+
 	/* guarantee group entities always have weight */
 	update_load_set(&se->load, NICE_0_LOAD);
 	se->parent = parent;
@@ -11997,6 +12001,35 @@ int sched_group_set_idle(struct task_group *tg, long idle)
 	return 0;
 }
 
+int sched_group_set_latency(struct task_group *tg, long latency)
+{
+	int i;
+
+	if (tg == &root_task_group)
+		return -EINVAL;
+
+	if (abs(latency) > sysctl_sched_latency)
+		return -EINVAL;
+
+	mutex_lock(&shares_mutex);
+
+	if (tg->latency_offset == latency) {
+		mutex_unlock(&shares_mutex);
+		return 0;
+	}
+
+	tg->latency_offset = latency;
+
+	for_each_possible_cpu(i) {
+		struct sched_entity *se = tg->se[i];
+
+		WRITE_ONCE(se->latency_offset, latency);
+	}
+
+	mutex_unlock(&shares_mutex);
+	return 0;
+}
+
 #else /* CONFIG_FAIR_GROUP_SCHED */
 
 void free_fair_sched_group(struct task_group *tg) { }
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index c4beddb58ebd..117b23dfb912 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -406,6 +406,8 @@ struct task_group {
 
 	/* A positive value indicates that this is a SCHED_IDLE group. */
 	int			idle;
+	/* latency constraint of the group. */
+	int			latency_offset;
 
 #ifdef	CONFIG_SMP
 	/*
@@ -519,6 +521,8 @@ extern int sched_group_set_shares(struct task_group *tg, unsigned long shares);
 
 extern int sched_group_set_idle(struct task_group *tg, long idle);
 
+extern int sched_group_set_latency(struct task_group *tg, long latency);
+
 #ifdef CONFIG_SMP
 extern void set_task_rq_fair(struct sched_entity *se,
 			     struct cfs_rq *prev, struct cfs_rq *next);
-- 
2.17.1

Re: [PATCH v3 6/8] sched/fair: Add sched group latency support

Posted by Tejun Heo 3 years, 6 months ago

Hello,

On Fri, Sep 09, 2022 at 03:03:07PM +0200, Vincent Guittot wrote:
> +	{
> +		.name = "latency",
> +		.read_s64 = cpu_latency_read_s64,
> +		.write_s64 = cpu_latency_write_s64,
> +	},

You're still using the nice value here, right? If so, can you please use the
filename "latency.nice" so that it's consistent with "weight.nice"?

Thanks.

-- 
tejun

Re: [PATCH v3 6/8] sched/fair: Add sched group latency support

Posted by Vincent Guittot 3 years, 6 months ago

On Tue, 13 Sept 2022 at 00:18, Tejun Heo <tj@kernel.org> wrote:
>
> Hello,
>
> On Fri, Sep 09, 2022 at 03:03:07PM +0200, Vincent Guittot wrote:
> > +     {
> > +             .name = "latency",
> > +             .read_s64 = cpu_latency_read_s64,
> > +             .write_s64 = cpu_latency_write_s64,
> > +     },
>
> You're still using the nice value here, right? If so, can you please use the
> filename "latency.nice" so that it's consistent with "weight.nice"?

This cpu.latency is not a nice priority but the signed offset used by
the scheduler at wakeup. On previous version you raised concern about
having a nice value for cgroup so I removed it for cgroup and directly
exposed the latency offset with cpu.latency similarly to the weight
that is exposed with cpu.shares.

Vincent
>
> Thanks.
>
> --
> tejun