Run to parity ensures that current will get a chance to run its full
slice in one go but this can create large latency for entity with shorter
slice that has alreasy exausted its slice and wait to run the next one.
Clamp the run to parity duration to the shortest slice of all enqueued
entities.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
kernel/sched/fair.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 479b38dc307a..d8345219dfd4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -917,23 +917,32 @@ struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq)
}
/*
- * HACK, stash a copy of deadline at the point of pick in vlag,
- * which isn't used until dequeue.
+ * HACK, Set the vruntime, up to which the entity can run before picking
+ * another one, in vlag, which isn't used until dequeue.
+ * In case of run to parity, we use the shortest slice of the enqueued
+ * entities.
*/
static inline void set_protect_slice(struct sched_entity *se)
{
- se->vlag = se->deadline;
+ u64 min_slice;
+
+ min_slice = cfs_rq_min_slice(cfs_rq_of(se));
+
+ if (min_slice != se->slice)
+ se->vlag = min(se->deadline, se->vruntime + calc_delta_fair(min_slice, se));
+ else
+ se->vlag = se->deadline;
}
static inline bool protect_slice(struct sched_entity *se)
{
- return se->vlag == se->deadline;
+ return ((s64)(se->vlag - se->vruntime) > 0);
}
static inline void cancel_protect_slice(struct sched_entity *se)
{
if (protect_slice(se))
- se->vlag = se->deadline + 1;
+ se->vlag = se->vruntime;
}
/*
--
2.43.0
On Friday, June 13th, 2025 at 7:16 AM, Vincent Guittot <vincent.guittot@linaro.org> wrote: > > > Run to parity ensures that current will get a chance to run its full > slice in one go but this can create large latency for entity with shorter > slice that has alreasy exausted its slice and wait to run the next one. "already exhausted" > > Clamp the run to parity duration to the shortest slice of all enqueued > entities. > > Signed-off-by: Vincent Guittot vincent.guittot@linaro.org > > --- > kernel/sched/fair.c | 19 ++++++++++++++----- > 1 file changed, 14 insertions(+), 5 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 479b38dc307a..d8345219dfd4 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -917,23 +917,32 @@ struct sched_entity *__pick_first_entity(struct cfs_rq cfs_rq) > } > > / > - * HACK, stash a copy of deadline at the point of pick in vlag, > - * which isn't used until dequeue. > + * HACK, Set the vruntime, up to which the entity can run before picking > + * another one, in vlag, which isn't used until dequeue. > + * In case of run to parity, we use the shortest slice of the enqueued > + * entities. > */ I am going to admit - I don't have a good intuitive sense on how this will affect the functionality. Maybe you can help me think of a test case to explicitly write out this assumption in behavior? Dhaval > static inline void set_protect_slice(struct sched_entity *se) > { > - se->vlag = se->deadline; > > + u64 min_slice; > + > + min_slice = cfs_rq_min_slice(cfs_rq_of(se)); > + > + if (min_slice != se->slice) > > + se->vlag = min(se->deadline, se->vruntime + calc_delta_fair(min_slice, se)); > > + else > + se->vlag = se->deadline; > > } > > static inline bool protect_slice(struct sched_entity *se) > { > - return se->vlag == se->deadline; > > + return ((s64)(se->vlag - se->vruntime) > 0); > > } > > static inline void cancel_protect_slice(struct sched_entity *se) > { > if (protect_slice(se)) > - se->vlag = se->deadline + 1; > > + se->vlag = se->vruntime; > > } > > /* > -- > 2.43.0
On Sat, 14 Jun 2025 at 00:53, <dhaval@gianis.ca> wrote: > > > > > > > On Friday, June 13th, 2025 at 7:16 AM, Vincent Guittot <vincent.guittot@linaro.org> wrote: > > > > > > > Run to parity ensures that current will get a chance to run its full > > slice in one go but this can create large latency for entity with shorter > > slice that has alreasy exausted its slice and wait to run the next one. > > "already exhausted" > > > > > Clamp the run to parity duration to the shortest slice of all enqueued > > entities. > > > > Signed-off-by: Vincent Guittot vincent.guittot@linaro.org > > > > --- > > kernel/sched/fair.c | 19 ++++++++++++++----- > > 1 file changed, 14 insertions(+), 5 deletions(-) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 479b38dc307a..d8345219dfd4 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -917,23 +917,32 @@ struct sched_entity *__pick_first_entity(struct cfs_rq cfs_rq) > > } > > > > / > > - * HACK, stash a copy of deadline at the point of pick in vlag, > > - * which isn't used until dequeue. > > + * HACK, Set the vruntime, up to which the entity can run before picking > > + * another one, in vlag, which isn't used until dequeue. > > + * In case of run to parity, we use the shortest slice of the enqueued > > + * entities. > > */ > > I am going to admit - I don't have a good intuitive sense on how this will affect the functionality. Maybe you can help me think of a test case to explicitly write out this assumption in behavior? Run to parity minimizes the number of context switches to improve throughput by letting an entity run its full slice before picking another entity. When all entities have the same and default sysctl_sched_base_slice, the latter can be assumed to also be the quantum q (although this is not really true as the entity can be preempted during its quantum in our case). In such case, we still comply with the theorem: -rmax < lagk (d) < max(rmax ; q); rmax being the max slice request of the task k When entities have different slices duration, we will break this rule which becomes -rmax < lagk (d) < max(max of r ; q); 'max of r' being the maximum slice of all entities In order to come back to the 1st version, we can't wait for the end of the slice of the current task but align with shorter slice When run to parity is disabled, we can face a similar problem because we don't enforce a resched periodically. In this case (patch 5), we use the 0.7ms value as the quantum q. So I would say that checking -rmax < lagk (d) < max(rmax ; q); when a task is dequeued should be a good test. We might need to use -(rmax + tick period) < lagk (d) < max(rmax ; q) + tick period; because of the way we trigger resched > > Dhaval > > > static inline void set_protect_slice(struct sched_entity *se) > > { > > - se->vlag = se->deadline; > > > > + u64 min_slice; > > + > > + min_slice = cfs_rq_min_slice(cfs_rq_of(se)); > > + > > + if (min_slice != se->slice) > > > > + se->vlag = min(se->deadline, se->vruntime + calc_delta_fair(min_slice, se)); > > > > + else > > + se->vlag = se->deadline; > > > > } > > > > static inline bool protect_slice(struct sched_entity *se) > > { > > - return se->vlag == se->deadline; > > > > + return ((s64)(se->vlag - se->vruntime) > 0); > > > > } > > > > static inline void cancel_protect_slice(struct sched_entity *se) > > { > > if (protect_slice(se)) > > - se->vlag = se->deadline + 1; > > > > + se->vlag = se->vruntime; > > > > } > > > > /* > > -- > > 2.43.0 > >
© 2016 - 2025 Red Hat, Inc.