[PATCH] sched/fair: Make SCHED_IDLE se be preempted in strict hierarchy

Tianchen Ding posted 1 patch 1 year, 7 months ago
kernel/sched/fair.c | 19 ++++++++-----------
1 file changed, 8 insertions(+), 11 deletions(-)
[PATCH] sched/fair: Make SCHED_IDLE se be preempted in strict hierarchy
Posted by Tianchen Ding 1 year, 7 months ago
Consider the following cgroup:
                       root
                        |
             ------------------------
             |                      |
       normal_cgroup            idle_cgroup
             |                      |
   SCHED_IDLE task_A           SCHED_NORMAL task_B

According to the cgroup hierarchy, A should preempt B. But current
check_preempt_wakeup_fair() treats cgroup se and task separately, so B
will preempt A unexpectedly.
Unify the wakeup logic by {p}se_is_idle only.

Also fix a bug about se_is_idle() definition when
!CONFIG_FAIR_GROUP_SCHED.

Fixes: 304000390f88 ("sched: Cgroup SCHED_IDLE support")
Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
---
 kernel/sched/fair.c | 19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 41b58387023d..c91cfaa7d9ee 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -511,7 +511,7 @@ static int cfs_rq_is_idle(struct cfs_rq *cfs_rq)
 
 static int se_is_idle(struct sched_entity *se)
 {
-	return 0;
+	return task_has_idle_policy(task_of(se));
 }
 
 #endif	/* CONFIG_FAIR_GROUP_SCHED */
@@ -8382,16 +8382,7 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
 	if (test_tsk_need_resched(curr))
 		return;
 
-	/* Idle tasks are by definition preempted by non-idle tasks. */
-	if (unlikely(task_has_idle_policy(curr)) &&
-	    likely(!task_has_idle_policy(p)))
-		goto preempt;
-
-	/*
-	 * Batch and idle tasks do not preempt non-idle tasks (their preemption
-	 * is driven by the tick):
-	 */
-	if (unlikely(p->policy != SCHED_NORMAL) || !sched_feat(WAKEUP_PREEMPTION))
+	if (!sched_feat(WAKEUP_PREEMPTION))
 		return;
 
 	find_matching_se(&se, &pse);
@@ -8408,6 +8399,12 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
 		goto preempt;
 	if (cse_is_idle != pse_is_idle)
 		return;
+	/*
+	 * Batch tasks do not preempt non-idle tasks (their preemption
+	 * is driven by the tick):
+	 */
+	if (unlikely(pse == &p->se && p->policy == SCHED_BATCH))
+		return;
 
 	cfs_rq = cfs_rq_of(se);
 	update_curr(cfs_rq);
-- 
2.39.3
Re: [PATCH] sched/fair: Make SCHED_IDLE se be preempted in strict hierarchy
Posted by Vincent Guittot 1 year, 7 months ago
On Mon, 24 Jun 2024 at 09:39, Tianchen Ding <dtcccc@linux.alibaba.com> wrote:
>
> Consider the following cgroup:
>                        root
>                         |
>              ------------------------
>              |                      |
>        normal_cgroup            idle_cgroup
>              |                      |
>    SCHED_IDLE task_A           SCHED_NORMAL task_B
>
> According to the cgroup hierarchy, A should preempt B. But current
> check_preempt_wakeup_fair() treats cgroup se and task separately, so B
> will preempt A unexpectedly.
> Unify the wakeup logic by {p}se_is_idle only.
>
> Also fix a bug about se_is_idle() definition when
> !CONFIG_FAIR_GROUP_SCHED.
>
> Fixes: 304000390f88 ("sched: Cgroup SCHED_IDLE support")
> Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
> ---
>  kernel/sched/fair.c | 19 ++++++++-----------
>  1 file changed, 8 insertions(+), 11 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 41b58387023d..c91cfaa7d9ee 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -511,7 +511,7 @@ static int cfs_rq_is_idle(struct cfs_rq *cfs_rq)
>
>  static int se_is_idle(struct sched_entity *se)
>  {
> -       return 0;
> +       return task_has_idle_policy(task_of(se));
>  }
>
>  #endif /* CONFIG_FAIR_GROUP_SCHED */
> @@ -8382,16 +8382,7 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
>         if (test_tsk_need_resched(curr))
>                 return;
>
> -       /* Idle tasks are by definition preempted by non-idle tasks. */
> -       if (unlikely(task_has_idle_policy(curr)) &&
> -           likely(!task_has_idle_policy(p)))
> -               goto preempt;
> -
> -       /*
> -        * Batch and idle tasks do not preempt non-idle tasks (their preemption
> -        * is driven by the tick):
> -        */
> -       if (unlikely(p->policy != SCHED_NORMAL) || !sched_feat(WAKEUP_PREEMPTION))
> +       if (!sched_feat(WAKEUP_PREEMPTION))
>                 return;
>
>         find_matching_se(&se, &pse);
> @@ -8408,6 +8399,12 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int

Replace "group" by "entity" in the comment above as it is not only
group but also entity

>                 goto preempt;
>         if (cse_is_idle != pse_is_idle)
>                 return;
> +       /*
> +        * Batch tasks do not preempt non-idle tasks (their preemption
> +        * is driven by the tick):
> +        */
> +       if (unlikely(pse == &p->se && p->policy == SCHED_BATCH))

I think I would prefer entity_is_task() which makes easier to
understand that the condition is about task

+       if (unlikely(entity_is_task(pse) &&  p->policy == SCHED_BATCH))

other than the 2 comments above

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>


> +               return;
>
>         cfs_rq = cfs_rq_of(se);
>         update_curr(cfs_rq);
> --
> 2.39.3
>
Re: [PATCH] sched/fair: Make SCHED_IDLE se be preempted in strict hierarchy
Posted by Josh Don 1 year, 7 months ago
Hi Tianchen,

Thanks for these fixes.

>  #endif /* CONFIG_FAIR_GROUP_SCHED */
> @@ -8382,16 +8382,7 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
>         if (test_tsk_need_resched(curr))
>                 return;
>
> -       /* Idle tasks are by definition preempted by non-idle tasks. */
> -       if (unlikely(task_has_idle_policy(curr)) &&
> -           likely(!task_has_idle_policy(p)))
> -               goto preempt;
> -
> -       /*
> -        * Batch and idle tasks do not preempt non-idle tasks (their preemption
> -        * is driven by the tick):
> -        */
> -       if (unlikely(p->policy != SCHED_NORMAL) || !sched_feat(WAKEUP_PREEMPTION))
> +       if (!sched_feat(WAKEUP_PREEMPTION))
>                 return;
>
>         find_matching_se(&se, &pse);
> @@ -8408,6 +8399,12 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
>                 goto preempt;
>         if (cse_is_idle != pse_is_idle)
>                 return;

nit: add newline

> +       /*
> +        * Batch tasks do not preempt non-idle tasks (their preemption
> +        * is driven by the tick):
> +        */
> +       if (unlikely(pse == &p->se && p->policy == SCHED_BATCH))
> +               return;

I think it is worth extending that comment to explain why we don't
also check policy for SCHED_IDLE here (since the cse_is_idle checks
above boil down to checking for task idle policy in this special case
where pse is a task entity). The comment above the pse_is_idle checks
only talks about idle group preemption, when here we're relying on it
for idle task preemption.

>
>         cfs_rq = cfs_rq_of(se);
>         update_curr(cfs_rq);
> --
> 2.39.3
>

Reviewed-by: Josh Don <joshdon@google.com>