[PATCH v6 6/9] sched/fair: fix another detach on unattached task corner case

Chengming Zhou posted 9 patches 3 years, 7 months ago
[PATCH v6 6/9] sched/fair: fix another detach on unattached task corner case
Posted by Chengming Zhou 3 years, 7 months ago
commit 7dc603c9028e ("sched/fair: Fix PELT integrity for new tasks")
fixed two load tracking problems for new task, including detach on
unattached new task problem.

There still left another detach on unattached task problem for the task
which has been woken up by try_to_wake_up() and waiting for actually
being woken up by sched_ttwu_pending().

try_to_wake_up(p)
  cpu = select_task_rq(p)
  if (task_cpu(p) != cpu)
    set_task_cpu(p, cpu)
      migrate_task_rq_fair()
        remove_entity_load_avg()       --> unattached
        se->avg.last_update_time = 0;
      __set_task_cpu()
  ttwu_queue(p, cpu)
    ttwu_queue_wakelist()
      __ttwu_queue_wakelist()

task_change_group_fair()
  detach_task_cfs_rq()
    detach_entity_cfs_rq()
      detach_entity_load_avg()   --> detach on unattached task
  set_task_rq()
  attach_task_cfs_rq()
    attach_entity_cfs_rq()
      attach_entity_load_avg()

The reason of this problem is similar, we should check in detach_entity_cfs_rq()
that se->avg.last_update_time != 0, before do detach_entity_load_avg().

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 kernel/sched/fair.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1eb3fb3d95c3..eba8a64f905a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11721,6 +11721,17 @@ static void detach_entity_cfs_rq(struct sched_entity *se)
 {
 	struct cfs_rq *cfs_rq = cfs_rq_of(se);
 
+#ifdef CONFIG_SMP
+	/*
+	 * In case the task sched_avg hasn't been attached:
+	 * - A forked task which hasn't been woken up by wake_up_new_task().
+	 * - A task which has been woken up by try_to_wake_up() but is
+	 *   waiting for actually being woken up by sched_ttwu_pending().
+	 */
+	if (!se->avg.last_update_time)
+		return;
+#endif
+
 	/* Catch up with the cfs_rq and remove our load when we leave */
 	update_load_avg(cfs_rq, se, 0);
 	detach_entity_load_avg(cfs_rq, se);
-- 
2.37.2
Re: [PATCH v6 6/9] sched/fair: fix another detach on unattached task corner case
Posted by Vincent Guittot 3 years, 7 months ago
On Thu, 18 Aug 2022 at 14:48, Chengming Zhou
<zhouchengming@bytedance.com> wrote:
>
> commit 7dc603c9028e ("sched/fair: Fix PELT integrity for new tasks")
> fixed two load tracking problems for new task, including detach on
> unattached new task problem.
>
> There still left another detach on unattached task problem for the task
> which has been woken up by try_to_wake_up() and waiting for actually
> being woken up by sched_ttwu_pending().
>
> try_to_wake_up(p)
>   cpu = select_task_rq(p)
>   if (task_cpu(p) != cpu)
>     set_task_cpu(p, cpu)
>       migrate_task_rq_fair()
>         remove_entity_load_avg()       --> unattached
>         se->avg.last_update_time = 0;
>       __set_task_cpu()
>   ttwu_queue(p, cpu)
>     ttwu_queue_wakelist()
>       __ttwu_queue_wakelist()
>
> task_change_group_fair()
>   detach_task_cfs_rq()
>     detach_entity_cfs_rq()
>       detach_entity_load_avg()   --> detach on unattached task
>   set_task_rq()
>   attach_task_cfs_rq()
>     attach_entity_cfs_rq()
>       attach_entity_load_avg()
>
> The reason of this problem is similar, we should check in detach_entity_cfs_rq()
> that se->avg.last_update_time != 0, before do detach_entity_load_avg().
>
> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---
>  kernel/sched/fair.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 1eb3fb3d95c3..eba8a64f905a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11721,6 +11721,17 @@ static void detach_entity_cfs_rq(struct sched_entity *se)
>  {
>         struct cfs_rq *cfs_rq = cfs_rq_of(se);
>
> +#ifdef CONFIG_SMP
> +       /*
> +        * In case the task sched_avg hasn't been attached:
> +        * - A forked task which hasn't been woken up by wake_up_new_task().
> +        * - A task which has been woken up by try_to_wake_up() but is
> +        *   waiting for actually being woken up by sched_ttwu_pending().
> +        */
> +       if (!se->avg.last_update_time)
> +               return;
> +#endif
> +
>         /* Catch up with the cfs_rq and remove our load when we leave */
>         update_load_avg(cfs_rq, se, 0);
>         detach_entity_load_avg(cfs_rq, se);
> --
> 2.37.2
>
[tip: sched/core] sched/fair: Fix another detach on unattached task corner case
Posted by tip-bot2 for Chengming Zhou 3 years, 7 months ago
The following commit has been merged into the sched/core branch of tip:

Commit-ID:     7e2edaf61814fb6aa363989d718950c023b882d4
Gitweb:        https://git.kernel.org/tip/7e2edaf61814fb6aa363989d718950c023b882d4
Author:        Chengming Zhou <zhouchengming@bytedance.com>
AuthorDate:    Thu, 18 Aug 2022 20:48:02 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Aug 2022 11:01:19 +02:00

sched/fair: Fix another detach on unattached task corner case

commit 7dc603c9028e ("sched/fair: Fix PELT integrity for new tasks")
fixed two load tracking problems for new task, including detach on
unattached new task problem.

There still left another detach on unattached task problem for the task
which has been woken up by try_to_wake_up() and waiting for actually
being woken up by sched_ttwu_pending().

try_to_wake_up(p)
  cpu = select_task_rq(p)
  if (task_cpu(p) != cpu)
    set_task_cpu(p, cpu)
      migrate_task_rq_fair()
        remove_entity_load_avg()       --> unattached
        se->avg.last_update_time = 0;
      __set_task_cpu()
  ttwu_queue(p, cpu)
    ttwu_queue_wakelist()
      __ttwu_queue_wakelist()

task_change_group_fair()
  detach_task_cfs_rq()
    detach_entity_cfs_rq()
      detach_entity_load_avg()   --> detach on unattached task
  set_task_rq()
  attach_task_cfs_rq()
    attach_entity_cfs_rq()
      attach_entity_load_avg()

The reason of this problem is similar, we should check in detach_entity_cfs_rq()
that se->avg.last_update_time != 0, before do detach_entity_load_avg().

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20220818124805.601-7-zhouchengming@bytedance.com
---
 kernel/sched/fair.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f52e7dc..e92bc05 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11557,6 +11557,17 @@ static void detach_entity_cfs_rq(struct sched_entity *se)
 {
 	struct cfs_rq *cfs_rq = cfs_rq_of(se);
 
+#ifdef CONFIG_SMP
+	/*
+	 * In case the task sched_avg hasn't been attached:
+	 * - A forked task which hasn't been woken up by wake_up_new_task().
+	 * - A task which has been woken up by try_to_wake_up() but is
+	 *   waiting for actually being woken up by sched_ttwu_pending().
+	 */
+	if (!se->avg.last_update_time)
+		return;
+#endif
+
 	/* Catch up with the cfs_rq and remove our load when we leave */
 	update_load_avg(cfs_rq, se, 0);
 	detach_entity_load_avg(cfs_rq, se);