[RFC PATCH] sched/fair: Fixup wake_up_sync vs DELAYED_DEQUEUE

Xuewen Yan posted 1 patch 9 months, 3 weeks ago
kernel/sched/fair.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
[RFC PATCH] sched/fair: Fixup wake_up_sync vs DELAYED_DEQUEUE
Posted by Xuewen Yan 9 months, 3 weeks ago
Delayed dequeued feature keeps a sleeping task enqueued until its
lag has elapsed. As a result, it stays also visible in rq->nr_running.
So when in ake_affine_idle(), we should use the real running-tasks
in rq to check whether we should place the wake-up task to
current cpu.

Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
 kernel/sched/fair.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1c0ef435a7aa..2d6d5582c3e9 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7342,8 +7342,15 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
 	if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
 		return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
 
-	if (sync && cpu_rq(this_cpu)->nr_running == 1)
-		return this_cpu;
+	if (sync) {
+		struct rq *rq = cpu_rq(this_cpu);
+		int nr_delayed;
+
+		nr_delayed = rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable;
+
+		if ((rq->nr_running - nr_delayed) == 1)
+			return this_cpu;
+	}
 
 	if (available_idle_cpu(prev_cpu))
 		return prev_cpu;
-- 
2.25.1
Re: [RFC PATCH] sched/fair: Fixup wake_up_sync vs DELAYED_DEQUEUE
Posted by Vincent Guittot 9 months, 3 weeks ago
On Wed, 26 Feb 2025 at 03:51, Xuewen Yan <xuewen.yan@unisoc.com> wrote:
>
> Delayed dequeued feature keeps a sleeping task enqueued until its
> lag has elapsed. As a result, it stays also visible in rq->nr_running.
> So when in ake_affine_idle(), we should use the real running-tasks

typo: wake_affine_idle

> in rq to check whether we should place the wake-up task to
> current cpu.

fair enough

>
> Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
>  kernel/sched/fair.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 1c0ef435a7aa..2d6d5582c3e9 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7342,8 +7342,15 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
>         if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
>                 return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
>
> -       if (sync && cpu_rq(this_cpu)->nr_running == 1)
> -               return this_cpu;
> +       if (sync) {
> +               struct rq *rq = cpu_rq(this_cpu);
> +               int nr_delayed;
> +
> +               nr_delayed = rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable;

Could you encapsulate this in a helper function ? something like below

static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
{
        struct rq *rq = cpu_rq(this_cpu);

        return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
}

> +
> +               if ((rq->nr_running - nr_delayed) == 1)
> +                       return this_cpu;
> +       }
>
>         if (available_idle_cpu(prev_cpu))
>                 return prev_cpu;
> --
> 2.25.1
>
>
Re: [RFC PATCH] sched/fair: Fixup wake_up_sync vs DELAYED_DEQUEUE
Posted by Xuewen Yan 9 months, 2 weeks ago
On Sat, Mar 1, 2025 at 12:39 AM Vincent Guittot
<vincent.guittot@linaro.org> wrote:
>
> On Wed, 26 Feb 2025 at 03:51, Xuewen Yan <xuewen.yan@unisoc.com> wrote:
> >
> > Delayed dequeued feature keeps a sleeping task enqueued until its
> > lag has elapsed. As a result, it stays also visible in rq->nr_running.
> > So when in ake_affine_idle(), we should use the real running-tasks
>
> typo: wake_affine_idle
>
> > in rq to check whether we should place the wake-up task to
> > current cpu.
>
> fair enough
>
> >
> > Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
> > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> > ---
> >  kernel/sched/fair.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 1c0ef435a7aa..2d6d5582c3e9 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7342,8 +7342,15 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
> >         if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
> >                 return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
> >
> > -       if (sync && cpu_rq(this_cpu)->nr_running == 1)
> > -               return this_cpu;
> > +       if (sync) {
> > +               struct rq *rq = cpu_rq(this_cpu);
> > +               int nr_delayed;
> > +
> > +               nr_delayed = rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable;
>
> Could you encapsulate this in a helper function ? something like below
>
> static inline unsigned int cfs_h_nr_delayed(struct rq *rq)
> {
>         struct rq *rq = cpu_rq(this_cpu);
>
>         return (rq->cfs.h_nr_queued - rq->cfs.h_nr_runnable);
> }
>

Okay,or we could  add cfs_h_nr_delayed(struct cfs_rq *cfs_rq)?
Others can call it with cfs_h_nr_delayed(cfs_rq) or cfs_h_nr_delayed(rq->cfs_q).
This way, the helper function will be more versatile.

On the other hand, I found that it's not just the nr-running here that
needs to be fixed; other instances of nr-running also need correction,
such as in yield_task_fair and sched_fair_update_stop_tick. Perhaps we
need to modify a series of patches to address these issues.

BR