[RFC PATCH] sched/fair: Prevent from cpufreq not being updated when delayed-task is iowait

Xuewen Yan posted 1 patch 9 months, 3 weeks ago
kernel/sched/fair.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
[RFC PATCH] sched/fair: Prevent from cpufreq not being updated when delayed-task is iowait
Posted by Xuewen Yan 9 months, 3 weeks ago
Because the sched-delayed task maybe in io-wait state,
so we should place the requeue_delayed_entity() after the
cpufreq_update_util(), to prevent not boosting iowait cpufreq
before return.

Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
 kernel/sched/fair.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2d6d5582c3e9..040674734128 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6931,11 +6931,6 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 	if (!(p->se.sched_delayed && (task_on_rq_migrating(p) || (flags & ENQUEUE_RESTORE))))
 		util_est_enqueue(&rq->cfs, p);
 
-	if (flags & ENQUEUE_DELAYED) {
-		requeue_delayed_entity(se);
-		return;
-	}
-
 	/*
 	 * If in_iowait is set, the code below may not trigger any cpufreq
 	 * utilization updates, so do it here explicitly with the IOWAIT flag
@@ -6944,6 +6939,11 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 	if (p->in_iowait)
 		cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT);
 
+	if (flags & ENQUEUE_DELAYED) {
+		requeue_delayed_entity(se);
+		return;
+	}
+
 	if (task_new && se->sched_delayed)
 		h_nr_runnable = 0;
 
-- 
2.25.1
Re: [RFC PATCH] sched/fair: Prevent from cpufreq not being updated when delayed-task is iowait
Posted by Christian Loehle 9 months, 3 weeks ago
On 2/26/25 11:43, Xuewen Yan wrote:
> Because the sched-delayed task maybe in io-wait state,
> so we should place the requeue_delayed_entity() after the
> cpufreq_update_util(), to prevent not boosting iowait cpufreq
> before return.
> 
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
>  kernel/sched/fair.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2d6d5582c3e9..040674734128 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6931,11 +6931,6 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>  	if (!(p->se.sched_delayed && (task_on_rq_migrating(p) || (flags & ENQUEUE_RESTORE))))
>  		util_est_enqueue(&rq->cfs, p);
>  
> -	if (flags & ENQUEUE_DELAYED) {
> -		requeue_delayed_entity(se);
> -		return;
> -	}
> -
>  	/*
>  	 * If in_iowait is set, the code below may not trigger any cpufreq
>  	 * utilization updates, so do it here explicitly with the IOWAIT flag
> @@ -6944,6 +6939,11 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>  	if (p->in_iowait)
>  		cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT);
>  
> +	if (flags & ENQUEUE_DELAYED) {
> +		requeue_delayed_entity(se);
> +		return;
> +	}
> +
>  	if (task_new && se->sched_delayed)
>  		h_nr_runnable = 0;
>  

I understand that iowait cpufreq update isn't happening now (and that's a bug),
but if we reorder we may call cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT)
followed by the cpufreq_update_util() in update_load_avg() of 
requeue_delayed_entity()
	update_load_avg()
		cpufreq_update_util()

and the latter will likely be dropped by the governor, so the update
won't include util of the (re)-enqueuing task, right?

I'll give it some more thought.
Re: [RFC PATCH] sched/fair: Prevent from cpufreq not being updated when delayed-task is iowait
Posted by Hongyan Xia 9 months, 3 weeks ago
On 26/02/2025 12:08, Christian Loehle wrote:
> On 2/26/25 11:43, Xuewen Yan wrote:
>> Because the sched-delayed task maybe in io-wait state,
>> so we should place the requeue_delayed_entity() after the
>> cpufreq_update_util(), to prevent not boosting iowait cpufreq
>> before return.
>>
>> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
>> ---
>>   kernel/sched/fair.c | 10 +++++-----
>>   1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 2d6d5582c3e9..040674734128 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -6931,11 +6931,6 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>>   	if (!(p->se.sched_delayed && (task_on_rq_migrating(p) || (flags & ENQUEUE_RESTORE))))
>>   		util_est_enqueue(&rq->cfs, p);
>>   
>> -	if (flags & ENQUEUE_DELAYED) {
>> -		requeue_delayed_entity(se);
>> -		return;
>> -	}
>> -
>>   	/*
>>   	 * If in_iowait is set, the code below may not trigger any cpufreq
>>   	 * utilization updates, so do it here explicitly with the IOWAIT flag
>> @@ -6944,6 +6939,11 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>>   	if (p->in_iowait)
>>   		cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT);
>>   
>> +	if (flags & ENQUEUE_DELAYED) {
>> +		requeue_delayed_entity(se);
>> +		return;
>> +	}
>> +
>>   	if (task_new && se->sched_delayed)
>>   		h_nr_runnable = 0;
>>   
> 
> I understand that iowait cpufreq update isn't happening now (and that's a bug),
> but if we reorder we may call cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT)
> followed by the cpufreq_update_util() in update_load_avg() of
> requeue_delayed_entity()
> 	update_load_avg()
> 		cpufreq_update_util()
> 
> and the latter will likely be dropped by the governor, so the update
> won't include util of the (re)-enqueuing task, right?
> 
> I'll give it some more thought.

True, but I think the code was like this before anyway. On the 
non-delayed path, the problem you mentioned still exists. Not saying 
this is the right thing, but just saying this is what it has always been 
like.
Re: [RFC PATCH] sched/fair: Prevent from cpufreq not being updated when delayed-task is iowait
Posted by Xuewen Yan 9 months, 3 weeks ago
On Wed, Feb 26, 2025 at 8:08 PM Christian Loehle
<christian.loehle@arm.com> wrote:
>
> On 2/26/25 11:43, Xuewen Yan wrote:
> > Because the sched-delayed task maybe in io-wait state,
> > so we should place the requeue_delayed_entity() after the
> > cpufreq_update_util(), to prevent not boosting iowait cpufreq
> > before return.
> >
> > Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> > ---
> >  kernel/sched/fair.c | 10 +++++-----
> >  1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 2d6d5582c3e9..040674734128 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6931,11 +6931,6 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> >       if (!(p->se.sched_delayed && (task_on_rq_migrating(p) || (flags & ENQUEUE_RESTORE))))
> >               util_est_enqueue(&rq->cfs, p);
> >
> > -     if (flags & ENQUEUE_DELAYED) {
> > -             requeue_delayed_entity(se);
> > -             return;
> > -     }
> > -
> >       /*
> >        * If in_iowait is set, the code below may not trigger any cpufreq
> >        * utilization updates, so do it here explicitly with the IOWAIT flag
> > @@ -6944,6 +6939,11 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> >       if (p->in_iowait)
> >               cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT);
> >
> > +     if (flags & ENQUEUE_DELAYED) {
> > +             requeue_delayed_entity(se);
> > +             return;
> > +     }
> > +
> >       if (task_new && se->sched_delayed)
> >               h_nr_runnable = 0;
> >
>
> I understand that iowait cpufreq update isn't happening now (and that's a bug),
> but if we reorder we may call cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT)
> followed by the cpufreq_update_util() in update_load_avg() of
> requeue_delayed_entity()
>         update_load_avg()
>                 cpufreq_update_util()
>
> and the latter will likely be dropped by the governor, so the update
> won't include util of the (re)-enqueuing task, right?

emmm, util_est has already included the util of requeue-task.
However, the cfs_rq->avg.runnable_avg would indeed be slightly different.
If we want to include both iowait and runnable_avg, perhaps we should
add the iowait check to the update_load_avg() function, but this would
make the code more complex.

>
> I'll give it some more thought.