[RFC PATCH 5/7] sched/fair: Take care of group/affinity/sched_class change for throttled task

Aaron Lu posted 7 patches 9 months ago
There is a newer version of this series
[RFC PATCH 5/7] sched/fair: Take care of group/affinity/sched_class change for throttled task
Posted by Aaron Lu 9 months, 1 week ago
On task group change, for a queued task, core will dequeue it and then
requeued it. The throttled task is still considered as queued by core
because p->on_rq is still set so core will dequeue it too, but since
the task is already dequeued on throttle, handle this case properly in
fair class code.

Affinity and sched class change is similar.

Signed-off-by: Aaron Lu <ziqianlu@bytedance.com>
---
 kernel/sched/fair.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9e036f18d73e6..f26d53ac143fe 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5876,8 +5876,8 @@ static void throttle_cfs_rq_work(struct
callback_head *work)

 	update_rq_clock(rq);
 	WARN_ON_ONCE(!list_empty(&p->throttle_node));
-	list_add(&p->throttle_node, &cfs_rq->throttled_limbo_list);
 	dequeue_task_fair(rq, p, DEQUEUE_SLEEP | DEQUEUE_SPECIAL);
+	list_add(&p->throttle_node, &cfs_rq->throttled_limbo_list);
 	resched_curr(rq);

 out_unlock:
@@ -5920,10 +5920,6 @@ static int tg_unthrottle_up(struct task_group
*tg, void *data)
 	/* Re-enqueue the tasks that have been throttled at this level. */
 	list_for_each_entry_safe(p, tmp, &cfs_rq->throttled_limbo_list,
throttle_node) {
 		list_del_init(&p->throttle_node);
-		/*
-		 * FIXME: p may not be allowed to run on this rq anymore
-		 * due to affinity change while p is throttled.
-		 */
 		enqueue_task_fair(rq_of(cfs_rq), p, ENQUEUE_WAKEUP);
 	}

@@ -7194,6 +7190,16 @@ static int dequeue_entities(struct rq *rq,
struct sched_entity *se, int flags)
  */
 static bool dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 {
+	if (task_is_throttled(p)) {
+		/* sched/core wants to dequeue this throttled task. */
+		SCHED_WARN_ON(p->se.on_rq);
+		SCHED_WARN_ON(flags & DEQUEUE_SLEEP);
+
+		list_del_init(&p->throttle_node);
+
+		return true;
+	}
+
 	if (!(p->se.sched_delayed && (task_on_rq_migrating(p) || (flags &
DEQUEUE_SAVE))))
 		util_est_dequeue(&rq->cfs, p);

-- 
2.39.5
Re: [RFC PATCH 5/7] sched/fair: Take care of group/affinity/sched_class change for throttled task
Posted by K Prateek Nayak 9 months, 1 week ago
Hello Aaron,

On 3/13/2025 12:51 PM, Aaron Lu wrote:
[..snip..]

> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5876,8 +5876,8 @@ static void throttle_cfs_rq_work(struct
> callback_head *work)
> 
>   	update_rq_clock(rq);
>   	WARN_ON_ONCE(!list_empty(&p->throttle_node));
> -	list_add(&p->throttle_node, &cfs_rq->throttled_limbo_list);
>   	dequeue_task_fair(rq, p, DEQUEUE_SLEEP | DEQUEUE_SPECIAL);
> +	list_add(&p->throttle_node, &cfs_rq->throttled_limbo_list);
>   	resched_curr(rq);

nit. Perhaps this bit can be moved to Patch 2 to consolidate all
changes in throttle_cfs_rq_work()

> 
>   out_unlock:

[..snip..]

-- 
Thanks and Regards,
Prateek
Re: [External] Re: [RFC PATCH 5/7] sched/fair: Take care of group/affinity/sched_class change for throttled task
Posted by Aaron Lu 9 months, 1 week ago
On Fri, Mar 14, 2025 at 10:21:15AM +0530, K Prateek Nayak wrote:
> Hello Aaron,
> 
> On 3/13/2025 12:51 PM, Aaron Lu wrote:
> [..snip..]
> 
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -5876,8 +5876,8 @@ static void throttle_cfs_rq_work(struct
> > callback_head *work)
> > 
> >   	update_rq_clock(rq);
> >   	WARN_ON_ONCE(!list_empty(&p->throttle_node));
> > -	list_add(&p->throttle_node, &cfs_rq->throttled_limbo_list);
> >   	dequeue_task_fair(rq, p, DEQUEUE_SLEEP | DEQUEUE_SPECIAL);
> > +	list_add(&p->throttle_node, &cfs_rq->throttled_limbo_list);
> >   	resched_curr(rq);
> 
> nit. Perhaps this bit can be moved to Patch 2 to consolidate all
> changes in throttle_cfs_rq_work()

No problem.

I placed it here to better illustrate why list_add() has to be done
after dequeue_task_fair().

Thanks,
Aaron

> > 
> >   out_unlock:
> 
> [..snip..]
> 
> -- 
> Thanks and Regards,
> Prateek
>