[RFC PATCH 7/7] sched/fair: Make sure cfs_rq has enough runtime_remaining on unthrottle path

Aaron Lu posted 7 patches 9 months ago
There is a newer version of this series
[RFC PATCH 7/7] sched/fair: Make sure cfs_rq has enough runtime_remaining on unthrottle path
Posted by Aaron Lu 9 months, 1 week ago
It's possible unthrottle_cfs_rq() is called with !runtime_remaining
due to things like user changed quota setting(see tg_set_cfs_bandwidth())
or async unthrottled us with a positive runtime_remaining but other still
running entities consumed those runtime before we reach there.

Anyway, we can't unthrottle this cfs_rq without any runtime remaining
because task enqueue during unthrottle can immediately trigger a throttle
by check_enqueue_throttle(), which should never happen.

Signed-off-by: Aaron Lu <ziqianlu@bytedance.com>
---
 kernel/sched/fair.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index be96f7d32998c..d646451d617c1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6058,6 +6058,19 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
 	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
 	struct sched_entity *se = cfs_rq->tg->se[cpu_of(rq)];

+	/*
+	 * It's possible we are called with !runtime_remaining due to things
+	 * like user changed quota setting(see tg_set_cfs_bandwidth()) or async
+	 * unthrottled us with a positive runtime_remaining but other still
+	 * running entities consumed those runtime before we reach here.
+	 *
+	 * Anyway, we can't unthrottle this cfs_rq without any runtime remaining
+	 * because any enqueue below will immediately trigger a throttle, which
+	 * is not supposed to happen on unthrottle path.
+	 */
+	if (cfs_rq->runtime_enabled && !cfs_rq->runtime_remaining)
+		return;
+
 	cfs_rq->throttled = 0;

 	update_rq_clock(rq);
-- 
2.39.5
Re: [RFC PATCH 7/7] sched/fair: Make sure cfs_rq has enough runtime_remaining on unthrottle path
Posted by K Prateek Nayak 9 months, 1 week ago
Hello Aaron,

On 3/13/2025 12:52 PM, Aaron Lu wrote:
> It's possible unthrottle_cfs_rq() is called with !runtime_remaining
> due to things like user changed quota setting(see tg_set_cfs_bandwidth())
> or async unthrottled us with a positive runtime_remaining but other still
> running entities consumed those runtime before we reach there.
> 
> Anyway, we can't unthrottle this cfs_rq without any runtime remaining
> because task enqueue during unthrottle can immediately trigger a throttle
> by check_enqueue_throttle(), which should never happen.
> 
> Signed-off-by: Aaron Lu <ziqianlu@bytedance.com>
> ---
>   kernel/sched/fair.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index be96f7d32998c..d646451d617c1 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6058,6 +6058,19 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
>   	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
>   	struct sched_entity *se = cfs_rq->tg->se[cpu_of(rq)];
> 
> +	/*
> +	 * It's possible we are called with !runtime_remaining due to things
> +	 * like user changed quota setting(see tg_set_cfs_bandwidth()) or async
> +	 * unthrottled us with a positive runtime_remaining but other still
> +	 * running entities consumed those runtime before we reach here.
> +	 *
> +	 * Anyway, we can't unthrottle this cfs_rq without any runtime remaining
> +	 * because any enqueue below will immediately trigger a throttle, which
> +	 * is not supposed to happen on unthrottle path.
> +	 */
> +	if (cfs_rq->runtime_enabled && !cfs_rq->runtime_remaining)

Should this be "cfs_rq->runtime_remaining <= 0" since slack could have
built up by that time we come here?

> +		return;
> +
>   	cfs_rq->throttled = 0;
> 
>   	update_rq_clock(rq);

-- 
Thanks and Regards,
Prateek
Re: [External] Re: [RFC PATCH 7/7] sched/fair: Make sure cfs_rq has enough runtime_remaining on unthrottle path
Posted by Aaron Lu 9 months, 1 week ago
On Fri, Mar 14, 2025 at 09:48:00AM +0530, K Prateek Nayak wrote:
> Hello Aaron,
> 
> On 3/13/2025 12:52 PM, Aaron Lu wrote:
> > It's possible unthrottle_cfs_rq() is called with !runtime_remaining
> > due to things like user changed quota setting(see tg_set_cfs_bandwidth())
> > or async unthrottled us with a positive runtime_remaining but other still
> > running entities consumed those runtime before we reach there.
> > 
> > Anyway, we can't unthrottle this cfs_rq without any runtime remaining
> > because task enqueue during unthrottle can immediately trigger a throttle
> > by check_enqueue_throttle(), which should never happen.
> > 
> > Signed-off-by: Aaron Lu <ziqianlu@bytedance.com>
> > ---
> >   kernel/sched/fair.c | 13 +++++++++++++
> >   1 file changed, 13 insertions(+)
> > 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index be96f7d32998c..d646451d617c1 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6058,6 +6058,19 @@ void unthrottle_cfs_rq(struct cfs_rq *cfs_rq)
> >   	struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
> >   	struct sched_entity *se = cfs_rq->tg->se[cpu_of(rq)];
> > 
> > +	/*
> > +	 * It's possible we are called with !runtime_remaining due to things
> > +	 * like user changed quota setting(see tg_set_cfs_bandwidth()) or async
> > +	 * unthrottled us with a positive runtime_remaining but other still
> > +	 * running entities consumed those runtime before we reach here.
> > +	 *
> > +	 * Anyway, we can't unthrottle this cfs_rq without any runtime remaining
> > +	 * because any enqueue below will immediately trigger a throttle, which
> > +	 * is not supposed to happen on unthrottle path.
> > +	 */
> > +	if (cfs_rq->runtime_enabled && !cfs_rq->runtime_remaining)
> 
> Should this be "cfs_rq->runtime_remaining <= 0" since slack could have
> built up by that time we come here?

Absolutely!
Thanks for pointing this out.

Best regards,
Aaron

> > +		return;
> > +
> >   	cfs_rq->throttled = 0;
> > 
> >   	update_rq_clock(rq);
> 
> -- 
> Thanks and Regards,
> Prateek
>