[PATCH v2 13/15] sched/deadline: Make start_dl_timer callers more robust

Joel Fernandes (Google) posted 15 patches 1 year, 11 months ago
[PATCH v2 13/15] sched/deadline: Make start_dl_timer callers more robust
Posted by Joel Fernandes (Google) 1 year, 11 months ago
For whatever reason, if start_dl_timer() returned 0 during replenish (it
did not start a new timer), then do not marked dl_defer_armed, because
we never really armed.

Further, we need to cancel any old timers,

This is similar to what dl_check_constrained_dl() does.

Add some guardrails for such situations.

Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
 kernel/sched/deadline.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index dbba95d364e2..e978e299381c 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -918,7 +918,16 @@ static void replenish_dl_entity(struct sched_dl_entity *dl_se)
 		if (!is_dl_boosted(dl_se)) {
 			dl_se->dl_defer_armed = 1;
 			dl_se->dl_throttled = 1;
-			start_dl_timer(dl_se);
+			if (!start_dl_timer(dl_se)) {
+				/*
+				 * If for whatever reason (delays), if a previous timer was
+				 * queued but not serviced, cancel it.
+				*/
+				hrtimer_try_to_cancel(&dl_se->dl_timer);
+				dl_se->dl_defer_armed = 0;
+				dl_se->dl_throttled = 0;
+				return;
+			}
 		}
 	}
 }
@@ -1465,7 +1474,14 @@ static void update_curr_dl_se(struct rq *rq, struct sched_dl_entity *dl_se, s64
 		hrtimer_try_to_cancel(&dl_se->dl_timer);
 
 		replenish_dl_new_period(dl_se, dl_se->rq);
-		start_dl_timer(dl_se);
+
+		/*
+		 * Not being able to start the timer seems problematic. If it could not
+		 * be started for whatever reason, we need to "unthrottle" the DL server
+		 * and queue right away. Otherwise nothing might queue it. That's similar
+		 * to what enqueue_dl_entity() does on start_dl_timer==0. For now, just warn.
+		 */
+		WARN_ON_ONCE(!start_dl_timer(dl_se));
 
 		return;
 	}
-- 
2.34.1
Re: [PATCH v2 13/15] sched/deadline: Make start_dl_timer callers more robust
Posted by Daniel Bristot de Oliveira 1 year, 10 months ago
On 3/13/24 02:24, Joel Fernandes (Google) wrote:
> For whatever reason, if start_dl_timer() returned 0 during replenish (it
> did not start a new timer), then do not marked dl_defer_armed, because
> we never really armed.
> 
> Further, we need to cancel any old timers,
> 
> This is similar to what dl_check_constrained_dl() does.
> 
> Add some guardrails for such situations.
> 
> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>

Makes sense, added as part of the defer patch.

-- Daniel