[PATCH v9 11/32] timers: Rework idle logic

Anna-Maria Behnsen posted 32 patches 2 years ago
There is a newer version of this series
[PATCH v9 11/32] timers: Rework idle logic
Posted by Anna-Maria Behnsen 2 years ago
From: Thomas Gleixner <tglx@linutronix.de>

To improve readability of the code, split base->idle calculation and
expires calculation into separate parts. While at it, update the comment
about timer base idle marking.

Thereby the following subtle change happens if the next event is just one
jiffy ahead and the tick was already stopped: Originally base->is_idle
remains true in this situation. Now base->is_idle turns to false. This may
spare an IPI if a timer is enqueued remotely to an idle CPU that is going
to tick on the next jiffy.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
---
v9: Re-ordering to not hurt the eyes and update comment
v4: Change condition to force 0 delta and update commit message (Frederic)
---
 kernel/time/timer.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index fee42dda8237..0826018d9873 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1943,22 +1943,23 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
 	 */
 	__forward_timer_base(base, basej);
 
-	if (time_before_eq(nextevt, basej)) {
-		expires = basem;
-		base->is_idle = false;
-	} else {
-		if (base->timers_pending)
-			expires = basem + (u64)(nextevt - basej) * TICK_NSEC;
-		/*
-		 * If we expect to sleep more than a tick, mark the base idle.
-		 * Also the tick is stopped so any added timer must forward
-		 * the base clk itself to keep granularity small. This idle
-		 * logic is only maintained for the BASE_STD base, deferrable
-		 * timers may still see large granularity skew (by design).
-		 */
-		if ((expires - basem) > TICK_NSEC)
-			base->is_idle = true;
+	if (base->timers_pending) {
+		/* If we missed a tick already, force 0 delta */
+		if (time_before(nextevt, basej))
+			nextevt = basej;
+		expires = basem + (u64)(nextevt - basej) * TICK_NSEC;
 	}
+
+	/*
+	 * Base is idle if the next event is more than a tick away.
+	 *
+	 * If the base is marked idle then any timer add operation must forward
+	 * the base clk itself to keep granularity small. This idle logic is
+	 * only maintained for the BASE_STD base, deferrable timers may still
+	 * see large granularity skew (by design).
+	 */
+	base->is_idle = time_after(nextevt, basej + 1);
+
 	trace_timer_base_idle(base->is_idle, base->cpu);
 	raw_spin_unlock(&base->lock);
 
-- 
2.39.2
Re: [PATCH v9 11/32] timers: Rework idle logic
Posted by Frederic Weisbecker 2 years ago
Le Fri, Dec 01, 2023 at 10:26:33AM +0100, Anna-Maria Behnsen a écrit :
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> To improve readability of the code, split base->idle calculation and
> expires calculation into separate parts. While at it, update the comment
> about timer base idle marking.
> 
> Thereby the following subtle change happens if the next event is just one
> jiffy ahead and the tick was already stopped: Originally base->is_idle
> remains true in this situation. Now base->is_idle turns to false. This may
> spare an IPI if a timer is enqueued remotely to an idle CPU that is going
> to tick on the next jiffy.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
> ---
> v9: Re-ordering to not hurt the eyes and update comment
> v4: Change condition to force 0 delta and update commit message (Frederic)
> ---
>  kernel/time/timer.c | 31 ++++++++++++++++---------------
>  1 file changed, 16 insertions(+), 15 deletions(-)
> 
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index fee42dda8237..0826018d9873 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -1943,22 +1943,23 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
>  	 */
>  	__forward_timer_base(base, basej);
>  
> -	if (time_before_eq(nextevt, basej)) {
> -		expires = basem;
> -		base->is_idle = false;
> -	} else {
> -		if (base->timers_pending)
> -			expires = basem + (u64)(nextevt - basej) * TICK_NSEC;
> -		/*
> -		 * If we expect to sleep more than a tick, mark the base idle.
> -		 * Also the tick is stopped so any added timer must forward
> -		 * the base clk itself to keep granularity small. This idle
> -		 * logic is only maintained for the BASE_STD base, deferrable
> -		 * timers may still see large granularity skew (by design).
> -		 */
> -		if ((expires - basem) > TICK_NSEC)
> -			base->is_idle = true;
> +	if (base->timers_pending) {
> +		/* If we missed a tick already, force 0 delta */
> +		if (time_before(nextevt, basej))
> +			nextevt = basej;
> +		expires = basem + (u64)(nextevt - basej) * TICK_NSEC;
>  	}
> +
> +	/*
> +	 * Base is idle if the next event is more than a tick away.
> +	 *
> +	 * If the base is marked idle then any timer add operation must forward
> +	 * the base clk itself to keep granularity small. This idle logic is
> +	 * only maintained for the BASE_STD base, deferrable timers may still
> +	 * see large granularity skew (by design).
> +	 */
> +	base->is_idle = time_after(nextevt, basej + 1);
> +

Much better, thanks! :-)
[tip: timers/core] timers: Rework idle logic
Posted by tip-bot2 for Thomas Gleixner 2 years ago
The following commit has been merged into the timers/core branch of tip:

Commit-ID:     bb8caad5083f8fbba70faf41f1d3bab7cf09da6d
Gitweb:        https://git.kernel.org/tip/bb8caad5083f8fbba70faf41f1d3bab7cf09da6d
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Fri, 01 Dec 2023 10:26:33 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 20 Dec 2023 16:49:39 +01:00

timers: Rework idle logic

To improve readability of the code, split base->idle calculation and
expires calculation into separate parts. While at it, update the comment
about timer base idle marking.

Thereby the following subtle change happens if the next event is just one
jiffy ahead and the tick was already stopped: Originally base->is_idle
remains true in this situation. Now base->is_idle turns to false. This may
spare an IPI if a timer is enqueued remotely to an idle CPU that is going
to tick on the next jiffy.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20231201092654.34614-12-anna-maria@linutronix.de

---
 kernel/time/timer.c | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 1a73d39..cf51655 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1924,6 +1924,7 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
 	struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
 	u64 expires = KTIME_MAX;
 	unsigned long nextevt;
+	bool was_idle;
 
 	/*
 	 * Pretend that there is no timer pending if the cpu is offline.
@@ -1943,27 +1944,26 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
 	 */
 	__forward_timer_base(base, basej);
 
-	if (time_before_eq(nextevt, basej)) {
-		expires = basem;
-		if (base->is_idle) {
-			base->is_idle = false;
-			trace_timer_base_idle(false, base->cpu);
-		}
-	} else {
-		if (base->timers_pending)
-			expires = basem + (u64)(nextevt - basej) * TICK_NSEC;
-		/*
-		 * If we expect to sleep more than a tick, mark the base idle.
-		 * Also the tick is stopped so any added timer must forward
-		 * the base clk itself to keep granularity small. This idle
-		 * logic is only maintained for the BASE_STD base, deferrable
-		 * timers may still see large granularity skew (by design).
-		 */
-		if ((expires - basem) > TICK_NSEC && !base->is_idle) {
-			base->is_idle = true;
-			trace_timer_base_idle(true, base->cpu);
-		}
+	if (base->timers_pending) {
+		/* If we missed a tick already, force 0 delta */
+		if (time_before(nextevt, basej))
+			nextevt = basej;
+		expires = basem + (u64)(nextevt - basej) * TICK_NSEC;
 	}
+
+	/*
+	 * Base is idle if the next event is more than a tick away.
+	 *
+	 * If the base is marked idle then any timer add operation must forward
+	 * the base clk itself to keep granularity small. This idle logic is
+	 * only maintained for the BASE_STD base, deferrable timers may still
+	 * see large granularity skew (by design).
+	 */
+	was_idle = base->is_idle;
+	base->is_idle = time_after(nextevt, basej + 1);
+	if (was_idle != base->is_idle)
+		trace_timer_base_idle(base->is_idle, base->cpu);
+
 	raw_spin_unlock(&base->lock);
 
 	return cmp_next_hrtimer_event(basem, expires);