[PATCH v1] cpuidle: governors: menu: Always check timers with tick stopped

Rafael J. Wysocki posted 1 patch 2 weeks, 4 days ago
drivers/cpuidle/governors/menu.c |   22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
[PATCH v1] cpuidle: governors: menu: Always check timers with tick stopped
Posted by Rafael J. Wysocki 2 weeks, 4 days ago
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

After commit 5484e31bbbff ("cpuidle: menu: Skip tick_nohz_get_sleep_length()
call in some cases"), if the return value of get_typical_interval()
multiplied by NSEC_PER_USEC is not greater than RESIDENCY_THRESHOLD_NS,
the menu governor will skip computing the time till the closest timer.
If that happens when the tick has been stopped already, the selected
idle state may be too deep due to the subsequent check comparing
predicted_ns with TICK_NSEC and causing its value to be replaced with
the expected time till the closest timer, which is KTIME_MAX in that
case.  That will cause the deepest enabled idle state to be selected,
but the time till the closest timer very well may be shorter than the
target residency of that state, in which case a shallower state should
be used.

Address this by making menu_select() always compute the time till the
closest timer when the tick has been stopped.

Also move the predicted_ns check mentioned above into the branch in
which the time till the closest timer is determined because it only
needs to be done in that case.

Fixes: 5484e31bbbff ("cpuidle: menu: Skip tick_nohz_get_sleep_length() call in some cases")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpuidle/governors/menu.c |   22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -239,7 +239,7 @@ static int menu_select(struct cpuidle_dr
 
 	/* Find the shortest expected idle interval. */
 	predicted_ns = get_typical_interval(data) * NSEC_PER_USEC;
-	if (predicted_ns > RESIDENCY_THRESHOLD_NS) {
+	if (predicted_ns > RESIDENCY_THRESHOLD_NS || tick_nohz_tick_stopped()) {
 		unsigned int timer_us;
 
 		/* Determine the time till the closest timer. */
@@ -259,6 +259,16 @@ static int menu_select(struct cpuidle_dr
 				   RESOLUTION * DECAY * NSEC_PER_USEC);
 		/* Use the lowest expected idle interval to pick the idle state. */
 		predicted_ns = min((u64)timer_us * NSEC_PER_USEC, predicted_ns);
+		/*
+		 * If the tick is already stopped, the cost of possible short
+		 * idle duration misprediction is much higher, because the CPU
+		 * may be stuck in a shallow idle state for a long time as a
+		 * result of it.  In that case, say we might mispredict and use
+		 * the known time till the closest timer event for the idle
+		 * state selection.
+		 */
+		if (tick_nohz_tick_stopped() && predicted_ns < TICK_NSEC)
+			predicted_ns = data->next_timer_ns;
 	} else {
 		/*
 		 * Because the next timer event is not going to be determined
@@ -285,16 +295,6 @@ static int menu_select(struct cpuidle_dr
 	}
 
 	/*
-	 * If the tick is already stopped, the cost of possible short idle
-	 * duration misprediction is much higher, because the CPU may be stuck
-	 * in a shallow idle state for a long time as a result of it.  In that
-	 * case, say we might mispredict and use the known time till the closest
-	 * timer event for the idle state selection.
-	 */
-	if (tick_nohz_tick_stopped() && predicted_ns < TICK_NSEC)
-		predicted_ns = data->next_timer_ns;
-
-	/*
 	 * Find the idle state with the lowest power while satisfying
 	 * our constraints.
 	 */
RE: [PATCH v1] cpuidle: governors: menu: Always check timers with tick stopped
Posted by Doug Smythies 1 week, 3 days ago
On 2026.01.20 07:26 Rafael J. Wysocki wrote:
>From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> After commit 5484e31bbbff ("cpuidle: menu: Skip tick_nohz_get_sleep_length()
> call in some cases"), if the return value of get_typical_interval()
> multiplied by NSEC_PER_USEC is not greater than RESIDENCY_THRESHOLD_NS,
> the menu governor will skip computing the time till the closest timer.
> If that happens when the tick has been stopped already, the selected
> idle state may be too deep due to the subsequent check comparing
> predicted_ns with TICK_NSEC and causing its value to be replaced with
> the expected time till the closest timer, which is KTIME_MAX in that
> case.  That will cause the deepest enabled idle state to be selected,
> but the time till the closest timer very well may be shorter than the
> target residency of that state, in which case a shallower state should
> be used.
>
> Address this by making menu_select() always compute the time till the
> closest timer when the tick has been stopped.
>
> Also move the predicted_ns check mentioned above into the branch in
> which the time till the closest timer is determined because it only
> needs to be done in that case.
>
> Fixes: 5484e31bbbff ("cpuidle: menu: Skip tick_nohz_get_sleep_length() call in some cases")
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

... snip ...

I have been testing with this patch, and have not observed any difference
in test results and energy use with and without this patch.
A couple of graphs are attached.
I did this work on top of the teo work, so those results are on the graphs also.

Legend:
rc5 = kernel 6.19-rc5
rjw = kernel 6.19-rc5 + original teo 5 patch set
rjw-1-1 = kernel 6.19-rc5 + current teo 5 patch set
(note: Not including the most recent V2 of patch 5 of 5,
now a newer 2 patch set.)
menu = kernel 6.19-rc5
menu-rjw = kernel 6.19-rc5 + this patch.
disable = all idle states except state 0 disabled, as a reference plot, at the cost of energy.

Re: [PATCH v1] cpuidle: governors: menu: Always check timers with tick stopped
Posted by Rafael J. Wysocki 1 week, 2 days ago
On Wed, Jan 28, 2026 at 6:19 AM Doug Smythies <dsmythies@telus.net> wrote:
>
> On 2026.01.20 07:26 Rafael J. Wysocki wrote:
> >From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > After commit 5484e31bbbff ("cpuidle: menu: Skip tick_nohz_get_sleep_length()
> > call in some cases"), if the return value of get_typical_interval()
> > multiplied by NSEC_PER_USEC is not greater than RESIDENCY_THRESHOLD_NS,
> > the menu governor will skip computing the time till the closest timer.
> > If that happens when the tick has been stopped already, the selected
> > idle state may be too deep due to the subsequent check comparing
> > predicted_ns with TICK_NSEC and causing its value to be replaced with
> > the expected time till the closest timer, which is KTIME_MAX in that
> > case.  That will cause the deepest enabled idle state to be selected,
> > but the time till the closest timer very well may be shorter than the
> > target residency of that state, in which case a shallower state should
> > be used.
> >
> > Address this by making menu_select() always compute the time till the
> > closest timer when the tick has been stopped.
> >
> > Also move the predicted_ns check mentioned above into the branch in
> > which the time till the closest timer is determined because it only
> > needs to be done in that case.
> >
> > Fixes: 5484e31bbbff ("cpuidle: menu: Skip tick_nohz_get_sleep_length() call in some cases")
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> ... snip ...
>
> I have been testing with this patch, and have not observed any difference
> in test results and energy use with and without this patch.
> A couple of graphs are attached.
> I did this work on top of the teo work, so those results are on the graphs also.
>
> Legend:
> rc5 = kernel 6.19-rc5
> rjw = kernel 6.19-rc5 + original teo 5 patch set
> rjw-1-1 = kernel 6.19-rc5 + current teo 5 patch set
> (note: Not including the most recent V2 of patch 5 of 5,
> now a newer 2 patch set.)
> menu = kernel 6.19-rc5
> menu-rjw = kernel 6.19-rc5 + this patch.
> disable = all idle states except state 0 disabled, as a reference plot, at the cost of energy.

Thanks for the data!

I think it is most likely to make a difference for the setups
involving CPU isolation.
Re: [PATCH v1] cpuidle: governors: menu: Always check timers with tick stopped
Posted by Christian Loehle 2 weeks, 3 days ago
On 1/20/26 15:26, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> After commit 5484e31bbbff ("cpuidle: menu: Skip tick_nohz_get_sleep_length()
> call in some cases"), if the return value of get_typical_interval()
> multiplied by NSEC_PER_USEC is not greater than RESIDENCY_THRESHOLD_NS,
> the menu governor will skip computing the time till the closest timer.
> If that happens when the tick has been stopped already, the selected
> idle state may be too deep due to the subsequent check comparing
> predicted_ns with TICK_NSEC and causing its value to be replaced with
> the expected time till the closest timer, which is KTIME_MAX in that
> case.  That will cause the deepest enabled idle state to be selected,
> but the time till the closest timer very well may be shorter than the
> target residency of that state, in which case a shallower state should
> be used.
> 
> Address this by making menu_select() always compute the time till the
> closest timer when the tick has been stopped.
> 
> Also move the predicted_ns check mentioned above into the branch in
> which the time till the closest timer is determined because it only
> needs to be done in that case.
> 
> Fixes: 5484e31bbbff ("cpuidle: menu: Skip tick_nohz_get_sleep_length() call in some cases")
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpuidle/governors/menu.c |   22 +++++++++++-----------
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 
> --- a/drivers/cpuidle/governors/menu.c
> +++ b/drivers/cpuidle/governors/menu.c
> @@ -239,7 +239,7 @@ static int menu_select(struct cpuidle_dr
>  
>  	/* Find the shortest expected idle interval. */
>  	predicted_ns = get_typical_interval(data) * NSEC_PER_USEC;
> -	if (predicted_ns > RESIDENCY_THRESHOLD_NS) {
> +	if (predicted_ns > RESIDENCY_THRESHOLD_NS || tick_nohz_tick_stopped()) {
>  		unsigned int timer_us;
>  
>  		/* Determine the time till the closest timer. */
> @@ -259,6 +259,16 @@ static int menu_select(struct cpuidle_dr
>  				   RESOLUTION * DECAY * NSEC_PER_USEC);
>  		/* Use the lowest expected idle interval to pick the idle state. */
>  		predicted_ns = min((u64)timer_us * NSEC_PER_USEC, predicted_ns);
> +		/*
> +		 * If the tick is already stopped, the cost of possible short
> +		 * idle duration misprediction is much higher, because the CPU
> +		 * may be stuck in a shallow idle state for a long time as a
> +		 * result of it.  In that case, say we might mispredict and use
> +		 * the known time till the closest timer event for the idle
> +		 * state selection.
> +		 */
> +		if (tick_nohz_tick_stopped() && predicted_ns < TICK_NSEC)
> +			predicted_ns = data->next_timer_ns;
>  	} else {
>  		/*
>  		 * Because the next timer event is not going to be determined
> @@ -285,16 +295,6 @@ static int menu_select(struct cpuidle_dr
>  	}
>  
>  	/*
> -	 * If the tick is already stopped, the cost of possible short idle
> -	 * duration misprediction is much higher, because the CPU may be stuck
> -	 * in a shallow idle state for a long time as a result of it.  In that
> -	 * case, say we might mispredict and use the known time till the closest
> -	 * timer event for the idle state selection.
> -	 */
> -	if (tick_nohz_tick_stopped() && predicted_ns < TICK_NSEC)
> -		predicted_ns = data->next_timer_ns;
> -
> -	/*
>  	 * Find the idle state with the lowest power while satisfying
>  	 * our constraints.
>  	 */
> 


Seems I forgot about this, sorry!

Reviewed-by: Christian Loehle <christian.loehle@arm.com>