[PATCH v1] cpuidle: Warn instead of bailing out if target residency check fails

Rafael J. Wysocki posted 1 patch 6 days, 7 hours ago
drivers/cpuidle/driver.c |   18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)
[PATCH v1] cpuidle: Warn instead of bailing out if target residency check fails
Posted by Rafael J. Wysocki 6 days, 7 hours ago
On Friday, November 21, 2025 2:10:57 PM CET Rafael J. Wysocki wrote:
> On Fri, Nov 21, 2025 at 2:08 AM Val Packett <val@packett.cool> wrote:
> >
> > On Device Tree platforms, the latency and target residency values come
> > directly from device trees, which are numerous and weren't all written
> > with cpuidle invariants in mind. For example, qcom/hamoa.dtsi currently
> > trips this check: exit latency 680000 > residency 600000.
> 
> So this breaks cpuidle expectations and it doesn't work correctly on
> the affected platforms.
> 
> > Instead of harshly rejecting the entire cpuidle driver with a mysterious
> > error message, print a warning and set the target residency value to be
> > equal to the exit latency.
> 
> This generally doesn't work because the new target residency may be
> greater than the target residency of the next state.
> 
> > Fixes: 76934e495cdc ("cpuidle: Add sanity check for exit latency and target residency")
> > Signed-off-by: Val Packett <val@packett.cool>
> > ---
> >  drivers/cpuidle/driver.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
> > index 1c295a93d582..06aeb59c1017 100644
> > --- a/drivers/cpuidle/driver.c
> > +++ b/drivers/cpuidle/driver.c
> > @@ -199,8 +199,11 @@ static int __cpuidle_driver_init(struct cpuidle_driver *drv)
> >                  * exceed its target residency which is assumed in cpuidle in
> >                  * multiple places.
> >                  */
> > -               if (s->exit_latency_ns > s->target_residency_ns)
> > -                       return -EINVAL;
> > +               if (s->exit_latency_ns > s->target_residency_ns) {
> > +                       pr_warn("cpuidle: state %d: exit latency %lld > residency %lld (fixing)\n",
> > +                               i, s->exit_latency_ns, s->target_residency_ns);
> > +                       s->target_residency_ns = s->exit_latency_ns;
> 
> And you also need to update s->target_residency.
> 
> Moreover, that needs to be done when all of the target residency and
> exit latency values have been computed and full sanitization of all
> the states would need to be done (including the ordering checks), but
> the kernel has insufficient information to do that (for instance, if
> the ordering is not as expected, it is not clear how to fix it up).
> Even the above sanitization is unlikely to result in the intended
> behavior.
> 
> So if returning the error code doesn't work, printing a warning is as
> much as can be done, like in the attached patch.
> 
> If this works for you, I'll submit it properly later.
> 

No response, so I assume no objections.

---
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

It turns out that the change in commit 76934e495cdc ("cpuidle: Add
sanity check for exit latency and target residency") goes too far
because there are systems in the field on which the check introduced
by that commit does not pass.

For this reason, change __cpuidle_driver_init() return type back to void
and make it print a warning when the check mentioned above does not
pass.

Fixes: 76934e495cdc ("cpuidle: Add sanity check for exit latency and target residency")
Reported-by: Val Packett <val@packett.cool>
Closes: https://lore.kernel.org/linux-pm/20251121010756.6687-1-val@packett.cool/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpuidle/driver.c |   18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

--- a/drivers/cpuidle/driver.c
+++ b/drivers/cpuidle/driver.c
@@ -8,6 +8,8 @@
  * This code is licenced under the GPL.
  */
 
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
 #include <linux/mutex.h>
 #include <linux/module.h>
 #include <linux/sched.h>
@@ -152,7 +154,7 @@ static void cpuidle_setup_broadcast_time
  * __cpuidle_driver_init - initialize the driver's internal data
  * @drv: a valid pointer to a struct cpuidle_driver
  */
-static int __cpuidle_driver_init(struct cpuidle_driver *drv)
+static void __cpuidle_driver_init(struct cpuidle_driver *drv)
 {
 	int i;
 
@@ -195,15 +197,13 @@ static int __cpuidle_driver_init(struct
 			s->exit_latency = div_u64(s->exit_latency_ns, NSEC_PER_USEC);
 
 		/*
-		 * Ensure that the exit latency of a CPU idle state does not
-		 * exceed its target residency which is assumed in cpuidle in
-		 * multiple places.
+		 * Warn if the exit latency of a CPU idle state exceeds its
+		 * target residency which is assumed to never happen in cpuidle
+		 * in multiple places.
 		 */
 		if (s->exit_latency_ns > s->target_residency_ns)
-			return -EINVAL;
+			pr_warn("Idle state %d target residency too low\n", i);
 	}
-
-	return 0;
 }
 
 /**
@@ -233,9 +233,7 @@ static int __cpuidle_register_driver(str
 	if (cpuidle_disabled())
 		return -ENODEV;
 
-	ret = __cpuidle_driver_init(drv);
-	if (ret)
-		return ret;
+	__cpuidle_driver_init(drv);
 
 	ret = __cpuidle_set_driver(drv);
 	if (ret)
Re: [PATCH v1] cpuidle: Warn instead of bailing out if target residency check fails
Posted by Val Packett 5 days, 18 hours ago
On 11/25/25 1:23 PM, Rafael J. Wysocki wrote:
> On Friday, November 21, 2025 2:10:57 PM CET Rafael J. Wysocki wrote:
>> On Fri, Nov 21, 2025 at 2:08 AM Val Packett <val@packett.cool> wrote:
>>> On Device Tree platforms, the latency and target residency values come
>>> directly from device trees, which are numerous and weren't all written
>>> with cpuidle invariants in mind. For example, qcom/hamoa.dtsi currently
>>> trips this check: exit latency 680000 > residency 600000.
>> So this breaks cpuidle expectations and it doesn't work correctly on
>> the affected platforms.
>>
>>> Instead of harshly rejecting the entire cpuidle driver with a mysterious
>>> error message, print a warning and set the target residency value to be
>>> equal to the exit latency.
>> This generally doesn't work because the new target residency may be
>> greater than the target residency of the next state.
>>
>>> Fixes: 76934e495cdc ("cpuidle: Add sanity check for exit latency and target residency")
>>> Signed-off-by: Val Packett <val@packett.cool>
>>> ---
>>>   drivers/cpuidle/driver.c | 7 +++++--
>>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
>>> index 1c295a93d582..06aeb59c1017 100644
>>> --- a/drivers/cpuidle/driver.c
>>> +++ b/drivers/cpuidle/driver.c
>>> @@ -199,8 +199,11 @@ static int __cpuidle_driver_init(struct cpuidle_driver *drv)
>>>                   * exceed its target residency which is assumed in cpuidle in
>>>                   * multiple places.
>>>                   */
>>> -               if (s->exit_latency_ns > s->target_residency_ns)
>>> -                       return -EINVAL;
>>> +               if (s->exit_latency_ns > s->target_residency_ns) {
>>> +                       pr_warn("cpuidle: state %d: exit latency %lld > residency %lld (fixing)\n",
>>> +                               i, s->exit_latency_ns, s->target_residency_ns);
>>> +                       s->target_residency_ns = s->exit_latency_ns;
>> And you also need to update s->target_residency.
>>
>> Moreover, that needs to be done when all of the target residency and
>> exit latency values have been computed and full sanitization of all
>> the states would need to be done (including the ordering checks), but
>> the kernel has insufficient information to do that (for instance, if
>> the ordering is not as expected, it is not clear how to fix it up).
>> Even the above sanitization is unlikely to result in the intended
>> behavior.
>>
>> So if returning the error code doesn't work, printing a warning is as
>> much as can be done, like in the attached patch.
>>
>> If this works for you, I'll submit it properly later.
>>
> No response, so I assume no objections. [..]

Right, only printing a warning is fine of course.

~val

Re: [PATCH v1] cpuidle: Warn instead of bailing out if target residency check fails
Posted by Christian Loehle 6 days, 6 hours ago
On 11/25/25 16:23, Rafael J. Wysocki wrote:
> On Friday, November 21, 2025 2:10:57 PM CET Rafael J. Wysocki wrote:
>> On Fri, Nov 21, 2025 at 2:08 AM Val Packett <val@packett.cool> wrote:
>>>
>>> On Device Tree platforms, the latency and target residency values come
>>> directly from device trees, which are numerous and weren't all written
>>> with cpuidle invariants in mind. For example, qcom/hamoa.dtsi currently
>>> trips this check: exit latency 680000 > residency 600000.
>>
>> So this breaks cpuidle expectations and it doesn't work correctly on
>> the affected platforms.
>>
>>> Instead of harshly rejecting the entire cpuidle driver with a mysterious
>>> error message, print a warning and set the target residency value to be
>>> equal to the exit latency.
>>
>> This generally doesn't work because the new target residency may be
>> greater than the target residency of the next state.
>>
>>> Fixes: 76934e495cdc ("cpuidle: Add sanity check for exit latency and target residency")
>>> Signed-off-by: Val Packett <val@packett.cool>
>>> ---
>>>  drivers/cpuidle/driver.c | 7 +++++--
>>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
>>> index 1c295a93d582..06aeb59c1017 100644
>>> --- a/drivers/cpuidle/driver.c
>>> +++ b/drivers/cpuidle/driver.c
>>> @@ -199,8 +199,11 @@ static int __cpuidle_driver_init(struct cpuidle_driver *drv)
>>>                  * exceed its target residency which is assumed in cpuidle in
>>>                  * multiple places.
>>>                  */
>>> -               if (s->exit_latency_ns > s->target_residency_ns)
>>> -                       return -EINVAL;
>>> +               if (s->exit_latency_ns > s->target_residency_ns) {
>>> +                       pr_warn("cpuidle: state %d: exit latency %lld > residency %lld (fixing)\n",
>>> +                               i, s->exit_latency_ns, s->target_residency_ns);
>>> +                       s->target_residency_ns = s->exit_latency_ns;
>>
>> And you also need to update s->target_residency.
>>
>> Moreover, that needs to be done when all of the target residency and
>> exit latency values have been computed and full sanitization of all
>> the states would need to be done (including the ordering checks), but
>> the kernel has insufficient information to do that (for instance, if
>> the ordering is not as expected, it is not clear how to fix it up).
>> Even the above sanitization is unlikely to result in the intended
>> behavior.
>>
>> So if returning the error code doesn't work, printing a warning is as
>> much as can be done, like in the attached patch.
>>
>> If this works for you, I'll submit it properly later.
>>
> 
> No response, so I assume no objections.
> 
> ---
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> It turns out that the change in commit 76934e495cdc ("cpuidle: Add
> sanity check for exit latency and target residency") goes too far
> because there are systems in the field on which the check introduced
> by that commit does not pass.
> 
> For this reason, change __cpuidle_driver_init() return type back to void
> and make it print a warning when the check mentioned above does not
> pass.
> 
> Fixes: 76934e495cdc ("cpuidle: Add sanity check for exit latency and target residency")
> Reported-by: Val Packett <val@packett.cool>
> Closes: https://lore.kernel.org/linux-pm/20251121010756.6687-1-val@packett.cool/
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpuidle/driver.c |   18 ++++++++----------
>  1 file changed, 8 insertions(+), 10 deletions(-)
> 
> --- a/drivers/cpuidle/driver.c
> +++ b/drivers/cpuidle/driver.c
> @@ -8,6 +8,8 @@
>   * This code is licenced under the GPL.
>   */
>  
> +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> +
>  #include <linux/mutex.h>
>  #include <linux/module.h>
>  #include <linux/sched.h>
> @@ -152,7 +154,7 @@ static void cpuidle_setup_broadcast_time
>   * __cpuidle_driver_init - initialize the driver's internal data
>   * @drv: a valid pointer to a struct cpuidle_driver
>   */
> -static int __cpuidle_driver_init(struct cpuidle_driver *drv)
> +static void __cpuidle_driver_init(struct cpuidle_driver *drv)
>  {
>  	int i;
>  
> @@ -195,15 +197,13 @@ static int __cpuidle_driver_init(struct
>  			s->exit_latency = div_u64(s->exit_latency_ns, NSEC_PER_USEC);
>  
>  		/*
> -		 * Ensure that the exit latency of a CPU idle state does not
> -		 * exceed its target residency which is assumed in cpuidle in
> -		 * multiple places.
> +		 * Warn if the exit latency of a CPU idle state exceeds its
> +		 * target residency which is assumed to never happen in cpuidle
> +		 * in multiple places.
>  		 */
>  		if (s->exit_latency_ns > s->target_residency_ns)
> -			return -EINVAL;
> +			pr_warn("Idle state %d target residency too low\n", i);
>  	}
> -
> -	return 0;
>  }
>  
>  /**
> @@ -233,9 +233,7 @@ static int __cpuidle_register_driver(str
>  	if (cpuidle_disabled())
>  		return -ENODEV;
>  
> -	ret = __cpuidle_driver_init(drv);
> -	if (ret)
> -		return ret;
> +	__cpuidle_driver_init(drv);
>  
>  	ret = __cpuidle_set_driver(drv);
>  	if (ret)
> 

FWIW I also prefer this to a weird fixing-up-states logic that we would never test!
Reviewed-by: Christian Loehle <christian.loehle@arm.com>