[PATCH] cpufreq: cap the default transition delay at 10 ms

Shawn Guo posted 1 patch 3 weeks, 1 day ago
drivers/cpufreq/cpufreq.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
[PATCH] cpufreq: cap the default transition delay at 10 ms
Posted by Shawn Guo 3 weeks, 1 day ago
From: Shawn Guo <shawnguo@kernel.org>

A regression is seen with 6.6 -> 6.12 kernel upgrade on platforms where
cpufreq-dt driver sets cpuinfo.transition_latency as CPUFREQ_ETERNAL (-1),
due to that platform's DT doesn't provide the optional property
'clock-latency-ns'.  The dbs sampling_rate was 10000 us on 6.6 and
suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these
platforms, because that the 10 ms cap for transition_delay_us was
accidentally dropped by the commits below.

  commit 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER")
  commit a755d0e2d41b ("cpufreq: Honour transition_latency over transition_delay_us")
  commit e13aa799c2a6 ("cpufreq: Change default transition delay to 2ms")

It slows down dbs governor's reacting to CPU loading change
dramatically.  Also, as transition_delay_us is used by schedutil governor
as rate_limit_us, it shows a negative impact on device idle power
consumption, because the device gets slightly less time in the lowest OPP.

Fix the regressions by adding the 10 ms cap on transition delay back.

Cc: stable@vger.kernel.org
Fixes: 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER")
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
---
 drivers/cpufreq/cpufreq.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index fc7eace8b65b..36e0c85cb4e0 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -551,8 +551,13 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
 
 	latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
 	if (latency)
-		/* Give a 50% breathing room between updates */
-		return latency + (latency >> 1);
+		/*
+		 * Give a 50% breathing room between updates.
+		 * And cap the transition delay to 10 ms for platforms
+		 * where the latency is too high to be reasonable for
+		 * reevaluating frequency.
+		 */
+		return min(latency + (latency >> 1), 10 * MSEC_PER_SEC);
 
 	return USEC_PER_MSEC;
 }
-- 
2.43.0
Re: [PATCH] cpufreq: cap the default transition delay at 10 ms
Posted by Rafael J. Wysocki 2 weeks, 6 days ago
On Wed, Sep 10, 2025 at 8:53 AM Shawn Guo <shawnguo2@yeah.net> wrote:
>
> From: Shawn Guo <shawnguo@kernel.org>
>
> A regression is seen with 6.6 -> 6.12 kernel upgrade on platforms where
> cpufreq-dt driver sets cpuinfo.transition_latency as CPUFREQ_ETERNAL (-1),
> due to that platform's DT doesn't provide the optional property
> 'clock-latency-ns'.  The dbs sampling_rate was 10000 us on 6.6 and
> suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these
> platforms, because that the 10 ms cap for transition_delay_us was
> accidentally dropped by the commits below.

IIRC, this was not accidental.

Why do you want to address the issue in the cpufreq core instead of
doing that in the cpufreq-dt driver?

CPUFREQ_ETERNAL doesn't appear to be a reasonable default for
cpuinfo.transition_latency.  Maybe just change the default there to 10
ms?

>   commit 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER")
>   commit a755d0e2d41b ("cpufreq: Honour transition_latency over transition_delay_us")
>   commit e13aa799c2a6 ("cpufreq: Change default transition delay to 2ms")
>
> It slows down dbs governor's reacting to CPU loading change
> dramatically.  Also, as transition_delay_us is used by schedutil governor
> as rate_limit_us, it shows a negative impact on device idle power
> consumption, because the device gets slightly less time in the lowest OPP.
>
> Fix the regressions by adding the 10 ms cap on transition delay back.
>
> Cc: stable@vger.kernel.org
> Fixes: 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER")
> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
> ---
>  drivers/cpufreq/cpufreq.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index fc7eace8b65b..36e0c85cb4e0 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -551,8 +551,13 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
>
>         latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
>         if (latency)
> -               /* Give a 50% breathing room between updates */
> -               return latency + (latency >> 1);
> +               /*
> +                * Give a 50% breathing room between updates.
> +                * And cap the transition delay to 10 ms for platforms
> +                * where the latency is too high to be reasonable for
> +                * reevaluating frequency.
> +                */
> +               return min(latency + (latency >> 1), 10 * MSEC_PER_SEC);
>
>         return USEC_PER_MSEC;
>  }
> --
> 2.43.0
>
Re: [PATCH] cpufreq: cap the default transition delay at 10 ms
Posted by Shawn Guo 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 12:41:14PM +0200, Rafael J. Wysocki wrote:
> On Wed, Sep 10, 2025 at 8:53 AM Shawn Guo <shawnguo2@yeah.net> wrote:
> >
> > From: Shawn Guo <shawnguo@kernel.org>
> >
> > A regression is seen with 6.6 -> 6.12 kernel upgrade on platforms where
> > cpufreq-dt driver sets cpuinfo.transition_latency as CPUFREQ_ETERNAL (-1),
> > due to that platform's DT doesn't provide the optional property
> > 'clock-latency-ns'.  The dbs sampling_rate was 10000 us on 6.6 and
> > suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these
> > platforms, because that the 10 ms cap for transition_delay_us was
> > accidentally dropped by the commits below.
> 
> IIRC, this was not accidental.

I could be wrong, but my understanding is that the intention of Qais's
commits is to drop 10 ms (and LATENCY_MULTIPLIER) as the *minimal* limit
on transition_delay_us, so that it's possible to get a much less
transition_delay_us on platforms like M1 mac mini where the transition
latency is just tens of us.  But it breaks platforms where 10 ms used
to be the *maximum* limit.

Even if it's intentional to remove 10 ms as both the minimal and maximum
limits, breaking some platforms must not be intentional, I guess :)

> Why do you want to address the issue in the cpufreq core instead of
> doing that in the cpufreq-dt driver?

My intuition was to fix the regression at where the regression was
introduced by recovering the code behavior.

> CPUFREQ_ETERNAL doesn't appear to be a reasonable default for
> cpuinfo.transition_latency.  Maybe just change the default there to 10
> ms?

I think cpufreq-dt is doing what it's asked to do, no?

 /*
  * Maximum transition latency is in nanoseconds - if it's unknown,
  * CPUFREQ_ETERNAL shall be used.
  */

Also, 10 ms will then be turned into 15 ms by:

	/* Give a 50% breathing room between updates */
	return latency + (latency >> 1);

Shawn

Re: [PATCH] cpufreq: cap the default transition delay at 10 ms
Posted by Qais Yousef 2 weeks, 4 days ago
On 09/12/25 21:07, Shawn Guo wrote:
> On Fri, Sep 12, 2025 at 12:41:14PM +0200, Rafael J. Wysocki wrote:
> > On Wed, Sep 10, 2025 at 8:53 AM Shawn Guo <shawnguo2@yeah.net> wrote:
> > >
> > > From: Shawn Guo <shawnguo@kernel.org>
> > >
> > > A regression is seen with 6.6 -> 6.12 kernel upgrade on platforms where
> > > cpufreq-dt driver sets cpuinfo.transition_latency as CPUFREQ_ETERNAL (-1),
> > > due to that platform's DT doesn't provide the optional property
> > > 'clock-latency-ns'.  The dbs sampling_rate was 10000 us on 6.6 and
> > > suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these
> > > platforms, because that the 10 ms cap for transition_delay_us was
> > > accidentally dropped by the commits below.
> > 
> > IIRC, this was not accidental.
> 
> I could be wrong, but my understanding is that the intention of Qais's
> commits is to drop 10 ms (and LATENCY_MULTIPLIER) as the *minimal* limit
> on transition_delay_us, so that it's possible to get a much less
> transition_delay_us on platforms like M1 mac mini where the transition
> latency is just tens of us.  But it breaks platforms where 10 ms used
> to be the *maximum* limit.
> 
> Even if it's intentional to remove 10 ms as both the minimal and maximum
> limits, breaking some platforms must not be intentional, I guess :)

These limits were arbitrary. The limit was reduced to 2ms initially but then
were dropped to avoid making assumptions as they are all arbitrary.

> 
> > Why do you want to address the issue in the cpufreq core instead of
> > doing that in the cpufreq-dt driver?
> 
> My intuition was to fix the regression at where the regression was
> introduced by recovering the code behavior.

Isn't the right fix here is at the driver level still? We can only give drivers
what they ask for. If they ask for something wrong and result in something
wrong, it is still their fault, no?

Alternatively maybe we can add special handling for CPUFREQ_ETERNAL value,
though I'd suggest to return 1ms (similar to the case of value being 0). Maybe
we can redefine CPUFREQ_ETERNAL to be 0, but not sure if this can have side
effects.


Thanks

--
Qais Yousef
Re: [PATCH] cpufreq: cap the default transition delay at 10 ms
Posted by Shawn Guo 2 weeks, 3 days ago
On Sun, Sep 14, 2025 at 06:43:26PM +0100, Qais Yousef wrote:
> > > Why do you want to address the issue in the cpufreq core instead of
> > > doing that in the cpufreq-dt driver?
> > 
> > My intuition was to fix the regression at where the regression was
> > introduced by recovering the code behavior.
> 
> Isn't the right fix here is at the driver level still? We can only give drivers
> what they ask for. If they ask for something wrong and result in something
> wrong, it is still their fault, no?

I'm not sure.  The cpufreq-dt driver is following suggestion to use
CPUFREQ_ETERNAL, which has the implication that core will figure out
a reasonable default value for platforms where the latency is unknown.
And that was exactly the situation before the regression.  How does it
become the fault of cpufreq-dt driver?

> Alternatively maybe we can add special handling for CPUFREQ_ETERNAL value,
> though I'd suggest to return 1ms (similar to the case of value being 0). Maybe
> we can redefine CPUFREQ_ETERNAL to be 0, but not sure if this can have side
> effects.

Changing CPUFREQ_ETERNAL to 0 looks so risky to me.  What about adding
an explicit check for CPUFREQ_ETERNAL?

---8<---

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index fc7eace8b65b..053f3a0288bc 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -549,11 +549,15 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
        if (policy->transition_delay_us)
                return policy->transition_delay_us;
 
+       if (policy->cpuinfo.transition_latency == CPUFREQ_ETERNAL)
+               goto default_delay;
+
        latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
        if (latency)
                /* Give a 50% breathing room between updates */
                return latency + (latency >> 1);
 
+default_delay:
        return USEC_PER_MSEC;
 }
 EXPORT_SYMBOL_GPL(cpufreq_policy_transition_delay_us);

--->8---

Shawn
Re: [PATCH] cpufreq: cap the default transition delay at 10 ms
Posted by Rafael J. Wysocki 2 weeks, 3 days ago
On Mon, Sep 15, 2025 at 9:29 AM Shawn Guo <shawnguo2@yeah.net> wrote:
>
> On Sun, Sep 14, 2025 at 06:43:26PM +0100, Qais Yousef wrote:
> > > > Why do you want to address the issue in the cpufreq core instead of
> > > > doing that in the cpufreq-dt driver?
> > >
> > > My intuition was to fix the regression at where the regression was
> > > introduced by recovering the code behavior.
> >
> > Isn't the right fix here is at the driver level still? We can only give drivers
> > what they ask for. If they ask for something wrong and result in something
> > wrong, it is still their fault, no?
>
> I'm not sure.  The cpufreq-dt driver is following suggestion to use
> CPUFREQ_ETERNAL,

Fair enough.

Actually, there are a few other drivers that fall back to
CPUFREQ_ETERNAL if they cannot determine transition_latency.

> which has the implication that core will figure out a reasonable default value for
> platforms where the latency is unknown.

Is this expectation realistic, though?  I'm not sure.

The core can only use a hard-coded default fallback number, but would
that number be really suitable for all of the platforms in question?

> And that was exactly the situation before the regression.  How does it
> become the fault of cpufreq-dt driver?

The question is not about who's fault it is, but what's the best place
to address this issue.

I think that addressing it in cpufreq_policy_transition_delay_us() is
a bit confusing because it is related to initialization and the new
branch becomes pure overhead for the drivers that don't set
cpuinfo.transition_latency to CPUFREQ_ETERNAL.

However, addressing it at the initialization time would effectively
mean that the core would do something like:

if (policy->cpuinfo.transition_latency == CPUFREQ_ETERNAL)
        policy->cpuinfo.transition_latency =
CPUFREQ_DEFAULT_TANSITION_LATENCY_NS;

but then it would be kind of more straightforward to update everybody
using CPUFREQ_ETERNAL to set cpuinfo.transition_latency to
CPUFREQ_DEFAULT_TANSITION_LATENCY_NS directly (and then get rid of
CPUFREQ_ETERNAL entirely).

> > Alternatively maybe we can add special handling for CPUFREQ_ETERNAL value,
> > though I'd suggest to return 1ms (similar to the case of value being 0). Maybe
> > we can redefine CPUFREQ_ETERNAL to be 0, but not sure if this can have side
> > effects.
>
> Changing CPUFREQ_ETERNAL to 0 looks so risky to me.  What about adding
> an explicit check for CPUFREQ_ETERNAL?
>
> ---8<---
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index fc7eace8b65b..053f3a0288bc 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -549,11 +549,15 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
>         if (policy->transition_delay_us)
>                 return policy->transition_delay_us;
>
> +       if (policy->cpuinfo.transition_latency == CPUFREQ_ETERNAL)
> +               goto default_delay;

Can't USEC_PER_MSEC be just returned directly from here?

> +
>         latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
>         if (latency)
>                 /* Give a 50% breathing room between updates */
>                 return latency + (latency >> 1);

Side note for self: The computation above can be done once at the
policy initialization time and transition_latency can be stored in us
(and only converted to ns when the corresponding sysfs attribute is
read).  It can be even set to USEC_PER_MSEC if zero.

> +default_delay:
>         return USEC_PER_MSEC;
>  }
>  EXPORT_SYMBOL_GPL(cpufreq_policy_transition_delay_us);
>
> --->8---
Re: [PATCH] cpufreq: cap the default transition delay at 10 ms
Posted by Shawn Guo 2 weeks, 2 days ago
On Mon, Sep 15, 2025 at 03:18:44PM +0200, Rafael J. Wysocki wrote:
> The question is not about who's fault it is, but what's the best place
> to address this issue.
> 
> I think that addressing it in cpufreq_policy_transition_delay_us() is
> a bit confusing because it is related to initialization and the new
> branch becomes pure overhead for the drivers that don't set
> cpuinfo.transition_latency to CPUFREQ_ETERNAL.
> 
> However, addressing it at the initialization time would effectively
> mean that the core would do something like:
> 
> if (policy->cpuinfo.transition_latency == CPUFREQ_ETERNAL)
>         policy->cpuinfo.transition_latency =
> CPUFREQ_DEFAULT_TANSITION_LATENCY_NS;
> 
> but then it would be kind of more straightforward to update everybody
> using CPUFREQ_ETERNAL to set cpuinfo.transition_latency to
> CPUFREQ_DEFAULT_TANSITION_LATENCY_NS directly (and then get rid of
> CPUFREQ_ETERNAL entirely).

So we fix the regression with an immediate change like below, and then
plan to remove CPUFREQ_ETERNAL entirely with another development series.
Do I get you right?

---8<---

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index d873ff9add49..e37722ce7aec 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -574,6 +574,10 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
        if (policy->transition_delay_us)
                return policy->transition_delay_us;
 
+       if (policy->cpuinfo.transition_latency == CPUFREQ_ETERNAL)
+               policy->cpuinfo.transition_latency =
+                       CPUFREQ_DEFAULT_TANSITION_LATENCY_NS;
+
        latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
        if (latency)
                /*
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 7fe0981a7e46..7331bc06f161 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -36,6 +36,8 @@
 /* Print length for names. Extra 1 space for accommodating '\n' in prints */
 #define CPUFREQ_NAME_PLEN              (CPUFREQ_NAME_LEN + 1)
 
+#define CPUFREQ_DEFAULT_TANSITION_LATENCY_NS   NSEC_PER_MSEC
+
 struct cpufreq_governor;
 
 enum cpufreq_table_sorting {

--->8---

Shawn
Re: [PATCH] cpufreq: cap the default transition delay at 10 ms
Posted by Qais Yousef 2 weeks, 3 days ago
On 09/15/25 15:29, Shawn Guo wrote:
> On Sun, Sep 14, 2025 at 06:43:26PM +0100, Qais Yousef wrote:
> > > > Why do you want to address the issue in the cpufreq core instead of
> > > > doing that in the cpufreq-dt driver?
> > > 
> > > My intuition was to fix the regression at where the regression was
> > > introduced by recovering the code behavior.
> > 
> > Isn't the right fix here is at the driver level still? We can only give drivers
> > what they ask for. If they ask for something wrong and result in something
> > wrong, it is still their fault, no?
> 
> I'm not sure.  The cpufreq-dt driver is following suggestion to use
> CPUFREQ_ETERNAL, which has the implication that core will figure out
> a reasonable default value for platforms where the latency is unknown.
> And that was exactly the situation before the regression.  How does it
> become the fault of cpufreq-dt driver?

Rafael and Viresh would know better, but amd-pstate chooses to fallback to
specific values if cppc returned CPUFREQ_ETERNAL.

Have you tried to look why dev_pm_opp_get_max_transition_latency() returns
0 for your platform? I think this is the problem that was being masked before.

> 
> > Alternatively maybe we can add special handling for CPUFREQ_ETERNAL value,
> > though I'd suggest to return 1ms (similar to the case of value being 0). Maybe
> > we can redefine CPUFREQ_ETERNAL to be 0, but not sure if this can have side
> > effects.
> 
> Changing CPUFREQ_ETERNAL to 0 looks so risky to me.  What about adding
> an explicit check for CPUFREQ_ETERNAL?

Yeah this is what I had in mind. I think treating CPUFREQ_ETERNAL like 0 where
we don't know the right value and end up with a sensible default makes sense to
me.

I think printing info/warn message that the driver is not specifying the actual
hardware transition delay would be helpful for admins. A driver/DT file is
likely needs to be updated.

Better hear from Rafael first to make sure it makes sense to him too.

> 
> ---8<---
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index fc7eace8b65b..053f3a0288bc 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -549,11 +549,15 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
>         if (policy->transition_delay_us)
>                 return policy->transition_delay_us;
>  
> +       if (policy->cpuinfo.transition_latency == CPUFREQ_ETERNAL)
> +               goto default_delay;
> +
>         latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
>         if (latency)
>                 /* Give a 50% breathing room between updates */
>                 return latency + (latency >> 1);
>  
> +default_delay:
>         return USEC_PER_MSEC;
>  }
>  EXPORT_SYMBOL_GPL(cpufreq_policy_transition_delay_us);
> 
> --->8---
> 
> Shawn
>
Re: [PATCH] cpufreq: cap the default transition delay at 10 ms
Posted by Shawn Guo 2 weeks, 3 days ago
On Mon, Sep 15, 2025 at 11:02:07AM +0100, Qais Yousef wrote:
> On 09/15/25 15:29, Shawn Guo wrote:
> > On Sun, Sep 14, 2025 at 06:43:26PM +0100, Qais Yousef wrote:
> > > > > Why do you want to address the issue in the cpufreq core instead of
> > > > > doing that in the cpufreq-dt driver?
> > > > 
> > > > My intuition was to fix the regression at where the regression was
> > > > introduced by recovering the code behavior.
> > > 
> > > Isn't the right fix here is at the driver level still? We can only give drivers
> > > what they ask for. If they ask for something wrong and result in something
> > > wrong, it is still their fault, no?
> > 
> > I'm not sure.  The cpufreq-dt driver is following suggestion to use
> > CPUFREQ_ETERNAL, which has the implication that core will figure out
> > a reasonable default value for platforms where the latency is unknown.
> > And that was exactly the situation before the regression.  How does it
> > become the fault of cpufreq-dt driver?
> 
> Rafael and Viresh would know better, but amd-pstate chooses to fallback to
> specific values if cppc returned CPUFREQ_ETERNAL.
> 
> Have you tried to look why dev_pm_opp_get_max_transition_latency() returns
> 0 for your platform? I think this is the problem that was being masked before.

My platform doesn't scale voltage along with frequency, and the platform
DT doesn't specify 'clock-latency-ns' which is an optional property
after all.

Shawn
Re: [PATCH] cpufreq: cap the default transition delay at 10 ms
Posted by Shawn Guo 3 weeks, 1 day ago
On Wed, Sep 10, 2025 at 02:53:12PM +0800, Shawn Guo wrote:
> From: Shawn Guo <shawnguo@kernel.org>
> 
> A regression is seen with 6.6 -> 6.12 kernel upgrade on platforms where
> cpufreq-dt driver sets cpuinfo.transition_latency as CPUFREQ_ETERNAL (-1),
> due to that platform's DT doesn't provide the optional property
> 'clock-latency-ns'.  The dbs sampling_rate was 10000 us on 6.6 and
> suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these
> platforms, because that the 10 ms cap for transition_delay_us was
> accidentally dropped by the commits below.
> 
>   commit 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER")
>   commit a755d0e2d41b ("cpufreq: Honour transition_latency over transition_delay_us")
>   commit e13aa799c2a6 ("cpufreq: Change default transition delay to 2ms")
> 
> It slows down dbs governor's reacting to CPU loading change
> dramatically.  Also, as transition_delay_us is used by schedutil governor
> as rate_limit_us, it shows a negative impact on device idle power
> consumption, because the device gets slightly less time in the lowest OPP.
> 
> Fix the regressions by adding the 10 ms cap on transition delay back.
> 
> Cc: stable@vger.kernel.org
> Fixes: 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER")
> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
> ---
>  drivers/cpufreq/cpufreq.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index fc7eace8b65b..36e0c85cb4e0 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -551,8 +551,13 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
>  
>  	latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
>  	if (latency)
> -		/* Give a 50% breathing room between updates */
> -		return latency + (latency >> 1);
> +		/*
> +		 * Give a 50% breathing room between updates.
> +		 * And cap the transition delay to 10 ms for platforms
> +		 * where the latency is too high to be reasonable for
> +		 * reevaluating frequency.
> +		 */
> +		return min(latency + (latency >> 1), 10 * MSEC_PER_SEC);

I guess it's more correct to use USEC_PER_MSEC instead, even if both
have the value 1000.  Will fix in v2.

Shawn

>  
>  	return USEC_PER_MSEC;
>  }
> -- 
> 2.43.0
>