[PATCH v4] cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT

Krzysztof Kozlowski posted 1 patch 2 years, 7 months ago
drivers/cpuidle/Kconfig.arm           | 8 ++++++++
drivers/cpuidle/cpuidle-psci-domain.c | 7 +++++--
drivers/cpuidle/cpuidle-psci.c        | 3 +++
3 files changed, 16 insertions(+), 2 deletions(-)
[PATCH v4] cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
Posted by Krzysztof Kozlowski 2 years, 7 months ago
The runtime Power Management of CPU topology is not compatible with
PREEMPT_RT:
1. Core cpuidle path disables IRQs.
2. Core cpuidle calls cpuidle-psci.
3. cpuidle-psci in __psci_enter_domain_idle_state() calls
   pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use
   spinlocks (which are sleeping on PREEMPT_RT).

Deep sleep modes are not a priority of Realtime kernels because the
latencies might become unpredictable.  On the other hand the PSCI CPU
idle power domain is a parent of other devices and power domain
controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250).

Disable the idle callbacks in cpuidle-psci and mark the domain as
always on.  This is a trade-off between making PREEMPT_RT working and
still having a proper power domain hierarchy in the system.

Cc: Adrien Thierry <athierry@redhat.com>
Cc: Brian Masney <bmasney@redhat.com>
Cc: linux-rt-users@vger.kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>

---

Changes since v3:
1. Rework - disable idle states, mark as always on (Ulf).
2. Extend Kconfig warning (Ulf).

Changes since v1:
1. Re-work commit msg.
2. Add note to Kconfig.

Several other patches were dropped, as this is the only one actually
needed.  It effectively stops PSCI cpuidle power domains from suspending
thus solving all other issues I experienced.
---
 drivers/cpuidle/Kconfig.arm           | 8 ++++++++
 drivers/cpuidle/cpuidle-psci-domain.c | 7 +++++--
 drivers/cpuidle/cpuidle-psci.c        | 3 +++
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm
index 747aa537389b..8deaa2e05206 100644
--- a/drivers/cpuidle/Kconfig.arm
+++ b/drivers/cpuidle/Kconfig.arm
@@ -24,6 +24,14 @@ config ARM_PSCI_CPUIDLE
 	  It provides an idle driver that is capable of detecting and
 	  managing idle states through the PSCI firmware interface.
 
+	  The driver has limitations when used with PREEMPT_RT:
+	  - If the idle states are described with the non-hierarchical layout,
+	    all idle states are still available.
+
+	  - If the idle states are described with the hierarchical layout,
+	    only the idle states defined per CPU are available, but not the ones
+	    being shared among a group of CPUs (aka cluster idle states).
+
 config ARM_PSCI_CPUIDLE_DOMAIN
 	bool "PSCI CPU idle Domain"
 	depends on ARM_PSCI_CPUIDLE
diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
index c80cf9ddabd8..6ad2954948a5 100644
--- a/drivers/cpuidle/cpuidle-psci-domain.c
+++ b/drivers/cpuidle/cpuidle-psci-domain.c
@@ -64,8 +64,11 @@ static int psci_pd_init(struct device_node *np, bool use_osi)
 
 	pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_CPU_DOMAIN;
 
-	/* Allow power off when OSI has been successfully enabled. */
-	if (use_osi)
+	/*
+	 * Allow power off when OSI has been successfully enabled.
+	 * PREEMPT_RT is not yet ready to enter domain idle states.
+	 */
+	if (use_osi && !IS_ENABLED(CONFIG_PREEMPT_RT))
 		pd->power_off = psci_pd_power_off;
 	else
 		pd->flags |= GENPD_FLAG_ALWAYS_ON;
diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
index 312a34ef28dc..6de027f9f6f5 100644
--- a/drivers/cpuidle/cpuidle-psci.c
+++ b/drivers/cpuidle/cpuidle-psci.c
@@ -222,6 +222,9 @@ static int psci_dt_cpu_init_topology(struct cpuidle_driver *drv,
 	if (!psci_has_osi_support())
 		return 0;
 
+	if (IS_ENABLED(CONFIG_PREEMPT_RT))
+		return 0;
+
 	data->dev = psci_dt_attach_cpu(cpu);
 	if (IS_ERR_OR_NULL(data->dev))
 		return PTR_ERR_OR_ZERO(data->dev);
-- 
2.34.1
Re: [PATCH v4] cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
Posted by Adrien Thierry 2 years, 7 months ago
Hi Krzysztof,

I tested your patch on the Qdrive3/sa8540p-ride on 6.2.0-rc3-rt1, and it
fixes the issue I encountered in [1].

Tested-by: Adrien Thierry <athierry@redhat.com>

Thank you,

Adrien

[1] https://lore.kernel.org/all/20220615203605.1068453-1-athierry@redhat.com/
Re: [PATCH v4] cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
Posted by Adrien Thierry 2 years, 7 months ago
Is there still something preventing this patch from being picked up?

Best,

Adrien
Re: [PATCH v4] cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
Posted by Rafael J. Wysocki 2 years, 7 months ago
On Tue, Feb 7, 2023 at 2:47 PM Adrien Thierry <athierry@redhat.com> wrote:
>
> Is there still something preventing this patch from being picked up?

Well, I've been waiting for Daniel to do that.  Or should I pick it up
directly?  Daniel?
Re: [PATCH v4] cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
Posted by Rafael J. Wysocki 2 years, 7 months ago
On Thu, Feb 9, 2023 at 6:08 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Feb 7, 2023 at 2:47 PM Adrien Thierry <athierry@redhat.com> wrote:
> >
> > Is there still something preventing this patch from being picked up?
>
> Well, I've been waiting for Daniel to do that.  Or should I pick it up
> directly?  Daniel?

Allright, applied as 6.3 material now, thanks!
Re: [PATCH v4] cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
Posted by Ulf Hansson 2 years, 7 months ago
On Wed, 25 Jan 2023 at 12:34, Krzysztof Kozlowski
<krzysztof.kozlowski@linaro.org> wrote:
>
> The runtime Power Management of CPU topology is not compatible with
> PREEMPT_RT:
> 1. Core cpuidle path disables IRQs.
> 2. Core cpuidle calls cpuidle-psci.
> 3. cpuidle-psci in __psci_enter_domain_idle_state() calls
>    pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use
>    spinlocks (which are sleeping on PREEMPT_RT).
>
> Deep sleep modes are not a priority of Realtime kernels because the
> latencies might become unpredictable.  On the other hand the PSCI CPU
> idle power domain is a parent of other devices and power domain
> controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250).
>
> Disable the idle callbacks in cpuidle-psci and mark the domain as
> always on.  This is a trade-off between making PREEMPT_RT working and
> still having a proper power domain hierarchy in the system.
>
> Cc: Adrien Thierry <athierry@redhat.com>
> Cc: Brian Masney <bmasney@redhat.com>
> Cc: linux-rt-users@vger.kernel.org
> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

Kind regards
Uffe

>
> ---
>
> Changes since v3:
> 1. Rework - disable idle states, mark as always on (Ulf).
> 2. Extend Kconfig warning (Ulf).
>
> Changes since v1:
> 1. Re-work commit msg.
> 2. Add note to Kconfig.
>
> Several other patches were dropped, as this is the only one actually
> needed.  It effectively stops PSCI cpuidle power domains from suspending
> thus solving all other issues I experienced.
> ---
>  drivers/cpuidle/Kconfig.arm           | 8 ++++++++
>  drivers/cpuidle/cpuidle-psci-domain.c | 7 +++++--
>  drivers/cpuidle/cpuidle-psci.c        | 3 +++
>  3 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm
> index 747aa537389b..8deaa2e05206 100644
> --- a/drivers/cpuidle/Kconfig.arm
> +++ b/drivers/cpuidle/Kconfig.arm
> @@ -24,6 +24,14 @@ config ARM_PSCI_CPUIDLE
>           It provides an idle driver that is capable of detecting and
>           managing idle states through the PSCI firmware interface.
>
> +         The driver has limitations when used with PREEMPT_RT:
> +         - If the idle states are described with the non-hierarchical layout,
> +           all idle states are still available.
> +
> +         - If the idle states are described with the hierarchical layout,
> +           only the idle states defined per CPU are available, but not the ones
> +           being shared among a group of CPUs (aka cluster idle states).
> +
>  config ARM_PSCI_CPUIDLE_DOMAIN
>         bool "PSCI CPU idle Domain"
>         depends on ARM_PSCI_CPUIDLE
> diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
> index c80cf9ddabd8..6ad2954948a5 100644
> --- a/drivers/cpuidle/cpuidle-psci-domain.c
> +++ b/drivers/cpuidle/cpuidle-psci-domain.c
> @@ -64,8 +64,11 @@ static int psci_pd_init(struct device_node *np, bool use_osi)
>
>         pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_CPU_DOMAIN;
>
> -       /* Allow power off when OSI has been successfully enabled. */
> -       if (use_osi)
> +       /*
> +        * Allow power off when OSI has been successfully enabled.
> +        * PREEMPT_RT is not yet ready to enter domain idle states.
> +        */
> +       if (use_osi && !IS_ENABLED(CONFIG_PREEMPT_RT))
>                 pd->power_off = psci_pd_power_off;
>         else
>                 pd->flags |= GENPD_FLAG_ALWAYS_ON;
> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> index 312a34ef28dc..6de027f9f6f5 100644
> --- a/drivers/cpuidle/cpuidle-psci.c
> +++ b/drivers/cpuidle/cpuidle-psci.c
> @@ -222,6 +222,9 @@ static int psci_dt_cpu_init_topology(struct cpuidle_driver *drv,
>         if (!psci_has_osi_support())
>                 return 0;
>
> +       if (IS_ENABLED(CONFIG_PREEMPT_RT))
> +               return 0;
> +
>         data->dev = psci_dt_attach_cpu(cpu);
>         if (IS_ERR_OR_NULL(data->dev))
>                 return PTR_ERR_OR_ZERO(data->dev);
> --
> 2.34.1
>
Re: [PATCH v4] cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
Posted by Daniel Lezcano 2 years, 7 months ago
Hi Krzysztof,


On 25/01/2023 12:34, Krzysztof Kozlowski wrote:
> The runtime Power Management of CPU topology is not compatible with
> PREEMPT_RT:
> 1. Core cpuidle path disables IRQs.
> 2. Core cpuidle calls cpuidle-psci.
> 3. cpuidle-psci in __psci_enter_domain_idle_state() calls
>     pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use
>     spinlocks (which are sleeping on PREEMPT_RT).
> 
> Deep sleep modes are not a priority of Realtime kernels because the
> latencies might become unpredictable.  On the other hand the PSCI CPU
> idle power domain is a parent of other devices and power domain
> controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250).
> 
> Disable the idle callbacks in cpuidle-psci and mark the domain as
> always on.  This is a trade-off between making PREEMPT_RT working and
> still having a proper power domain hierarchy in the system.

Wouldn't make sense to rely on the latency constraint framework ?


> Cc: Adrien Thierry <athierry@redhat.com>
> Cc: Brian Masney <bmasney@redhat.com>
> Cc: linux-rt-users@vger.kernel.org
> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
> 
> ---
> 
> Changes since v3:
> 1. Rework - disable idle states, mark as always on (Ulf).
> 2. Extend Kconfig warning (Ulf).
> 
> Changes since v1:
> 1. Re-work commit msg.
> 2. Add note to Kconfig.
> 
> Several other patches were dropped, as this is the only one actually
> needed.  It effectively stops PSCI cpuidle power domains from suspending
> thus solving all other issues I experienced.
> ---
>   drivers/cpuidle/Kconfig.arm           | 8 ++++++++
>   drivers/cpuidle/cpuidle-psci-domain.c | 7 +++++--
>   drivers/cpuidle/cpuidle-psci.c        | 3 +++
>   3 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm
> index 747aa537389b..8deaa2e05206 100644
> --- a/drivers/cpuidle/Kconfig.arm
> +++ b/drivers/cpuidle/Kconfig.arm
> @@ -24,6 +24,14 @@ config ARM_PSCI_CPUIDLE
>   	  It provides an idle driver that is capable of detecting and
>   	  managing idle states through the PSCI firmware interface.
>   
> +	  The driver has limitations when used with PREEMPT_RT:
> +	  - If the idle states are described with the non-hierarchical layout,
> +	    all idle states are still available.
> +
> +	  - If the idle states are described with the hierarchical layout,
> +	    only the idle states defined per CPU are available, but not the ones
> +	    being shared among a group of CPUs (aka cluster idle states).
> +
>   config ARM_PSCI_CPUIDLE_DOMAIN
>   	bool "PSCI CPU idle Domain"
>   	depends on ARM_PSCI_CPUIDLE
> diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
> index c80cf9ddabd8..6ad2954948a5 100644
> --- a/drivers/cpuidle/cpuidle-psci-domain.c
> +++ b/drivers/cpuidle/cpuidle-psci-domain.c
> @@ -64,8 +64,11 @@ static int psci_pd_init(struct device_node *np, bool use_osi)
>   
>   	pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_CPU_DOMAIN;
>   
> -	/* Allow power off when OSI has been successfully enabled. */
> -	if (use_osi)
> +	/*
> +	 * Allow power off when OSI has been successfully enabled.
> +	 * PREEMPT_RT is not yet ready to enter domain idle states.
> +	 */
> +	if (use_osi && !IS_ENABLED(CONFIG_PREEMPT_RT))
>   		pd->power_off = psci_pd_power_off;
>   	else
>   		pd->flags |= GENPD_FLAG_ALWAYS_ON;
> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> index 312a34ef28dc..6de027f9f6f5 100644
> --- a/drivers/cpuidle/cpuidle-psci.c
> +++ b/drivers/cpuidle/cpuidle-psci.c
> @@ -222,6 +222,9 @@ static int psci_dt_cpu_init_topology(struct cpuidle_driver *drv,
>   	if (!psci_has_osi_support())
>   		return 0;
>   
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT))
> +		return 0;
> +
>   	data->dev = psci_dt_attach_cpu(cpu);
>   	if (IS_ERR_OR_NULL(data->dev))
>   		return PTR_ERR_OR_ZERO(data->dev);

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

Re: [PATCH v4] cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
Posted by Ulf Hansson 2 years, 7 months ago
On Wed, 25 Jan 2023 at 17:46, Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
>
> Hi Krzysztof,
>
>
> On 25/01/2023 12:34, Krzysztof Kozlowski wrote:
> > The runtime Power Management of CPU topology is not compatible with
> > PREEMPT_RT:
> > 1. Core cpuidle path disables IRQs.
> > 2. Core cpuidle calls cpuidle-psci.
> > 3. cpuidle-psci in __psci_enter_domain_idle_state() calls
> >     pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use
> >     spinlocks (which are sleeping on PREEMPT_RT).
> >
> > Deep sleep modes are not a priority of Realtime kernels because the
> > latencies might become unpredictable.  On the other hand the PSCI CPU
> > idle power domain is a parent of other devices and power domain
> > controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250).
> >
> > Disable the idle callbacks in cpuidle-psci and mark the domain as
> > always on.  This is a trade-off between making PREEMPT_RT working and
> > still having a proper power domain hierarchy in the system.
>
> Wouldn't make sense to rely on the latency constraint framework ?

The main problem is that for runtime PM there is a per device spinlock
being used, which becomes a sleepable lock on PREEMPT_RT.

In other words, the only simple solution is to avoid the calls to
runtime PM in the idle path.

[...]

Kind regards
Uffe