diff mbox series

[v4] cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT

Message ID 20230125113418.455089-1-krzysztof.kozlowski@linaro.org
State New
Headers show
Series [v4] cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT | expand

Commit Message

Krzysztof Kozlowski Jan. 25, 2023, 11:34 a.m. UTC
The runtime Power Management of CPU topology is not compatible with
PREEMPT_RT:
1. Core cpuidle path disables IRQs.
2. Core cpuidle calls cpuidle-psci.
3. cpuidle-psci in __psci_enter_domain_idle_state() calls
   pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use
   spinlocks (which are sleeping on PREEMPT_RT).

Deep sleep modes are not a priority of Realtime kernels because the
latencies might become unpredictable.  On the other hand the PSCI CPU
idle power domain is a parent of other devices and power domain
controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250).

Disable the idle callbacks in cpuidle-psci and mark the domain as
always on.  This is a trade-off between making PREEMPT_RT working and
still having a proper power domain hierarchy in the system.

Cc: Adrien Thierry <athierry@redhat.com>
Cc: Brian Masney <bmasney@redhat.com>
Cc: linux-rt-users@vger.kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>

---

Changes since v3:
1. Rework - disable idle states, mark as always on (Ulf).
2. Extend Kconfig warning (Ulf).

Changes since v1:
1. Re-work commit msg.
2. Add note to Kconfig.

Several other patches were dropped, as this is the only one actually
needed.  It effectively stops PSCI cpuidle power domains from suspending
thus solving all other issues I experienced.
---
 drivers/cpuidle/Kconfig.arm           | 8 ++++++++
 drivers/cpuidle/cpuidle-psci-domain.c | 7 +++++--
 drivers/cpuidle/cpuidle-psci.c        | 3 +++
 3 files changed, 16 insertions(+), 2 deletions(-)

Comments

Daniel Lezcano Jan. 25, 2023, 4:46 p.m. UTC | #1
Hi Krzysztof,


On 25/01/2023 12:34, Krzysztof Kozlowski wrote:
> The runtime Power Management of CPU topology is not compatible with
> PREEMPT_RT:
> 1. Core cpuidle path disables IRQs.
> 2. Core cpuidle calls cpuidle-psci.
> 3. cpuidle-psci in __psci_enter_domain_idle_state() calls
>     pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use
>     spinlocks (which are sleeping on PREEMPT_RT).
> 
> Deep sleep modes are not a priority of Realtime kernels because the
> latencies might become unpredictable.  On the other hand the PSCI CPU
> idle power domain is a parent of other devices and power domain
> controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250).
> 
> Disable the idle callbacks in cpuidle-psci and mark the domain as
> always on.  This is a trade-off between making PREEMPT_RT working and
> still having a proper power domain hierarchy in the system.

Wouldn't make sense to rely on the latency constraint framework ?


> Cc: Adrien Thierry <athierry@redhat.com>
> Cc: Brian Masney <bmasney@redhat.com>
> Cc: linux-rt-users@vger.kernel.org
> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
> 
> ---
> 
> Changes since v3:
> 1. Rework - disable idle states, mark as always on (Ulf).
> 2. Extend Kconfig warning (Ulf).
> 
> Changes since v1:
> 1. Re-work commit msg.
> 2. Add note to Kconfig.
> 
> Several other patches were dropped, as this is the only one actually
> needed.  It effectively stops PSCI cpuidle power domains from suspending
> thus solving all other issues I experienced.
> ---
>   drivers/cpuidle/Kconfig.arm           | 8 ++++++++
>   drivers/cpuidle/cpuidle-psci-domain.c | 7 +++++--
>   drivers/cpuidle/cpuidle-psci.c        | 3 +++
>   3 files changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm
> index 747aa537389b..8deaa2e05206 100644
> --- a/drivers/cpuidle/Kconfig.arm
> +++ b/drivers/cpuidle/Kconfig.arm
> @@ -24,6 +24,14 @@ config ARM_PSCI_CPUIDLE
>   	  It provides an idle driver that is capable of detecting and
>   	  managing idle states through the PSCI firmware interface.
>   
> +	  The driver has limitations when used with PREEMPT_RT:
> +	  - If the idle states are described with the non-hierarchical layout,
> +	    all idle states are still available.
> +
> +	  - If the idle states are described with the hierarchical layout,
> +	    only the idle states defined per CPU are available, but not the ones
> +	    being shared among a group of CPUs (aka cluster idle states).
> +
>   config ARM_PSCI_CPUIDLE_DOMAIN
>   	bool "PSCI CPU idle Domain"
>   	depends on ARM_PSCI_CPUIDLE
> diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
> index c80cf9ddabd8..6ad2954948a5 100644
> --- a/drivers/cpuidle/cpuidle-psci-domain.c
> +++ b/drivers/cpuidle/cpuidle-psci-domain.c
> @@ -64,8 +64,11 @@ static int psci_pd_init(struct device_node *np, bool use_osi)
>   
>   	pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_CPU_DOMAIN;
>   
> -	/* Allow power off when OSI has been successfully enabled. */
> -	if (use_osi)
> +	/*
> +	 * Allow power off when OSI has been successfully enabled.
> +	 * PREEMPT_RT is not yet ready to enter domain idle states.
> +	 */
> +	if (use_osi && !IS_ENABLED(CONFIG_PREEMPT_RT))
>   		pd->power_off = psci_pd_power_off;
>   	else
>   		pd->flags |= GENPD_FLAG_ALWAYS_ON;
> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
> index 312a34ef28dc..6de027f9f6f5 100644
> --- a/drivers/cpuidle/cpuidle-psci.c
> +++ b/drivers/cpuidle/cpuidle-psci.c
> @@ -222,6 +222,9 @@ static int psci_dt_cpu_init_topology(struct cpuidle_driver *drv,
>   	if (!psci_has_osi_support())
>   		return 0;
>   
> +	if (IS_ENABLED(CONFIG_PREEMPT_RT))
> +		return 0;
> +
>   	data->dev = psci_dt_attach_cpu(cpu);
>   	if (IS_ERR_OR_NULL(data->dev))
>   		return PTR_ERR_OR_ZERO(data->dev);
Ulf Hansson Jan. 26, 2023, 12:45 p.m. UTC | #2
On Wed, 25 Jan 2023 at 17:46, Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
>
> Hi Krzysztof,
>
>
> On 25/01/2023 12:34, Krzysztof Kozlowski wrote:
> > The runtime Power Management of CPU topology is not compatible with
> > PREEMPT_RT:
> > 1. Core cpuidle path disables IRQs.
> > 2. Core cpuidle calls cpuidle-psci.
> > 3. cpuidle-psci in __psci_enter_domain_idle_state() calls
> >     pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use
> >     spinlocks (which are sleeping on PREEMPT_RT).
> >
> > Deep sleep modes are not a priority of Realtime kernels because the
> > latencies might become unpredictable.  On the other hand the PSCI CPU
> > idle power domain is a parent of other devices and power domain
> > controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250).
> >
> > Disable the idle callbacks in cpuidle-psci and mark the domain as
> > always on.  This is a trade-off between making PREEMPT_RT working and
> > still having a proper power domain hierarchy in the system.
>
> Wouldn't make sense to rely on the latency constraint framework ?

The main problem is that for runtime PM there is a per device spinlock
being used, which becomes a sleepable lock on PREEMPT_RT.

In other words, the only simple solution is to avoid the calls to
runtime PM in the idle path.

[...]

Kind regards
Uffe
Adrien Thierry Jan. 26, 2023, 10:06 p.m. UTC | #3
Hi Krzysztof,

I tested your patch on the Qdrive3/sa8540p-ride on 6.2.0-rc3-rt1, and it
fixes the issue I encountered in [1].

Tested-by: Adrien Thierry <athierry@redhat.com>

Thank you,

Adrien

[1] https://lore.kernel.org/all/20220615203605.1068453-1-athierry@redhat.com/
Adrien Thierry Feb. 7, 2023, 1:47 p.m. UTC | #4
Is there still something preventing this patch from being picked up?

Best,

Adrien
Rafael J. Wysocki Feb. 9, 2023, 5:08 p.m. UTC | #5
On Tue, Feb 7, 2023 at 2:47 PM Adrien Thierry <athierry@redhat.com> wrote:
>
> Is there still something preventing this patch from being picked up?

Well, I've been waiting for Daniel to do that.  Or should I pick it up
directly?  Daniel?
Rafael J. Wysocki Feb. 13, 2023, 4:17 p.m. UTC | #6
On Thu, Feb 9, 2023 at 6:08 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Tue, Feb 7, 2023 at 2:47 PM Adrien Thierry <athierry@redhat.com> wrote:
> >
> > Is there still something preventing this patch from being picked up?
>
> Well, I've been waiting for Daniel to do that.  Or should I pick it up
> directly?  Daniel?

Allright, applied as 6.3 material now, thanks!
diff mbox series

Patch

diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm
index 747aa537389b..8deaa2e05206 100644
--- a/drivers/cpuidle/Kconfig.arm
+++ b/drivers/cpuidle/Kconfig.arm
@@ -24,6 +24,14 @@  config ARM_PSCI_CPUIDLE
 	  It provides an idle driver that is capable of detecting and
 	  managing idle states through the PSCI firmware interface.
 
+	  The driver has limitations when used with PREEMPT_RT:
+	  - If the idle states are described with the non-hierarchical layout,
+	    all idle states are still available.
+
+	  - If the idle states are described with the hierarchical layout,
+	    only the idle states defined per CPU are available, but not the ones
+	    being shared among a group of CPUs (aka cluster idle states).
+
 config ARM_PSCI_CPUIDLE_DOMAIN
 	bool "PSCI CPU idle Domain"
 	depends on ARM_PSCI_CPUIDLE
diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
index c80cf9ddabd8..6ad2954948a5 100644
--- a/drivers/cpuidle/cpuidle-psci-domain.c
+++ b/drivers/cpuidle/cpuidle-psci-domain.c
@@ -64,8 +64,11 @@  static int psci_pd_init(struct device_node *np, bool use_osi)
 
 	pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_CPU_DOMAIN;
 
-	/* Allow power off when OSI has been successfully enabled. */
-	if (use_osi)
+	/*
+	 * Allow power off when OSI has been successfully enabled.
+	 * PREEMPT_RT is not yet ready to enter domain idle states.
+	 */
+	if (use_osi && !IS_ENABLED(CONFIG_PREEMPT_RT))
 		pd->power_off = psci_pd_power_off;
 	else
 		pd->flags |= GENPD_FLAG_ALWAYS_ON;
diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c
index 312a34ef28dc..6de027f9f6f5 100644
--- a/drivers/cpuidle/cpuidle-psci.c
+++ b/drivers/cpuidle/cpuidle-psci.c
@@ -222,6 +222,9 @@  static int psci_dt_cpu_init_topology(struct cpuidle_driver *drv,
 	if (!psci_has_osi_support())
 		return 0;
 
+	if (IS_ENABLED(CONFIG_PREEMPT_RT))
+		return 0;
+
 	data->dev = psci_dt_attach_cpu(cpu);
 	if (IS_ERR_OR_NULL(data->dev))
 		return PTR_ERR_OR_ZERO(data->dev);