Message ID | 20230119184228.683892-1-krzysztof.kozlowski@linaro.org |
---|---|
State | New |
Headers | show |
Series | [v3] cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT | expand |
On Thu, 19 Jan 2023 at 19:42, Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> wrote: > > The runtime Power Management of CPU topology is not compatible with > PREEMPT_RT: > 1. Core cpuidle path disables IRQs. > 2. Core cpuidle calls cpuidle-psci. > 3. cpuidle-psci in __psci_enter_domain_idle_state() calls > pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use > spinlocks (which are sleeping on PREEMPT_RT). > > Deep sleep modes are not a priority of Realtime kernels because the > latencies might become unpredictable. On the other hand the PSCI CPU > idle power domain is a parent of other devices and power domain > controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250). > > Disable the runtime PM calls from cpuidle-psci, which effectively stops > suspending the cpuidle PSCI domain. This is a trade-off between making > PREEMPT_RT working and still having a proper power domain hierarchy in > the system. I think this sounds like a reasonable compromise, at least at this point. > > Cc: Adrien Thierry <athierry@redhat.com> > Cc: Brian Masney <bmasney@redhat.com> > Cc: linux-rt-users@vger.kernel.org > Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> > > --- > > Changes since v1: > 1. Re-work commit msg. > 2. Add note to Kconfig. > > Several other patches were dropped, as this is the only one actually > needed. It effectively stops PSCI cpuidle power domains from suspending > thus solving all other issues I experienced. I like this approach better, thanks! > --- > drivers/cpuidle/Kconfig.arm | 3 +++ > drivers/cpuidle/cpuidle-psci.c | 4 ++-- > 2 files changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm > index 747aa537389b..24429b5bfd1c 100644 > --- a/drivers/cpuidle/Kconfig.arm > +++ b/drivers/cpuidle/Kconfig.arm > @@ -24,6 +24,9 @@ config ARM_PSCI_CPUIDLE > It provides an idle driver that is capable of detecting and > managing idle states through the PSCI firmware interface. > > + The driver is not yet compatible with PREEMPT_RT: no idle states will > + be entered by CPUs on such kernel. This isn't entirely correct. In principle your suggested change ends up providing the below updated behaviour for PREEMPT_RT. *) If the idle states are described with the non-hierarchical layout, all idle states are still available. **) If the idle states are described with the hierarchical layout, only the idle states defined per CPU are available, but not the ones being shared among a group of CPUs (aka cluster idle states). Perhaps there is an easier way to summarize what I stated above? > + > config ARM_PSCI_CPUIDLE_DOMAIN > bool "PSCI CPU idle Domain" > depends on ARM_PSCI_CPUIDLE > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c > index 312a34ef28dc..c25592718984 100644 > --- a/drivers/cpuidle/cpuidle-psci.c > +++ b/drivers/cpuidle/cpuidle-psci.c > @@ -66,7 +66,7 @@ static __cpuidle int __psci_enter_domain_idle_state(struct cpuidle_device *dev, > /* Do runtime PM to manage a hierarchical CPU toplogy. */ > if (s2idle) > dev_pm_genpd_suspend(pd_dev); > - else > + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) Rather than doing this (and the below) in __psci_enter_domain_idle_state(), I suggest replacing this with a bailout point in psci_dt_cpu_init_topology(). That would prevent the __psci_enter_domain_idle_state() from being called altogether, which is really what we need. Moreover, I think it would make sense to set the GENPD_FLAG_ALWAYS_ON for the corresponding genpd, when CONFIG_PREEMPT_RT is set. See psci_pd_init(). > pm_runtime_put_sync_suspend(pd_dev); > > state = psci_get_domain_state(); > @@ -77,7 +77,7 @@ static __cpuidle int __psci_enter_domain_idle_state(struct cpuidle_device *dev, > > if (s2idle) > dev_pm_genpd_resume(pd_dev); > - else > + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) > pm_runtime_get_sync(pd_dev); > > cpu_pm_exit(); > -- Kind regards Uffe
On Thu, Jan 19, 2023 at 07:42:28PM +0100, Krzysztof Kozlowski wrote: > The runtime Power Management of CPU topology is not compatible with > PREEMPT_RT: > 1. Core cpuidle path disables IRQs. > 2. Core cpuidle calls cpuidle-psci. > 3. cpuidle-psci in __psci_enter_domain_idle_state() calls > pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use > spinlocks (which are sleeping on PREEMPT_RT). > > Deep sleep modes are not a priority of Realtime kernels because the > latencies might become unpredictable. On the other hand the PSCI CPU > idle power domain is a parent of other devices and power domain > controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250). > > Disable the runtime PM calls from cpuidle-psci, which effectively stops > suspending the cpuidle PSCI domain. This is a trade-off between making > PREEMPT_RT working and still having a proper power domain hierarchy in > the system. > > Cc: Adrien Thierry <athierry@redhat.com> > Cc: Brian Masney <bmasney@redhat.com> > Cc: linux-rt-users@vger.kernel.org > Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> > > --- > > Changes since v1: > 1. Re-work commit msg. > 2. Add note to Kconfig. > > Several other patches were dropped, as this is the only one actually > needed. It effectively stops PSCI cpuidle power domains from suspending > thus solving all other issues I experienced. > --- > drivers/cpuidle/Kconfig.arm | 3 +++ > drivers/cpuidle/cpuidle-psci.c | 4 ++-- > 2 files changed, 5 insertions(+), 2 deletions(-) > > diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm > index 747aa537389b..24429b5bfd1c 100644 > --- a/drivers/cpuidle/Kconfig.arm > +++ b/drivers/cpuidle/Kconfig.arm > @@ -24,6 +24,9 @@ config ARM_PSCI_CPUIDLE > It provides an idle driver that is capable of detecting and > managing idle states through the PSCI firmware interface. > > + The driver is not yet compatible with PREEMPT_RT: no idle states will > + be entered by CPUs on such kernel. > + Any particular reason for even compiling this file in or allowing the ARM_PSCI_CPUIDLE when PREEMPT_RT=y ? If we can't enter idle states, we can as well compile this file out ?
On 24/01/2023 16:34, Sudeep Holla wrote: > On Thu, Jan 19, 2023 at 07:42:28PM +0100, Krzysztof Kozlowski wrote: >> The runtime Power Management of CPU topology is not compatible with >> PREEMPT_RT: >> 1. Core cpuidle path disables IRQs. >> 2. Core cpuidle calls cpuidle-psci. >> 3. cpuidle-psci in __psci_enter_domain_idle_state() calls >> pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use >> spinlocks (which are sleeping on PREEMPT_RT). >> >> Deep sleep modes are not a priority of Realtime kernels because the >> latencies might become unpredictable. On the other hand the PSCI CPU >> idle power domain is a parent of other devices and power domain >> controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250). >> >> Disable the runtime PM calls from cpuidle-psci, which effectively stops >> suspending the cpuidle PSCI domain. This is a trade-off between making >> PREEMPT_RT working and still having a proper power domain hierarchy in >> the system. >> >> Cc: Adrien Thierry <athierry@redhat.com> >> Cc: Brian Masney <bmasney@redhat.com> >> Cc: linux-rt-users@vger.kernel.org >> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> >> >> --- >> >> Changes since v1: >> 1. Re-work commit msg. >> 2. Add note to Kconfig. >> >> Several other patches were dropped, as this is the only one actually >> needed. It effectively stops PSCI cpuidle power domains from suspending >> thus solving all other issues I experienced. >> --- >> drivers/cpuidle/Kconfig.arm | 3 +++ >> drivers/cpuidle/cpuidle-psci.c | 4 ++-- >> 2 files changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm >> index 747aa537389b..24429b5bfd1c 100644 >> --- a/drivers/cpuidle/Kconfig.arm >> +++ b/drivers/cpuidle/Kconfig.arm >> @@ -24,6 +24,9 @@ config ARM_PSCI_CPUIDLE >> It provides an idle driver that is capable of detecting and >> managing idle states through the PSCI firmware interface. >> >> + The driver is not yet compatible with PREEMPT_RT: no idle states will >> + be entered by CPUs on such kernel. >> + > > Any particular reason for even compiling this file in or allowing the > ARM_PSCI_CPUIDLE when PREEMPT_RT=y ? If we can't enter idle states, we > can as well compile this file out ? It's the power domain sued for other devices, so we need it. Otherwise other devices will keep waiting for this missing power domain provider. Best regards, Krzysztof
On Wed, 25 Jan 2023 at 08:43, Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> wrote: > > On 24/01/2023 16:34, Sudeep Holla wrote: > > On Thu, Jan 19, 2023 at 07:42:28PM +0100, Krzysztof Kozlowski wrote: > >> The runtime Power Management of CPU topology is not compatible with > >> PREEMPT_RT: > >> 1. Core cpuidle path disables IRQs. > >> 2. Core cpuidle calls cpuidle-psci. > >> 3. cpuidle-psci in __psci_enter_domain_idle_state() calls > >> pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use > >> spinlocks (which are sleeping on PREEMPT_RT). > >> > >> Deep sleep modes are not a priority of Realtime kernels because the > >> latencies might become unpredictable. On the other hand the PSCI CPU > >> idle power domain is a parent of other devices and power domain > >> controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250). > >> > >> Disable the runtime PM calls from cpuidle-psci, which effectively stops > >> suspending the cpuidle PSCI domain. This is a trade-off between making > >> PREEMPT_RT working and still having a proper power domain hierarchy in > >> the system. > >> > >> Cc: Adrien Thierry <athierry@redhat.com> > >> Cc: Brian Masney <bmasney@redhat.com> > >> Cc: linux-rt-users@vger.kernel.org > >> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> > >> > >> --- > >> > >> Changes since v1: > >> 1. Re-work commit msg. > >> 2. Add note to Kconfig. > >> > >> Several other patches were dropped, as this is the only one actually > >> needed. It effectively stops PSCI cpuidle power domains from suspending > >> thus solving all other issues I experienced. > >> --- > >> drivers/cpuidle/Kconfig.arm | 3 +++ > >> drivers/cpuidle/cpuidle-psci.c | 4 ++-- > >> 2 files changed, 5 insertions(+), 2 deletions(-) > >> > >> diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm > >> index 747aa537389b..24429b5bfd1c 100644 > >> --- a/drivers/cpuidle/Kconfig.arm > >> +++ b/drivers/cpuidle/Kconfig.arm > >> @@ -24,6 +24,9 @@ config ARM_PSCI_CPUIDLE > >> It provides an idle driver that is capable of detecting and > >> managing idle states through the PSCI firmware interface. > >> > >> + The driver is not yet compatible with PREEMPT_RT: no idle states will > >> + be entered by CPUs on such kernel. > >> + > > > > Any particular reason for even compiling this file in or allowing the > > ARM_PSCI_CPUIDLE when PREEMPT_RT=y ? If we can't enter idle states, we > > can as well compile this file out ? > > It's the power domain sued for other devices, so we need it. Otherwise > other devices will keep waiting for this missing power domain provider. Yes. And we are still able to use those idle states that are solely per CPU, which is probably nice to have. No? Kind regards Uffe
On 24/01/2023 11:33, Ulf Hansson wrote: > On Thu, 19 Jan 2023 at 19:42, Krzysztof Kozlowski > <krzysztof.kozlowski@linaro.org> wrote: >> >> The runtime Power Management of CPU topology is not compatible with >> PREEMPT_RT: >> 1. Core cpuidle path disables IRQs. >> 2. Core cpuidle calls cpuidle-psci. >> 3. cpuidle-psci in __psci_enter_domain_idle_state() calls >> pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use >> spinlocks (which are sleeping on PREEMPT_RT). >> >> Deep sleep modes are not a priority of Realtime kernels because the >> latencies might become unpredictable. On the other hand the PSCI CPU >> idle power domain is a parent of other devices and power domain >> controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250). >> >> Disable the runtime PM calls from cpuidle-psci, which effectively stops >> suspending the cpuidle PSCI domain. This is a trade-off between making >> PREEMPT_RT working and still having a proper power domain hierarchy in >> the system. > > I think this sounds like a reasonable compromise, at least at this point. > >> >> Cc: Adrien Thierry <athierry@redhat.com> >> Cc: Brian Masney <bmasney@redhat.com> >> Cc: linux-rt-users@vger.kernel.org >> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> >> >> --- >> >> Changes since v1: >> 1. Re-work commit msg. >> 2. Add note to Kconfig. >> >> Several other patches were dropped, as this is the only one actually >> needed. It effectively stops PSCI cpuidle power domains from suspending >> thus solving all other issues I experienced. > > I like this approach better, thanks! > >> --- >> drivers/cpuidle/Kconfig.arm | 3 +++ >> drivers/cpuidle/cpuidle-psci.c | 4 ++-- >> 2 files changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm >> index 747aa537389b..24429b5bfd1c 100644 >> --- a/drivers/cpuidle/Kconfig.arm >> +++ b/drivers/cpuidle/Kconfig.arm >> @@ -24,6 +24,9 @@ config ARM_PSCI_CPUIDLE >> It provides an idle driver that is capable of detecting and >> managing idle states through the PSCI firmware interface. >> >> + The driver is not yet compatible with PREEMPT_RT: no idle states will >> + be entered by CPUs on such kernel. > > This isn't entirely correct. In principle your suggested change ends > up providing the below updated behaviour for PREEMPT_RT. > > *) If the idle states are described with the non-hierarchical layout, > all idle states are still available. > **) If the idle states are described with the hierarchical layout, > only the idle states defined per CPU are available, but not the ones > being shared among a group of CPUs (aka cluster idle states). > > Perhaps there is an easier way to summarize what I stated above? Yes, I'll correct the message. > >> + >> config ARM_PSCI_CPUIDLE_DOMAIN >> bool "PSCI CPU idle Domain" >> depends on ARM_PSCI_CPUIDLE >> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c >> index 312a34ef28dc..c25592718984 100644 >> --- a/drivers/cpuidle/cpuidle-psci.c >> +++ b/drivers/cpuidle/cpuidle-psci.c >> @@ -66,7 +66,7 @@ static __cpuidle int __psci_enter_domain_idle_state(struct cpuidle_device *dev, >> /* Do runtime PM to manage a hierarchical CPU toplogy. */ >> if (s2idle) >> dev_pm_genpd_suspend(pd_dev); >> - else >> + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) > > Rather than doing this (and the below) in > __psci_enter_domain_idle_state(), I suggest replacing this with a > bailout point in psci_dt_cpu_init_topology(). That would prevent the > __psci_enter_domain_idle_state() from being called altogether, which > is really what we need. Ack > > Moreover, I think it would make sense to set the GENPD_FLAG_ALWAYS_ON > for the corresponding genpd, when CONFIG_PREEMPT_RT is set. See > psci_pd_init(). Makes sense. Best regards, Krzysztof
On Wed, Jan 25, 2023 at 11:08:04AM +0100, Ulf Hansson wrote: > On Wed, 25 Jan 2023 at 08:43, Krzysztof Kozlowski > <krzysztof.kozlowski@linaro.org> wrote: > > > > On 24/01/2023 16:34, Sudeep Holla wrote: [...] > > > Any particular reason for even compiling this file in or allowing the > > > ARM_PSCI_CPUIDLE when PREEMPT_RT=y ? If we can't enter idle states, we > > > can as well compile this file out ? > > > > It's the power domain sued for other devices, so we need it. Otherwise > > other devices will keep waiting for this missing power domain provider. > > Yes. > > And we are still able to use those idle states that are solely per > CPU, which is probably nice to have. No? > Makes sense, thanks for the explanation. Also the other discussions clears probably questions I had.
On 2023-01-25 11:08:04 [+0100], Ulf Hansson wrote: > > It's the power domain sued for other devices, so we need it. Otherwise > > other devices will keep waiting for this missing power domain provider. > Ach this explains my other question I had. > > And we are still able to use those idle states that are solely per > CPU, which is probably nice to have. No? If the entry/ exit latency is not known, it can be dangerous. > Kind regards > Uffe Sebastian
diff --git a/drivers/cpuidle/Kconfig.arm b/drivers/cpuidle/Kconfig.arm index 747aa537389b..24429b5bfd1c 100644 --- a/drivers/cpuidle/Kconfig.arm +++ b/drivers/cpuidle/Kconfig.arm @@ -24,6 +24,9 @@ config ARM_PSCI_CPUIDLE It provides an idle driver that is capable of detecting and managing idle states through the PSCI firmware interface. + The driver is not yet compatible with PREEMPT_RT: no idle states will + be entered by CPUs on such kernel. + config ARM_PSCI_CPUIDLE_DOMAIN bool "PSCI CPU idle Domain" depends on ARM_PSCI_CPUIDLE diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c index 312a34ef28dc..c25592718984 100644 --- a/drivers/cpuidle/cpuidle-psci.c +++ b/drivers/cpuidle/cpuidle-psci.c @@ -66,7 +66,7 @@ static __cpuidle int __psci_enter_domain_idle_state(struct cpuidle_device *dev, /* Do runtime PM to manage a hierarchical CPU toplogy. */ if (s2idle) dev_pm_genpd_suspend(pd_dev); - else + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) pm_runtime_put_sync_suspend(pd_dev); state = psci_get_domain_state(); @@ -77,7 +77,7 @@ static __cpuidle int __psci_enter_domain_idle_state(struct cpuidle_device *dev, if (s2idle) dev_pm_genpd_resume(pd_dev); - else + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) pm_runtime_get_sync(pd_dev); cpu_pm_exit();
The runtime Power Management of CPU topology is not compatible with PREEMPT_RT: 1. Core cpuidle path disables IRQs. 2. Core cpuidle calls cpuidle-psci. 3. cpuidle-psci in __psci_enter_domain_idle_state() calls pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use spinlocks (which are sleeping on PREEMPT_RT). Deep sleep modes are not a priority of Realtime kernels because the latencies might become unpredictable. On the other hand the PSCI CPU idle power domain is a parent of other devices and power domain controllers, thus it cannot be simply skipped (e.g. on Qualcomm SM8250). Disable the runtime PM calls from cpuidle-psci, which effectively stops suspending the cpuidle PSCI domain. This is a trade-off between making PREEMPT_RT working and still having a proper power domain hierarchy in the system. Cc: Adrien Thierry <athierry@redhat.com> Cc: Brian Masney <bmasney@redhat.com> Cc: linux-rt-users@vger.kernel.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> --- Changes since v1: 1. Re-work commit msg. 2. Add note to Kconfig. Several other patches were dropped, as this is the only one actually needed. It effectively stops PSCI cpuidle power domains from suspending thus solving all other issues I experienced. --- drivers/cpuidle/Kconfig.arm | 3 +++ drivers/cpuidle/cpuidle-psci.c | 4 ++-- 2 files changed, 5 insertions(+), 2 deletions(-)