Message ID | 20221219151503.385816-4-krzysztof.kozlowski@linaro.org |
---|---|
State | New |
Headers | show |
Series | PM: Fixes for Realtime systems | expand |
On 2022-12-19 16:15:01 [+0100], Krzysztof Kozlowski wrote: > diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c > index 57bc3e3ae391..9d971cc4b12b 100644 > --- a/drivers/cpuidle/cpuidle-psci.c > +++ b/drivers/cpuidle/cpuidle-psci.c > @@ -72,7 +72,7 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev, > ct_irq_enter_irqson(); > if (s2idle) > dev_pm_genpd_suspend(pd_dev); > - else > + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) > pm_runtime_put_sync_suspend(pd_dev); So based on the commit description you run into a sleeping lock in pm_runtime_put_sync_suspend() while the CPU is in an IRQ-off region. Why is it okay to skip it on PREEMPT_RT? > ct_irq_exit_irqson(); > Sebastian
On 12/01/2023 12:09, Sebastian Andrzej Siewior wrote: > On 2022-12-19 16:15:01 [+0100], Krzysztof Kozlowski wrote: >> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c >> index 57bc3e3ae391..9d971cc4b12b 100644 >> --- a/drivers/cpuidle/cpuidle-psci.c >> +++ b/drivers/cpuidle/cpuidle-psci.c >> @@ -72,7 +72,7 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev, >> ct_irq_enter_irqson(); >> if (s2idle) >> dev_pm_genpd_suspend(pd_dev); >> - else >> + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) >> pm_runtime_put_sync_suspend(pd_dev); > > So based on the commit description you run into a sleeping lock in > pm_runtime_put_sync_suspend() while the CPU is in an IRQ-off region. > Why is it okay to skip it on PREEMPT_RT? It is okay to skip it everywhere, you just don't get a suspended CPU. Why PREEMPT_RT is different here - having suspended CPU is a great way to have longer or even unpredictable (as it goes to firmware which is out of control of Linux) latencies. Best regards, Krzysztof
On 2023-01-12 12:34:35 [+0100], Krzysztof Kozlowski wrote: > On 12/01/2023 12:09, Sebastian Andrzej Siewior wrote: > > On 2022-12-19 16:15:01 [+0100], Krzysztof Kozlowski wrote: > >> diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c > >> index 57bc3e3ae391..9d971cc4b12b 100644 > >> --- a/drivers/cpuidle/cpuidle-psci.c > >> +++ b/drivers/cpuidle/cpuidle-psci.c > >> @@ -72,7 +72,7 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev, > >> ct_irq_enter_irqson(); > >> if (s2idle) > >> dev_pm_genpd_suspend(pd_dev); > >> - else > >> + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) > >> pm_runtime_put_sync_suspend(pd_dev); > > > > So based on the commit description you run into a sleeping lock in > > pm_runtime_put_sync_suspend() while the CPU is in an IRQ-off region. > > Why is it okay to skip it on PREEMPT_RT? > > It is okay to skip it everywhere, you just don't get a suspended CPU. > Why PREEMPT_RT is different here - having suspended CPU is a great way > to have longer or even unpredictable (as it goes to firmware which is > out of control of Linux) latencies. On X86 C1 has a latency of less than 5us and this is exposed by the firmware. C1E and everything that follows has a much higher entry/ exit latency which makes not usable. The entry/exit latency seems not to be exposed by PSCI. My understanding is that the driver is now enabled but not doing any suspending, right? If so, why isn't it completely disabled? > Best regards, > Krzysztof Sebastian
diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c index 57bc3e3ae391..9d971cc4b12b 100644 --- a/drivers/cpuidle/cpuidle-psci.c +++ b/drivers/cpuidle/cpuidle-psci.c @@ -72,7 +72,7 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev, ct_irq_enter_irqson(); if (s2idle) dev_pm_genpd_suspend(pd_dev); - else + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) pm_runtime_put_sync_suspend(pd_dev); ct_irq_exit_irqson(); @@ -85,7 +85,7 @@ static int __psci_enter_domain_idle_state(struct cpuidle_device *dev, ct_irq_enter_irqson(); if (s2idle) dev_pm_genpd_resume(pd_dev); - else + else if (!IS_ENABLED(CONFIG_PREEMPT_RT)) pm_runtime_get_sync(pd_dev); ct_irq_exit_irqson();
Realtime kernels are not supposed to go into deep sleep modes (neither system nor CPUs) because the latencies might become unpredictable. Therefore actual power suspending of CPU topology is not that important. On the other hand, this runtime PM of CPU topology is not compatible with PREEMPT_RT: 1. Core cpuidle path disables IRQs. 2. Core cpuidle calls cpuidle-psci. 3. cpuidle-psci in __psci_enter_domain_idle_state() calls pm_runtime_put_sync_suspend() and pm_runtime_get_sync() which use spinlocks (which are sleeping on PREEMPT_RT): BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/0 preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 1 lock held by swapper/0/0: #0: ffff20524613c1a0 (&dev->power.lock){+.+.}-{3:3}, at: __pm_runtime_suspend+0x30/0xac irq event stamp: 18776 hardirqs last enabled at (18775): [<ffffa4853720ac80>] tick_nohz_idle_enter+0x7c/0x17c hardirqs last disabled at (18776): [<ffffa4853717ae00>] do_idle+0xe0/0x300 softirqs last enabled at (4310): [<ffffa48537126398>] __local_bh_enable_ip+0x98/0x280 softirqs last disabled at (4288): [<ffffa4853721c250>] cgroup_idr_alloc.constprop.0+0x0/0xd0 Preemption disabled at: [<ffffa485381e55a0>] schedule_preempt_disabled+0x20/0x30 CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 6.1.0-rt5-00322-gb49b67f1d8dc #1 Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) Call trace: dump_backtrace.part.0+0xe0/0xf0 show_stack+0x18/0x40 dump_stack_lvl+0x8c/0xb8 dump_stack+0x18/0x34 __might_resched+0x17c/0x214 rt_spin_lock+0x5c/0x100 __pm_runtime_suspend+0x30/0xac __psci_enter_domain_idle_state.constprop.0+0x60/0x104 psci_enter_domain_idle_state+0x18/0x2c cpuidle_enter_state+0x220/0x37c cpuidle_enter+0x38/0x50 do_idle+0x258/0x300 cpu_startup_entry+0x28/0x30 rest_init+0x104/0x180 arch_post_acpi_subsys_init+0x0/0x18 start_kernel+0x6fc/0x73c __primary_switched+0xbc/0xc4 Cc: Adrien Thierry <athierry@redhat.com> Cc: Brian Masney <bmasney@redhat.com> Cc: linux-rt-users@vger.kernel.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> --- drivers/cpuidle/cpuidle-psci.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)