diff mbox series

[v2,2/5] cpuidle: psci: Mark as PREEMPT_RT safe

Message ID 20221219151503.385816-3-krzysztof.kozlowski@linaro.org
State New
Headers show
Series PM: Fixes for Realtime systems | expand

Commit Message

Krzysztof Kozlowski Dec. 19, 2022, 3:15 p.m. UTC
The PSCI cpuidle power domain in power_off callback uses
__this_cpu_write() so it is PREEMPT_RT safe.  This allows to use it in
Realtime kernels and solves errors like:

  BUG: scheduling while atomic: swapper/2/0/0x00000002
  Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
  Call trace:
   dump_backtrace.part.0+0xe0/0xf0
   show_stack+0x18/0x40
   dump_stack_lvl+0x68/0x84
   dump_stack+0x18/0x34
   __schedule_bug+0x60/0x80
   __schedule+0x628/0x800
   schedule_rtlock+0x28/0x5c
   rtlock_slowlock_locked+0x360/0xd30
   rt_spin_lock+0x88/0xb0
   genpd_lock_nested_spin+0x1c/0x30
   genpd_power_off.part.0.isra.0+0x20c/0x2a0
   genpd_runtime_suspend+0x150/0x2bc
   __rpm_callback+0x48/0x170
   rpm_callback+0x6c/0x7c
   rpm_suspend+0x108/0x660
   __pm_runtime_suspend+0x4c/0x8c
   __psci_enter_domain_idle_state.constprop.0+0x54/0xe0
   psci_enter_domain_idle_state+0x18/0x2c
   cpuidle_enter_state+0x8c/0x4e0
   cpuidle_enter+0x38/0x50
   do_idle+0x248/0x2f0
   cpu_startup_entry+0x24/0x30
   secondary_start_kernel+0x130/0x154
   __secondary_switched+0xb0/0xb4

Cc: Adrien Thierry <athierry@redhat.com>
Cc: Brian Masney <bmasney@redhat.com>
Cc: linux-rt-users@vger.kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
---
 drivers/cpuidle/cpuidle-psci-domain.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Sebastian Andrzej Siewior Jan. 12, 2023, 11 a.m. UTC | #1
On 2022-12-19 16:15:00 [+0100], Krzysztof Kozlowski wrote:
> The PSCI cpuidle power domain in power_off callback uses
> __this_cpu_write() so it is PREEMPT_RT safe.  This allows to use it in

Why does __this_cpu_write() matter here?

> Realtime kernels and solves errors like:
> 
>   BUG: scheduling while atomic: swapper/2/0/0x00000002
>   Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
>   Call trace:
>    dump_backtrace.part.0+0xe0/0xf0
>    show_stack+0x18/0x40
>    dump_stack_lvl+0x68/0x84
>    dump_stack+0x18/0x34
>    __schedule_bug+0x60/0x80
>    __schedule+0x628/0x800
>    schedule_rtlock+0x28/0x5c
>    rtlock_slowlock_locked+0x360/0xd30
>    rt_spin_lock+0x88/0xb0
>    genpd_lock_nested_spin+0x1c/0x30
>    genpd_power_off.part.0.isra.0+0x20c/0x2a0
>    genpd_runtime_suspend+0x150/0x2bc
>    __rpm_callback+0x48/0x170
>    rpm_callback+0x6c/0x7c
>    rpm_suspend+0x108/0x660
>    __pm_runtime_suspend+0x4c/0x8c
>    __psci_enter_domain_idle_state.constprop.0+0x54/0xe0
>    psci_enter_domain_idle_state+0x18/0x2c
>    cpuidle_enter_state+0x8c/0x4e0
>    cpuidle_enter+0x38/0x50
>    do_idle+0x248/0x2f0
>    cpu_startup_entry+0x24/0x30
>    secondary_start_kernel+0x130/0x154
>    __secondary_switched+0xb0/0xb4

This is to a sleeping lock (spinlock_t) in an IRQ-off region which
starts in do_idle(). You could describe the problem and to solution you
aim for instead pasting a backtrace into the commit description and
adding a flow in the code.

I don't see how your commit description matches your change in code. One
might think "Oh. If I see this pattern, I need an irqsafe lock to fix
it".

Sebastian
Krzysztof Kozlowski Jan. 12, 2023, 11:32 a.m. UTC | #2
On 12/01/2023 12:00, Sebastian Andrzej Siewior wrote:
> On 2022-12-19 16:15:00 [+0100], Krzysztof Kozlowski wrote:
>> The PSCI cpuidle power domain in power_off callback uses
>> __this_cpu_write() so it is PREEMPT_RT safe.  This allows to use it in
> 
> Why does __this_cpu_write() matter here?

I'll reword to "not using sleeping primitives nor spinlock_t"

> 
>> Realtime kernels and solves errors like:
>>
>>   BUG: scheduling while atomic: swapper/2/0/0x00000002
>>   Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
>>   Call trace:
>>    dump_backtrace.part.0+0xe0/0xf0
>>    show_stack+0x18/0x40
>>    dump_stack_lvl+0x68/0x84
>>    dump_stack+0x18/0x34
>>    __schedule_bug+0x60/0x80
>>    __schedule+0x628/0x800
>>    schedule_rtlock+0x28/0x5c
>>    rtlock_slowlock_locked+0x360/0xd30
>>    rt_spin_lock+0x88/0xb0
>>    genpd_lock_nested_spin+0x1c/0x30
>>    genpd_power_off.part.0.isra.0+0x20c/0x2a0
>>    genpd_runtime_suspend+0x150/0x2bc
>>    __rpm_callback+0x48/0x170
>>    rpm_callback+0x6c/0x7c
>>    rpm_suspend+0x108/0x660
>>    __pm_runtime_suspend+0x4c/0x8c
>>    __psci_enter_domain_idle_state.constprop.0+0x54/0xe0
>>    psci_enter_domain_idle_state+0x18/0x2c
>>    cpuidle_enter_state+0x8c/0x4e0
>>    cpuidle_enter+0x38/0x50
>>    do_idle+0x248/0x2f0
>>    cpu_startup_entry+0x24/0x30
>>    secondary_start_kernel+0x130/0x154
>>    __secondary_switched+0xb0/0xb4
> 
> This is to a sleeping lock (spinlock_t) in an IRQ-off region which
> starts in do_idle(). You could describe the problem and to solution you
> aim for instead pasting a backtrace into the commit description and
> adding a flow in the code.

I'll extend the description.

> 
> I don't see how your commit description matches your change in code. One
> might think "Oh. If I see this pattern, I need an irqsafe lock to fix
> it".


Best regards,
Krzysztof
Ulf Hansson Jan. 17, 2023, 3:27 p.m. UTC | #3
On Mon, 19 Dec 2022 at 16:15, Krzysztof Kozlowski
<krzysztof.kozlowski@linaro.org> wrote:
>
> The PSCI cpuidle power domain in power_off callback uses
> __this_cpu_write() so it is PREEMPT_RT safe.  This allows to use it in
> Realtime kernels and solves errors like:
>
>   BUG: scheduling while atomic: swapper/2/0/0x00000002
>   Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
>   Call trace:
>    dump_backtrace.part.0+0xe0/0xf0
>    show_stack+0x18/0x40
>    dump_stack_lvl+0x68/0x84
>    dump_stack+0x18/0x34
>    __schedule_bug+0x60/0x80
>    __schedule+0x628/0x800
>    schedule_rtlock+0x28/0x5c
>    rtlock_slowlock_locked+0x360/0xd30
>    rt_spin_lock+0x88/0xb0
>    genpd_lock_nested_spin+0x1c/0x30
>    genpd_power_off.part.0.isra.0+0x20c/0x2a0
>    genpd_runtime_suspend+0x150/0x2bc
>    __rpm_callback+0x48/0x170
>    rpm_callback+0x6c/0x7c
>    rpm_suspend+0x108/0x660
>    __pm_runtime_suspend+0x4c/0x8c
>    __psci_enter_domain_idle_state.constprop.0+0x54/0xe0
>    psci_enter_domain_idle_state+0x18/0x2c
>    cpuidle_enter_state+0x8c/0x4e0
>    cpuidle_enter+0x38/0x50
>    do_idle+0x248/0x2f0
>    cpu_startup_entry+0x24/0x30
>    secondary_start_kernel+0x130/0x154
>    __secondary_switched+0xb0/0xb4
>
> Cc: Adrien Thierry <athierry@redhat.com>
> Cc: Brian Masney <bmasney@redhat.com>
> Cc: linux-rt-users@vger.kernel.org
> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
> ---
>  drivers/cpuidle/cpuidle-psci-domain.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
> index c80cf9ddabd8..d15a91fb7048 100644
> --- a/drivers/cpuidle/cpuidle-psci-domain.c
> +++ b/drivers/cpuidle/cpuidle-psci-domain.c
> @@ -62,7 +62,8 @@ static int psci_pd_init(struct device_node *np, bool use_osi)
>         if (!pd_provider)
>                 goto free_pd;
>
> -       pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_CPU_DOMAIN;
> +       pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_RT_SAFE | \
> +                    GENPD_FLAG_CPU_DOMAIN;

My main concern with this, is that it will affect the parent domains
too. Whether those would be able to use the GENPD_FLAG_RT_SAFE or not,
is a different story.

In one way or the other, I think it would be better to limit the
GENPD_FLAG_RT_SAFE to be used only for PREEMPT_RT kernels.

>
>         /* Allow power off when OSI has been successfully enabled. */
>         if (use_osi)
> --
> 2.34.1
>

Kind regards
Uffe
Krzysztof Kozlowski Jan. 19, 2023, 3:40 p.m. UTC | #4
On 17/01/2023 16:27, Ulf Hansson wrote:
> On Mon, 19 Dec 2022 at 16:15, Krzysztof Kozlowski
> <krzysztof.kozlowski@linaro.org> wrote:
>>
>> The PSCI cpuidle power domain in power_off callback uses
>> __this_cpu_write() so it is PREEMPT_RT safe.  This allows to use it in
>> Realtime kernels and solves errors like:
>>
>>   BUG: scheduling while atomic: swapper/2/0/0x00000002
>>   Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
>>   Call trace:
>>    dump_backtrace.part.0+0xe0/0xf0
>>    show_stack+0x18/0x40
>>    dump_stack_lvl+0x68/0x84
>>    dump_stack+0x18/0x34
>>    __schedule_bug+0x60/0x80
>>    __schedule+0x628/0x800
>>    schedule_rtlock+0x28/0x5c
>>    rtlock_slowlock_locked+0x360/0xd30
>>    rt_spin_lock+0x88/0xb0
>>    genpd_lock_nested_spin+0x1c/0x30
>>    genpd_power_off.part.0.isra.0+0x20c/0x2a0
>>    genpd_runtime_suspend+0x150/0x2bc
>>    __rpm_callback+0x48/0x170
>>    rpm_callback+0x6c/0x7c
>>    rpm_suspend+0x108/0x660
>>    __pm_runtime_suspend+0x4c/0x8c
>>    __psci_enter_domain_idle_state.constprop.0+0x54/0xe0
>>    psci_enter_domain_idle_state+0x18/0x2c
>>    cpuidle_enter_state+0x8c/0x4e0
>>    cpuidle_enter+0x38/0x50
>>    do_idle+0x248/0x2f0
>>    cpu_startup_entry+0x24/0x30
>>    secondary_start_kernel+0x130/0x154
>>    __secondary_switched+0xb0/0xb4
>>
>> Cc: Adrien Thierry <athierry@redhat.com>
>> Cc: Brian Masney <bmasney@redhat.com>
>> Cc: linux-rt-users@vger.kernel.org
>> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
>> ---
>>  drivers/cpuidle/cpuidle-psci-domain.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
>> index c80cf9ddabd8..d15a91fb7048 100644
>> --- a/drivers/cpuidle/cpuidle-psci-domain.c
>> +++ b/drivers/cpuidle/cpuidle-psci-domain.c
>> @@ -62,7 +62,8 @@ static int psci_pd_init(struct device_node *np, bool use_osi)
>>         if (!pd_provider)
>>                 goto free_pd;
>>
>> -       pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_CPU_DOMAIN;
>> +       pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_RT_SAFE | \
>> +                    GENPD_FLAG_CPU_DOMAIN;
> 
> My main concern with this, is that it will affect the parent domains
> too. Whether those would be able to use the GENPD_FLAG_RT_SAFE or not,
> is a different story.
> 
> In one way or the other, I think it would be better to limit the
> GENPD_FLAG_RT_SAFE to be used only for PREEMPT_RT kernels.

I can do it... or maybe we should just drop the flags (RT and IRQ safe)
when parent domain does not have it?

Best regards,
Krzysztof
Krzysztof Kozlowski Jan. 19, 2023, 5:06 p.m. UTC | #5
On 19/01/2023 16:40, Krzysztof Kozlowski wrote:
> On 17/01/2023 16:27, Ulf Hansson wrote:
>> On Mon, 19 Dec 2022 at 16:15, Krzysztof Kozlowski
>> <krzysztof.kozlowski@linaro.org> wrote:
>>>
>>> The PSCI cpuidle power domain in power_off callback uses
>>> __this_cpu_write() so it is PREEMPT_RT safe.  This allows to use it in
>>> Realtime kernels and solves errors like:
>>>
>>>   BUG: scheduling while atomic: swapper/2/0/0x00000002
>>>   Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
>>>   Call trace:
>>>    dump_backtrace.part.0+0xe0/0xf0
>>>    show_stack+0x18/0x40
>>>    dump_stack_lvl+0x68/0x84
>>>    dump_stack+0x18/0x34
>>>    __schedule_bug+0x60/0x80
>>>    __schedule+0x628/0x800
>>>    schedule_rtlock+0x28/0x5c
>>>    rtlock_slowlock_locked+0x360/0xd30
>>>    rt_spin_lock+0x88/0xb0
>>>    genpd_lock_nested_spin+0x1c/0x30
>>>    genpd_power_off.part.0.isra.0+0x20c/0x2a0
>>>    genpd_runtime_suspend+0x150/0x2bc
>>>    __rpm_callback+0x48/0x170
>>>    rpm_callback+0x6c/0x7c
>>>    rpm_suspend+0x108/0x660
>>>    __pm_runtime_suspend+0x4c/0x8c
>>>    __psci_enter_domain_idle_state.constprop.0+0x54/0xe0
>>>    psci_enter_domain_idle_state+0x18/0x2c
>>>    cpuidle_enter_state+0x8c/0x4e0
>>>    cpuidle_enter+0x38/0x50
>>>    do_idle+0x248/0x2f0
>>>    cpu_startup_entry+0x24/0x30
>>>    secondary_start_kernel+0x130/0x154
>>>    __secondary_switched+0xb0/0xb4
>>>
>>> Cc: Adrien Thierry <athierry@redhat.com>
>>> Cc: Brian Masney <bmasney@redhat.com>
>>> Cc: linux-rt-users@vger.kernel.org
>>> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
>>> ---
>>>  drivers/cpuidle/cpuidle-psci-domain.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
>>> index c80cf9ddabd8..d15a91fb7048 100644
>>> --- a/drivers/cpuidle/cpuidle-psci-domain.c
>>> +++ b/drivers/cpuidle/cpuidle-psci-domain.c
>>> @@ -62,7 +62,8 @@ static int psci_pd_init(struct device_node *np, bool use_osi)
>>>         if (!pd_provider)
>>>                 goto free_pd;
>>>
>>> -       pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_CPU_DOMAIN;
>>> +       pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_RT_SAFE | \
>>> +                    GENPD_FLAG_CPU_DOMAIN;
>>
>> My main concern with this, is that it will affect the parent domains
>> too. Whether those would be able to use the GENPD_FLAG_RT_SAFE or not,
>> is a different story.
>>
>> In one way or the other, I think it would be better to limit the
>> GENPD_FLAG_RT_SAFE to be used only for PREEMPT_RT kernels.
> 
> I can do it... or maybe we should just drop the flags (RT and IRQ safe)
> when parent domain does not have it?

Actually, with next patch, I can skip this one entirely. This is needed
if PSCI cpuidle driver invokes runtime PM functions which eventually
puts PSCI cpuidle power domain into suspend/resume. If the former does
not happen, the domain driver won't be even called so my problem disappears.

Since I need patch 3/5 - effectively disabling PSCI cpuidle runtime PM -
we can drop this one, till we find a real user needing it.

Best regards,
Krzysztof
diff mbox series

Patch

diff --git a/drivers/cpuidle/cpuidle-psci-domain.c b/drivers/cpuidle/cpuidle-psci-domain.c
index c80cf9ddabd8..d15a91fb7048 100644
--- a/drivers/cpuidle/cpuidle-psci-domain.c
+++ b/drivers/cpuidle/cpuidle-psci-domain.c
@@ -62,7 +62,8 @@  static int psci_pd_init(struct device_node *np, bool use_osi)
 	if (!pd_provider)
 		goto free_pd;
 
-	pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_CPU_DOMAIN;
+	pd->flags |= GENPD_FLAG_IRQ_SAFE | GENPD_FLAG_RT_SAFE | \
+		     GENPD_FLAG_CPU_DOMAIN;
 
 	/* Allow power off when OSI has been successfully enabled. */
 	if (use_osi)