Message ID | 20221219151503.385816-6-krzysztof.kozlowski@linaro.org |
---|---|
State | New |
Headers | show |
Series | PM: Fixes for Realtime systems | expand |
On 2022-12-19 16:15:03 [+0100], Krzysztof Kozlowski wrote: > If PM domain on PREEMPT_RT is marked as GENPD_FLAG_RT_SAFE(), the > genpd_lock() will be a raw spin lock, thus device_pm_check_callbacks() a raw_spinlock_t > must be called outside of the domain lock. Right. First the sleeping lock, followed by the spinning locks. This is covered in Documentation/locking/locktypes.rst at the end. > This solves on PREEMPT_RT: Yes but > [ BUG: Invalid wait context ] This "Invalid wait context" should also trigger on !PREEMPT_RT with CONFIG_PROVE_RAW_LOCK_NESTING. > 6.1.0-rt5-00325-g8a5f56bcfcca #8 Tainted: G W > ----------------------------- > swapper/0/1 is trying to lock: > ffff76e045dec9a0 (&dev->power.lock){+.+.}-{3:3}, at: device_pm_check_callbacks+0x20/0xf0 > other info that might help us debug this: > context-{5:5} > 3 locks held by swapper/0/1: > #0: ffff76e045deb8e8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x38/0x1c0 > #1: ffffa92b81f825e0 (gpd_list_lock){+.+.}-{4:4}, at: __genpd_dev_pm_attach+0x7c/0x250 > #2: ffff76e04105c7a0 (&genpd->rslock){....}-{2:2}, at: genpd_lock_rawspin+0x1c/0x30 > stack backtrace: > CPU: 5 PID: 1 Comm: swapper/0 Tainted: G W 6.1.0-rt5-00325-g8a5f56bcfcca #8 > Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) > Call trace: > dump_backtrace.part.0+0xe0/0xf0 > show_stack+0x18/0x40 > dump_stack_lvl+0x8c/0xb8 > dump_stack+0x18/0x34 > __lock_acquire+0x938/0x2100 > lock_acquire.part.0+0x104/0x28c > lock_acquire+0x68/0x84 > rt_spin_lock+0x40/0x100 > device_pm_check_callbacks+0x20/0xf0 > dev_pm_domain_set+0x54/0x64 > genpd_add_device+0x258/0x340 > __genpd_dev_pm_attach+0xa8/0x250 > genpd_dev_pm_attach_by_id+0xc4/0x190 > genpd_dev_pm_attach_by_name+0x3c/0x60 > dev_pm_domain_attach_by_name+0x20/0x30 > dt_idle_attach_cpu+0x24/0x90 > psci_cpuidle_probe+0x300/0x4b0 > platform_probe+0x68/0xe0 > really_probe+0xbc/0x2dc > __driver_probe_device+0x78/0xe0 > driver_probe_device+0x3c/0x160 > __device_attach_driver+0xb8/0x140 > bus_for_each_drv+0x78/0xd0 > __device_attach+0xa8/0x1c0 > device_initial_probe+0x14/0x20 > bus_probe_device+0x9c/0xa4 > device_add+0x3b4/0x8dc > platform_device_add+0x114/0x234 > platform_device_register_full+0x108/0x1a4 > psci_idle_init+0x6c/0xb0 > do_one_initcall+0x74/0x450 > kernel_init_freeable+0x2e0/0x350 > kernel_init+0x24/0x130 > ret_from_fork+0x10/0x20 I would prefer a description of the issue instead hacing this backtrace. > Cc: Adrien Thierry <athierry@redhat.com> > Cc: Brian Masney <bmasney@redhat.com> > Cc: linux-rt-users@vger.kernel.org > Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> > --- > drivers/base/power/domain.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > index 4dfce1d476f4..db499ba40497 100644 > --- a/drivers/base/power/domain.c > +++ b/drivers/base/power/domain.c > @@ -1666,10 +1666,14 @@ static int genpd_add_device(struct generic_pm_domain *genpd, struct device *dev, > if (ret) > goto out; > > + > + /* PREEMPT_RT: Must be outside of genpd_lock */ Could this comment be rewritten if needed? The callback, which acquires sleeping locks on PREEMPT_RT, can be moved before genpd_lock() because it does not rely on this lock. It must be moved because the latter may acquire spinning locks. It might be also be part of the commit messageā¦ > + device_pm_check_callbacks(dev); > + > genpd_lock(genpd); > > genpd_set_cpumask(genpd, gpd_data->cpu); > - dev_pm_domain_set(dev, &genpd->domain); > + dev_pm_domain_set_no_cb(dev, &genpd->domain); > > genpd->device_count++; > if (gd) Sebastian
On 12/01/2023 12:31, Sebastian Andrzej Siewior wrote: > On 2022-12-19 16:15:03 [+0100], Krzysztof Kozlowski wrote: >> If PM domain on PREEMPT_RT is marked as GENPD_FLAG_RT_SAFE(), the >> genpd_lock() will be a raw spin lock, thus device_pm_check_callbacks() > > a raw_spinlock_t > >> must be called outside of the domain lock. > > Right. First the sleeping lock, followed by the spinning locks. This is > covered in > Documentation/locking/locktypes.rst > > at the end. I don't understand your comment. Do you expect me to change something? > >> This solves on PREEMPT_RT: > Yes but >> [ BUG: Invalid wait context ] > > This "Invalid wait context" should also trigger on !PREEMPT_RT with > CONFIG_PROVE_RAW_LOCK_NESTING. Could be, I just did not hit it. > >> 6.1.0-rt5-00325-g8a5f56bcfcca #8 Tainted: G W >> ----------------------------- >> swapper/0/1 is trying to lock: >> ffff76e045dec9a0 (&dev->power.lock){+.+.}-{3:3}, at: device_pm_check_callbacks+0x20/0xf0 >> other info that might help us debug this: >> context-{5:5} >> 3 locks held by swapper/0/1: >> #0: ffff76e045deb8e8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x38/0x1c0 >> #1: ffffa92b81f825e0 (gpd_list_lock){+.+.}-{4:4}, at: __genpd_dev_pm_attach+0x7c/0x250 >> #2: ffff76e04105c7a0 (&genpd->rslock){....}-{2:2}, at: genpd_lock_rawspin+0x1c/0x30 >> stack backtrace: >> CPU: 5 PID: 1 Comm: swapper/0 Tainted: G W 6.1.0-rt5-00325-g8a5f56bcfcca #8 >> Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) >> Call trace: >> dump_backtrace.part.0+0xe0/0xf0 >> show_stack+0x18/0x40 >> dump_stack_lvl+0x8c/0xb8 >> dump_stack+0x18/0x34 >> __lock_acquire+0x938/0x2100 >> lock_acquire.part.0+0x104/0x28c >> lock_acquire+0x68/0x84 >> rt_spin_lock+0x40/0x100 >> device_pm_check_callbacks+0x20/0xf0 >> dev_pm_domain_set+0x54/0x64 >> genpd_add_device+0x258/0x340 >> __genpd_dev_pm_attach+0xa8/0x250 >> genpd_dev_pm_attach_by_id+0xc4/0x190 >> genpd_dev_pm_attach_by_name+0x3c/0x60 >> dev_pm_domain_attach_by_name+0x20/0x30 >> dt_idle_attach_cpu+0x24/0x90 >> psci_cpuidle_probe+0x300/0x4b0 >> platform_probe+0x68/0xe0 >> really_probe+0xbc/0x2dc >> __driver_probe_device+0x78/0xe0 >> driver_probe_device+0x3c/0x160 >> __device_attach_driver+0xb8/0x140 >> bus_for_each_drv+0x78/0xd0 >> __device_attach+0xa8/0x1c0 >> device_initial_probe+0x14/0x20 >> bus_probe_device+0x9c/0xa4 >> device_add+0x3b4/0x8dc >> platform_device_add+0x114/0x234 >> platform_device_register_full+0x108/0x1a4 >> psci_idle_init+0x6c/0xb0 >> do_one_initcall+0x74/0x450 >> kernel_init_freeable+0x2e0/0x350 >> kernel_init+0x24/0x130 >> ret_from_fork+0x10/0x20 > > I would prefer a description of the issue instead hacing this > backtrace. I'll extend the commit msg. > >> Cc: Adrien Thierry <athierry@redhat.com> >> Cc: Brian Masney <bmasney@redhat.com> >> Cc: linux-rt-users@vger.kernel.org >> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> >> --- >> drivers/base/power/domain.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c >> index 4dfce1d476f4..db499ba40497 100644 >> --- a/drivers/base/power/domain.c >> +++ b/drivers/base/power/domain.c >> @@ -1666,10 +1666,14 @@ static int genpd_add_device(struct generic_pm_domain *genpd, struct device *dev, >> if (ret) >> goto out; >> >> + >> + /* PREEMPT_RT: Must be outside of genpd_lock */ > > Could this comment be rewritten if needed? > The callback, which acquires sleeping locks on PREEMPT_RT, can be moved > before genpd_lock() because it does not rely on this lock. It must be > moved because the latter may acquire spinning locks. Sure Best regards, Krzysztof
On Mon, 19 Dec 2022 at 16:15, Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> wrote: > > If PM domain on PREEMPT_RT is marked as GENPD_FLAG_RT_SAFE(), the > genpd_lock() will be a raw spin lock, thus device_pm_check_callbacks() > must be called outside of the domain lock. > > This solves on PREEMPT_RT: > > [ BUG: Invalid wait context ] > 6.1.0-rt5-00325-g8a5f56bcfcca #8 Tainted: G W > ----------------------------- > swapper/0/1 is trying to lock: > ffff76e045dec9a0 (&dev->power.lock){+.+.}-{3:3}, at: device_pm_check_callbacks+0x20/0xf0 > other info that might help us debug this: > context-{5:5} > 3 locks held by swapper/0/1: > #0: ffff76e045deb8e8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x38/0x1c0 > #1: ffffa92b81f825e0 (gpd_list_lock){+.+.}-{4:4}, at: __genpd_dev_pm_attach+0x7c/0x250 > #2: ffff76e04105c7a0 (&genpd->rslock){....}-{2:2}, at: genpd_lock_rawspin+0x1c/0x30 > stack backtrace: > CPU: 5 PID: 1 Comm: swapper/0 Tainted: G W 6.1.0-rt5-00325-g8a5f56bcfcca #8 > Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) > Call trace: > dump_backtrace.part.0+0xe0/0xf0 > show_stack+0x18/0x40 > dump_stack_lvl+0x8c/0xb8 > dump_stack+0x18/0x34 > __lock_acquire+0x938/0x2100 > lock_acquire.part.0+0x104/0x28c > lock_acquire+0x68/0x84 > rt_spin_lock+0x40/0x100 > device_pm_check_callbacks+0x20/0xf0 > dev_pm_domain_set+0x54/0x64 > genpd_add_device+0x258/0x340 > __genpd_dev_pm_attach+0xa8/0x250 > genpd_dev_pm_attach_by_id+0xc4/0x190 > genpd_dev_pm_attach_by_name+0x3c/0x60 > dev_pm_domain_attach_by_name+0x20/0x30 > dt_idle_attach_cpu+0x24/0x90 > psci_cpuidle_probe+0x300/0x4b0 > platform_probe+0x68/0xe0 > really_probe+0xbc/0x2dc > __driver_probe_device+0x78/0xe0 > driver_probe_device+0x3c/0x160 > __device_attach_driver+0xb8/0x140 > bus_for_each_drv+0x78/0xd0 > __device_attach+0xa8/0x1c0 > device_initial_probe+0x14/0x20 > bus_probe_device+0x9c/0xa4 > device_add+0x3b4/0x8dc > platform_device_add+0x114/0x234 > platform_device_register_full+0x108/0x1a4 > psci_idle_init+0x6c/0xb0 > do_one_initcall+0x74/0x450 > kernel_init_freeable+0x2e0/0x350 > kernel_init+0x24/0x130 > ret_from_fork+0x10/0x20 > > Cc: Adrien Thierry <athierry@redhat.com> > Cc: Brian Masney <bmasney@redhat.com> > Cc: linux-rt-users@vger.kernel.org > Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> > --- > drivers/base/power/domain.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > index 4dfce1d476f4..db499ba40497 100644 > --- a/drivers/base/power/domain.c > +++ b/drivers/base/power/domain.c > @@ -1666,10 +1666,14 @@ static int genpd_add_device(struct generic_pm_domain *genpd, struct device *dev, > if (ret) > goto out; > > + > + /* PREEMPT_RT: Must be outside of genpd_lock */ > + device_pm_check_callbacks(dev); > + > genpd_lock(genpd); > > genpd_set_cpumask(genpd, gpd_data->cpu); > - dev_pm_domain_set(dev, &genpd->domain); > + dev_pm_domain_set_no_cb(dev, &genpd->domain); > > genpd->device_count++; > if (gd) Rather than splitting up the assignment in two steps, I think it should be perfectly fine to move the call to dev_pm_domain_set() outside the genpd lock. Note that, genpd_add_device() is always being called with gpd_list_lock mutex being held. This prevents the genpd from being removed, while we use it here. Moreover, we need a similar change for the call to dev_pm_domain_set() in genpd_remove_device(). > -- > 2.34.1 > Kind regards Uffe
On 17/01/2023 16:11, Ulf Hansson wrote: > On Mon, 19 Dec 2022 at 16:15, Krzysztof Kozlowski > <krzysztof.kozlowski@linaro.org> wrote: >> >> If PM domain on PREEMPT_RT is marked as GENPD_FLAG_RT_SAFE(), the >> genpd_lock() will be a raw spin lock, thus device_pm_check_callbacks() >> must be called outside of the domain lock. >> >> This solves on PREEMPT_RT: >> >> [ BUG: Invalid wait context ] >> 6.1.0-rt5-00325-g8a5f56bcfcca #8 Tainted: G W >> ----------------------------- >> swapper/0/1 is trying to lock: >> ffff76e045dec9a0 (&dev->power.lock){+.+.}-{3:3}, at: device_pm_check_callbacks+0x20/0xf0 >> other info that might help us debug this: >> context-{5:5} >> 3 locks held by swapper/0/1: >> #0: ffff76e045deb8e8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x38/0x1c0 >> #1: ffffa92b81f825e0 (gpd_list_lock){+.+.}-{4:4}, at: __genpd_dev_pm_attach+0x7c/0x250 >> #2: ffff76e04105c7a0 (&genpd->rslock){....}-{2:2}, at: genpd_lock_rawspin+0x1c/0x30 >> stack backtrace: >> CPU: 5 PID: 1 Comm: swapper/0 Tainted: G W 6.1.0-rt5-00325-g8a5f56bcfcca #8 >> Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) >> Call trace: >> dump_backtrace.part.0+0xe0/0xf0 >> show_stack+0x18/0x40 >> dump_stack_lvl+0x8c/0xb8 >> dump_stack+0x18/0x34 >> __lock_acquire+0x938/0x2100 >> lock_acquire.part.0+0x104/0x28c >> lock_acquire+0x68/0x84 >> rt_spin_lock+0x40/0x100 >> device_pm_check_callbacks+0x20/0xf0 >> dev_pm_domain_set+0x54/0x64 >> genpd_add_device+0x258/0x340 >> __genpd_dev_pm_attach+0xa8/0x250 >> genpd_dev_pm_attach_by_id+0xc4/0x190 >> genpd_dev_pm_attach_by_name+0x3c/0x60 >> dev_pm_domain_attach_by_name+0x20/0x30 >> dt_idle_attach_cpu+0x24/0x90 >> psci_cpuidle_probe+0x300/0x4b0 >> platform_probe+0x68/0xe0 >> really_probe+0xbc/0x2dc >> __driver_probe_device+0x78/0xe0 >> driver_probe_device+0x3c/0x160 >> __device_attach_driver+0xb8/0x140 >> bus_for_each_drv+0x78/0xd0 >> __device_attach+0xa8/0x1c0 >> device_initial_probe+0x14/0x20 >> bus_probe_device+0x9c/0xa4 >> device_add+0x3b4/0x8dc >> platform_device_add+0x114/0x234 >> platform_device_register_full+0x108/0x1a4 >> psci_idle_init+0x6c/0xb0 >> do_one_initcall+0x74/0x450 >> kernel_init_freeable+0x2e0/0x350 >> kernel_init+0x24/0x130 >> ret_from_fork+0x10/0x20 >> >> Cc: Adrien Thierry <athierry@redhat.com> >> Cc: Brian Masney <bmasney@redhat.com> >> Cc: linux-rt-users@vger.kernel.org >> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> >> --- >> drivers/base/power/domain.c | 6 +++++- >> 1 file changed, 5 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c >> index 4dfce1d476f4..db499ba40497 100644 >> --- a/drivers/base/power/domain.c >> +++ b/drivers/base/power/domain.c >> @@ -1666,10 +1666,14 @@ static int genpd_add_device(struct generic_pm_domain *genpd, struct device *dev, >> if (ret) >> goto out; >> >> + >> + /* PREEMPT_RT: Must be outside of genpd_lock */ >> + device_pm_check_callbacks(dev); >> + >> genpd_lock(genpd); >> >> genpd_set_cpumask(genpd, gpd_data->cpu); >> - dev_pm_domain_set(dev, &genpd->domain); >> + dev_pm_domain_set_no_cb(dev, &genpd->domain); >> >> genpd->device_count++; >> if (gd) > > Rather than splitting up the assignment in two steps, I think it > should be perfectly fine to move the call to dev_pm_domain_set() > outside the genpd lock. > > Note that, genpd_add_device() is always being called with > gpd_list_lock mutex being held. This prevents the genpd from being > removed, while we use it here. Hm, indeed, should be fine. > > Moreover, we need a similar change for the call to dev_pm_domain_set() > in genpd_remove_device(). Right. Best regards, Krzysztof
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 4dfce1d476f4..db499ba40497 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -1666,10 +1666,14 @@ static int genpd_add_device(struct generic_pm_domain *genpd, struct device *dev, if (ret) goto out; + + /* PREEMPT_RT: Must be outside of genpd_lock */ + device_pm_check_callbacks(dev); + genpd_lock(genpd); genpd_set_cpumask(genpd, gpd_data->cpu); - dev_pm_domain_set(dev, &genpd->domain); + dev_pm_domain_set_no_cb(dev, &genpd->domain); genpd->device_count++; if (gd)
If PM domain on PREEMPT_RT is marked as GENPD_FLAG_RT_SAFE(), the genpd_lock() will be a raw spin lock, thus device_pm_check_callbacks() must be called outside of the domain lock. This solves on PREEMPT_RT: [ BUG: Invalid wait context ] 6.1.0-rt5-00325-g8a5f56bcfcca #8 Tainted: G W ----------------------------- swapper/0/1 is trying to lock: ffff76e045dec9a0 (&dev->power.lock){+.+.}-{3:3}, at: device_pm_check_callbacks+0x20/0xf0 other info that might help us debug this: context-{5:5} 3 locks held by swapper/0/1: #0: ffff76e045deb8e8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x38/0x1c0 #1: ffffa92b81f825e0 (gpd_list_lock){+.+.}-{4:4}, at: __genpd_dev_pm_attach+0x7c/0x250 #2: ffff76e04105c7a0 (&genpd->rslock){....}-{2:2}, at: genpd_lock_rawspin+0x1c/0x30 stack backtrace: CPU: 5 PID: 1 Comm: swapper/0 Tainted: G W 6.1.0-rt5-00325-g8a5f56bcfcca #8 Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT) Call trace: dump_backtrace.part.0+0xe0/0xf0 show_stack+0x18/0x40 dump_stack_lvl+0x8c/0xb8 dump_stack+0x18/0x34 __lock_acquire+0x938/0x2100 lock_acquire.part.0+0x104/0x28c lock_acquire+0x68/0x84 rt_spin_lock+0x40/0x100 device_pm_check_callbacks+0x20/0xf0 dev_pm_domain_set+0x54/0x64 genpd_add_device+0x258/0x340 __genpd_dev_pm_attach+0xa8/0x250 genpd_dev_pm_attach_by_id+0xc4/0x190 genpd_dev_pm_attach_by_name+0x3c/0x60 dev_pm_domain_attach_by_name+0x20/0x30 dt_idle_attach_cpu+0x24/0x90 psci_cpuidle_probe+0x300/0x4b0 platform_probe+0x68/0xe0 really_probe+0xbc/0x2dc __driver_probe_device+0x78/0xe0 driver_probe_device+0x3c/0x160 __device_attach_driver+0xb8/0x140 bus_for_each_drv+0x78/0xd0 __device_attach+0xa8/0x1c0 device_initial_probe+0x14/0x20 bus_probe_device+0x9c/0xa4 device_add+0x3b4/0x8dc platform_device_add+0x114/0x234 platform_device_register_full+0x108/0x1a4 psci_idle_init+0x6c/0xb0 do_one_initcall+0x74/0x450 kernel_init_freeable+0x2e0/0x350 kernel_init+0x24/0x130 ret_from_fork+0x10/0x20 Cc: Adrien Thierry <athierry@redhat.com> Cc: Brian Masney <bmasney@redhat.com> Cc: linux-rt-users@vger.kernel.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> --- drivers/base/power/domain.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)