Message ID | 1412174494-15346-2-git-send-email-ulf.hansson@linaro.org |
---|---|
State | New |
Headers | show |
On 01/10/14 16:41, Ulf Hansson wrote:
> There are currently no users of this API, let's remove it.
The sad fact is that removal of pm_genpd_dev_need_restore() calls
from arch/arm/mach-exynos/pm_domains.c introduces regressions in
multiple exynos drivers (I'm sure it breaks media drivers).
I think before doing such changes all relevant drivers should be
updated first. I need to take a closer look again, but it seems
after dropping the pm_genpd_dev_need_restore() calls, client
driver's runtime_resume() callback is not being called in response
to first pm_runtime_get(_sync) call, even if a device is runtime
pm active.
More details can be found in commit ebc35c726298ba3fdebba316a
'ARM: EXYNOS: register devices in 'need_restore' state for pm_domains'.
The above only happens when devices are added to an inactive power
domain, then I guess patch 2/4 is also an attempt to address this
issue ?
--
Thanks,
Sylwester
--
To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On 1 October 2014 18:36, Sylwester Nawrocki <s.nawrocki@samsung.com> wrote: > On 01/10/14 16:41, Ulf Hansson wrote: >> There are currently no users of this API, let's remove it. Hi Sylwester, Thanks for your reply! > > The sad fact is that removal of pm_genpd_dev_need_restore() calls > from arch/arm/mach-exynos/pm_domains.c introduces regressions in > multiple exynos drivers (I'm sure it breaks media drivers). The fact that you need pm_genpd_dev_need_restore() is really worrying. That indicates that exynos are suffering from this race condition I am trying to fix in this patchset. > I think before doing such changes all relevant drivers should be > updated first. I need to take a closer look again, but it seems > after dropping the pm_genpd_dev_need_restore() calls, client > driver's runtime_resume() callback is not being called in response > to first pm_runtime_get(_sync) call, even if a device is runtime > pm active. Why would runtime PM callbacks be invoked when the device are runtime PM active? That's prevented by the runtime PM core and is the expected behaviour. pm_genpd_dev_need_restore() is not the solution, besides that it gives you the option to lie about device's runtime PM state to genpd. Thus, if you are really lucky, that might workaround your issues. :-) I will happily help out in fixing drivers for exynos. Would be nice if you could provide me with a list of which driver/subsystems that seems broken. HW, wise I have a exynos 5250 and exynos 5422 on my desk, so those I can test. > > More details can be found in commit ebc35c726298ba3fdebba316a > 'ARM: EXYNOS: register devices in 'need_restore' state for pm_domains'. > > The above only happens when devices are added to an inactive power > domain, then I guess patch 2/4 is also an attempt to address this > issue ? Yes. I would really appreciate if you could run a test with the complete patchset and see if that resolves the issues. Kind regards Uffe -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello Ulf, On 02/10/14 11:09, Ulf Hansson wrote: > On 1 October 2014 18:36, Sylwester Nawrocki <s.nawrocki@samsung.com> wrote: >> On 01/10/14 16:41, Ulf Hansson wrote: >>> There are currently no users of this API, let's remove it. > > Hi Sylwester, > > Thanks for your reply! > >> The sad fact is that removal of pm_genpd_dev_need_restore() calls >> from arch/arm/mach-exynos/pm_domains.c introduces regressions in >> multiple exynos drivers (I'm sure it breaks media drivers). > > The fact that you need pm_genpd_dev_need_restore() is really worrying. > That indicates that exynos are suffering from this race condition I am > trying to fix in this patchset. > >> I think before doing such changes all relevant drivers should be >> updated first. I need to take a closer look again, but it seems >> after dropping the pm_genpd_dev_need_restore() calls, client >> driver's runtime_resume() callback is not being called in response >> to first pm_runtime_get(_sync) call, even if a device is runtime >> pm active. Sorry, I meant to say "inactive" here. > Why would runtime PM callbacks be invoked when the device are runtime > PM active? That's prevented by the runtime PM core and is the expected > behaviour. > > pm_genpd_dev_need_restore() is not the solution, besides that it gives > you the option to lie about device's runtime PM state to genpd. Thus, > if you are really lucky, that might workaround your issues. :-) Might be, nevertheless so far it more or less worked for us. At least I'm sure without it everything breaks right away. > I will happily help out in fixing drivers for exynos. Would be nice if > you could provide me with a list of which driver/subsystems that seems > broken. HW, wise I have a exynos 5250 and exynos 5422 on my desk, so > those I can test. For exynos5 I would check exynos-gsc and s5p-mfc (see kernel logs below). I could take care of other, exynos4 drivers. But I believe the fix belongs in the PM core, I doubt we can solve in the driver a problem which only occurs to one instance (first probed) of a device from a set of devices in the power domain. >> More details can be found in commit ebc35c726298ba3fdebba316a >> 'ARM: EXYNOS: register devices in 'need_restore' state for pm_domains'. >> >> The above only happens when devices are added to an inactive power >> domain, then I guess patch 2/4 is also an attempt to address this >> issue ? > > Yes. > > I would really appreciate if you could run a test with the complete > patchset and see if that resolves the issues. Sure, is there a git tree I could fetch all the relevant patches from ? I'm not sure what the base tree for this series, I could only apply first 2 patches from this thread, on top of the generic OF PM domain patches. And that didn't solve issues which are seen in the logs below. Nevertheless, as Rafael pointed out, enabling all power domains unconditionally seems a step backwards. It would be nice to have resolved the race in a away the power domains remain in state left by the firmware and are being enabled only before any client device actually needs it. ===================================================================== [ 2.301092] ------------[ cut here ]------------ [ 2.305581] WARNING: CPU: 1 PID: 25 at drivers/clk/clk.c:889 clk_disable+0x24/0x30() [ 2.313288] Modules linked in:[ 2.315985] s5p-fimc-md soc:camera: clk provider not registered [ 2.322055] [ 2.323711] CPU: 1 PID: 25 Comm: kworker/1:0 Tainted: G W 3.16.1-00612-gb8fe420-dirty #11731 [ 2.333087] Workqueue: pm pm_runtime_work [ 2.337098] [<c0016548>] (unwind_backtrace) from [<c00129c4>] (show_stack+0x10/0x14) [ 2.344810] [<c00129c4>] (show_stack) from [<c0651264>] (dump_stack+0x80/0xcc) [ 2.352013] [<c0651264>] (dump_stack) from [<c00269e0>] (warn_slowpath_common+0x68/0x8c) [ 2.360082] [<c00269e0>] (warn_slowpath_common) from [<c0026a20>] (warn_slowpath_null+0x1c/0x24) [ 2.368849] [<c0026a20>] (warn_slowpath_null) from [<c0481190>] (clk_disable+0x24/0x30) [ 2.376838] [<c0481190>] (clk_disable) from [<c0415298>] (fimc_runtime_suspend+0x60/0x98) [ 2.384998] [<c0415298>] (fimc_runtime_suspend) from [<c0308edc>] (pm_generic_runtime_suspend+0x2c/0x38) [ 2.394459] [<c0308edc>] (pm_generic_runtime_suspend) from [<c03125a0>] (pm_genpd_default_save_state+0x50/0xb0) [ 2.404524] [<c03125a0>] (pm_genpd_default_save_state) from [<c0311ab8>] (pm_genpd_poweroff+0x22c/0x448) [ 2.413984] [<c0311ab8>] (pm_genpd_poweroff) from [<c0311d9c>] (pm_genpd_runtime_suspend+0xa4/0xf0) [ 2.423012] [<c0311d9c>] (pm_genpd_runtime_suspend) from [<c030a5b8>] (__rpm_callback+0x2c/0x60) [ 2.431779] [<c030a5b8>] (__rpm_callback) from [<c030a60c>] (rpm_callback+0x20/0x74) [ 2.439503] [<c030a60c>] (rpm_callback) from [<c030a9f4>] (rpm_suspend+0xd8/0x538) [ 2.447053] [<c030a9f4>] (rpm_suspend) from [<c030c084>] (pm_runtime_work+0x80/0x90) [ 2.454783] [<c030c084>] (pm_runtime_work) from [<c003dfec>] (process_one_work+0x124/0x420) [ 2.463113] [<c003dfec>] (process_one_work) from [<c003e354>] (worker_thread+0x38/0x568) [ 2.471185] [<c003e354>] (worker_thread) from [<c00442f8>] (kthread+0xcc/0xe8) [ 2.478390] [<c00442f8>] (kthread) from [<c000f208>] (ret_from_fork+0x14/0x2c) [ 2.485588] ---[ end trace 436d73983b61c827 ]--- [ 2.490701] Unable to handle kernel NULL pointer dereference at virtual address 0000011c [ 2.498288] pgd = c0004000 [ 2.500983] [0000011c] *pgd=00000000 [ 2.504514] Internal error: Oops: 805 [#1] PREEMPT SMP ARM [ 2.509978] Modules linked in: [ 2.513022] CPU: 3 PID: 763 Comm: kworker/3:1 Tainted: G W 3.16.1-00612-gb8fe420-dirty #11731 [ 2.522482] Workqueue: pm pm_runtime_work [ 2.526471] task: ee19b840 ti: ed918000 task.ti: ed918000 [ 2.531860] PC is at s5p_mfc_runtime_suspend+0xc/0x14 [ 2.536892] LR is at pm_generic_runtime_suspend+0x2c/0x38 [ 2.542270] pc : [<c04229a4>] lr : [<c0308edc>] psr: a0000113 [ 2.542270] sp : ed919e20 ip : 00000000 fp : ed918000 [ 2.553725] r10: 00000000 r9 : ee31988c r8 : ee319884 [ 2.558933] r7 : ee1686c0 r6 : ee319810 r5 : 00000001 r4 : ee3d3a10 [ 2.565443] r3 : 00000000 r2 : 00000000 r1 : 0000030a r0 : 00000000 [ 2.571955] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel [ 2.579245] Control: 10c53c7d Table: 4000404a DAC: 00000015 [ 2.584974] Process kworker/3:1 (pid: 763, stack limit = 0xed918240) [ 2.591310] Stack: (0xed919e20 to 0xed91a000) [ 2.595654] 9e20: ee3d3a10 c03125a0 80231d55 c0311ab8 c0a140c0 0000000a 80231d55 00000000 [ 2.603813] 9e40: 80231d55 00000000 80230b6d ee319810 ee31988c 00000000 c0a140c0 00000000 [ 2.611971] 9e60: c0a140c0 0000000a ee3d3c10 c0311d9c ee3d3c10 ee3d3c74 c0311cf8 c030a5b8 [ 2.620130] 9e80: ee3d3c10 00000000 00000008 c030a60c ee3d3cb8 00000000 00000008 c030a9f4 [ 2.628289] 9ea0: 00000000 ed918000 00000000 c0a10840 ee3d3c10 ee3d3c74 00000000 ed918000 [ 2.636449] 9ec0: 00000002 ee3d3cb8 ee3d3c74 eeb493d4 ed918000 eeb493c0 00000000 eeb4d100 [ 2.644608] 9ee0: 00000000 c030c084 c030c004 ee2bb280 ee3d3cb8 c003dfec c0a0e2c0 ee1abecc [ 2.652767] 9f00: 00000001 ee1abed8 00000000 ee2bb280 eeb493c0 eeb493d4 ed918000 ed918000 [ 2.660926] 9f20: 00000008 ee2bb298 eeb493c0 c003e354 c003e31c ed918000 00000000 00000000 [ 2.669085] 9f40: 00000000 edb41180 00000000 ee2bb280 c003e31c 00000000 00000000 00000000 [ 2.677245] 9f60: 00000000 c00442f8 00000000 00000000 ed919f9c ee2bb280 00000000 00000000 [ 2.685404] 9f80: ed919f80 ed919f80 00000000 00000000 ed919f90 ed919f90 ed919fac edb41180 [ 2.693563] 9fa0: c004422c 00000000 00000000 c000f208 00000000 00000000 00000000 00000000 [ 2.701722] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 2.709882] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffffffff ffffffff [ 2.718050] [<c04229a4>] (s5p_mfc_runtime_suspend) from [<c0308edc>] (pm_generic_runtime_suspend+0x2c/0x38) [ 2.727770] [<c0308edc>] (pm_generic_runtime_suspend) from [<c03125a0>] (pm_genpd_default_save_state+0x50/0xb0) [ 2.737836] [<c03125a0>] (pm_genpd_default_save_state) from [<c0311ab8>] (pm_genpd_poweroff+0x22c/0x448) [ 2.747295] [<c0311ab8>] (pm_genpd_poweroff) from [<c0311d9c>] (pm_genpd_runtime_suspend+0xa4/0xf0) [ 2.756325] [<c0311d9c>] (pm_genpd_runtime_suspend) from [<c030a5b8>] (__rpm_callback+0x2c/0x60) [ 2.765090] [<c030a5b8>] (__rpm_callback) from [<c030a60c>] (rpm_callback+0x20/0x74) [ 2.772815] [<c030a60c>] (rpm_callback) from [<c030a9f4>] (rpm_suspend+0xd8/0x538) [ 2.780365] [<c030a9f4>] (rpm_suspend) from [<c030c084>] (pm_runtime_work+0x80/0x90) [ 2.788092] [<c030c084>] (pm_runtime_work) from [<c003dfec>] (process_one_work+0x124/0x420) [ 2.796424] [<c003dfec>] (process_one_work) from [<c003e354>] (worker_thread+0x38/0x568) [ 2.804496] [<c003e354>] (worker_thread) from [<c00442f8>] (kthread+0xcc/0xe8) [ 2.811702] [<c00442f8>] (kthread) from [<c000f208>] (ret_from_fork+0x14/0x2c) [ 2.818903] Code: e12fff1e e5902058 e3a03000 e1a00003 (e582311c) [ 2.825018] ---[ end trace 436d73983b61c828 ]--- And here with printks in fimc-core.c and fimc-capture.c (in probe(), /dev/video open() and release() callbacks) to track device PM state: [ 2.263449] cam-power-domain: Power-on latency exceeded, new value 141375 ns [ 2.269257] exynos4-fimc 11800000.fimc: fimc_probe():1028: active: 0, suspended: 1 [ 2.276933] exynos4-fimc 11810000.fimc: fimc_probe():1028: active: 0, suspended: 1 [ 2.290555] s5p-fimc-md: Registered fimc.0.m2m as /dev/video0 [ 2.296284] s5p-fimc-md: Registered fimc.0.capture as /dev/video1 [ 2.302320] s5p-fimc-md: Registered fimc.1.m2m as /dev/video2 [ 2.308042] s5p-fimc-md: Registered fimc.1.capture as /dev/video3 [ 2.313970] exynos4-fimc 11800000.fimc: fimc_runtime_resume():1053: active: 0, suspended: 0 [ 2.322320] exynos4-fimc 11810000.fimc: fimc_runtime_suspend():1073: active: 0, suspended: 1 [ 2.330674] ------------[ cut here ]------------ [ 2.341165] WARNING: CPU: 0 PID: 450 at drivers/clk/clk.c:889 clk_disable+0x24/0x30() [ 2.348963] Modules linked in: [ 2.352008] CPU: 0 PID: 450 Comm: kworker/0:1 Tainted: G W 3.16.1-00612-gb8fe420-dirty #11735 [ 2.361469] Workqueue: pm pm_runtime_work [ 2.365482] [<c0016548>] (unwind_backtrace) from [<c00129c4>] (show_stack+0x10/0x14) [ 2.373192] [<c00129c4>] (show_stack) from [<c06516b0>] (dump_stack+0x80/0xcc) [ 2.380395] [<c06516b0>] (dump_stack) from [<c00269e0>] (warn_slowpath_common+0x68/0x8c) [ 2.388464] [<c00269e0>] (warn_slowpath_common) from [<c0026a20>] (warn_slowpath_null+0x1c/0x24) [ 2.397232] [<c0026a20>] (warn_slowpath_null) from [<c04815dc>] (clk_disable+0x24/0x30) [ 2.405222] [<c04815dc>] (clk_disable) from [<c0415560>] (fimc_runtime_suspend+0xa0/0xdc) [ 2.413382] [<c0415560>] (fimc_runtime_suspend) from [<c0308edc>] (pm_generic_runtime_suspend+0x2c/0x38) [ 2.422843] [<c0308edc>] (pm_generic_runtime_suspend) from [<c03125a0>] (pm_genpd_default_save_state+0x50/0xb0) [ 2.432907] [<c03125a0>] (pm_genpd_default_save_state) from [<c0311ab8>] (pm_genpd_poweroff+0x22c/0x448) [ 2.442367] [<c0311ab8>] (pm_genpd_poweroff) from [<c0311d9c>] (pm_genpd_runtime_suspend+0xa4/0xf0) [ 2.451395] [<c0311d9c>] (pm_genpd_runtime_suspend) from [<c030a5b8>] (__rpm_callback+0x2c/0x60) [ 2.460162] [<c030a5b8>] (__rpm_callback) from [<c030a60c>] (rpm_callback+0x20/0x74) [ 2.467886] [<c030a60c>] (rpm_callback) from [<c030a9f4>] (rpm_suspend+0xd8/0x538) [ 2.475437] [<c030a9f4>] (rpm_suspend) from [<c030c084>] (pm_runtime_work+0x80/0x90) [ 2.483166] [<c030c084>] (pm_runtime_work) from [<c003dfec>] (process_one_work+0x124/0x420) [ 2.491496] [<c003dfec>] (process_one_work) from [<c003e354>] (worker_thread+0x38/0x568) [ 2.499568] [<c003e354>] (worker_thread) from [<c00442f8>] (kthread+0xcc/0xe8) [ 2.506774] [<c00442f8>] (kthread) from [<c000f208>] (ret_from_fork+0x14/0x2c) [ 2.513972] ---[ end trace 52eeee3f62bd143d ]--- [ 2.518632] exynos4-fimc 11800000.fimc: fimc_runtime_suspend():1073: active: 0, suspended: 0 [ 2.518724] s5p-mfc 13400000.codec: registered sysmmu controller for left subdevice [ 2.519335] exynos-sysmmu 13620000.sysmmu: Enabled [ 2.519378] Unable to handle kernel NULL pointer dereference at virtual address 0000011c [ 2.519382] pgd = c0004000 [ 2.519387] [0000011c] *pgd=00000000 [ 2.519394] Internal error: Oops: 805 [#1] PREEMPT SMP ARM [ 2.519400] Modules linked in: [ 2.519408] CPU: 1 PID: 25 Comm: kworker/1:0 Tainted: G W 3.16.1-00612-gb8fe420-dirty #11735 [ 2.519417] Workqueue: pm pm_runtime_work [ 2.519422] task: ee0dabc0 ti: ee0ee000 task.ti: ee0ee000 [ 2.519434] PC is at s5p_mfc_runtime_suspend+0xc/0x14 [ 2.519442] LR is at pm_generic_runtime_suspend+0x2c/0x38 [ 2.519449] pc : [<c0422df0>] lr : [<c0308edc>] psr: a0000113 [ 2.519449] sp : ee0efe20 ip : 00000000 fp : ee0ee000 [ 2.519452] r10: 00000000 r9 : ee17d88c r8 : ee17d884 [ 2.519457] r7 : ed876cc0 r6 : ee17d810 r5 : 00000001 r4 : ed80ba10 [ 2.519460] r3 : 00000000 r2 : 00000000 r1 : 0000035a r0 : 00000000 [ 2.519466] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel [ 2.519471] Control: 10c53c7d Table: 4000404a DAC: 00000015 [ 2.519475] Process kworker/1:0 (pid: 25, stack limit = 0xee0ee240) [ 2.519480] Stack: (0xee0efe20 to 0xee0f0000) [ 2.519488] fe20: ed80ba10 c03125a0 8304b081 c0311ab8 c0a140c0 0000000a 8304b081 00000000 [ 2.519496] fe40: 8304b081 00000000 83049f6a ee17d810 ee17d88c 00000000 c0a140c0 00000000 [ 2.519504] fe60: c0a140c0 0000000a ed80bc10 c0311d9c ed80bc10 ed80bc74 c0311cf8 c030a5b8 [ 2.519511] fe80: ed80bc10 00000000 00000008 c030a60c ed80bcb8 00000000 00000008 c030a9f4 [ 2.519518] fea0: 00000000 ee0ee000 00000000 c0a10840 ed80bc10 ed80bc74 00000000 ee0ee000 [ 2.519525] fec0: 00000002 ed80bcb8 ed80bc74 eeb393d4 ee0ee000 eeb393c0 00000000 eeb3d100 [ 2.519532] fee0: 00000000 c030c084 c030c004 ee05ae00 ed80bcb8 c003dfec ee0eff2c c0052e6c [ 2.519540] ff00: 00000001 eeb39840 eeb393d4 ee05ae00 eeb393c0 eeb393d4 ee0ee000 ee0ee000 [ 2.519547] ff20: 00000008 ee05ae18 eeb393c0 c003e354 eeb393c0 ee0ee000 eeb39524 eeb39408 [ 2.519554] ff40: c003e31c ee06a540 00000000 ee05ae00 c003e31c 00000000 00000000 00000000 [ 2.519561] ff60: 00000000 c00442f8 00000000 00000000 ee0eff9c ee05ae00 00000000 00000000 [ 2.519568] ff80: ee0eff80 ee0eff80 00000000 00000000 ee0eff90 ee0eff90 ee0effac ee06a540 [ 2.519575] ffa0: c004422c 00000000 00000000 c000f208 00000000 00000000 00000000 00000000 [ 2.519581] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 2.519588] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 ffffffff ffffffff [ 2.519603] [<c0422df0>] (s5p_mfc_runtime_suspend) from [<c0308edc>] (pm_generic_runtime_suspend+0x2c/0x38) [ 2.519617] [<c0308edc>] (pm_generic_runtime_suspend) from [<c03125a0>] (pm_genpd_default_save_state+0x50/0xb0) [ 2.519629] [<c03125a0>] (pm_genpd_default_save_state) from [<c0311ab8>] (pm_genpd_poweroff+0x22c/0x448) [ 2.519639] [<c0311ab8>] (pm_genpd_poweroff) from [<c0311d9c>] (pm_genpd_runtime_suspend+0xa4/0xf0) [ 2.519651] [<c0311d9c>] (pm_genpd_runtime_suspend) from [<c030a5b8>] (__rpm_callback+0x2c/0x60) [ 2.519662] [<c030a5b8>] (__rpm_callback) from [<c030a60c>] (rpm_callback+0x20/0x74) [ 2.519672] [<c030a60c>] (rpm_callback) from [<c030a9f4>] (rpm_suspend+0xd8/0x538) [ 2.519681] [<c030a9f4>] (rpm_suspend) from [<c030c084>] (pm_runtime_work+0x80/0x90) [ 2.519692] [<c030c084>] (pm_runtime_work) from [<c003dfec>] (process_one_work+0x124/0x420) [ 2.519701] [<c003dfec>] (process_one_work) from [<c003e354>] (worker_thread+0x38/0x568) [ 2.519710] [<c003e354>] (worker_thread) from [<c00442f8>] (kthread+0xcc/0xe8) [ 2.519720] [<c00442f8>] (kthread) from [<c000f208>] (ret_from_fork+0x14/0x2c) [ 2.519729] Code: e12fff1e e5902058 e3a03000 e1a00003 (e582311c) [ 2.519735] ---[ end trace 52eeee3f62bd143e ]--- Note there is only one call to fimc_runtime_resume() for 11800000.fimc device and fimc_runtime_resume() call for both 11800000.fimc and 11810000.fimc. -- Thanks, Sylwester -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2 October 2014 14:00, Sylwester Nawrocki <s.nawrocki@samsung.com> wrote: > Hello Ulf, > > On 02/10/14 11:09, Ulf Hansson wrote: >> On 1 October 2014 18:36, Sylwester Nawrocki <s.nawrocki@samsung.com> wrote: >>> On 01/10/14 16:41, Ulf Hansson wrote: >>>> There are currently no users of this API, let's remove it. >> >> Hi Sylwester, >> >> Thanks for your reply! >> >>> The sad fact is that removal of pm_genpd_dev_need_restore() calls >>> from arch/arm/mach-exynos/pm_domains.c introduces regressions in >>> multiple exynos drivers (I'm sure it breaks media drivers). >> >> The fact that you need pm_genpd_dev_need_restore() is really worrying. >> That indicates that exynos are suffering from this race condition I am >> trying to fix in this patchset. >> >>> I think before doing such changes all relevant drivers should be >>> updated first. I need to take a closer look again, but it seems >>> after dropping the pm_genpd_dev_need_restore() calls, client >>> driver's runtime_resume() callback is not being called in response >>> to first pm_runtime_get(_sync) call, even if a device is runtime >>> pm active. > > Sorry, I meant to say "inactive" here. Ahh, now I see the approach. Correct me if I am wrong, but I think in principle these exynos drivers don't use pm_runtime_set_active() during ->probe() and are instead relying on CONFIG_PM_RUNTIME to be enabled. That's not a good behaviour. If these drivers are build without CONFIG_PM_RUNTIME - they won't work. I guess I have pointed out this in earlier discussions around runtime PM, I have seen quite some driver's getting their support for runtime PM wrong. > >> Why would runtime PM callbacks be invoked when the device are runtime >> PM active? That's prevented by the runtime PM core and is the expected >> behaviour. >> >> pm_genpd_dev_need_restore() is not the solution, besides that it gives >> you the option to lie about device's runtime PM state to genpd. Thus, >> if you are really lucky, that might workaround your issues. :-) > > Might be, nevertheless so far it more or less worked for us. At least > I'm sure without it everything breaks right away. I see. Obviously we must not break exynos, let's try to figure out the best way forward. > >> I will happily help out in fixing drivers for exynos. Would be nice if >> you could provide me with a list of which driver/subsystems that seems >> broken. HW, wise I have a exynos 5250 and exynos 5422 on my desk, so >> those I can test. > > For exynos5 I would check exynos-gsc and s5p-mfc (see kernel logs below). Thanks, I will have look and run some tests on exynos 5. Let you know soon. > I could take care of other, exynos4 drivers. But I believe the fix > belongs in the PM core, I doubt we can solve in the driver a problem > which only occurs to one instance (first probed) of a device from a set > of devices in the power domain. > >>> More details can be found in commit ebc35c726298ba3fdebba316a >>> 'ARM: EXYNOS: register devices in 'need_restore' state for pm_domains'. >>> >>> The above only happens when devices are added to an inactive power >>> domain, then I guess patch 2/4 is also an attempt to address this >>> issue ? >> >> Yes. >> >> I would really appreciate if you could run a test with the complete >> patchset and see if that resolves the issues. > > Sure, is there a git tree I could fetch all the relevant patches from ? > I'm not sure what the base tree for this series, I could only apply > first 2 patches from this thread, on top of the generic OF PM domain > patches. And that didn't solve issues which are seen in the logs below. You have two options: 1) Use/build Rafaels tree. Apply my patches on the bleeding edge branch.That's also what I use a work base. http://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git 2) Use/build Stephen Rothwell's linux-next tree. I can confirm the patches applies on the master branch as of today. http://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git I really appreciate your help here, thanks! > > Nevertheless, as Rafael pointed out, enabling all power domains > unconditionally seems a step backwards. It would be nice to have > resolved the race in a away the power domains remain in state left > by the firmware and are being enabled only before any client device > actually needs it. You are right, but I am not sure we can do this in one go - since it depends on if we can fix this on PM core of if we need need adaptations in drivers/subsystems. Kind regards Uffe -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/10/14 15:30, Ulf Hansson wrote: [...] > Correct me if I am wrong, but I think in principle these exynos > drivers don't use pm_runtime_set_active() during ->probe() and are > instead relying on CONFIG_PM_RUNTIME to be enabled. Yes, pm_runtime_set_active() is not used in probe(), I believe this is not required. In case of those IP blocks there is no use of activating them during probe(). Instead we check if PM_RUNTIME is enabled through pm_runtime_enabled() helper and enable the device clock(s) if not. I agree it all doesn't quite work in current mainline for !PM_RUNTIME, since there is nothing ensuring that the power domains are enabled in such kernel configuration. > That's not a good behaviour. If these drivers are build without > CONFIG_PM_RUNTIME - they won't work. They wouldn't similarly work with pm_runtime_set_active() call in probe() with CONFIG_PM_RUNTIME disabled, would they ? > I guess I have pointed out this in earlier discussions around runtime > PM, I have seen quite some driver's getting their support for runtime > PM wrong. > >>> Why would runtime PM callbacks be invoked when the device are runtime >>> PM active? That's prevented by the runtime PM core and is the expected >>> behaviour. >>> >>> pm_genpd_dev_need_restore() is not the solution, besides that it gives >>> you the option to lie about device's runtime PM state to genpd. Thus, >>> if you are really lucky, that might workaround your issues. :-) >> >> Might be, nevertheless so far it more or less worked for us. At least >> I'm sure without it everything breaks right away. > > I see. Obviously we must not break exynos, let's try to figure out the > best way forward. > >> >>> I will happily help out in fixing drivers for exynos. Would be nice if >>> you could provide me with a list of which driver/subsystems that seems >>> broken. HW, wise I have a exynos 5250 and exynos 5422 on my desk, so >>> those I can test. >> >> For exynos5 I would check exynos-gsc and s5p-mfc (see kernel logs below). > > Thanks, I will have look and run some tests on exynos 5. Let you know soon. Especially it might be interesting look at gscaler, where there are multiple instances of same device handled by one driver. [...] >> Sure, is there a git tree I could fetch all the relevant patches from ? >> I'm not sure what the base tree for this series, I could only apply >> first 2 patches from this thread, on top of the generic OF PM domain >> patches. And that didn't solve issues which are seen in the logs below. > > You have two options: > > 1) Use/build Rafaels tree. Apply my patches on the bleeding edge > branch.That's also what I use a work base. > http://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git > > 2) Use/build Stephen Rothwell's linux-next tree. I can confirm the > patches applies on the master branch as of today. > http://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git Thanks, I will be able to take a look at this the earliest on Monday. > I really appreciate your help here, thanks! > >> Nevertheless, as Rafael pointed out, enabling all power domains >> unconditionally seems a step backwards. It would be nice to have >> resolved the race in a away the power domains remain in state left >> by the firmware and are being enabled only before any client device >> actually needs it. > > You are right, but I am not sure we can do this in one go - since it > depends on if we can fix this on PM core of if we need need > adaptations in drivers/subsystems. I see, fortunately the genpd API appears not to be wildly used yet.
On 2 October 2014 17:54, Sylwester Nawrocki <s.nawrocki@samsung.com> wrote: > On 02/10/14 15:30, Ulf Hansson wrote: > [...] >> Correct me if I am wrong, but I think in principle these exynos >> drivers don't use pm_runtime_set_active() during ->probe() and are >> instead relying on CONFIG_PM_RUNTIME to be enabled. > > Yes, pm_runtime_set_active() is not used in probe(), I believe this > is not required. In case of those IP blocks there is no use of activating > them during probe(). Instead we check if PM_RUNTIME is enabled through > pm_runtime_enabled() helper and enable the device clock(s) if not. > I agree it all doesn't quite work in current mainline for !PM_RUNTIME, > since there is nothing ensuring that the power domains are enabled > in such kernel configuration. > >> That's not a good behaviour. If these drivers are build without >> CONFIG_PM_RUNTIME - they won't work. > > They wouldn't similarly work with pm_runtime_set_active() call in probe() > with CONFIG_PM_RUNTIME disabled, would they ? Yes they would, although they require some minor additional adaptations. Those resources that are enabled from the driver's runtime PM resume callback, should also be enabled during ->probe(). The pm_runtime_set_active() will then update the state to reflect this. Then, if CONFIG_PM_RUNTIME is enabled - the device will be scheduled to go inactive from driver core (pm_request_idle()), after ->probe() has completed. Thus saving power if it's unused. If CONFIG_PM_RUNTIME isn't enabled - the driver will still be functional, since all resources are enabled during ->probe(). Kind regards Uffe -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[...] >> >> Yes they would, although they require some minor additional adaptations. >> >> Those resources that are enabled from the driver's runtime PM resume >> callback, should also be enabled during ->probe(). The >> pm_runtime_set_active() will then update the state to reflect this. >> >> Then, if CONFIG_PM_RUNTIME is enabled - the device will be scheduled >> to go inactive from driver core (pm_request_idle()), after ->probe() >> has completed. Thus saving power if it's unused. >> If CONFIG_PM_RUNTIME isn't enabled - the driver will still be >> functional, since all resources are enabled during ->probe(). > > OK, I suspected you also assumed enabling relevant resources, so the PM > state matches the hardware state. > > Sorry for getting back late to this, now there is a regression on > Exynos seen in multiple drivers, i.e. affecting all the media and > video devices. Is this patch series going to be merged for v3.18 as > a regression fix ? If so I would need to update remaining drivers to > enable clocks and use pm_runtime_set_active() in probe(). Urgh!!! Let's see how we can work this out. I will be helping out and give this the highest prio! I did post a patchset for exynos 5 media gsc driver, I guess you have seen it!? Now, that doesn't help us much but still. https://www.mail-archive.com/linux-media@vger.kernel.org/msg80592.html > > I can see two options to fix bugs which appeared in Exynos after > merging the patch series switching to the OF generic power domain API: > > 1. merge this series and update the affected drivers for v3.18, This version is superseded by a v3. That takes a more solid approach on how to power on the PM domain from the bus' ->probe(). It's being discussed currently. [PATCH v3 0/9] PM / Domains: Fix race conditions during boot You should be on cc-list of that. > > 2. revert for now to the previous behaviour, doing something as > the patch below. > > 1. seems only a partial solution, since the regression remains for the > loadable modules which are loaded after late_initcall(). At that point > power domain may be disabled and the driver attempting to access > the hardware will hand the system. That was a limitation which is fixed in v3. I have tested loading modules and it works nicely. > > It's also a bit not clear to me why there is an assumption that when > a power domain is initially enabled all its corresponding devices are > already also fully activated ? That's a very good question! I think the assumption is wrong! Somehow we need to decouple that dependency. To me, typically the "need_restore" flag should reflect the current runtime PM status of the device, which isn't the case right now. In the v3 version of this patchset, the PM domain will be powered on from the bus's ->probe(), which also means the "need_restore" flag will initially always be set to "false" once the device are being probed by the driver. That means, those drivers not invoking pm_runtime_set_active() and manually enable clk/resources during ->probe(), but instead relies on pm_runtime_get_sync() - will _not_ work. As you stated above for some of the Exynos drivers. Even if it's likely that most of these Exynos drivers should be using pm_runtime_set_active() during ->probe(), the key-problem lies in genpd's wrong assumption about the device's runtime PM status, which is stored in the "need restore" flag. If we would fix the issue in genpd for the need_restore flag, that should solve the regression for Exynos, don't you think? I will immediately start working on a patch on genpd, which tries this approach, I will keep you posted. > > int __pm_genpd_add_device(struct generic_pm_domain *genpd, struct device *dev, > struct gpd_timing_data *td) > { > ... > gpd_data->need_restore = genpd->status == GPD_STATE_POWER_OFF; > ... > } > > It seems correct to me to have initially the power domain enabled and some > devices inactive, e.g. if device's driver manages its clocks and didn't > turn them on yet. As stated, I fully agree! [...] Kind regards Uffe -- To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 18cc68d..36871b3 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -1533,26 +1533,6 @@ int pm_genpd_remove_device(struct generic_pm_domain *genpd, } /** - * pm_genpd_dev_need_restore - Set/unset the device's "need restore" flag. - * @dev: Device to set/unset the flag for. - * @val: The new value of the device's "need restore" flag. - */ -void pm_genpd_dev_need_restore(struct device *dev, bool val) -{ - struct pm_subsys_data *psd; - unsigned long flags; - - spin_lock_irqsave(&dev->power.lock, flags); - - psd = dev_to_psd(dev); - if (psd && psd->domain_data) - to_gpd_data(psd->domain_data)->need_restore = val; - - spin_unlock_irqrestore(&dev->power.lock, flags); -} -EXPORT_SYMBOL_GPL(pm_genpd_dev_need_restore); - -/** * pm_genpd_add_subdomain - Add a subdomain to an I/O PM domain. * @genpd: Master PM domain to add the subdomain to. * @subdomain: Subdomain to be added. diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index 9004743..a21dfa9 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -129,7 +129,6 @@ extern int __pm_genpd_name_add_device(const char *domain_name, extern int pm_genpd_remove_device(struct generic_pm_domain *genpd, struct device *dev); -extern void pm_genpd_dev_need_restore(struct device *dev, bool val); extern int pm_genpd_add_subdomain(struct generic_pm_domain *genpd, struct generic_pm_domain *new_subdomain); extern int pm_genpd_add_subdomain_names(const char *master_name, @@ -175,7 +174,6 @@ static inline int pm_genpd_remove_device(struct generic_pm_domain *genpd, { return -ENOSYS; } -static inline void pm_genpd_dev_need_restore(struct device *dev, bool val) {} static inline int pm_genpd_add_subdomain(struct generic_pm_domain *genpd, struct generic_pm_domain *new_sd) {