Message ID | CAPDyKFras1o13WBZzaZ_Snnj5TBQQFWKr=PtCjUvwe+yGPQn9w@mail.gmail.com |
---|---|
State | New |
Headers | show |
On 2 February 2016 at 17:35, Tony Lindgren <tony@atomide.com> wrote: > Hi, > > * Ulf Hansson <ulf.hansson@linaro.org> [160202 02:43]: >> >> For the omap_hsmmc and likely also other omap drivers, which needs more >> than one attempt to ->probe() (returning -EPROBE_DEFER), this commit >> causes a regression at the PM domain level (omap hwmod). >> >> The reason is that the drivers don't put back the device into low power >> state while bailing out in ->probe to return -EPROBE_DEFER. This leads to >> that pm_runtime_reinit() in driver core, is re-initializing the runtime PM >> status from RPM_ACTIVE to RPM_SUSPENDED. > > Yup, that's the bug here. It seems that we never call the runtime_suspend > callback at the end of a first failed device driver probe if the driver > has set pm_runtime_use_autosuspend. Only rpm_idle runtime_idle callback > gets called. So the device stays on. > > This does not happen if pm_runtime_dont_use_autosuspend() is added to > the end of the device driver probe before pm_runtime_put_sync(). Thanks! It then confirms the second option I proposed. > >> The next ->probe() attempt then triggers the ->runtime_resume() callback >> to be invoked, which means this happens two times in a row. At the PM >> domain level (omap hwmod) this is being treated as an error and thus the >> runtime PM status of the device isn't correctly synchronized with the >> runtime PM core. > > That's a valid error though, let's not remove it. The reason why we > call runtime_resume() twice is because runtime_suspend callback never > gets called like I explain above. This isn't an error, it's just a hickup in the synchronization of the runtime PM status. Very similar what happens at the first probe, when the driver core has initialized the runtime PM status to RPM_SUSPENDED at the device registration. To me, it's the responsible of the PM domain to *help* with the synchronization, not prevent it as it currently does. > >> In the end, ->probe() anyway succeeds (as the driver don't checks the >> error code from the runtime PM APIs), but results in that the PM domain >> always stays powered on. This because of the runtime PM core believes the >> device is RPM_SUSPENDED. > > FYI, the following allows runtime_suspend callback to get called at the > end of a failed driver probe so the hardware state matches the PM runtime > state. Need to debug more. > > Regards, > > Tony > > 8< ------------ > --- a/drivers/mmc/host/omap_hsmmc.c > +++ b/drivers/mmc/host/omap_hsmmc.c > @@ -2232,6 +2232,7 @@ err_irq: > dma_release_channel(host->tx_chan); > if (host->rx_chan) > dma_release_channel(host->rx_chan); > + pm_runtime_dont_use_autosuspend(host->dev); It's good know this works, although do you intend to fix this sequence for all omap drivers/devices that's part of the hwmod PM domain? I haven't checked the number of drivers this would affect, but I imagine there could be quite many with similar behaviour and thus may suffer from the same issue. Of course the regression is only noticed for those returning -EPROBE_DEFER, which might not be that many, but it seems fragile to rely on this when going forward. All related drivers then needs to be fixed. > pm_runtime_put_sync(host->dev); > pm_runtime_disable(host->dev); > if (host->dbclk) Could you please test my version 2 of the patch I attached earlier. I still believe it's the best way to solve the regression, if it works of course. :-) Kind regards Uffe -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3 February 2016 at 13:18, Rafael J. Wysocki <rafael@kernel.org> wrote: > On Wed, Feb 3, 2016 at 11:25 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: >>> >>> I don't see the problem, but of course you know omap for better than I do. >>> >>> So if you are concerned about this, perhaps adding a dev_dbg print >>> when the omap hwmod's ->runtime_suspend() callback returns zero could >> >> /s/runtime_suspend/runtime_resume > > OK > > Let me summarize my understanding of this thread so far. > > It looks like the omap3 code initializes hardware in ->probe() and > then it may return -EPROBE_DEFER due to some unmet dependencies. In > that case the hardware is not reset to the previous state and the > runtime PM framework is left in the state that corresponds to the > current hardware state. Before we had pm_runtime_reinit(), everything > worked as expected on the second ->probe() call, because things were > in sync then. Correct! Before pm_runtime_reinit(), the failing probe case (-EPROBE_DEFER) worked fine because the HW state and the runtime PM status was in sync. The device was powered and the runtime PM status was RPM_ACTIVE. > > The introduction of pm_runtime_reinit() changed the situation and now > effectively the hardware is required to be reset to the initial state > if -EPROBE_DEFER is to be returned too. Not correct. The hardware doesn't need a reset as it stays powered after a failed probe. Instead, only the runtime PM status needs to be synchronized with the HW state the next probe attempt. In other words, when the PM domain's ->runtime_resume() callbacks gets called due to a pm_runtime_get_sync() in the second probe attempt, it may find that the HW is already powered and can return zero instead of resetting it. Kind regards Uffe -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/arm/mach-omap2/omap_device.c b/arch/arm/mach-omap2/omap_device.c index 0437537..1ad390b 100644 --- a/arch/arm/mach-omap2/omap_device.c +++ b/arch/arm/mach-omap2/omap_device.c @@ -599,8 +599,17 @@ static int _od_runtime_suspend(struct device *dev) static int _od_runtime_resume(struct device *dev) { struct platform_device *pdev = to_platform_device(dev); + struct omap_device *od = to_omap_device(pdev); int ret; + /* + * From the PM domain perspective the device may already be enabled. + * In such case, let's return zero to synchronize the state with the + * runtime PM core. + */ + if (od->_state == OMAP_DEVICE_STATE_ENABLED) + return 0; + ret = omap_device_enable(pdev); if (ret) return ret;