Message ID | 20210429203723.1177082-1-martin.blumenstingl@googlemail.com |
---|---|
State | New |
Headers | show |
Series | [RFC] soc: amlogic: meson-ee-pwrc: Drop the .shutdown callback from the driver | expand |
On 2021-04-29 22:37, Martin Blumenstingl wrote: > Stefan reports that rebooting his ODROID-N2+ (using a G12B SoC) results > in the board hanging. His kernel config uses: > CONFIG_MESON_EE_PM_DOMAINS=y > CONFIG_DRM_MESON=m > > He reports that his kernel config results in the DRM driver's .shutdown > callback to be executed after the power domain driver's .shutdown > callback. That's problematic because meson_ee_pwrc_shutdown disables the > clock which are used by the VPU IP. This causes the board to hang. > > Further he reports that changing from CONFIG_DRM_MESON=m to > CONFIG_DRM_MESON=y reverses the order in which the DRM and power domain > driver's shutdown functions are executed, making the reboot successful. > > The reason why we use meson_ee_pwrc_shutdown is because of the VPU power > domain (which is causing the problem described above). It can be left > enabled by u-boot. According to the original TOFIX comment in > meson_ee_pwrc_init_domain we need to be careful because disabling the > power domain could "cause system errors". As a workaround the clocks > are manually enabled in meson_ee_pwrc_init_domain and the power domain > is marked as GENPD_FLAG_ALWAYS_ON (so it can never be turned off). > > Experimenting has shown that the power domain itself can be disabled as > long as we keep the clocks enabled if u-boot enabled the power domain > but we don't have any driver enabled for the VPU (CONFIG_DRM_MESON=n). > > Keeping the clocks enabled is the responsibility of the CCF drivers, not > the power domain driver. Even better: this is already covered as all > gates in the VPU and VAPB tree on GX an G12 SoCs have the > CLK_IGNORE_UNUSED flag set, meaning: if the bootloader has previously > enabled the clock we're not touching it until a driver explicitly asks > to enable (and then disable) it. In case of CONFIG_DRM_MESON=n we're > never calling meson_ee_pwrc_on, meaning that we always keep the state of > the clocks as set by u-boot. > > The original TOFIX comment also mentioned that we need to make sure not > to mess up the clock's prepare/enable ref-counters. This is the only > requirement that's left for the meson-ee-pwrc driver that needs to be > managed for the VPU power domain. > > Three steps can improve this situation: > - Don't prepare and enable the clocks (just to fix the ref-counting) in > meson_ee_pwrc_init_domain if u-boot left that power domain enabled. > Instead, remember if the clocks are enabled in meson_ee_pwrc_{on,off} > and only disable them if we have previously turned them on ourselves. > - Drop GENPD_FLAG_ALWAYS_ON as we can always manage the state of the VPU > power domain if both the power domain controller and DRM driver are > enabled (=m or =y). If the power domain driver is enabled but the DRM > driver is disabled we can still use meson_ee_pwrc_off because it's not > trying to disable the clocks anymore > - Drop meson_ee_pwrc_shutdown as it's the responsibility of the genpd > framework to call meson_ee_pwrc_off when needed (either when a power > domain is being disabled - regardless of whether it's was used by a > driver before or not). Now there's also no more shutdown callback > ordering dependency between the power domain driver and other drivers > anymore. > > Fixes: eef3c2ba0a42a6 ("soc: amlogic: Add support for Everything-Else > power domains controller") > Reported-by: Stefan Agner <stefan@agner.ch> > Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> FWIW, I successfully tested this change with CONFIG_DRM_MESON=m and CONFIG_DRM_MESON=y each with a couple of reboots without any hang. Tested-by: Stefan Agner <stefan@agner.ch> Thanks Martin for help debugging and working on this patch! -- Stefan > --- > Neil, I would like you to specifically review (and re-test) the original > TOFIX part. I *believe* that I understand the original problem and hope > that my approach also works in that scenario, but I am not 100% sure. > > Kevin, as you're my go-to genpd expert I am also asking you to review > this to point out any issues you see. > > > drivers/soc/amlogic/meson-ee-pwrc.c | 76 ++++++++++++----------------- > 1 file changed, 31 insertions(+), 45 deletions(-) > > diff --git a/drivers/soc/amlogic/meson-ee-pwrc.c > b/drivers/soc/amlogic/meson-ee-pwrc.c > index 50bf5d2b828b..534c33ba31ce 100644 > --- a/drivers/soc/amlogic/meson-ee-pwrc.c > +++ b/drivers/soc/amlogic/meson-ee-pwrc.c > @@ -353,10 +353,26 @@ static int meson_ee_pwrc_off(struct > generic_pm_domain *domain) > > if (pwrc_domain->num_clks) { > msleep(20); > - clk_bulk_disable_unprepare(pwrc_domain->num_clks, > - pwrc_domain->clks); > + > + /* > + * We are only allowed to turn off the clocks here if we > + * have previously enabled them ourselves. In other words: > + * for an "unused" power domain (which is not used by any > + * power domain consumer) we have not enabled the clocks > + * previously so we keep them in the state they are. > + * This is relevant for the VPU power domain which may > + * have been enabled by u-boot. If the display driver in > + * Linux is disabled then we need to keep the clocks in > + * the state as u-boot set them, otherwise the board will > + * hang. > + */ > + if (pwrc_domain->enabled) > + clk_bulk_disable_unprepare(pwrc_domain->num_clks, > + pwrc_domain->clks); > } > > + pwrc_domain->enabled = false; > + > return 0; > } > > @@ -392,8 +408,14 @@ static int meson_ee_pwrc_on(struct > generic_pm_domain *domain) > if (ret) > return ret; > > - return clk_bulk_prepare_enable(pwrc_domain->num_clks, > - pwrc_domain->clks); > + ret = clk_bulk_prepare_enable(pwrc_domain->num_clks, > + pwrc_domain->clks); > + if (ret) > + return ret; > + > + pwrc_domain->enabled = true; > + > + return 0; > } > > static int meson_ee_pwrc_init_domain(struct platform_device *pdev, > @@ -434,33 +456,11 @@ static int meson_ee_pwrc_init_domain(struct > platform_device *pdev, > dom->base.power_on = meson_ee_pwrc_on; > dom->base.power_off = meson_ee_pwrc_off; > > - /* > - * TOFIX: This is a special case for the VPU power domain, which can > - * be enabled previously by the bootloader. In this case the VPU > - * pipeline may be functional but no driver maybe never attach > - * to this power domain, and if the domain is disabled it could > - * cause system errors. This is why the pm_domain_always_on_gov > - * is used here. > - * For the same reason, the clocks should be enabled in case > - * we need to power the domain off, otherwise the internal clocks > - * prepare/enable counters won't be in sync. > - */ > - if (dom->num_clks && dom->desc.get_power && !dom->desc.get_power(dom)) { > - ret = clk_bulk_prepare_enable(dom->num_clks, dom->clks); > - if (ret) > - return ret; > - > - dom->base.flags = GENPD_FLAG_ALWAYS_ON; > - ret = pm_genpd_init(&dom->base, NULL, false); > - if (ret) > - return ret; > - } else { > - ret = pm_genpd_init(&dom->base, NULL, > - (dom->desc.get_power ? > - dom->desc.get_power(dom) : true)); > - if (ret) > - return ret; > - } > + ret = pm_genpd_init(&dom->base, NULL, > + (dom->desc.get_power ? > + dom->desc.get_power(dom) : true)); > + if (ret) > + return ret; > > return 0; > } > @@ -528,19 +528,6 @@ static int meson_ee_pwrc_probe(struct > platform_device *pdev) > return of_genpd_add_provider_onecell(pdev->dev.of_node, &pwrc->xlate); > } > > -static void meson_ee_pwrc_shutdown(struct platform_device *pdev) > -{ > - struct meson_ee_pwrc *pwrc = platform_get_drvdata(pdev); > - int i; > - > - for (i = 0 ; i < pwrc->xlate.num_domains ; ++i) { > - struct meson_ee_pwrc_domain *dom = &pwrc->domains[i]; > - > - if (dom->desc.get_power && !dom->desc.get_power(dom)) > - meson_ee_pwrc_off(&dom->base); > - } > -} > - > static struct meson_ee_pwrc_domain_data meson_ee_g12a_pwrc_data = { > .count = ARRAY_SIZE(g12a_pwrc_domains), > .domains = g12a_pwrc_domains, > @@ -606,7 +593,6 @@ MODULE_DEVICE_TABLE(of, meson_ee_pwrc_match_table); > > static struct platform_driver meson_ee_pwrc_driver = { > .probe = meson_ee_pwrc_probe, > - .shutdown = meson_ee_pwrc_shutdown, > .driver = { > .name = "meson_ee_pwrc", > .of_match_table = meson_ee_pwrc_match_table,
On Thu, Apr 29, 2021 at 10:37 PM Martin Blumenstingl <martin.blumenstingl@googlemail.com> wrote: > > Stefan reports that rebooting his ODROID-N2+ (using a G12B SoC) results > in the board hanging. His kernel config uses: > CONFIG_MESON_EE_PM_DOMAINS=y > CONFIG_DRM_MESON=m > > He reports that his kernel config results in the DRM driver's .shutdown > callback to be executed after the power domain driver's .shutdown > callback. That's problematic because meson_ee_pwrc_shutdown disables the > clock which are used by the VPU IP. This causes the board to hang. > > Further he reports that changing from CONFIG_DRM_MESON=m to > CONFIG_DRM_MESON=y reverses the order in which the DRM and power domain > driver's shutdown functions are executed, making the reboot successful. > > The reason why we use meson_ee_pwrc_shutdown is because of the VPU power > domain (which is causing the problem described above). It can be left > enabled by u-boot. According to the original TOFIX comment in > meson_ee_pwrc_init_domain we need to be careful because disabling the > power domain could "cause system errors". As a workaround the clocks > are manually enabled in meson_ee_pwrc_init_domain and the power domain > is marked as GENPD_FLAG_ALWAYS_ON (so it can never be turned off). > > Experimenting has shown that the power domain itself can be disabled as > long as we keep the clocks enabled if u-boot enabled the power domain > but we don't have any driver enabled for the VPU (CONFIG_DRM_MESON=n). > > Keeping the clocks enabled is the responsibility of the CCF drivers, not > the power domain driver. Even better: this is already covered as all > gates in the VPU and VAPB tree on GX an G12 SoCs have the > CLK_IGNORE_UNUSED flag set, meaning: if the bootloader has previously > enabled the clock we're not touching it until a driver explicitly asks > to enable (and then disable) it. In case of CONFIG_DRM_MESON=n we're > never calling meson_ee_pwrc_on, meaning that we always keep the state of > the clocks as set by u-boot. > > The original TOFIX comment also mentioned that we need to make sure not > to mess up the clock's prepare/enable ref-counters. This is the only > requirement that's left for the meson-ee-pwrc driver that needs to be > managed for the VPU power domain. > > Three steps can improve this situation: > - Don't prepare and enable the clocks (just to fix the ref-counting) in > meson_ee_pwrc_init_domain if u-boot left that power domain enabled. > Instead, remember if the clocks are enabled in meson_ee_pwrc_{on,off} > and only disable them if we have previously turned them on ourselves. > - Drop GENPD_FLAG_ALWAYS_ON as we can always manage the state of the VPU > power domain if both the power domain controller and DRM driver are > enabled (=m or =y). If the power domain driver is enabled but the DRM > driver is disabled we can still use meson_ee_pwrc_off because it's not > trying to disable the clocks anymore > - Drop meson_ee_pwrc_shutdown as it's the responsibility of the genpd > framework to call meson_ee_pwrc_off when needed (either when a power > domain is being disabled - regardless of whether it's was used by a > driver before or not). Now there's also no more shutdown callback > ordering dependency between the power domain driver and other drivers > anymore. > > Fixes: eef3c2ba0a42a6 ("soc: amlogic: Add support for Everything-Else power domains controller") > Reported-by: Stefan Agner <stefan@agner.ch> > Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> unfortunately I need to add: Nacked-by: myself it turns out that the genpd framework does not call .power_on when the power domain is already powered on during initialization (pm_genpd_init) I wonder if we need to extend the genpd to handle this use-case. CCF (common clock framework) for example has CLK_IGNORE_UNUSED and a .disable_unused callback which can be used to specifically manage the "unused" use-case my idea #1: - add a GENPD_FLAG_IGNORE_UNUSED flag - set it for the VPU power domain - skip "unused" power domains which have the GENPD_FLAG_IGNORE_UNUSED flag set - drop the GENPD_FLAG_ALWAYS_ON flag from the VPU power domain my idea #2: - a power_off_unused callback (with the same arguments as power_off) could be introduced - in pm_genpd_init we check if that callback is initialized, if not we assign the power_off callback - instead of using the power_off callback when disabling an unused power domain the new power_off_unused callback is used - for the meson-ee-pwrc we implement a special meson_ee_pwrc_power_off function that is a no-op Best regards, Martin
Martin Blumenstingl <martin.blumenstingl@googlemail.com> writes: > Stefan reports that rebooting his ODROID-N2+ (using a G12B SoC) results > in the board hanging. His kernel config uses: > CONFIG_MESON_EE_PM_DOMAINS=y > CONFIG_DRM_MESON=m > > He reports that his kernel config results in the DRM driver's .shutdown > callback to be executed after the power domain driver's .shutdown > callback. That's problematic because meson_ee_pwrc_shutdown disables the > clock which are used by the VPU IP. This causes the board to hang. I didn't dig deeply on this yet because this smells very much like an issue Art reported[1] and fixed. What kernel version are you using, and does it contain commit fa0c16caf3d7 (drm: meson_drv add shutdown function) Kevin [1] https://lore.kernel.org/dri-devel/20210302042202.3728113-1-art@khadas.com/
diff --git a/drivers/soc/amlogic/meson-ee-pwrc.c b/drivers/soc/amlogic/meson-ee-pwrc.c index 50bf5d2b828b..534c33ba31ce 100644 --- a/drivers/soc/amlogic/meson-ee-pwrc.c +++ b/drivers/soc/amlogic/meson-ee-pwrc.c @@ -353,10 +353,26 @@ static int meson_ee_pwrc_off(struct generic_pm_domain *domain) if (pwrc_domain->num_clks) { msleep(20); - clk_bulk_disable_unprepare(pwrc_domain->num_clks, - pwrc_domain->clks); + + /* + * We are only allowed to turn off the clocks here if we + * have previously enabled them ourselves. In other words: + * for an "unused" power domain (which is not used by any + * power domain consumer) we have not enabled the clocks + * previously so we keep them in the state they are. + * This is relevant for the VPU power domain which may + * have been enabled by u-boot. If the display driver in + * Linux is disabled then we need to keep the clocks in + * the state as u-boot set them, otherwise the board will + * hang. + */ + if (pwrc_domain->enabled) + clk_bulk_disable_unprepare(pwrc_domain->num_clks, + pwrc_domain->clks); } + pwrc_domain->enabled = false; + return 0; } @@ -392,8 +408,14 @@ static int meson_ee_pwrc_on(struct generic_pm_domain *domain) if (ret) return ret; - return clk_bulk_prepare_enable(pwrc_domain->num_clks, - pwrc_domain->clks); + ret = clk_bulk_prepare_enable(pwrc_domain->num_clks, + pwrc_domain->clks); + if (ret) + return ret; + + pwrc_domain->enabled = true; + + return 0; } static int meson_ee_pwrc_init_domain(struct platform_device *pdev, @@ -434,33 +456,11 @@ static int meson_ee_pwrc_init_domain(struct platform_device *pdev, dom->base.power_on = meson_ee_pwrc_on; dom->base.power_off = meson_ee_pwrc_off; - /* - * TOFIX: This is a special case for the VPU power domain, which can - * be enabled previously by the bootloader. In this case the VPU - * pipeline may be functional but no driver maybe never attach - * to this power domain, and if the domain is disabled it could - * cause system errors. This is why the pm_domain_always_on_gov - * is used here. - * For the same reason, the clocks should be enabled in case - * we need to power the domain off, otherwise the internal clocks - * prepare/enable counters won't be in sync. - */ - if (dom->num_clks && dom->desc.get_power && !dom->desc.get_power(dom)) { - ret = clk_bulk_prepare_enable(dom->num_clks, dom->clks); - if (ret) - return ret; - - dom->base.flags = GENPD_FLAG_ALWAYS_ON; - ret = pm_genpd_init(&dom->base, NULL, false); - if (ret) - return ret; - } else { - ret = pm_genpd_init(&dom->base, NULL, - (dom->desc.get_power ? - dom->desc.get_power(dom) : true)); - if (ret) - return ret; - } + ret = pm_genpd_init(&dom->base, NULL, + (dom->desc.get_power ? + dom->desc.get_power(dom) : true)); + if (ret) + return ret; return 0; } @@ -528,19 +528,6 @@ static int meson_ee_pwrc_probe(struct platform_device *pdev) return of_genpd_add_provider_onecell(pdev->dev.of_node, &pwrc->xlate); } -static void meson_ee_pwrc_shutdown(struct platform_device *pdev) -{ - struct meson_ee_pwrc *pwrc = platform_get_drvdata(pdev); - int i; - - for (i = 0 ; i < pwrc->xlate.num_domains ; ++i) { - struct meson_ee_pwrc_domain *dom = &pwrc->domains[i]; - - if (dom->desc.get_power && !dom->desc.get_power(dom)) - meson_ee_pwrc_off(&dom->base); - } -} - static struct meson_ee_pwrc_domain_data meson_ee_g12a_pwrc_data = { .count = ARRAY_SIZE(g12a_pwrc_domains), .domains = g12a_pwrc_domains, @@ -606,7 +593,6 @@ MODULE_DEVICE_TABLE(of, meson_ee_pwrc_match_table); static struct platform_driver meson_ee_pwrc_driver = { .probe = meson_ee_pwrc_probe, - .shutdown = meson_ee_pwrc_shutdown, .driver = { .name = "meson_ee_pwrc", .of_match_table = meson_ee_pwrc_match_table,
Stefan reports that rebooting his ODROID-N2+ (using a G12B SoC) results in the board hanging. His kernel config uses: CONFIG_MESON_EE_PM_DOMAINS=y CONFIG_DRM_MESON=m He reports that his kernel config results in the DRM driver's .shutdown callback to be executed after the power domain driver's .shutdown callback. That's problematic because meson_ee_pwrc_shutdown disables the clock which are used by the VPU IP. This causes the board to hang. Further he reports that changing from CONFIG_DRM_MESON=m to CONFIG_DRM_MESON=y reverses the order in which the DRM and power domain driver's shutdown functions are executed, making the reboot successful. The reason why we use meson_ee_pwrc_shutdown is because of the VPU power domain (which is causing the problem described above). It can be left enabled by u-boot. According to the original TOFIX comment in meson_ee_pwrc_init_domain we need to be careful because disabling the power domain could "cause system errors". As a workaround the clocks are manually enabled in meson_ee_pwrc_init_domain and the power domain is marked as GENPD_FLAG_ALWAYS_ON (so it can never be turned off). Experimenting has shown that the power domain itself can be disabled as long as we keep the clocks enabled if u-boot enabled the power domain but we don't have any driver enabled for the VPU (CONFIG_DRM_MESON=n). Keeping the clocks enabled is the responsibility of the CCF drivers, not the power domain driver. Even better: this is already covered as all gates in the VPU and VAPB tree on GX an G12 SoCs have the CLK_IGNORE_UNUSED flag set, meaning: if the bootloader has previously enabled the clock we're not touching it until a driver explicitly asks to enable (and then disable) it. In case of CONFIG_DRM_MESON=n we're never calling meson_ee_pwrc_on, meaning that we always keep the state of the clocks as set by u-boot. The original TOFIX comment also mentioned that we need to make sure not to mess up the clock's prepare/enable ref-counters. This is the only requirement that's left for the meson-ee-pwrc driver that needs to be managed for the VPU power domain. Three steps can improve this situation: - Don't prepare and enable the clocks (just to fix the ref-counting) in meson_ee_pwrc_init_domain if u-boot left that power domain enabled. Instead, remember if the clocks are enabled in meson_ee_pwrc_{on,off} and only disable them if we have previously turned them on ourselves. - Drop GENPD_FLAG_ALWAYS_ON as we can always manage the state of the VPU power domain if both the power domain controller and DRM driver are enabled (=m or =y). If the power domain driver is enabled but the DRM driver is disabled we can still use meson_ee_pwrc_off because it's not trying to disable the clocks anymore - Drop meson_ee_pwrc_shutdown as it's the responsibility of the genpd framework to call meson_ee_pwrc_off when needed (either when a power domain is being disabled - regardless of whether it's was used by a driver before or not). Now there's also no more shutdown callback ordering dependency between the power domain driver and other drivers anymore. Fixes: eef3c2ba0a42a6 ("soc: amlogic: Add support for Everything-Else power domains controller") Reported-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> --- Neil, I would like you to specifically review (and re-test) the original TOFIX part. I *believe* that I understand the original problem and hope that my approach also works in that scenario, but I am not 100% sure. Kevin, as you're my go-to genpd expert I am also asking you to review this to point out any issues you see. drivers/soc/amlogic/meson-ee-pwrc.c | 76 ++++++++++++----------------- 1 file changed, 31 insertions(+), 45 deletions(-)