Message ID | 20240919091834.83572-1-sebastian.reichel@collabora.com |
---|---|
Headers | show |
Series | Fix RK3588 GPU domain | expand |
On Thu, Sep 19, 2024 at 11:27 AM Sebastian Reichel <sebastian.reichel@collabora.com> wrote: > > Some power domains require extra voltages to be applied. For example > trying to enable the GPU domain on RK3588 fails when the SoC does not > have VDD GPU enabled. > > The solution to temporarily change the device's device tree node has > been taken over from the Mediatek power domain driver. > > The regulator is not acquired at probe time, since that creates circular > dependencies. The power domain driver must be probed early, since SoC > peripherals need it. Regulators on the other hand depend on SoC > peripherals like SPI, I2C or GPIO. > > Reviewed-by: Heiko Stuebner <heiko@sntech.de> > Tested-by: Adrian Larumbe <adrian.larumbe@collabora.com> # On Rock 5B > Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> > --- > drivers/pmdomain/rockchip/pm-domains.c | 56 +++++++++++++++++++++++++- > 1 file changed, 54 insertions(+), 2 deletions(-) > > diff --git a/drivers/pmdomain/rockchip/pm-domains.c b/drivers/pmdomain/rockchip/pm-domains.c > index 663d390faaeb..4bc17b588419 100644 > --- a/drivers/pmdomain/rockchip/pm-domains.c > +++ b/drivers/pmdomain/rockchip/pm-domains.c > @@ -18,6 +18,7 @@ > #include <linux/of_clk.h> > #include <linux/clk.h> > #include <linux/regmap.h> > +#include <linux/regulator/consumer.h> > #include <linux/mfd/syscon.h> > #include <soc/rockchip/pm_domains.h> > #include <dt-bindings/power/px30-power.h> > @@ -89,6 +90,8 @@ struct rockchip_pm_domain { > u32 *qos_save_regs[MAX_QOS_REGS_NUM]; > int num_clks; > struct clk_bulk_data *clks; > + struct device_node *node; > + struct regulator *supply; > }; > > struct rockchip_pmu { > @@ -571,18 +574,66 @@ static int rockchip_pd_power(struct rockchip_pm_domain *pd, bool power_on) > return 0; > } > > +static int rockchip_pd_regulator_disable(struct rockchip_pm_domain *pd) > +{ > + return pd->supply ? regulator_disable(pd->supply) : 0; > +} > + > +static int rockchip_pd_regulator_enable(struct rockchip_pm_domain *pd) > +{ > + struct rockchip_pmu *pmu = pd->pmu; > + struct device_node *main_node; > + > + if (!pd->supply) { > + /* > + * Find regulator in current power domain node. > + * devm_regulator_get() finds regulator in a node and its child > + * node, so set of_node to current power domain node then change > + * back to original node after regulator is found for current > + * power domain node. > + */ > + main_node = pmu->dev->of_node; > + pmu->dev->of_node = pd->node; > + pd->supply = devm_regulator_get(pmu->dev, "domain"); How do you differentiate between a power domain not needing a supply, vs a power domain that is missing its supply in DT? You're using the normal "devm_regulator_get()" here (no "_optional), which gives the dummy supply if no supply is specified in the DT, and I think that wouldn't work properly. I'm trying to work out if having "devm_of_regulator_get()" is needed or not. Also, this could return -EPROBE_DEFER. I guess it works with the initial pm_domain_attach() from the driver core? Thanks ChenYu > + pmu->dev->of_node = main_node; > + if (IS_ERR(pd->supply)) { > + pd->supply = NULL; > + return 0; > + } > + } > + > + return regulator_enable(pd->supply); > +} > + > static int rockchip_pd_power_on(struct generic_pm_domain *domain) > { > struct rockchip_pm_domain *pd = to_rockchip_pd(domain); > + int ret; > + > + ret = rockchip_pd_regulator_enable(pd); > + if (ret) { > + dev_err(pd->pmu->dev, "Failed to enable supply: %d\n", ret); > + return ret; > + } > > - return rockchip_pd_power(pd, true); > + ret = rockchip_pd_power(pd, true); > + if (ret) > + rockchip_pd_regulator_disable(pd); > + > + return ret; > } > > static int rockchip_pd_power_off(struct generic_pm_domain *domain) > { > struct rockchip_pm_domain *pd = to_rockchip_pd(domain); > + int ret; > > - return rockchip_pd_power(pd, false); > + ret = rockchip_pd_power(pd, false); > + if (ret) > + return ret; > + > + rockchip_pd_regulator_disable(pd); > + return ret; > } > > static int rockchip_pd_attach_dev(struct generic_pm_domain *genpd, > @@ -663,6 +714,7 @@ static int rockchip_pm_add_one_domain(struct rockchip_pmu *pmu, > > pd->info = pd_info; > pd->pmu = pmu; > + pd->node = node; > > pd->num_clks = of_clk_get_parent_count(node); > if (pd->num_clks > 0) { > -- > 2.45.2 >
On Thu, 19 Sept 2024 at 11:18, Sebastian Reichel <sebastian.reichel@collabora.com> wrote: > > Hi, > > I got a report, that the Linux kernel crashes on Rock 5B when the panthor > driver is loaded late after booting. The crash starts with the following > shortened error print: > > rockchip-pm-domain fd8d8000.power-management:power-controller: failed to set domain 'gpu', val=0 > rockchip-pm-domain fd8d8000.power-management:power-controller: failed to get ack on domain 'gpu', val=0xa9fff > SError Interrupt on CPU4, code 0x00000000be000411 -- SError > > This series first does some cleanups in the Rockchip power domain > driver and changes the driver, so that it no longer tries to continue > when it fails to enable a domain. This gets rid of the SError interrupt > and long backtraces. But the kernel still hangs when it fails to enable > a power domain. I have not done further analysis to check if that can > be avoided. > > Last but not least this provides a fix for the GPU power domain failing > to get enabled - after some testing from my side it seems to require the > GPU voltage supply to be enabled. > > I'm not really happy about the hack to get a regulator for a sub-node, > which I took over from the Mediatek driver. I discussed this with > Chen-Yu Tsai and Heiko Stübner at OSS EU and the plan is: > > 1. Merge Rockchip PM domain driver with this hack for now, since DRM CI > people need it > 2. Chen-Yu will work on a series, which fixes the hack in Mediatek by > introducing a new devm_regulator_get function taking an DT node as > additional argument > 3. Rockchip PM domain later will switch to that once it has landed I have just queued up 2) on my next branch. My suggestion is to skip the intermediate step in 1) and go directly for 3) instead, unless you think there is a problem with that, of course? [...] Kind regards Uffe