Message ID | 20230127104054.895129-1-abel.vesa@linaro.org |
---|---|
State | New |
Headers | show |
Series | [RFC,v2,1/2] PM: domains: Skip disabling unused domains if provider has sync_state | expand |
Hi Abel, On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > Currently, there are cases when a domain needs to remain enabled until > the consumer driver probes. Sometimes such consumer drivers may be built > as modules. Since the genpd_power_off_unused is called too early for > such consumer driver modules to get a chance to probe, the domain, since > it is unused, will get disabled. On the other hand, the best time for > an unused domain to be disabled is on the provider's sync_state > callback. So, if the provider has registered a sync_state callback, > assume the unused domains for that provider will be disabled on its > sync_state callback. Also provide a generic sync_state callback which > disables all the domains unused for the provider that registers it. > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > --- > > This approach has been applied for unused clocks as well. > With this patch merged in, all the providers that have sync_state > callback registered will leave the domains enabled unless the provider's > sync_state callback explicitly disables them. So those providers will > need to add the disabling part to their sync_state callback. On the > other hand, the platforms that have cases where domains need to remain > enabled (even if unused) until the consumer driver probes, will be able, > with this patch in, to run without the pd_ignore_unused kernel argument, > which seems to be the case for most Qualcomm platforms, at this moment. I recently encountered a related issue on a Qualcomm platform with a v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use highest corner until sync_state"). The issue involves a DT node with a rpmhpd, the DT node is enabled, however the corresponding device driver is not enabled in the kernel. In such a scenario the sync_state callback is never called, because the genpd consumer never probes. As a result the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during system suspend, which results in a substantially higher power consumption in S3. I wonder if genpd (and some other frameworks) needs something like regulator_init_complete(), which turns off unused regulators 30s after system boot. That's conceptually similar to the current genpd_power_off_unused(), but would provide time for modules being loaded. > The v1 is here: > https://lore.kernel.org/all/20230126234013.3638425-1-abel.vesa@linaro.org/ > > Changes since v1: > * added a generic sync state callback to be registered by providers in > order to disable the unused domains on their sync state. Also > mentioned this in the commit message. > > drivers/base/power/domain.c | 17 ++++++++++++++++- > include/linux/pm_domain.h | 3 +++ > 2 files changed, 19 insertions(+), 1 deletion(-) > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > index 84662d338188..c2a5f77c01f3 100644 > --- a/drivers/base/power/domain.c > +++ b/drivers/base/power/domain.c > @@ -1099,7 +1099,8 @@ static int __init genpd_power_off_unused(void) > mutex_lock(&gpd_list_lock); > > list_for_each_entry(genpd, &gpd_list, gpd_list_node) > - genpd_queue_power_off_work(genpd); > + if (!dev_has_sync_state(genpd->provider->dev)) > + genpd_queue_power_off_work(genpd); > > mutex_unlock(&gpd_list_lock); > > @@ -1107,6 +1108,20 @@ static int __init genpd_power_off_unused(void) > } > late_initcall(genpd_power_off_unused); > > +void genpd_power_off_unused_sync_state(struct device *dev) > +{ > + struct generic_pm_domain *genpd; > + > + mutex_lock(&gpd_list_lock); > + > + list_for_each_entry(genpd, &gpd_list, gpd_list_node) > + if (genpd->provider->dev == dev) > + genpd_queue_power_off_work(genpd); > + > + mutex_unlock(&gpd_list_lock); > +} > +EXPORT_SYMBOL_GPL(genpd_power_off_unused_sync_state); > + > #ifdef CONFIG_PM_SLEEP > > /** > diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h > index f776fb93eaa0..1fd5aa500c81 100644 > --- a/include/linux/pm_domain.h > +++ b/include/linux/pm_domain.h > @@ -351,6 +351,7 @@ struct device *genpd_dev_pm_attach_by_id(struct device *dev, > unsigned int index); > struct device *genpd_dev_pm_attach_by_name(struct device *dev, > const char *name); > +void genpd_power_off_unused_sync_state(struct device *dev); > #else /* !CONFIG_PM_GENERIC_DOMAINS_OF */ > static inline int of_genpd_add_provider_simple(struct device_node *np, > struct generic_pm_domain *genpd) > @@ -419,6 +420,8 @@ struct generic_pm_domain *of_genpd_remove_last(struct device_node *np) > { > return ERR_PTR(-EOPNOTSUPP); > } > + > +static inline genpd_power_off_unused_sync_state(struct device *dev) {} > #endif /* CONFIG_PM_GENERIC_DOMAINS_OF */ > > #ifdef CONFIG_PM > -- > 2.34.1 >
Hi, On Thu, Feb 2, 2023 at 10:24 AM Matthias Kaehlcke <mka@chromium.org> wrote: > > Hi Abel, > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > > Currently, there are cases when a domain needs to remain enabled until > > the consumer driver probes. Sometimes such consumer drivers may be built > > as modules. Since the genpd_power_off_unused is called too early for > > such consumer driver modules to get a chance to probe, the domain, since > > it is unused, will get disabled. On the other hand, the best time for > > an unused domain to be disabled is on the provider's sync_state > > callback. So, if the provider has registered a sync_state callback, > > assume the unused domains for that provider will be disabled on its > > sync_state callback. Also provide a generic sync_state callback which > > disables all the domains unused for the provider that registers it. > > > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > > --- > > > > This approach has been applied for unused clocks as well. > > With this patch merged in, all the providers that have sync_state > > callback registered will leave the domains enabled unless the provider's > > sync_state callback explicitly disables them. So those providers will > > need to add the disabling part to their sync_state callback. On the > > other hand, the platforms that have cases where domains need to remain > > enabled (even if unused) until the consumer driver probes, will be able, > > with this patch in, to run without the pd_ignore_unused kernel argument, > > which seems to be the case for most Qualcomm platforms, at this moment. > > I recently encountered a related issue on a Qualcomm platform with a > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > highest corner until sync_state"). The issue involves a DT node with a > rpmhpd, the DT node is enabled, however the corresponding device driver > is not enabled in the kernel. In such a scenario the sync_state callback > is never called, because the genpd consumer never probes. As a result > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > system suspend, which results in a substantially higher power consumption > in S3. > > I wonder if genpd (and some other frameworks) needs something like > regulator_init_complete(), which turns off unused regulators 30s after > system boot. That's conceptually similar to the current > genpd_power_off_unused(), but would provide time for modules being loaded. Just for completeness, there are at least a few other similar concepts in the kernel where the kernel needs to decide that it's going to stop waiting for modules to show up and it just shuts off anything that's unused. The other one that jumps to the top of my head is related to "driver_deferred_probe_timeout". There we give 10 seconds (by default) for userspace to load modules. After that point in time we start returning errors instead of waiting longer. You can even see that the default depends on whether "CONFIG_MODULES" is set. -Doug
On 02/02/2023 20:24, Matthias Kaehlcke wrote: > Hi Abel, > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: >> Currently, there are cases when a domain needs to remain enabled until >> the consumer driver probes. Sometimes such consumer drivers may be built >> as modules. Since the genpd_power_off_unused is called too early for >> such consumer driver modules to get a chance to probe, the domain, since >> it is unused, will get disabled. On the other hand, the best time for >> an unused domain to be disabled is on the provider's sync_state >> callback. So, if the provider has registered a sync_state callback, >> assume the unused domains for that provider will be disabled on its >> sync_state callback. Also provide a generic sync_state callback which >> disables all the domains unused for the provider that registers it. >> >> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> >> --- >> >> This approach has been applied for unused clocks as well. >> With this patch merged in, all the providers that have sync_state >> callback registered will leave the domains enabled unless the provider's >> sync_state callback explicitly disables them. So those providers will >> need to add the disabling part to their sync_state callback. On the >> other hand, the platforms that have cases where domains need to remain >> enabled (even if unused) until the consumer driver probes, will be able, >> with this patch in, to run without the pd_ignore_unused kernel argument, >> which seems to be the case for most Qualcomm platforms, at this moment. > > I recently encountered a related issue on a Qualcomm platform with a > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > highest corner until sync_state"). The issue involves a DT node with a > rpmhpd, the DT node is enabled, however the corresponding device driver > is not enabled in the kernel. In such a scenario the sync_state callback > is never called, because the genpd consumer never probes. As a result > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > system suspend, which results in a substantially higher power consumption > in S3. > > I wonder if genpd (and some other frameworks) needs something like > regulator_init_complete(), which turns off unused regulators 30s after > system boot. That's conceptually similar to the current > genpd_power_off_unused(), but would provide time for modules being loaded. I think the overall goal is to move away from ad-hoc implementations like clk_disable_unused/genpd_power_off_unused/regulator_init_complete towards the sync_state. So inherently one either has to provide drivers for all devices in question or disable unused devices in DT. > >> The v1 is here: >> https://lore.kernel.org/all/20230126234013.3638425-1-abel.vesa@linaro.org/ >> >> Changes since v1: >> * added a generic sync state callback to be registered by providers in >> order to disable the unused domains on their sync state. Also >> mentioned this in the commit message. >> >> drivers/base/power/domain.c | 17 ++++++++++++++++- >> include/linux/pm_domain.h | 3 +++ >> 2 files changed, 19 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c >> index 84662d338188..c2a5f77c01f3 100644 >> --- a/drivers/base/power/domain.c >> +++ b/drivers/base/power/domain.c >> @@ -1099,7 +1099,8 @@ static int __init genpd_power_off_unused(void) >> mutex_lock(&gpd_list_lock); >> >> list_for_each_entry(genpd, &gpd_list, gpd_list_node) >> - genpd_queue_power_off_work(genpd); >> + if (!dev_has_sync_state(genpd->provider->dev)) >> + genpd_queue_power_off_work(genpd); >> >> mutex_unlock(&gpd_list_lock); >> >> @@ -1107,6 +1108,20 @@ static int __init genpd_power_off_unused(void) >> } >> late_initcall(genpd_power_off_unused); >> >> +void genpd_power_off_unused_sync_state(struct device *dev) >> +{ >> + struct generic_pm_domain *genpd; >> + >> + mutex_lock(&gpd_list_lock); >> + >> + list_for_each_entry(genpd, &gpd_list, gpd_list_node) >> + if (genpd->provider->dev == dev) >> + genpd_queue_power_off_work(genpd); >> + >> + mutex_unlock(&gpd_list_lock); >> +} >> +EXPORT_SYMBOL_GPL(genpd_power_off_unused_sync_state); >> + >> #ifdef CONFIG_PM_SLEEP >> >> /** >> diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h >> index f776fb93eaa0..1fd5aa500c81 100644 >> --- a/include/linux/pm_domain.h >> +++ b/include/linux/pm_domain.h >> @@ -351,6 +351,7 @@ struct device *genpd_dev_pm_attach_by_id(struct device *dev, >> unsigned int index); >> struct device *genpd_dev_pm_attach_by_name(struct device *dev, >> const char *name); >> +void genpd_power_off_unused_sync_state(struct device *dev); >> #else /* !CONFIG_PM_GENERIC_DOMAINS_OF */ >> static inline int of_genpd_add_provider_simple(struct device_node *np, >> struct generic_pm_domain *genpd) >> @@ -419,6 +420,8 @@ struct generic_pm_domain *of_genpd_remove_last(struct device_node *np) >> { >> return ERR_PTR(-EOPNOTSUPP); >> } >> + >> +static inline genpd_power_off_unused_sync_state(struct device *dev) {} >> #endif /* CONFIG_PM_GENERIC_DOMAINS_OF */ >> >> #ifdef CONFIG_PM >> -- >> 2.34.1 >>
Hi, On Thu, Feb 2, 2023 at 11:53 AM Dmitry Baryshkov <dmitry.baryshkov@linaro.org> wrote: > > On 02/02/2023 20:24, Matthias Kaehlcke wrote: > > Hi Abel, > > > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > >> Currently, there are cases when a domain needs to remain enabled until > >> the consumer driver probes. Sometimes such consumer drivers may be built > >> as modules. Since the genpd_power_off_unused is called too early for > >> such consumer driver modules to get a chance to probe, the domain, since > >> it is unused, will get disabled. On the other hand, the best time for > >> an unused domain to be disabled is on the provider's sync_state > >> callback. So, if the provider has registered a sync_state callback, > >> assume the unused domains for that provider will be disabled on its > >> sync_state callback. Also provide a generic sync_state callback which > >> disables all the domains unused for the provider that registers it. > >> > >> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > >> --- > >> > >> This approach has been applied for unused clocks as well. > >> With this patch merged in, all the providers that have sync_state > >> callback registered will leave the domains enabled unless the provider's > >> sync_state callback explicitly disables them. So those providers will > >> need to add the disabling part to their sync_state callback. On the > >> other hand, the platforms that have cases where domains need to remain > >> enabled (even if unused) until the consumer driver probes, will be able, > >> with this patch in, to run without the pd_ignore_unused kernel argument, > >> which seems to be the case for most Qualcomm platforms, at this moment. > > > > I recently encountered a related issue on a Qualcomm platform with a > > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > > highest corner until sync_state"). The issue involves a DT node with a > > rpmhpd, the DT node is enabled, however the corresponding device driver > > is not enabled in the kernel. In such a scenario the sync_state callback > > is never called, because the genpd consumer never probes. As a result > > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > > system suspend, which results in a substantially higher power consumption > > in S3. > > > > I wonder if genpd (and some other frameworks) needs something like > > regulator_init_complete(), which turns off unused regulators 30s after > > system boot. That's conceptually similar to the current > > genpd_power_off_unused(), but would provide time for modules being loaded. > > I think the overall goal is to move away from ad-hoc implementations > like clk_disable_unused/genpd_power_off_unused/regulator_init_complete > towards the sync_state. > > So inherently one either has to provide drivers for all devices in > question or disable unused devices in DT. Hmm. I guess I haven't been involved too much in those discussions, but overall I thought: 1. The device tree should ideally be describing the hardware. Thus if the hardware is there / available to use on a given board then the device should be marked enabled. 2. Users are not actually required to enable drivers for all hardware on their board. Things should still function OK even if a driver is disabled. For instance, if the SoC had a crypto accelerator you'd describe it in the device tree but it would be OK for someone to build a kernel that didn't enable the crypto accelerator driver. Am I mistaken? Which point did I get wrong, #1, or #2? -Doug
Hi Dmitry, On Thu, Feb 02, 2023 at 09:53:41PM +0200, Dmitry Baryshkov wrote: > On 02/02/2023 20:24, Matthias Kaehlcke wrote: > > Hi Abel, > > > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > > > Currently, there are cases when a domain needs to remain enabled until > > > the consumer driver probes. Sometimes such consumer drivers may be built > > > as modules. Since the genpd_power_off_unused is called too early for > > > such consumer driver modules to get a chance to probe, the domain, since > > > it is unused, will get disabled. On the other hand, the best time for > > > an unused domain to be disabled is on the provider's sync_state > > > callback. So, if the provider has registered a sync_state callback, > > > assume the unused domains for that provider will be disabled on its > > > sync_state callback. Also provide a generic sync_state callback which > > > disables all the domains unused for the provider that registers it. > > > > > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > > > --- > > > > > > This approach has been applied for unused clocks as well. > > > With this patch merged in, all the providers that have sync_state > > > callback registered will leave the domains enabled unless the provider's > > > sync_state callback explicitly disables them. So those providers will > > > need to add the disabling part to their sync_state callback. On the > > > other hand, the platforms that have cases where domains need to remain > > > enabled (even if unused) until the consumer driver probes, will be able, > > > with this patch in, to run without the pd_ignore_unused kernel argument, > > > which seems to be the case for most Qualcomm platforms, at this moment. > > > > I recently encountered a related issue on a Qualcomm platform with a > > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > > highest corner until sync_state"). The issue involves a DT node with a > > rpmhpd, the DT node is enabled, however the corresponding device driver > > is not enabled in the kernel. In such a scenario the sync_state callback > > is never called, because the genpd consumer never probes. As a result > > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > > system suspend, which results in a substantially higher power consumption > > in S3. > > > > I wonder if genpd (and some other frameworks) needs something like > > regulator_init_complete(), which turns off unused regulators 30s after > > system boot. That's conceptually similar to the current > > genpd_power_off_unused(), but would provide time for modules being loaded. > > I think the overall goal is to move away from ad-hoc implementations like > clk_disable_unused/genpd_power_off_unused/regulator_init_complete towards > the sync_state. I generally agree with the goal of using common mechanisms whenever possible. > So inherently one either has to provide drivers for all devices in question > or disable unused devices in DT. I don't think that's a great solution, it essentially hands the issue down to the users or downstream maintainers of the kernel, who might not be aware that there is an issue, nor know about the specifics of genpd (or interconnects and clocks which have similar problems). In general symptoms are probably subtle, like a (potentially substantially) increased power consumption during system suspend. The issue might have been introduced by an update to a newer kernel, which now includes a DT node for a new SoC feature which wasn't supported by the 'old' kernel. It's common practice to use the 'old' .config, at least as a starting point, which obviously doesn't enable the new driver. That happend to me with [1] when testing v6.1. It took me quite some time to track the 'culprit' commit down and then some debugging to understand what's going on. Shortly after that I ran into a related issue involving genpds when testing v6.2-rc, which again took a non-trivial amount of time to track down (and I'm familiar with the SoC platform and the general nature of the issue). I don't think it's reasonable to expect every user/downstream maintainer of an impacted system to go through this, one person at a time. Maybe there could be a generic solution for drivers with a 'sync_state' callback, e.g. a the driver (or framework) could have a 'sync_state_timeout' callback (or similar), which is called by the driver framework if 'sync_state' wasn't called (for example) 30s after the device was probed. Then the provider can power off or throttle unclaimed resources. m. [1] https://lore.kernel.org/lkml/20220902043511.17130-5-quic_rjendra@quicinc.com/ > > > The v1 is here: > > > https://lore.kernel.org/all/20230126234013.3638425-1-abel.vesa@linaro.org/ > > > > > > Changes since v1: > > > * added a generic sync state callback to be registered by providers in > > > order to disable the unused domains on their sync state. Also > > > mentioned this in the commit message. > > > > > > drivers/base/power/domain.c | 17 ++++++++++++++++- > > > include/linux/pm_domain.h | 3 +++ > > > 2 files changed, 19 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > > > index 84662d338188..c2a5f77c01f3 100644 > > > --- a/drivers/base/power/domain.c > > > +++ b/drivers/base/power/domain.c > > > @@ -1099,7 +1099,8 @@ static int __init genpd_power_off_unused(void) > > > mutex_lock(&gpd_list_lock); > > > list_for_each_entry(genpd, &gpd_list, gpd_list_node) > > > - genpd_queue_power_off_work(genpd); > > > + if (!dev_has_sync_state(genpd->provider->dev)) > > > + genpd_queue_power_off_work(genpd); > > > mutex_unlock(&gpd_list_lock); > > > @@ -1107,6 +1108,20 @@ static int __init genpd_power_off_unused(void) > > > } > > > late_initcall(genpd_power_off_unused); > > > +void genpd_power_off_unused_sync_state(struct device *dev) > > > +{ > > > + struct generic_pm_domain *genpd; > > > + > > > + mutex_lock(&gpd_list_lock); > > > + > > > + list_for_each_entry(genpd, &gpd_list, gpd_list_node) > > > + if (genpd->provider->dev == dev) > > > + genpd_queue_power_off_work(genpd); > > > + > > > + mutex_unlock(&gpd_list_lock); > > > +} > > > +EXPORT_SYMBOL_GPL(genpd_power_off_unused_sync_state); > > > + > > > #ifdef CONFIG_PM_SLEEP > > > /** > > > diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h > > > index f776fb93eaa0..1fd5aa500c81 100644 > > > --- a/include/linux/pm_domain.h > > > +++ b/include/linux/pm_domain.h > > > @@ -351,6 +351,7 @@ struct device *genpd_dev_pm_attach_by_id(struct device *dev, > > > unsigned int index); > > > struct device *genpd_dev_pm_attach_by_name(struct device *dev, > > > const char *name); > > > +void genpd_power_off_unused_sync_state(struct device *dev); > > > #else /* !CONFIG_PM_GENERIC_DOMAINS_OF */ > > > static inline int of_genpd_add_provider_simple(struct device_node *np, > > > struct generic_pm_domain *genpd) > > > @@ -419,6 +420,8 @@ struct generic_pm_domain *of_genpd_remove_last(struct device_node *np) > > > { > > > return ERR_PTR(-EOPNOTSUPP); > > > } > > > + > > > +static inline genpd_power_off_unused_sync_state(struct device *dev) {} > > > #endif /* CONFIG_PM_GENERIC_DOMAINS_OF */ > > > #ifdef CONFIG_PM > > > -- > > > 2.34.1 > > > > > -- > With best wishes > Dmitry >
On 03/02/2023 03:20, Matthias Kaehlcke wrote: > Hi Dmitry, > > On Thu, Feb 02, 2023 at 09:53:41PM +0200, Dmitry Baryshkov wrote: >> On 02/02/2023 20:24, Matthias Kaehlcke wrote: >>> Hi Abel, >>> >>> On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: >>>> Currently, there are cases when a domain needs to remain enabled until >>>> the consumer driver probes. Sometimes such consumer drivers may be built >>>> as modules. Since the genpd_power_off_unused is called too early for >>>> such consumer driver modules to get a chance to probe, the domain, since >>>> it is unused, will get disabled. On the other hand, the best time for >>>> an unused domain to be disabled is on the provider's sync_state >>>> callback. So, if the provider has registered a sync_state callback, >>>> assume the unused domains for that provider will be disabled on its >>>> sync_state callback. Also provide a generic sync_state callback which >>>> disables all the domains unused for the provider that registers it. >>>> >>>> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> >>>> --- >>>> >>>> This approach has been applied for unused clocks as well. >>>> With this patch merged in, all the providers that have sync_state >>>> callback registered will leave the domains enabled unless the provider's >>>> sync_state callback explicitly disables them. So those providers will >>>> need to add the disabling part to their sync_state callback. On the >>>> other hand, the platforms that have cases where domains need to remain >>>> enabled (even if unused) until the consumer driver probes, will be able, >>>> with this patch in, to run without the pd_ignore_unused kernel argument, >>>> which seems to be the case for most Qualcomm platforms, at this moment. >>> >>> I recently encountered a related issue on a Qualcomm platform with a >>> v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use >>> highest corner until sync_state"). The issue involves a DT node with a >>> rpmhpd, the DT node is enabled, however the corresponding device driver >>> is not enabled in the kernel. In such a scenario the sync_state callback >>> is never called, because the genpd consumer never probes. As a result >>> the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during >>> system suspend, which results in a substantially higher power consumption >>> in S3. >>> >>> I wonder if genpd (and some other frameworks) needs something like >>> regulator_init_complete(), which turns off unused regulators 30s after >>> system boot. That's conceptually similar to the current >>> genpd_power_off_unused(), but would provide time for modules being loaded. >> >> I think the overall goal is to move away from ad-hoc implementations like >> clk_disable_unused/genpd_power_off_unused/regulator_init_complete towards >> the sync_state. > > I generally agree with the goal of using common mechanisms whenever possible. > >> So inherently one either has to provide drivers for all devices in question >> or disable unused devices in DT. > > I don't think that's a great solution, it essentially hands the issue down to > the users or downstream maintainers of the kernel, who might not be aware that > there is an issue, nor know about the specifics of genpd (or interconnects and > clocks which have similar problems). The goal is to move the control down to individual drivers. Previously we had issues with clk_disable_unused() disabling mdss/mdp clocks incorrectly, which frequently led to broken display output. Other clock/genpd/regulator drivers might have other internal dependencies. Thus it is not really possible to handle resource shutdown in the common (framework) code. > > In general symptoms are probably subtle, like a (potentially substantially) > increased power consumption during system suspend. The issue might have been > introduced by an update to a newer kernel, which now includes a DT node for a > new SoC feature which wasn't supported by the 'old' kernel. It's common > practice to use the 'old' .config, at least as a starting point, which > obviously doesn't enable the new driver. That happend to me with [1] when > testing v6.1. It took me quite some time to track the 'culprit' commit down > and then some debugging to understand what's going on. Shortly after that I > ran into a related issue involving genpds when testing v6.2-rc, which again > took a non-trivial amount of time to track down (and I'm familiar with the SoC > platform and the general nature of the issue). I don't think it's reasonable > to expect every user/downstream maintainer of an impacted system to go through > this, one person at a time. I think it would be nice to have some way of 'sync_pending' debug available (compare this to debugfs/devices_deferred). Note, we are trying to make sure that all supported drivers are enabled at least as modules (if possible). If we fail, please send a patch fixing the defconfig. > Maybe there could be a generic solution for drivers with a 'sync_state' > callback, e.g. a the driver (or framework) could have a 'sync_state_timeout' > callback (or similar), which is called by the driver framework if 'sync_state' > wasn't called (for example) 30s after the device was probed. Then the provider > can power off or throttle unclaimed resources. I might be missing a point somewhere, but for me it looks like a logical solution. Please send a proposal.
On Fri, Feb 03, 2023 at 10:00:27PM +0200, Dmitry Baryshkov wrote: > On 03/02/2023 03:20, Matthias Kaehlcke wrote: > > Hi Dmitry, > > > > On Thu, Feb 02, 2023 at 09:53:41PM +0200, Dmitry Baryshkov wrote: > > > On 02/02/2023 20:24, Matthias Kaehlcke wrote: > > > > Hi Abel, > > > > > > > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > > > > > Currently, there are cases when a domain needs to remain enabled until > > > > > the consumer driver probes. Sometimes such consumer drivers may be built > > > > > as modules. Since the genpd_power_off_unused is called too early for > > > > > such consumer driver modules to get a chance to probe, the domain, since > > > > > it is unused, will get disabled. On the other hand, the best time for > > > > > an unused domain to be disabled is on the provider's sync_state > > > > > callback. So, if the provider has registered a sync_state callback, > > > > > assume the unused domains for that provider will be disabled on its > > > > > sync_state callback. Also provide a generic sync_state callback which > > > > > disables all the domains unused for the provider that registers it. > > > > > > > > > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > > > > > --- > > > > > > > > > > This approach has been applied for unused clocks as well. > > > > > With this patch merged in, all the providers that have sync_state > > > > > callback registered will leave the domains enabled unless the provider's > > > > > sync_state callback explicitly disables them. So those providers will > > > > > need to add the disabling part to their sync_state callback. On the > > > > > other hand, the platforms that have cases where domains need to remain > > > > > enabled (even if unused) until the consumer driver probes, will be able, > > > > > with this patch in, to run without the pd_ignore_unused kernel argument, > > > > > which seems to be the case for most Qualcomm platforms, at this moment. > > > > > > > > I recently encountered a related issue on a Qualcomm platform with a > > > > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > > > > highest corner until sync_state"). The issue involves a DT node with a > > > > rpmhpd, the DT node is enabled, however the corresponding device driver > > > > is not enabled in the kernel. In such a scenario the sync_state callback > > > > is never called, because the genpd consumer never probes. As a result > > > > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > > > > system suspend, which results in a substantially higher power consumption > > > > in S3. > > > > > > > > I wonder if genpd (and some other frameworks) needs something like > > > > regulator_init_complete(), which turns off unused regulators 30s after > > > > system boot. That's conceptually similar to the current > > > > genpd_power_off_unused(), but would provide time for modules being loaded. > > > > > > I think the overall goal is to move away from ad-hoc implementations like > > > clk_disable_unused/genpd_power_off_unused/regulator_init_complete towards > > > the sync_state. > > > > I generally agree with the goal of using common mechanisms whenever possible. > > > > > So inherently one either has to provide drivers for all devices in question > > > or disable unused devices in DT. > > > > I don't think that's a great solution, it essentially hands the issue down to > > the users or downstream maintainers of the kernel, who might not be aware that > > there is an issue, nor know about the specifics of genpd (or interconnects and > > clocks which have similar problems). > > The goal is to move the control down to individual drivers. Previously we > had issues with clk_disable_unused() disabling mdss/mdp clocks incorrectly, > which frequently led to broken display output. Other clock/genpd/regulator > drivers might have other internal dependencies. Thus it is not really > possible to handle resource shutdown in the common (framework) code. > > > > > In general symptoms are probably subtle, like a (potentially substantially) > > increased power consumption during system suspend. The issue might have been > > introduced by an update to a newer kernel, which now includes a DT node for a > > new SoC feature which wasn't supported by the 'old' kernel. It's common > > practice to use the 'old' .config, at least as a starting point, which > > obviously doesn't enable the new driver. That happend to me with [1] when > > testing v6.1. It took me quite some time to track the 'culprit' commit down > > and then some debugging to understand what's going on. Shortly after that I > > ran into a related issue involving genpds when testing v6.2-rc, which again > > took a non-trivial amount of time to track down (and I'm familiar with the SoC > > platform and the general nature of the issue). I don't think it's reasonable > > to expect every user/downstream maintainer of an impacted system to go through > > this, one person at a time. > > I think it would be nice to have some way of 'sync_pending' debug available > (compare this to debugfs/devices_deferred). Most folks are probably not even aware that they have a 'sync_state' issue and wouldn't look in debugfs, so I think this would have to be something proactive, like a warning log that is enabled by default (possibly with the option to disable it). Something in debugfs could be a nice complement. > Note, we are trying to make sure that all supported drivers are enabled at > least as modules (if possible). If we fail, please send a patch fixing the > defconfig. That's great, however not everybody uses the defconfig, it's just a default. > > Maybe there could be a generic solution for drivers with a 'sync_state' > > callback, e.g. a the driver (or framework) could have a 'sync_state_timeout' > > callback (or similar), which is called by the driver framework if 'sync_state' > > wasn't called (for example) 30s after the device was probed. Then the provider > > can power off or throttle unclaimed resources. > > I might be missing a point somewhere, but for me it looks like a logical > solution. Please send a proposal. I started working on a patch, I'll probably send it out next week if I don't encounter any evident major issues.
On 23-02-03 22:00:27, Dmitry Baryshkov wrote: > On 03/02/2023 03:20, Matthias Kaehlcke wrote: > > Hi Dmitry, > > > > On Thu, Feb 02, 2023 at 09:53:41PM +0200, Dmitry Baryshkov wrote: > > > On 02/02/2023 20:24, Matthias Kaehlcke wrote: > > > > Hi Abel, > > > > > > > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > > > > > Currently, there are cases when a domain needs to remain enabled until > > > > > the consumer driver probes. Sometimes such consumer drivers may be built > > > > > as modules. Since the genpd_power_off_unused is called too early for > > > > > such consumer driver modules to get a chance to probe, the domain, since > > > > > it is unused, will get disabled. On the other hand, the best time for > > > > > an unused domain to be disabled is on the provider's sync_state > > > > > callback. So, if the provider has registered a sync_state callback, > > > > > assume the unused domains for that provider will be disabled on its > > > > > sync_state callback. Also provide a generic sync_state callback which > > > > > disables all the domains unused for the provider that registers it. > > > > > > > > > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > > > > > --- > > > > > > > > > > This approach has been applied for unused clocks as well. > > > > > With this patch merged in, all the providers that have sync_state > > > > > callback registered will leave the domains enabled unless the provider's > > > > > sync_state callback explicitly disables them. So those providers will > > > > > need to add the disabling part to their sync_state callback. On the > > > > > other hand, the platforms that have cases where domains need to remain > > > > > enabled (even if unused) until the consumer driver probes, will be able, > > > > > with this patch in, to run without the pd_ignore_unused kernel argument, > > > > > which seems to be the case for most Qualcomm platforms, at this moment. > > > > > > > > I recently encountered a related issue on a Qualcomm platform with a > > > > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > > > > highest corner until sync_state"). The issue involves a DT node with a > > > > rpmhpd, the DT node is enabled, however the corresponding device driver > > > > is not enabled in the kernel. In such a scenario the sync_state callback > > > > is never called, because the genpd consumer never probes. As a result > > > > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > > > > system suspend, which results in a substantially higher power consumption > > > > in S3. > > > > > > > > I wonder if genpd (and some other frameworks) needs something like > > > > regulator_init_complete(), which turns off unused regulators 30s after > > > > system boot. That's conceptually similar to the current > > > > genpd_power_off_unused(), but would provide time for modules being loaded. > > > > > > I think the overall goal is to move away from ad-hoc implementations like > > > clk_disable_unused/genpd_power_off_unused/regulator_init_complete towards > > > the sync_state. > > > > I generally agree with the goal of using common mechanisms whenever possible. > > > > > So inherently one either has to provide drivers for all devices in question > > > or disable unused devices in DT. > > > > I don't think that's a great solution, it essentially hands the issue down to > > the users or downstream maintainers of the kernel, who might not be aware that > > there is an issue, nor know about the specifics of genpd (or interconnects and > > clocks which have similar problems). > > The goal is to move the control down to individual drivers. Previously we > had issues with clk_disable_unused() disabling mdss/mdp clocks incorrectly, > which frequently led to broken display output. Other clock/genpd/regulator > drivers might have other internal dependencies. Thus it is not really > possible to handle resource shutdown in the common (framework) code. > > > > > In general symptoms are probably subtle, like a (potentially substantially) > > increased power consumption during system suspend. The issue might have been > > introduced by an update to a newer kernel, which now includes a DT node for a > > new SoC feature which wasn't supported by the 'old' kernel. It's common > > practice to use the 'old' .config, at least as a starting point, which > > obviously doesn't enable the new driver. That happend to me with [1] when > > testing v6.1. It took me quite some time to track the 'culprit' commit down > > and then some debugging to understand what's going on. Shortly after that I > > ran into a related issue involving genpds when testing v6.2-rc, which again > > took a non-trivial amount of time to track down (and I'm familiar with the SoC > > platform and the general nature of the issue). I don't think it's reasonable > > to expect every user/downstream maintainer of an impacted system to go through > > this, one person at a time. > > I think it would be nice to have some way of 'sync_pending' debug available > (compare this to debugfs/devices_deferred). There is actually a 'state_synced' sysfs interface (per device) that either shows 0, meaning it hasn't reach sync_state yet, or the file is not available at all, meaning it has reached sync_state. > > Note, we are trying to make sure that all supported drivers are enabled at > least as modules (if possible). If we fail, please send a patch fixing the > defconfig. > > > Maybe there could be a generic solution for drivers with a 'sync_state' > > callback, e.g. a the driver (or framework) could have a 'sync_state_timeout' > > callback (or similar), which is called by the driver framework if 'sync_state' > > wasn't called (for example) 30s after the device was probed. Then the provider > > can power off or throttle unclaimed resources. > > I might be missing a point somewhere, but for me it looks like a logical > solution. Please send a proposal. > > -- > With best wishes > Dmitry >
On 23-02-06 18:21:20, Abel Vesa wrote: > On 23-02-03 22:00:27, Dmitry Baryshkov wrote: > > On 03/02/2023 03:20, Matthias Kaehlcke wrote: > > > Hi Dmitry, > > > > > > On Thu, Feb 02, 2023 at 09:53:41PM +0200, Dmitry Baryshkov wrote: > > > > On 02/02/2023 20:24, Matthias Kaehlcke wrote: > > > > > Hi Abel, > > > > > > > > > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > > > > > > Currently, there are cases when a domain needs to remain enabled until > > > > > > the consumer driver probes. Sometimes such consumer drivers may be built > > > > > > as modules. Since the genpd_power_off_unused is called too early for > > > > > > such consumer driver modules to get a chance to probe, the domain, since > > > > > > it is unused, will get disabled. On the other hand, the best time for > > > > > > an unused domain to be disabled is on the provider's sync_state > > > > > > callback. So, if the provider has registered a sync_state callback, > > > > > > assume the unused domains for that provider will be disabled on its > > > > > > sync_state callback. Also provide a generic sync_state callback which > > > > > > disables all the domains unused for the provider that registers it. > > > > > > > > > > > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > > > > > > --- > > > > > > > > > > > > This approach has been applied for unused clocks as well. > > > > > > With this patch merged in, all the providers that have sync_state > > > > > > callback registered will leave the domains enabled unless the provider's > > > > > > sync_state callback explicitly disables them. So those providers will > > > > > > need to add the disabling part to their sync_state callback. On the > > > > > > other hand, the platforms that have cases where domains need to remain > > > > > > enabled (even if unused) until the consumer driver probes, will be able, > > > > > > with this patch in, to run without the pd_ignore_unused kernel argument, > > > > > > which seems to be the case for most Qualcomm platforms, at this moment. > > > > > > > > > > I recently encountered a related issue on a Qualcomm platform with a > > > > > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > > > > > highest corner until sync_state"). The issue involves a DT node with a > > > > > rpmhpd, the DT node is enabled, however the corresponding device driver > > > > > is not enabled in the kernel. In such a scenario the sync_state callback > > > > > is never called, because the genpd consumer never probes. As a result > > > > > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > > > > > system suspend, which results in a substantially higher power consumption > > > > > in S3. > > > > > > > > > > I wonder if genpd (and some other frameworks) needs something like > > > > > regulator_init_complete(), which turns off unused regulators 30s after > > > > > system boot. That's conceptually similar to the current > > > > > genpd_power_off_unused(), but would provide time for modules being loaded. > > > > > > > > I think the overall goal is to move away from ad-hoc implementations like > > > > clk_disable_unused/genpd_power_off_unused/regulator_init_complete towards > > > > the sync_state. > > > > > > I generally agree with the goal of using common mechanisms whenever possible. > > > > > > > So inherently one either has to provide drivers for all devices in question > > > > or disable unused devices in DT. > > > > > > I don't think that's a great solution, it essentially hands the issue down to > > > the users or downstream maintainers of the kernel, who might not be aware that > > > there is an issue, nor know about the specifics of genpd (or interconnects and > > > clocks which have similar problems). > > > > The goal is to move the control down to individual drivers. Previously we > > had issues with clk_disable_unused() disabling mdss/mdp clocks incorrectly, > > which frequently led to broken display output. Other clock/genpd/regulator > > drivers might have other internal dependencies. Thus it is not really > > possible to handle resource shutdown in the common (framework) code. > > > > > > > > In general symptoms are probably subtle, like a (potentially substantially) > > > increased power consumption during system suspend. The issue might have been > > > introduced by an update to a newer kernel, which now includes a DT node for a > > > new SoC feature which wasn't supported by the 'old' kernel. It's common > > > practice to use the 'old' .config, at least as a starting point, which > > > obviously doesn't enable the new driver. That happend to me with [1] when > > > testing v6.1. It took me quite some time to track the 'culprit' commit down > > > and then some debugging to understand what's going on. Shortly after that I > > > ran into a related issue involving genpds when testing v6.2-rc, which again > > > took a non-trivial amount of time to track down (and I'm familiar with the SoC > > > platform and the general nature of the issue). I don't think it's reasonable > > > to expect every user/downstream maintainer of an impacted system to go through > > > this, one person at a time. > > > > I think it would be nice to have some way of 'sync_pending' debug available > > (compare this to debugfs/devices_deferred). > > There is actually a 'state_synced' sysfs interface (per device) that > either shows 0, meaning it hasn't reach sync_state yet, or the file is > not available at all, meaning it has reached sync_state. For the sake of correctness, drivers that have sync_state callback registered have the state_synced attribute. 0 means not state synced yet, 1 means already states synced. According to: https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-devices-state_synced > > > > > Note, we are trying to make sure that all supported drivers are enabled at > > least as modules (if possible). If we fail, please send a patch fixing the > > defconfig. > > > > > Maybe there could be a generic solution for drivers with a 'sync_state' > > > callback, e.g. a the driver (or framework) could have a 'sync_state_timeout' > > > callback (or similar), which is called by the driver framework if 'sync_state' > > > wasn't called (for example) 30s after the device was probed. Then the provider > > > can power off or throttle unclaimed resources. > > > > I might be missing a point somewhere, but for me it looks like a logical > > solution. Please send a proposal. > > > > -- > > With best wishes > > Dmitry > >
On Mon, Feb 06, 2023 at 06:31:21PM +0200, Abel Vesa wrote: > On 23-02-02 18:24:15, Matthias Kaehlcke wrote: > > Hi Abel, > > > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > > > Currently, there are cases when a domain needs to remain enabled until > > > the consumer driver probes. Sometimes such consumer drivers may be built > > > as modules. Since the genpd_power_off_unused is called too early for > > > such consumer driver modules to get a chance to probe, the domain, since > > > it is unused, will get disabled. On the other hand, the best time for > > > an unused domain to be disabled is on the provider's sync_state > > > callback. So, if the provider has registered a sync_state callback, > > > assume the unused domains for that provider will be disabled on its > > > sync_state callback. Also provide a generic sync_state callback which > > > disables all the domains unused for the provider that registers it. > > > > > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > > > --- > > > > > > This approach has been applied for unused clocks as well. > > > With this patch merged in, all the providers that have sync_state > > > callback registered will leave the domains enabled unless the provider's > > > sync_state callback explicitly disables them. So those providers will > > > need to add the disabling part to their sync_state callback. On the > > > other hand, the platforms that have cases where domains need to remain > > > enabled (even if unused) until the consumer driver probes, will be able, > > > with this patch in, to run without the pd_ignore_unused kernel argument, > > > which seems to be the case for most Qualcomm platforms, at this moment. > > > > I recently encountered a related issue on a Qualcomm platform with a > > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > > highest corner until sync_state"). The issue involves a DT node with a > > rpmhpd, the DT node is enabled, however the corresponding device driver > > is not enabled in the kernel. In such a scenario the sync_state callback > > is never called, because the genpd consumer never probes. As a result > > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > > system suspend, which results in a substantially higher power consumption > > in S3. > > If I get this correctly, one of the providers is missing (doesn't matter > the reason), in which case, your kernel needs that driver, period. There > is no reason why you would expect the consumer to work without the > provider. Or, you could just remove the property in the devicetree node, > the property that makes the consumer wait for that provider. Anyway, you > should never end up with a consumer provider relationship in devicetree > without providing the provider driver. I would agree if it was actually a provider that's missing, however it's a 'missing' consumer that prevents the sync_state() call. > > I wonder if genpd (and some other frameworks) needs something like > > regulator_init_complete(), which turns off unused regulators 30s after > > system boot. That's conceptually similar to the current > > genpd_power_off_unused(), but would provide time for modules being loaded. > > NACK, timeouts are just another hack in this case, specially when we > have a pretty reliable mechanism like sync_state. It does not work properly unless all consumers are probed successfully. It makes sense to wait some time for the consumers to probe, but not eternally, it's perfectly valid that a driver for a (potential) consumer is not enabled.
On Mon, Feb 06, 2023 at 06:24:30PM +0200, Abel Vesa wrote: > On 23-02-02 14:20:56, Doug Anderson wrote: > > Hi, > > > > On Thu, Feb 2, 2023 at 11:53 AM Dmitry Baryshkov > > <dmitry.baryshkov@linaro.org> wrote: > > > > > > On 02/02/2023 20:24, Matthias Kaehlcke wrote: > > > > Hi Abel, > > > > > > > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > > > >> Currently, there are cases when a domain needs to remain enabled until > > > >> the consumer driver probes. Sometimes such consumer drivers may be built > > > >> as modules. Since the genpd_power_off_unused is called too early for > > > >> such consumer driver modules to get a chance to probe, the domain, since > > > >> it is unused, will get disabled. On the other hand, the best time for > > > >> an unused domain to be disabled is on the provider's sync_state > > > >> callback. So, if the provider has registered a sync_state callback, > > > >> assume the unused domains for that provider will be disabled on its > > > >> sync_state callback. Also provide a generic sync_state callback which > > > >> disables all the domains unused for the provider that registers it. > > > >> > > > >> Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > > > >> --- > > > >> > > > >> This approach has been applied for unused clocks as well. > > > >> With this patch merged in, all the providers that have sync_state > > > >> callback registered will leave the domains enabled unless the provider's > > > >> sync_state callback explicitly disables them. So those providers will > > > >> need to add the disabling part to their sync_state callback. On the > > > >> other hand, the platforms that have cases where domains need to remain > > > >> enabled (even if unused) until the consumer driver probes, will be able, > > > >> with this patch in, to run without the pd_ignore_unused kernel argument, > > > >> which seems to be the case for most Qualcomm platforms, at this moment. > > > > > > > > I recently encountered a related issue on a Qualcomm platform with a > > > > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > > > > highest corner until sync_state"). The issue involves a DT node with a > > > > rpmhpd, the DT node is enabled, however the corresponding device driver > > > > is not enabled in the kernel. In such a scenario the sync_state callback > > > > is never called, because the genpd consumer never probes. As a result > > > > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > > > > system suspend, which results in a substantially higher power consumption > > > > in S3. > > > > > > > > I wonder if genpd (and some other frameworks) needs something like > > > > regulator_init_complete(), which turns off unused regulators 30s after > > > > system boot. That's conceptually similar to the current > > > > genpd_power_off_unused(), but would provide time for modules being loaded. > > > > > > I think the overall goal is to move away from ad-hoc implementations > > > like clk_disable_unused/genpd_power_off_unused/regulator_init_complete > > > towards the sync_state. > > > > > > So inherently one either has to provide drivers for all devices in > > > question or disable unused devices in DT. > > > > Hmm. I guess I haven't been involved too much in those discussions, > > but overall I thought: > > > > 1. The device tree should ideally be describing the hardware. Thus if > > the hardware is there / available to use on a given board then the > > device should be marked enabled. > > That is correct. > > > > > 2. Users are not actually required to enable drivers for all hardware > > on their board. Things should still function OK even if a driver is > > disabled. For instance, if the SoC had a crypto accelerator you'd > > describe it in the device tree but it would be OK for someone to build > > a kernel that didn't enable the crypto accelerator driver. > > Right, but sync state is relying on fw_devlinks to decide if there are > any consumers left that still need to probe. So if one of the consumer > devicetree nodes needs some provider, the consumer will simply not work > without the provider. In theory at least. It is correct that a device which actually depends on a provider won't work without that provider, however that isn't the case here. In the scenario above the consumer is the crypto accelerator. It doesn't probe because it's driver is not enabled, not because it's waiting for a provider (clk, interconnect, ...).
On Mon, Feb 06, 2023 at 07:48:23PM +0200, Abel Vesa wrote: > > CC'ed Saravana > > On 23-02-06 17:22:23, Matthias Kaehlcke wrote: > > On Mon, Feb 06, 2023 at 06:31:21PM +0200, Abel Vesa wrote: > > > On 23-02-02 18:24:15, Matthias Kaehlcke wrote: > > > > Hi Abel, > > > > > > > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > > > > > Currently, there are cases when a domain needs to remain enabled until > > > > > the consumer driver probes. Sometimes such consumer drivers may be built > > > > > as modules. Since the genpd_power_off_unused is called too early for > > > > > such consumer driver modules to get a chance to probe, the domain, since > > > > > it is unused, will get disabled. On the other hand, the best time for > > > > > an unused domain to be disabled is on the provider's sync_state > > > > > callback. So, if the provider has registered a sync_state callback, > > > > > assume the unused domains for that provider will be disabled on its > > > > > sync_state callback. Also provide a generic sync_state callback which > > > > > disables all the domains unused for the provider that registers it. > > > > > > > > > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > > > > > --- > > > > > > > > > > This approach has been applied for unused clocks as well. > > > > > With this patch merged in, all the providers that have sync_state > > > > > callback registered will leave the domains enabled unless the provider's > > > > > sync_state callback explicitly disables them. So those providers will > > > > > need to add the disabling part to their sync_state callback. On the > > > > > other hand, the platforms that have cases where domains need to remain > > > > > enabled (even if unused) until the consumer driver probes, will be able, > > > > > with this patch in, to run without the pd_ignore_unused kernel argument, > > > > > which seems to be the case for most Qualcomm platforms, at this moment. > > > > > > > > I recently encountered a related issue on a Qualcomm platform with a > > > > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > > > > highest corner until sync_state"). The issue involves a DT node with a > > > > rpmhpd, the DT node is enabled, however the corresponding device driver > > > > is not enabled in the kernel. In such a scenario the sync_state callback > > > > is never called, because the genpd consumer never probes. As a result > > > > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > > > > system suspend, which results in a substantially higher power consumption > > > > in S3. > > > > > > If I get this correctly, one of the providers is missing (doesn't matter > > > the reason), in which case, your kernel needs that driver, period. There > > > is no reason why you would expect the consumer to work without the > > > provider. Or, you could just remove the property in the devicetree node, > > > the property that makes the consumer wait for that provider. Anyway, you > > > should never end up with a consumer provider relationship in devicetree > > > without providing the provider driver. > > > > I would agree if it was actually a provider that's missing, however it's a > > 'missing' consumer that prevents the sync_state() call. > > Oh, my bad. > > Still, why would you keep the consumer node enabled in devicetree if you don't > intend to allow its driver to ever probe? As Doug pointed out, the device tree is supposed to describe the hardware, but that shouldn't impose a user/admin/downstream maintainer to use every single existing piece of hardware with the potential power implications on battery powererd devices. I someone uses an off the shelf board like a Raspberry Pi for a project in which they only use a subset of the functionality, they would be forced to use a downstream device tree if they can't just disable the drivers they are not interested in. Supposedly we want people/companies to use upstream kernels as much as possible, however you suggest to adapt the device tree in a way that does not describe the hardware, which effectively forces folks to user downstream kernels (or at least device trees). > > > > I wonder if genpd (and some other frameworks) needs something like > > > > regulator_init_complete(), which turns off unused regulators 30s after > > > > system boot. That's conceptually similar to the current > > > > genpd_power_off_unused(), but would provide time for modules being loaded. > > > > > > NACK, timeouts are just another hack in this case, specially when we > > > have a pretty reliable mechanism like sync_state. > > > > It does not work properly unless all consumers are probed successfully. It > > makes sense to wait some time for the consumers to probe, but not eternally, > > it's perfectly valid that a driver for a (potential) consumer is not enabled. > > Usually, if you have a consumer devicetree node that you consider it > should not probe, you should consider disabling that node in your board > dts, specially if you don't intend to provide its driver. Nope, the device tree is supposed to describe the hardware and the hardware doesn't change because a particular use case doesn't require/want the use of all existing parts. > Again, timeouts are bad all-around. What happens if rootfs doesn't get > mounted in time? Will 30 seconds be enough for every scenario? What > happens if I want to load the driver (module) for a consumer a day after boot? I am not sure if I have a complete/correct picture here. My understanding is that sync_state is above all used for handing over critical hardware from the bootloader to the kernel. For example the CPUs may require clocks and interconnects to run at a certain speed during boot, before they are 'probed' and can ask the provider to run the resource (at least) at a certain speed. I would expect that drivers on the roots aren't critical for the system to enter the rootfs, all these should be either built-in or as modules on an initramfs. Let's say we have an audio driver that uses a clock (provider) and this audio driver lives as a modules on the rootfs. For some reason our rootfs takes a long time to mount and sync_state() of the clock provider is called, due to the timeout we introduced. The provider determines that it hasn't received any requests for the audio clock and disables it. Now finally the rootfs is mounted and out audio module is loaded. The audio device probes, gets the clock and asks the provider to run it at certain speed. The clock provider puts the clock at the requested speed and audio works as if the sync_state timeout never happened. Am I missing/misunderstanding anything important? > IMHO, I think even the regulator_init_complete should be switched to some sync > state approach. Maybe, with a timeout :)
On Mon, Feb 6, 2023 at 9:48 AM Abel Vesa <abel.vesa@linaro.org> wrote: > > > CC'ed Saravana Thanks. Please do cc me for stuff like this from the start. I skimmed the series and I think it's doing one of my TODO items. So, thanks for the patch! I'll take a closer look within a few days -- trying to get through some existing fw_devlink stuff. But long story short, it is the right thing to keep a supplier on indefinitely if there's a consumer device (that's not disabled in DT) that never gets probed. It's a pretty common scenario -- for example, say a display backlight. The default case should be functional correctness. And then we can add stuff that allows changing this behavior with command line args or something else that can be done from userspace. +1 to what Doug said elsewhere in this thread too. I'm trying to consolidate the "when do we give up" decision at the driver core level independent of what framework is being used. -Saravana > > On 23-02-06 17:22:23, Matthias Kaehlcke wrote: > > On Mon, Feb 06, 2023 at 06:31:21PM +0200, Abel Vesa wrote: > > > On 23-02-02 18:24:15, Matthias Kaehlcke wrote: > > > > Hi Abel, > > > > > > > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > > > > > Currently, there are cases when a domain needs to remain enabled until > > > > > the consumer driver probes. Sometimes such consumer drivers may be built > > > > > as modules. Since the genpd_power_off_unused is called too early for > > > > > such consumer driver modules to get a chance to probe, the domain, since > > > > > it is unused, will get disabled. On the other hand, the best time for > > > > > an unused domain to be disabled is on the provider's sync_state > > > > > callback. So, if the provider has registered a sync_state callback, > > > > > assume the unused domains for that provider will be disabled on its > > > > > sync_state callback. Also provide a generic sync_state callback which > > > > > disables all the domains unused for the provider that registers it. > > > > > > > > > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > > > > > --- > > > > > > > > > > This approach has been applied for unused clocks as well. > > > > > With this patch merged in, all the providers that have sync_state > > > > > callback registered will leave the domains enabled unless the provider's > > > > > sync_state callback explicitly disables them. So those providers will > > > > > need to add the disabling part to their sync_state callback. On the > > > > > other hand, the platforms that have cases where domains need to remain > > > > > enabled (even if unused) until the consumer driver probes, will be able, > > > > > with this patch in, to run without the pd_ignore_unused kernel argument, > > > > > which seems to be the case for most Qualcomm platforms, at this moment. > > > > > > > > I recently encountered a related issue on a Qualcomm platform with a > > > > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > > > > highest corner until sync_state"). The issue involves a DT node with a > > > > rpmhpd, the DT node is enabled, however the corresponding device driver > > > > is not enabled in the kernel. In such a scenario the sync_state callback > > > > is never called, because the genpd consumer never probes. As a result > > > > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > > > > system suspend, which results in a substantially higher power consumption > > > > in S3. > > > > > > If I get this correctly, one of the providers is missing (doesn't matter > > > the reason), in which case, your kernel needs that driver, period. There > > > is no reason why you would expect the consumer to work without the > > > provider. Or, you could just remove the property in the devicetree node, > > > the property that makes the consumer wait for that provider. Anyway, you > > > should never end up with a consumer provider relationship in devicetree > > > without providing the provider driver. > > > > I would agree if it was actually a provider that's missing, however it's a > > 'missing' consumer that prevents the sync_state() call. > > Oh, my bad. > > Still, why would you keep the consumer node enabled in devicetree if you don't > intend to allow its driver to ever probe? > > > > > > > I wonder if genpd (and some other frameworks) needs something like > > > > regulator_init_complete(), which turns off unused regulators 30s after > > > > system boot. That's conceptually similar to the current > > > > genpd_power_off_unused(), but would provide time for modules being loaded. > > > > > > NACK, timeouts are just another hack in this case, specially when we > > > have a pretty reliable mechanism like sync_state. > > > > It does not work properly unless all consumers are probed successfully. It > > makes sense to wait some time for the consumers to probe, but not eternally, > > it's perfectly valid that a driver for a (potential) consumer is not enabled. > > Usually, if you have a consumer devicetree node that you consider it > should not probe, you should consider disabling that node in your board > dts, specially if you don't intend to provide its driver. > > Again, timeouts are bad all-around. What happens if rootfs doesn't get > mounted in time? Will 30 seconds be enough for every scenario? What > happens if I want to load the driver (module) for a consumer a day after boot? > > IMHO, I think even the regulator_init_complete should be switched to some sync > state approach.
Hi, On Mon, Feb 6, 2023 at 11:33 AM Saravana Kannan <saravanak@google.com> wrote: > > On Mon, Feb 6, 2023 at 9:48 AM Abel Vesa <abel.vesa@linaro.org> wrote: > > > > > > CC'ed Saravana > > Thanks. Please do cc me for stuff like this from the start. I skimmed > the series and I think it's doing one of my TODO items. So, thanks for > the patch! > > I'll take a closer look within a few days -- trying to get through > some existing fw_devlink stuff. > > But long story short, it is the right thing to keep a supplier on > indefinitely if there's a consumer device (that's not disabled in DT) > that never gets probed. It's a pretty common scenario -- for example, > say a display backlight. The default case should be functional > correctness. And then we can add stuff that allows changing this > behavior with command line args or something else that can be done > from userspace. > > +1 to what Doug said elsewhere in this thread too. I'm trying to > consolidate the "when do we give up" decision at the driver core level > independent of what framework is being used. I'm not really sure I agree with the above, at least not without lots of discussion in the community. It really goes against what the kernel has been doing for years and years in the regulator and clock frameworks. Those frameworks both eventually give up and power down resources that no active drivers are using. Either changing the regulator/clock frameworks or saying that other frameworks should work in an opposite way seems like a recipe for confusion. Now, certainly I won't say that the way that the regulator and clock frameworks function is perfect nor will I say that they don't cause any problems. However, going the opposite way where resources are kept at full power indefinitely will _also_ cause problems. Specifically, let's look at the case you mentioned of a display backlight. I think you're saying that if there is no backlight driver enabled in the kernel that you'd expect the backlight to just be on at full brightness. Would you expect this even if the firmware didn't leave the backlight on? In any case, why do you say it's more correct? I suppose you'd say that the screen is at least usable like this. ...except that you've broken a different feature: suspend/resume. Without being able to turn the backlight off at suspend time the device would drain tons of power. It could also overheat when you stuffed it in your backpack and damage the battery or start a fire. Even if you argue that in the case of the display backlight you're better off, what about a keyboard backlight? It's pretty easy to use a laptop without the keyboard backlight and if you didn't have a driver for it you'd be in better shape leaving it off instead of leaving it on 100% of the time, even when the device is suspended. Overall: if a kernel isn't configured for a given driver we shouldn't be expecting the hardware controlled by that driver to work. The best we can hope for is that it's at least in a low power state. In general I think that having a well-defined way to know it's time to give up and power off anything for which a driver didn't probe needs to be an important part of any designs here. > -Saravana > > > > > On 23-02-06 17:22:23, Matthias Kaehlcke wrote: > > > On Mon, Feb 06, 2023 at 06:31:21PM +0200, Abel Vesa wrote: > > > > On 23-02-02 18:24:15, Matthias Kaehlcke wrote: > > > > > Hi Abel, > > > > > > > > > > On Fri, Jan 27, 2023 at 12:40:53PM +0200, Abel Vesa wrote: > > > > > > Currently, there are cases when a domain needs to remain enabled until > > > > > > the consumer driver probes. Sometimes such consumer drivers may be built > > > > > > as modules. Since the genpd_power_off_unused is called too early for > > > > > > such consumer driver modules to get a chance to probe, the domain, since > > > > > > it is unused, will get disabled. On the other hand, the best time for > > > > > > an unused domain to be disabled is on the provider's sync_state > > > > > > callback. So, if the provider has registered a sync_state callback, > > > > > > assume the unused domains for that provider will be disabled on its > > > > > > sync_state callback. Also provide a generic sync_state callback which > > > > > > disables all the domains unused for the provider that registers it. > > > > > > > > > > > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > > > > > > --- > > > > > > > > > > > > This approach has been applied for unused clocks as well. > > > > > > With this patch merged in, all the providers that have sync_state > > > > > > callback registered will leave the domains enabled unless the provider's > > > > > > sync_state callback explicitly disables them. So those providers will > > > > > > need to add the disabling part to their sync_state callback. On the > > > > > > other hand, the platforms that have cases where domains need to remain > > > > > > enabled (even if unused) until the consumer driver probes, will be able, > > > > > > with this patch in, to run without the pd_ignore_unused kernel argument, > > > > > > which seems to be the case for most Qualcomm platforms, at this moment. > > > > > > > > > > I recently encountered a related issue on a Qualcomm platform with a > > > > > v6.2-rc kernel, which includes 3a39049f88e4 ("soc: qcom: rpmhpd: Use > > > > > highest corner until sync_state"). The issue involves a DT node with a > > > > > rpmhpd, the DT node is enabled, however the corresponding device driver > > > > > is not enabled in the kernel. In such a scenario the sync_state callback > > > > > is never called, because the genpd consumer never probes. As a result > > > > > the Always-on subsystem (AOSS) of the SoC doesn't enter sleep mode during > > > > > system suspend, which results in a substantially higher power consumption > > > > > in S3. > > > > > > > > If I get this correctly, one of the providers is missing (doesn't matter > > > > the reason), in which case, your kernel needs that driver, period. There > > > > is no reason why you would expect the consumer to work without the > > > > provider. Or, you could just remove the property in the devicetree node, > > > > the property that makes the consumer wait for that provider. Anyway, you > > > > should never end up with a consumer provider relationship in devicetree > > > > without providing the provider driver. > > > > > > I would agree if it was actually a provider that's missing, however it's a > > > 'missing' consumer that prevents the sync_state() call. > > > > Oh, my bad. > > > > Still, why would you keep the consumer node enabled in devicetree if you don't > > intend to allow its driver to ever probe? > > > > > > > > > > I wonder if genpd (and some other frameworks) needs something like > > > > > regulator_init_complete(), which turns off unused regulators 30s after > > > > > system boot. That's conceptually similar to the current > > > > > genpd_power_off_unused(), but would provide time for modules being loaded. > > > > > > > > NACK, timeouts are just another hack in this case, specially when we > > > > have a pretty reliable mechanism like sync_state. > > > > > > It does not work properly unless all consumers are probed successfully. It > > > makes sense to wait some time for the consumers to probe, but not eternally, > > > it's perfectly valid that a driver for a (potential) consumer is not enabled. > > > > Usually, if you have a consumer devicetree node that you consider it > > should not probe, you should consider disabling that node in your board > > dts, specially if you don't intend to provide its driver. > > > > Again, timeouts are bad all-around. What happens if rootfs doesn't get > > mounted in time? Will 30 seconds be enough for every scenario? What > > happens if I want to load the driver (module) for a consumer a day after boot? > > > > IMHO, I think even the regulator_init_complete should be switched to some sync > > state approach.
On Fri, 27 Jan 2023 at 11:40, Abel Vesa <abel.vesa@linaro.org> wrote: > > Currently, there are cases when a domain needs to remain enabled until > the consumer driver probes. Sometimes such consumer drivers may be built > as modules. Since the genpd_power_off_unused is called too early for > such consumer driver modules to get a chance to probe, the domain, since > it is unused, will get disabled. On the other hand, the best time for > an unused domain to be disabled is on the provider's sync_state > callback. So, if the provider has registered a sync_state callback, > assume the unused domains for that provider will be disabled on its > sync_state callback. Also provide a generic sync_state callback which > disables all the domains unused for the provider that registers it. > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > --- > > This approach has been applied for unused clocks as well. > With this patch merged in, all the providers that have sync_state > callback registered will leave the domains enabled unless the provider's > sync_state callback explicitly disables them. So those providers will > need to add the disabling part to their sync_state callback. On the > other hand, the platforms that have cases where domains need to remain > enabled (even if unused) until the consumer driver probes, will be able, > with this patch in, to run without the pd_ignore_unused kernel argument, > which seems to be the case for most Qualcomm platforms, at this moment. My apologies for the somewhat late reply. Please see my comments below. > > The v1 is here: > https://lore.kernel.org/all/20230126234013.3638425-1-abel.vesa@linaro.org/ > > Changes since v1: > * added a generic sync state callback to be registered by providers in > order to disable the unused domains on their sync state. Also > mentioned this in the commit message. > > drivers/base/power/domain.c | 17 ++++++++++++++++- > include/linux/pm_domain.h | 3 +++ > 2 files changed, 19 insertions(+), 1 deletion(-) > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > index 84662d338188..c2a5f77c01f3 100644 > --- a/drivers/base/power/domain.c > +++ b/drivers/base/power/domain.c > @@ -1099,7 +1099,8 @@ static int __init genpd_power_off_unused(void) > mutex_lock(&gpd_list_lock); > > list_for_each_entry(genpd, &gpd_list, gpd_list_node) > - genpd_queue_power_off_work(genpd); > + if (!dev_has_sync_state(genpd->provider->dev)) Unfortunately, this doesn't really help, due to the fact that a genpd's ->power_off() callback may get called anyway. At power off, the genpd core only cares about those consumers that are currently attached, not those that might get attached at some point later in time. In other words, it's the responsibility for each specific genpd provider to cope with the condition that its ->sync_state() callback may *not* have been called, while its ->power_off() callback is being called. In these cases, the genpd provider should probably make the ->power_off() callback to return -EBUSY. This is what we do in psci_pd_power_off(), for example. > + genpd_queue_power_off_work(genpd); > > mutex_unlock(&gpd_list_lock); > > @@ -1107,6 +1108,20 @@ static int __init genpd_power_off_unused(void) > } > late_initcall(genpd_power_off_unused); > > +void genpd_power_off_unused_sync_state(struct device *dev) > +{ > + struct generic_pm_domain *genpd; > + > + mutex_lock(&gpd_list_lock); > + > + list_for_each_entry(genpd, &gpd_list, gpd_list_node) > + if (genpd->provider->dev == dev) > + genpd_queue_power_off_work(genpd); > + > + mutex_unlock(&gpd_list_lock); > +} > +EXPORT_SYMBOL_GPL(genpd_power_off_unused_sync_state); I don't think this function is needed at all. In fact, this part of the problem that you are trying to solve should already be managed by the driver core, as it calls dev->pm_domain->sync() (which is assigned to genpd_dev_pm_sync()) , in really_probe(). Or isn't that taking care of the problem for you? > + > #ifdef CONFIG_PM_SLEEP > > /** > diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h > index f776fb93eaa0..1fd5aa500c81 100644 > --- a/include/linux/pm_domain.h > +++ b/include/linux/pm_domain.h > @@ -351,6 +351,7 @@ struct device *genpd_dev_pm_attach_by_id(struct device *dev, > unsigned int index); > struct device *genpd_dev_pm_attach_by_name(struct device *dev, > const char *name); > +void genpd_power_off_unused_sync_state(struct device *dev); > #else /* !CONFIG_PM_GENERIC_DOMAINS_OF */ > static inline int of_genpd_add_provider_simple(struct device_node *np, > struct generic_pm_domain *genpd) > @@ -419,6 +420,8 @@ struct generic_pm_domain *of_genpd_remove_last(struct device_node *np) > { > return ERR_PTR(-EOPNOTSUPP); > } > + > +static inline genpd_power_off_unused_sync_state(struct device *dev) {} > #endif /* CONFIG_PM_GENERIC_DOMAINS_OF */ > > #ifdef CONFIG_PM > -- > 2.34.1 > Kind regards Uffe
On 23-02-15 12:57:54, Ulf Hansson wrote: > On Fri, 27 Jan 2023 at 11:40, Abel Vesa <abel.vesa@linaro.org> wrote: > > > > Currently, there are cases when a domain needs to remain enabled until > > the consumer driver probes. Sometimes such consumer drivers may be built > > as modules. Since the genpd_power_off_unused is called too early for > > such consumer driver modules to get a chance to probe, the domain, since > > it is unused, will get disabled. On the other hand, the best time for > > an unused domain to be disabled is on the provider's sync_state > > callback. So, if the provider has registered a sync_state callback, > > assume the unused domains for that provider will be disabled on its > > sync_state callback. Also provide a generic sync_state callback which > > disables all the domains unused for the provider that registers it. > > > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > > --- > > > > This approach has been applied for unused clocks as well. > > With this patch merged in, all the providers that have sync_state > > callback registered will leave the domains enabled unless the provider's > > sync_state callback explicitly disables them. So those providers will > > need to add the disabling part to their sync_state callback. On the > > other hand, the platforms that have cases where domains need to remain > > enabled (even if unused) until the consumer driver probes, will be able, > > with this patch in, to run without the pd_ignore_unused kernel argument, > > which seems to be the case for most Qualcomm platforms, at this moment. > > My apologies for the somewhat late reply. Please see my comments below. > > > > > The v1 is here: > > https://lore.kernel.org/all/20230126234013.3638425-1-abel.vesa@linaro.org/ > > > > Changes since v1: > > * added a generic sync state callback to be registered by providers in > > order to disable the unused domains on their sync state. Also > > mentioned this in the commit message. > > > > drivers/base/power/domain.c | 17 ++++++++++++++++- > > include/linux/pm_domain.h | 3 +++ > > 2 files changed, 19 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > > index 84662d338188..c2a5f77c01f3 100644 > > --- a/drivers/base/power/domain.c > > +++ b/drivers/base/power/domain.c > > @@ -1099,7 +1099,8 @@ static int __init genpd_power_off_unused(void) > > mutex_lock(&gpd_list_lock); > > > > list_for_each_entry(genpd, &gpd_list, gpd_list_node) > > - genpd_queue_power_off_work(genpd); > > + if (!dev_has_sync_state(genpd->provider->dev)) > > Unfortunately, this doesn't really help, due to the fact that a > genpd's ->power_off() callback may get called anyway. At power off, > the genpd core only cares about those consumers that are currently > attached, not those that might get attached at some point later in > time. > > In other words, it's the responsibility for each specific genpd > provider to cope with the condition that its ->sync_state() callback > may *not* have been called, while its ->power_off() callback is being > called. > > In these cases, the genpd provider should probably make the > ->power_off() callback to return -EBUSY. This is what we do in > psci_pd_power_off(), for example. > Hmm, this might actually be a better idea. Bjorn, do you agree? > > + genpd_queue_power_off_work(genpd); > > > > mutex_unlock(&gpd_list_lock); > > > > @@ -1107,6 +1108,20 @@ static int __init genpd_power_off_unused(void) > > } > > late_initcall(genpd_power_off_unused); > > > > +void genpd_power_off_unused_sync_state(struct device *dev) > > +{ > > + struct generic_pm_domain *genpd; > > + > > + mutex_lock(&gpd_list_lock); > > + > > + list_for_each_entry(genpd, &gpd_list, gpd_list_node) > > + if (genpd->provider->dev == dev) > > + genpd_queue_power_off_work(genpd); > > + > > + mutex_unlock(&gpd_list_lock); > > +} > > +EXPORT_SYMBOL_GPL(genpd_power_off_unused_sync_state); > > I don't think this function is needed at all. > > In fact, this part of the problem that you are trying to solve should > already be managed by the driver core, as it calls > dev->pm_domain->sync() (which is assigned to genpd_dev_pm_sync()) , in > really_probe(). Or isn't that taking care of the problem for you? Hmm, I missed the genpd_dev_pm_sync scenario entirely. Yes, that is actually what is needed, and yes, this function I added here is useless in this case. > > > + > > #ifdef CONFIG_PM_SLEEP > > > > /** > > diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h > > index f776fb93eaa0..1fd5aa500c81 100644 > > --- a/include/linux/pm_domain.h > > +++ b/include/linux/pm_domain.h > > @@ -351,6 +351,7 @@ struct device *genpd_dev_pm_attach_by_id(struct device *dev, > > unsigned int index); > > struct device *genpd_dev_pm_attach_by_name(struct device *dev, > > const char *name); > > +void genpd_power_off_unused_sync_state(struct device *dev); > > #else /* !CONFIG_PM_GENERIC_DOMAINS_OF */ > > static inline int of_genpd_add_provider_simple(struct device_node *np, > > struct generic_pm_domain *genpd) > > @@ -419,6 +420,8 @@ struct generic_pm_domain *of_genpd_remove_last(struct device_node *np) > > { > > return ERR_PTR(-EOPNOTSUPP); > > } > > + > > +static inline genpd_power_off_unused_sync_state(struct device *dev) {} > > #endif /* CONFIG_PM_GENERIC_DOMAINS_OF */ > > > > #ifdef CONFIG_PM > > -- > > 2.34.1 > > > > Kind regards > Uffe
On Wed, Feb 15, 2023 at 02:21:53PM +0200, Abel Vesa wrote: > On 23-02-15 12:57:54, Ulf Hansson wrote: > > On Fri, 27 Jan 2023 at 11:40, Abel Vesa <abel.vesa@linaro.org> wrote: > > > > > > Currently, there are cases when a domain needs to remain enabled until > > > the consumer driver probes. Sometimes such consumer drivers may be built > > > as modules. Since the genpd_power_off_unused is called too early for > > > such consumer driver modules to get a chance to probe, the domain, since > > > it is unused, will get disabled. On the other hand, the best time for > > > an unused domain to be disabled is on the provider's sync_state > > > callback. So, if the provider has registered a sync_state callback, > > > assume the unused domains for that provider will be disabled on its > > > sync_state callback. Also provide a generic sync_state callback which > > > disables all the domains unused for the provider that registers it. > > > > > > Signed-off-by: Abel Vesa <abel.vesa@linaro.org> > > > --- > > > > > > This approach has been applied for unused clocks as well. > > > With this patch merged in, all the providers that have sync_state > > > callback registered will leave the domains enabled unless the provider's > > > sync_state callback explicitly disables them. So those providers will > > > need to add the disabling part to their sync_state callback. On the > > > other hand, the platforms that have cases where domains need to remain > > > enabled (even if unused) until the consumer driver probes, will be able, > > > with this patch in, to run without the pd_ignore_unused kernel argument, > > > which seems to be the case for most Qualcomm platforms, at this moment. > > > > My apologies for the somewhat late reply. Please see my comments below. > > > > > > > > The v1 is here: > > > https://lore.kernel.org/all/20230126234013.3638425-1-abel.vesa@linaro.org/ > > > > > > Changes since v1: > > > * added a generic sync state callback to be registered by providers in > > > order to disable the unused domains on their sync state. Also > > > mentioned this in the commit message. > > > > > > drivers/base/power/domain.c | 17 ++++++++++++++++- > > > include/linux/pm_domain.h | 3 +++ > > > 2 files changed, 19 insertions(+), 1 deletion(-) > > > > > > diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c > > > index 84662d338188..c2a5f77c01f3 100644 > > > --- a/drivers/base/power/domain.c > > > +++ b/drivers/base/power/domain.c > > > @@ -1099,7 +1099,8 @@ static int __init genpd_power_off_unused(void) > > > mutex_lock(&gpd_list_lock); > > > > > > list_for_each_entry(genpd, &gpd_list, gpd_list_node) > > > - genpd_queue_power_off_work(genpd); > > > + if (!dev_has_sync_state(genpd->provider->dev)) > > > > Unfortunately, this doesn't really help, due to the fact that a > > genpd's ->power_off() callback may get called anyway. At power off, > > the genpd core only cares about those consumers that are currently > > attached, not those that might get attached at some point later in > > time. > > > > In other words, it's the responsibility for each specific genpd > > provider to cope with the condition that its ->sync_state() callback > > may *not* have been called, while its ->power_off() callback is being > > called. > > > > In these cases, the genpd provider should probably make the > > ->power_off() callback to return -EBUSY. This is what we do in > > psci_pd_power_off(), for example. > > > > Hmm, this might actually be a better idea. Bjorn, do you agree? > Yes, I agree. Regards, Bjorn
On Tue, Feb 07, 2023 at 03:45:35PM -0800, Doug Anderson wrote: > Hi, > > On Mon, Feb 6, 2023 at 1:35 PM Saravana Kannan <saravanak@google.com> wrote: > > > > On Mon, Feb 6, 2023 at 1:10 PM Doug Anderson <dianders@chromium.org> wrote: > > > > > > Hi, > > > > > > On Mon, Feb 6, 2023 at 11:33 AM Saravana Kannan <saravanak@google.com> wrote: > > > > > > > > On Mon, Feb 6, 2023 at 9:48 AM Abel Vesa <abel.vesa@linaro.org> wrote: > > > > > > > > > > > > > > > CC'ed Saravana > > > > > > > > Thanks. Please do cc me for stuff like this from the start. I skimmed > > > > the series and I think it's doing one of my TODO items. So, thanks for > > > > the patch! > > > > > > > > I'll take a closer look within a few days -- trying to get through > > > > some existing fw_devlink stuff. > > > > > > > > But long story short, it is the right thing to keep a supplier on > > > > indefinitely if there's a consumer device (that's not disabled in DT) > > > > that never gets probed. It's a pretty common scenario -- for example, > > > > say a display backlight. The default case should be functional > > > > correctness. And then we can add stuff that allows changing this > > > > behavior with command line args or something else that can be done > > > > from userspace. > > > > > > > > +1 to what Doug said elsewhere in this thread too. I'm trying to > > > > consolidate the "when do we give up" decision at the driver core level > > > > independent of what framework is being used. > > > > > > I'm not really sure I agree with the above, at least not without lots > > > of discussion in the community. It really goes against what the kernel > > > has been doing for years and years in the regulator and clock > > > frameworks. Those frameworks both eventually give up and power down > > > resources that no active drivers are using. Either changing the > > > regulator/clock frameworks or saying that other frameworks should work > > > in an opposite way seems like a recipe for confusion. > > > > > > Now, certainly I won't say that the way that the regulator and clock > > > frameworks function is perfect nor will I say that they don't cause > > > any problems. However, going the opposite way where resources are kept > > > at full power indefinitely will _also_ cause problems. > > > > > > Specifically, let's look at the case you mentioned of a display > > > backlight. I think you're saying that if there is no backlight driver > > > enabled in the kernel that you'd expect the backlight to just be on at > > > full brightness. > > > > No, I'm not saying that. > > > > > Would you expect this even if the firmware didn't > > > leave the backlight on? > > > > sync_state() never turns on anything that wasn't already on at boot. > > So in your example, if the firmware didn't turn on the backlight, then > > it'll remain off. > > As per offline discussion, part of the problems are that today this > _isn't_ true for a few Qualcomm things (like interconnect). The > interconnect frameway specifically maxes things out for early boot. > The problem being solved here is that the bootloader leaves some vote at 1GB/s, as needed by hardware related to driver B. Driver A is loaded first and votes for 1kb/s; what should the kernel do now, without knowledge of the needs from the hardware associated with B, or the ability to read back the bootloader's votes. This was the behavior of the initial implementation, and the practical implications was seen as the UART would typically come along really early, cast a low vote on the various buses and it would take forever to get to the probing of the drivers that actually gave us reasonable votes. Also consider the case where driver A probes, votes for bandwidth, does it's initialization and then votes for 0. Without making assumptions about the needs of B (or a potential B even), we'd turn off critical resources - possible preventing us from ever attempting to probe B. As such, the only safe solution is to assume that there might be a later loaded/probed client that has a large vote and preemptively vote for some higher bandwidth until then. > > > > In any case, why do you say it's more correct? > > > > Because if you turn off the display, the device is unusable. In other > > circumstances, it can crash a device because the firmware powered it > > on left it in a "good enough" state, but we'd go turn it off and crash > > the system. > > > > > I suppose you'd say that the screen is at least usable like this. > > > ...except that you've broken a different feature: suspend/resume. > > > > If the display is off and the laptop is unusable, then we have bigger > > problems than suspend/resume? > > I suspect that here we'll have to agree to disagree. IMO it's a > non-goal to expect hardware to work for which there is no driver. So > making the backlight work without a backlight driver isn't really > something we should strive for. > Without trying to make you agree ;) How can you differentiate between "the driver wasn't built" and "the driver isn't yet available"? Consider the case where I boot my laptop, I have some set of builtin drivers, some set of drivers in the ramdisk and some set of drivers in the root filesystem. In the event that something goes wrong mounting the rootfs, I will now be in the ramdisk console. Given the current timer-based disabling of regulators, I have ~25 seconds to solve my problem before the backlight goes blank. Obviously this isn't a typical scenario in a consumer device, but it seems conceivable that your ramdisk would run fsck for some amount of time before mounting the rootfs and picking up the last tier of drivers. Regards, Bjorn
On Mon, Feb 20, 2023 at 09:15:50AM -0800, Bjorn Andersson wrote: > On Tue, Feb 07, 2023 at 03:45:35PM -0800, Doug Anderson wrote: > > Hi, > > > > On Mon, Feb 6, 2023 at 1:35 PM Saravana Kannan <saravanak@google.com> wrote: > > > > > > On Mon, Feb 6, 2023 at 1:10 PM Doug Anderson <dianders@chromium.org> wrote: > > > > > > > > Hi, > > > > > > > > On Mon, Feb 6, 2023 at 11:33 AM Saravana Kannan <saravanak@google.com> wrote: > > > > > > > > > > On Mon, Feb 6, 2023 at 9:48 AM Abel Vesa <abel.vesa@linaro.org> wrote: > > > > > > > > > > > > > > > > > > CC'ed Saravana > > > > > > > > > > Thanks. Please do cc me for stuff like this from the start. I skimmed > > > > > the series and I think it's doing one of my TODO items. So, thanks for > > > > > the patch! > > > > > > > > > > I'll take a closer look within a few days -- trying to get through > > > > > some existing fw_devlink stuff. > > > > > > > > > > But long story short, it is the right thing to keep a supplier on > > > > > indefinitely if there's a consumer device (that's not disabled in DT) > > > > > that never gets probed. It's a pretty common scenario -- for example, > > > > > say a display backlight. The default case should be functional > > > > > correctness. And then we can add stuff that allows changing this > > > > > behavior with command line args or something else that can be done > > > > > from userspace. > > > > > > > > > > +1 to what Doug said elsewhere in this thread too. I'm trying to > > > > > consolidate the "when do we give up" decision at the driver core level > > > > > independent of what framework is being used. > > > > > > > > I'm not really sure I agree with the above, at least not without lots > > > > of discussion in the community. It really goes against what the kernel > > > > has been doing for years and years in the regulator and clock > > > > frameworks. Those frameworks both eventually give up and power down > > > > resources that no active drivers are using. Either changing the > > > > regulator/clock frameworks or saying that other frameworks should work > > > > in an opposite way seems like a recipe for confusion. > > > > > > > > Now, certainly I won't say that the way that the regulator and clock > > > > frameworks function is perfect nor will I say that they don't cause > > > > any problems. However, going the opposite way where resources are kept > > > > at full power indefinitely will _also_ cause problems. > > > > > > > > Specifically, let's look at the case you mentioned of a display > > > > backlight. I think you're saying that if there is no backlight driver > > > > enabled in the kernel that you'd expect the backlight to just be on at > > > > full brightness. > > > > > > No, I'm not saying that. > > > > > > > Would you expect this even if the firmware didn't > > > > leave the backlight on? > > > > > > sync_state() never turns on anything that wasn't already on at boot. > > > So in your example, if the firmware didn't turn on the backlight, then > > > it'll remain off. > > > > As per offline discussion, part of the problems are that today this > > _isn't_ true for a few Qualcomm things (like interconnect). The > > interconnect frameway specifically maxes things out for early boot. > > > > The problem being solved here is that the bootloader leaves some vote at > 1GB/s, as needed by hardware related to driver B. > > Driver A is loaded first and votes for 1kb/s; what should the kernel do > now, without knowledge of the needs from the hardware associated with B, > or the ability to read back the bootloader's votes. > > This was the behavior of the initial implementation, and the practical > implications was seen as the UART would typically come along really > early, cast a low vote on the various buses and it would take forever to > get to the probing of the drivers that actually gave us reasonable > votes. I generally understand this problem and agree that it makes sense to bump the resources *initially*. Doug and I primarily question the 'wait forever' part of it. > Also consider the case where driver A probes, votes for bandwidth, does > it's initialization and then votes for 0. Without making assumptions > about the needs of B (or a potential B even), we'd turn off critical > resources - possible preventing us from ever attempting to probe B. For the most critical devices that are probed during early boot this would still work if the resources are initially bumped and then turned off after some timeout. Could you provide an example for some other type of device that is/would be probed later? Except for auto-probing buses like USB or PCI the device should probe regardless of the resources being enabled and then vote during probe for the required bandwidth, voltage, etc., which should put the resources into the required state. Am I missing something here? > As such, the only safe solution is to assume that there might be a later > loaded/probed client that has a large vote and preemptively vote for > some higher bandwidth until then. > > > > In any case, why do you say it's more correct? > > > > > > Because if you turn off the display, the device is unusable. In other > > > circumstances, it can crash a device because the firmware powered it > > > on left it in a "good enough" state, but we'd go turn it off and crash > > > the system. > > > > > > > I suppose you'd say that the screen is at least usable like this. > > > > ...except that you've broken a different feature: suspend/resume. > > > > > > If the display is off and the laptop is unusable, then we have bigger > > > problems than suspend/resume? > > > > I suspect that here we'll have to agree to disagree. IMO it's a > > non-goal to expect hardware to work for which there is no driver. So > > making the backlight work without a backlight driver isn't really > > something we should strive for. > > > > Without trying to make you agree ;) > > How can you differentiate between "the driver wasn't built" and "the > driver isn't yet available"? Unfortunately you can't AFAIK. > Consider the case where I boot my laptop, I have some set of builtin > drivers, some set of drivers in the ramdisk and some set of drivers in > the root filesystem. > > In the event that something goes wrong mounting the rootfs, I will now > be in the ramdisk console. Given the current timer-based disabling of > regulators, I have ~25 seconds to solve my problem before the backlight > goes blank. > > > Obviously this isn't a typical scenario in a consumer device, but it > seems conceivable that your ramdisk would run fsck for some amount of > time before mounting the rootfs and picking up the last tier of drivers. If the laptop is running a kernel that is tailored for this device I'd say the most practial solution would be to either build the backlight driver into the kernel or have it on the ramdisk as a module. However the laptop might be running a distribution like Debian or Red Hat with (I assume) a single kernel+ramdisk for all systems of a given architecture (e.g. aarch64). In that case it might not be desirable to have all possible backlight drivers in the kernel image or ramdisk. For this kind of system 'wait forever' might be a suitable solution. I have the impression we might want both options, a timeout after which resources are turned off unless they have been voted for, and 'wait forever', with a Kconfig option to select the desired behavior. For most tailored systems the timout is probably a more suitable solution. The maintainer of the kernel/system can decide to not enable certain drivers because they aren't needed and include essential drivers into the kernel image or ramdisk. The timeout ensures that the system doesn't burn extra power for reasons that aren't evident for the maintainer (who might not even be aware of the whole sync_state story). On the other hand general purpose distributions might want to wait forever, since they have to support a wide range of hardware and enable most available drivers anyway. If we end up with such an option I think the timeout should be the default. Why? There are hundreds of maintainers of tailored kernels who might run into hard to detect/debug power issues with 'wait forever'. On the other hand there is a limited number of general purpose distributions, with kernel teams that probably already know about 'sync_state'. They only have to enable 'wait forever' once and then carry it forward to future versions.
On Tue, Feb 21, 2023 at 06:32:06PM +0000, Matthias Kaehlcke wrote: > On Mon, Feb 20, 2023 at 09:15:50AM -0800, Bjorn Andersson wrote: > > On Tue, Feb 07, 2023 at 03:45:35PM -0800, Doug Anderson wrote: > > > Hi, > > > > > > On Mon, Feb 6, 2023 at 1:35 PM Saravana Kannan <saravanak@google.com> wrote: > > > > > > > > On Mon, Feb 6, 2023 at 1:10 PM Doug Anderson <dianders@chromium.org> wrote: > > > > > > > > > > Hi, > > > > > > > > > > On Mon, Feb 6, 2023 at 11:33 AM Saravana Kannan <saravanak@google.com> wrote: > > > > > > > > > > > > On Mon, Feb 6, 2023 at 9:48 AM Abel Vesa <abel.vesa@linaro.org> wrote: > > > > > > > > > > > > > > > > > > > > > CC'ed Saravana > > > > > > > > > > > > Thanks. Please do cc me for stuff like this from the start. I skimmed > > > > > > the series and I think it's doing one of my TODO items. So, thanks for > > > > > > the patch! > > > > > > > > > > > > I'll take a closer look within a few days -- trying to get through > > > > > > some existing fw_devlink stuff. > > > > > > > > > > > > But long story short, it is the right thing to keep a supplier on > > > > > > indefinitely if there's a consumer device (that's not disabled in DT) > > > > > > that never gets probed. It's a pretty common scenario -- for example, > > > > > > say a display backlight. The default case should be functional > > > > > > correctness. And then we can add stuff that allows changing this > > > > > > behavior with command line args or something else that can be done > > > > > > from userspace. > > > > > > > > > > > > +1 to what Doug said elsewhere in this thread too. I'm trying to > > > > > > consolidate the "when do we give up" decision at the driver core level > > > > > > independent of what framework is being used. > > > > > > > > > > I'm not really sure I agree with the above, at least not without lots > > > > > of discussion in the community. It really goes against what the kernel > > > > > has been doing for years and years in the regulator and clock > > > > > frameworks. Those frameworks both eventually give up and power down > > > > > resources that no active drivers are using. Either changing the > > > > > regulator/clock frameworks or saying that other frameworks should work > > > > > in an opposite way seems like a recipe for confusion. > > > > > > > > > > Now, certainly I won't say that the way that the regulator and clock > > > > > frameworks function is perfect nor will I say that they don't cause > > > > > any problems. However, going the opposite way where resources are kept > > > > > at full power indefinitely will _also_ cause problems. > > > > > > > > > > Specifically, let's look at the case you mentioned of a display > > > > > backlight. I think you're saying that if there is no backlight driver > > > > > enabled in the kernel that you'd expect the backlight to just be on at > > > > > full brightness. > > > > > > > > No, I'm not saying that. > > > > > > > > > Would you expect this even if the firmware didn't > > > > > leave the backlight on? > > > > > > > > sync_state() never turns on anything that wasn't already on at boot. > > > > So in your example, if the firmware didn't turn on the backlight, then > > > > it'll remain off. > > > > > > As per offline discussion, part of the problems are that today this > > > _isn't_ true for a few Qualcomm things (like interconnect). The > > > interconnect frameway specifically maxes things out for early boot. > > > > > > > The problem being solved here is that the bootloader leaves some vote at > > 1GB/s, as needed by hardware related to driver B. > > > > Driver A is loaded first and votes for 1kb/s; what should the kernel do > > now, without knowledge of the needs from the hardware associated with B, > > or the ability to read back the bootloader's votes. > > > > This was the behavior of the initial implementation, and the practical > > implications was seen as the UART would typically come along really > > early, cast a low vote on the various buses and it would take forever to > > get to the probing of the drivers that actually gave us reasonable > > votes. > > I generally understand this problem and agree that it makes sense to bump > the resources *initially*. Doug and I primarily question the 'wait forever' > part of it. > The question is when "initially" ends. > > Also consider the case where driver A probes, votes for bandwidth, does > > it's initialization and then votes for 0. Without making assumptions > > about the needs of B (or a potential B even), we'd turn off critical > > resources - possible preventing us from ever attempting to probe B. > > For the most critical devices that are probed during early boot this > would still work if the resources are initially bumped and then turned > off after some timeout. > The resources that needs to be kept on are those which we rely on for the system to reach said driver 'B'. The obvious ones are those allowing us to execute code (e.g. some form of DDR vote) and things such as earlycon - until the console driver is probed properly and can cast a interconnect vote. But the typical example here is display, which depending on system configuration might rely on the hardware being able to fetch data from DDR until all the relevant display driver components is starting to take care of the voting. > Could you provide an example for some other type of device that is/would > be probed later? Except for auto-probing buses like USB or PCI the device > should probe regardless of the resources being enabled and then vote > during probe for the required bandwidth, voltage, etc., which should put > the resources into the required state. Am I missing something here? > What do you mean with "later"? We have one consistent definition of "later", and that's late_initcall(). By that definition it's pretty much everything beyond gcc, interconnect and UART is or may probe "later" (because starting userspace without a valid /dev/console is suboptimal).. Looking at something specific, if you where to try to boot sc7280 using the upstream defconfig I don't think you have anyone ensuring that the CPU is allowed to reach DDR at this point. > > As such, the only safe solution is to assume that there might be a later > > loaded/probed client that has a large vote and preemptively vote for > > some higher bandwidth until then. > > > > > > In any case, why do you say it's more correct? > > > > > > > > Because if you turn off the display, the device is unusable. In other > > > > circumstances, it can crash a device because the firmware powered it > > > > on left it in a "good enough" state, but we'd go turn it off and crash > > > > the system. > > > > > > > > > I suppose you'd say that the screen is at least usable like this. > > > > > ...except that you've broken a different feature: suspend/resume. > > > > > > > > If the display is off and the laptop is unusable, then we have bigger > > > > problems than suspend/resume? > > > > > > I suspect that here we'll have to agree to disagree. IMO it's a > > > non-goal to expect hardware to work for which there is no driver. So > > > making the backlight work without a backlight driver isn't really > > > something we should strive for. > > > > > > > Without trying to make you agree ;) > > > > How can you differentiate between "the driver wasn't built" and "the > > driver isn't yet available"? > > Unfortunately you can't AFAIK. > > > Consider the case where I boot my laptop, I have some set of builtin > > drivers, some set of drivers in the ramdisk and some set of drivers in > > the root filesystem. > > > > In the event that something goes wrong mounting the rootfs, I will now > > be in the ramdisk console. Given the current timer-based disabling of > > regulators, I have ~25 seconds to solve my problem before the backlight > > goes blank. > > > > > > Obviously this isn't a typical scenario in a consumer device, but it > > seems conceivable that your ramdisk would run fsck for some amount of > > time before mounting the rootfs and picking up the last tier of drivers. > > If the laptop is running a kernel that is tailored for this device I'd > say the most practial solution would be to either build the backlight > driver into the kernel or have it on the ramdisk as a module. > > However the laptop might be running a distribution like Debian or Red Hat > with (I assume) a single kernel+ramdisk for all systems of a given > architecture (e.g. aarch64). In that case it might not be desirable to > have all possible backlight drivers in the kernel image or ramdisk. For > this kind of system 'wait forever' might be a suitable solution. > For most users, most of the time, I don't think "wait forever" is very good either; e.g. if a distro is lacking one .ko the user would see significant worse battery performance... But at the same time, when your distro asks for your crypto key for your rootfs and the kernel decides that times up, that's pretty bad user experience as well.. > I have the impression we might want both options, a timeout after which > resources are turned off unless they have been voted for, and 'wait > forever', with a Kconfig option to select the desired behavior. > > For most tailored systems the timout is probably a more suitable solution. > The maintainer of the kernel/system can decide to not enable certain > drivers because they aren't needed and include essential drivers into > the kernel image or ramdisk. The timeout ensures that the system doesn't > burn extra power for reasons that aren't evident for the maintainer (who > might not even be aware of the whole sync_state story). > > On the other hand general purpose distributions might want to wait > forever, since they have to support a wide range of hardware and enable > most available drivers anyway. > > If we end up with such an option I think the timeout should be the > default. Why? There are hundreds of maintainers of tailored kernels > who might run into hard to detect/debug power issues with 'wait > forever'. On the other hand there is a limited number of general purpose > distributions, with kernel teams that probably already know about > 'sync_state'. They only have to enable 'wait forever' once and then > carry it forward to future versions. In your tailored system you test every hardware and driver combinations, so the wait forever seems like a very easy thing to handle. In a general purpose distribution, the distribution doesn't have a way to test every single system. So it's quite likely that they won't enable some optional set of drivers (e.g. camera, venus) and wait forever would be the wrong thing to do. Except when the user needs that backlight, or the framebuffer etc. I don't know how to solve this, the global timeout is likely the best way we have for now - at least it works, for all cases except when someone needs to stay in their initramfs for more than 30 seconds... Regards, Bjorn
On Wed, Feb 22, 2023 at 10:51:52AM -0800, Bjorn Andersson wrote: > On Tue, Feb 21, 2023 at 06:32:06PM +0000, Matthias Kaehlcke wrote: > > On Mon, Feb 20, 2023 at 09:15:50AM -0800, Bjorn Andersson wrote: > > > On Tue, Feb 07, 2023 at 03:45:35PM -0800, Doug Anderson wrote: > > > > Hi, > > > > > > > > On Mon, Feb 6, 2023 at 1:35 PM Saravana Kannan <saravanak@google.com> wrote: > > > > > > > > > > On Mon, Feb 6, 2023 at 1:10 PM Doug Anderson <dianders@chromium.org> wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > On Mon, Feb 6, 2023 at 11:33 AM Saravana Kannan <saravanak@google.com> wrote: > > > > > > > > > > > > > > On Mon, Feb 6, 2023 at 9:48 AM Abel Vesa <abel.vesa@linaro.org> wrote: > > > > > > > > > > > > > > > > > > > > > > > > CC'ed Saravana > > > > > > > > > > > > > > Thanks. Please do cc me for stuff like this from the start. I skimmed > > > > > > > the series and I think it's doing one of my TODO items. So, thanks for > > > > > > > the patch! > > > > > > > > > > > > > > I'll take a closer look within a few days -- trying to get through > > > > > > > some existing fw_devlink stuff. > > > > > > > > > > > > > > But long story short, it is the right thing to keep a supplier on > > > > > > > indefinitely if there's a consumer device (that's not disabled in DT) > > > > > > > that never gets probed. It's a pretty common scenario -- for example, > > > > > > > say a display backlight. The default case should be functional > > > > > > > correctness. And then we can add stuff that allows changing this > > > > > > > behavior with command line args or something else that can be done > > > > > > > from userspace. > > > > > > > > > > > > > > +1 to what Doug said elsewhere in this thread too. I'm trying to > > > > > > > consolidate the "when do we give up" decision at the driver core level > > > > > > > independent of what framework is being used. > > > > > > > > > > > > I'm not really sure I agree with the above, at least not without lots > > > > > > of discussion in the community. It really goes against what the kernel > > > > > > has been doing for years and years in the regulator and clock > > > > > > frameworks. Those frameworks both eventually give up and power down > > > > > > resources that no active drivers are using. Either changing the > > > > > > regulator/clock frameworks or saying that other frameworks should work > > > > > > in an opposite way seems like a recipe for confusion. > > > > > > > > > > > > Now, certainly I won't say that the way that the regulator and clock > > > > > > frameworks function is perfect nor will I say that they don't cause > > > > > > any problems. However, going the opposite way where resources are kept > > > > > > at full power indefinitely will _also_ cause problems. > > > > > > > > > > > > Specifically, let's look at the case you mentioned of a display > > > > > > backlight. I think you're saying that if there is no backlight driver > > > > > > enabled in the kernel that you'd expect the backlight to just be on at > > > > > > full brightness. > > > > > > > > > > No, I'm not saying that. > > > > > > > > > > > Would you expect this even if the firmware didn't > > > > > > leave the backlight on? > > > > > > > > > > sync_state() never turns on anything that wasn't already on at boot. > > > > > So in your example, if the firmware didn't turn on the backlight, then > > > > > it'll remain off. > > > > > > > > As per offline discussion, part of the problems are that today this > > > > _isn't_ true for a few Qualcomm things (like interconnect). The > > > > interconnect frameway specifically maxes things out for early boot. > > > > > > > > > > The problem being solved here is that the bootloader leaves some vote at > > > 1GB/s, as needed by hardware related to driver B. > > > > > > Driver A is loaded first and votes for 1kb/s; what should the kernel do > > > now, without knowledge of the needs from the hardware associated with B, > > > or the ability to read back the bootloader's votes. > > > > > > This was the behavior of the initial implementation, and the practical > > > implications was seen as the UART would typically come along really > > > early, cast a low vote on the various buses and it would take forever to > > > get to the probing of the drivers that actually gave us reasonable > > > votes. > > > > I generally understand this problem and agree that it makes sense to bump > > the resources *initially*. Doug and I primarily question the 'wait forever' > > part of it. > > > > The question is when "initially" ends. ack > > > Also consider the case where driver A probes, votes for bandwidth, does > > > it's initialization and then votes for 0. Without making assumptions > > > about the needs of B (or a potential B even), we'd turn off critical > > > resources - possible preventing us from ever attempting to probe B. > > > > For the most critical devices that are probed during early boot this > > would still work if the resources are initially bumped and then turned > > off after some timeout. > > > > The resources that needs to be kept on are those which we rely on for > the system to reach said driver 'B'. > > The obvious ones are those allowing us to execute code (e.g. some form > of DDR vote) and things such as earlycon - until the console driver is > probed properly and can cast a interconnect vote. Yes, these were the things that came to my mind and I'd expect the corresponding drivers to be either builtin or on the ramdisk, so they wouldn't be impacted if the rootfs takes longer to mount. > But the typical example here is display, which depending on system > configuration might rely on the hardware being able to fetch data from > DDR until all the relevant display driver components is starting to take > care of the voting. Thanks for the example, it's certainly desirable to keep the display working even when the rootf can't be mounted (in time). > > Could you provide an example for some other type of device that is/would > > be probed later? Except for auto-probing buses like USB or PCI the device > > should probe regardless of the resources being enabled and then vote > > during probe for the required bandwidth, voltage, etc., which should put > > the resources into the required state. Am I missing something here? > > > > What do you mean with "later"? I was thinking beyond 'early boot', you are right, there is no clear definition. > We have one consistent definition of "later", and that's > late_initcall(). By that definition it's pretty much everything beyond > gcc, interconnect and UART is or may probe "later" (because starting > userspace without a valid /dev/console is suboptimal).. > > > Looking at something specific, if you where to try to boot sc7280 using > the upstream defconfig I don't think you have anyone ensuring that the > CPU is allowed to reach DDR at this point. Why is that the case? > > > As such, the only safe solution is to assume that there might be a later > > > loaded/probed client that has a large vote and preemptively vote for > > > some higher bandwidth until then. > > > > > > > > In any case, why do you say it's more correct? > > > > > > > > > > Because if you turn off the display, the device is unusable. In other > > > > > circumstances, it can crash a device because the firmware powered it > > > > > on left it in a "good enough" state, but we'd go turn it off and crash > > > > > the system. > > > > > > > > > > > I suppose you'd say that the screen is at least usable like this. > > > > > > ...except that you've broken a different feature: suspend/resume. > > > > > > > > > > If the display is off and the laptop is unusable, then we have bigger > > > > > problems than suspend/resume? > > > > > > > > I suspect that here we'll have to agree to disagree. IMO it's a > > > > non-goal to expect hardware to work for which there is no driver. So > > > > making the backlight work without a backlight driver isn't really > > > > something we should strive for. > > > > > > > > > > Without trying to make you agree ;) > > > > > > How can you differentiate between "the driver wasn't built" and "the > > > driver isn't yet available"? > > > > Unfortunately you can't AFAIK. > > > > > Consider the case where I boot my laptop, I have some set of builtin > > > drivers, some set of drivers in the ramdisk and some set of drivers in > > > the root filesystem. > > > > > > In the event that something goes wrong mounting the rootfs, I will now > > > be in the ramdisk console. Given the current timer-based disabling of > > > regulators, I have ~25 seconds to solve my problem before the backlight > > > goes blank. > > > > > > > > > Obviously this isn't a typical scenario in a consumer device, but it > > > seems conceivable that your ramdisk would run fsck for some amount of > > > time before mounting the rootfs and picking up the last tier of drivers. > > > > If the laptop is running a kernel that is tailored for this device I'd > > say the most practial solution would be to either build the backlight > > driver into the kernel or have it on the ramdisk as a module. > > > > However the laptop might be running a distribution like Debian or Red Hat > > with (I assume) a single kernel+ramdisk for all systems of a given > > architecture (e.g. aarch64). In that case it might not be desirable to > > have all possible backlight drivers in the kernel image or ramdisk. For > > this kind of system 'wait forever' might be a suitable solution. > > > > For most users, most of the time, I don't think "wait forever" is very > good either; e.g. if a distro is lacking one .ko the user would see > significant worse battery performance... > > But at the same time, when your distro asks for your crypto key for your > rootfs and the kernel decides that times up, that's pretty bad user > experience as well.. Indeed, both options have their significant caveats :( > > I have the impression we might want both options, a timeout after which > > resources are turned off unless they have been voted for, and 'wait > > forever', with a Kconfig option to select the desired behavior. > > > > For most tailored systems the timout is probably a more suitable solution. > > The maintainer of the kernel/system can decide to not enable certain > > drivers because they aren't needed and include essential drivers into > > the kernel image or ramdisk. The timeout ensures that the system doesn't > > burn extra power for reasons that aren't evident for the maintainer (who > > might not even be aware of the whole sync_state story). > > > > On the other hand general purpose distributions might want to wait > > forever, since they have to support a wide range of hardware and enable > > most available drivers anyway. > > > > If we end up with such an option I think the timeout should be the > > default. Why? There are hundreds of maintainers of tailored kernels > > who might run into hard to detect/debug power issues with 'wait > > forever'. On the other hand there is a limited number of general purpose > > distributions, with kernel teams that probably already know about > > 'sync_state'. They only have to enable 'wait forever' once and then > > carry it forward to future versions. > > In your tailored system you test every hardware and driver combinations, > so the wait forever seems like a very easy thing to handle. I disagree For one you might not want to enable certain drivers, because the system doesn't make use of the underlying hardware. In theory (probably not as much in practice) the device tree could describe existing hardware which isn't supported by the kernel (yet). More importantly it might not be evident to the maintaner(s) of the system that they have an issue impacting power. Think of small companies with a single person or very small kernel team and no vendor support. How do they even know their system is consuming more power than it would if only all these drivers they don't really care about were enabled? Some of them will be aware of the problematic and pro-actively look out for possible issues, but I expect many aren't. It wouldn't be scalable and 'user'-friendly to put the burden on them and expect them to sort out the issues individually. > In a general purpose distribution, the distribution doesn't have a way > to test every single system. So it's quite likely that they won't enable > some optional set of drivers (e.g. camera, venus) and wait forever would > be the wrong thing to do. Except when the user needs that backlight, or > the framebuffer etc. True, even a general purpose distribution doesn't necessarily enable everything. > I don't know how to solve this, the global timeout is likely the best > way we have for now - at least it works, for all cases except when > someone needs to stay in their initramfs for more than 30 seconds... There is indeed (at least for now) no solution that addresses all the issues. A timeout sounds good to me for the time being.
Hi, On Mon, Feb 20, 2023 at 9:12 AM Bjorn Andersson <andersson@kernel.org> wrote: > > > I suspect that here we'll have to agree to disagree. IMO it's a > > non-goal to expect hardware to work for which there is no driver. So > > making the backlight work without a backlight driver isn't really > > something we should strive for. > > > > Without trying to make you agree ;) > > How can you differentiate between "the driver wasn't built" and "the > driver isn't yet available"? BTW, when I was responding to Saravana's series [1], I realized that you _can_ differentiate between these two cases, at least from a practical point of view. Specifically, when the "deferred_probe_timeout" expires then you should assume that "the driver wasn't built". Said another way, once the "deferred_probe_timeout" expires then you should assume that the driver won't be available in the future. While you still could try loading it, in general once that timeout has expired the kernel has made decisions (like making -EPROBE_DEFER non-retriable) that make it very awkward to load new drivers. Of course, one could say "hey, let's get rid of the deferred_probe_timeout". That might be a tough sell unless you can come up with an equivalent solution for those currently using this feature. [1] https://lore.kernel.org/r/CAD=FV=XQnLpD1P8sRBcizTMjCQyHTjaiNvjcPdgyZc5JCzvOtw@mail.gmail.com > Consider the case where I boot my laptop, I have some set of builtin > drivers, some set of drivers in the ramdisk and some set of drivers in > the root filesystem. > > In the event that something goes wrong mounting the rootfs, I will now > be in the ramdisk console. Given the current timer-based disabling of > regulators, I have ~25 seconds to solve my problem before the backlight > goes blank. I personally don't love the timeout. It feels like the kind of thing that userspace knows and should be able to tell the kernel. I know we don't like to put impositions on userspace, but userspace is pretty definitely involved in things like loading modules. It just makes sense (in my mind) for userspace to say when it's done and all modules for cold-plugged devices have been loaded... -Doug
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 84662d338188..c2a5f77c01f3 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -1099,7 +1099,8 @@ static int __init genpd_power_off_unused(void) mutex_lock(&gpd_list_lock); list_for_each_entry(genpd, &gpd_list, gpd_list_node) - genpd_queue_power_off_work(genpd); + if (!dev_has_sync_state(genpd->provider->dev)) + genpd_queue_power_off_work(genpd); mutex_unlock(&gpd_list_lock); @@ -1107,6 +1108,20 @@ static int __init genpd_power_off_unused(void) } late_initcall(genpd_power_off_unused); +void genpd_power_off_unused_sync_state(struct device *dev) +{ + struct generic_pm_domain *genpd; + + mutex_lock(&gpd_list_lock); + + list_for_each_entry(genpd, &gpd_list, gpd_list_node) + if (genpd->provider->dev == dev) + genpd_queue_power_off_work(genpd); + + mutex_unlock(&gpd_list_lock); +} +EXPORT_SYMBOL_GPL(genpd_power_off_unused_sync_state); + #ifdef CONFIG_PM_SLEEP /** diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h index f776fb93eaa0..1fd5aa500c81 100644 --- a/include/linux/pm_domain.h +++ b/include/linux/pm_domain.h @@ -351,6 +351,7 @@ struct device *genpd_dev_pm_attach_by_id(struct device *dev, unsigned int index); struct device *genpd_dev_pm_attach_by_name(struct device *dev, const char *name); +void genpd_power_off_unused_sync_state(struct device *dev); #else /* !CONFIG_PM_GENERIC_DOMAINS_OF */ static inline int of_genpd_add_provider_simple(struct device_node *np, struct generic_pm_domain *genpd) @@ -419,6 +420,8 @@ struct generic_pm_domain *of_genpd_remove_last(struct device_node *np) { return ERR_PTR(-EOPNOTSUPP); } + +static inline genpd_power_off_unused_sync_state(struct device *dev) {} #endif /* CONFIG_PM_GENERIC_DOMAINS_OF */ #ifdef CONFIG_PM
Currently, there are cases when a domain needs to remain enabled until the consumer driver probes. Sometimes such consumer drivers may be built as modules. Since the genpd_power_off_unused is called too early for such consumer driver modules to get a chance to probe, the domain, since it is unused, will get disabled. On the other hand, the best time for an unused domain to be disabled is on the provider's sync_state callback. So, if the provider has registered a sync_state callback, assume the unused domains for that provider will be disabled on its sync_state callback. Also provide a generic sync_state callback which disables all the domains unused for the provider that registers it. Signed-off-by: Abel Vesa <abel.vesa@linaro.org> --- This approach has been applied for unused clocks as well. With this patch merged in, all the providers that have sync_state callback registered will leave the domains enabled unless the provider's sync_state callback explicitly disables them. So those providers will need to add the disabling part to their sync_state callback. On the other hand, the platforms that have cases where domains need to remain enabled (even if unused) until the consumer driver probes, will be able, with this patch in, to run without the pd_ignore_unused kernel argument, which seems to be the case for most Qualcomm platforms, at this moment. The v1 is here: https://lore.kernel.org/all/20230126234013.3638425-1-abel.vesa@linaro.org/ Changes since v1: * added a generic sync state callback to be registered by providers in order to disable the unused domains on their sync state. Also mentioned this in the commit message. drivers/base/power/domain.c | 17 ++++++++++++++++- include/linux/pm_domain.h | 3 +++ 2 files changed, 19 insertions(+), 1 deletion(-)