Message ID | 1936685.PYKUYFuaPT@kreacher |
---|---|
State | Superseded |
Headers | show |
Series | thermal: core/ACPI: Fix processor cooling device regression | expand |
On Tue, Mar 7, 2023 at 5:47 PM Zhang, Rui <rui.zhang@intel.com> wrote: > > On Fri, 2023-03-03 at 20:23 +0100, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > When a cpufreq policy appears or goes away, the CPU cooling devices > > for > > the CPUs covered by that policy need to be updated so that the new > > processor_get_max_state() value is stored as max_state and the > > statistics in sysfs are rearranged for each of them. > > > > Do that accordingly in acpi_thermal_cpufreq_init() and > > acpi_thermal_cpufreq_exit(). > > > > Fixes: a365105c685c("thermal: sysfs: Reuse cdev->max_state") > > Reported-by: Wang, Quanxian <quanxian.wang@intel.com> > > Link: > > https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com/ > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > --- > > drivers/acpi/processor_thermal.c | 16 +++++++++++++--- > > 1 file changed, 13 insertions(+), 3 deletions(-) > > > > Index: linux-pm/drivers/acpi/processor_thermal.c > > =================================================================== > > --- linux-pm.orig/drivers/acpi/processor_thermal.c > > +++ linux-pm/drivers/acpi/processor_thermal.c > > @@ -140,9 +140,14 @@ void acpi_thermal_cpufreq_init(struct cp > > ret = freq_qos_add_request(&policy->constraints, > > &pr->thermal_req, > > FREQ_QOS_MAX, INT_MAX); > > - if (ret < 0) > > + if (ret < 0) { > > pr_err("Failed to add freq constraint for CPU%d > > (%d)\n", > > cpu, ret); > > + continue; > > + } > > + > > + if (!IS_ERR(pr->cdev)) > > + thermal_cooling_device_update(pr->cdev); > > Although thermal_cooling_device_update() handles "pr->cdev == NULL" > case, I think it is better to use !IS_ERR_OR_NULL() here. Why is it? I was thinking about doing that, but then I realized that the NULL case had been covered and that's why I went for the change above. If there is a particular reason to check for NULL here, I can do that, but I'm not sure what it is.
On Fri, 2023-03-10 at 19:29 +0100, Rafael J. Wysocki wrote: > On Tue, Mar 7, 2023 at 5:47 PM Zhang, Rui <rui.zhang@intel.com> > wrote: > > On Fri, 2023-03-03 at 20:23 +0100, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > > > When a cpufreq policy appears or goes away, the CPU cooling > > > devices > > > for > > > the CPUs covered by that policy need to be updated so that the > > > new > > > processor_get_max_state() value is stored as max_state and the > > > statistics in sysfs are rearranged for each of them. > > > > > > Do that accordingly in acpi_thermal_cpufreq_init() and > > > acpi_thermal_cpufreq_exit(). > > > > > > Fixes: a365105c685c("thermal: sysfs: Reuse cdev->max_state") > > > Reported-by: Wang, Quanxian <quanxian.wang@intel.com> > > > Link: > > > https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com/ > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > --- > > > drivers/acpi/processor_thermal.c | 16 +++++++++++++--- > > > 1 file changed, 13 insertions(+), 3 deletions(-) > > > > > > Index: linux-pm/drivers/acpi/processor_thermal.c > > > ================================================================= > > > == > > > --- linux-pm.orig/drivers/acpi/processor_thermal.c > > > +++ linux-pm/drivers/acpi/processor_thermal.c > > > @@ -140,9 +140,14 @@ void acpi_thermal_cpufreq_init(struct cp > > > ret = freq_qos_add_request(&policy->constraints, > > > &pr->thermal_req, > > > FREQ_QOS_MAX, INT_MAX); > > > - if (ret < 0) > > > + if (ret < 0) { > > > pr_err("Failed to add freq constraint for > > > CPU%d > > > (%d)\n", > > > cpu, ret); > > > + continue; > > > + } > > > + > > > + if (!IS_ERR(pr->cdev)) > > > + thermal_cooling_device_update(pr->cdev); > > > > Although thermal_cooling_device_update() handles "pr->cdev == NULL" > > case, I think it is better to use !IS_ERR_OR_NULL() here. > > Why is it? > > I was thinking about doing that, but then I realized that the NULL > case had been covered and that's why I went for the change above. If > there is a particular reason to check for NULL here, I can do that, > but I'm not sure what it is. I don't have a strong objection here. I thought this was a code bug at first glance, until I double checked t hermal_cooling_device_update(). So I think the latter would be more straight forward without introducing code complexity. thanks, rui
On Sun, Mar 12, 2023 at 3:44 PM Zhang, Rui <rui.zhang@intel.com> wrote: > > On Fri, 2023-03-10 at 19:29 +0100, Rafael J. Wysocki wrote: > > On Tue, Mar 7, 2023 at 5:47 PM Zhang, Rui <rui.zhang@intel.com> > > wrote: > > > On Fri, 2023-03-03 at 20:23 +0100, Rafael J. Wysocki wrote: > > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > > > > > When a cpufreq policy appears or goes away, the CPU cooling > > > > devices > > > > for > > > > the CPUs covered by that policy need to be updated so that the > > > > new > > > > processor_get_max_state() value is stored as max_state and the > > > > statistics in sysfs are rearranged for each of them. > > > > > > > > Do that accordingly in acpi_thermal_cpufreq_init() and > > > > acpi_thermal_cpufreq_exit(). > > > > > > > > Fixes: a365105c685c("thermal: sysfs: Reuse cdev->max_state") > > > > Reported-by: Wang, Quanxian <quanxian.wang@intel.com> > > > > Link: > > > > https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com/ > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > --- > > > > drivers/acpi/processor_thermal.c | 16 +++++++++++++--- > > > > 1 file changed, 13 insertions(+), 3 deletions(-) > > > > > > > > Index: linux-pm/drivers/acpi/processor_thermal.c > > > > ================================================================= > > > > == > > > > --- linux-pm.orig/drivers/acpi/processor_thermal.c > > > > +++ linux-pm/drivers/acpi/processor_thermal.c > > > > @@ -140,9 +140,14 @@ void acpi_thermal_cpufreq_init(struct cp > > > > ret = freq_qos_add_request(&policy->constraints, > > > > &pr->thermal_req, > > > > FREQ_QOS_MAX, INT_MAX); > > > > - if (ret < 0) > > > > + if (ret < 0) { > > > > pr_err("Failed to add freq constraint for > > > > CPU%d > > > > (%d)\n", > > > > cpu, ret); > > > > + continue; > > > > + } > > > > + > > > > + if (!IS_ERR(pr->cdev)) > > > > + thermal_cooling_device_update(pr->cdev); > > > > > > Although thermal_cooling_device_update() handles "pr->cdev == NULL" > > > case, I think it is better to use !IS_ERR_OR_NULL() here. > > > > Why is it? > > > > I was thinking about doing that, but then I realized that the NULL > > case had been covered and that's why I went for the change above. If > > there is a particular reason to check for NULL here, I can do that, > > but I'm not sure what it is. > > I don't have a strong objection here. > > I thought this was a code bug at first glance, until I double checked t > hermal_cooling_device_update(). > > So I think the latter would be more straight forward without > introducing code complexity. Alternatively, thermal_cooling_device_update() can be made to do the full check instead.
Index: linux-pm/drivers/acpi/processor_thermal.c =================================================================== --- linux-pm.orig/drivers/acpi/processor_thermal.c +++ linux-pm/drivers/acpi/processor_thermal.c @@ -140,9 +140,14 @@ void acpi_thermal_cpufreq_init(struct cp ret = freq_qos_add_request(&policy->constraints, &pr->thermal_req, FREQ_QOS_MAX, INT_MAX); - if (ret < 0) + if (ret < 0) { pr_err("Failed to add freq constraint for CPU%d (%d)\n", cpu, ret); + continue; + } + + if (!IS_ERR(pr->cdev)) + thermal_cooling_device_update(pr->cdev); } } @@ -153,8 +158,13 @@ void acpi_thermal_cpufreq_exit(struct cp for_each_cpu(cpu, policy->related_cpus) { struct acpi_processor *pr = per_cpu(processors, cpu); - if (pr) - freq_qos_remove_request(&pr->thermal_req); + if (!pr) + continue; + + freq_qos_remove_request(&pr->thermal_req); + + if (!IS_ERR(pr->cdev)) + thermal_cooling_device_update(pr->cdev); } } #else /* ! CONFIG_CPU_FREQ */