Message ID | 1506575625-20388-1-git-send-email-daniel.lezcano@linaro.org |
---|---|
State | New |
Headers | show |
Series | [V2] thermal/drivers/hisi: Switch to interrupt mode | expand |
On Thu, Sep 28, 2017 at 02:57:52PM +0800, Leo Yan wrote: > Hi Daniel, > > On Thu, Sep 28, 2017 at 07:13:44AM +0200, Daniel Lezcano wrote: > > At this moment, we have both the interrupt setup and the polling enabled. The > > interrupt does nothing more than forcing an update while the temperature is > > polled every second. > > > > We can do much better than that, threshold is set to 65C in the DT and the > > passive cooling device enters in the dance when 75C is reached. We need to > > sample the temperature at 65C in order to let the IPA gather enough values for > > the PID computation. If the SoC is running at a temperature below 65C, we will > > be constantly polling for nothing. > > > > This patch disables the sensor when the temperature is below 65C and enables it > > when passing the threshold. It results the thermal sensor driver will have no > > activity most of the time. > > > > Cc: Keerthy <j-keerthy@ti.com> > > Cc: Leo Yang <leo.yan@linaro.org> > > s/Yang/Yan :) Have tested this patch on Hikey at my side: Oops sorry :) > Reviewed-by: Leo Yan <leo.yan@linaro.org> > Tested-by: Leo Yan <leo.yan@linaro.org> Great! Thanks for testing. -- Daniel
Hi, On 09/28/2017 06:13 AM, Daniel Lezcano wrote: > At this moment, we have both the interrupt setup and the polling enabled. The > interrupt does nothing more than forcing an update while the temperature is > polled every second. > > We can do much better than that, threshold is set to 65C in the DT and the > passive cooling device enters in the dance when 75C is reached. We need to > sample the temperature at 65C in order to let the IPA gather enough values for > the PID computation. Sample collection is a valid point, but passive cooling should become active as soon as 65C is reached (at least that's the case with IPA). Furthermore, IPA's PID controller is reset when the temperature drops below the first trip-point (threshold) - as such, I believe the PID's behavior should be the same now as it with polling. I think the main selling point of interrupt-based updates is a faster reaction time. On the HiKey960, we can go from below the threshold temperature to over the control temperature in less than a second (default polling rate is 1s). In this situation, IPA's PID starts accumulating error while already overshooting, which isn't optimal. On top of that, the passive cooling reacts too slowly. > If the SoC is running at a temperature below 65C, we will > be constantly polling for nothing. > > This patch disables the sensor when the temperature is below 65C and enables it > when passing the threshold. It results the thermal sensor driver will have no > activity most of the time. > > Cc: Keerthy <j-keerthy@ti.com> > Cc: Leo Yang <leo.yan@linaro.org> > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> I've tested this on HiKey960 (Android 4.9 + upstream patches to apply your thermal/drivers/hisi series + Kevin's hi3600 support). I ran a video workload, and noticed I get several interrupts while passive cooling is already in effect (I might move part of this discussion to Kevin's posting, but I think it's still relevant to be here): [ 118.107357] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > 65000 [ 119.182531] hisi_thermal fff30000.tsensor: THERMAL ALARM: 76235 > 65000 [ 119.361964] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > 65000 [ 119.907865] hisi_thermal fff30000.tsensor: THERMAL ALARM: 75620 > 65000 [ 119.959076] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70700 > 65000 This isn't optimal for IPA, as the PID is supposed to use a specific sampling rate, but those interrupts forced a re-trigger of power_allocator_throttle which changes the PID's actual sampling rate. IPA isn't expecting this kind of scenarios, as I can see a tz->passive_delay in the computation of the derivative term (although the derivative coefficient defaults to 0...). In a perfect world I would see those interrupts being toggled by the thermal governor, as that is where we know what to do with each trip point - we could still want the interrupts, but in the case of IPA we'd like to disable them while the PID controller is active, and we would know when to re-enable them (as soon as IPA is toggled off). > --- > drivers/thermal/hisi_thermal.c | 24 ++++++++++++++---------- > 1 file changed, 14 insertions(+), 10 deletions(-) > > diff --git a/drivers/thermal/hisi_thermal.c b/drivers/thermal/hisi_thermal.c > index 39f4627..74ea70d 100644 > --- a/drivers/thermal/hisi_thermal.c > +++ b/drivers/thermal/hisi_thermal.c > @@ -218,6 +218,15 @@ static int hisi_thermal_get_temp(void *__data, int *temp) > return 0; > } > > +static void hisi_thermal_toggle_sensor(struct hisi_thermal_sensor *sensor, > + bool on) > +{ > + struct thermal_zone_device *tzd = sensor->tzd; > + > + tzd->ops->set_mode(tzd, > + on ? THERMAL_DEVICE_ENABLED : THERMAL_DEVICE_DISABLED); > +} > + > static const struct thermal_zone_of_device_ops hisi_of_thermal_ops = { > .get_temp = hisi_thermal_get_temp, > }; > @@ -236,12 +245,16 @@ static irqreturn_t hisi_thermal_alarm_irq_thread(int irq, void *dev) > dev_crit(&data->pdev->dev, "THERMAL ALARM: %d > %d\n", > temp, sensor->thres_temp); > > + hisi_thermal_toggle_sensor(&data->sensor, true); > + > thermal_zone_device_update(data->sensor.tzd, > THERMAL_EVENT_UNSPECIFIED); > > } else if (temp < sensor->thres_temp) { > dev_crit(&data->pdev->dev, "THERMAL ALARM stopped: %d < %d\n", > temp, sensor->thres_temp); > + > + hisi_thermal_toggle_sensor(&data->sensor, false); > } > > return IRQ_HANDLED; > @@ -286,15 +299,6 @@ static const struct of_device_id of_hisi_thermal_match[] = { > }; > MODULE_DEVICE_TABLE(of, of_hisi_thermal_match); > > -static void hisi_thermal_toggle_sensor(struct hisi_thermal_sensor *sensor, > - bool on) > -{ > - struct thermal_zone_device *tzd = sensor->tzd; > - > - tzd->ops->set_mode(tzd, > - on ? THERMAL_DEVICE_ENABLED : THERMAL_DEVICE_DISABLED); > -} > - > static int hisi_thermal_setup(struct hisi_thermal_data *data) > { > struct hisi_thermal_sensor *sensor; > @@ -393,7 +397,7 @@ static int hisi_thermal_probe(struct platform_device *pdev) > return ret; > } > > - hisi_thermal_toggle_sensor(&data->sensor, true); > + hisi_thermal_toggle_sensor(&data->sensor, false); > > return 0; > }
Hi Valentin, On 29/09/2017 13:07, Valentin Schneider wrote: [ ... ] > I've tested this on HiKey960 (Android 4.9 + upstream patches to apply > your thermal/drivers/hisi series + Kevin's hi3600 support). > I ran a video workload, and noticed I get several interrupts while > passive cooling is already in effect > (I might move part of this discussion to Kevin's posting, but I think > it's still relevant to be here): > > [ 118.107357] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > 65000 > [ 119.182531] hisi_thermal fff30000.tsensor: THERMAL ALARM: 76235 > 65000 > [ 119.361964] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > 65000 > [ 119.907865] hisi_thermal fff30000.tsensor: THERMAL ALARM: 75620 > 65000 > [ 119.959076] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70700 > 65000 > > This isn't optimal for IPA, as the PID is supposed to use a specific > sampling rate, but those interrupts > forced a re-trigger of power_allocator_throttle which changes the PID's > actual sampling rate. > IPA isn't expecting this kind of scenarios, as I can see a > tz->passive_delay in the computation > of the derivative term (although the derivative coefficient defaults to > 0...). Interesting. This patch actually is for the hikey, not for the hikey960. I didn't tested with Kevin's patches. The thermal characteristics are very different on hi960. For example, the temperature increase is must faster on hikey960. So I guess, the driver itself should be fine tuned to prevent this kind of scenario. > In a perfect world I would see those interrupts being toggled by the > thermal governor, as that is > where we know what to do with each trip point - we could still want the > interrupts, but in the case > of IPA we'd like to disable them while the PID controller is active, and > we would know when to re-enable them > (as soon as IPA is toggled off). Yes, that is true the thermal framework could be enhanced to support this. Do you have an hikey (not 960) to test this patch? -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog
On 09/29/2017 05:46 PM, Daniel Lezcano wrote: > Hi Valentin, > > On 29/09/2017 13:07, Valentin Schneider wrote: > > [ ... ] > >> I've tested this on HiKey960 (Android 4.9 + upstream patches to apply >> your thermal/drivers/hisi series + Kevin's hi3600 support). >> I ran a video workload, and noticed I get several interrupts while >> passive cooling is already in effect >> (I might move part of this discussion to Kevin's posting, but I think >> it's still relevant to be here): >> >> [ 118.107357] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > 65000 >> [ 119.182531] hisi_thermal fff30000.tsensor: THERMAL ALARM: 76235 > 65000 >> [ 119.361964] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > 65000 >> [ 119.907865] hisi_thermal fff30000.tsensor: THERMAL ALARM: 75620 > 65000 >> [ 119.959076] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70700 > 65000 >> >> This isn't optimal for IPA, as the PID is supposed to use a specific >> sampling rate, but those interrupts >> forced a re-trigger of power_allocator_throttle which changes the PID's >> actual sampling rate. >> IPA isn't expecting this kind of scenarios, as I can see a >> tz->passive_delay in the computation >> of the derivative term (although the derivative coefficient defaults to >> 0...). > Interesting. This patch actually is for the hikey, not for the hikey960. > I didn't tested with Kevin's patches. > > The thermal characteristics are very different on hi960. For example, > the temperature increase is must faster on hikey960. So I guess, the > driver itself should be fine tuned to prevent this kind of scenario. True >> In a perfect world I would see those interrupts being toggled by the >> thermal governor, as that is >> where we know what to do with each trip point - we could still want the >> interrupts, but in the case >> of IPA we'd like to disable them while the PID controller is active, and >> we would know when to re-enable them >> (as soon as IPA is toggled off). > Yes, that is true the thermal framework could be enhanced to support this. If you're up for it, it could be an interesting discussion. In any case, I might spend a little bit of time looking into making such an interrupt-toggling work (at least as a prototype just on the HiKey960). > > Do you have an hikey (not 960) to test this patch? I do have one but I'm currently lending it to someone else in my team, I'll have to track it down (probably next week). I'm actually quite interested in having IPA triggered by interrupt because of the HiKey960's tendency to heat up insanely fast, which leads to IPA starting when the temperature is already way too high. I started toying around with Kevin's patches but saw your own patch and thought it was related, so I figured I would share some of my thoughts before starting to work on something someone else might already be working on. I should perhaps move some of those comments to Kevin's posting - sorry for the noise. > >
Oh, I noticed Kevin is not in the loop, Cc'ing him. Sorry Kevin. On 29/09/2017 19:26, Valentin Schneider wrote: > > > On 09/29/2017 05:46 PM, Daniel Lezcano wrote: >> Hi Valentin, >> >> On 29/09/2017 13:07, Valentin Schneider wrote: >> >> [ ... ] >> >>> I've tested this on HiKey960 (Android 4.9 + upstream patches to apply >>> your thermal/drivers/hisi series + Kevin's hi3600 support). >>> I ran a video workload, and noticed I get several interrupts while >>> passive cooling is already in effect >>> (I might move part of this discussion to Kevin's posting, but I think >>> it's still relevant to be here): >>> >>> [ 118.107357] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > >>> 65000 >>> [ 119.182531] hisi_thermal fff30000.tsensor: THERMAL ALARM: 76235 > >>> 65000 >>> [ 119.361964] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > >>> 65000 >>> [ 119.907865] hisi_thermal fff30000.tsensor: THERMAL ALARM: 75620 > >>> 65000 >>> [ 119.959076] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70700 > >>> 65000 >>> >>> This isn't optimal for IPA, as the PID is supposed to use a specific >>> sampling rate, but those interrupts >>> forced a re-trigger of power_allocator_throttle which changes the PID's >>> actual sampling rate. >>> IPA isn't expecting this kind of scenarios, as I can see a >>> tz->passive_delay in the computation >>> of the derivative term (although the derivative coefficient defaults to >>> 0...). >> Interesting. This patch actually is for the hikey, not for the hikey960. >> I didn't tested with Kevin's patches. >> >> The thermal characteristics are very different on hi960. For example, >> the temperature increase is must faster on hikey960. So I guess, the >> driver itself should be fine tuned to prevent this kind of scenario. > > True > >>> In a perfect world I would see those interrupts being toggled by the >>> thermal governor, as that is >>> where we know what to do with each trip point - we could still want the >>> interrupts, but in the case >>> of IPA we'd like to disable them while the PID controller is active, and >>> we would know when to re-enable them >>> (as soon as IPA is toggled off). >> Yes, that is true the thermal framework could be enhanced to support >> this. > > If you're up for it, it could be an interesting discussion. In any case, > I might spend a little bit of time looking into making such an > interrupt-toggling work (at least as a prototype just on the HiKey960). > >> >> Do you have an hikey (not 960) to test this patch? > > I do have one but I'm currently lending it to someone else in my team, > I'll have to track it down (probably next week). > I'm actually quite interested in having IPA triggered by interrupt > because of the HiKey960's tendency to heat up insanely fast, which leads > to IPA starting when the temperature is already way too high. I started > toying around with Kevin's patches but saw your own patch and thought it > was related, so I figured I would share some of my thoughts before > starting to work on something someone else might already be working on. > > I should perhaps move some of those comments to Kevin's posting - sorry > for the noise. > >> >> > -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog
On 29/09/2017 13:07, Valentin Schneider wrote: > Hi, > > > On 09/28/2017 06:13 AM, Daniel Lezcano wrote: >> At this moment, we have both the interrupt setup and the polling >> enabled. The >> interrupt does nothing more than forcing an update while the >> temperature is >> polled every second. >> >> We can do much better than that, threshold is set to 65C in the DT and >> the >> passive cooling device enters in the dance when 75C is reached. We >> need to >> sample the temperature at 65C in order to let the IPA gather enough >> values for >> the PID computation. > Sample collection is a valid point, but passive cooling should become > active as soon > as 65C is reached (at least that's the case with IPA). Furthermore, > IPA's PID controller > is reset when the temperature drops below the first trip-point > (threshold) - as such, > I believe the PID's behavior should be the same now as it with polling. > > I think the main selling point of interrupt-based updates is a faster > reaction time. > On the HiKey960, we can go from below the threshold temperature to over > the control temperature > in less than a second (default polling rate is 1s). In this situation, > IPA's PID starts accumulating error > while already overshooting, which isn't optimal. On top of that, the > passive cooling reacts > too slowly. >> If the SoC is running at a temperature below 65C, we will >> be constantly polling for nothing. >> >> This patch disables the sensor when the temperature is below 65C and >> enables it >> when passing the threshold. It results the thermal sensor driver will >> have no >> activity most of the time. >> >> Cc: Keerthy <j-keerthy@ti.com> >> Cc: Leo Yang <leo.yan@linaro.org> >> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> > I've tested this on HiKey960 (Android 4.9 + upstream patches to apply > your thermal/drivers/hisi series + Kevin's hi3600 support). > I ran a video workload, and noticed I get several interrupts while > passive cooling is already in effect > (I might move part of this discussion to Kevin's posting, but I think > it's still relevant to be here): > > [ 118.107357] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > 65000 > [ 119.182531] hisi_thermal fff30000.tsensor: THERMAL ALARM: 76235 > 65000 > [ 119.361964] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > 65000 > [ 119.907865] hisi_thermal fff30000.tsensor: THERMAL ALARM: 75620 > 65000 > [ 119.959076] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70700 > 65000 Ok, so apparently there are multiple alarms level in the driver for the hikey960 [1]. So I prefer to drop this patch for now and take the hikey960 thermal support first and we can sort out the issue later. For my information, can you show me the DT snippet you have for the thermal zones? -- Daniel [1] https://patchwork.kernel.org/patch/9965743/ -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog
On 10/10/2017 05:51 PM, Daniel Lezcano wrote: > Ok, so apparently there are multiple alarms level in the driver for the > hikey960 [1]. So I prefer to drop this patch for now and take the > hikey960 thermal support first and we can sort out the issue later. > > For my information, can you show me the DT snippet you have for the > thermal zones? Sure thing: thermal-zones { cls0: cls0 { polling-delay = <1000>; polling-delay-passive = <100>; sustainable-power = <4500>; /* sensor ID */ thermal-sensors = <&tsensor 1>; trips { threshold: trip-point@0 { temperature = <65000>; hysteresis = <1000>; type = "passive"; }; target: trip-point@1 { temperature = <75000>; hysteresis = <1000>; type = "passive"; }; }; cooling-maps { map0 { trip = <&target>; contribution = <1024>; cooling-device = <&cpu0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; }; map1 { trip = <&target>; contribution = <512>; cooling-device = <&cpu4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; }; map2 { trip = <&target>; contribution = <1024>; cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; }; }; }; }; It's exactly what is on the Android 4.9 kernel (https://android.googlesource.com/kernel/hikey-linaro/+/android-hikey-linaro-4.9/arch/arm64/boot/dts/hisilicon/hi3660.dtsi#1272), with the tsensor index changed from 4 to 1 to work with Kevin's patches. > > -- Daniel > > [1] https://patchwork.kernel.org/patch/9965743/ > >
On 10/10/2017 19:01, Valentin Schneider wrote: > On 10/10/2017 05:51 PM, Daniel Lezcano wrote: >> Ok, so apparently there are multiple alarms level in the driver for the >> hikey960 [1]. So I prefer to drop this patch for now and take the >> hikey960 thermal support first and we can sort out the issue later. >> >> For my information, can you show me the DT snippet you have for the >> thermal zones? > > Sure thing: > > thermal-zones { > > cls0: cls0 { > polling-delay = <1000>; > polling-delay-passive = <100>; > sustainable-power = <4500>; > > /* sensor ID */ > thermal-sensors = <&tsensor 1>; > > trips { > threshold: trip-point@0 { > temperature = <65000>; > hysteresis = <1000>; > type = "passive"; > }; > > target: trip-point@1 { > temperature = <75000>; > hysteresis = <1000>; > type = "passive"; > }; > }; That's strange, regarding your traces: " [ 118.107357] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > 65000 [ 119.182531] hisi_thermal fff30000.tsensor: THERMAL ALARM: 76235 > 65000 [ 119.361964] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > 65000 [ 119.907865] hisi_thermal fff30000.tsensor: THERMAL ALARM: 75620 > 65000 [ 119.959076] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70700 > 65000 " I was expecting to see more trip points. Did you test the driver with a 70000 trip point? [ ... ] -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog
On 10/10/2017 06:13 PM, Daniel Lezcano wrote: > On 10/10/2017 19:01, Valentin Schneider wrote: >> On 10/10/2017 05:51 PM, Daniel Lezcano wrote: >>> Ok, so apparently there are multiple alarms level in the driver for the >>> hikey960 [1]. So I prefer to drop this patch for now and take the >>> hikey960 thermal support first and we can sort out the issue later. >>> >>> For my information, can you show me the DT snippet you have for the >>> thermal zones? >> Sure thing: >> >> thermal-zones { >> >> cls0: cls0 { >> polling-delay = <1000>; >> polling-delay-passive = <100>; >> sustainable-power = <4500>; >> >> /* sensor ID */ >> thermal-sensors = <&tsensor 1>; >> >> trips { >> threshold: trip-point@0 { >> temperature = <65000>; >> hysteresis = <1000>; >> type = "passive"; >> }; >> >> target: trip-point@1 { >> temperature = <75000>; >> hysteresis = <1000>; >> type = "passive"; >> }; >> }; > That's strange, regarding your traces: > > " > [ 118.107357] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > 65000 > [ 119.182531] hisi_thermal fff30000.tsensor: THERMAL ALARM: 76235 > 65000 > [ 119.361964] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > 65000 > [ 119.907865] hisi_thermal fff30000.tsensor: THERMAL ALARM: 75620 > 65000 > [ 119.959076] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70700 > 65000 > " > > I was expecting to see more trip points. Did you test the driver with a > 70000 trip point? No, I didn't change any setting other than the tsensor index to make things work. Mind you, in Kevin's patch series the thermal alarm is setup with a 4 degrees "lag", i.e. alarms will only be re-triggered if temperature increases/decreases of at least 4 degrees (which explains the traces). > > > [ ... ] > >
On 10/10/2017 19:19, Valentin Schneider wrote: > > > On 10/10/2017 06:13 PM, Daniel Lezcano wrote: >> On 10/10/2017 19:01, Valentin Schneider wrote: >>> On 10/10/2017 05:51 PM, Daniel Lezcano wrote: >>>> Ok, so apparently there are multiple alarms level in the driver for the >>>> hikey960 [1]. So I prefer to drop this patch for now and take the >>>> hikey960 thermal support first and we can sort out the issue later. >>>> >>>> For my information, can you show me the DT snippet you have for the >>>> thermal zones? >>> Sure thing: >>> >>> thermal-zones { >>> >>> cls0: cls0 { >>> polling-delay = <1000>; >>> polling-delay-passive = <100>; >>> sustainable-power = <4500>; >>> >>> /* sensor ID */ >>> thermal-sensors = <&tsensor 1>; >>> >>> trips { >>> threshold: trip-point@0 { >>> temperature = <65000>; >>> hysteresis = <1000>; >>> type = "passive"; >>> }; >>> >>> target: trip-point@1 { >>> temperature = <75000>; >>> hysteresis = <1000>; >>> type = "passive"; >>> }; >>> }; >> That's strange, regarding your traces: >> >> " >> [ 118.107357] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > >> 65000 >> [ 119.182531] hisi_thermal fff30000.tsensor: THERMAL ALARM: 76235 > >> 65000 >> [ 119.361964] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > >> 65000 >> [ 119.907865] hisi_thermal fff30000.tsensor: THERMAL ALARM: 75620 > >> 65000 >> [ 119.959076] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70700 > >> 65000 >> " >> >> I was expecting to see more trip points. Did you test the driver with a >> 70000 trip point? > > No, I didn't change any setting other than the tsensor index to make > things work. Mind you, in Kevin's patch series the thermal alarm is > setup with a 4 degrees "lag", i.e. alarms will only be re-triggered if > temperature increases/decreases of at least 4 degrees (which explains > the traces). Mmh, the behavior regarding the interrupt is slightly different with the hi960, perhaps a bit fuzzy regarding how it is handled now. Anyway, we can live with that now and go further to fix that later, the result is the same. -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog
On 2017/10/11 1:28, Daniel Lezcano wrote: > On 10/10/2017 19:19, Valentin Schneider wrote: >> >> >> On 10/10/2017 06:13 PM, Daniel Lezcano wrote: >>> On 10/10/2017 19:01, Valentin Schneider wrote: >>>> On 10/10/2017 05:51 PM, Daniel Lezcano wrote: >>>>> Ok, so apparently there are multiple alarms level in the driver for the >>>>> hikey960 [1]. So I prefer to drop this patch for now and take the >>>>> hikey960 thermal support first and we can sort out the issue later. >>>>> >>>>> For my information, can you show me the DT snippet you have for the >>>>> thermal zones? >>>> Sure thing: >>>> >>>> thermal-zones { >>>> >>>> cls0: cls0 { >>>> polling-delay = <1000>; >>>> polling-delay-passive = <100>; >>>> sustainable-power = <4500>; >>>> >>>> /* sensor ID */ >>>> thermal-sensors = <&tsensor 1>; >>>> >>>> trips { >>>> threshold: trip-point@0 { >>>> temperature = <65000>; >>>> hysteresis = <1000>; >>>> type = "passive"; >>>> }; >>>> >>>> target: trip-point@1 { >>>> temperature = <75000>; >>>> hysteresis = <1000>; >>>> type = "passive"; >>>> }; >>>> }; >>> That's strange, regarding your traces: >>> >>> " >>> [ 118.107357] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > >>> 65000 >>> [ 119.182531] hisi_thermal fff30000.tsensor: THERMAL ALARM: 76235 > >>> 65000 >>> [ 119.361964] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70495 > >>> 65000 >>> [ 119.907865] hisi_thermal fff30000.tsensor: THERMAL ALARM: 75620 > >>> 65000 >>> [ 119.959076] hisi_thermal fff30000.tsensor: THERMAL ALARM: 70700 > >>> 65000 >>> " >>> >>> I was expecting to see more trip points. Did you test the driver with a >>> 70000 trip point? >> >> No, I didn't change any setting other than the tsensor index to make >> things work. Mind you, in Kevin's patch series the thermal alarm is >> setup with a 4 degrees "lag", i.e. alarms will only be re-triggered if >> temperature increases/decreases of at least 4 degrees (which explains >> the traces). This should be a trip point 75000, cross 75000 triger an interrupt and drop below 71000 triger another interrupt, the multi alarm interrupt is not suitable for ipa as you discussed before, we should drop the patch of multi alarm support. > > Mmh, the behavior regarding the interrupt is slightly different with the > hi960, perhaps a bit fuzzy regarding how it is handled now. Anyway, we > can live with that now and go further to fix that later, the result is > the same. > >
Hello, On Thu, Sep 28, 2017 at 09:32:20AM +0200, Daniel Lezcano wrote: > On Thu, Sep 28, 2017 at 02:57:52PM +0800, Leo Yan wrote: > > Hi Daniel, > > > > On Thu, Sep 28, 2017 at 07:13:44AM +0200, Daniel Lezcano wrote: > > > At this moment, we have both the interrupt setup and the polling enabled. The > > > interrupt does nothing more than forcing an update while the temperature is > > > polled every second. > > > > > > We can do much better than that, threshold is set to 65C in the DT and the > > > passive cooling device enters in the dance when 75C is reached. We need to > > > sample the temperature at 65C in order to let the IPA gather enough values for > > > the PID computation. If the SoC is running at a temperature below 65C, we will > > > be constantly polling for nothing. > > > > > > This patch disables the sensor when the temperature is below 65C and enables it > > > when passing the threshold. It results the thermal sensor driver will have no > > > activity most of the time. > > > > > > Cc: Keerthy <j-keerthy@ti.com> > > > Cc: Leo Yang <leo.yan@linaro.org> > > > > s/Yang/Yan :) Have tested this patch on Hikey at my side: > > Oops sorry :) > > > Reviewed-by: Leo Yan <leo.yan@linaro.org> > > Tested-by: Leo Yan <leo.yan@linaro.org> > Is this still needed after the latest rework done? > > Great! Thanks for testing. > > -- Daniel
On 05/12/2017 03:00, Eduardo Valentin wrote: > Hello, > > On Thu, Sep 28, 2017 at 09:32:20AM +0200, Daniel Lezcano wrote: >> On Thu, Sep 28, 2017 at 02:57:52PM +0800, Leo Yan wrote: >>> Hi Daniel, >>> >>> On Thu, Sep 28, 2017 at 07:13:44AM +0200, Daniel Lezcano wrote: >>>> At this moment, we have both the interrupt setup and the polling enabled. The >>>> interrupt does nothing more than forcing an update while the temperature is >>>> polled every second. >>>> >>>> We can do much better than that, threshold is set to 65C in the DT and the >>>> passive cooling device enters in the dance when 75C is reached. We need to >>>> sample the temperature at 65C in order to let the IPA gather enough values for >>>> the PID computation. If the SoC is running at a temperature below 65C, we will >>>> be constantly polling for nothing. >>>> >>>> This patch disables the sensor when the temperature is below 65C and enables it >>>> when passing the threshold. It results the thermal sensor driver will have no >>>> activity most of the time. >>>> >>>> Cc: Keerthy <j-keerthy@ti.com> >>>> Cc: Leo Yang <leo.yan@linaro.org> >>> >>> s/Yang/Yan :) Have tested this patch on Hikey at my side: >> >> Oops sorry :) >> >>> Reviewed-by: Leo Yan <leo.yan@linaro.org> >>> Tested-by: Leo Yan <leo.yan@linaro.org> >> > > Is this still needed after the latest rework done? No longer needed. -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog
diff --git a/drivers/thermal/hisi_thermal.c b/drivers/thermal/hisi_thermal.c index 39f4627..74ea70d 100644 --- a/drivers/thermal/hisi_thermal.c +++ b/drivers/thermal/hisi_thermal.c @@ -218,6 +218,15 @@ static int hisi_thermal_get_temp(void *__data, int *temp) return 0; } +static void hisi_thermal_toggle_sensor(struct hisi_thermal_sensor *sensor, + bool on) +{ + struct thermal_zone_device *tzd = sensor->tzd; + + tzd->ops->set_mode(tzd, + on ? THERMAL_DEVICE_ENABLED : THERMAL_DEVICE_DISABLED); +} + static const struct thermal_zone_of_device_ops hisi_of_thermal_ops = { .get_temp = hisi_thermal_get_temp, }; @@ -236,12 +245,16 @@ static irqreturn_t hisi_thermal_alarm_irq_thread(int irq, void *dev) dev_crit(&data->pdev->dev, "THERMAL ALARM: %d > %d\n", temp, sensor->thres_temp); + hisi_thermal_toggle_sensor(&data->sensor, true); + thermal_zone_device_update(data->sensor.tzd, THERMAL_EVENT_UNSPECIFIED); } else if (temp < sensor->thres_temp) { dev_crit(&data->pdev->dev, "THERMAL ALARM stopped: %d < %d\n", temp, sensor->thres_temp); + + hisi_thermal_toggle_sensor(&data->sensor, false); } return IRQ_HANDLED; @@ -286,15 +299,6 @@ static const struct of_device_id of_hisi_thermal_match[] = { }; MODULE_DEVICE_TABLE(of, of_hisi_thermal_match); -static void hisi_thermal_toggle_sensor(struct hisi_thermal_sensor *sensor, - bool on) -{ - struct thermal_zone_device *tzd = sensor->tzd; - - tzd->ops->set_mode(tzd, - on ? THERMAL_DEVICE_ENABLED : THERMAL_DEVICE_DISABLED); -} - static int hisi_thermal_setup(struct hisi_thermal_data *data) { struct hisi_thermal_sensor *sensor; @@ -393,7 +397,7 @@ static int hisi_thermal_probe(struct platform_device *pdev) return ret; } - hisi_thermal_toggle_sensor(&data->sensor, true); + hisi_thermal_toggle_sensor(&data->sensor, false); return 0; }
At this moment, we have both the interrupt setup and the polling enabled. The interrupt does nothing more than forcing an update while the temperature is polled every second. We can do much better than that, threshold is set to 65C in the DT and the passive cooling device enters in the dance when 75C is reached. We need to sample the temperature at 65C in order to let the IPA gather enough values for the PID computation. If the SoC is running at a temperature below 65C, we will be constantly polling for nothing. This patch disables the sensor when the temperature is below 65C and enables it when passing the threshold. It results the thermal sensor driver will have no activity most of the time. Cc: Keerthy <j-keerthy@ti.com> Cc: Leo Yang <leo.yan@linaro.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> --- drivers/thermal/hisi_thermal.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) -- 2.7.4