Message ID | 7f94696460848a6bcfe5aee5ffda7fe556240736.1547458732.git.amit.kucheria@linaro.org |
---|---|
State | New |
Headers | show |
Series | None | expand |
On Tue, Jan 15, 2019 at 3:31 AM Matthias Kaehlcke <mka@chromium.org> wrote: > > On Mon, Jan 14, 2019 at 03:51:10PM +0530, Amit Kucheria wrote: > > Since all cpus in the big and little clusters, respectively, are in the > > same frequency domain, use all of them for mitigation in the > > cooling-map. We end up with two cooling devices - one each for the big > > and little clusters. > > > > At the lower trip points we restrict ourselves to throttling only a few > > OPPs. At higher trip temperatures, allow ourselves to be throttled to > > any extent. > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > --- > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 177 ++++++++++++++++++++++++--- > > 1 file changed, 161 insertions(+), 16 deletions(-) > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > index fb7da678b116..7973e88bdf94 100644 > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > > ... > > > > @@ -1719,18 +1728,35 @@ > > thermal-sensors = <&tsens0 1>; > > > > trips { > > - cpu_alert0: trip0 { > > + cpu0_alert0: trip-point@0 { > > Thanks for adapting the trip point names and labels in anticipation of > further additions! > > Seems you aren't overly convinced about the 'target/threshold' > terminology used by some other arm64 platforms ;-) target and threshold have an air of finality to them and doesn't lend itself to having a few trip points on the way to the critical trip, IMO. Let me know if you feel otherwise. > > temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > I realized that we still have the potential problem of a name change > in the trip point node name if a 'threshold' node for IPA is added, > since this node will have a lower temperature than 95°. If this is > something to be concerned about it might be worth to add that extra > trip point already to avoid headaches or funky trip point enumeration, > even if we know that the value might not be the final one. I will squash both the DT changes in to a single change introducing 2 passive trips and 1 critical trip to avoid the churn. See if you like it better. > (I'm aware that we are also changing the node names and labels right > now, it seems less problematic at this point since the SDM845 thermal > zones are a fairly recent addition) > > > - cpu_crit0: trip1 { > > + cpu0_crit: cpu_crit@0 { > > nit: does the @0 add any value here? IIUC there can be only one > critical trip point, hence there will never be a cpu_crit@1 or > higher. Agreed. Will remove. > > temperature = <110000>; > > hysteresis = <1000>; > > type = "critical"; > > }; > > }; > > + > > + cooling-maps { > > + map0 { > > + trip = <&cpu0_alert0>; > > + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, > > + <&CPU1 THERMAL_NO_LIMIT 4>, > > + <&CPU2 THERMAL_NO_LIMIT 4>, > > + <&CPU3 THERMAL_NO_LIMIT 4>; > > + }; > > Out of curiosity: how did you determing the max cooling state of 4? Just some basic testing by pinning a dhrystone benchmark to each of the cores along with some stress-ng threads. Lopping off the top 4 OPPs seemed to mitigate anything I could throw at the board. I'm unable to do the "device in a closed car on a hot summer day" type of tests on the dev board. Nevertheless, I've changed the patch now to only remove the boost frequency at 75 degrees and then full throttling at 95 degrees. I'd appreciate more "real world" testing to validate these. Thanks for the review. Regards, Amit
On Mon, Jan 21, 2019 at 11:40:45PM +0530, Amit Kucheria wrote: > On Tue, Jan 15, 2019 at 3:31 AM Matthias Kaehlcke <mka@chromium.org> wrote: > > > > On Mon, Jan 14, 2019 at 03:51:10PM +0530, Amit Kucheria wrote: > > > Since all cpus in the big and little clusters, respectively, are in the > > > same frequency domain, use all of them for mitigation in the > > > cooling-map. We end up with two cooling devices - one each for the big > > > and little clusters. > > > > > > At the lower trip points we restrict ourselves to throttling only a few > > > OPPs. At higher trip temperatures, allow ourselves to be throttled to > > > any extent. > > > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > > --- > > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 177 ++++++++++++++++++++++++--- > > > 1 file changed, 161 insertions(+), 16 deletions(-) > > > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > index fb7da678b116..7973e88bdf94 100644 > > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > > > > ... > > > > > > @@ -1719,18 +1728,35 @@ > > > thermal-sensors = <&tsens0 1>; > > > > > > trips { > > > - cpu_alert0: trip0 { > > > + cpu0_alert0: trip-point@0 { > > > > Thanks for adapting the trip point names and labels in anticipation of > > further additions! > > > > Seems you aren't overly convinced about the 'target/threshold' > > terminology used by some other arm64 platforms ;-) > > target and threshold have an air of finality to them and doesn't lend > itself to having a few trip points on the way to the critical trip, > IMO. > > Let me know if you feel otherwise. I can see your point, and it's also true that target/threshold seem to imply the use of power_allocator, which may not always be the case. > > > temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > > I realized that we still have the potential problem of a name change > > in the trip point node name if a 'threshold' node for IPA is added, > > since this node will have a lower temperature than 95°. If this is > > something to be concerned about it might be worth to add that extra > > trip point already to avoid headaches or funky trip point enumeration, > > even if we know that the value might not be the final one. > > I will squash both the DT changes in to a single change introducing 2 > passive trips and 1 critical trip to avoid the churn. Sounds good, thanks! > See if you like it better. I didn't really dislike it, was just wondering if renaming nodes could break existing users. I imagine it's not a huge problem after all, since users with an older kernel version won't see the DT change and probably should use the phandle anyway. > > (I'm aware that we are also changing the node names and labels right > > now, it seems less problematic at this point since the SDM845 thermal > > zones are a fairly recent addition) > > > > > - cpu_crit0: trip1 { > > > + cpu0_crit: cpu_crit@0 { > > > > nit: does the @0 add any value here? IIUC there can be only one > > critical trip point, hence there will never be a cpu_crit@1 or > > higher. > > Agreed. Will remove. > > > > temperature = <110000>; > > > hysteresis = <1000>; > > > type = "critical"; > > > }; > > > }; > > > + > > > + cooling-maps { > > > + map0 { > > > + trip = <&cpu0_alert0>; > > > + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, > > > + <&CPU1 THERMAL_NO_LIMIT 4>, > > > + <&CPU2 THERMAL_NO_LIMIT 4>, > > > + <&CPU3 THERMAL_NO_LIMIT 4>; > > > + }; > > > > Out of curiosity: how did you determing the max cooling state of 4? > > Just some basic testing by pinning a dhrystone benchmark to each of > the cores along with some stress-ng threads. Lopping off the top 4 > OPPs seemed to mitigate anything I could throw at the board. Thanks for sharing your approach! > I'm unable to do the "device in a closed car on a hot summer day" type > of tests on the dev board. Nevertheless, I've changed the patch now to > only remove the boost frequency at 75 degrees and then full throttling > at 95 degrees. > > I'd appreciate more "real world" testing to validate these. Sure, we'll run some tests with the new configuration on our site. Thanks Matthias
diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi index fb7da678b116..7973e88bdf94 100644 --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi @@ -13,6 +13,7 @@ #include <dt-bindings/reset/qcom,sdm845-aoss.h> #include <dt-bindings/soc/qcom,rpmh-rsc.h> #include <dt-bindings/clock/qcom,gcc-sdm845.h> +#include <dt-bindings/thermal/thermal.h> / { interrupt-parent = <&intc>; @@ -99,6 +100,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x0>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_0>; qcom,freq-domain = <&cpufreq_hw 0>; @@ -116,6 +118,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x100>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_100>; qcom,freq-domain = <&cpufreq_hw 0>; @@ -130,6 +133,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x200>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_200>; qcom,freq-domain = <&cpufreq_hw 0>; @@ -144,6 +148,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x300>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_300>; qcom,freq-domain = <&cpufreq_hw 0>; @@ -158,6 +163,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x400>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_400>; qcom,freq-domain = <&cpufreq_hw 1>; @@ -172,6 +178,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x500>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_500>; qcom,freq-domain = <&cpufreq_hw 1>; @@ -186,6 +193,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x600>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_600>; qcom,freq-domain = <&cpufreq_hw 1>; @@ -200,6 +208,7 @@ compatible = "qcom,kryo385"; reg = <0x0 0x700>; enable-method = "psci"; + #cooling-cells = <2>; next-level-cache = <&L2_700>; qcom,freq-domain = <&cpufreq_hw 1>; @@ -1719,18 +1728,35 @@ thermal-sensors = <&tsens0 1>; trips { - cpu_alert0: trip0 { + cpu0_alert0: trip-point@0 { temperature = <95000>; hysteresis = <2000>; type = "passive"; }; - cpu_crit0: trip1 { + cpu0_crit: cpu_crit@0 { temperature = <110000>; hysteresis = <1000>; type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu0_alert0>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, + <&CPU1 THERMAL_NO_LIMIT 4>, + <&CPU2 THERMAL_NO_LIMIT 4>, + <&CPU3 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu0_crit>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu1-thermal { @@ -1740,18 +1766,35 @@ thermal-sensors = <&tsens0 2>; trips { - cpu_alert1: trip0 { + cpu1_alert0: trip-point@0 { temperature = <95000>; hysteresis = <2000>; type = "passive"; }; - cpu_crit1: trip1 { + cpu1_crit: cpu_crit@0 { temperature = <110000>; hysteresis = <1000>; type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu1_alert0>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, + <&CPU1 THERMAL_NO_LIMIT 4>, + <&CPU2 THERMAL_NO_LIMIT 4>, + <&CPU3 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu1_crit>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu2-thermal { @@ -1761,18 +1804,35 @@ thermal-sensors = <&tsens0 3>; trips { - cpu_alert2: trip0 { + cpu2_alert0: trip-point@0 { temperature = <95000>; hysteresis = <2000>; type = "passive"; }; - cpu_crit2: trip1 { + cpu2_crit: cpu_crit@0 { temperature = <110000>; hysteresis = <1000>; type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu2_alert0>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, + <&CPU1 THERMAL_NO_LIMIT 4>, + <&CPU2 THERMAL_NO_LIMIT 4>, + <&CPU3 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu2_crit>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu3-thermal { @@ -1782,18 +1842,35 @@ thermal-sensors = <&tsens0 4>; trips { - cpu_alert3: trip0 { + cpu3_alert0: trip-point@0 { temperature = <95000>; hysteresis = <2000>; type = "passive"; }; - cpu_crit3: trip1 { + cpu3_crit: cpu_crit@0 { temperature = <110000>; hysteresis = <1000>; type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu3_alert0>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT 4>, + <&CPU1 THERMAL_NO_LIMIT 4>, + <&CPU2 THERMAL_NO_LIMIT 4>, + <&CPU3 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu3_crit>; + cooling-device = <&CPU0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu4-thermal { @@ -1803,18 +1880,35 @@ thermal-sensors = <&tsens0 7>; trips { - cpu_alert4: trip0 { + cpu4_alert0: trip-point@0 { temperature = <95000>; hysteresis = <2000>; type = "passive"; }; - cpu_crit4: trip1 { + cpu4_crit: cpu_crit@0 { temperature = <110000>; hysteresis = <1000>; type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu4_alert0>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT 4>, + <&CPU5 THERMAL_NO_LIMIT 4>, + <&CPU6 THERMAL_NO_LIMIT 4>, + <&CPU7 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu4_crit>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu5-thermal { @@ -1824,18 +1918,35 @@ thermal-sensors = <&tsens0 8>; trips { - cpu_alert5: trip0 { + cpu5_alert0: trip-point@0 { temperature = <95000>; hysteresis = <2000>; type = "passive"; }; - cpu_crit5: trip1 { + cpu5_crit: cpu_crit@0 { temperature = <110000>; hysteresis = <1000>; type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu5_alert0>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT 4>, + <&CPU5 THERMAL_NO_LIMIT 4>, + <&CPU6 THERMAL_NO_LIMIT 4>, + <&CPU7 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu5_crit>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu6-thermal { @@ -1845,18 +1956,35 @@ thermal-sensors = <&tsens0 9>; trips { - cpu_alert6: trip0 { + cpu6_alert0: trip-point@0 { temperature = <95000>; hysteresis = <2000>; type = "passive"; }; - cpu_crit6: trip1 { + cpu6_crit: cpu_crit@0 { temperature = <110000>; hysteresis = <1000>; type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu6_alert0>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT 4>, + <&CPU5 THERMAL_NO_LIMIT 4>, + <&CPU6 THERMAL_NO_LIMIT 4>, + <&CPU7 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu6_crit>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; cpu7-thermal { @@ -1866,18 +1994,35 @@ thermal-sensors = <&tsens0 10>; trips { - cpu_alert7: trip0 { + cpu7_alert0: trip-point@0 { temperature = <95000>; hysteresis = <2000>; type = "passive"; }; - cpu_crit7: trip1 { + cpu7_crit: cpu_crit@0 { temperature = <110000>; hysteresis = <1000>; type = "critical"; }; }; + + cooling-maps { + map0 { + trip = <&cpu7_alert0>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT 4>, + <&CPU5 THERMAL_NO_LIMIT 4>, + <&CPU6 THERMAL_NO_LIMIT 4>, + <&CPU7 THERMAL_NO_LIMIT 4>; + }; + map1 { + trip = <&cpu7_crit>; + cooling-device = <&CPU4 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU5 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU6 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>, + <&CPU7 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>; + }; + }; }; }; };
Since all cpus in the big and little clusters, respectively, are in the same frequency domain, use all of them for mitigation in the cooling-map. We end up with two cooling devices - one each for the big and little clusters. At the lower trip points we restrict ourselves to throttling only a few OPPs. At higher trip temperatures, allow ourselves to be throttled to any extent. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> --- arch/arm64/boot/dts/qcom/sdm845.dtsi | 177 ++++++++++++++++++++++++--- 1 file changed, 161 insertions(+), 16 deletions(-) -- 2.17.1