diff mbox series

[V3,2/4] thermal/drivers/cpu_cooling: Add idle cooling device documentation

Message ID 20191203093704.7037-2-daniel.lezcano@linaro.org
State New
Headers show
Series [V3,1/4] thermal/drivers/Kconfig: Convert the CPU cooling device to a choice | expand

Commit Message

Daniel Lezcano Dec. 3, 2019, 9:37 a.m. UTC
Provide some documentation for the idle injection cooling effect in
order to let people to understand the rational of the approach for the
idle injection CPU cooling device.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

---
 .../driver-api/thermal/cpu-idle-cooling.rst   | 166 ++++++++++++++++++
 1 file changed, 166 insertions(+)
 create mode 100644 Documentation/driver-api/thermal/cpu-idle-cooling.rst

-- 
2.17.1

Comments

Amit Kucheria Dec. 4, 2019, 4:24 a.m. UTC | #1
On Tue, Dec 3, 2019 at 3:07 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>

> Provide some documentation for the idle injection cooling effect in

> order to let people to understand the rational of the approach for the


s/rational/rationale

> idle injection CPU cooling device.

>

> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

> ---

>  .../driver-api/thermal/cpu-idle-cooling.rst   | 166 ++++++++++++++++++

>  1 file changed, 166 insertions(+)

>  create mode 100644 Documentation/driver-api/thermal/cpu-idle-cooling.rst

>

> diff --git a/Documentation/driver-api/thermal/cpu-idle-cooling.rst b/Documentation/driver-api/thermal/cpu-idle-cooling.rst

> new file mode 100644

> index 000000000000..457cd9979ddb

> --- /dev/null

> +++ b/Documentation/driver-api/thermal/cpu-idle-cooling.rst

> @@ -0,0 +1,166 @@

> +

> +Situation:

> +----------

> +

> +Under certain circumstances a SoC can reach the maximum temperature

> +limit or is unable to stabilize the temperature around a temperature


s/the maximum/a critical/

s/or/and/

> +control. When the SoC has to stabilize the temperature, the kernel can

> +act on a cooling device to mitigate the dissipated power. When the

> +maximum temperature is reached and to prevent a reboot or a shutdown,

> +a decision must be taken to reduce the temperature under the critical

> +threshold, that impacts the performance.


Consider replacing above paragraph with:

When the critical temperature is reached, a decision must be taken to
reduce the temperature, that, in turn impacts performance.

> +

> +Another situation is when the silicon reaches a certain temperature

> +which continues to increase even if the dynamic leakage is reduced to

> +its minimum by clock gating the component. The runaway phenomena will


s/phenomena/phenomenon/

> +continue with the static leakage and only powering down the component,

> +thus dropping the dynamic and static leakage will allow the component

> +to cool down.

> +


Consider rephrasing as,

Another situation is when the silicon temperature continues to
increase even after the dynamic leakage is reduced to its minimum by
clock gating the component. This runaway phenomenon can continue due
to the static leakage. The only solution is to power down the
component, thus dropping the dynamic and static leakage that will
allow the component to cool down.


> +Last but not least, the system can ask for a specific power budget but

> +because of the OPP density, we can only choose an OPP with a power

> +budget lower than the requested one and underuse the CPU, thus losing

> +performances. In other words, one OPP under uses the CPU with a power


s/performances/performance.

s/underuse/under-utlilize/
s/under use/under-utlilizes/

> +lesser than the power budget and the next OPP exceed the power budget,


s/lesser than the/less than the requested/
s/exceed/exceeds/

> +an intermediate OPP could have been used if it were present.


Make this a new sentence.

> +

> +Solutions:

> +----------

> +

> +If we can remove the static and the dynamic leakage for a specific

> +duration in a controlled period, the SoC temperature will

> +decrease. Acting at the idle state duration or the idle cycle


s/at/for/ ?

> +injection period, we can mitigate the temperature by modulating the

> +power budget.

> +

> +The Operating Performance Point (OPP) density has a great influence on

> +the control precision of cpufreq, however different vendors have a

> +plethora of OPP density, and some have large power gap between OPPs,

> +that will result in loss of performance during thermal control and

> +loss of power in other scenes.


s/scenes/scenarios/

> +

> +At a specific OPP, we can assume injecting idle cycle on all CPUs,

> +belonging to the same cluster, with a duration greater than the


Change to "we can assume that injecting idle cycles on all CPUs belong
to the same cluster"

> +cluster idle state target residency, we drop the static and the


s/we drop/will lead to dropping/

> +dynamic leakage for this period (modulo the energy needed to enter

> +this state). So the sustainable power with idle cycles has a linear

> +relation with the OPP’s sustainable power and can be computed with a

> +coefficient similar to:

> +

> +           Power(IdleCycle) = Coef x Power(OPP)

> +

> +Idle Injection:

> +---------------

> +

> +The base concept of the idle injection is to force the CPU to go to an

> +idle state for a specified time each control cycle, it provides

> +another way to control CPU power and heat in addition to

> +cpufreq. Ideally, if all CPUs belonging to the same cluster, inject

> +their idle cycle synchronously, the cluster can reach its power down


cycles

> +state with a minimum power consumption and static leakage

> +drop. However, these idle cycles injection will add extra latencies as


s/static leakage drop/reduce static leakage to (almost) zero/

> +the CPUs will have to wakeup from a deep sleep state.

> +

> +     ^

> +     |

> +     |

> +     |-------       -------       -------

> +     |_______|_____|_______|_____|_______|___________

> +

> +      <----->

> +       idle  <---->

> +              running

> +

> +With the fixed idle injection duration, we can give a value which is

> +an acceptable performance drop off or latency when we reach a specific

> +temperature and we begin to mitigate by varying the Idle injection

> +period.

> +


I'm not sure what it the purpose of this statement. You've described
how the period value starts at a maximum and is adjusted dynamically
below.

> +The mitigation begins with a maximum period value which decrease when


Shouldn't the idle injection period increase to get more cooling effect?

> +more cooling effect is requested. When the period duration is equal to

> +the idle duration, then we are in a situation the platform can’t

> +dissipate the heat enough and the mitigation fails. In this case the

> +situation is considered critical and there is nothing to do. The idle

> +injection duration must be changed by configuration and until we reach

> +the cooling effect, otherwise an additionnal cooling device must be


typo: additional

> +used or ultimately decrease the SoC performance by dropping the

> +highest OPP point of the SoC.

> +

> +The idle injection duration value must comply with the constraints:

> +

> +- It is lesser or equal to the latency we tolerate when the mitigation


s/lesser/less than/

> +  begins. It is platform dependent and will depend on the user

> +  experience, reactivity vs performance trade off we want. This value

> +  should be specified.

> +

> +- It is greater than the idle state’s target residency we want to go

> +  for thermal mitigation, otherwise we end up consuming more energy.

> +

> +Minimum period

> +--------------

> +

> +The idle injection duration being fixed, it is obvious the minimum


Change to:
When the idle injection duration is fixed,

> +period can’t be lesser than that, otherwise we will be scheduling the

> +idle injection task right before the idle injection duration is

> +complete, so waking up the CPU to put it asleep again.

> +

> +Maximum period

> +--------------

> +

> +The maximum period is the initial period when the mitigation

> +begins. Theoretically when we reach the thermal trip point, we have to

> +sustain a specified power for specific temperature but at this time we

> +consume:

> +

> + Power = Capacitance x Voltage^2 x Frequency x Utilisation

> +

> +... which is more than the sustainable power (or there is something

> +wrong on the system setup). The ‘Capacitance’ and ‘Utilisation’ are a


s/on/in/

> +fixed value, ‘Voltage’ and the ‘Frequency’ are fixed artificially

> +because we don’t want to change the OPP. We can group the

> +‘Capacitance’ and the ‘Utilisation’ into a single term which is the

> +‘Dynamic Power Coefficient (Cdyn)’ Simplifying the above, we have:

> +

> + Pdyn = Cdyn x Voltage^2 x Frequency

> +

> +The IPA will ask us somehow to reduce our power in order to target the


s/IPA/power allocator governor/

> +sustainable power defined in the device tree. So with the idle

> +injection mechanism, we want an average power (Ptarget) resulting on


s/on/in

> +an amount of time running at full power on a specific OPP and idle

> +another amount of time. That could be put in a equation:

> +

> + P(opp)target = ((trunning x (P(opp)running) + (tidle P(opp)idle)) /


missed a 'x' after tidle.

Suggest using capital T for time everwhere to make it easier to read.

> +                       (trunning + tidle)

> +  ...

> +

> + tidle = trunning x ((P(opp)running / P(opp)target) - 1)

> +

> +At this point if we know the running period for the CPU, that gives us

> +the idle injection, we need. Alternatively if we have the idle


Lose the comma.

> +injection duration, we can compute the running duration with:

> +

> + trunning = tidle / ((P(opp)running / P(opp)target) - 1)

> +

> +Practically, if the running power is lesses than the targeted power,


s/lesses/less/

> +we end up with a negative time value, so obviously the equation usage

> +is bound to a power reduction, hence a higher OPP is needed to have

> +the running power greater than the targeted power.

> +

> +However, in this demonstration we ignore three aspects:

> +

> + * The static leakage is not defined here, we can introduce it in the

> +   equation but assuming it will be zero most of the time as it is

> +   difficult to get the values from the SoC vendors

> +

> + * The idle state wake up latency (or entry + exit latency) is not

> +   taken into account, it must be added in the equation in order to

> +   rigorously compute the idle injection

> +

> + * The injected idle duration must be greater than the idle state

> +   target residency, otherwise we end up consuming more energy and

> +   potentially invert the mitigation effect

> +

> +So the final equation is:

> +

> + trunning = (tidle - twakeup ) x

> +               (((P(opp)dyn + P(opp)static ) - P(opp)target) / P(opp)target )

> --

> 2.17.1

>
Daniel Lezcano Dec. 4, 2019, 6:50 a.m. UTC | #2
Hi Amit,

thanks for the review.


On 04/12/2019 05:24, Amit Kucheria wrote:

[ ... ]

>> +the CPUs will have to wakeup from a deep sleep state.

>> +

>> +     ^

>> +     |

>> +     |

>> +     |-------       -------       -------

>> +     |_______|_____|_______|_____|_______|___________

>> +

>> +      <----->

>> +       idle  <---->

>> +              running

>> +

>> +With the fixed idle injection duration, we can give a value which is

>> +an acceptable performance drop off or latency when we reach a specific

>> +temperature and we begin to mitigate by varying the Idle injection

>> +period.

>> +

> 

> I'm not sure what it the purpose of this statement. You've described

> how the period value starts at a maximum and is adjusted dynamically

> below.


We can have different way to inject idle periods. We can increase the
idle duration and/or keep this duration constant but make a variation of
the period. This statement clarify the method which is the latter
because we want to have a constant latency per period easier to deal with.

>> +The mitigation begins with a maximum period value which decrease when

> 

> Shouldn't the idle injection period increase to get more cooling effect?


The period is the opposite of the frequency. The highest the period, the
lowest the frequency, thus less idle cycles and lesser cooling effect.

>> +more cooling effect is requested. When the period duration is equal to

>> +the idle duration, then we are in a situation the platform can’t

>> +dissipate the heat enough and the mitigation fails. In this case the

>> +situation is considered critical and there is nothing to do. The idle

>> +injection duration must be changed by configuration and until we reach

>> +the cooling effect, otherwise an additionnal cooling device must be

> 

> typo: additional

> 

>> +used or ultimately decrease the SoC performance by dropping the

>> +highest OPP point of the SoC.

>> +

>> +The idle injection duration value must comply with the constraints:

>> +

>> +- It is lesser or equal to the latency we tolerate when the mitigation

> 

> s/lesser/less than/

> 

>> +  begins. It is platform dependent and will depend on the user

>> +  experience, reactivity vs performance trade off we want. This value

>> +  should be specified.

>> +

>> +- It is greater than the idle state’s target residency we want to go

>> +  for thermal mitigation, otherwise we end up consuming more energy.

>> +

>> +Minimum period

>> +--------------

>> +

>> +The idle injection duration being fixed, it is obvious the minimum

> 

> Change to:

> When the idle injection duration is fixed,

> 


The idle duration is always fixed in the cpuidle cooling device, why do
you want to add the sentence above?


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Amit Kucheria Dec. 4, 2019, 7:10 a.m. UTC | #3
On Wed, Dec 4, 2019 at 12:20 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>

>

> Hi Amit,

>

> thanks for the review.

>

>

> On 04/12/2019 05:24, Amit Kucheria wrote:

>

> [ ... ]

>

> >> +the CPUs will have to wakeup from a deep sleep state.

> >> +

> >> +     ^

> >> +     |

> >> +     |

> >> +     |-------       -------       -------

> >> +     |_______|_____|_______|_____|_______|___________

> >> +

> >> +      <----->

> >> +       idle  <---->

> >> +              running

> >> +

> >> +With the fixed idle injection duration, we can give a value which is

> >> +an acceptable performance drop off or latency when we reach a specific

> >> +temperature and we begin to mitigate by varying the Idle injection

> >> +period.

> >> +

> >

> > I'm not sure what it the purpose of this statement. You've described

> > how the period value starts at a maximum and is adjusted dynamically

> > below.

>

> We can have different way to inject idle periods. We can increase the

> idle duration and/or keep this duration constant but make a variation of

> the period. This statement clarify the method which is the latter

> because we want to have a constant latency per period easier to deal with.


I think I read period as duration leading to confusion. I suggest
using duty-cycle instead of period throughout this series. I think it
will improve the explanation.

The above paragraph could be rewritten as:

"We use a fixed duration of idle injection that gives an acceptable
performance penalty and a fixed latency. Mitigation can be increased
or decreased by modulating the duty cycle of the idle injection."

Perhaps you could also enhance your ascii art above to show fixed
duration idles and different duty cyles to drive home the point.

> >> +The mitigation begins with a maximum period value which decrease when

> >

> > Shouldn't the idle injection period increase to get more cooling effect?

>

> The period is the opposite of the frequency. The highest the period, the

> lowest the frequency, thus less idle cycles and lesser cooling effect.


Yeah, I definitely confused period with duration :-)

> >> +more cooling effect is requested. When the period duration is equal to

> >> +the idle duration, then we are in a situation the platform can’t

> >> +dissipate the heat enough and the mitigation fails. In this case the

> >> +situation is considered critical and there is nothing to do. The idle

> >> +injection duration must be changed by configuration and until we reach

> >> +the cooling effect, otherwise an additionnal cooling device must be

> >

> > typo: additional

> >

> >> +used or ultimately decrease the SoC performance by dropping the

> >> +highest OPP point of the SoC.

> >> +

> >> +The idle injection duration value must comply with the constraints:

> >> +

> >> +- It is lesser or equal to the latency we tolerate when the mitigation

> >

> > s/lesser/less than/

> >

> >> +  begins. It is platform dependent and will depend on the user

> >> +  experience, reactivity vs performance trade off we want. This value

> >> +  should be specified.

> >> +

> >> +- It is greater than the idle state’s target residency we want to go

> >> +  for thermal mitigation, otherwise we end up consuming more energy.

> >> +

> >> +Minimum period

> >> +--------------

> >> +

> >> +The idle injection duration being fixed, it is obvious the minimum

> >

> > Change to:

> > When the idle injection duration is fixed,

> >

>

> The idle duration is always fixed in the cpuidle cooling device, why do

> you want to add the sentence above?


Ignore for now.

Regards,
Amit
diff mbox series

Patch

diff --git a/Documentation/driver-api/thermal/cpu-idle-cooling.rst b/Documentation/driver-api/thermal/cpu-idle-cooling.rst
new file mode 100644
index 000000000000..457cd9979ddb
--- /dev/null
+++ b/Documentation/driver-api/thermal/cpu-idle-cooling.rst
@@ -0,0 +1,166 @@ 
+
+Situation:
+----------
+
+Under certain circumstances a SoC can reach the maximum temperature
+limit or is unable to stabilize the temperature around a temperature
+control. When the SoC has to stabilize the temperature, the kernel can
+act on a cooling device to mitigate the dissipated power. When the
+maximum temperature is reached and to prevent a reboot or a shutdown,
+a decision must be taken to reduce the temperature under the critical
+threshold, that impacts the performance.
+
+Another situation is when the silicon reaches a certain temperature
+which continues to increase even if the dynamic leakage is reduced to
+its minimum by clock gating the component. The runaway phenomena will
+continue with the static leakage and only powering down the component,
+thus dropping the dynamic and static leakage will allow the component
+to cool down.
+
+Last but not least, the system can ask for a specific power budget but
+because of the OPP density, we can only choose an OPP with a power
+budget lower than the requested one and underuse the CPU, thus losing
+performances. In other words, one OPP under uses the CPU with a power
+lesser than the power budget and the next OPP exceed the power budget,
+an intermediate OPP could have been used if it were present.
+
+Solutions:
+----------
+
+If we can remove the static and the dynamic leakage for a specific
+duration in a controlled period, the SoC temperature will
+decrease. Acting at the idle state duration or the idle cycle
+injection period, we can mitigate the temperature by modulating the
+power budget.
+
+The Operating Performance Point (OPP) density has a great influence on
+the control precision of cpufreq, however different vendors have a
+plethora of OPP density, and some have large power gap between OPPs,
+that will result in loss of performance during thermal control and
+loss of power in other scenes.
+
+At a specific OPP, we can assume injecting idle cycle on all CPUs,
+belonging to the same cluster, with a duration greater than the
+cluster idle state target residency, we drop the static and the
+dynamic leakage for this period (modulo the energy needed to enter
+this state). So the sustainable power with idle cycles has a linear
+relation with the OPP’s sustainable power and can be computed with a
+coefficient similar to:
+
+	    Power(IdleCycle) = Coef x Power(OPP)
+
+Idle Injection:
+---------------
+
+The base concept of the idle injection is to force the CPU to go to an
+idle state for a specified time each control cycle, it provides
+another way to control CPU power and heat in addition to
+cpufreq. Ideally, if all CPUs belonging to the same cluster, inject
+their idle cycle synchronously, the cluster can reach its power down
+state with a minimum power consumption and static leakage
+drop. However, these idle cycles injection will add extra latencies as
+the CPUs will have to wakeup from a deep sleep state.
+
+     ^
+     |
+     |
+     |-------       -------       -------
+     |_______|_____|_______|_____|_______|___________
+
+      <----->
+       idle  <---->
+              running
+
+With the fixed idle injection duration, we can give a value which is
+an acceptable performance drop off or latency when we reach a specific
+temperature and we begin to mitigate by varying the Idle injection
+period.
+
+The mitigation begins with a maximum period value which decrease when
+more cooling effect is requested. When the period duration is equal to
+the idle duration, then we are in a situation the platform can’t
+dissipate the heat enough and the mitigation fails. In this case the
+situation is considered critical and there is nothing to do. The idle
+injection duration must be changed by configuration and until we reach
+the cooling effect, otherwise an additionnal cooling device must be
+used or ultimately decrease the SoC performance by dropping the
+highest OPP point of the SoC.
+
+The idle injection duration value must comply with the constraints:
+
+- It is lesser or equal to the latency we tolerate when the mitigation
+  begins. It is platform dependent and will depend on the user
+  experience, reactivity vs performance trade off we want. This value
+  should be specified.
+
+- It is greater than the idle state’s target residency we want to go
+  for thermal mitigation, otherwise we end up consuming more energy.
+
+Minimum period
+--------------
+
+The idle injection duration being fixed, it is obvious the minimum
+period can’t be lesser than that, otherwise we will be scheduling the
+idle injection task right before the idle injection duration is
+complete, so waking up the CPU to put it asleep again.
+
+Maximum period
+--------------
+
+The maximum period is the initial period when the mitigation
+begins. Theoretically when we reach the thermal trip point, we have to
+sustain a specified power for specific temperature but at this time we
+consume:
+
+ Power = Capacitance x Voltage^2 x Frequency x Utilisation
+
+... which is more than the sustainable power (or there is something
+wrong on the system setup). The ‘Capacitance’ and ‘Utilisation’ are a
+fixed value, ‘Voltage’ and the ‘Frequency’ are fixed artificially
+because we don’t want to change the OPP. We can group the
+‘Capacitance’ and the ‘Utilisation’ into a single term which is the
+‘Dynamic Power Coefficient (Cdyn)’ Simplifying the above, we have:
+
+ Pdyn = Cdyn x Voltage^2 x Frequency
+
+The IPA will ask us somehow to reduce our power in order to target the
+sustainable power defined in the device tree. So with the idle
+injection mechanism, we want an average power (Ptarget) resulting on
+an amount of time running at full power on a specific OPP and idle
+another amount of time. That could be put in a equation:
+
+ P(opp)target = ((trunning x (P(opp)running) + (tidle P(opp)idle)) /
+			(trunning + tidle)
+  ...
+
+ tidle = trunning x ((P(opp)running / P(opp)target) - 1)
+
+At this point if we know the running period for the CPU, that gives us
+the idle injection, we need. Alternatively if we have the idle
+injection duration, we can compute the running duration with:
+
+ trunning = tidle / ((P(opp)running / P(opp)target) - 1)
+
+Practically, if the running power is lesses than the targeted power,
+we end up with a negative time value, so obviously the equation usage
+is bound to a power reduction, hence a higher OPP is needed to have
+the running power greater than the targeted power.
+
+However, in this demonstration we ignore three aspects:
+
+ * The static leakage is not defined here, we can introduce it in the
+   equation but assuming it will be zero most of the time as it is
+   difficult to get the values from the SoC vendors
+
+ * The idle state wake up latency (or entry + exit latency) is not
+   taken into account, it must be added in the equation in order to
+   rigorously compute the idle injection
+
+ * The injected idle duration must be greater than the idle state
+   target residency, otherwise we end up consuming more energy and
+   potentially invert the mitigation effect
+
+So the final equation is:
+
+ trunning = (tidle - twakeup ) x
+		(((P(opp)dyn + P(opp)static ) - P(opp)target) / P(opp)target )