diff mbox series

[v3,4/4] thermal/core: Fix thermal trip cross point

Message ID 20220715210911.714479-4-daniel.lezcano@linaro.org
State Superseded
Headers show
Series [v3,1/4] thermal/core: Encapsulate the trip point crossed function | expand

Commit Message

Daniel Lezcano July 15, 2022, 9:09 p.m. UTC
The routine doing trip point crossing the way up or down is actually
wrong.

A trip point is composed with a trip temperature and a hysteresis.

The trip temperature is used to detect when the trip point is crossed
the way up.

The trip temperature minus the hysteresis is used to detect when the
trip point is crossed the way down.

|-----------low--------high------------|
             |<--------->|
             |    hyst   |
             |           |
             |          -|--> crossed the way up
             |
         <---|-- crossed the way down

For that, there is a two point comparison: the current temperature and
the previous temperature.

The actual code assumes if the current temperature is greater than the
trip temperature and the previous temperature was lesser, then the
trip point is crossed the way up. That is true only if we crossed the
way down the low temperature boundary from the previous temperature or
if the hysteresis is zero. The temperature can decrease between the
low and high, so the trip point is not crossed the way down and then
increase again and cross the high temperature raising a new trip point
crossed detection which is incorrect. The same scenario happens when
crossing the way down.

The trip point crossing the way up and down must act as parenthesis, a
trip point down must close a trip point up. Today we have multiple
trip point up without the corresponding trip point down.

In order to fix that, we store the previous trip point which gives the
information about the previous trip and we change the trip point
browsing order depending on the temperature trend: in the ascending
order when the temperature trend is raising, otherwise in the
descending order.

As a sidenote, the thermal_zone_device structure has already the
prev_trip_low and prev_trip_high information which are used by the
thermal_zone_set_trips() function. This one can be changed to be
triggered by the trip temperature crossing function, which makes more
sense, and the two fields will disappear.

Tested on a rk3399-rock960 with thermal stress and 4 trip points. Also
tested with temperature emulation to create a temperature jump
directly to the second trip point.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
V3:

  - Use the ordered indexes introduced in the previous patch as the
    trip could be not ordered

V2:
  - As spotted by Zhang Rui, the trip cross notification does not
  work if the temperature drops and crosses two trip points in the
  same update interval. In order to fix that, we browse the trip point
  in the ascending order when the temperature trend is raising,
  otherwise in the descending order.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/thermal/thermal_core.c | 54 ++++++++++++++++++++++++----------
 include/linux/thermal.h        |  2 ++
 2 files changed, 41 insertions(+), 15 deletions(-)

Comments

Zhang, Rui July 18, 2022, 5:30 a.m. UTC | #1
On Fri, 2022-07-15 at 23:09 +0200, Daniel Lezcano wrote:
> The routine doing trip point crossing the way up or down is actually
> wrong.
> 
> A trip point is composed with a trip temperature and a hysteresis.
> 
> The trip temperature is used to detect when the trip point is crossed
> the way up.
> 
> The trip temperature minus the hysteresis is used to detect when the
> trip point is crossed the way down.
> 
> > -----------low--------high------------|
>              |<--------->|
>              |    hyst   |
>              |           |
>              |          -|--> crossed the way up
>              |
>          <---|-- crossed the way down
> 
> For that, there is a two point comparison: the current temperature
> and
> the previous temperature.
> 
> The actual code assumes if the current temperature is greater than
> the
> trip temperature and the previous temperature was lesser, then the
> trip point is crossed the way up. That is true only if we crossed the
> way down the low temperature boundary from the previous temperature
> or
> if the hysteresis is zero. The temperature can decrease between the
> low and high, so the trip point is not crossed the way down and then
> increase again and cross the high temperature raising a new trip
> point
> crossed detection which is incorrect. The same scenario happens when
> crossing the way down.
> 
> The trip point crossing the way up and down must act as parenthesis,
> a
> trip point down must close a trip point up. Today we have multiple
> trip point up without the corresponding trip point down.
> 
> In order to fix that, we store the previous trip point which gives
> the
> information about the previous trip and we change the trip point
> browsing order depending on the temperature trend: in the ascending
> order when the temperature trend is raising, otherwise in the
> descending order.
> 
> As a sidenote, the thermal_zone_device structure has already the
> prev_trip_low and prev_trip_high information which are used by the
> thermal_zone_set_trips() function. This one can be changed to be
> triggered by the trip temperature crossing function, which makes more
> sense, and the two fields will disappear.
> 
> Tested on a rk3399-rock960 with thermal stress and 4 trip points.
> Also
> tested with temperature emulation to create a temperature jump
> directly to the second trip point.
> 
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
> V3:
> 
>   - Use the ordered indexes introduced in the previous patch as the
>     trip could be not ordered
> 
> V2:
>   - As spotted by Zhang Rui, the trip cross notification does not
>   work if the temperature drops and crosses two trip points in the
>   same update interval. In order to fix that, we browse the trip
> point
>   in the ascending order when the temperature trend is raising,
>   otherwise in the descending order.
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

Reviewed-by: Zhang Rui <rui.zhang@intel.com>

> ---
>  drivers/thermal/thermal_core.c | 54 ++++++++++++++++++++++++--------
> --
>  include/linux/thermal.h        |  2 ++
>  2 files changed, 41 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_core.c
> b/drivers/thermal/thermal_core.c
> index f02f38b66445..a5c5f6f4e42b 100644
> --- a/drivers/thermal/thermal_core.c
> +++ b/drivers/thermal/thermal_core.c
> @@ -354,30 +354,48 @@ static void handle_critical_trips(struct
> thermal_zone_device *tz,
>                 tz->ops->critical(tz);
>  }
>  
> -static void handle_thermal_trip_crossed(struct thermal_zone_device
> *tz, int trip,
> +static void handle_thermal_trip_crossed(struct thermal_zone_device
> *tz, int index,
>                                         int trip_temp, int trip_hyst,
>                                         enum thermal_trip_type
> trip_type)
>  {
> +       int trip_low_temp = trip_temp - trip_hyst;
> +       int trip = tz->trips_indexes[index];
> +       
>         if (tz->last_temperature == THERMAL_TEMP_INVALID)
>                 return;
>  
> -       if (tz->last_temperature < trip_temp &&
> -           tz->temperature >= trip_temp) {
> -               thermal_notify_tz_trip_up(tz->id, trip,
> -                                         tz->temperature);
> -       }
> -
> -       if (tz->last_temperature >= trip_temp &&
> -           tz->temperature < (trip_temp - trip_hyst)) {
> -               thermal_notify_tz_trip_down(tz->id, trip,
> -                                           tz->temperature);
> +       /*
> +        * Due to the hysteresis, a third information is needed to
> +        * detect when the temperature is wavering between the
> +        * trip_low_temp and the trip_temp. A trip point is crossed
> +        * the way up only if the temperature is above it while the
> +        * previous temperature was below *and* we crossed the
> +        * trip_temp_low before. The previous trip point give us the
> +        * previous trip point transition. The similar problem exists
> +        * when crossing the way down.
> +        *
> +        * Note the mechanism works only if the caller of the
> function
> +        * invoke the function with the trip point ascending or
> +        * descending regarding the temperature trend. A temperature
> +        * drop trend will browse the trip point in the descending
> +        * order
> +        */
> +       if (tz->last_temperature < trip_temp && tz->temperature >=
> trip_temp &&
> +           index != tz->prev_index) {
> +               thermal_notify_tz_trip_up(tz->id, trip, tz-
> >temperature);
> +               tz->prev_index = index;
> +       } else if (tz->last_temperature >= trip_low_temp && tz-
> >temperature < trip_low_temp &&
> +                  index == tz->prev_index) {
> +               thermal_notify_tz_trip_down(tz->id, trip, tz-
> >temperature);
> +               tz->prev_index--;
>         }
>  }
>  
> -static void handle_thermal_trip(struct thermal_zone_device *tz, int
> trip)
> +static void handle_thermal_trip(struct thermal_zone_device *tz, int
> index)
>  {
>         enum thermal_trip_type type;
>         int trip_temp, hyst = 0;
> +       int trip = tz->trips_indexes[index];
>  
>         /* Ignore disabled trip points */
>         if (test_bit(trip, &tz->trips_disabled))
> @@ -388,7 +406,7 @@ static void handle_thermal_trip(struct
> thermal_zone_device *tz, int trip)
>         if (tz->ops->get_trip_hyst)
>                 tz->ops->get_trip_hyst(tz, trip, &hyst);
>  
> -       handle_thermal_trip_crossed(tz, trip, trip_temp, hyst, type);
> +       handle_thermal_trip_crossed(tz, index, trip_temp, hyst,
> type);
>  
>         if (type == THERMAL_TRIP_CRITICAL || type ==
> THERMAL_TRIP_HOT)
>                 handle_critical_trips(tz, trip, trip_temp, type);
> @@ -428,6 +446,7 @@ static void thermal_zone_device_init(struct
> thermal_zone_device *tz)
>  {
>         struct thermal_instance *pos;
>         tz->temperature = THERMAL_TEMP_INVALID;
> +       tz->prev_index = -1;
>         tz->prev_low_trip = -INT_MAX;
>         tz->prev_high_trip = INT_MAX;
>         list_for_each_entry(pos, &tz->thermal_instances, tz_node)
> @@ -512,8 +531,13 @@ void thermal_zone_device_update(struct
> thermal_zone_device *tz,
>  
>         tz->notify_event = event;
>  
> -       for (count = 0; count < tz->trips; count++)
> -               handle_thermal_trip(tz, count);
> +       if (tz->last_temperature <=  tz->temperature) {
> +               for (count = 0; count < tz->trips; count++)
> +                       handle_thermal_trip(tz, count);
> +       } else {
> +               for (count = tz->trips; count >= 0; count--)
> +                       handle_thermal_trip(tz, count);
> +       }
>  }
>  EXPORT_SYMBOL_GPL(thermal_zone_device_update);
>  
> diff --git a/include/linux/thermal.h b/include/linux/thermal.h
> index 4c3b72536772..d512f21561f1 100644
> --- a/include/linux/thermal.h
> +++ b/include/linux/thermal.h
> @@ -125,6 +125,7 @@ struct thermal_cooling_device {
>   * @last_temperature:  previous temperature read
>   * @emul_temperature:  emulated temperature when using
> CONFIG_THERMAL_EMULATION
>   * @passive:           1 if you've crossed a passive trip point, 0
> otherwise.
> + * @prev_index:                previous index pointing to the trip
> point the thermal zone was
>   * @prev_low_trip:     the low current temperature if you've crossed
> a passive
>                         trip point.
>   * @prev_high_trip:    the above current temperature if you've
> crossed a
> @@ -161,6 +162,7 @@ struct thermal_zone_device {
>         int last_temperature;
>         int emul_temperature;
>         int passive;
> +       int prev_index;
>         int prev_low_trip;
>         int prev_high_trip;
>         atomic_t need_update;
Rafael J. Wysocki Oct. 26, 2023, 6:37 p.m. UTC | #2
On Fri, Jul 15, 2022 at 11:09 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
> The routine doing trip point crossing the way up or down is actually
> wrong.
>
> A trip point is composed with a trip temperature and a hysteresis.
>
> The trip temperature is used to detect when the trip point is crossed
> the way up.
>
> The trip temperature minus the hysteresis is used to detect when the
> trip point is crossed the way down.
>
> |-----------low--------high------------|
>              |<--------->|
>              |    hyst   |
>              |           |
>              |          -|--> crossed the way up
>              |
>          <---|-- crossed the way down
>
> For that, there is a two point comparison: the current temperature and
> the previous temperature.
>
> The actual code assumes if the current temperature is greater than the
> trip temperature and the previous temperature was lesser, then the
> trip point is crossed the way up. That is true only if we crossed the
> way down the low temperature boundary from the previous temperature or
> if the hysteresis is zero. The temperature can decrease between the
> low and high, so the trip point is not crossed the way down and then
> increase again and cross the high temperature raising a new trip point
> crossed detection which is incorrect. The same scenario happens when
> crossing the way down.
>
> The trip point crossing the way up and down must act as parenthesis, a
> trip point down must close a trip point up. Today we have multiple
> trip point up without the corresponding trip point down.
>
> In order to fix that, we store the previous trip point which gives the
> information about the previous trip and we change the trip point
> browsing order depending on the temperature trend: in the ascending
> order when the temperature trend is raising, otherwise in the
> descending order.

There is an alternative way of addressing this problem which doesn't
require using information regarding the previous trip point and so it
would work even if the trips were not sorted.

Namely, for each trip there can be an effective threshold equal to
either its temperature. or its temperature minus its hysteresis (low
temperature).  If the initial zone temperature is below the trip's
temperature, the initial value of its threshold is equal to its
temperature.  Otherwise, the initial value of the trip's threshold is
its low temperature.

Then, if the zone temperature crosses the threshold (either up or
down), the trip crossing triggers and the threshold value is flipped
(that is, if it was equal to the trip's temperature, it becomes its
low temperature or the other way around).  [Note that if the threshold
value is equal to the trip temperature, it can only be crossed on the
way up, because it means that the zone temperature was below it at one
point and has not grown above it since then.  Conversely, if the
threshold value is equal to the low temperature of the trip, it can
only be crossed on the way down, because it means that the zone
temperature was above the trip temperature at one point and it has not
fallen below the trip's low temperature since then.]

This should not be hard to implement AFAICS and it should also work in
the cases when one trip is located in the hysteresis range of another
one.
diff mbox series

Patch

diff --git a/drivers/thermal/thermal_core.c b/drivers/thermal/thermal_core.c
index f02f38b66445..a5c5f6f4e42b 100644
--- a/drivers/thermal/thermal_core.c
+++ b/drivers/thermal/thermal_core.c
@@ -354,30 +354,48 @@  static void handle_critical_trips(struct thermal_zone_device *tz,
 		tz->ops->critical(tz);
 }
 
-static void handle_thermal_trip_crossed(struct thermal_zone_device *tz, int trip,
+static void handle_thermal_trip_crossed(struct thermal_zone_device *tz, int index,
 					int trip_temp, int trip_hyst,
 					enum thermal_trip_type trip_type)
 {
+	int trip_low_temp = trip_temp - trip_hyst;
+	int trip = tz->trips_indexes[index];
+	
 	if (tz->last_temperature == THERMAL_TEMP_INVALID)
 		return;
 
-	if (tz->last_temperature < trip_temp &&
-	    tz->temperature >= trip_temp) {
-		thermal_notify_tz_trip_up(tz->id, trip,
-					  tz->temperature);
-	}
-
-	if (tz->last_temperature >= trip_temp &&
-	    tz->temperature < (trip_temp - trip_hyst)) {
-		thermal_notify_tz_trip_down(tz->id, trip,
-					    tz->temperature);
+	/*
+	 * Due to the hysteresis, a third information is needed to
+	 * detect when the temperature is wavering between the
+	 * trip_low_temp and the trip_temp. A trip point is crossed
+	 * the way up only if the temperature is above it while the
+	 * previous temperature was below *and* we crossed the
+	 * trip_temp_low before. The previous trip point give us the
+	 * previous trip point transition. The similar problem exists
+	 * when crossing the way down.
+	 *
+	 * Note the mechanism works only if the caller of the function
+	 * invoke the function with the trip point ascending or
+	 * descending regarding the temperature trend. A temperature
+	 * drop trend will browse the trip point in the descending
+	 * order
+	 */
+	if (tz->last_temperature < trip_temp && tz->temperature >= trip_temp &&
+	    index != tz->prev_index) {
+		thermal_notify_tz_trip_up(tz->id, trip, tz->temperature);
+		tz->prev_index = index;
+	} else if (tz->last_temperature >= trip_low_temp && tz->temperature < trip_low_temp &&
+		   index == tz->prev_index) {
+		thermal_notify_tz_trip_down(tz->id, trip, tz->temperature);
+		tz->prev_index--;
 	}
 }
 
-static void handle_thermal_trip(struct thermal_zone_device *tz, int trip)
+static void handle_thermal_trip(struct thermal_zone_device *tz, int index)
 {
 	enum thermal_trip_type type;
 	int trip_temp, hyst = 0;
+	int trip = tz->trips_indexes[index];
 
 	/* Ignore disabled trip points */
 	if (test_bit(trip, &tz->trips_disabled))
@@ -388,7 +406,7 @@  static void handle_thermal_trip(struct thermal_zone_device *tz, int trip)
 	if (tz->ops->get_trip_hyst)
 		tz->ops->get_trip_hyst(tz, trip, &hyst);
 
-	handle_thermal_trip_crossed(tz, trip, trip_temp, hyst, type);
+	handle_thermal_trip_crossed(tz, index, trip_temp, hyst, type);
 
 	if (type == THERMAL_TRIP_CRITICAL || type == THERMAL_TRIP_HOT)
 		handle_critical_trips(tz, trip, trip_temp, type);
@@ -428,6 +446,7 @@  static void thermal_zone_device_init(struct thermal_zone_device *tz)
 {
 	struct thermal_instance *pos;
 	tz->temperature = THERMAL_TEMP_INVALID;
+	tz->prev_index = -1;
 	tz->prev_low_trip = -INT_MAX;
 	tz->prev_high_trip = INT_MAX;
 	list_for_each_entry(pos, &tz->thermal_instances, tz_node)
@@ -512,8 +531,13 @@  void thermal_zone_device_update(struct thermal_zone_device *tz,
 
 	tz->notify_event = event;
 
-	for (count = 0; count < tz->trips; count++)
-		handle_thermal_trip(tz, count);
+	if (tz->last_temperature <=  tz->temperature) {
+		for (count = 0; count < tz->trips; count++)
+			handle_thermal_trip(tz, count);
+	} else {
+		for (count = tz->trips; count >= 0; count--)
+			handle_thermal_trip(tz, count);
+	}
 }
 EXPORT_SYMBOL_GPL(thermal_zone_device_update);
 
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 4c3b72536772..d512f21561f1 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -125,6 +125,7 @@  struct thermal_cooling_device {
  * @last_temperature:	previous temperature read
  * @emul_temperature:	emulated temperature when using CONFIG_THERMAL_EMULATION
  * @passive:		1 if you've crossed a passive trip point, 0 otherwise.
+ * @prev_index:		previous index pointing to the trip point the thermal zone was
  * @prev_low_trip:	the low current temperature if you've crossed a passive
 			trip point.
  * @prev_high_trip:	the above current temperature if you've crossed a
@@ -161,6 +162,7 @@  struct thermal_zone_device {
 	int last_temperature;
 	int emul_temperature;
 	int passive;
+	int prev_index;
 	int prev_low_trip;
 	int prev_high_trip;
 	atomic_t need_update;