mbox series

[v3,0/6] Add thermal user thresholds support

Message ID 20240906073631.1655552-1-daniel.lezcano@linaro.org
Headers show
Series Add thermal user thresholds support | expand

Message

Daniel Lezcano Sept. 6, 2024, 7:36 a.m. UTC
The trip points are a firmware description of the temperature limits
of a specific thermal zone where we associate an action which is done
by the kernel. The time resolution is low.

The userspace has to deal with a more complex thermal management based
on heuristics from different information coming from different
places. The logic is much more complex but based on a bigger time
resolution, usually one second based.

The purpose of the userspace is to monitor the temperatures from
different places and take actions. However, it can not be constantly
reading the temperature to detect when a temperature threshold has
been reached. This is especially bad for mobile or embedded system as
that will lead to an unacceptable number of wakeup to check the
temperature with nothing to do.

On the other side, the sensors are now most of the time interrupt
driven. That means the thermal framework will use the temperature trip
points to program the sensor to trigger an interrupt when a
temperature limit is crossed.

Unfortunately, the userspace can not benefit this feature and current
solutions found here and there, iow out-of-tree, are to add fake trip
points in the firmware and enable the writable trip points.

This is bad for different reasons, the trip points are for in-kernel
actions, the semantic of their types is used by the thermal framework
and by adding trip points in the device tree is a way to overcome the
current limitation but tampering with how the thermal framework is
supposed to work. The writable trip points is a way to adjust a
temperature limit given a specific platform if the firmware is not
accurate enough and TBH it is more a debug feature from my POV.

The user thresholds mechanism is a way to have the userspace to tell
thermal framework to send a notification when a temperature limit is
crossed. There is no id, no hysteresis, just the temperature and the
direction of the limit crossing. That means we can be notified when a
temperature threshold is crossed the way up only, or the way down only
or both ways. That allows to create hysteresis values if it is needed.

Those thresholds are refered as user thresholds in order to do the
difference with the trip points which are similar.

An user threshold can be added, deleted or flushed. The latter means
all user thresholds belonging to a thermal zone will be deleted.
    
When one or several user thresholds are crossed, an event is sent to
the userspace.

All aforementioned actions and events lead to a notification to the
userspace. A user threshold change (add, delete and flush) is notified
to the userspace with the process id responsible of the action.

Along with the kernel changes, the thermal library has been extended
to provide the different API to deal with the new user threshold
netlink events and commands.

In addition, the thermal-engine skeleton uses these new API by
flushing and adding user thresholds as well as getting the
notification about these actions.

Overall the series has been tested with the thermal-engine skeleton
and some selftests which are not part of this series.

Changelog:
  V3:
    - the first patch of the v2 series has been merged

    - Modified the description to split the information between the
      cover letter and the patch 1 description (Rafael)

    - Made the thresholds code as part of the core (Rafael)

    - Converted the thresholds into a list and directly declared in
      the thermal zone device structure (Rafael)

    - Changed the name of the field in the thermal zone device
      structure to user_thresholds (Rafael)

    - Added #include "thermal_thresholds.h" (Rafael)

    - Combined the conditions in the function
      __thermal_threshold_is_crossed (Rafael)

    - Moved the function thermal_thresholds_flush() before
      thermal_thresholds_exit() (Rafael)

    - Change thermal_thresholds_handle() to return void (Rafael)

    - Move the list field on top the of the structure threshold and
      renamed it list_node (Rafael)

    - Changed THERMAL_THRESHOLD_* notifications to
      THERMAL_TZ_THRESHOLD_* (Rafael)

Daniel Lezcano (6):
  thermal/core: Add user thresholds support
  thermal/core: Connect the threshold with the core
  thermal/netlink: Add the commands and the events for the thresholds
  tools/lib/thermal: Make more generic the command encoding function
  tools/lib/thermal: Add the threshold netlink ABI
  tools/thermal/thermal-engine: Take into account the thresholds API

 drivers/thermal/Makefile                      |   1 +
 drivers/thermal/thermal_core.c                |   9 +
 drivers/thermal/thermal_core.h                |   2 +
 drivers/thermal/thermal_netlink.c             | 239 +++++++++++++++++-
 drivers/thermal/thermal_netlink.h             |   7 +
 drivers/thermal/thermal_thresholds.c          | 235 +++++++++++++++++
 drivers/thermal/thermal_thresholds.h          |  19 ++
 include/linux/thermal.h                       |   3 +
 include/uapi/linux/thermal.h                  |  30 ++-
 tools/lib/thermal/commands.c                  | 167 +++++++++++-
 tools/lib/thermal/events.c                    |  58 ++++-
 tools/lib/thermal/include/thermal.h           |  40 +++
 tools/lib/thermal/libthermal.map              |   5 +
 tools/lib/thermal/thermal.c                   |  17 ++
 tools/thermal/lib/Makefile                    |   2 +-
 tools/thermal/thermal-engine/thermal-engine.c | 109 +++++++-
 16 files changed, 894 insertions(+), 49 deletions(-)
 create mode 100644 drivers/thermal/thermal_thresholds.c
 create mode 100644 drivers/thermal/thermal_thresholds.h