Message ID | 20230519032719.2581689-6-evalenti@kernel.org |
---|---|
State | New |
Headers | show |
Series | thermal: enhancements on thermal stats | expand |
On Fri, May 19, 2023 at 5:27 AM Eduardo Valentin <evalenti@kernel.org> wrote: > > From: Eduardo Valentin <eduval@amazon.com> > > This patch adds a statistic to report how long > the thermal zone spent on temperature intervals > created by each trip point. The first interval > is the range below the first trip point. All > subsequent intervals are accounted when temperature > is above the trip point temperature value. > > Samples: > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > trip-1 0 0 The above line is confusing. > trip0 -10000 35188 > trip1 25000 0 And the format violates the "one value per attribute" sysfs rule. > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > trip-1 0 0 > trip0 -10000 36901 > trip1 25000 0 > $ echo 25001 > /sys//class/thermal/thermal_zone0/emul_temp > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > trip-1 0 0 > trip0 -10000 47810 > trip1 25000 2259 > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > trip-1 0 0 > trip0 -10000 47810 > trip1 25000 3224 > $ echo 24001 > /sys//class/thermal/thermal_zone0/emul_temp > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > trip-1 0 0 > trip0 -10000 48960 > trip1 25000 10080 > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > trip-1 0 0 > trip0 -10000 49844 > trip1 25000 10080 > > Cc: "Rafael J. Wysocki" <rafael@kernel.org> (supporter:THERMAL) > Cc: Daniel Lezcano <daniel.lezcano@linaro.org> (supporter:THERMAL) > Cc: Amit Kucheria <amitk@kernel.org> (reviewer:THERMAL) > Cc: Zhang Rui <rui.zhang@intel.com> (reviewer:THERMAL) > Cc: Jonathan Corbet <corbet@lwn.net> (maintainer:DOCUMENTATION) > Cc: linux-pm@vger.kernel.org (open list:THERMAL) > Cc: linux-doc@vger.kernel.org (open list:DOCUMENTATION) > Cc: linux-kernel@vger.kernel.org (open list) > > Signed-off-by: Eduardo Valentin <eduval@amazon.com> > --- > .../driver-api/thermal/sysfs-api.rst | 2 + > drivers/thermal/thermal_sysfs.c | 86 +++++++++++++++++++ > 2 files changed, 88 insertions(+) > > diff --git a/Documentation/driver-api/thermal/sysfs-api.rst b/Documentation/driver-api/thermal/sysfs-api.rst > index ed5e6ba4e0d7..4a2b92a7488c 100644 > --- a/Documentation/driver-api/thermal/sysfs-api.rst > +++ b/Documentation/driver-api/thermal/sysfs-api.rst > @@ -359,6 +359,8 @@ Thermal zone device sys I/F, created once it's registered:: > |---stats/reset_tz_stats: Writes to this file resets the statistics. > |---stats/max_gradient: The maximum recorded dT/dt in uC/ms. > |---stats/min_gradient: The minimum recorded dT/dt in uC/ms. > + |---stats/time_in_trip_ms: Time spent on each temperature interval of > + trip points. I would write "in each temperature interval between consecutive trip points". Doesn't this assume a specific temperature ordering of trip points? And so what if they are not ordered?
On Tue, Jun 20, 2023 at 07:27:57PM +0200, Rafael J. Wysocki wrote: > > > > On Fri, May 19, 2023 at 5:27 AM Eduardo Valentin <evalenti@kernel.org> wrote: > > > > From: Eduardo Valentin <eduval@amazon.com> > > > > This patch adds a statistic to report how long > > the thermal zone spent on temperature intervals > > created by each trip point. The first interval > > is the range below the first trip point. All > > subsequent intervals are accounted when temperature > > is above the trip point temperature value. > > > > Samples: > > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > > trip-1 0 0 > > The above line is confusing. > > > trip0 -10000 35188 > > trip1 25000 0 > > And the format violates the "one value per attribute" sysfs rule. > > > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > > trip-1 0 0 > > trip0 -10000 36901 > > trip1 25000 0 > > $ echo 25001 > /sys//class/thermal/thermal_zone0/emul_temp > > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > > trip-1 0 0 > > trip0 -10000 47810 > > trip1 25000 2259 > > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > > trip-1 0 0 > > trip0 -10000 47810 > > trip1 25000 3224 > > $ echo 24001 > /sys//class/thermal/thermal_zone0/emul_temp > > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > > trip-1 0 0 > > trip0 -10000 48960 > > trip1 25000 10080 > > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > > trip-1 0 0 > > trip0 -10000 49844 > > trip1 25000 10080 > > > > Cc: "Rafael J. Wysocki" <rafael@kernel.org> (supporter:THERMAL) > > Cc: Daniel Lezcano <daniel.lezcano@linaro.org> (supporter:THERMAL) > > Cc: Amit Kucheria <amitk@kernel.org> (reviewer:THERMAL) > > Cc: Zhang Rui <rui.zhang@intel.com> (reviewer:THERMAL) > > Cc: Jonathan Corbet <corbet@lwn.net> (maintainer:DOCUMENTATION) > > Cc: linux-pm@vger.kernel.org (open list:THERMAL) > > Cc: linux-doc@vger.kernel.org (open list:DOCUMENTATION) > > Cc: linux-kernel@vger.kernel.org (open list) > > > > Signed-off-by: Eduardo Valentin <eduval@amazon.com> > > --- > > .../driver-api/thermal/sysfs-api.rst | 2 + > > drivers/thermal/thermal_sysfs.c | 86 +++++++++++++++++++ > > 2 files changed, 88 insertions(+) > > > > diff --git a/Documentation/driver-api/thermal/sysfs-api.rst b/Documentation/driver-api/thermal/sysfs-api.rst > > index ed5e6ba4e0d7..4a2b92a7488c 100644 > > --- a/Documentation/driver-api/thermal/sysfs-api.rst > > +++ b/Documentation/driver-api/thermal/sysfs-api.rst > > @@ -359,6 +359,8 @@ Thermal zone device sys I/F, created once it's registered:: > > |---stats/reset_tz_stats: Writes to this file resets the statistics. > > |---stats/max_gradient: The maximum recorded dT/dt in uC/ms. > > |---stats/min_gradient: The minimum recorded dT/dt in uC/ms. > > + |---stats/time_in_trip_ms: Time spent on each temperature interval of > > + trip points. > > I would write "in each temperature interval between consecutive trip points". Ok > > Doesn't this assume a specific temperature ordering of trip points? > And so what if they are not ordered? It does. I believe other things will break if they are not ordered. Like the temperature update against the governor throttle callback invocation in the thermal core.
On Wed, Jun 21, 2023 at 6:45 AM Eduardo Valentin <evalenti@kernel.org> wrote: > > On Tue, Jun 20, 2023 at 07:27:57PM +0200, Rafael J. Wysocki wrote: > > > > > > > > On Fri, May 19, 2023 at 5:27 AM Eduardo Valentin <evalenti@kernel.org> wrote: > > > > > > From: Eduardo Valentin <eduval@amazon.com> > > > > > > This patch adds a statistic to report how long > > > the thermal zone spent on temperature intervals > > > created by each trip point. The first interval > > > is the range below the first trip point. All > > > subsequent intervals are accounted when temperature > > > is above the trip point temperature value. > > > > > > Samples: > > > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > > > trip-1 0 0 > > > > The above line is confusing. > > > > > trip0 -10000 35188 > > > trip1 25000 0 > > > > And the format violates the "one value per attribute" sysfs rule. > > > > > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > > > trip-1 0 0 > > > trip0 -10000 36901 > > > trip1 25000 0 > > > $ echo 25001 > /sys//class/thermal/thermal_zone0/emul_temp > > > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > > > trip-1 0 0 > > > trip0 -10000 47810 > > > trip1 25000 2259 > > > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > > > trip-1 0 0 > > > trip0 -10000 47810 > > > trip1 25000 3224 > > > $ echo 24001 > /sys//class/thermal/thermal_zone0/emul_temp > > > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > > > trip-1 0 0 > > > trip0 -10000 48960 > > > trip1 25000 10080 > > > $ cat /sys//class/thermal/thermal_zone0/stats/time_in_trip_ms > > > trip-1 0 0 > > > trip0 -10000 49844 > > > trip1 25000 10080 > > > > > > Cc: "Rafael J. Wysocki" <rafael@kernel.org> (supporter:THERMAL) > > > Cc: Daniel Lezcano <daniel.lezcano@linaro.org> (supporter:THERMAL) > > > Cc: Amit Kucheria <amitk@kernel.org> (reviewer:THERMAL) > > > Cc: Zhang Rui <rui.zhang@intel.com> (reviewer:THERMAL) > > > Cc: Jonathan Corbet <corbet@lwn.net> (maintainer:DOCUMENTATION) > > > Cc: linux-pm@vger.kernel.org (open list:THERMAL) > > > Cc: linux-doc@vger.kernel.org (open list:DOCUMENTATION) > > > Cc: linux-kernel@vger.kernel.org (open list) > > > > > > Signed-off-by: Eduardo Valentin <eduval@amazon.com> > > > --- > > > .../driver-api/thermal/sysfs-api.rst | 2 + > > > drivers/thermal/thermal_sysfs.c | 86 +++++++++++++++++++ > > > 2 files changed, 88 insertions(+) > > > > > > diff --git a/Documentation/driver-api/thermal/sysfs-api.rst b/Documentation/driver-api/thermal/sysfs-api.rst > > > index ed5e6ba4e0d7..4a2b92a7488c 100644 > > > --- a/Documentation/driver-api/thermal/sysfs-api.rst > > > +++ b/Documentation/driver-api/thermal/sysfs-api.rst > > > @@ -359,6 +359,8 @@ Thermal zone device sys I/F, created once it's registered:: > > > |---stats/reset_tz_stats: Writes to this file resets the statistics. > > > |---stats/max_gradient: The maximum recorded dT/dt in uC/ms. > > > |---stats/min_gradient: The minimum recorded dT/dt in uC/ms. > > > + |---stats/time_in_trip_ms: Time spent on each temperature interval of > > > + trip points. > > > > I would write "in each temperature interval between consecutive trip points". > > Ok > > > > > Doesn't this assume a specific temperature ordering of trip points? > > And so what if they are not ordered? > > It does. I believe other things will break if they are not ordered. But there's no guarantee that they will be ordered, so it looks like those other things are already broken.
diff --git a/Documentation/driver-api/thermal/sysfs-api.rst b/Documentation/driver-api/thermal/sysfs-api.rst index ed5e6ba4e0d7..4a2b92a7488c 100644 --- a/Documentation/driver-api/thermal/sysfs-api.rst +++ b/Documentation/driver-api/thermal/sysfs-api.rst @@ -359,6 +359,8 @@ Thermal zone device sys I/F, created once it's registered:: |---stats/reset_tz_stats: Writes to this file resets the statistics. |---stats/max_gradient: The maximum recorded dT/dt in uC/ms. |---stats/min_gradient: The minimum recorded dT/dt in uC/ms. + |---stats/time_in_trip_ms: Time spent on each temperature interval of + trip points. Thermal cooling device sys I/F, created once it's registered:: diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c index f89ec9a7e8c8..25851fe073c3 100644 --- a/drivers/thermal/thermal_sysfs.c +++ b/drivers/thermal/thermal_sysfs.c @@ -544,6 +544,7 @@ struct thermal_zone_device_stats { s64 max_gradient; s64 min_gradient; ktime_t last_time; + ktime_t *time_in_trip; }; #define DELTA_MILLI_C_TO_MICRO_C(t0, t1) (((t0) - (t1)) * 1000) @@ -552,6 +553,7 @@ static void temperature_stats_update(struct thermal_zone_device *tz) struct thermal_zone_device_stats *stats = tz->stats; ktime_t now = ktime_get(), delta; s64 cur_gradient, delta_temp; + int i, trip_id = -1; delta = ktime_sub(now, stats->last_time); stats->last_time = now; @@ -567,6 +569,29 @@ static void temperature_stats_update(struct thermal_zone_device *tz) if (tz->last_temperature == 0) cur_gradient = 0; + for (i = 0; i < tz->num_trips; i++) { + struct thermal_trip trip; + int ret; + + ret = __thermal_zone_get_trip(tz, i, &trip); + if (ret) + continue; + + if (tz->temperature > trip.temperature) + trip_id = i; + } + + /* + * Update how long we spend in each temperature interval. + * This array is like as follows: + * time_in_trip[0] == time spent with temperature <= trip[0] + * time_in_trip[1] == time spent with temperature > trip[0] + * time_in_trip[2] == time spent with temperature > trip[1] + * ... + */ + stats->time_in_trip[trip_id + 1] = + ktime_add(stats->time_in_trip[trip_id + 1], delta); + /* update fastest temperature rise from our perspective */ if (cur_gradient > stats->max_gradient) stats->max_gradient = cur_gradient; @@ -615,12 +640,66 @@ static ssize_t min_gradient_show(struct device *dev, return ret; } +static ssize_t +time_in_trip_ms_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct thermal_zone_device *tz = to_thermal_zone(dev); + struct thermal_zone_device_stats *stats = tz->stats; + ssize_t len = 0, ret = 0; + int i; + + spin_lock(&stats->lock); + temperature_stats_update(tz); + + /* + * This should look like this: + * trip-1 X + * trip0 Y + * trip1 Z + * ... + * and the way to read is. This thermal zone temperature was seen + * X ms with tz->temperature <= trip0, Y ms with tz->temperature > trip0 + * and Z ms with tz->temperature > trip1. + */ + for (i = 0; i <= tz->num_trips; i++) { + int trip_temp = 0, r = 0; + + if (i > 0) { + struct thermal_trip trip; + + r = __thermal_zone_get_trip(tz, i - 1, &trip); + if (r < 0) { + ret = r; + break; + } + trip_temp = trip.temperature; + } + + ret = snprintf(buf + len, PAGE_SIZE - len, "trip%d\t%d\t%llu\n", + i - 1, trip_temp, + ktime_to_ms(stats->time_in_trip[i])); + + if (ret == 0) + ret = -EOVERFLOW; + + if (ret < 0) + break; + + len += ret; + } + spin_unlock(&stats->lock); + + return ret < 0 ? ret : len; +} + static ssize_t reset_tz_stats_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) { struct thermal_zone_device *tz = to_thermal_zone(dev); struct thermal_zone_device_stats *stats = tz->stats; + int i; spin_lock(&stats->lock); @@ -628,6 +707,9 @@ reset_tz_stats_store(struct device *dev, struct device_attribute *attr, stats->max_gradient = 0; stats->last_time = ktime_get(); + for (i = 0; i <= tz->num_trips; i++) + stats->time_in_trip[i] = ktime_set(0, 0); + spin_unlock(&stats->lock); return count; @@ -635,11 +717,13 @@ reset_tz_stats_store(struct device *dev, struct device_attribute *attr, static DEVICE_ATTR_RO(min_gradient); static DEVICE_ATTR_RO(max_gradient); +static DEVICE_ATTR_RO(time_in_trip_ms); static DEVICE_ATTR_WO(reset_tz_stats); static struct attribute *thermal_zone_device_stats_attrs[] = { &dev_attr_min_gradient.attr, &dev_attr_max_gradient.attr, + &dev_attr_time_in_trip_ms.attr, &dev_attr_reset_tz_stats.attr, NULL }; @@ -657,11 +741,13 @@ thermal_zone_device_stats_setup(struct thermal_zone_device *tz, int var; var = sizeof(*stats); + var += sizeof(*stats->time_in_trip) * (tz->num_trips + 1); stats = kzalloc(var, GFP_KERNEL); if (!stats) return; + stats->time_in_trip = (ktime_t *)(stats + 1); stats->last_time = ktime_get(); spin_lock_init(&stats->lock); tz->stats = stats;