diff mbox series

[v1,2/3] thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()

Message ID 2181352.irdbgypaU6@kreacher
State Superseded
Headers show
Series thermal/debugfs: Fix handling of cdev states and mitigation episodes in progress | expand

Commit Message

Rafael J. Wysocki April 23, 2024, 6 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

If cdev_dt_seq_show() runs before the first state transition of a cooling
device, it will not print any state residency information for it, even
though it might be reasonably expected to print residency information for
the initial state of the cooling device.

For this reason, rearrange the code to get the initial state of a cooling
device at the registration time and pass it to thermal_debug_cdev_add(),
so that the latter can create a duration record for that state which will
allow cdev_dt_seq_show() to print its residency information.

Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
Reported-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/thermal/thermal_core.c    |    9 +++++++--
 drivers/thermal/thermal_debugfs.c |   12 ++++++++++--
 drivers/thermal/thermal_debugfs.h |    4 ++--
 3 files changed, 19 insertions(+), 6 deletions(-)

Comments

Lukasz Luba April 25, 2024, 10:02 a.m. UTC | #1
Hi Rafael,

On 4/23/24 19:00, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> If cdev_dt_seq_show() runs before the first state transition of a cooling
> device, it will not print any state residency information for it, even
> though it might be reasonably expected to print residency information for
> the initial state of the cooling device.
> 
> For this reason, rearrange the code to get the initial state of a cooling
> device at the registration time and pass it to thermal_debug_cdev_add(),
> so that the latter can create a duration record for that state which will
> allow cdev_dt_seq_show() to print its residency information.
> 
> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> Reported-by: Lukasz Luba <lukasz.luba@arm.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>   drivers/thermal/thermal_core.c    |    9 +++++++--
>   drivers/thermal/thermal_debugfs.c |   12 ++++++++++--
>   drivers/thermal/thermal_debugfs.h |    4 ++--
>   3 files changed, 19 insertions(+), 6 deletions(-)
> 

I'm trying to test it on my board and do the review, but faced issue.
For some reason I couldn't apply that patch on the latest bleeding-edge
branch.
Could you help me in this please? Is there something missing in the
patch set (like one more fist patch)?

Regards,
Lukasz
Rafael J. Wysocki April 25, 2024, 12:36 p.m. UTC | #2
Hi Lukasz,

On Thu, Apr 25, 2024 at 12:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
> Hi Rafael,
>
> On 4/23/24 19:00, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > If cdev_dt_seq_show() runs before the first state transition of a cooling
> > device, it will not print any state residency information for it, even
> > though it might be reasonably expected to print residency information for
> > the initial state of the cooling device.
> >
> > For this reason, rearrange the code to get the initial state of a cooling
> > device at the registration time and pass it to thermal_debug_cdev_add(),
> > so that the latter can create a duration record for that state which will
> > allow cdev_dt_seq_show() to print its residency information.
> >
> > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> > Reported-by: Lukasz Luba <lukasz.luba@arm.com>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >   drivers/thermal/thermal_core.c    |    9 +++++++--
> >   drivers/thermal/thermal_debugfs.c |   12 ++++++++++--
> >   drivers/thermal/thermal_debugfs.h |    4 ++--
> >   3 files changed, 19 insertions(+), 6 deletions(-)
> >
>
> I'm trying to test it on my board and do the review, but faced issue.
> For some reason I couldn't apply that patch on the latest bleeding-edge
> branch.
> Could you help me in this please? Is there something missing in the
> patch set (like one more fist patch)?

I messed up the ordering of patches (patch [2/3] should be the last
one in the series) and on top of that, you'll need a small rebase on
that patch.

Sorry about this, I'll send a v2.

Thank you!
Rafael J. Wysocki April 25, 2024, 1 p.m. UTC | #3
On Thu, Apr 25, 2024 at 2:36 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> Hi Lukasz,
>
> On Thu, Apr 25, 2024 at 12:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
> >
> > Hi Rafael,
> >
> > On 4/23/24 19:00, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > If cdev_dt_seq_show() runs before the first state transition of a cooling
> > > device, it will not print any state residency information for it, even
> > > though it might be reasonably expected to print residency information for
> > > the initial state of the cooling device.
> > >
> > > For this reason, rearrange the code to get the initial state of a cooling
> > > device at the registration time and pass it to thermal_debug_cdev_add(),
> > > so that the latter can create a duration record for that state which will
> > > allow cdev_dt_seq_show() to print its residency information.
> > >
> > > Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> > > Reported-by: Lukasz Luba <lukasz.luba@arm.com>
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >   drivers/thermal/thermal_core.c    |    9 +++++++--
> > >   drivers/thermal/thermal_debugfs.c |   12 ++++++++++--
> > >   drivers/thermal/thermal_debugfs.h |    4 ++--
> > >   3 files changed, 19 insertions(+), 6 deletions(-)
> > >
> >
> > I'm trying to test it on my board and do the review, but faced issue.
> > For some reason I couldn't apply that patch on the latest bleeding-edge
> > branch.
> > Could you help me in this please? Is there something missing in the
> > patch set (like one more fist patch)?
>
> I messed up the ordering of patches (patch [2/3] should be the last
> one in the series) and on top of that, you'll need a small rebase on
> that patch.
>
> Sorry about this, I'll send a v2.

Actually, the ordering was OK, but the rebase of the second patch is
still needed.  I'll send a v2.
Lukasz Luba April 25, 2024, 6:42 p.m. UTC | #4
On 4/25/24 14:00, Rafael J. Wysocki wrote:
> On Thu, Apr 25, 2024 at 2:36 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>
>> Hi Lukasz,
>>
>> On Thu, Apr 25, 2024 at 12:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>>
>>> Hi Rafael,
>>>
>>> On 4/23/24 19:00, Rafael J. Wysocki wrote:
>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>
>>>> If cdev_dt_seq_show() runs before the first state transition of a cooling
>>>> device, it will not print any state residency information for it, even
>>>> though it might be reasonably expected to print residency information for
>>>> the initial state of the cooling device.
>>>>
>>>> For this reason, rearrange the code to get the initial state of a cooling
>>>> device at the registration time and pass it to thermal_debug_cdev_add(),
>>>> so that the latter can create a duration record for that state which will
>>>> allow cdev_dt_seq_show() to print its residency information.
>>>>
>>>> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
>>>> Reported-by: Lukasz Luba <lukasz.luba@arm.com>
>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>> ---
>>>>    drivers/thermal/thermal_core.c    |    9 +++++++--
>>>>    drivers/thermal/thermal_debugfs.c |   12 ++++++++++--
>>>>    drivers/thermal/thermal_debugfs.h |    4 ++--
>>>>    3 files changed, 19 insertions(+), 6 deletions(-)
>>>>
>>>
>>> I'm trying to test it on my board and do the review, but faced issue.
>>> For some reason I couldn't apply that patch on the latest bleeding-edge
>>> branch.
>>> Could you help me in this please? Is there something missing in the
>>> patch set (like one more fist patch)?
>>
>> I messed up the ordering of patches (patch [2/3] should be the last
>> one in the series) and on top of that, you'll need a small rebase on
>> that patch.
>>
>> Sorry about this, I'll send a v2.
> 
> Actually, the ordering was OK, but the rebase of the second patch is
> still needed.  I'll send a v2.

Thanks, I've seen it. That v2 applies smoothly and runs on the board.
I'll test it and review.
Rafael J. Wysocki April 25, 2024, 6:53 p.m. UTC | #5
On Thu, Apr 25, 2024 at 8:42 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
>
>
> On 4/25/24 14:00, Rafael J. Wysocki wrote:
> > On Thu, Apr 25, 2024 at 2:36 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >>
> >> Hi Lukasz,
> >>
> >> On Thu, Apr 25, 2024 at 12:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
> >>>
> >>> Hi Rafael,
> >>>
> >>> On 4/23/24 19:00, Rafael J. Wysocki wrote:
> >>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>>
> >>>> If cdev_dt_seq_show() runs before the first state transition of a cooling
> >>>> device, it will not print any state residency information for it, even
> >>>> though it might be reasonably expected to print residency information for
> >>>> the initial state of the cooling device.
> >>>>
> >>>> For this reason, rearrange the code to get the initial state of a cooling
> >>>> device at the registration time and pass it to thermal_debug_cdev_add(),
> >>>> so that the latter can create a duration record for that state which will
> >>>> allow cdev_dt_seq_show() to print its residency information.
> >>>>
> >>>> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> >>>> Reported-by: Lukasz Luba <lukasz.luba@arm.com>
> >>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>> ---
> >>>>    drivers/thermal/thermal_core.c    |    9 +++++++--
> >>>>    drivers/thermal/thermal_debugfs.c |   12 ++++++++++--
> >>>>    drivers/thermal/thermal_debugfs.h |    4 ++--
> >>>>    3 files changed, 19 insertions(+), 6 deletions(-)
> >>>>
> >>>
> >>> I'm trying to test it on my board and do the review, but faced issue.
> >>> For some reason I couldn't apply that patch on the latest bleeding-edge
> >>> branch.
> >>> Could you help me in this please? Is there something missing in the
> >>> patch set (like one more fist patch)?
> >>
> >> I messed up the ordering of patches (patch [2/3] should be the last
> >> one in the series) and on top of that, you'll need a small rebase on
> >> that patch.
> >>
> >> Sorry about this, I'll send a v2.
> >
> > Actually, the ordering was OK, but the rebase of the second patch is
> > still needed.  I'll send a v2.
>
> Thanks, I've seen it. That v2 applies smoothly and runs on the board.
> I'll test it and review.

Thank you!

Please also see

https://lore.kernel.org/linux-pm/12427744.O9o76ZdvQC@kreacher/

which is actually more urgent because the fixes there address more
serious issues (I would even consider them as 6.9-rc material).

Thanks!
Lukasz Luba April 25, 2024, 6:55 p.m. UTC | #6
On 4/25/24 19:53, Rafael J. Wysocki wrote:
> On Thu, Apr 25, 2024 at 8:42 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>
>>
>>
>> On 4/25/24 14:00, Rafael J. Wysocki wrote:
>>> On Thu, Apr 25, 2024 at 2:36 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>>>
>>>> Hi Lukasz,
>>>>
>>>> On Thu, Apr 25, 2024 at 12:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>>>>>
>>>>> Hi Rafael,
>>>>>
>>>>> On 4/23/24 19:00, Rafael J. Wysocki wrote:
>>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>>
>>>>>> If cdev_dt_seq_show() runs before the first state transition of a cooling
>>>>>> device, it will not print any state residency information for it, even
>>>>>> though it might be reasonably expected to print residency information for
>>>>>> the initial state of the cooling device.
>>>>>>
>>>>>> For this reason, rearrange the code to get the initial state of a cooling
>>>>>> device at the registration time and pass it to thermal_debug_cdev_add(),
>>>>>> so that the latter can create a duration record for that state which will
>>>>>> allow cdev_dt_seq_show() to print its residency information.
>>>>>>
>>>>>> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
>>>>>> Reported-by: Lukasz Luba <lukasz.luba@arm.com>
>>>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>>> ---
>>>>>>     drivers/thermal/thermal_core.c    |    9 +++++++--
>>>>>>     drivers/thermal/thermal_debugfs.c |   12 ++++++++++--
>>>>>>     drivers/thermal/thermal_debugfs.h |    4 ++--
>>>>>>     3 files changed, 19 insertions(+), 6 deletions(-)
>>>>>>
>>>>>
>>>>> I'm trying to test it on my board and do the review, but faced issue.
>>>>> For some reason I couldn't apply that patch on the latest bleeding-edge
>>>>> branch.
>>>>> Could you help me in this please? Is there something missing in the
>>>>> patch set (like one more fist patch)?
>>>>
>>>> I messed up the ordering of patches (patch [2/3] should be the last
>>>> one in the series) and on top of that, you'll need a small rebase on
>>>> that patch.
>>>>
>>>> Sorry about this, I'll send a v2.
>>>
>>> Actually, the ordering was OK, but the rebase of the second patch is
>>> still needed.  I'll send a v2.
>>
>> Thanks, I've seen it. That v2 applies smoothly and runs on the board.
>> I'll test it and review.
> 
> Thank you!
> 
> Please also see
> 
> https://lore.kernel.org/linux-pm/12427744.O9o76ZdvQC@kreacher/
> 
> which is actually more urgent because the fixes there address more
> serious issues (I would even consider them as 6.9-rc material).

Yes, I've read that patches' headers and glanced through the code.
I'll do the review & testing on them as well today.
Rafael J. Wysocki April 25, 2024, 7:10 p.m. UTC | #7
On Thu, Apr 25, 2024 at 8:55 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
>
>
>
> On 4/25/24 19:53, Rafael J. Wysocki wrote:
> > On Thu, Apr 25, 2024 at 8:42 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
> >>
> >>
> >>
> >> On 4/25/24 14:00, Rafael J. Wysocki wrote:
> >>> On Thu, Apr 25, 2024 at 2:36 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >>>>
> >>>> Hi Lukasz,
> >>>>
> >>>> On Thu, Apr 25, 2024 at 12:02 PM Lukasz Luba <lukasz.luba@arm.com> wrote:
> >>>>>
> >>>>> Hi Rafael,
> >>>>>
> >>>>> On 4/23/24 19:00, Rafael J. Wysocki wrote:
> >>>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>>>>
> >>>>>> If cdev_dt_seq_show() runs before the first state transition of a cooling
> >>>>>> device, it will not print any state residency information for it, even
> >>>>>> though it might be reasonably expected to print residency information for
> >>>>>> the initial state of the cooling device.
> >>>>>>
> >>>>>> For this reason, rearrange the code to get the initial state of a cooling
> >>>>>> device at the registration time and pass it to thermal_debug_cdev_add(),
> >>>>>> so that the latter can create a duration record for that state which will
> >>>>>> allow cdev_dt_seq_show() to print its residency information.
> >>>>>>
> >>>>>> Fixes: 755113d76786 ("thermal/debugfs: Add thermal cooling device debugfs information")
> >>>>>> Reported-by: Lukasz Luba <lukasz.luba@arm.com>
> >>>>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>>>>> ---
> >>>>>>     drivers/thermal/thermal_core.c    |    9 +++++++--
> >>>>>>     drivers/thermal/thermal_debugfs.c |   12 ++++++++++--
> >>>>>>     drivers/thermal/thermal_debugfs.h |    4 ++--
> >>>>>>     3 files changed, 19 insertions(+), 6 deletions(-)
> >>>>>>
> >>>>>
> >>>>> I'm trying to test it on my board and do the review, but faced issue.
> >>>>> For some reason I couldn't apply that patch on the latest bleeding-edge
> >>>>> branch.
> >>>>> Could you help me in this please? Is there something missing in the
> >>>>> patch set (like one more fist patch)?
> >>>>
> >>>> I messed up the ordering of patches (patch [2/3] should be the last
> >>>> one in the series) and on top of that, you'll need a small rebase on
> >>>> that patch.
> >>>>
> >>>> Sorry about this, I'll send a v2.
> >>>
> >>> Actually, the ordering was OK, but the rebase of the second patch is
> >>> still needed.  I'll send a v2.
> >>
> >> Thanks, I've seen it. That v2 applies smoothly and runs on the board.
> >> I'll test it and review.
> >
> > Thank you!
> >
> > Please also see
> >
> > https://lore.kernel.org/linux-pm/12427744.O9o76ZdvQC@kreacher/
> >
> > which is actually more urgent because the fixes there address more
> > serious issues (I would even consider them as 6.9-rc material).
>
> Yes, I've read that patches' headers and glanced through the code.
> I'll do the review & testing on them as well today.

Thanks, much appreciated!
diff mbox series

Patch

Index: linux-pm/drivers/thermal/thermal_core.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_core.c
+++ linux-pm/drivers/thermal/thermal_core.c
@@ -928,6 +928,7 @@  __thermal_cooling_device_register(struct
 {
 	struct thermal_cooling_device *cdev;
 	struct thermal_zone_device *pos = NULL;
+	unsigned long current_state;
 	int id, ret;
 
 	if (!ops || !ops->get_max_state || !ops->get_cur_state ||
@@ -965,6 +966,10 @@  __thermal_cooling_device_register(struct
 	if (ret)
 		goto out_cdev_type;
 
+	ret = cdev->ops->get_cur_state(cdev, &current_state);
+	if (ret)
+		goto out_cdev_type;
+
 	thermal_cooling_device_setup_sysfs(cdev);
 
 	ret = dev_set_name(&cdev->device, "cooling_device%d", cdev->id);
@@ -978,6 +983,8 @@  __thermal_cooling_device_register(struct
 		return ERR_PTR(ret);
 	}
 
+	thermal_debug_cdev_add(cdev, current_state);
+
 	/* Add 'this' new cdev to the global cdev list */
 	mutex_lock(&thermal_list_lock);
 
@@ -993,8 +1000,6 @@  __thermal_cooling_device_register(struct
 
 	mutex_unlock(&thermal_list_lock);
 
-	thermal_debug_cdev_add(cdev);
-
 	return cdev;
 
 out_cooling_dev:
Index: linux-pm/drivers/thermal/thermal_debugfs.c
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_debugfs.c
+++ linux-pm/drivers/thermal/thermal_debugfs.c
@@ -466,8 +466,9 @@  void thermal_debug_cdev_state_update(con
  * Allocates a cooling device object for debug, initializes the
  * statistics and create the entries in sysfs.
  * @cdev: a pointer to a cooling device
+ * @state: current state of the cooling device
  */
-void thermal_debug_cdev_add(struct thermal_cooling_device *cdev)
+void thermal_debug_cdev_add(struct thermal_cooling_device *cdev, int state)
 {
 	struct thermal_debugfs *thermal_dbg;
 	struct cdev_debugfs *cdev_dbg;
@@ -484,9 +485,16 @@  void thermal_debug_cdev_add(struct therm
 		INIT_LIST_HEAD(&cdev_dbg->durations[i]);
 	}
 
-	cdev_dbg->current_state = 0;
+	cdev_dbg->current_state = state;
 	cdev_dbg->timestamp = ktime_get();
 
+	/*
+	 * Create a record for the initial cooling device state, so its
+	 * duration will be printed by cdev_dt_seq_show() as expected if it
+	 * runs before the first state transition.
+	 */
+	thermal_debugfs_cdev_record_get(thermal_dbg, cdev_dbg->durations, state);
+
 	debugfs_create_file("trans_table", 0400, thermal_dbg->d_top,
 			    thermal_dbg, &tt_fops);
 
Index: linux-pm/drivers/thermal/thermal_debugfs.h
===================================================================
--- linux-pm.orig/drivers/thermal/thermal_debugfs.h
+++ linux-pm/drivers/thermal/thermal_debugfs.h
@@ -2,7 +2,7 @@ 
 
 #ifdef CONFIG_THERMAL_DEBUGFS
 void thermal_debug_init(void);
-void thermal_debug_cdev_add(struct thermal_cooling_device *cdev);
+void thermal_debug_cdev_add(struct thermal_cooling_device *cdev, int state);
 void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev);
 void thermal_debug_cdev_state_update(const struct thermal_cooling_device *cdev, int state);
 void thermal_debug_tz_add(struct thermal_zone_device *tz);
@@ -14,7 +14,7 @@  void thermal_debug_tz_trip_down(struct t
 void thermal_debug_update_stats(struct thermal_zone_device *tz);
 #else
 static inline void thermal_debug_init(void) {}
-static inline void thermal_debug_cdev_add(struct thermal_cooling_device *cdev) {}
+static inline void thermal_debug_cdev_add(struct thermal_cooling_device *cdev, int state) {}
 static inline void thermal_debug_cdev_remove(struct thermal_cooling_device *cdev) {}
 static inline void thermal_debug_cdev_state_update(const struct thermal_cooling_device *cdev,
 						   int state) {}