diff mbox series

thermal: devfreq_cooling: use local ops instead of global ops

Message ID 20220325094436.101419-1-kant@allwinnertech.com
State Accepted
Commit b947769b8f778db130aad834257fcaca25df2edc
Headers show
Series thermal: devfreq_cooling: use local ops instead of global ops | expand

Commit Message

Kant Fan March 25, 2022, 9:44 a.m. UTC
commit 7b62935828266658714f81d4e9176edad808dc70 upstream.

Fix access illegal address problem in following condition:
There are muti devfreq cooling devices in system, some of them register
with dfc_power but other does not, power model ops such as state2power will
append to global devfreq_cooling_ops when the cooling device with
dfc_power register. It makes the cooling device without dfc_power
also use devfreq_cooling_ops after appending when register later by
of_devfreq_cooling_register_power() or of_devfreq_cooling_register().

IPA governor regards the cooling devices without dfc_power as a power actor
because they also have power model ops, and will access illegal address at
dfc->power_ops when execute cdev->ops->get_requested_power or
cdev->ops->power2state. As the calltrace below shows:

Unable to handle kernel NULL pointer dereference at virtual address
00000008
...
calltrace:
[<c06e5488>] devfreq_cooling_power2state+0x24/0x184
[<c06df420>] power_actor_set_power+0x54/0xa8
[<c06e3774>] power_allocator_throttle+0x770/0x97c
[<c06dd120>] handle_thermal_trip+0x1b4/0x26c
[<c06ddb48>] thermal_zone_device_update+0x154/0x208
[<c014159c>] process_one_work+0x1ec/0x36c
[<c0141c58>] worker_thread+0x204/0x2ec
[<c0146788>] kthread+0x140/0x154
[<c01010e8>] ret_from_fork+0x14/0x2c

Fixes: a76caf55e5b35 ("thermal: Add devfreq cooling")
Cc: stable@vger.kernel.org # 4.4+
Signed-off-by: Kant Fan <kant@allwinnertech.com>
---
 drivers/thermal/devfreq_cooling.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

Comments

Kant Fan April 23, 2022, 10:49 a.m. UTC | #1
On 20/04/2022 18:32, Lukasz Luba wrote:
> Hi Kant,
> 
> On 4/19/22 16:49, Kant Fan wrote:
>> On 29/03/2022 14:59, Lukasz Luba wrote:
>>>
>>>
>>> On 3/25/22 09:44, Kant Fan wrote:
>>>> commit 7b62935828266658714f81d4e9176edad808dc70 upstream.
>>>>
>>>> Fix access illegal address problem in following condition:
>>>> There are muti devfreq cooling devices in system, some of them register
>>>> with dfc_power but other does not, power model ops such as 
>>>> state2power will
>>>> append to global devfreq_cooling_ops when the cooling device with
>>>> dfc_power register. It makes the cooling device without dfc_power
>>>> also use devfreq_cooling_ops after appending when register later by
>>>> of_devfreq_cooling_register_power() or of_devfreq_cooling_register().
>>>>
>>>> IPA governor regards the cooling devices without dfc_power as a 
>>>> power actor
>>>> because they also have power model ops, and will access illegal 
>>>> address at
>>>> dfc->power_ops when execute cdev->ops->get_requested_power or
>>>> cdev->ops->power2state. As the calltrace below shows:
>>>>
>>>> Unable to handle kernel NULL pointer dereference at virtual address
>>>> 00000008
>>>> ...
>>>> calltrace:
>>>> [<c06e5488>] devfreq_cooling_power2state+0x24/0x184
>>>> [<c06df420>] power_actor_set_power+0x54/0xa8
>>>> [<c06e3774>] power_allocator_throttle+0x770/0x97c
>>>> [<c06dd120>] handle_thermal_trip+0x1b4/0x26c
>>>> [<c06ddb48>] thermal_zone_device_update+0x154/0x208
>>>> [<c014159c>] process_one_work+0x1ec/0x36c
>>>> [<c0141c58>] worker_thread+0x204/0x2ec
>>>> [<c0146788>] kthread+0x140/0x154
>>>> [<c01010e8>] ret_from_fork+0x14/0x2c
>>>>
>>>> Fixes: a76caf55e5b35 ("thermal: Add devfreq cooling")
>>>> Cc: stable@vger.kernel.org # 4.4+
>>>> Signed-off-by: Kant Fan <kant@allwinnertech.com>
>>>> ---
>>>>   drivers/thermal/devfreq_cooling.c | 25 ++++++++++++++++++-------
>>>>   1 file changed, 18 insertions(+), 7 deletions(-)
>>>>
>>>
>>> Looks good. So this patch should be applied for all stable
>>> kernels starting from v4.4 to v5.12 (the v5.13 and later need
>>> other patch).
>>>
>>> Next time you might use in the subject something like:
>>> [PATCH 4.4] thermal: devfreq_cooling: use local ops instead of global 
>>> ops
>>> It would be better distinguished from your other patch with the
>>> same subject, which was for mainline and v5.13+
>>
>> Hi Lukasz,
>> Thank you for the guidance. I want to know if I'm understanding you in 
>> a right way. Could you confirm the following information?
>>
>> 1. The stable patches
>> After the patch is merged into mainline later, I'll submit the 
>> following patches individually for v4.4 ~ v5.12:
> 
> Correct, after it gets mainline you can point to that commit hash and
> process with those patches. I don't now which of those older stable
> kernels are still maintained, since some of them have longer support
> and the rest had shorter and might already ended. You can check the
> end of life for those 'Longterm' here [1]. AFAICS the 4.4 is not in that
> table, so you can start from 4.9, should be OK.
> So the list of needed patches would be for those stable kernels:
> 4.9, 4.14, 4.19, 5.4, 5.10
> I can see that last release for 5.11.x was in May 2021, so it's probably
> ended, similar for 5.12.x (Jul 2021). That's why I suggested that list
> for the long support kernels.
> 

Hi Lukasz,
Thanks for figuring it out. I'll check the stable versions carefully.

>>
>> [PATCH 4.4] thermal: devfreq_cooling: use local ops instead of global ops
>> [PATCH 4.5] thermal: devfreq_cooling: use local ops instead of global ops
>> ...
>> [PATCH 5.12] thermal: devfreq_cooling: use local ops instead of global 
>> ops
>>
>> And also the following patches individually for v5.13+ :
> 
> For this, you probably don't have to. You have added 'v5.13+' in the
> original patch v2, so it will be picked correctly. It should apply
> on those stable kernels w/o issues. If there will be, stable kernel
> engineers will ping us.
> 
>> [PATCH 5.13] thermal: devfreq_cooling: use local ops instead of global 
>> ops
>> [PATCH 5.14] thermal: devfreq_cooling: use local ops instead of global 
>> ops
>> ...
>> [PATCH 5.17] thermal: devfreq_cooling: use local ops instead of global 
>> ops
>>
>> 2. The mainline patch
>> I saw your mail with Rafael, seems there are conflicts... I wonder if 
>> there's anything wrong with my patch, or anything I can help?
>>
> 
> Thank you for offering help. Rafael solved that correctly, so it doesn't
> need any more work.
> 
> Thank you for doing that work!
> 
> Regards,
> Lukasz
> 
> [1] https://www.kernel.org/category/releases.html

No problem. I'll submit the stable patches after the mainline patch is 
merged.
diff mbox series

Patch

diff --git a/drivers/thermal/devfreq_cooling.c b/drivers/thermal/devfreq_cooling.c
index dfab49a67252..d36f70513e6a 100644
--- a/drivers/thermal/devfreq_cooling.c
+++ b/drivers/thermal/devfreq_cooling.c
@@ -462,22 +462,29 @@  of_devfreq_cooling_register_power(struct device_node *np, struct devfreq *df,
 {
 	struct thermal_cooling_device *cdev;
 	struct devfreq_cooling_device *dfc;
+	struct thermal_cooling_device_ops *ops;
 	char dev_name[THERMAL_NAME_LENGTH];
 	int err;
 
-	dfc = kzalloc(sizeof(*dfc), GFP_KERNEL);
-	if (!dfc)
+	ops = kmemdup(&devfreq_cooling_ops, sizeof(*ops), GFP_KERNEL);
+	if (!ops)
 		return ERR_PTR(-ENOMEM);
 
+	dfc = kzalloc(sizeof(*dfc), GFP_KERNEL);
+	if (!dfc) {
+		err = -ENOMEM;
+		goto free_ops;
+	}
+
 	dfc->devfreq = df;
 
 	if (dfc_power) {
 		dfc->power_ops = dfc_power;
 
-		devfreq_cooling_ops.get_requested_power =
+		ops->get_requested_power =
 			devfreq_cooling_get_requested_power;
-		devfreq_cooling_ops.state2power = devfreq_cooling_state2power;
-		devfreq_cooling_ops.power2state = devfreq_cooling_power2state;
+		ops->state2power = devfreq_cooling_state2power;
+		ops->power2state = devfreq_cooling_power2state;
 	}
 
 	err = devfreq_cooling_gen_tables(dfc);
@@ -497,8 +504,7 @@  of_devfreq_cooling_register_power(struct device_node *np, struct devfreq *df,
 
 	snprintf(dev_name, sizeof(dev_name), "thermal-devfreq-%d", dfc->id);
 
-	cdev = thermal_of_cooling_device_register(np, dev_name, dfc,
-						  &devfreq_cooling_ops);
+	cdev = thermal_of_cooling_device_register(np, dev_name, dfc, ops);
 	if (IS_ERR(cdev)) {
 		err = PTR_ERR(cdev);
 		dev_err(df->dev.parent,
@@ -522,6 +528,8 @@  of_devfreq_cooling_register_power(struct device_node *np, struct devfreq *df,
 	kfree(dfc->freq_table);
 free_dfc:
 	kfree(dfc);
+free_ops:
+	kfree(ops);
 
 	return ERR_PTR(err);
 }
@@ -557,10 +565,12 @@  EXPORT_SYMBOL_GPL(devfreq_cooling_register);
 void devfreq_cooling_unregister(struct thermal_cooling_device *cdev)
 {
 	struct devfreq_cooling_device *dfc;
+	const struct thermal_cooling_device_ops *ops;
 
 	if (!cdev)
 		return;
 
+	ops = cdev->ops;
 	dfc = cdev->devdata;
 
 	thermal_cooling_device_unregister(dfc->cdev);
@@ -570,5 +580,6 @@  void devfreq_cooling_unregister(struct thermal_cooling_device *cdev)
 	kfree(dfc->freq_table);
 
 	kfree(dfc);
+	kfree(ops);
 }
 EXPORT_SYMBOL_GPL(devfreq_cooling_unregister);