Message ID | 1631041289-11804-1-git-send-email-quic_subbaram@quicinc.com |
---|---|
State | New |
Headers | show |
Series | [v2] thermal: Fix a NULL pointer dereference | expand |
On 9/17/21 1:06 PM, Subbaraman Narayanamurthy wrote: > On 9/17/21 2:31 AM, Daniel Lezcano wrote: >> On 07/09/2021 21:01, Subbaraman Narayanamurthy wrote: >>> of_parse_thermal_zones() parses the thermal-zones node and registers a >>> thermal_zone device for each subnode. However, if a thermal zone is >>> consuming a thermal sensor and that thermal sensor device hasn't probed >>> yet, an attempt to set trip_point_*_temp for that thermal zone device >>> can cause a NULL pointer dereference. Fix it. >>> >>> console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp >>> ... >>> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 >>> ... >>> Call trace: >>> of_thermal_set_trip_temp+0x40/0xc4 >>> trip_point_temp_store+0xc0/0x1dc >>> dev_attr_store+0x38/0x88 >>> sysfs_kf_write+0x64/0xc0 >>> kernfs_fop_write_iter+0x108/0x1d0 >>> vfs_write+0x2f4/0x368 >>> ksys_write+0x7c/0xec >>> __arm64_sys_write+0x20/0x30 >>> el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc >>> do_el0_svc+0x28/0xa0 >>> el0_svc+0x14/0x24 >>> el0_sync_handler+0x88/0xec >>> el0_sync+0x1c0/0x200 >>> >>> While at it, fix the possible NULL pointer dereference in other >>> functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(), >>> of_thermal_get_trend(). >>> >>> Cc: stable@vger.kernel.org >>> Suggested-by: David Collins <quic_collinsd@quicinc.com> >>> Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com> >>> --- >>> Changes for v2: >>> - Added checks in of_thermal_get_temp(), of_thermal_set_emul_temp(), of_thermal_get_trend(). >>> >>> drivers/thermal/thermal_of.c | 9 ++++++--- >>> 1 file changed, 6 insertions(+), 3 deletions(-) >>> >>> diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c >>> index 6379f26..9233f7e 100644 >>> --- a/drivers/thermal/thermal_of.c >>> +++ b/drivers/thermal/thermal_of.c >>> @@ -89,7 +89,7 @@ static int of_thermal_get_temp(struct thermal_zone_device *tz, >>> { >>> struct __thermal_zone *data = tz->devdata; >>> >>> - if (!data->ops->get_temp) >>> + if (!data->ops || !data->ops->get_temp) >> comment (1) >> >> AFAICT, if data->ops != NULL then data->ops->get_temp is also != NULL >> because of the code allocating/freeing the ops structure. >> >> The tests can be replaced by (!data->ops), no ? > Thanks Daniel for reviewing the patch. > > I agree that even if a sensor module is unregistered, that would call > "thermal_zone_of_sensor_unregister" which would eventually set NULL on > get_temp() and get_trend() and "tzd->ops" as well. > > However, of_thermal_get_temp() is trying to call "data->ops->get_temp" > which comes from a sensor driver when it registers. There is no > guarantee that it would be non-NULL right? > > Thinking of which, I think having both checks would be valid. Hi Daniel, Do you still have any concerns with this change? Thanks, Subbaraman >>> return -EINVAL; >>> >>> return data->ops->get_temp(data->sensor_data, temp); >>> @@ -186,6 +186,9 @@ static int of_thermal_set_emul_temp(struct thermal_zone_device *tz, >>> { >>> struct __thermal_zone *data = tz->devdata; >>> >>> + if (!data->ops || !data->ops->set_emul_temp) >>> + return -EINVAL; >>> + >> comment (2) >> >> The test looks pointless here (I mean both of them). >> >> If of_thermal_set_emul_temp() is called it is because the callback was >> set in thermal_zone_of_add_sensor(). >> >> This one does: >> >> tz->ops = ops; >> >> and >> if (ops->set_emul_temp) >> tzd->ops->set_emul_temp = of_thermal_set_emul_temp; >> >> If I'm not wrong if we are called, then data->ops && >> data->ops->set_emul_temp is always true, right ? >> > I've not exercised this condition yet. However, the original problem > we've observed was when thermal HAL was trying to set trip thresholds > on a thermal zone device for which the sensor device is not probed yet. > This had happened randomly because of vendor modules taking time to be > loaded and probed. I don't know if there would be any userspace entity > that can try to set emulated temperature for a thermal zone even before > a sensor device is probed. > > Without a sensor driver probed, "tz->ops" would not have a valid pointer > right? So, I think checking for "data->ops" should be good. > > Another possibility is, a sensor might not have "set_emul_temp" callback. > So checking for "ops->set_emul_temp" should be still valid. > >>> return data->ops->set_emul_temp(data->sensor_data, temp); >>> } >>> >>> @@ -194,7 +197,7 @@ static int of_thermal_get_trend(struct thermal_zone_device *tz, int trip, >>> { >>> struct __thermal_zone *data = tz->devdata; >>> >>> - if (!data->ops->get_trend) >>> + if (!data->ops || !data->ops->get_trend) >>> return -EINVAL; >> Same comment as (1) > of_thermal_get_trend() is trying to call "data->ops->get_trend" which > comes from a sensor driver when it registers. From what I can see, > there are lot of drivers which don't pass "get_trend" in their ops. > So there is no guarantee that it would be non-NULL right? > > Thinking of which, I think having both checks would be valid. > >>> return data->ops->get_trend(data->sensor_data, trip, trend); >>> @@ -301,7 +304,7 @@ static int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip, >>> if (trip >= data->ntrips || trip < 0) >>> return -EDOM; >>> >>> - if (data->ops->set_trip_temp) { >>> + if (data->ops && data->ops->set_trip_temp) { >> Same comment as (2) > Without a sensor driver probed, "tz->ops" would not have a valid pointer right? > So, I think checking for "data->ops" should be good. Another possibility is, a > sensor device might not have "set_trip_temp" callback but just "set_trips". > > So checking for "data->ops->set_trip_temp" might be still valid. > >>> int ret; >>> >>> ret = data->ops->set_trip_temp(data->sensor_data, trip, temp); >>>
On 06/10/2021 00:09, Subbaraman Narayanamurthy wrote: [ ... ] >>> The tests can be replaced by (!data->ops), no ? >> Thanks Daniel for reviewing the patch. >> >> I agree that even if a sensor module is unregistered, that would call >> "thermal_zone_of_sensor_unregister" which would eventually set NULL on >> get_temp() and get_trend() and "tzd->ops" as well. >> >> However, of_thermal_get_temp() is trying to call "data->ops->get_temp" >> which comes from a sensor driver when it registers. There is no >> guarantee that it would be non-NULL right? >> >> Thinking of which, I think having both checks would be valid. > > Hi Daniel, > Do you still have any concerns with this change? Yes, let me answer to the initial patch. -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c index 6379f26..9233f7e 100644 --- a/drivers/thermal/thermal_of.c +++ b/drivers/thermal/thermal_of.c @@ -89,7 +89,7 @@ static int of_thermal_get_temp(struct thermal_zone_device *tz, { struct __thermal_zone *data = tz->devdata; - if (!data->ops->get_temp) + if (!data->ops || !data->ops->get_temp) return -EINVAL; return data->ops->get_temp(data->sensor_data, temp); @@ -186,6 +186,9 @@ static int of_thermal_set_emul_temp(struct thermal_zone_device *tz, { struct __thermal_zone *data = tz->devdata; + if (!data->ops || !data->ops->set_emul_temp) + return -EINVAL; + return data->ops->set_emul_temp(data->sensor_data, temp); } @@ -194,7 +197,7 @@ static int of_thermal_get_trend(struct thermal_zone_device *tz, int trip, { struct __thermal_zone *data = tz->devdata; - if (!data->ops->get_trend) + if (!data->ops || !data->ops->get_trend) return -EINVAL; return data->ops->get_trend(data->sensor_data, trip, trend); @@ -301,7 +304,7 @@ static int of_thermal_set_trip_temp(struct thermal_zone_device *tz, int trip, if (trip >= data->ntrips || trip < 0) return -EDOM; - if (data->ops->set_trip_temp) { + if (data->ops && data->ops->set_trip_temp) { int ret; ret = data->ops->set_trip_temp(data->sensor_data, trip, temp);
of_parse_thermal_zones() parses the thermal-zones node and registers a thermal_zone device for each subnode. However, if a thermal zone is consuming a thermal sensor and that thermal sensor device hasn't probed yet, an attempt to set trip_point_*_temp for that thermal zone device can cause a NULL pointer dereference. Fix it. console:/sys/class/thermal/thermal_zone87 # echo 120000 > trip_point_0_temp ... Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 ... Call trace: of_thermal_set_trip_temp+0x40/0xc4 trip_point_temp_store+0xc0/0x1dc dev_attr_store+0x38/0x88 sysfs_kf_write+0x64/0xc0 kernfs_fop_write_iter+0x108/0x1d0 vfs_write+0x2f4/0x368 ksys_write+0x7c/0xec __arm64_sys_write+0x20/0x30 el0_svc_common.llvm.7279915941325364641+0xbc/0x1bc do_el0_svc+0x28/0xa0 el0_svc+0x14/0x24 el0_sync_handler+0x88/0xec el0_sync+0x1c0/0x200 While at it, fix the possible NULL pointer dereference in other functions as well: of_thermal_get_temp(), of_thermal_set_emul_temp(), of_thermal_get_trend(). Cc: stable@vger.kernel.org Suggested-by: David Collins <quic_collinsd@quicinc.com> Signed-off-by: Subbaraman Narayanamurthy <quic_subbaram@quicinc.com> --- Changes for v2: - Added checks in of_thermal_get_temp(), of_thermal_set_emul_temp(), of_thermal_get_trend(). drivers/thermal/thermal_of.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)