Message ID | 20250113-mt8192-lvts-filtered-suspend-fix-v2-0-07a25200c7c6@collabora.com |
---|---|
Headers | show |
Series | thermal/drivers/mediatek/lvts: Fixes for suspend and IRQ storm, and cleanups | expand |
On 13/01/2025 14:27, Nícolas F. R. A. Prado wrote: > In order to get working interrupts, a low offset value needs to be > configured. The minimum value for it is 20 Celsius, which is what is > configured when there's no lower thermal trip (ie the thermal core > passes -INT_MAX as low trip temperature). However, when the temperature > gets that low and fluctuates around that value it causes an interrupt > storm. Is it really about an irq storm or about having a temperature threshold set close to the ambiant temperature. So leading to unnecessary wakeups as there is need for mitigation ? > Prevent that interrupt storm by not enabling the low offset interrupt if > the low threshold is the minimum one. The case where the high threshold is the INT_MAX should be handled too. The system may have configured a thermal zone without critical trip points, so setting the next upper threshold will program the register with INT_MAX. I guess it is an undefined behavior in this case, right ? > Cc: stable@vger.kernel.org [ ... ]
Patches 1 and 2 of this series fix the issue reported by Hsin-Te Yuan [1] where MT8192-based Chromebooks are not able to suspend/resume 10 times in a row. Either one of those patches on its own is enough to fix the issue, but I believe both are desirable, so I've included them both here. Patches 3-5 fix unrelated issues that I've noticed while debugging. Patch 3 fixes IRQ storms when the temperature sensors drop to 20 Celsius. Patches 4 and 5 are cleanups to prevent future issues. To test this series, I've run 'rtcwake -m mem -d 60' 10 times in a row on a MT8192-Asurada-Spherion-rev3 Chromebook and checked that the wakeup happened 60 seconds later (+-5 seconds). I've repeated that test on 10 separate runs. Not once did the chromebook wake up early with the series applied. I've also checked that during those runs, the LVTS interrupt didn't trigger even once, while before the series it would trigger a few times per run, generally during boot or resume. Finally, as a sanity check I've verified that the interrupts still work by lowering the thermal trip point to 45 Celsius and running 'stress -c 8'. Indeed they still do, and the temperature showed by the thermal_temperature ftrace event matched the expected value. [1] https://lore.kernel.org/all/20241108-lvts-v1-1-eee339c6ca20@chromium.org/ Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> --- Changes in v2: - Renamed bitmasks for interrupt enable (added "INTEN" to the name) - Made read-only arrays static const - Changed sensor_filt_bitmap array from u32 to u8 to save memory - Rebased on next-20241209 - Link to v1: https://lore.kernel.org/r/20241125-mt8192-lvts-filtered-suspend-fix-v1-0-42e3c0528c6c@collabora.com --- Nícolas F. R. A. Prado (5): thermal/drivers/mediatek/lvts: Disable monitor mode during suspend thermal/drivers/mediatek/lvts: Disable Stage 3 thermal threshold thermal/drivers/mediatek/lvts: Disable low offset IRQ for minimum threshold thermal/drivers/mediatek/lvts: Start sensor interrupts disabled thermal/drivers/mediatek/lvts: Only update IRQ enable for valid sensors drivers/thermal/mediatek/lvts_thermal.c | 103 ++++++++++++++++++++++---------- 1 file changed, 72 insertions(+), 31 deletions(-) --- base-commit: d1486dca38afd08ca279ae94eb3a397f10737824 change-id: 20241121-mt8192-lvts-filtered-suspend-fix-a5032ca8eceb Best regards,