Message ID | 20230811-topic-7280_lmhirq-v1-1-c262b6a25c8f@linaro.org |
---|---|
State | Accepted |
Commit | 3f93d119c9d6e1744d55cd48af764160a1a3aca3 |
Headers | show |
Series | [RFT] arm64: dts: qcom: sc7280: Add missing LMH interrupts | expand |
Hi, On Fri, Aug 11, 2023 at 1:58 PM Konrad Dybcio <konrad.dybcio@linaro.org> wrote: > > Hook up the interrupts that signal the Limits Management Hardware has > started some sort of throttling action. > > Fixes: 7dbd121a2c58 ("arm64: dts: qcom: sc7280: Add cpufreq hw node") > Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> > --- > test case: > > - hammer the CPUs (like compile the Linux kernel) > - watch -n1 "cat /proc/interrupts | grep dcvsh" > - the numbers go up up up up -> good I'm not doing much on sc7280 these days, but I did try putting your patch on a sc7280-hoglin (AKA a CRD). I tried to stress the system out a bunch (ran 8 instances of "while true; do true; done" and opened something to activate the GPU). I didn't see any LMH interrupts fire. Of course, with ChromeOS firmware LMH is _supposed_ to be mostly disabled, so maybe that's right? Our policy was always to have Linux do as much of the throttling as possible and only use LMH as a last resort. I assume I don't need any specific config option turned on? I know that on other Qualcomm boards I see LMH nodes in the device tree, which we don't have in sc7280. Like "qcom,sdm845-lmh". Is that important? I haven't been following what's been going on with LMH in Linux since we try not to use it. For giggles, I also tried putting the patch on a sc7280-villager device to see if it had different thermals. I even put my jacket over it to try to keep it warm. I saw the sensors go up to 109C on the medium cores and still no LMH interrupts. Oh, and then the device shut itself down. I guess something about thermal throttling in Linux must be disabled but then it still handles the critical state? :( That's concerning... I put the same kernel on a trogdor device and that did normal Linux throttling OK. So something is definitely wonky with sc7280... I dug enough to find that if I used "step_wise" instead of "power_allocator" that it works OK, so I guess something is wonky about the config of power_allocator on sc7280. In any case, it's not affected by your patch and I've already probably spent too much time on it. :-P -Doug
On 25.08.2023 22:17, Doug Anderson wrote: > Hi, > > On Fri, Aug 11, 2023 at 1:58 PM Konrad Dybcio <konrad.dybcio@linaro.org> wrote: >> >> Hook up the interrupts that signal the Limits Management Hardware has >> started some sort of throttling action. >> >> Fixes: 7dbd121a2c58 ("arm64: dts: qcom: sc7280: Add cpufreq hw node") >> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> >> --- >> test case: >> >> - hammer the CPUs (like compile the Linux kernel) >> - watch -n1 "cat /proc/interrupts | grep dcvsh" >> - the numbers go up up up up -> good > > I'm not doing much on sc7280 these days, :( I'm really sad it got the boot but I did try putting your > patch on a sc7280-hoglin (AKA a CRD). I tried to stress the system out > a bunch (ran 8 instances of "while true; do true; done" and opened > something to activate the GPU). I didn't see any LMH interrupts fire. > Of course, with ChromeOS firmware LMH is _supposed_ to be mostly > disabled, so maybe that's right? Our policy was always to have Linux > do as much of the throttling as possible and only use LMH as a last > resort. > > I assume I don't need any specific config option turned on? > > I know that on other Qualcomm boards I see LMH nodes in the device > tree, which we don't have in sc7280. Like "qcom,sdm845-lmh". Is that > important? I haven't been following what's been going on with LMH in > Linux since we try not to use it. It used to be important, but on newer socs it's preconfigured in fw > > For giggles, I also tried putting the patch on a sc7280-villager > device to see if it had different thermals. I even put my jacket over > it to try to keep it warm. I saw the sensors go up to 109C on the > medium cores and still no LMH interrupts. Oh, and then the device shut > itself down. I guess something about thermal throttling in Linux must > be disabled but then it still handles the critical state? :( That's > concerning... > > I put the same kernel on a trogdor device and that did normal Linux > throttling OK. So something is definitely wonky with sc7280... I dug > enough to find that if I used "step_wise" instead of "power_allocator" > that it works OK, so I guess something is wonky about the config of > power_allocator on sc7280. In any case, it's not affected by your > patch and I've already probably spent too much time on it. :-P Hm, perhaps it would be worth to try this patch on a non-chrome 7280 device.. Would you guys have standard android-y or windows-y firmware that you could flash on these to try out, or should I try poking somebody else? Konrad
Hi, On Fri, Aug 25, 2023 at 2:07 PM Konrad Dybcio <konrad.dybcio@linaro.org> wrote: > > > I put the same kernel on a trogdor device and that did normal Linux > > throttling OK. So something is definitely wonky with sc7280... I dug > > enough to find that if I used "step_wise" instead of "power_allocator" > > that it works OK, so I guess something is wonky about the config of > > power_allocator on sc7280. In any case, it's not affected by your > > patch and I've already probably spent too much time on it. :-P > Hm, perhaps it would be worth to try this patch on a non-chrome 7280 > device.. Would you guys have standard android-y or windows-y firmware > that you could flash on these to try out, or should I try poking > somebody else? I don't have hardware that runs anything other than the standard ChromeOS bootloader, sorry! -Doug
diff --git a/arch/arm64/boot/dts/qcom/sc7280.dtsi b/arch/arm64/boot/dts/qcom/sc7280.dtsi index 925428a5f6ae..76ed32c8d6f7 100644 --- a/arch/arm64/boot/dts/qcom/sc7280.dtsi +++ b/arch/arm64/boot/dts/qcom/sc7280.dtsi @@ -5363,6 +5363,14 @@ cpufreq_hw: cpufreq@18591000 { reg = <0 0x18591000 0 0x1000>, <0 0x18592000 0 0x1000>, <0 0x18593000 0 0x1000>; + + interrupts = <GIC_SPI 30 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 31 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 19 IRQ_TYPE_LEVEL_HIGH>; + interrupt-names = "dcvsh-irq-0", + "dcvsh-irq-1", + "dcvsh-irq-2"; + clocks = <&rpmhcc RPMH_CXO_CLK>, <&gcc GCC_GPLL0>; clock-names = "xo", "alternate"; #freq-domain-cells = <1>;
Hook up the interrupts that signal the Limits Management Hardware has started some sort of throttling action. Fixes: 7dbd121a2c58 ("arm64: dts: qcom: sc7280: Add cpufreq hw node") Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> --- test case: - hammer the CPUs (like compile the Linux kernel) - watch -n1 "cat /proc/interrupts | grep dcvsh" - the numbers go up up up up -> good --- arch/arm64/boot/dts/qcom/sc7280.dtsi | 8 ++++++++ 1 file changed, 8 insertions(+) --- base-commit: 21ef7b1e17d039053edaeaf41142423810572741 change-id: 20230811-topic-7280_lmhirq-e07ad2530387 Best regards,