mbox series

[V2,00/13] OPP: Add support for multiple clocks*

Message ID cover.1657003420.git.viresh.kumar@linaro.org
Headers show
Series OPP: Add support for multiple clocks* | expand

Message

Viresh Kumar July 5, 2022, 7 a.m. UTC
Hello,

This patchset adds support for devices with multiple clocks. None of the clocks
is considered primary in this case and all are handled equally.

The drivers, for multiple clock case, are expected to call dev_pm_opp_set_opp()
to set the specific OPP. Though how they find the target OPP is left for the
users to handle. For some, we may have another unique OPP property, like level,
which can be used to find the OPP. While in case of others, we may want to
implement freq-based OPP finder APIs for multiple clock rates. I have decided
not to implement them in advance, and add them only someone wants to use them.

This is rebased over a lot of other OPP changes and is pushed here:

git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git opp/linux-next

V1->V2:

- Fix broken git bisect for:
  OPP: Reuse _opp_compare_key() in _opp_add_static_v2()

- Include binding changes written by Krzysztof earlier.

- Check config_clks before calling it, it isn't always set.

- Add config_clks for Tegra30's devfreq to handle its corner case.

- _opp_compare_key() supports multi-clk case now, earlier it skipped freq
  comparison for such a case.

- New patch to compare all bandwidth values as well in _opp_compare_key().

- New patch to remove *_noclk() interface.

- Various other minor fixes.

--
Viresh

Krzysztof Kozlowski (1):
  dt-bindings: opp: accept array of frequencies

Viresh Kumar (12):
  OPP: Use consistent names for OPP table instances
  OPP: Remove rate_not_available parameter to _opp_add()
  OPP: Reuse _opp_compare_key() in _opp_add_static_v2()
  OPP: Make dev_pm_opp_set_opp() independent of frequency
  OPP: Allow multiple clocks for a device
  OPP: Compare bandwidths for all paths in _opp_compare_key()
  OPP: Add key specific assert() method to key finding helpers
  OPP: Assert clk_count == 1 for single clk helpers
  OPP: Provide a simple implementation to configure multiple clocks
  OPP: Allow config_clks helper for single clk case
  PM / devfreq: tegra30: Register config_clks helper
  OPP: Remove dev{m}_pm_opp_of_add_table_noclk()

 .../devicetree/bindings/opp/opp-v2-base.yaml  |  10 +
 drivers/devfreq/tegra30-devfreq.c             |  22 +-
 drivers/opp/core.c                            | 404 +++++++++++++-----
 drivers/opp/cpu.c                             |  12 +-
 drivers/opp/debugfs.c                         |  27 +-
 drivers/opp/of.c                              | 139 +++---
 drivers/opp/opp.h                             |  24 +-
 include/linux/pm_opp.h                        |  29 +-
 8 files changed, 466 insertions(+), 201 deletions(-)

Comments

Viresh Kumar July 6, 2022, 6:39 a.m. UTC | #1
On 05-07-22, 19:21, Krzysztof Kozlowski wrote:
> Where is the safety problem with multiple-clocks case?

All these APIs, which I have added the assert to now, are designed
with a single clk/freq in mind. They simply take a clock frequency
value as input and do something based on it. It only works fine with
single clk case, as more information is required for multiple clock
case, a freq value isn't sufficient here. In order to avoid abuse or
accidental use of the wrong API, these WARNs were required.

> And how users of PM OPP API are supposed to iterate/find OPPs they
> want if the API now throws WARN?

We need to provide new APIs for that, as I mentioned in the cover
letter and the other mail I sent you yesterday.

Specifically, if we want to have OPP finder API based on freq value,
then it needs to either have index + freq as input, or an array of
frequencies (one for each clk) as input.

Though I am not sure what you would need at the moment, as you can
also use opp-level for OPPs as they match UFS gears, that was my
understanding from earlier discussion.
Dmitry Osipenko July 7, 2022, 7:43 p.m. UTC | #2
On 7/5/22 10:00, Viresh Kumar wrote:
> Hello,
> 
> This patchset adds support for devices with multiple clocks. None of the clocks
> is considered primary in this case and all are handled equally.
> 
> The drivers, for multiple clock case, are expected to call dev_pm_opp_set_opp()
> to set the specific OPP. Though how they find the target OPP is left for the
> users to handle. For some, we may have another unique OPP property, like level,
> which can be used to find the OPP. While in case of others, we may want to
> implement freq-based OPP finder APIs for multiple clock rates. I have decided
> not to implement them in advance, and add them only someone wants to use them.
> 
> This is rebased over a lot of other OPP changes and is pushed here:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git opp/linux-next
> 
> V1->V2:
> 
> - Fix broken git bisect for:
>   OPP: Reuse _opp_compare_key() in _opp_add_static_v2()
> 
> - Include binding changes written by Krzysztof earlier.
> 
> - Check config_clks before calling it, it isn't always set.
> 
> - Add config_clks for Tegra30's devfreq to handle its corner case.
> 
> - _opp_compare_key() supports multi-clk case now, earlier it skipped freq
>   comparison for such a case.
> 
> - New patch to compare all bandwidth values as well in _opp_compare_key().
> 
> - New patch to remove *_noclk() interface.
> 
> - Various other minor fixes.
> 
> --
> Viresh
> 
> Krzysztof Kozlowski (1):
>   dt-bindings: opp: accept array of frequencies
> 
> Viresh Kumar (12):
>   OPP: Use consistent names for OPP table instances
>   OPP: Remove rate_not_available parameter to _opp_add()
>   OPP: Reuse _opp_compare_key() in _opp_add_static_v2()
>   OPP: Make dev_pm_opp_set_opp() independent of frequency
>   OPP: Allow multiple clocks for a device
>   OPP: Compare bandwidths for all paths in _opp_compare_key()
>   OPP: Add key specific assert() method to key finding helpers
>   OPP: Assert clk_count == 1 for single clk helpers
>   OPP: Provide a simple implementation to configure multiple clocks
>   OPP: Allow config_clks helper for single clk case
>   PM / devfreq: tegra30: Register config_clks helper

Hello Viresh,

This patch breaks Tegra again, please take a look:

   OPP: Remove dev{m}_pm_opp_of_add_table_noclk()

 8<--- cut here ---
 Unable to handle kernel paging request at virtual address ffffffff
 [ffffffff] *pgd=9effd861, *pte=00000000, *ppte=00000000
 Internal error: Oops: 37 [#1] PREEMPT SMP ARM
 Modules linked in:
 CPU: 3 PID: 8 Comm: kworker/u8:0 Not tainted
5.19.0-rc1-00040-g30b62d123f4f #82
 Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
 Workqueue: events_unbound deferred_probe_work_func
 PC is at _opp_compare_key+0x40/0xc4
 LR is at 0xfffffffb
 pc : [<c0b91b54>]    lr : [<fffffffb>]    psr: 20000113
 sp : df831b08  ip : c33cd4d0  fp : df831b24
 r10: c2586078  r9 : c258606c  r8 : 00000000
 r7 : 00000000  r6 : 00000001  r5 : c33cd480  r4 : c2586000
 r3 : 00000000  r2 : c33cd480  r1 : c258606c  r0 : c2586000
 Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
 Control: 10c5387d  Table: 8000404a  DAC: 00000051
...
 Backtrace:
  _opp_compare_key from _set_opp+0x80/0x408
  r7:00000000 r6:c27c0010 r5:c33cd480 r4:c2586000
  _set_opp from dev_pm_opp_set_opp+0x74/0xdc
  r10:00000001 r9:000f4240 r8:c33b6840 r7:c33b6840 r6:c33cd480 r5:c27c0010
  r4:c2586000
  dev_pm_opp_set_opp from tegra_pmc_core_pd_set_performance_state+0x54/0xbc
  r6:c33b6840 r5:c1a21760 r4:c33cd480
  tegra_pmc_core_pd_set_performance_state from
_genpd_set_performance_state+0x1f0/0x280
  r6:c33b6b58 r5:c33b6b48 r4:000f4240
  _genpd_set_performance_state from _genpd_set_performance_state+0xb4/0x280
  r10:00000001 r9:000f4240 r8:c33b9800 r7:c33b6840 r6:c33b9b18 r5:c33d0040
  r4:000f4240
  _genpd_set_performance_state from genpd_set_performance_state+0xb8/0xd4
  r10:c33b9a98 r9:c33d0400 r8:00000000 r7:00000000 r6:c33b9800 r5:00000000
  r4:c33d0400
  genpd_set_performance_state from genpd_runtime_resume+0x22c/0x240
  r5:00000000 r4:c27c0810
  genpd_runtime_resume from __rpm_callback+0x4c/0x1ac
  r10:c27ba8bc r9:00000000 r8:c27c0960 r7:c27c08cc r6:c27ba800 r5:c09611b8
  r4:c27c0810
  __rpm_callback from rpm_callback+0x60/0x64
  r9:df831ce4 r8:c27c0960 r7:00000004 r6:c27ba800 r5:c09611b8 r4:c27c0810
  rpm_callback from rpm_resume+0x480/0x7e0
  r7:00000004 r6:c27ba800 r5:c09611b8 r4:c27c0810
  rpm_resume from __pm_runtime_resume+0x58/0xb0
  r10:00000000 r9:c2587194 r8:c2587210 r7:c27c0810 r6:c27c08cc r5:60000113
  r4:c27c0810
  __pm_runtime_resume from host1x_probe+0x3d4/0x6d4
  r7:c27c0810 r6:c27c0800 r5:00000000 r4:c2587040
  host1x_probe from platform_probe+0x6c/0xc0
  r10:c191438c r9:c1aa8e20 r8:0000000d r7:c27c0810 r6:c1a2d354 r5:c27c0810
  r4:00000000
  platform_probe from really_probe.part.0+0xac/0x2c0
  r7:c27c0810 r6:c1a2d354 r5:c27c0810 r4:00000000
  really_probe.part.0 from __driver_probe_device+0xb8/0x14c
  r7:c27c0810 r6:c1a2d354 r5:00000000 r4:c27c0810
  __driver_probe_device from driver_probe_device+0x44/0x11c
  r7:c27c0810 r6:c27c0810 r5:c2137c58 r4:c2137c54
  driver_probe_device from __device_attach_driver+0xc8/0x10c
  r9:c1aa8e20 r8:c242c000 r7:00000000 r6:c27c0810 r5:df831e6c r4:c1a2d354
  __device_attach_driver from bus_for_each_drv+0x90/0xdc
  r7:00000000 r6:c09440f8 r5:df831e6c r4:00000000
  bus_for_each_drv from __device_attach+0xbc/0x1d4
  r6:c27c0854 r5:00000001 r4:c27c0810
  __device_attach from device_initial_probe+0x1c/0x20
  r6:c1a30df8 r5:c27c0810 r4:c27c0810
  device_initial_probe from bus_probe_device+0x98/0xa0
  bus_probe_device from deferred_probe_work_func+0x8c/0xbc
  r7:00000000 r6:c1a309e8 r5:c1a3099c r4:c27c0810
  deferred_probe_work_func from process_one_work+0x2b8/0x774
  r7:c25c8000 r6:c2407000 r5:c2557480 r4:c1a309f8
  process_one_work from worker_thread+0x17c/0x56c
  r10:00000088 r9:c25c8000 r8:c1905d40 r7:c240703c r6:c2557498 r5:c2407000
  r4:c2557480
  worker_thread from kthread+0x108/0x13c
  r10:00000000 r9:df815e2c r8:c2557500 r7:c2557480 r6:c0151924 r5:c25c8000
  r4:c2554540
  kthread from ret_from_fork+0x14/0x28
 Exception stack(0xdf831fb0 to 0xdf831ff8)
 1fa0:                                     00000000 00000000 00000000
00000000
 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
  r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c015ba68 r4:c2554540
 Code: e24cc004 ea000001 e1530006 0a000007 (e5be5004)
 ---[ end trace 0000000000000000 ]---
Viresh Kumar July 8, 2022, 7:19 a.m. UTC | #3
On 07-07-22, 22:43, Dmitry Osipenko wrote:
> This patch breaks Tegra again, please take a look:

Damn, not again :(

>    OPP: Remove dev{m}_pm_opp_of_add_table_noclk()

Why did you mention this patch ? This just removed an unused API,
Tegra should have broke because of something else, isn't it ?

>  8<--- cut here ---
>  Unable to handle kernel paging request at virtual address ffffffff
>  [ffffffff] *pgd=9effd861, *pte=00000000, *ppte=00000000
>  Internal error: Oops: 37 [#1] PREEMPT SMP ARM
>  Modules linked in:
>  CPU: 3 PID: 8 Comm: kworker/u8:0 Not tainted
> 5.19.0-rc1-00040-g30b62d123f4f #82
>  Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
>  Workqueue: events_unbound deferred_probe_work_func
>  PC is at _opp_compare_key+0x40/0xc4
>  LR is at 0xfffffffb

How is LR set to such an address ?

>  pc : [<c0b91b54>]    lr : [<fffffffb>]    psr: 20000113
>  sp : df831b08  ip : c33cd4d0  fp : df831b24
>  r10: c2586078  r9 : c258606c  r8 : 00000000
>  r7 : 00000000  r6 : 00000001  r5 : c33cd480  r4 : c2586000
>  r3 : 00000000  r2 : c33cd480  r1 : c258606c  r0 : c2586000
>  Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>  Control: 10c5387d  Table: 8000404a  DAC: 00000051
> ...
>  Backtrace:
>   _opp_compare_key from _set_opp+0x80/0x408

Whatever happened, happened from _opp_compare_key() and I tried to
look at it many times, couldn't figure out what's wrong there.

For the device in question, pmc I think, we don't have any "opp-hz"
property in the DT, but still the OPP core will fetch its clock and
set clk_count to 1. But this was working earlier too, we were
comparing the rate anyways. I think one of _opp_compare_rate() or
_opp_compare_bw() is broken here, but I just couldn't figure out. The
rate one should run one loop and bw one should just return. I don't
see why a crash should come out eventually.

Can you help debug this a bit ? Also see what are the values of
opp_table->path_count and opp_table->clk_count, should be 0 and 1
AFAICT.

Sorry about this Dmitry, I think we are all settled and again went
into crap.
Dmitry Osipenko July 8, 2022, 7:26 a.m. UTC | #4
On 7/8/22 10:19, Viresh Kumar wrote:
> On 07-07-22, 22:43, Dmitry Osipenko wrote:
>> This patch breaks Tegra again, please take a look:
> 
> Damn, not again :(
> 
>>    OPP: Remove dev{m}_pm_opp_of_add_table_noclk()
> 
> Why did you mention this patch ? This just removed an unused API,
> Tegra should have broke because of something else, isn't it ?

This patch is the cause.

>>  8<--- cut here ---
>>  Unable to handle kernel paging request at virtual address ffffffff
>>  [ffffffff] *pgd=9effd861, *pte=00000000, *ppte=00000000
>>  Internal error: Oops: 37 [#1] PREEMPT SMP ARM
>>  Modules linked in:
>>  CPU: 3 PID: 8 Comm: kworker/u8:0 Not tainted
>> 5.19.0-rc1-00040-g30b62d123f4f #82
>>  Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
>>  Workqueue: events_unbound deferred_probe_work_func
>>  PC is at _opp_compare_key+0x40/0xc4
>>  LR is at 0xfffffffb
> 
> How is LR set to such an address ?
> 
>>  pc : [<c0b91b54>]    lr : [<fffffffb>]    psr: 20000113
>>  sp : df831b08  ip : c33cd4d0  fp : df831b24
>>  r10: c2586078  r9 : c258606c  r8 : 00000000
>>  r7 : 00000000  r6 : 00000001  r5 : c33cd480  r4 : c2586000
>>  r3 : 00000000  r2 : c33cd480  r1 : c258606c  r0 : c2586000
>>  Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>>  Control: 10c5387d  Table: 8000404a  DAC: 00000051
>> ...
>>  Backtrace:
>>   _opp_compare_key from _set_opp+0x80/0x408
> 
> Whatever happened, happened from _opp_compare_key() and I tried to
> look at it many times, couldn't figure out what's wrong there.
> 
> For the device in question, pmc I think, we don't have any "opp-hz"
> property in the DT, but still the OPP core will fetch its clock and
> set clk_count to 1. But this was working earlier too, we were
> comparing the rate anyways. I think one of _opp_compare_rate() or
> _opp_compare_bw() is broken here, but I just couldn't figure out. The
> rate one should run one loop and bw one should just return. I don't
> see why a crash should come out eventually.
> 
> Can you help debug this a bit ? Also see what are the values of
> opp_table->path_count and opp_table->clk_count, should be 0 and 1
> AFAICT.

I see that previously dev_pm_opp_set_config() had "_add_opp_table(dev,
false)", now it's "_add_opp_table(dev, true)".

Will take a closer look later on.

> Sorry about this Dmitry, I think we are all settled and again went
> into crap.

No problems :)
Dmitry Osipenko July 8, 2022, 7:30 a.m. UTC | #5
On 7/8/22 10:26, Dmitry Osipenko wrote:
> On 7/8/22 10:19, Viresh Kumar wrote:
>> On 07-07-22, 22:43, Dmitry Osipenko wrote:
>>> This patch breaks Tegra again, please take a look:
>>
>> Damn, not again :(
>>
>>>    OPP: Remove dev{m}_pm_opp_of_add_table_noclk()
>>
>> Why did you mention this patch ? This just removed an unused API,
>> Tegra should have broke because of something else, isn't it ?
> 
> This patch is the cause.
> 
>>>  8<--- cut here ---
>>>  Unable to handle kernel paging request at virtual address ffffffff
>>>  [ffffffff] *pgd=9effd861, *pte=00000000, *ppte=00000000
>>>  Internal error: Oops: 37 [#1] PREEMPT SMP ARM
>>>  Modules linked in:
>>>  CPU: 3 PID: 8 Comm: kworker/u8:0 Not tainted
>>> 5.19.0-rc1-00040-g30b62d123f4f #82
>>>  Hardware name: NVIDIA Tegra SoC (Flattened Device Tree)
>>>  Workqueue: events_unbound deferred_probe_work_func
>>>  PC is at _opp_compare_key+0x40/0xc4
>>>  LR is at 0xfffffffb
>>
>> How is LR set to such an address ?
>>
>>>  pc : [<c0b91b54>]    lr : [<fffffffb>]    psr: 20000113
>>>  sp : df831b08  ip : c33cd4d0  fp : df831b24
>>>  r10: c2586078  r9 : c258606c  r8 : 00000000
>>>  r7 : 00000000  r6 : 00000001  r5 : c33cd480  r4 : c2586000
>>>  r3 : 00000000  r2 : c33cd480  r1 : c258606c  r0 : c2586000
>>>  Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
>>>  Control: 10c5387d  Table: 8000404a  DAC: 00000051
>>> ...
>>>  Backtrace:
>>>   _opp_compare_key from _set_opp+0x80/0x408
>>
>> Whatever happened, happened from _opp_compare_key() and I tried to
>> look at it many times, couldn't figure out what's wrong there.
>>
>> For the device in question, pmc I think, we don't have any "opp-hz"
>> property in the DT, but still the OPP core will fetch its clock and
>> set clk_count to 1. But this was working earlier too, we were
>> comparing the rate anyways. I think one of _opp_compare_rate() or
>> _opp_compare_bw() is broken here, but I just couldn't figure out. The
>> rate one should run one loop and bw one should just return. I don't
>> see why a crash should come out eventually.
>>
>> Can you help debug this a bit ? Also see what are the values of
>> opp_table->path_count and opp_table->clk_count, should be 0 and 1
>> AFAICT.
> 
> I see that previously dev_pm_opp_set_config() had "_add_opp_table(dev,
> false)", now it's "_add_opp_table(dev, true)".
> 
> Will take a closer look later on.
> 
>> Sorry about this Dmitry, I think we are all settled and again went
>> into crap.
> 
> No problems :)

BTW, maybe we should consider to start adding kselftests for OPP, like
clk framework did. That will be handy to have given that it's not easy
to test the whole OPP core without having specific devices.
Viresh Kumar July 8, 2022, 8:12 a.m. UTC | #6
On 08-07-22, 10:26, Dmitry Osipenko wrote:
> On 7/8/22 10:19, Viresh Kumar wrote:
> > On 07-07-22, 22:43, Dmitry Osipenko wrote:
> >> This patch breaks Tegra again, please take a look:
> > 
> > Damn, not again :(
> > 
> >>    OPP: Remove dev{m}_pm_opp_of_add_table_noclk()
> > 
> > Why did you mention this patch ? This just removed an unused API,
> > Tegra should have broke because of something else, isn't it ?
> 
> This patch is the cause.

I was tracking the crash too closely it seems. :(

> I see that previously dev_pm_opp_set_config() had "_add_opp_table(dev,
> false)", now it's "_add_opp_table(dev, true)".

That's definitely a mistake, I still don't understand though how it
can lead to the crash we got.

I have fixed this in my tree now, can you check again please.
Viresh Kumar July 8, 2022, 8:13 a.m. UTC | #7
On 08-07-22, 10:30, Dmitry Osipenko wrote:
> BTW, maybe we should consider to start adding kselftests for OPP, like
> clk framework did. That will be handy to have given that it's not easy
> to test the whole OPP core without having specific devices.

After being regularly bitten by such issues, I added some for cpufreq
earlier. Its time that I invest some time for OPP core too I think :)

I don't know though when I will be able to find time for that :(
Dmitry Osipenko July 8, 2022, 4:15 p.m. UTC | #8
On 7/8/22 11:12, Viresh Kumar wrote:
> On 08-07-22, 10:26, Dmitry Osipenko wrote:
>> On 7/8/22 10:19, Viresh Kumar wrote:
>>> On 07-07-22, 22:43, Dmitry Osipenko wrote:
>>>> This patch breaks Tegra again, please take a look:
>>>
>>> Damn, not again :(
>>>
>>>>    OPP: Remove dev{m}_pm_opp_of_add_table_noclk()
>>>
>>> Why did you mention this patch ? This just removed an unused API,
>>> Tegra should have broke because of something else, isn't it ?
>>
>> This patch is the cause.
> 
> I was tracking the crash too closely it seems. :(
> 
>> I see that previously dev_pm_opp_set_config() had "_add_opp_table(dev,
>> false)", now it's "_add_opp_table(dev, true)".
> 
> That's definitely a mistake, I still don't understand though how it
> can lead to the crash we got.

I'll investigate it.

> I have fixed this in my tree now, can you check again please.
> 

Yours tree works, thank you.
Johan Hovold July 11, 2022, 4:40 p.m. UTC | #9
On Tue, Jul 05, 2022 at 12:30:03PM +0530, Viresh Kumar wrote:
> Hello,
> 
> This patchset adds support for devices with multiple clocks. None of the clocks
> is considered primary in this case and all are handled equally.
> 
> The drivers, for multiple clock case, are expected to call dev_pm_opp_set_opp()
> to set the specific OPP. Though how they find the target OPP is left for the
> users to handle. For some, we may have another unique OPP property, like level,
> which can be used to find the OPP. While in case of others, we may want to
> implement freq-based OPP finder APIs for multiple clock rates. I have decided
> not to implement them in advance, and add them only someone wants to use them.

This break OPP parsing on SC8280XP and hence cpufreq and other things:

[  +0.010890] cpu cpu0: _opp_add_static_v2: opp key field not found
[  +0.000019] cpu cpu0: _of_add_opp_table_v2: Failed to add OPP, -19
[  +0.000060] cpu cpu0: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 300000000, volt: 576000, enabled: 1. New: freq: 403200000, volt: 576000, enabled: 1
[  +0.000030] cpu cpu0: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 300000000, volt: 576000, enabled: 1. New: freq: 499200000, volt: 576000, enabled: 1
...

I just did a rebase on next-20220708 and hit this.

I've narrowed it down to _read_rate() now returning -ENODEV since
opp_table->clk_count is zero.

Similar to what was reported for tegra for v1:

	https://lore.kernel.org/all/58cc8e3c-74d4-e432-8502-299312a1f15e@collabora.com/

I don't have time to look at this any more today, but it would we nice
if you could unbreak linux-next.

Perhaps Bjorn or Mani can help with further details, but this doesn't
look like something that is specific to SC8280XP.

Johan
Viresh Kumar July 12, 2022, 7:52 a.m. UTC | #10
On 11-07-22, 18:40, Johan Hovold wrote:
> This break OPP parsing on SC8280XP and hence cpufreq and other things:
> 
> [  +0.010890] cpu cpu0: _opp_add_static_v2: opp key field not found
> [  +0.000019] cpu cpu0: _of_add_opp_table_v2: Failed to add OPP, -19
> [  +0.000060] cpu cpu0: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 300000000, volt: 576000, enabled: 1. New: freq: 403200000, volt: 576000, enabled: 1
> [  +0.000030] cpu cpu0: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 300000000, volt: 576000, enabled: 1. New: freq: 499200000, volt: 576000, enabled: 1
> ...
> 
> I just did a rebase on next-20220708 and hit this.
> 
> I've narrowed it down to _read_rate() now returning -ENODEV since
> opp_table->clk_count is zero.
> 
> Similar to what was reported for tegra for v1:
> 
> 	https://lore.kernel.org/all/58cc8e3c-74d4-e432-8502-299312a1f15e@collabora.com/
> 
> I don't have time to look at this any more today, but it would we nice
> if you could unbreak linux-next.
> 
> Perhaps Bjorn or Mani can help with further details, but this doesn't
> look like something that is specific to SC8280XP.

It is actually. This is yet another corner case, Tegra had one as
well.

I have tried to understand the Qcom code / setup to best of my
abilities, and the problem as per me is that qcom-cpufreq-hw doesn't
provide a clk to the OPP core, which breaks it after the new updates
to the OPP core. I believe following will solve it. Can someone please
try this ? I will then merge it with the right commit.

diff --git a/drivers/opp/core.c b/drivers/opp/core.c
index 666e1ebf91d1..4f4a285886fa 100644
--- a/drivers/opp/core.c
+++ b/drivers/opp/core.c
@@ -1384,6 +1384,20 @@ static struct opp_table *_update_opp_table_clk(struct device *dev,
        }

        if (ret == -ENOENT) {
+               /*
+                * There are few platforms which don't want the OPP core to
+                * manage device's clock settings. In such cases neither the
+                * platform provides the clks explicitly to us, nor the DT
+                * contains a valid clk entry. The OPP nodes in DT may still
+                * contain "opp-hz" property though, which we need to parse and
+                * allow the platform to find an OPP based on freq later on.
+                *
+                * This is a simple solution to take care of such corner cases,
+                * i.e. make the clk_count 1, which lets us allocate space for
+                * frequency in opp->rates and also parse the entries in DT.
+                */
+               opp_table->clk_count = 1;
+
                dev_dbg(dev, "%s: Couldn't find clock: %d\n", __func__, ret);
                return opp_table;
        }
Manivannan Sadhasivam July 12, 2022, 12:25 p.m. UTC | #11
On Tue, Jul 12, 2022 at 01:22:40PM +0530, Viresh Kumar wrote:
> On 11-07-22, 18:40, Johan Hovold wrote:
> > This break OPP parsing on SC8280XP and hence cpufreq and other things:
> > 
> > [  +0.010890] cpu cpu0: _opp_add_static_v2: opp key field not found
> > [  +0.000019] cpu cpu0: _of_add_opp_table_v2: Failed to add OPP, -19
> > [  +0.000060] cpu cpu0: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 300000000, volt: 576000, enabled: 1. New: freq: 403200000, volt: 576000, enabled: 1
> > [  +0.000030] cpu cpu0: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 300000000, volt: 576000, enabled: 1. New: freq: 499200000, volt: 576000, enabled: 1
> > ...
> > 
> > I just did a rebase on next-20220708 and hit this.
> > 
> > I've narrowed it down to _read_rate() now returning -ENODEV since
> > opp_table->clk_count is zero.
> > 
> > Similar to what was reported for tegra for v1:
> > 
> > 	https://lore.kernel.org/all/58cc8e3c-74d4-e432-8502-299312a1f15e@collabora.com/
> > 
> > I don't have time to look at this any more today, but it would we nice
> > if you could unbreak linux-next.
> > 
> > Perhaps Bjorn or Mani can help with further details, but this doesn't
> > look like something that is specific to SC8280XP.
> 
> It is actually. This is yet another corner case, Tegra had one as
> well.
> 
> I have tried to understand the Qcom code / setup to best of my
> abilities, and the problem as per me is that qcom-cpufreq-hw doesn't
> provide a clk to the OPP core, which breaks it after the new updates
> to the OPP core. I believe following will solve it. Can someone please
> try this ? I will then merge it with the right commit.
> 

I gave it a shot on Lenovo X13s based on SC8280 SoC and it fixes the OPP
issue. So you can add,

Tested-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>

Thanks,
Mani

> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index 666e1ebf91d1..4f4a285886fa 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -1384,6 +1384,20 @@ static struct opp_table *_update_opp_table_clk(struct device *dev,
>         }
> 
>         if (ret == -ENOENT) {
> +               /*
> +                * There are few platforms which don't want the OPP core to
> +                * manage device's clock settings. In such cases neither the
> +                * platform provides the clks explicitly to us, nor the DT
> +                * contains a valid clk entry. The OPP nodes in DT may still
> +                * contain "opp-hz" property though, which we need to parse and
> +                * allow the platform to find an OPP based on freq later on.
> +                *
> +                * This is a simple solution to take care of such corner cases,
> +                * i.e. make the clk_count 1, which lets us allocate space for
> +                * frequency in opp->rates and also parse the entries in DT.
> +                */
> +               opp_table->clk_count = 1;
> +
>                 dev_dbg(dev, "%s: Couldn't find clock: %d\n", __func__, ret);
>                 return opp_table;
>         }
> 
> -- 
> viresh
Johan Hovold July 12, 2022, 2:29 p.m. UTC | #12
On Tue, Jul 12, 2022 at 01:22:40PM +0530, Viresh Kumar wrote:
> On 11-07-22, 18:40, Johan Hovold wrote:
> > This break OPP parsing on SC8280XP and hence cpufreq and other things:
> > 
> > [  +0.010890] cpu cpu0: _opp_add_static_v2: opp key field not found
> > [  +0.000019] cpu cpu0: _of_add_opp_table_v2: Failed to add OPP, -19
> > [  +0.000060] cpu cpu0: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 300000000, volt: 576000, enabled: 1. New: freq: 403200000, volt: 576000, enabled: 1
> > [  +0.000030] cpu cpu0: _opp_is_duplicate: duplicate OPPs detected. Existing: freq: 300000000, volt: 576000, enabled: 1. New: freq: 499200000, volt: 576000, enabled: 1
> > ...
> > 
> > I just did a rebase on next-20220708 and hit this.
> > 
> > I've narrowed it down to _read_rate() now returning -ENODEV since
> > opp_table->clk_count is zero.
> > 
> > Similar to what was reported for tegra for v1:
> > 
> > 	https://lore.kernel.org/all/58cc8e3c-74d4-e432-8502-299312a1f15e@collabora.com/
> > 
> > I don't have time to look at this any more today, but it would we nice
> > if you could unbreak linux-next.
> > 
> > Perhaps Bjorn or Mani can help with further details, but this doesn't
> > look like something that is specific to SC8280XP.
> 
> It is actually. This is yet another corner case, Tegra had one as
> well.

I literally meant that it does not appear to be SC8280XP specific. Bjorn
reported seeing similar problems on multiple Qualcomm SoCs.

> I have tried to understand the Qcom code / setup to best of my
> abilities, and the problem as per me is that qcom-cpufreq-hw doesn't
> provide a clk to the OPP core, which breaks it after the new updates
> to the OPP core. I believe following will solve it. Can someone please
> try this ? I will then merge it with the right commit.
> 
> diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> index 666e1ebf91d1..4f4a285886fa 100644
> --- a/drivers/opp/core.c
> +++ b/drivers/opp/core.c
> @@ -1384,6 +1384,20 @@ static struct opp_table *_update_opp_table_clk(struct device *dev,
>         }
> 
>         if (ret == -ENOENT) {
> +               /*
> +                * There are few platforms which don't want the OPP core to
> +                * manage device's clock settings. In such cases neither the
> +                * platform provides the clks explicitly to us, nor the DT
> +                * contains a valid clk entry. The OPP nodes in DT may still
> +                * contain "opp-hz" property though, which we need to parse and
> +                * allow the platform to find an OPP based on freq later on.
> +                *
> +                * This is a simple solution to take care of such corner cases,
> +                * i.e. make the clk_count 1, which lets us allocate space for
> +                * frequency in opp->rates and also parse the entries in DT.
> +                */
> +               opp_table->clk_count = 1;
> +
>                 dev_dbg(dev, "%s: Couldn't find clock: %d\n", __func__, ret);
>                 return opp_table;
>         }

This looks like a hack. And it also triggers a bunch of new warning when
opp is trying to create debugfs entries for an entirely different table
which now gets clk_count set to 1:

[  +0.000979]  cx: _update_opp_table_clk: Couldn't find clock: -2
[  +0.000022] debugfs: Directory 'opp:0' with parent 'cx' already present!
[  +0.000004] debugfs: Directory 'opp:0' with parent 'cx' already present!
[  +0.000004] debugfs: Directory 'opp:0' with parent 'cx' already present!
[  +0.000003] debugfs: Directory 'opp:0' with parent 'cx' already present!
[  +0.000003] debugfs: Directory 'opp:0' with parent 'cx' already present!
[  +0.000003] debugfs: Directory 'opp:0' with parent 'cx' already present!
[  +0.000003] debugfs: Directory 'opp:0' with parent 'cx' already present!
[  +0.000003] debugfs: Directory 'opp:0' with parent 'cx' already present!
[  +0.000003] debugfs: Directory 'opp:0' with parent 'cx' already present!

This is for the rpmhpd whose opp table does not have either opp-hz or
clocks (just opp-level).

The above unbreaks cpufreq though.

Johan
Viresh Kumar July 12, 2022, 3:10 p.m. UTC | #13
On 12-07-22, 16:29, Johan Hovold wrote:
> On Tue, Jul 12, 2022 at 01:22:40PM +0530, Viresh Kumar wrote:
> > diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> > index 666e1ebf91d1..4f4a285886fa 100644
> > --- a/drivers/opp/core.c
> > +++ b/drivers/opp/core.c
> > @@ -1384,6 +1384,20 @@ static struct opp_table *_update_opp_table_clk(struct device *dev,
> >         }
> > 
> >         if (ret == -ENOENT) {
> > +               /*
> > +                * There are few platforms which don't want the OPP core to
> > +                * manage device's clock settings. In such cases neither the
> > +                * platform provides the clks explicitly to us, nor the DT
> > +                * contains a valid clk entry. The OPP nodes in DT may still
> > +                * contain "opp-hz" property though, which we need to parse and
> > +                * allow the platform to find an OPP based on freq later on.
> > +                *
> > +                * This is a simple solution to take care of such corner cases,
> > +                * i.e. make the clk_count 1, which lets us allocate space for
> > +                * frequency in opp->rates and also parse the entries in DT.
> > +                */
> > +               opp_table->clk_count = 1;
> > +
> >                 dev_dbg(dev, "%s: Couldn't find clock: %d\n", __func__, ret);
> >                 return opp_table;
> >         }
> 
> This looks like a hack.

Yeah, a bit. Initially I wanted to solve it in a cleaner way, like it
is done for Tegra, where you will pass the right clock name to the OPP
core, so it can verify that the clk is there and parse the table. And
then tell the OPP core not to configure the clk from
dev_pm_opp_set_opp(), which is possible now. This would have done the
things in the right way.

The problem with Qcom's DT is that the CPU node have the OPP table but
doesn't contain the clocks, which are available with the
qcom,cpufreq-hw node instead. Because of this, I couldn't pass the
real clocks name to the OPP core, "xo", for the CPU device.

I really tried to avoid adding the above code for Tegra and found a
better and cleaner way out. But I couldn't do the same here and
figured it may be more generic of a problem, which is fine as well.

The OPP core does two things currently:

1) Parse the DT and provide helpers to find the right OPP, etc.

2) Provide generic helper to configure all resources related to the
   OPP.

It is fine if some platforms only want to have the first and not the
second. To have the second though, you need to have the first as well.

The clk is required only for the second case, and the OPP core should
parse the DT anyways, irrespective of the availability of the clock.
Because of this reason, making the above change looked reasonable
(this is what was happening before my new patches came in anyway). The
clock isn't there, but there is "opp-hz" present in the DT, which
needs to be parsed.

> And it also triggers a bunch of new warning when
> opp is trying to create debugfs entries for an entirely different table
> which now gets clk_count set to 1:
> 
> [  +0.000979]  cx: _update_opp_table_clk: Couldn't find clock: -2
> [  +0.000022] debugfs: Directory 'opp:0' with parent 'cx' already present!
> [  +0.000004] debugfs: Directory 'opp:0' with parent 'cx' already present!
> [  +0.000004] debugfs: Directory 'opp:0' with parent 'cx' already present!
> [  +0.000003] debugfs: Directory 'opp:0' with parent 'cx' already present!
> [  +0.000003] debugfs: Directory 'opp:0' with parent 'cx' already present!
> [  +0.000003] debugfs: Directory 'opp:0' with parent 'cx' already present!
> [  +0.000003] debugfs: Directory 'opp:0' with parent 'cx' already present!
> [  +0.000003] debugfs: Directory 'opp:0' with parent 'cx' already present!
> [  +0.000003] debugfs: Directory 'opp:0' with parent 'cx' already present!
> 
> This is for the rpmhpd whose opp table does not have either opp-hz or
> clocks (just opp-level).

Ahh, I missed switching back to the earlier code here. i.e. not use
the frequency for OPP directory's name, when it is 0.

This will fix it.

diff --git a/drivers/opp/debugfs.c b/drivers/opp/debugfs.c
index 402c507edac7..96a30a032c5f 100644
--- a/drivers/opp/debugfs.c
+++ b/drivers/opp/debugfs.c
@@ -138,7 +138,7 @@ void opp_debug_create_one(struct dev_pm_opp *opp, struct opp_table *opp_table)
         * - For some devices rate isn't available or there are multiple, use
         *   index instead for them.
         */
-       if (likely(opp_table->clk_count == 1))
+       if (likely(opp_table->clk_count == 1 && opp->rates[0]))
                id = opp->rates[0];
        else
                id = _get_opp_count(opp_table);

I have merged this into:

commit 341df9889277 ("OPP: Allow multiple clocks for a device")

and pushed out for linux-next.


Bjorn, Mani,

It would be really good if we can find a way to make following work on
Qcom:

        clk_get(cpu_dev, NULL or "xo")

If that happens, we can handle the special case just at the consumer
driver (qcom-cpufreq-hw) and not in the core.
Johan Hovold July 12, 2022, 3:55 p.m. UTC | #14
On Tue, Jul 12, 2022 at 08:40:45PM +0530, Viresh Kumar wrote:
> On 12-07-22, 16:29, Johan Hovold wrote:
> > On Tue, Jul 12, 2022 at 01:22:40PM +0530, Viresh Kumar wrote:
> > > diff --git a/drivers/opp/core.c b/drivers/opp/core.c
> > > index 666e1ebf91d1..4f4a285886fa 100644
> > > --- a/drivers/opp/core.c
> > > +++ b/drivers/opp/core.c
> > > @@ -1384,6 +1384,20 @@ static struct opp_table *_update_opp_table_clk(struct device *dev,
> > >         }
> > > 
> > >         if (ret == -ENOENT) {
> > > +               /*
> > > +                * There are few platforms which don't want the OPP core to
> > > +                * manage device's clock settings. In such cases neither the
> > > +                * platform provides the clks explicitly to us, nor the DT
> > > +                * contains a valid clk entry. The OPP nodes in DT may still
> > > +                * contain "opp-hz" property though, which we need to parse and
> > > +                * allow the platform to find an OPP based on freq later on.
> > > +                *
> > > +                * This is a simple solution to take care of such corner cases,
> > > +                * i.e. make the clk_count 1, which lets us allocate space for
> > > +                * frequency in opp->rates and also parse the entries in DT.
> > > +                */
> > > +               opp_table->clk_count = 1;
> > > +
> > >                 dev_dbg(dev, "%s: Couldn't find clock: %d\n", __func__, ret);
> > >                 return opp_table;
> > >         }
> > 
> > This looks like a hack.
> 
> Yeah, a bit. Initially I wanted to solve it in a cleaner way, like it
> is done for Tegra, where you will pass the right clock name to the OPP
> core, so it can verify that the clk is there and parse the table. And
> then tell the OPP core not to configure the clk from
> dev_pm_opp_set_opp(), which is possible now. This would have done the
> things in the right way.
> 
> The problem with Qcom's DT is that the CPU node have the OPP table but
> doesn't contain the clocks, which are available with the
> qcom,cpufreq-hw node instead. Because of this, I couldn't pass the
> real clocks name to the OPP core, "xo", for the CPU device.
> 
> I really tried to avoid adding the above code for Tegra and found a
> better and cleaner way out. But I couldn't do the same here and
> figured it may be more generic of a problem, which is fine as well.
> 
> The OPP core does two things currently:
> 
> 1) Parse the DT and provide helpers to find the right OPP, etc.
> 
> 2) Provide generic helper to configure all resources related to the
>    OPP.
> 
> It is fine if some platforms only want to have the first and not the
> second. To have the second though, you need to have the first as well.
> 
> The clk is required only for the second case, and the OPP core should
> parse the DT anyways, irrespective of the availability of the clock.
> Because of this reason, making the above change looked reasonable
> (this is what was happening before my new patches came in anyway). The
> clock isn't there, but there is "opp-hz" present in the DT, which
> needs to be parsed.

Ok, thanks for the details. I'd still look into if there's some way to
avoid setting clk_count when there are no clocks as it sounds like an
anti-pattern that will just make the code harder to understand and
maintain.

> > And it also triggers a bunch of new warning when
> > opp is trying to create debugfs entries for an entirely different table
> > which now gets clk_count set to 1:
> > 
> > [  +0.000979]  cx: _update_opp_table_clk: Couldn't find clock: -2
> > [  +0.000022] debugfs: Directory 'opp:0' with parent 'cx' already present!

> > This is for the rpmhpd whose opp table does not have either opp-hz or
> > clocks (just opp-level).
> 
> Ahh, I missed switching back to the earlier code here. i.e. not use
> the frequency for OPP directory's name, when it is 0.
> 
> This will fix it.
> 
> diff --git a/drivers/opp/debugfs.c b/drivers/opp/debugfs.c
> index 402c507edac7..96a30a032c5f 100644
> --- a/drivers/opp/debugfs.c
> +++ b/drivers/opp/debugfs.c
> @@ -138,7 +138,7 @@ void opp_debug_create_one(struct dev_pm_opp *opp, struct opp_table *opp_table)
>          * - For some devices rate isn't available or there are multiple, use
>          *   index instead for them.
>          */
> -       if (likely(opp_table->clk_count == 1))
> +       if (likely(opp_table->clk_count == 1 && opp->rates[0]))
>                 id = opp->rates[0];
>         else
>                 id = _get_opp_count(opp_table);

Indeed it does, thanks.

> I have merged this into:
> 
> commit 341df9889277 ("OPP: Allow multiple clocks for a device")
> 
> and pushed out for linux-next.

Thanks for addressing this quickly. With the two patches above applied,
the issues I noticed are gone.

Johan
Viresh Kumar July 13, 2022, 6:55 a.m. UTC | #15
On 12-07-22, 17:55, Johan Hovold wrote:
> Ok, thanks for the details. I'd still look into if there's some way to
> avoid setting clk_count when there are no clocks as it sounds like an
> anti-pattern that will just make the code harder to understand and
> maintain.

Here is an attempt from me :)

https://lore.kernel.org/lkml/cover.1657695140.git.viresh.kumar@linaro.org/