mbox series

[00/13] clk: qcom: cpu-8996: stability fixes

Message ID 20230111192004.2509750-1-dmitry.baryshkov@linaro.org
Headers show
Series clk: qcom: cpu-8996: stability fixes | expand

Message

Dmitry Baryshkov Jan. 11, 2023, 7:19 p.m. UTC
This series provides stability fixes for the MSM8996 boot process. It
changes the order of calls during the CPU PLL setup, makes it use GPLL0
(through sys_apcs_aux) during PLL init, finetunes the ACD, etc.

Dependencies: [1], [2]

[1] https://lore.kernel.org/linux-arm-msm/20230111191453.2509468-1-dmitry.baryshkov@linaro.org/
[2] https://lore.kernel.org/linux-arm-msm/20230111191634.2509616-1-dmitry.baryshkov@linaro.org/


Dmitry Baryshkov (13):
  clk: qcom: clk-alpha-pll: program PLL_TEST/PLL_TEST_U if required
  clk: qcom: cpu-8996: correct PLL programming
  clk: qcom: cpu-8996: fix the init clock rate
  clk: qcom: cpu-8996: support using GPLL0 as SMUX input
  clk: qcom: cpu-8996: skip ACD init if the setup is valid
  clk: qcom: cpu-8996: simplify the cpu_clk_notifier_cb
  clk: qcom: cpu-8996: setup PLLs before registering clocks
  clk: qcom: cpu-8996: move qcom_cpu_clk_msm8996_acd_init call
  clk: qcom: cpu-8996: fix PLL configuration sequence
  clk: qcom: cpu-8996: fix ACD initialization
  clk: qcom: cpu-8996: fix PLL clock ops
  clk: qcom: cpu-8996: change setup sequence to follow vendor kernel
  arm64: dts: qcom: msm8996: support using GPLL0 as kryocc input

 arch/arm64/boot/dts/qcom/msm8996.dtsi |   4 +-
 drivers/clk/qcom/clk-alpha-pll.c      |   5 +
 drivers/clk/qcom/clk-cpu-8996.c       | 145 ++++++++++++++++++--------
 3 files changed, 111 insertions(+), 43 deletions(-)

Comments

Dmitry Baryshkov Jan. 13, 2023, 10:44 a.m. UTC | #1
On 12/01/2023 16:35, Konrad Dybcio wrote:
> 
> 
> On 11.01.2023 20:20, Dmitry Baryshkov wrote:
>> The vendor kernel applies different order while programming SSSCTL and
>> L2ACDCR registers on power and performance clusters. However it was
>> demonstrated that doing this upstream results in the board reset. Make
>> both clusters use the same sequence, which fixes the reset.
>>
>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>> ---
> I think we should look for the source of why this doesn't work,
> e.g. does downstream program it earlier somewhere? Are we
> missing something else that may bite later?

I'm not sure what is the reason for downstream doing init in such 
sequence. Right now I'm sure that doing ACD init with the provided 
sequence fails the boot in some conditions. There might be the 
difference in the CPU init order. Or any other ordering issue. Or the 
lack of the CPR. Or Kryo LDO programming. There is a huge difference 
between vendor's 3.18 and the current 6.x.

I propose to take the patch in, as it fixes the boot and runtime issue 
and revisit it later if any of the problems occur. I don't fancy such 
approach usually, but without the documentation I don't see a way to 
find any particular reason for programming pwr and perf using the 
different order of operations.

> 
> Konrad
>>   drivers/clk/qcom/clk-cpu-8996.c | 20 ++++++++------------
>>   1 file changed, 8 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/clk/qcom/clk-cpu-8996.c b/drivers/clk/qcom/clk-cpu-8996.c
>> index 47c58bb5f21a..1c00eb629b61 100644
>> --- a/drivers/clk/qcom/clk-cpu-8996.c
>> +++ b/drivers/clk/qcom/clk-cpu-8996.c
>> @@ -475,9 +475,9 @@ static int qcom_cpu_clk_msm8996_register_clks(struct device *dev,
>>   	return ret;
>>   }
>>   
>> -#define CPU_AFINITY_MASK 0xFFF
>> -#define PWRCL_CPU_REG_MASK 0x3
>> -#define PERFCL_CPU_REG_MASK 0x103
>> +#define CPU_CLUSTER_AFFINITY_MASK 0xf00
>> +#define PWRCL_AFFINITY_MASK 0x000
>> +#define PERFCL_AFFINITY_MASK 0x100
>>   
>>   #define L2ACDCR_REG 0x580ULL
>>   #define L2ACDTD_REG 0x581ULL
>> @@ -498,21 +498,17 @@ static void qcom_cpu_clk_msm8996_acd_init(struct regmap *regmap)
>>   	if (val == 0x00006a11)
>>   		goto out;
>>   
>> -	hwid = read_cpuid_mpidr() & CPU_AFINITY_MASK;
>> -
>>   	kryo_l2_set_indirect_reg(L2ACDTD_REG, 0x00006a11);
>>   	kryo_l2_set_indirect_reg(L2ACDDVMRC_REG, 0x000e0f0f);
>>   	kryo_l2_set_indirect_reg(L2ACDSSCR_REG, 0x00000601);
>>   
>> -	if (PWRCL_CPU_REG_MASK == (hwid | PWRCL_CPU_REG_MASK)) {
>> -		regmap_write(regmap, PWRCL_REG_OFFSET + SSSCTL_OFFSET, 0xf);
>> -		kryo_l2_set_indirect_reg(L2ACDCR_REG, 0x002c5ffd);
>> -	}
>> +	kryo_l2_set_indirect_reg(L2ACDCR_REG, 0x002c5ffd);
>>   
>> -	if (PERFCL_CPU_REG_MASK == (hwid | PERFCL_CPU_REG_MASK)) {
>> -		kryo_l2_set_indirect_reg(L2ACDCR_REG, 0x002c5ffd);
>> +	hwid = read_cpuid_mpidr();
>> +	if ((hwid & CPU_CLUSTER_AFFINITY_MASK) == PWRCL_AFFINITY_MASK)
>> +		regmap_write(regmap, PWRCL_REG_OFFSET + SSSCTL_OFFSET, 0xf);
>> +	else
>>   		regmap_write(regmap, PERFCL_REG_OFFSET + SSSCTL_OFFSET, 0xf);
>> -	}
>>   
>>   out:
>>   	spin_unlock_irqrestore(&qcom_clk_acd_lock, flags);
Dmitry Baryshkov Jan. 13, 2023, 11:35 a.m. UTC | #2
On 12/01/2023 18:10, Konrad Dybcio wrote:
> 
> 
> On 11.01.2023 20:20, Dmitry Baryshkov wrote:
>> Switch CPU PLLs to use clk_alpha_pll_hwfsm_ops, it seems to suit
>> better.
>>
>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>> ---
> I *think* SUPPORTS_DYNAMIC_UPDATE should also be kicked from
> non-alt PLLs.. Otherwise we might have been kicking ourselves
> in the face all along, changing the frequency of a running
> PLL that doesn't support it if we were using the main PLL
> and not the altPLL/ACD..
> 
> Downstream sets it only for clk_ops_alpha_pll_hwfsm which is
> used on alt PLLs only
> 
> This change seems sound, as Huayra supports dynamic update
> even without setting any flags.

I don't know where Huayra came from. Downstream uses plain hwfsm pll. 
Huayra uses different alpha register settings.

> 
> Konrad
>>   drivers/clk/qcom/clk-cpu-8996.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/clk/qcom/clk-cpu-8996.c b/drivers/clk/qcom/clk-cpu-8996.c
>> index 1c00eb629b61..b53cddc4bca3 100644
>> --- a/drivers/clk/qcom/clk-cpu-8996.c
>> +++ b/drivers/clk/qcom/clk-cpu-8996.c
>> @@ -128,7 +128,7 @@ static struct clk_alpha_pll pwrcl_pll = {
>>   		.name = "pwrcl_pll",
>>   		.parent_data = pll_parent,
>>   		.num_parents = ARRAY_SIZE(pll_parent),
>> -		.ops = &clk_alpha_pll_huayra_ops,
>> +		.ops = &clk_alpha_pll_hwfsm_ops,
>>   	},
>>   };
>>   
>> @@ -140,7 +140,7 @@ static struct clk_alpha_pll perfcl_pll = {
>>   		.name = "perfcl_pll",
>>   		.parent_data = pll_parent,
>>   		.num_parents = ARRAY_SIZE(pll_parent),
>> -		.ops = &clk_alpha_pll_huayra_ops,
>> +		.ops = &clk_alpha_pll_hwfsm_ops,
>>   	},
>>   };
>>
Konrad Dybcio Jan. 13, 2023, 2 p.m. UTC | #3
On 13.01.2023 11:44, Dmitry Baryshkov wrote:
> On 12/01/2023 16:35, Konrad Dybcio wrote:
>>
>>
>> On 11.01.2023 20:20, Dmitry Baryshkov wrote:
>>> The vendor kernel applies different order while programming SSSCTL and
>>> L2ACDCR registers on power and performance clusters. However it was
>>> demonstrated that doing this upstream results in the board reset. Make
>>> both clusters use the same sequence, which fixes the reset.
>>>
>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>>> ---
>> I think we should look for the source of why this doesn't work,
>> e.g. does downstream program it earlier somewhere? Are we
>> missing something else that may bite later?
> 
> I'm not sure what is the reason for downstream doing init in such sequence. Right now I'm sure that doing ACD init with the provided sequence fails the boot in some conditions. There might be the difference in the CPU init order. Or any other ordering issue. Or the lack of the CPR. Or Kryo LDO programming. There is a huge difference between vendor's 3.18 and the current 6.x.
> 
> I propose to take the patch in, as it fixes the boot and runtime issue and revisit it later if any of the problems occur. I don't fancy such approach usually, but without the documentation I don't see a way to find any particular reason for programming pwr and perf using the different order of operations.
Ack, let's do that.

Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>

One more thing, I noticed that downstream calls ACD init on
`CPU_STARTING` event instead of `PRE_RATE_CHANGE`, see [1].
Are we "over-programming" ACD too much, or is it intended/fine?

Konrad

[1] https://git.codelinaro.org/clo/la/kernel/msm-3.18/-/blob/LA.UM.7.5.r1-05300-8x96.0/drivers/clk/msm/clock-cpu-8996.c#L1522-1540
> 
>>
>> Konrad
>>>   drivers/clk/qcom/clk-cpu-8996.c | 20 ++++++++------------
>>>   1 file changed, 8 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/clk/qcom/clk-cpu-8996.c b/drivers/clk/qcom/clk-cpu-8996.c
>>> index 47c58bb5f21a..1c00eb629b61 100644
>>> --- a/drivers/clk/qcom/clk-cpu-8996.c
>>> +++ b/drivers/clk/qcom/clk-cpu-8996.c
>>> @@ -475,9 +475,9 @@ static int qcom_cpu_clk_msm8996_register_clks(struct device *dev,
>>>       return ret;
>>>   }
>>>   -#define CPU_AFINITY_MASK 0xFFF
>>> -#define PWRCL_CPU_REG_MASK 0x3
>>> -#define PERFCL_CPU_REG_MASK 0x103
>>> +#define CPU_CLUSTER_AFFINITY_MASK 0xf00
>>> +#define PWRCL_AFFINITY_MASK 0x000
>>> +#define PERFCL_AFFINITY_MASK 0x100
>>>     #define L2ACDCR_REG 0x580ULL
>>>   #define L2ACDTD_REG 0x581ULL
>>> @@ -498,21 +498,17 @@ static void qcom_cpu_clk_msm8996_acd_init(struct regmap *regmap)
>>>       if (val == 0x00006a11)
>>>           goto out;
>>>   -    hwid = read_cpuid_mpidr() & CPU_AFINITY_MASK;
>>> -
>>>       kryo_l2_set_indirect_reg(L2ACDTD_REG, 0x00006a11);
>>>       kryo_l2_set_indirect_reg(L2ACDDVMRC_REG, 0x000e0f0f);
>>>       kryo_l2_set_indirect_reg(L2ACDSSCR_REG, 0x00000601);
>>>   -    if (PWRCL_CPU_REG_MASK == (hwid | PWRCL_CPU_REG_MASK)) {
>>> -        regmap_write(regmap, PWRCL_REG_OFFSET + SSSCTL_OFFSET, 0xf);
>>> -        kryo_l2_set_indirect_reg(L2ACDCR_REG, 0x002c5ffd);
>>> -    }
>>> +    kryo_l2_set_indirect_reg(L2ACDCR_REG, 0x002c5ffd);
>>>   -    if (PERFCL_CPU_REG_MASK == (hwid | PERFCL_CPU_REG_MASK)) {
>>> -        kryo_l2_set_indirect_reg(L2ACDCR_REG, 0x002c5ffd);
>>> +    hwid = read_cpuid_mpidr();
>>> +    if ((hwid & CPU_CLUSTER_AFFINITY_MASK) == PWRCL_AFFINITY_MASK)
>>> +        regmap_write(regmap, PWRCL_REG_OFFSET + SSSCTL_OFFSET, 0xf);
>>> +    else
>>>           regmap_write(regmap, PERFCL_REG_OFFSET + SSSCTL_OFFSET, 0xf);
>>> -    }
>>>     out:
>>>       spin_unlock_irqrestore(&qcom_clk_acd_lock, flags);
>
Konrad Dybcio Jan. 13, 2023, 2:02 p.m. UTC | #4
On 13.01.2023 12:35, Dmitry Baryshkov wrote:
> On 12/01/2023 18:10, Konrad Dybcio wrote:
>>
>>
>> On 11.01.2023 20:20, Dmitry Baryshkov wrote:
>>> Switch CPU PLLs to use clk_alpha_pll_hwfsm_ops, it seems to suit
>>> better.
>>>
>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>>> ---
>> I *think* SUPPORTS_DYNAMIC_UPDATE should also be kicked from
>> non-alt PLLs.. Otherwise we might have been kicking ourselves
>> in the face all along, changing the frequency of a running
>> PLL that doesn't support it if we were using the main PLL
>> and not the altPLL/ACD..
>>
>> Downstream sets it only for clk_ops_alpha_pll_hwfsm which is
>> used on alt PLLs only
>>
>> This change seems sound, as Huayra supports dynamic update
>> even without setting any flags.
> 
> I don't know where Huayra came from. Downstream uses plain hwfsm pll. Huayra uses different alpha register settings.
Right, that too.. somewhat of a miracle things worked at all..

Konrad

P.S please revisit that SUPPORTS_DYNAMIC_UPDATE flag for main PLLs
> 
>>
>> Konrad
>>>   drivers/clk/qcom/clk-cpu-8996.c | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/clk/qcom/clk-cpu-8996.c b/drivers/clk/qcom/clk-cpu-8996.c
>>> index 1c00eb629b61..b53cddc4bca3 100644
>>> --- a/drivers/clk/qcom/clk-cpu-8996.c
>>> +++ b/drivers/clk/qcom/clk-cpu-8996.c
>>> @@ -128,7 +128,7 @@ static struct clk_alpha_pll pwrcl_pll = {
>>>           .name = "pwrcl_pll",
>>>           .parent_data = pll_parent,
>>>           .num_parents = ARRAY_SIZE(pll_parent),
>>> -        .ops = &clk_alpha_pll_huayra_ops,
>>> +        .ops = &clk_alpha_pll_hwfsm_ops,
>>>       },
>>>   };
>>>   @@ -140,7 +140,7 @@ static struct clk_alpha_pll perfcl_pll = {
>>>           .name = "perfcl_pll",
>>>           .parent_data = pll_parent,
>>>           .num_parents = ARRAY_SIZE(pll_parent),
>>> -        .ops = &clk_alpha_pll_huayra_ops,
>>> +        .ops = &clk_alpha_pll_hwfsm_ops,
>>>       },
>>>   };
>>>   
>