mbox series

[v2,00/10] AMD Pstate Driver Fixes and Improvements

Message ID cover.1715356532.git.perry.yuan@amd.com
Headers show
Series AMD Pstate Driver Fixes and Improvements | expand

Message

Yuan, Perry May 13, 2024, 2:07 a.m. UTC
Hello everyone,

This patchset addresses critical issues and enhances performance settings for CPUs
with heterogeneous core types in the AMD pstate driver. 
Specifically, it resolves problems related to calculating the highest performance
and frequency on the latest CPUs with preferred cores. 
Additionally, the patchset includes documentation improvements in amd-pstate.rst,
offering a comprehensive guide covering topics such as recommended reboot requirements
during driver switching, debugging procedures for driver loading failures.

Your feedback and suggestions for improvement are highly appreciated. 
Please review the patches and provide your valuable input.

Thank you.

Best regards,
Perry.


Changes from V1:
 * drop patch 11 which has been merged in a separate patch. (Mario)
 * fix some typos in commit log and tile (Mario)
 * fix the patch 11 regression issue of kernel command line (Oleksandr Natalenko)
 * pick ack flag for patch 7 (Mario)
 * drop patch 4 which is not recommended for user(Mario)
 * rebase to linux-pm/bleeding-edge branch
 * fix some build warning
 * rework the patch 3 for CPU ID matching(Mario)
 * address feedback for patch 5 (Mario)
 * move the acpi pm profile after got default mode(Mario)


Perry Yuan (10):
  cpufreq: amd-pstate: optimize the initial frequency values
    verification
  cpufreq: amd-pstate: remove unused  variable nominal_freq
  cpufreq: amd-pstate: show CPPC debug message if CPPC is not supported
  cpufreq: amd-pstate: add debug message while CPPC is supported and
    disabled by SBIOS
  Documentation: PM: amd-pstate: add debugging section for driver
    loading failure
  Documentation: PM: amd-pstate: add guided mode to the Operation mode
  cpufreq: amd-pstate: switch boot_cpu_has() to cpu_feature_enabled()
  x86/cpufeatures: Add feature bits for AMD heterogeneous processor
  cpufreq: amd-pstate: implement heterogeneous core topology for highest
    performance initialization
  cpufreq: amd-pstate: automatically load pstate driver by default

 Documentation/admin-guide/pm/amd-pstate.rst |  15 +-
 arch/x86/include/asm/cpufeatures.h          |   1 +
 arch/x86/include/asm/processor.h            |   2 +
 arch/x86/kernel/cpu/amd.c                   |  19 ++
 arch/x86/kernel/cpu/scattered.c             |   1 +
 drivers/cpufreq/amd-pstate.c                | 209 +++++++++++++-------
 include/linux/amd-pstate.h                  |   8 +
 7 files changed, 181 insertions(+), 74 deletions(-)

Comments

Dhananjay Ugwekar May 14, 2024, 4:50 a.m. UTC | #1
Hello Perry,

On 5/13/2024 7:37 AM, Perry Yuan wrote:
> To enhance the debugging capability of the driver loading failure for
> broken CPPC ACPI tables, it can optimize the expression by moving the
> verification of `min_freq`, `nominal_freq`, and other dependency values
> to the `amd_pstate_init_freq()` function where they are initialized.
> If any of these values are incorrect, the `amd-pstate` driver will not be registered.
> 
> By ensuring that these values are correct before they are used, it will facilitate
> the debugging process when encountering driver loading failures due to faulty CPPC
> ACPI tables from BIOS
> 
> Signed-off-by: Perry Yuan <perry.yuan@amd.com>
> ---
>  drivers/cpufreq/amd-pstate.c | 35 ++++++++++++++++++-----------------
>  1 file changed, 18 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 6a342b0c0140..614f6fac0764 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -889,6 +889,24 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>  	WRITE_ONCE(cpudata->nominal_freq, nominal_freq);
>  	WRITE_ONCE(cpudata->max_freq, max_freq);
>  
> +	/**
> +	 * Below values need to be initialized correctly, otherwise driver will fail to load
> +	 * max_freq is calculated according to (nominal_freq * highest_perf)/nominal_perf
> +	 * lowest_nonlinear_freq is a value between [min_freq, nominal_freq]
> +	 * Check _CPC in ACPI table objects if any values are incorrect
> +	 */
> +	if (min_freq <= 0 || max_freq <= 0 || nominal_freq <= 0 || min_freq > max_freq) {
> +		pr_err("min_freq(%d) or max_freq(%d) or nominal_freq(%d) value is incorrect\n",
> +			min_freq, max_freq, nominal_freq);
> +		return -EINVAL;
> +	}
> +
> +	if (lowest_nonlinear_freq <= min_freq || lowest_nonlinear_freq > nominal_freq * 1000) {
> +		pr_err("lowest_nonlinear_freq(%d) value is out of range [min_freq(%d), nominal_freq(%d)]\n",
> +			lowest_nonlinear_freq, min_freq, nominal_freq * 1000);

A reminder, we should fix the below code section (due to only nominal freq being in MHz),

685 static int amd_pstate_set_boost(struct cpufreq_policy *policy, int state)                                                        
686 {                                                                                                                                
687         struct amd_cpudata *cpudata = policy->driver_data;                                                                       
688         int ret;                                                                                                                 
689                                                                                                                                  
690         if (!cpudata->boost_supported) {                                                                                         
691                 pr_err("Boost mode is not supported by this processor or SBIOS\n");                                              
692                 return -EINVAL;                                                                                                  
693         }                                                                                                                        
694                                                                                                                                  
695         if (state)                                                                                                               
696                 policy->cpuinfo.max_freq = cpudata->max_freq;                                                                    
697         else                                                                                                                     
698                 policy->cpuinfo.max_freq = cpudata->nominal_freq;      <--- mismatch in left and right hand side units  


To avoid below situation(from a Zen4 AMD EPYC system with boost disabled),

/sys/devices/system/cpu/cpu0/cpufreq# cat scaling_max_freq                                                             
2151                                                             	<--- MHz                                                                    
/sys/devices/system/cpu/cpu0/cpufreq# cat amd_pstate_max_freq                                                          
2287000									<--- KHz

Thanks,
Dhananjay


> +		return -EINVAL;
> +	}
> +
>  	return 0;
>  }
>  
> @@ -927,15 +945,6 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>  	max_freq = READ_ONCE(cpudata->max_freq);
>  	nominal_freq = READ_ONCE(cpudata->nominal_freq);
>  
> -	if (min_freq <= 0 || max_freq <= 0 ||
> -	    nominal_freq <= 0 || min_freq > max_freq) {
> -		dev_err(dev,
> -			"min_freq(%d) or max_freq(%d) or nominal_freq (%d) value is incorrect, check _CPC in ACPI tables\n",
> -			min_freq, max_freq, nominal_freq);
> -		ret = -EINVAL;
> -		goto free_cpudata1;
> -	}
> -
>  	policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
>  	policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);
>  
> @@ -1388,14 +1397,6 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>  	min_freq = READ_ONCE(cpudata->min_freq);
>  	max_freq = READ_ONCE(cpudata->max_freq);
>  	nominal_freq = READ_ONCE(cpudata->nominal_freq);
> -	if (min_freq <= 0 || max_freq <= 0 ||
> -	    nominal_freq <= 0 || min_freq > max_freq) {
> -		dev_err(dev,
> -			"min_freq(%d) or max_freq(%d) or nominal_freq(%d) value is incorrect, check _CPC in ACPI tables\n",
> -			min_freq, max_freq, nominal_freq);
> -		ret = -EINVAL;
> -		goto free_cpudata1;
> -	}
>  
>  	policy->cpuinfo.min_freq = min_freq;
>  	policy->cpuinfo.max_freq = max_freq;