mbox series

[0/8] Adjustments for preferred core detection

Message ID 20240826211358.2694603-1-superm1@kernel.org
Headers show
Series Adjustments for preferred core detection | expand

Message

Mario Limonciello Aug. 26, 2024, 9:13 p.m. UTC
From: Mario Limonciello <mario.limonciello@amd.com>

Preferred core detection is fragile in that any CPU that reports
less than 255 for any core is assumed to be a preferred core design.
This might not always be true, so it's better to check all CPUs and
see that varying values are actually reported.

Furthermore, preferred core detection isn't used by acpi-cpufreq. So
incorrect frequencies are used unless amd-pstate is active.

This series moves preferred core detection out of amd-pstate in a more
robust fashion.  It also removes some tech debt of hardcoded values for
platforms that are actually preferred core platforms.

This branch is based off v6.11-rc5.

Mario Limonciello (8):
  x86/amd: Move amd_get_highest_perf() from amd.c to cppc.c
  x86/amd: Rename amd_get_highest_perf() to
    amd_get_boost_ratio_numerator()
  ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn
  x86/amd: Move amd_get_highest_perf() out of amd-pstate
  x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator()
  cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into
    amd_get_boost_ratio_numerator()
  cpufreq: amd-pstate: Optimize amd_pstate_update_limits()
  cpufreq: amd-pstate: Drop some uses of cpudata->hw_prefcore

 arch/x86/include/asm/processor.h |  14 +--
 arch/x86/kernel/acpi/cppc.c      | 159 ++++++++++++++++++++++++++++++-
 arch/x86/kernel/cpu/amd.c        |  16 ----
 drivers/cpufreq/acpi-cpufreq.c   |  12 ++-
 drivers/cpufreq/amd-pstate.c     | 132 ++++++-------------------
 include/acpi/cppc_acpi.h         |  10 ++
 6 files changed, 203 insertions(+), 140 deletions(-)

Comments

Yuan, Perry Aug. 27, 2024, 6:29 a.m. UTC | #1
[AMD Official Use Only - AMD Internal Distribution Only]

> -----Original Message-----
> From: Mario Limonciello <superm1@kernel.org>
> Sent: Tuesday, August 27, 2024 5:14 AM
> To: Borislav Petkov <bp@alien8.de>; Shenoy, Gautham Ranjal
> <gautham.shenoy@amd.com>; Yuan, Perry <Perry.Yuan@amd.com>
> Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT) <x86@kernel.org>;
> Rafael J . Wysocki <rafael@kernel.org>; open list:X86 ARCHITECTURE (32-BIT
> AND 64-BIT) <linux-kernel@vger.kernel.org>; open list:ACPI <linux-
> acpi@vger.kernel.org>; open list:CPU FREQUENCY SCALING FRAMEWORK
> <linux-pm@vger.kernel.org>; Limonciello, Mario
> <Mario.Limonciello@amd.com>
> Subject: [PATCH 1/8] x86/amd: Move amd_get_highest_perf() from amd.c to
> cppc.c
>
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> To prepare to let amd_get_highest_perf() detect preferred cores it will require
> CPPC functions. Move amd_get_highest_perf() to cppc.c to prepare for
> 'preferred core detection' rework.
>
> No functional changes intended.
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
>  arch/x86/kernel/acpi/cppc.c | 16 ++++++++++++++++
>  arch/x86/kernel/cpu/amd.c   | 16 ----------------
>  2 files changed, 16 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c index
> ff8f25faca3dd..7ec8f2ce859c8 100644
> --- a/arch/x86/kernel/acpi/cppc.c
> +++ b/arch/x86/kernel/acpi/cppc.c
> @@ -116,3 +116,19 @@ void init_freq_invariance_cppc(void)
>       init_done = true;
>       mutex_unlock(&freq_invariance_lock);
>  }
> +
> +u32 amd_get_highest_perf(void)
> +{
> +     struct cpuinfo_x86 *c = &boot_cpu_data;
> +
> +     if (c->x86 == 0x17 && ((c->x86_model >= 0x30 && c->x86_model <
> 0x40) ||
> +                            (c->x86_model >= 0x70 && c->x86_model <
> 0x80)))
> +             return 166;
> +
> +     if (c->x86 == 0x19 && ((c->x86_model >= 0x20 && c->x86_model <
> 0x30) ||
> +                            (c->x86_model >= 0x40 && c->x86_model <
> 0x70)))
> +             return 166;
> +
> +     return 255;
> +}
> +EXPORT_SYMBOL_GPL(amd_get_highest_perf);
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index
> 1e0fe5f8ab84e..015971adadfc7 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -1190,22 +1190,6 @@ unsigned long amd_get_dr_addr_mask(unsigned
> int dr)  }  EXPORT_SYMBOL_GPL(amd_get_dr_addr_mask);
>
> -u32 amd_get_highest_perf(void)
> -{
> -     struct cpuinfo_x86 *c = &boot_cpu_data;
> -
> -     if (c->x86 == 0x17 && ((c->x86_model >= 0x30 && c->x86_model <
> 0x40) ||
> -                            (c->x86_model >= 0x70 && c->x86_model <
> 0x80)))
> -             return 166;
> -
> -     if (c->x86 == 0x19 && ((c->x86_model >= 0x20 && c->x86_model <
> 0x30) ||
> -                            (c->x86_model >= 0x40 && c->x86_model <
> 0x70)))
> -             return 166;
> -
> -     return 255;
> -}
> -EXPORT_SYMBOL_GPL(amd_get_highest_perf);
> -
>  static void zenbleed_check_cpu(void *unused)  {
>       struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
> --
> 2.43.0
>

LGTM, thanks!

Reviewed-by: Perry Yuan <perry.yuan@amd.com>
Yuan, Perry Aug. 27, 2024, 6:37 a.m. UTC | #2
[AMD Official Use Only - AMD Internal Distribution Only]

> -----Original Message-----
> From: Mario Limonciello <superm1@kernel.org>
> Sent: Tuesday, August 27, 2024 5:14 AM
> To: Borislav Petkov <bp@alien8.de>; Shenoy, Gautham Ranjal
> <gautham.shenoy@amd.com>; Yuan, Perry <Perry.Yuan@amd.com>
> Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT) <x86@kernel.org>;
> Rafael J . Wysocki <rafael@kernel.org>; open list:X86 ARCHITECTURE (32-BIT
> AND 64-BIT) <linux-kernel@vger.kernel.org>; open list:ACPI <linux-
> acpi@vger.kernel.org>; open list:CPU FREQUENCY SCALING FRAMEWORK
> <linux-pm@vger.kernel.org>; Limonciello, Mario
> <Mario.Limonciello@amd.com>; Yuan, Perry <Perry.Yuan@amd.com>
> Subject: [PATCH 3/8] ACPI: CPPC: Adjust debug messages in
> amd_set_max_freq_ratio() to warn
>
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> If the boost ratio isn't calculated properly for the system for any reason this
> can cause other problems that are non-obvious.
>
> Raise all messages to warn instead.
>
> Suggested-by: Perry Yuan <Perry.Yuan@amd.com>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
>  arch/x86/kernel/acpi/cppc.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c index
> 1d631ac5ec328..e94507110ca24 100644
> --- a/arch/x86/kernel/acpi/cppc.c
> +++ b/arch/x86/kernel/acpi/cppc.c
> @@ -75,17 +75,17 @@ static void amd_set_max_freq_ratio(void)
>
>       rc = cppc_get_perf_caps(0, &perf_caps);
>       if (rc) {
> -             pr_debug("Could not retrieve perf counters (%d)\n", rc);
> +             pr_warn("Could not retrieve perf counters (%d)\n", rc);
>               return;
>       }
>
>       rc = amd_get_boost_ratio_numerator(0, &highest_perf);
>       if (rc)
> -             pr_debug("Could not retrieve highest performance\n");
> +             pr_warn("Could not retrieve highest performance\n");
>       nominal_perf = perf_caps.nominal_perf;
>
>       if (!nominal_perf) {
> -             pr_debug("Could not retrieve nominal performance\n");
> +             pr_warn("Could not retrieve nominal performance\n");
>               return;
>       }
>
> @@ -93,7 +93,7 @@ static void amd_set_max_freq_ratio(void)
>       /* midpoint between max_boost and max_P */
>       perf_ratio = (perf_ratio + SCHED_CAPACITY_SCALE) >> 1;
>       if (!perf_ratio) {
> -             pr_debug("Non-zero highest/nominal perf values led to a 0
> ratio\n");
> +             pr_warn("Non-zero highest/nominal perf values led to a 0
> ratio\n");
>               return;
>       }
>
> --
> 2.43.0
>

LGTM, it is good to show the warning message once there are some perf values are invalid.
That will help to debug the issue from customer report log.

Reviewed-by: Perry Yuan <perry.yuan@amd.com>


Best Regards.

Perry.
Yuan, Perry Aug. 27, 2024, 6:48 a.m. UTC | #3
[AMD Official Use Only - AMD Internal Distribution Only]

> -----Original Message-----
> From: Mario Limonciello <superm1@kernel.org>
> Sent: Tuesday, August 27, 2024 5:14 AM
> To: Borislav Petkov <bp@alien8.de>; Shenoy, Gautham Ranjal
> <gautham.shenoy@amd.com>; Yuan, Perry <Perry.Yuan@amd.com>
> Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT) <x86@kernel.org>;
> Rafael J . Wysocki <rafael@kernel.org>; open list:X86 ARCHITECTURE (32-BIT
> AND 64-BIT) <linux-kernel@vger.kernel.org>; open list:ACPI <linux-
> acpi@vger.kernel.org>; open list:CPU FREQUENCY SCALING FRAMEWORK
> <linux-pm@vger.kernel.org>; Limonciello, Mario
> <Mario.Limonciello@amd.com>
> Subject: [PATCH 7/8] cpufreq: amd-pstate: Optimize
> amd_pstate_update_limits()
>
> From: Mario Limonciello <mario.limonciello@amd.com>
>
> Don't take and release the mutex when prefcore isn't present and avoid
> initialization of variables that will be initially set in the function.
>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
>  drivers/cpufreq/amd-pstate.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 75568d0f84623..ed05d7a0add10 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -798,17 +798,17 @@ static void amd_pstate_update_limits(unsigned int
> cpu)
>       int ret;
>       bool highest_perf_changed = false;
>
> -     mutex_lock(&amd_pstate_driver_lock);
> -     if ((!amd_pstate_prefcore) || (!cpudata->hw_prefcore))
> -             goto free_cpufreq_put;
> +     if (!amd_pstate_prefcore)
> +             return;
>
> +     mutex_lock(&amd_pstate_driver_lock);
>       ret = amd_get_highest_perf(cpu, &cur_high);
>       if (ret)
>               goto free_cpufreq_put;
>
>       prev_high = READ_ONCE(cpudata->prefcore_ranking);
> -     if (prev_high != cur_high) {
> -             highest_perf_changed = true;
> +     highest_perf_changed = (prev_high != cur_high);
> +     if (highest_perf_changed) {
>               WRITE_ONCE(cpudata->prefcore_ranking, cur_high);
>
>               if (cur_high < CPPC_MAX_PERF)
> --
> 2.43.0
>

Reviewed-by: Perry Yuan <perry.yuan@amd.com>


Best Regards.

Perry.
Gautham R. Shenoy Aug. 27, 2024, 2:08 p.m. UTC | #4
On Mon, Aug 26, 2024 at 04:13:51PM -0500, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
> 
> To prepare to let amd_get_highest_perf() detect preferred cores
> it will require CPPC functions. Move amd_get_highest_perf() to
> cppc.c to prepare for 'preferred core detection' rework.
> 
> No functional changes intended.
> 
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>

--
Thanks and Regards
gautham.

> ---
>  arch/x86/kernel/acpi/cppc.c | 16 ++++++++++++++++
>  arch/x86/kernel/cpu/amd.c   | 16 ----------------
>  2 files changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c
> index ff8f25faca3dd..7ec8f2ce859c8 100644
> --- a/arch/x86/kernel/acpi/cppc.c
> +++ b/arch/x86/kernel/acpi/cppc.c
> @@ -116,3 +116,19 @@ void init_freq_invariance_cppc(void)
>  	init_done = true;
>  	mutex_unlock(&freq_invariance_lock);
>  }
> +
> +u32 amd_get_highest_perf(void)
> +{
> +	struct cpuinfo_x86 *c = &boot_cpu_data;
> +
> +	if (c->x86 == 0x17 && ((c->x86_model >= 0x30 && c->x86_model < 0x40) ||
> +			       (c->x86_model >= 0x70 && c->x86_model < 0x80)))
> +		return 166;
> +
> +	if (c->x86 == 0x19 && ((c->x86_model >= 0x20 && c->x86_model < 0x30) ||
> +			       (c->x86_model >= 0x40 && c->x86_model < 0x70)))
> +		return 166;
> +
> +	return 255;
> +}
> +EXPORT_SYMBOL_GPL(amd_get_highest_perf);
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index 1e0fe5f8ab84e..015971adadfc7 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -1190,22 +1190,6 @@ unsigned long amd_get_dr_addr_mask(unsigned int dr)
>  }
>  EXPORT_SYMBOL_GPL(amd_get_dr_addr_mask);
>  
> -u32 amd_get_highest_perf(void)
> -{
> -	struct cpuinfo_x86 *c = &boot_cpu_data;
> -
> -	if (c->x86 == 0x17 && ((c->x86_model >= 0x30 && c->x86_model < 0x40) ||
> -			       (c->x86_model >= 0x70 && c->x86_model < 0x80)))
> -		return 166;
> -
> -	if (c->x86 == 0x19 && ((c->x86_model >= 0x20 && c->x86_model < 0x30) ||
> -			       (c->x86_model >= 0x40 && c->x86_model < 0x70)))
> -		return 166;
> -
> -	return 255;
> -}
> -EXPORT_SYMBOL_GPL(amd_get_highest_perf);
> -
>  static void zenbleed_check_cpu(void *unused)
>  {
>  	struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());
> -- 
> 2.43.0
>
Gautham R. Shenoy Aug. 27, 2024, 2:50 p.m. UTC | #5
On Mon, Aug 26, 2024 at 04:13:53PM -0500, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
> 
> If the boost ratio isn't calculated properly for the system for any
> reason this can cause other problems that are non-obvious.
> 
> Raise all messages to warn instead.
> 
> Suggested-by: Perry Yuan <Perry.Yuan@amd.com>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
>  arch/x86/kernel/acpi/cppc.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c
> index 1d631ac5ec328..e94507110ca24 100644
> --- a/arch/x86/kernel/acpi/cppc.c
> +++ b/arch/x86/kernel/acpi/cppc.c
> @@ -75,17 +75,17 @@ static void amd_set_max_freq_ratio(void)
>  
>  	rc = cppc_get_perf_caps(0, &perf_caps);
>  	if (rc) {
> -		pr_debug("Could not retrieve perf counters (%d)\n", rc);
> +		pr_warn("Could not retrieve perf counters (%d)\n", rc);
>  		return;
>  	}
>  
>  	rc = amd_get_boost_ratio_numerator(0, &highest_perf);
>  	if (rc)
> -		pr_debug("Could not retrieve highest performance\n");
> +		pr_warn("Could not retrieve highest performance\n");
>  	nominal_perf = perf_caps.nominal_perf;
>  
>  	if (!nominal_perf) {
> -		pr_debug("Could not retrieve nominal performance\n");
> +		pr_warn("Could not retrieve nominal performance\n");
>  		return;
>  	}
>  
> @@ -93,7 +93,7 @@ static void amd_set_max_freq_ratio(void)
>  	/* midpoint between max_boost and max_P */
>  	perf_ratio = (perf_ratio + SCHED_CAPACITY_SCALE) >> 1;
>  	if (!perf_ratio) {
> -		pr_debug("Non-zero highest/nominal perf values led to a 0 ratio\n");
> +		pr_warn("Non-zero highest/nominal perf values led to a 0 ratio\n");
>  		return;

Aside:
perf_ratio is a u64, and SCHED_CAPACITY_SCALE is (1L << 10). Thus, is
it even possible to have !perf_ratio?

Otherwise, I am ok with this promotion of pr_debug to pr_warn.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>

--
Thanks and Regards
gautham.
Gautham R. Shenoy Aug. 27, 2024, 3:43 p.m. UTC | #6
On Mon, Aug 26, 2024 at 04:13:55PM -0500, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
> 
> AMD systems that support preferred cores will use "166" as their
> numerator for max frequency calculations instead of "255".
> 
> Add a function for detecting preferred cores by looking at the
> highest perf value on all cores.
> 
> If preferred cores are enabled return 166 and if disabled the
> value in the highest perf register. As the function will be called
> multiple times, cache the values for the boost numerator and if
> preferred cores will be enabled in global variables.
> 
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---

[..snip..]

>  /**
>   * amd_get_boost_ratio_numerator: Get the numerator to use for boost ratio calculation
>   * @cpu: CPU to get numerator for.
> @@ -162,20 +232,19 @@ EXPORT_SYMBOL_GPL(amd_get_highest_perf);
>   */
>  int amd_get_boost_ratio_numerator(unsigned int cpu, u64 *numerator)
>  {
> -	struct cpuinfo_x86 *c = &boot_cpu_data;
> +	bool prefcore;
> +	int ret;
>  
> -	if (c->x86 == 0x17 && ((c->x86_model >= 0x30 && c->x86_model < 0x40) ||
> -			       (c->x86_model >= 0x70 && c->x86_model < 0x80))) {
> -		*numerator = 166;
> -		return 0;
> -	}
> +	ret = amd_detect_prefcore(&prefcore);
> +	if (ret)
> +		return ret;
>  
> -	if (c->x86 == 0x19 && ((c->x86_model >= 0x20 && c->x86_model < 0x30) ||
> -			       (c->x86_model >= 0x40 && c->x86_model < 0x70))) {
> -		*numerator = 166;
> +	/* without preferred cores, return the highest perf register value */
> +	if (!prefcore) {
> +		*numerator = boost_numerator;
>  		return 0;
>  	}
> -	*numerator = 255;
> +	*numerator = CPPC_HIGHEST_PERF_PREFCORE;


Interesting. So even when the user boots a system that supports
preferred-cores with "amd_preferred=disable",
amd_get_boost_ratio_numerator() will return CPPC_HIGHEST_PERF_PREFCORE
as the call prefcore == true here.

I suppose that is as intended, since even though the user may not want
to use the preferred core logic for task-scheduling/load-balancing,
the numerator for the boost-ratio is purely dependent on the h/w
capability.

Is this understanding correct? If so, can this be included as a
comment in the code ?

The rest of the patch looks good to me.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>

--
Thanks and Regards
gautham.



>  
>  	return 0;
>  }
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index f470b5700db58..ec32c830abc1d 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -807,32 +807,18 @@ static DECLARE_WORK(sched_prefcore_work, amd_pstste_sched_prefcore_workfn);
>  
>  static void amd_pstate_init_prefcore(struct amd_cpudata *cpudata)
>  {
> -	int ret, prio;
> -	u32 highest_perf;
> -
> -	ret = amd_get_highest_perf(cpudata->cpu, &highest_perf);
> -	if (ret)
> +	/* user disabled or not detected */
> +	if (!amd_pstate_prefcore)
>  		return;
>  
>  	cpudata->hw_prefcore = true;
> -	/* check if CPPC preferred core feature is enabled*/
> -	if (highest_perf < CPPC_MAX_PERF)
> -		prio = (int)highest_perf;
> -	else {
> -		pr_debug("AMD CPPC preferred core is unsupported!\n");
> -		cpudata->hw_prefcore = false;
> -		return;
> -	}
> -
> -	if (!amd_pstate_prefcore)
> -		return;
>  
>  	/*
>  	 * The priorities can be set regardless of whether or not
>  	 * sched_set_itmt_support(true) has been called and it is valid to
>  	 * update them at any time after it has been called.
>  	 */
> -	sched_set_itmt_core_prio(prio, cpudata->cpu);
> +	sched_set_itmt_core_prio((int)READ_ONCE(cpudata->highest_perf), cpudata->cpu);
>  
>  	schedule_work(&sched_prefcore_work);
>  }
> @@ -998,12 +984,12 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>  
>  	cpudata->cpu = policy->cpu;
>  
> -	amd_pstate_init_prefcore(cpudata);
> -
>  	ret = amd_pstate_init_perf(cpudata);
>  	if (ret)
>  		goto free_cpudata1;
>  
> +	amd_pstate_init_prefcore(cpudata);
> +
>  	ret = amd_pstate_init_freq(cpudata);
>  	if (ret)
>  		goto free_cpudata1;
> @@ -1453,12 +1439,12 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>  	cpudata->cpu = policy->cpu;
>  	cpudata->epp_policy = 0;
>  
> -	amd_pstate_init_prefcore(cpudata);
> -
>  	ret = amd_pstate_init_perf(cpudata);
>  	if (ret)
>  		goto free_cpudata1;
>  
> +	amd_pstate_init_prefcore(cpudata);
> +
>  	ret = amd_pstate_init_freq(cpudata);
>  	if (ret)
>  		goto free_cpudata1;
> @@ -1903,6 +1889,12 @@ static int __init amd_pstate_init(void)
>  		static_call_update(amd_pstate_update_perf, cppc_update_perf);
>  	}
>  
> +	if (amd_pstate_prefcore) {
> +		ret = amd_detect_prefcore(&amd_pstate_prefcore);
> +		if (ret)
> +			return ret;
> +	}
> +
>  	/* enable amd pstate feature */
>  	ret = amd_pstate_enable(true);
>  	if (ret) {
> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
> index 2246ce0630362..1d79320a23490 100644
> --- a/include/acpi/cppc_acpi.h
> +++ b/include/acpi/cppc_acpi.h
> @@ -137,10 +137,12 @@ struct cppc_cpudata {
>  };
>  
>  #ifdef CONFIG_CPU_SUP_AMD
> +extern int amd_detect_prefcore(bool *detected);
>  extern int amd_get_highest_perf(unsigned int cpu, u32 *highest_perf);
>  extern int amd_get_boost_ratio_numerator(unsigned int cpu, u64 *numerator);
>  #else /* !CONFIG_CPU_SUP_AMD */
>  static inline int amd_get_highest_perf(unsigned int cpu, u32 *highest_perf) { return -ENODEV; }
> +static inline int amd_detect_prefcore(bool *detected) { return -ENODEV; }
>  static inline int amd_get_boost_ratio_numerator(unsigned int cpu, u64 *numerator) { return -ENODEV; }
>  #endif /* !CONFIG_CPU_SUP_AMD */
>  
> -- 
> 2.43.0
>
Gautham R. Shenoy Aug. 27, 2024, 4:57 p.m. UTC | #7
On Mon, Aug 26, 2024 at 04:13:57PM -0500, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
> 
> Don't take and release the mutex when prefcore isn't present and
> avoid initialization of variables that will be initially set
> in the function.
> 
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

> ---
>  drivers/cpufreq/amd-pstate.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index 75568d0f84623..ed05d7a0add10 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -798,17 +798,17 @@ static void amd_pstate_update_limits(unsigned int cpu)
>  	int ret;
>  	bool highest_perf_changed = false;
>  
> -	mutex_lock(&amd_pstate_driver_lock);
> -	if ((!amd_pstate_prefcore) || (!cpudata->hw_prefcore))
> -		goto free_cpufreq_put;
> +	if (!amd_pstate_prefcore)
> +		return;

Looks good to me.

Wondering if it is worth maintaining a static key for
amd_pstate_prefcore. Anyway it doesn't change after boot.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>


--
Thanks and Regards
gautham.
Mario Limonciello Aug. 27, 2024, 6:48 p.m. UTC | #8
On 8/27/2024 09:50, Gautham R. Shenoy wrote:
> On Mon, Aug 26, 2024 at 04:13:53PM -0500, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> If the boost ratio isn't calculated properly for the system for any
>> reason this can cause other problems that are non-obvious.
>>
>> Raise all messages to warn instead.
>>
>> Suggested-by: Perry Yuan <Perry.Yuan@amd.com>
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>>   arch/x86/kernel/acpi/cppc.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kernel/acpi/cppc.c b/arch/x86/kernel/acpi/cppc.c
>> index 1d631ac5ec328..e94507110ca24 100644
>> --- a/arch/x86/kernel/acpi/cppc.c
>> +++ b/arch/x86/kernel/acpi/cppc.c
>> @@ -75,17 +75,17 @@ static void amd_set_max_freq_ratio(void)
>>   
>>   	rc = cppc_get_perf_caps(0, &perf_caps);
>>   	if (rc) {
>> -		pr_debug("Could not retrieve perf counters (%d)\n", rc);
>> +		pr_warn("Could not retrieve perf counters (%d)\n", rc);
>>   		return;
>>   	}
>>   
>>   	rc = amd_get_boost_ratio_numerator(0, &highest_perf);
>>   	if (rc)
>> -		pr_debug("Could not retrieve highest performance\n");
>> +		pr_warn("Could not retrieve highest performance\n");
>>   	nominal_perf = perf_caps.nominal_perf;
>>   
>>   	if (!nominal_perf) {
>> -		pr_debug("Could not retrieve nominal performance\n");
>> +		pr_warn("Could not retrieve nominal performance\n");
>>   		return;
>>   	}
>>   
>> @@ -93,7 +93,7 @@ static void amd_set_max_freq_ratio(void)
>>   	/* midpoint between max_boost and max_P */
>>   	perf_ratio = (perf_ratio + SCHED_CAPACITY_SCALE) >> 1;
>>   	if (!perf_ratio) {
>> -		pr_debug("Non-zero highest/nominal perf values led to a 0 ratio\n");
>> +		pr_warn("Non-zero highest/nominal perf values led to a 0 ratio\n");
>>   		return;
> 
> Aside:
> perf_ratio is a u64, and SCHED_CAPACITY_SCALE is (1L << 10). Thus, is
> it even possible to have !perf_ratio?
> 
> Otherwise, I am ok with this promotion of pr_debug to pr_warn.

You're right; I don't see this is possible.  I'll tear it out in a 
prerequisite patch in v2.

> 
> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
> 
> --
> Thanks and Regards
> gautham.
Mario Limonciello Aug. 27, 2024, 7 p.m. UTC | #9
On 8/27/2024 10:43, Gautham R. Shenoy wrote:
> 
> On Mon, Aug 26, 2024 at 04:13:55PM -0500, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> AMD systems that support preferred cores will use "166" as their
>> numerator for max frequency calculations instead of "255".
>>
>> Add a function for detecting preferred cores by looking at the
>> highest perf value on all cores.
>>
>> If preferred cores are enabled return 166 and if disabled the
>> value in the highest perf register. As the function will be called
>> multiple times, cache the values for the boost numerator and if
>> preferred cores will be enabled in global variables.
>>
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
> 
> [..snip..]
> 
>>   /**
>>    * amd_get_boost_ratio_numerator: Get the numerator to use for boost ratio calculation
>>    * @cpu: CPU to get numerator for.
>> @@ -162,20 +232,19 @@ EXPORT_SYMBOL_GPL(amd_get_highest_perf);
>>    */
>>   int amd_get_boost_ratio_numerator(unsigned int cpu, u64 *numerator)
>>   {
>> -	struct cpuinfo_x86 *c = &boot_cpu_data;
>> +	bool prefcore;
>> +	int ret;
>>   
>> -	if (c->x86 == 0x17 && ((c->x86_model >= 0x30 && c->x86_model < 0x40) ||
>> -			       (c->x86_model >= 0x70 && c->x86_model < 0x80))) {
>> -		*numerator = 166;
>> -		return 0;
>> -	}
>> +	ret = amd_detect_prefcore(&prefcore);
>> +	if (ret)
>> +		return ret;
>>   
>> -	if (c->x86 == 0x19 && ((c->x86_model >= 0x20 && c->x86_model < 0x30) ||
>> -			       (c->x86_model >= 0x40 && c->x86_model < 0x70))) {
>> -		*numerator = 166;
>> +	/* without preferred cores, return the highest perf register value */
>> +	if (!prefcore) {
>> +		*numerator = boost_numerator;
>>   		return 0;
>>   	}
>> -	*numerator = 255;
>> +	*numerator = CPPC_HIGHEST_PERF_PREFCORE;
> 
> 
> Interesting. So even when the user boots a system that supports
> preferred-cores with "amd_preferred=disable",
> amd_get_boost_ratio_numerator() will return CPPC_HIGHEST_PERF_PREFCORE
> as the call prefcore == true here.
> 
> I suppose that is as intended, since even though the user may not want
> to use the preferred core logic for task-scheduling/load-balancing,
> the numerator for the boost-ratio is purely dependent on the h/w
> capability.
> 
> Is this understanding correct? If so, can this be included as a
> comment in the code ?
> 

Yup, you got it all right.  I'll fold some of this into the function 
comment for v2.

> The rest of the patch looks good to me.
> 
> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
> 
> --
> Thanks and Regards
> gautham.
> 
> 
> 
>>   
>>   	return 0;
>>   }
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index f470b5700db58..ec32c830abc1d 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -807,32 +807,18 @@ static DECLARE_WORK(sched_prefcore_work, amd_pstste_sched_prefcore_workfn);
>>   
>>   static void amd_pstate_init_prefcore(struct amd_cpudata *cpudata)
>>   {
>> -	int ret, prio;
>> -	u32 highest_perf;
>> -
>> -	ret = amd_get_highest_perf(cpudata->cpu, &highest_perf);
>> -	if (ret)
>> +	/* user disabled or not detected */
>> +	if (!amd_pstate_prefcore)
>>   		return;
>>   
>>   	cpudata->hw_prefcore = true;
>> -	/* check if CPPC preferred core feature is enabled*/
>> -	if (highest_perf < CPPC_MAX_PERF)
>> -		prio = (int)highest_perf;
>> -	else {
>> -		pr_debug("AMD CPPC preferred core is unsupported!\n");
>> -		cpudata->hw_prefcore = false;
>> -		return;
>> -	}
>> -
>> -	if (!amd_pstate_prefcore)
>> -		return;
>>   
>>   	/*
>>   	 * The priorities can be set regardless of whether or not
>>   	 * sched_set_itmt_support(true) has been called and it is valid to
>>   	 * update them at any time after it has been called.
>>   	 */
>> -	sched_set_itmt_core_prio(prio, cpudata->cpu);
>> +	sched_set_itmt_core_prio((int)READ_ONCE(cpudata->highest_perf), cpudata->cpu);
>>   
>>   	schedule_work(&sched_prefcore_work);
>>   }
>> @@ -998,12 +984,12 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>>   
>>   	cpudata->cpu = policy->cpu;
>>   
>> -	amd_pstate_init_prefcore(cpudata);
>> -
>>   	ret = amd_pstate_init_perf(cpudata);
>>   	if (ret)
>>   		goto free_cpudata1;
>>   
>> +	amd_pstate_init_prefcore(cpudata);
>> +
>>   	ret = amd_pstate_init_freq(cpudata);
>>   	if (ret)
>>   		goto free_cpudata1;
>> @@ -1453,12 +1439,12 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>>   	cpudata->cpu = policy->cpu;
>>   	cpudata->epp_policy = 0;
>>   
>> -	amd_pstate_init_prefcore(cpudata);
>> -
>>   	ret = amd_pstate_init_perf(cpudata);
>>   	if (ret)
>>   		goto free_cpudata1;
>>   
>> +	amd_pstate_init_prefcore(cpudata);
>> +
>>   	ret = amd_pstate_init_freq(cpudata);
>>   	if (ret)
>>   		goto free_cpudata1;
>> @@ -1903,6 +1889,12 @@ static int __init amd_pstate_init(void)
>>   		static_call_update(amd_pstate_update_perf, cppc_update_perf);
>>   	}
>>   
>> +	if (amd_pstate_prefcore) {
>> +		ret = amd_detect_prefcore(&amd_pstate_prefcore);
>> +		if (ret)
>> +			return ret;
>> +	}
>> +
>>   	/* enable amd pstate feature */
>>   	ret = amd_pstate_enable(true);
>>   	if (ret) {
>> diff --git a/include/acpi/cppc_acpi.h b/include/acpi/cppc_acpi.h
>> index 2246ce0630362..1d79320a23490 100644
>> --- a/include/acpi/cppc_acpi.h
>> +++ b/include/acpi/cppc_acpi.h
>> @@ -137,10 +137,12 @@ struct cppc_cpudata {
>>   };
>>   
>>   #ifdef CONFIG_CPU_SUP_AMD
>> +extern int amd_detect_prefcore(bool *detected);
>>   extern int amd_get_highest_perf(unsigned int cpu, u32 *highest_perf);
>>   extern int amd_get_boost_ratio_numerator(unsigned int cpu, u64 *numerator);
>>   #else /* !CONFIG_CPU_SUP_AMD */
>>   static inline int amd_get_highest_perf(unsigned int cpu, u32 *highest_perf) { return -ENODEV; }
>> +static inline int amd_detect_prefcore(bool *detected) { return -ENODEV; }
>>   static inline int amd_get_boost_ratio_numerator(unsigned int cpu, u64 *numerator) { return -ENODEV; }
>>   #endif /* !CONFIG_CPU_SUP_AMD */
>>   
>> -- 
>> 2.43.0
>>
Mario Limonciello Aug. 27, 2024, 7:02 p.m. UTC | #10
On 8/27/2024 11:57, Gautham R. Shenoy wrote:
> On Mon, Aug 26, 2024 at 04:13:57PM -0500, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> Don't take and release the mutex when prefcore isn't present and
>> avoid initialization of variables that will be initially set
>> in the function.
>>
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> 
>> ---
>>   drivers/cpufreq/amd-pstate.c | 10 +++++-----
>>   1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 75568d0f84623..ed05d7a0add10 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -798,17 +798,17 @@ static void amd_pstate_update_limits(unsigned int cpu)
>>   	int ret;
>>   	bool highest_perf_changed = false;
>>   
>> -	mutex_lock(&amd_pstate_driver_lock);
>> -	if ((!amd_pstate_prefcore) || (!cpudata->hw_prefcore))
>> -		goto free_cpufreq_put;
>> +	if (!amd_pstate_prefcore)
>> +		return;
> 
> Looks good to me.
> 
> Wondering if it is worth maintaining a static key for
> amd_pstate_prefcore. Anyway it doesn't change after boot.

As there is a kernel command line option how would you pass the early 
param parsing result over without a static variable?

> 
> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
> 
Thanks!

> 
> --
> Thanks and Regards
> gautham.