diff mbox series

[Regression] 6.11.0-rc1: AMD CPU boot with error when CPPC feature disabled by BIOS

Message ID 20240730140111.4491-1-00107082@163.com
State New
Headers show
Series [Regression] 6.11.0-rc1: AMD CPU boot with error when CPPC feature disabled by BIOS | expand

Commit Message

David Wang July 30, 2024, 2:01 p.m. UTC
Hi,

I notice some kernel warning and errors when I update to 6.11.0-rc1:

 kernel: [    1.022739] amd_pstate: The CPPC feature is supported but currently disabled by the BIOS.
 kernel: [    1.022739] Please enable it if your BIOS has the CPPC option.
 kernel: [    1.098054] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
 kernel: [    1.110058] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
 kernel: [    1.122057] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
 kernel: [    1.134062] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
 kernel: [    1.134641] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
 kernel: [    1.135128] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
 kernel: [    1.135693] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
 kernel: [    1.136371] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
 kernel: [    1.136390] amd_pstate: failed to register with return -19
 kernel: [    1.138410] ledtrig-cpu: registered to indicate activity on CPUs


Those warning message was introduced by commit:
 bff7d13c190ad98cf4f877189b022c75df4cb383 ("cpufreq: amd-pstate: add debug message while CPPC is supported and disabled by SBIOS)
, which make sense.

Those error message was introduced by commit:
 8f8b42c1fcc939a73b547b172a9ffcb65ef4bf47 ("cpufreq: amd-pstate: optimize the initial frequency values verification")
, when CPPC is disabled by BIOS, this error message does not make sense, and the error return-code would abort the driver registeration,
but this behavior could be handled earlier when detecting CPPC feature.

I feel following changes would make a clean fix: do not register amd_pstate driver when CPPC disabled by BIOS.



Thanks
David

Comments

Gautham R. Shenoy Aug. 2, 2024, 5:02 a.m. UTC | #1
Hello David,

"David Wang" <00107082@163.com> writes:

> Hi,
>
> At 2024-07-31 18:12:12, "Gautham R.Shenoy" <gautham.shenoy@amd.com> wrote:
>>Hello David,
>>
>>David Wang <00107082@163.com> writes:
>>
>>> Hi,
>>>
>>> I notice some kernel warning and errors when I update to 6.11.0-rc1:
>>>
>>>  kernel: [    1.022739] amd_pstate: The CPPC feature is supported but currently disabled by the BIOS.
>>>  kernel: [    1.022739] Please enable it if your BIOS has the CPPC option.
>>>  kernel: [    1.098054] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.110058] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.122057] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.134062] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.134641] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.135128] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.135693] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.136371] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.136390] amd_pstate: failed to register with return -19
>>>  kernel: [    1.138410] ledtrig-cpu: registered to indicate activity on CPUs
>>>
>>>
>>> Those warning message was introduced by commit:
>>>  bff7d13c190ad98cf4f877189b022c75df4cb383 ("cpufreq: amd-pstate: add debug message while CPPC is supported and disabled by SBIOS)
>>> , which make sense.
>>
>>
>>If CPPC is disabed in the BIOS, then the _CPC objects shouldn't have
>>been created. And the error message that you should have seen is
>>"the _CPC object is not present in SBIOS or ACPI disabled".
>>
>>
>>Could you please share the family and model number of the platform where
>>you are observing this ?
>
> My `cat /proc/cpuinfo` shows something as following:
> processor	: 0
> vendor_id	: AuthenticAMD
> cpu family	: 23
> model		: 113


This is Family 0x17 (Zen2), Model 0x71. AFAIK, this processor supports
CPPC but does not have the support for the CPPC MSRs. Hence the CPPC
communication occurs via shared-memory.

Hence the warning introduced by the commit bff7d13c190a ("cpufreq:
amd-pstate: add debug message while CPPC is supported and disabled by
SBIOS") is not applicable on your platform. I will send a patch to
rectify this which avoids the warning for Zen2 Models 0x70-0x7F.

Regarding the following errors that you are observing 

>>>  kernel: [    1.098054] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.110058] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.122057] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.134062] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.134641] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.135128] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.135693] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.136371] amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
>>>  kernel: [    1.136390] amd_pstate: failed to register with return -19


it appears that the CPPC version on your platform is v2 which does not
advertise the nominal_freq and the lowest_freq. In the absence of these,
it is not possible for the amd-pstate driver to infer the
min/max_freq. Which is why the driver bails at this later stage.

The way around it is to add a quirk for your BIOS as done in this commit
from Perry:
eb8b6c368202 ("cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities missing")


--
Thanks and Regards
gautham.
Luna Nova Sept. 26, 2024, 8:56 p.m. UTC | #2
Hi Gautham,

I'm seeing the same message on a server board with an EPYC Rome 7K62 CPU.
CPPC is set to enabled in the UEFI firmware settings.

Kernel: 6.11.0 (6.11.0 #1-NixOS SMP PREEMPT_DYNAMIC Sun Sep 15 14:57:56 UTC 2024 x86_64 GNU/Linux)
Board: Gigabyte MZ22-G20-00 Rev 1.0 (in a G292-Z20 Rev 100)
UEFI Firwmare: R23_F01 (2021-09-06, latest available version at time of this message)
AGESA PI Version 1.0.0.C.

CONFIG_ACPI_CPPC_LIB=y
CONFIG_X86_AMD_PSTATE=y
CONFIG_X86_AMD_PSTATE_DEFAULT_MODE=3
CONFIG_X86_AMD_PSTATE_UT=m

$ cat /proc/cmdline
initrd=\EFI\nixos\z16gakzlwypxbjzm5y93x10cjmxjvial-initrd-linux-6.11-initrd.efi init=/nix/store/cqhw9x7w7dc3avwri4i2lk0mgc31arll-nixos-system-tsukiakari-nixos-24.11/init sysrq_always_enabled fsck.mode=force loglevel=4 audit=0 amd_pstate=guided amd_pstate.shared_mem=1 amdgpu.lockup_timeout=10000,10000,10000,10000
$ sudo dmesg | grep pstate
amd_pstate: min_freq(0) or max_freq(0) or nominal_freq(0) value is incorrect
(Repeats for each core)
amd_pstate: failed to register with return -19
stage-1-init: [Thu Sep 26 20:04:53 UTC 2024] loading module amd_pstate_ut...
amd_pstate_ut: 1    amd_pstate_ut_acpi_cpc_valid  success!
amd_pstate_ut: 2    amd_pstate_ut_check_enabled   success!
amd_pstate_ut: 3    amd_pstate_ut_check_perf      success!
amd_pstate_ut: 4    amd_pstate_ut_check_freq      success!

It seems odd that amd_pstate fails to load but amd_pstate_ut reports success for all checks.

> it appears that the CPPC version on your platform is v2 which does not
> advertise the nominal_freq and the lowest_freq. In the absence of these,
> it is not possible for the amd-pstate driver to infer the
> min/max_freq. Which is why the driver bails at this later stage.

> The way around it is to add a quirk for your BIOS as done in this commit
> from Perry:
> eb8b6c368202 ("cpufreq: amd-pstate: Add quirk for the pstate CPPC capabilities missing")

Perry's patch you referenced as an example above targets the same 7K62 CPU but requires one specific BIOS version.
Should I submit a patch adding the version on this system to that quirk?

I'm confused by the quirk code: it's called "AMD EPYC 7K62" but it matches by BIOS revision and doesn't check the CPU model.
An earlier version of the quirk included `boot_cpu_data.x86 == 0x17 && boot_cpu_data.x86_model == 0x31` to check the model; it now uses the nominal frequencies for a 7K62 regardless of the CPU model if the BIOS revision matches.

Best,
Luna
diff mbox series

Patch

diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
index 68c616b572f2..b06faea58fd4 100644
--- a/drivers/cpufreq/amd-pstate.c
+++ b/drivers/cpufreq/amd-pstate.c
@@ -1837,8 +1837,6 @@  static bool amd_cppc_supported(void)
         * If the CPPC feature is disabled in the BIOS for processors that support MSR-based CPPC,
         * the AMD Pstate driver may not function correctly.
         * Check the CPPC flag and display a warning message if the platform supports CPPC.
-        * Note: below checking code will not abort the driver registeration process because of
-        * the code is added for debugging purposes.
         */
        if (!cpu_feature_enabled(X86_FEATURE_CPPC)) {
                if (cpu_feature_enabled(X86_FEATURE_ZEN1) || cpu_feature_enabled(X86_FEATURE_ZEN2)) {
@@ -1856,6 +1854,7 @@  static bool amd_cppc_supported(void)
        if (warn)
                pr_warn_once("The CPPC feature is supported but currently disabled by the BIOS.\n"
                                        "Please enable it if your BIOS has the CPPC option.\n");
+               return false;
        return true;
 }