Message ID | 20230905015116.2268926-4-li.meng@amd.com |
---|---|
State | Superseded |
Headers | show |
Series | [V5,1/7] x86: Drop CPU_SUP_INTEL from SCHED_MC_PRIO for the expansion. | expand |
Hi Meng, kernel test robot noticed the following build warnings: [auto build test WARNING on rafael-pm/linux-next] [also build test WARNING on linus/master v6.5 next-20230905] [cannot apply to tip/x86/core] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Meng-Li/x86-Drop-CPU_SUP_INTEL-from-SCHED_MC_PRIO-for-the-expansion/20230906-003754 base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next patch link: https://lore.kernel.org/r/20230905015116.2268926-4-li.meng%40amd.com patch subject: [PATCH V5 3/7] cpufreq: amd-pstate: Enable amd-pstate preferred core supporting. config: x86_64-randconfig-r022-20230906 (https://download.01.org/0day-ci/archive/20230906/202309061049.2ag7qkvI-lkp@intel.com/config) compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230906/202309061049.2ag7qkvI-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202309061049.2ag7qkvI-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/cpufreq/amd-pstate.c:1131:8: warning: unused variable 'dev_attr_hw_prefcore' [-Wunused-variable] static DEVICE_ATTR_RO(hw_prefcore); ^ include/linux/device.h:198:26: note: expanded from macro 'DEVICE_ATTR_RO' struct device_attribute dev_attr_##_name = __ATTR_RO(_name) ^ <scratch space>:91:1: note: expanded from here dev_attr_hw_prefcore ^ 1 warning generated. vim +/dev_attr_hw_prefcore +1131 drivers/cpufreq/amd-pstate.c 1126 1127 cpufreq_freq_attr_ro(amd_pstate_highest_perf); 1128 cpufreq_freq_attr_rw(energy_performance_preference); 1129 cpufreq_freq_attr_ro(energy_performance_available_preferences); 1130 static DEVICE_ATTR_RW(status); > 1131 static DEVICE_ATTR_RO(hw_prefcore); 1132 static DEVICE_ATTR_RO(prefcore); 1133
Hi Meng,
kernel test robot noticed the following build warnings:
[auto build test WARNING on rafael-pm/linux-next]
[also build test WARNING on linus/master v6.5 next-20230906]
[cannot apply to tip/x86/core]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Meng-Li/x86-Drop-CPU_SUP_INTEL-from-SCHED_MC_PRIO-for-the-expansion/20230906-003754
base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link: https://lore.kernel.org/r/20230905015116.2268926-4-li.meng%40amd.com
patch subject: [PATCH V5 3/7] cpufreq: amd-pstate: Enable amd-pstate preferred core supporting.
config: x86_64-defconfig (https://download.01.org/0day-ci/archive/20230906/202309061958.4wimkcbo-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-12) 11.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230906/202309061958.4wimkcbo-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202309061958.4wimkcbo-lkp@intel.com/
All warnings (new ones prefixed by >>):
In file included from include/linux/node.h:18,
from include/linux/cpu.h:17,
from include/linux/cpufreq.h:12,
from drivers/cpufreq/amd-pstate.c:30:
>> include/linux/device.h:198:33: warning: 'dev_attr_hw_prefcore' defined but not used [-Wunused-variable]
198 | struct device_attribute dev_attr_##_name = __ATTR_RO(_name)
| ^~~~~~~~~
drivers/cpufreq/amd-pstate.c:1131:8: note: in expansion of macro 'DEVICE_ATTR_RO'
1131 | static DEVICE_ATTR_RO(hw_prefcore);
| ^~~~~~~~~~~~~~
vim +/dev_attr_hw_prefcore +198 include/linux/device.h
ca22e56debc57b4 Kay Sievers 2011-12-14 123
ca22e56debc57b4 Kay Sievers 2011-12-14 124 ssize_t device_show_ulong(struct device *dev, struct device_attribute *attr,
ca22e56debc57b4 Kay Sievers 2011-12-14 125 char *buf);
ca22e56debc57b4 Kay Sievers 2011-12-14 126 ssize_t device_store_ulong(struct device *dev, struct device_attribute *attr,
ca22e56debc57b4 Kay Sievers 2011-12-14 127 const char *buf, size_t count);
ca22e56debc57b4 Kay Sievers 2011-12-14 128 ssize_t device_show_int(struct device *dev, struct device_attribute *attr,
ca22e56debc57b4 Kay Sievers 2011-12-14 129 char *buf);
ca22e56debc57b4 Kay Sievers 2011-12-14 130 ssize_t device_store_int(struct device *dev, struct device_attribute *attr,
ca22e56debc57b4 Kay Sievers 2011-12-14 131 const char *buf, size_t count);
91872392f08486f Borislav Petkov 2012-10-09 132 ssize_t device_show_bool(struct device *dev, struct device_attribute *attr,
91872392f08486f Borislav Petkov 2012-10-09 133 char *buf);
91872392f08486f Borislav Petkov 2012-10-09 134 ssize_t device_store_bool(struct device *dev, struct device_attribute *attr,
91872392f08486f Borislav Petkov 2012-10-09 135 const char *buf, size_t count);
ca22e56debc57b4 Kay Sievers 2011-12-14 136
cd00bc2ca42705b James Seo 2023-05-08 137 /**
cd00bc2ca42705b James Seo 2023-05-08 138 * DEVICE_ATTR - Define a device attribute.
cd00bc2ca42705b James Seo 2023-05-08 139 * @_name: Attribute name.
cd00bc2ca42705b James Seo 2023-05-08 140 * @_mode: File mode.
cd00bc2ca42705b James Seo 2023-05-08 141 * @_show: Show handler. Optional, but mandatory if attribute is readable.
cd00bc2ca42705b James Seo 2023-05-08 142 * @_store: Store handler. Optional, but mandatory if attribute is writable.
cd00bc2ca42705b James Seo 2023-05-08 143 *
cd00bc2ca42705b James Seo 2023-05-08 144 * Convenience macro for defining a struct device_attribute.
cd00bc2ca42705b James Seo 2023-05-08 145 *
cd00bc2ca42705b James Seo 2023-05-08 146 * For example, ``DEVICE_ATTR(foo, 0644, foo_show, foo_store);`` expands to:
cd00bc2ca42705b James Seo 2023-05-08 147 *
cd00bc2ca42705b James Seo 2023-05-08 148 * .. code-block:: c
cd00bc2ca42705b James Seo 2023-05-08 149 *
cd00bc2ca42705b James Seo 2023-05-08 150 * struct device_attribute dev_attr_foo = {
cd00bc2ca42705b James Seo 2023-05-08 151 * .attr = { .name = "foo", .mode = 0644 },
cd00bc2ca42705b James Seo 2023-05-08 152 * .show = foo_show,
cd00bc2ca42705b James Seo 2023-05-08 153 * .store = foo_store,
cd00bc2ca42705b James Seo 2023-05-08 154 * };
cd00bc2ca42705b James Seo 2023-05-08 155 */
a7fd67062efc5b0 Kay Sievers 2005-10-01 156 #define DEVICE_ATTR(_name, _mode, _show, _store) \
a7fd67062efc5b0 Kay Sievers 2005-10-01 157 struct device_attribute dev_attr_##_name = __ATTR(_name, _mode, _show, _store)
cd00bc2ca42705b James Seo 2023-05-08 158
cd00bc2ca42705b James Seo 2023-05-08 159 /**
cd00bc2ca42705b James Seo 2023-05-08 160 * DEVICE_ATTR_PREALLOC - Define a preallocated device attribute.
cd00bc2ca42705b James Seo 2023-05-08 161 * @_name: Attribute name.
cd00bc2ca42705b James Seo 2023-05-08 162 * @_mode: File mode.
cd00bc2ca42705b James Seo 2023-05-08 163 * @_show: Show handler. Optional, but mandatory if attribute is readable.
cd00bc2ca42705b James Seo 2023-05-08 164 * @_store: Store handler. Optional, but mandatory if attribute is writable.
cd00bc2ca42705b James Seo 2023-05-08 165 *
cd00bc2ca42705b James Seo 2023-05-08 166 * Like DEVICE_ATTR(), but ``SYSFS_PREALLOC`` is set on @_mode.
cd00bc2ca42705b James Seo 2023-05-08 167 */
7fda9100bb8258b Christophe Leroy 2017-12-18 168 #define DEVICE_ATTR_PREALLOC(_name, _mode, _show, _store) \
7fda9100bb8258b Christophe Leroy 2017-12-18 169 struct device_attribute dev_attr_##_name = \
7fda9100bb8258b Christophe Leroy 2017-12-18 170 __ATTR_PREALLOC(_name, _mode, _show, _store)
cd00bc2ca42705b James Seo 2023-05-08 171
cd00bc2ca42705b James Seo 2023-05-08 172 /**
cd00bc2ca42705b James Seo 2023-05-08 173 * DEVICE_ATTR_RW - Define a read-write device attribute.
cd00bc2ca42705b James Seo 2023-05-08 174 * @_name: Attribute name.
cd00bc2ca42705b James Seo 2023-05-08 175 *
cd00bc2ca42705b James Seo 2023-05-08 176 * Like DEVICE_ATTR(), but @_mode is 0644, @_show is <_name>_show,
cd00bc2ca42705b James Seo 2023-05-08 177 * and @_store is <_name>_store.
cd00bc2ca42705b James Seo 2023-05-08 178 */
ced321bf9151535 Greg Kroah-Hartman 2013-07-14 179 #define DEVICE_ATTR_RW(_name) \
ced321bf9151535 Greg Kroah-Hartman 2013-07-14 180 struct device_attribute dev_attr_##_name = __ATTR_RW(_name)
cd00bc2ca42705b James Seo 2023-05-08 181
cd00bc2ca42705b James Seo 2023-05-08 182 /**
cd00bc2ca42705b James Seo 2023-05-08 183 * DEVICE_ATTR_ADMIN_RW - Define an admin-only read-write device attribute.
cd00bc2ca42705b James Seo 2023-05-08 184 * @_name: Attribute name.
cd00bc2ca42705b James Seo 2023-05-08 185 *
cd00bc2ca42705b James Seo 2023-05-08 186 * Like DEVICE_ATTR_RW(), but @_mode is 0600.
cd00bc2ca42705b James Seo 2023-05-08 187 */
3022c6a1b4b76c4 Dan Williams 2020-06-25 188 #define DEVICE_ATTR_ADMIN_RW(_name) \
3022c6a1b4b76c4 Dan Williams 2020-06-25 189 struct device_attribute dev_attr_##_name = __ATTR_RW_MODE(_name, 0600)
cd00bc2ca42705b James Seo 2023-05-08 190
cd00bc2ca42705b James Seo 2023-05-08 191 /**
cd00bc2ca42705b James Seo 2023-05-08 192 * DEVICE_ATTR_RO - Define a readable device attribute.
cd00bc2ca42705b James Seo 2023-05-08 193 * @_name: Attribute name.
cd00bc2ca42705b James Seo 2023-05-08 194 *
cd00bc2ca42705b James Seo 2023-05-08 195 * Like DEVICE_ATTR(), but @_mode is 0444 and @_show is <_name>_show.
cd00bc2ca42705b James Seo 2023-05-08 196 */
ced321bf9151535 Greg Kroah-Hartman 2013-07-14 197 #define DEVICE_ATTR_RO(_name) \
ced321bf9151535 Greg Kroah-Hartman 2013-07-14 @198 struct device_attribute dev_attr_##_name = __ATTR_RO(_name)
cd00bc2ca42705b James Seo 2023-05-08 199
[AMD Official Use Only - General] Hi Ray: > -----Original Message----- > From: Huang, Ray <Ray.Huang@amd.com> > Sent: Wednesday, September 6, 2023 9:53 PM > To: Meng, Li (Jassmine) <Li.Meng@amd.com> > Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>; linux- > pm@vger.kernel.org; linux-kernel@vger.kernel.org; x86@kernel.org; linux- > acpi@vger.kernel.org; Shuah Khan <skhan@linuxfoundation.org>; linux- > kselftest@vger.kernel.org; Fontenot, Nathan > <Nathan.Fontenot@amd.com>; Sharma, Deepak > <Deepak.Sharma@amd.com>; Deucher, Alexander > <Alexander.Deucher@amd.com>; Limonciello, Mario > <Mario.Limonciello@amd.com>; Huang, Shimmer > <Shimmer.Huang@amd.com>; Yuan, Perry <Perry.Yuan@amd.com>; Du, > Xiaojian <Xiaojian.Du@amd.com>; Viresh Kumar <viresh.kumar@linaro.org>; > Borislav Petkov <bp@alien8.de> > Subject: Re: [PATCH V5 3/7] cpufreq: amd-pstate: Enable amd-pstate > preferred core supporting. > > On Tue, Sep 05, 2023 at 09:51:12AM +0800, Meng, Li (Jassmine) wrote: > > amd-pstate driver utilizes the functions and data structures provided > > by the ITMT architecture to enable the scheduler to favor scheduling > > on cores which can be get a higher frequency with lower voltage. We > > call it amd-pstate preferrred core. > > > > Here sched_set_itmt_core_prio() is called to set priorities and > > sched_set_itmt_support() is called to enable ITMT feature. > > amd-pstate driver uses the highest performance value to indicate the > > priority of CPU. The higher value has a higher priority. > > > > The initial core rankings are set up by amd-pstate when the system > > boots. > > > > Add device attribute for hardware preferred core. It will check if the > > processor and power firmware support preferred core feature. > > > > Add device attribute for preferred core. Only when hardware supports > > preferred core and user set `enabled` in early parameter, it can be > > set to enabled. > > > > Add one new early parameter `disable` to allow user to disable the > > preferred core. > > > > Signed-off-by: Perry Yuan <Perry.Yuan@amd.com> > > Co-developed-by: Perry Yuan <Perry.Yuan@amd.com> > > Signed-off-by: Meng Li <li.meng@amd.com> > > Co-developed-by: Meng Li <li.meng@amd.com> > > Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> > > --- > > drivers/cpufreq/amd-pstate.c | 131 > > ++++++++++++++++++++++++++++++----- > > 1 file changed, 115 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/cpufreq/amd-pstate.c > > b/drivers/cpufreq/amd-pstate.c index 9a1e194d5cf8..454eb6e789e7 > 100644 > > --- a/drivers/cpufreq/amd-pstate.c > > +++ b/drivers/cpufreq/amd-pstate.c > > @@ -37,6 +37,7 @@ > > #include <linux/uaccess.h> > > #include <linux/static_call.h> > > #include <linux/amd-pstate.h> > > +#include <linux/topology.h> > > > > #include <acpi/processor.h> > > #include <acpi/cppc_acpi.h> > > @@ -49,6 +50,8 @@ > > > > #define AMD_PSTATE_TRANSITION_LATENCY 20000 > > #define AMD_PSTATE_TRANSITION_DELAY 1000 > > +#define AMD_PSTATE_PREFCORE_THRESHOLD 166 > > +#define AMD_PSTATE_MAX_CPPC_PERF 255 > > > > /* > > * TODO: We need more time to fine tune processors with shared memory > > solution @@ -65,6 +68,12 @@ static struct cpufreq_driver > > amd_pstate_epp_driver; static int cppc_state = > AMD_PSTATE_UNDEFINED; > > static bool cppc_enabled; > > > > +/*HW preferred Core featue is supported*/ static bool hw_prefcore = > > +true; > > + > > +/*Preferred Core featue is supported*/ static bool prefcore = true; > > + > > /* > > * AMD Energy Preference Performance (EPP) > > * The EPP is used in the CCLK DPM controller to drive @@ -290,23 > > +299,21 @@ static inline int amd_pstate_enable(bool enable) static > > int pstate_init_perf(struct amd_cpudata *cpudata) { > > u64 cap1; > > - u32 highest_perf; > > > > int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, > > &cap1); > > if (ret) > > return ret; > > > > - /* > > - * TODO: Introduce AMD specific power feature. > > - * > > - * CPPC entry doesn't indicate the highest performance in some > ASICs. > > + /* For platforms that do not support the preferred core feature, the > > + * highest_pef may be configured with 166 or 255, to avoid max > frequency > > + * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) > value as > > + * the default max perf. > > */ > > - highest_perf = amd_get_highest_perf(); > > - if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1)) > > - highest_perf = AMD_CPPC_HIGHEST_PERF(cap1); > > - > > - WRITE_ONCE(cpudata->highest_perf, highest_perf); > > + if (prefcore) > > + WRITE_ONCE(cpudata->highest_perf, > AMD_PSTATE_PREFCORE_THRESHOLD); > > + else > > + WRITE_ONCE(cpudata->highest_perf, > AMD_CPPC_HIGHEST_PERF(cap1)); > > > > WRITE_ONCE(cpudata->nominal_perf, > AMD_CPPC_NOMINAL_PERF(cap1)); > > WRITE_ONCE(cpudata->lowest_nonlinear_perf, > > AMD_CPPC_LOWNONLIN_PERF(cap1)); @@ -318,17 +325,15 @@ static int > > pstate_init_perf(struct amd_cpudata *cpudata) static int > > cppc_init_perf(struct amd_cpudata *cpudata) { > > struct cppc_perf_caps cppc_perf; > > - u32 highest_perf; > > > > int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf); > > if (ret) > > return ret; > > > > - highest_perf = amd_get_highest_perf(); > > - if (highest_perf > cppc_perf.highest_perf) > > - highest_perf = cppc_perf.highest_perf; > > - > > - WRITE_ONCE(cpudata->highest_perf, highest_perf); > > + if (prefcore) > > + WRITE_ONCE(cpudata->highest_perf, > AMD_PSTATE_PREFCORE_THRESHOLD); > > + else > > + WRITE_ONCE(cpudata->highest_perf, > cppc_perf.highest_perf); > > > > WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf); > > WRITE_ONCE(cpudata->lowest_nonlinear_perf, > > @@ -676,6 +681,73 @@ static void amd_perf_ctl_reset(unsigned int cpu) > > wrmsrl_on_cpu(cpu, MSR_AMD_PERF_CTL, 0); } > > > > +/* > > + * Set amd-pstate preferred core enable can't be done directly from > > +cpufreq callbacks > > + * due to locking, so queue the work for later. > > + */ > > +static void amd_pstste_sched_prefcore_workfn(struct work_struct > > +*work) { > > + sched_set_itmt_support(); > > +} > > +static DECLARE_WORK(sched_prefcore_work, > > +amd_pstste_sched_prefcore_workfn); > > + > > +/* > > + * Get the highest performance register value. > > + * @cpu: CPU from which to get highest performance. > > + * @highest_perf: Return address. > > + * > > + * Return: 0 for success, -EIO otherwise. > > + */ > > +static int amd_pstate_get_highest_perf(int cpu, u64 *highest_perf) { > > + int ret; > > + > > + if (boot_cpu_has(X86_FEATURE_CPPC)) { > > + u64 cap1; > > + > > + ret = rdmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_CAP1, > &cap1); > > + if (ret) > > + return ret; > > + WRITE_ONCE(*highest_perf, > AMD_CPPC_HIGHEST_PERF(cap1)); > > + } else { > > + ret = cppc_get_highest_perf(cpu, highest_perf); > > + } > > + > > + return (ret); > > +} > > + > > +static void amd_pstate_init_prefcore(void) { > > + int cpu, ret; > > + u64 highest_perf; > > + > > + if (!prefcore) > > + return; > > + > > + for_each_online_cpu(cpu) { > > + ret = amd_pstate_get_highest_perf(cpu, &highest_perf); > > + if (ret) > > + break; > > + > > + sched_set_itmt_core_prio(highest_perf, cpu); > > + > > + /* check if CPPC preferred core feature is enabled*/ > > + if (highest_perf == AMD_PSTATE_MAX_CPPC_PERF) { > > + hw_prefcore = false; > > + prefcore = false; > > I think you should use prefcore which embeds into cpudata structure instead > of global variable. Here, actually, you walked through all online cpus, the last > cpu's status will overwrite the previous one. > [Meng, Li (Jassmine)] The variable "prefcore" is an early kernel param. User can set it status to enabled or disabled. I think it cannot be embedded into "cpudata" structure. > > + return; > > + } > > + } > > + > > + /* > > + * This code can be run during CPU online under the > > + * CPU hotplug locks, so sched_set_amd_prefcore_support() > > + * cannot be called from here. Queue up a work item > > + * to invoke it. > > + */ > > + schedule_work(&sched_prefcore_work); > > +} > > + > > static int amd_pstate_cpu_init(struct cpufreq_policy *policy) { > > int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret; > @@ > > -1037,6 +1109,18 @@ static ssize_t status_store(struct device *a, struct > device_attribute *b, > > return ret < 0 ? ret : count; > > } > > > > +static ssize_t hw_prefcore_show(struct device *dev, > > + struct device_attribute *attr, char *buf) { > > + return sysfs_emit(buf, "%s\n", hw_prefcore ? "supported" : > > +"unsupported"); } > > Is there any requirement from user space (cpupower or other tool) to query > the capacity at runtime? In fact, we can simplify the codes that use a print in > the kernel to let user know whether current cpu supports prefcore in > hardware side. > > Thanks, > Ray [Meng, Li (Jassmine)] I will modify it to pr_debug() message. > > > + > > +static ssize_t prefcore_show(struct device *dev, > > + struct device_attribute *attr, char *buf) { > > + return sysfs_emit(buf, "%s\n", prefcore ? "enabled" : "disabled"); } > > + > > cpufreq_freq_attr_ro(amd_pstate_max_freq); > > cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq); > > > > @@ -1044,6 +1128,8 @@ cpufreq_freq_attr_ro(amd_pstate_highest_perf); > > cpufreq_freq_attr_rw(energy_performance_preference); > > cpufreq_freq_attr_ro(energy_performance_available_preferences); > > static DEVICE_ATTR_RW(status); > > +static DEVICE_ATTR_RO(hw_prefcore); > > +static DEVICE_ATTR_RO(prefcore); > > > > static struct freq_attr *amd_pstate_attr[] = { > > &amd_pstate_max_freq, > > @@ -1063,6 +1149,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = > > { > > > > static struct attribute *pstate_global_attributes[] = { > > &dev_attr_status.attr, > > + &dev_attr_prefcore.attr, > > NULL > > }; > > > > @@ -1506,6 +1593,8 @@ static int __init amd_pstate_init(void) > > } > > } > > > > + amd_pstate_init_prefcore(); > > + > > return ret; > > > > global_attr_free: > > @@ -1527,7 +1616,17 @@ static int __init amd_pstate_param(char *str) > > > > return amd_pstate_set_driver(mode_idx); } > > + > > +static int __init amd_prefcore_param(char *str) { > > + if (!strcmp(str, "disable")) > > + prefcore = false; > > + > > + return 0; > > +} > > + > > early_param("amd_pstate", amd_pstate_param); > > +early_param("amd_prefcore", amd_prefcore_param); > > > > MODULE_AUTHOR("Huang Rui <ray.huang@amd.com>"); > > MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver"); > > -- > > 2.34.1 > >
On Tue, Sep 05, 2023 at 09:51:12AM +0800, Meng Li wrote: > + /* > + * This code can be run during CPU online under the > + * CPU hotplug locks, so sched_set_amd_prefcore_support() There is no such function... ? > + * cannot be called from here. Queue up a work item > + * to invoke it. > + */ > + schedule_work(&sched_prefcore_work);
On Tue, Sep 05, 2023 at 09:51:12AM +0800, Meng Li wrote: > +static void amd_pstate_init_prefcore(void) > +{ > + int cpu, ret; > + u64 highest_perf; > + > + if (!prefcore) > + return; > + > + for_each_online_cpu(cpu) { > + ret = amd_pstate_get_highest_perf(cpu, &highest_perf); > + if (ret) > + break; > + > + sched_set_itmt_core_prio(highest_perf, cpu); > + > + /* check if CPPC preferred core feature is enabled*/ > + if (highest_perf == AMD_PSTATE_MAX_CPPC_PERF) { > + hw_prefcore = false; > + prefcore = false; > + return; > + } > + } > + > + /* > + * This code can be run during CPU online under the > + * CPU hotplug locks, so sched_set_amd_prefcore_support() > + * cannot be called from here. Queue up a work item > + * to invoke it. > + */ > + schedule_work(&sched_prefcore_work); > +} > @@ -1506,6 +1593,8 @@ static int __init amd_pstate_init(void) > } > } > > + amd_pstate_init_prefcore(); > + > return ret; > > global_attr_free: I'm confused,... you call amd_pstate_init_prefcore() at device_initcall(). Once per boot. Then it iterates all online CPUs.. But what if you boot with some CPUs offline and bring then online later?
diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c index 9a1e194d5cf8..454eb6e789e7 100644 --- a/drivers/cpufreq/amd-pstate.c +++ b/drivers/cpufreq/amd-pstate.c @@ -37,6 +37,7 @@ #include <linux/uaccess.h> #include <linux/static_call.h> #include <linux/amd-pstate.h> +#include <linux/topology.h> #include <acpi/processor.h> #include <acpi/cppc_acpi.h> @@ -49,6 +50,8 @@ #define AMD_PSTATE_TRANSITION_LATENCY 20000 #define AMD_PSTATE_TRANSITION_DELAY 1000 +#define AMD_PSTATE_PREFCORE_THRESHOLD 166 +#define AMD_PSTATE_MAX_CPPC_PERF 255 /* * TODO: We need more time to fine tune processors with shared memory solution @@ -65,6 +68,12 @@ static struct cpufreq_driver amd_pstate_epp_driver; static int cppc_state = AMD_PSTATE_UNDEFINED; static bool cppc_enabled; +/*HW preferred Core featue is supported*/ +static bool hw_prefcore = true; + +/*Preferred Core featue is supported*/ +static bool prefcore = true; + /* * AMD Energy Preference Performance (EPP) * The EPP is used in the CCLK DPM controller to drive @@ -290,23 +299,21 @@ static inline int amd_pstate_enable(bool enable) static int pstate_init_perf(struct amd_cpudata *cpudata) { u64 cap1; - u32 highest_perf; int ret = rdmsrl_safe_on_cpu(cpudata->cpu, MSR_AMD_CPPC_CAP1, &cap1); if (ret) return ret; - /* - * TODO: Introduce AMD specific power feature. - * - * CPPC entry doesn't indicate the highest performance in some ASICs. + /* For platforms that do not support the preferred core feature, the + * highest_pef may be configured with 166 or 255, to avoid max frequency + * calculated wrongly. we take the AMD_CPPC_HIGHEST_PERF(cap1) value as + * the default max perf. */ - highest_perf = amd_get_highest_perf(); - if (highest_perf > AMD_CPPC_HIGHEST_PERF(cap1)) - highest_perf = AMD_CPPC_HIGHEST_PERF(cap1); - - WRITE_ONCE(cpudata->highest_perf, highest_perf); + if (prefcore) + WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD); + else + WRITE_ONCE(cpudata->highest_perf, AMD_CPPC_HIGHEST_PERF(cap1)); WRITE_ONCE(cpudata->nominal_perf, AMD_CPPC_NOMINAL_PERF(cap1)); WRITE_ONCE(cpudata->lowest_nonlinear_perf, AMD_CPPC_LOWNONLIN_PERF(cap1)); @@ -318,17 +325,15 @@ static int pstate_init_perf(struct amd_cpudata *cpudata) static int cppc_init_perf(struct amd_cpudata *cpudata) { struct cppc_perf_caps cppc_perf; - u32 highest_perf; int ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf); if (ret) return ret; - highest_perf = amd_get_highest_perf(); - if (highest_perf > cppc_perf.highest_perf) - highest_perf = cppc_perf.highest_perf; - - WRITE_ONCE(cpudata->highest_perf, highest_perf); + if (prefcore) + WRITE_ONCE(cpudata->highest_perf, AMD_PSTATE_PREFCORE_THRESHOLD); + else + WRITE_ONCE(cpudata->highest_perf, cppc_perf.highest_perf); WRITE_ONCE(cpudata->nominal_perf, cppc_perf.nominal_perf); WRITE_ONCE(cpudata->lowest_nonlinear_perf, @@ -676,6 +681,73 @@ static void amd_perf_ctl_reset(unsigned int cpu) wrmsrl_on_cpu(cpu, MSR_AMD_PERF_CTL, 0); } +/* + * Set amd-pstate preferred core enable can't be done directly from cpufreq callbacks + * due to locking, so queue the work for later. + */ +static void amd_pstste_sched_prefcore_workfn(struct work_struct *work) +{ + sched_set_itmt_support(); +} +static DECLARE_WORK(sched_prefcore_work, amd_pstste_sched_prefcore_workfn); + +/* + * Get the highest performance register value. + * @cpu: CPU from which to get highest performance. + * @highest_perf: Return address. + * + * Return: 0 for success, -EIO otherwise. + */ +static int amd_pstate_get_highest_perf(int cpu, u64 *highest_perf) +{ + int ret; + + if (boot_cpu_has(X86_FEATURE_CPPC)) { + u64 cap1; + + ret = rdmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_CAP1, &cap1); + if (ret) + return ret; + WRITE_ONCE(*highest_perf, AMD_CPPC_HIGHEST_PERF(cap1)); + } else { + ret = cppc_get_highest_perf(cpu, highest_perf); + } + + return (ret); +} + +static void amd_pstate_init_prefcore(void) +{ + int cpu, ret; + u64 highest_perf; + + if (!prefcore) + return; + + for_each_online_cpu(cpu) { + ret = amd_pstate_get_highest_perf(cpu, &highest_perf); + if (ret) + break; + + sched_set_itmt_core_prio(highest_perf, cpu); + + /* check if CPPC preferred core feature is enabled*/ + if (highest_perf == AMD_PSTATE_MAX_CPPC_PERF) { + hw_prefcore = false; + prefcore = false; + return; + } + } + + /* + * This code can be run during CPU online under the + * CPU hotplug locks, so sched_set_amd_prefcore_support() + * cannot be called from here. Queue up a work item + * to invoke it. + */ + schedule_work(&sched_prefcore_work); +} + static int amd_pstate_cpu_init(struct cpufreq_policy *policy) { int min_freq, max_freq, nominal_freq, lowest_nonlinear_freq, ret; @@ -1037,6 +1109,18 @@ static ssize_t status_store(struct device *a, struct device_attribute *b, return ret < 0 ? ret : count; } +static ssize_t hw_prefcore_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%s\n", hw_prefcore ? "supported" : "unsupported"); +} + +static ssize_t prefcore_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%s\n", prefcore ? "enabled" : "disabled"); +} + cpufreq_freq_attr_ro(amd_pstate_max_freq); cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq); @@ -1044,6 +1128,8 @@ cpufreq_freq_attr_ro(amd_pstate_highest_perf); cpufreq_freq_attr_rw(energy_performance_preference); cpufreq_freq_attr_ro(energy_performance_available_preferences); static DEVICE_ATTR_RW(status); +static DEVICE_ATTR_RO(hw_prefcore); +static DEVICE_ATTR_RO(prefcore); static struct freq_attr *amd_pstate_attr[] = { &amd_pstate_max_freq, @@ -1063,6 +1149,7 @@ static struct freq_attr *amd_pstate_epp_attr[] = { static struct attribute *pstate_global_attributes[] = { &dev_attr_status.attr, + &dev_attr_prefcore.attr, NULL }; @@ -1506,6 +1593,8 @@ static int __init amd_pstate_init(void) } } + amd_pstate_init_prefcore(); + return ret; global_attr_free: @@ -1527,7 +1616,17 @@ static int __init amd_pstate_param(char *str) return amd_pstate_set_driver(mode_idx); } + +static int __init amd_prefcore_param(char *str) +{ + if (!strcmp(str, "disable")) + prefcore = false; + + return 0; +} + early_param("amd_pstate", amd_pstate_param); +early_param("amd_prefcore", amd_prefcore_param); MODULE_AUTHOR("Huang Rui <ray.huang@amd.com>"); MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver");