Message ID | 47159248.fMDQidcC6G@rjwysocki.net |
---|---|
State | New |
Headers | show |
Series | cpufreq: intel_pstate: Enable EAS on hybrid platforms without SMT | expand |
On 4/16/25 19:12, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > On some hybrid platforms some efficient CPUs (E-cores) are not connected > to the L3 cache, but there are no other differences between them and the > other E-cores that use L3. In that case, it is generally more efficient > to run "light" workloads on the E-cores that do not use L3 and allow all > of the cores using L3, including P-cores, to go into idle states. > > For this reason, slightly increase the cost for all CPUs sharing the L3 > cache to make EAS prefer CPUs that do not use it to the other CPUs with > the same perf-to-frequency scaling factor (if any). > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > --- > drivers/cpufreq/intel_pstate.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > --- a/drivers/cpufreq/intel_pstate.c > +++ b/drivers/cpufreq/intel_pstate.c > @@ -979,6 +979,7 @@ > unsigned long *cost) > { > struct pstate_data *pstate = &all_cpu_data[dev->id]->pstate; > + struct cpu_cacheinfo *cacheinfo = get_cpu_cacheinfo(dev->id); > > /* > * The smaller the perf-to-frequency scaling factor, the larger the IPC > @@ -991,6 +992,13 @@ > * of the same type in different "utilization bins" is different. > */ > *cost = div_u64(100ULL * INTEL_PSTATE_CORE_SCALING, pstate->scaling) + freq; > + /* > + * Inrease the cost slightly for CPUs able to access L3 to avoid litting s/Inrease/Increase and I guess s/litting/littering > + * it up too eagerly in case some other CPUs of the same type cannot > + * access it. > + */ > + if (cacheinfo->num_levels >= 3) > + (*cost)++; This makes cost(OPP1) of the SoC Tile e-core as expensive as cost(OPP0) of a normal e-core. Is that the intended behaviour?
On Fri, Apr 25, 2025 at 11:32 PM Christian Loehle <christian.loehle@arm.com> wrote: > > On 4/16/25 19:12, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > On some hybrid platforms some efficient CPUs (E-cores) are not connected > > to the L3 cache, but there are no other differences between them and the > > other E-cores that use L3. In that case, it is generally more efficient > > to run "light" workloads on the E-cores that do not use L3 and allow all > > of the cores using L3, including P-cores, to go into idle states. > > > > For this reason, slightly increase the cost for all CPUs sharing the L3 > > cache to make EAS prefer CPUs that do not use it to the other CPUs with > > the same perf-to-frequency scaling factor (if any). > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > --- > > drivers/cpufreq/intel_pstate.c | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > --- a/drivers/cpufreq/intel_pstate.c > > +++ b/drivers/cpufreq/intel_pstate.c > > @@ -979,6 +979,7 @@ > > unsigned long *cost) > > { > > struct pstate_data *pstate = &all_cpu_data[dev->id]->pstate; > > + struct cpu_cacheinfo *cacheinfo = get_cpu_cacheinfo(dev->id); > > > > /* > > * The smaller the perf-to-frequency scaling factor, the larger the IPC > > @@ -991,6 +992,13 @@ > > * of the same type in different "utilization bins" is different. > > */ > > *cost = div_u64(100ULL * INTEL_PSTATE_CORE_SCALING, pstate->scaling) + freq; > > + /* > > + * Inrease the cost slightly for CPUs able to access L3 to avoid litting > > s/Inrease/Increase > and I guess s/litting/littering > > > + * it up too eagerly in case some other CPUs of the same type cannot > > + * access it. > > + */ > > + if (cacheinfo->num_levels >= 3) This check actually doesn't work on Intel processors, I have a replacement patch for this one. > > + (*cost)++; > > This makes cost(OPP1) of the SoC Tile e-core as expensive as cost(OPP0) of a > normal e-core. If "a normal Ecore" is one using L3, then yes. > Is that the intended behaviour? Yes, it is. I wanted the Ecores on L3 to appear somewhat more expensive, but not too much. It looks like *cost += 2 would work better, though.
--- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -979,6 +979,7 @@ unsigned long *cost) { struct pstate_data *pstate = &all_cpu_data[dev->id]->pstate; + struct cpu_cacheinfo *cacheinfo = get_cpu_cacheinfo(dev->id); /* * The smaller the perf-to-frequency scaling factor, the larger the IPC @@ -991,6 +992,13 @@ * of the same type in different "utilization bins" is different. */ *cost = div_u64(100ULL * INTEL_PSTATE_CORE_SCALING, pstate->scaling) + freq; + /* + * Inrease the cost slightly for CPUs able to access L3 to avoid litting + * it up too eagerly in case some other CPUs of the same type cannot + * access it. + */ + if (cacheinfo->num_levels >= 3) + (*cost)++; return 0; }