diff mbox series

[2/2] cpufreq: scmi: Update Energy Model with allowed performance limits

Message ID 20240403162315.1458337-3-lukasz.luba@arm.com
State New
Headers show
Series Update Energy Model with perfromance limits | expand

Commit Message

Lukasz Luba April 3, 2024, 4:23 p.m. UTC
The Energy Model (EM) supports performance limits updates. Use the SCMI
notifications to get information from FW about allowed frequency scope for
the CPUs.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
 drivers/cpufreq/scmi-cpufreq.c | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

Comments

Dietmar Eggemann April 22, 2024, 1:11 p.m. UTC | #1
On 03/04/2024 18:23, Lukasz Luba wrote:
> The Energy Model (EM) supports performance limits updates. Use the SCMI
> notifications to get information from FW about allowed frequency scope for
> the CPUs.

I'm slightly confused here. IMHO this doesn't seem to be related to the
"HW dependency between 'little CPUs & L3 $ in DSU' or similar" usecase.

I assumed that this usecase is rather handled via an additional
out-of-tree driver, potentially the same which updates the EM because of
temperature change (em_dev_compute_costs(), em_dev_update_perf_domain())
or chip binning (em_dev_update_chip_binning()).

What about other CPUFreq drivers registering an EM via
em_dev_register_perf_domain() or 'cpufreq_register_em_with_opp() ->
dev_pm_opp_of_register_em()'? Or is this 'limit notification' an SCMI FW
only thing?

[...]
Lukasz Luba April 22, 2024, 1:55 p.m. UTC | #2
On 4/22/24 14:11, Dietmar Eggemann wrote:
> On 03/04/2024 18:23, Lukasz Luba wrote:
>> The Energy Model (EM) supports performance limits updates. Use the SCMI
>> notifications to get information from FW about allowed frequency scope for
>> the CPUs.
> 
> I'm slightly confused here. IMHO this doesn't seem to be related to the
> "HW dependency between 'little CPUs & L3 $ in DSU' or similar" usecase.
> 
> I assumed that this usecase is rather handled via an additional
> out-of-tree driver, potentially the same which updates the EM because of
> temperature change (em_dev_compute_costs(), em_dev_update_perf_domain())
> or chip binning (em_dev_update_chip_binning()).

This patch allows to handle relatively simple and straight forward use
case for updating the perf limits in the EM. The one that you mention,
which would probably always live out-of-tree, is more complex and
focused on leakage estimation in different conditions.

I see those two drivers separate. We have the DSU+Littles dependency
always, because it's HW dependency, while the leakage issue can happen
in some scenarios like gaming and needs more dedicated driver to handle
it (or rely on FW, but that's another story, orthogonal as well) and
more information to do it properly.

> 
> What about other CPUFreq drivers registering an EM via
> em_dev_register_perf_domain() or 'cpufreq_register_em_with_opp() ->
> dev_pm_opp_of_register_em()'? Or is this 'limit notification' an SCMI FW
> only thing?

Other platforms which use different drivers for CPUfreq will have to
develop their own code. Although, when we merge this upstream, they
could follow this pattern as a reference design.

In our SCMI cpufreq and our SCP firmware we have this situation:
1. Sending CPUfreq request from Big CPU to SCP e.g. via fast-channel &
    it's done from sched-util w/o sugov kthread &
    it has to be super fast, we don't check any other dependency CPU
    for Littles or something like that.
2. The SCP firmware receives the frequency request for Big CPU &
    it checks internal dependencies for other components e.g.
    L3 cache min speed (DSU+Littles domian frequency)
3. The SCP changes the Big CPU frequency & changes the DSU+Littles
    frequency as the depended device
4. The SCP sends updated performance limits for the depended DSU+Littles
    domian to the SCMI cpufreq kernel driver, for proper Littles domain
    cpufreq device
5. SCMI cpufreq kernel driver gets the SCP notification about updated
    perf limits & translates the perf limits min value as the lowest
    currently available frequency for the Littles
6. The SCMI cpufreq driver updates the EM perf limits for Littles
    as the currently minimum available frequency.
    This allows to properly simulate the energy impact when the EAS
    tries to put a task on that domain, even when that PD's util signals
    might show lower frequency potentially being used.

That's why I see this as part of the CPUfreq driver feature.

The leakage driver might be better suited for the thermal framework,
since there is more information available there.
Cristian Marussi May 1, 2024, 9:26 a.m. UTC | #3
On Wed, Apr 03, 2024 at 05:23:15PM +0100, Lukasz Luba wrote:
> The Energy Model (EM) supports performance limits updates. Use the SCMI
> notifications to get information from FW about allowed frequency scope for
> the CPUs.
> 
> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
> ---
>  drivers/cpufreq/scmi-cpufreq.c | 19 ++++++++++++++++---
>  1 file changed, 16 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
> index d946b7a082584..90c8448578cb1 100644
> --- a/drivers/cpufreq/scmi-cpufreq.c
> +++ b/drivers/cpufreq/scmi-cpufreq.c
> @@ -185,12 +185,25 @@ static int scmi_limit_notify_cb(struct notifier_block *nb, unsigned long event,
>  {
>  	struct scmi_data *priv = container_of(nb, struct scmi_data, limit_notify_nb);
>  	struct scmi_perf_limits_report *limit_notify = data;
> +	unsigned int limit_freq_max_khz, limit_freq_min_khz;
>  	struct cpufreq_policy *policy = priv->policy;
> -	unsigned int limit_freq_khz;
> +	struct em_perf_domain *pd;
> +	int ret;
> +
> +	limit_freq_max_khz = limit_notify->range_max_freq / HZ_PER_KHZ;
> +	limit_freq_min_khz = limit_notify->range_min_freq / HZ_PER_KHZ;

Note that these values could be zeroed if the notification is good but
the range_min/range_max values could NOT be mapped to a frequency
equivalent (due to some FW errors).

I would probably have to add a warn about this error in the core SCMI
notification path (or drop the notif as a whole); if not here you could
end-up just setting max/min to 0 if the fw has messed up the
notification range_min/range_max.

Or is it just that, especially max_feq = 0 is NOT plausible value and you
will need anyway to check it here ?

>  
> -	limit_freq_khz = limit_notify->range_max_freq / HZ_PER_KHZ;
> +	pd = em_cpu_get(policy->cpu);
> +	if (pd) {
> +		ret = em_update_performance_limits(pd, limit_freq_min_khz,
> +						   limit_freq_max_khz);
> +		if (ret)
> +			dev_warn(priv->cpu_dev,
> +				 "EM perf limits update failed\n");
> +	}
>  
> -	policy->max = clamp(limit_freq_khz, policy->cpuinfo.min_freq, policy->cpuinfo.max_freq);
> +	policy->max = clamp(limit_freq_max_khz, policy->cpuinfo.min_freq,
> +			    policy->cpuinfo.max_freq);

FWIW, regarding the SCMI bits.

LGTM.
Reviewed-by: Cristian Marussi <cristian.marussi@arm.com>

Thanks,
Cristian
diff mbox series

Patch

diff --git a/drivers/cpufreq/scmi-cpufreq.c b/drivers/cpufreq/scmi-cpufreq.c
index d946b7a082584..90c8448578cb1 100644
--- a/drivers/cpufreq/scmi-cpufreq.c
+++ b/drivers/cpufreq/scmi-cpufreq.c
@@ -185,12 +185,25 @@  static int scmi_limit_notify_cb(struct notifier_block *nb, unsigned long event,
 {
 	struct scmi_data *priv = container_of(nb, struct scmi_data, limit_notify_nb);
 	struct scmi_perf_limits_report *limit_notify = data;
+	unsigned int limit_freq_max_khz, limit_freq_min_khz;
 	struct cpufreq_policy *policy = priv->policy;
-	unsigned int limit_freq_khz;
+	struct em_perf_domain *pd;
+	int ret;
+
+	limit_freq_max_khz = limit_notify->range_max_freq / HZ_PER_KHZ;
+	limit_freq_min_khz = limit_notify->range_min_freq / HZ_PER_KHZ;
 
-	limit_freq_khz = limit_notify->range_max_freq / HZ_PER_KHZ;
+	pd = em_cpu_get(policy->cpu);
+	if (pd) {
+		ret = em_update_performance_limits(pd, limit_freq_min_khz,
+						   limit_freq_max_khz);
+		if (ret)
+			dev_warn(priv->cpu_dev,
+				 "EM perf limits update failed\n");
+	}
 
-	policy->max = clamp(limit_freq_khz, policy->cpuinfo.min_freq, policy->cpuinfo.max_freq);
+	policy->max = clamp(limit_freq_max_khz, policy->cpuinfo.min_freq,
+			    policy->cpuinfo.max_freq);
 
 	cpufreq_update_pressure(policy);