[0/5] Rework system pressure interface to the scheduler

Message ID	20231212142730.998913-1-vincent.guittot@linaro.org
Headers	show Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="md5jkMNp" From: Vincent Guittot <vincent.guittot@linaro.org> To: catalin.marinas@arm.com, will@kernel.org, sudeep.holla@arm.com, rafael@kernel.org, viresh.kumar@linaro.org, agross@kernel.org, andersson@kernel.org, konrad.dybcio@linaro.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, lukasz.luba@arm.com, rui.zhang@intel.com, mhiramat@kernel.org, daniel.lezcano@linaro.org, amit.kachhap@gmail.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: Vincent Guittot <vincent.guittot@linaro.org> Subject: [PATCH 0/5] Rework system pressure interface to the scheduler Date: Tue, 12 Dec 2023 15:27:26 +0100 Message-Id: <20231212142730.998913-1-vincent.guittot@linaro.org> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Rework system pressure interface to the scheduler \| expand [0/5] Rework system pressure interface to the scheduler [1/4] cpufreq: Add a cpufreq pressure feedback for the scheduler [2/4] sched: Take cpufreq feedback into account [3/4] thermal/cpufreq: Remove arch_update_thermal_pressure() [4/4] sched: Rename arch_update_thermal_pressure into arch_update_hw_pressure

Vincent Guittot Dec. 12, 2023, 2:27 p.m. UTC

Following the consolidation and cleanup of CPU capacity in [1], this serie
reworks how the scheduler gets the pressures on CPUs. We need to take into
account all pressures applied by cpufreq on the compute capacity of a CPU
for dozens of ms or more and not only cpufreq cooling device or HW
mitigiations. we split the pressure applied on CPU's capacity in 2 parts:
- one from cpufreq and freq_qos
- one from HW high freq mitigiation.

The next step will be to add a dedicated interface for long standing
capping of the CPU capacity (i.e. for seconds or more) like the
scaling_max_freq of cpufreq sysfs. The latter is already taken into
account by this serie but as a temporary pressure which is not always the
best choice when we know that it will happen for seconds or more.

[1] https://lore.kernel.org/lkml/20231211104855.558096-1-vincent.guittot@linaro.org/

Vincent Guittot (4):
  cpufreq: Add a cpufreq pressure feedback for the scheduler
  sched: Take cpufreq feedback into account
  thermal/cpufreq: Remove arch_update_thermal_pressure()
  sched: Rename arch_update_thermal_pressure into
    arch_update_hw_pressure

 arch/arm/include/asm/topology.h               |  6 +--
 arch/arm64/include/asm/topology.h             |  6 +--
 drivers/base/arch_topology.c                  | 26 ++++-----
 drivers/cpufreq/cpufreq.c                     | 48 +++++++++++++++++
 drivers/cpufreq/qcom-cpufreq-hw.c             |  4 +-
 drivers/thermal/cpufreq_cooling.c             |  3 --
 include/linux/arch_topology.h                 |  8 +--
 include/linux/cpufreq.h                       | 10 ++++
 include/linux/sched/topology.h                |  8 +--
 .../{thermal_pressure.h => hw_pressure.h}     | 14 ++---
 include/trace/events/sched.h                  |  2 +-
 init/Kconfig                                  | 12 ++---
 kernel/sched/core.c                           |  8 +--
 kernel/sched/fair.c                           | 53 ++++++++++---------
 kernel/sched/pelt.c                           | 18 +++----
 kernel/sched/pelt.h                           | 16 +++---
 kernel/sched/sched.h                          |  4 +-
 17 files changed, 152 insertions(+), 94 deletions(-)
 rename include/trace/events/{thermal_pressure.h => hw_pressure.h} (55%)

Viresh Kumar Dec. 13, 2023, 7:17 a.m. UTC | #1

On 12-12-23, 15:27, Vincent Guittot wrote:
> Provide to the scheduler a feedback about the temporary max available
> capacity. Unlike arch_update_thermal_pressure, this doesn't need to be
> filtered as the pressure will happen for dozens ms or more.
> 
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> ---
>  drivers/cpufreq/cpufreq.c | 48 +++++++++++++++++++++++++++++++++++++++
>  include/linux/cpufreq.h   | 10 ++++++++
>  2 files changed, 58 insertions(+)
> 
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 44db4f59c4cc..7d5f71be8d29 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -2563,6 +2563,50 @@ int cpufreq_get_policy(struct cpufreq_policy *policy, unsigned int cpu)
>  }
>  EXPORT_SYMBOL(cpufreq_get_policy);
>  
> +DEFINE_PER_CPU(unsigned long, cpufreq_pressure);
> +EXPORT_PER_CPU_SYMBOL_GPL(cpufreq_pressure);
> +
> +/**
> + * cpufreq_update_pressure() - Update cpufreq pressure for CPUs
> + * @cpus        : The related CPUs for which max capacity has been reduced
> + * @capped_freq : The maximum allowed frequency that CPUs can run at
> + *
> + * Update the value of cpufreq pressure for all @cpus in the mask. The
> + * cpumask should include all (online+offline) affected CPUs, to avoid
> + * operating on stale data when hot-plug is used for some CPUs. The
> + * @capped_freq reflects the currently allowed max CPUs frequency due to
> + * freq_qos capping. It might be also a boost frequency value, which is bigger
> + * than the internal 'capacity_freq_ref' max frequency. In such case the
> + * pressure value should simply be removed, since this is an indication that
> + * there is no capping. The @capped_freq must be provided in kHz.
> + */
> +static void cpufreq_update_pressure(const struct cpumask *cpus,

Since this is defined as 'static', why not just pass policy here ?

> +				      unsigned long capped_freq)
> +{
> +	unsigned long max_capacity, capacity, pressure;
> +	u32 max_freq;
> +	int cpu;
> +
> +	cpu = cpumask_first(cpus);
> +	max_capacity = arch_scale_cpu_capacity(cpu);

This anyway expects all of them to be from the same policy ..

> +	max_freq = arch_scale_freq_ref(cpu);
> +
> +	/*
> +	 * Handle properly the boost frequencies, which should simply clean
> +	 * the thermal pressure value.
> +	 */
> +	if (max_freq <= capped_freq)
> +		capacity = max_capacity;
> +	else
> +		capacity = mult_frac(max_capacity, capped_freq, max_freq);
> +
> +	pressure = max_capacity - capacity;
> +

Extra blank line here.

> +
> +	for_each_cpu(cpu, cpus)
> +		WRITE_ONCE(per_cpu(cpufreq_pressure, cpu), pressure);
> +}
> +
>  /**
>   * cpufreq_set_policy - Modify cpufreq policy parameters.
>   * @policy: Policy object to modify.
> @@ -2584,6 +2628,7 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
>  {
>  	struct cpufreq_policy_data new_data;
>  	struct cpufreq_governor *old_gov;
> +	struct cpumask *cpus;
>  	int ret;
>  
>  	memcpy(&new_data.cpuinfo, &policy->cpuinfo, sizeof(policy->cpuinfo));
> @@ -2618,6 +2663,9 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
>  	policy->max = __resolve_freq(policy, policy->max, CPUFREQ_RELATION_H);
>  	trace_cpu_frequency_limits(policy);
>  
> +	cpus = policy->related_cpus;

You don't need the extra variable anyway, but lets just pass policy
instead to the routine.

> +	cpufreq_update_pressure(cpus, policy->max);
> +
>  	policy->cached_target_freq = UINT_MAX;
>  
>  	pr_debug("new min and max freqs are %u - %u kHz\n",
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index afda5f24d3dd..b1d97edd3253 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -241,6 +241,12 @@ struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy);
>  void cpufreq_enable_fast_switch(struct cpufreq_policy *policy);
>  void cpufreq_disable_fast_switch(struct cpufreq_policy *policy);
>  bool has_target_index(void);
> +
> +DECLARE_PER_CPU(unsigned long, cpufreq_pressure);
> +static inline unsigned long cpufreq_get_pressure(int cpu)
> +{
> +	return per_cpu(cpufreq_pressure, cpu);
> +}
>  #else
>  static inline unsigned int cpufreq_get(unsigned int cpu)
>  {
> @@ -263,6 +269,10 @@ static inline bool cpufreq_supports_freq_invariance(void)
>  	return false;
>  }
>  static inline void disable_cpufreq(void) { }
> +static inline unsigned long cpufreq_get_pressure(int cpu)
> +{
> +	return 0;
> +}
>  #endif
>  
>  #ifdef CONFIG_CPU_FREQ_STAT
> -- 
> 2.34.1

Vincent Guittot Dec. 13, 2023, 8:05 a.m. UTC | #2

On Wed, 13 Dec 2023 at 08:17, Viresh Kumar <viresh.kumar@linaro.org> wrote:
>
> On 12-12-23, 15:27, Vincent Guittot wrote:
> > Provide to the scheduler a feedback about the temporary max available
> > capacity. Unlike arch_update_thermal_pressure, this doesn't need to be
> > filtered as the pressure will happen for dozens ms or more.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > ---
> >  drivers/cpufreq/cpufreq.c | 48 +++++++++++++++++++++++++++++++++++++++
> >  include/linux/cpufreq.h   | 10 ++++++++
> >  2 files changed, 58 insertions(+)
> >
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 44db4f59c4cc..7d5f71be8d29 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -2563,6 +2563,50 @@ int cpufreq_get_policy(struct cpufreq_policy *policy, unsigned int cpu)
> >  }
> >  EXPORT_SYMBOL(cpufreq_get_policy);
> >
> > +DEFINE_PER_CPU(unsigned long, cpufreq_pressure);
> > +EXPORT_PER_CPU_SYMBOL_GPL(cpufreq_pressure);
> > +
> > +/**
> > + * cpufreq_update_pressure() - Update cpufreq pressure for CPUs
> > + * @cpus        : The related CPUs for which max capacity has been reduced
> > + * @capped_freq : The maximum allowed frequency that CPUs can run at
> > + *
> > + * Update the value of cpufreq pressure for all @cpus in the mask. The
> > + * cpumask should include all (online+offline) affected CPUs, to avoid
> > + * operating on stale data when hot-plug is used for some CPUs. The
> > + * @capped_freq reflects the currently allowed max CPUs frequency due to
> > + * freq_qos capping. It might be also a boost frequency value, which is bigger
> > + * than the internal 'capacity_freq_ref' max frequency. In such case the
> > + * pressure value should simply be removed, since this is an indication that
> > + * there is no capping. The @capped_freq must be provided in kHz.
> > + */
> > +static void cpufreq_update_pressure(const struct cpumask *cpus,
>
> Since this is defined as 'static', why not just pass policy here ?

Mainly because we only need the cpumask and also because this follows
the same pattern as other place like arch_topology.c

>
> > +                                   unsigned long capped_freq)
> > +{
> > +     unsigned long max_capacity, capacity, pressure;
> > +     u32 max_freq;
> > +     int cpu;
> > +
> > +     cpu = cpumask_first(cpus);
> > +     max_capacity = arch_scale_cpu_capacity(cpu);
>
> This anyway expects all of them to be from the same policy ..
>
> > +     max_freq = arch_scale_freq_ref(cpu);
> > +
> > +     /*
> > +      * Handle properly the boost frequencies, which should simply clean
> > +      * the thermal pressure value.
> > +      */
> > +     if (max_freq <= capped_freq)
> > +             capacity = max_capacity;
> > +     else
> > +             capacity = mult_frac(max_capacity, capped_freq, max_freq);
> > +
> > +     pressure = max_capacity - capacity;
> > +
>
> Extra blank line here.
>
> > +
> > +     for_each_cpu(cpu, cpus)
> > +             WRITE_ONCE(per_cpu(cpufreq_pressure, cpu), pressure);
> > +}
> > +
> >  /**
> >   * cpufreq_set_policy - Modify cpufreq policy parameters.
> >   * @policy: Policy object to modify.
> > @@ -2584,6 +2628,7 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
> >  {
> >       struct cpufreq_policy_data new_data;
> >       struct cpufreq_governor *old_gov;
> > +     struct cpumask *cpus;
> >       int ret;
> >
> >       memcpy(&new_data.cpuinfo, &policy->cpuinfo, sizeof(policy->cpuinfo));
> > @@ -2618,6 +2663,9 @@ static int cpufreq_set_policy(struct cpufreq_policy *policy,
> >       policy->max = __resolve_freq(policy, policy->max, CPUFREQ_RELATION_H);
> >       trace_cpu_frequency_limits(policy);
> >
> > +     cpus = policy->related_cpus;
>
> You don't need the extra variable anyway, but lets just pass policy
> instead to the routine.

In fact I have followed what was done in cpufreq_cooling.c with
arch_update_thermal_pressure().

Will remove it

>
> > +     cpufreq_update_pressure(cpus, policy->max);
> > +
> >       policy->cached_target_freq = UINT_MAX;
> >
> >       pr_debug("new min and max freqs are %u - %u kHz\n",
> > diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> > index afda5f24d3dd..b1d97edd3253 100644
> > --- a/include/linux/cpufreq.h
> > +++ b/include/linux/cpufreq.h
> > @@ -241,6 +241,12 @@ struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy);
> >  void cpufreq_enable_fast_switch(struct cpufreq_policy *policy);
> >  void cpufreq_disable_fast_switch(struct cpufreq_policy *policy);
> >  bool has_target_index(void);
> > +
> > +DECLARE_PER_CPU(unsigned long, cpufreq_pressure);
> > +static inline unsigned long cpufreq_get_pressure(int cpu)
> > +{
> > +     return per_cpu(cpufreq_pressure, cpu);
> > +}
> >  #else
> >  static inline unsigned int cpufreq_get(unsigned int cpu)
> >  {
> > @@ -263,6 +269,10 @@ static inline bool cpufreq_supports_freq_invariance(void)
> >       return false;
> >  }
> >  static inline void disable_cpufreq(void) { }
> > +static inline unsigned long cpufreq_get_pressure(int cpu)
> > +{
> > +     return 0;
> > +}
> >  #endif
> >
> >  #ifdef CONFIG_CPU_FREQ_STAT
> > --
> > 2.34.1
>
> --
> viresh

Lukasz Luba Dec. 14, 2023, 8:22 a.m. UTC | #3

Hi Vincent,

I've been waiting for this feature, thanks!


On 12/12/23 14:27, Vincent Guittot wrote:
> Following the consolidation and cleanup of CPU capacity in [1], this serie
> reworks how the scheduler gets the pressures on CPUs. We need to take into
> account all pressures applied by cpufreq on the compute capacity of a CPU
> for dozens of ms or more and not only cpufreq cooling device or HW
> mitigiations. we split the pressure applied on CPU's capacity in 2 parts:
> - one from cpufreq and freq_qos
> - one from HW high freq mitigiation.
> 
> The next step will be to add a dedicated interface for long standing
> capping of the CPU capacity (i.e. for seconds or more) like the
> scaling_max_freq of cpufreq sysfs. The latter is already taken into
> account by this serie but as a temporary pressure which is not always the
> best choice when we know that it will happen for seconds or more.
> 
> [1] https://lore.kernel.org/lkml/20231211104855.558096-1-vincent.guittot@linaro.org/
> 
> Vincent Guittot (4):
>    cpufreq: Add a cpufreq pressure feedback for the scheduler
>    sched: Take cpufreq feedback into account
>    thermal/cpufreq: Remove arch_update_thermal_pressure()
>    sched: Rename arch_update_thermal_pressure into
>      arch_update_hw_pressure
> 
>   arch/arm/include/asm/topology.h               |  6 +--
>   arch/arm64/include/asm/topology.h             |  6 +--
>   drivers/base/arch_topology.c                  | 26 ++++-----
>   drivers/cpufreq/cpufreq.c                     | 48 +++++++++++++++++
>   drivers/cpufreq/qcom-cpufreq-hw.c             |  4 +-
>   drivers/thermal/cpufreq_cooling.c             |  3 --
>   include/linux/arch_topology.h                 |  8 +--
>   include/linux/cpufreq.h                       | 10 ++++
>   include/linux/sched/topology.h                |  8 +--
>   .../{thermal_pressure.h => hw_pressure.h}     | 14 ++---
>   include/trace/events/sched.h                  |  2 +-
>   init/Kconfig                                  | 12 ++---
>   kernel/sched/core.c                           |  8 +--
>   kernel/sched/fair.c                           | 53 ++++++++++---------
>   kernel/sched/pelt.c                           | 18 +++----
>   kernel/sched/pelt.h                           | 16 +++---
>   kernel/sched/sched.h                          |  4 +-
>   17 files changed, 152 insertions(+), 94 deletions(-)
>   rename include/trace/events/{thermal_pressure.h => hw_pressure.h} (55%)
> 

I would like to test it, but something worries me. Why there is 0/5 in
this subject and only 4 patches?

Could you tell me your base branch that I can apply this, please?

Regards,
Lukasz

Vincent Guittot Dec. 14, 2023, 8:29 a.m. UTC | #4

On Thu, 14 Dec 2023 at 09:21, Lukasz Luba <lukasz.luba@arm.com> wrote:
>
> Hi Vincent,
>
> I've been waiting for this feature, thanks!
>
>
> On 12/12/23 14:27, Vincent Guittot wrote:
> > Following the consolidation and cleanup of CPU capacity in [1], this serie
> > reworks how the scheduler gets the pressures on CPUs. We need to take into
> > account all pressures applied by cpufreq on the compute capacity of a CPU
> > for dozens of ms or more and not only cpufreq cooling device or HW
> > mitigiations. we split the pressure applied on CPU's capacity in 2 parts:
> > - one from cpufreq and freq_qos
> > - one from HW high freq mitigiation.
> >
> > The next step will be to add a dedicated interface for long standing
> > capping of the CPU capacity (i.e. for seconds or more) like the
> > scaling_max_freq of cpufreq sysfs. The latter is already taken into
> > account by this serie but as a temporary pressure which is not always the
> > best choice when we know that it will happen for seconds or more.
> >
> > [1] https://lore.kernel.org/lkml/20231211104855.558096-1-vincent.guittot@linaro.org/
> >
> > Vincent Guittot (4):
> >    cpufreq: Add a cpufreq pressure feedback for the scheduler
> >    sched: Take cpufreq feedback into account
> >    thermal/cpufreq: Remove arch_update_thermal_pressure()
> >    sched: Rename arch_update_thermal_pressure into
> >      arch_update_hw_pressure
> >
> >   arch/arm/include/asm/topology.h               |  6 +--
> >   arch/arm64/include/asm/topology.h             |  6 +--
> >   drivers/base/arch_topology.c                  | 26 ++++-----
> >   drivers/cpufreq/cpufreq.c                     | 48 +++++++++++++++++
> >   drivers/cpufreq/qcom-cpufreq-hw.c             |  4 +-
> >   drivers/thermal/cpufreq_cooling.c             |  3 --
> >   include/linux/arch_topology.h                 |  8 +--
> >   include/linux/cpufreq.h                       | 10 ++++
> >   include/linux/sched/topology.h                |  8 +--
> >   .../{thermal_pressure.h => hw_pressure.h}     | 14 ++---
> >   include/trace/events/sched.h                  |  2 +-
> >   init/Kconfig                                  | 12 ++---
> >   kernel/sched/core.c                           |  8 +--
> >   kernel/sched/fair.c                           | 53 ++++++++++---------
> >   kernel/sched/pelt.c                           | 18 +++----
> >   kernel/sched/pelt.h                           | 16 +++---
> >   kernel/sched/sched.h                          |  4 +-
> >   17 files changed, 152 insertions(+), 94 deletions(-)
> >   rename include/trace/events/{thermal_pressure.h => hw_pressure.h} (55%)
> >
>
> I would like to test it, but something worries me. Why there is 0/5 in
> this subject and only 4 patches?

I removed a patch from the series but copied/pasted the cover letter
subject without noticing the /5 instead of /4

>
> Could you tell me your base branch that I can apply this, please?

It applies on top of tip/sched/core + [1]
and you can find it here:
https://git.linaro.org/people/vincent.guittot/kernel.git/log/?h=sched/system-pressure

>
> Regards,
> Lukasz

Lukasz Luba Dec. 14, 2023, 8:32 a.m. UTC | #5

On 12/14/23 08:29, Vincent Guittot wrote:
> On Thu, 14 Dec 2023 at 09:21, Lukasz Luba <lukasz.luba@arm.com> wrote:
>>
>> Hi Vincent,
>>
>> I've been waiting for this feature, thanks!
>>
>>
>> On 12/12/23 14:27, Vincent Guittot wrote:
>>> Following the consolidation and cleanup of CPU capacity in [1], this serie
>>> reworks how the scheduler gets the pressures on CPUs. We need to take into
>>> account all pressures applied by cpufreq on the compute capacity of a CPU
>>> for dozens of ms or more and not only cpufreq cooling device or HW
>>> mitigiations. we split the pressure applied on CPU's capacity in 2 parts:
>>> - one from cpufreq and freq_qos
>>> - one from HW high freq mitigiation.
>>>
>>> The next step will be to add a dedicated interface for long standing
>>> capping of the CPU capacity (i.e. for seconds or more) like the
>>> scaling_max_freq of cpufreq sysfs. The latter is already taken into
>>> account by this serie but as a temporary pressure which is not always the
>>> best choice when we know that it will happen for seconds or more.
>>>
>>> [1] https://lore.kernel.org/lkml/20231211104855.558096-1-vincent.guittot@linaro.org/
>>>
>>> Vincent Guittot (4):
>>>     cpufreq: Add a cpufreq pressure feedback for the scheduler
>>>     sched: Take cpufreq feedback into account
>>>     thermal/cpufreq: Remove arch_update_thermal_pressure()
>>>     sched: Rename arch_update_thermal_pressure into
>>>       arch_update_hw_pressure
>>>
>>>    arch/arm/include/asm/topology.h               |  6 +--
>>>    arch/arm64/include/asm/topology.h             |  6 +--
>>>    drivers/base/arch_topology.c                  | 26 ++++-----
>>>    drivers/cpufreq/cpufreq.c                     | 48 +++++++++++++++++
>>>    drivers/cpufreq/qcom-cpufreq-hw.c             |  4 +-
>>>    drivers/thermal/cpufreq_cooling.c             |  3 --
>>>    include/linux/arch_topology.h                 |  8 +--
>>>    include/linux/cpufreq.h                       | 10 ++++
>>>    include/linux/sched/topology.h                |  8 +--
>>>    .../{thermal_pressure.h => hw_pressure.h}     | 14 ++---
>>>    include/trace/events/sched.h                  |  2 +-
>>>    init/Kconfig                                  | 12 ++---
>>>    kernel/sched/core.c                           |  8 +--
>>>    kernel/sched/fair.c                           | 53 ++++++++++---------
>>>    kernel/sched/pelt.c                           | 18 +++----
>>>    kernel/sched/pelt.h                           | 16 +++---
>>>    kernel/sched/sched.h                          |  4 +-
>>>    17 files changed, 152 insertions(+), 94 deletions(-)
>>>    rename include/trace/events/{thermal_pressure.h => hw_pressure.h} (55%)
>>>
>>
>> I would like to test it, but something worries me. Why there is 0/5 in
>> this subject and only 4 patches?
> 
> I removed a patch from the series but copied/pasted the cover letter
> subject without noticing the /5 instead of /4

OK

> 
>>
>> Could you tell me your base branch that I can apply this, please?
> 
> It applies on top of tip/sched/core + [1]
> and you can find it here:
> https://git.linaro.org/people/vincent.guittot/kernel.git/log/?h=sched/system-pressure

Thanks for the info and handy link.

Lukasz Luba Dec. 15, 2023, 3:54 p.m. UTC | #6

On 12/14/23 08:32, Lukasz Luba wrote:
> 
> 
> On 12/14/23 08:29, Vincent Guittot wrote:
>> On Thu, 14 Dec 2023 at 09:21, Lukasz Luba <lukasz.luba@arm.com> wrote:
>>>
>>> Hi Vincent,
>>>
>>> I've been waiting for this feature, thanks!
>>>
>>>
>>> On 12/12/23 14:27, Vincent Guittot wrote:
>>>> Following the consolidation and cleanup of CPU capacity in [1], this 
>>>> serie
>>>> reworks how the scheduler gets the pressures on CPUs. We need to 
>>>> take into
>>>> account all pressures applied by cpufreq on the compute capacity of 
>>>> a CPU
>>>> for dozens of ms or more and not only cpufreq cooling device or HW
>>>> mitigiations. we split the pressure applied on CPU's capacity in 2 
>>>> parts:
>>>> - one from cpufreq and freq_qos
>>>> - one from HW high freq mitigiation.
>>>>
>>>> The next step will be to add a dedicated interface for long standing
>>>> capping of the CPU capacity (i.e. for seconds or more) like the
>>>> scaling_max_freq of cpufreq sysfs. The latter is already taken into
>>>> account by this serie but as a temporary pressure which is not 
>>>> always the
>>>> best choice when we know that it will happen for seconds or more.
>>>>
>>>> [1] 
>>>> https://lore.kernel.org/lkml/20231211104855.558096-1-vincent.guittot@linaro.org/
>>>>
>>>> Vincent Guittot (4):
>>>>     cpufreq: Add a cpufreq pressure feedback for the scheduler
>>>>     sched: Take cpufreq feedback into account
>>>>     thermal/cpufreq: Remove arch_update_thermal_pressure()
>>>>     sched: Rename arch_update_thermal_pressure into
>>>>       arch_update_hw_pressure
>>>>
>>>>    arch/arm/include/asm/topology.h               |  6 +--
>>>>    arch/arm64/include/asm/topology.h             |  6 +--
>>>>    drivers/base/arch_topology.c                  | 26 ++++-----
>>>>    drivers/cpufreq/cpufreq.c                     | 48 +++++++++++++++++
>>>>    drivers/cpufreq/qcom-cpufreq-hw.c             |  4 +-
>>>>    drivers/thermal/cpufreq_cooling.c             |  3 --
>>>>    include/linux/arch_topology.h                 |  8 +--
>>>>    include/linux/cpufreq.h                       | 10 ++++
>>>>    include/linux/sched/topology.h                |  8 +--
>>>>    .../{thermal_pressure.h => hw_pressure.h}     | 14 ++---
>>>>    include/trace/events/sched.h                  |  2 +-
>>>>    init/Kconfig                                  | 12 ++---
>>>>    kernel/sched/core.c                           |  8 +--
>>>>    kernel/sched/fair.c                           | 53 
>>>> ++++++++++---------
>>>>    kernel/sched/pelt.c                           | 18 +++----
>>>>    kernel/sched/pelt.h                           | 16 +++---
>>>>    kernel/sched/sched.h                          |  4 +-
>>>>    17 files changed, 152 insertions(+), 94 deletions(-)
>>>>    rename include/trace/events/{thermal_pressure.h => hw_pressure.h} 
>>>> (55%)
>>>>
>>>
>>> I would like to test it, but something worries me. Why there is 0/5 in
>>> this subject and only 4 patches?
>>
>> I removed a patch from the series but copied/pasted the cover letter
>> subject without noticing the /5 instead of /4
> 
> OK
> 
>>
>>>
>>> Could you tell me your base branch that I can apply this, please?
>>
>> It applies on top of tip/sched/core + [1]
>> and you can find it here:
>> https://git.linaro.org/people/vincent.guittot/kernel.git/log/?h=sched/system-pressure
> 
> Thanks for the info and handy link.
> 

I've tested your patches with: DTPM/PowerCap + thermal gov + cpufreq
sysfs scaling_max_freq. It works fine all my cases (couldn't cause
any issues). If you like to test DTPM you will need 2 fixed pending
in Rafael's tree.

So, I'm looking for a your v2 to continue reviewing it.

[0/5] Rework system pressure interface to the scheduler

Message

Comments