mbox series

[v4,0/3] Add allowed CPU capacity knowledge to EAS

Message ID 20210614185815.15136-1-lukasz.luba@arm.com
Headers show
Series Add allowed CPU capacity knowledge to EAS | expand

Message

Lukasz Luba June 14, 2021, 6:58 p.m. UTC
Hi all,

The patch set v4 aims to add knowledge about reduced CPU capacity
into the Energy Model (EM) and Energy Aware Scheduler (EAS). Currently the
issue is that SchedUtil CPU frequency and EM frequency are not aligned,
when there is a CPU thermal capping. This causes an estimation error.
This patch set provides the information about allowed CPU capacity
into the EM (thanks to thermal pressure information). This improves the
energy estimation. More info about this mechanism can be found in the
patches description.

Changelog:
v4:
- removed local variable and improved description in patch 2/3
- added Reviewed-by from Vincent for patch 2/3
- added Acked-by from Viresh for patch 1/3
v3 [3]:
- switched to 'raw' per-cpu thermal pressure instead of thermal pressure
  geometric series signal, since it more suited for purpose of
  this use case: predicting SchedUtil frequency (Vincent, Dietmar)
- added more comment in the patch 2/3 header for use case when thermal
  capping might be applied even the CPUs are not over-utilized
  (Dietmar)
- added ACK tag from Rafael for SchedUtil part
- added a fix patch for offline CPUs in cpufreq_cooling and per-cpu
  thermal_pressure missing update
v2 [2]:
- clamp the returned value from effective_cpu_util() and avoid irq
  util scaling issues (Quentin)
v1 is available at [1]

Regards,
Lukasz

[1] https://lore.kernel.org/linux-pm/20210602135609.10867-1-lukasz.luba@arm.com/
[2] https://lore.kernel.org/lkml/20210604080954.13915-1-lukasz.luba@arm.com/
[3] https://lore.kernel.org/lkml/20210610150324.22919-1-lukasz.luba@arm.com/

Lukasz Luba (3):
  thermal: cpufreq_cooling: Update also offline CPUs per-cpu
    thermal_pressure
  sched/fair: Take thermal pressure into account while estimating energy
  sched/cpufreq: Consider reduced CPU capacity in energy calculation

 drivers/thermal/cpufreq_cooling.c |  2 +-
 include/linux/energy_model.h      | 16 +++++++++++++---
 include/linux/sched/cpufreq.h     |  2 +-
 kernel/sched/cpufreq_schedutil.c  |  1 +
 kernel/sched/fair.c               | 13 +++++++++----
 5 files changed, 25 insertions(+), 9 deletions(-)

Comments

Lukasz Luba June 16, 2021, 1:33 p.m. UTC | #1
Hi Peter,


On 6/14/21 7:58 PM, Lukasz Luba wrote:
> Hi all,

> 

> The patch set v4 aims to add knowledge about reduced CPU capacity

> into the Energy Model (EM) and Energy Aware Scheduler (EAS). Currently the

> issue is that SchedUtil CPU frequency and EM frequency are not aligned,

> when there is a CPU thermal capping. This causes an estimation error.

> This patch set provides the information about allowed CPU capacity

> into the EM (thanks to thermal pressure information). This improves the

> energy estimation. More info about this mechanism can be found in the

> patches description.

> 

> Changelog:

> v4:

> - removed local variable and improved description in patch 2/3

> - added Reviewed-by from Vincent for patch 2/3

> - added Acked-by from Viresh for patch 1/3

> v3 [3]:

> - switched to 'raw' per-cpu thermal pressure instead of thermal pressure

>    geometric series signal, since it more suited for purpose of

>    this use case: predicting SchedUtil frequency (Vincent, Dietmar)

> - added more comment in the patch 2/3 header for use case when thermal

>    capping might be applied even the CPUs are not over-utilized

>    (Dietmar)

> - added ACK tag from Rafael for SchedUtil part

> - added a fix patch for offline CPUs in cpufreq_cooling and per-cpu

>    thermal_pressure missing update

> v2 [2]:

> - clamp the returned value from effective_cpu_util() and avoid irq

>    util scaling issues (Quentin)

> v1 is available at [1]

> 

> Regards,

> Lukasz

> 

> [1] https://lore.kernel.org/linux-pm/20210602135609.10867-1-lukasz.luba@arm.com/

> [2] https://lore.kernel.org/lkml/20210604080954.13915-1-lukasz.luba@arm.com/

> [3] https://lore.kernel.org/lkml/20210610150324.22919-1-lukasz.luba@arm.com/

> 

> Lukasz Luba (3):

>    thermal: cpufreq_cooling: Update also offline CPUs per-cpu

>      thermal_pressure

>    sched/fair: Take thermal pressure into account while estimating energy

>    sched/cpufreq: Consider reduced CPU capacity in energy calculation

> 

>   drivers/thermal/cpufreq_cooling.c |  2 +-

>   include/linux/energy_model.h      | 16 +++++++++++++---

>   include/linux/sched/cpufreq.h     |  2 +-

>   kernel/sched/cpufreq_schedutil.c  |  1 +

>   kernel/sched/fair.c               | 13 +++++++++----

>   5 files changed, 25 insertions(+), 9 deletions(-)

> 


Could you take these 3 patches via your tree, please?
I'm asking you because the fair.c has most changes
(apart from energy_model.h) and the patches got
ACKs from Rafael and Viresh. The patch which touches
fair.c got Reviewed-by Vincent Guittot. I have address
all the comment, thus, IMHO it could fly now.

Please let me know if you like me to re-base on top
of some of your branches.

Regards,
Lukasz