Message ID | 1528459794-13066-7-git-send-email-vincent.guittot@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | track CPU utilization | expand |
Hi Vincent, On 08/06/18 14:09, Vincent Guittot wrote: > Now that we have both the dl class bandwidth requirement and the dl class > utilization, we can detect when CPU is fully used so we should run at max. > Otherwise, we keep using the dl bandwidth requirement to define the > utilization of the CPU > > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > --- [...] > @@ -190,20 +192,24 @@ static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu) > if (rq->rt.rt_nr_running) > return sg_cpu->max; > > - util = sg_cpu->util_dl; > - util += sg_cpu->util_cfs; > + util = sg_cpu->util_cfs; > util += sg_cpu->util_rt; > > + if ((util + sg_cpu->util_dl) >= sg_cpu->max) > + return sg_cpu->max; > + Mmm, won't we run at max (or reach max) with a, say, 100ms/500ms DL task running alone? Best, - Juri
On 8 June 2018 at 14:39, Juri Lelli <juri.lelli@redhat.com> wrote: > Hi Vincent, > > On 08/06/18 14:09, Vincent Guittot wrote: >> Now that we have both the dl class bandwidth requirement and the dl class >> utilization, we can detect when CPU is fully used so we should run at max. >> Otherwise, we keep using the dl bandwidth requirement to define the >> utilization of the CPU >> >> Cc: Ingo Molnar <mingo@redhat.com> >> Cc: Peter Zijlstra <peterz@infradead.org> >> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> >> --- > > [...] > >> @@ -190,20 +192,24 @@ static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu) >> if (rq->rt.rt_nr_running) >> return sg_cpu->max; >> >> - util = sg_cpu->util_dl; >> - util += sg_cpu->util_cfs; >> + util = sg_cpu->util_cfs; >> util += sg_cpu->util_rt; >> >> + if ((util + sg_cpu->util_dl) >= sg_cpu->max) >> + return sg_cpu->max; >> + > > Mmm, won't we run at max (or reach max) with a, say, 100ms/500ms DL task > running alone? not for a 100ms running task. You have to run more than 320ms to reach max value 100ms/500ms will vary between 0 and 907 Vincent > > Best, > > - Juri
On 08/06/18 14:48, Vincent Guittot wrote: > On 8 June 2018 at 14:39, Juri Lelli <juri.lelli@redhat.com> wrote: > > Hi Vincent, > > > > On 08/06/18 14:09, Vincent Guittot wrote: > >> Now that we have both the dl class bandwidth requirement and the dl class > >> utilization, we can detect when CPU is fully used so we should run at max. > >> Otherwise, we keep using the dl bandwidth requirement to define the > >> utilization of the CPU > >> > >> Cc: Ingo Molnar <mingo@redhat.com> > >> Cc: Peter Zijlstra <peterz@infradead.org> > >> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > >> --- > > > > [...] > > > >> @@ -190,20 +192,24 @@ static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu) > >> if (rq->rt.rt_nr_running) > >> return sg_cpu->max; > >> > >> - util = sg_cpu->util_dl; > >> - util += sg_cpu->util_cfs; > >> + util = sg_cpu->util_cfs; > >> util += sg_cpu->util_rt; > >> > >> + if ((util + sg_cpu->util_dl) >= sg_cpu->max) > >> + return sg_cpu->max; > >> + > > > > Mmm, won't we run at max (or reach max) with a, say, 100ms/500ms DL task > > running alone? > > not for a 100ms running task. You have to run more than 320ms to reach max value > > 100ms/500ms will vary between 0 and 907 OK, right, my point I guess is still that such a task will run fine at ~250 and it might be save more energy by doing so? Also, less freq switches (consider for example a few background CFS tasks waking up from time to time).
On 08/06/18 14:54, Juri Lelli wrote: > On 08/06/18 14:48, Vincent Guittot wrote: > > On 8 June 2018 at 14:39, Juri Lelli <juri.lelli@redhat.com> wrote: > > > Hi Vincent, > > > > > > On 08/06/18 14:09, Vincent Guittot wrote: > > >> Now that we have both the dl class bandwidth requirement and the dl class > > >> utilization, we can detect when CPU is fully used so we should run at max. > > >> Otherwise, we keep using the dl bandwidth requirement to define the > > >> utilization of the CPU > > >> > > >> Cc: Ingo Molnar <mingo@redhat.com> > > >> Cc: Peter Zijlstra <peterz@infradead.org> > > >> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> > > >> --- > > > > > > [...] > > > > > >> @@ -190,20 +192,24 @@ static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu) > > >> if (rq->rt.rt_nr_running) > > >> return sg_cpu->max; > > >> > > >> - util = sg_cpu->util_dl; > > >> - util += sg_cpu->util_cfs; > > >> + util = sg_cpu->util_cfs; > > >> util += sg_cpu->util_rt; > > >> > > >> + if ((util + sg_cpu->util_dl) >= sg_cpu->max) > > >> + return sg_cpu->max; > > >> + > > > > > > Mmm, won't we run at max (or reach max) with a, say, 100ms/500ms DL task > > > running alone? > > > > not for a 100ms running task. You have to run more than 320ms to reach max value > > > > 100ms/500ms will vary between 0 and 907 > > OK, right, my point I guess is still that such a task will run fine at > ~250 and it might be save more energy by doing so? As discussed on IRC, we still endup selecting 1/5 of max freq because util_dl is below max. So, turning point is at ~320ms/[something_bigger], which looks a pretty big runtime, but I'm not sure if having that is OK. Also, it becomes smaller with CFS/RT background "perturbations". Mmm. BTW, adding Luca and Claudio. :)
On 8 June 2018 at 15:36, Juri Lelli <juri.lelli@redhat.com> wrote: > On 08/06/18 14:54, Juri Lelli wrote: >> On 08/06/18 14:48, Vincent Guittot wrote: >> > On 8 June 2018 at 14:39, Juri Lelli <juri.lelli@redhat.com> wrote: >> > > Hi Vincent, >> > > >> > > On 08/06/18 14:09, Vincent Guittot wrote: >> > >> Now that we have both the dl class bandwidth requirement and the dl class >> > >> utilization, we can detect when CPU is fully used so we should run at max. >> > >> Otherwise, we keep using the dl bandwidth requirement to define the >> > >> utilization of the CPU >> > >> >> > >> Cc: Ingo Molnar <mingo@redhat.com> >> > >> Cc: Peter Zijlstra <peterz@infradead.org> >> > >> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> >> > >> --- >> > > >> > > [...] >> > > >> > >> @@ -190,20 +192,24 @@ static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu) >> > >> if (rq->rt.rt_nr_running) >> > >> return sg_cpu->max; >> > >> >> > >> - util = sg_cpu->util_dl; >> > >> - util += sg_cpu->util_cfs; >> > >> + util = sg_cpu->util_cfs; >> > >> util += sg_cpu->util_rt; >> > >> >> > >> + if ((util + sg_cpu->util_dl) >= sg_cpu->max) >> > >> + return sg_cpu->max; >> > >> + >> > > >> > > Mmm, won't we run at max (or reach max) with a, say, 100ms/500ms DL task >> > > running alone? >> > >> > not for a 100ms running task. You have to run more than 320ms to reach max value >> > >> > 100ms/500ms will vary between 0 and 907 >> >> OK, right, my point I guess is still that such a task will run fine at >> ~250 and it might be save more energy by doing so? > > As discussed on IRC, we still endup selecting 1/5 of max freq because > util_dl is below max. > > So, turning point is at ~320ms/[something_bigger], which looks a pretty > big runtime, but I'm not sure if having that is OK. Also, it becomes > smaller with CFS/RT background "perturbations". Mmm. > > BTW, adding Luca and Claudio. :) Argh... I have added few more but forgot Luca and Claudio. I'm very sorry
On Fri, Jun 08, 2018 at 02:09:49PM +0200, Vincent Guittot wrote: > - * Ideally we would like to set util_dl as min/guaranteed freq and > - * util_cfs + util_dl as requested freq. However, cpufreq is not yet > - * ready for such an interface. So, we only do the latter for now. Please don't delete that comment. It is not less relevant. > -static inline unsigned long cpu_util_dl(struct rq *rq) > +static inline unsigned long cpu_bw_dl(struct rq *rq) I think you forgot to fix-up ignore_dl_rate_limit().
On Fri, 22 Jun 2018 at 17:24, Peter Zijlstra <peterz@infradead.org> wrote: > > On Fri, Jun 08, 2018 at 02:09:49PM +0200, Vincent Guittot wrote: > > - * Ideally we would like to set util_dl as min/guaranteed freq and > > - * util_cfs + util_dl as requested freq. However, cpufreq is not yet > > - * ready for such an interface. So, we only do the latter for now. > > Please don't delete that comment. It is not less relevant. ok i will keep it in next version > > > -static inline unsigned long cpu_util_dl(struct rq *rq) > > +static inline unsigned long cpu_bw_dl(struct rq *rq) > > I think you forgot to fix-up ignore_dl_rate_limit(). yes you're right
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 32f97fb..25cee59 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -56,6 +56,7 @@ struct sugov_cpu { /* The fields below are only needed when sharing a policy: */ unsigned long util_cfs; unsigned long util_dl; + unsigned long bw_dl; unsigned long util_rt; unsigned long max; @@ -179,6 +180,7 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu) sg_cpu->max = arch_scale_cpu_capacity(NULL, sg_cpu->cpu); sg_cpu->util_cfs = cpu_util_cfs(rq); sg_cpu->util_dl = cpu_util_dl(rq); + sg_cpu->bw_dl = cpu_bw_dl(rq); sg_cpu->util_rt = cpu_util_rt(rq); } @@ -190,20 +192,24 @@ static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu) if (rq->rt.rt_nr_running) return sg_cpu->max; - util = sg_cpu->util_dl; - util += sg_cpu->util_cfs; + util = sg_cpu->util_cfs; util += sg_cpu->util_rt; + if ((util + sg_cpu->util_dl) >= sg_cpu->max) + return sg_cpu->max; + /* - * Utilization required by DEADLINE must always be granted while, for - * FAIR, we use blocked utilization of IDLE CPUs as a mechanism to - * gracefully reduce the frequency when no tasks show up for longer + * As there is still idle time on the CPU, we need to compute the + * utilization level of the CPU. + * Bandwidth required by DEADLINE must always be granted while, for + * FAIR and RT, we use blocked utilization of IDLE CPUs as a mechanism + * to gracefully reduce the frequency when no tasks show up for longer * periods of time. - * - * Ideally we would like to set util_dl as min/guaranteed freq and - * util_cfs + util_dl as requested freq. However, cpufreq is not yet - * ready for such an interface. So, we only do the latter for now. */ + + /* Add DL bandwidth requirement */ + util += sg_cpu->bw_dl; + return min(sg_cpu->max, util); } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 4526ba6..bc4305f 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2192,11 +2192,16 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {} #endif #ifdef CONFIG_CPU_FREQ_GOV_SCHEDUTIL -static inline unsigned long cpu_util_dl(struct rq *rq) +static inline unsigned long cpu_bw_dl(struct rq *rq) { return (rq->dl.running_bw * SCHED_CAPACITY_SCALE) >> BW_SHIFT; } +static inline unsigned long cpu_util_dl(struct rq *rq) +{ + return READ_ONCE(rq->avg_dl.util_avg); +} + static inline unsigned long cpu_util_cfs(struct rq *rq) { unsigned long util = READ_ONCE(rq->cfs.avg.util_avg);
Now that we have both the dl class bandwidth requirement and the dl class utilization, we can detect when CPU is fully used so we should run at max. Otherwise, we keep using the dl bandwidth requirement to define the utilization of the CPU Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> --- kernel/sched/cpufreq_schedutil.c | 24 +++++++++++++++--------- kernel/sched/sched.h | 7 ++++++- 2 files changed, 21 insertions(+), 10 deletions(-) -- 2.7.4