Message ID | 1530200714-4504-1-git-send-email-vincent.guittot@linaro.org |
---|---|
Headers | show |
Series | track CPU utilization | expand |
On Thu, Jun 28, 2018 at 05:45:03PM +0200, Vincent Guittot wrote: > Vincent Guittot (11): > sched/pelt: Move pelt related code in a dedicated file > sched/rt: add rt_rq utilization tracking > cpufreq/schedutil: use rt utilization tracking > sched/dl: add dl_rq utilization tracking > cpufreq/schedutil: use dl utilization tracking > sched/irq: add irq utilization tracking > cpufreq/schedutil: take into account interrupt > sched: schedutil: remove sugov_aggregate_util() > sched: use pelt for scale_rt_capacity() > sched: remove rt_avg code > proc/sched: remove unused sched_time_avg_ms > > include/linux/sched/sysctl.h | 1 - > kernel/sched/Makefile | 2 +- > kernel/sched/core.c | 38 +--- > kernel/sched/cpufreq_schedutil.c | 65 ++++--- > kernel/sched/deadline.c | 8 +- > kernel/sched/fair.c | 403 +++++---------------------------------- > kernel/sched/pelt.c | 399 ++++++++++++++++++++++++++++++++++++++ > kernel/sched/pelt.h | 72 +++++++ > kernel/sched/rt.c | 15 +- > kernel/sched/sched.h | 68 +++++-- > kernel/sysctl.c | 8 - > 11 files changed, 632 insertions(+), 447 deletions(-) > create mode 100644 kernel/sched/pelt.c > create mode 100644 kernel/sched/pelt.h OK, this looks good I suppose. Rafael, are you OK with me taking these? I have the below on top because I once again forgot how it all worked; does this work for you Vincent? --- Subject: sched/cpufreq: Clarify sugov_get_util() Add a few comments (hopefully) clarifying some of the magic in sugov_get_util(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- cpufreq_schedutil.c | 69 ++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 51 insertions(+), 18 deletions(-) --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -177,6 +177,26 @@ static unsigned int get_next_freq(struct return cpufreq_driver_resolve_freq(policy, freq); } +/* + * This function computes an effective utilization for the given CPU, to be + * used for frequency selection given the linear relation: f = u * f_max. + * + * The scheduler tracks the following metrics: + * + * cpu_util_{cfs,rt,dl,irq}() + * cpu_bw_dl() + * + * Where the cfs,rt and dl util numbers are tracked with the same metric and + * synchronized windows and are thus directly comparable. + * + * The cfs,rt,dl utilization are the running times measured with rq->clock_task + * which excludes things like IRQ and steal-time. These latter are then accrued in + * the irq utilization. + * + * The DL bandwidth number otoh is not a measured meric but a value computed + * based on the task model parameters and gives the minimal u required to meet + * deadlines. + */ static unsigned long sugov_get_util(struct sugov_cpu *sg_cpu) { struct rq *rq = cpu_rq(sg_cpu->cpu); @@ -188,26 +208,50 @@ static unsigned long sugov_get_util(stru if (rt_rq_is_runnable(&rq->rt)) return max; + /* + * Early check to see if IRQ/steal time saturates the CPU, can be + * because of inaccuracies in how we track these -- see + * update_irq_load_avg(). + */ irq = cpu_util_irq(rq); - if (unlikely(irq >= max)) return max; - /* Sum rq utilization */ + /* + * Because the time spend on RT/DL tasks is visible as 'lost' time to + * CFS tasks and we use the same metric to track the effective + * utilization (PELT windows are synchronized) we can directly add them + * to obtain the CPU's actual utilization. + */ util = cpu_util_cfs(rq); util += cpu_util_rt(rq); /* - * Interrupt time is not seen by rqs utilization nso we can compare - * them with the CPU capacity + * We do not make cpu_util_dl() a permanent part of this sum because we + * want to use cpu_bw_dl() later on, but we need to check if the + * CFS+RT+DL sum is saturated (ie. no idle time) such that we select + * f_max when there is no idle time. + * + * NOTE: numerical errors or stop class might cause us to not quite hit + * saturation when we should -- something for later. */ if ((util + cpu_util_dl(rq)) >= max) return max; /* - * As there is still idle time on the CPU, we need to compute the - * utilization level of the CPU. + * There is still idle time; further improve the number by using the + * irq metric. Because IRQ/steal time is hidden from the task clock we + * need to scale the task numbers: * + * 1 - irq + * U' = irq + ------- * U + * max + */ + util *= (max - irq); + util /= max; + util += irq; + + /* * Bandwidth required by DEADLINE must always be granted while, for * FAIR and RT, we use blocked utilization of IDLE CPUs as a mechanism * to gracefully reduce the frequency when no tasks show up for longer @@ -217,18 +261,7 @@ static unsigned long sugov_get_util(stru * util_cfs + util_dl as requested freq. However, cpufreq is not yet * ready for such an interface. So, we only do the latter for now. */ - - /* Weight rqs utilization to normal context window */ - util *= (max - irq); - util /= max; - - /* Add interrupt utilization */ - util += irq; - - /* Add DL bandwidth requirement */ - util += sg_cpu->bw_dl; - - return min(max, util); + return min(max, util + sg_cpu->bw_dl); } /**
Hi Peter On Thu, 5 Jul 2018 at 14:36, Peter Zijlstra <peterz@infradead.org> wrote: > > > OK, this looks good I suppose. Rafael, are you OK with me taking these? > > I have the below on top because I once again forgot how it all worked; > does this work for you Vincent? Yes looks good to me Thanks > > --- > Subject: sched/cpufreq: Clarify sugov_get_util() > > Add a few comments (hopefully) clarifying some of the magic in > sugov_get_util(). > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > --- > cpufreq_schedutil.c | 69 ++++++++++++++++++++++++++++++++++++++-------------- > 1 file changed, 51 insertions(+), 18 deletions(-) >
On 05-07-18, 14:36, Peter Zijlstra wrote: > Subject: sched/cpufreq: Clarify sugov_get_util() > > Add a few comments (hopefully) clarifying some of the magic in > sugov_get_util(). > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > --- > cpufreq_schedutil.c | 69 ++++++++++++++++++++++++++++++++++++++-------------- > 1 file changed, 51 insertions(+), 18 deletions(-) > > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -177,6 +177,26 @@ static unsigned int get_next_freq(struct > return cpufreq_driver_resolve_freq(policy, freq); > } > > +/* > + * This function computes an effective utilization for the given CPU, to be > + * used for frequency selection given the linear relation: f = u * f_max. > + * > + * The scheduler tracks the following metrics: > + * > + * cpu_util_{cfs,rt,dl,irq}() > + * cpu_bw_dl() > + * > + * Where the cfs,rt and dl util numbers are tracked with the same metric and > + * synchronized windows and are thus directly comparable. > + * > + * The cfs,rt,dl utilization are the running times measured with rq->clock_task > + * which excludes things like IRQ and steal-time. These latter are then accrued in > + * the irq utilization. > + * > + * The DL bandwidth number otoh is not a measured meric but a value computed metric > + * based on the task model parameters and gives the minimal u required to meet u ? > + * deadlines. > + */ > static unsigned long sugov_get_util(struct sugov_cpu *sg_cpu) > { > struct rq *rq = cpu_rq(sg_cpu->cpu); > @@ -188,26 +208,50 @@ static unsigned long sugov_get_util(stru > if (rt_rq_is_runnable(&rq->rt)) > return max; > > + /* > + * Early check to see if IRQ/steal time saturates the CPU, can be > + * because of inaccuracies in how we track these -- see > + * update_irq_load_avg(). > + */ > irq = cpu_util_irq(rq); > - > if (unlikely(irq >= max)) > return max; > > - /* Sum rq utilization */ > + /* > + * Because the time spend on RT/DL tasks is visible as 'lost' time to > + * CFS tasks and we use the same metric to track the effective > + * utilization (PELT windows are synchronized) we can directly add them > + * to obtain the CPU's actual utilization. > + */ > util = cpu_util_cfs(rq); > util += cpu_util_rt(rq); > > /* > - * Interrupt time is not seen by rqs utilization nso we can compare > - * them with the CPU capacity > + * We do not make cpu_util_dl() a permanent part of this sum because we > + * want to use cpu_bw_dl() later on, but we need to check if the > + * CFS+RT+DL sum is saturated (ie. no idle time) such that we select > + * f_max when there is no idle time. > + * > + * NOTE: numerical errors or stop class might cause us to not quite hit > + * saturation when we should -- something for later. > */ > if ((util + cpu_util_dl(rq)) >= max) > return max; > > /* > - * As there is still idle time on the CPU, we need to compute the > - * utilization level of the CPU. > + * There is still idle time; further improve the number by using the > + * irq metric. Because IRQ/steal time is hidden from the task clock we > + * need to scale the task numbers: > * > + * 1 - irq > + * U' = irq + ------- * U > + * max > + */ > + util *= (max - irq); > + util /= max; > + util += irq; > + > + /* > * Bandwidth required by DEADLINE must always be granted while, for > * FAIR and RT, we use blocked utilization of IDLE CPUs as a mechanism > * to gracefully reduce the frequency when no tasks show up for longer > @@ -217,18 +261,7 @@ static unsigned long sugov_get_util(stru > * util_cfs + util_dl as requested freq. However, cpufreq is not yet > * ready for such an interface. So, we only do the latter for now. > */ > - > - /* Weight rqs utilization to normal context window */ > - util *= (max - irq); > - util /= max; > - > - /* Add interrupt utilization */ > - util += irq; > - > - /* Add DL bandwidth requirement */ > - util += sg_cpu->bw_dl; > - > - return min(max, util); > + return min(max, util + sg_cpu->bw_dl); > } > Acked-by: Viresh Kumar <viresh.kumar@linaro.org> -- viresh
On Fri, Jul 06, 2018 at 11:35:22AM +0530, Viresh Kumar wrote: > On 05-07-18, 14:36, Peter Zijlstra wrote: > > +/* > > + * This function computes an effective utilization for the given CPU, to be > > + * used for frequency selection given the linear relation: f = u * f_max. > > + * > > + * The scheduler tracks the following metrics: > > + * > > + * cpu_util_{cfs,rt,dl,irq}() > > + * cpu_bw_dl() > > + * > > + * Where the cfs,rt and dl util numbers are tracked with the same metric and > > + * synchronized windows and are thus directly comparable. > > + * > > + * The cfs,rt,dl utilization are the running times measured with rq->clock_task > > + * which excludes things like IRQ and steal-time. These latter are then accrued in > > + * the irq utilization. > > + * > > + * The DL bandwidth number otoh is not a measured meric but a value computed > > metric Indeed, fixed. > > + * based on the task model parameters and gives the minimal u required to meet > > u ? utilization, but for lazy people :-) I'll use the whole word. > > + * deadlines. > > + */