Message ID | 1528795016-18208-1-git-send-email-vincent.guittot@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | None | expand |
On Tuesday 12 Jun 2018 at 11:16:56 (+0200), Vincent Guittot wrote: > The time spent under interrupt can be significant but it is not reflected > in the utilization of CPU when deciding to choose an OPP. Now that we have > access to this metric, schedutil can take it into account when selecting > the OPP for a CPU. > rqs utilization don't see the time spend under interrupt context and report > their value in the normal context time window. We need to compensate this when > adding interrupt utilization > > The CPU utilization is : > irq util_avg + (1 - irq util_avg / max capacity ) * /Sum rq util_avg > > A test with iperf on hikey (octo arm64) gives: > iperf -c server_address -r -t 5 > > w/o patch w/ patch > Tx 276 Mbits/sec 304 Mbits/sec +10% > Rx 299 Mbits/sec 328 Mbits/sec +09% > > 8 iterations > stdev is lower than 1% > Only WFI idle state is enable (shallowest diel state) ^^^^ nit: s/diel/idle And, out of curiosity, what happens if you leave the idle states untouched ? Do you still see an improvement ? Or is it lost in the noise ? Thanks, Quentin
On 12 June 2018 at 11:20, Quentin Perret <quentin.perret@arm.com> wrote: > On Tuesday 12 Jun 2018 at 11:16:56 (+0200), Vincent Guittot wrote: >> The time spent under interrupt can be significant but it is not reflected >> in the utilization of CPU when deciding to choose an OPP. Now that we have >> access to this metric, schedutil can take it into account when selecting >> the OPP for a CPU. >> rqs utilization don't see the time spend under interrupt context and report >> their value in the normal context time window. We need to compensate this when >> adding interrupt utilization >> >> The CPU utilization is : >> irq util_avg + (1 - irq util_avg / max capacity ) * /Sum rq util_avg >> >> A test with iperf on hikey (octo arm64) gives: >> iperf -c server_address -r -t 5 >> >> w/o patch w/ patch >> Tx 276 Mbits/sec 304 Mbits/sec +10% >> Rx 299 Mbits/sec 328 Mbits/sec +09% >> >> 8 iterations >> stdev is lower than 1% >> Only WFI idle state is enable (shallowest diel state) > ^^^^ > nit: s/diel/idle > > And, out of curiosity, what happens if you leave the idle states > untouched ? Do you still see an improvement ? Or is it lost in the > noise ? the result are less stable because c-state wake up time impact performance and cpuidle is not good to select the right idle state in such case. Normally, an app should use qos dma latency or a driver per device resume latency > > Thanks, > Quentin
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 25cee59..092c310 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -58,6 +58,7 @@ struct sugov_cpu { unsigned long util_dl; unsigned long bw_dl; unsigned long util_rt; + unsigned long util_irq; unsigned long max; /* The field below is for single-CPU policies only: */ @@ -182,21 +183,30 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu) sg_cpu->util_dl = cpu_util_dl(rq); sg_cpu->bw_dl = cpu_bw_dl(rq); sg_cpu->util_rt = cpu_util_rt(rq); + sg_cpu->util_irq = cpu_util_irq(rq); } static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu) { struct rq *rq = cpu_rq(sg_cpu->cpu); - unsigned long util; + unsigned long util, max = sg_cpu->max; if (rq->rt.rt_nr_running) return sg_cpu->max; + if (unlikely(sg_cpu->util_irq >= max)) + return max; + + /* Sum rq utilization */ util = sg_cpu->util_cfs; util += sg_cpu->util_rt; - if ((util + sg_cpu->util_dl) >= sg_cpu->max) - return sg_cpu->max; + /* + * Interrupt time is not seen by rqs utilization so we can compare + * them with the CPU capacity + */ + if ((util + sg_cpu->util_dl) >= max) + return max; /* * As there is still idle time on the CPU, we need to compute the @@ -207,10 +217,17 @@ static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu) * periods of time. */ + /* Weight rqs utilization to normal context window */ + util *= (max - sg_cpu->util_irq); + util /= max; + + /* Add interrupt utilization */ + util += sg_cpu->util_irq; + /* Add DL bandwidth requirement */ util += sg_cpu->bw_dl; - return min(sg_cpu->max, util); + return min(max, util); } static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, unsigned int flags) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index b534a43..873b567 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2221,4 +2221,17 @@ static inline unsigned long cpu_util_rt(struct rq *rq) { return rq->avg_rt.util_avg; } + +#if defined(CONFIG_IRQ_TIME_ACCOUNTING) || defined(CONFIG_PARAVIRT_TIME_ACCOUNTING) +static inline unsigned long cpu_util_irq(struct rq *rq) +{ + return rq->avg_irq.util_avg; +} +#else +static inline unsigned long cpu_util_irq(struct rq *rq) +{ + return 0; +} + +#endif #endif
The time spent under interrupt can be significant but it is not reflected in the utilization of CPU when deciding to choose an OPP. Now that we have access to this metric, schedutil can take it into account when selecting the OPP for a CPU. rqs utilization don't see the time spend under interrupt context and report their value in the normal context time window. We need to compensate this when adding interrupt utilization The CPU utilization is : irq util_avg + (1 - irq util_avg / max capacity ) * /Sum rq util_avg A test with iperf on hikey (octo arm64) gives: iperf -c server_address -r -t 5 w/o patch w/ patch Tx 276 Mbits/sec 304 Mbits/sec +10% Rx 299 Mbits/sec 328 Mbits/sec +09% 8 iterations stdev is lower than 1% Only WFI idle state is enable (shallowest diel state) Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> --- kernel/sched/cpufreq_schedutil.c | 25 +++++++++++++++++++++---- kernel/sched/sched.h | 13 +++++++++++++ 2 files changed, 34 insertions(+), 4 deletions(-) -- 2.7.4