From patchwork Fri Mar 16 11:25:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 131919 Delivered-To: patch@linaro.org Received: by 10.46.84.17 with SMTP id i17csp617607ljb; Fri, 16 Mar 2018 04:26:06 -0700 (PDT) X-Google-Smtp-Source: AG47ELueYLQ2hCDe2r0PHN8fF+dDtHIY4ePtMccl9Rd8pff/cYK7hhCbf9AThcun40GLbcZd2dJW X-Received: by 10.101.96.65 with SMTP id b1mr1226200pgv.340.1521199566657; Fri, 16 Mar 2018 04:26:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521199566; cv=none; d=google.com; s=arc-20160816; b=PG+yMqthgsMhypXuNUOu9A4Kve7qk4gLozWWJV3m0yEBKyEWrePEad8P+lRcuRWPjI LHRCnl9T+sBUywtDDztlTZ+OybvaEfDIPjnomP2vv/GI7tuk+z31offkB9Rn8zYgfbJS 9AkK5LVdFTvDFteHZJUYf7P1qznKjoTrXA1wRjoNArxXPFlIL39tDECelfu4xMBVIXg5 +HF5qnho7z0eI1T2yf945s+1NhUAltQAXD5EBCyUgg5LEuDggqeL0k1LNqTXtW0wCmyE NY0Huh5OF6ikw93PcaESV/0Vys8bx2C83c2nEzklbNTy98QH+sgNTC9DraP8tItXVaaV 8LHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=+b2TrT/QRy0aILk6vMZz8JxVqeO5zzyNbu8ZCIkL9oo=; b=Kct3CPT567ILYolhJRlDiltLJSYQH4QQXbJ4SjMMyoAOD74yB4PSoZcYfW4qlk+vqw 6sPn/E4kJfj8RBZGPEc0LJav0J3igDk/kAoC0TNVIzSE64kN29DX5Z581qATX+HJ9HWu kXRXdeoyzvdfWQbm/TJ4025Vqj3Our+UGttXJQuTS9UVOWtzas2GgsvSNiehM9l+sHG/ E/RTZkiTDx2kaD6+TkwN2relvz+Aq8rMwLWDXLigHbypHknNRqdAOCaJwOcPCKw+Jbq4 n6z/Kxp3OcQMTBmb9h1kniZ2QPVoIqIsNzRPZCY9fpD5iOIklt/dJZyxA/aca8qb8vD1 UvXg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Qz/L0RpJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f12-v6si6007009plo.91.2018.03.16.04.26.06; Fri, 16 Mar 2018 04:26:06 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=Qz/L0RpJ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753917AbeCPL0A (ORCPT + 28 others); Fri, 16 Mar 2018 07:26:00 -0400 Received: from mail-wr0-f196.google.com ([209.85.128.196]:35905 "EHLO mail-wr0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753678AbeCPLZy (ORCPT ); Fri, 16 Mar 2018 07:25:54 -0400 Received: by mail-wr0-f196.google.com with SMTP id d10so11369193wrf.3 for ; Fri, 16 Mar 2018 04:25:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=+b2TrT/QRy0aILk6vMZz8JxVqeO5zzyNbu8ZCIkL9oo=; b=Qz/L0RpJCDmL5z8wJ8MhiruUyhLaizJnRzdLhYtZXZCeMXryrSg1rg9/UpheJDLFV0 S+9TQKvHPRf6Akds0Gzb3lrTq43bxCMiIITUXH13HW+tX0pw8KpsXN4AU5vZoc1kqhST UcCi8tqKFiQxGu7Gn+92BOwPTIp/1XMh1V3OE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=+b2TrT/QRy0aILk6vMZz8JxVqeO5zzyNbu8ZCIkL9oo=; b=J7odStnjUdd2MAQ0DFfoQGCiBTi6lGOK5bihtAKgb/prAahUUXh8cWVxy8LiWNKwsI poSqX1HP6L9vV4I/hhF0BD97OdCeJK3ISCl9emenxANnQPelr5U1hqG/Vcs0r8UN2mrs V6yci6XBBRuDZUlVwZG8i/DCq7Fa9wONGSFpR7b6SZBJwZkeDLSeZdEj5HXsxuCm4orP ds5+2pGy0Uk1swU3k7PKoar+kxZQq5anSqi+nwmPtdhYfufvmRu/nS6iGctkiqj4cpHm tRuBwxxSRwUYLj6P0NeF7EZgHQatoZl1tgAW3T9N1eY7IXe1rQIYQ74Nd8svm+cuR9sV e7ow== X-Gm-Message-State: AElRT7EAc2Nrd/d+UQKnaO9rGaJdeEnkiioCiWVqdHVx/KOAb6J5UU9/ MtUbF48micOmNlVYuL0RHeb60w== X-Received: by 10.223.134.210 with SMTP id 18mr1440332wry.232.1521199552038; Fri, 16 Mar 2018 04:25:52 -0700 (PDT) Received: from localhost.localdomain ([2a01:e0a:f:6020:9471:2e09:99e5:f59]) by smtp.gmail.com with ESMTPSA id m15sm6785292wrb.58.2018.03.16.04.25.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 16 Mar 2018 04:25:50 -0700 (PDT) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, rjw@rjwysocki.net Cc: juri.lelli@redhat.com, dietmar.eggemann@arm.com, Morten.Rasmussen@arm.com, viresh.kumar@linaro.org, valentin.schneider@arm.com, Vincent Guittot Subject: [PATCH 1/4 v4] sched/pelt: Move pelt related code in a dedicated file Date: Fri, 16 Mar 2018 12:25:38 +0100 Message-Id: <1521199541-15308-2-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1521199541-15308-1-git-send-email-vincent.guittot@linaro.org> References: <1521199541-15308-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org We want to track rt_rq's utilization as a part of the estimation of the whole rq's utilization. This is necessary because rt tasks can steal utilization to cfs tasks and make them lighter than they are. As we want to use the same load tracking mecanism for both and prevent useless dependency between cfs and rt code, pelt code is moved in a dedicated file. Signed-off-by: Vincent Guittot --- kernel/sched/Makefile | 2 +- kernel/sched/fair.c | 306 +------------------------------------------------ kernel/sched/pelt.c | 308 ++++++++++++++++++++++++++++++++++++++++++++++++++ kernel/sched/pelt.h | 17 +++ kernel/sched/sched.h | 19 ++++ 5 files changed, 346 insertions(+), 306 deletions(-) create mode 100644 kernel/sched/pelt.c create mode 100644 kernel/sched/pelt.h -- 2.7.4 diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile index d9a02b3..7fe1834 100644 --- a/kernel/sched/Makefile +++ b/kernel/sched/Makefile @@ -20,7 +20,7 @@ obj-y += core.o loadavg.o clock.o cputime.o obj-y += idle.o fair.o rt.o deadline.o obj-y += wait.o wait_bit.o swait.o completion.o -obj-$(CONFIG_SMP) += cpupri.o cpudeadline.o topology.o stop_task.o +obj-$(CONFIG_SMP) += cpupri.o cpudeadline.o topology.o stop_task.o pelt.o obj-$(CONFIG_SCHED_AUTOGROUP) += autogroup.o obj-$(CONFIG_SCHEDSTATS) += stats.o obj-$(CONFIG_SCHED_DEBUG) += debug.o diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3582117..bfd56bc 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -255,9 +255,6 @@ static inline struct rq *rq_of(struct cfs_rq *cfs_rq) return cfs_rq->rq; } -/* An entity is a task if it doesn't "own" a runqueue */ -#define entity_is_task(se) (!se->my_q) - static inline struct task_struct *task_of(struct sched_entity *se) { SCHED_WARN_ON(!entity_is_task(se)); @@ -419,7 +416,6 @@ static inline struct rq *rq_of(struct cfs_rq *cfs_rq) return container_of(cfs_rq, struct rq, cfs); } -#define entity_is_task(se) 1 #define for_each_sched_entity(se) \ for (; se; se = NULL) @@ -692,7 +688,7 @@ static u64 sched_vslice(struct cfs_rq *cfs_rq, struct sched_entity *se) } #ifdef CONFIG_SMP - +#include "pelt.h" #include "sched-pelt.h" static int select_idle_sibling(struct task_struct *p, int prev_cpu, int cpu); @@ -2720,19 +2716,6 @@ account_entity_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se) } while (0) #ifdef CONFIG_SMP -/* - * XXX we want to get rid of these helpers and use the full load resolution. - */ -static inline long se_weight(struct sched_entity *se) -{ - return scale_load_down(se->load.weight); -} - -static inline long se_runnable(struct sched_entity *se) -{ - return scale_load_down(se->runnable_weight); -} - static inline void enqueue_runnable_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se) { @@ -3033,287 +3016,6 @@ static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq, int flags) } #ifdef CONFIG_SMP -/* - * Approximate: - * val * y^n, where y^32 ~= 0.5 (~1 scheduling period) - */ -static u64 decay_load(u64 val, u64 n) -{ - unsigned int local_n; - - if (unlikely(n > LOAD_AVG_PERIOD * 63)) - return 0; - - /* after bounds checking we can collapse to 32-bit */ - local_n = n; - - /* - * As y^PERIOD = 1/2, we can combine - * y^n = 1/2^(n/PERIOD) * y^(n%PERIOD) - * With a look-up table which covers y^n (n= LOAD_AVG_PERIOD)) { - val >>= local_n / LOAD_AVG_PERIOD; - local_n %= LOAD_AVG_PERIOD; - } - - val = mul_u64_u32_shr(val, runnable_avg_yN_inv[local_n], 32); - return val; -} - -static u32 __accumulate_pelt_segments(u64 periods, u32 d1, u32 d3) -{ - u32 c1, c2, c3 = d3; /* y^0 == 1 */ - - /* - * c1 = d1 y^p - */ - c1 = decay_load((u64)d1, periods); - - /* - * p-1 - * c2 = 1024 \Sum y^n - * n=1 - * - * inf inf - * = 1024 ( \Sum y^n - \Sum y^n - y^0 ) - * n=0 n=p - */ - c2 = LOAD_AVG_MAX - decay_load(LOAD_AVG_MAX, periods) - 1024; - - return c1 + c2 + c3; -} - -/* - * Accumulate the three separate parts of the sum; d1 the remainder - * of the last (incomplete) period, d2 the span of full periods and d3 - * the remainder of the (incomplete) current period. - * - * d1 d2 d3 - * ^ ^ ^ - * | | | - * |<->|<----------------->|<--->| - * ... |---x---|------| ... |------|-----x (now) - * - * p-1 - * u' = (u + d1) y^p + 1024 \Sum y^n + d3 y^0 - * n=1 - * - * = u y^p + (Step 1) - * - * p-1 - * d1 y^p + 1024 \Sum y^n + d3 y^0 (Step 2) - * n=1 - */ -static __always_inline u32 -accumulate_sum(u64 delta, int cpu, struct sched_avg *sa, - unsigned long load, unsigned long runnable, int running) -{ - unsigned long scale_freq, scale_cpu; - u32 contrib = (u32)delta; /* p == 0 -> delta < 1024 */ - u64 periods; - - scale_freq = arch_scale_freq_capacity(cpu); - scale_cpu = arch_scale_cpu_capacity(NULL, cpu); - - delta += sa->period_contrib; - periods = delta / 1024; /* A period is 1024us (~1ms) */ - - /* - * Step 1: decay old *_sum if we crossed period boundaries. - */ - if (periods) { - sa->load_sum = decay_load(sa->load_sum, periods); - sa->runnable_load_sum = - decay_load(sa->runnable_load_sum, periods); - sa->util_sum = decay_load((u64)(sa->util_sum), periods); - - /* - * Step 2 - */ - delta %= 1024; - contrib = __accumulate_pelt_segments(periods, - 1024 - sa->period_contrib, delta); - } - sa->period_contrib = delta; - - contrib = cap_scale(contrib, scale_freq); - if (load) - sa->load_sum += load * contrib; - if (runnable) - sa->runnable_load_sum += runnable * contrib; - if (running) - sa->util_sum += contrib * scale_cpu; - - return periods; -} - -/* - * We can represent the historical contribution to runnable average as the - * coefficients of a geometric series. To do this we sub-divide our runnable - * history into segments of approximately 1ms (1024us); label the segment that - * occurred N-ms ago p_N, with p_0 corresponding to the current period, e.g. - * - * [<- 1024us ->|<- 1024us ->|<- 1024us ->| ... - * p0 p1 p2 - * (now) (~1ms ago) (~2ms ago) - * - * Let u_i denote the fraction of p_i that the entity was runnable. - * - * We then designate the fractions u_i as our co-efficients, yielding the - * following representation of historical load: - * u_0 + u_1*y + u_2*y^2 + u_3*y^3 + ... - * - * We choose y based on the with of a reasonably scheduling period, fixing: - * y^32 = 0.5 - * - * This means that the contribution to load ~32ms ago (u_32) will be weighted - * approximately half as much as the contribution to load within the last ms - * (u_0). - * - * When a period "rolls over" and we have new u_0`, multiplying the previous - * sum again by y is sufficient to update: - * load_avg = u_0` + y*(u_0 + u_1*y + u_2*y^2 + ... ) - * = u_0 + u_1*y + u_2*y^2 + ... [re-labeling u_i --> u_{i+1}] - */ -static __always_inline int -___update_load_sum(u64 now, int cpu, struct sched_avg *sa, - unsigned long load, unsigned long runnable, int running) -{ - u64 delta; - - delta = now - sa->last_update_time; - /* - * This should only happen when time goes backwards, which it - * unfortunately does during sched clock init when we swap over to TSC. - */ - if ((s64)delta < 0) { - sa->last_update_time = now; - return 0; - } - - /* - * Use 1024ns as the unit of measurement since it's a reasonable - * approximation of 1us and fast to compute. - */ - delta >>= 10; - if (!delta) - return 0; - - sa->last_update_time += delta << 10; - - /* - * running is a subset of runnable (weight) so running can't be set if - * runnable is clear. But there are some corner cases where the current - * se has been already dequeued but cfs_rq->curr still points to it. - * This means that weight will be 0 but not running for a sched_entity - * but also for a cfs_rq if the latter becomes idle. As an example, - * this happens during idle_balance() which calls - * update_blocked_averages() - */ - if (!load) - runnable = running = 0; - - /* - * Now we know we crossed measurement unit boundaries. The *_avg - * accrues by two steps: - * - * Step 1: accumulate *_sum since last_update_time. If we haven't - * crossed period boundaries, finish. - */ - if (!accumulate_sum(delta, cpu, sa, load, runnable, running)) - return 0; - - return 1; -} - -static __always_inline void -___update_load_avg(struct sched_avg *sa, unsigned long load, unsigned long runnable) -{ - u32 divider = LOAD_AVG_MAX - 1024 + sa->period_contrib; - - /* - * Step 2: update *_avg. - */ - sa->load_avg = div_u64(load * sa->load_sum, divider); - sa->runnable_load_avg = div_u64(runnable * sa->runnable_load_sum, divider); - sa->util_avg = sa->util_sum / divider; -} - -/* - * sched_entity: - * - * task: - * se_runnable() == se_weight() - * - * group: [ see update_cfs_group() ] - * se_weight() = tg->weight * grq->load_avg / tg->load_avg - * se_runnable() = se_weight(se) * grq->runnable_load_avg / grq->load_avg - * - * load_sum := runnable_sum - * load_avg = se_weight(se) * runnable_avg - * - * runnable_load_sum := runnable_sum - * runnable_load_avg = se_runnable(se) * runnable_avg - * - * XXX collapse load_sum and runnable_load_sum - * - * cfq_rs: - * - * load_sum = \Sum se_weight(se) * se->avg.load_sum - * load_avg = \Sum se->avg.load_avg - * - * runnable_load_sum = \Sum se_runnable(se) * se->avg.runnable_load_sum - * runnable_load_avg = \Sum se->avg.runable_load_avg - */ - -static int -__update_load_avg_blocked_se(u64 now, int cpu, struct sched_entity *se) -{ - if (entity_is_task(se)) - se->runnable_weight = se->load.weight; - - if (___update_load_sum(now, cpu, &se->avg, 0, 0, 0)) { - ___update_load_avg(&se->avg, se_weight(se), se_runnable(se)); - return 1; - } - - return 0; -} - -static int -__update_load_avg_se(u64 now, int cpu, struct cfs_rq *cfs_rq, struct sched_entity *se) -{ - if (entity_is_task(se)) - se->runnable_weight = se->load.weight; - - if (___update_load_sum(now, cpu, &se->avg, !!se->on_rq, !!se->on_rq, - cfs_rq->curr == se)) { - - ___update_load_avg(&se->avg, se_weight(se), se_runnable(se)); - return 1; - } - - return 0; -} - -static int -__update_load_avg_cfs_rq(u64 now, int cpu, struct cfs_rq *cfs_rq) -{ - if (___update_load_sum(now, cpu, &cfs_rq->avg, - scale_load_down(cfs_rq->load.weight), - scale_load_down(cfs_rq->runnable_weight), - cfs_rq->curr != NULL)) { - - ___update_load_avg(&cfs_rq->avg, 1, 1); - return 1; - } - - return 0; -} - #ifdef CONFIG_FAIR_GROUP_SCHED /** * update_tg_load_avg - update the tg's load avg @@ -3875,12 +3577,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf); #else /* CONFIG_SMP */ -static inline int -update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq) -{ - return 0; -} - #define UPDATE_TG 0x0 #define SKIP_AGE_LOAD 0x0 #define DO_ATTACH 0x0 diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c new file mode 100644 index 0000000..d693e5e --- /dev/null +++ b/kernel/sched/pelt.c @@ -0,0 +1,308 @@ +/* + * Per Entity Load Tracking + * + * Copyright (C) 2007 Red Hat, Inc., Ingo Molnar + * + * Interactivity improvements by Mike Galbraith + * (C) 2007 Mike Galbraith + * + * Various enhancements by Dmitry Adamushko. + * (C) 2007 Dmitry Adamushko + * + * Group scheduling enhancements by Srivatsa Vaddagiri + * Copyright IBM Corporation, 2007 + * Author: Srivatsa Vaddagiri + * + * Scaled math optimizations by Thomas Gleixner + * Copyright (C) 2007, Thomas Gleixner + * + * Adaptive scheduling granularity, math enhancements by Peter Zijlstra + * Copyright (C) 2007 Red Hat, Inc., Peter Zijlstra + * + * Move PELT related code from fair.c into this pelt.c file + * Author: Vincent Guittot + */ + +#include +#include "sched.h" +#include "sched-pelt.h" + +/* + * Approximate: + * val * y^n, where y^32 ~= 0.5 (~1 scheduling period) + */ +static u64 decay_load(u64 val, u64 n) +{ + unsigned int local_n; + + if (unlikely(n > LOAD_AVG_PERIOD * 63)) + return 0; + + /* after bounds checking we can collapse to 32-bit */ + local_n = n; + + /* + * As y^PERIOD = 1/2, we can combine + * y^n = 1/2^(n/PERIOD) * y^(n%PERIOD) + * With a look-up table which covers y^n (n= LOAD_AVG_PERIOD)) { + val >>= local_n / LOAD_AVG_PERIOD; + local_n %= LOAD_AVG_PERIOD; + } + + val = mul_u64_u32_shr(val, runnable_avg_yN_inv[local_n], 32); + return val; +} + +static u32 __accumulate_pelt_segments(u64 periods, u32 d1, u32 d3) +{ + u32 c1, c2, c3 = d3; /* y^0 == 1 */ + + /* + * c1 = d1 y^p + */ + c1 = decay_load((u64)d1, periods); + + /* + * p-1 + * c2 = 1024 \Sum y^n + * n=1 + * + * inf inf + * = 1024 ( \Sum y^n - \Sum y^n - y^0 ) + * n=0 n=p + */ + c2 = LOAD_AVG_MAX - decay_load(LOAD_AVG_MAX, periods) - 1024; + + return c1 + c2 + c3; +} + +#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT) + +/* + * Accumulate the three separate parts of the sum; d1 the remainder + * of the last (incomplete) period, d2 the span of full periods and d3 + * the remainder of the (incomplete) current period. + * + * d1 d2 d3 + * ^ ^ ^ + * | | | + * |<->|<----------------->|<--->| + * ... |---x---|------| ... |------|-----x (now) + * + * p-1 + * u' = (u + d1) y^p + 1024 \Sum y^n + d3 y^0 + * n=1 + * + * = u y^p + (Step 1) + * + * p-1 + * d1 y^p + 1024 \Sum y^n + d3 y^0 (Step 2) + * n=1 + */ +static __always_inline u32 +accumulate_sum(u64 delta, int cpu, struct sched_avg *sa, + unsigned long load, unsigned long runnable, int running) +{ + unsigned long scale_freq, scale_cpu; + u32 contrib = (u32)delta; /* p == 0 -> delta < 1024 */ + u64 periods; + + scale_freq = arch_scale_freq_capacity(cpu); + scale_cpu = arch_scale_cpu_capacity(NULL, cpu); + + delta += sa->period_contrib; + periods = delta / 1024; /* A period is 1024us (~1ms) */ + + /* + * Step 1: decay old *_sum if we crossed period boundaries. + */ + if (periods) { + sa->load_sum = decay_load(sa->load_sum, periods); + sa->runnable_load_sum = + decay_load(sa->runnable_load_sum, periods); + sa->util_sum = decay_load((u64)(sa->util_sum), periods); + + /* + * Step 2 + */ + delta %= 1024; + contrib = __accumulate_pelt_segments(periods, + 1024 - sa->period_contrib, delta); + } + sa->period_contrib = delta; + + contrib = cap_scale(contrib, scale_freq); + if (load) + sa->load_sum += load * contrib; + if (runnable) + sa->runnable_load_sum += runnable * contrib; + if (running) + sa->util_sum += contrib * scale_cpu; + + return periods; +} + +/* + * We can represent the historical contribution to runnable average as the + * coefficients of a geometric series. To do this we sub-divide our runnable + * history into segments of approximately 1ms (1024us); label the segment that + * occurred N-ms ago p_N, with p_0 corresponding to the current period, e.g. + * + * [<- 1024us ->|<- 1024us ->|<- 1024us ->| ... + * p0 p1 p2 + * (now) (~1ms ago) (~2ms ago) + * + * Let u_i denote the fraction of p_i that the entity was runnable. + * + * We then designate the fractions u_i as our co-efficients, yielding the + * following representation of historical load: + * u_0 + u_1*y + u_2*y^2 + u_3*y^3 + ... + * + * We choose y based on the with of a reasonably scheduling period, fixing: + * y^32 = 0.5 + * + * This means that the contribution to load ~32ms ago (u_32) will be weighted + * approximately half as much as the contribution to load within the last ms + * (u_0). + * + * When a period "rolls over" and we have new u_0`, multiplying the previous + * sum again by y is sufficient to update: + * load_avg = u_0` + y*(u_0 + u_1*y + u_2*y^2 + ... ) + * = u_0 + u_1*y + u_2*y^2 + ... [re-labeling u_i --> u_{i+1}] + */ +static __always_inline int +___update_load_sum(u64 now, int cpu, struct sched_avg *sa, + unsigned long load, unsigned long runnable, int running) +{ + u64 delta; + + delta = now - sa->last_update_time; + /* + * This should only happen when time goes backwards, which it + * unfortunately does during sched clock init when we swap over to TSC. + */ + if ((s64)delta < 0) { + sa->last_update_time = now; + return 0; + } + + /* + * Use 1024ns as the unit of measurement since it's a reasonable + * approximation of 1us and fast to compute. + */ + delta >>= 10; + if (!delta) + return 0; + + sa->last_update_time += delta << 10; + + /* + * running is a subset of runnable (weight) so running can't be set if + * runnable is clear. But there are some corner cases where the current + * se has been already dequeued but cfs_rq->curr still points to it. + * This means that weight will be 0 but not running for a sched_entity + * but also for a cfs_rq if the latter becomes idle. As an example, + * this happens during idle_balance() which calls + * update_blocked_averages() + */ + if (!load) + runnable = running = 0; + + /* + * Now we know we crossed measurement unit boundaries. The *_avg + * accrues by two steps: + * + * Step 1: accumulate *_sum since last_update_time. If we haven't + * crossed period boundaries, finish. + */ + if (!accumulate_sum(delta, cpu, sa, load, runnable, running)) + return 0; + + return 1; +} + +static __always_inline void +___update_load_avg(struct sched_avg *sa, unsigned long load, unsigned long runnable) +{ + u32 divider = LOAD_AVG_MAX - 1024 + sa->period_contrib; + + /* + * Step 2: update *_avg. + */ + sa->load_avg = div_u64(load * sa->load_sum, divider); + sa->runnable_load_avg = div_u64(runnable * sa->runnable_load_sum, divider); + sa->util_avg = sa->util_sum / divider; +} + +/* + * sched_entity: + * + * task: + * se_runnable() == se_weight() + * + * group: [ see update_cfs_group() ] + * se_weight() = tg->weight * grq->load_avg / tg->load_avg + * se_runnable() = se_weight(se) * grq->runnable_load_avg / grq->load_avg + * + * load_sum := runnable_sum + * load_avg = se_weight(se) * runnable_avg + * + * runnable_load_sum := runnable_sum + * runnable_load_avg = se_runnable(se) * runnable_avg + * + * XXX collapse load_sum and runnable_load_sum + * + * cfq_rs: + * + * load_sum = \Sum se_weight(se) * se->avg.load_sum + * load_avg = \Sum se->avg.load_avg + * + * runnable_load_sum = \Sum se_runnable(se) * se->avg.runnable_load_sum + * runnable_load_avg = \Sum se->avg.runable_load_avg + */ + +int __update_load_avg_blocked_se(u64 now, int cpu, struct sched_entity *se) +{ + if (entity_is_task(se)) + se->runnable_weight = se->load.weight; + + if (___update_load_sum(now, cpu, &se->avg, 0, 0, 0)) { + ___update_load_avg(&se->avg, se_weight(se), se_runnable(se)); + return 1; + } + + return 0; +} + +int __update_load_avg_se(u64 now, int cpu, struct cfs_rq *cfs_rq, struct sched_entity *se) +{ + if (entity_is_task(se)) + se->runnable_weight = se->load.weight; + + if (___update_load_sum(now, cpu, &se->avg, !!se->on_rq, !!se->on_rq, + cfs_rq->curr == se)) { + + ___update_load_avg(&se->avg, se_weight(se), se_runnable(se)); + return 1; + } + + return 0; +} + +int __update_load_avg_cfs_rq(u64 now, int cpu, struct cfs_rq *cfs_rq) +{ + if (___update_load_sum(now, cpu, &cfs_rq->avg, + scale_load_down(cfs_rq->load.weight), + scale_load_down(cfs_rq->runnable_weight), + cfs_rq->curr != NULL)) { + + ___update_load_avg(&cfs_rq->avg, 1, 1); + return 1; + } + + return 0; +} diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h new file mode 100644 index 0000000..c312d8c --- /dev/null +++ b/kernel/sched/pelt.h @@ -0,0 +1,17 @@ +#ifdef CONFIG_SMP + +int __update_load_avg_blocked_se(u64 now, int cpu, struct sched_entity *se); +int __update_load_avg_se(u64 now, int cpu, struct cfs_rq *cfs_rq, struct sched_entity *se); +int __update_load_avg_cfs_rq(u64 now, int cpu, struct cfs_rq *cfs_rq); + +#else + +static inline int +update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq) +{ + return 0; +} + +#endif + + diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 22909ff..783eacf 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -666,7 +666,26 @@ struct dl_rq { u64 bw_ratio; }; +#ifdef CONFIG_FAIR_GROUP_SCHED +/* An entity is a task if it doesn't "own" a runqueue */ +#define entity_is_task(se) (!se->my_q) +#else +#define entity_is_task(se) 1 +#endif + #ifdef CONFIG_SMP +/* + * XXX we want to get rid of these helpers and use the full load resolution. + */ +static inline long se_weight(struct sched_entity *se) +{ + return scale_load_down(se->load.weight); +} + +static inline long se_runnable(struct sched_entity *se) +{ + return scale_load_down(se->runnable_weight); +} static inline bool sched_asym_prefer(int a, int b) { From patchwork Fri Mar 16 11:25:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 131918 Delivered-To: patch@linaro.org Received: by 10.46.84.17 with SMTP id i17csp617542ljb; Fri, 16 Mar 2018 04:26:02 -0700 (PDT) X-Google-Smtp-Source: AG47ELtV9ua7Eb/bBi7OH4mqF7sLL1EdeZEpzd/P46NAjOOLgkOOEtGP6Gmhgwd91JXeHGQz7Fql X-Received: by 10.98.102.155 with SMTP id s27mr1299865pfj.198.1521199562595; Fri, 16 Mar 2018 04:26:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521199562; cv=none; d=google.com; s=arc-20160816; b=gJjhgH4LzVCX02UnuiJMYoch3sLph5ZkbrUW4EZSDA+xtVkH6Sl0gSsCCINjKQX4x2 Zxyydne7KgpYg6z/0FEwHGio1I2bosrMfPW47IIW1FQ5jT47ok++be9ukA88gX2WfDMD xnu7/3vsxl8W0EVErk4dkQJ2AVz30DuoIKAVuUhHUWq0/Q2zZzirV4ZNfuQAkxQedTY2 twFn6p/njH4QgqUQZucBNQJw3yOWi9Cev0GGejB4bP/vGVmMjz3TXLelN8bueA2HqKKg m12A0x0X03bo8W9JvLiIPTUbDDqW7UC/qFiz8qCBSafqnca+6eL5pPl6VUYBcZAq2PiN XMtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=dDHwCU13IdFGmm7buZPl3Ymx7q2j3VwNA6EtjObtB60=; b=E8EV1n5tywdHkw8q6UIbZOFXLI01K7b2kdJFTCMgwcluxUopZbfulLlY6+0E/7vI2t 2XZsQgVEPIJBIwBE/RHVA8g7Li+D+r/u0FaM+431akmpR+2fwNisKDaY18x1HLiGPNWB Ugevo0eB2pIsNj8rVUf7kL0j33a37TLGWejXfTx1kK3NLlYRfRQ17JEoHaRURGGxyriL ZLcxyL5AT7/Dz8e4HvtYlps60BDVzyWBxpmE3AgrvE4wQmhJBKxRBGbmNQsCermJhHpp kovD0/CHJiG4UdHL8qPN8++IiUgw/9SsCWU0QAWw5JDOxGWtod/ntKD7PDkT17+ug4zX qEww== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=CN4DBfVt; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f12-v6si6007009plo.91.2018.03.16.04.26.02; Fri, 16 Mar 2018 04:26:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=CN4DBfVt; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753903AbeCPLZ6 (ORCPT + 28 others); Fri, 16 Mar 2018 07:25:58 -0400 Received: from mail-wr0-f194.google.com ([209.85.128.194]:36933 "EHLO mail-wr0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753777AbeCPLZz (ORCPT ); Fri, 16 Mar 2018 07:25:55 -0400 Received: by mail-wr0-f194.google.com with SMTP id z12so11372850wrg.4 for ; Fri, 16 Mar 2018 04:25:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=dDHwCU13IdFGmm7buZPl3Ymx7q2j3VwNA6EtjObtB60=; b=CN4DBfVtR36FHsnZhGLVQw1WDNdCVC4S+0ZzDiUgYh3/g0NyrhV17c/O3Kjo7XWEdE KBAgkHK1v5a5/tG/IykXgbWlqpnwMuEMBk+FMBz9QDMkbVz3Zzw8xaB+4P44LFHJcwom cqnoIcQnVgqkytds6NRJBhX0zdqtPaADM1hDA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=dDHwCU13IdFGmm7buZPl3Ymx7q2j3VwNA6EtjObtB60=; b=XWpb6i/Ppv8IJHON460SXrQXkwWO5M80COTVfEu0eYq96IJQhxZbHxaxJGd8U8SL2z zOrRdopaRgSx3jarTN+7Gw5X4lB4+sxj8BDJMXpimdDeWhmZdYNXD7b/fn2ZB8dNGAZ1 1OCmojZtMgc266ghW0cTWpTPc/SMvCBABzGGG9BgPesK2YkCNgZJZNX45ZD+N8gvL3wt 9wVU8xcf+5mEMJz8IKyrfhnFVExWLYdP+wnIypwRpw2+MIS+Aen8sR42da5MEAdV8B/k Cnta3b/vV+04T4CIyt9/kWFGA0ooMvh6/UxNjJEdfexXwfIIsugMkmgrmqsZGMf2vjoM Dp/g== X-Gm-Message-State: AElRT7HGmv9fMD/89WXpTU+TJECWoKP5mCQKZsgsPNA2wWizaOQKIo/Q ljyxH986lkLcj4+xge//beFjZA== X-Received: by 10.223.171.79 with SMTP id r15mr1250778wrc.208.1521199553266; Fri, 16 Mar 2018 04:25:53 -0700 (PDT) Received: from localhost.localdomain ([2a01:e0a:f:6020:9471:2e09:99e5:f59]) by smtp.gmail.com with ESMTPSA id m15sm6785292wrb.58.2018.03.16.04.25.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 16 Mar 2018 04:25:52 -0700 (PDT) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, rjw@rjwysocki.net Cc: juri.lelli@redhat.com, dietmar.eggemann@arm.com, Morten.Rasmussen@arm.com, viresh.kumar@linaro.org, valentin.schneider@arm.com, Vincent Guittot Subject: [PATCH 2/4 v4] sched/rt: add rt_rq utilization tracking Date: Fri, 16 Mar 2018 12:25:39 +0100 Message-Id: <1521199541-15308-3-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1521199541-15308-1-git-send-email-vincent.guittot@linaro.org> References: <1521199541-15308-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org schedutil governor relies on cfs_rq's util_avg to choose the OPP when cfs tasks are running. When the CPU is overloaded by cfs and rt tasks, cfs tasks are preempted by rt tasks and in this case util_avg reflects the remaining capacity that is used by cfs task but not what cfs want to use. In such case, schedutil can select a lower OPP whereas the CPU is overloaded. In order to have a more accurate view of the utilization of the CPU, we track the utilization that is "stolen" by RT tasks. Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 2 ++ kernel/sched/pelt.c | 23 +++++++++++++++++++++++ kernel/sched/pelt.h | 7 +++++++ kernel/sched/rt.c | 8 ++++++++ kernel/sched/sched.h | 2 ++ 5 files changed, 42 insertions(+) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index bfd56bc..60e3c4b 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7190,6 +7190,7 @@ static void update_blocked_averages(int cpu) if (cfs_rq_has_blocked(cfs_rq)) done = false; } + update_rt_rq_load_avg(rq_clock_task(rq), cpu, &rq->rt, 0); #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; @@ -7255,6 +7256,7 @@ static inline void update_blocked_averages(int cpu) rq_lock_irqsave(rq, &rf); update_rq_clock(rq); update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq); + update_rt_rq_load_avg(rq_clock_task(rq), cpu, &rq->rt, 0); #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; if (!cfs_rq_has_blocked(cfs_rq)) diff --git a/kernel/sched/pelt.c b/kernel/sched/pelt.c index d693e5e..cd51576 100644 --- a/kernel/sched/pelt.c +++ b/kernel/sched/pelt.c @@ -306,3 +306,26 @@ int __update_load_avg_cfs_rq(u64 now, int cpu, struct cfs_rq *cfs_rq) return 0; } + +/* + * rt_rq: + * + * util_sum = \Sum se->avg.util_sum but se->avg.util_sum is not tracked + * util_sum = cpu_scale * load_sum + * runnable_load_sum = load_sum + * + */ + +int update_rt_rq_load_avg(u64 now, int cpu, struct rt_rq *rt_rq, int running) +{ + if (___update_load_sum(now, cpu, &rt_rq->avg, + running, + running, + running)) { + + ___update_load_avg(&rt_rq->avg, 1, 1); + return 1; + } + + return 0; +} diff --git a/kernel/sched/pelt.h b/kernel/sched/pelt.h index c312d8c..78a2107 100644 --- a/kernel/sched/pelt.h +++ b/kernel/sched/pelt.h @@ -3,6 +3,7 @@ int __update_load_avg_blocked_se(u64 now, int cpu, struct sched_entity *se); int __update_load_avg_se(u64 now, int cpu, struct cfs_rq *cfs_rq, struct sched_entity *se); int __update_load_avg_cfs_rq(u64 now, int cpu, struct cfs_rq *cfs_rq); +int update_rt_rq_load_avg(u64 now, int cpu, struct rt_rq *rt_rq, int running); #else @@ -12,6 +13,12 @@ update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq) return 0; } +static inline int +update_rt_rq_load_avg(u64 now, int cpu, struct rt_rq *rt_rq, int running) +{ + return 0; +} + #endif diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 86b7798..c48078e 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -5,6 +5,8 @@ */ #include "sched.h" +#include "pelt.h" + int sched_rr_timeslice = RR_TIMESLICE; int sysctl_sched_rr_timeslice = (MSEC_PER_SEC / HZ) * RR_TIMESLICE; @@ -1570,6 +1572,9 @@ pick_next_task_rt(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) rt_queue_push_tasks(rq); + update_rt_rq_load_avg(rq_clock_task(rq), cpu_of(rq), rt_rq, + rq->curr->sched_class == &rt_sched_class); + return p; } @@ -1577,6 +1582,8 @@ static void put_prev_task_rt(struct rq *rq, struct task_struct *p) { update_curr_rt(rq); + update_rt_rq_load_avg(rq_clock_task(rq), cpu_of(rq), &rq->rt, 1); + /* * The previous task needs to be made eligible for pushing * if it is still active @@ -2306,6 +2313,7 @@ static void task_tick_rt(struct rq *rq, struct task_struct *p, int queued) struct sched_rt_entity *rt_se = &p->rt; update_curr_rt(rq); + update_rt_rq_load_avg(rq_clock_task(rq), cpu_of(rq), &rq->rt, 1); watchdog(rq, p); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 783eacf..a8003a9 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -592,6 +592,8 @@ struct rt_rq { unsigned long rt_nr_total; int overloaded; struct plist_head pushable_tasks; + + struct sched_avg avg; #endif /* CONFIG_SMP */ int rt_queued; From patchwork Fri Mar 16 11:25:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 131921 Delivered-To: patch@linaro.org Received: by 10.46.84.17 with SMTP id i17csp618185ljb; Fri, 16 Mar 2018 04:26:40 -0700 (PDT) X-Google-Smtp-Source: AG47ELuh0ordGBosUHNBeDdALDQ+dvr6dVdwGvBVm5peJO0zlpmeUcg0qnrl6bKfjnefkWO1CHKU X-Received: by 2002:a17:902:7804:: with SMTP id p4-v6mr1771205pll.17.1521199600660; Fri, 16 Mar 2018 04:26:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521199600; cv=none; d=google.com; s=arc-20160816; b=crFmDQ7TG3/YY2RqTWwsUiGJ/rYd1flthW+MBWfJWsixD40PvESFBlN8ZFXejX1ATh h3wxAeqrhQ/9wsIRHWVj2HApeeFXxbjSy8NLhqI6mlH0RinIEP7N/Q3GM8VGlThyNbcc NwDRuEiKa657prWx1W84eAubBe7D0U6omipVs/PNQayOWKAhdzSwF0yIC9I6cQX8WMRw ypheTtmMr3KZv5+LyzLZEU+aTbD3cOvtXvKQsaXO0fNP7TtcvUrnITMRHSEK62n5tC1l YO0dgwGfjt1Fx6ZY1rIumAJH+DIHct41iOclqK4svOLXlQGcOmL1ancn4dpQqOdoZmk1 O31Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=Hl9dAOfDQW1Q9gueNRmv9XiQMKFH/X0jiwYD3A3LDY0=; b=LoqgJAYPkEROckMXc8cEk/JlzbVNzx0TWS3DYPbAkXzkKgJAz1copIJiW0S0dO13ap wZsR1MC8zrSeZPSR8/We0lwHECxw7zBPvKIdqn067gIFaww3lsfI4k87Ag4L4s8grbp4 irvImbde4vPd8eUssQYHhW+9hal10K1tJ50zIX21nRgOJldB2EnHMiX2ZOpAbLUTF7Qx MR627L7HknwpLQ1tCPalE4Ys7HwibF0JJwU1aKJy0z+a6LDPgxy59IknqZLdg0bEQ5A1 HIzJQTDYYZ65PVEAyx2HGZJ5tCENENC51vaxZ9RMIeOzKGH8sGQuJQHLIHlT2UkjFJuk m1Vg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=P4Tgv73u; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 33-v6si5109684pls.491.2018.03.16.04.26.40; Fri, 16 Mar 2018 04:26:40 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=P4Tgv73u; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753946AbeCPL0i (ORCPT + 28 others); Fri, 16 Mar 2018 07:26:38 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:55446 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753884AbeCPLZ4 (ORCPT ); Fri, 16 Mar 2018 07:25:56 -0400 Received: by mail-wm0-f66.google.com with SMTP id q83so2386141wme.5 for ; Fri, 16 Mar 2018 04:25:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Hl9dAOfDQW1Q9gueNRmv9XiQMKFH/X0jiwYD3A3LDY0=; b=P4Tgv73ud/p0yNEWTp+6j4YEObpYaazVsd/TNcGH30Sb691eIdi8Lzp6GmN3ZjxGcZ mLfjo+KsR4kMCIofZ5LmbzK0546Snl9qouDHD7og5yzU7MCPAs4Mzumh1UHja93WeRNZ JCXpMmzCq/mzge/xvslbg8pd5TiTY7uOM9ro0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Hl9dAOfDQW1Q9gueNRmv9XiQMKFH/X0jiwYD3A3LDY0=; b=LjMohEQmZqDAlB3qJ8laewpcjxRWjPbCvmcc+Z50hO6rMXVKlvmKiSiJz3a5bRo+yZ WJP96x0p7VjogtQDyfCeiFYmHCg6WeFVioQj+bd3IjNdSiHMNEuCji42NtwE9tTKk7Dx NUdyGkfAbKCFhM7kDj1rsxOrsT3ggLVAEKHeVt1UxE6G0OtXt6XhZuVXOzCZ7WoGab95 FhHun+ep9QsR2vDNOf7xRPAB/AX4TpH/q5Ro6vQkj39PvbkKI29hflvrqRCKxwoKcpka xTrET9Eg093jyfk2KIzDotdmYlGoiTol6JNrwVAnnLN9MH811KDWd8LsAw2Kan885OXt tTWQ== X-Gm-Message-State: AElRT7FkfLGl08+LlTGSAHE7ELfErmF8jXZQ96XPTR3vAI8ehXzaqTsQ XsMXtK6OIZf7GNNI0Sij0cW6CQ== X-Received: by 10.28.109.90 with SMTP id i87mr1421598wmc.71.1521199554749; Fri, 16 Mar 2018 04:25:54 -0700 (PDT) Received: from localhost.localdomain ([2a01:e0a:f:6020:9471:2e09:99e5:f59]) by smtp.gmail.com with ESMTPSA id m15sm6785292wrb.58.2018.03.16.04.25.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 16 Mar 2018 04:25:54 -0700 (PDT) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, rjw@rjwysocki.net Cc: juri.lelli@redhat.com, dietmar.eggemann@arm.com, Morten.Rasmussen@arm.com, viresh.kumar@linaro.org, valentin.schneider@arm.com, Vincent Guittot Subject: [PATCH 3/4 v4] cpufreq/schedutil: add rt utilization tracking Date: Fri, 16 Mar 2018 12:25:40 +0100 Message-Id: <1521199541-15308-4-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1521199541-15308-1-git-send-email-vincent.guittot@linaro.org> References: <1521199541-15308-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org add both cfs and rt utilization when selecting an OPP as rt can preempt and steal cfs's running time Signed-off-by: Vincent Guittot --- kernel/sched/cpufreq_schedutil.c | 4 +++- kernel/sched/sched.h | 7 +++++++ 2 files changed, 10 insertions(+), 1 deletion(-) -- 2.7.4 diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 89fe78e..7ce0643 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -56,6 +56,7 @@ struct sugov_cpu { /* The fields below are only needed when sharing a policy: */ unsigned long util_cfs; unsigned long util_dl; + unsigned long util_rt; unsigned long max; /* The field below is for single-CPU policies only: */ @@ -178,6 +179,7 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu) sg_cpu->max = arch_scale_cpu_capacity(NULL, sg_cpu->cpu); sg_cpu->util_cfs = cpu_util_cfs(rq); sg_cpu->util_dl = cpu_util_dl(rq); + sg_cpu->util_rt = cpu_util_rt(rq); } static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu) @@ -190,7 +192,7 @@ static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu) } else { util = sg_cpu->util_dl; if (rq->cfs.h_nr_running) - util += sg_cpu->util_cfs; + util += sg_cpu->util_cfs + sg_cpu->util_rt; } /* diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index a8003a9..b8784be 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2186,4 +2186,11 @@ static inline unsigned long cpu_util_cfs(struct rq *rq) { return rq->cfs.avg.util_avg; } + +static inline unsigned long cpu_util_rt(struct rq *rq) +{ + return rq->rt.avg.util_avg; +} + + #endif From patchwork Fri Mar 16 11:25:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 131920 Delivered-To: patch@linaro.org Received: by 10.46.84.17 with SMTP id i17csp617946ljb; Fri, 16 Mar 2018 04:26:26 -0700 (PDT) X-Google-Smtp-Source: AG47ELsqREG/qmOTt8M7gbYu7OFmogJxcXcvohVBE4SSqBa5+lc4f6/zt86Z5GCXF0av3vtNVJeR X-Received: by 2002:a17:902:5410:: with SMTP id d16-v6mr1777867pli.176.1521199586849; Fri, 16 Mar 2018 04:26:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521199586; cv=none; d=google.com; s=arc-20160816; b=Ne0/kBYjzHknG22iDeD4ghbc8bywpV9CDIKEa7Vcit3nqrmkISjAGDQax1RpnVavOq dqvVpN58XunGuuA+1dZjcbpvihBmm+YMk6ZpxkFlb7TJB2QDZawMISG06QNIDLh6VFvr Lmh4dmnTe9wwwI7IQYyNttFTl0zIYYDp7tCyVaD2sXQEqf9SNlHJqQadUVEpqkzzAVuQ oU9x9rwAHqWRmMSqFWSkQZDTLpMJ5WqnnZKEe+yVZ4ZbrK/amhwNB4VB8TnVUqBwTWK4 kPj8G52P8y4c3uR/khCdBZOmqocaddj/LPZCNzUSwSYg/5oQGthObSVE+SU/mKHeymTP KQhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=YVPi8OgBu68Jkd23UJvd5V3jRLSOcdW5cQfj/S6SEKU=; b=twBl1DXrP4RP/Xra7LX6iHnQ4n47SbvX4UT1fGyO82Vink4XTEVEOCk8WZT9FQHa2c ZvIw7NulyRtTuxJDzz68CE/VBY9PGcy5JUASzOEPnVA8MEN+vVjmQREAagy1CoNXgM2H ha+//ZA2baVNQCaDqGUUEFws/h51nqkWt9jLY+1y5Th4mdELROhrmfjPZL7w7anjVB3w STTExsP6BnfYWIAs8UCt96p1JxZiT5duF9Fbf6HQgrOcauLX5qmmmJoXM3clI8V/t69H j/jBTQJzNu0qot6UXmBNslcmHUyL5cnD3nCCvuqvZcPPKg9l3UXvIyUYsKC7U0gYtP2B SupA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=WizJijMt; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 33-v6si5109684pls.491.2018.03.16.04.26.26; Fri, 16 Mar 2018 04:26:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=WizJijMt; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753931AbeCPL0X (ORCPT + 28 others); Fri, 16 Mar 2018 07:26:23 -0400 Received: from mail-wr0-f196.google.com ([209.85.128.196]:46423 "EHLO mail-wr0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753887AbeCPLZ5 (ORCPT ); Fri, 16 Mar 2018 07:25:57 -0400 Received: by mail-wr0-f196.google.com with SMTP id m12so11274522wrm.13 for ; Fri, 16 Mar 2018 04:25:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=YVPi8OgBu68Jkd23UJvd5V3jRLSOcdW5cQfj/S6SEKU=; b=WizJijMth97uFiqg3FZJQzeIAoNK0Njj85+CcvA6nGXtLrEO0qhyfebYn/RFihBke9 FUzQZTueZr0KIXHHD5LKQtvNSSPr++1Fcs5hIQDDVvFZVNudnsEUKklrhu0No9cwQnlU fi1cRTBDMyy7LE18JwjxEUCcDpMNDB1lop6YU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=YVPi8OgBu68Jkd23UJvd5V3jRLSOcdW5cQfj/S6SEKU=; b=hsVsp54dwQQkkOwmb+kP5n/0qni1kClQt2b/ZIOdCb+fmrVgGMlzPCSMG7Oxrba6OG gm+4TjGd5JrYP3/3GPiyS98MFkXlX5k4AK1STiehu3yh749Iiy+Erym4V25XT8TctwyV rRw5p9Uc0L+rwVcJ96GD7IE5jEe93A2PzJiqyDzxaK+9L77NJ9rFYd7bYwQ8oJreaprb vxeMSeM1r5iV3auAm2fgIScJ0KbinTh55bpj/1+bcRJKSdtybLNdSemJVv4mb6004xd7 96En81ScSKB3bQNHc78XvgO1RgSS98t20Q1Ad3ByzYGMLoUXG+JAGHUGIXz0R8F1yi08 qRgQ== X-Gm-Message-State: AElRT7G8bTygcaCPC8fasZ5Gvdl2/gI58yYtI5sd8KhyKNJoHeDXF+oP cFvT4ksLCRgo8cRJzWL8keE1/g== X-Received: by 10.223.153.167 with SMTP id y36mr1323896wrb.136.1521199556616; Fri, 16 Mar 2018 04:25:56 -0700 (PDT) Received: from localhost.localdomain ([2a01:e0a:f:6020:9471:2e09:99e5:f59]) by smtp.gmail.com with ESMTPSA id m15sm6785292wrb.58.2018.03.16.04.25.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 16 Mar 2018 04:25:55 -0700 (PDT) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, rjw@rjwysocki.net Cc: juri.lelli@redhat.com, dietmar.eggemann@arm.com, Morten.Rasmussen@arm.com, viresh.kumar@linaro.org, valentin.schneider@arm.com, Vincent Guittot Subject: [PATCH 4/4 v4] sched/nohz: monitor rt utilization Date: Fri, 16 Mar 2018 12:25:41 +0100 Message-Id: <1521199541-15308-5-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1521199541-15308-1-git-send-email-vincent.guittot@linaro.org> References: <1521199541-15308-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Take into account rt's utilization when deciding to stop periodic update Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 60e3c4b..3f00e03 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7131,6 +7131,14 @@ static inline bool cfs_rq_has_blocked(struct cfs_rq *cfs_rq) return false; } +static inline bool rt_rq_has_blocked(struct rt_rq *rt_rq) +{ + if (rt_rq->avg.util_avg) + return true; + + return false; +} + #ifdef CONFIG_FAIR_GROUP_SCHED static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) @@ -7191,6 +7199,9 @@ static void update_blocked_averages(int cpu) done = false; } update_rt_rq_load_avg(rq_clock_task(rq), cpu, &rq->rt, 0); + /* Don't need periodic decay once load/util_avg are null */ + if (rt_rq_has_blocked(&rq->rt)) + done = false; #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; @@ -7259,7 +7270,7 @@ static inline void update_blocked_averages(int cpu) update_rt_rq_load_avg(rq_clock_task(rq), cpu, &rq->rt, 0); #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; - if (!cfs_rq_has_blocked(cfs_rq)) + if (!cfs_rq_has_blocked(cfs_rq) && !rt_rq_has_blocked(&rq->rt)) rq->has_blocked_load = 0; #endif rq_unlock_irqrestore(rq, &rf);