From patchwork Thu Jul  3 16:26:10 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Morten Rasmussen <morten.rasmussen@arm.com>
X-Patchwork-Id: 33037
Return-Path: <patchwork-forward+bncBCCJZKMZ7UKBB6UI22OQKGQELEMDQAI@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-ve0-f199.google.com (mail-ve0-f199.google.com
 [209.85.128.199])
 by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id B8397203AC
 for <linaro@patches.linaro.org>; Thu,  3 Jul 2014 16:27:38 +0000 (UTC)
Received: by mail-ve0-f199.google.com with SMTP id oy12sf1357470veb.10
 for <linaro@patches.linaro.org>; Thu, 03 Jul 2014 09:27:38 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject
 :date:message-id:in-reply-to:references:sender:precedence:list-id
 :x-original-sender:x-original-authentication-results:mailing-list
 :list-post:list-help:list-archive:list-unsubscribe:content-type
 :content-transfer-encoding;
 bh=Z5fUijg2688y31Ji4l5bpaE3urU9OaGQnTnLPLJ5HhE=;
 b=ENd2kv9udhq16YPBljz36FS7E8wWBRrRax1B4kKV/FPgnQQ6ndru6T/HcJhoG6bJZY
 AM7Tv5UCYgmtM1K0ftZxDfZoDlXpCDX8+HdblffHsKonfue6ZoZejE4zqPlFEi9pYWyV
 Zlf8fbKsHQ9sFlmnEaNGS9au2W7BXnEWQJS60ywrFisdi0FqOUU1EDWDqA7aP1ejG4Ae
 XzAQNlDWRlFtAqME1j7dQZSWHSbZjyMeX1rFr+olpG/DayqlzWQw6tOG49OVIFfeobDT
 eVE++yGYDAXyariJ950XBAcNmES1NioSwAlAsqQcdcbTTbQEB4RovToXK5jR8Nkruo65
 no+g==
X-Gm-Message-State: ALoCoQnhg+UTPJ35IsJniw/uF87QvKrjDp5DVeU6VsdEfdFs37P+40KSqfNPGOp6u8gFRfe2nQNt
X-Received: by 10.58.186.50 with SMTP id fh18mr2761894vec.34.1404404858439; 
 Thu, 03 Jul 2014 09:27:38 -0700 (PDT)
MIME-Version: 1.0
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.140.32.203 with SMTP id h69ls514029qgh.56.gmail; Thu, 03 Jul
 2014 09:27:38 -0700 (PDT)
X-Received: by 10.52.27.133 with SMTP id t5mr4095543vdg.9.1404404858357;
 Thu, 03 Jul 2014 09:27:38 -0700 (PDT)
Received: from mail-vc0-f178.google.com (mail-vc0-f178.google.com
 [209.85.220.178]) by mx.google.com with ESMTPS id
 cy11si14421514vec.49.2014.07.03.09.27.38
 for <patchwork-forward@linaro.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Thu, 03 Jul 2014 09:27:38 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.220.178 as permitted sender) client-ip=209.85.220.178; 
Received: by mail-vc0-f178.google.com with SMTP id ij19so458394vcb.37
 for <patchwork-forward@linaro.org>;
 Thu, 03 Jul 2014 09:27:38 -0700 (PDT)
X-Received: by 10.52.117.209 with SMTP id kg17mr4058449vdb.28.1404404858252; 
 Thu, 03 Jul 2014 09:27:38 -0700 (PDT)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.221.37.5 with SMTP id tc5csp390398vcb;
 Thu, 3 Jul 2014 09:27:37 -0700 (PDT)
X-Received: by 10.66.161.199 with SMTP id xu7mr5751300pab.89.1404404857297; 
 Thu, 03 Jul 2014 09:27:37 -0700 (PDT)
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
 by mx.google.com with ESMTP id
 ho4si33280366pbc.180.2014.07.03.09.27.36; 
 Thu, 03 Jul 2014 09:27:36 -0700 (PDT)
Received-SPF: none (google.com: linux-kernel-owner@vger.kernel.org does not
 designate permitted sender hosts) client-ip=209.132.180.67; 
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S1759417AbaGCQ04 (ORCPT <rfc822;mturquette@linaro.org>
 + 27 others); Thu, 3 Jul 2014 12:26:56 -0400
Received: from service87.mimecast.com ([91.220.42.44]:45993 "EHLO
 service87.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
 with ESMTP id S1759297AbaGCQ01 (ORCPT
 <rfc822;linux-kernel@vger.kernel.org>);
 Thu, 3 Jul 2014 12:26:27 -0400
Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com
 [217.140.96.21]) by service87.mimecast.com;
 Thu, 03 Jul 2014 17:26:25 +0100
Received: from e103034-lin.cambridge.arm.com ([10.1.255.212]) by
 cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959); 
 Thu, 3 Jul 2014 17:26:25 +0100
From: Morten Rasmussen <morten.rasmussen@arm.com>
To: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
 peterz@infradead.org, mingo@kernel.org
Cc: rjw@rjwysocki.net, vincent.guittot@linaro.org,
 daniel.lezcano@linaro.org, preeti@linux.vnet.ibm.com,
 Dietmar.Eggemann@arm.com, pjt@google.com
Subject: [RFCv2 PATCH 23/23] sched: Use energy model in load balance path
Date: Thu,  3 Jul 2014 17:26:10 +0100
Message-Id: <1404404770-323-24-git-send-email-morten.rasmussen@arm.com>
X-Mailer: git-send-email 1.7.9.5
In-Reply-To: <1404404770-323-1-git-send-email-morten.rasmussen@arm.com>
References: <1404404770-323-1-git-send-email-morten.rasmussen@arm.com>
X-OriginalArrivalTime: 03 Jul 2014 16:26:25.0631 (UTC)
 FILETIME=[88ED96F0:01CF96DB]
X-MC-Unique: 114070317262500801
Sender: linux-kernel-owner@vger.kernel.org
Precedence: list
List-ID: <patchwork-forward.linaro.org>
X-Mailing-List: linux-kernel@vger.kernel.org
X-Removed-Original-Auth: Dkim didn't pass.
X-Original-Sender: morten.rasmussen@arm.com
X-Original-Authentication-Results: mx.google.com; spf=pass (google.com:
 domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.220.178 as permitted sender)
 smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org
Mailing-list: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
X-Google-Group-Id: 836684582541
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Unsubscribe: <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>, 
 <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>

From: Dietmar Eggemann <dietmar.eggemann@arm.com>

Attempt to pick the source cpu which potentially gives the maximum energy
savings in case the minimum of the amount of utilization which the
destination cpu is additionally able to handle and the current
utilization on the source cpu is taken away from it and put on the
destination cpu instead.
Finding the optimum source requires an exhaustive search through all cpus
in the groups. Instead, the source group is determined based on
utilization and probing the energy cost on a single cpu in each group.

This implementation is not providing an actual energy aware load
balancing right now. It is only trying to showcase the way to find the
most suitable source queue (cpu) based on the energy aware data. The
actual load balance is still done based on the calculated load based
imbalance.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
---
 kernel/sched/fair.c |   88 ++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 83 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2acd45a..1ce3a89 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4549,6 +4549,42 @@ static int energy_diff_task(int cpu, struct task_struct *p)
 			p->se.avg.wakeup_avg_sum);
 }
 
+static int energy_diff_cpu(int dst_cpu, int src_cpu)
+{
+	int util_diff, dst_nrg_diff, src_nrg_diff;
+	unsigned long src_curr_cap, src_util;
+	unsigned long dst_curr_cap = get_curr_capacity(dst_cpu);
+	unsigned long dst_util = cpu_load(dst_cpu, 1);
+
+	/*
+	 * If the destination cpu is already fully or even over-utilized
+	 * return error.
+	 */
+	if (dst_curr_cap <= dst_util)
+		return INT_MAX;
+
+	src_curr_cap = get_curr_capacity(src_cpu);
+	src_util = cpu_load(src_cpu, 1);
+
+	/*
+	 * If the source cpu is over-utilized return the minimum value
+	 * to indicate maximum potential energy savings. Performance
+	 * is still given priority over pure energy efficiency here.
+	 */
+	if (src_curr_cap < src_util)
+		return INT_MIN;
+
+	util_diff = min(dst_curr_cap - dst_util, src_util);
+
+	dst_nrg_diff = energy_diff_util(dst_cpu, util_diff, 0);
+	src_nrg_diff = energy_diff_util(src_cpu, -util_diff, 0);
+
+	if (dst_nrg_diff == INT_MAX || src_nrg_diff == INT_MAX)
+		return INT_MAX;
+
+	return dst_nrg_diff + src_nrg_diff;
+}
+
 static int wake_wide(struct task_struct *p)
 {
 	int factor = this_cpu_read(sd_llc_size);
@@ -5488,6 +5524,9 @@ struct lb_env {
 	unsigned int		loop_max;
 
 	enum fbq_type		fbq_type;
+
+	unsigned int use_ea;	/* Use energy aware lb */
+
 };
 
 /*
@@ -5957,6 +5996,7 @@ struct sg_lb_stats {
 	unsigned int nr_numa_running;
 	unsigned int nr_preferred_running;
 #endif
+	int nrg_diff; /* Maximum energy difference btwn dst_cpu and probe_cpu */
 };
 
 /*
@@ -5969,9 +6009,11 @@ struct sd_lb_stats {
 	unsigned long total_load;	/* Total load of all groups in sd */
 	unsigned long total_capacity;	/* Total capacity of all groups in sd */
 	unsigned long avg_load;	/* Average load across all groups in sd */
+	unsigned int use_ea;		/* Use energy aware lb */
 
 	struct sg_lb_stats busiest_stat;/* Statistics of the busiest group */
 	struct sg_lb_stats local_stat;	/* Statistics of the local group */
+
 };
 
 static inline void init_sd_lb_stats(struct sd_lb_stats *sds)
@@ -5987,8 +6029,10 @@ static inline void init_sd_lb_stats(struct sd_lb_stats *sds)
 		.local = NULL,
 		.total_load = 0UL,
 		.total_capacity = 0UL,
+		.use_ea = 0,
 		.busiest_stat = {
 			.avg_load = 0UL,
+			.nrg_diff = INT_MAX,
 		},
 	};
 }
@@ -6282,20 +6326,32 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 			struct sched_group *group, int load_idx,
 			int local_group, struct sg_lb_stats *sgs)
 {
-	unsigned long load;
-	int i;
+	unsigned long load, probe_util = 0;
+	int i, probe_cpu = cpumask_first(sched_group_cpus(group));
 
 	memset(sgs, 0, sizeof(*sgs));
 
+	sgs->nrg_diff = INT_MAX;
+
 	for_each_cpu_and(i, sched_group_cpus(group), env->cpus) {
 		struct rq *rq = cpu_rq(i);
 
 		/* Bias balancing toward cpus of our domain */
 		if (local_group)
 			load = target_load(i, load_idx, 0);
-		else
+		else {
 			load = source_load(i, load_idx, 0);
 
+			if (energy_aware()) {
+				unsigned long util = source_load(i, load_idx, 1);
+
+				if (probe_util < util) {
+					probe_util = util;
+					probe_cpu = i;
+				}
+			}
+		}
+
 		sgs->group_load += load;
 		sgs->sum_nr_running += rq->nr_running;
 #ifdef CONFIG_NUMA_BALANCING
@@ -6321,6 +6377,9 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 
 	if (sgs->group_capacity_factor > sgs->sum_nr_running)
 		sgs->group_has_free_capacity = 1;
+
+	if (energy_aware() && !local_group)
+		sgs->nrg_diff = energy_diff_cpu(env->dst_cpu, probe_cpu);
 }
 
 /**
@@ -6341,6 +6400,14 @@ static bool update_sd_pick_busiest(struct lb_env *env,
 				   struct sched_group *sg,
 				   struct sg_lb_stats *sgs)
 {
+	if (energy_aware()) {
+		if (sgs->nrg_diff < sds->busiest_stat.nrg_diff) {
+			sds->use_ea = 1;
+			return true;
+		}
+		sds->use_ea = 0;
+	}
+
 	if (sgs->avg_load <= sds->busiest_stat.avg_load)
 		return false;
 
@@ -6450,6 +6517,8 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd
 		if (update_sd_pick_busiest(env, sds, sg, sgs)) {
 			sds->busiest = sg;
 			sds->busiest_stat = *sgs;
+			if (energy_aware())
+				env->use_ea = sds->use_ea;
 		}
 
 next_group:
@@ -6761,7 +6830,7 @@ static struct rq *find_busiest_queue(struct lb_env *env,
 {
 	struct rq *busiest = NULL, *rq;
 	unsigned long busiest_load = 0, busiest_capacity = 1;
-	int i;
+	int i, min_nrg = INT_MAX;
 
 	for_each_cpu_and(i, sched_group_cpus(group), env->cpus) {
 		unsigned long capacity, capacity_factor, load;
@@ -6807,6 +6876,14 @@ static struct rq *find_busiest_queue(struct lb_env *env,
 				load > env->imbalance)
 			continue;
 
+		if (energy_aware() && env->use_ea) {
+			int nrg = energy_diff_cpu(env->dst_cpu, i);
+
+			if (nrg < min_nrg) {
+				min_nrg = nrg;
+				busiest = rq;
+			}
+		}
 		/*
 		 * For the load comparisons with the other cpu's, consider
 		 * the cpu_load() scaled with the cpu capacity, so
@@ -6818,7 +6895,7 @@ static struct rq *find_busiest_queue(struct lb_env *env,
 		 * to: load_i * capacity_j > load_j * capacity_i;  where j is
 		 * our previous maximum.
 		 */
-		if (load * busiest_capacity > busiest_load * capacity) {
+		else if (load * busiest_capacity > busiest_load * capacity) {
 			busiest_load = load;
 			busiest_capacity = capacity;
 			busiest = rq;
@@ -6915,6 +6992,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
 		.loop_break	= sched_nr_migrate_break,
 		.cpus		= cpus,
 		.fbq_type	= all,
+		.use_ea		= 0,
 	};
 
 	/*