From patchwork Mon Feb  8 12:45:30 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Juri Lelli <juri.lelli@arm.com>
X-Patchwork-Id: 61409
Delivered-To: patch@linaro.org
Received: by 10.112.43.199 with SMTP id y7csp1414870lbl;
 Mon, 8 Feb 2016 04:45:48 -0800 (PST)
X-Received: by 10.66.234.137 with SMTP id ue9mr42387041pac.134.1454935548333; 
 Mon, 08 Feb 2016 04:45:48 -0800 (PST)
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
 by mx.google.com with ESMTP id
 la16si46403987pab.64.2016.02.08.04.45.47; 
 Mon, 08 Feb 2016 04:45:48 -0800 (PST)
Received-SPF: pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender) client-ip=209.132.180.67; 
Authentication-Results: mx.google.com;
 spf=pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S1752671AbcBHMpq (ORCPT <rfc822;bjorn.andersson@linaro.org>
 + 30 others); Mon, 8 Feb 2016 07:45:46 -0500
Received: from foss.arm.com ([217.140.101.70]:60453 "EHLO foss.arm.com"
 rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
 id S1752074AbcBHMpI (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
 Mon, 8 Feb 2016 07:45:08 -0500
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C5C8E3A8;
 Mon,  8 Feb 2016 04:44:21 -0800 (PST)
Received: from e106622-lin.cambridge.arm.com (e106622-lin.cambridge.arm.com
 [10.1.208.152])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id
 B6FE13F25F; Mon,  8 Feb 2016 04:45:06 -0800 (PST)
From: Juri Lelli <juri.lelli@arm.com>
To: rostedt@goodmis.org
Cc: linux-kernel@vger.kernel.org, peterz@infradead.org,
 mingo@redhat.com, luca.abeni@unitn.it, vincent.guittot@linaro.org,
 wanpeng.li@hotmail.com, juri.lelli@arm.com
Subject: [PATCH 1/2] sched/deadline: add per rq tracking of admitted bandwidth
Date: Mon,  8 Feb 2016 12:45:30 +0000
Message-Id: <1454935531-7541-2-git-send-email-juri.lelli@arm.com>
X-Mailer: git-send-email 2.7.0
In-Reply-To: <1454935531-7541-1-git-send-email-juri.lelli@arm.com>
References: <1454935531-7541-1-git-send-email-juri.lelli@arm.com>
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Currently SCHED_DEADLINE scheduling policy tracks bandwidth of tasks
that passed admission control at root_domain level only. This creates
problems when such data structure(s) are destroyed, when we reconfigure
scheduling domains for example.

This is part one of two changes required to fix the problem. In this
patch we add per-rq tracking of admitted bandwidth. Tasks bring with
them their bandwidth contribution when they enter the system and are
enqueued for the first time. Contributions are then moved around when
migrations happen and removed when tasks die.

Per-rq admitted bandwidth information will be leveraged in the next
commit to save/restore per-rq bandwidth contribution towards the root
domain (using the rq_{online,offline} mechanism).

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reported-by: Wanpeng Li <wanpeng.li@hotmail.com>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
[ original patch by ]
Signed-off-by: Luca Abeni <luca.abeni@unitn.it>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
---
 kernel/sched/core.c     |  2 ++
 kernel/sched/deadline.c | 18 ++++++++++++++++++
 kernel/sched/sched.h    | 22 ++++++++++++++++++++++
 3 files changed, 42 insertions(+)

-- 
2.7.0

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 24fcdbf..706ca23 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2449,7 +2449,9 @@ static int dl_overflow(struct task_struct *p, int policy,
 	} else if (dl_policy(policy) && task_has_dl_policy(p) &&
 		   !__dl_overflow(dl_b, cpus, p->dl.dl_bw, new_bw)) {
 		__dl_clear(dl_b, p->dl.dl_bw);
+		__dl_sub_ac(task_rq(p), p->dl.dl_bw);
 		__dl_add(dl_b, new_bw);
+		__dl_add_ac(task_rq(p), new_bw);
 		err = 0;
 	} else if (!dl_policy(policy) && task_has_dl_policy(p)) {
 		__dl_clear(dl_b, p->dl.dl_bw);
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index cd64c97..2480cab 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -83,6 +83,7 @@ void init_dl_rq(struct dl_rq *dl_rq)
 #else
 	init_dl_bw(&dl_rq->dl_bw);
 #endif
+	dl_rq->ac_bw = 0;
 }
 
 #ifdef CONFIG_SMP
@@ -278,8 +279,10 @@ static struct rq *dl_task_offline_migration(struct rq *rq, struct task_struct *p
 	 * By now the task is replenished and enqueued; migrate it.
 	 */
 	deactivate_task(rq, p, 0);
+	__dl_sub_ac(rq, p->dl.dl_bw);
 	set_task_cpu(p, later_rq->cpu);
 	activate_task(later_rq, p, 0);
+	__dl_add_ac(later_rq, p->dl.dl_bw);
 
 	if (!fallback)
 		resched_curr(later_rq);
@@ -506,6 +509,7 @@ static void update_dl_entity(struct sched_dl_entity *dl_se,
 	 */
 	if (dl_se->dl_new) {
 		setup_new_dl_entity(dl_se, pi_se);
+		__dl_add_ac(rq, dl_se->dl_bw);
 		return;
 	}
 
@@ -955,6 +959,9 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
 		return;
 	}
 
+	if (p->on_rq == TASK_ON_RQ_MIGRATING)
+		__dl_add_ac(rq, p->dl.dl_bw);
+
 	/*
 	 * If p is throttled, we do nothing. In fact, if it exhausted
 	 * its budget it needs a replenishment and, since it now is on
@@ -980,6 +987,8 @@ static void dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
 {
 	update_curr_dl(rq);
 	__dequeue_task_dl(rq, p, flags);
+	if (p->on_rq == TASK_ON_RQ_MIGRATING)
+		__dl_sub_ac(rq, p->dl.dl_bw);
 }
 
 /*
@@ -1219,6 +1228,8 @@ static void task_dead_dl(struct task_struct *p)
 {
 	struct dl_bw *dl_b = dl_bw_of(task_cpu(p));
 
+	__dl_sub_ac(task_rq(p), p->dl.dl_bw);
+
 	/*
 	 * Since we are TASK_DEAD we won't slip out of the domain!
 	 */
@@ -1511,8 +1522,10 @@ retry:
 	}
 
 	deactivate_task(rq, next_task, 0);
+	__dl_sub_ac(rq, next_task->dl.dl_bw);
 	set_task_cpu(next_task, later_rq->cpu);
 	activate_task(later_rq, next_task, 0);
+	__dl_add_ac(later_rq, next_task->dl.dl_bw);
 	ret = 1;
 
 	resched_curr(later_rq);
@@ -1599,8 +1612,10 @@ static void pull_dl_task(struct rq *this_rq)
 			resched = true;
 
 			deactivate_task(src_rq, p, 0);
+			__dl_sub_ac(src_rq, p->dl.dl_bw);
 			set_task_cpu(p, this_cpu);
 			activate_task(this_rq, p, 0);
+			__dl_add_ac(this_rq, p->dl.dl_bw);
 			dmin = p->dl.deadline;
 
 			/* Is there any other task even earlier? */
@@ -1705,6 +1720,9 @@ static void switched_from_dl(struct rq *rq, struct task_struct *p)
 	if (!start_dl_timer(p))
 		__dl_clear_params(p);
 
+	if (dl_prio(p->normal_prio))
+		__dl_sub_ac(rq, p->dl.dl_bw);
+
 	/*
 	 * Since this might be the only -deadline task on the rq,
 	 * this is the right place to try to pull some other one
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 10f1637..e754680 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -519,6 +519,14 @@ struct dl_rq {
 #else
 	struct dl_bw dl_bw;
 #endif
+
+	/*
+	 * ac_bw keeps track of per rq admitted bandwidth. It only changes
+	 * when a new task is admitted, it dies, it changes scheduling policy
+	 * or is migrated to another rq. It is used to correctly save/resore
+	 * total_bw on root_domain changes.
+	 */
+	u64 ac_bw;
 };
 
 #ifdef CONFIG_SMP
@@ -720,6 +728,20 @@ DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
 #define cpu_curr(cpu)		(cpu_rq(cpu)->curr)
 #define raw_rq()		raw_cpu_ptr(&runqueues)
 
+static inline
+void __dl_sub_ac(struct rq *rq, u64 tsk_bw)
+{
+	WARN_ON(rq->dl.ac_bw == 0);
+
+	rq->dl.ac_bw -= tsk_bw;
+}
+
+static inline
+void __dl_add_ac(struct rq *rq, u64 tsk_bw)
+{
+	rq->dl.ac_bw += tsk_bw;
+}
+
 static inline u64 __rq_clock_broken(struct rq *rq)
 {
 	return READ_ONCE(rq->clock);