From patchwork Wed Feb 10 11:32:58 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juri Lelli X-Patchwork-Id: 61638 Delivered-To: patch@linaro.org Received: by 10.112.43.199 with SMTP id y7csp2584334lbl; Wed, 10 Feb 2016 03:32:18 -0800 (PST) X-Received: by 10.67.1.164 with SMTP id bh4mr57976826pad.118.1455103937907; Wed, 10 Feb 2016 03:32:17 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id un7si4498411pac.228.2016.02.10.03.32.17; Wed, 10 Feb 2016 03:32:17 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753941AbcBJLcO (ORCPT + 30 others); Wed, 10 Feb 2016 06:32:14 -0500 Received: from foss.arm.com ([217.140.101.70]:46384 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753102AbcBJLcM (ORCPT ); Wed, 10 Feb 2016 06:32:12 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2102C49; Wed, 10 Feb 2016 03:31:25 -0800 (PST) Received: from e106622-lin (e106622-lin.cambridge.arm.com [10.1.208.152]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CE31B3F238; Wed, 10 Feb 2016 03:32:10 -0800 (PST) Date: Wed, 10 Feb 2016 11:32:58 +0000 From: Juri Lelli To: rostedt@goodmis.org Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, luca.abeni@unitn.it, vincent.guittot@linaro.org, wanpeng.li@hotmail.com, Juri Lelli Subject: Re: [PATCH 1/2] sched/deadline: add per rq tracking of admitted bandwidth Message-ID: <20160210113258.GX11415@e106622-lin> References: <1454935531-7541-1-git-send-email-juri.lelli@arm.com> <1454935531-7541-2-git-send-email-juri.lelli@arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1454935531-7541-2-git-send-email-juri.lelli@arm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, I've updated this patch since, with a bit more testing and talking with Luca in private, I realized that the previous version didn't manage switching back and forth from SCHED_DEADLINE correctly. Thanks a lot Luca for your feedback (even if not visible on the list). I updated the testing branch accordingly and added a test to my tests that stresses switch-in/switch-out. I don't particularly like the fact that we break the scheduling classes abstraction in __dl_overflow(), so I think a little bit of refactoring is still needed. But that can also happen afterwards, if we fix the problem with root domains. Best, - Juri --->8--- >From 62f70ca3051672dce209e8355cf5eddc9d825c2a Mon Sep 17 00:00:00 2001 From: Juri Lelli Date: Sat, 6 Feb 2016 12:41:09 +0000 Subject: [PATCH 1/2] sched/deadline: add per rq tracking of admitted bandwidth Currently SCHED_DEADLINE scheduling policy tracks bandwidth of tasks that passed admission control at root_domain level only. This creates problems when such data structure(s) are destroyed, when we reconfigure scheduling domains for example. This is part one of two changes required to fix the problem. In this patch we add per-rq tracking of admitted bandwidth. Tasks bring with them their bandwidth contribution when they enter the system and are enqueued for the first time. Contributions are then moved around when migrations happen and removed when tasks die. Per-rq admitted bandwidth information will be leveraged in the next commit to save/restore per-rq bandwidth contribution towards the root domain (using the rq_{online,offline} mechanism). Cc: Ingo Molnar Cc: Peter Zijlstra Reported-by: Wanpeng Li Reported-by: Steven Rostedt [ original patch by ] Signed-off-by: Luca Abeni Signed-off-by: Juri Lelli --- kernel/sched/core.c | 6 +++++- kernel/sched/deadline.c | 16 +++++++++++++++- kernel/sched/sched.h | 22 ++++++++++++++++++++++ 3 files changed, 42 insertions(+), 2 deletions(-) -- 2.7.0 diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9503d59..0ee0ec2 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2432,7 +2432,7 @@ static int dl_overflow(struct task_struct *p, int policy, u64 new_bw = dl_policy(policy) ? to_ratio(period, runtime) : 0; int cpus, err = -1; - if (new_bw == p->dl.dl_bw) + if (task_has_dl_policy(p) && new_bw == p->dl.dl_bw) return 0; /* @@ -2445,14 +2445,18 @@ static int dl_overflow(struct task_struct *p, int policy, if (dl_policy(policy) && !task_has_dl_policy(p) && !__dl_overflow(dl_b, cpus, 0, new_bw)) { __dl_add(dl_b, new_bw); + __dl_add_ac(task_rq(p), new_bw); err = 0; } else if (dl_policy(policy) && task_has_dl_policy(p) && !__dl_overflow(dl_b, cpus, p->dl.dl_bw, new_bw)) { __dl_clear(dl_b, p->dl.dl_bw); + __dl_sub_ac(task_rq(p), p->dl.dl_bw); __dl_add(dl_b, new_bw); + __dl_add_ac(task_rq(p), new_bw); err = 0; } else if (!dl_policy(policy) && task_has_dl_policy(p)) { __dl_clear(dl_b, p->dl.dl_bw); + __dl_sub_ac(task_rq(p), p->dl.dl_bw); err = 0; } raw_spin_unlock(&dl_b->lock); diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index cd64c97..8ac0fee 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -83,6 +83,7 @@ void init_dl_rq(struct dl_rq *dl_rq) #else init_dl_bw(&dl_rq->dl_bw); #endif + dl_rq->ac_bw = 0; } #ifdef CONFIG_SMP @@ -278,8 +279,10 @@ static struct rq *dl_task_offline_migration(struct rq *rq, struct task_struct *p * By now the task is replenished and enqueued; migrate it. */ deactivate_task(rq, p, 0); + __dl_sub_ac(rq, p->dl.dl_bw); set_task_cpu(p, later_rq->cpu); activate_task(later_rq, p, 0); + __dl_add_ac(later_rq, p->dl.dl_bw); if (!fallback) resched_curr(later_rq); @@ -597,7 +600,7 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer) /* * The task might have changed its scheduling policy to something - * different than SCHED_DEADLINE (through switched_fromd_dl()). + * different than SCHED_DEADLINE (through switched_from_dl()). */ if (!dl_task(p)) { __dl_clear_params(p); @@ -955,6 +958,9 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags) return; } + if (p->on_rq == TASK_ON_RQ_MIGRATING) + __dl_add_ac(rq, p->dl.dl_bw); + /* * If p is throttled, we do nothing. In fact, if it exhausted * its budget it needs a replenishment and, since it now is on @@ -980,6 +986,8 @@ static void dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags) { update_curr_dl(rq); __dequeue_task_dl(rq, p, flags); + if (p->on_rq == TASK_ON_RQ_MIGRATING) + __dl_sub_ac(rq, p->dl.dl_bw); } /* @@ -1219,6 +1227,8 @@ static void task_dead_dl(struct task_struct *p) { struct dl_bw *dl_b = dl_bw_of(task_cpu(p)); + __dl_sub_ac(task_rq(p), p->dl.dl_bw); + /* * Since we are TASK_DEAD we won't slip out of the domain! */ @@ -1511,8 +1521,10 @@ retry: } deactivate_task(rq, next_task, 0); + __dl_sub_ac(rq, next_task->dl.dl_bw); set_task_cpu(next_task, later_rq->cpu); activate_task(later_rq, next_task, 0); + __dl_add_ac(later_rq, next_task->dl.dl_bw); ret = 1; resched_curr(later_rq); @@ -1599,8 +1611,10 @@ static void pull_dl_task(struct rq *this_rq) resched = true; deactivate_task(src_rq, p, 0); + __dl_sub_ac(src_rq, p->dl.dl_bw); set_task_cpu(p, this_cpu); activate_task(this_rq, p, 0); + __dl_add_ac(this_rq, p->dl.dl_bw); dmin = p->dl.deadline; /* Is there any other task even earlier? */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 10f1637..242907f 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -519,6 +519,14 @@ struct dl_rq { #else struct dl_bw dl_bw; #endif + + /* + * ac_bw keeps track of per rq admitted bandwidth. It only changes + * when a new task is admitted, it dies, it changes scheduling policy + * or is migrated to another rq. It is used to correctly save/resore + * total_bw on root_domain changes. + */ + u64 ac_bw; }; #ifdef CONFIG_SMP @@ -720,6 +728,20 @@ DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues); #define cpu_curr(cpu) (cpu_rq(cpu)->curr) #define raw_rq() raw_cpu_ptr(&runqueues) +static inline +void __dl_sub_ac(struct rq *rq, u64 tsk_bw) +{ + WARN_ON(rq->dl.ac_bw < tsk_bw); + + rq->dl.ac_bw -= tsk_bw; +} + +static inline +void __dl_add_ac(struct rq *rq, u64 tsk_bw) +{ + rq->dl.ac_bw += tsk_bw; +} + static inline u64 __rq_clock_broken(struct rq *rq) { return READ_ONCE(rq->clock);