From patchwork Fri Apr 15 07:07:09 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juri Lelli X-Patchwork-Id: 65871 Delivered-To: patch@linaro.org Received: by 10.140.93.198 with SMTP id d64csp1021329qge; Fri, 15 Apr 2016 00:07:28 -0700 (PDT) X-Received: by 10.66.241.106 with SMTP id wh10mr27636325pac.130.1460704048819; Fri, 15 Apr 2016 00:07:28 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id vq9si852602pab.209.2016.04.15.00.07.28; Fri, 15 Apr 2016 00:07:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753062AbcDOHH1 (ORCPT + 29 others); Fri, 15 Apr 2016 03:07:27 -0400 Received: from foss.arm.com ([217.140.101.70]:47439 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751754AbcDOHH0 (ORCPT ); Fri, 15 Apr 2016 03:07:26 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0F05C3C; Fri, 15 Apr 2016 00:06:11 -0700 (PDT) Received: from pablo (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 51B3F3F25F; Fri, 15 Apr 2016 00:07:22 -0700 (PDT) Date: Fri, 15 Apr 2016 08:07:09 +0100 From: Juri Lelli To: Xunlei Pang Cc: linux-kernel@vger.kernel.org, Peter Zijlstra , Ingo Molnar , Steven Rostedt , luca abeni Subject: Re: [PATCH] sched/deadline: Fix a bug in dl_overflow() Message-ID: <20160415070709.GA3908@pablo> References: <1460636368-1993-1-git-send-email-xlpang@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1460636368-1993-1-git-send-email-xlpang@redhat.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [+Luca] Hi, On 14/04/16 20:19, Xunlei Pang wrote: > I got a minus(very big) dl_b->total_bw during my deadline tests. > > # grep dl /proc/sched_debug > dl_rq[0]: > .dl_nr_running : 0 > .dl_bw->bw : 996147 > .dl_bw->total_bw : -222297900 > > Something unusual must have happened. > > After some digging, I finally noticed that when changing a deadline > task to normal(cfs), and changing it back to deadline immediately, > after it died, we will got the wrong dl_bw->total_bw. > > The root cause is in dl_overflow(), it has: > if (new_bw == p->dl.dl_bw) > return 0; > > 1) When a deadline task is changed to !deadline task, it will start > dl timer in switched_from_dl(), and retain previous deadline parameter > till the timer expires. > 2) If we change it back to deadline with the same bandwidth parameter > before the timer expires, as it keeps the old bandwidth although it > is not a deadline task. dl_overflow() simply returns success without > updating the right data, and got the wrong dl_bw->total_bw. > > The solution is simple, if @p is not deadline, don't return. > > Signed-off-by: Xunlei Pang > --- > kernel/sched/core.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 4a2c79d..5988fee 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2378,7 +2378,8 @@ static int dl_overflow(struct task_struct *p, int policy, > u64 new_bw = dl_policy(policy) ? to_ratio(period, runtime) : 0; > int cpus, err = -1; > > - if (new_bw == p->dl.dl_bw) > + /* !deadline task may carry old deadline bandwidth */ > + if (new_bw == p->dl.dl_bw && task_has_dl_policy(p)) Right. I got the same patch that I believe Luca is be already using for his tests (and he also put together the changelog). I never managed to send it out, sorry about that. We can take yours, mine follows just in case we want to take something from the changelog or we want to reverse the if condition. Thanks, - Juri --->8--- >From 4bf38111bd9383035e03d3dc3d42011aaa9e26e7 Mon Sep 17 00:00:00 2001 From: Juri Lelli Date: Thu, 25 Feb 2016 15:50:42 +0100 Subject: [PATCH 2/3] fix a bug in the -deadline utilization tracking mechanism Currently, a task doing while(1) { switch to SCHED_DEADLINE switch to SCHED_OTHER } brings dl_b->total_bw below 0. This happens because when the task switches back from SCHED_DEADLINE to SCHED_OTHER, switched_from_dl() does not clear its deadline parameters (they will be cleared by the deadline timer when it fires). But dl_overflow() removes its utilization from dl_b->total_bw. When the task switches back to SCHED_DEADLINE, the if (new_bw == p->dl.dl_bw) check in dl_overflow() prevents __dl_add() from being called, and so when the task switches back to SCHED_OTHER dl_b->total_bw becomes negative. This patch changes the check so that if the task is switching from SCHED_OTHER to SCHED_DEADLINE __dl_add() is correctly invoked. --- kernel/sched/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.5.0 diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9503d59..d59fa20 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2432,7 +2432,7 @@ static int dl_overflow(struct task_struct *p, int policy, u64 new_bw = dl_policy(policy) ? to_ratio(period, runtime) : 0; int cpus, err = -1; - if (new_bw == p->dl.dl_bw) + if (task_has_dl_policy(p) && new_bw == p->dl.dl_bw) return 0; /*