From patchwork Wed Apr 26 06:27:56 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Vincent Guittot <vincent.guittot@linaro.org>
X-Patchwork-Id: 98232
Delivered-To: patch@linaro.org
Received: by 10.140.109.52 with SMTP id k49csp176083qgf;
 Tue, 25 Apr 2017 23:28:11 -0700 (PDT)
X-Received: by 10.98.138.150 with SMTP id o22mr30997611pfk.69.1493188091792; 
 Tue, 25 Apr 2017 23:28:11 -0700 (PDT)
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
 by mx.google.com with ESMTP id
 c68si24189429pfl.288.2017.04.25.23.28.11; 
 Tue, 25 Apr 2017 23:28:11 -0700 (PDT)
Received-SPF: pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender) client-ip=209.132.180.67; 
Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org;
 spf=pass (google.com: best guess record for domain of
 linux-kernel-owner@vger.kernel.org designates 209.132.180.67
 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org; 
 dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S1435054AbdDZG2J (ORCPT <rfc822;georgi.djakov@linaro.org>
 + 16 others); Wed, 26 Apr 2017 02:28:09 -0400
Received: from mail-wr0-f174.google.com ([209.85.128.174]:33426 "EHLO
 mail-wr0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
 with ESMTP id S1435037AbdDZG2B (ORCPT
 <rfc822;linux-kernel@vger.kernel.org>);
 Wed, 26 Apr 2017 02:28:01 -0400
Received: by mail-wr0-f174.google.com with SMTP id w50so96331582wrc.0
 for <linux-kernel@vger.kernel.org>;
 Tue, 25 Apr 2017 23:28:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; 
 h=from:to:cc:subject:date:message-id;
 bh=+Z4GjUHjDeEQTZjE7Zrce6cp8DNKVH2h+IJtcSpFLGo=;
 b=ZcWuIW7ufPR4sXJw44r4pRTqA4EsklkPw6PzDZ/IBX0CBTbQ/oNvxuOTWSYa4dKGsL
 vAgGs5rIOc2dtqt+tV62cnr2Gd2ihXtbPtGdGVmbIIUvUCk9gwT6Rb0gH2dWL0r3UIpO
 lAZYNKlpg3vaoExCKfxKra+OrcM0qpFe74imE=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id;
 bh=+Z4GjUHjDeEQTZjE7Zrce6cp8DNKVH2h+IJtcSpFLGo=;
 b=sQCQqTbDzvFSax0yZ7+o7ZdN/8FgVk5SCKm/aJ4S+VnIq4fMcn92V2FnGK+ps+X3hU
 lkvZ5I1j+e/wcUm+rd6lUCHY6+ViBsOlI539nGxqSVLAh64lt9wiWo+ZDw4/ZRjnmZBJ
 lPzs7dscz3airnUbOf1Hw1mLqcd71qR9cO8haR2DdBoEicbkAL1cMUk+eto7xVsGkEqm
 KKl5W0gIOn3kY3Y3fdaNj1uU/iqSOJ/y7iG1W+MX4b8NP6U2p59+3CCouMrPyYiqGCXS
 ydOaUkPmi6Cxsj93ZLpLJCD5d7JjR4NhK7NzO9KT1dqBCiUCUNGAyiWDox37ZB3V1Y87
 PIoQ==
X-Gm-Message-State: AN3rC/6cwwW5YT0kSaDpAm0nD9EDmCvgAunoBqg26P91+P1M1n1Gp88T
 uhgZAXObTuTj2pio
X-Received: by 10.223.139.87 with SMTP id v23mr468109wra.9.1493188080194;
 Tue, 25 Apr 2017 23:28:00 -0700 (PDT)
Received: from localhost.localdomain ([2a01:e0a:f:6020:4569:4893:1cc7:c5b1])
 by smtp.gmail.com with ESMTPSA id
 w96sm16914468wrc.14.2017.04.25.23.27.59
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
 Tue, 25 Apr 2017 23:27:59 -0700 (PDT)
From: Vincent Guittot <vincent.guittot@linaro.org>
To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org
Cc: dietmar.eggemann@arm.com, Morten.Rasmussen@arm.com,
 yuyang.du@intel.com, pjt@google.com, bsegall@google.com,
 Vincent Guittot <vincent.guittot@linaro.org>
Subject: [PATCH v3] sched/cfs: make util/load_avg more stable
Date: Wed, 26 Apr 2017 08:27:56 +0200
Message-Id: <1493188076-2767-1-git-send-email-vincent.guittot@linaro.org>
X-Mailer: git-send-email 2.7.4
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

In the current implementation of load/util_avg, we assume that the ongoing
time segment has fully elapsed, and util/load_sum is divided by LOAD_AVG_MAX,
even if part of the time segment still remains to run. As a consequence, this
remaining part is considered as idle time and generates unexpected variations
of util_avg of a busy CPU in the range [1002..1024[ whereas util_avg should
stay at 1023.

In order to keep the metric stable, we should not consider the ongoing time
segment when computing load/util_avg but only the segments that have already
fully elapsed. But to not consider the current time segment adds unwanted
latency in the load/util_avg responsivness especially when the time is scaled
instead of the contribution. Instead of waiting for the current time segment
to have fully elapsed before accounting it in load/util_avg, we can already
account the elapsed part but change the range used to compute load/util_avg
accordingly.

At the very beginning of a new time segment, the past segments have been
decayed and the max value is LOAD_AVG_MAX*y. At the very end of the current
time segment, the max value becomes 1024(us) + LOAD_AVG_MAX*y which is equal
to LOAD_AVG_MAX. In fact, the max value is
sa->period_contrib + LOAD_AVG_MAX*y at any time in the time segment.

Taking advantage of the fact that LOAD_AVG_MAX*y == LOAD_AVG_MAX-1024, the
range becomes [0..LOAD_AVG_MAX-1024+sa->period_contrib].

As the elapsed part is already accounted in load/util_sum, we update the max
value according to the current position in the time segment instead of
removing its contribution.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---

Changes:
-Correct typo in commit message: s/MAX_LOAD_AVG/LOAD_AVG_MAX/ and square bracket

 kernel/sched/fair.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
2.7.4

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a903276..3531fa1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2916,12 +2916,12 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
 	/*
 	 * Step 2: update *_avg.
 	 */
-	sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX);
+	sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib);
 	if (cfs_rq) {
 		cfs_rq->runnable_load_avg =
-			div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX);
+			div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib);
 	}
-	sa->util_avg = sa->util_sum / LOAD_AVG_MAX;
+	sa->util_avg = sa->util_sum / (LOAD_AVG_MAX - 1024 + sa->period_contrib);
 
 	return 1;
 }