From patchwork Wed Apr 26 06:27:56 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 98232 Delivered-To: patch@linaro.org Received: by 10.140.109.52 with SMTP id k49csp176083qgf; Tue, 25 Apr 2017 23:28:11 -0700 (PDT) X-Received: by 10.98.138.150 with SMTP id o22mr30997611pfk.69.1493188091792; Tue, 25 Apr 2017 23:28:11 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c68si24189429pfl.288.2017.04.25.23.28.11; Tue, 25 Apr 2017 23:28:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1435054AbdDZG2J (ORCPT + 16 others); Wed, 26 Apr 2017 02:28:09 -0400 Received: from mail-wr0-f174.google.com ([209.85.128.174]:33426 "EHLO mail-wr0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1435037AbdDZG2B (ORCPT ); Wed, 26 Apr 2017 02:28:01 -0400 Received: by mail-wr0-f174.google.com with SMTP id w50so96331582wrc.0 for ; Tue, 25 Apr 2017 23:28:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=+Z4GjUHjDeEQTZjE7Zrce6cp8DNKVH2h+IJtcSpFLGo=; b=ZcWuIW7ufPR4sXJw44r4pRTqA4EsklkPw6PzDZ/IBX0CBTbQ/oNvxuOTWSYa4dKGsL vAgGs5rIOc2dtqt+tV62cnr2Gd2ihXtbPtGdGVmbIIUvUCk9gwT6Rb0gH2dWL0r3UIpO lAZYNKlpg3vaoExCKfxKra+OrcM0qpFe74imE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=+Z4GjUHjDeEQTZjE7Zrce6cp8DNKVH2h+IJtcSpFLGo=; b=sQCQqTbDzvFSax0yZ7+o7ZdN/8FgVk5SCKm/aJ4S+VnIq4fMcn92V2FnGK+ps+X3hU lkvZ5I1j+e/wcUm+rd6lUCHY6+ViBsOlI539nGxqSVLAh64lt9wiWo+ZDw4/ZRjnmZBJ lPzs7dscz3airnUbOf1Hw1mLqcd71qR9cO8haR2DdBoEicbkAL1cMUk+eto7xVsGkEqm KKl5W0gIOn3kY3Y3fdaNj1uU/iqSOJ/y7iG1W+MX4b8NP6U2p59+3CCouMrPyYiqGCXS ydOaUkPmi6Cxsj93ZLpLJCD5d7JjR4NhK7NzO9KT1dqBCiUCUNGAyiWDox37ZB3V1Y87 PIoQ== X-Gm-Message-State: AN3rC/6cwwW5YT0kSaDpAm0nD9EDmCvgAunoBqg26P91+P1M1n1Gp88T uhgZAXObTuTj2pio X-Received: by 10.223.139.87 with SMTP id v23mr468109wra.9.1493188080194; Tue, 25 Apr 2017 23:28:00 -0700 (PDT) Received: from localhost.localdomain ([2a01:e0a:f:6020:4569:4893:1cc7:c5b1]) by smtp.gmail.com with ESMTPSA id w96sm16914468wrc.14.2017.04.25.23.27.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 25 Apr 2017 23:27:59 -0700 (PDT) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org Cc: dietmar.eggemann@arm.com, Morten.Rasmussen@arm.com, yuyang.du@intel.com, pjt@google.com, bsegall@google.com, Vincent Guittot Subject: [PATCH v3] sched/cfs: make util/load_avg more stable Date: Wed, 26 Apr 2017 08:27:56 +0200 Message-Id: <1493188076-2767-1-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In the current implementation of load/util_avg, we assume that the ongoing time segment has fully elapsed, and util/load_sum is divided by LOAD_AVG_MAX, even if part of the time segment still remains to run. As a consequence, this remaining part is considered as idle time and generates unexpected variations of util_avg of a busy CPU in the range [1002..1024[ whereas util_avg should stay at 1023. In order to keep the metric stable, we should not consider the ongoing time segment when computing load/util_avg but only the segments that have already fully elapsed. But to not consider the current time segment adds unwanted latency in the load/util_avg responsivness especially when the time is scaled instead of the contribution. Instead of waiting for the current time segment to have fully elapsed before accounting it in load/util_avg, we can already account the elapsed part but change the range used to compute load/util_avg accordingly. At the very beginning of a new time segment, the past segments have been decayed and the max value is LOAD_AVG_MAX*y. At the very end of the current time segment, the max value becomes 1024(us) + LOAD_AVG_MAX*y which is equal to LOAD_AVG_MAX. In fact, the max value is sa->period_contrib + LOAD_AVG_MAX*y at any time in the time segment. Taking advantage of the fact that LOAD_AVG_MAX*y == LOAD_AVG_MAX-1024, the range becomes [0..LOAD_AVG_MAX-1024+sa->period_contrib]. As the elapsed part is already accounted in load/util_sum, we update the max value according to the current position in the time segment instead of removing its contribution. Suggested-by: Peter Zijlstra Signed-off-by: Vincent Guittot --- Changes: -Correct typo in commit message: s/MAX_LOAD_AVG/LOAD_AVG_MAX/ and square bracket kernel/sched/fair.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index a903276..3531fa1 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -2916,12 +2916,12 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa, /* * Step 2: update *_avg. */ - sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX); + sa->load_avg = div_u64(sa->load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib); if (cfs_rq) { cfs_rq->runnable_load_avg = - div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX); + div_u64(cfs_rq->runnable_load_sum, LOAD_AVG_MAX - 1024 + sa->period_contrib); } - sa->util_avg = sa->util_sum / LOAD_AVG_MAX; + sa->util_avg = sa->util_sum / (LOAD_AVG_MAX - 1024 + sa->period_contrib); return 1; }