From patchwork Wed Apr 30 09:24:43 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Alex Shi <alex.shi@linaro.org>
X-Patchwork-Id: 29389
Return-Path: <patchwork-forward+bncBDVYVHM6UUPBB6ECQONQKGQE4JRFG7I@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-ob0-f199.google.com (mail-ob0-f199.google.com
 [209.85.214.199])
 by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id E8F77202FE
 for <linaro@patches.linaro.org>; Wed, 30 Apr 2014 09:25:12 +0000 (UTC)
Received: by mail-ob0-f199.google.com with SMTP id gq1sf7501595obb.10
 for <linaro@patches.linaro.org>; Wed, 30 Apr 2014 02:25:12 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:delivered-to:message-id:date:from:user-agent
 :mime-version:to:cc:subject:references:in-reply-to:sender:precedence
 :list-id:x-original-sender:x-original-authentication-results
 :mailing-list:list-post:list-help:list-archive:list-unsubscribe
 :content-type:content-transfer-encoding;
 bh=BMOVXYXcpTWIixC2jB/1FDeuRf/uSG6LSutXovRADO0=;
 b=BonhBR2myoWfEL3P3dfkmzAmBAPpw4xNkFjKDRyS+9jWBx2qpQZDkVL00VQwevuVOC
 pyL/Siu4W3RuKC5mqEzq1Bp1nNLtuQcHlKcz1UwkU+iikeovsgwvxEP0psWcCRhahjWM
 /NdRApE41fdektFZIJ7Fc0MgEzr9V0rqTKyjMCxWIK9IavP3CkEKxiORIoitijKj5A4T
 9D15f6ZzEvzOr6YgyxoNKUDIUchDRTgqKxjtbgAre7QIpBeTgW3ffR4TqGT5QVWnHVmf
 mbAcSWfMeEmm+6zJrmqE993eJt5NM+86XJ6zKUXRSFOGcJGmBYI/6Cc7bl2VTmUJHuWo
 1vHw==
X-Gm-Message-State: ALoCoQlFKAxAcpxD2fywEkXqDEqXfeI0qmY6/s0gRNdEmS2dhxN9Y11SVYbasqBqwe/3qqicI8vi
X-Received: by 10.182.216.165 with SMTP id or5mr1740901obc.29.1398849912481; 
 Wed, 30 Apr 2014 02:25:12 -0700 (PDT)
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.140.22.145 with SMTP id 17ls510104qgn.21.gmail; Wed, 30 Apr
 2014 02:25:12 -0700 (PDT)
X-Received: by 10.221.62.131 with SMTP id xa3mr2987684vcb.13.1398849912347; 
 Wed, 30 Apr 2014 02:25:12 -0700 (PDT)
Received: from mail-vc0-f180.google.com (mail-vc0-f180.google.com
 [209.85.220.180]) by mx.google.com with ESMTPS id
 f17si5192002vco.135.2014.04.30.02.25.12
 for <patchwork-forward@linaro.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Wed, 30 Apr 2014 02:25:12 -0700 (PDT)
Received-SPF: none (google.com:
 patch+caf_=patchwork-forward=linaro.org@linaro.org does not
 designate permitted sender hosts) client-ip=209.85.220.180; 
Received: by mail-vc0-f180.google.com with SMTP id hq16so1803335vcb.39
 for <patchwork-forward@linaro.org>;
 Wed, 30 Apr 2014 02:25:12 -0700 (PDT)
X-Received: by 10.52.251.199 with SMTP id zm7mr2534211vdc.21.1398849912248; 
 Wed, 30 Apr 2014 02:25:12 -0700 (PDT)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.220.221.72 with SMTP id ib8csp263638vcb;
 Wed, 30 Apr 2014 02:25:11 -0700 (PDT)
X-Received: by 10.50.111.138 with SMTP id ii10mr3162369igb.34.1398849911247; 
 Wed, 30 Apr 2014 02:25:11 -0700 (PDT)
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
 by mx.google.com with ESMTP id
 k7si18278847icu.207.2014.04.30.02.25.10; 
 Wed, 30 Apr 2014 02:25:10 -0700 (PDT)
Received-SPF: none (google.com: linux-kernel-owner@vger.kernel.org does not
 designate permitted sender hosts) client-ip=209.132.180.67; 
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
 id S1758647AbaD3JY7 (ORCPT <rfc822;sandeep.tripathy@linaro.org>
 + 28 others); Wed, 30 Apr 2014 05:24:59 -0400
Received: from mail-ie0-f177.google.com ([209.85.223.177]:60746 "EHLO
 mail-ie0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
 with ESMTP id S1757516AbaD3JY4 (ORCPT
 <rfc822;linux-kernel@vger.kernel.org>);
 Wed, 30 Apr 2014 05:24:56 -0400
Received: by mail-ie0-f177.google.com with SMTP id rp18so1577783iec.8
 for <linux-kernel@vger.kernel.org>;
 Wed, 30 Apr 2014 02:24:55 -0700 (PDT)
X-Received: by 10.50.66.227 with SMTP id i3mr3038586igt.19.1398849895578;
 Wed, 30 Apr 2014 02:24:55 -0700 (PDT)
Received: from [192.168.1.4] ([116.232.95.157])
 by mx.google.com with ESMTPSA id
 pi3sm4180227igb.5.2014.04.30.02.24.47 for <multiple recipients>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Wed, 30 Apr 2014 02:24:55 -0700 (PDT)
Message-ID: <5360C15B.9060608@linaro.org>
Date: Wed, 30 Apr 2014 17:24:43 +0800
From: Alex Shi <alex.shi@linaro.org>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:24.0) Gecko/20100101 Thunderbird/24.4.0
MIME-Version: 1.0
To: Morten Rasmussen <morten.rasmussen@arm.com>
CC: "mingo@redhat.com" <mingo@redhat.com>,
 "peterz@infradead.org" <peterz@infradead.org>,
 "vincent.guittot@linaro.org" <vincent.guittot@linaro.org>,
 "daniel.lezcano@linaro.org" <daniel.lezcano@linaro.org>,
 "efault@gmx.de" <efault@gmx.de>,
 "wangyun@linux.vnet.ibm.com" <wangyun@linux.vnet.ibm.com>,
 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
 "mgorman@suse.de" <mgorman@suse.de>
Subject: Re: [RESEND PATCH V5 0/8] remove cpu_load idx
References: <1397616209-27275-1-git-send-email-alex.shi@linaro.org>
 <20140429145221.GH2639@e103034-lin>
In-Reply-To: <20140429145221.GH2639@e103034-lin>
Sender: linux-kernel-owner@vger.kernel.org
Precedence: list
List-ID: <patchwork-forward.linaro.org>
X-Mailing-List: linux-kernel@vger.kernel.org
X-Removed-Original-Auth: Dkim didn't pass.
X-Original-Sender: alex.shi@linaro.org
X-Original-Authentication-Results: mx.google.com;       spf=neutral
 (google.com: patch+caf_=patchwork-forward=linaro.org@linaro.org does
 not designate permitted sender hosts)
 smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org
Mailing-list: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
X-Google-Group-Id: 836684582541
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Unsubscribe: <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>, 
 <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>

On 04/29/2014 10:52 PM, Morten Rasmussen wrote:
> On Wed, Apr 16, 2014 at 03:43:21AM +0100, Alex Shi wrote:
>> In the cpu_load decay usage, we mixed the long term, short term load with
>> balance bias, randomly pick a big or small value according to balance 
>> destination or source.
> 
> I disagree that it is random. min()/max() in {source,target}_load()
> provides a conservative bias to the load estimate that should prevent us
> from trying to pull tasks from the source cpu if its current load is
> just a temporary spike. Likewise, we don't try to pull tasks to the
> target cpu if the load is just a temporary drop.

Thanks a lot for review, Morten!

A temporary spike load should be very small bases its runnable load. It
can not cause much impact.
Here the random refer to short term or long term load selection.

> 
>> This mix is wrong, the balance bias should be based
>> on task moving cost between cpu groups, not on random history or instant load.
> 
> Your patch set actually changes everything to be based on the instant
> load alone. rq->cfs.runnable_load_avg is updated instantaneously when
> tasks are enqueued and deqeueue, so this load expression is quite volatile.

Seems we are backing to the predication correction argument. :)

The runnable load is not instant with runnable consideration. And no
testing show that taking too much history load will lead to a better
balance. Current cpu_load[] are decayed with different degree level. And
there is no good reason to support the different level decay setting
after runnable load involved.

> 
> What do you mean by "task moving cost"?

I mean the possible LL cache missing, and memory access latency on
different NUMA after a task move to other cpu.

> 
>> History load maybe diverage a lot from real load, that lead to incorrect bias.
>>
>> like on busy_idx,
>> We mix history load decay and bias together. The ridiculous thing is, when 
>> all cpu load are continuous stable, long/short term load is same. then we 
>> lose the bias meaning, so any minimum imbalance may cause unnecessary task
>> moving. To prevent this funny thing happen, we have to reuse the 
>> imbalance_pct again in find_busiest_group().  But that clearly causes over
>> bias in normal time. If there are some burst load in system, it is more worse.
> 
> Isn't imbalance_pct only used once in the periodic load-balance path?

Yes, but we already used source/target_load bias. Then, we have biased
twice. that is over bias.

> 
> It is not clear to me what the over bias problem is. If you have a
> stable situation, I would expect the long and short term load to be the
> same?

yes, long/short term load is same, then source/taget_load is same, then
any minimum temporary load change can cause rebalance, that is bad
considering the relative bigger task moving cost. so current code still
need add imbalance_pct to prevent such things happen. Using both
source/target load bias and imbalance pct bias is over bias.

> 
>> As to idle_idx:
>> Though I have some cencern of usage corretion, 
>> https://lkml.org/lkml/2014/3/12/247 but since we are working on cpu
>> idle migration into scheduler. The problem will be reconsidered. We don't
>> need to care it too much now.
>>
>> In fact, the cpu_load decays can be replaced by the sched_avg decay, that 
>> also decays load on time. The balance bias part can fullly use fixed bias --
>> imbalance_pct, which is already used in newly idle, wake, forkexec balancing
>> and numa balancing scenarios.
> 
> As I have said previously, I agree that cpu_load[] is somewhat broken in
> its current form, but I don't see how removing it and replacing it with
> the instantaneous cpu load solves the problems you point out.

I am glad that we get an agreement on the cpu_load[] is inappropriate. :)

Actually, this patchset just remove it and use the cpu load which
considered the runnable info, not 'instantaneous'.

> 
> The current cpu_load[] averages the cpu_load over time, while
> rq->cfs.runnable_load_avg is the sum of the currently runnable tasks'
> load_avg_contrib. The former provides a long term view of the cpu_load,

It doesn't. it may or may not use the long term load, just according to
which load is bigger or smaller. It just pretend to consider the long
term load status. but may drop it.

> the latter does not. It can change radically in an instant. I'm
> therefore a bit concerned about the stability of the load-balance
> decisions. However, since most decisions are based on cpu_load[0]
> anyway, we could try setting LB_BIAS to false as Peter suggests and see
> what happens.


Maybe the predication is reasonable on per task history. but on a cpu
load history, with many tasks rebalance. No testing show current method
is helpful.

For task load change, scheduler has no idea for its future except guess
from its history. but for cpu load change, scheduler know this from task
wakeup and balance, which both under control and its aim.


I think the first patch of this serial has the same effect of LB_LIAS
disable. and previous result show performance is good.

Anyway, I just pushed the following patch to github, maybe fengguang's
testing system will care this.

    Alex
---
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

diff --git a/kernel/sched/features.h b/kernel/sched/features.h
index 5716929..0bf649f 100644
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -43,7 +43,7 @@ SCHED_FEAT(ARCH_POWER, true)

 SCHED_FEAT(HRTICK, false)
 SCHED_FEAT(DOUBLE_TICK, false)
-SCHED_FEAT(LB_BIAS, true)
+SCHED_FEAT(LB_BIAS, false)


-- 
Thanks