From patchwork Wed Apr 30 09:24:43 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Shi X-Patchwork-Id: 29389 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-ob0-f199.google.com (mail-ob0-f199.google.com [209.85.214.199]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id E8F77202FE for ; Wed, 30 Apr 2014 09:25:12 +0000 (UTC) Received: by mail-ob0-f199.google.com with SMTP id gq1sf7501595obb.10 for ; Wed, 30 Apr 2014 02:25:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:message-id:date:from:user-agent :mime-version:to:cc:subject:references:in-reply-to:sender:precedence :list-id:x-original-sender:x-original-authentication-results :mailing-list:list-post:list-help:list-archive:list-unsubscribe :content-type:content-transfer-encoding; bh=BMOVXYXcpTWIixC2jB/1FDeuRf/uSG6LSutXovRADO0=; b=BonhBR2myoWfEL3P3dfkmzAmBAPpw4xNkFjKDRyS+9jWBx2qpQZDkVL00VQwevuVOC pyL/Siu4W3RuKC5mqEzq1Bp1nNLtuQcHlKcz1UwkU+iikeovsgwvxEP0psWcCRhahjWM /NdRApE41fdektFZIJ7Fc0MgEzr9V0rqTKyjMCxWIK9IavP3CkEKxiORIoitijKj5A4T 9D15f6ZzEvzOr6YgyxoNKUDIUchDRTgqKxjtbgAre7QIpBeTgW3ffR4TqGT5QVWnHVmf mbAcSWfMeEmm+6zJrmqE993eJt5NM+86XJ6zKUXRSFOGcJGmBYI/6Cc7bl2VTmUJHuWo 1vHw== X-Gm-Message-State: ALoCoQlFKAxAcpxD2fywEkXqDEqXfeI0qmY6/s0gRNdEmS2dhxN9Y11SVYbasqBqwe/3qqicI8vi X-Received: by 10.182.216.165 with SMTP id or5mr1740901obc.29.1398849912481; Wed, 30 Apr 2014 02:25:12 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.22.145 with SMTP id 17ls510104qgn.21.gmail; Wed, 30 Apr 2014 02:25:12 -0700 (PDT) X-Received: by 10.221.62.131 with SMTP id xa3mr2987684vcb.13.1398849912347; Wed, 30 Apr 2014 02:25:12 -0700 (PDT) Received: from mail-vc0-f180.google.com (mail-vc0-f180.google.com [209.85.220.180]) by mx.google.com with ESMTPS id f17si5192002vco.135.2014.04.30.02.25.12 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 30 Apr 2014 02:25:12 -0700 (PDT) Received-SPF: none (google.com: patch+caf_=patchwork-forward=linaro.org@linaro.org does not designate permitted sender hosts) client-ip=209.85.220.180; Received: by mail-vc0-f180.google.com with SMTP id hq16so1803335vcb.39 for ; Wed, 30 Apr 2014 02:25:12 -0700 (PDT) X-Received: by 10.52.251.199 with SMTP id zm7mr2534211vdc.21.1398849912248; Wed, 30 Apr 2014 02:25:12 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.220.221.72 with SMTP id ib8csp263638vcb; Wed, 30 Apr 2014 02:25:11 -0700 (PDT) X-Received: by 10.50.111.138 with SMTP id ii10mr3162369igb.34.1398849911247; Wed, 30 Apr 2014 02:25:11 -0700 (PDT) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k7si18278847icu.207.2014.04.30.02.25.10; Wed, 30 Apr 2014 02:25:10 -0700 (PDT) Received-SPF: none (google.com: linux-kernel-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758647AbaD3JY7 (ORCPT + 28 others); Wed, 30 Apr 2014 05:24:59 -0400 Received: from mail-ie0-f177.google.com ([209.85.223.177]:60746 "EHLO mail-ie0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757516AbaD3JY4 (ORCPT ); Wed, 30 Apr 2014 05:24:56 -0400 Received: by mail-ie0-f177.google.com with SMTP id rp18so1577783iec.8 for ; Wed, 30 Apr 2014 02:24:55 -0700 (PDT) X-Received: by 10.50.66.227 with SMTP id i3mr3038586igt.19.1398849895578; Wed, 30 Apr 2014 02:24:55 -0700 (PDT) Received: from [192.168.1.4] ([116.232.95.157]) by mx.google.com with ESMTPSA id pi3sm4180227igb.5.2014.04.30.02.24.47 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 30 Apr 2014 02:24:55 -0700 (PDT) Message-ID: <5360C15B.9060608@linaro.org> Date: Wed, 30 Apr 2014 17:24:43 +0800 From: Alex Shi User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0 MIME-Version: 1.0 To: Morten Rasmussen CC: "mingo@redhat.com" , "peterz@infradead.org" , "vincent.guittot@linaro.org" , "daniel.lezcano@linaro.org" , "efault@gmx.de" , "wangyun@linux.vnet.ibm.com" , "linux-kernel@vger.kernel.org" , "mgorman@suse.de" Subject: Re: [RESEND PATCH V5 0/8] remove cpu_load idx References: <1397616209-27275-1-git-send-email-alex.shi@linaro.org> <20140429145221.GH2639@e103034-lin> In-Reply-To: <20140429145221.GH2639@e103034-lin> Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: alex.shi@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: patch+caf_=patchwork-forward=linaro.org@linaro.org does not designate permitted sender hosts) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , On 04/29/2014 10:52 PM, Morten Rasmussen wrote: > On Wed, Apr 16, 2014 at 03:43:21AM +0100, Alex Shi wrote: >> In the cpu_load decay usage, we mixed the long term, short term load with >> balance bias, randomly pick a big or small value according to balance >> destination or source. > > I disagree that it is random. min()/max() in {source,target}_load() > provides a conservative bias to the load estimate that should prevent us > from trying to pull tasks from the source cpu if its current load is > just a temporary spike. Likewise, we don't try to pull tasks to the > target cpu if the load is just a temporary drop. Thanks a lot for review, Morten! A temporary spike load should be very small bases its runnable load. It can not cause much impact. Here the random refer to short term or long term load selection. > >> This mix is wrong, the balance bias should be based >> on task moving cost between cpu groups, not on random history or instant load. > > Your patch set actually changes everything to be based on the instant > load alone. rq->cfs.runnable_load_avg is updated instantaneously when > tasks are enqueued and deqeueue, so this load expression is quite volatile. Seems we are backing to the predication correction argument. :) The runnable load is not instant with runnable consideration. And no testing show that taking too much history load will lead to a better balance. Current cpu_load[] are decayed with different degree level. And there is no good reason to support the different level decay setting after runnable load involved. > > What do you mean by "task moving cost"? I mean the possible LL cache missing, and memory access latency on different NUMA after a task move to other cpu. > >> History load maybe diverage a lot from real load, that lead to incorrect bias. >> >> like on busy_idx, >> We mix history load decay and bias together. The ridiculous thing is, when >> all cpu load are continuous stable, long/short term load is same. then we >> lose the bias meaning, so any minimum imbalance may cause unnecessary task >> moving. To prevent this funny thing happen, we have to reuse the >> imbalance_pct again in find_busiest_group(). But that clearly causes over >> bias in normal time. If there are some burst load in system, it is more worse. > > Isn't imbalance_pct only used once in the periodic load-balance path? Yes, but we already used source/target_load bias. Then, we have biased twice. that is over bias. > > It is not clear to me what the over bias problem is. If you have a > stable situation, I would expect the long and short term load to be the > same? yes, long/short term load is same, then source/taget_load is same, then any minimum temporary load change can cause rebalance, that is bad considering the relative bigger task moving cost. so current code still need add imbalance_pct to prevent such things happen. Using both source/target load bias and imbalance pct bias is over bias. > >> As to idle_idx: >> Though I have some cencern of usage corretion, >> https://lkml.org/lkml/2014/3/12/247 but since we are working on cpu >> idle migration into scheduler. The problem will be reconsidered. We don't >> need to care it too much now. >> >> In fact, the cpu_load decays can be replaced by the sched_avg decay, that >> also decays load on time. The balance bias part can fullly use fixed bias -- >> imbalance_pct, which is already used in newly idle, wake, forkexec balancing >> and numa balancing scenarios. > > As I have said previously, I agree that cpu_load[] is somewhat broken in > its current form, but I don't see how removing it and replacing it with > the instantaneous cpu load solves the problems you point out. I am glad that we get an agreement on the cpu_load[] is inappropriate. :) Actually, this patchset just remove it and use the cpu load which considered the runnable info, not 'instantaneous'. > > The current cpu_load[] averages the cpu_load over time, while > rq->cfs.runnable_load_avg is the sum of the currently runnable tasks' > load_avg_contrib. The former provides a long term view of the cpu_load, It doesn't. it may or may not use the long term load, just according to which load is bigger or smaller. It just pretend to consider the long term load status. but may drop it. > the latter does not. It can change radically in an instant. I'm > therefore a bit concerned about the stability of the load-balance > decisions. However, since most decisions are based on cpu_load[0] > anyway, we could try setting LB_BIAS to false as Peter suggests and see > what happens. Maybe the predication is reasonable on per task history. but on a cpu load history, with many tasks rebalance. No testing show current method is helpful. For task load change, scheduler has no idea for its future except guess from its history. but for cpu load change, scheduler know this from task wakeup and balance, which both under control and its aim. I think the first patch of this serial has the same effect of LB_LIAS disable. and previous result show performance is good. Anyway, I just pushed the following patch to github, maybe fengguang's testing system will care this. Alex --- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 5716929..0bf649f 100644 --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -43,7 +43,7 @@ SCHED_FEAT(ARCH_POWER, true) SCHED_FEAT(HRTICK, false) SCHED_FEAT(DOUBLE_TICK, false) -SCHED_FEAT(LB_BIAS, true) +SCHED_FEAT(LB_BIAS, false) -- Thanks