From patchwork Tue Feb 13 10:31:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 128212 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp4107018ljc; Tue, 13 Feb 2018 02:31:58 -0800 (PST) X-Google-Smtp-Source: AH8x225+ohsJigS8Ck0ks2taxAztHw1yiR/KRwoa5NDtMKZUJix0Q9epdkqZs5wXjErnpFZYO18M X-Received: by 10.99.114.18 with SMTP id n18mr606737pgc.169.1518517918727; Tue, 13 Feb 2018 02:31:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518517918; cv=none; d=google.com; s=arc-20160816; b=mmcVjJy8N+aOZqKVrfQt6jmCY8HNGkwnmbDGXqzmUDxFcp2fVPp+DxeXMH0i5BRN9R ZBIpyCqNsYnjsJCx4MV1vPFbgt7MclroIjTGohtbdHfa6mmHJemy6CZfbzW5EDfOky0C 8Q4zWQ+lWdR6uy/5/6rCjwz2WJ7lqX4n8+GGNLjgQywC0n9gnYIMWNmyjKKWaHwhIVAr 1NbZoxrxDoLsny5sOvK6H/rvN9Rok1KqtxYxllccP/9DcovjaW6ZOrk7+3lnzN8ZOSFW HUEaAxfRUy1/ijMRp/Ors68svSlTElVQSB6Lx2YkK6vAwchqtx1jacZss7cmKn/us0gJ oDog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=ychFB2x+Ocf7MMtzUI1dfCI/hUxDhn5G7pssdtHsoo4=; b=Bp5ahq8fUKLwV05aq0/ERGGSOy58/FmOkv0zP2mdODMJGnntdfZP9tzh4ZyGp4DGSk CkiZmXt1z1FQPsRaW0uq9f1WY//AnyAdgj1m5X5ZFvRL1W0Xf7wEuRYJF6RSmRsU2uqB 2St8CZMWh3OWdUDNB/hrgvKTBxzvq4jZrtB9jFuZn0DqJJcw9j6AXuy3+atwjBwpFsYT KWft976p9YKaocLNNtWbqVkAukdLhFPF0OLuhTlYhX6wHnQG59Y2hNhKIlLzrYHtz2hR 704+0MdGMrM/rPlwpXZJlcbPX4D3stEzqhxX87zW7x2JxAcfiAfjiO8FPSAhE2nQRm/i 6OaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=QCLAEWNZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s1si1018112pfk.26.2018.02.13.02.31.58; Tue, 13 Feb 2018 02:31:58 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=QCLAEWNZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934294AbeBMKbg (ORCPT + 27 others); Tue, 13 Feb 2018 05:31:36 -0500 Received: from mail-wm0-f65.google.com ([74.125.82.65]:51433 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933967AbeBMKb2 (ORCPT ); Tue, 13 Feb 2018 05:31:28 -0500 Received: by mail-wm0-f65.google.com with SMTP id r71so15082688wmd.1 for ; Tue, 13 Feb 2018 02:31:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ychFB2x+Ocf7MMtzUI1dfCI/hUxDhn5G7pssdtHsoo4=; b=QCLAEWNZ+iEwZrLShsFWTEc/1Xz+WGGpq39XSXgjnvTCGxr1331inoyge0Szk8+stN PgfN/gPUuyZ/2QBdYCaGg8h6HIXoR3vXcaB/vgEvIVAnoCPJoObDVjxMKwiMdc6XzL/C p/jYQzxFXyjV+BVtzJZN4jTG8Q4yCCA/4uKN4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ychFB2x+Ocf7MMtzUI1dfCI/hUxDhn5G7pssdtHsoo4=; b=i0ev2sqvL8nsPVvkg6Wrrs09OJIw5FiVbdwR7lJ3wc1v5yBdV8o8y9sEMCu5OANrRt JYzituocDQZKqzPKPlNuFM8so1idpJ6W+bhaM1X3teylO3pjo1ZBYP21Uk/W6A6lS4W7 PgzWOhjg2wTLL696ZuFVQeW1RqtabakKQi32o6zaVogeLwBm5vhv1PnysRiVPAG76eaq HAS5Mu6Wh8L54+ez2ypp9BpKSRNB579jmrNJH7PQmHMyrX9s7xucV/ovYffcAI7JplSE 3Gi5CMzFvK+Sv8XENv7ouaQJE22WX9qkwNvhiPplRgQXbopjA27njZ5jKMFG7RTYeZXY 059A== X-Gm-Message-State: APf1xPAANt83Yis6M7YaOoAPJfqhRjLjltPyE9NtIcya6kPR1Sj5l0Be i4+J8IOM03q90lV4nX5xq1tpRg== X-Received: by 10.80.145.115 with SMTP id f48mr1525764eda.92.1518517886552; Tue, 13 Feb 2018 02:31:26 -0800 (PST) Received: from localhost.localdomain ([2a01:e0a:f:6020:2529:df33:8519:ee8c]) by smtp.gmail.com with ESMTPSA id a38sm6616542edf.97.2018.02.13.02.31.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 13 Feb 2018 02:31:25 -0800 (PST) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, valentin.schneider@arm.com Cc: morten.rasmussen@foss.arm.com, brendan.jackman@arm.com, dietmar.eggemann@arm.com, Vincent Guittot Subject: [PATCH v4 1/3] sched: Stop nohz stats when decayed Date: Tue, 13 Feb 2018 11:31:17 +0100 Message-Id: <1518517879-2280-2-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1518517879-2280-1-git-send-email-vincent.guittot@linaro.org> References: <1518517879-2280-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Stopped the periodic update of blocked load when all idle CPUs have fully decayed. We introduce a new nohz.has_blocked that reflect if some idle CPUs has blocked load that have to be periodiccally updated. nohz.has_blocked is set everytime that a Idle CPU can have blocked load and it is then clear when no more blocked load has been detected during an update. We don't need atomic operation but only to make cure of the right ordering when updating nohz.idle_cpus_mask and nohz.has_blocked. Suggested-by: Peter Zijlstra (Intel) Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 122 ++++++++++++++++++++++++++++++++++++++++++--------- kernel/sched/sched.h | 1 + 2 files changed, 102 insertions(+), 21 deletions(-) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7af1fa9..5a6835e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5383,8 +5383,9 @@ decay_load_missed(unsigned long load, unsigned long missed_updates, int idx) static struct { cpumask_var_t idle_cpus_mask; atomic_t nr_cpus; + int has_blocked; /* Idle CPUS has blocked load */ unsigned long next_balance; /* in jiffy units */ - unsigned long next_stats; + unsigned long next_blocked; /* Next update of blocked load in jiffies */ } nohz ____cacheline_aligned; #endif /* CONFIG_NO_HZ_COMMON */ @@ -6951,6 +6952,7 @@ enum fbq_type { regular, remote, all }; #define LBF_DST_PINNED 0x04 #define LBF_SOME_PINNED 0x08 #define LBF_NOHZ_STATS 0x10 +#define LBF_NOHZ_AGAIN 0x20 struct lb_env { struct sched_domain *sd; @@ -7335,8 +7337,6 @@ static void attach_tasks(struct lb_env *env) rq_unlock(env->dst_rq, &rf); } -#ifdef CONFIG_FAIR_GROUP_SCHED - static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) { if (cfs_rq->load.weight) @@ -7354,11 +7354,14 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) return true; } +#ifdef CONFIG_FAIR_GROUP_SCHED + static void update_blocked_averages(int cpu) { struct rq *rq = cpu_rq(cpu); struct cfs_rq *cfs_rq, *pos; struct rq_flags rf; + bool done = true; rq_lock_irqsave(rq, &rf); update_rq_clock(rq); @@ -7388,10 +7391,14 @@ static void update_blocked_averages(int cpu) */ if (cfs_rq_is_decayed(cfs_rq)) list_del_leaf_cfs_rq(cfs_rq); + else + done = false; } #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; + if (done) + rq->has_blocked_load = 0; #endif rq_unlock_irqrestore(rq, &rf); } @@ -7454,6 +7461,8 @@ static inline void update_blocked_averages(int cpu) update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq); #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; + if (cfs_rq_is_decayed(cfs_rq)) + rq->has_blocked_load = 0; #endif rq_unlock_irqrestore(rq, &rf); } @@ -7789,18 +7798,25 @@ group_type group_classify(struct sched_group *group, return group_other; } -static void update_nohz_stats(struct rq *rq) +static bool update_nohz_stats(struct rq *rq) { #ifdef CONFIG_NO_HZ_COMMON unsigned int cpu = rq->cpu; + if (!rq->has_blocked_load) + return false; + if (!cpumask_test_cpu(cpu, nohz.idle_cpus_mask)) - return; + return false; if (!time_after(jiffies, rq->last_blocked_load_update_tick)) - return; + return true; update_blocked_averages(cpu); + + return rq->has_blocked_load; +#else + return false; #endif } @@ -7826,8 +7842,8 @@ static inline void update_sg_lb_stats(struct lb_env *env, for_each_cpu_and(i, sched_group_span(group), env->cpus) { struct rq *rq = cpu_rq(i); - if (env->flags & LBF_NOHZ_STATS) - update_nohz_stats(rq); + if ((env->flags & LBF_NOHZ_STATS) && update_nohz_stats(rq)) + env->flags |= LBF_NOHZ_AGAIN; /* Bias balancing toward cpus of our domain */ if (local_group) @@ -7979,18 +7995,17 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd struct sg_lb_stats *local = &sds->local_stat; struct sg_lb_stats tmp_sgs; int load_idx, prefer_sibling = 0; +#ifdef CONFIG_NO_HZ_COMMON + int has_blocked = READ_ONCE(nohz.has_blocked); +#endif bool overload = false; if (child && child->flags & SD_PREFER_SIBLING) prefer_sibling = 1; #ifdef CONFIG_NO_HZ_COMMON - if (env->idle == CPU_NEWLY_IDLE) { + if (env->idle == CPU_NEWLY_IDLE && has_blocked) env->flags |= LBF_NOHZ_STATS; - - if (cpumask_subset(nohz.idle_cpus_mask, sched_domain_span(env->sd))) - nohz.next_stats = jiffies + msecs_to_jiffies(LOAD_AVG_PERIOD); - } #endif load_idx = get_sd_load_idx(env->sd, env->idle); @@ -8046,6 +8061,15 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd sg = sg->next; } while (sg != env->sd->groups); +#ifdef CONFIG_NO_HZ_COMMON + if ((env->flags & LBF_NOHZ_AGAIN) && + cpumask_subset(nohz.idle_cpus_mask, sched_domain_span(env->sd))) { + + WRITE_ONCE(nohz.next_blocked, + jiffies + msecs_to_jiffies(LOAD_AVG_PERIOD)); + } +#endif + if (env->sd->flags & SD_NUMA) env->fbq_type = fbq_classify_group(&sds->busiest_stat); @@ -9069,6 +9093,8 @@ static void nohz_balancer_kick(struct rq *rq) struct sched_domain *sd; int nr_busy, i, cpu = rq->cpu; unsigned int flags = 0; + unsigned long has_blocked = READ_ONCE(nohz.has_blocked); + unsigned long next_blocked = READ_ONCE(nohz.next_blocked); if (unlikely(rq->idle_balance)) return; @@ -9086,7 +9112,7 @@ static void nohz_balancer_kick(struct rq *rq) if (likely(!atomic_read(&nohz.nr_cpus))) return; - if (time_after(now, nohz.next_stats)) + if (time_after(now, next_blocked) && has_blocked) flags = NOHZ_STATS_KICK; if (time_before(now, nohz.next_balance)) @@ -9207,13 +9233,26 @@ void nohz_balance_enter_idle(int cpu) if (!housekeeping_cpu(cpu, HK_FLAG_SCHED)) return; + /* + * Can be set safely without rq->lock held + * If a clear happens, it will have evaluated last additions because + * rq->lock is held during the check and the clear + */ + rq->has_blocked_load = 1; + + /* + * The tick is still stopped but load could have been added in the + * meantime. We set the nohz.has_blocked flag to trig a check of the + * *_avg. The CPU is already part of nohz.idle_cpus_mask so the clear + * of nohz.has_blocked can only happen after checking the new load + */ if (rq->nohz_tick_stopped) - return; + goto out; /* * If we're a completely isolated CPU, we don't play. */ - if (on_null_domain(cpu_rq(cpu))) + if (on_null_domain(rq)) return; rq->nohz_tick_stopped = 1; @@ -9221,7 +9260,21 @@ void nohz_balance_enter_idle(int cpu) cpumask_set_cpu(cpu, nohz.idle_cpus_mask); atomic_inc(&nohz.nr_cpus); + /* + * Ensures that if nohz_idle_balance() fails to observe our + * @idle_cpus_mask store, it must observe the @has_blocked + * store. + */ + smp_mb__after_atomic(); + set_cpu_sd_state_idle(cpu); + +out: + /* + * Each time a cpu enter idle, we assume that it has blocked load and + * enable the periodic update of the load of idle cpus + */ + WRITE_ONCE(nohz.has_blocked, 1); } #else static inline void nohz_balancer_kick(struct rq *rq) { } @@ -9355,7 +9408,7 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) /* Earliest time when we have to do rebalance again */ unsigned long now = jiffies; unsigned long next_balance = now + 60*HZ; - unsigned long next_stats = now + msecs_to_jiffies(LOAD_AVG_PERIOD); + bool has_blocked_load = false; int update_next_balance = 0; int this_cpu = this_rq->cpu; unsigned int flags; @@ -9374,6 +9427,22 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK); + /* + * We assume there will be no idle load after this update and clear + * the has_blocked flag. If a cpu enters idle in the mean time, it will + * set the has_blocked flag and trig another update of idle load. + * Because a cpu that becomes idle, is added to idle_cpus_mask before + * setting the flag, we are sure to not clear the state and not + * check the load of an idle cpu. + */ + WRITE_ONCE(nohz.has_blocked, 0); + + /* + * Ensures that if we miss the CPU, we must see the has_blocked + * store from nohz_balance_enter_idle(). + */ + smp_mb(); + for_each_cpu(balance_cpu, nohz.idle_cpus_mask) { if (balance_cpu == this_cpu || !idle_cpu(balance_cpu)) continue; @@ -9383,11 +9452,16 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) * work being done for other cpus. Next load * balancing owner will pick it up. */ - if (need_resched()) - break; + if (need_resched()) { + has_blocked_load = true; + goto abort; + } rq = cpu_rq(balance_cpu); + update_blocked_averages(rq->cpu); + has_blocked_load |= rq->has_blocked_load; + /* * If time for next balance is due, * do the balance. @@ -9400,7 +9474,6 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) cpu_load_update_idle(rq); rq_unlock_irq(rq, &rf); - update_blocked_averages(rq->cpu); if (flags & NOHZ_BALANCE_KICK) rebalance_domains(rq, CPU_IDLE); } @@ -9415,7 +9488,13 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) if (flags & NOHZ_BALANCE_KICK) rebalance_domains(this_rq, CPU_IDLE); - nohz.next_stats = next_stats; + WRITE_ONCE(nohz.next_blocked, + now + msecs_to_jiffies(LOAD_AVG_PERIOD)); + +abort: + /* There is still blocked load, enable periodic update */ + if (has_blocked_load) + WRITE_ONCE(nohz.has_blocked, 1); /* * next_balance will be updated only when there is a need. @@ -10046,6 +10125,7 @@ __init void init_sched_fair_class(void) #ifdef CONFIG_NO_HZ_COMMON nohz.next_balance = jiffies; + nohz.next_blocked = jiffies; zalloc_cpumask_var(&nohz.idle_cpus_mask, GFP_NOWAIT); #endif #endif /* SMP */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index e200045..ad9b929 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -723,6 +723,7 @@ struct rq { #ifdef CONFIG_SMP unsigned long last_load_update_tick; unsigned long last_blocked_load_update_tick; + unsigned int has_blocked_load; #endif /* CONFIG_SMP */ unsigned int nohz_tick_stopped; atomic_t nohz_flags; From patchwork Tue Feb 13 10:31:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 128211 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp4106828ljc; Tue, 13 Feb 2018 02:31:42 -0800 (PST) X-Google-Smtp-Source: AH8x224q3BsaFqp3fVper7XhUe9/GYsxApi07nv6fIM6hf4q27LWPqhJ02ZZJUgcTwjhc1ue6O9Y X-Received: by 10.99.110.10 with SMTP id j10mr628967pgc.72.1518517902172; Tue, 13 Feb 2018 02:31:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518517902; cv=none; d=google.com; s=arc-20160816; b=OLqa1Xsi0wR0meh4M3N81rmOMgn6uwUxKlpO2pcP9ZkaP5awIAuqgEGNZxYX6lP0Ow pj14KTxeF7xS2JyAjdTHsar0UaHe4OYzPCxH7dO6Eb139pIKe2eVMea2KncCTiYxJFvw aZfPsP45ls9B9jH/lWLdkkVlAJJH5yrs8xqWt4zWVmyfHfxM8S8NDWsdDLmk13g+v9rT gmLY1c5NG01bdbAr7IvRdKNZUDWdLRWGrFWHhYXO2w2cRHcXTDUhHBOnMCdLptAFgPYO w/8lKVUkVYtzYr359cuabrtawHtWrKxf4t4oDq+h/FXri0YGTOkzVvpoDYAduTfj1E7X HGsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=0awVy9Eht9OU1vgZB1RJ/d7ScAU0JQVdWyx7O31zaGc=; b=xGYLguicbhmDU4RMqOsrDmiPU4sE0f3a4nZogsrs/bhjZWAJ4jmp3SBkTTO/GeERbd eXJPr7BrRnYX7GZCFGAw6hCivoeA7EwDtpSORIASRdXW+L0SjAP0E/S3uaPsilX+64NG xfC5mYEYn8FyAV6a83ByX/0nMBvJSsCagm7m7hiSrBgzTtpiMDtP3/3pHWgyxjbV1tzR nzD7F38uuyl7w8HptZFsPRN1/jczPdte4scpjdgcHuGF7pq5kWCu+eQBEAZkaeJUj8Ea 0J8/akD6GBKHsBg+lwVX4QtYyLqcA4t8QQW/29taCYiihlnc9vaRLrsnbYSOxZcNvHfQ 6SfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=IIcsNfXV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s1si1018112pfk.26.2018.02.13.02.31.41; Tue, 13 Feb 2018 02:31:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=IIcsNfXV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934376AbeBMKbi (ORCPT + 27 others); Tue, 13 Feb 2018 05:31:38 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:39950 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934026AbeBMKb2 (ORCPT ); Tue, 13 Feb 2018 05:31:28 -0500 Received: by mail-wm0-f67.google.com with SMTP id v123so15114645wmd.5 for ; Tue, 13 Feb 2018 02:31:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=0awVy9Eht9OU1vgZB1RJ/d7ScAU0JQVdWyx7O31zaGc=; b=IIcsNfXVV8KC6PILVyeUZ17V+zIdTv8KMxkmfpFPr355dvAWJaTLu7vIDLCTtr7Qeh xr2+ou+Yl849kDOfjNoNO8lls2i9u1kUHn57k4JyD770K44yS10yiiF/7dFWmGXI3YMD 0Kd/CETZegGwC12T0605wtV9vL7aKn8gliBk8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=0awVy9Eht9OU1vgZB1RJ/d7ScAU0JQVdWyx7O31zaGc=; b=oRl2s4z/tPsiPtcGIpAoVASJZ6MMWp8OWhBxQYtDQo8b6Dt0EY+GiXMp92kzyl5WWT pV230R8OUiqWraCrNJ2w1ECK78tOLHm/h0jSWpbcYuT7mEF32pTU/K6WB1EFRRkOmBP3 NY+VttkAT0kuQesgBhcw1AauldCrc0Yr7Jh8FTL+zbRSGxf6srV6Dpdcy1ojV+An1tBe dwCF9+w2aEii4iZmW3lXvmv+6PpiSdG0qaBpEmGbYfztnOinJuPfjmR1uUYLcwcaTMTy DWAPPm8MiteaPfIRNbr2Ex0V66WS+eIyg6aMoCfY1ULeHWCRydslYcluMQnmXkJsCxnY lalw== X-Gm-Message-State: APf1xPBCnMd6Q57OGIOfZQDDIAVaShIo/VCoNh9/rWpvMcQYfkkoEGGR oOHYv2f2O6a8GUNSazOLRFR5wA== X-Received: by 10.80.140.99 with SMTP id p90mr1535404edp.280.1518517887798; Tue, 13 Feb 2018 02:31:27 -0800 (PST) Received: from localhost.localdomain ([2a01:e0a:f:6020:2529:df33:8519:ee8c]) by smtp.gmail.com with ESMTPSA id a38sm6616542edf.97.2018.02.13.02.31.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 13 Feb 2018 02:31:27 -0800 (PST) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, valentin.schneider@arm.com Cc: morten.rasmussen@foss.arm.com, brendan.jackman@arm.com, dietmar.eggemann@arm.com, Vincent Guittot Subject: [PATCH v4 2/3] sched: reduce the periodic update duration Date: Tue, 13 Feb 2018 11:31:18 +0100 Message-Id: <1518517879-2280-3-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1518517879-2280-1-git-send-email-vincent.guittot@linaro.org> References: <1518517879-2280-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Instead of using the cfs_rq_is_decayed() which monitors all *_avg and *_sum, we create a cfs_rq_has_blocked() which only takes care of util_avg and load_avg. We are only interested by these 2 values which are decaying faster than the *_sum so we can stop the periodic update earlier. Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 5a6835e..9183fee 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7337,6 +7337,19 @@ static void attach_tasks(struct lb_env *env) rq_unlock(env->dst_rq, &rf); } +static inline bool cfs_rq_has_blocked(struct cfs_rq *cfs_rq) +{ + if (cfs_rq->avg.load_avg) + return true; + + if (cfs_rq->avg.util_avg) + return true; + + return false; +} + +#ifdef CONFIG_FAIR_GROUP_SCHED + static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) { if (cfs_rq->load.weight) @@ -7354,8 +7367,6 @@ static inline bool cfs_rq_is_decayed(struct cfs_rq *cfs_rq) return true; } -#ifdef CONFIG_FAIR_GROUP_SCHED - static void update_blocked_averages(int cpu) { struct rq *rq = cpu_rq(cpu); @@ -7391,7 +7402,9 @@ static void update_blocked_averages(int cpu) */ if (cfs_rq_is_decayed(cfs_rq)) list_del_leaf_cfs_rq(cfs_rq); - else + + /* Don't need periodic decay once load/util_avg are null */ + if (cfs_rq_has_blocked(cfs_rq)) done = false; } @@ -7461,7 +7474,7 @@ static inline void update_blocked_averages(int cpu) update_cfs_rq_load_avg(cfs_rq_clock_task(cfs_rq), cfs_rq); #ifdef CONFIG_NO_HZ_COMMON rq->last_blocked_load_update_tick = jiffies; - if (cfs_rq_is_decayed(cfs_rq)) + if (!cfs_rq_has_blocked(cfs_rq)) rq->has_blocked_load = 0; #endif rq_unlock_irqrestore(rq, &rf); From patchwork Tue Feb 13 10:31:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 128213 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp4107265ljc; Tue, 13 Feb 2018 02:32:16 -0800 (PST) X-Google-Smtp-Source: AH8x2250j8VhrooQakX9K6qoVTHQ20qMTGcC7VKifQlFZ9xITHJsuRelrCTIxOyhvwfTx3T4N4xe X-Received: by 10.98.181.14 with SMTP id y14mr784264pfe.220.1518517936078; Tue, 13 Feb 2018 02:32:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518517936; cv=none; d=google.com; s=arc-20160816; b=gQH22XNQr4MTaK7AH3XxT6dj/kT1I2zh5jfrH8ngFq1BwshPwvsI2t17IhcEJQCKjY VPYjxeGgGBUkMhuTUw6g8PmwolebdPbpyibWW0ECwc9waAlgKDOuChtBFaN5YquieAQj Dw87/Z0zMVViD53tj6G/3MiiACGYTGJy3BS6CBR1IFPYdLsnT3FJ3ln7mPrIGzoaT4w8 P9bZiEi2frwaMWz2rlnyO1SCM1gfMWeYa+9zZwDuoYfXkfnkAxnF77UpxJ8CXx1bRTiT huzuIzgDWCZS0RlZwpkconLKkRCaEcYmpQmmu1TeTIwvl0r9jcvWOgTTqVKhhO03AW8O xQ/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=9WYM3cv1+3yQe8e1UYCPzNqdKQIf4iElVSAm1HELrhM=; b=T6H/YK4UqvkOVs//yGjvsllVBmm/glwPHEdsIwfmKWk/xVN2sJVJruVOfwXSj2acFZ d82BFcloekNRWb4xp/OipOA2FZlu/N/ANgLZWNvGH5L48GgoSO9ofeudgxYkhVAj/NkJ S3glovBhljI0i4Mh6s1hmo+ugnb53A3St75ZlRoF9YjNbszpZDU+oh1nw+fBmgisXFXc PXy1WxIcn37jxOhFYhucXGzAbO6p1k7N6FJWgN59RRXUmeuRmaAWRUm1IwmAdPsJvS+j 1zm6U3xpY7PPLGlIX98103RWjorjAvoL6qiH042NCsMT0eX70106OSmtX9hz2Ji/QFrB 3CSw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=GjszEXfF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y8-v6si916755plt.704.2018.02.13.02.32.15; Tue, 13 Feb 2018 02:32:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=GjszEXfF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934400AbeBMKcL (ORCPT + 27 others); Tue, 13 Feb 2018 05:32:11 -0500 Received: from mail-wm0-f66.google.com ([74.125.82.66]:38558 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933849AbeBMKbb (ORCPT ); Tue, 13 Feb 2018 05:31:31 -0500 Received: by mail-wm0-f66.google.com with SMTP id 141so15054188wme.3 for ; Tue, 13 Feb 2018 02:31:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9WYM3cv1+3yQe8e1UYCPzNqdKQIf4iElVSAm1HELrhM=; b=GjszEXfFvFfXmsCbThFxxD2ipxzV07K0hKL/SaIFdPO0VY/IcgNg1n32LCFFpZWgHl VwwGKqKZUVjnrhKilrbASsPPr2ERtkKCJdinBHcY7NUCKKO7lDO9AGOTKLhhoR0KUshu 2XR4D4WFSbHdaVyedn8eAhJDfzNcLh4QrP+fw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9WYM3cv1+3yQe8e1UYCPzNqdKQIf4iElVSAm1HELrhM=; b=qY6DV/w85I/jojIvZRewhVwTi9w/M/k/YrfFrPSiqcJx126LTyvBGbJSz/iyrMIV1z Nft1cHPh0bS1UMlWSdja89o4415covZo2ISaTYLomKtExdbmbyyiliNnoXGY0dD+1eoj jY/4r6k4YyUwOWgPLjHJkVdIZkNi0z2XmkNJ6nh2x/+8DA1O2Je5fPGBDmnes9upmPuw De0LQCB9Zi7vb3OK8WWlqZR6EfNuQ6ymA6UMfC4FW8GXMhQXzgpJ9L4HKCSCXxVddUZx ttrReEB2WK46DSTTGnodj4uCk9x5F19YkUEpMJ6TLV8B5s08VtMOL3yAvsRpPV4B2Iuu r1Yg== X-Gm-Message-State: APf1xPDU/w0YUNLjmOJ7GHrJiMxK7Oed69q+y0jyqK3nUiDIziao2IW3 kyYstuaPcOR3kuPg19EgAgkxmQ== X-Received: by 10.80.182.103 with SMTP id c36mr1475419ede.57.1518517889582; Tue, 13 Feb 2018 02:31:29 -0800 (PST) Received: from localhost.localdomain ([2a01:e0a:f:6020:2529:df33:8519:ee8c]) by smtp.gmail.com with ESMTPSA id a38sm6616542edf.97.2018.02.13.02.31.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 13 Feb 2018 02:31:28 -0800 (PST) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, valentin.schneider@arm.com Cc: morten.rasmussen@foss.arm.com, brendan.jackman@arm.com, dietmar.eggemann@arm.com, Vincent Guittot Subject: [PATCH 3/3] sched: update blocked load when newly idle Date: Tue, 13 Feb 2018 11:31:19 +0100 Message-Id: <1518517879-2280-4-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1518517879-2280-1-git-send-email-vincent.guittot@linaro.org> References: <1518517879-2280-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When NEWLY_IDLE load balance is not triggered, we might need to update the blocked load anyway. We can kick an ilb so an idle CPU will take care of updating blocked load or we can try to update them locally before entering idle. In the latter case, we reuse part of the nohz_idle_balance. Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 324 +++++++++++++++++++++++++++++++--------------------- 1 file changed, 193 insertions(+), 131 deletions(-) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9183fee..cb1ab5c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8832,120 +8832,6 @@ update_next_balance(struct sched_domain *sd, unsigned long *next_balance) } /* - * idle_balance is called by schedule() if this_cpu is about to become - * idle. Attempts to pull tasks from other CPUs. - */ -static int idle_balance(struct rq *this_rq, struct rq_flags *rf) -{ - unsigned long next_balance = jiffies + HZ; - int this_cpu = this_rq->cpu; - struct sched_domain *sd; - int pulled_task = 0; - u64 curr_cost = 0; - - /* - * We must set idle_stamp _before_ calling idle_balance(), such that we - * measure the duration of idle_balance() as idle time. - */ - this_rq->idle_stamp = rq_clock(this_rq); - - /* - * Do not pull tasks towards !active CPUs... - */ - if (!cpu_active(this_cpu)) - return 0; - - /* - * This is OK, because current is on_cpu, which avoids it being picked - * for load-balance and preemption/IRQs are still disabled avoiding - * further scheduler activity on it and we're being very careful to - * re-start the picking loop. - */ - rq_unpin_lock(this_rq, rf); - - if (this_rq->avg_idle < sysctl_sched_migration_cost || - !this_rq->rd->overload) { - rcu_read_lock(); - sd = rcu_dereference_check_sched_domain(this_rq->sd); - if (sd) - update_next_balance(sd, &next_balance); - rcu_read_unlock(); - - goto out; - } - - raw_spin_unlock(&this_rq->lock); - - update_blocked_averages(this_cpu); - rcu_read_lock(); - for_each_domain(this_cpu, sd) { - int continue_balancing = 1; - u64 t0, domain_cost; - - if (!(sd->flags & SD_LOAD_BALANCE)) - continue; - - if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) { - update_next_balance(sd, &next_balance); - break; - } - - if (sd->flags & SD_BALANCE_NEWIDLE) { - t0 = sched_clock_cpu(this_cpu); - - pulled_task = load_balance(this_cpu, this_rq, - sd, CPU_NEWLY_IDLE, - &continue_balancing); - - domain_cost = sched_clock_cpu(this_cpu) - t0; - if (domain_cost > sd->max_newidle_lb_cost) - sd->max_newidle_lb_cost = domain_cost; - - curr_cost += domain_cost; - } - - update_next_balance(sd, &next_balance); - - /* - * Stop searching for tasks to pull if there are - * now runnable tasks on this rq. - */ - if (pulled_task || this_rq->nr_running > 0) - break; - } - rcu_read_unlock(); - - raw_spin_lock(&this_rq->lock); - - if (curr_cost > this_rq->max_idle_balance_cost) - this_rq->max_idle_balance_cost = curr_cost; - - /* - * While browsing the domains, we released the rq lock, a task could - * have been enqueued in the meantime. Since we're not going idle, - * pretend we pulled a task. - */ - if (this_rq->cfs.h_nr_running && !pulled_task) - pulled_task = 1; - -out: - /* Move the next balance forward */ - if (time_after(this_rq->next_balance, next_balance)) - this_rq->next_balance = next_balance; - - /* Is there a task of a high priority class? */ - if (this_rq->nr_running != this_rq->cfs.h_nr_running) - pulled_task = -1; - - if (pulled_task) - this_rq->idle_stamp = 0; - - rq_repin_lock(this_rq, rf); - - return pulled_task; -} - -/* * active_load_balance_cpu_stop is run by cpu stopper. It pushes * running tasks off the busiest CPU onto idle CPUs. It requires at * least 1 task to be running on each physical CPU where possible, and @@ -9413,10 +9299,14 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle) #ifdef CONFIG_NO_HZ_COMMON /* - * In CONFIG_NO_HZ_COMMON case, the idle balance kickee will do the - * rebalancing for all the cpus for whom scheduler ticks are stopped. + * Internal function that runs load balance for all idle cpus. The load balance + * can be a simple update of blocked load or a complete load balance with + * tasks movement depending of flags. + * The function returns false if the loop has stopped before running + * through all idle CPUs. */ -static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) +static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, + enum cpu_idle_type idle) { /* Earliest time when we have to do rebalance again */ unsigned long now = jiffies; @@ -9424,20 +9314,10 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) bool has_blocked_load = false; int update_next_balance = 0; int this_cpu = this_rq->cpu; - unsigned int flags; int balance_cpu; + int ret = false; struct rq *rq; - if (!(atomic_read(nohz_flags(this_cpu)) & NOHZ_KICK_MASK)) - return false; - - if (idle != CPU_IDLE) { - atomic_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); - return false; - } - - flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); - SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK); /* @@ -9482,10 +9362,10 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) if (time_after_eq(jiffies, rq->next_balance)) { struct rq_flags rf; - rq_lock_irq(rq, &rf); + rq_lock_irqsave(rq, &rf); update_rq_clock(rq); cpu_load_update_idle(rq); - rq_unlock_irq(rq, &rf); + rq_unlock_irqrestore(rq, &rf); if (flags & NOHZ_BALANCE_KICK) rebalance_domains(rq, CPU_IDLE); @@ -9497,13 +9377,21 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) } } - update_blocked_averages(this_cpu); + /* Newly idle CPU doesn't need an update */ + if (idle != CPU_NEWLY_IDLE) { + update_blocked_averages(this_cpu); + has_blocked_load |= this_rq->has_blocked_load; + } + if (flags & NOHZ_BALANCE_KICK) rebalance_domains(this_rq, CPU_IDLE); WRITE_ONCE(nohz.next_blocked, now + msecs_to_jiffies(LOAD_AVG_PERIOD)); + /* The full idle balance loop has been done */ + ret = true; + abort: /* There is still blocked load, enable periodic update */ if (has_blocked_load) @@ -9517,6 +9405,35 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) if (likely(update_next_balance)) nohz.next_balance = next_balance; + return ret; +} + +/* + * In CONFIG_NO_HZ_COMMON case, the idle balance kickee will do the + * rebalancing for all the cpus for whom scheduler ticks are stopped. + */ +static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) +{ + int this_cpu = this_rq->cpu; + unsigned int flags; + + if (!(atomic_read(nohz_flags(this_cpu)) & NOHZ_KICK_MASK)) + return false; + + if (idle != CPU_IDLE) { + atomic_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + return false; + } + + /* + * barrier, pairs with nohz_balance_enter_idle(), ensures ... + */ + flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + if (!(flags & NOHZ_KICK_MASK)) + return false; + + _nohz_idle_balance(this_rq, flags, idle); + return true; } #else @@ -9527,6 +9444,151 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) #endif /* + * idle_balance is called by schedule() if this_cpu is about to become + * idle. Attempts to pull tasks from other CPUs. + */ +static int idle_balance(struct rq *this_rq, struct rq_flags *rf) +{ + unsigned long next_balance = jiffies + HZ; + int this_cpu = this_rq->cpu; + struct sched_domain *sd; + int pulled_task = 0; + u64 curr_cost = 0; + + /* + * We must set idle_stamp _before_ calling idle_balance(), such that we + * measure the duration of idle_balance() as idle time. + */ + this_rq->idle_stamp = rq_clock(this_rq); + + /* + * Do not pull tasks towards !active CPUs... + */ + if (!cpu_active(this_cpu)) + return 0; + + /* + * This is OK, because current is on_cpu, which avoids it being picked + * for load-balance and preemption/IRQs are still disabled avoiding + * further scheduler activity on it and we're being very careful to + * re-start the picking loop. + */ + rq_unpin_lock(this_rq, rf); + + if (this_rq->avg_idle < sysctl_sched_migration_cost || + !this_rq->rd->overload) { +#ifdef CONFIG_NO_HZ_COMMON + unsigned long has_blocked = READ_ONCE(nohz.has_blocked); + unsigned long next_blocked = READ_ONCE(nohz.next_blocked); +#endif + rcu_read_lock(); + sd = rcu_dereference_check_sched_domain(this_rq->sd); + if (sd) + update_next_balance(sd, &next_balance); + rcu_read_unlock(); + +#ifdef CONFIG_NO_HZ_COMMON + /* + * This CPU doesn't want to be disturbed by scheduler + * houskeeping + */ + if (!housekeeping_cpu(this_cpu, HK_FLAG_SCHED)) + goto out; + + /* Will wake up very soon. No time for fdoing anything else*/ + if (this_rq->avg_idle < sysctl_sched_migration_cost) + goto out; + + /* Don't need to update blocked load of idle CPUs*/ + if (!has_blocked || time_after_eq(jiffies, next_blocked)) + goto out; + + raw_spin_unlock(&this_rq->lock); + /* + * This CPU is going to be idle and blocked load of idle CPUs + * need to be updated. Run the ilb locally as it is a good + * candidate for ilb instead of waking up another idle CPU. + * Kick an normal ilb if we failed to do the update. + */ + if (!_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE)) + kick_ilb(NOHZ_STATS_KICK); + raw_spin_lock(&this_rq->lock); +#endif + goto out; + } + + raw_spin_unlock(&this_rq->lock); + + update_blocked_averages(this_cpu); + rcu_read_lock(); + for_each_domain(this_cpu, sd) { + int continue_balancing = 1; + u64 t0, domain_cost; + + if (!(sd->flags & SD_LOAD_BALANCE)) + continue; + + if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) { + update_next_balance(sd, &next_balance); + break; + } + + if (sd->flags & SD_BALANCE_NEWIDLE) { + t0 = sched_clock_cpu(this_cpu); + + pulled_task = load_balance(this_cpu, this_rq, + sd, CPU_NEWLY_IDLE, + &continue_balancing); + + domain_cost = sched_clock_cpu(this_cpu) - t0; + if (domain_cost > sd->max_newidle_lb_cost) + sd->max_newidle_lb_cost = domain_cost; + + curr_cost += domain_cost; + } + + update_next_balance(sd, &next_balance); + + /* + * Stop searching for tasks to pull if there are + * now runnable tasks on this rq. + */ + if (pulled_task || this_rq->nr_running > 0) + break; + } + rcu_read_unlock(); + + raw_spin_lock(&this_rq->lock); + + if (curr_cost > this_rq->max_idle_balance_cost) + this_rq->max_idle_balance_cost = curr_cost; + + /* + * While browsing the domains, we released the rq lock, a task could + * have been enqueued in the meantime. Since we're not going idle, + * pretend we pulled a task. + */ + if (this_rq->cfs.h_nr_running && !pulled_task) + pulled_task = 1; + +out: + /* Move the next balance forward */ + if (time_after(this_rq->next_balance, next_balance)) + this_rq->next_balance = next_balance; + + /* Is there a task of a high priority class? */ + if (this_rq->nr_running != this_rq->cfs.h_nr_running) + pulled_task = -1; + + if (pulled_task) + this_rq->idle_stamp = 0; + + rq_repin_lock(this_rq, rf); + + return pulled_task; +} + +/* * run_rebalance_domains is triggered when needed from the scheduler tick. * Also triggered for nohz idle balancing (with nohz_balancing_kick set). */