From patchwork Mon Feb 12 08:07:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 127946 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp2977937ljc; Mon, 12 Feb 2018 00:08:42 -0800 (PST) X-Google-Smtp-Source: AH8x225M8Y0U130p2h2Jpvl6stQpYzniYT9sbu/UdZvDWvX/q5V/ow8CcxYGqXRdQC7So1Jw0qno X-Received: by 10.99.4.66 with SMTP id 63mr4005235pge.93.1518422921911; Mon, 12 Feb 2018 00:08:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518422921; cv=none; d=google.com; s=arc-20160816; b=Z2Wt5zfP2DKT+55BXiUJC3IhXf8J03pF6d/ypHvNgA4F0FWorq71QKBPLtpm9goh3b gK4U/GNpUgC6mU5S/p2KsOuSvl1NQYYyGD7Vv4Nu6hd9wI/YnZumqsZEkd/UPO5M5Owd u7SfQgFyWea0YWUBExdz5Rfndt88X0hi4pw0xRLn31i4XwGfd4CTJreJuZC7fPEjtM9f gkSO4I2f5o9jjDbfWhxudDVULQntVaFHB5tbGRNyCcWIu69yYtN+GYrfLZ5PEHzoEsY9 12IGKUhq+EcRtdlP9i1DkYVxLIJutnZ0nLcmMZxsADuTqwERekz8kDXIKVwv5aVpnvUW 9Jwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature:arc-authentication-results; bh=8DkJ88in6Jag6IguI2ISW6lhQx6B6l8KugjtA+1TDcg=; b=0rwS0htNS6nxe2ks0UYefPUbvTon5erar9QN9Jzj+78nokiKdW480NxYpn/TJa7mf8 bxT7lwAcElk1+LhyXDz3OFPxOLirt4kEIYMm4l8ls6Oko3Z+RxOlMP9psjyzNmrYRpNR TgVy+ivBAXAzZO9ppAj8CDXPlD5IuyVdA6OdLECR2Ao2CQB2Z9lkHxlbqifon28fCjte tQc1wKAQQ+1v+oFLriNJMVIiFuygnLeMk9EnAo/z1C3Kyxmpn8pN73bVG3fYEWUR0/ZY bERZzG8SzW2+US/Nq61T2pdnCB63SQ/BzOrWjLnxRYl0/xL/+qH5phJJwBeAOjObbcWr jiTw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=i85oIqKz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i67si3983532pfj.212.2018.02.12.00.08.41; Mon, 12 Feb 2018 00:08:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=i85oIqKz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932962AbeBLIIh (ORCPT + 12 others); Mon, 12 Feb 2018 03:08:37 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:39051 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932876AbeBLIIF (ORCPT ); Mon, 12 Feb 2018 03:08:05 -0500 Received: by mail-wm0-f67.google.com with SMTP id b21so8112264wme.4 for ; Mon, 12 Feb 2018 00:08:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=8DkJ88in6Jag6IguI2ISW6lhQx6B6l8KugjtA+1TDcg=; b=i85oIqKzFhmVkflMd5rLIrEmvx0VP/Bdp2J/oohHARKJDiVHcCWWflm46+l63ISfUi uWUPGvB7WbyU6+qjBkG4+nyA+USBiVTsaVK3tMBN7BABpdUXh4YePfx+wfmhUNGbqpMH z87ng+bHfa67XM4JERNSVXg906zXPckbFQTEg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=8DkJ88in6Jag6IguI2ISW6lhQx6B6l8KugjtA+1TDcg=; b=I3jWP6/qzKnP4Ye5mQZOugAS0mvPtxnWWZwSBgOzSdHAoqysokHa5dBU9LpabjvSpk OXBJfa/IqGTl4tmLwRFEqU/7jhjCs1bKdOlTKHr4etj5QEjOQhJMmFgqtD8sA8ltjY8y 8oRO8Jl7ALSqXBcpOOlh9GqBgiAfny8XSzrL4kHOgkTcAQqkqOyqukRcvzgslPfRxTKz h3APEhKCZb7u2QTBofR0sE/oRL6EKaLjyW+kSUr8D1J2gAHrUVdSkU3DdDwCicqdQtRH fbwBO95k3TPP9bSWFRpSXio9kHfQkfhX3eKdO5A0J7u/tkqvBEFgT4nWgCyIYmj+7WEm lWCw== X-Gm-Message-State: APf1xPCAvJQ+13BbcvUlWK/L1TLi364Y8ZV4V01MaS6rivxT8t8Wm06E KIDt5hNQYGYgg+2TmI2uUvKI6Q== X-Received: by 10.80.219.75 with SMTP id b11mr15639899edl.220.1518422884300; Mon, 12 Feb 2018 00:08:04 -0800 (PST) Received: from localhost.localdomain ([2a01:e0a:f:6020:8800:c3b2:94cc:861a]) by smtp.gmail.com with ESMTPSA id f29sm4853715eda.43.2018.02.12.00.08.02 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 12 Feb 2018 00:08:03 -0800 (PST) From: Vincent Guittot To: peterz@infradead.org, mingo@kernel.org, linux-kernel@vger.kernel.org, valentin.schneider@arm.com Cc: morten.rasmussen@foss.arm.com, brendan.jackman@arm.com, dietmar.eggemann@arm.com, Vincent Guittot Subject: [PATCH v3 3/3] sched: update blocked load when newly idle Date: Mon, 12 Feb 2018 09:07:54 +0100 Message-Id: <1518422874-13216-4-git-send-email-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1518422874-13216-1-git-send-email-vincent.guittot@linaro.org> References: <1518422874-13216-1-git-send-email-vincent.guittot@linaro.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When NEWLY_IDLE load balance is not triggered, we might need to update the blocked load anyway. We can kick an ilb so an idle CPU will take care of updating blocked load or we can try to update them locally before entering idle. In the latter case, we reuse part of the nohz_idle_balance. Signed-off-by: Vincent Guittot --- kernel/sched/fair.c | 102 ++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 84 insertions(+), 18 deletions(-) -- 2.7.4 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7566424..3c11ef6 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8829,6 +8829,9 @@ update_next_balance(struct sched_domain *sd, unsigned long *next_balance) *next_balance = next; } +static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, enum cpu_idle_type idle); +static void kick_ilb(unsigned int flags); + /* * idle_balance is called by schedule() if this_cpu is about to become * idle. Attempts to pull tasks from other CPUs. @@ -8863,12 +8866,26 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf) if (this_rq->avg_idle < sysctl_sched_migration_cost || !this_rq->rd->overload) { + unsigned long has_blocked = READ_ONCE(nohz.has_blocked); + unsigned long next_blocked = READ_ONCE(nohz.next_blocked); + rcu_read_lock(); sd = rcu_dereference_check_sched_domain(this_rq->sd); if (sd) update_next_balance(sd, &next_balance); rcu_read_unlock(); + /* + * Update blocked idle load if it has not been done for a + * while. Try to do it locally before entering idle but kick a + * ilb if it takes too much time and/or might delay next local + * wake up + */ + if (has_blocked && time_after_eq(jiffies, next_blocked) && + (this_rq->avg_idle < sysctl_sched_migration_cost || + !_nohz_idle_balance(this_rq, NOHZ_STATS_KICK, CPU_NEWLY_IDLE))) + kick_ilb(NOHZ_STATS_KICK); + goto out; } @@ -9411,30 +9428,24 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle) #ifdef CONFIG_NO_HZ_COMMON /* - * In CONFIG_NO_HZ_COMMON case, the idle balance kickee will do the - * rebalancing for all the cpus for whom scheduler ticks are stopped. + * Internal function that runs load balance for all idle cpus. The load balance + * can be a simple update of blocked load or a complete load balance with + * tasks movement depending of flags. + * For newly idle mode, we abort the loop if it takes too much time and return + * false to notify that the loop has not be completed and a ilb should be kick. */ -static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) +static bool _nohz_idle_balance(struct rq *this_rq, unsigned int flags, enum cpu_idle_type idle) { /* Earliest time when we have to do rebalance again */ unsigned long now = jiffies; unsigned long next_balance = now + 60*HZ; - unsigned long next_stats = now + msecs_to_jiffies(LOAD_AVG_PERIOD); + bool has_blocked_load = false; int update_next_balance = 0; int this_cpu = this_rq->cpu; - unsigned int flags; int balance_cpu; + int ret = false; struct rq *rq; - - if (!(atomic_read(nohz_flags(this_cpu)) & NOHZ_KICK_MASK)) - return false; - - if (idle != CPU_IDLE) { - atomic_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); - return false; - } - - flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + u64 curr_cost = 0; SCHED_WARN_ON((flags & NOHZ_KICK_MASK) == NOHZ_BALANCE_KICK); @@ -9455,6 +9466,10 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) smp_mb(); for_each_cpu(balance_cpu, nohz.idle_cpus_mask) { + u64 t0, domain_cost; + + t0 = sched_clock_cpu(this_cpu); + if (balance_cpu == this_cpu || !idle_cpu(balance_cpu)) continue; @@ -9468,6 +9483,16 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) goto abort; } + /* + * If the update is done while CPU becomes idle, we abort + * the update when its cost is higher than the average idle + * time in orde to not delay a possible wake up. + */ + if (idle == CPU_NEWLY_IDLE && this_rq->avg_idle < curr_cost) { + has_blocked_load = true; + goto abort; + } + rq = cpu_rq(balance_cpu); update_blocked_averages(rq->cpu); @@ -9480,10 +9505,10 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) if (time_after_eq(jiffies, rq->next_balance)) { struct rq_flags rf; - rq_lock_irq(rq, &rf); + rq_lock_irqsave(rq, &rf); update_rq_clock(rq); cpu_load_update_idle(rq); - rq_unlock_irq(rq, &rf); + rq_unlock_irqrestore(rq, &rf); if (flags & NOHZ_BALANCE_KICK) rebalance_domains(rq, CPU_IDLE); @@ -9493,15 +9518,27 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) next_balance = rq->next_balance; update_next_balance = 1; } + + domain_cost = sched_clock_cpu(this_cpu) - t0; + curr_cost += domain_cost; + + } + + /* Newly idle CPU doesn't need an update */ + if (idle != CPU_NEWLY_IDLE) { + update_blocked_averages(this_cpu); + has_blocked_load |= this_rq->has_blocked_load; } - update_blocked_averages(this_cpu); if (flags & NOHZ_BALANCE_KICK) rebalance_domains(this_rq, CPU_IDLE); WRITE_ONCE(nohz.next_blocked, now + msecs_to_jiffies(LOAD_AVG_PERIOD)); + /* The full idle balance loop has been done */ + ret = true; + abort: /* There is still blocked load, enable periodic update */ if (has_blocked_load) @@ -9515,6 +9552,35 @@ static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) if (likely(update_next_balance)) nohz.next_balance = next_balance; + return ret; +} + +/* + * In CONFIG_NO_HZ_COMMON case, the idle balance kickee will do the + * rebalancing for all the cpus for whom scheduler ticks are stopped. + */ +static bool nohz_idle_balance(struct rq *this_rq, enum cpu_idle_type idle) +{ + int this_cpu = this_rq->cpu; + unsigned int flags; + + if (!(atomic_read(nohz_flags(this_cpu)) & NOHZ_KICK_MASK)) + return false; + + if (idle != CPU_IDLE) { + atomic_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + return false; + } + + /* + * barrier, pairs with nohz_balance_enter_idle(), ensures ... + */ + flags = atomic_fetch_andnot(NOHZ_KICK_MASK, nohz_flags(this_cpu)); + if (!(flags & NOHZ_KICK_MASK)) + return false; + + _nohz_idle_balance(this_rq, flags, idle); + return true; } #else