From patchwork Tue Dec 24 05:13:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Viresh Kumar X-Patchwork-Id: 182430 Delivered-To: patch@linaro.org Received: by 2002:a92:a146:0:0:0:0:0 with SMTP id v67csp1262045ili; Mon, 23 Dec 2019 21:13:46 -0800 (PST) X-Google-Smtp-Source: APXvYqzVFHEtt4HApvz3ZlSgaXesx3+hrcrCAzLlV40oEuBZ8QaDOy86Pj7jFHKCEkEcBbf2stCv X-Received: by 2002:a05:6830:4d5:: with SMTP id s21mr20234080otd.294.1577164426021; Mon, 23 Dec 2019 21:13:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1577164426; cv=none; d=google.com; s=arc-20160816; b=h7wM4GG/tjSDGBFYmss/S8QE2oLR3CbPUQNL2KLWaHwlN+zU7MwcMnTeUNAWkPRnOS e9AnjM7UxQNA+UbEchlVMTCH57UkH6lR/PCpbPMQOk6gdbgc8+/+vSeCgXPRcomBJN7R QbbezlCuND8ipCarzuzWCLtTtTYPpYge02vspKUTvCyEHlSkUYJSXkDTVH91yUqjc8OV Au3g6N/uZCtlQ8aGYKL/jeKex07vAblHeZe3ZBO6MXFVLvJbZmZbG5SZmBWXLNWl9pd/ j120TGdik5CnU2kWciM/g0+OeJpmmsWAU41vdPr4j9Ht04Y+oEdlkunyGa4HC/vG4Klf qj4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=uArY/n7lKcudSIQdaanha3jWm3R3XwcyZLgpfo2NysM=; b=pLKHwn5hMTBvgons7vje9Js9jQknpEzC34f3Epj/YuVFAR4bTOtRO5LoypZJK5sNAL tqlb9KmkA4Hm8DoEnOyEWD/0oUcsAlXDJFLsoqofIJjHM7/rBRtzoKaIlFI4p2df8dk8 kVF2/vrabValUAPVjcNtxGVCU4mmkOPv4zlXWqXl4/dROvCWTCNpP5Ascd+de3U7hlMM gNRBc1WoCsf3WLFZMA9tzlFz34jQYhdSiZlvPmmLIMYZrBjV9VMEjcB8rHAQE/wtWm7f /kFao9ivfOgwRZAcDzzl8u295dCoV2xicc9HQ3KIPoBla4LZBOvqa7b5txwQCH1oETNB yl3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=beZ4cUJW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w131si9810852oie.47.2019.12.23.21.13.44; Mon, 23 Dec 2019 21:13:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=beZ4cUJW; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726068AbfLXFNg (ORCPT + 27 others); Tue, 24 Dec 2019 00:13:36 -0500 Received: from mail-pj1-f68.google.com ([209.85.216.68]:54367 "EHLO mail-pj1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725208AbfLXFNg (ORCPT ); Tue, 24 Dec 2019 00:13:36 -0500 Received: by mail-pj1-f68.google.com with SMTP id kx11so709661pjb.4 for ; Mon, 23 Dec 2019 21:13:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=uArY/n7lKcudSIQdaanha3jWm3R3XwcyZLgpfo2NysM=; b=beZ4cUJWJvHXnXxWzVcM+aTuAiQ49q/wwIJsWIUkHuXJ+tHW8tohuOK3lSlzPE2Tnr AaYCORZpELGM5GlHzxhMRiZuUn/jtmjOq4ahXHpCjxLlT3GEaNMmSFW2stfQVfnyqKQX HSila6kJz/xUkjW435LQfIBoGDJ/AM+JvxXRVW26jnBRtD8abYbXPHrOwcOuZ+3/oY9z uroOPiOSEKsc9Ws84m48Sf2Gb/E8EZcROmvX6OGcRa6j9c/bdYjof6cISZwkEfGNVejb WO1OLtC9V1FxlMp9bNNpsuNM+kEfMeG5h7oYETlozta5QInj1NbMOUhK5t6UuodYFIRy mFzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=uArY/n7lKcudSIQdaanha3jWm3R3XwcyZLgpfo2NysM=; b=LZEl0fDPRZSNSxVY4MnDZgVJOnQrMjKpg8WyNdPN+dSP3iB7d7dQjsq0s63N1u1JxY +mBm3RoAKEjwVyEfiputmUnpNvMq26uCZBL31FFdlJOPSl8lFqghd+G9EM+RaZBxmSOF aSesQaN/2b8JkDwwxhmbJ8BKpeJ0hy+Q0UgQyN4x11J7jnmge9PO3B7+d1EGkgg+z83t +dOrBC76rxWjGEfRNocvKQZv/cwJ06FMqoiPnsK7YAFY3Na7kvrQxSL4lRx5HAr02zAa KdqdsTBeiXrnHBRIuoI21LZ/7YxjOAl+ZIRU0KuhLasRpNOS4VW+404esQP0T/0kH3/q mpdw== X-Gm-Message-State: APjAAAXlFO1osVlrxfaHtPhrwCSWqY3b3d3tHtlZZ9iq2fcyi5WunNVC 4CSbNCiRI19j7w4FQoZaXkYRdoBQzb4= X-Received: by 2002:a17:90a:a88d:: with SMTP id h13mr3394363pjq.55.1577164415381; Mon, 23 Dec 2019 21:13:35 -0800 (PST) Received: from localhost ([122.171.234.168]) by smtp.gmail.com with ESMTPSA id i3sm26058099pfg.94.2019.12.23.21.13.34 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Dec 2019 21:13:34 -0800 (PST) From: Viresh Kumar To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman Cc: Viresh Kumar , linux-kernel@vger.kernel.org Subject: [PATCH] sched/fair: Load balance aggressively for SCHED_IDLE CPUs Date: Tue, 24 Dec 2019 10:43:30 +0530 Message-Id: <885b1be9af68d124f44a863f54e337f8eb6c4917.1577090998.git.viresh.kumar@linaro.org> X-Mailer: git-send-email 2.21.0.rc0.269.g1a574e7a288b MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The fair scheduler performs periodic load balance on every CPU to check if it can pull some tasks from other busy CPUs. The duration of this periodic load balance is set to sd->balance_interval for the idle CPUs and is calculated by multiplying the sd->balance_interval with the sd->busy_factor (set to 32 by default) for the busy CPUs. The multiplication is done for busy CPUs to avoid doing load balance too often and rather spend more time executing actual task. While that is the right thing to do for the CPUs busy with SCHED_OTHER or SCHED_BATCH tasks, it may not be the optimal thing for CPUs running only SCHED_IDLE tasks. With the recent enhancements in the fair scheduler around SCHED_IDLE CPUs, we now prefer to enqueue a newly-woken task to a SCHED_IDLE CPU instead of other busy or idle CPUs. The same reasoning should be applied to the load balancer as well to make it migrate tasks more aggressively to a SCHED_IDLE CPU, as that will reduce the scheduling latency of the migrated (SCHED_OTHER) tasks. This patch makes minimal changes to the fair scheduler to do the next load balance soon after the last non SCHED_IDLE task is dequeued from a runqueue, i.e. making the CPU SCHED_IDLE. Also the sd->busy_factor is ignored while calculating the balance_interval for such CPUs. This is done to avoid delaying the periodic load balance by few hundred milliseconds for SCHED_IDLE CPUs. This is tested on ARM64 Hikey620 platform (octa-core) with the help of rt-app and it is verified, using kernel traces, that the newly SCHED_IDLE CPU does load balancing shortly after it becomes SCHED_IDLE and pulls tasks from other busy CPUs. Signed-off-by: Viresh Kumar --- kernel/sched/fair.c | 39 +++++++++++++++++++++++++++------------ 1 file changed, 27 insertions(+), 12 deletions(-) -- 2.21.0.rc0.269.g1a574e7a288b diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 83500b5b93dc..645eb248a2d0 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5210,6 +5210,18 @@ static inline void update_overutilized_status(struct rq *rq) static inline void update_overutilized_status(struct rq *rq) { } #endif +/* Runqueue only has SCHED_IDLE tasks enqueued */ +static int sched_idle_rq(struct rq *rq) +{ + return unlikely(rq->nr_running == rq->cfs.idle_h_nr_running && + rq->nr_running); +} + +static int sched_idle_cpu(int cpu) +{ + return sched_idle_rq(cpu_rq(cpu)); +} + /* * The enqueue_task method is called before nr_running is * increased. Here we update the fair scheduling stats and @@ -5324,6 +5336,7 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags) struct sched_entity *se = &p->se; int task_sleep = flags & DEQUEUE_SLEEP; int idle_h_nr_running = task_has_idle_policy(p); + bool was_sched_idle = sched_idle_rq(rq); for_each_sched_entity(se) { cfs_rq = cfs_rq_of(se); @@ -5370,6 +5383,10 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags) if (!se) sub_nr_running(rq, 1); + /* balance early to pull high priority tasks */ + if (unlikely(!was_sched_idle && sched_idle_rq(rq))) + rq->next_balance = jiffies; + util_est_dequeue(&rq->cfs, p, task_sleep); hrtick_update(rq); } @@ -5392,15 +5409,6 @@ static struct { #endif /* CONFIG_NO_HZ_COMMON */ -/* CPU only has SCHED_IDLE tasks enqueued */ -static int sched_idle_cpu(int cpu) -{ - struct rq *rq = cpu_rq(cpu); - - return unlikely(rq->nr_running == rq->cfs.idle_h_nr_running && - rq->nr_running); -} - static unsigned long cpu_load(struct rq *rq) { return cfs_rq_load_avg(&rq->cfs); @@ -9531,6 +9539,7 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle) { int continue_balancing = 1; int cpu = rq->cpu; + int busy = idle != CPU_IDLE && !sched_idle_cpu(cpu); unsigned long interval; struct sched_domain *sd; /* Earliest time when we have to do rebalance again */ @@ -9567,7 +9576,7 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle) break; } - interval = get_sd_balance_interval(sd, idle != CPU_IDLE); + interval = get_sd_balance_interval(sd, busy); need_serialize = sd->flags & SD_SERIALIZE; if (need_serialize) { @@ -9582,10 +9591,16 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle) * env->dst_cpu, so we can't know our idle * state even if we migrated tasks. Update it. */ - idle = idle_cpu(cpu) ? CPU_IDLE : CPU_NOT_IDLE; + if (idle_cpu(cpu)) { + idle = CPU_IDLE; + busy = 0; + } else { + idle = CPU_NOT_IDLE; + busy = !sched_idle_cpu(cpu); + } } sd->last_balance = jiffies; - interval = get_sd_balance_interval(sd, idle != CPU_IDLE); + interval = get_sd_balance_interval(sd, busy); } if (need_serialize) spin_unlock(&balancing);