From patchwork Wed Sep 21 17:25:38 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dietmar Eggemann X-Patchwork-Id: 76707 Delivered-To: patch@linaro.org Received: by 10.140.106.72 with SMTP id d66csp2168343qgf; Wed, 21 Sep 2016 10:25:50 -0700 (PDT) X-Received: by 10.66.161.195 with SMTP id xu3mr66813154pab.68.1474478750747; Wed, 21 Sep 2016 10:25:50 -0700 (PDT) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o16si17928889pfj.222.2016.09.21.10.25.50; Wed, 21 Sep 2016 10:25:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758141AbcIURZs (ORCPT + 27 others); Wed, 21 Sep 2016 13:25:48 -0400 Received: from foss.arm.com ([217.140.101.70]:51998 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757999AbcIURZq (ORCPT ); Wed, 21 Sep 2016 13:25:46 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 216E917C; Wed, 21 Sep 2016 10:25:44 -0700 (PDT) Received: from [10.1.210.41] (e107985-lin.cambridge.arm.com [10.1.210.41]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id ACBC83F251; Wed, 21 Sep 2016 10:25:42 -0700 (PDT) Subject: Re: [PATCH 2/7 v3] sched: fix hierarchical order in rq->leaf_cfs_rq_list To: Vincent Guittot References: <1473666472-13749-1-git-send-email-vincent.guittot@linaro.org> <1473666472-13749-3-git-send-email-vincent.guittot@linaro.org> Cc: Peter Zijlstra , Ingo Molnar , linux-kernel , Yuyang Du , Morten Rasmussen , Linaro Kernel Mailman List , Paul Turner , Benjamin Segall From: Dietmar Eggemann Message-ID: Date: Wed, 21 Sep 2016 18:25:38 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 MIME-Version: 1.0 In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 21/09/16 13:34, Vincent Guittot wrote: > Hi Dietmar, > > On 21 September 2016 at 12:14, Dietmar Eggemann > wrote: >> Hi Vincent, >> >> On 12/09/16 08:47, Vincent Guittot wrote: [...] >> I guess in the meantime we lost the functionality to remove a cfs_rq from the >> leaf_cfs_rq list once there are no se's enqueued on it anymore. If e.g. t migrates >> away from Cpu1, all the cfs_rq's of the task hierarchy (for tg_css_id=2,4,6) >> owned by Cpu1 stay in the leaf_cfs_rq list. >> >> Shouldn't we reintegrate it? Following patch goes on top of this patch: > > I see one problem: Once a cfs_rq has been removed , it will not be > periodically updated anymore until the next enqueue. A sleeping task > is only attached but not enqueued when it moves into a cfs_rq so its > load is added to cfs_rq's load which have to be periodically > updated/decayed. So adding a cfs_rq to the list only during an enqueue > is not enough in this case. There was more in the original patch (commit 82958366cfea), the removal of the cfs_rq from the list was only done in case the se->avg.runnable_avg_sum had decayed to 0. Couldn't we use something similar with se->avg.load_avg instead to make sure that these blocked contributions have been decayed before we do the removal? > Then, only the highest child level of task group will be removed > because cfs_rq->nr_running will be > 0 as soon as a child task group > is created and enqueued into a task group. Or you should use > cfs.h_nr_running instead i don't know all implications In my tests all cfs_rq's (tg_id=6,4,2) on the source CPU (CPU1) get removed? Do I miss something? migration/1-16 [001] 67.235292: sched_migrate_task: comm=task0 pid=2210 prio=120 orig_cpu=1 dest_cpu=2 migration/1-16 [001] 67.235295: bprint: enqueue_task_fair: list_add_leaf_cfs_rq: cpu=2 cfs_rq=0xffff80097550d700 tg_id=6 on_list=0 migration/1-16 [001] 67.235298: bprint: enqueue_task_fair: list_add_leaf_cfs_rq: cpu=2 cfs_rq=0xffff800975903700 tg_id=4 on_list=0 migration/1-16 [001] 67.235300: bprint: enqueue_task_fair: list_add_leaf_cfs_rq: cpu=2 cfs_rq=0xffff800974e0fe00 tg_id=2 on_list=0 migration/1-16 [001] 67.235302: bprint: enqueue_task_fair: list_add_leaf_cfs_rq: cpu=2 cfs_rq=0xffff80097fecb6e0 tg_id=1 on_list=1 migration/1-16 [001] 67.235309: bprint: update_blocked_averages: update_blocked_averages: cpu=1 cfs_rq=0xffff80097550d800 tg_id=6 on_list=1 -> migration/1-16 [001] 67.235312: bprint: update_blocked_averages: list_del_leaf_cfs_rq: cpu=1 cfs_rq=0xffff80097550d800 tg_id=6 on_list=1 migration/1-16 [001] 67.235314: bprint: update_blocked_averages: update_blocked_averages: cpu=1 cfs_rq=0xffff800975903600 tg_id=4 on_list=1 -> migration/1-16 [001] 67.235315: bprint: update_blocked_averages: list_del_leaf_cfs_rq: cpu=1 cfs_rq=0xffff800975903600 tg_id=4 on_list=1 migration/1-16 [001] 67.235316: bprint: update_blocked_averages: update_blocked_averages: cpu=1 cfs_rq=0xffff800974e0f600 tg_id=2 on_list=1 -> migration/1-16 [001] 67.235318: bprint: update_blocked_averages: list_del_leaf_cfs_rq: cpu=1 cfs_rq=0xffff800974e0f600 tg_id=2 on_list=1 migration/1-16 [001] 67.235319: bprint: update_blocked_averages: update_blocked_averages: cpu=1 cfs_rq=0xffff80097feb56e0 tg_id=1 on_list=1 If we don't remove these cfs_rq's, IMHO they stay in the list as long as the tg exists. [...] diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 4ac1718560e2..3595b0623b37 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6566,6 +6566,8 @@ static void update_blocked_averages(int cpu) * list_add_leaf_cfs_rq() for details. */ for_each_leaf_cfs_rq(rq, cfs_rq) { + struct sched_entity *se = cfs_rq->tg->se[cpu]; + /* throttled entities do not contribute to load */ if (throttled_hierarchy(cfs_rq)) continue; @@ -6574,8 +6576,12 @@ static void update_blocked_averages(int cpu) update_tg_load_avg(cfs_rq, 0); /* Propagate pending load changes to the parent */ - if (cfs_rq->tg->se[cpu]) - update_load_avg(cfs_rq->tg->se[cpu], 0, 0); + if (se) { + update_load_avg(se, 0, 0); + + if (!se->avg.load_avg && !cfs_rq->nr_running) + list_del_leaf_cfs_rq(cfs_rq); + } } raw_spin_unlock_irqrestore(&rq->lock, flags); }