From patchwork Mon Apr 27 13:22:51 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Viresh Kumar X-Patchwork-Id: 47614 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-lb0-f198.google.com (mail-lb0-f198.google.com [209.85.217.198]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id F2C8A20553 for ; Mon, 27 Apr 2015 13:23:28 +0000 (UTC) Received: by lbbrr5 with SMTP id rr5sf25144230lbb.3 for ; Mon, 27 Apr 2015 06:23:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:sender:precedence:list-id:x-original-sender :x-original-authentication-results:mailing-list:list-post:list-help :list-archive:list-unsubscribe; bh=c0WYkkUUt/E47qJ0MVOtC8QJawLoVm22Mvc0WTYwyvQ=; b=WWr4xu549M0D7JdiMv0yx7Dzat9TNAu78urVt9zCa3XPnzQZBgUp7ILHImYda7wZoL Zqi6aV6zEtZ8Y6F68pYxQODPHuB9y4OTrZL+C/T6C0pXKOV663RRFJftCDcdgW7GYAbF I4NJshM1xpFwxrI0c6dDYA1W4JUEZk3ixbzPnJ3f1+UOCC+yUD+DsbUqEsWf6rR8O5HP 3Gr/TlST3DCyUau8Yb3bZQ5fSL7FeJL/BhVS6Euna0lsH07bjEI9CFvEWDkm4YCMLotM d1t50ShQzf1fmVS3EHzMq751X2XC+jvhW+xi24KM0Ok+AoZqeOBmx80E4+gK9pUxnQYt X1qw== X-Gm-Message-State: ALoCoQnm2iPAfKnBhAjXSJjJxGot49cMrozmhBY3foYD9FhPb5M/adoLsTPy0GSOwgicrfqV04hq X-Received: by 10.180.88.226 with SMTP id bj2mr7131719wib.7.1430141007669; Mon, 27 Apr 2015 06:23:27 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.4.73 with SMTP id i9ls751101lai.100.gmail; Mon, 27 Apr 2015 06:23:27 -0700 (PDT) X-Received: by 10.152.120.163 with SMTP id ld3mr10202652lab.59.1430141007418; Mon, 27 Apr 2015 06:23:27 -0700 (PDT) Received: from mail-la0-f47.google.com (mail-la0-f47.google.com. [209.85.215.47]) by mx.google.com with ESMTPS id lk7si14764902lac.61.2015.04.27.06.23.27 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 27 Apr 2015 06:23:27 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.47 as permitted sender) client-ip=209.85.215.47; Received: by laat2 with SMTP id t2so79634088laa.1 for ; Mon, 27 Apr 2015 06:23:27 -0700 (PDT) X-Received: by 10.152.4.137 with SMTP id k9mr9984987lak.29.1430141007309; Mon, 27 Apr 2015 06:23:27 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.67.65 with SMTP id l1csp1274912lbt; Mon, 27 Apr 2015 06:23:25 -0700 (PDT) X-Received: by 10.66.146.6 with SMTP id sy6mr21732609pab.150.1430141004004; Mon, 27 Apr 2015 06:23:24 -0700 (PDT) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id fn7si29929561pad.20.2015.04.27.06.23.22; Mon, 27 Apr 2015 06:23:23 -0700 (PDT) Received-SPF: none (google.com: linux-kernel-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933025AbbD0NXP (ORCPT + 27 others); Mon, 27 Apr 2015 09:23:15 -0400 Received: from mail-pa0-f46.google.com ([209.85.220.46]:36581 "EHLO mail-pa0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932219AbbD0NXK (ORCPT ); Mon, 27 Apr 2015 09:23:10 -0400 Received: by pabsx10 with SMTP id sx10so129322442pab.3 for ; Mon, 27 Apr 2015 06:23:09 -0700 (PDT) X-Received: by 10.68.194.6 with SMTP id hs6mr22221952pbc.58.1430140989662; Mon, 27 Apr 2015 06:23:09 -0700 (PDT) Received: from localhost ([122.167.227.36]) by mx.google.com with ESMTPSA id nb10sm19476476pdb.76.2015.04.27.06.23.08 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Mon, 27 Apr 2015 06:23:08 -0700 (PDT) From: Viresh Kumar To: Thomas Gleixner Cc: linaro-kernel@lists.linaro.org, linux-kernel@vger.kernel.org, Peter Zijlstra , Ingo Molnar , Viresh Kumar Subject: [PATCH V2] timer: Migrate running timer to avoid waking up an idle-core Date: Mon, 27 Apr 2015 18:52:51 +0530 Message-Id: <78143cce1e94d35e6fe4e36b5e44cd2d4433c4ee.1430140376.git.viresh.kumar@linaro.org> X-Mailer: git-send-email 2.3.0.rc0.44.ga94655d Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: viresh.kumar@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.47 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , Currently, when adding new timers, we try to migrate them to a non-idle core if the local core is idle. However, we don't do this for timers that are re-armed from their handler. These running timers can wake up an idle core when they could've been serviced by any non-idle core, thus potentially impacting power savings by preventing deep idling of cores. Migration of a running timer is race-prone in two ways: 1. Serialization of the timer with itself: If migrated, it is possible that timer may fire on the new base, before the timer handler has finished execution on the old base. 2. Deletion of timer with del_timer_sync(): del_timer_sync() deletes timer if the timer handler isn't running currently - this is checked by comparing timer against the running_timer of its base. If we migrate the timer to a new base, del_timer_sync() might delete it while its handler is still running on the old base because it will check against the running_timer of the new base and not find it running. We can fix these problems by deferring the re-queuing of the timer until its handler has finished. By moving these timers to a special migration list, we can process it after all expired timers are processed. To optimize finding a suitable new base for timers in the migration list, preserve the preferred_target base in __mod_timer(). Suggested-by: Thomas Gleixner Signed-off-by: Viresh Kumar --- V1->V2: - Completely new approach and the correct one. - Re-queue timer only after its handler has finished. Tested on Exynos (Dual ARM Cortex-A9) board, with ubuntu. System was fairly idle and 'dmesg > /dev/null' was run in an infinite loop on one of the CPUs, to get it out of idle. It was seen multiple times that migration does happen and sometimes while processing migration-list, it is found that local CPU isn't idle anymore. And in such cases we end up adding timer on local CPU. kernel/time/timer.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 49 insertions(+), 3 deletions(-) diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 2ece3aa5069c..de5aa69ab958 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -78,6 +78,8 @@ struct tvec_root { struct tvec_base { spinlock_t lock; struct timer_list *running_timer; + struct list_head migration_list; + struct tvec_base *preferred_target; unsigned long timer_jiffies; unsigned long next_timer; unsigned long active_timers; @@ -795,10 +797,18 @@ __mod_timer(struct timer_list *timer, unsigned long expires, * We are trying to schedule the timer on the local CPU. * However we can't change timer's base while it is running, * otherwise del_timer_sync() can't detect that the timer's - * handler yet has not finished. This also guarantees that - * the timer is serialized wrt itself. + * handler yet has not finished. + * + * Move timer to migration_list which can be processed after all + * expired timers are serviced. This also guarantees that the + * timer is serialized wrt itself. */ - if (likely(base->running_timer != timer)) { + if (unlikely(base->running_timer == timer)) { + timer->expires = expires; + base->preferred_target = new_base; + list_add_tail(&timer->entry, &base->migration_list); + goto out_unlock; + } else { /* See the comment in lock_timer_base() */ timer_set_base(timer, NULL); spin_unlock(&base->lock); @@ -1228,6 +1238,41 @@ static inline void __run_timers(struct tvec_base *base) } } base->running_timer = NULL; + + /* + * Process timers from migration list, as their handlers have finished + * now. + */ + if (unlikely(!list_empty(&base->migration_list))) { + struct tvec_base *new_base = base->preferred_target; + + if (!idle_cpu(base->cpu)) { + /* Local CPU isn't idle anymore */ + new_base = base; + } else if (idle_cpu(new_base->cpu)) { + /* Re-evaluate base, target CPU has gone idle */ + new_base = per_cpu(tvec_bases, get_nohz_timer_target(false)); + } + + do { + timer = list_first_entry(&base->migration_list, + struct timer_list, entry); + + __list_del(timer->entry.prev, timer->entry.next); + + /* See the comment in lock_timer_base() */ + timer_set_base(timer, NULL); + spin_unlock(&base->lock); + + spin_lock(&new_base->lock); + timer_set_base(timer, new_base); + internal_add_timer(new_base, timer); + spin_unlock(&new_base->lock); + + spin_lock(&base->lock); + } while (!list_empty(&base->migration_list)); + } + spin_unlock_irq(&base->lock); } @@ -1635,6 +1680,7 @@ static void __init init_timer_cpu(struct tvec_base *base, int cpu) for (j = 0; j < TVR_SIZE; j++) INIT_LIST_HEAD(base->tv1.vec + j); + INIT_LIST_HEAD(&base->migration_list); base->timer_jiffies = jiffies; base->next_timer = base->timer_jiffies; }