From patchwork Fri Apr 17 08:12:54 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Viresh Kumar X-Patchwork-Id: 47258 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-wi0-f199.google.com (mail-wi0-f199.google.com [209.85.212.199]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id E9B712121F for ; Fri, 17 Apr 2015 08:13:21 +0000 (UTC) Received: by wiun10 with SMTP id n10sf2421955wiu.1 for ; Fri, 17 Apr 2015 01:13:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:message-id:date:from:user-agent :mime-version:to:cc:subject:references:in-reply-to:content-type :content-transfer-encoding:sender:precedence:list-id :x-original-sender:x-original-authentication-results:mailing-list :list-post:list-help:list-archive:list-unsubscribe; bh=PzOrGaRH8Ofa1XZqPRt2QOAfXgdafl7J6V4tcUhvQW4=; b=FubtZQqpMXQ9cQX8BBPxcisFUNPh9jh1eB3niV5AN1j2I15Z7JxDbOlC5KbjziwWsc FoUv49LrEKLYdbBPIxGaGNtR6Qywdx4V7/dg0Dodxr3L3Zr1CXWUWGNEzH+DqlYwcIzW kVK6O+0QkBOnMTBihRY9xUDF7Sv+ab30/lZTb2rRrQ+VvG+XuoW4agmVKYb/f7sGBjRc 4BmJHzeWdCesqAAMLtStgUK64bFvDgopJ00ksY8bVr47CNbGJHKyq1za05KX0BRs5bMw AINrryj0syR85pb2eH/B0dMlRMJhdeqjPZMqPNB3kueA25Abw9kueXh7xUqp7oNuxYvH y6QA== X-Gm-Message-State: ALoCoQmwFK6p2yaQemOHV4FydaSmKb+PliWG4XFkVmYD/UPyXWQAC3XvCjSPwCgSlYwYFhIePSOQ X-Received: by 10.152.116.115 with SMTP id jv19mr817907lab.9.1429258401203; Fri, 17 Apr 2015 01:13:21 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.2.70 with SMTP id 6ls405516las.14.gmail; Fri, 17 Apr 2015 01:13:21 -0700 (PDT) X-Received: by 10.152.26.34 with SMTP id i2mr1476233lag.117.1429258400990; Fri, 17 Apr 2015 01:13:20 -0700 (PDT) Received: from mail-la0-f53.google.com (mail-la0-f53.google.com. [209.85.215.53]) by mx.google.com with ESMTPS id re5si8208876lbb.150.2015.04.17.01.13.20 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 17 Apr 2015 01:13:20 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.53 as permitted sender) client-ip=209.85.215.53; Received: by labbd9 with SMTP id bd9so74512256lab.2 for ; Fri, 17 Apr 2015 01:13:20 -0700 (PDT) X-Received: by 10.112.163.168 with SMTP id yj8mr1538816lbb.36.1429258400883; Fri, 17 Apr 2015 01:13:20 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.67.65 with SMTP id l1csp3585967lbt; Fri, 17 Apr 2015 01:13:19 -0700 (PDT) X-Received: by 10.68.241.9 with SMTP id we9mr3303594pbc.59.1429258398953; Fri, 17 Apr 2015 01:13:18 -0700 (PDT) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id gz6si2591pbc.252.2015.04.17.01.13.17; Fri, 17 Apr 2015 01:13:18 -0700 (PDT) Received-SPF: none (google.com: linux-kernel-owner@vger.kernel.org does not designate permitted sender hosts) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932143AbbDQINM (ORCPT + 27 others); Fri, 17 Apr 2015 04:13:12 -0400 Received: from mail-pd0-f180.google.com ([209.85.192.180]:32971 "EHLO mail-pd0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753182AbbDQINI (ORCPT ); Fri, 17 Apr 2015 04:13:08 -0400 Received: by pdbnk13 with SMTP id nk13so120800234pdb.0 for ; Fri, 17 Apr 2015 01:13:07 -0700 (PDT) X-Received: by 10.68.115.43 with SMTP id jl11mr3337441pbb.139.1429258387565; Fri, 17 Apr 2015 01:13:07 -0700 (PDT) Received: from [192.168.0.26] ([122.167.118.120]) by mx.google.com with ESMTPSA id m8sm9329681pdn.5.2015.04.17.01.13.04 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 17 Apr 2015 01:13:06 -0700 (PDT) Message-ID: <5530C086.2020700@linaro.org> Date: Fri, 17 Apr 2015 13:42:54 +0530 From: viresh kumar User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Thomas Gleixner CC: Ingo Molnar , Peter Zijlstra , linaro-kernel@lists.linaro.org, linux-kernel@vger.kernel.org, Steven Miao , shashim@codeaurora.org Subject: Re: [PATCH 1/2] timer: Avoid waking up an idle-core by migrate running timer References: <80182e47a7103608d2ddab7f62c0c3dffc99fdcc.1427782893.git.viresh.kumar@linaro.org> In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: viresh.kumar@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.53 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , Hi Thomas, On Wednesday 15 April 2015 04:43 AM, Thomas Gleixner wrote: > No new flags in the timer base, no concurrent expiry, no extra > conditionals in the expiry path. Just a single extra check at the end > of the softirq handler for this rare condition instead of imposing all > that nonsense to the hotpath. Here is what I could get to based on your suggestions: --- viresh -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 2d3f5c504939..c412511d34b8 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -78,6 +78,8 @@ struct tvec_root { struct tvec_base { spinlock_t lock; struct timer_list *running_timer; + struct list_head migration_list; + struct tvec_base *preferred_target; unsigned long timer_jiffies; unsigned long next_timer; unsigned long active_timers; @@ -785,10 +787,18 @@ __mod_timer(struct timer_list *timer, unsigned long expires, * We are trying to schedule the timer on the local CPU. * However we can't change timer's base while it is running, * otherwise del_timer_sync() can't detect that the timer's - * handler yet has not finished. This also guarantees that - * the timer is serialized wrt itself. + * handler yet has not finished. + * + * Move timer to migration_list which can be processed after all + * timers in current cycle are serviced. This also guarantees + * that the timer is serialized wrt itself. */ - if (likely(base->running_timer != timer)) { + if (unlikely(base->running_timer == timer)) { + timer->expires = expires; + base->preferred_target = new_base; + list_add_tail(&timer->entry, &base->migration_list); + goto out_unlock; + } else { /* See the comment in lock_timer_base() */ timer_set_base(timer, NULL); spin_unlock(&base->lock); @@ -1214,6 +1224,33 @@ static inline void __run_timers(struct tvec_base *base) spin_lock_irq(&base->lock); } } + + /* + * Timer handler re-armed timer and we didn't wanted to add that + * on a idle local CPU. Its handler has finished now, lets + * enqueue it again. + */ + if (unlikely(!list_empty(&base->migration_list))) { + struct tvec_base *new_base = base->preferred_target; + + do { + timer = list_first_entry(&base->migration_list, + struct timer_list, entry); + + __list_del(timer->entry.prev, timer->entry.next); + + /* See the comment in lock_timer_base() */ + timer_set_base(timer, NULL); + spin_unlock(&base->lock); + + spin_lock(&new_base->lock); + timer_set_base(timer, new_base); + internal_add_timer(new_base, timer); + spin_unlock(&new_base->lock); + + spin_lock(&base->lock); + } while (!list_empty(&base->migration_list)); + } } base->running_timer = NULL; spin_unlock_irq(&base->lock); @@ -1583,6 +1620,7 @@ static int init_timers_cpu(int cpu) for (j = 0; j < TVR_SIZE; j++) INIT_LIST_HEAD(base->tv1.vec + j); + INIT_LIST_HEAD(&base->migration_list); base->timer_jiffies = jiffies; base->next_timer = base->timer_jiffies; base->active_timers = 0; Does this look any better ? I tested this on Exynos (Dual ARM Cortex-A9) board, with ubuntu over it. System was fairly idle and I did 'dmesg > /dev/null' on one of the CPUs in an infinite loop, to get CPUs out of idle. I have got few more concerns/diffs over this to further optimize things, but kept them separate so that I can drop them if they aren't correct. 1.) Should the above list_empty(migration_list) block be added out of the while (time_after_eq(jiffies, base->timer_jiffies)) block ? So that we check it only once per timer interrupt. Also ->running_timer is set to the last serviced timer and a del_timer_sync() might be waiting to remove it. But we continue with the migration list first, without clearing it first. Not sure if this is important at all.. 2.) By the time we finish serving all pending timers, local CPU might not be idle anymore OR the target CPU may become idle. diff --git a/kernel/time/timer.c b/kernel/time/timer.c index c412511d34b8..d75d31e9dc49 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1233,6 +1233,14 @@ static inline void __run_timers(struct tvec_base *base) if (unlikely(!list_empty(&base->migration_list))) { struct tvec_base *new_base = base->preferred_target; + if (!idle_cpu(base->cpu)) { + /* Local CPU not idle anymore */ + new_base = base; + } else if (idle_cpu(new_base->cpu)) { + /* Re-evaluate base, target CPU has gone idle */ + new_base = per_cpu(tvec_bases, get_nohz_timer_target(false)); + } + do { timer = list_first_entry(&base->migration_list, struct timer_list, entry); The first if block is getting hit a lot of times in my tests. But the second one has never been hit (Probably because of only two CPUs, not sure). 2.) It might be better to update preferred_target every time we choose a difference base. This may help us avoid calling get_nohz_timer_target() in the second if block above. diff --git a/kernel/time/timer.c b/kernel/time/timer.c index d75d31e9dc49..558cd98abd87 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -783,6 +783,8 @@ __mod_timer(struct timer_list *timer, unsigned long expires, new_base = per_cpu(tvec_bases, cpu); if (base != new_base) { + base->preferred_target = new_base; + /* * We are trying to schedule the timer on the local CPU. * However we can't change timer's base while it is running, @@ -795,7 +797,6 @@ __mod_timer(struct timer_list *timer, unsigned long expires, */ if (unlikely(base->running_timer == timer)) { timer->expires = expires; - base->preferred_target = new_base; list_add_tail(&timer->entry, &base->migration_list); goto out_unlock; } else {