From patchwork Tue Feb 28 06:29:37 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Antipov X-Patchwork-Id: 6963 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 42A4823E4A for ; Tue, 28 Feb 2012 06:28:25 +0000 (UTC) Received: from mail-iy0-f180.google.com (mail-iy0-f180.google.com [209.85.210.180]) by fiordland.canonical.com (Postfix) with ESMTP id D6C87A187B8 for ; Tue, 28 Feb 2012 06:28:24 +0000 (UTC) Received: by iage36 with SMTP id e36so746203iag.11 for ; Mon, 27 Feb 2012 22:28:24 -0800 (PST) Received: from mr.google.com ([10.50.89.201]) by 10.50.89.201 with SMTP id bq9mr1423436igb.55.1330410504335 (num_hops = 1); Mon, 27 Feb 2012 22:28:24 -0800 (PST) Received: by 10.50.89.201 with SMTP id bq9mr1222382igb.55.1330410504258; Mon, 27 Feb 2012 22:28:24 -0800 (PST) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.231.11.10 with SMTP id r10csp1793ibr; Mon, 27 Feb 2012 22:28:23 -0800 (PST) Received: by 10.204.136.219 with SMTP id s27mr5494434bkt.116.1330410502731; Mon, 27 Feb 2012 22:28:22 -0800 (PST) Received: from mail-bk0-f50.google.com (mail-bk0-f50.google.com [209.85.214.50]) by mx.google.com with ESMTPS id c10si9437488bke.39.2012.02.27.22.28.22 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 27 Feb 2012 22:28:22 -0800 (PST) Received-SPF: neutral (google.com: 209.85.214.50 is neither permitted nor denied by best guess record for domain of dmitry.antipov@linaro.org) client-ip=209.85.214.50; Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.214.50 is neither permitted nor denied by best guess record for domain of dmitry.antipov@linaro.org) smtp.mail=dmitry.antipov@linaro.org Received: by bkuw11 with SMTP id w11so1922095bku.37 for ; Mon, 27 Feb 2012 22:28:22 -0800 (PST) Received-SPF: pass (google.com: domain of dmitry.antipov@linaro.org designates 10.204.13.72 as permitted sender) client-ip=10.204.13.72; Received: from mr.google.com ([10.204.13.72]) by 10.204.13.72 with SMTP id b8mr7432146bka.105.1330410502181 (num_hops = 1); Mon, 27 Feb 2012 22:28:22 -0800 (PST) MIME-Version: 1.0 Received: by 10.204.13.72 with SMTP id b8mr5964862bka.105.1330410501926; Mon, 27 Feb 2012 22:28:21 -0800 (PST) Received: from localhost.localdomain ([78.153.153.8]) by mx.google.com with ESMTPS id d5sm29042651bkb.3.2012.02.27.22.28.20 (version=SSLv3 cipher=OTHER); Mon, 27 Feb 2012 22:28:21 -0800 (PST) From: Dmitry Antipov To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , Russell King , Rusty Russell , Ingo Molnar , Yong Zhang , Venki Pallipadi , x86@kernel.org, linux-arm-kernel@lists.infradead.org, linaro-dev@lists.linaro.org, patches@linaro.org, Dmitry Antipov Subject: [PATCH] sched: generalize CONFIG_IRQ_TIME_ACCOUNTING for X86 and ARM Date: Tue, 28 Feb 2012 10:29:37 +0400 Message-Id: <1330410577-22739-1-git-send-email-dmitry.antipov@linaro.org> X-Mailer: git-send-email 1.7.7.6 X-Gm-Message-State: ALoCoQk1no1JApXMhTifhStn1ULdjsfUOfa3xnown/+5GRe0abArGjXN8UEksQSBuoAheUbqW+jq Generalize CONFIG_IRQ_TIME_ACCOUNTING between X86 and ARM, move "noirqtime=" option to common debugging code. For a bit of backward compatibility, X86-specific option "tsc=noirqtime" is preserved, but issues a warning. Suggested-by: Yong Zhang Suggested-by: Russell King Suggested-by: Ingo Molnar Suggested-by: Peter Zijlstra Acked-by: Venkatesh Pallipadi Signed-off-by: Dmitry Antipov --- Documentation/kernel-parameters.txt | 9 +++++---- arch/arm/kernel/sched_clock.c | 2 ++ arch/x86/Kconfig | 11 ----------- arch/x86/kernel/tsc.c | 12 ++++++------ include/linux/sched.h | 6 +----- kernel/sched/core.c | 24 +++++++++++++++++++----- lib/Kconfig.debug | 12 ++++++++++++ 7 files changed, 45 insertions(+), 31 deletions(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 033d4e6..a5da255 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1790,6 +1790,9 @@ bytes respectively. Such letter suffixes can also be entirely omitted. noirqdebug [X86-32] Disables the code which attempts to detect and disable unhandled interrupt sources. + noirqtime [X86,ARM] Run time disables IRQ_TIME_ACCOUNTING and + eliminates the timestamping on irq/softirq entry/exit. + no_timer_check [X86,APIC] Disables the code which tests for broken timer IRQ sources. @@ -2636,10 +2639,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted. as the stability checks done at bootup. Used to enable high-resolution timer mode on older hardware, and in virtualized environment. - [x86] noirqtime: Do not use TSC to do irq accounting. - Used to run time disable IRQ_TIME_ACCOUNTING on any - platforms where RDTSC is slow and this accounting - can add overhead. + [x86] noirqtime: obsoleted by "noirqtime" generic option, + see it's documentation for details. turbografx.map[2|3]= [HW,JOY] TurboGraFX parallel port interface diff --git a/arch/arm/kernel/sched_clock.c b/arch/arm/kernel/sched_clock.c index 5416c7c..30b5f89 100644 --- a/arch/arm/kernel/sched_clock.c +++ b/arch/arm/kernel/sched_clock.c @@ -144,6 +144,8 @@ void __init setup_sched_clock(u32 (*read)(void), int bits, unsigned long rate) */ cd.epoch_ns = 0; + enable_sched_clock_irqtime(); + pr_debug("Registered %pF as sched_clock source\n", read); } diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5bed94e..4759676 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -805,17 +805,6 @@ config SCHED_MC making when dealing with multi-core CPU chips at a cost of slightly increased overhead in some places. If unsure say N here. -config IRQ_TIME_ACCOUNTING - bool "Fine granularity task level IRQ time accounting" - default n - ---help--- - Select this option to enable fine granularity task irq time - accounting. This is done by reading a timestamp on each - transitions between softirq and hardirq state, so there can be a - small performance impact. - - If in doubt, say N here. - source "kernel/Kconfig.preempt" config X86_UP_APIC diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index a62c201..f1b2b63 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -103,14 +103,15 @@ int __init notsc_setup(char *str) __setup("notsc", notsc_setup); -static int no_sched_irq_time; - static int __init tsc_setup(char *str) { if (!strcmp(str, "reliable")) tsc_clocksource_reliable = 1; - if (!strncmp(str, "noirqtime", 9)) - no_sched_irq_time = 1; + if (!strncmp(str, "noirqtime", 9)) { + printk(KERN_WARNING "tsc: tsc=noirqtime is " + "obsolete, use noirqtime instead\n"); + disable_sched_clock_irqtime(); + } return 1; } @@ -978,8 +979,7 @@ void __init tsc_init(void) /* now allow native_sched_clock() to use rdtsc */ tsc_disabled = 0; - if (!no_sched_irq_time) - enable_sched_clock_irqtime(); + enable_sched_clock_irqtime(); lpj = ((u64)tsc_khz * 1000); do_div(lpj, HZ); diff --git a/include/linux/sched.h b/include/linux/sched.h index 7d379a6..ea4019c 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1961,11 +1961,7 @@ extern void sched_clock_idle_wakeup_event(u64 delta_ns); #endif #ifdef CONFIG_IRQ_TIME_ACCOUNTING -/* - * An i/f to runtime opt-in for irq time accounting based off of sched_clock. - * The reason for this explicit opt-in is not to have perf penalty with - * slow sched_clocks. - */ +extern int sched_clock_irqtime; extern void enable_sched_clock_irqtime(void); extern void disable_sched_clock_irqtime(void); #else diff --git a/kernel/sched/core.c b/kernel/sched/core.c index b342f57..4693509 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -757,11 +757,17 @@ static DEFINE_PER_CPU(u64, cpu_hardirq_time); static DEFINE_PER_CPU(u64, cpu_softirq_time); static DEFINE_PER_CPU(u64, irq_start_time); -static int sched_clock_irqtime; + +/* + * -1 if not initialized, 0 if disabled with "noirqtime" kernel option + * or after unstable clock was detected, 1 if enabled and active. + */ +__read_mostly int sched_clock_irqtime = -1; void enable_sched_clock_irqtime(void) { - sched_clock_irqtime = 1; + if (sched_clock_irqtime == -1) + sched_clock_irqtime = 1; } void disable_sched_clock_irqtime(void) @@ -769,6 +775,14 @@ void disable_sched_clock_irqtime(void) sched_clock_irqtime = 0; } +static int __init irqtime_setup(char *str) +{ + sched_clock_irqtime = 0; + return 1; +} + +__setup("noirqtime", irqtime_setup); + #ifndef CONFIG_64BIT static DEFINE_PER_CPU(seqcount_t, irq_time_seq); @@ -822,7 +836,7 @@ void account_system_vtime(struct task_struct *curr) s64 delta; int cpu; - if (!sched_clock_irqtime) + if (sched_clock_irqtime < 1) return; local_irq_save(flags); @@ -2852,7 +2866,7 @@ void account_process_tick(struct task_struct *p, int user_tick) cputime_t one_jiffy_scaled = cputime_to_scaled(cputime_one_jiffy); struct rq *rq = this_rq(); - if (sched_clock_irqtime) { + if (sched_clock_irqtime > 0) { irqtime_account_process_tick(p, user_tick, rq); return; } @@ -2886,7 +2900,7 @@ void account_steal_ticks(unsigned long ticks) void account_idle_ticks(unsigned long ticks) { - if (sched_clock_irqtime) { + if (sched_clock_irqtime > 0) { irqtime_account_idle_ticks(ticks); return; } diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 8745ac7..236e814 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -299,6 +299,18 @@ config SCHEDSTATS application, you can say N to avoid the very slight overhead this adds. +config IRQ_TIME_ACCOUNTING + bool "Fine granularity task level IRQ time accounting" + depends on X86 || ARM + default n + ---help--- + Select this option to enable fine granularity task irq time + accounting. This is done by reading a timestamp on each + transitions between softirq and hardirq state, so there can be a + small performance impact. + + If in doubt, say N here. + config TIMER_STATS bool "Collect kernel timers statistics" depends on DEBUG_KERNEL && PROC_FS