From patchwork Fri Nov 21 14:53:23 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Daniel Thompson <daniel.thompson@linaro.org>
X-Patchwork-Id: 41325
Return-Path: <patchwork-forward+bncBCAPDLF44QLBBB5EXWRQKGQEDEMKXUA@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-wg0-f72.google.com (mail-wg0-f72.google.com [74.125.82.72])
 by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 788B92452C
 for <linaro@patches.linaro.org>; Fri, 21 Nov 2014 15:01:47 +0000 (UTC)
Received: by mail-wg0-f72.google.com with SMTP id y19sf3118989wgg.3
 for <linaro@patches.linaro.org>; Fri, 21 Nov 2014 07:01:44 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject
 :date:message-id:x-original-sender:x-original-authentication-results
 :precedence:mailing-list:list-id:list-post:list-help:list-archive
 :list-unsubscribe;
 bh=bXXaBzXUPbdrynBbegiz5JLnR6T608x/Xn3zCWfEhGA=;
 b=Hxje7k2m9j480QnOMQYtkPk3pzG4n7usgn96tmvu6pW2amsvRuogE68765jCv9Ovpj
 EGOlWsEJkTTytnhtiTojASb2hWo4PGrXmoppdGlK3zOM8JySNy1ciu2wAL+W8TYA2dkG
 kjqWOKeUZoGN05cimDhTqrR/gO3SDT3rJbfRVogCpDQ1Ckpm2lNU58BnSxAbBDCeKoNd
 aToDQihMImZWVmCIC4cIKvqmdpeXPgGtH1qUTP0Nyc6oq9ygk8SGDYkCuJlCK8NVNCGV
 Ux5YFJgZl0hBnnkitZF88e29Vpu4TIP1fcn48HyiftFYmzgAeiv1iyN+YtZA51lYFkWE
 MDOg==
X-Gm-Message-State: ALoCoQmsMez/p/RgqcJyV9Xxg+Cn1IEJf6/v/bGn+ORf71xQxOFyRxFnjfG5M6QKZGkhKH4EYBsq
X-Received: by 10.112.201.169 with SMTP id kb9mr928657lbc.20.1416581639386; 
 Fri, 21 Nov 2014 06:53:59 -0800 (PST)
MIME-Version: 1.0
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.152.22.130 with SMTP id d2ls225179laf.32.gmail; Fri, 21 Nov
 2014 06:53:59 -0800 (PST)
X-Received: by 10.152.234.140 with SMTP id ue12mr5155565lac.78.1416581639241; 
 Fri, 21 Nov 2014 06:53:59 -0800 (PST)
Received: from mail-lb0-f176.google.com (mail-lb0-f176.google.com.
 [209.85.217.176]) by mx.google.com with ESMTPS id
 li6si5731972lbc.87.2014.11.21.06.53.58
 for <patchwork-forward@linaro.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 21 Nov 2014 06:53:58 -0800 (PST)
Received-SPF: pass (google.com: domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.217.176 as permitted sender) client-ip=209.85.217.176; 
Received: by mail-lb0-f176.google.com with SMTP id p9so3109286lbv.21
 for <patchwork-forward@linaro.org>;
 Fri, 21 Nov 2014 06:53:58 -0800 (PST)
X-Received: by 10.153.7.170 with SMTP id dd10mr5246135lad.44.1416581638014; 
 Fri, 21 Nov 2014 06:53:58 -0800 (PST)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patches@linaro.org
Received: by 10.112.184.201 with SMTP id ew9csp105433lbc;
 Fri, 21 Nov 2014 06:53:57 -0800 (PST)
X-Received: by 10.194.174.40 with SMTP id bp8mr8247595wjc.104.1416581637168; 
 Fri, 21 Nov 2014 06:53:57 -0800 (PST)
Received: from mail-wg0-f48.google.com (mail-wg0-f48.google.com.
 [74.125.82.48]) by mx.google.com with ESMTPS id
 q3si13538106wix.22.2014.11.21.06.53.55 for <patches@linaro.org>
 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128);
 Fri, 21 Nov 2014 06:53:55 -0800 (PST)
Received-SPF: pass (google.com: domain of daniel.thompson@linaro.org
 designates 74.125.82.48 as permitted sender)
 client-ip=74.125.82.48; 
Received: by mail-wg0-f48.google.com with SMTP id y19so6768427wgg.7
 for <patches@linaro.org>; Fri, 21 Nov 2014 06:53:53 -0800 (PST)
X-Received: by 10.194.185.167 with SMTP id fd7mr8573556wjc.108.1416581633197; 
 Fri, 21 Nov 2014 06:53:53 -0800 (PST)
Received: from sundance.lan (cpc4-aztw19-0-0-cust157.18-1.cable.virginm.net.
 [82.33.25.158]) by mx.google.com with ESMTPSA id
 r10sm9049124wiy.19.2014.11.21.06.53.51 for <multiple recipients>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 21 Nov 2014 06:53:52 -0800 (PST)
From: Daniel Thompson <daniel.thompson@linaro.org>
To: Russell King <linux@arm.linux.org.uk>, Will Deacon <will.deacon@arm.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>,
 linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org,
 Shawn Guo <shawn.guo@linaro.org>, Sascha Hauer <kernel@pengutronix.de>,
 Peter Zijlstra <a.p.zijlstra@chello.nl>,
 Paul Mackerras <paulus@samba.org>, Ingo Molnar <mingo@redhat.com>,
 Arnaldo Carvalho de Melo <acme@kernel.org>,
 Thomas Gleixner <tglx@linutronix.de>, Lucas Stach <l.stach@pengutronix.de>,
 Linus Walleij <linus.walleij@linaro.org>, patches@linaro.org,
 linaro-kernel@lists.linaro.org, John Stultz <john.stultz@linaro.org>,
 Sumit Semwal <sumit.semwal@linaro.org>
Subject: [PATCH] arm: perf: Directly handle SMP platforms with one SPI
Date: Fri, 21 Nov 2014 14:53:23 +0000
Message-Id: <1416581603-30557-1-git-send-email-daniel.thompson@linaro.org>
X-Mailer: git-send-email 1.9.3
X-Removed-Original-Auth: Dkim didn't pass.
X-Original-Sender: daniel.thompson@linaro.org
X-Original-Authentication-Results: mx.google.com; spf=pass (google.com:
 domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 209.85.217.176 as permitted sender)
 smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org
Precedence: list
Mailing-list: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
List-ID: <patchwork-forward.linaro.org>
X-Google-Group-Id: 836684582541
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Unsubscribe: <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>, 
 <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>

Some ARM platforms mux the PMU interrupt of every core into a single
SPI. On such platforms if the PMU of any core except 0 raises an interrupt
then it cannot be serviced and eventually, if you are lucky, the spurious
irq detection might forcefully disable the interrupt.

On these SoCs it is not possible to determine which core raised the
interrupt so workaround this issue by queuing irqwork on the other
cores whenever the primary interrupt handler is unable to service the
interrupt.

The u8500 platform has an alternative workaround that dynamically alters
the affinity of the PMU interrupt. This workaround logic is no longer
required so the original code is removed as is the hook it relied upon.

Tested on imx6q (which has fours cores/PMUs all muxed to a single SPI).

Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
---

Notes:
    Thanks to Lucas Stach, Russell King and Thomas Gleixner for critiquing
    an older, completely different way to tackle the same problem.
    

 arch/arm/include/asm/pmu.h       | 10 +++++
 arch/arm/kernel/perf_event.c     | 11 ++---
 arch/arm/kernel/perf_event_cpu.c | 94 ++++++++++++++++++++++++++++++++++++++++
 arch/arm/mach-ux500/cpu-db8500.c | 29 -------------
 4 files changed, 107 insertions(+), 37 deletions(-)

--
1.9.3
diff --git a/arch/arm/include/asm/pmu.h b/arch/arm/include/asm/pmu.h
index 0b648c541293..36472c3cc283 100644
--- a/arch/arm/include/asm/pmu.h
+++ b/arch/arm/include/asm/pmu.h
@@ -81,6 +81,12 @@ struct pmu_hw_events {
 	raw_spinlock_t		pmu_lock;
 };

+struct arm_pmu_work {
+	struct irq_work work;
+	struct arm_pmu *arm_pmu;
+	atomic_t ret;
+};
+
 struct arm_pmu {
 	struct pmu	pmu;
 	cpumask_t	active_irqs;
@@ -101,6 +107,7 @@ struct arm_pmu {
 	void		(*reset)(void *);
 	int		(*request_irq)(struct arm_pmu *, irq_handler_t handler);
 	void		(*free_irq)(struct arm_pmu *);
+	irqreturn_t	(*handle_irq_none)(struct arm_pmu *);
 	int		(*map_event)(struct perf_event *event);
 	int		num_events;
 	atomic_t	active_events;
@@ -108,6 +115,9 @@ struct arm_pmu {
 	u64		max_period;
 	struct platform_device	*plat_device;
 	struct pmu_hw_events	*(*get_hw_events)(void);
+	int             single_irq;
+	struct arm_pmu_work __percpu *work;
+	atomic_t        remaining_work;
 };

 #define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index b50a770f8c99..0792c913b9bb 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -306,22 +306,17 @@ validate_group(struct perf_event *event)
 static irqreturn_t armpmu_dispatch_irq(int irq, void *dev)
 {
 	struct arm_pmu *armpmu;
-	struct platform_device *plat_device;
-	struct arm_pmu_platdata *plat;
 	int ret;
 	u64 start_clock, finish_clock;

 	if (irq_is_percpu(irq))
 		dev = *(void **)dev;
 	armpmu = dev;
-	plat_device = armpmu->plat_device;
-	plat = dev_get_platdata(&plat_device->dev);

 	start_clock = sched_clock();
-	if (plat && plat->handle_irq)
-		ret = plat->handle_irq(irq, dev, armpmu->handle_irq);
-	else
-		ret = armpmu->handle_irq(irq, dev);
+	ret = armpmu->handle_irq(irq, dev);
+	if (ret == IRQ_NONE && armpmu->handle_irq_none)
+		ret = armpmu->handle_irq_none(dev);
 	finish_clock = sched_clock();

 	perf_sample_event_took(finish_clock - start_clock);
diff --git a/arch/arm/kernel/perf_event_cpu.c b/arch/arm/kernel/perf_event_cpu.c
index eb2c4d55666b..e7153dc3b489 100644
--- a/arch/arm/kernel/perf_event_cpu.c
+++ b/arch/arm/kernel/perf_event_cpu.c
@@ -88,6 +88,75 @@ static void cpu_pmu_disable_percpu_irq(void *data)
 	disable_percpu_irq(irq);
 }

+/*
+ * Workaround logic that is distributed to all cores if the PMU has only
+ * a single IRQ and the CPU receiving that IRQ cannot handle it. Its
+ * job is to try to service the interrupt on the current CPU. It will
+ * also enable the IRQ again if all the other CPUs have already tried to
+ * service it.
+ */
+static void cpu_pmu_do_percpu_work(struct irq_work *w)
+{
+	struct arm_pmu_work *work = container_of(w, struct arm_pmu_work, work);
+	struct arm_pmu *cpu_pmu = work->arm_pmu;
+
+	atomic_set(&work->ret,
+		   cpu_pmu->handle_irq(cpu_pmu->single_irq, cpu_pmu));
+
+	if (atomic_dec_and_test(&cpu_pmu->remaining_work))
+		enable_irq(cpu_pmu->single_irq);
+}
+
+/*
+ * This callback, which is enabled only on SMP platforms that are
+ * running with a single IRQ, is called when the PMU handler running in
+ * the current CPU cannot service the interrupt.
+ *
+ * It will disable the interrupt and distribute irqwork to all other
+ * processors in the system. Hopefully one of them will clear the
+ * interrupt...
+ */
+static irqreturn_t cpu_pmu_handle_irq_none(struct arm_pmu *cpu_pmu)
+{
+	int num_online = num_online_cpus();
+	irqreturn_t ret = IRQ_NONE;
+	int cpu, cret;
+
+	if (num_online <= 1)
+		return IRQ_NONE;
+
+	disable_irq_nosync(cpu_pmu->single_irq);
+	atomic_add(num_online, &cpu_pmu->remaining_work);
+	smp_mb__after_atomic();
+
+	for_each_online_cpu(cpu) {
+		struct arm_pmu_work *work = per_cpu_ptr(cpu_pmu->work, cpu);
+
+		if (cpu == smp_processor_id())
+			continue;
+
+		/*
+		 * We can be extremely relaxed about memory ordering
+		 * here. All we are doing is gathering information
+		 * about the past to help us give a return value that
+		 * will keep the spurious interrupt detector both happy
+		 * *and* functional. We are not shared so we can
+		 * tolerate the occasional spurious IRQ_HANDLED.
+		 */
+		cret = atomic_read(&work->ret);
+		if (cret != IRQ_NONE)
+			ret = cret;
+
+		if (!irq_work_queue_on(&work->work, cpu))
+			atomic_dec(&cpu_pmu->remaining_work);
+	}
+
+	if (atomic_dec_and_test(&cpu_pmu->remaining_work))
+		enable_irq(cpu_pmu->single_irq);
+
+	return ret;
+}
+
 static void cpu_pmu_free_irq(struct arm_pmu *cpu_pmu)
 {
 	int i, irq, irqs;
@@ -107,6 +176,9 @@ static void cpu_pmu_free_irq(struct arm_pmu *cpu_pmu)
 			if (irq >= 0)
 				free_irq(irq, cpu_pmu);
 		}
+
+		cpu_pmu->handle_irq_none = cpu_pmu_handle_irq_none;
+		free_percpu(cpu_pmu->work);
 	}
 }

@@ -162,6 +234,28 @@ static int cpu_pmu_request_irq(struct arm_pmu *cpu_pmu, irq_handler_t handler)

 			cpumask_set_cpu(i, &cpu_pmu->active_irqs);
 		}
+
+		/*
+		 * If we are running SMP and have only one interrupt source
+		 * then get ready to share that single irq among the cores.
+		 */
+		if (nr_cpu_ids > 1 && irqs == 1) {
+			cpu_pmu->single_irq = platform_get_irq(pmu_device, 0);
+			cpu_pmu->work = alloc_percpu(struct arm_pmu_work);
+			if (!cpu_pmu->work) {
+				pr_err("no memory for shared IRQ workaround\n");
+				return -ENOMEM;
+			}
+
+			for_each_possible_cpu(i) {
+				struct arm_pmu_work *w =
+				    per_cpu_ptr(cpu_pmu->work, i);
+				init_irq_work(&w->work, cpu_pmu_do_percpu_work);
+				w->arm_pmu = cpu_pmu;
+			}
+
+			cpu_pmu->handle_irq_none = cpu_pmu_handle_irq_none;
+		}
 	}

 	return 0;
diff --git a/arch/arm/mach-ux500/cpu-db8500.c b/arch/arm/mach-ux500/cpu-db8500.c
index 6f63954c8bde..917774999c5c 100644
--- a/arch/arm/mach-ux500/cpu-db8500.c
+++ b/arch/arm/mach-ux500/cpu-db8500.c
@@ -12,8 +12,6 @@
 #include <linux/init.h>
 #include <linux/device.h>
 #include <linux/amba/bus.h>
-#include <linux/interrupt.h>
-#include <linux/irq.h>
 #include <linux/platform_device.h>
 #include <linux/io.h>
 #include <linux/mfd/abx500/ab8500.h>
@@ -23,7 +21,6 @@
 #include <linux/regulator/machine.h>
 #include <linux/random.h>

-#include <asm/pmu.h>
 #include <asm/mach/map.h>

 #include "setup.h"
@@ -99,30 +96,6 @@ static void __init u8500_map_io(void)
 		iotable_init(u8500_io_desc, ARRAY_SIZE(u8500_io_desc));
 }

-/*
- * The PMU IRQ lines of two cores are wired together into a single interrupt.
- * Bounce the interrupt to the other core if it's not ours.
- */
-static irqreturn_t db8500_pmu_handler(int irq, void *dev, irq_handler_t handler)
-{
-	irqreturn_t ret = handler(irq, dev);
-	int other = !smp_processor_id();
-
-	if (ret == IRQ_NONE && cpu_online(other))
-		irq_set_affinity(irq, cpumask_of(other));
-
-	/*
-	 * We should be able to get away with the amount of IRQ_NONEs we give,
-	 * while still having the spurious IRQ detection code kick in if the
-	 * interrupt really starts hitting spuriously.
-	 */
-	return ret;
-}
-
-static struct arm_pmu_platdata db8500_pmu_platdata = {
-	.handle_irq		= db8500_pmu_handler,
-};
-
 static const char *db8500_read_soc_id(void)
 {
 	void __iomem *uid = __io_address(U8500_BB_UID_BASE);
@@ -143,8 +116,6 @@ static struct device * __init db8500_soc_device_init(void)
 }

 static struct of_dev_auxdata u8500_auxdata_lookup[] __initdata = {
-	/* Requires call-back bindings. */
-	OF_DEV_AUXDATA("arm,cortex-a9-pmu", 0, "arm-pmu", &db8500_pmu_platdata),
 	/* Requires DMA bindings. */
 	OF_DEV_AUXDATA("stericsson,ux500-msp-i2s", 0x80123000,
 		       "ux500-msp-i2s.0", &msp0_platform_data),