From patchwork Thu Sep 20 18:48:09 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 11604 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id C274C23E54 for ; Thu, 20 Sep 2012 18:49:12 +0000 (UTC) Received: from mail-iy0-f180.google.com (mail-iy0-f180.google.com [209.85.210.180]) by fiordland.canonical.com (Postfix) with ESMTP id E0BF9A1823C for ; Thu, 20 Sep 2012 18:49:11 +0000 (UTC) Received: by mail-iy0-f180.google.com with SMTP id j25so1863989iaf.11 for ; Thu, 20 Sep 2012 11:49:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-forwarded-to:x-forwarded-for:delivered-to:received-spf:from:to:cc :subject:date:message-id:x-mailer:in-reply-to:references :x-content-scanned:x-cbid:x-gm-message-state; bh=7b0ovu5VRu/2/bTdvEK/PX03oAZF717rotovAh6VuXo=; b=Bs5zhquZ2RHFzRKsNjKuBKaN3Kr0btsMMzFjr36VeT0CVKlestKbziXO9rBSBmoHua hEBdRGxfXuSPdfZqFVf6tvVU+oZ4XsrORxGok1i+pfG7nCaLERqpnl9LegvDaE+3tGz6 k8HRuGvVPLZEPEnI+sXK9HP3uS23axaeqFegXZ7FQryWCWnAo9C0ND3dD3c/2iT4UN5a mQlP3yMct0bfERnhCjI2FhRSXopHR5ifo0u9iXqOZpsKSQSqOohZTat8EixR/Jboa62R dZDOeoBGy7JLp0Ey1tWHo8NQdI/h8T437BU6L9z60nIROp//JJavou1plRTbYxqnODku SxKg== Received: by 10.50.195.134 with SMTP id ie6mr2603121igc.28.1348166951707; Thu, 20 Sep 2012 11:49:11 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.50.184.232 with SMTP id ex8csp92247igc; Thu, 20 Sep 2012 11:49:11 -0700 (PDT) Received: by 10.60.31.165 with SMTP id b5mr2015305oei.58.1348166951041; Thu, 20 Sep 2012 11:49:11 -0700 (PDT) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com. [32.97.110.152]) by mx.google.com with ESMTPS id ue6si5667931obb.178.2012.09.20.11.49.10 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 20 Sep 2012 11:49:10 -0700 (PDT) Received-SPF: pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.110.152 as permitted sender) client-ip=32.97.110.152; Authentication-Results: mx.google.com; spf=pass (google.com: domain of paulmck@linux.vnet.ibm.com designates 32.97.110.152 as permitted sender) smtp.mail=paulmck@linux.vnet.ibm.com Received: from /spool/local by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 20 Sep 2012 12:49:10 -0600 Received: from d03dlp01.boulder.ibm.com (9.17.202.177) by e34.co.us.ibm.com (192.168.1.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 20 Sep 2012 12:49:07 -0600 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id D12741FF003C for ; Thu, 20 Sep 2012 12:49:01 -0600 (MDT) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q8KImjNn146124 for ; Thu, 20 Sep 2012 12:48:45 -0600 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q8KImQC5021250 for ; Thu, 20 Sep 2012 12:48:40 -0600 Received: from paulmck-ThinkPad-W500 ([9.47.24.72]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q8KImNfY020800; Thu, 20 Sep 2012 12:48:24 -0600 Received: by paulmck-ThinkPad-W500 (Postfix, from userid 1000) id 5A516EC52F; Thu, 20 Sep 2012 11:48:22 -0700 (PDT) From: "Paul E. McKenney" To: linux-kernel@vger.kernel.org Cc: mingo@elte.hu, laijs@cn.fujitsu.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca, josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, Valdis.Kletnieks@vt.edu, dhowells@redhat.com, eric.dumazet@gmail.com, darren@dvhart.com, fweisbec@gmail.com, sbw@mit.edu, patches@linaro.org, "Paul E. McKenney" Subject: [PATCH tip/core/rcu 13/23] rcu: Prevent force_quiescent_state() memory contention Date: Thu, 20 Sep 2012 11:48:09 -0700 Message-Id: <1348166900-18716-13-git-send-email-paulmck@linux.vnet.ibm.com> X-Mailer: git-send-email 1.7.8 In-Reply-To: <1348166900-18716-1-git-send-email-paulmck@linux.vnet.ibm.com> References: <20120920184751.GA18657@linux.vnet.ibm.com> <1348166900-18716-1-git-send-email-paulmck@linux.vnet.ibm.com> X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12092018-2876-0000-0000-00000058A197 X-Gm-Message-State: ALoCoQlox0/YMR2sqmlSVKdkr7FZOwxJhttuaL6bX1066LE4sXY0wo9XvCTG311MCfGT3Mb+dN6a From: "Paul E. McKenney" Large systems running RCU_FAST_NO_HZ kernels see extreme memory contention on the rcu_state structure's ->fqslock field. This can be avoided by disabling RCU_FAST_NO_HZ, either at compile time or at boot time (via the nohz kernel boot parameter), but large systems will no doubt become sensitive to energy consumption. This commit therefore uses a combining-tree approach to spread the memory contention across new cache lines in the leaf rcu_node structures. This can be thought of as a tournament lock that has only a try-lock acquisition primitive. The effect on small systems is minimal, because such systems have an rcu_node "tree" consisting of a single node. In addition, this functionality is not used on fastpaths. Signed-off-by: Paul E. McKenney Reviewed-by: Josh Triplett --- kernel/rcutree.c | 47 +++++++++++++++++++++++++++++++++++++---------- kernel/rcutree.h | 1 + 2 files changed, 38 insertions(+), 10 deletions(-) diff --git a/kernel/rcutree.c b/kernel/rcutree.c index b353d32..5edbdf8 100644 --- a/kernel/rcutree.c +++ b/kernel/rcutree.c @@ -61,6 +61,7 @@ /* Data structures. */ static struct lock_class_key rcu_node_class[RCU_NUM_LVLS]; +static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS]; #define RCU_STATE_INITIALIZER(sname, cr) { \ .level = { &sname##_state.node[0] }, \ @@ -1805,16 +1806,35 @@ static void force_qs_rnp(struct rcu_state *rsp, int (*f)(struct rcu_data *)) static void force_quiescent_state(struct rcu_state *rsp) { unsigned long flags; - struct rcu_node *rnp = rcu_get_root(rsp); + bool ret; + struct rcu_node *rnp; + struct rcu_node *rnp_old = NULL; + + /* Funnel through hierarchy to reduce memory contention. */ + rnp = per_cpu_ptr(rsp->rda, raw_smp_processor_id())->mynode; + for (; rnp != NULL; rnp = rnp->parent) { + ret = (ACCESS_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) || + !raw_spin_trylock(&rnp->fqslock); + if (rnp_old != NULL) + raw_spin_unlock(&rnp_old->fqslock); + if (ret) { + rsp->n_force_qs_lh++; + return; + } + rnp_old = rnp; + } + /* rnp_old == rcu_get_root(rsp), rnp == NULL. */ - if (ACCESS_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) + /* Reached the root of the rcu_node tree, acquire lock. */ + raw_spin_lock_irqsave(&rnp_old->lock, flags); + raw_spin_unlock(&rnp_old->fqslock); + if (ACCESS_ONCE(rsp->gp_flags) & RCU_GP_FLAG_FQS) { + rsp->n_force_qs_lh++; + raw_spin_unlock_irqrestore(&rnp_old->lock, flags); return; /* Someone beat us to it. */ - if (!raw_spin_trylock_irqsave(&rnp->lock, flags)) { - rsp->n_force_qs_lh++; /* Inexact, can lose counts. Tough! */ - return; } rsp->gp_flags |= RCU_GP_FLAG_FQS; - raw_spin_unlock_irqrestore(&rnp->lock, flags); + raw_spin_unlock_irqrestore(&rnp_old->lock, flags); wake_up(&rsp->gp_wq); /* Memory barrier implied by wake_up() path. */ } @@ -2702,10 +2722,14 @@ static void __init rcu_init_levelspread(struct rcu_state *rsp) static void __init rcu_init_one(struct rcu_state *rsp, struct rcu_data __percpu *rda) { - static char *buf[] = { "rcu_node_level_0", - "rcu_node_level_1", - "rcu_node_level_2", - "rcu_node_level_3" }; /* Match MAX_RCU_LVLS */ + static char *buf[] = { "rcu_node_0", + "rcu_node_1", + "rcu_node_2", + "rcu_node_3" }; /* Match MAX_RCU_LVLS */ + static char *fqs[] = { "rcu_node_fqs_0", + "rcu_node_fqs_1", + "rcu_node_fqs_2", + "rcu_node_fqs_3" }; /* Match MAX_RCU_LVLS */ int cpustride = 1; int i; int j; @@ -2730,6 +2754,9 @@ static void __init rcu_init_one(struct rcu_state *rsp, raw_spin_lock_init(&rnp->lock); lockdep_set_class_and_name(&rnp->lock, &rcu_node_class[i], buf[i]); + raw_spin_lock_init(&rnp->fqslock); + lockdep_set_class_and_name(&rnp->fqslock, + &rcu_fqs_class[i], fqs[i]); rnp->gpnum = 0; rnp->qsmask = 0; rnp->qsmaskinit = 0; diff --git a/kernel/rcutree.h b/kernel/rcutree.h index 7fb93ce..8f0293c 100644 --- a/kernel/rcutree.h +++ b/kernel/rcutree.h @@ -202,6 +202,7 @@ struct rcu_node { /* per-CPU kthreads as needed. */ unsigned int node_kthread_status; /* State of node_kthread_task for tracing. */ + raw_spinlock_t fqslock ____cacheline_internodealigned_in_smp; } ____cacheline_internodealigned_in_smp; /*