From patchwork Thu Apr 5 16:59:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 132869 Delivered-To: patch@linaro.org Received: by 10.46.84.29 with SMTP id i29csp6520518ljb; Thu, 5 Apr 2018 09:59:42 -0700 (PDT) X-Google-Smtp-Source: AIpwx49B+i2kWjGOUMqk7MEQmSXeKqHAn7zRBPp6R6GaeMEjOVzHPjWp4TD4XhHdLuR9VLm32c48 X-Received: by 2002:a17:902:e81:: with SMTP id 1-v6mr24486768plx.158.1522947582000; Thu, 05 Apr 2018 09:59:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522947581; cv=none; d=google.com; s=arc-20160816; b=DUDJHnrrSX/7YuDH22h3MCsDWe3zhD/MXjfyUwDf41BrTSC/OGU3c+Nosbx50AJxPA Ftq3HuwwxCOKgoVRxfeuY7lIvTxGg2bMLjL9hPiRkvBZtBlT2M90uLhflYD2Xz6xwVic bkMxyYhnyg5V2n6sZY20N6sqaQEt92AV9CyO1pREb3Q2+UjXISTJhyNxloYbDV3iAxmP 8jXf6rVm7qJI4tOo5A/kIy2orNTzz98SvNC06azQxJLklfSvo12GJSCBUj3UuhKm8sov xfEd87YNrjLDtlwkt9JE0d9ugq6kYGoFxD34M5N++P9D53X9yt4zbz/v2HOwBLKr4w6A m4BQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=7meFwI0hsLGAO8hGmKmgQ28G1uBBJm6cf4uagWRfbkU=; b=RZi2Y7zpTbZIaux2stuNc8/x/2AB5n+7NlKjXnDmOhTbEtV0ePiMewiRc6A5+7g3nk oPoxzeWoA8A4vsapnC00kRJ3zonfibLmMgHE91KyHCRVlQhEO12/bGTRfQykIfrjzv2S Spm/p4XD6qKaa3aATSLowt+Db5jmk5Ym/Ojm3v+0sbu+Mx0nN2RGNHCICji+Is6bg4NS OYWE4Heibg/GmrsDLnYDsUcJ1QSxrHERQ6G9KlBMS5mshq3lsuAcWFj/hnanvV6CGGtE XO6lJdgD5mbV1g2CUBleSjpf0I8hat6m2lE+F2yj5SkLfY3fl8asSsR4xRBw7f45azDY i+Dg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p64si5706075pga.492.2018.04.05.09.59.41; Thu, 05 Apr 2018 09:59:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751849AbeDEQ7Z (ORCPT + 29 others); Thu, 5 Apr 2018 12:59:25 -0400 Received: from foss.arm.com ([217.140.101.70]:57404 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751469AbeDEQ64 (ORCPT ); Thu, 5 Apr 2018 12:58:56 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2F6DB16BA; Thu, 5 Apr 2018 09:58:56 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 01D353F7F3; Thu, 5 Apr 2018 09:58:56 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id 306311AE55AC; Thu, 5 Apr 2018 17:59:09 +0100 (BST) From: Will Deacon To: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org, peterz@infradead.org, mingo@kernel.org, boqun.feng@gmail.com, paulmck@linux.vnet.ibm.com, catalin.marinas@arm.com, Will Deacon Subject: [PATCH 10/10] locking/qspinlock: Elide back-to-back RELEASE operations with smp_wmb() Date: Thu, 5 Apr 2018 17:59:07 +0100 Message-Id: <1522947547-24081-11-git-send-email-will.deacon@arm.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1522947547-24081-1-git-send-email-will.deacon@arm.com> References: <1522947547-24081-1-git-send-email-will.deacon@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The qspinlock slowpath must ensure that the MCS node is fully initialised before it can be reached by another other CPU. This is currently enforced by using a RELEASE operation when updating the tail and also when linking the node into the waitqueue (since the control dependency off xchg_tail is insufficient to enforce sufficient ordering -- see 95bcade33a8a ("locking/qspinlock: Ensure node is initialised before updating prev->next")). Back-to-back RELEASE operations may be expensive on some architectures, particularly those that implement them using fences under the hood. We can replace the two RELEASE operations with a single smp_wmb() fence and use RELAXED operations for the subsequent publishing of the node. Cc: Peter Zijlstra Cc: Ingo Molnar Signed-off-by: Will Deacon --- kernel/locking/qspinlock.c | 32 +++++++++++++++----------------- 1 file changed, 15 insertions(+), 17 deletions(-) -- 2.1.4 diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index 3ad8786a47e2..42c61f7b37c5 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -141,10 +141,10 @@ static __always_inline void clear_pending_set_locked(struct qspinlock *lock) static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) { /* - * Use release semantics to make sure that the MCS node is properly - * initialized before changing the tail code. + * We can use relaxed semantics since the caller ensures that the + * MCS node is properly initialized before updating the tail. */ - return (u32)xchg_release(&lock->tail, + return (u32)xchg_relaxed(&lock->tail, tail >> _Q_TAIL_OFFSET) << _Q_TAIL_OFFSET; } @@ -178,10 +178,11 @@ static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail) for (;;) { new = (val & _Q_LOCKED_PENDING_MASK) | tail; /* - * Use release semantics to make sure that the MCS node is - * properly initialized before changing the tail code. + * We can use relaxed semantics since the caller ensures that + * the MCS node is properly initialized before updating the + * tail. */ - old = atomic_cmpxchg_release(&lock->val, val, new); + old = atomic_cmpxchg_relaxed(&lock->val, val, new); if (old == val) break; @@ -340,12 +341,17 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) goto release; /* + * Ensure that the initialisation of @node is complete before we + * publish the updated tail and potentially link @node into the + * waitqueue. + */ + smp_wmb(); + + /* * We have already touched the queueing cacheline; don't bother with * pending stuff. * * p,*,* -> n,*,* - * - * RELEASE, such that the stores to @node must be complete. */ old = xchg_tail(lock, tail); next = NULL; @@ -356,15 +362,7 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) */ if (old & _Q_TAIL_MASK) { prev = decode_tail(old); - - /* - * We must ensure that the stores to @node are observed before - * the write to prev->next. The address dependency from - * xchg_tail is not sufficient to ensure this because the read - * component of xchg_tail is unordered with respect to the - * initialisation of @node. - */ - smp_store_release(&prev->next, node); + WRITE_ONCE(prev->next, node); pv_wait_node(node, prev); arch_mcs_spin_lock_contended(&node->locked);