From patchwork Wed Dec 2 15:35:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Lutomirski X-Patchwork-Id: 336701 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66A28C83016 for ; Wed, 2 Dec 2020 15:36:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 16E0720B1F for ; Wed, 2 Dec 2020 15:36:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730496AbgLBPf7 (ORCPT ); Wed, 2 Dec 2020 10:35:59 -0500 Received: from mail.kernel.org ([198.145.29.99]:33226 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726113AbgLBPf6 (ORCPT ); Wed, 2 Dec 2020 10:35:58 -0500 From: Andy Lutomirski Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: x86@kernel.org, Mathieu Desnoyers Cc: LKML , Nicholas Piggin , Arnd Bergmann , Anton Blanchard , Andy Lutomirski , stable@vger.kernel.org Subject: [PATCH v2 1/4] x86/membarrier: Get rid of a dubious optimization Date: Wed, 2 Dec 2020 07:35:09 -0800 Message-Id: <5afc7632be1422f91eaf7611aaaa1b5b8580a086.1606923183.git.luto@kernel.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org sync_core_before_usermode() had an incorrect optimization. If we're in an IRQ, we can get to usermode without IRET -- we just have to schedule to a different task in the same mm and do SYSRET. Fortunately, there were no callers of sync_core_before_usermode() that could have had in_irq() or in_nmi() equal to true, because it's only ever called from the scheduler. While we're at it, clarify a related comment. Cc: stable@vger.kernel.org Reviewed-by: Mathieu Desnoyers Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/sync_core.h | 9 +++++---- arch/x86/mm/tlb.c | 10 ++++++++-- 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/sync_core.h b/arch/x86/include/asm/sync_core.h index 0fd4a9dfb29c..ab7382f92aff 100644 --- a/arch/x86/include/asm/sync_core.h +++ b/arch/x86/include/asm/sync_core.h @@ -98,12 +98,13 @@ static inline void sync_core_before_usermode(void) /* With PTI, we unconditionally serialize before running user code. */ if (static_cpu_has(X86_FEATURE_PTI)) return; + /* - * Return from interrupt and NMI is done through iret, which is core - * serializing. + * Even if we're in an interrupt, we might reschedule before returning, + * in which case we could switch to a different thread in the same mm + * and return using SYSRET or SYSEXIT. Instead of trying to keep + * track of our need to sync the core, just sync right away. */ - if (in_irq() || in_nmi()) - return; sync_core(); } diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 11666ba19b62..569ac1d57f55 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -474,8 +474,14 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, /* * The membarrier system call requires a full memory barrier and * core serialization before returning to user-space, after - * storing to rq->curr. Writing to CR3 provides that full - * memory barrier and core serializing instruction. + * storing to rq->curr, when changing mm. This is because + * membarrier() sends IPIs to all CPUs that are in the target mm + * to make them issue memory barriers. However, if another CPU + * switches to/from the target mm concurrently with + * membarrier(), it can cause that CPU not to receive an IPI + * when it really should issue a memory barrier. Writing to CR3 + * provides that full memory barrier and core serializing + * instruction. */ if (real_prev == next) { VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) != From patchwork Wed Dec 2 15:35:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Lutomirski X-Patchwork-Id: 336700 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.0 required=3.0 tests=BAYES_00, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24C0EC83017 for ; Wed, 2 Dec 2020 15:36:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DB1CB21D7A for ; Wed, 2 Dec 2020 15:36:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387476AbgLBPgA (ORCPT ); Wed, 2 Dec 2020 10:36:00 -0500 Received: from mail.kernel.org ([198.145.29.99]:33266 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387457AbgLBPf7 (ORCPT ); Wed, 2 Dec 2020 10:35:59 -0500 From: Andy Lutomirski Authentication-Results: mail.kernel.org; dkim=permerror (bad message/signature format) To: x86@kernel.org, Mathieu Desnoyers Cc: LKML , Nicholas Piggin , Arnd Bergmann , Anton Blanchard , Andy Lutomirski , stable@vger.kernel.org Subject: [PATCH v2 2/4] membarrier: Add an actual barrier before rseq_preempt() Date: Wed, 2 Dec 2020 07:35:10 -0800 Message-Id: <510cb07946be056d2da7dda721bbf444c288751b.1606923183.git.luto@kernel.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org It seems to me that most RSEQ membarrier users will expect any stores done before the membarrier() syscall to be visible to the target task(s). While this is extremely likely to be true in practice, nothing actually guarantees it by a strict reading of the x86 manuals. Rather than providing this guarantee by accident and potentially causing a problem down the road, just add an explicit barrier. Cc: stable@vger.kernel.org Signed-off-by: Andy Lutomirski --- kernel/sched/membarrier.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index 5a40b3828ff2..6251d3d12abe 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -168,6 +168,14 @@ static void ipi_mb(void *info) static void ipi_rseq(void *info) { + /* + * Ensure that all stores done by the calling thread are visible + * to the current task before the current task resumes. We could + * probably optimize this away on most architectures, but by the + * time we've already sent an IPI, the cost of the extra smp_mb() + * is negligible. + */ + smp_mb(); rseq_preempt(current); }