From patchwork Thu Nov 10 12:44:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mel Gorman X-Patchwork-Id: 624158 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABA86C4332F for ; Thu, 10 Nov 2022 12:49:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229790AbiKJMtX (ORCPT ); Thu, 10 Nov 2022 07:49:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229537AbiKJMtW (ORCPT ); Thu, 10 Nov 2022 07:49:22 -0500 X-Greylist: delayed 317 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Thu, 10 Nov 2022 04:49:21 PST Received: from outbound-smtp43.blacknight.com (outbound-smtp43.blacknight.com [46.22.139.229]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50C3B6C737 for ; Thu, 10 Nov 2022 04:49:21 -0800 (PST) Received: from mail.blacknight.com (pemlinmail03.blacknight.ie [81.17.254.16]) by outbound-smtp43.blacknight.com (Postfix) with ESMTPS id 21C751E9C for ; Thu, 10 Nov 2022 12:44:03 +0000 (GMT) Received: (qmail 29824 invoked from network); 10 Nov 2022 12:44:02 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.198.246]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 10 Nov 2022 12:44:02 -0000 Date: Thu, 10 Nov 2022 12:44:00 +0000 From: Mel Gorman To: Thomas Gleixner Cc: "Chang S. Bae" , Borislav Petkov , Ingo Molnar , Dave Hansen , Mike Galbraith , LKML , Linux-X86 , Linux-RT Subject: [PATCH] x86: Drop fpregs lock before inheriting FPU permissions Message-ID: <20221110124400.zgymc2lnwqjukgfh@techsingularity.net> MIME-Version: 1.0 Content-Disposition: inline Precedence: bulk List-ID: X-Mailing-List: linux-rt-users@vger.kernel.org Mike Galbraith reported the following against an old fork of preempt-rt but the same issue also applies to the current preempt-rt tree. BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: systemd preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 Preemption disabled at: fpu_clone+0xfa/0x480 CPU: 6 PID: 1 Comm: systemd Tainted: G E (unreleased) Call Trace: dump_stack_lvl+0x45/0x5b ? fpu_clone+0xfa/0x480 __might_resched+0x165/0x200 rt_spin_lock+0x2d/0x70 fpu_clone+0x32a/0x480 ? copy_thread+0xef/0x270 ? copy_process+0xd2c/0x1c00 ? shmem_alloc_inode+0x16/0x30 ? kmem_cache_alloc+0x120/0x2a0 ? kernel_clone+0x9b/0x460 ? __do_sys_clone+0x72/0xa0 ? do_syscall_64+0x58/0x80 ? __x64_sys_rt_sigprocmask+0x93/0xd0 ? syscall_exit_to_user_mode+0x18/0x40 ? do_syscall_64+0x67/0x80 ? syscall_exit_to_user_mode+0x18/0x40 ? do_syscall_64+0x67/0x80 ? syscall_exit_to_user_mode+0x18/0x40 ? do_syscall_64+0x67/0x80 ? exc_page_fault+0x6a/0x190 ? entry_SYSCALL_64_after_hwframe+0x61/0xcb The splat comes from fpu_inherit_perms() being called under fpregs_lock(), and us reaching the spin_lock_irq() therein due to fpu_state_size_dynamic() returning true despite static key __fpu_state_size_dynamic having never been enabled. Mike's assessment looks correct. fpregs_lock on a PREEMPT_RT kernel disables preemption so calling spin_lock_irq() in fpu_inherit_perms is unsafe. This problem exists since commit 9e798e9aa14c ("x86/fpu: Prepare fpu_clone() for dynamically enabled features"). Even though the original bug report should not have enabled the paths at all, the bug still exists. fpregs_lock is necessary when editing the FPU registers or a task's FP state but it is not necessary for fpu_inherit_perms. The only write of any FP state in fpu_inherit_perms is for the new child which is not running yet and cannot context switch or be borrowed by a kernel thread yet. Hence, fpregs_lock is not protecting anything in the new child until clone() completes and can be dropped earlier. The siglock still needs to be acquired by fpu_inherit_perms as the read of the parents permissions has to be serialised. Reported-by: Mike Galbraith Signed-off-by: Mel Gorman Reviewed-by: Thomas Gleixner --- arch/x86/kernel/fpu/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 3b28c5b25e12..d00db56a8868 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -605,9 +605,9 @@ int fpu_clone(struct task_struct *dst, unsigned long clone_flags, bool minimal) if (test_thread_flag(TIF_NEED_FPU_LOAD)) fpregs_restore_userregs(); save_fpregs_to_fpstate(dst_fpu); + fpregs_unlock(); if (!(clone_flags & CLONE_THREAD)) fpu_inherit_perms(dst_fpu); - fpregs_unlock(); /* * Children never inherit PASID state.