Message ID | 20221109113044.7ncdw6263o3msycl@techsingularity.net |
---|---|
State | Superseded |
Headers | show |
Series | [RFC] x86: Drop fpregs lock before inheriting FPU permissions during clone | expand |
On Wed, Nov 09 2022 at 11:30, Mel Gorman wrote: > BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 ... > The splat comes from fpu_inherit_perms() being called under fpregs_lock(), > and us reaching the spin_lock_irq() therein due to fpu_state_size_dynamic() > returning true despite static key __fpu_state_size_dynamic having never > been enabled. > > Mike's assessment looks correct. fpregs_lock on PREEMPT_RT disables > preemption only so the spin_lock_irq() in fpu_inherit_perms is unsafe > and converting siglock to raw spinlock would be an unwelcome change. > This problem exists since commit 9e798e9aa14c ("x86/fpu: Prepare fpu_clone() > for dynamically enabled features"). While the bug triggering is probably a > mistake for the affected machine and due to a bug that is not in mainline, > spin_lock_irq within a preempt_disable section on PREEMPT_RT is problematic. > > In this specific context, it may not be necessary to hold fpregs_lock at > all. The lock is necessary when editing the FPU registers or a tasks fpstate > but in this case, the only write of any FP state in fpu_inherit_perms is > for the new child which is not running yet so it cannot context switch or > be borrowed by a kernel thread yet. Hence, fpregs_lock is not protecting > anything in the new child until clone() completes. The siglock still needs > to be acquired by fpu_inherit_perms as the read of the parents permissions > has to be serialised. That's correct and siglock is the real protection for the permissions. > This is not tested as I did not access to a machine with Intel's > eXtended Feature Disable (XFD) feature that enables the relevant path > in fpu_inherit_perms and the bug is against a non-mainline kernel. It's still entirely correct on mainline as there is no requirement to hold fpregs_lock in this case > Reported-by: Mike Galbraith <efault@gmx.de> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
On Wed, Nov 09, 2022 at 05:25:47PM +0100, Thomas Gleixner wrote: > On Wed, Nov 09 2022 at 11:30, Mel Gorman wrote: > > BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 > ... > > The splat comes from fpu_inherit_perms() being called under fpregs_lock(), > > and us reaching the spin_lock_irq() therein due to fpu_state_size_dynamic() > > returning true despite static key __fpu_state_size_dynamic having never > > been enabled. > > > > Mike's assessment looks correct. fpregs_lock on PREEMPT_RT disables > > preemption only so the spin_lock_irq() in fpu_inherit_perms is unsafe > > and converting siglock to raw spinlock would be an unwelcome change. > > This problem exists since commit 9e798e9aa14c ("x86/fpu: Prepare fpu_clone() > > for dynamically enabled features"). While the bug triggering is probably a > > mistake for the affected machine and due to a bug that is not in mainline, > > spin_lock_irq within a preempt_disable section on PREEMPT_RT is problematic. > > > > In this specific context, it may not be necessary to hold fpregs_lock at > > all. The lock is necessary when editing the FPU registers or a tasks fpstate > > but in this case, the only write of any FP state in fpu_inherit_perms is > > for the new child which is not running yet so it cannot context switch or > > be borrowed by a kernel thread yet. Hence, fpregs_lock is not protecting > > anything in the new child until clone() completes. The siglock still needs > > to be acquired by fpu_inherit_perms as the read of the parents permissions > > has to be serialised. > > That's correct and siglock is the real protection for the permissions. > > > This is not tested as I did not access to a machine with Intel's > > eXtended Feature Disable (XFD) feature that enables the relevant path > > in fpu_inherit_perms and the bug is against a non-mainline kernel. > > It's still entirely correct on mainline as there is no requirement to > hold fpregs_lock in this case > > > Reported-by: Mike Galbraith <efault@gmx.de> > > Signed-off-by: Mel Gorman <mgorman@techsingularity.net> > > Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Perfect, I'll rephase the changelog slightly and resend without RFC and all the x86 maintainers cc'd. Thanks Thomas!
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index 3b28c5b25e12..d00db56a8868 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -605,9 +605,9 @@ int fpu_clone(struct task_struct *dst, unsigned long clone_flags, bool minimal) if (test_thread_flag(TIF_NEED_FPU_LOAD)) fpregs_restore_userregs(); save_fpregs_to_fpstate(dst_fpu); + fpregs_unlock(); if (!(clone_flags & CLONE_THREAD)) fpu_inherit_perms(dst_fpu); - fpregs_unlock(); /* * Children never inherit PASID state.
Mike Galbraith reported the following off-list against an old fork of preempt-rt but the same issue likely also applies to current preempt-rt BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: systemd preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 Preemption disabled at: fpu_clone+0xfa/0x480 CPU: 6 PID: 1 Comm: systemd Tainted: G E (unreleased) Call Trace: <TASK> dump_stack_lvl+0x45/0x5b ? fpu_clone+0xfa/0x480 __might_resched+0x165/0x200 rt_spin_lock+0x2d/0x70 fpu_clone+0x32a/0x480 ? copy_thread+0xef/0x270 ? copy_process+0xd2c/0x1c00 ? shmem_alloc_inode+0x16/0x30 ? kmem_cache_alloc+0x120/0x2a0 ? kernel_clone+0x9b/0x460 ? __do_sys_clone+0x72/0xa0 ? do_syscall_64+0x58/0x80 ? __x64_sys_rt_sigprocmask+0x93/0xd0 ? syscall_exit_to_user_mode+0x18/0x40 ? do_syscall_64+0x67/0x80 ? syscall_exit_to_user_mode+0x18/0x40 ? do_syscall_64+0x67/0x80 ? syscall_exit_to_user_mode+0x18/0x40 ? do_syscall_64+0x67/0x80 ? exc_page_fault+0x6a/0x190 ? entry_SYSCALL_64_after_hwframe+0x61/0xcb </TASK> The splat comes from fpu_inherit_perms() being called under fpregs_lock(), and us reaching the spin_lock_irq() therein due to fpu_state_size_dynamic() returning true despite static key __fpu_state_size_dynamic having never been enabled. Mike's assessment looks correct. fpregs_lock on PREEMPT_RT disables preemption only so the spin_lock_irq() in fpu_inherit_perms is unsafe and converting siglock to raw spinlock would be an unwelcome change. This problem exists since commit 9e798e9aa14c ("x86/fpu: Prepare fpu_clone() for dynamically enabled features"). While the bug triggering is probably a mistake for the affected machine and due to a bug that is not in mainline, spin_lock_irq within a preempt_disable section on PREEMPT_RT is problematic. In this specific context, it may not be necessary to hold fpregs_lock at all. The lock is necessary when editing the FPU registers or a tasks fpstate but in this case, the only write of any FP state in fpu_inherit_perms is for the new child which is not running yet so it cannot context switch or be borrowed by a kernel thread yet. Hence, fpregs_lock is not protecting anything in the new child until clone() completes. The siglock still needs to be acquired by fpu_inherit_perms as the read of the parents permissions has to be serialised. This is not tested as I did not access to a machine with Intel's eXtended Feature Disable (XFD) feature that enables the relevant path in fpu_inherit_perms and the bug is against a non-mainline kernel. Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> --- arch/x86/kernel/fpu/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)