Message ID | 87k0b0apne.fsf_-_@email.froward.int.ebiederm.org |
---|---|
Headers | show |
Series | ptrace: cleaning up ptrace_stop | expand |
On 2022-05-04 17:40:56 [-0500], Eric W. Biederman wrote: > Stop playing with tsk->__state to remove TASK_WAKEKILL while a ptrace > command is executing. > > Instead remove TASK_WAKEKILL from the definition of TASK_TRACED, and > implemention a new jobctl flag TASK_PTRACE_FROZEN. This new flag is implement ? > set in jobctl_freeze_task and cleared when ptrace_stop is awoken or in > jobctl_unfreeze_task (when ptrace_stop remains asleep). Sebastian
On 05/04, Eric W. Biederman wrote: > > -static int ptrace_stop(int exit_code, int why, int clear_code, > - unsigned long message, kernel_siginfo_t *info) > +static int ptrace_stop(int exit_code, int why, unsigned long message, > + kernel_siginfo_t *info) > __releases(¤t->sighand->siglock) > __acquires(¤t->sighand->siglock) > { > @@ -2259,54 +2259,33 @@ static int ptrace_stop(int exit_code, int why, int clear_code, > > spin_unlock_irq(¤t->sighand->siglock); > read_lock(&tasklist_lock); > - if (likely(current->ptrace)) { > - /* > - * Notify parents of the stop. > - * > - * While ptraced, there are two parents - the ptracer and > - * the real_parent of the group_leader. The ptracer should > - * know about every stop while the real parent is only > - * interested in the completion of group stop. The states > - * for the two don't interact with each other. Notify > - * separately unless they're gonna be duplicates. > - */ > + /* > + * Notify parents of the stop. > + * > + * While ptraced, there are two parents - the ptracer and > + * the real_parent of the group_leader. The ptracer should > + * know about every stop while the real parent is only > + * interested in the completion of group stop. The states > + * for the two don't interact with each other. Notify > + * separately unless they're gonna be duplicates. > + */ > + if (current->ptrace) > do_notify_parent_cldstop(current, true, why); > - if (gstop_done && ptrace_reparented(current)) > - do_notify_parent_cldstop(current, false, why); > + if (gstop_done && (!current->ptrace || ptrace_reparented(current))) > + do_notify_parent_cldstop(current, false, why); > > - /* > - * Don't want to allow preemption here, because > - * sys_ptrace() needs this task to be inactive. > - * > - * XXX: implement read_unlock_no_resched(). > - */ > - preempt_disable(); > - read_unlock(&tasklist_lock); > - cgroup_enter_frozen(); > - preempt_enable_no_resched(); > - freezable_schedule(); > - cgroup_leave_frozen(true); > - } else { > - /* > - * By the time we got the lock, our tracer went away. > - * Don't drop the lock yet, another tracer may come. > - * > - * If @gstop_done, the ptracer went away between group stop > - * completion and here. During detach, it would have set > - * JOBCTL_STOP_PENDING on us and we'll re-enter > - * TASK_STOPPED in do_signal_stop() on return, so notifying > - * the real parent of the group stop completion is enough. > - */ > - if (gstop_done) > - do_notify_parent_cldstop(current, false, why); > - > - /* tasklist protects us from ptrace_freeze_traced() */ > - __set_current_state(TASK_RUNNING); > - read_code = false; > - if (clear_code) > - exit_code = 0; > - read_unlock(&tasklist_lock); > - } > + /* > + * Don't want to allow preemption here, because > + * sys_ptrace() needs this task to be inactive. > + * > + * XXX: implement read_unlock_no_resched(). > + */ > + preempt_disable(); > + read_unlock(&tasklist_lock); > + cgroup_enter_frozen(); > + preempt_enable_no_resched(); > + freezable_schedule(); I must have missed something. So the tracee calls ptrace_notify() but debugger goes away before the ptrace_notify() takes siglock. After that the no longer traced task will sleep in TASK_TRACED ? Looks like ptrace_stop() needs to check current->ptrace before it does set_special_state(TASK_TRACED) with siglock held? Then we can rely on ptrace_unlink() which will wake the tracee up even if debugger exits. No? Oleg.
On 05/04, Eric W. Biederman wrote: > > With the removal of the incomplete detection of the tracer going away > in ptrace_stop, ptrace_stop always sleeps in schedule after > ptrace_freeze_traced succeeds. Modify ptrace_check_attach to > warn if wait_task_inactive fails. Oh. Again, I don't understand the changelog. If we forget about RT, ptrace_stop() will always sleep if ptrace_freeze_traced() succeeds. may_ptrace_stop() has gone. IOW. Lets forget about RT > --- a/kernel/ptrace.c > +++ b/kernel/ptrace.c > @@ -266,17 +266,9 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state) > } > read_unlock(&tasklist_lock); > > - if (!ret && !ignore_state) { > - if (!wait_task_inactive(child, __TASK_TRACED)) { > - /* > - * This can only happen if may_ptrace_stop() fails and > - * ptrace_stop() changes ->state back to TASK_RUNNING, > - * so we should not worry about leaking __TASK_TRACED. > - */ > - WARN_ON(READ_ONCE(child->__state) == __TASK_TRACED); > - ret = -ESRCH; > - } > - } > + if (!ret && !ignore_state && > + WARN_ON_ONCE(!wait_task_inactive(child, __TASK_TRACED))) > + ret = -ESRCH; > > return ret; > } Why do you think this change would be wrong without any other changes? Oleg.
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes: > On 2022-05-04 17:40:56 [-0500], Eric W. Biederman wrote: >> Stop playing with tsk->__state to remove TASK_WAKEKILL while a ptrace >> command is executing. >> >> Instead remove TASK_WAKEKILL from the definition of TASK_TRACED, and >> implemention a new jobctl flag TASK_PTRACE_FROZEN. This new flag is > implement ? Yes. Thank you. Eric
Oleg Nesterov <oleg@redhat.com> writes: > On 05/04, Eric W. Biederman wrote: >> >> -static int ptrace_stop(int exit_code, int why, int clear_code, >> - unsigned long message, kernel_siginfo_t *info) >> +static int ptrace_stop(int exit_code, int why, unsigned long message, >> + kernel_siginfo_t *info) >> __releases(¤t->sighand->siglock) >> __acquires(¤t->sighand->siglock) >> { >> @@ -2259,54 +2259,33 @@ static int ptrace_stop(int exit_code, int why, int clear_code, >> >> spin_unlock_irq(¤t->sighand->siglock); >> read_lock(&tasklist_lock); >> - if (likely(current->ptrace)) { >> - /* >> - * Notify parents of the stop. >> - * >> - * While ptraced, there are two parents - the ptracer and >> - * the real_parent of the group_leader. The ptracer should >> - * know about every stop while the real parent is only >> - * interested in the completion of group stop. The states >> - * for the two don't interact with each other. Notify >> - * separately unless they're gonna be duplicates. >> - */ >> + /* >> + * Notify parents of the stop. >> + * >> + * While ptraced, there are two parents - the ptracer and >> + * the real_parent of the group_leader. The ptracer should >> + * know about every stop while the real parent is only >> + * interested in the completion of group stop. The states >> + * for the two don't interact with each other. Notify >> + * separately unless they're gonna be duplicates. >> + */ >> + if (current->ptrace) >> do_notify_parent_cldstop(current, true, why); >> - if (gstop_done && ptrace_reparented(current)) >> - do_notify_parent_cldstop(current, false, why); >> + if (gstop_done && (!current->ptrace || ptrace_reparented(current))) >> + do_notify_parent_cldstop(current, false, why); >> >> - /* >> - * Don't want to allow preemption here, because >> - * sys_ptrace() needs this task to be inactive. >> - * >> - * XXX: implement read_unlock_no_resched(). >> - */ >> - preempt_disable(); >> - read_unlock(&tasklist_lock); >> - cgroup_enter_frozen(); >> - preempt_enable_no_resched(); >> - freezable_schedule(); >> - cgroup_leave_frozen(true); >> - } else { >> - /* >> - * By the time we got the lock, our tracer went away. >> - * Don't drop the lock yet, another tracer may come. >> - * >> - * If @gstop_done, the ptracer went away between group stop >> - * completion and here. During detach, it would have set >> - * JOBCTL_STOP_PENDING on us and we'll re-enter >> - * TASK_STOPPED in do_signal_stop() on return, so notifying >> - * the real parent of the group stop completion is enough. >> - */ >> - if (gstop_done) >> - do_notify_parent_cldstop(current, false, why); >> - >> - /* tasklist protects us from ptrace_freeze_traced() */ >> - __set_current_state(TASK_RUNNING); >> - read_code = false; >> - if (clear_code) >> - exit_code = 0; >> - read_unlock(&tasklist_lock); >> - } >> + /* >> + * Don't want to allow preemption here, because >> + * sys_ptrace() needs this task to be inactive. >> + * >> + * XXX: implement read_unlock_no_resched(). >> + */ >> + preempt_disable(); >> + read_unlock(&tasklist_lock); >> + cgroup_enter_frozen(); >> + preempt_enable_no_resched(); >> + freezable_schedule(); > > I must have missed something. > > So the tracee calls ptrace_notify() but debugger goes away before the > ptrace_notify() takes siglock. After that the no longer traced task > will sleep in TASK_TRACED ? > > Looks like ptrace_stop() needs to check current->ptrace before it does > set_special_state(TASK_TRACED) with siglock held? Then we can rely on > ptrace_unlink() which will wake the tracee up even if debugger exits. > > No? Hmm. If the debugger goes away when siglock is dropped and reaquired at the top of ptrace_stop, that would appear to set the debugged process up to sleep indefinitely. I was thinking of the SIGKILL case which is handled. Thank you for catching that. Eric
Oleg Nesterov <oleg@redhat.com> writes: > On 05/04, Eric W. Biederman wrote: >> >> With the removal of the incomplete detection of the tracer going away >> in ptrace_stop, ptrace_stop always sleeps in schedule after >> ptrace_freeze_traced succeeds. Modify ptrace_check_attach to >> warn if wait_task_inactive fails. > > Oh. Again, I don't understand the changelog. If we forget about RT, > ptrace_stop() will always sleep if ptrace_freeze_traced() succeeds. > may_ptrace_stop() has gone. > > IOW. Lets forget about RT > >> --- a/kernel/ptrace.c >> +++ b/kernel/ptrace.c >> @@ -266,17 +266,9 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state) >> } >> read_unlock(&tasklist_lock); >> >> - if (!ret && !ignore_state) { >> - if (!wait_task_inactive(child, __TASK_TRACED)) { >> - /* >> - * This can only happen if may_ptrace_stop() fails and >> - * ptrace_stop() changes ->state back to TASK_RUNNING, >> - * so we should not worry about leaking __TASK_TRACED. >> - */ >> - WARN_ON(READ_ONCE(child->__state) == __TASK_TRACED); >> - ret = -ESRCH; >> - } >> - } >> + if (!ret && !ignore_state && >> + WARN_ON_ONCE(!wait_task_inactive(child, __TASK_TRACED))) >> + ret = -ESRCH; >> >> return ret; >> } > > Why do you think this change would be wrong without any other changes? For purposes of this analysis ptrace_detach and ptrace_exit (when the tracer exits) can't happen. So the bug you spotted in ptrace_stop does not apply. I was thinking that the test against !current->ptrace that replaced the old may_ptrace_stop could trigger a failure here. If the ptrace_freeze_traced happens before that test that branch clearly can not happen. *Looks twice* Both ptrace_check_attach and ptrace_stop taking a read_lock on tasklist_lock does not protect against concurrency by each other, but the write_lock on tasklist_lock in ptrace_attach does protect against a ptrace_attach coming in after the test and before __set_current_state(TASK_RUNNING). So yes. I should really split that part out into it's own patch. And yes that WARN_ON_ONCE can trigger on PREEMPT_RT but that is just because PREMPT_RT is currently broken with respect to ptrace. Which makes a WARN_ON_ONCE appropriate. I will see how much of this analysis I can put in the changelog. Thank you, Eric
On 05/05, Eric W. Biederman wrote: > > And yes that WARN_ON_ONCE can trigger on PREEMPT_RT but that is just > because PREMPT_RT is currently broken with respect to ptrace. Which > makes a WARN_ON_ONCE appropriate. Yes agreed. In this case WARN_ON_ONCE() can help a user to understand that a failure was caused by the kernel problem which we need to fix anyway. Oleg.
Oleg Nesterov <oleg@redhat.com> writes: > On 05/04, Eric W. Biederman wrote: >> >> -static int ptrace_stop(int exit_code, int why, int clear_code, >> - unsigned long message, kernel_siginfo_t *info) >> +static int ptrace_stop(int exit_code, int why, unsigned long message, >> + kernel_siginfo_t *info) > > Forgot to mention... but in general I like this change. > > In particular, I like the fact it kills the ugly "int clear_code" arg > which looks as if it solves the problems with the exiting tracer, but > actually it doesn't. And we do not really care, imo. Further either this change is necessary or we need to take siglock in the !current->ptrace path in "ptrace: Don't change __state" so that JOBCTL_TRACED can be cleared. So I vote for deleting code, and making ptrace_stop easier to reason about. Eric
On 05/05, Eric W. Biederman wrote: > > So I vote for deleting code, and making ptrace_stop easier to reason > about. Yes, yes, agreed. Oleg.