Message ID | 20220426225211.308418-7-ebiederm@xmission.com |
---|---|
State | New |
Headers | show |
Series | ptrace: cleaning up ptrace_stop | expand |
"Eric W. Biederman" <ebiederm@xmission.com> writes: > Asking wait_task_inactive to verify that tsk->__state == __TASK_TRACED > was needed to detect the when ptrace_stop would decide not to stop > after calling "set_special_state(TASK_TRACED)". With the recent > cleanups ptrace_stop will always stop after calling set_special_state. > > Take advatnage of this by no longer asking wait_task_inactive to > verify the state. If a bug is hit and wait_task_inactive does not > succeed warn and return -ESRCH. As Oleg noticed upthread there are more reasons than simply !current->ptrace for wait_task_inactive to fail. In particular a fatal signal can be received any time before JOBCTL_DELAY_SIGKILL. So this change is not safe. I will respin this one. Eric > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> > --- > kernel/ptrace.c | 14 +++----------- > 1 file changed, 3 insertions(+), 11 deletions(-) > > diff --git a/kernel/ptrace.c b/kernel/ptrace.c > index 16d1a84a2cae..0634da7ac685 100644 > --- a/kernel/ptrace.c > +++ b/kernel/ptrace.c > @@ -265,17 +265,9 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state) > } > read_unlock(&tasklist_lock); > > - if (!ret && !ignore_state) { > - if (!wait_task_inactive(child, __TASK_TRACED)) { > - /* > - * This can only happen if may_ptrace_stop() fails and > - * ptrace_stop() changes ->state back to TASK_RUNNING, > - * so we should not worry about leaking __TASK_TRACED. > - */ > - WARN_ON(READ_ONCE(child->__state) == __TASK_TRACED); > - ret = -ESRCH; > - } > - } > + if (!ret && !ignore_state && > + WARN_ON_ONCE(!wait_task_inactive(child, 0))) > + ret = -ESRCH; > > return ret; > } Eric
"Eric W. Biederman" <ebiederm@xmission.com> writes: > "Eric W. Biederman" <ebiederm@xmission.com> writes: > >> Asking wait_task_inactive to verify that tsk->__state == __TASK_TRACED >> was needed to detect the when ptrace_stop would decide not to stop >> after calling "set_special_state(TASK_TRACED)". With the recent >> cleanups ptrace_stop will always stop after calling set_special_state. >> >> Take advatnage of this by no longer asking wait_task_inactive to >> verify the state. If a bug is hit and wait_task_inactive does not >> succeed warn and return -ESRCH. > > As Oleg noticed upthread there are more reasons than simply > !current->ptrace for wait_task_inactive to fail. In particular a fatal > signal can be received any time before JOBCTL_DELAY_SIGKILL. > > So this change is not safe. I will respin this one. Bah. I definitely need to update the description so there is going to be a v2. I confused myself. This change is safe because ptrace_freeze_traced fails if there is a pending fatal signal, and arranges that no new fatal signals will wake up the task. Eric >> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> >> --- >> kernel/ptrace.c | 14 +++----------- >> 1 file changed, 3 insertions(+), 11 deletions(-) >> >> diff --git a/kernel/ptrace.c b/kernel/ptrace.c >> index 16d1a84a2cae..0634da7ac685 100644 >> --- a/kernel/ptrace.c >> +++ b/kernel/ptrace.c >> @@ -265,17 +265,9 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state) >> } >> read_unlock(&tasklist_lock); >> >> - if (!ret && !ignore_state) { >> - if (!wait_task_inactive(child, __TASK_TRACED)) { >> - /* >> - * This can only happen if may_ptrace_stop() fails and >> - * ptrace_stop() changes ->state back to TASK_RUNNING, >> - * so we should not worry about leaking __TASK_TRACED. >> - */ >> - WARN_ON(READ_ONCE(child->__state) == __TASK_TRACED); >> - ret = -ESRCH; >> - } >> - } >> + if (!ret && !ignore_state && >> + WARN_ON_ONCE(!wait_task_inactive(child, 0))) >> + ret = -ESRCH; >> >> return ret; >> } > > Eric
On 04/26, Eric W. Biederman wrote: > > Asking wait_task_inactive to verify that tsk->__state == __TASK_TRACED > was needed to detect the when ptrace_stop would decide not to stop > after calling "set_special_state(TASK_TRACED)". With the recent > cleanups ptrace_stop will always stop after calling set_special_state. > > Take advatnage of this by no longer asking wait_task_inactive to > verify the state. If a bug is hit and wait_task_inactive does not > succeed warn and return -ESRCH. ACK, but I think that the changelog is wrong. We could do this right after may_ptrace_stop() has gone. This doesn't depend on the previous changes in this series. Oleg.
On Wed, Apr 27, 2022 at 05:14:57PM +0200, Oleg Nesterov wrote: > On 04/26, Eric W. Biederman wrote: > > > > Asking wait_task_inactive to verify that tsk->__state == __TASK_TRACED > > was needed to detect the when ptrace_stop would decide not to stop > > after calling "set_special_state(TASK_TRACED)". With the recent > > cleanups ptrace_stop will always stop after calling set_special_state. > > > > Take advatnage of this by no longer asking wait_task_inactive to > > verify the state. If a bug is hit and wait_task_inactive does not > > succeed warn and return -ESRCH. > > ACK, but I think that the changelog is wrong. > > We could do this right after may_ptrace_stop() has gone. This doesn't > depend on the previous changes in this series. It very much does rely on there not being any blocking between set_special_state() and schedule() tho. So all those PREEMPT_RT spinlock->rt_mutex things need to be gone. That is also the reason I couldn't do wait_task_inactive(task, 0) in the other patch, I had to really match 'TASK_TRACED or TASK_FROZEN' any other state must fail (specifically TASK_RTLOCK_WAIT must not match).
On 04/28, Peter Zijlstra wrote: > > On Wed, Apr 27, 2022 at 05:14:57PM +0200, Oleg Nesterov wrote: > > On 04/26, Eric W. Biederman wrote: > > > > > > Asking wait_task_inactive to verify that tsk->__state == __TASK_TRACED > > > was needed to detect the when ptrace_stop would decide not to stop > > > after calling "set_special_state(TASK_TRACED)". With the recent > > > cleanups ptrace_stop will always stop after calling set_special_state. > > > > > > Take advatnage of this by no longer asking wait_task_inactive to > > > verify the state. If a bug is hit and wait_task_inactive does not > > > succeed warn and return -ESRCH. > > > > ACK, but I think that the changelog is wrong. > > > > We could do this right after may_ptrace_stop() has gone. This doesn't > > depend on the previous changes in this series. > > It very much does rely on there not being any blocking between > set_special_state() and schedule() tho. So all those PREEMPT_RT > spinlock->rt_mutex things need to be gone. Yes sure. But this patch doesn't add the new problems, imo. Yes we can hit the WARN_ON_ONCE(!wait_task_inactive()), but this is correct in that it should not fail, and this is what we need to fix. > That is also the reason I couldn't do wait_task_inactive(task, 0) Ah, I din't notice this patch uses wait_task_inactive(child, 0), I think it should do wait_task_inactive(child, __TASK_TRACED). Oleg.
On Thu, Apr 28, 2022 at 01:19:11PM +0200, Oleg Nesterov wrote: > > That is also the reason I couldn't do wait_task_inactive(task, 0) > > Ah, I din't notice this patch uses wait_task_inactive(child, 0), > I think it should do wait_task_inactive(child, __TASK_TRACED). Shouldn't we then switch wait_task_inactive() so have & matching instead of the current ==.
On 04/28, Peter Zijlstra wrote: > > On Thu, Apr 28, 2022 at 01:19:11PM +0200, Oleg Nesterov wrote: > > > That is also the reason I couldn't do wait_task_inactive(task, 0) > > > > Ah, I din't notice this patch uses wait_task_inactive(child, 0), > > I think it should do wait_task_inactive(child, __TASK_TRACED). > > Shouldn't we then switch wait_task_inactive() so have & matching instead > of the current ==. Sorry, I don't understand the context... As long as ptrace_freeze_traced() sets __state == __TASK_TRACED (as it currently does) wait_task_inactive(__TASK_TRACED) is what we need ? After we change it to use JOBCTL_DELAY_WAKEKILL and not abuse __state, ptrace_attach() should use wait_task_inactive(TASK_TRACED), but this depends on what exactly we are going to do... Oleg.
On Thu, Apr 28, 2022 at 04:57:50PM +0200, Oleg Nesterov wrote: > > Shouldn't we then switch wait_task_inactive() so have & matching instead > > of the current ==. > > Sorry, I don't understand the context... This.. I've always found it strange to have wti use a different matching scheme from ttwu. diff --git a/kernel/sched/core.c b/kernel/sched/core.c index f259621f4c93..c039aef4c8fe 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3304,7 +3304,7 @@ unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state * is actually now running somewhere else! */ while (task_running(rq, p)) { - if (match_state && unlikely(READ_ONCE(p->__state) != match_state)) + if (match_state && unlikely(!(READ_ONCE(p->__state) & match_state))) return 0; cpu_relax(); } @@ -3319,7 +3319,7 @@ unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state running = task_running(rq, p); queued = task_on_rq_queued(p); ncsw = 0; - if (!match_state || READ_ONCE(p->__state) == match_state) + if (!match_state || (READ_ONCE(p->__state) & match_state)) ncsw = p->nvcsw | LONG_MIN; /* sets MSB */ task_rq_unlock(rq, p, &rf);
On 04/28, Peter Zijlstra wrote: > > On Thu, Apr 28, 2022 at 04:57:50PM +0200, Oleg Nesterov wrote: > > > > Shouldn't we then switch wait_task_inactive() so have & matching instead > > > of the current ==. > > > > Sorry, I don't understand the context... > > This.. I've always found it strange to have wti use a different matching > scheme from ttwu. Ah. This is what I understood (and I too thought about this), just I meant that this patch from Eric (assuming wait_task_inactive() still uses __TASK_TRACED) is fine without your change below. Oleg. > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index f259621f4c93..c039aef4c8fe 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -3304,7 +3304,7 @@ unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state > * is actually now running somewhere else! > */ > while (task_running(rq, p)) { > - if (match_state && unlikely(READ_ONCE(p->__state) != match_state)) > + if (match_state && unlikely(!(READ_ONCE(p->__state) & match_state))) > return 0; > cpu_relax(); > } > @@ -3319,7 +3319,7 @@ unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state > running = task_running(rq, p); > queued = task_on_rq_queued(p); > ncsw = 0; > - if (!match_state || READ_ONCE(p->__state) == match_state) > + if (!match_state || (READ_ONCE(p->__state) & match_state)) > ncsw = p->nvcsw | LONG_MIN; /* sets MSB */ > task_rq_unlock(rq, p, &rf);
diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 16d1a84a2cae..0634da7ac685 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -265,17 +265,9 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state) } read_unlock(&tasklist_lock); - if (!ret && !ignore_state) { - if (!wait_task_inactive(child, __TASK_TRACED)) { - /* - * This can only happen if may_ptrace_stop() fails and - * ptrace_stop() changes ->state back to TASK_RUNNING, - * so we should not worry about leaking __TASK_TRACED. - */ - WARN_ON(READ_ONCE(child->__state) == __TASK_TRACED); - ret = -ESRCH; - } - } + if (!ret && !ignore_state && + WARN_ON_ONCE(!wait_task_inactive(child, 0))) + ret = -ESRCH; return ret; }
Asking wait_task_inactive to verify that tsk->__state == __TASK_TRACED was needed to detect the when ptrace_stop would decide not to stop after calling "set_special_state(TASK_TRACED)". With the recent cleanups ptrace_stop will always stop after calling set_special_state. Take advatnage of this by no longer asking wait_task_inactive to verify the state. If a bug is hit and wait_task_inactive does not succeed warn and return -ESRCH. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> --- kernel/ptrace.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)