mbox series

[v4,0/12] ptrace: cleaning up ptrace_stop

Message ID 87a6bv6dl6.fsf_-_@email.froward.int.ebiederm.org
Headers show
Series ptrace: cleaning up ptrace_stop | expand

Message

Eric W. Biederman May 5, 2022, 6:25 p.m. UTC
The states TASK_STOPPED and TASK_TRACE are special in they can not
handle spurious wake-ups.  This plus actively depending upon and
changing the value of tsk->__state causes problems for PREEMPT_RT and
Peter's freezer rewrite.

There are a lot of details we have to get right to sort out the
technical challenges and this is my parred back version of the changes
that contains just those problems I see good solutions to that I believe
are ready.

A couple of issues have been pointed but I think this parred back set of
changes is still on the right track.  The biggest change in v4 is the
split of "ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs" into
two patches because the dependency I thought exited between two
different changes did not exist.  The rest of the changes are minor
tweaks to "ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs";
removing an always true branch, and adding an early  test to see if the
ptracer had gone, before TASK_TRAPPING was set.

This set of changes should support Peter's freezer rewrite, and with the
addition of changing wait_task_inactive(TASK_TRACED) to be
wait_task_inactive(0) in ptrace_check_attach I don't think there are any
races or issues to be concerned about from the ptrace side.

More work is needed to support PREEMPT_RT, but these changes get things
closer.

This set of changes continues to look like it will provide a firm
foundation for solving the PREEMPT_RT and freezer challenges.

Eric W. Biederman (11):
      signal: Rename send_signal send_signal_locked
      signal: Replace __group_send_sig_info with send_signal_locked
      ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
      ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
      ptrace: Remove arch_ptrace_attach
      signal: Use lockdep_assert_held instead of assert_spin_locked
      ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
      ptrace: Document that wait_task_inactive can't fail
      ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
      ptrace: Don't change __state
      ptrace: Always take siglock in ptrace_resume

Peter Zijlstra (1):
      sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state

 arch/ia64/include/asm/ptrace.h    |   4 --
 arch/ia64/kernel/ptrace.c         |  57 ----------------
 arch/um/include/asm/thread_info.h |   2 +
 arch/um/kernel/exec.c             |   2 +-
 arch/um/kernel/process.c          |   2 +-
 arch/um/kernel/ptrace.c           |   8 +--
 arch/um/kernel/signal.c           |   4 +-
 arch/x86/kernel/step.c            |   3 +-
 arch/xtensa/kernel/ptrace.c       |   4 +-
 arch/xtensa/kernel/signal.c       |   4 +-
 drivers/tty/tty_jobctrl.c         |   4 +-
 include/linux/ptrace.h            |   7 --
 include/linux/sched.h             |  10 ++-
 include/linux/sched/jobctl.h      |   8 +++
 include/linux/sched/signal.h      |  20 ++++--
 include/linux/signal.h            |   3 +-
 kernel/ptrace.c                   |  87 ++++++++---------------
 kernel/sched/core.c               |   5 +-
 kernel/signal.c                   | 140 +++++++++++++++++---------------------
 kernel/time/posix-cpu-timers.c    |   6 +-
 20 files changed, 140 insertions(+), 240 deletions(-)

Eric

Comments

Oleg Nesterov May 6, 2022, 2:14 p.m. UTC | #1
On 05/05, Eric W. Biederman wrote:
>
> Eric W. Biederman (11):
>       signal: Rename send_signal send_signal_locked
>       signal: Replace __group_send_sig_info with send_signal_locked
>       ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
>       ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
>       ptrace: Remove arch_ptrace_attach
>       signal: Use lockdep_assert_held instead of assert_spin_locked
>       ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
>       ptrace: Document that wait_task_inactive can't fail
>       ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
>       ptrace: Don't change __state
>       ptrace: Always take siglock in ptrace_resume
>
> Peter Zijlstra (1):
>       sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state

I can't comment 5/12. to be honest I didn't even try to look into
arch/ia64/.

But other than that I see no problems in this version. However, I'd
like to actually apply the whole series and read the changed code
carefully, but sorry, I don't think I can do this before Monday.

Oleg.
Eric W. Biederman May 6, 2022, 2:38 p.m. UTC | #2
Oleg Nesterov <oleg@redhat.com> writes:

> On 05/05, Eric W. Biederman wrote:
>>
>> Eric W. Biederman (11): signal: Rename send_signal send_signal_locked
>> signal: Replace __group_send_sig_info with send_signal_locked
>> ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP ptrace/xtensa:
>> Replace PT_SINGLESTEP with TIF_SINGLESTEP ptrace: Remove
>> arch_ptrace_attach signal: Use lockdep_assert_held instead of
>> assert_spin_locked ptrace: Reimplement PTRACE_KILL by always sending
>> SIGKILL ptrace: Document that wait_task_inactive can't fail ptrace:
>> Admit ptrace_stop can generate spuriuos SIGTRAPs ptrace: Don't change
>> __state ptrace: Always take siglock in ptrace_resume
>>
>> Peter Zijlstra (1):
>>       sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state
>
> I can't comment 5/12. to be honest I didn't even try to look into
> arch/ia64/.

I just looked at arch_ptrace_attach again and I spotted what looks like
a fairly easy analysis that is mostly arch-generic code that shows this
is dead code on ia64.

On ia64 arch_ptrace_attach is ptrace_attach_sync_user_rbs, and does
nothing if __state is not TASK_STOPPED.

When arch_ptrace_attach is called after ptrace_traceme __state is
TASK_RUNNING pretty much by definition as we are running in the
child.  Therefore ptrace_attach_sync_user_rbs does nothing in that case.

When arch_ptrace_attach is called after ptrace_attach __state there
are two possibilities.  If the tracee was already in TASK_STOPPED
before the ptrace_attach, the tracee will be in TASK_TRACED.
Otherwise the tracee will be in TASK_TRACED or on it's way to stopping
in TASK_TRACED.

Unless I totally misread ptrace_attach.  There is no way that after
a successful ptrace_attach for the tracee to be in TASK_STOPPED.
This makes ptrace_attach_sync_user_rbs a big noop, AKA dead code.
So it can be removed.

> But other than that I see no problems in this version. However, I'd
> like to actually apply the whole series and read the changed code
> carefully, but sorry, I don't think I can do this before Monday.

No rush.  I don't expect the merge window will open for a while yet.

Eric
Kees Cook May 6, 2022, 9:26 p.m. UTC | #3
On Thu, May 05, 2022 at 01:25:57PM -0500, Eric W. Biederman wrote:
> The states TASK_STOPPED and TASK_TRACE are special in they can not
> handle spurious wake-ups.  This plus actively depending upon and
> changing the value of tsk->__state causes problems for PREEMPT_RT and
> Peter's freezer rewrite.
> 
> There are a lot of details we have to get right to sort out the
> technical challenges and this is my parred back version of the changes
> that contains just those problems I see good solutions to that I believe
> are ready.
> 
> A couple of issues have been pointed but I think this parred back set of
> changes is still on the right track.  The biggest change in v4 is the
> split of "ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs" into
> two patches because the dependency I thought exited between two
> different changes did not exist.  The rest of the changes are minor
> tweaks to "ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs";
> removing an always true branch, and adding an early  test to see if the
> ptracer had gone, before TASK_TRAPPING was set.
> 
> This set of changes should support Peter's freezer rewrite, and with the
> addition of changing wait_task_inactive(TASK_TRACED) to be
> wait_task_inactive(0) in ptrace_check_attach I don't think there are any
> races or issues to be concerned about from the ptrace side.
> 
> More work is needed to support PREEMPT_RT, but these changes get things
> closer.
> 
> This set of changes continues to look like it will provide a firm
> foundation for solving the PREEMPT_RT and freezer challenges.

One of the more sensitive projects to changes around ptrace is rr
(Robert and Kyle added to CC). I ran rr's selftests before/after this
series and saw no changes. My failures remained the same; I assume
they're due to missing CPU features (pkeys) or build configs (bpf), etc:

99% tests passed, 19 tests failed out of 2777

Total Test time (real) = 773.40 sec

The following tests FAILED:
         42 - bpf_map (Failed)
         43 - bpf_map-no-syscallbuf (Failed)
        414 - netfilter (Failed)
        415 - netfilter-no-syscallbuf (Failed)
        454 - x86/pkeys (Failed)
        455 - x86/pkeys-no-syscallbuf (Failed)
        1152 - ttyname (Failed)
        1153 - ttyname-no-syscallbuf (Failed)
        1430 - bpf_map-32 (Failed)
        1431 - bpf_map-32-no-syscallbuf (Failed)
        1502 - detach_sigkill-32 (Failed)
        1802 - netfilter-32 (Failed)
        1803 - netfilter-32-no-syscallbuf (Failed)
        1842 - x86/pkeys-32 (Failed)
        1843 - x86/pkeys-32-no-syscallbuf (Failed)
        2316 - crash_in_function-32 (Failed)
        2317 - crash_in_function-32-no-syscallbuf (Failed)
        2540 - ttyname-32 (Failed)
        2541 - ttyname-32-no-syscallbuf (Failed)

So, I guess:

Tested-by: Kees Cook <keescook@chromium.org>

:)
Eric W. Biederman May 6, 2022, 9:59 p.m. UTC | #4
Kees Cook <keescook@chromium.org> writes:

> On Thu, May 05, 2022 at 01:25:57PM -0500, Eric W. Biederman wrote:
>> The states TASK_STOPPED and TASK_TRACE are special in they can not
>> handle spurious wake-ups.  This plus actively depending upon and
>> changing the value of tsk->__state causes problems for PREEMPT_RT and
>> Peter's freezer rewrite.
>> 
>> There are a lot of details we have to get right to sort out the
>> technical challenges and this is my parred back version of the changes
>> that contains just those problems I see good solutions to that I believe
>> are ready.
>> 
>> A couple of issues have been pointed but I think this parred back set of
>> changes is still on the right track.  The biggest change in v4 is the
>> split of "ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs" into
>> two patches because the dependency I thought exited between two
>> different changes did not exist.  The rest of the changes are minor
>> tweaks to "ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs";
>> removing an always true branch, and adding an early  test to see if the
>> ptracer had gone, before TASK_TRAPPING was set.
>> 
>> This set of changes should support Peter's freezer rewrite, and with the
>> addition of changing wait_task_inactive(TASK_TRACED) to be
>> wait_task_inactive(0) in ptrace_check_attach I don't think there are any
>> races or issues to be concerned about from the ptrace side.
>> 
>> More work is needed to support PREEMPT_RT, but these changes get things
>> closer.
>> 
>> This set of changes continues to look like it will provide a firm
>> foundation for solving the PREEMPT_RT and freezer challenges.
>
> One of the more sensitive projects to changes around ptrace is rr
> (Robert and Kyle added to CC). I ran rr's selftests before/after this
> series and saw no changes. My failures remained the same; I assume
> they're due to missing CPU features (pkeys) or build configs (bpf), etc:
>
> 99% tests passed, 19 tests failed out of 2777
>
> Total Test time (real) = 773.40 sec
>
> The following tests FAILED:
>          42 - bpf_map (Failed)
>          43 - bpf_map-no-syscallbuf (Failed)
>         414 - netfilter (Failed)
>         415 - netfilter-no-syscallbuf (Failed)
>         454 - x86/pkeys (Failed)
>         455 - x86/pkeys-no-syscallbuf (Failed)
>         1152 - ttyname (Failed)
>         1153 - ttyname-no-syscallbuf (Failed)
>         1430 - bpf_map-32 (Failed)
>         1431 - bpf_map-32-no-syscallbuf (Failed)
>         1502 - detach_sigkill-32 (Failed)
>         1802 - netfilter-32 (Failed)
>         1803 - netfilter-32-no-syscallbuf (Failed)
>         1842 - x86/pkeys-32 (Failed)
>         1843 - x86/pkeys-32-no-syscallbuf (Failed)
>         2316 - crash_in_function-32 (Failed)
>         2317 - crash_in_function-32-no-syscallbuf (Failed)
>         2540 - ttyname-32 (Failed)
>         2541 - ttyname-32-no-syscallbuf (Failed)
>
> So, I guess:
>
> Tested-by: Kees Cook <keescook@chromium.org>
>
> :)

Thank you.  I was thinking it would be good to add the rr folks to the
discussion.

Eric
Oleg Nesterov May 10, 2022, 2:11 p.m. UTC | #5
On 05/05, Eric W. Biederman wrote:
>
> Eric W. Biederman (11):
>       signal: Rename send_signal send_signal_locked
>       signal: Replace __group_send_sig_info with send_signal_locked
>       ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
>       ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
>       ptrace: Remove arch_ptrace_attach
>       signal: Use lockdep_assert_held instead of assert_spin_locked
>       ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
>       ptrace: Document that wait_task_inactive can't fail
>       ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
>       ptrace: Don't change __state
>       ptrace: Always take siglock in ptrace_resume
>
> Peter Zijlstra (1):
>       sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state

OK, lgtm.

Reviewed-by: Oleg Nesterov <oleg@redhat.com>


I still dislike you removed TASK_WAKEKILL from TASK_TRACED, but I can't
find a good argument against it ;) and yes, this is subjective.

Oleg.
Eric W. Biederman May 10, 2022, 2:26 p.m. UTC | #6
Oleg Nesterov <oleg@redhat.com> writes:

> On 05/05, Eric W. Biederman wrote:
>>
>> Eric W. Biederman (11):
>>       signal: Rename send_signal send_signal_locked
>>       signal: Replace __group_send_sig_info with send_signal_locked
>>       ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
>>       ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
>>       ptrace: Remove arch_ptrace_attach
>>       signal: Use lockdep_assert_held instead of assert_spin_locked
>>       ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
>>       ptrace: Document that wait_task_inactive can't fail
>>       ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
>>       ptrace: Don't change __state
>>       ptrace: Always take siglock in ptrace_resume
>>
>> Peter Zijlstra (1):
>>       sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state
>
> OK, lgtm.
>
> Reviewed-by: Oleg Nesterov <oleg@redhat.com>
>
>
> I still dislike you removed TASK_WAKEKILL from TASK_TRACED, but I can't
> find a good argument against it ;) and yes, this is subjective.

Does anyone else have any comments on this patchset?

If not I am going to apply this to a branch and get it into linux-next.

Eric
Sebastian Andrzej Siewior May 10, 2022, 2:45 p.m. UTC | #7
On 2022-05-10 09:26:36 [-0500], Eric W. Biederman wrote:
> Does anyone else have any comments on this patchset?
> 
> If not I am going to apply this to a branch and get it into linux-next.

Looks good I guess.
Be aware that there will be clash due to
   https://lore.kernel.org/all/1649240981-11024-3-git-send-email-yangtiezhu@loongson.cn/

which sits currently in -akpm.

> Eric

Sebastian
Eric W. Biederman May 10, 2022, 3:18 p.m. UTC | #8
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:

> On 2022-05-10 09:26:36 [-0500], Eric W. Biederman wrote:
>> Does anyone else have any comments on this patchset?
>> 
>> If not I am going to apply this to a branch and get it into linux-next.
>
> Looks good I guess.
> Be aware that there will be clash due to
>    https://lore.kernel.org/all/1649240981-11024-3-git-send-email-yangtiezhu@loongson.cn/
>
> which sits currently in -akpm.

Thanks for the heads up.  That looks like the best kind of conflict.
One where code just disappears.

Eric
Eric W. Biederman May 11, 2022, 8 p.m. UTC | #9
"Eric W. Biederman" <ebiederm@xmission.com> writes:

> Oleg Nesterov <oleg@redhat.com> writes:
>
>> On 05/05, Eric W. Biederman wrote:
>>>
>>> Eric W. Biederman (11):
>>>       signal: Rename send_signal send_signal_locked
>>>       signal: Replace __group_send_sig_info with send_signal_locked
>>>       ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
>>>       ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
>>>       ptrace: Remove arch_ptrace_attach
>>>       signal: Use lockdep_assert_held instead of assert_spin_locked
>>>       ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
>>>       ptrace: Document that wait_task_inactive can't fail
>>>       ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
>>>       ptrace: Don't change __state
>>>       ptrace: Always take siglock in ptrace_resume
>>>
>>> Peter Zijlstra (1):
>>>       sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state
>>
>> OK, lgtm.
>>
>> Reviewed-by: Oleg Nesterov <oleg@redhat.com>
>>
>>
>> I still dislike you removed TASK_WAKEKILL from TASK_TRACED, but I can't
>> find a good argument against it ;) and yes, this is subjective.
>
> Does anyone else have any comments on this patchset?
>
> If not I am going to apply this to a branch and get it into linux-next.

Thank you all.

I have pushed this to my ptrace_stop-cleanup-for-v5.19 branch
and placed the branch in linux-next.

Eric