diff mbox series

[v2,1/2] exec: add a flag for "reasonable" execveat() comm

Message ID 20240927151746.391931-1-tycho@tycho.pizza
State New
Headers show
Series [v2,1/2] exec: add a flag for "reasonable" execveat() comm | expand

Commit Message

Tycho Andersen Sept. 27, 2024, 3:17 p.m. UTC
From: Tycho Andersen <tandersen@netflix.com>

Zbigniew mentioned at Linux Plumber's that systemd is interested in
switching to execveat() for service execution, but can't, because the
contents of /proc/pid/comm are the file descriptor which was used,
instead of the path to the binary. This makes the output of tools like
top and ps useless, especially in a world where most fds are opened
CLOEXEC so the number is truly meaningless.

Change exec path to fix up /proc/pid/comm in the case where we have
allocated one of these synthetic paths in bprm_init(). This way the actual
exec machinery is unchanged, but cosmetically the comm looks reasonable to
admins investigating things.

Signed-off-by: Tycho Andersen <tandersen@netflix.com>
Suggested-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
CC: Aleksa Sarai <cyphar@cyphar.com>
Link: https://github.com/uapi-group/kernel-features#set-comm-field-before-exec
---
v2: * drop the flag, everyone :)
    * change the rendered value to f_path.dentry->d_name.name instead of
      argv[0], Eric
---
 fs/exec.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)


base-commit: baeb9a7d8b60b021d907127509c44507539c15e5

Comments

Eric W. Biederman Sept. 27, 2024, 3:45 p.m. UTC | #1
Tycho Andersen <tycho@tycho.pizza> writes:

> From: Tycho Andersen <tandersen@netflix.com>
>
> Zbigniew mentioned at Linux Plumber's that systemd is interested in
> switching to execveat() for service execution, but can't, because the
> contents of /proc/pid/comm are the file descriptor which was used,
> instead of the path to the binary. This makes the output of tools like
> top and ps useless, especially in a world where most fds are opened
> CLOEXEC so the number is truly meaningless.
>
> Change exec path to fix up /proc/pid/comm in the case where we have
> allocated one of these synthetic paths in bprm_init(). This way the actual
> exec machinery is unchanged, but cosmetically the comm looks reasonable to
> admins investigating things.

Perhaps change the subject to match the code.

> Signed-off-by: Tycho Andersen <tandersen@netflix.com>
> Suggested-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
> CC: Aleksa Sarai <cyphar@cyphar.com>
> Link: https://github.com/uapi-group/kernel-features#set-comm-field-before-exec
> ---
> v2: * drop the flag, everyone :)
>     * change the rendered value to f_path.dentry->d_name.name instead of
>       argv[0], Eric
> ---
>  fs/exec.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index dad402d55681..9520359a8dcc 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1416,7 +1416,18 @@ int begin_new_exec(struct linux_binprm * bprm)
>  		set_dumpable(current->mm, SUID_DUMP_USER);
>  
>  	perf_event_exec();
> -	__set_task_comm(me, kbasename(bprm->filename), true);
> +
> +	/*
> +	 * If fdpath was set, execveat() made up a path that will
> +	 * probably not be useful to admins running ps or similar.
> +	 * Let's fix it up to be something reasonable.
> +	 */
> +	if (bprm->fdpath) {
> +		BUILD_BUG_ON(TASK_COMM_LEN > DNAME_INLINE_LEN);
> +		__set_task_comm(me, bprm->file->f_path.dentry->d_name.name, true);

We can just do this regardless of bprm->fdpath.

It will be a change of behavior on when executing symlinks and possibly
mount points but I don't think we care.  If we do then we can add make
it conditional with "if (bprm->fdpath)"

At the very least using the above version unconditionally ought to flush
out any bugs.

It should be 99% application invisible as all an application can see
is argv0.  So it is only ps and friends where the comm value is visible.

Eric
Kees Cook Sept. 28, 2024, 9:56 p.m. UTC | #2
On Fri, Sep 27, 2024 at 10:45:58AM -0500, Eric W. Biederman wrote:
> Tycho Andersen <tycho@tycho.pizza> writes:
> 
> > From: Tycho Andersen <tandersen@netflix.com>
> >
> > Zbigniew mentioned at Linux Plumber's that systemd is interested in
> > switching to execveat() for service execution, but can't, because the
> > contents of /proc/pid/comm are the file descriptor which was used,
> > instead of the path to the binary. This makes the output of tools like
> > top and ps useless, especially in a world where most fds are opened
> > CLOEXEC so the number is truly meaningless.
> >
> > Change exec path to fix up /proc/pid/comm in the case where we have
> > allocated one of these synthetic paths in bprm_init(). This way the actual
> > exec machinery is unchanged, but cosmetically the comm looks reasonable to
> > admins investigating things.
> 
> Perhaps change the subject to match the code.
> 
> > Signed-off-by: Tycho Andersen <tandersen@netflix.com>
> > Suggested-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl>
> > CC: Aleksa Sarai <cyphar@cyphar.com>
> > Link: https://github.com/uapi-group/kernel-features#set-comm-field-before-exec
> > ---
> > v2: * drop the flag, everyone :)
> >     * change the rendered value to f_path.dentry->d_name.name instead of
> >       argv[0], Eric
> > ---
> >  fs/exec.c | 13 ++++++++++++-
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/exec.c b/fs/exec.c
> > index dad402d55681..9520359a8dcc 100644
> > --- a/fs/exec.c
> > +++ b/fs/exec.c
> > @@ -1416,7 +1416,18 @@ int begin_new_exec(struct linux_binprm * bprm)
> >  		set_dumpable(current->mm, SUID_DUMP_USER);
> >  
> >  	perf_event_exec();
> > -	__set_task_comm(me, kbasename(bprm->filename), true);
> > +
> > +	/*
> > +	 * If fdpath was set, execveat() made up a path that will
> > +	 * probably not be useful to admins running ps or similar.
> > +	 * Let's fix it up to be something reasonable.
> > +	 */
> > +	if (bprm->fdpath) {
> > +		BUILD_BUG_ON(TASK_COMM_LEN > DNAME_INLINE_LEN);
> > +		__set_task_comm(me, bprm->file->f_path.dentry->d_name.name, true);
> 
> We can just do this regardless of bprm->fdpath.
> 
> It will be a change of behavior on when executing symlinks and possibly
> mount points but I don't think we care.  If we do then we can add make
> it conditional with "if (bprm->fdpath)"
> 
> At the very least using the above version unconditionally ought to flush
> out any bugs.

I'm not super comfortable doing this regardless of bprm->fdpath; that
seems like too many cases getting changed. Can we just leave it as
depending on bprm->fdpath?

Also, is d_name.name always going to be set? e.g. what about memfd, etc?
Eric W. Biederman Sept. 30, 2024, 8:10 p.m. UTC | #3
"Eric W. Biederman" <ebiederm@xmission.com> writes:

> Kees Cook <kees@kernel.org> writes:

>> I'm not super comfortable doing this regardless of bprm->fdpath; that
>> seems like too many cases getting changed. Can we just leave it as
>> depending on bprm->fdpath?

I was recommending that because I did not expect that there was any
widespread usage of aliasing of binary names using symlinks.

I realized today that on debian there are many aliases
of binaries created with the /etc/alternatives mechanism.
So there is much wider exposure to problems than I would have
supposed.

So I remove any objections to making the new code conditional on bprm->fdpath.

Eric
Tycho Andersen Oct. 1, 2024, 1:43 p.m. UTC | #4
On Mon, Sep 30, 2024 at 03:10:29PM -0500, Eric W. Biederman wrote:
> "Eric W. Biederman" <ebiederm@xmission.com> writes:
> 
> > Kees Cook <kees@kernel.org> writes:
> 
> >> I'm not super comfortable doing this regardless of bprm->fdpath; that
> >> seems like too many cases getting changed. Can we just leave it as
> >> depending on bprm->fdpath?
> 
> I was recommending that because I did not expect that there was any
> widespread usage of aliasing of binary names using symlinks.
> 
> I realized today that on debian there are many aliases
> of binaries created with the /etc/alternatives mechanism.
> So there is much wider exposure to problems than I would have
> supposed.
> 
> So I remove any objections to making the new code conditional on bprm->fdpath.

Yep, and it looks like Alpine distributes busybox with symlinks
instead of hard links. I will respin with a fixed subject line shortly.

Thanks,

Tycho
diff mbox series

Patch

diff --git a/fs/exec.c b/fs/exec.c
index dad402d55681..9520359a8dcc 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1416,7 +1416,18 @@  int begin_new_exec(struct linux_binprm * bprm)
 		set_dumpable(current->mm, SUID_DUMP_USER);
 
 	perf_event_exec();
-	__set_task_comm(me, kbasename(bprm->filename), true);
+
+	/*
+	 * If fdpath was set, execveat() made up a path that will
+	 * probably not be useful to admins running ps or similar.
+	 * Let's fix it up to be something reasonable.
+	 */
+	if (bprm->fdpath) {
+		BUILD_BUG_ON(TASK_COMM_LEN > DNAME_INLINE_LEN);
+		__set_task_comm(me, bprm->file->f_path.dentry->d_name.name, true);
+	} else {
+		__set_task_comm(me, kbasename(bprm->filename), true);
+	}
 
 	/* An exec changes our domain. We are no longer part of the thread
 	   group */