Message ID | 20180915151622.17789-2-adhemerval.zanella@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | [1/3] posix: Add internal symbols for posix_spawn interface | expand |
It seems to me that there are still reasonable questions about whether to use posix_spawn or vfork ("posix_spawn is a badly designed API"). For (well over) 30 years, I've understood that vfork was the go-to call for a fork/exec scenario, so, what is the technical problem with using it for popen and system? (I'm not asking about vfork's overall technical merits, I'm asking exclusively about using it for popen and system.)
On Sun, Sep 16, 2018 at 02:43:02PM +0930, David Newall wrote: > It seems to me that there are still reasonable questions about > whether to use posix_spawn or vfork ("posix_spawn is a badly > designed API"). For (well over) 30 years, I've understood that > vfork was the go-to call for a fork/exec scenario, so, what is the > technical problem with using it for popen and system? (I'm not > asking about vfork's overall technical merits, I'm asking > exclusively about using it for popen and system.) The historical contract of vfork is that you can basically do nothing after it returns in the child except for exec or _exit, and there are good reasons for this; sharing memory and stack with the parent has lots of subtle issues, especially in the presence of a non-dead-stupid compiler. One thing to note is that vfork is completely unsafe to use as documented if any signal handlers are installed, unless you block all signals before calling vfork, in which case the exec'd process will inherit a fully-blocked signal mask which is probably not what you want. Otherwise signal handlers may wrongly run in the child that's sharing memory with the parent. The posix_spawn implementation already takes care of these issues by not sharing the stack and uninstalling any signal handlers before unmasking signals. Rich
On 17/09/2018 07:50, Rich Felker wrote: > On Sun, Sep 16, 2018 at 02:43:02PM +0930, David Newall wrote: >> It seems to me that there are still reasonable questions about >> whether to use posix_spawn or vfork ("posix_spawn is a badly >> designed API"). For (well over) 30 years, I've understood that >> vfork was the go-to call for a fork/exec scenario, so, what is the >> technical problem with using it for popen and system? (I'm not >> asking about vfork's overall technical merits, I'm asking >> exclusively about using it for popen and system.) > > The historical contract of vfork is that you can basically do nothing > after it returns in the child except for exec or _exit, and there are > good reasons for this; sharing memory and stack with the parent has > lots of subtle issues, especially in the presence of a non-dead-stupid > compiler. > > One thing to note is that vfork is completely unsafe to use as > documented if any signal handlers are installed, unless you block all > signals before calling vfork, in which case the exec'd process will > inherit a fully-blocked signal mask which is probably not what you > want. Otherwise signal handlers may wrongly run in the child that's > sharing memory with the parent. > > The posix_spawn implementation already takes care of these issues by > not sharing the stack and uninstalling any signal handlers before > unmasking signals. > > Rich > And posix_spawn implementation on Linux uses the same performance improvements that vfork aims to provide.
On 18/09/18 00:20, Rich Felker wrote: > The historical contract of vfork is that you can basically do nothing > after it returns in the child except for exec or _exit Yes, that is true. My understanding is that vfork was intended only as a fast way of doing fork/exec sequence. > One thing to note is that vfork is completely unsafe to use as > documented if any signal handlers are installed, unless you block all > signals before calling vfork, in which case the exec'd process will > inherit a fully-blocked signal mask which is probably not what you > want. Otherwise signal handlers may wrongly run in the child that's > sharing memory with the parent. You're saying that kernel will deliver a signal to child pid when it was parent pid that was signalled. Can that really happen?
On Tue, Sep 18, 2018 at 11:00:48AM +0930, David Newall wrote: > On 18/09/18 00:20, Rich Felker wrote: > >The historical contract of vfork is that you can basically do nothing > >after it returns in the child except for exec or _exit > Yes, that is true. My understanding is that vfork was intended > only as a fast way of doing fork/exec sequence. > > >One thing to note is that vfork is completely unsafe to use as > >documented if any signal handlers are installed, unless you block all > >signals before calling vfork, in which case the exec'd process will > >inherit a fully-blocked signal mask which is probably not what you > >want. Otherwise signal handlers may wrongly run in the child that's > >sharing memory with the parent. > > You're saying that kernel will deliver a signal to child pid when it > was parent pid that was signalled. Can that really happen? There are various conditions under which signals are delivered to an entire process group; the most well-known is tty signals from ^C, ^\, ^Z, SIGWINCH, etc. to the tty's foreground process group. After vfork these would be delivered to both the parent and child while they share memory. The parent is suspended and won't act until the child execs or exits, but mere execution of the signal handler in the child is observably and dangerously wrong behavior. Rich
On Sun, Sep 16, 2018 at 1:13 AM David Newall <glibc@davidnewall.com> wrote: > It seems to me that there are still reasonable questions about whether > to use posix_spawn or vfork ("posix_spawn is a badly designed API"). When I said to Sergey that I would rather see the problem they reported addressed using vfork instead of posix_spawn, I was giving advice to a new contributor. I really _would_ rather see it addressed that way, and I also thought that they were more likely to succeed in writing those patches. Adhemerval is not a new contributor and they deeply understand the problems in this area. Their patches strike me as a step generally in the right direction. I don't have time to review them in detail, but I don't object to them. However, do I think some of the fine details demonstrate why an API that allows for arbitrary computation and system calls before exec would be preferable, such as there being "no safe way to clear close-on-exec in the child" (because, IIUC, there's no posix_spawn action to do that). zw
On Tue, Sep 18, 2018 at 02:01:29PM -0400, Zack Weinberg wrote: > On Sun, Sep 16, 2018 at 1:13 AM David Newall <glibc@davidnewall.com> wrote: > > It seems to me that there are still reasonable questions about whether > > to use posix_spawn or vfork ("posix_spawn is a badly designed API"). > > When I said to Sergey that I would rather see the problem they > reported addressed using vfork instead of posix_spawn, I was giving > advice to a new contributor. I really _would_ rather see it addressed > that way, and I also thought that they were more likely to succeed in > writing those patches. > > Adhemerval is not a new contributor and they deeply understand the > problems in this area. Their patches strike me as a step generally in > the right direction. I don't have time to review them in detail, but > I don't object to them. However, do I think some of the fine details > demonstrate why an API that allows for arbitrary computation and > system calls before exec would be preferable, such as there being "no > safe way to clear close-on-exec in the child" (because, IIUC, there's > no posix_spawn action to do that). The resolution to Austin Group issue #411 made it so adddup2(n,n) does what you want: http://austingroupbugs.net/view.php?id=411 Rich
On 18/09/2018 22:17, Rich Felker wrote: > On Tue, Sep 18, 2018 at 02:01:29PM -0400, Zack Weinberg wrote: >> On Sun, Sep 16, 2018 at 1:13 AM David Newall <glibc@davidnewall.com> wrote: >>> It seems to me that there are still reasonable questions about whether >>> to use posix_spawn or vfork ("posix_spawn is a badly designed API"). >> >> When I said to Sergey that I would rather see the problem they >> reported addressed using vfork instead of posix_spawn, I was giving >> advice to a new contributor. I really _would_ rather see it addressed >> that way, and I also thought that they were more likely to succeed in >> writing those patches. >> >> Adhemerval is not a new contributor and they deeply understand the >> problems in this area. Their patches strike me as a step generally in >> the right direction. I don't have time to review them in detail, but >> I don't object to them. However, do I think some of the fine details >> demonstrate why an API that allows for arbitrary computation and >> system calls before exec would be preferable, such as there being "no >> safe way to clear close-on-exec in the child" (because, IIUC, there's >> no posix_spawn action to do that). > > The resolution to Austin Group issue #411 made it so adddup2(n,n) does > what you want: > > http://austingroupbugs.net/view.php?id=411 I has been tracked on https://sourceware.org/bugzilla/show_bug.cgi?id=23640 as well. I am not found of having adddup2(n,n) semantic different dup2(n,n), but I don't have a better straightforward solution either.
Ping. On 15/09/2018 12:16, Adhemerval Zanella wrote: > This patch uses posix_spawn on popen instead of fork and execl. On Linux > this has the advantage of much lower memory consumption (usually 32 Kb > minimum for the mmap stack area). > > Checked on x86_64-linux-gnu and i686-linux-gnu. > > * libio/iopopen.c (_IO_new_proc_open): use posix_spawn instead of > fork and execl. > --- > ChangeLog | 3 ++ > libio/iopopen.c | 97 +++++++++++++++++++++++++++++-------------------- > 2 files changed, 61 insertions(+), 39 deletions(-) > > diff --git a/libio/iopopen.c b/libio/iopopen.c > index 2eff45b4c8..3cce2e5596 100644 > --- a/libio/iopopen.c > +++ b/libio/iopopen.c > @@ -34,7 +34,8 @@ > #include <not-cancel.h> > #include <sys/types.h> > #include <sys/wait.h> > -#include <kernel-features.h> > +#include <spawn.h> > +#include <paths.h> > > struct _IO_proc_file > { > @@ -63,9 +64,8 @@ FILE * > _IO_new_proc_open (FILE *fp, const char *command, const char *mode) > { > int read_or_write; > - int parent_end, child_end; > int pipe_fds[2]; > - pid_t child_pid; > + int op; > > int do_read = 0; > int do_write = 0; > @@ -108,59 +108,78 @@ _IO_new_proc_open (FILE *fp, const char *command, const char *mode) > > if (do_read) > { > - parent_end = pipe_fds[0]; > - child_end = pipe_fds[1]; > + op = 0; > read_or_write = _IO_NO_WRITES; > } > else > { > - parent_end = pipe_fds[1]; > - child_end = pipe_fds[0]; > + op = 1; > read_or_write = _IO_NO_READS; > } > > - ((_IO_proc_file *) fp)->pid = child_pid = __fork (); > - if (child_pid == 0) > - { > - int child_std_end = do_read ? 1 : 0; > - struct _IO_proc_file *p; > - > - if (child_end != child_std_end) > - __dup2 (child_end, child_std_end); > - else > - /* The descriptor is already the one we will use. But it must > - not be marked close-on-exec. Undo the effects. */ > - __fcntl (child_end, F_SETFD, 0); > - /* POSIX.2: "popen() shall ensure that any streams from previous > - popen() calls that remain open in the parent process are closed > - in the new child process." */ > - for (p = proc_file_chain; p; p = p->next) > - { > - int fd = _IO_fileno ((FILE *) p); > + { > + posix_spawn_file_actions_t fa; > + /* posix_spawn_file_actions_init does not fail. */ > + __posix_spawn_file_actions_init (&fa); > > - /* If any stream from previous popen() calls has fileno > - child_std_end, it has been already closed by the dup2 syscall > - above. */ > - if (fd != child_std_end) > - __close_nocancel (fd); > - } > + /* The descriptor is already in the one the child will use. In this case > + it must be moved to another one, otherwise there is no safe way to > + remove the close-on-exec flag in the child without creating a FD leak > + race in the parent. */ > + if (pipe_fds[1 - op] == 1 - op) > + { > + int tmp = __fcntl (1 - op, F_DUPFD_CLOEXEC, 0); > + if (tmp < 0) > + goto spawn_failure; > + __close_nocancel (pipe_fds[1 - op]); > + pipe_fds[1 - op] = tmp; > + } > > - execl ("/bin/sh", "sh", "-c", command, (char *) 0); > - _exit (127); > - } > - __close_nocancel (child_end); > - if (child_pid < 0) > + if (__posix_spawn_file_actions_adddup2 (&fa, pipe_fds[1 - op], 1 - op) > + != 0) > + goto spawn_failure; > + > + /* POSIX.2: "popen() shall ensure that any streams from previous popen() > + calls that remain open in the parent process are closed in the new > + child process." */ > + for (struct _IO_proc_file *p = proc_file_chain; p; p = p->next) > + { > + int fd = _IO_fileno ((FILE *) p); > + > + /* If any stream from previous popen() calls has fileno > + child_send, it has been already closed by the dup2 syscall > + above. */ > + if (fd != 1 - op > + && __posix_spawn_file_actions_addclose (&fa, fd) != 0) > + goto spawn_failure; > + } > + > + if (__posix_spawn (&((_IO_proc_file *) fp)->pid, _PATH_BSHELL, &fa, 0, > + (char *const[]){ (char*) "sh", (char*) "-c", > + (char *) command, NULL }, __environ) != 0) > + { > + spawn_failure: > + __posix_spawn_file_actions_destroy (&fa); > + __close_nocancel (pipe_fds[1 - op]); > + __set_errno (ENOMEM); > + return NULL; > + } > + > + __posix_spawn_file_actions_destroy (&fa); > + } > + __close_nocancel (pipe_fds[1 - op]); > + if (((_IO_proc_file *) fp)->pid < 0) > { > - __close_nocancel (parent_end); > + __close_nocancel (pipe_fds[op]); > return NULL; > } > > if (!do_cloexec) > /* Undo the effects of the pipe2 call which set the > close-on-exec flag. */ > - __fcntl (parent_end, F_SETFD, 0); > + __fcntl (pipe_fds[op], F_SETFD, 0); > > - _IO_fileno (fp) = parent_end; > + _IO_fileno (fp) = pipe_fds[op]; > > /* Link into proc_file_chain. */ > #ifdef _IO_MTSAFE_IO >
I also fixed BZ#17490. Although POSIX pthread_atfork [1] description only list 'fork' as the function where should issue the atfork handlers and popen description [2] states that: '[...] shall be *as if* a child process were created within the popen() call using the fork() function [...]' Other libc/system seems to follow the idea atfork handlers should not be issue for popen: libc/system | run atfork handles | notes ----------------|----------------------|--------------------------------------- freebsd master | no | uses vfork solaris 11 | no | MacOSX (11.13) | no | implemented through posix_spawn syscall ----------------|----------------------|---------------------------------------- And I also agree that, as for posix_spawn and system, popen idea is to spawn a different binary so all the POSIX rationale to run the atfork handlers to avoid internal process inconsistent are not really required and in some cases might be unsafe. [1] http://pubs.opengroup.org/onlinepubs/9699919799/ [2] http://pubs.opengroup.org/onlinepubs/9699919799/ On 17/10/2018 14:11, Adhemerval Zanella wrote: > Ping. > > On 15/09/2018 12:16, Adhemerval Zanella wrote: >> This patch uses posix_spawn on popen instead of fork and execl. On Linux >> this has the advantage of much lower memory consumption (usually 32 Kb >> minimum for the mmap stack area). >> >> Checked on x86_64-linux-gnu and i686-linux-gnu. >> >> * libio/iopopen.c (_IO_new_proc_open): use posix_spawn instead of >> fork and execl. >> --- >> ChangeLog | 3 ++ >> libio/iopopen.c | 97 +++++++++++++++++++++++++++++-------------------- >> 2 files changed, 61 insertions(+), 39 deletions(-) >> >> diff --git a/libio/iopopen.c b/libio/iopopen.c >> index 2eff45b4c8..3cce2e5596 100644 >> --- a/libio/iopopen.c >> +++ b/libio/iopopen.c >> @@ -34,7 +34,8 @@ >> #include <not-cancel.h> >> #include <sys/types.h> >> #include <sys/wait.h> >> -#include <kernel-features.h> >> +#include <spawn.h> >> +#include <paths.h> >> >> struct _IO_proc_file >> { >> @@ -63,9 +64,8 @@ FILE * >> _IO_new_proc_open (FILE *fp, const char *command, const char *mode) >> { >> int read_or_write; >> - int parent_end, child_end; >> int pipe_fds[2]; >> - pid_t child_pid; >> + int op; >> >> int do_read = 0; >> int do_write = 0; >> @@ -108,59 +108,78 @@ _IO_new_proc_open (FILE *fp, const char *command, const char *mode) >> >> if (do_read) >> { >> - parent_end = pipe_fds[0]; >> - child_end = pipe_fds[1]; >> + op = 0; >> read_or_write = _IO_NO_WRITES; >> } >> else >> { >> - parent_end = pipe_fds[1]; >> - child_end = pipe_fds[0]; >> + op = 1; >> read_or_write = _IO_NO_READS; >> } >> >> - ((_IO_proc_file *) fp)->pid = child_pid = __fork (); >> - if (child_pid == 0) >> - { >> - int child_std_end = do_read ? 1 : 0; >> - struct _IO_proc_file *p; >> - >> - if (child_end != child_std_end) >> - __dup2 (child_end, child_std_end); >> - else >> - /* The descriptor is already the one we will use. But it must >> - not be marked close-on-exec. Undo the effects. */ >> - __fcntl (child_end, F_SETFD, 0); >> - /* POSIX.2: "popen() shall ensure that any streams from previous >> - popen() calls that remain open in the parent process are closed >> - in the new child process." */ >> - for (p = proc_file_chain; p; p = p->next) >> - { >> - int fd = _IO_fileno ((FILE *) p); >> + { >> + posix_spawn_file_actions_t fa; >> + /* posix_spawn_file_actions_init does not fail. */ >> + __posix_spawn_file_actions_init (&fa); >> >> - /* If any stream from previous popen() calls has fileno >> - child_std_end, it has been already closed by the dup2 syscall >> - above. */ >> - if (fd != child_std_end) >> - __close_nocancel (fd); >> - } >> + /* The descriptor is already in the one the child will use. In this case >> + it must be moved to another one, otherwise there is no safe way to >> + remove the close-on-exec flag in the child without creating a FD leak >> + race in the parent. */ >> + if (pipe_fds[1 - op] == 1 - op) >> + { >> + int tmp = __fcntl (1 - op, F_DUPFD_CLOEXEC, 0); >> + if (tmp < 0) >> + goto spawn_failure; >> + __close_nocancel (pipe_fds[1 - op]); >> + pipe_fds[1 - op] = tmp; >> + } >> >> - execl ("/bin/sh", "sh", "-c", command, (char *) 0); >> - _exit (127); >> - } >> - __close_nocancel (child_end); >> - if (child_pid < 0) >> + if (__posix_spawn_file_actions_adddup2 (&fa, pipe_fds[1 - op], 1 - op) >> + != 0) >> + goto spawn_failure; >> + >> + /* POSIX.2: "popen() shall ensure that any streams from previous popen() >> + calls that remain open in the parent process are closed in the new >> + child process." */ >> + for (struct _IO_proc_file *p = proc_file_chain; p; p = p->next) >> + { >> + int fd = _IO_fileno ((FILE *) p); >> + >> + /* If any stream from previous popen() calls has fileno >> + child_send, it has been already closed by the dup2 syscall >> + above. */ >> + if (fd != 1 - op >> + && __posix_spawn_file_actions_addclose (&fa, fd) != 0) >> + goto spawn_failure; >> + } >> + >> + if (__posix_spawn (&((_IO_proc_file *) fp)->pid, _PATH_BSHELL, &fa, 0, >> + (char *const[]){ (char*) "sh", (char*) "-c", >> + (char *) command, NULL }, __environ) != 0) >> + { >> + spawn_failure: >> + __posix_spawn_file_actions_destroy (&fa); >> + __close_nocancel (pipe_fds[1 - op]); >> + __set_errno (ENOMEM); >> + return NULL; >> + } >> + >> + __posix_spawn_file_actions_destroy (&fa); >> + } >> + __close_nocancel (pipe_fds[1 - op]); >> + if (((_IO_proc_file *) fp)->pid < 0) >> { >> - __close_nocancel (parent_end); >> + __close_nocancel (pipe_fds[op]); >> return NULL; >> } >> >> if (!do_cloexec) >> /* Undo the effects of the pipe2 call which set the >> close-on-exec flag. */ >> - __fcntl (parent_end, F_SETFD, 0); >> + __fcntl (pipe_fds[op], F_SETFD, 0); >> >> - _IO_fileno (fp) = parent_end; >> + _IO_fileno (fp) = pipe_fds[op]; >> >> /* Link into proc_file_chain. */ >> #ifdef _IO_MTSAFE_IO >>
diff --git a/libio/iopopen.c b/libio/iopopen.c index 2eff45b4c8..3cce2e5596 100644 --- a/libio/iopopen.c +++ b/libio/iopopen.c @@ -34,7 +34,8 @@ #include <not-cancel.h> #include <sys/types.h> #include <sys/wait.h> -#include <kernel-features.h> +#include <spawn.h> +#include <paths.h> struct _IO_proc_file { @@ -63,9 +64,8 @@ FILE * _IO_new_proc_open (FILE *fp, const char *command, const char *mode) { int read_or_write; - int parent_end, child_end; int pipe_fds[2]; - pid_t child_pid; + int op; int do_read = 0; int do_write = 0; @@ -108,59 +108,78 @@ _IO_new_proc_open (FILE *fp, const char *command, const char *mode) if (do_read) { - parent_end = pipe_fds[0]; - child_end = pipe_fds[1]; + op = 0; read_or_write = _IO_NO_WRITES; } else { - parent_end = pipe_fds[1]; - child_end = pipe_fds[0]; + op = 1; read_or_write = _IO_NO_READS; } - ((_IO_proc_file *) fp)->pid = child_pid = __fork (); - if (child_pid == 0) - { - int child_std_end = do_read ? 1 : 0; - struct _IO_proc_file *p; - - if (child_end != child_std_end) - __dup2 (child_end, child_std_end); - else - /* The descriptor is already the one we will use. But it must - not be marked close-on-exec. Undo the effects. */ - __fcntl (child_end, F_SETFD, 0); - /* POSIX.2: "popen() shall ensure that any streams from previous - popen() calls that remain open in the parent process are closed - in the new child process." */ - for (p = proc_file_chain; p; p = p->next) - { - int fd = _IO_fileno ((FILE *) p); + { + posix_spawn_file_actions_t fa; + /* posix_spawn_file_actions_init does not fail. */ + __posix_spawn_file_actions_init (&fa); - /* If any stream from previous popen() calls has fileno - child_std_end, it has been already closed by the dup2 syscall - above. */ - if (fd != child_std_end) - __close_nocancel (fd); - } + /* The descriptor is already in the one the child will use. In this case + it must be moved to another one, otherwise there is no safe way to + remove the close-on-exec flag in the child without creating a FD leak + race in the parent. */ + if (pipe_fds[1 - op] == 1 - op) + { + int tmp = __fcntl (1 - op, F_DUPFD_CLOEXEC, 0); + if (tmp < 0) + goto spawn_failure; + __close_nocancel (pipe_fds[1 - op]); + pipe_fds[1 - op] = tmp; + } - execl ("/bin/sh", "sh", "-c", command, (char *) 0); - _exit (127); - } - __close_nocancel (child_end); - if (child_pid < 0) + if (__posix_spawn_file_actions_adddup2 (&fa, pipe_fds[1 - op], 1 - op) + != 0) + goto spawn_failure; + + /* POSIX.2: "popen() shall ensure that any streams from previous popen() + calls that remain open in the parent process are closed in the new + child process." */ + for (struct _IO_proc_file *p = proc_file_chain; p; p = p->next) + { + int fd = _IO_fileno ((FILE *) p); + + /* If any stream from previous popen() calls has fileno + child_send, it has been already closed by the dup2 syscall + above. */ + if (fd != 1 - op + && __posix_spawn_file_actions_addclose (&fa, fd) != 0) + goto spawn_failure; + } + + if (__posix_spawn (&((_IO_proc_file *) fp)->pid, _PATH_BSHELL, &fa, 0, + (char *const[]){ (char*) "sh", (char*) "-c", + (char *) command, NULL }, __environ) != 0) + { + spawn_failure: + __posix_spawn_file_actions_destroy (&fa); + __close_nocancel (pipe_fds[1 - op]); + __set_errno (ENOMEM); + return NULL; + } + + __posix_spawn_file_actions_destroy (&fa); + } + __close_nocancel (pipe_fds[1 - op]); + if (((_IO_proc_file *) fp)->pid < 0) { - __close_nocancel (parent_end); + __close_nocancel (pipe_fds[op]); return NULL; } if (!do_cloexec) /* Undo the effects of the pipe2 call which set the close-on-exec flag. */ - __fcntl (parent_end, F_SETFD, 0); + __fcntl (pipe_fds[op], F_SETFD, 0); - _IO_fileno (fp) = parent_end; + _IO_fileno (fp) = pipe_fds[op]; /* Link into proc_file_chain. */ #ifdef _IO_MTSAFE_IO