mbox series

[RFC,0/8] signals: Support more than 64 signals

Message ID 20220103181956.983342-1-walt@drummond.us
Headers show
Series signals: Support more than 64 signals | expand

Message

Walt Drummond Jan. 3, 2022, 6:19 p.m. UTC
This patch set expands the number of signals in Linux beyond the
current cap of 64.  It sets a new cap at the somewhat arbitrary limit
of 1024 signals, both because it’s what GLibc and MUSL support and
because many architectures pad sigset_t or ucontext_t in the kernel to
this cap.  This limit is not fixed and can be further expanded within
reason.

Despite best efforts, there is some non-zero potential that this could
break user space; I'd appreciate any comments, review and/or pointers
to areas of concern.

Basically, these changes entail:

 - Make all system calls that accept sigset_t honor the existing
   sigsetsize parameter for values between 8 and 128, and to return
   sigsetsize bytes to user space.

 - Add AT_SIGSET_SZ to the aux vector to signal to user space the
   maximum size sigset_t the kernel can accept.

 - Remove the sigmask() macro except in compatibility cases, change
   the sigaddset()/sigdelset()/etc. to accept a comma separated list
   of signal numbers.

 - Change the _NSIG_WORDS calculation to round up when needed on
   generic and x86.

 - Place the complete sigmask in the real time signal frame (x86_64,
   x32 and ia32).

 - Various fixes where sigset_t size is assumed.

 - Add BSD SIGINFO (and VSTATUS) as a test.

The changes that have the most risk of breaking user space are the
ones that put more than 8 bytes of sigset_t in the real time signal
stack frame (Patches 2 & 6), and I should note that an earlier and
incomplete version of patch 2 was NAK’ed by Al in
https://lore.kernel.org/lkml/20201119221132.1515696-1-walt@drummond.us/.

As far as I have been able to determine this patchset, and
specifically changing the size of sigset_t, does not break user space.

The two uses of sigset_t that pose the most user space risk are 1) as
a member of ucontext_t passed as a parameter to the signal handler and
2) when user space performs manual inspection of the real-time signal
stack frame.

In case (1), user space has definitions of both siget_t and ucontext_t
that are independent of, and may differ from, the kernel (eg, sigset_t
in uclibc-ng is 16 bytes, musl is 128 bytes, glibc is 128 bytes on all
architectures except Arc, etc.).  User space will interpret the data
on the signal stack through these definitions, and extensions to
sigset_t will be opaque.  Other non-C runtimes are similarly
independent from kernel sigset_t and ucontext_t and derive their
definition of sigset_t from libc either directly or indirectly, and do
not manually inspect the signal stack (specifically OpenJDK, Golang,
Python3, Rust and Perl).

The only instances I found of case (2), manually inspecting the signal
stack frame, are in stack unwinders/backtracers (GDB, GCC, libunwind)
and in GDB when recording program execution, and only on the i386,
x86_64, s390 and powerpc architectures.  The GDB, GCC and libunwind
behave consistently with and without this patchset.

GDB's execution recording is somewhat more complicated.  It uses
internally defined architecture specific constants to represent the
total size of the signal frame, and will save that entire frame for
later use.  I cannot confirm that the values for powerpc and s390 are
correct, but for this purpose it doesn't matter as these architectures
explicitly pad for an expanded uc_sigmask.  I can, however, confirm
that the values for i386 and x86_64 are not correct, and that GDB is
recording an incorrect amount of stack data.  This doesn’t appear to
be an issue; while I cannot build a test case on x86_64 due to a known
bug[1], a basic test on i386 shows that the stack is correctly being
recorded, and forward and reverse replay seems to work just fine
across signal handlers.

There are other cases to consider if the number of signals and
therefore the size of sigset_t changes:

Impact on struct rt_sigframe member elements

  The placement of ucontext_t in struct rt_sigframe has the potential
  to move following member elements in ways that could break user
  space if user space relied on the offsets of these elements.
  However a review shows that any elements in rt_sigframe after
  ucontext_t.uc_sigmask are either (1) unused or only used by the
  kernel or (2) fall into the x86_64/i386 floating point state case
  above.

Kernel has new signals, user space does not

  Any new bits in ucontext.uc_sigmask placed on the signal stack are
  opaque to user space (except in cases where user space already has a
  larger sigset_t, as in glibc).

  There are no changes to the real-time signals system call semantics,
  as the kernel will honor the hard-coded sigsetsize value of 8 in
  libc and behave as it has before these changes.

  Signal numbers larger than 64 cannot be blocked or caught until user
  space is updated, however their default action will work as
  expected.  This can cause one problem: a parent process that uses
  the signal number a child exited with as an index into an array
  without bounds checking can cause a crash.  I’ve seen exactly one
  instance of this in tcsh, and is, I think, a bug in tcsh.

User space has new signals, kernel does not

  User space attempting to use a signal number not supported by the
  kernel in system calls (eg, sigaction()) or other libc functions (eg,
  sigaddset()) will result in EINVAL, as expected.

  User space needs to know how to set the sigsetsize parameter to the
  real time signal system calls and it can use getauxval(AT_SIGSET_SZ)
  to determine this.  If it returns zero the sigsetsize must be 8,
  otherwise the kernel will accept sigsetsize between 8 and the return
  value.

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=23188

Walt Drummond (8):
  signals: Make the real-time signal system calls accept different sized
    sigset_t from user space.
  signals: Put the full signal mask on the signal stack for x86_64, X32
    and ia32 compatibility mode
  signals: Use a helper function to test if a signal is a real-time
    signal.
  signals: Remove sigmask() macro
  signals: Better support cases where _NSIG_WORDS is greater than 2
  signals: Round up _NSIG_WORDS
  signals: Add signal debugging
  signals: Support BSD VSTATUS, KERNINFO and SIGINFO

 arch/alpha/kernel/signal.c          |   4 +-
 arch/m68k/include/asm/signal.h      |   6 +-
 arch/nios2/kernel/signal.c          |   2 -
 arch/x86/ia32/ia32_signal.c         |   5 +-
 arch/x86/include/asm/sighandling.h  |  34 +++
 arch/x86/include/asm/signal.h       |  10 +-
 arch/x86/include/uapi/asm/signal.h  |   4 +-
 arch/x86/kernel/signal.c            |  11 +-
 drivers/scsi/dpti.h                 |   2 -
 drivers/tty/Makefile                |   2 +-
 drivers/tty/n_tty.c                 |  21 ++
 drivers/tty/tty_io.c                |  10 +-
 drivers/tty/tty_ioctl.c             |   4 +
 drivers/tty/tty_status.c            | 135 ++++++++++
 fs/binfmt_elf.c                     |   1 +
 fs/binfmt_elf_fdpic.c               |   1 +
 fs/ceph/addr.c                      |   2 +-
 fs/jffs2/background.c               |   2 +-
 fs/lockd/svc.c                      |   1 -
 fs/proc/array.c                     |  32 +--
 fs/proc/base.c                      |  48 ++++
 fs/signalfd.c                       |  26 +-
 include/asm-generic/termios.h       |   4 +-
 include/linux/compat.h              |  98 ++++++-
 include/linux/sched.h               |  52 +++-
 include/linux/signal.h              | 389 ++++++++++++++++++++--------
 include/linux/tty.h                 |   8 +
 include/uapi/asm-generic/ioctls.h   |   2 +
 include/uapi/asm-generic/signal.h   |   8 +-
 include/uapi/asm-generic/termbits.h |  34 +--
 include/uapi/linux/auxvec.h         |   1 +
 kernel/compat.c                     |  30 +--
 kernel/fork.c                       |   2 +-
 kernel/ptrace.c                     |  18 +-
 kernel/signal.c                     | 288 ++++++++++----------
 kernel/sysctl.c                     |  41 +++
 kernel/time/posix-timers.c          |   3 +-
 lib/Kconfig.debug                   |  10 +
 security/apparmor/ipc.c             |   4 +-
 virt/kvm/kvm_main.c                 |  18 +-
 40 files changed, 974 insertions(+), 399 deletions(-)
 create mode 100644 drivers/tty/tty_status.c

Comments

Al Viro Jan. 3, 2022, 6:48 p.m. UTC | #1
On Mon, Jan 03, 2022 at 10:19:48AM -0800, Walt Drummond wrote:
> This patch set expands the number of signals in Linux beyond the
> current cap of 64.  It sets a new cap at the somewhat arbitrary limit
> of 1024 signals, both because it’s what GLibc and MUSL support and
> because many architectures pad sigset_t or ucontext_t in the kernel to
> this cap.  This limit is not fixed and can be further expanded within
> reason.

Could you explain the point of the entire exercise?  Why do we need more
rt signals in the first place?

glibc has quite a bit of utterly pointless future-proofing.  So "they
allow more" is not a good reason - not without a plausible use-case,
at least.
Walt Drummond Jan. 4, 2022, 1 a.m. UTC | #2
I simply wanted SIGINFO and VSTATUS, and that necessitated this. If
the limit of 1024 rt signals is an issue, that's an extremely simple
change to make.



On Mon, Jan 3, 2022 at 10:48 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Mon, Jan 03, 2022 at 10:19:48AM -0800, Walt Drummond wrote:
> > This patch set expands the number of signals in Linux beyond the
> > current cap of 64.  It sets a new cap at the somewhat arbitrary limit
> > of 1024 signals, both because it’s what GLibc and MUSL support and
> > because many architectures pad sigset_t or ucontext_t in the kernel to
> > this cap.  This limit is not fixed and can be further expanded within
> > reason.
>
> Could you explain the point of the entire exercise?  Why do we need more
> rt signals in the first place?
>
> glibc has quite a bit of utterly pointless future-proofing.  So "they
> allow more" is not a good reason - not without a plausible use-case,
> at least.
Al Viro Jan. 4, 2022, 1:16 a.m. UTC | #3
On Mon, Jan 03, 2022 at 05:00:58PM -0800, Walt Drummond wrote:
> I simply wanted SIGINFO and VSTATUS, and that necessitated this.

Elaborate, please.  What exactly requires more than 32 rt signals?
Al Viro Jan. 4, 2022, 1:49 a.m. UTC | #4
On Tue, Jan 04, 2022 at 01:16:17AM +0000, Al Viro wrote:
> On Mon, Jan 03, 2022 at 05:00:58PM -0800, Walt Drummond wrote:
> > I simply wanted SIGINFO and VSTATUS, and that necessitated this.
> 
> Elaborate, please.  What exactly requires more than 32 rt signals?

More to the point, which system had SIGINFO >= SIGRTMIN?  Or signals
with numbers greater than SIGRTMAX, for that matter?

I really don't get it...
Eric W. Biederman Jan. 4, 2022, 6 p.m. UTC | #5
Walt Drummond <walt@drummond.us> writes:

> This patch set expands the number of signals in Linux beyond the
> current cap of 64.  It sets a new cap at the somewhat arbitrary limit
> of 1024 signals, both because it’s what GLibc and MUSL support and
> because many architectures pad sigset_t or ucontext_t in the kernel to
> this cap.  This limit is not fixed and can be further expanded within
> reason.

Ahhhh!!

Please let's not expand the number of signals supported if there is any
alternative.  Signals only really make sense for supporting existing
interfaces.  For new applications there is almost always something
better.

In the last discussion of adding SIGINFO
https://lore.kernel.org/lkml/20190625161153.29811-1-ar@cs.msu.ru/ the
approach examined was to fix SIGPWR to be ignored by default and to
define SIGINFO as SIGPWR.

I dug through the previous conversations and there is a little debate
about what makes sense for SIGPWR to do by default.  Alan Cox remembered
SIGPWR was sent when the power was restored, so ignoring SIGPWR by
default made sense.  Ted Tso pointed out a different scenario where it
was reasonable for SIGPWR to be a terminating signal.

So far no one has actually found any applications that will regress if
SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
defined to be sent to init, and init ignores all signals by default so
in practice SIGPWR is ignored by the only process that receives it
currently.

I am persuaded at least enough that I could see adding a patch to
linux-next and them sending to Linus that could be reverted if anything
broke.

Where I saw the last conversation falter was in making a persuasive
case of why SIGINFO was interesting to add.  Given a world of ssh
connections I expect a persuasive case can be made.  Especially if there
are a handful of utilities where it is already implemented that just
need to be built with SIGINFO defined.

>  - Add BSD SIGINFO (and VSTATUS) as a test.

If your actual point is not to implement SIGINFO and you really have
another use case for expanding sigset_t please make it clear.

Without seeing the persuasive case for more signals I have to say that
adding more signals to the kernel sounds like a bad idea.

Eric
Theodore Ts'o Jan. 4, 2022, 8:52 p.m. UTC | #6
On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
> I dug through the previous conversations and there is a little debate
> about what makes sense for SIGPWR to do by default.  Alan Cox remembered
> SIGPWR was sent when the power was restored, so ignoring SIGPWR by
> default made sense.  Ted Tso pointed out a different scenario where it
> was reasonable for SIGPWR to be a terminating signal.
> 
> So far no one has actually found any applications that will regress if
> SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
> defined to be sent to init, and init ignores all signals by default so
> in practice SIGPWR is ignored by the only process that receives it
> currently.

As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
initiate the sigpwr target.  From the systemd.special man page:

       sigpwr.target
           A special target that is started when systemd receives the
           SIGPWR process signal, which is normally sent by the kernel
           or UPS daemons when power fails.

And child processes of systemd are not ignoring SIGPWR.  Instead, they
are getting terminated.

<tytso@cwcc>
41% /bin/sleep 50 &
[1] 180671
<tytso@cwcc>
42% kill -PWR 180671
[1]+  Power failure           /bin/sleep 50

> Where I saw the last conversation falter was in making a persuasive
> case of why SIGINFO was interesting to add.  Given a world of ssh
> connections I expect a persuasive case can be made.  Especially if there
> are a handful of utilities where it is already implemented that just
> need to be built with SIGINFO defined.

One thing that's perhaps worth disentangling is the value of
supporting VSTATUS --- which is a control character much like VINTR
(^C) or VQUIT (control backslash) which is set via the c_cc[] array in
termios structure.  Quoting from the termios man page:

       VSTATUS
              (not in POSIX; not supported under Linux; status
              request: 024, DC4, Ctrl-T).  Status character (STATUS).
              Display status information at terminal, including state
              of foreground process and amount of CPU time it has
              consumed.  Also sends a SIGINFO signal (not supported on
              Linux) to the foreground process group.

The basic idea is that when you type C-t, you can find out information
about the currently running process.  This is a feature that
originally comes from TOPS-10's TENEX operating system, and it is
supported today on FreeBSD and Mac OS.  For example, it might display
something like this:

load: 2.39  cmd: ping 5374 running 0.00u 0.00s

The reason why SIGINFO is sent to the foreground process group is that
it gives the process an opportunity print application specific
information about currently running process.  For example, maybe the C
compiler could print something like "parsing 2042 of 5000 header
files", or some such.  :-)

There are people who wish that Linux supported Control-T / VSTATUS,
for example, just last week, on TUHS, the Unix greybeards list, there
were two such heartfelt wishes for Control-T support from two such
greybeards:

    "It's my biggest annoyance with Linux that it doesn't [support
    control-t]
    - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html

    "I personally can't stand using Linux, even casually for a very
     short sys-admin task, because of this missing feature"
    - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html

I claim, though, that we could implement VSTATUS without implenting
the SIGINFO part of the feature.  Previous few applications *ever*
implemented SIGINFO signal handlers so they could give status
information, it's the hard one, since we don't have any spare signals
left.  If we were to repurpose some lesser used signal, whether it be
SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some
userspace program (such as a UPS monitoring program which wants to
trigger power fail handling, or a userspace NFSv4 process that wants
to signal that it was unable to recover a file's file lock after a
server reboot), and if we try to take over the signal assignment, it's
possible that we might get surprised.  Furthermore, all of the
possibly unused signals that we might try to reclaim terminate the
process by default, and SIGINFO *has* to have a default signal
handling action of Ignore, since otherwise typing Control-T will end
up killing the current foreground application.

Personally, I don't care all that much about VSTATUS support --- I
used it when I was in university, but honestly, I've never missed it.
But if there is someone who wants to try to implement VSTATUS, and
make some Unix greybeards happy, and maybe even switch from FreeBSD to
Linux as a result, go wild.  I'm not convinced, though, that adding
the SIGINFO part of the support is worth the effort.

Not only do almost no programs implement SIGINFO support, a lot of CPU
bound programs where this might be actually useful, end up running a
large number of processes in parallel.  Take the "parsing 2042 of 5000
header files" example I gave above.  Consider what would happen if gcc
implemented support for SIGINFO, but the user was running a "make -j
16" and typed Control-T.   The result would be chaos!

So if you really miss Control-T, and it's the only thing holding back
a few FreeBSD users from Linux, I don't see the problem with
implementing that part of the feature.  Why not just do the easy part
of the feature which is perhaps 5% of the work, and might provide 99%
of the benefit (at least for those people who care).

> Without seeing the persuasive case for more signals I have to say that
> adding more signals to the kernel sounds like a bad idea.

Concur, 100%.

						- Ted
Walt Drummond Jan. 4, 2022, 9:33 p.m. UTC | #7
Fair enough.  I'll abandon the signals part of this and just send out
the VSTATUS/Control-T part, after I address some comments from Greg.

Thanks.

On Tue, Jan 4, 2022 at 12:52 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
> > I dug through the previous conversations and there is a little debate
> > about what makes sense for SIGPWR to do by default.  Alan Cox remembered
> > SIGPWR was sent when the power was restored, so ignoring SIGPWR by
> > default made sense.  Ted Tso pointed out a different scenario where it
> > was reasonable for SIGPWR to be a terminating signal.
> >
> > So far no one has actually found any applications that will regress if
> > SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
> > defined to be sent to init, and init ignores all signals by default so
> > in practice SIGPWR is ignored by the only process that receives it
> > currently.
>
> As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
> initiate the sigpwr target.  From the systemd.special man page:
>
>        sigpwr.target
>            A special target that is started when systemd receives the
>            SIGPWR process signal, which is normally sent by the kernel
>            or UPS daemons when power fails.
>
> And child processes of systemd are not ignoring SIGPWR.  Instead, they
> are getting terminated.
>
> <tytso@cwcc>
> 41% /bin/sleep 50 &
> [1] 180671
> <tytso@cwcc>
> 42% kill -PWR 180671
> [1]+  Power failure           /bin/sleep 50
>
> > Where I saw the last conversation falter was in making a persuasive
> > case of why SIGINFO was interesting to add.  Given a world of ssh
> > connections I expect a persuasive case can be made.  Especially if there
> > are a handful of utilities where it is already implemented that just
> > need to be built with SIGINFO defined.
>
> One thing that's perhaps worth disentangling is the value of
> supporting VSTATUS --- which is a control character much like VINTR
> (^C) or VQUIT (control backslash) which is set via the c_cc[] array in
> termios structure.  Quoting from the termios man page:
>
>        VSTATUS
>               (not in POSIX; not supported under Linux; status
>               request: 024, DC4, Ctrl-T).  Status character (STATUS).
>               Display status information at terminal, including state
>               of foreground process and amount of CPU time it has
>               consumed.  Also sends a SIGINFO signal (not supported on
>               Linux) to the foreground process group.
>
> The basic idea is that when you type C-t, you can find out information
> about the currently running process.  This is a feature that
> originally comes from TOPS-10's TENEX operating system, and it is
> supported today on FreeBSD and Mac OS.  For example, it might display
> something like this:
>
> load: 2.39  cmd: ping 5374 running 0.00u 0.00s
>
> The reason why SIGINFO is sent to the foreground process group is that
> it gives the process an opportunity print application specific
> information about currently running process.  For example, maybe the C
> compiler could print something like "parsing 2042 of 5000 header
> files", or some such.  :-)
>
> There are people who wish that Linux supported Control-T / VSTATUS,
> for example, just last week, on TUHS, the Unix greybeards list, there
> were two such heartfelt wishes for Control-T support from two such
> greybeards:
>
>     "It's my biggest annoyance with Linux that it doesn't [support
>     control-t]
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html
>
>     "I personally can't stand using Linux, even casually for a very
>      short sys-admin task, because of this missing feature"
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html
>
> I claim, though, that we could implement VSTATUS without implenting
> the SIGINFO part of the feature.  Previous few applications *ever*
> implemented SIGINFO signal handlers so they could give status
> information, it's the hard one, since we don't have any spare signals
> left.  If we were to repurpose some lesser used signal, whether it be
> SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some
> userspace program (such as a UPS monitoring program which wants to
> trigger power fail handling, or a userspace NFSv4 process that wants
> to signal that it was unable to recover a file's file lock after a
> server reboot), and if we try to take over the signal assignment, it's
> possible that we might get surprised.  Furthermore, all of the
> possibly unused signals that we might try to reclaim terminate the
> process by default, and SIGINFO *has* to have a default signal
> handling action of Ignore, since otherwise typing Control-T will end
> up killing the current foreground application.
>
> Personally, I don't care all that much about VSTATUS support --- I
> used it when I was in university, but honestly, I've never missed it.
> But if there is someone who wants to try to implement VSTATUS, and
> make some Unix greybeards happy, and maybe even switch from FreeBSD to
> Linux as a result, go wild.  I'm not convinced, though, that adding
> the SIGINFO part of the support is worth the effort.
>
> Not only do almost no programs implement SIGINFO support, a lot of CPU
> bound programs where this might be actually useful, end up running a
> large number of processes in parallel.  Take the "parsing 2042 of 5000
> header files" example I gave above.  Consider what would happen if gcc
> implemented support for SIGINFO, but the user was running a "make -j
> 16" and typed Control-T.   The result would be chaos!
>
> So if you really miss Control-T, and it's the only thing holding back
> a few FreeBSD users from Linux, I don't see the problem with
> implementing that part of the feature.  Why not just do the easy part
> of the feature which is perhaps 5% of the work, and might provide 99%
> of the benefit (at least for those people who care).
>
> > Without seeing the persuasive case for more signals I have to say that
> > adding more signals to the kernel sounds like a bad idea.
>
> Concur, 100%.
>
>                                                 - Ted
Eric W. Biederman Jan. 4, 2022, 10:05 p.m. UTC | #8
"Theodore Ts'o" <tytso@mit.edu> writes:

> On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
>> I dug through the previous conversations and there is a little debate
>> about what makes sense for SIGPWR to do by default.  Alan Cox remembered
>> SIGPWR was sent when the power was restored, so ignoring SIGPWR by
>> default made sense.  Ted Tso pointed out a different scenario where it
>> was reasonable for SIGPWR to be a terminating signal.
>> 
>> So far no one has actually found any applications that will regress if
>> SIGPWR becomes ignored by default.  Furthermore on linux SIGPWR is only
>> defined to be sent to init, and init ignores all signals by default so
>> in practice SIGPWR is ignored by the only process that receives it
>> currently.
>
> As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
> initiate the sigpwr target.  From the systemd.special man page:
>
>        sigpwr.target
>            A special target that is started when systemd receives the
>            SIGPWR process signal, which is normally sent by the kernel
>            or UPS daemons when power fails.
>
> And child processes of systemd are not ignoring SIGPWR.  Instead, they
> are getting terminated.
>
> <tytso@cwcc>
> 41% /bin/sleep 50 &
> [1] 180671
> <tytso@cwcc>
> 42% kill -PWR 180671
> [1]+  Power failure           /bin/sleep 50


That is all as expected, and does not demonstrate a regression would
happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT,
SIGCHLD, SIGURG do.  It does show there is the possibility of problems.

The practical question is does anything send SIGPWR to anything besides
init, and expect the process to handle SIGPWR or terminate?

Possibly easier to implement (if people desire) is to simply send
SIGCONT with an si_code that indicates someone pressed the VSTATUS
key.  We have a per signal 32bit si_code space so that should
be comparatively easy.

> I claim, though, that we could implement VSTATUS without implenting
> the SIGINFO part of the feature.

I agree that is the place to start.  And if we aren't going to use
SIGINFO perhaps we could have an equally good notification method
if anyone wants one.  Say call an ioctl and get an fd that can
be read when a VSTATUS request comes in.

SIGINFO vs SIGCONT vs a fd vs something else is something we can sort
out when people get interested in modifying userspace.

Eric
Theodore Ts'o Jan. 4, 2022, 10:23 p.m. UTC | #9
On Tue, Jan 04, 2022 at 04:05:26PM -0600, Eric W. Biederman wrote:
> 
> That is all as expected, and does not demonstrate a regression would
> happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT,
> SIGCHLD, SIGURG do.  It does show there is the possibility of problems.
> 
> The practical question is does anything send SIGPWR to anything besides
> init, and expect the process to handle SIGPWR or terminate?

So if I *cared* about SIGINFO, what I'd do is ask the systemd
developers and users list if there are any users of the sigpwr.target
feature that they know of.  And I'd also download all of the open
source UPS monitoring applications (and perhaps documentation of
closed-source UPS applications, such as for example APC's program) and
see if any of them are trying to send the SIGPWR signal.

I don't personally think it's worth the effort to do that research,
but maybe other people care enough to do the work.

> > I claim, though, that we could implement VSTATUS without implenting
> > the SIGINFO part of the feature.
> 
> I agree that is the place to start.  And if we aren't going to use
> SIGINFO perhaps we could have an equally good notification method
> if anyone wants one.  Say call an ioctl and get an fd that can
> be read when a VSTATUS request comes in.
> 
> SIGINFO vs SIGCONT vs a fd vs something else is something we can sort
> out when people get interested in modifying userspace.


Once VSTATUS support lands in the kernel, we can wait and see if there
is anyone who shows up wanting the SIGINFO functionality.  Certainly
we have no shortage of userspace notification interfaces in Linux.  :-)

   	   	       		 	      - Ted
Walt Drummond Jan. 4, 2022, 10:31 p.m. UTC | #10
The only standard tools that support SIGINFO are sleep, dd and ping,
(and kill, for obvious reasons) so it's not like there's a vast hole
in the tooling or something, nor is there a large legacy software base
just waiting for SIGINFO to appear.   So while I very much enjoyed
figuring out how to make SIGINFO work ...

I'll have the VSTATUS patch out in a little bit.

I also think there might be some merit in consolidating the 10
'sigsetsize != sizeof(sigset_t)' checks in a macro and adding comments
that wave people off on trying to do what I did.  If that would be
useful, happy to provide the patch.

On Tue, Jan 4, 2022 at 2:23 PM Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Tue, Jan 04, 2022 at 04:05:26PM -0600, Eric W. Biederman wrote:
> >
> > That is all as expected, and does not demonstrate a regression would
> > happen if SIGPWR were to treat SIG_DFL as SIG_IGN, as SIGWINCH, SIGCONT,
> > SIGCHLD, SIGURG do.  It does show there is the possibility of problems.
> >
> > The practical question is does anything send SIGPWR to anything besides
> > init, and expect the process to handle SIGPWR or terminate?
>
> So if I *cared* about SIGINFO, what I'd do is ask the systemd
> developers and users list if there are any users of the sigpwr.target
> feature that they know of.  And I'd also download all of the open
> source UPS monitoring applications (and perhaps documentation of
> closed-source UPS applications, such as for example APC's program) and
> see if any of them are trying to send the SIGPWR signal.
>
> I don't personally think it's worth the effort to do that research,
> but maybe other people care enough to do the work.
>
> > > I claim, though, that we could implement VSTATUS without implenting
> > > the SIGINFO part of the feature.
> >
> > I agree that is the place to start.  And if we aren't going to use
> > SIGINFO perhaps we could have an equally good notification method
> > if anyone wants one.  Say call an ioctl and get an fd that can
> > be read when a VSTATUS request comes in.
> >
> > SIGINFO vs SIGCONT vs a fd vs something else is something we can sort
> > out when people get interested in modifying userspace.
>
>
> Once VSTATUS support lands in the kernel, we can wait and see if there
> is anyone who shows up wanting the SIGINFO functionality.  Certainly
> we have no shortage of userspace notification interfaces in Linux.  :-)
>
>                                               - Ted
Arseny Maslennikov Jan. 7, 2022, 7:19 p.m. UTC | #11
I generally agree with Ted's suggestion that we could merge the
easy-to-design part — the VSTATUS+kerninfo — first and deal with the
SIGINFO part later. The only concern I have here is that the "later"
part might never practically arrive... :)

Still, some notes on the SIGINFO/userspace-status
part:

On Tue, Jan 04, 2022 at 03:52:28PM -0500, Theodore Ts'o wrote:
> On Tue, Jan 04, 2022 at 12:00:34PM -0600, Eric W. Biederman wrote:
> > I dug through the previous conversations and there is a little debate
> > about what makes sense for SIGPWR to do by default.  Alan Cox remembered
> > SIGPWR was sent when the power was restored, so ignoring SIGPWR by
> > default made sense.  Ted Tso pointed out a different scenario where it
> > was reasonable for SIGPWR to be a terminating signal.
> > 
> > So far no one has actually found any applications that will regress if
> > SIGPWR becomes ignored by default.

Some folks from linux-api@ claimed otherwise, but unfortunately didn't elaborate.

> > Furthermore on linux SIGPWR is only
> > defined to be sent to init, and init ignores all signals by default so
> > in practice SIGPWR is ignored by the only process that receives it
> > currently.
> 
> As it turns out, systemd does *not* ignore SIGPWR.  Instead, it will
> initiate the sigpwr target.  From the systemd.special man page:
> 
>        sigpwr.target
>            A special target that is started when systemd receives the
>            SIGPWR process signal, which is normally sent by the kernel
>            or UPS daemons when power fails.

Not sure what you had in mind; in case you're suggesting that systemd has
to drop the sigpwr.target semantics — it doesn't.
We don't need to ask systemd to drop sigpwr.target semantics.

To introduce SIGINFO == SIGPWR to the kernel, the only "breaking" change
we have to do is to change the default disposition for SIGPWR, i. e. the
behaviour if the signal is set to SIG_DFL. If a process (including PID
1) installs its own signal handler for SIGPWR to do something when PWR
is received (or blocks the signal and handles it via signalfd
notifications), then the default disposition does not matter at all, as
Eric notes further in this thread.

From a quick glance at systemd code, pid1's main() function calls
manager_new() calls manager_setup_signals(); this function, in turn,
blocks a set of signals, including PWR, and sets up a signalfd(2) on
that set. No changes have to be made in systemd, no need to remove the
sigpwr.target semantics.

The target activation does not send SIGPWR to anyone, it results in
systemd services being started and possibly stopped; the exact
consequences are out of scope for systemd.

There could be another concern: a VSTATUS keypress could result in
SIGINFO == SIGPWR being sent to pid1. In a correct implementation this
will not ever happen, because a sane PID 1 does not have (and never
acquires) a controlling terminal.

> And child processes of systemd are not ignoring SIGPWR.  Instead, they
> are getting terminated.
> 
> <tytso@cwcc>
> 41% /bin/sleep 50 &
> [1] 180671
> <tytso@cwcc>
> 42% kill -PWR 180671
> [1]+  Power failure           /bin/sleep 50

All the possible surprises with the SIGINFO == SIGPWR approach we might
get stem from here, not from the sigpwr.target.

> > Where I saw the last conversation falter was in making a persuasive
> > case of why SIGINFO was interesting to add.  Given a world of ssh
> > connections I expect a persuasive case can be made.  Especially if there
> > are a handful of utilities where it is already implemented that just
> > need to be built with SIGINFO defined.
> 
> One thing that's perhaps worth disentangling is the value of
> supporting VSTATUS --- which is a control character much like VINTR
> (^C) or VQUIT (control backslash) which is set via the c_cc[] array in
> termios structure.  Quoting from the termios man page:
> 
>        VSTATUS
>               (not in POSIX; not supported under Linux; status
>               request: 024, DC4, Ctrl-T).  Status character (STATUS).
>               Display status information at terminal, including state
>               of foreground process and amount of CPU time it has
>               consumed.  Also sends a SIGINFO signal (not supported on
>               Linux) to the foreground process group.
> 
> The basic idea is that when you type C-t, you can find out information
> about the currently running process.  This is a feature that
> originally comes from TOPS-10's TENEX operating system, and it is
> supported today on FreeBSD and Mac OS.  For example, it might display
> something like this:
> 
> load: 2.39  cmd: ping 5374 running 0.00u 0.00s
> 
> The reason why SIGINFO is sent to the foreground process group is that
> it gives the process an opportunity print application specific
> information about currently running process.  For example, maybe the C
> compiler could print something like "parsing 2042 of 5000 header
> files", or some such.  :-)
> 
> There are people who wish that Linux supported Control-T / VSTATUS,
> for example, just last week, on TUHS, the Unix greybeards list, there
> were two such heartfelt wishes for Control-T support from two such
> greybeards:
> 
>     "It's my biggest annoyance with Linux that it doesn't [support
>     control-t]
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024849.html
> 
>     "I personally can't stand using Linux, even casually for a very
>      short sys-admin task, because of this missing feature"
>     - https://minnie.tuhs.org/pipermail/tuhs/2021-December/024898.html
> 
> I claim, though, that we could implement VSTATUS without implenting
> the SIGINFO part of the feature.  Previous few applications *ever*
> implemented SIGINFO signal handlers so they could give status
> information, it's the hard one, since we don't have any spare signals
> left.  If we were to repurpose some lesser used signal, whether it be
> SIGPWR, SIGLOST, or SIGSTKFLT, the danger is that there might be some
> userspace program (such as a UPS monitoring program which wants to
> trigger power fail handling, or a userspace NFSv4 process that wants
> to signal that it was unable to recover a file's file lock after a
> server reboot), and if we try to take over the signal assignment, it's
> possible that we might get surprised.  Furthermore, all of the
> possibly unused signals that we might try to reclaim terminate the
> process by default, and SIGINFO *has* to have a default signal
> handling action of Ignore, since otherwise typing Control-T will end
> up killing the current foreground application.
> 
> Personally, I don't care all that much about VSTATUS support --- I
> used it when I was in university, but honestly, I've never missed it.
> But if there is someone who wants to try to implement VSTATUS, and
> make some Unix greybeards happy, and maybe even switch from FreeBSD to
> Linux as a result, go wild.  I'm not convinced, though, that adding
> the SIGINFO part of the support is worth the effort.
> 
> Not only do almost no programs implement SIGINFO support, a lot of CPU

To be fair, many programs are a lot younger than 4.3BSD, and with the
current ubiquity of Linux without VSTATUS, it's kind of a chicken-egg
problem. :)

> bound programs where this might be actually useful, end up running a
> large number of processes in parallel.  Take the "parsing 2042 of 5000
> header files" example I gave above.  Consider what would happen if gcc
> implemented support for SIGINFO, but the user was running a "make -j
> 16" and typed Control-T.   The result would be chaos!
> 
> So if you really miss Control-T, and it's the only thing holding back
> a few FreeBSD users from Linux, I don't see the problem with
> implementing that part of the feature.  Why not just do the easy part
> of the feature which is perhaps 5% of the work, and might provide 99%
> of the benefit (at least for those people who care).
> 
> > Without seeing the persuasive case for more signals I have to say that
> > adding more signals to the kernel sounds like a bad idea.
> 
> Concur, 100%.
> 
> 						- Ted
Arseny Maslennikov Jan. 7, 2022, 7:29 p.m. UTC | #12
On Tue, Jan 04, 2022 at 02:31:44PM -0800, Walt Drummond wrote:
> The only standard tools that support SIGINFO are sleep, dd and ping,
> (and kill, for obvious reasons) so it's not like there's a vast hole
> in the tooling or something, nor is there a large legacy software base
> just waiting for SIGINFO to appear.   So while I very much enjoyed
> figuring out how to make SIGINFO work ...

As far as I recall, GNU make on *BSD does support SIGINFO (Not a
standard tool, but obviously an established one).

The developers of strace have expressed interest in SIGINFO support
to print tracer status messages (unfortunately, not on a public list).
Computational software can use this instead of stderr progress spam, if
run in an interactive fashion on a terminal, as it frequently is. There
is a user base, it's just not very vocal on kernel lists. :)
Pavel Machek May 19, 2022, 12:27 p.m. UTC | #13
Hi!

> > The only standard tools that support SIGINFO are sleep, dd and ping,
> > (and kill, for obvious reasons) so it's not like there's a vast hole
> > in the tooling or something, nor is there a large legacy software base
> > just waiting for SIGINFO to appear.   So while I very much enjoyed
> > figuring out how to make SIGINFO work ...
> 
> As far as I recall, GNU make on *BSD does support SIGINFO (Not a
> standard tool, but obviously an established one).
> 
> The developers of strace have expressed interest in SIGINFO support
> to print tracer status messages (unfortunately, not on a public list).
> Computational software can use this instead of stderr progress spam, if
> run in an interactive fashion on a terminal, as it frequently is. There
> is a user base, it's just not very vocal on kernel lists. :)

And often it would be useful if cp supported this. Yes, this
is feature I'd like to see.

BR,							Pavel

--