Message ID | 20220209112207.3368139-1-alex.bennee@linaro.org |
---|---|
State | New |
Headers | show |
Series | [RFC] linux-user: trap internal SIGABRT's | expand |
On Wed, 9 Feb 2022 at 11:35, Alex Bennée <alex.bennee@linaro.org> wrote: > linux-user wants to trap all signals in case they are related to the > guest. This however results in less than helpful core dumps when the > error is internal to QEMU. We can detect when an assert failure is in > progress by examining __glib_assert_msg and fall through to > cpu_abort() which will pretty print something before restoring the > default SIGABRT behaviour and dumping core. There is definitely a problem here that it would be nice to fix, but __glib_assert_msg is as far as I can tell not a documented public-facing glib API, and in any case it won't catch assertions via plain old assert() or abort() or for that matter SIGSEGVs and other kinds of crash in QEMU's own code. thanks -- PMM
Peter Maydell <peter.maydell@linaro.org> writes: > On Wed, 9 Feb 2022 at 11:35, Alex Bennée <alex.bennee@linaro.org> wrote: >> linux-user wants to trap all signals in case they are related to the >> guest. This however results in less than helpful core dumps when the >> error is internal to QEMU. We can detect when an assert failure is in >> progress by examining __glib_assert_msg and fall through to >> cpu_abort() which will pretty print something before restoring the >> default SIGABRT behaviour and dumping core. > > There is definitely a problem here that it would be nice to > fix, but __glib_assert_msg is as far as I can tell not a > documented public-facing glib API, Yeah it's in an odd position - it is explicitly exported but not documented as an API but for use by crash tools: https://gitlab.gnome.org/GNOME/glib/-/issues/712 > and in any case it won't > catch assertions via plain old assert() or abort() or for libc does provide an a private __abort_msg but that is explicitly private and I guess would break against a non-gnu libc (do we support that?). Explicit aborts() in linux-user code should probably be converted to cpu_abort as it does the right thing. asserts() can be converted to g_assert() given as glib is a absolute requirement for building. > that matter SIGSEGVs and other kinds of crash in QEMU's own code. There is some checking in the host_signal_handler that could be a bit cleverer. We currently check for h2g_valid(host_addr) but we could expand that to cover QEMU's own address space and behave appropriately. > > thanks > -- PMM
On Wed, 9 Feb 2022 at 13:23, Alex Bennée <alex.bennee@linaro.org> wrote: > > > Peter Maydell <peter.maydell@linaro.org> writes: > > > On Wed, 9 Feb 2022 at 11:35, Alex Bennée <alex.bennee@linaro.org> wrote: > >> linux-user wants to trap all signals in case they are related to the > >> guest. This however results in less than helpful core dumps when the > >> error is internal to QEMU. We can detect when an assert failure is in > >> progress by examining __glib_assert_msg and fall through to > >> cpu_abort() which will pretty print something before restoring the > >> default SIGABRT behaviour and dumping core. > > > > There is definitely a problem here that it would be nice to > > fix, but __glib_assert_msg is as far as I can tell not a > > documented public-facing glib API, > > Yeah it's in an odd position - it is explicitly exported but not > documented as an API but for use by crash tools: > > https://gitlab.gnome.org/GNOME/glib/-/issues/712 Mmm. I think if glib specifically mark it as "not part of our API" then we should not be touching it. -- PMM
On 2/9/22 22:22, Alex Bennée wrote: > linux-user wants to trap all signals in case they are related to the > guest. This however results in less than helpful core dumps when the > error is internal to QEMU. We can detect when an assert failure is in > progress by examining __glib_assert_msg and fall through to > cpu_abort() which will pretty print something before restoring the > default SIGABRT behaviour and dumping core. > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > --- > linux-user/signal.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/linux-user/signal.c b/linux-user/signal.c > index 32854bb375..8ecc1215f7 100644 > --- a/linux-user/signal.c > +++ b/linux-user/signal.c > @@ -809,6 +809,8 @@ static inline void rewind_if_in_safe_syscall(void *puc) > } > } > > +GLIB_VAR char *__glib_assert_msg; > + > static void host_signal_handler(int host_sig, siginfo_t *info, void *puc) > { > CPUArchState *env = thread_cpu->env_ptr; > @@ -821,6 +823,10 @@ static void host_signal_handler(int host_sig, siginfo_t *info, void *puc) > uintptr_t pc = 0; > bool sync_sig = false; > > + if (__glib_assert_msg) { > + cpu_abort(cpu, "internal QEMU error, aborting..."); > + } I think we should not be trapping SIGABRT. I think we can preserve all guest behaviour wrt SIGABRT by stealing another SIGRTMIN value, and remapping the guest signal number. We can produce the correct result for the system by mapping it back to host SIGABRT in core_dump_and_abort(). r~
On Wed, 9 Feb 2022 at 22:12, Richard Henderson <richard.henderson@linaro.org> wrote: > > On 2/9/22 22:22, Alex Bennée wrote: > > linux-user wants to trap all signals in case they are related to the > > guest. This however results in less than helpful core dumps when the > > error is internal to QEMU. We can detect when an assert failure is in > > progress by examining __glib_assert_msg and fall through to > > cpu_abort() which will pretty print something before restoring the > > default SIGABRT behaviour and dumping core. > > > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > > --- > > linux-user/signal.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/linux-user/signal.c b/linux-user/signal.c > > index 32854bb375..8ecc1215f7 100644 > > --- a/linux-user/signal.c > > +++ b/linux-user/signal.c > > @@ -809,6 +809,8 @@ static inline void rewind_if_in_safe_syscall(void *puc) > > } > > } > > > > +GLIB_VAR char *__glib_assert_msg; > > + > > static void host_signal_handler(int host_sig, siginfo_t *info, void *puc) > > { > > CPUArchState *env = thread_cpu->env_ptr; > > @@ -821,6 +823,10 @@ static void host_signal_handler(int host_sig, siginfo_t *info, void *puc) > > uintptr_t pc = 0; > > bool sync_sig = false; > > > > + if (__glib_assert_msg) { > > + cpu_abort(cpu, "internal QEMU error, aborting..."); > > + } > > I think we should not be trapping SIGABRT. I think we can preserve all guest behaviour > wrt SIGABRT by stealing another SIGRTMIN value, and remapping the guest signal number. We > can produce the correct result for the system by mapping it back to host SIGABRT in > core_dump_and_abort(). Stealing signal values is awkward, because you don't know what the guest (either application or libc) might be trying to do with them. I think we should prefer not to steal them, especially in this case where the only reason for taking a signal value for our own use is to deal with a "should never happen anyway" case. We've already had a few bugs/awkwardnesses with the current level of signal stealing/rearranging we have to do. thanks -- PMM
diff --git a/linux-user/signal.c b/linux-user/signal.c index 32854bb375..8ecc1215f7 100644 --- a/linux-user/signal.c +++ b/linux-user/signal.c @@ -809,6 +809,8 @@ static inline void rewind_if_in_safe_syscall(void *puc) } } +GLIB_VAR char *__glib_assert_msg; + static void host_signal_handler(int host_sig, siginfo_t *info, void *puc) { CPUArchState *env = thread_cpu->env_ptr; @@ -821,6 +823,10 @@ static void host_signal_handler(int host_sig, siginfo_t *info, void *puc) uintptr_t pc = 0; bool sync_sig = false; + if (__glib_assert_msg) { + cpu_abort(cpu, "internal QEMU error, aborting..."); + } + /* * Non-spoofed SIGSEGV and SIGBUS are synchronous, and need special * handling wrt signal blocking and unwinding.
linux-user wants to trap all signals in case they are related to the guest. This however results in less than helpful core dumps when the error is internal to QEMU. We can detect when an assert failure is in progress by examining __glib_assert_msg and fall through to cpu_abort() which will pretty print something before restoring the default SIGABRT behaviour and dumping core. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> --- linux-user/signal.c | 6 ++++++ 1 file changed, 6 insertions(+)