Add crash handler for qemu-linux-user

[PATCH][RFC] Add crash handler for qemu-linux-user

Posted by Helge Deller 2 years, 6 months ago

If there is an internal program error in qemu source code
which triggers a SIGSEGV, qemu will currently assume this
is a SIGSEGV of the target and print:

(hppa-chroot)root@p100:/# cat /proc/self/maps
**
ERROR:../../home/cvs/qemu/qemu/accel/tcg/cpu-exec.c:532:cpu_exec_longjmp_cleanup: assertion failed: (cpu == current_cpu)
Bail out! ERROR:../../home/cvs/qemu/qemu/accel/tcg/cpu-exec.c:532:cpu_exec_longjmp_cleanup: assertion failed: (cpu == current_cpu)
**

This error message is very misleading for developers and end-users.

The attached patch will print instead:

(hppa-chroot)root@p100:/# cat /proc/self/maps
QEMU host internal error: signal=11, errno=0, code=1, addr=(nil)
QEMU host backtrace:
[0x7fd285445039]
[0x7fd28561af80]
[0x7fd28544eba9]
[0x7fd28544ebcb]
[0x7fd285459631]
[0x7fd28545d833]
[0x7fd285463e78]
[0x7fd2853fbd2d]
[0x7fd2853f57f9]
[0x7fd2856056a8]
[0x7fd285606d8f]
[0x7fd2853f60a5]

Note that glibc's backtrace() does not resolve the addresses to function
names, which is why only addresses are show above.

Is such a patch useful?

Ideas on how to get the function names resolved, e.g.  including a
static build of libunwind?

Note, that the patch below includes the sigsegv-trigger (just run
"cat /proc/self/maps" in the chroot) for testing purposes.

Opinions?

Helge


Signed-off-by: Helge Deller <deller@gmx.de>

diff --git a/linux-user/signal.c b/linux-user/signal.c
index 748a98f3e5..5d01ab61b8 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -23,6 +23,7 @@

 #include <sys/ucontext.h>
 #include <sys/resource.h>
+#include <execinfo.h>

 #include "qemu.h"
 #include "user-internals.h"
@@ -781,6 +782,27 @@ static inline void rewind_if_in_safe_syscall(void *puc)
     }
 }

+static void qemu_show_backtrace(siginfo_t *info)
+{
+    void *array[20];
+    char **strings;
+    int size, i;
+
+    fprintf(stderr, "QEMU host internal error: signal=%d, errno=%d, code=%d, addr=%p\n",
+        info->si_signo, info->si_errno, info->si_code, info->si_addr);
+    size = backtrace (array, 20);
+    strings = backtrace_symbols (array, size);
+    if (strings)
+    {
+        fprintf(stderr, "QEMU host backtrace:\n");
+        for (i = 0; i < size; i++)
+            fprintf(stderr, "%s\n", strings[i]);
+    }
+    free (strings);
+    exit(info->si_code);
+}
+
+
 static void host_signal_handler(int host_sig, siginfo_t *info, void *puc)
 {
     CPUArchState *env = thread_cpu->env_ptr;
@@ -819,6 +841,11 @@ static void host_signal_handler(int host_sig, siginfo_t *info, void *puc)
         if (host_sig == SIGSEGV) {
             bool maperr = true;

+            /* did qemu source code crashed? */
+            if (unlikely(!h2g_valid(host_addr))) {
+                qemu_show_backtrace(info);
+            }
+
             if (info->si_code == SEGV_ACCERR && h2g_valid(host_addr)) {
                 /* If this was a write to a TB protected page, restart. */
                 if (is_write &&
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index a15bce2be2..0267ff7649 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8074,6 +8074,7 @@ static int open_self_maps_1(CPUArchState *cpu_env, int fd, bool smaps)
     IntervalTreeNode *s;
     int count;

+*(int*)NULL = 1;
     for (s = interval_tree_iter_first(map_info, 0, -1); s;
          s = interval_tree_iter_next(s, 0, -1)) {
         MapInfo *e = container_of(s, MapInfo, itree);
diff --git a/meson.build b/meson.build
index 98e68ef0b1..6db4e029a0 100644
--- a/meson.build
+++ b/meson.build
@@ -4068,6 +4068,8 @@ if targetos == 'darwin'
   summary_info += {'Objective-C compiler': ' '.join(meson.get_compiler('objc').cmd_array())}
 endif
 option_cflags = (get_option('debug') ? ['-g'] : [])
+# add symbol table for backtrace(), not sufficient for qemu-linux-user built as static executable
+option_cflags += ['-rdynamic']
 if get_option('optimization') != 'plain'
   option_cflags += ['-O' + get_option('optimization')]
 endif

Re: [PATCH][RFC] Add crash handler for qemu-linux-user

Posted by Richard Henderson 2 years, 6 months ago

On 8/9/23 16:07, Helge Deller wrote:
> +            /* did qemu source code crashed? */
> +            if (unlikely(!h2g_valid(host_addr))) {
> +                qemu_show_backtrace(info);
> +            }

This won't do anything at all when reserved_va == 0,
i.e. 64-bit guest on 64-bit host, or any 32-bit host.

The idea of having a backtrace is nice, I suppose, we just need a better detector. Between 
host_signal_pc() and adjust_signal_pc(), we should be able to determine if the access is 
being done on behalf of the guest.  If it isn't, it's a qemu bug.

r~

Re: [PATCH][RFC] Add crash handler for qemu-linux-user

Posted by Peter Maydell 2 years, 6 months ago

On Thu, 10 Aug 2023 at 02:28, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 8/9/23 16:07, Helge Deller wrote:
> > +            /* did qemu source code crashed? */
> > +            if (unlikely(!h2g_valid(host_addr))) {
> > +                qemu_show_backtrace(info);
> > +            }
>
> This won't do anything at all when reserved_va == 0,
> i.e. 64-bit guest on 64-bit host, or any 32-bit host.
>
> The idea of having a backtrace is nice, I suppose, we just need
> a better detector.

I think Dan also had a look at one point at doing
backtraces for crashes in system emulation mode?
Certainly this would be useful for test crashes in CI.

thanks
-- PMM

Re: [PATCH][RFC] Add crash handler for qemu-linux-user

Posted by Daniel P. Berrangé 2 years, 5 months ago

On Thu, Aug 10, 2023 at 10:14:06AM +0100, Peter Maydell wrote:
> On Thu, 10 Aug 2023 at 02:28, Richard Henderson
> <richard.henderson@linaro.org> wrote:
> >
> > On 8/9/23 16:07, Helge Deller wrote:
> > > +            /* did qemu source code crashed? */
> > > +            if (unlikely(!h2g_valid(host_addr))) {
> > > +                qemu_show_backtrace(info);
> > > +            }
> >
> > This won't do anything at all when reserved_va == 0,
> > i.e. 64-bit guest on 64-bit host, or any 32-bit host.
> >
> > The idea of having a backtrace is nice, I suppose, we just need
> > a better detector.
> 
> I think Dan also had a look at one point at doing
> backtraces for crashes in system emulation mode?
> Certainly this would be useful for test crashes in CI.

I stopped that work as I couldn't figure out a way to get a backtrace
across all the threads, which severely limited its usefulness in the
QEMU system emulators.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|