Message ID | 20230901204251.137307-6-richard.henderson@linaro.org |
---|---|
State | New |
Headers | show |
Series | [PULL,01/13] linux-user: Split out cpu/target_proc.h | expand |
Hi, I've run into a problem with this change. We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we started seeing errors cross compiling webkitgtk on x86_64 for x86_64 during the introspection code which runs under user mode qemu. The error we see is: qemu-x86_64: QEMU internal SIGSEGV {code=MAPERR, addr=0x20} Segmentation fault e.g. here: https://autobuilder.yoctoproject.org/typhoon/#/builders/40/builds/8488/steps/11/logs/stdio This usually seems to happen on our debian 11 based autobuilder machines. I took one of the broken builds and bisected it to this change (commit 7b7a3366e142d3baeb3fd1d3660a50e7956c19eb). There was a change in output from commit 7dfd3ca8d95f9962cdd2ebdfcdd699279b98fa18, before that it was: ERROR:../git/accel/tcg/cpu-exec.c:532:cpu_exec_longjmp_cleanup: assertion failed: (cpu == current_cpu) Bail out! ERROR:../git/accel/tcg/cpu-exec.c:532:cpu_exec_longjmp_cleanup: assertion failed: (cpu == current_cpu) After digging into the code and trying to work out what is going on, I realised that n is NULL when it fails so this makes the problem "go away": diff --git a/linux-user/syscall.c b/linux-user/syscall.c index e384e14248..2577fb770d 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -8085,6 +8085,9 @@ static int open_self_maps_2(void *opaque, target_ulong guest_start, while (1) { IntervalTreeNode *n = interval_tree_iter_first(d->host_maps, host_start, host_start); + if (!n) { + return 0; + } MapInfo *mi = container_of(n, MapInfo, itree); uintptr_t this_hlast = MIN(host_last, n->last); target_ulong this_gend = h2g(this_hlast) + 1; I'm hoping that might be enough to give you an idea of what is going on and what the correct fix may be? I haven't managed to make an isolated test to reproduce the issue yet. Cheers, Richard
26.01.2024 16:03, Richard Purdie wrote: > Hi, > > I've run into a problem with this change. > > We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we > started seeing errors cross compiling webkitgtk on x86_64 for x86_64 > during the introspection code which runs under user mode qemu. Hi Richard! Besides your observations, please be aware there's quite a few issues in 8.2.0. Please take a look at https://gitlab.com/mjt0k/qemu/-/commits/staging-8.2/ (and https://gitlab.com/qemu-project/qemu/-/commits/staging-8.2/ which is updated less often) for fixes already queued up, if you haven't looked there already. 8.2.1 stable/bugfix release is scheduled for the beginning of the next week. Thanks, /mjt
Hi Michael, On Fri, 2024-01-26 at 16:33 +0300, Michael Tokarev wrote: > 26.01.2024 16:03, Richard Purdie wrote: > > I've run into a problem with this change. > > > > We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we > > started seeing errors cross compiling webkitgtk on x86_64 for x86_64 > > during the introspection code which runs under user mode qemu. > > Besides your observations, please be aware there's quite a few issues in 8.2.0. > Please take a look at https://gitlab.com/mjt0k/qemu/-/commits/staging-8.2/ > (and https://gitlab.com/qemu-project/qemu/-/commits/staging-8.2/ which is updated > less often) for fixes already queued up, if you haven't looked there already. > 8.2.1 stable/bugfix release is scheduled for the beginning of the next week. Thanks. I should note that I did test the staging-8.2 branch and nothing there helped. The issue was also present with master as of yesterday. https://bugzilla.yoctoproject.org/show_bug.cgi?id=15367 is Yocto Projects tracking of the issue which has the commits for master and staging-8.2 that I tested. Cheers, Richard
On 1/26/24 23:52, Richard Purdie wrote: > Hi Michael, > > On Fri, 2024-01-26 at 16:33 +0300, Michael Tokarev wrote: >> 26.01.2024 16:03, Richard Purdie wrote: >>> I've run into a problem with this change. >>> >>> We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we >>> started seeing errors cross compiling webkitgtk on x86_64 for x86_64 >>> during the introspection code which runs under user mode qemu. >> >> Besides your observations, please be aware there's quite a few issues in 8.2.0. >> Please take a look at https://gitlab.com/mjt0k/qemu/-/commits/staging-8.2/ >> (and https://gitlab.com/qemu-project/qemu/-/commits/staging-8.2/ which is updated >> less often) for fixes already queued up, if you haven't looked there already. >> 8.2.1 stable/bugfix release is scheduled for the beginning of the next week. > > Thanks. > > I should note that I did test the staging-8.2 branch and nothing there > helped. The issue was also present with master as of yesterday. > > https://bugzilla.yoctoproject.org/show_bug.cgi?id=15367 is Yocto > Projects tracking of the issue which has the commits for master and > staging-8.2 that I tested. The yocto logs referenced here are not helpful for reproducing the problem. Please extract a binary to run, inputs, and command-line. r~
On Mon, 2024-02-05 at 13:05 +1000, Richard Henderson wrote: > On 1/26/24 23:52, Richard Purdie wrote: > > On Fri, 2024-01-26 at 16:33 +0300, Michael Tokarev wrote: > > > 26.01.2024 16:03, Richard Purdie wrote: > > > > I've run into a problem with this change. > > > > > > > > We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we > > > > started seeing errors cross compiling webkitgtk on x86_64 for x86_64 > > > > during the introspection code which runs under user mode qemu. > > > > > > Besides your observations, please be aware there's quite a few issues in 8.2.0. > > > Please take a look at https://gitlab.com/mjt0k/qemu/-/commits/staging-8.2/ > > > (and https://gitlab.com/qemu-project/qemu/-/commits/staging-8.2/ which is updated > > > less often) for fixes already queued up, if you haven't looked there already. > > > 8.2.1 stable/bugfix release is scheduled for the beginning of the next week. > > > > Thanks. > > > > I should note that I did test the staging-8.2 branch and nothing there > > helped. The issue was also present with master as of yesterday. > > > > https://bugzilla.yoctoproject.org/show_bug.cgi?id=15367 is Yocto > > Projects tracking of the issue which has the commits for master and > > staging-8.2 that I tested. > > The yocto logs referenced here are not helpful for reproducing the problem. It took me a couple of days I didn't have to workout which commit caused it, which versions showed the issue and how to work around it. It looks host kernel specific since it doesn't happen on some systems so even with the binaries/command/environment vars, it may not be enough. I was hoping the indication of the cause might help point to the fix as there is quite a bit of work in trying to extract this into a reproducer. The failure is 20 mins into a webkitgtk compile on a remote CI system which no longer has the context on it. > Please extract a binary to run, inputs, and command-line. I wish I could say that to the bug reports I get! :) I'll do my best but finding the time is going to be a challenge. Cheers, Richard
On Mon, Feb 05, 2024 at 11:11:06AM +0000, Richard Purdie wrote: > On Mon, 2024-02-05 at 13:05 +1000, Richard Henderson wrote: > > On 1/26/24 23:52, Richard Purdie wrote: > > > On Fri, 2024-01-26 at 16:33 +0300, Michael Tokarev wrote: > > > > 26.01.2024 16:03, Richard Purdie wrote: > > > > > I've run into a problem with this change. > > > > > > > > > > We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we > > > > > started seeing errors cross compiling webkitgtk on x86_64 for x86_64 > > > > > during the introspection code which runs under user mode qemu. > > > > > > > > Besides your observations, please be aware there's quite a few issues in 8.2.0. > > > > Please take a look at https://gitlab.com/mjt0k/qemu/-/commits/staging-8.2/ > > > > (and https://gitlab.com/qemu-project/qemu/-/commits/staging-8.2/ which is updated > > > > less often) for fixes already queued up, if you haven't looked there already. > > > > 8.2.1 stable/bugfix release is scheduled for the beginning of the next week. > > > > > > Thanks. > > > > > > I should note that I did test the staging-8.2 branch and nothing there > > > helped. The issue was also present with master as of yesterday. > > > > > > https://bugzilla.yoctoproject.org/show_bug.cgi?id=15367 is Yocto > > > Projects tracking of the issue which has the commits for master and > > > staging-8.2 that I tested. > > > > The yocto logs referenced here are not helpful for reproducing the problem. > > It took me a couple of days I didn't have to workout which commit > caused it, which versions showed the issue and how to work around it. > > It looks host kernel specific since it doesn't happen on some systems > so even with the binaries/command/environment vars, it may not be > enough. > > I was hoping the indication of the cause might help point to the fix as > there is quite a bit of work in trying to extract this into a > reproducer. The failure is 20 mins into a webkitgtk compile on a remote > CI system which no longer has the context on it. > > > Please extract a binary to run, inputs, and command-line. > > I wish I could say that to the bug reports I get! :) > > I'll do my best but finding the time is going to be a challenge. > > Cheers, > > Richard I just ran into a similar crash and could reproduce it with 5005aed8a7e7 alpha-linux-user as follows: #include <fcntl.h> #include <sys/shm.h> int main(void) { shmat(shmget(IPC_PRIVATE, 1836016, IPC_CREAT | 0600), (void *)0x20000804000, 0); open("/proc/self/maps", O_RDONLY); } Apparently an mmap() is missing for shmat() when g>h and shmaddr is specified. The mismatch between the host's and the guest's view of the mapping's tail appears to be causing the SEGV.
On 2/12/24 10:43, Ilya Leoshkevich wrote: > int main(void) > { > shmat(shmget(IPC_PRIVATE, 1836016, IPC_CREAT | 0600), (void *)0x20000804000, 0); > open("/proc/self/maps", O_RDONLY); > } > > Apparently an mmap() is missing for shmat() when g>h and shmaddr is > specified. The mismatch between the host's and the guest's view of the > mapping's tail appears to be causing the SEGV. Yes, shmat() is handling none of the h != g page size issues; it is in fact fairly broken. r~
diff --git a/linux-user/syscall.c b/linux-user/syscall.c index a562920a84..0b91f996b7 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -8095,12 +8095,66 @@ static int open_self_cmdline(CPUArchState *cpu_env, int fd) return 0; } -static void show_smaps(int fd, unsigned long size) -{ - unsigned long page_size_kb = TARGET_PAGE_SIZE >> 10; - unsigned long size_kb = size >> 10; +struct open_self_maps_data { + TaskState *ts; + IntervalTreeRoot *host_maps; + int fd; + bool smaps; +}; - dprintf(fd, "Size: %lu kB\n" +/* + * Subroutine to output one line of /proc/self/maps, + * or one region of /proc/self/smaps. + */ + +#ifdef TARGET_HPPA +# define test_stack(S, E, L) (E == L) +#else +# define test_stack(S, E, L) (S == L) +#endif + +static void open_self_maps_4(const struct open_self_maps_data *d, + const MapInfo *mi, abi_ptr start, + abi_ptr end, unsigned flags) +{ + const struct image_info *info = d->ts->info; + const char *path = mi->path; + uint64_t offset; + int fd = d->fd; + int count; + + if (test_stack(start, end, info->stack_limit)) { + path = "[stack]"; + } + + /* Except null device (MAP_ANON), adjust offset for this fragment. */ + offset = mi->offset; + if (mi->dev) { + uintptr_t hstart = (uintptr_t)g2h_untagged(start); + offset += hstart - mi->itree.start; + } + + count = dprintf(fd, TARGET_ABI_FMT_ptr "-" TARGET_ABI_FMT_ptr + " %c%c%c%c %08" PRIx64 " %02x:%02x %"PRId64, + start, end, + (flags & PAGE_READ) ? 'r' : '-', + (flags & PAGE_WRITE_ORG) ? 'w' : '-', + (flags & PAGE_EXEC) ? 'x' : '-', + mi->is_priv ? 'p' : 's', + offset, major(mi->dev), minor(mi->dev), + (uint64_t)mi->inode); + if (path) { + dprintf(fd, "%*s%s\n", 73 - count, "", path); + } else { + dprintf(fd, "\n"); + } + + if (d->smaps) { + unsigned long size = end - start; + unsigned long page_size_kb = TARGET_PAGE_SIZE >> 10; + unsigned long size_kb = size >> 10; + + dprintf(fd, "Size: %lu kB\n" "KernelPageSize: %lu kB\n" "MMUPageSize: %lu kB\n" "Rss: 0 kB\n" @@ -8121,91 +8175,75 @@ static void show_smaps(int fd, unsigned long size) "Swap: 0 kB\n" "SwapPss: 0 kB\n" "Locked: 0 kB\n" - "THPeligible: 0\n", size_kb, page_size_kb, page_size_kb); + "THPeligible: 0\n" + "VmFlags:%s%s%s%s%s%s%s%s\n", + size_kb, page_size_kb, page_size_kb, + (flags & PAGE_READ) ? " rd" : "", + (flags & PAGE_WRITE_ORG) ? " wr" : "", + (flags & PAGE_EXEC) ? " ex" : "", + mi->is_priv ? "" : " sh", + (flags & PAGE_READ) ? " mr" : "", + (flags & PAGE_WRITE_ORG) ? " mw" : "", + (flags & PAGE_EXEC) ? " me" : "", + mi->is_priv ? "" : " ms"); + } } -static int open_self_maps_1(CPUArchState *cpu_env, int fd, bool smaps) +/* + * Callback for walk_memory_regions, when read_self_maps() fails. + * Proceed without the benefit of host /proc/self/maps cross-check. + */ +static int open_self_maps_3(void *opaque, target_ulong guest_start, + target_ulong guest_end, unsigned long flags) { - CPUState *cpu = env_cpu(cpu_env); - TaskState *ts = cpu->opaque; - IntervalTreeRoot *map_info = read_self_maps(); - IntervalTreeNode *s; - int count; + static const MapInfo mi = { .is_priv = true }; - for (s = interval_tree_iter_first(map_info, 0, -1); s; - s = interval_tree_iter_next(s, 0, -1)) { - MapInfo *e = container_of(s, MapInfo, itree); + open_self_maps_4(opaque, &mi, guest_start, guest_end, flags); + return 0; +} - if (h2g_valid(e->itree.start)) { - unsigned long min = e->itree.start; - unsigned long max = e->itree.last + 1; - int flags = page_get_flags(h2g(min)); - const char *path; +/* + * Callback for walk_memory_regions, when read_self_maps() succeeds. + */ +static int open_self_maps_2(void *opaque, target_ulong guest_start, + target_ulong guest_end, unsigned long flags) +{ + const struct open_self_maps_data *d = opaque; + uintptr_t host_start = (uintptr_t)g2h_untagged(guest_start); + uintptr_t host_last = (uintptr_t)g2h_untagged(guest_end - 1); - max = h2g_valid(max - 1) ? - max : (uintptr_t) g2h_untagged(GUEST_ADDR_MAX) + 1; + while (1) { + IntervalTreeNode *n = + interval_tree_iter_first(d->host_maps, host_start, host_start); + MapInfo *mi = container_of(n, MapInfo, itree); + uintptr_t this_hlast = MIN(host_last, n->last); + target_ulong this_gend = h2g(this_hlast) + 1; - if (!page_check_range(h2g(min), max - min, flags)) { - continue; - } + open_self_maps_4(d, mi, guest_start, this_gend, flags); -#ifdef TARGET_HPPA - if (h2g(max) == ts->info->stack_limit) { -#else - if (h2g(min) == ts->info->stack_limit) { -#endif - path = "[stack]"; - } else { - path = e->path; - } - - count = dprintf(fd, TARGET_ABI_FMT_ptr "-" TARGET_ABI_FMT_ptr - " %c%c%c%c %08" PRIx64 " %02x:%02x %"PRId64, - h2g(min), h2g(max - 1) + 1, - (flags & PAGE_READ) ? 'r' : '-', - (flags & PAGE_WRITE_ORG) ? 'w' : '-', - (flags & PAGE_EXEC) ? 'x' : '-', - e->is_priv ? 'p' : 's', - (uint64_t)e->offset, - major(e->dev), minor(e->dev), - (uint64_t)e->inode); - if (path) { - dprintf(fd, "%*s%s\n", 73 - count, "", path); - } else { - dprintf(fd, "\n"); - } - if (smaps) { - show_smaps(fd, max - min); - dprintf(fd, "VmFlags:%s%s%s%s%s%s%s%s\n", - (flags & PAGE_READ) ? " rd" : "", - (flags & PAGE_WRITE_ORG) ? " wr" : "", - (flags & PAGE_EXEC) ? " ex" : "", - e->is_priv ? "" : " sh", - (flags & PAGE_READ) ? " mr" : "", - (flags & PAGE_WRITE_ORG) ? " mw" : "", - (flags & PAGE_EXEC) ? " me" : "", - e->is_priv ? "" : " ms"); - } + if (this_hlast == host_last) { + return 0; } + host_start = this_hlast + 1; + guest_start = h2g(host_start); } +} - free_self_maps(map_info); +static int open_self_maps_1(CPUArchState *env, int fd, bool smaps) +{ + struct open_self_maps_data d = { + .ts = env_cpu(env)->opaque, + .host_maps = read_self_maps(), + .fd = fd, + .smaps = smaps + }; -#ifdef TARGET_VSYSCALL_PAGE - /* - * We only support execution from the vsyscall page. - * This is as if CONFIG_LEGACY_VSYSCALL_XONLY=y from v5.3. - */ - count = dprintf(fd, TARGET_FMT_lx "-" TARGET_FMT_lx - " --xp 00000000 00:00 0", - TARGET_VSYSCALL_PAGE, TARGET_VSYSCALL_PAGE + TARGET_PAGE_SIZE); - dprintf(fd, "%*s%s\n", 73 - count, "", "[vsyscall]"); - if (smaps) { - show_smaps(fd, TARGET_PAGE_SIZE); - dprintf(fd, "VmFlags: ex\n"); + if (d.host_maps) { + walk_memory_regions(&d, open_self_maps_2); + free_self_maps(d.host_maps); + } else { + walk_memory_regions(&d, open_self_maps_3); } -#endif - return 0; }