include/chardev/char-fe.h | 18 +- include/io/task.h | 29 +- chardev/char-fe.c | 33 +- chardev/char-mux.c | 16 +- chardev/char-serial.c | 2 +- chardev/char-socket.c | 487 ++++++++++++++++------ chardev/char.c | 2 + io/task.c | 98 +++-- tests/ivshmem-test.c | 2 +- tests/libqtest.c | 4 +- tests/test-char.c | 723 +++++++++++++++++++++++++-------- tests/test-filter-redirector.c | 4 +- io/trace-events | 2 + 13 files changed, 1061 insertions(+), 359 deletions(-)
The following changes since commit 632351e0e1a861f2eaf709b053c53f96a1225825: Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging (2019-02-07 14:20:46 +0000) are available in the Git repository at: https://github.com/elmarco/qemu.git tags/chardev-pull-request for you to fetch changes up to df3afdedd23ade0c9de55cadeb1d85055689023f: tests/test-char: add muxed chardev testing for open/close (2019-02-07 16:18:25 +0100) ---------------------------------------------------------------- Various chardev fixes ---------------------------------------------------------------- Artem Pisarenko (2): chardev: fix mess in OPENED/CLOSED events when muxed tests/test-char: add muxed chardev testing for open/close Daniel P. Berrangé (16): io: store reference to thread information in the QIOTask struct io: add qio_task_wait_thread to join with a background thread chardev: fix validation of options for QMP created chardevs chardev: forbid 'reconnect' option with server sockets chardev: forbid 'wait' option with client sockets chardev: remove many local variables in qemu_chr_parse_socket chardev: ensure qemu_chr_parse_compat reports missing driver error chardev: remove unused 'sioc' variable & cleanup paths chardev: split tcp_chr_wait_connected into two methods chardev: split up qmp_chardev_open_socket connection code chardev: use a state machine for socket connection state chardev: honour the reconnect setting in tcp_chr_wait_connected chardev: disallow TLS/telnet/websocket with tcp_chr_wait_connected chardev: fix race with client connections in tcp_chr_wait_connected tests: expand coverage of socket chardev test chardev: ensure termios is fully initialized include/chardev/char-fe.h | 18 +- include/io/task.h | 29 +- chardev/char-fe.c | 33 +- chardev/char-mux.c | 16 +- chardev/char-serial.c | 2 +- chardev/char-socket.c | 487 ++++++++++++++++------ chardev/char.c | 2 + io/task.c | 98 +++-- tests/ivshmem-test.c | 2 +- tests/libqtest.c | 4 +- tests/test-char.c | 723 +++++++++++++++++++++++++-------- tests/test-filter-redirector.c | 4 +- io/trace-events | 2 + 13 files changed, 1061 insertions(+), 359 deletions(-) -- 2.20.1.519.g8feddda32c
On Thu, 7 Feb 2019 at 16:06, Marc-André Lureau <marcandre.lureau@redhat.com> wrote: > > The following changes since commit 632351e0e1a861f2eaf709b053c53f96a1225825: > > Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging (2019-02-07 14:20:46 +0000) > > are available in the Git repository at: > > https://github.com/elmarco/qemu.git tags/chardev-pull-request > > for you to fetch changes up to df3afdedd23ade0c9de55cadeb1d85055689023f: > > tests/test-char: add muxed chardev testing for open/close (2019-02-07 16:18:25 +0100) > > ---------------------------------------------------------------- > Various chardev fixes > > ---------------------------------------------------------------- This seems to result in 'make check' failures on some platforms. I saw this on s390 and aarch32, I think. MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-char -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-char " PASS 1 test-char /char/null PASS 2 test-char /char/invalid PASS 3 test-char /char/ringbuf PASS 4 test-char /char/mux PASS 5 test-char /char/stdio PASS 6 test-char /char/pipe PASS 7 test-char /char/file PASS 8 test-char /char/file-fifo PASS 9 test-char /char/udp PASS 10 test-char /char/serial PASS 11 test-char /char/hotswap PASS 12 test-char /char/websocket PASS 13 test-char /char/socket/server/mainloop/tcp PASS 14 test-char /char/socket/server/mainloop/unix PASS 15 test-char /char/socket/server/wait-conn/tcp PASS 16 test-char /char/socket/server/wait-conn/unix PASS 17 test-char /char/socket/server/mainloop-fdpass/tcp PASS 18 test-char /char/socket/server/mainloop-fdpass/unix PASS 19 test-char /char/socket/server/wait-conn-fdpass/tcp PASS 20 test-char /char/socket/server/wait-conn-fdpass/unix PASS 21 test-char /char/socket/client/mainloop/tcp PASS 22 test-char /char/socket/client/mainloop/unix qemu: qemu_mutex_destroy: Device or resource busy PASS 23 test-char /char/socket/client/wait-conn/tcp PASS 24 test-char /char/socket/client/wait-conn/unix Aborted (core dumped) ERROR - too few tests run (expected 32, got 24) Here's a backtrace from running tests/test-char under gdb. Looks like a race condition between a thread trying to destroy a mutex and a different thread that is still using it. qemu: qemu_mutex_destroy: Device or resource busy test-char: /home/linux1/qemu/util/qemu-thread-posix.c:92: qemu_mutex_unlock_impl: Assertion `mutex->initialized' failed. (gdb) thread apply all bt Thread 17 (Thread 0x3fff77ff910 (LWP 35364)): #0 0x000003fffd7381b8 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1 0x000003fffd739726 in __GI_abort () at abort.c:89 #2 0x000003fffd7300d6 in __assert_fail_base (fmt=0x3fffd84d18c "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x1000918d2 "mutex->initialized", file=0x1000918a6 "/home/linux1/qemu/util/qemu-thread-posix.c", line=<optimized out>, function=0x100091bb2 <__PRETTY_FUNCTION__.18115> "qemu_mutex_unlock_impl") at assert.c:92 #3 0x000003fffd730164 in __GI___assert_fail (assertion=0x1000918d2 "mutex->initialized", file=0x1000918a6 "/home/linux1/qemu/util/qemu-thread-posix.c", line=<optimized out>, function=0x100091bb2 <__PRETTY_FUNCTION__.18115> "qemu_mutex_unlock_impl") at assert.c:101 #4 0x000000010005db16 in qemu_mutex_unlock_impl (mutex=<optimized out>, file=<optimized out>, line=<optimized out>) at /home/linux1/qemu/util/qemu-thread-posix.c:92 #5 0x00000001000257cc in qio_task_thread_worker (opaque=opaque@entry=0x1000f1370) at /home/linux1/qemu/io/task.c:141 #6 0x000000010005d640 in qemu_thread_start (args=<optimized out>) at /home/linux1/qemu/util/qemu-thread-posix.c:502 #7 0x000003fffd907934 in start_thread (arg=0x3fff77ff910) at pthread_create.c:335 #8 0x000003fffd7edce2 in thread_start () at ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:74 Thread 16 (Thread 0x3fff7fff910 (LWP 35363)): #0 0x000003fffd911774 in __libc_recvmsg (fd=<optimized out>, msg=0x3fff7ffe7d0, flags=<optimized out>) at ../sysdeps/unix/sysv/linux/recvmsg.c:33 #1 0x000000010001fab6 in qio_channel_socket_readv (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>, fds=0x0, nfds=0x0, errp=0x1000c0320 <error_abort>) at /home/linux1/qemu/io/channel-socket.c:484 #2 0x000000010001ca04 in qio_channel_readv_full (ioc=0x3fff00008c0, iov=0x3fff00012b0, niov=1, fds=0x0, nfds=0x0, errp=0x1000c0320 <error_abort>) at /home/linux1/qemu/io/channel.c:65 #3 0x000000010001d478 in qio_channel_readv (errp=0x1000c0320 <error_abort>, niov=<optimized out>, iov=<optimized out>, ioc=0x3fff00008c0) at /home/linux1/qemu/io/channel.c:197 #4 qio_channel_readv_all_eof (ioc=0x3fff00008c0, iov=<optimized out>, niov=<optimized out>, errp=errp@entry=0x1000c0320 <error_abort>) at /home/linux1/qemu/io/channel.c:106 #5 0x000000010001d576 in qio_channel_readv_all (ioc=<optimized out>, iov=<optimized out>, niov=<optimized out>, errp=0x1000c0320 <error_abort>) at /home/linux1/qemu/io/channel.c:142 #6 0x000000010001d602 in qio_channel_read_all (ioc=<optimized out>, buf=<optimized out>, buflen=<optimized out>, errp=<optimized out>) at /home/linux1/qemu/io/channel.c:246 #7 0x000000010001c3e0 in char_socket_ping_pong (ioc=0x3fff00008c0) at /home/linux1/qemu/tests/test-char.c:706 #8 0x000000010001c4a8 in char_socket_client_server_thread (data=data@entry=0x1000f2730) at /home/linux1/qemu/tests/test-char.c:859 #9 0x000000010005d640 in qemu_thread_start (args=<optimized out>) at /home/linux1/qemu/util/qemu-thread-posix.c:502 #10 0x000003fffd907934 in start_thread (arg=0x3fff7fff910) at pthread_create.c:335 #11 0x000003fffd7edce2 in thread_start () at ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:74 Thread 3 (Thread 0x3fffc9ff910 (LWP 35350)): #0 0x000003fffd7e3e54 in ?? () at ../sysdeps/unix/syscall-template.S:84 from /lib/s390x-linux-gnu/libc.so.6 #1 0x000003fffddd06ee in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0 #2 0x000003fffddd087c in g_main_context_iteration () from /lib/s390x-linux-gnu/libglib-2.0.so.0 #3 0x000003fffddd08cc in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0 #4 0x000003fffddfaba4 in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0 #5 0x000003fffd907934 in start_thread (arg=0x3fffc9ff910) at pthread_create.c:335 #6 0x000003fffd7edce2 in thread_start () at ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:74 Thread 2 (Thread 0x3fffd1ff910 (LWP 35346)): #0 syscall () at ../sysdeps/unix/sysv/linux/s390/s390-64/syscall.S:58 #1 0x000000010005e3ca in qemu_futex_wait (val=<optimized out>, f=<optimized out>) at /home/linux1/qemu/include/qemu/futex.h:29 #2 qemu_event_wait (ev=0x1000c17f0 <rcu_call_ready_event>) at /home/linux1/qemu/util/qemu-thread-posix.c:442 #3 0x000000010007d524 in call_rcu_thread (opaque=opaque@entry=0x0) at /home/linux1/qemu/util/rcu.c:261 #4 0x000000010005d640 in qemu_thread_start (args=<optimized out>) at /home/linux1/qemu/util/qemu-thread-posix.c:502 #5 0x000003fffd907934 in start_thread (arg=0x3fffd1ff910) at pthread_create.c:335 #6 0x000003fffd7edce2 in thread_start () at ../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:74 Thread 1 (Thread 0x3fffdff3920 (LWP 35343)): #0 0x000003fffd7381b8 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:54 #1 0x000003fffd739726 in __GI_abort () at abort.c:89 #2 0x0000000100017444 in error_exit (err=<optimized out>, msg=msg@entry=0x100091c26 <__func__.18092> "qemu_mutex_destroy") at /home/linux1/qemu/util/qemu-thread-posix.c:36 #3 0x000000010005d772 in qemu_mutex_destroy (mutex=<optimized out>) at /home/linux1/qemu/util/qemu-thread-posix.c:57 #4 0x0000000100025c36 in qio_task_free (task=0x1000f1370) at /home/linux1/qemu/io/task.c:97 #5 qio_task_complete (task=task@entry=0x1000f1370) at /home/linux1/qemu/io/task.c:196 #6 0x0000000100025d0e in qio_task_thread_result (opaque=0x1000f1370) at /home/linux1/qemu/io/task.c:110 #7 0x000003fffddd03ce in g_main_context_dispatch () from /lib/s390x-linux-gnu/libglib-2.0.so.0 #8 0x000000010005a16a in glib_pollfds_poll () at /home/linux1/qemu/util/main-loop.c:215 #9 os_host_main_loop_wait (timeout=<optimized out>) at /home/linux1/qemu/util/main-loop.c:238 #10 main_loop_wait (nonblocking=<optimized out>) at /home/linux1/qemu/util/main-loop.c:514 #11 0x00000001000190c2 in char_socket_client_test (opaque=<optimized out>) at /home/linux1/qemu/tests/test-char.c:962 #12 0x000003fffddf9756 in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0 #13 0x000003fffddf9934 in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0 #14 0x000003fffddf9934 in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0 #15 0x000003fffddf9934 in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0 #16 0x000003fffddf9934 in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0 #17 0x000003fffddf9b5e in g_test_run_suite () from /lib/s390x-linux-gnu/libglib-2.0.so.0 #18 0x000003fffddf9b80 in g_test_run () from /lib/s390x-linux-gnu/libglib-2.0.so.0 #19 0x0000000100017a28 in main (argc=1, argv=0x3fffffff578) at /home/linux1/qemu/tests/test-char.c:1358 On some other hosts I saw a similar "qemu: qemu_mutex_destroy: Device or resource busy" and core dump in the migration tests, I think, which is probably the same underlying bug. thanks -- PMM
On Fri, Feb 08, 2019 at 11:44:42AM +0000, Peter Maydell wrote: > On Thu, 7 Feb 2019 at 16:06, Marc-André Lureau > <marcandre.lureau@redhat.com> wrote: > > > > The following changes since commit 632351e0e1a861f2eaf709b053c53f96a1225825: > > > > Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging (2019-02-07 14:20:46 +0000) > > > > are available in the Git repository at: > > > > https://github.com/elmarco/qemu.git tags/chardev-pull-request > > > > for you to fetch changes up to df3afdedd23ade0c9de55cadeb1d85055689023f: > > > > tests/test-char: add muxed chardev testing for open/close (2019-02-07 16:18:25 +0100) > > > > ---------------------------------------------------------------- > > Various chardev fixes > > > > ---------------------------------------------------------------- > > This seems to result in 'make check' failures on some platforms. > I saw this on s390 and aarch32, I think. > > MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} > tests/test-char -m=quick -k --tap < /dev/null | > ./scripts/tap-driver.pl --test-name="test-char > " > PASS 1 test-char /char/null > PASS 2 test-char /char/invalid > PASS 3 test-char /char/ringbuf > PASS 4 test-char /char/mux > PASS 5 test-char /char/stdio > PASS 6 test-char /char/pipe > PASS 7 test-char /char/file > PASS 8 test-char /char/file-fifo > PASS 9 test-char /char/udp > PASS 10 test-char /char/serial > PASS 11 test-char /char/hotswap > PASS 12 test-char /char/websocket > PASS 13 test-char /char/socket/server/mainloop/tcp > PASS 14 test-char /char/socket/server/mainloop/unix > PASS 15 test-char /char/socket/server/wait-conn/tcp > PASS 16 test-char /char/socket/server/wait-conn/unix > PASS 17 test-char /char/socket/server/mainloop-fdpass/tcp > PASS 18 test-char /char/socket/server/mainloop-fdpass/unix > PASS 19 test-char /char/socket/server/wait-conn-fdpass/tcp > PASS 20 test-char /char/socket/server/wait-conn-fdpass/unix > PASS 21 test-char /char/socket/client/mainloop/tcp > PASS 22 test-char /char/socket/client/mainloop/unix > qemu: qemu_mutex_destroy: Device or resource busy > PASS 23 test-char /char/socket/client/wait-conn/tcp > PASS 24 test-char /char/socket/client/wait-conn/unix > Aborted (core dumped) > ERROR - too few tests run (expected 32, got 24) > > Here's a backtrace from running tests/test-char under gdb. > Looks like a race condition between a thread trying to > destroy a mutex and a different thread that is still > using it. Thanks, that is very useful. I can see the race condition here now between qio_task_thread_worker and qio_task_thread_result. I need to acquire the mutex in qio_task_thread_result in order to sycnhronize with completion of qio_task_thread_worker. > > On some other hosts I saw a similar > "qemu: qemu_mutex_destroy: Device or resource busy" and core dump in the > migration tests, I think, which is probably the same underlying bug. Yes, I expect it is the same problem Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Mon, Feb 11, 2019 at 05:03:13PM +0000, Daniel P. Berrangé wrote: > On Fri, Feb 08, 2019 at 11:44:42AM +0000, Peter Maydell wrote: > > On Thu, 7 Feb 2019 at 16:06, Marc-André Lureau > > <marcandre.lureau@redhat.com> wrote: > > > > > > The following changes since commit 632351e0e1a861f2eaf709b053c53f96a1225825: > > > > > > Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging (2019-02-07 14:20:46 +0000) > > > > > > are available in the Git repository at: > > > > > > https://github.com/elmarco/qemu.git tags/chardev-pull-request > > > > > > for you to fetch changes up to df3afdedd23ade0c9de55cadeb1d85055689023f: > > > > > > tests/test-char: add muxed chardev testing for open/close (2019-02-07 16:18:25 +0100) > > > > > > ---------------------------------------------------------------- > > > Various chardev fixes > > > > > > ---------------------------------------------------------------- > > > > This seems to result in 'make check' failures on some platforms. > > I saw this on s390 and aarch32, I think. > > > > MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} > > tests/test-char -m=quick -k --tap < /dev/null | > > ./scripts/tap-driver.pl --test-name="test-char > > " > > PASS 1 test-char /char/null > > PASS 2 test-char /char/invalid > > PASS 3 test-char /char/ringbuf > > PASS 4 test-char /char/mux > > PASS 5 test-char /char/stdio > > PASS 6 test-char /char/pipe > > PASS 7 test-char /char/file > > PASS 8 test-char /char/file-fifo > > PASS 9 test-char /char/udp > > PASS 10 test-char /char/serial > > PASS 11 test-char /char/hotswap > > PASS 12 test-char /char/websocket > > PASS 13 test-char /char/socket/server/mainloop/tcp > > PASS 14 test-char /char/socket/server/mainloop/unix > > PASS 15 test-char /char/socket/server/wait-conn/tcp > > PASS 16 test-char /char/socket/server/wait-conn/unix > > PASS 17 test-char /char/socket/server/mainloop-fdpass/tcp > > PASS 18 test-char /char/socket/server/mainloop-fdpass/unix > > PASS 19 test-char /char/socket/server/wait-conn-fdpass/tcp > > PASS 20 test-char /char/socket/server/wait-conn-fdpass/unix > > PASS 21 test-char /char/socket/client/mainloop/tcp > > PASS 22 test-char /char/socket/client/mainloop/unix > > qemu: qemu_mutex_destroy: Device or resource busy > > PASS 23 test-char /char/socket/client/wait-conn/tcp > > PASS 24 test-char /char/socket/client/wait-conn/unix > > Aborted (core dumped) > > ERROR - too few tests run (expected 32, got 24) > > > > Here's a backtrace from running tests/test-char under gdb. > > Looks like a race condition between a thread trying to > > destroy a mutex and a different thread that is still > > using it. > > Thanks, that is very useful. I can see the race condition here > now between qio_task_thread_worker and qio_task_thread_result. > I need to acquire the mutex in qio_task_thread_result in order > to sycnhronize with completion of qio_task_thread_worker. In testing this first bug, I found a second bug hiding where tcp_chr_wait_connected forget to de-register the pending reconnect timer GSource, leading to a later crash. We would not have seen this in the test suite except for me adding a sleep(1) in the right place by chance :-) I've sent a v3 series for Marc-André to queue for a new PULL > > On some other hosts I saw a similar > > "qemu: qemu_mutex_destroy: Device or resource busy" and core dump in the > > migration tests, I think, which is probably the same underlying bug. > > Yes, I expect it is the same problem I'm confident this is the first problem i mention now. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Thu, Feb 07, 2019 at 05:05:59PM +0100, Marc-André Lureau wrote: > The following changes since commit 632351e0e1a861f2eaf709b053c53f96a1225825: > > Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging (2019-02-07 14:20:46 +0000) > > are available in the Git repository at: > > https://github.com/elmarco/qemu.git tags/chardev-pull-request > > for you to fetch changes up to df3afdedd23ade0c9de55cadeb1d85055689023f: > > tests/test-char: add muxed chardev testing for open/close (2019-02-07 16:18:25 +0100) > > ---------------------------------------------------------------- > Various chardev fixes > > ---------------------------------------------------------------- BTW, I think as the maintainer sending the PULL request you are expected to have added your own S-o-B to every patch, rather than just a R-B. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Mon, 11 Feb 2019 at 16:50, Daniel P. Berrangé <berrange@redhat.com> wrote: > > On Thu, Feb 07, 2019 at 05:05:59PM +0100, Marc-André Lureau wrote: > > The following changes since commit 632351e0e1a861f2eaf709b053c53f96a1225825: > > > > Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging (2019-02-07 14:20:46 +0000) > > > > are available in the Git repository at: > > > > https://github.com/elmarco/qemu.git tags/chardev-pull-request > > > > for you to fetch changes up to df3afdedd23ade0c9de55cadeb1d85055689023f: > > > > tests/test-char: add muxed chardev testing for open/close (2019-02-07 16:18:25 +0100) > > > > ---------------------------------------------------------------- > > Various chardev fixes > > > > ---------------------------------------------------------------- > > BTW, I think as the maintainer sending the PULL request you are expected > to have added your own S-o-B to every patch, rather than just a R-B. Yes, indeed. Thanks for catching that. -- PMM
© 2016 - 2024 Red Hat, Inc.