include/chardev/char-fe.h | 18 +- include/io/task.h | 29 +- chardev/char-fe.c | 33 +- chardev/char-mux.c | 16 +- chardev/char-serial.c | 2 +- chardev/char-socket.c | 487 ++++++++++++++++------ chardev/char.c | 2 + io/task.c | 98 +++-- tests/ivshmem-test.c | 2 +- tests/libqtest.c | 4 +- tests/test-char.c | 723 +++++++++++++++++++++++++-------- tests/test-filter-redirector.c | 4 +- io/trace-events | 2 + 13 files changed, 1061 insertions(+), 359 deletions(-)
The following changes since commit 632351e0e1a861f2eaf709b053c53f96a1225825: Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging (2019-02-07 14:20:46 +0000) are available in the Git repository at: https://github.com/elmarco/qemu.git tags/chardev-pull-request for you to fetch changes up to df3afdedd23ade0c9de55cadeb1d85055689023f: tests/test-char: add muxed chardev testing for open/close (2019-02-07 16:18:25 +0100) ---------------------------------------------------------------- Various chardev fixes ---------------------------------------------------------------- Artem Pisarenko (2): chardev: fix mess in OPENED/CLOSED events when muxed tests/test-char: add muxed chardev testing for open/close Daniel P. Berrangé (16): io: store reference to thread information in the QIOTask struct io: add qio_task_wait_thread to join with a background thread chardev: fix validation of options for QMP created chardevs chardev: forbid 'reconnect' option with server sockets chardev: forbid 'wait' option with client sockets chardev: remove many local variables in qemu_chr_parse_socket chardev: ensure qemu_chr_parse_compat reports missing driver error chardev: remove unused 'sioc' variable & cleanup paths chardev: split tcp_chr_wait_connected into two methods chardev: split up qmp_chardev_open_socket connection code chardev: use a state machine for socket connection state chardev: honour the reconnect setting in tcp_chr_wait_connected chardev: disallow TLS/telnet/websocket with tcp_chr_wait_connected chardev: fix race with client connections in tcp_chr_wait_connected tests: expand coverage of socket chardev test chardev: ensure termios is fully initialized include/chardev/char-fe.h | 18 +- include/io/task.h | 29 +- chardev/char-fe.c | 33 +- chardev/char-mux.c | 16 +- chardev/char-serial.c | 2 +- chardev/char-socket.c | 487 ++++++++++++++++------ chardev/char.c | 2 + io/task.c | 98 +++-- tests/ivshmem-test.c | 2 +- tests/libqtest.c | 4 +- tests/test-char.c | 723 +++++++++++++++++++++++++-------- tests/test-filter-redirector.c | 4 +- io/trace-events | 2 + 13 files changed, 1061 insertions(+), 359 deletions(-) -- 2.20.1.519.g8feddda32c
On Thu, 7 Feb 2019 at 16:06, Marc-André Lureau
<marcandre.lureau@redhat.com> wrote:
>
> The following changes since commit 632351e0e1a861f2eaf709b053c53f96a1225825:
>
> Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging (2019-02-07 14:20:46 +0000)
>
> are available in the Git repository at:
>
> https://github.com/elmarco/qemu.git tags/chardev-pull-request
>
> for you to fetch changes up to df3afdedd23ade0c9de55cadeb1d85055689023f:
>
> tests/test-char: add muxed chardev testing for open/close (2019-02-07 16:18:25 +0100)
>
> ----------------------------------------------------------------
> Various chardev fixes
>
> ----------------------------------------------------------------
This seems to result in 'make check' failures on some platforms.
I saw this on s390 and aarch32, I think.
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
tests/test-char -m=quick -k --tap < /dev/null |
./scripts/tap-driver.pl --test-name="test-char
"
PASS 1 test-char /char/null
PASS 2 test-char /char/invalid
PASS 3 test-char /char/ringbuf
PASS 4 test-char /char/mux
PASS 5 test-char /char/stdio
PASS 6 test-char /char/pipe
PASS 7 test-char /char/file
PASS 8 test-char /char/file-fifo
PASS 9 test-char /char/udp
PASS 10 test-char /char/serial
PASS 11 test-char /char/hotswap
PASS 12 test-char /char/websocket
PASS 13 test-char /char/socket/server/mainloop/tcp
PASS 14 test-char /char/socket/server/mainloop/unix
PASS 15 test-char /char/socket/server/wait-conn/tcp
PASS 16 test-char /char/socket/server/wait-conn/unix
PASS 17 test-char /char/socket/server/mainloop-fdpass/tcp
PASS 18 test-char /char/socket/server/mainloop-fdpass/unix
PASS 19 test-char /char/socket/server/wait-conn-fdpass/tcp
PASS 20 test-char /char/socket/server/wait-conn-fdpass/unix
PASS 21 test-char /char/socket/client/mainloop/tcp
PASS 22 test-char /char/socket/client/mainloop/unix
qemu: qemu_mutex_destroy: Device or resource busy
PASS 23 test-char /char/socket/client/wait-conn/tcp
PASS 24 test-char /char/socket/client/wait-conn/unix
Aborted (core dumped)
ERROR - too few tests run (expected 32, got 24)
Here's a backtrace from running tests/test-char under gdb.
Looks like a race condition between a thread trying to
destroy a mutex and a different thread that is still
using it.
qemu: qemu_mutex_destroy: Device or resource busy
test-char: /home/linux1/qemu/util/qemu-thread-posix.c:92:
qemu_mutex_unlock_impl: Assertion `mutex->initialized' failed.
(gdb) thread apply all bt
Thread 17 (Thread 0x3fff77ff910 (LWP 35364)):
#0 0x000003fffd7381b8 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
#1 0x000003fffd739726 in __GI_abort () at abort.c:89
#2 0x000003fffd7300d6 in __assert_fail_base (fmt=0x3fffd84d18c
"%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x1000918d2
"mutex->initialized",
file=0x1000918a6 "/home/linux1/qemu/util/qemu-thread-posix.c",
line=<optimized out>,
function=0x100091bb2 <__PRETTY_FUNCTION__.18115>
"qemu_mutex_unlock_impl") at assert.c:92
#3 0x000003fffd730164 in __GI___assert_fail (assertion=0x1000918d2
"mutex->initialized", file=0x1000918a6
"/home/linux1/qemu/util/qemu-thread-posix.c",
line=<optimized out>, function=0x100091bb2
<__PRETTY_FUNCTION__.18115> "qemu_mutex_unlock_impl") at assert.c:101
#4 0x000000010005db16 in qemu_mutex_unlock_impl (mutex=<optimized
out>, file=<optimized out>, line=<optimized out>)
at /home/linux1/qemu/util/qemu-thread-posix.c:92
#5 0x00000001000257cc in qio_task_thread_worker
(opaque=opaque@entry=0x1000f1370) at /home/linux1/qemu/io/task.c:141
#6 0x000000010005d640 in qemu_thread_start (args=<optimized out>) at
/home/linux1/qemu/util/qemu-thread-posix.c:502
#7 0x000003fffd907934 in start_thread (arg=0x3fff77ff910) at
pthread_create.c:335
#8 0x000003fffd7edce2 in thread_start () at
../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:74
Thread 16 (Thread 0x3fff7fff910 (LWP 35363)):
#0 0x000003fffd911774 in __libc_recvmsg (fd=<optimized out>,
msg=0x3fff7ffe7d0, flags=<optimized out>) at
../sysdeps/unix/sysv/linux/recvmsg.c:33
#1 0x000000010001fab6 in qio_channel_socket_readv (ioc=<optimized
out>, iov=<optimized out>, niov=<optimized out>, fds=0x0, nfds=0x0,
errp=0x1000c0320 <error_abort>) at /home/linux1/qemu/io/channel-socket.c:484
#2 0x000000010001ca04 in qio_channel_readv_full (ioc=0x3fff00008c0,
iov=0x3fff00012b0, niov=1, fds=0x0, nfds=0x0, errp=0x1000c0320
<error_abort>)
at /home/linux1/qemu/io/channel.c:65
#3 0x000000010001d478 in qio_channel_readv (errp=0x1000c0320
<error_abort>, niov=<optimized out>, iov=<optimized out>,
ioc=0x3fff00008c0)
at /home/linux1/qemu/io/channel.c:197
#4 qio_channel_readv_all_eof (ioc=0x3fff00008c0, iov=<optimized out>,
niov=<optimized out>, errp=errp@entry=0x1000c0320 <error_abort>)
at /home/linux1/qemu/io/channel.c:106
#5 0x000000010001d576 in qio_channel_readv_all (ioc=<optimized out>,
iov=<optimized out>, niov=<optimized out>, errp=0x1000c0320
<error_abort>)
at /home/linux1/qemu/io/channel.c:142
#6 0x000000010001d602 in qio_channel_read_all (ioc=<optimized out>,
buf=<optimized out>, buflen=<optimized out>, errp=<optimized out>)
at /home/linux1/qemu/io/channel.c:246
#7 0x000000010001c3e0 in char_socket_ping_pong (ioc=0x3fff00008c0) at
/home/linux1/qemu/tests/test-char.c:706
#8 0x000000010001c4a8 in char_socket_client_server_thread
(data=data@entry=0x1000f2730) at
/home/linux1/qemu/tests/test-char.c:859
#9 0x000000010005d640 in qemu_thread_start (args=<optimized out>) at
/home/linux1/qemu/util/qemu-thread-posix.c:502
#10 0x000003fffd907934 in start_thread (arg=0x3fff7fff910) at
pthread_create.c:335
#11 0x000003fffd7edce2 in thread_start () at
../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:74
Thread 3 (Thread 0x3fffc9ff910 (LWP 35350)):
#0 0x000003fffd7e3e54 in ?? () at
../sysdeps/unix/syscall-template.S:84 from
/lib/s390x-linux-gnu/libc.so.6
#1 0x000003fffddd06ee in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0
#2 0x000003fffddd087c in g_main_context_iteration () from
/lib/s390x-linux-gnu/libglib-2.0.so.0
#3 0x000003fffddd08cc in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0
#4 0x000003fffddfaba4 in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0
#5 0x000003fffd907934 in start_thread (arg=0x3fffc9ff910) at
pthread_create.c:335
#6 0x000003fffd7edce2 in thread_start () at
../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:74
Thread 2 (Thread 0x3fffd1ff910 (LWP 35346)):
#0 syscall () at ../sysdeps/unix/sysv/linux/s390/s390-64/syscall.S:58
#1 0x000000010005e3ca in qemu_futex_wait (val=<optimized out>,
f=<optimized out>) at /home/linux1/qemu/include/qemu/futex.h:29
#2 qemu_event_wait (ev=0x1000c17f0 <rcu_call_ready_event>) at
/home/linux1/qemu/util/qemu-thread-posix.c:442
#3 0x000000010007d524 in call_rcu_thread (opaque=opaque@entry=0x0) at
/home/linux1/qemu/util/rcu.c:261
#4 0x000000010005d640 in qemu_thread_start (args=<optimized out>) at
/home/linux1/qemu/util/qemu-thread-posix.c:502
#5 0x000003fffd907934 in start_thread (arg=0x3fffd1ff910) at
pthread_create.c:335
#6 0x000003fffd7edce2 in thread_start () at
../sysdeps/unix/sysv/linux/s390/s390-64/clone.S:74
Thread 1 (Thread 0x3fffdff3920 (LWP 35343)):
#0 0x000003fffd7381b8 in __GI_raise (sig=sig@entry=6) at
../sysdeps/unix/sysv/linux/raise.c:54
#1 0x000003fffd739726 in __GI_abort () at abort.c:89
#2 0x0000000100017444 in error_exit (err=<optimized out>,
msg=msg@entry=0x100091c26 <__func__.18092> "qemu_mutex_destroy")
at /home/linux1/qemu/util/qemu-thread-posix.c:36
#3 0x000000010005d772 in qemu_mutex_destroy (mutex=<optimized out>)
at /home/linux1/qemu/util/qemu-thread-posix.c:57
#4 0x0000000100025c36 in qio_task_free (task=0x1000f1370) at
/home/linux1/qemu/io/task.c:97
#5 qio_task_complete (task=task@entry=0x1000f1370) at
/home/linux1/qemu/io/task.c:196
#6 0x0000000100025d0e in qio_task_thread_result (opaque=0x1000f1370)
at /home/linux1/qemu/io/task.c:110
#7 0x000003fffddd03ce in g_main_context_dispatch () from
/lib/s390x-linux-gnu/libglib-2.0.so.0
#8 0x000000010005a16a in glib_pollfds_poll () at
/home/linux1/qemu/util/main-loop.c:215
#9 os_host_main_loop_wait (timeout=<optimized out>) at
/home/linux1/qemu/util/main-loop.c:238
#10 main_loop_wait (nonblocking=<optimized out>) at
/home/linux1/qemu/util/main-loop.c:514
#11 0x00000001000190c2 in char_socket_client_test (opaque=<optimized
out>) at /home/linux1/qemu/tests/test-char.c:962
#12 0x000003fffddf9756 in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0
#13 0x000003fffddf9934 in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0
#14 0x000003fffddf9934 in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0
#15 0x000003fffddf9934 in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0
#16 0x000003fffddf9934 in ?? () from /lib/s390x-linux-gnu/libglib-2.0.so.0
#17 0x000003fffddf9b5e in g_test_run_suite () from
/lib/s390x-linux-gnu/libglib-2.0.so.0
#18 0x000003fffddf9b80 in g_test_run () from
/lib/s390x-linux-gnu/libglib-2.0.so.0
#19 0x0000000100017a28 in main (argc=1, argv=0x3fffffff578) at
/home/linux1/qemu/tests/test-char.c:1358
On some other hosts I saw a similar
"qemu: qemu_mutex_destroy: Device or resource busy" and core dump in the
migration tests, I think, which is probably the same underlying bug.
thanks
-- PMM
On Fri, Feb 08, 2019 at 11:44:42AM +0000, Peter Maydell wrote:
> On Thu, 7 Feb 2019 at 16:06, Marc-André Lureau
> <marcandre.lureau@redhat.com> wrote:
> >
> > The following changes since commit 632351e0e1a861f2eaf709b053c53f96a1225825:
> >
> > Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging (2019-02-07 14:20:46 +0000)
> >
> > are available in the Git repository at:
> >
> > https://github.com/elmarco/qemu.git tags/chardev-pull-request
> >
> > for you to fetch changes up to df3afdedd23ade0c9de55cadeb1d85055689023f:
> >
> > tests/test-char: add muxed chardev testing for open/close (2019-02-07 16:18:25 +0100)
> >
> > ----------------------------------------------------------------
> > Various chardev fixes
> >
> > ----------------------------------------------------------------
>
> This seems to result in 'make check' failures on some platforms.
> I saw this on s390 and aarch32, I think.
>
> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
> tests/test-char -m=quick -k --tap < /dev/null |
> ./scripts/tap-driver.pl --test-name="test-char
> "
> PASS 1 test-char /char/null
> PASS 2 test-char /char/invalid
> PASS 3 test-char /char/ringbuf
> PASS 4 test-char /char/mux
> PASS 5 test-char /char/stdio
> PASS 6 test-char /char/pipe
> PASS 7 test-char /char/file
> PASS 8 test-char /char/file-fifo
> PASS 9 test-char /char/udp
> PASS 10 test-char /char/serial
> PASS 11 test-char /char/hotswap
> PASS 12 test-char /char/websocket
> PASS 13 test-char /char/socket/server/mainloop/tcp
> PASS 14 test-char /char/socket/server/mainloop/unix
> PASS 15 test-char /char/socket/server/wait-conn/tcp
> PASS 16 test-char /char/socket/server/wait-conn/unix
> PASS 17 test-char /char/socket/server/mainloop-fdpass/tcp
> PASS 18 test-char /char/socket/server/mainloop-fdpass/unix
> PASS 19 test-char /char/socket/server/wait-conn-fdpass/tcp
> PASS 20 test-char /char/socket/server/wait-conn-fdpass/unix
> PASS 21 test-char /char/socket/client/mainloop/tcp
> PASS 22 test-char /char/socket/client/mainloop/unix
> qemu: qemu_mutex_destroy: Device or resource busy
> PASS 23 test-char /char/socket/client/wait-conn/tcp
> PASS 24 test-char /char/socket/client/wait-conn/unix
> Aborted (core dumped)
> ERROR - too few tests run (expected 32, got 24)
>
> Here's a backtrace from running tests/test-char under gdb.
> Looks like a race condition between a thread trying to
> destroy a mutex and a different thread that is still
> using it.
Thanks, that is very useful. I can see the race condition here
now between qio_task_thread_worker and qio_task_thread_result.
I need to acquire the mutex in qio_task_thread_result in order
to sycnhronize with completion of qio_task_thread_worker.
>
> On some other hosts I saw a similar
> "qemu: qemu_mutex_destroy: Device or resource busy" and core dump in the
> migration tests, I think, which is probably the same underlying bug.
Yes, I expect it is the same problem
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Mon, Feb 11, 2019 at 05:03:13PM +0000, Daniel P. Berrangé wrote:
> On Fri, Feb 08, 2019 at 11:44:42AM +0000, Peter Maydell wrote:
> > On Thu, 7 Feb 2019 at 16:06, Marc-André Lureau
> > <marcandre.lureau@redhat.com> wrote:
> > >
> > > The following changes since commit 632351e0e1a861f2eaf709b053c53f96a1225825:
> > >
> > > Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging (2019-02-07 14:20:46 +0000)
> > >
> > > are available in the Git repository at:
> > >
> > > https://github.com/elmarco/qemu.git tags/chardev-pull-request
> > >
> > > for you to fetch changes up to df3afdedd23ade0c9de55cadeb1d85055689023f:
> > >
> > > tests/test-char: add muxed chardev testing for open/close (2019-02-07 16:18:25 +0100)
> > >
> > > ----------------------------------------------------------------
> > > Various chardev fixes
> > >
> > > ----------------------------------------------------------------
> >
> > This seems to result in 'make check' failures on some platforms.
> > I saw this on s390 and aarch32, I think.
> >
> > MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
> > tests/test-char -m=quick -k --tap < /dev/null |
> > ./scripts/tap-driver.pl --test-name="test-char
> > "
> > PASS 1 test-char /char/null
> > PASS 2 test-char /char/invalid
> > PASS 3 test-char /char/ringbuf
> > PASS 4 test-char /char/mux
> > PASS 5 test-char /char/stdio
> > PASS 6 test-char /char/pipe
> > PASS 7 test-char /char/file
> > PASS 8 test-char /char/file-fifo
> > PASS 9 test-char /char/udp
> > PASS 10 test-char /char/serial
> > PASS 11 test-char /char/hotswap
> > PASS 12 test-char /char/websocket
> > PASS 13 test-char /char/socket/server/mainloop/tcp
> > PASS 14 test-char /char/socket/server/mainloop/unix
> > PASS 15 test-char /char/socket/server/wait-conn/tcp
> > PASS 16 test-char /char/socket/server/wait-conn/unix
> > PASS 17 test-char /char/socket/server/mainloop-fdpass/tcp
> > PASS 18 test-char /char/socket/server/mainloop-fdpass/unix
> > PASS 19 test-char /char/socket/server/wait-conn-fdpass/tcp
> > PASS 20 test-char /char/socket/server/wait-conn-fdpass/unix
> > PASS 21 test-char /char/socket/client/mainloop/tcp
> > PASS 22 test-char /char/socket/client/mainloop/unix
> > qemu: qemu_mutex_destroy: Device or resource busy
> > PASS 23 test-char /char/socket/client/wait-conn/tcp
> > PASS 24 test-char /char/socket/client/wait-conn/unix
> > Aborted (core dumped)
> > ERROR - too few tests run (expected 32, got 24)
> >
> > Here's a backtrace from running tests/test-char under gdb.
> > Looks like a race condition between a thread trying to
> > destroy a mutex and a different thread that is still
> > using it.
>
> Thanks, that is very useful. I can see the race condition here
> now between qio_task_thread_worker and qio_task_thread_result.
> I need to acquire the mutex in qio_task_thread_result in order
> to sycnhronize with completion of qio_task_thread_worker.
In testing this first bug, I found a second bug hiding where
tcp_chr_wait_connected forget to de-register the pending
reconnect timer GSource, leading to a later crash. We would
not have seen this in the test suite except for me adding
a sleep(1) in the right place by chance :-)
I've sent a v3 series for Marc-André to queue for a new PULL
> > On some other hosts I saw a similar
> > "qemu: qemu_mutex_destroy: Device or resource busy" and core dump in the
> > migration tests, I think, which is probably the same underlying bug.
>
> Yes, I expect it is the same problem
I'm confident this is the first problem i mention now.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Thu, Feb 07, 2019 at 05:05:59PM +0100, Marc-André Lureau wrote: > The following changes since commit 632351e0e1a861f2eaf709b053c53f96a1225825: > > Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging (2019-02-07 14:20:46 +0000) > > are available in the Git repository at: > > https://github.com/elmarco/qemu.git tags/chardev-pull-request > > for you to fetch changes up to df3afdedd23ade0c9de55cadeb1d85055689023f: > > tests/test-char: add muxed chardev testing for open/close (2019-02-07 16:18:25 +0100) > > ---------------------------------------------------------------- > Various chardev fixes > > ---------------------------------------------------------------- BTW, I think as the maintainer sending the PULL request you are expected to have added your own S-o-B to every patch, rather than just a R-B. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Mon, 11 Feb 2019 at 16:50, Daniel P. Berrangé <berrange@redhat.com> wrote: > > On Thu, Feb 07, 2019 at 05:05:59PM +0100, Marc-André Lureau wrote: > > The following changes since commit 632351e0e1a861f2eaf709b053c53f96a1225825: > > > > Merge remote-tracking branch 'remotes/elmarco/tags/dump-pull-request' into staging (2019-02-07 14:20:46 +0000) > > > > are available in the Git repository at: > > > > https://github.com/elmarco/qemu.git tags/chardev-pull-request > > > > for you to fetch changes up to df3afdedd23ade0c9de55cadeb1d85055689023f: > > > > tests/test-char: add muxed chardev testing for open/close (2019-02-07 16:18:25 +0100) > > > > ---------------------------------------------------------------- > > Various chardev fixes > > > > ---------------------------------------------------------------- > > BTW, I think as the maintainer sending the PULL request you are expected > to have added your own S-o-B to every patch, rather than just a R-B. Yes, indeed. Thanks for catching that. -- PMM
© 2016 - 2026 Red Hat, Inc.