[Qemu-devel] [PATCH v2 6/6] monitor: avoid potential dead-lock when cleaning up

Marc-André Lureau posted 6 patches 7 years ago
There is a newer version of this series
[Qemu-devel] [PATCH v2 6/6] monitor: avoid potential dead-lock when cleaning up
Posted by Marc-André Lureau 7 years ago
When a monitor is connected to a Spice chardev, the monitor cleanup
can dead-lock:

 #0  0x00007f43446637fd in __lll_lock_wait () at /lib64/libpthread.so.0
 #1  0x00007f434465ccf4 in pthread_mutex_lock () at /lib64/libpthread.so.0
 #2  0x0000556dd79f22ba in qemu_mutex_lock_impl (mutex=0x556dd81c9220 <monitor_lock>, file=0x556dd7ae3648 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66
 #3  0x0000556dd7431bd5 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x556dd9abc850, errp=0x7fffb7bbddd8) at /home/elmarco/src/qq/monitor.c:645
 #4  0x0000556dd79d476b in qapi_event_send_spice_disconnected (server=0x556dd98ee760, client=0x556ddaaa8560, errp=0x556dd82180d0 <error_abort>) at qapi/qapi-events-ui.c:149
 #5  0x0000556dd7870fc1 in channel_event (event=3, info=0x556ddad1b590) at /home/elmarco/src/qq/ui/spice-core.c:235
 #6  0x00007f434560a6bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x556ddad1b590) at reds.c:316
 #7  0x00007f43455f393b in main_dispatcher_self_handle_channel_event (info=0x556ddad1b590, event=3, self=0x556dd9a7d8c0) at main-dispatcher.c:197
 #8  0x00007f43455f393b in main_dispatcher_channel_event (self=0x556dd9a7d8c0, event=event@entry=3, info=0x556ddad1b590) at main-dispatcher.c:197
 #9  0x00007f4345612833 in red_stream_push_channel_event (s=s@entry=0x556ddae2ef40, event=event@entry=3) at red-stream.c:414
 #10 0x00007f434561286b in red_stream_free (s=0x556ddae2ef40) at red-stream.c:388
 #11 0x00007f43455f9ddc in red_channel_client_finalize (object=0x556dd9bb21a0) at red-channel-client.c:347
 #12 0x00007f434b5f9fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0
 #13 0x00007f43455fc212 in red_channel_client_push (rcc=0x556dd9bb21a0) at red-channel-client.c:1341
 #14 0x0000556dd76081ba in spice_port_set_fe_open (chr=0x556dd9925e20, fe_open=0) at /home/elmarco/src/qq/chardev/spice.c:241
 #15 0x0000556dd796d74a in qemu_chr_fe_set_open (be=0x556dd9a37c00, fe_open=0) at /home/elmarco/src/qq/chardev/char-fe.c:340
 #16 0x0000556dd796d4d9 in qemu_chr_fe_set_handlers (b=0x556dd9a37c00, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=true) at /home/elmarco/src/qq/chardev/char-fe.c:280
 #17 0x0000556dd796d359 in qemu_chr_fe_deinit (b=0x556dd9a37c00, del=false) at /home/elmarco/src/qq/chardev/char-fe.c:233
 #18 0x0000556dd7432240 in monitor_data_destroy (mon=0x556dd9a37c00) at /home/elmarco/src/qq/monitor.c:786
 #19 0x0000556dd743b968 in monitor_cleanup () at /home/elmarco/src/qq/monitor.c:4683
 #20 0x0000556dd75ce776 in main (argc=3, argv=0x7fffb7bbe458, envp=0x7fffb7bbe478) at /home/elmarco/src/qq/vl.c:4660

Because spice code tries to emit a "disconnected" signal on the
monitors. Fix this dead-lock by releasing the monitor lock for
flush/destroy.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 monitor.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/monitor.c b/monitor.c
index 7fe89daa87..b55e604a98 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4643,8 +4643,10 @@ void monitor_cleanup(void)
     monitor_destroyed = true;
     QTAILQ_FOREACH_SAFE(mon, &mon_list, entry, next) {
         QTAILQ_REMOVE(&mon_list, mon, entry);
+        qemu_mutex_unlock(&monitor_lock);
         monitor_flush(mon);
         monitor_data_destroy(mon);
+        qemu_mutex_lock(&monitor_lock);
         g_free(mon);
     }
     qemu_mutex_unlock(&monitor_lock);
-- 
2.19.0.271.gfe8321ec05


Re: [Qemu-devel] [PATCH v2 6/6] monitor: avoid potential dead-lock when cleaning up
Posted by Markus Armbruster 6 years, 11 months ago
Marc-André Lureau <marcandre.lureau@redhat.com> writes:

> When a monitor is connected to a Spice chardev, the monitor cleanup
> can dead-lock:
>
>  #0  0x00007f43446637fd in __lll_lock_wait () at /lib64/libpthread.so.0
>  #1  0x00007f434465ccf4 in pthread_mutex_lock () at /lib64/libpthread.so.0
>  #2  0x0000556dd79f22ba in qemu_mutex_lock_impl (mutex=0x556dd81c9220 <monitor_lock>, file=0x556dd7ae3648 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66
>  #3  0x0000556dd7431bd5 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x556dd9abc850, errp=0x7fffb7bbddd8) at /home/elmarco/src/qq/monitor.c:645
>  #4  0x0000556dd79d476b in qapi_event_send_spice_disconnected (server=0x556dd98ee760, client=0x556ddaaa8560, errp=0x556dd82180d0 <error_abort>) at qapi/qapi-events-ui.c:149
>  #5  0x0000556dd7870fc1 in channel_event (event=3, info=0x556ddad1b590) at /home/elmarco/src/qq/ui/spice-core.c:235
>  #6  0x00007f434560a6bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x556ddad1b590) at reds.c:316
>  #7  0x00007f43455f393b in main_dispatcher_self_handle_channel_event (info=0x556ddad1b590, event=3, self=0x556dd9a7d8c0) at main-dispatcher.c:197
>  #8  0x00007f43455f393b in main_dispatcher_channel_event (self=0x556dd9a7d8c0, event=event@entry=3, info=0x556ddad1b590) at main-dispatcher.c:197
>  #9  0x00007f4345612833 in red_stream_push_channel_event (s=s@entry=0x556ddae2ef40, event=event@entry=3) at red-stream.c:414
>  #10 0x00007f434561286b in red_stream_free (s=0x556ddae2ef40) at red-stream.c:388
>  #11 0x00007f43455f9ddc in red_channel_client_finalize (object=0x556dd9bb21a0) at red-channel-client.c:347
>  #12 0x00007f434b5f9fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0
>  #13 0x00007f43455fc212 in red_channel_client_push (rcc=0x556dd9bb21a0) at red-channel-client.c:1341
>  #14 0x0000556dd76081ba in spice_port_set_fe_open (chr=0x556dd9925e20, fe_open=0) at /home/elmarco/src/qq/chardev/spice.c:241
>  #15 0x0000556dd796d74a in qemu_chr_fe_set_open (be=0x556dd9a37c00, fe_open=0) at /home/elmarco/src/qq/chardev/char-fe.c:340
>  #16 0x0000556dd796d4d9 in qemu_chr_fe_set_handlers (b=0x556dd9a37c00, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=true) at /home/elmarco/src/qq/chardev/char-fe.c:280
>  #17 0x0000556dd796d359 in qemu_chr_fe_deinit (b=0x556dd9a37c00, del=false) at /home/elmarco/src/qq/chardev/char-fe.c:233
>  #18 0x0000556dd7432240 in monitor_data_destroy (mon=0x556dd9a37c00) at /home/elmarco/src/qq/monitor.c:786
>  #19 0x0000556dd743b968 in monitor_cleanup () at /home/elmarco/src/qq/monitor.c:4683
>  #20 0x0000556dd75ce776 in main (argc=3, argv=0x7fffb7bbe458, envp=0x7fffb7bbe478) at /home/elmarco/src/qq/vl.c:4660
>
> Because spice code tries to emit a "disconnected" signal on the
> monitors. Fix this dead-lock by releasing the monitor lock for
> flush/destroy.
>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>  monitor.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/monitor.c b/monitor.c
> index 7fe89daa87..b55e604a98 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -4643,8 +4643,10 @@ void monitor_cleanup(void)
>      monitor_destroyed = true;
>      QTAILQ_FOREACH_SAFE(mon, &mon_list, entry, next) {
>          QTAILQ_REMOVE(&mon_list, mon, entry);
> +        qemu_mutex_unlock(&monitor_lock);
>          monitor_flush(mon);
>          monitor_data_destroy(mon);
> +        qemu_mutex_lock(&monitor_lock);
>          g_free(mon);
>      }
>      qemu_mutex_unlock(&monitor_lock);

I think a comment hinting at the reason for relinquishing the lock would
be in order.  Perhaps

           /* Permit QAPI event emission from character frontend release */
           qemu_mutex_unlock(&monitor_lock);

We need to demonstrate calling monitor_flush() and
monitor_data_destroy() without holding @monitor_lock is safe.

@monitor_lock's comment states it "protects mon_list,
monitor_qapi_event_state."  Looks plausible from how it's used.

As far as I can tell, monitor_flush() and monitor_data_destroy() don't
access mon_list and monitor_qapi_event_state.

monitor_cleanup()'s loop itself is safe because it uses
QTAILQ_FOREACH_SAFE(), unlike similar loops elsewhere.

I think we're good.  I'd like you to work this argument into the commit
message.

Re: [Qemu-devel] [PATCH v2 6/6] monitor: avoid potential dead-lock when cleaning up
Posted by Marc-André Lureau 6 years, 11 months ago
On Mon, Dec 3, 2018 at 1:26 PM Markus Armbruster <armbru@redhat.com> wrote:
>
> Marc-André Lureau <marcandre.lureau@redhat.com> writes:
>
> > When a monitor is connected to a Spice chardev, the monitor cleanup
> > can dead-lock:
> >
> >  #0  0x00007f43446637fd in __lll_lock_wait () at /lib64/libpthread.so.0
> >  #1  0x00007f434465ccf4 in pthread_mutex_lock () at /lib64/libpthread.so.0
> >  #2  0x0000556dd79f22ba in qemu_mutex_lock_impl (mutex=0x556dd81c9220 <monitor_lock>, file=0x556dd7ae3648 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66
> >  #3  0x0000556dd7431bd5 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x556dd9abc850, errp=0x7fffb7bbddd8) at /home/elmarco/src/qq/monitor.c:645
> >  #4  0x0000556dd79d476b in qapi_event_send_spice_disconnected (server=0x556dd98ee760, client=0x556ddaaa8560, errp=0x556dd82180d0 <error_abort>) at qapi/qapi-events-ui.c:149
> >  #5  0x0000556dd7870fc1 in channel_event (event=3, info=0x556ddad1b590) at /home/elmarco/src/qq/ui/spice-core.c:235
> >  #6  0x00007f434560a6bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x556ddad1b590) at reds.c:316
> >  #7  0x00007f43455f393b in main_dispatcher_self_handle_channel_event (info=0x556ddad1b590, event=3, self=0x556dd9a7d8c0) at main-dispatcher.c:197
> >  #8  0x00007f43455f393b in main_dispatcher_channel_event (self=0x556dd9a7d8c0, event=event@entry=3, info=0x556ddad1b590) at main-dispatcher.c:197
> >  #9  0x00007f4345612833 in red_stream_push_channel_event (s=s@entry=0x556ddae2ef40, event=event@entry=3) at red-stream.c:414
> >  #10 0x00007f434561286b in red_stream_free (s=0x556ddae2ef40) at red-stream.c:388
> >  #11 0x00007f43455f9ddc in red_channel_client_finalize (object=0x556dd9bb21a0) at red-channel-client.c:347
> >  #12 0x00007f434b5f9fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0
> >  #13 0x00007f43455fc212 in red_channel_client_push (rcc=0x556dd9bb21a0) at red-channel-client.c:1341
> >  #14 0x0000556dd76081ba in spice_port_set_fe_open (chr=0x556dd9925e20, fe_open=0) at /home/elmarco/src/qq/chardev/spice.c:241
> >  #15 0x0000556dd796d74a in qemu_chr_fe_set_open (be=0x556dd9a37c00, fe_open=0) at /home/elmarco/src/qq/chardev/char-fe.c:340
> >  #16 0x0000556dd796d4d9 in qemu_chr_fe_set_handlers (b=0x556dd9a37c00, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=true) at /home/elmarco/src/qq/chardev/char-fe.c:280
> >  #17 0x0000556dd796d359 in qemu_chr_fe_deinit (b=0x556dd9a37c00, del=false) at /home/elmarco/src/qq/chardev/char-fe.c:233
> >  #18 0x0000556dd7432240 in monitor_data_destroy (mon=0x556dd9a37c00) at /home/elmarco/src/qq/monitor.c:786
> >  #19 0x0000556dd743b968 in monitor_cleanup () at /home/elmarco/src/qq/monitor.c:4683
> >  #20 0x0000556dd75ce776 in main (argc=3, argv=0x7fffb7bbe458, envp=0x7fffb7bbe478) at /home/elmarco/src/qq/vl.c:4660
> >
> > Because spice code tries to emit a "disconnected" signal on the
> > monitors. Fix this dead-lock by releasing the monitor lock for
> > flush/destroy.
> >
> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > ---
> >  monitor.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/monitor.c b/monitor.c
> > index 7fe89daa87..b55e604a98 100644
> > --- a/monitor.c
> > +++ b/monitor.c
> > @@ -4643,8 +4643,10 @@ void monitor_cleanup(void)
> >      monitor_destroyed = true;
> >      QTAILQ_FOREACH_SAFE(mon, &mon_list, entry, next) {
> >          QTAILQ_REMOVE(&mon_list, mon, entry);
> > +        qemu_mutex_unlock(&monitor_lock);
> >          monitor_flush(mon);
> >          monitor_data_destroy(mon);
> > +        qemu_mutex_lock(&monitor_lock);
> >          g_free(mon);
> >      }
> >      qemu_mutex_unlock(&monitor_lock);
>
> I think a comment hinting at the reason for relinquishing the lock would
> be in order.  Perhaps
>
>            /* Permit QAPI event emission from character frontend release */
>            qemu_mutex_unlock(&monitor_lock);
>
> We need to demonstrate calling monitor_flush() and
> monitor_data_destroy() without holding @monitor_lock is safe.
>
> @monitor_lock's comment states it "protects mon_list,
> monitor_qapi_event_state."  Looks plausible from how it's used.
>
> As far as I can tell, monitor_flush() and monitor_data_destroy() don't
> access mon_list and monitor_qapi_event_state.
>
> monitor_cleanup()'s loop itself is safe because it uses
> QTAILQ_FOREACH_SAFE(), unlike similar loops elsewhere.
>
> I think we're good.  I'd like you to work this argument into the commit
> message.

What about adding:

    monitor_lock protects mon_list, monitor_qapi_event_state and
    monitor_destroyed. monitor_flush() and monitor_data_destroy() don't
    access any of those variables.

    monitor_cleanup()'s loop is safe because it uses
    QTAILQ_FOREACH_SAFE(), and no further monitor can be added after
    calling monitor_cleanup() thanks to monitor_destroyed check in
    monitor_list_append().

Re: [Qemu-devel] [PATCH v2 6/6] monitor: avoid potential dead-lock when cleaning up
Posted by Markus Armbruster 6 years, 11 months ago
Marc-André Lureau <marcandre.lureau@redhat.com> writes:

> On Mon, Dec 3, 2018 at 1:26 PM Markus Armbruster <armbru@redhat.com> wrote:
>>
>> Marc-André Lureau <marcandre.lureau@redhat.com> writes:
>>
>> > When a monitor is connected to a Spice chardev, the monitor cleanup
>> > can dead-lock:
>> >
>> >  #0  0x00007f43446637fd in __lll_lock_wait () at /lib64/libpthread.so.0
>> >  #1  0x00007f434465ccf4 in pthread_mutex_lock () at /lib64/libpthread.so.0
>> >  #2  0x0000556dd79f22ba in qemu_mutex_lock_impl (mutex=0x556dd81c9220 <monitor_lock>, file=0x556dd7ae3648 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66
>> >  #3  0x0000556dd7431bd5 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x556dd9abc850, errp=0x7fffb7bbddd8) at /home/elmarco/src/qq/monitor.c:645
>> >  #4  0x0000556dd79d476b in qapi_event_send_spice_disconnected (server=0x556dd98ee760, client=0x556ddaaa8560, errp=0x556dd82180d0 <error_abort>) at qapi/qapi-events-ui.c:149
>> >  #5  0x0000556dd7870fc1 in channel_event (event=3, info=0x556ddad1b590) at /home/elmarco/src/qq/ui/spice-core.c:235
>> >  #6  0x00007f434560a6bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x556ddad1b590) at reds.c:316
>> >  #7  0x00007f43455f393b in main_dispatcher_self_handle_channel_event (info=0x556ddad1b590, event=3, self=0x556dd9a7d8c0) at main-dispatcher.c:197
>> >  #8  0x00007f43455f393b in main_dispatcher_channel_event (self=0x556dd9a7d8c0, event=event@entry=3, info=0x556ddad1b590) at main-dispatcher.c:197
>> >  #9  0x00007f4345612833 in red_stream_push_channel_event (s=s@entry=0x556ddae2ef40, event=event@entry=3) at red-stream.c:414
>> >  #10 0x00007f434561286b in red_stream_free (s=0x556ddae2ef40) at red-stream.c:388
>> >  #11 0x00007f43455f9ddc in red_channel_client_finalize (object=0x556dd9bb21a0) at red-channel-client.c:347
>> >  #12 0x00007f434b5f9fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0
>> >  #13 0x00007f43455fc212 in red_channel_client_push (rcc=0x556dd9bb21a0) at red-channel-client.c:1341
>> >  #14 0x0000556dd76081ba in spice_port_set_fe_open (chr=0x556dd9925e20, fe_open=0) at /home/elmarco/src/qq/chardev/spice.c:241
>> >  #15 0x0000556dd796d74a in qemu_chr_fe_set_open (be=0x556dd9a37c00, fe_open=0) at /home/elmarco/src/qq/chardev/char-fe.c:340
>> >  #16 0x0000556dd796d4d9 in qemu_chr_fe_set_handlers (b=0x556dd9a37c00, fd_can_read=0x0, fd_read=0x0, fd_event=0x0, be_change=0x0, opaque=0x0, context=0x0, set_open=true) at /home/elmarco/src/qq/chardev/char-fe.c:280
>> >  #17 0x0000556dd796d359 in qemu_chr_fe_deinit (b=0x556dd9a37c00, del=false) at /home/elmarco/src/qq/chardev/char-fe.c:233
>> >  #18 0x0000556dd7432240 in monitor_data_destroy (mon=0x556dd9a37c00) at /home/elmarco/src/qq/monitor.c:786
>> >  #19 0x0000556dd743b968 in monitor_cleanup () at /home/elmarco/src/qq/monitor.c:4683
>> >  #20 0x0000556dd75ce776 in main (argc=3, argv=0x7fffb7bbe458, envp=0x7fffb7bbe478) at /home/elmarco/src/qq/vl.c:4660
>> >
>> > Because spice code tries to emit a "disconnected" signal on the
>> > monitors. Fix this dead-lock by releasing the monitor lock for
>> > flush/destroy.
>> >
>> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>> > ---
>> >  monitor.c | 2 ++
>> >  1 file changed, 2 insertions(+)
>> >
>> > diff --git a/monitor.c b/monitor.c
>> > index 7fe89daa87..b55e604a98 100644
>> > --- a/monitor.c
>> > +++ b/monitor.c
>> > @@ -4643,8 +4643,10 @@ void monitor_cleanup(void)
>> >      monitor_destroyed = true;
>> >      QTAILQ_FOREACH_SAFE(mon, &mon_list, entry, next) {
>> >          QTAILQ_REMOVE(&mon_list, mon, entry);
>> > +        qemu_mutex_unlock(&monitor_lock);
>> >          monitor_flush(mon);
>> >          monitor_data_destroy(mon);
>> > +        qemu_mutex_lock(&monitor_lock);
>> >          g_free(mon);
>> >      }
>> >      qemu_mutex_unlock(&monitor_lock);
>>
>> I think a comment hinting at the reason for relinquishing the lock would
>> be in order.  Perhaps
>>
>>            /* Permit QAPI event emission from character frontend release */
>>            qemu_mutex_unlock(&monitor_lock);
>>
>> We need to demonstrate calling monitor_flush() and
>> monitor_data_destroy() without holding @monitor_lock is safe.
>>
>> @monitor_lock's comment states it "protects mon_list,
>> monitor_qapi_event_state."  Looks plausible from how it's used.
>>
>> As far as I can tell, monitor_flush() and monitor_data_destroy() don't
>> access mon_list and monitor_qapi_event_state.
>>
>> monitor_cleanup()'s loop itself is safe because it uses
>> QTAILQ_FOREACH_SAFE(), unlike similar loops elsewhere.
>>
>> I think we're good.  I'd like you to work this argument into the commit
>> message.
>
> What about adding:
>
>     monitor_lock protects mon_list, monitor_qapi_event_state and
>     monitor_destroyed. monitor_flush() and monitor_data_destroy() don't
>     access any of those variables.
>
>     monitor_cleanup()'s loop is safe because it uses
>     QTAILQ_FOREACH_SAFE(), and no further monitor can be added after
>     calling monitor_cleanup() thanks to monitor_destroyed check in
>     monitor_list_append().

Works for me.  Thanks!