[Qemu-devel] [PATCH v2] monitor: temporary fix for dead-lock on event recursion

Marc-André Lureau posted 1 patch 7 years, 3 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20180731150144.14022-1-marcandre.lureau@redhat.com
Test checkpatch passed
Test docker-mingw@fedora passed
Test docker-clang@ubuntu passed
Test docker-quick@centos7 passed
monitor.c | 44 +++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 43 insertions(+), 1 deletion(-)
[Qemu-devel] [PATCH v2] monitor: temporary fix for dead-lock on event recursion
Posted by Marc-André Lureau 7 years, 3 months ago
With a Spice port chardev, it is possible to reenter
monitor_qapi_event_queue() (when the client disconnects for
example). This will dead-lock on monitor_lock.

Instead, use some TLS variables to check for recursion and queue the
events.

Fixes:
 (gdb) bt
 #0  0x00007fa69e7217fd in __lll_lock_wait () at /lib64/libpthread.so.0
 #1  0x00007fa69e71acf4 in pthread_mutex_lock () at /lib64/libpthread.so.0
 #2  0x0000563303567619 in qemu_mutex_lock_impl (mutex=0x563303d3e220 <monitor_lock>, file=0x5633036589a8 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66
 #3  0x0000563302fa6c25 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x56330602bde0, errp=0x7ffc6ab5e728) at /home/elmarco/src/qq/monitor.c:645
 #4  0x0000563303549aca in qapi_event_send_spice_disconnected (server=0x563305afd630, client=0x563305745360, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-ui.c:149
 #5  0x00005633033e600f in channel_event (event=3, info=0x5633061b0050) at /home/elmarco/src/qq/ui/spice-core.c:235
 #6  0x00007fa69f6c86bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x5633061b0050) at reds.c:316
 #7  0x00007fa69f6b193b in main_dispatcher_self_handle_channel_event (info=0x5633061b0050, event=3, self=0x563304e088c0) at main-dispatcher.c:197
 #8  0x00007fa69f6b193b in main_dispatcher_channel_event (self=0x563304e088c0, event=event@entry=3, info=0x5633061b0050) at main-dispatcher.c:197
 #9  0x00007fa69f6d0833 in red_stream_push_channel_event (s=s@entry=0x563305ad8f50, event=event@entry=3) at red-stream.c:414
 #10 0x00007fa69f6d086b in red_stream_free (s=0x563305ad8f50) at red-stream.c:388
 #11 0x00007fa69f6b7ddc in red_channel_client_finalize (object=0x563304df2360) at red-channel-client.c:347
 #12 0x00007fa6a56b7fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0
 #13 0x00007fa69f6ba212 in red_channel_client_push (rcc=0x563304df2360) at red-channel-client.c:1341
 #14 0x00007fa69f68b259 in red_char_device_send_msg_to_client (client=<optimized out>, msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305
 #15 0x00007fa69f68b259 in red_char_device_send_msg_to_clients (msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305
 #16 0x00007fa69f68b259 in red_char_device_read_from_device (dev=0x563304e08bc0) at char-device.c:353
 #17 0x000056330317d01d in spice_chr_write (chr=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/spice.c:199
 #18 0x00005633034deee7 in qemu_chr_write_buffer (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, offset=0x7ffc6ab5ea70, write_all=false) at /home/elmarco/src/qq/chardev/char.c:112
 #19 0x00005633034df054 in qemu_chr_write (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, write_all=false) at /home/elmarco/src/qq/chardev/char.c:147
 #20 0x00005633034e1e13 in qemu_chr_fe_write (be=0x563304dbb800, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/char-fe.c:42
 #21 0x0000563302fa6334 in monitor_flush_locked (mon=0x563304dbb800) at /home/elmarco/src/qq/monitor.c:425
 #22 0x0000563302fa6520 in monitor_puts (mon=0x563304dbb800, str=0x563305de7e9e "") at /home/elmarco/src/qq/monitor.c:468
 #23 0x0000563302fa680c in qmp_send_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:517
 #24 0x0000563302fa6905 in qmp_queue_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:538
 #25 0x0000563302fa6b5b in monitor_qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730) at /home/elmarco/src/qq/monitor.c:624
 #26 0x0000563302fa6c4b in monitor_qapi_event_queue (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730, errp=0x7ffc6ab5ed00) at /home/elmarco/src/qq/monitor.c:649
 #27 0x0000563303548cce in qapi_event_send_shutdown (guest=false, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-run-state.c:58
 #28 0x000056330313bcd7 in main_loop_should_exit () at /home/elmarco/src/qq/vl.c:1822
 #29 0x000056330313bde3 in main_loop () at /home/elmarco/src/qq/vl.c:1862
 #30 0x0000563303143781 in main (argc=3, argv=0x7ffc6ab5f068, envp=0x7ffc6ab5f088) at /home/elmarco/src/qq/vl.c:4644

Note that error report is now moved to the first caller, which may
receive an error for a recursed event. This is probably fine (95% of
callers use &error_abort, the rest have NULL error and ignore it)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 monitor.c | 44 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/monitor.c b/monitor.c
index d8d8211ae4..4d9c1873bc 100644
--- a/monitor.c
+++ b/monitor.c
@@ -633,7 +633,7 @@ static void monitor_qapi_event_handler(void *opaque);
  * applying any rate limiting if required.
  */
 static void
-monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
+monitor_qapi_event_queue_no_recurse(QAPIEvent event, QDict *qdict)
 {
     MonitorQAPIEventConf *evconf;
     MonitorQAPIEventState *evstate;
@@ -688,6 +688,48 @@ monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
     qemu_mutex_unlock(&monitor_lock);
 }
 
+static void
+monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
+{
+    /*
+     * monitor_qapi_event_queue_no_recurse() is not reentrant: it
+     * would deadlock on monitor_lock.  Work around by queueing
+     * events in thread-local storage.
+     * TODO: remove this, make it re-enter safe.
+     */
+    static __thread bool reentered;
+    typedef struct MonitorQapiEvent {
+        QAPIEvent event;
+        QDict *qdict;
+        QSIMPLEQ_ENTRY(MonitorQapiEvent) entry;
+    } MonitorQapiEvent;
+    MonitorQapiEvent *ev;
+    static __thread QSIMPLEQ_HEAD(, MonitorQapiEvent) event_queue;
+
+    if (!reentered) {
+        QSIMPLEQ_INIT(&event_queue);
+    }
+
+    ev = g_new(MonitorQapiEvent, 1);
+    ev->qdict = qobject_ref(qdict);
+    ev->event = event;
+    QSIMPLEQ_INSERT_TAIL(&event_queue, ev, entry);
+    if (reentered) {
+        return;
+    }
+
+    reentered = true;
+
+    while ((ev = QSIMPLEQ_FIRST(&event_queue)) != NULL) {
+        QSIMPLEQ_REMOVE_HEAD(&event_queue, entry);
+        monitor_qapi_event_queue_no_recurse(ev->event, ev->qdict);
+        qobject_unref(ev->qdict);
+        g_free(ev);
+    }
+
+    reentered = false;
+}
+
 /*
  * This function runs evconf->rate ns after sending a throttled
  * event.
-- 
2.18.0.321.gffc6fa0e39


Re: [Qemu-devel] [PATCH v2] monitor: temporary fix for dead-lock on event recursion
Posted by Markus Armbruster 7 years, 3 months ago
Marc-André Lureau <marcandre.lureau@redhat.com> writes:

> With a Spice port chardev, it is possible to reenter
> monitor_qapi_event_queue() (when the client disconnects for
> example). This will dead-lock on monitor_lock.
>
> Instead, use some TLS variables to check for recursion and queue the
> events.
>
> Fixes:
>  (gdb) bt
>  #0  0x00007fa69e7217fd in __lll_lock_wait () at /lib64/libpthread.so.0
>  #1  0x00007fa69e71acf4 in pthread_mutex_lock () at /lib64/libpthread.so.0
>  #2  0x0000563303567619 in qemu_mutex_lock_impl (mutex=0x563303d3e220 <monitor_lock>, file=0x5633036589a8 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66
>  #3  0x0000563302fa6c25 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x56330602bde0, errp=0x7ffc6ab5e728) at /home/elmarco/src/qq/monitor.c:645
>  #4  0x0000563303549aca in qapi_event_send_spice_disconnected (server=0x563305afd630, client=0x563305745360, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-ui.c:149
>  #5  0x00005633033e600f in channel_event (event=3, info=0x5633061b0050) at /home/elmarco/src/qq/ui/spice-core.c:235
>  #6  0x00007fa69f6c86bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x5633061b0050) at reds.c:316
>  #7  0x00007fa69f6b193b in main_dispatcher_self_handle_channel_event (info=0x5633061b0050, event=3, self=0x563304e088c0) at main-dispatcher.c:197
>  #8  0x00007fa69f6b193b in main_dispatcher_channel_event (self=0x563304e088c0, event=event@entry=3, info=0x5633061b0050) at main-dispatcher.c:197
>  #9  0x00007fa69f6d0833 in red_stream_push_channel_event (s=s@entry=0x563305ad8f50, event=event@entry=3) at red-stream.c:414
>  #10 0x00007fa69f6d086b in red_stream_free (s=0x563305ad8f50) at red-stream.c:388
>  #11 0x00007fa69f6b7ddc in red_channel_client_finalize (object=0x563304df2360) at red-channel-client.c:347
>  #12 0x00007fa6a56b7fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0
>  #13 0x00007fa69f6ba212 in red_channel_client_push (rcc=0x563304df2360) at red-channel-client.c:1341
>  #14 0x00007fa69f68b259 in red_char_device_send_msg_to_client (client=<optimized out>, msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305
>  #15 0x00007fa69f68b259 in red_char_device_send_msg_to_clients (msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305
>  #16 0x00007fa69f68b259 in red_char_device_read_from_device (dev=0x563304e08bc0) at char-device.c:353
>  #17 0x000056330317d01d in spice_chr_write (chr=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/spice.c:199
>  #18 0x00005633034deee7 in qemu_chr_write_buffer (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, offset=0x7ffc6ab5ea70, write_all=false) at /home/elmarco/src/qq/chardev/char.c:112
>  #19 0x00005633034df054 in qemu_chr_write (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, write_all=false) at /home/elmarco/src/qq/chardev/char.c:147
>  #20 0x00005633034e1e13 in qemu_chr_fe_write (be=0x563304dbb800, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/char-fe.c:42
>  #21 0x0000563302fa6334 in monitor_flush_locked (mon=0x563304dbb800) at /home/elmarco/src/qq/monitor.c:425
>  #22 0x0000563302fa6520 in monitor_puts (mon=0x563304dbb800, str=0x563305de7e9e "") at /home/elmarco/src/qq/monitor.c:468
>  #23 0x0000563302fa680c in qmp_send_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:517
>  #24 0x0000563302fa6905 in qmp_queue_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:538
>  #25 0x0000563302fa6b5b in monitor_qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730) at /home/elmarco/src/qq/monitor.c:624
>  #26 0x0000563302fa6c4b in monitor_qapi_event_queue (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730, errp=0x7ffc6ab5ed00) at /home/elmarco/src/qq/monitor.c:649
>  #27 0x0000563303548cce in qapi_event_send_shutdown (guest=false, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-run-state.c:58
>  #28 0x000056330313bcd7 in main_loop_should_exit () at /home/elmarco/src/qq/vl.c:1822
>  #29 0x000056330313bde3 in main_loop () at /home/elmarco/src/qq/vl.c:1862
>  #30 0x0000563303143781 in main (argc=3, argv=0x7ffc6ab5f068, envp=0x7ffc6ab5f088) at /home/elmarco/src/qq/vl.c:4644
>
> Note that error report is now moved to the first caller, which may
> receive an error for a recursed event. This is probably fine (95% of
> callers use &error_abort, the rest have NULL error and ignore it)
>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>  monitor.c | 44 +++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 43 insertions(+), 1 deletion(-)
>
> diff --git a/monitor.c b/monitor.c
> index d8d8211ae4..4d9c1873bc 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -633,7 +633,7 @@ static void monitor_qapi_event_handler(void *opaque);
>   * applying any rate limiting if required.
>   */
>  static void
> -monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
> +monitor_qapi_event_queue_no_recurse(QAPIEvent event, QDict *qdict)

Let's name it _no_reenter.  Can do in my tree.

>  {
>      MonitorQAPIEventConf *evconf;
>      MonitorQAPIEventState *evstate;
> @@ -688,6 +688,48 @@ monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
>      qemu_mutex_unlock(&monitor_lock);
>  }
>  
> +static void
> +monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
> +{
> +    /*
> +     * monitor_qapi_event_queue_no_recurse() is not reentrant: it
> +     * would deadlock on monitor_lock.  Work around by queueing
> +     * events in thread-local storage.
> +     * TODO: remove this, make it re-enter safe.
> +     */
> +    static __thread bool reentered;
> +    typedef struct MonitorQapiEvent {
> +        QAPIEvent event;
> +        QDict *qdict;
> +        QSIMPLEQ_ENTRY(MonitorQapiEvent) entry;
> +    } MonitorQapiEvent;
> +    MonitorQapiEvent *ev;
> +    static __thread QSIMPLEQ_HEAD(, MonitorQapiEvent) event_queue;

I'd prefer this order:

       typedef struct MonitorQapiEvent {
           QAPIEvent event;
           QDict *qdict;
           QSIMPLEQ_ENTRY(MonitorQapiEvent) entry;
       } MonitorQapiEvent;
       static __thread QSIMPLEQ_HEAD(, MonitorQapiEvent) event_queue;
       static __thread bool reentered;
       MonitorQapiEvent *ev;

Can do in my tree.

> +
> +    if (!reentered) {
> +        QSIMPLEQ_INIT(&event_queue);
> +    }
> +
> +    ev = g_new(MonitorQapiEvent, 1);
> +    ev->qdict = qobject_ref(qdict);
> +    ev->event = event;
> +    QSIMPLEQ_INSERT_TAIL(&event_queue, ev, entry);
> +    if (reentered) {
> +        return;
> +    }
> +
> +    reentered = true;
> +
> +    while ((ev = QSIMPLEQ_FIRST(&event_queue)) != NULL) {
> +        QSIMPLEQ_REMOVE_HEAD(&event_queue, entry);
> +        monitor_qapi_event_queue_no_recurse(ev->event, ev->qdict);
> +        qobject_unref(ev->qdict);
> +        g_free(ev);
> +    }
> +
> +    reentered = false;
> +}
> +
>  /*
>   * This function runs evconf->rate ns after sending a throttled
>   * event.

Preferably with these touch-ups.
Reviewed-by: Markus Armbruster <armbru@redhat.com>

Re: [Qemu-devel] [PATCH v2] monitor: temporary fix for dead-lock on event recursion
Posted by Marc-André Lureau 7 years, 3 months ago
On Tue, Jul 31, 2018 at 5:25 PM, Markus Armbruster <armbru@redhat.com> wrote:
> Marc-André Lureau <marcandre.lureau@redhat.com> writes:
>
>> With a Spice port chardev, it is possible to reenter
>> monitor_qapi_event_queue() (when the client disconnects for
>> example). This will dead-lock on monitor_lock.
>>
>> Instead, use some TLS variables to check for recursion and queue the
>> events.
>>
>> Fixes:
>>  (gdb) bt
>>  #0  0x00007fa69e7217fd in __lll_lock_wait () at /lib64/libpthread.so.0
>>  #1  0x00007fa69e71acf4 in pthread_mutex_lock () at /lib64/libpthread.so.0
>>  #2  0x0000563303567619 in qemu_mutex_lock_impl (mutex=0x563303d3e220 <monitor_lock>, file=0x5633036589a8 "/home/elmarco/src/qq/monitor.c", line=645) at /home/elmarco/src/qq/util/qemu-thread-posix.c:66
>>  #3  0x0000563302fa6c25 in monitor_qapi_event_queue (event=QAPI_EVENT_SPICE_DISCONNECTED, qdict=0x56330602bde0, errp=0x7ffc6ab5e728) at /home/elmarco/src/qq/monitor.c:645
>>  #4  0x0000563303549aca in qapi_event_send_spice_disconnected (server=0x563305afd630, client=0x563305745360, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-ui.c:149
>>  #5  0x00005633033e600f in channel_event (event=3, info=0x5633061b0050) at /home/elmarco/src/qq/ui/spice-core.c:235
>>  #6  0x00007fa69f6c86bb in reds_handle_channel_event (reds=<optimized out>, event=3, info=0x5633061b0050) at reds.c:316
>>  #7  0x00007fa69f6b193b in main_dispatcher_self_handle_channel_event (info=0x5633061b0050, event=3, self=0x563304e088c0) at main-dispatcher.c:197
>>  #8  0x00007fa69f6b193b in main_dispatcher_channel_event (self=0x563304e088c0, event=event@entry=3, info=0x5633061b0050) at main-dispatcher.c:197
>>  #9  0x00007fa69f6d0833 in red_stream_push_channel_event (s=s@entry=0x563305ad8f50, event=event@entry=3) at red-stream.c:414
>>  #10 0x00007fa69f6d086b in red_stream_free (s=0x563305ad8f50) at red-stream.c:388
>>  #11 0x00007fa69f6b7ddc in red_channel_client_finalize (object=0x563304df2360) at red-channel-client.c:347
>>  #12 0x00007fa6a56b7fb9 in g_object_unref () at /lib64/libgobject-2.0.so.0
>>  #13 0x00007fa69f6ba212 in red_channel_client_push (rcc=0x563304df2360) at red-channel-client.c:1341
>>  #14 0x00007fa69f68b259 in red_char_device_send_msg_to_client (client=<optimized out>, msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305
>>  #15 0x00007fa69f68b259 in red_char_device_send_msg_to_clients (msg=0x5633059b6310, dev=0x563304e08bc0) at char-device.c:305
>>  #16 0x00007fa69f68b259 in red_char_device_read_from_device (dev=0x563304e08bc0) at char-device.c:353
>>  #17 0x000056330317d01d in spice_chr_write (chr=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/spice.c:199
>>  #18 0x00005633034deee7 in qemu_chr_write_buffer (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, offset=0x7ffc6ab5ea70, write_all=false) at /home/elmarco/src/qq/chardev/char.c:112
>>  #19 0x00005633034df054 in qemu_chr_write (s=0x563304cafe20, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111, write_all=false) at /home/elmarco/src/qq/chardev/char.c:147
>>  #20 0x00005633034e1e13 in qemu_chr_fe_write (be=0x563304dbb800, buf=0x563304cc50b0 "{\"timestamp\": {\"seconds\": 1532944763, \"microseconds\": 326636}, \"event\": \"SHUTDOWN\", \"data\": {\"guest\": false}}\r\n", len=111) at /home/elmarco/src/qq/chardev/char-fe.c:42
>>  #21 0x0000563302fa6334 in monitor_flush_locked (mon=0x563304dbb800) at /home/elmarco/src/qq/monitor.c:425
>>  #22 0x0000563302fa6520 in monitor_puts (mon=0x563304dbb800, str=0x563305de7e9e "") at /home/elmarco/src/qq/monitor.c:468
>>  #23 0x0000563302fa680c in qmp_send_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:517
>>  #24 0x0000563302fa6905 in qmp_queue_response (mon=0x563304dbb800, rsp=0x563304df5730) at /home/elmarco/src/qq/monitor.c:538
>>  #25 0x0000563302fa6b5b in monitor_qapi_event_emit (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730) at /home/elmarco/src/qq/monitor.c:624
>>  #26 0x0000563302fa6c4b in monitor_qapi_event_queue (event=QAPI_EVENT_SHUTDOWN, qdict=0x563304df5730, errp=0x7ffc6ab5ed00) at /home/elmarco/src/qq/monitor.c:649
>>  #27 0x0000563303548cce in qapi_event_send_shutdown (guest=false, errp=0x563303d8d0f0 <error_abort>) at qapi/qapi-events-run-state.c:58
>>  #28 0x000056330313bcd7 in main_loop_should_exit () at /home/elmarco/src/qq/vl.c:1822
>>  #29 0x000056330313bde3 in main_loop () at /home/elmarco/src/qq/vl.c:1862
>>  #30 0x0000563303143781 in main (argc=3, argv=0x7ffc6ab5f068, envp=0x7ffc6ab5f088) at /home/elmarco/src/qq/vl.c:4644
>>
>> Note that error report is now moved to the first caller, which may
>> receive an error for a recursed event. This is probably fine (95% of
>> callers use &error_abort, the rest have NULL error and ignore it)
>>
>> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>> ---
>>  monitor.c | 44 +++++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 43 insertions(+), 1 deletion(-)
>>
>> diff --git a/monitor.c b/monitor.c
>> index d8d8211ae4..4d9c1873bc 100644
>> --- a/monitor.c
>> +++ b/monitor.c
>> @@ -633,7 +633,7 @@ static void monitor_qapi_event_handler(void *opaque);
>>   * applying any rate limiting if required.
>>   */
>>  static void
>> -monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
>> +monitor_qapi_event_queue_no_recurse(QAPIEvent event, QDict *qdict)
>
> Let's name it _no_reenter.  Can do in my tree.
>
>>  {
>>      MonitorQAPIEventConf *evconf;
>>      MonitorQAPIEventState *evstate;
>> @@ -688,6 +688,48 @@ monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
>>      qemu_mutex_unlock(&monitor_lock);
>>  }
>>
>> +static void
>> +monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
>> +{
>> +    /*
>> +     * monitor_qapi_event_queue_no_recurse() is not reentrant: it
>> +     * would deadlock on monitor_lock.  Work around by queueing
>> +     * events in thread-local storage.
>> +     * TODO: remove this, make it re-enter safe.
>> +     */
>> +    static __thread bool reentered;
>> +    typedef struct MonitorQapiEvent {
>> +        QAPIEvent event;
>> +        QDict *qdict;
>> +        QSIMPLEQ_ENTRY(MonitorQapiEvent) entry;
>> +    } MonitorQapiEvent;
>> +    MonitorQapiEvent *ev;
>> +    static __thread QSIMPLEQ_HEAD(, MonitorQapiEvent) event_queue;
>
> I'd prefer this order:
>
>        typedef struct MonitorQapiEvent {
>            QAPIEvent event;
>            QDict *qdict;
>            QSIMPLEQ_ENTRY(MonitorQapiEvent) entry;
>        } MonitorQapiEvent;
>        static __thread QSIMPLEQ_HEAD(, MonitorQapiEvent) event_queue;
>        static __thread bool reentered;
>        MonitorQapiEvent *ev;
>
> Can do in my tree.
>
>> +
>> +    if (!reentered) {
>> +        QSIMPLEQ_INIT(&event_queue);
>> +    }
>> +
>> +    ev = g_new(MonitorQapiEvent, 1);
>> +    ev->qdict = qobject_ref(qdict);
>> +    ev->event = event;
>> +    QSIMPLEQ_INSERT_TAIL(&event_queue, ev, entry);
>> +    if (reentered) {
>> +        return;
>> +    }
>> +
>> +    reentered = true;
>> +
>> +    while ((ev = QSIMPLEQ_FIRST(&event_queue)) != NULL) {
>> +        QSIMPLEQ_REMOVE_HEAD(&event_queue, entry);
>> +        monitor_qapi_event_queue_no_recurse(ev->event, ev->qdict);
>> +        qobject_unref(ev->qdict);
>> +        g_free(ev);
>> +    }
>> +
>> +    reentered = false;
>> +}
>> +
>>  /*
>>   * This function runs evconf->rate ns after sending a throttled
>>   * event.
>
> Preferably with these touch-ups.

np, ack

> Reviewed-by: Markus Armbruster <armbru@redhat.com>
>

thanks

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v2] monitor: temporary fix for dead-lock on event recursion
Posted by Eric Blake 7 years, 3 months ago
On 07/31/2018 10:01 AM, Marc-André Lureau wrote:
> With a Spice port chardev, it is possible to reenter
> monitor_qapi_event_queue() (when the client disconnects for
> example). This will dead-lock on monitor_lock.
> 
> Instead, use some TLS variables to check for recursion and queue the
> events.
> 

> 
> diff --git a/monitor.c b/monitor.c
> index d8d8211ae4..4d9c1873bc 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -633,7 +633,7 @@ static void monitor_qapi_event_handler(void *opaque);
>    * applying any rate limiting if required.
>    */
>   static void
> -monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
> +monitor_qapi_event_queue_no_recurse(QAPIEvent event, QDict *qdict)
>   {
>       MonitorQAPIEventConf *evconf;
>       MonitorQAPIEventState *evstate;
> @@ -688,6 +688,48 @@ monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
>       qemu_mutex_unlock(&monitor_lock);
>   }
>   
> +static void
> +monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
> +{
> +    /*
> +     * monitor_qapi_event_queue_no_recurse() is not reentrant: it
> +     * would deadlock on monitor_lock.  Work around by queueing
> +     * events in thread-local storage.
> +     * TODO: remove this, make it re-enter safe.
> +     */
> +    static __thread bool reentered;
> +    typedef struct MonitorQapiEvent {
> +        QAPIEvent event;
> +        QDict *qdict;
> +        QSIMPLEQ_ENTRY(MonitorQapiEvent) entry;
> +    } MonitorQapiEvent;
> +    MonitorQapiEvent *ev;
> +    static __thread QSIMPLEQ_HEAD(, MonitorQapiEvent) event_queue;
> +
> +    if (!reentered) {
> +        QSIMPLEQ_INIT(&event_queue);
> +    }

Is it safe to call QSIMPLEQ_INIT() on an already-initialized but empty 
queue?

> +
> +    ev = g_new(MonitorQapiEvent, 1);
> +    ev->qdict = qobject_ref(qdict);
> +    ev->event = event;
> +    QSIMPLEQ_INSERT_TAIL(&event_queue, ev, entry);
> +    if (reentered) {
> +        return;
> +    }
> +
> +    reentered = true;
> +
> +    while ((ev = QSIMPLEQ_FIRST(&event_queue)) != NULL) {
> +        QSIMPLEQ_REMOVE_HEAD(&event_queue, entry);
> +        monitor_qapi_event_queue_no_recurse(ev->event, ev->qdict);
> +        qobject_unref(ev->qdict);
> +        g_free(ev);
> +    }
> +
> +    reentered = false;
> +}
> +
>   /*
>    * This function runs evconf->rate ns after sending a throttled
>    * event.
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH v2] monitor: temporary fix for dead-lock on event recursion
Posted by Marc-André Lureau 7 years, 3 months ago
Hi

On Tue, Jul 31, 2018 at 6:03 PM, Eric Blake <eblake@redhat.com> wrote:
> On 07/31/2018 10:01 AM, Marc-André Lureau wrote:
>>
>> With a Spice port chardev, it is possible to reenter
>> monitor_qapi_event_queue() (when the client disconnects for
>> example). This will dead-lock on monitor_lock.
>>
>> Instead, use some TLS variables to check for recursion and queue the
>> events.
>>
>
>>
>> diff --git a/monitor.c b/monitor.c
>> index d8d8211ae4..4d9c1873bc 100644
>> --- a/monitor.c
>> +++ b/monitor.c
>> @@ -633,7 +633,7 @@ static void monitor_qapi_event_handler(void *opaque);
>>    * applying any rate limiting if required.
>>    */
>>   static void
>> -monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
>> +monitor_qapi_event_queue_no_recurse(QAPIEvent event, QDict *qdict)
>>   {
>>       MonitorQAPIEventConf *evconf;
>>       MonitorQAPIEventState *evstate;
>> @@ -688,6 +688,48 @@ monitor_qapi_event_queue(QAPIEvent event, QDict
>> *qdict, Error **errp)
>>       qemu_mutex_unlock(&monitor_lock);
>>   }
>>   +static void
>> +monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
>> +{
>> +    /*
>> +     * monitor_qapi_event_queue_no_recurse() is not reentrant: it
>> +     * would deadlock on monitor_lock.  Work around by queueing
>> +     * events in thread-local storage.
>> +     * TODO: remove this, make it re-enter safe.
>> +     */
>> +    static __thread bool reentered;
>> +    typedef struct MonitorQapiEvent {
>> +        QAPIEvent event;
>> +        QDict *qdict;
>> +        QSIMPLEQ_ENTRY(MonitorQapiEvent) entry;
>> +    } MonitorQapiEvent;
>> +    MonitorQapiEvent *ev;
>> +    static __thread QSIMPLEQ_HEAD(, MonitorQapiEvent) event_queue;
>> +
>> +    if (!reentered) {
>> +        QSIMPLEQ_INIT(&event_queue);
>> +    }
>
>
> Is it safe to call QSIMPLEQ_INIT() on an already-initialized but empty
> queue?

#define QSIMPLEQ_INIT(head) do {                                        \
    (head)->sqh_first = NULL;                                           \
    (head)->sqh_last = &(head)->sqh_first;                              \
} while (/*CONSTCOND*/0)


It looks safe to me. There is no allocation in QSIMPLEQ macros, and
some of them do call QSIMPLEQ_INIT on a number of operations on
already initialized queues.

Note that I would rather use QSIMPLEQ_HEAD_INITIALIZER, but there is an error:


/home/elmarco/src/qemu/monitor.c: In function ‘monitor_qapi_event_queue’:
/home/elmarco/src/qemu/include/qemu/queue.h:247:13: error: initializer
element is not constant
     { NULL, &(head).sqh_first }

Tbh, I don't really understand the problem! :)

>
>
>> +
>> +    ev = g_new(MonitorQapiEvent, 1);
>> +    ev->qdict = qobject_ref(qdict);
>> +    ev->event = event;
>> +    QSIMPLEQ_INSERT_TAIL(&event_queue, ev, entry);
>> +    if (reentered) {
>> +        return;
>> +    }
>> +
>> +    reentered = true;
>> +
>> +    while ((ev = QSIMPLEQ_FIRST(&event_queue)) != NULL) {
>> +        QSIMPLEQ_REMOVE_HEAD(&event_queue, entry);
>> +        monitor_qapi_event_queue_no_recurse(ev->event, ev->qdict);
>> +        qobject_unref(ev->qdict);
>> +        g_free(ev);
>> +    }
>> +
>> +    reentered = false;
>> +}
>> +
>>   /*
>>    * This function runs evconf->rate ns after sending a throttled
>>    * event.
>>
>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
>



-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v2] monitor: temporary fix for dead-lock on event recursion
Posted by Markus Armbruster 7 years, 3 months ago
Eric Blake <eblake@redhat.com> writes:

> On 07/31/2018 10:01 AM, Marc-André Lureau wrote:
>> With a Spice port chardev, it is possible to reenter
>> monitor_qapi_event_queue() (when the client disconnects for
>> example). This will dead-lock on monitor_lock.
>>
>> Instead, use some TLS variables to check for recursion and queue the
>> events.
>>
>
>>
>> diff --git a/monitor.c b/monitor.c
>> index d8d8211ae4..4d9c1873bc 100644
>> --- a/monitor.c
>> +++ b/monitor.c
>> @@ -633,7 +633,7 @@ static void monitor_qapi_event_handler(void *opaque);
>>    * applying any rate limiting if required.
>>    */
>>   static void
>> -monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
>> +monitor_qapi_event_queue_no_recurse(QAPIEvent event, QDict *qdict)
>>   {
>>       MonitorQAPIEventConf *evconf;
>>       MonitorQAPIEventState *evstate;
>> @@ -688,6 +688,48 @@ monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
>>       qemu_mutex_unlock(&monitor_lock);
>>   }
>>   +static void
>> +monitor_qapi_event_queue(QAPIEvent event, QDict *qdict, Error **errp)
>> +{
>> +    /*
>> +     * monitor_qapi_event_queue_no_recurse() is not reentrant: it
>> +     * would deadlock on monitor_lock.  Work around by queueing
>> +     * events in thread-local storage.
>> +     * TODO: remove this, make it re-enter safe.
>> +     */
>> +    static __thread bool reentered;
>> +    typedef struct MonitorQapiEvent {
>> +        QAPIEvent event;
>> +        QDict *qdict;
>> +        QSIMPLEQ_ENTRY(MonitorQapiEvent) entry;
>> +    } MonitorQapiEvent;
>> +    MonitorQapiEvent *ev;
>> +    static __thread QSIMPLEQ_HEAD(, MonitorQapiEvent) event_queue;
>> +
>> +    if (!reentered) {
>> +        QSIMPLEQ_INIT(&event_queue);
>> +    }
>
> Is it safe to call QSIMPLEQ_INIT() on an already-initialized but empty
> queue?

Yes.

    #define QSIMPLEQ_HEAD(name, type)                                       \
    struct name {                                                           \
        struct type *sqh_first;    /* first element */                      \
        struct type **sqh_last;    /* addr of last next element */          \
    }

    #define QSIMPLEQ_INIT(head) do {                                        \
        (head)->sqh_first = NULL;                                           \
        (head)->sqh_last = &(head)->sqh_first;                              \
    } while (/*CONSTCOND*/0)


Removing the last member results in the same state as QSIMPLEQ_INIT():

    #define QSIMPLEQ_REMOVE_HEAD(head, field) do {                          \
        if (((head)->sqh_first = (head)->sqh_first->field.sqe_next) == NULL)\
            (head)->sqh_last = &(head)->sqh_first;                          \
    } while (/*CONSTCOND*/0)

Calling QSIMPLEQ_INIT() again then is a no-op.

>
>> +
>> +    ev = g_new(MonitorQapiEvent, 1);
>> +    ev->qdict = qobject_ref(qdict);
>> +    ev->event = event;
>> +    QSIMPLEQ_INSERT_TAIL(&event_queue, ev, entry);
>> +    if (reentered) {
>> +        return;
>> +    }
>> +
>> +    reentered = true;
>> +
>> +    while ((ev = QSIMPLEQ_FIRST(&event_queue)) != NULL) {
>> +        QSIMPLEQ_REMOVE_HEAD(&event_queue, entry);
>> +        monitor_qapi_event_queue_no_recurse(ev->event, ev->qdict);
>> +        qobject_unref(ev->qdict);
>> +        g_free(ev);
>> +    }
>> +
>> +    reentered = false;
>> +}
>> +
>>   /*
>>    * This function runs evconf->rate ns after sending a throttled
>>    * event.
>>