[PATCH v4 16/23] util: fix interleaving of error & trace output

Daniel P. Berrangé posted 23 patches 3 days, 5 hours ago
Maintainers: Gerd Hoffmann <kraxel@redhat.com>, Christian Schoenebeck <qemu_oss@crudebyte.com>, Markus Armbruster <armbru@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, "Marc-André Lureau" <marcandre.lureau@redhat.com>, "Daniel P. Berrangé" <berrange@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, "Dr. David Alan Gilbert" <dave@treblig.org>, Manos Pitsidianakis <manos.pitsidianakis@linaro.org>, Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, Stefan Weil <sw@weilnetz.de>
[PATCH v4 16/23] util: fix interleaving of error & trace output
Posted by Daniel P. Berrangé via Devel 3 days, 5 hours ago
The monitor_cur_hmp() function will acquire/release mutex locks, which
will trigger trace probes, which can in turn trigger qemu_log() calls.
vreport() calls monitor_cur() multiple times through its execution
both directly and indirectly via error_vprintf().

The result is that the prefix information printed by vreport() gets
interleaved with qemu_log() output, when run outside the context of
an HMP command dispatcher. This can be seen with:

 $ qemu-system-x86_64 \
     -msg timestamp=on,guest-name=on \
     -display none \
     -object tls-creds-x509,id=f,dir=fish \
     -name fish \
     -d trace:qemu_mutex*
   2025-09-10T16:30:42.514374Z qemu_mutex_unlock released mutex 0x560b0339b4c0 (/var/home/berrange/src/virt/qemu/include/qemu/lockable.h:56)
   2025-09-10T16:30:42.514400Z qemu_mutex_lock waiting on mutex 0x560b033983e0 (/var/home/berrange/src/virt/qemu/include/qemu/lockable.h:56)
   2025-09-10T16:30:42.514402Z qemu_mutex_locked taken mutex 0x560b033983e0 (/var/home/berrange/src/virt/qemu/include/qemu/lockable.h:56)
   2025-09-10T16:30:42.514404Z qemu_mutex_unlock released mutex 0x560b033983e0 (/var/home/berrange/src/virt/qemu/include/qemu/lockable.h:56)
   2025-09-10T16:30:42.516716Z qemu_mutex_lock waiting on mutex 0x560b03398560 (../monitor/monitor.c:91)
   2025-09-10T16:30:42.516723Z qemu_mutex_locked taken mutex 0x560b03398560 (../monitor/monitor.c:91)
   2025-09-10T16:30:42.516726Z qemu_mutex_unlock released mutex 0x560b03398560 (../monitor/monitor.c:96)
   2025-09-10T16:30:42.516728Z qemu_mutex_lock waiting on mutex 0x560b03398560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842057Z qemu_mutex_locked taken mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842058Z qemu_mutex_unlock released mutex 0x564f5e401560 (../monitor/monitor.c:96)
   2025-09-10T16:31:04.842055Z 2025-09-10T16:31:04.842060Z qemu_mutex_lock waiting on mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842061Z qemu_mutex_locked taken mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842062Z qemu_mutex_unlock released mutex 0x564f5e401560 (../monitor/monitor.c:96)
   2025-09-10T16:31:04.842064Z qemu_mutex_lock waiting on mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842065Z qemu_mutex_locked taken mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842066Z qemu_mutex_unlock released mutex 0x564f5e401560 (../monitor/monitor.c:96)
   fish 2025-09-10T16:31:04.842068Z qemu_mutex_lock waiting on mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842069Z qemu_mutex_locked taken mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842070Z qemu_mutex_unlock released mutex 0x564f5e401560 (../monitor/monitor.c:96)
   2025-09-10T16:31:04.842072Z qemu_mutex_lock waiting on mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842097Z qemu_mutex_locked taken mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842099Z qemu_mutex_unlock released mutex 0x564f5e401560 (../monitor/monitor.c:96)
   qemu-system-x86_64:2025-09-10T16:31:04.842100Z qemu_mutex_lock waiting on mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842102Z qemu_mutex_locked taken mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842103Z qemu_mutex_unlock released mutex 0x564f5e401560 (../monitor/monitor.c:96)
    2025-09-10T16:31:04.842105Z qemu_mutex_lock waiting on mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842106Z qemu_mutex_locked taken mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842107Z qemu_mutex_unlock released mutex 0x564f5e401560 (../monitor/monitor.c:96)
   Unable to access credentials fish/ca-cert.pem: No such file or directory2025-09-10T16:31:04.842109Z qemu_mutex_lock waiting on mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842110Z qemu_mutex_locked taken mutex 0x564f5e401560 (../monitor/monitor.c:91)
   2025-09-10T16:31:04.842111Z qemu_mutex_unlock released mutex 0x564f5e401560 (../monitor/monitor.c:96)

To avoid this interleaving (as well as reduce the huge number of
mutex lock/unlock calls) we need to ensure that monitor_cur_is_hmp()
is only called once at the start of vreport(), and if no HMP is
present, no further monitor APIs can be called.

This implies error_[v]printf() cannot be called from vreport(),
instead we must introduce error_[v]printf_mon() which accept a
pre-acquired Monitor object.

 $ qemu-system-x86_64 \
     -msg timestamp=on,guest-name=on \
     -display none \
     -object tls-creds-x509,id=f,dir=fish \
     -name fish \
     -d trace:qemu_mutex*
   2025-09-10T16:31:22.701691Z qemu_mutex_unlock released mutex 0x5626fd3b84c0 (/var/home/berrange/src/virt/qemu/include/qemu/lockable.h:56)
   2025-09-10T16:31:22.701728Z qemu_mutex_lock waiting on mutex 0x5626fd3b53e0 (/var/home/berrange/src/virt/qemu/include/qemu/lockable.h:56)
   2025-09-10T16:31:22.701730Z qemu_mutex_locked taken mutex 0x5626fd3b53e0 (/var/home/berrange/src/virt/qemu/include/qemu/lockable.h:56)
   2025-09-10T16:31:22.701732Z qemu_mutex_unlock released mutex 0x5626fd3b53e0 (/var/home/berrange/src/virt/qemu/include/qemu/lockable.h:56)
   2025-09-10T16:31:22.703989Z qemu_mutex_lock waiting on mutex 0x5626fd3b5560 (../monitor/monitor.c:91)
   2025-09-10T16:31:22.703996Z qemu_mutex_locked taken mutex 0x5626fd3b5560 (../monitor/monitor.c:91)
   2025-09-10T16:31:22.703999Z qemu_mutex_unlock released mutex 0x5626fd3b5560 (../monitor/monitor.c:96)
   2025-09-10T16:31:22.704000Z fish qemu-system-x86_64: Unable to access credentials fish/ca-cert.pem: No such file or directory

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 util/error-report.c | 67 +++++++++++++++++++++++++++++++--------------
 1 file changed, 46 insertions(+), 21 deletions(-)

diff --git a/util/error-report.c b/util/error-report.c
index 872158ad90..222c40d0d1 100644
--- a/util/error-report.c
+++ b/util/error-report.c
@@ -29,9 +29,12 @@ bool message_with_timestamp;
 bool error_with_guestname;
 const char *error_guest_name;
 
-int error_vprintf(const char *fmt, va_list ap)
+/*
+ * Print to current monitor if we have one, else to stderr.
+ */
+static int G_GNUC_PRINTF(2, 0)
+error_vprintf_mon(Monitor *cur_mon, const char *fmt, va_list ap)
 {
-    Monitor *cur_mon = monitor_cur();
     /*
      * This will return -1 if 'cur_mon' is NULL, or is QMP.
      * IOW this will only print if in HMP, otherwise we
@@ -44,13 +47,30 @@ int error_vprintf(const char *fmt, va_list ap)
     return ret;
 }
 
+static int G_GNUC_PRINTF(2, 3)
+error_printf_mon(Monitor *cur_mon, const char *fmt, ...)
+{
+    va_list ap;
+    int ret;
+
+    va_start(ap, fmt);
+    ret = error_vprintf_mon(cur_mon, fmt, ap);
+    va_end(ap);
+    return ret;
+}
+
+int error_vprintf(const char *fmt, va_list ap)
+{
+    return error_vprintf_mon(monitor_cur(), fmt, ap);
+}
+
 int error_printf(const char *fmt, ...)
 {
     va_list ap;
     int ret;
 
     va_start(ap, fmt);
-    ret = error_vprintf(fmt, ap);
+    ret = error_vprintf_mon(monitor_cur(), fmt, ap);
     va_end(ap);
     return ret;
 }
@@ -153,34 +173,34 @@ void loc_set_file(const char *fname, int lno)
 /*
  * Print current location to current monitor if we have one, else to stderr.
  */
-static void print_loc(void)
+static void print_loc(Monitor *cur)
 {
     const char *sep = "";
     int i;
     const char *const *argp;
 
-    if (!monitor_cur_is_hmp() && g_get_prgname()) {
-        error_printf("%s:", g_get_prgname());
+    if (!cur && g_get_prgname()) {
+        error_printf_mon(NULL, "%s:", g_get_prgname());
         sep = " ";
     }
     switch (cur_loc->kind) {
     case LOC_CMDLINE:
         argp = cur_loc->ptr;
         for (i = 0; i < cur_loc->num; i++) {
-            error_printf("%s%s", sep, argp[i]);
+            error_printf_mon(cur, "%s%s", sep, argp[i]);
             sep = " ";
         }
-        error_printf(": ");
+        error_printf_mon(cur, ": ");
         break;
     case LOC_FILE:
-        error_printf("%s:", (const char *)cur_loc->ptr);
+        error_printf_mon(cur, "%s:", (const char *)cur_loc->ptr);
         if (cur_loc->num) {
-            error_printf("%d:", cur_loc->num);
+            error_printf_mon(cur, "%d:", cur_loc->num);
         }
-        error_printf(" ");
+        error_printf_mon(cur, " ");
         break;
     default:
-        error_printf("%s", sep);
+        error_printf_mon(cur, "%s", sep);
     }
 }
 
@@ -201,34 +221,39 @@ real_time_iso8601(void)
 G_GNUC_PRINTF(2, 0)
 static void vreport(report_type type, const char *fmt, va_list ap)
 {
+    Monitor *cur = NULL;
     gchar *timestr;
 
-    if (message_with_timestamp && !monitor_cur_is_hmp()) {
+    if (monitor_cur_is_hmp()) {
+        cur = monitor_cur();
+    }
+
+    if (message_with_timestamp && !cur) {
         timestr = real_time_iso8601();
-        error_printf("%s ", timestr);
+        error_printf_mon(NULL, "%s ", timestr);
         g_free(timestr);
     }
 
     /* Only prepend guest name if -msg guest-name and -name guest=... are set */
-    if (error_with_guestname && error_guest_name && !monitor_cur_is_hmp()) {
-        error_printf("%s ", error_guest_name);
+    if (error_with_guestname && error_guest_name && !cur) {
+        error_printf_mon(NULL, "%s ", error_guest_name);
     }
 
-    print_loc();
+    print_loc(cur);
 
     switch (type) {
     case REPORT_TYPE_ERROR:
         break;
     case REPORT_TYPE_WARNING:
-        error_printf("warning: ");
+        error_printf_mon(cur, "warning: ");
         break;
     case REPORT_TYPE_INFO:
-        error_printf("info: ");
+        error_printf_mon(cur, "info: ");
         break;
     }
 
-    error_vprintf(fmt, ap);
-    error_printf("\n");
+    error_vprintf_mon(cur, fmt, ap);
+    error_printf_mon(cur, "\n");
 }
 
 /*
-- 
2.50.1

Re: [PATCH v4 16/23] util: fix interleaving of error & trace output
Posted by Richard Henderson 2 days, 18 hours ago
On 9/25/25 02:44, Daniel P. Berrangé wrote:
> +    if (monitor_cur_is_hmp()) {
> +        cur = monitor_cur();
> +    }

Didn't your last patch set return Montor* from monitor_cur_is_hmp?
Because this takes the locks in monitor_cur() twice.

> +    if (message_with_timestamp && !cur) {
>          timestr = real_time_iso8601();
> -        error_printf("%s ", timestr);
> +        error_printf_mon(NULL, "%s ", timestr);

Passing NULL to error_printf_mon is fprintf to stderr.


r~

Re: [PATCH v4 16/23] util: fix interleaving of error & trace output
Posted by Daniel P. Berrangé 2 days ago
On Thu, Sep 25, 2025 at 12:54:13PM -0700, Richard Henderson wrote:
> On 9/25/25 02:44, Daniel P. Berrangé wrote:
> > +    if (monitor_cur_is_hmp()) {
> > +        cur = monitor_cur();
> > +    }
> 
> Didn't your last patch set return Montor* from monitor_cur_is_hmp?
> Because this takes the locks in monitor_cur() twice.

Yes, however, the intent of both this & the previous version is to
optimize the scenario where the monitor is NULL or QMP. If we take
an extra lock in the HMP case that's acceptable and doesn't cause
ill effets.

> 
> > +    if (message_with_timestamp && !cur) {
> >          timestr = real_time_iso8601();
> > -        error_printf("%s ", timestr);
> > +        error_printf_mon(NULL, "%s ", timestr);
> 
> Passing NULL to error_printf_mon is fprintf to stderr.

True, I could have rewritten this to just fprintf. It goes away
later in the series when its all moved into the new qcontext API
call, which is given stderr directly. I guess making it clear it
is stderr at this point in the series, gives reassurance that the
later qcontext conversion hasn't changed the output target.


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|