trace/simple.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-)
When compiling QEMU with --enable-trace-backends=simple , the
iotest 233 is currently hanging. This happens because qemu-nbd
calls trace_init_backends() first - which causes simpletrace to
install its writer thread and the atexit() handler - before
calling fork(). But the simpletrace writer thread is then only
available in the parent process, not in the child process anymore.
Thus when the child process exits, its atexit handler waits forever
on the trace_empty_cond condition to be set by the non-existing
writer thread, so the process never finishes.
Fix it by installing a pthread_atfork() handler, too, which
makes sure that the trace_writeout_enabled variable gets set
to false again in the child process, so we can use it in the
atexit() handler to check whether we still need to wait on the
writer thread or not.
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
trace/simple.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/trace/simple.c b/trace/simple.c
index c0aba00cb7f..269bbda69f1 100644
--- a/trace/simple.c
+++ b/trace/simple.c
@@ -380,8 +380,22 @@ void st_print_trace_file_status(void)
void st_flush_trace_buffer(void)
{
- flush_trace_file(true);
+ flush_trace_file(trace_writeout_enabled);
+}
+
+#ifndef _WIN32
+static void trace_thread_atfork(void)
+{
+ /*
+ * If we fork, the writer thread does not exist in the child, so
+ * make sure to allow st_flush_trace_buffer() to clean up correctly.
+ */
+ g_mutex_lock(&trace_lock);
+ trace_writeout_enabled = false;
+ g_cond_signal(&trace_empty_cond);
+ g_mutex_unlock(&trace_lock);
}
+#endif
/* Helper function to create a thread with signals blocked. Use glib's
* portable threads since QEMU abstractions cannot be used due to reentrancy in
@@ -396,6 +410,7 @@ static GThread *trace_thread_create(GThreadFunc fn)
sigfillset(&set);
pthread_sigmask(SIG_SETMASK, &set, &oldset);
+ pthread_atfork(NULL, NULL, trace_thread_atfork);
#endif
thread = g_thread_new("trace-thread", fn, NULL);
--
2.48.1
On Wed, Feb 26, 2025 at 09:50:15AM +0100, Thomas Huth wrote: > When compiling QEMU with --enable-trace-backends=simple , the > iotest 233 is currently hanging. This happens because qemu-nbd > calls trace_init_backends() first - which causes simpletrace to > install its writer thread and the atexit() handler - before > calling fork(). But the simpletrace writer thread is then only > available in the parent process, not in the child process anymore. > Thus when the child process exits, its atexit handler waits forever > on the trace_empty_cond condition to be set by the non-existing > writer thread, so the process never finishes. > > Fix it by installing a pthread_atfork() handler, too, which > makes sure that the trace_writeout_enabled variable gets set > to false again in the child process, so we can use it in the > atexit() handler to check whether we still need to wait on the > writer thread or not. > > Signed-off-by: Thomas Huth <thuth@redhat.com> > --- > trace/simple.c | 17 ++++++++++++++++- > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/trace/simple.c b/trace/simple.c > index c0aba00cb7f..269bbda69f1 100644 > --- a/trace/simple.c > +++ b/trace/simple.c > @@ -380,8 +380,22 @@ void st_print_trace_file_status(void) > > void st_flush_trace_buffer(void) > { > - flush_trace_file(true); > + flush_trace_file(trace_writeout_enabled); > +} > + > +#ifndef _WIN32 > +static void trace_thread_atfork(void) > +{ > + /* > + * If we fork, the writer thread does not exist in the child, so > + * make sure to allow st_flush_trace_buffer() to clean up correctly. > + */ > + g_mutex_lock(&trace_lock); And are we sure trace_lock was previously initialized in memory visible to the thread that is calling the after-fork handler here? POSIX admits that pthread_atfork() is generally too hard to use successfully (there are just too many corner cases), and instead recommends that the only portable thing a multi-threaded app can do after forking is limit itself to async-signal-safe functions until it execs a new program. The other common reason for forking is for daemonization; there, the advice is to be completely single-threaded until after the fork() has set up the right environment for the daemon, at which point the child can then finally go multi-threaded. And with either of those solutions (Using fork() to spawn a child process? do nothing unsafe after fork except for what it takes to reach exec. Using fork() to daemonize? do nothing unsafe before fork) eliminates the need for using pthread_atfork() in the first place. At any rate, per your plea elsewhere in the thread, I'll take a look at deferring the logging init until after qemu-nbd has daemonized. -- Eric Blake, Principal Software Engineer Red Hat, Inc. Virtualization: qemu.org | libguestfs.org
Am 26.02.2025 um 09:50 hat Thomas Huth geschrieben: > When compiling QEMU with --enable-trace-backends=simple , the > iotest 233 is currently hanging. This happens because qemu-nbd > calls trace_init_backends() first - which causes simpletrace to > install its writer thread and the atexit() handler - before > calling fork(). But the simpletrace writer thread is then only > available in the parent process, not in the child process anymore. > Thus when the child process exits, its atexit handler waits forever > on the trace_empty_cond condition to be set by the non-existing > writer thread, so the process never finishes. > > Fix it by installing a pthread_atfork() handler, too, which > makes sure that the trace_writeout_enabled variable gets set > to false again in the child process, so we can use it in the > atexit() handler to check whether we still need to wait on the > writer thread or not. > > Signed-off-by: Thomas Huth <thuth@redhat.com> I can see how this would fix the hang, but do we actually get the trace events written out somewhere then? Or do we need to make sure that the child process has a writer thread, too? Of course, the question would then be how the two processes writing into the same trace file interact. Kevin
On Wed, Feb 26, 2025 at 10:29:59AM +0100, Kevin Wolf wrote: > Am 26.02.2025 um 09:50 hat Thomas Huth geschrieben: > > When compiling QEMU with --enable-trace-backends=simple , the > > iotest 233 is currently hanging. This happens because qemu-nbd > > calls trace_init_backends() first - which causes simpletrace to > > install its writer thread and the atexit() handler - before > > calling fork(). But the simpletrace writer thread is then only > > available in the parent process, not in the child process anymore. > > Thus when the child process exits, its atexit handler waits forever > > on the trace_empty_cond condition to be set by the non-existing > > writer thread, so the process never finishes. > > > > Fix it by installing a pthread_atfork() handler, too, which > > makes sure that the trace_writeout_enabled variable gets set > > to false again in the child process, so we can use it in the > > atexit() handler to check whether we still need to wait on the > > writer thread or not. > > > > Signed-off-by: Thomas Huth <thuth@redhat.com> > > I can see how this would fix the hang, but do we actually get the trace > events written out somewhere then? Or do we need to make sure that the > child process has a writer thread, too? > > Of course, the question would then be how the two processes writing into > the same trace file interact. In st_set_trace_file, if no explicit filename is given by the user we use a default filename "trace-$PID", so *if* the atfork handler in the child were to reset its file after fork it'll get a distinct filename. Happy days. If the user provided an explicit filename though, we just use that as is and would end up with dualling writers and likely data corruption. While we could blindly append a pid to the user's filename it is somewhat unhelpful in the daemonize case to split it across files. Whether we actually want tracing in a child also possibly depends on the reason for fork. If we're forking to daemonize, then we (mostly) want tracing in the child and (mostly) no longer care about the parent. If we're forking to exec another app, then we (mostly) want tracing in the parent and (mostly) no longer care about the child. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Wed, Feb 26, 2025 at 09:50:15AM +0100, Thomas Huth wrote: > When compiling QEMU with --enable-trace-backends=simple , the > iotest 233 is currently hanging. This happens because qemu-nbd > calls trace_init_backends() first - which causes simpletrace to > install its writer thread and the atexit() handler - before > calling fork(). But the simpletrace writer thread is then only > available in the parent process, not in the child process anymore. > Thus when the child process exits, its atexit handler waits forever > on the trace_empty_cond condition to be set by the non-existing > writer thread, so the process never finishes. > > Fix it by installing a pthread_atfork() handler, too, which > makes sure that the trace_writeout_enabled variable gets set > to false again in the child process, so we can use it in the > atexit() handler to check whether we still need to wait on the > writer thread or not. > > Signed-off-by: Thomas Huth <thuth@redhat.com> > --- > trace/simple.c | 17 ++++++++++++++++- > 1 file changed, 16 insertions(+), 1 deletion(-) > > diff --git a/trace/simple.c b/trace/simple.c > index c0aba00cb7f..269bbda69f1 100644 > --- a/trace/simple.c > +++ b/trace/simple.c > @@ -380,8 +380,22 @@ void st_print_trace_file_status(void) > > void st_flush_trace_buffer(void) > { > - flush_trace_file(true); > + flush_trace_file(trace_writeout_enabled); > +} > + > +#ifndef _WIN32 > +static void trace_thread_atfork(void) > +{ > + /* > + * If we fork, the writer thread does not exist in the child, so > + * make sure to allow st_flush_trace_buffer() to clean up correctly. > + */ > + g_mutex_lock(&trace_lock); > + trace_writeout_enabled = false; > + g_cond_signal(&trace_empty_cond); > + g_mutex_unlock(&trace_lock); > } > +#endif This doesn't seem right to me. This is being run in the child and while it may avoid the hang when the child exits, surely it still leaves tracing non-functional in the child as we're lacking the thread to write out the trace data. > > /* Helper function to create a thread with signals blocked. Use glib's > * portable threads since QEMU abstractions cannot be used due to reentrancy in > @@ -396,6 +410,7 @@ static GThread *trace_thread_create(GThreadFunc fn) > > sigfillset(&set); > pthread_sigmask(SIG_SETMASK, &set, &oldset); > + pthread_atfork(NULL, NULL, trace_thread_atfork); > #endif > > thread = g_thread_new("trace-thread", fn, NULL); > -- > 2.48.1 > > With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 26/02/2025 10.15, Daniel P. Berrangé wrote: > On Wed, Feb 26, 2025 at 09:50:15AM +0100, Thomas Huth wrote: >> When compiling QEMU with --enable-trace-backends=simple , the >> iotest 233 is currently hanging. This happens because qemu-nbd >> calls trace_init_backends() first - which causes simpletrace to >> install its writer thread and the atexit() handler - before >> calling fork(). But the simpletrace writer thread is then only >> available in the parent process, not in the child process anymore. >> Thus when the child process exits, its atexit handler waits forever >> on the trace_empty_cond condition to be set by the non-existing >> writer thread, so the process never finishes. >> >> Fix it by installing a pthread_atfork() handler, too, which >> makes sure that the trace_writeout_enabled variable gets set >> to false again in the child process, so we can use it in the >> atexit() handler to check whether we still need to wait on the >> writer thread or not. >> >> Signed-off-by: Thomas Huth <thuth@redhat.com> >> --- >> trace/simple.c | 17 ++++++++++++++++- >> 1 file changed, 16 insertions(+), 1 deletion(-) >> >> diff --git a/trace/simple.c b/trace/simple.c >> index c0aba00cb7f..269bbda69f1 100644 >> --- a/trace/simple.c >> +++ b/trace/simple.c >> @@ -380,8 +380,22 @@ void st_print_trace_file_status(void) >> >> void st_flush_trace_buffer(void) >> { >> - flush_trace_file(true); >> + flush_trace_file(trace_writeout_enabled); >> +} >> + >> +#ifndef _WIN32 >> +static void trace_thread_atfork(void) >> +{ >> + /* >> + * If we fork, the writer thread does not exist in the child, so >> + * make sure to allow st_flush_trace_buffer() to clean up correctly. >> + */ >> + g_mutex_lock(&trace_lock); >> + trace_writeout_enabled = false; >> + g_cond_signal(&trace_empty_cond); >> + g_mutex_unlock(&trace_lock); >> } >> +#endif > > This doesn't seem right to me. This is being run in the child and while > it may avoid the hang when the child exits, surely it still leaves tracing > non-functional in the child as we're lacking the thread to write out the > trace data. Well, you cannot write to the same file from the parent and child at the same time, so one of both needs to be shut up AFAIU. And the simpletrace code cannot now which one of the two processes should be allowed to continue with the logging, so we either have to disable tracing in one of the two processes, or think of something completely different, e.g. using pthread_atfork(abort, NULL, NULL) to make people aware that they are not allowed to start tracing before calling fork()...? But in that case we still need a qemu-nbd expert to fix qemu-nbd, so that it does not initialize the trace backend before calling fork(). Thomas
On Wed, Feb 26, 2025 at 10:38:56AM +0100, Thomas Huth wrote: > On 26/02/2025 10.15, Daniel P. Berrangé wrote: > > On Wed, Feb 26, 2025 at 09:50:15AM +0100, Thomas Huth wrote: > > > When compiling QEMU with --enable-trace-backends=simple , the > > > iotest 233 is currently hanging. This happens because qemu-nbd > > > calls trace_init_backends() first - which causes simpletrace to > > > install its writer thread and the atexit() handler - before > > > calling fork(). But the simpletrace writer thread is then only > > > available in the parent process, not in the child process anymore. > > > Thus when the child process exits, its atexit handler waits forever > > > on the trace_empty_cond condition to be set by the non-existing > > > writer thread, so the process never finishes. > > > > > > Fix it by installing a pthread_atfork() handler, too, which > > > makes sure that the trace_writeout_enabled variable gets set > > > to false again in the child process, so we can use it in the > > > atexit() handler to check whether we still need to wait on the > > > writer thread or not. > > > > > > Signed-off-by: Thomas Huth <thuth@redhat.com> > > > --- > > > trace/simple.c | 17 ++++++++++++++++- > > > 1 file changed, 16 insertions(+), 1 deletion(-) > > > > > > diff --git a/trace/simple.c b/trace/simple.c > > > index c0aba00cb7f..269bbda69f1 100644 > > > --- a/trace/simple.c > > > +++ b/trace/simple.c > > > @@ -380,8 +380,22 @@ void st_print_trace_file_status(void) > > > void st_flush_trace_buffer(void) > > > { > > > - flush_trace_file(true); > > > + flush_trace_file(trace_writeout_enabled); > > > +} > > > + > > > +#ifndef _WIN32 > > > +static void trace_thread_atfork(void) > > > +{ > > > + /* > > > + * If we fork, the writer thread does not exist in the child, so > > > + * make sure to allow st_flush_trace_buffer() to clean up correctly. > > > + */ > > > + g_mutex_lock(&trace_lock); > > > + trace_writeout_enabled = false; > > > + g_cond_signal(&trace_empty_cond); > > > + g_mutex_unlock(&trace_lock); > > > } > > > +#endif > > > > This doesn't seem right to me. This is being run in the child and while > > it may avoid the hang when the child exits, surely it still leaves tracing > > non-functional in the child as we're lacking the thread to write out the > > trace data. > > Well, you cannot write to the same file from the parent and child at the > same time, so one of both needs to be shut up AFAIU. And the simpletrace > code cannot now which one of the two processes should be allowed to continue > with the logging, so we either have to disable tracing in one of the two > processes, or think of something completely different, e.g. using > pthread_atfork(abort, NULL, NULL) to make people aware that they are not > allowed to start tracing before calling fork()...? But in that case we still > need a qemu-nbd expert to fix qemu-nbd, so that it does not initialize the > trace backend before calling fork(). As precedent, in system/vl.c we delay trace_init() until after daemonizing which is the simple way to avoid the worst of the danger. It would still be nice to have an atfork() handler to fully eliminate the danger though With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Wed, Feb 26, 2025 at 09:53:53AM +0000, Daniel P. Berrangé wrote: > On Wed, Feb 26, 2025 at 10:38:56AM +0100, Thomas Huth wrote: > > On 26/02/2025 10.15, Daniel P. Berrangé wrote: > > > On Wed, Feb 26, 2025 at 09:50:15AM +0100, Thomas Huth wrote: > > > > When compiling QEMU with --enable-trace-backends=simple , the > > > > iotest 233 is currently hanging. This happens because qemu-nbd > > > > calls trace_init_backends() first - which causes simpletrace to > > > > install its writer thread and the atexit() handler - before > > > > calling fork(). But the simpletrace writer thread is then only > > > > available in the parent process, not in the child process anymore. > > > > Thus when the child process exits, its atexit handler waits forever > > > > on the trace_empty_cond condition to be set by the non-existing > > > > writer thread, so the process never finishes. > > > > > > > > Fix it by installing a pthread_atfork() handler, too, which > > > > makes sure that the trace_writeout_enabled variable gets set > > > > to false again in the child process, so we can use it in the > > > > atexit() handler to check whether we still need to wait on the > > > > writer thread or not. > > > > > > > > Signed-off-by: Thomas Huth <thuth@redhat.com> > > > > --- > > > > trace/simple.c | 17 ++++++++++++++++- > > > > 1 file changed, 16 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/trace/simple.c b/trace/simple.c > > > > index c0aba00cb7f..269bbda69f1 100644 > > > > --- a/trace/simple.c > > > > +++ b/trace/simple.c > > > > @@ -380,8 +380,22 @@ void st_print_trace_file_status(void) > > > > void st_flush_trace_buffer(void) > > > > { > > > > - flush_trace_file(true); > > > > + flush_trace_file(trace_writeout_enabled); > > > > +} > > > > + > > > > +#ifndef _WIN32 > > > > +static void trace_thread_atfork(void) > > > > +{ > > > > + /* > > > > + * If we fork, the writer thread does not exist in the child, so > > > > + * make sure to allow st_flush_trace_buffer() to clean up correctly. > > > > + */ > > > > + g_mutex_lock(&trace_lock); > > > > + trace_writeout_enabled = false; > > > > + g_cond_signal(&trace_empty_cond); > > > > + g_mutex_unlock(&trace_lock); > > > > } > > > > +#endif > > > > > > This doesn't seem right to me. This is being run in the child and while > > > it may avoid the hang when the child exits, surely it still leaves tracing > > > non-functional in the child as we're lacking the thread to write out the > > > trace data. > > > > Well, you cannot write to the same file from the parent and child at the > > same time, so one of both needs to be shut up AFAIU. And the simpletrace > > code cannot now which one of the two processes should be allowed to continue > > with the logging, so we either have to disable tracing in one of the two > > processes, or think of something completely different, e.g. using > > pthread_atfork(abort, NULL, NULL) to make people aware that they are not > > allowed to start tracing before calling fork()...? But in that case we still > > need a qemu-nbd expert to fix qemu-nbd, so that it does not initialize the > > trace backend before calling fork(). > > As precedent, in system/vl.c we delay trace_init() until after daemonizing > which is the simple way to avoid the worst of the danger. That sounds good to me. Adding Daniel Henrique Barboza because he fixed a similar issue in commit 10b6ee1616f9 ("vl.c: do not execute trace_init_backends() before daemonizing"). Stefan
© 2016 - 2025 Red Hat, Inc.