[PULL 3/5] linux-user/syscall.c: Prevent acquiring clone_lock while fork()

Richard Henderson posted 5 patches 4 days ago
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>, Riku Voipio <riku.voipio@iki.fi>, Laurent Vivier <laurent@vivier.eu>, Pierrick Bouvier <pierrick.bouvier@linaro.org>
There is a newer version of this series
[PULL 3/5] linux-user/syscall.c: Prevent acquiring clone_lock while fork()
Posted by Richard Henderson 4 days ago
From: Aleksandr Sergeev <sergeev0xef@gmail.com>

By the spec, fork() copies only the thread which executes it.
So it may happen, what while one thread is doing a fork,
another thread is holding `clone_lock` mutex
(e.g. doing a `fork()` or `exit()`).
So the child process is born with the mutex being held,
and there are nobody to release it.

As the thread executing do_syscall() is not considered running,
start_exclusive() does not protect us from the case.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3226
Signed-off-by: Aleksandr Sergeev <sergeev0xef@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20260126151612.2176451-1-sergeev0xef@gmail.com>
---
 linux-user/user-internals.h |  2 ++
 linux-user/main.c           |  2 ++
 linux-user/syscall.c        | 14 ++++++++++++++
 3 files changed, 18 insertions(+)

diff --git a/linux-user/user-internals.h b/linux-user/user-internals.h
index 067c02bb93..24d35998f0 100644
--- a/linux-user/user-internals.h
+++ b/linux-user/user-internals.h
@@ -69,6 +69,8 @@ abi_long get_errno(abi_long ret);
 const char *target_strerror(int err);
 int get_osversion(void);
 void init_qemu_uname_release(void);
+void clone_fork_start(void);
+void clone_fork_end(bool child);
 void fork_start(void);
 void fork_end(pid_t pid);
 
diff --git a/linux-user/main.c b/linux-user/main.c
index db751c0757..c49d1e91d2 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -145,6 +145,7 @@ unsigned long guest_stack_size = TARGET_DEFAULT_STACK_SIZE;
 void fork_start(void)
 {
     start_exclusive();
+    clone_fork_start();
     mmap_fork_start();
     cpu_list_lock();
     qemu_plugin_user_prefork_lock();
@@ -174,6 +175,7 @@ void fork_end(pid_t pid)
         cpu_list_unlock();
     }
     gdbserver_fork_end(thread_cpu, pid);
+    clone_fork_end(child);
     /*
      * qemu_init_cpu_list() reinitialized the child exclusive state, but we
      * also need to keep current_cpu consistent, so call end_exclusive() for
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 3e6a56aa0f..d466d0e32f 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -6856,6 +6856,20 @@ static void *clone_func(void *arg)
     return NULL;
 }
 
+void clone_fork_start(void)
+{
+    pthread_mutex_lock(&clone_lock);
+}
+
+void clone_fork_end(bool child)
+{
+    if (child) {
+        pthread_mutex_init(&clone_lock, NULL);
+    } else {
+        pthread_mutex_unlock(&clone_lock);
+    }
+}
+
 /* do_fork() Must return host values and target errnos (unlike most
    do_*() functions). */
 static int do_fork(CPUArchState *env, unsigned int flags, abi_ulong newsp,
-- 
2.43.0