We have had some tracing tools for mutex but it's not easy to use them
for e.g. dead locks. Let's provide "--enable-debug-mutex" parameter
when configure to allow QemuMutex to store the last owner that took
specific lock. It might be extremely easy to use this tool to debug
deadlocks since we can directly know who took the lock then as long as
we can have a gdb attached to the process.
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
v2:
- fix "--disable-debug-mutex" not working..
---
configure | 10 ++++++++++
include/qemu/thread-posix.h | 4 ++++
util/qemu-thread-posix.c | 4 ++++
3 files changed, 18 insertions(+)
diff --git a/configure b/configure
index 0a19b033bc..a80af735b2 100755
--- a/configure
+++ b/configure
@@ -451,6 +451,7 @@ jemalloc="no"
replication="yes"
vxhs=""
libxml2=""
+debug_mutex="no"
supported_cpu="no"
supported_os="no"
@@ -1374,6 +1375,10 @@ for opt do
;;
--disable-git-update) git_update=no
;;
+ --enable-debug-mutex) debug_mutex=yes
+ ;;
+ --disable-debug-mutex) debug_mutex=no
+ ;;
*)
echo "ERROR: unknown option $opt"
echo "Try '$0 --help' for more information"
@@ -1631,6 +1636,7 @@ disabled with --disable-FEATURE, default is enabled if available:
crypto-afalg Linux AF_ALG crypto backend driver
vhost-user vhost-user support
capstone capstone disassembler support
+ debug-mutex mutex debugging support
NOTE: The object files are built at the place where configure is launched
EOF
@@ -5874,6 +5880,7 @@ echo "avx2 optimization $avx2_opt"
echo "replication support $replication"
echo "VxHS block device $vxhs"
echo "capstone $capstone"
+echo "mutex debugging $debug_mutex"
if test "$sdl_too_old" = "yes"; then
echo "-> Your SDL version is too old - please upgrade to have SDL support"
@@ -6602,6 +6609,9 @@ fi
if test "$capstone" != "no" ; then
echo "CONFIG_CAPSTONE=y" >> $config_host_mak
fi
+if test "$debug_mutex" = "yes" ; then
+ echo "CONFIG_DEBUG_MUTEX=y" >> $config_host_mak
+fi
# Hold two types of flag:
# CONFIG_THREAD_SETNAME_BYTHREAD - we've got a way of setting the name on
diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h
index f3f47e426f..fd27b34128 100644
--- a/include/qemu/thread-posix.h
+++ b/include/qemu/thread-posix.h
@@ -12,6 +12,10 @@ typedef QemuMutex QemuRecMutex;
struct QemuMutex {
pthread_mutex_t lock;
+#ifdef CONFIG_DEBUG_MUTEX
+ const char *file;
+ int line;
+#endif
bool initialized;
};
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index b789cf32e9..905a816af6 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -68,6 +68,10 @@ void qemu_mutex_lock_impl(QemuMutex *mutex, const char *file, const int line)
if (err)
error_exit(err, __func__);
+#ifdef CONFIG_DEBUG_MUTEX
+ mutex->file = file;
+ mutex->line = line;
+#endif
trace_qemu_mutex_locked(mutex, file, line);
}
--
2.14.3
On Wed, 04/18 17:06, Peter Xu wrote: > We have had some tracing tools for mutex but it's not easy to use them > for e.g. dead locks. Let's provide "--enable-debug-mutex" parameter > when configure to allow QemuMutex to store the last owner that took > specific lock. It might be extremely easy to use this tool to debug > deadlocks since we can directly know who took the lock then as long as > we can have a gdb attached to the process. > > CC: Paolo Bonzini <pbonzini@redhat.com> > CC: Stefan Hajnoczi <stefanha@redhat.com> > CC: Fam Zheng <famz@redhat.com> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > v2: > - fix "--disable-debug-mutex" not working.. > --- > configure | 10 ++++++++++ > include/qemu/thread-posix.h | 4 ++++ > util/qemu-thread-posix.c | 4 ++++ > 3 files changed, 18 insertions(+) > > diff --git a/configure b/configure > index 0a19b033bc..a80af735b2 100755 > --- a/configure > +++ b/configure > @@ -451,6 +451,7 @@ jemalloc="no" > replication="yes" > vxhs="" > libxml2="" > +debug_mutex="no" > > supported_cpu="no" > supported_os="no" > @@ -1374,6 +1375,10 @@ for opt do > ;; > --disable-git-update) git_update=no > ;; > + --enable-debug-mutex) debug_mutex=yes > + ;; > + --disable-debug-mutex) debug_mutex=no > + ;; > *) > echo "ERROR: unknown option $opt" > echo "Try '$0 --help' for more information" > @@ -1631,6 +1636,7 @@ disabled with --disable-FEATURE, default is enabled if available: > crypto-afalg Linux AF_ALG crypto backend driver > vhost-user vhost-user support > capstone capstone disassembler support > + debug-mutex mutex debugging support > > NOTE: The object files are built at the place where configure is launched > EOF > @@ -5874,6 +5880,7 @@ echo "avx2 optimization $avx2_opt" > echo "replication support $replication" > echo "VxHS block device $vxhs" > echo "capstone $capstone" > +echo "mutex debugging $debug_mutex" > > if test "$sdl_too_old" = "yes"; then > echo "-> Your SDL version is too old - please upgrade to have SDL support" > @@ -6602,6 +6609,9 @@ fi > if test "$capstone" != "no" ; then > echo "CONFIG_CAPSTONE=y" >> $config_host_mak > fi > +if test "$debug_mutex" = "yes" ; then > + echo "CONFIG_DEBUG_MUTEX=y" >> $config_host_mak > +fi > > # Hold two types of flag: > # CONFIG_THREAD_SETNAME_BYTHREAD - we've got a way of setting the name on > diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h > index f3f47e426f..fd27b34128 100644 > --- a/include/qemu/thread-posix.h > +++ b/include/qemu/thread-posix.h > @@ -12,6 +12,10 @@ typedef QemuMutex QemuRecMutex; > > struct QemuMutex { > pthread_mutex_t lock; > +#ifdef CONFIG_DEBUG_MUTEX > + const char *file; > + int line; > +#endif These look quite cheap, why we need a configure option to enable it? > bool initialized; > }; > > diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c > index b789cf32e9..905a816af6 100644 > --- a/util/qemu-thread-posix.c > +++ b/util/qemu-thread-posix.c > @@ -68,6 +68,10 @@ void qemu_mutex_lock_impl(QemuMutex *mutex, const char *file, const int line) > if (err) > error_exit(err, __func__); > > +#ifdef CONFIG_DEBUG_MUTEX > + mutex->file = file; > + mutex->line = line; > +#endif > trace_qemu_mutex_locked(mutex, file, line); > } > > -- > 2.14.3 > Fam
On Thu, Apr 19, 2018 at 09:56:31AM +0800, Fam Zheng wrote: > On Wed, 04/18 17:06, Peter Xu wrote: > > We have had some tracing tools for mutex but it's not easy to use them > > for e.g. dead locks. Let's provide "--enable-debug-mutex" parameter > > when configure to allow QemuMutex to store the last owner that took > > specific lock. It might be extremely easy to use this tool to debug > > deadlocks since we can directly know who took the lock then as long as > > we can have a gdb attached to the process. > > > > CC: Paolo Bonzini <pbonzini@redhat.com> > > CC: Stefan Hajnoczi <stefanha@redhat.com> > > CC: Fam Zheng <famz@redhat.com> > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > v2: > > - fix "--disable-debug-mutex" not working.. > > --- > > configure | 10 ++++++++++ > > include/qemu/thread-posix.h | 4 ++++ > > util/qemu-thread-posix.c | 4 ++++ > > 3 files changed, 18 insertions(+) > > > > diff --git a/configure b/configure > > index 0a19b033bc..a80af735b2 100755 > > --- a/configure > > +++ b/configure > > @@ -451,6 +451,7 @@ jemalloc="no" > > replication="yes" > > vxhs="" > > libxml2="" > > +debug_mutex="no" > > > > supported_cpu="no" > > supported_os="no" > > @@ -1374,6 +1375,10 @@ for opt do > > ;; > > --disable-git-update) git_update=no > > ;; > > + --enable-debug-mutex) debug_mutex=yes > > + ;; > > + --disable-debug-mutex) debug_mutex=no > > + ;; > > *) > > echo "ERROR: unknown option $opt" > > echo "Try '$0 --help' for more information" > > @@ -1631,6 +1636,7 @@ disabled with --disable-FEATURE, default is enabled if available: > > crypto-afalg Linux AF_ALG crypto backend driver > > vhost-user vhost-user support > > capstone capstone disassembler support > > + debug-mutex mutex debugging support > > > > NOTE: The object files are built at the place where configure is launched > > EOF > > @@ -5874,6 +5880,7 @@ echo "avx2 optimization $avx2_opt" > > echo "replication support $replication" > > echo "VxHS block device $vxhs" > > echo "capstone $capstone" > > +echo "mutex debugging $debug_mutex" > > > > if test "$sdl_too_old" = "yes"; then > > echo "-> Your SDL version is too old - please upgrade to have SDL support" > > @@ -6602,6 +6609,9 @@ fi > > if test "$capstone" != "no" ; then > > echo "CONFIG_CAPSTONE=y" >> $config_host_mak > > fi > > +if test "$debug_mutex" = "yes" ; then > > + echo "CONFIG_DEBUG_MUTEX=y" >> $config_host_mak > > +fi > > > > # Hold two types of flag: > > # CONFIG_THREAD_SETNAME_BYTHREAD - we've got a way of setting the name on > > diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h > > index f3f47e426f..fd27b34128 100644 > > --- a/include/qemu/thread-posix.h > > +++ b/include/qemu/thread-posix.h > > @@ -12,6 +12,10 @@ typedef QemuMutex QemuRecMutex; > > > > struct QemuMutex { > > pthread_mutex_t lock; > > +#ifdef CONFIG_DEBUG_MUTEX > > + const char *file; > > + int line; > > +#endif > > These look quite cheap, why we need a configure option to enable it? Yeah; I am not 100% sure about whether it's cheap or not, hence with that. I can remove that part if we are sure we want it always. -- Peter Xu
On Thu, Apr 19, 2018 at 11:13:35 +0800, Peter Xu wrote: > On Thu, Apr 19, 2018 at 09:56:31AM +0800, Fam Zheng wrote: (snip) > > > @@ -12,6 +12,10 @@ typedef QemuMutex QemuRecMutex; > > > > > > struct QemuMutex { > > > pthread_mutex_t lock; > > > +#ifdef CONFIG_DEBUG_MUTEX > > > + const char *file; > > > + int line; > > > +#endif > > > > These look quite cheap, why we need a configure option to enable it? > > Yeah; I am not 100% sure about whether it's cheap or not, hence with > that. I can remove that part if we are sure we want it always. I can think of a few good reasons not to enable these by default. * Adds 12 bytes to struct QemuMutex on 64-bit hosts. * Increases slightly the critical region (the assignment happens once the lock has been acquired) + This is measurable for a single-thread with a microbenchmark: Before (no --enable-debug-mutex): $ taskset -c 0 tests/atomic_add-bench -n 1 -m -d 10 Parameters: # of threads: 1 duration: 10 ops' range: 1024 Results: Duration: 10 s Throughput: 57.59 Mops/s Throughput/thread: 57.59 Mops/s/thread After (with --enable-debug-mutex): $ taskset -c 0 tests/atomic_add-bench -n 1 -m -d 10 Parameters: # of threads: 1 duration: 10 ops' range: 1024 Results: Duration: 10 s Throughput: 56.25 Mops/s Throughput/thread: 56.25 Mops/s/thread NB. The -m option for atomic_add-bench is not upstream yet, but feel free to cherry-pick this commit: https://github.com/cota/qemu/commit/f04f34df + A longer critical section can impact scalability, especially for large core counts. Also note that there are some alternatives to this. On POSIX systems when I need to debug mutexes I just revert 24fa904 ("qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK", 2015-03-10). Note that this doesn't work well with fork() in linux-user, but I rarely need that. Another alternative is to trace mutex_lock, that will give you the same info although at higher overhead (in both runtime and disk usage). So I'm not against this, but please keep it configured out. BTW you might also want to add the file/line pair to qemu-thread-win32.c, or hide the configure option to Windows users. Thanks, Emilio
On Thu, Apr 19, 2018 at 15:43:01 -0400, Emilio G. Cota wrote: > BTW you might also want to add the file/line pair to > qemu-thread-win32.c, or hide the configure option to Windows > users. A few other call sites are missing as well, e.g. when returning from pthread_cond_wait or with _trylock. If you grep for trace_qemu_mutex_locked you'll find all the places where the file/line update should occur. Emilio
On Thu, Apr 19, 2018 at 03:43:01PM -0400, Emilio G. Cota wrote: > On Thu, Apr 19, 2018 at 11:13:35 +0800, Peter Xu wrote: > > On Thu, Apr 19, 2018 at 09:56:31AM +0800, Fam Zheng wrote: > (snip) > > > > @@ -12,6 +12,10 @@ typedef QemuMutex QemuRecMutex; > > > > > > > > struct QemuMutex { > > > > pthread_mutex_t lock; > > > > +#ifdef CONFIG_DEBUG_MUTEX > > > > + const char *file; > > > > + int line; > > > > +#endif > > > > > > These look quite cheap, why we need a configure option to enable it? > > > > Yeah; I am not 100% sure about whether it's cheap or not, hence with > > that. I can remove that part if we are sure we want it always. > > I can think of a few good reasons not to enable these by default. > > * Adds 12 bytes to struct QemuMutex on 64-bit hosts. Yes, I worried about this too. Mutex is still widely used, and there can be a large amount of locks even not used quite often but will still consume the space. > * Increases slightly the critical region (the assignment happens > once the lock has been acquired) > + This is measurable for a single-thread with a microbenchmark: > Before (no --enable-debug-mutex): > $ taskset -c 0 tests/atomic_add-bench -n 1 -m -d 10 > Parameters: > # of threads: 1 > duration: 10 > ops' range: 1024 > Results: > Duration: 10 s > Throughput: 57.59 Mops/s > Throughput/thread: 57.59 Mops/s/thread > > After (with --enable-debug-mutex): > $ taskset -c 0 tests/atomic_add-bench -n 1 -m -d 10 > Parameters: > # of threads: 1 > duration: 10 > ops' range: 1024 > Results: > Duration: 10 s > Throughput: 56.25 Mops/s > Throughput/thread: 56.25 Mops/s/thread > > NB. The -m option for atomic_add-bench is not upstream yet, but > feel free to cherry-pick this commit: > https://github.com/cota/qemu/commit/f04f34df > > + A longer critical section can impact scalability, especially > for large core counts. Thanks for the testing report. Then I think I'll keep the configure part, and keep it off by default. > > Also note that there are some alternatives to this. > On POSIX systems when I need to debug mutexes I just revert > 24fa904 ("qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK", > 2015-03-10). Note that this doesn't work well with fork() in > linux-user, but I rarely need that. Note that this patch can even be helpful to debug unlock missing paths (when owner of a lock forgot to release after use). AFAIU PTHREAD_MUTEX_ERRORCHECK won't be able to, since it only provides the thread ID that has taken the lock (and the lock type) then we can't tell where the lock is explicitly taken. Or is there a way that I missed? > > Another alternative is to trace mutex_lock, that will give > you the same info although at higher overhead (in both runtime > and disk usage). > > So I'm not against this, but please keep it configured out. > BTW you might also want to add the file/line pair to > qemu-thread-win32.c, or hide the configure option to Windows > users. Yes, I missed some other paths that will take the lock, and it won't be hard to support Windows too. Thanks for your input. -- Peter Xu
© 2016 - 2024 Red Hat, Inc.