[Qemu-devel] [PATCH v2] QemuMutex: support --enable-debug-mutex

Peter Xu posted 1 patch 5 years, 11 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20180418090622.20073-1-peterx@redhat.com
Test checkpatch passed
Test docker-build@min-glib passed
Test docker-mingw@fedora passed
Test s390x passed
configure                   | 10 ++++++++++
include/qemu/thread-posix.h |  4 ++++
util/qemu-thread-posix.c    |  4 ++++
3 files changed, 18 insertions(+)
[Qemu-devel] [PATCH v2] QemuMutex: support --enable-debug-mutex
Posted by Peter Xu 5 years, 11 months ago
We have had some tracing tools for mutex but it's not easy to use them
for e.g. dead locks.  Let's provide "--enable-debug-mutex" parameter
when configure to allow QemuMutex to store the last owner that took
specific lock.  It might be extremely easy to use this tool to debug
deadlocks since we can directly know who took the lock then as long as
we can have a gdb attached to the process.

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
v2:
- fix "--disable-debug-mutex" not working..
---
 configure                   | 10 ++++++++++
 include/qemu/thread-posix.h |  4 ++++
 util/qemu-thread-posix.c    |  4 ++++
 3 files changed, 18 insertions(+)

diff --git a/configure b/configure
index 0a19b033bc..a80af735b2 100755
--- a/configure
+++ b/configure
@@ -451,6 +451,7 @@ jemalloc="no"
 replication="yes"
 vxhs=""
 libxml2=""
+debug_mutex="no"
 
 supported_cpu="no"
 supported_os="no"
@@ -1374,6 +1375,10 @@ for opt do
   ;;
   --disable-git-update) git_update=no
   ;;
+  --enable-debug-mutex) debug_mutex=yes
+  ;;
+  --disable-debug-mutex) debug_mutex=no
+  ;;
   *)
       echo "ERROR: unknown option $opt"
       echo "Try '$0 --help' for more information"
@@ -1631,6 +1636,7 @@ disabled with --disable-FEATURE, default is enabled if available:
   crypto-afalg    Linux AF_ALG crypto backend driver
   vhost-user      vhost-user support
   capstone        capstone disassembler support
+  debug-mutex     mutex debugging support
 
 NOTE: The object files are built at the place where configure is launched
 EOF
@@ -5874,6 +5880,7 @@ echo "avx2 optimization $avx2_opt"
 echo "replication support $replication"
 echo "VxHS block device $vxhs"
 echo "capstone          $capstone"
+echo "mutex debugging   $debug_mutex"
 
 if test "$sdl_too_old" = "yes"; then
 echo "-> Your SDL version is too old - please upgrade to have SDL support"
@@ -6602,6 +6609,9 @@ fi
 if test "$capstone" != "no" ; then
   echo "CONFIG_CAPSTONE=y" >> $config_host_mak
 fi
+if test "$debug_mutex" = "yes" ; then
+  echo "CONFIG_DEBUG_MUTEX=y" >> $config_host_mak
+fi
 
 # Hold two types of flag:
 #   CONFIG_THREAD_SETNAME_BYTHREAD  - we've got a way of setting the name on
diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h
index f3f47e426f..fd27b34128 100644
--- a/include/qemu/thread-posix.h
+++ b/include/qemu/thread-posix.h
@@ -12,6 +12,10 @@ typedef QemuMutex QemuRecMutex;
 
 struct QemuMutex {
     pthread_mutex_t lock;
+#ifdef CONFIG_DEBUG_MUTEX
+    const char *file;
+    int line;
+#endif
     bool initialized;
 };
 
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index b789cf32e9..905a816af6 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -68,6 +68,10 @@ void qemu_mutex_lock_impl(QemuMutex *mutex, const char *file, const int line)
     if (err)
         error_exit(err, __func__);
 
+#ifdef CONFIG_DEBUG_MUTEX
+    mutex->file = file;
+    mutex->line = line;
+#endif
     trace_qemu_mutex_locked(mutex, file, line);
 }
 
-- 
2.14.3


Re: [Qemu-devel] [PATCH v2] QemuMutex: support --enable-debug-mutex
Posted by Fam Zheng 5 years, 11 months ago
On Wed, 04/18 17:06, Peter Xu wrote:
> We have had some tracing tools for mutex but it's not easy to use them
> for e.g. dead locks.  Let's provide "--enable-debug-mutex" parameter
> when configure to allow QemuMutex to store the last owner that took
> specific lock.  It might be extremely easy to use this tool to debug
> deadlocks since we can directly know who took the lock then as long as
> we can have a gdb attached to the process.
> 
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Fam Zheng <famz@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> v2:
> - fix "--disable-debug-mutex" not working..
> ---
>  configure                   | 10 ++++++++++
>  include/qemu/thread-posix.h |  4 ++++
>  util/qemu-thread-posix.c    |  4 ++++
>  3 files changed, 18 insertions(+)
> 
> diff --git a/configure b/configure
> index 0a19b033bc..a80af735b2 100755
> --- a/configure
> +++ b/configure
> @@ -451,6 +451,7 @@ jemalloc="no"
>  replication="yes"
>  vxhs=""
>  libxml2=""
> +debug_mutex="no"
>  
>  supported_cpu="no"
>  supported_os="no"
> @@ -1374,6 +1375,10 @@ for opt do
>    ;;
>    --disable-git-update) git_update=no
>    ;;
> +  --enable-debug-mutex) debug_mutex=yes
> +  ;;
> +  --disable-debug-mutex) debug_mutex=no
> +  ;;
>    *)
>        echo "ERROR: unknown option $opt"
>        echo "Try '$0 --help' for more information"
> @@ -1631,6 +1636,7 @@ disabled with --disable-FEATURE, default is enabled if available:
>    crypto-afalg    Linux AF_ALG crypto backend driver
>    vhost-user      vhost-user support
>    capstone        capstone disassembler support
> +  debug-mutex     mutex debugging support
>  
>  NOTE: The object files are built at the place where configure is launched
>  EOF
> @@ -5874,6 +5880,7 @@ echo "avx2 optimization $avx2_opt"
>  echo "replication support $replication"
>  echo "VxHS block device $vxhs"
>  echo "capstone          $capstone"
> +echo "mutex debugging   $debug_mutex"
>  
>  if test "$sdl_too_old" = "yes"; then
>  echo "-> Your SDL version is too old - please upgrade to have SDL support"
> @@ -6602,6 +6609,9 @@ fi
>  if test "$capstone" != "no" ; then
>    echo "CONFIG_CAPSTONE=y" >> $config_host_mak
>  fi
> +if test "$debug_mutex" = "yes" ; then
> +  echo "CONFIG_DEBUG_MUTEX=y" >> $config_host_mak
> +fi
>  
>  # Hold two types of flag:
>  #   CONFIG_THREAD_SETNAME_BYTHREAD  - we've got a way of setting the name on
> diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h
> index f3f47e426f..fd27b34128 100644
> --- a/include/qemu/thread-posix.h
> +++ b/include/qemu/thread-posix.h
> @@ -12,6 +12,10 @@ typedef QemuMutex QemuRecMutex;
>  
>  struct QemuMutex {
>      pthread_mutex_t lock;
> +#ifdef CONFIG_DEBUG_MUTEX
> +    const char *file;
> +    int line;
> +#endif

These look quite cheap, why we need a configure option to enable it?

>      bool initialized;
>  };
>  
> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
> index b789cf32e9..905a816af6 100644
> --- a/util/qemu-thread-posix.c
> +++ b/util/qemu-thread-posix.c
> @@ -68,6 +68,10 @@ void qemu_mutex_lock_impl(QemuMutex *mutex, const char *file, const int line)
>      if (err)
>          error_exit(err, __func__);
>  
> +#ifdef CONFIG_DEBUG_MUTEX
> +    mutex->file = file;
> +    mutex->line = line;
> +#endif
>      trace_qemu_mutex_locked(mutex, file, line);
>  }
>  
> -- 
> 2.14.3
> 

Fam

Re: [Qemu-devel] [PATCH v2] QemuMutex: support --enable-debug-mutex
Posted by Peter Xu 5 years, 11 months ago
On Thu, Apr 19, 2018 at 09:56:31AM +0800, Fam Zheng wrote:
> On Wed, 04/18 17:06, Peter Xu wrote:
> > We have had some tracing tools for mutex but it's not easy to use them
> > for e.g. dead locks.  Let's provide "--enable-debug-mutex" parameter
> > when configure to allow QemuMutex to store the last owner that took
> > specific lock.  It might be extremely easy to use this tool to debug
> > deadlocks since we can directly know who took the lock then as long as
> > we can have a gdb attached to the process.
> > 
> > CC: Paolo Bonzini <pbonzini@redhat.com>
> > CC: Stefan Hajnoczi <stefanha@redhat.com>
> > CC: Fam Zheng <famz@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > v2:
> > - fix "--disable-debug-mutex" not working..
> > ---
> >  configure                   | 10 ++++++++++
> >  include/qemu/thread-posix.h |  4 ++++
> >  util/qemu-thread-posix.c    |  4 ++++
> >  3 files changed, 18 insertions(+)
> > 
> > diff --git a/configure b/configure
> > index 0a19b033bc..a80af735b2 100755
> > --- a/configure
> > +++ b/configure
> > @@ -451,6 +451,7 @@ jemalloc="no"
> >  replication="yes"
> >  vxhs=""
> >  libxml2=""
> > +debug_mutex="no"
> >  
> >  supported_cpu="no"
> >  supported_os="no"
> > @@ -1374,6 +1375,10 @@ for opt do
> >    ;;
> >    --disable-git-update) git_update=no
> >    ;;
> > +  --enable-debug-mutex) debug_mutex=yes
> > +  ;;
> > +  --disable-debug-mutex) debug_mutex=no
> > +  ;;
> >    *)
> >        echo "ERROR: unknown option $opt"
> >        echo "Try '$0 --help' for more information"
> > @@ -1631,6 +1636,7 @@ disabled with --disable-FEATURE, default is enabled if available:
> >    crypto-afalg    Linux AF_ALG crypto backend driver
> >    vhost-user      vhost-user support
> >    capstone        capstone disassembler support
> > +  debug-mutex     mutex debugging support
> >  
> >  NOTE: The object files are built at the place where configure is launched
> >  EOF
> > @@ -5874,6 +5880,7 @@ echo "avx2 optimization $avx2_opt"
> >  echo "replication support $replication"
> >  echo "VxHS block device $vxhs"
> >  echo "capstone          $capstone"
> > +echo "mutex debugging   $debug_mutex"
> >  
> >  if test "$sdl_too_old" = "yes"; then
> >  echo "-> Your SDL version is too old - please upgrade to have SDL support"
> > @@ -6602,6 +6609,9 @@ fi
> >  if test "$capstone" != "no" ; then
> >    echo "CONFIG_CAPSTONE=y" >> $config_host_mak
> >  fi
> > +if test "$debug_mutex" = "yes" ; then
> > +  echo "CONFIG_DEBUG_MUTEX=y" >> $config_host_mak
> > +fi
> >  
> >  # Hold two types of flag:
> >  #   CONFIG_THREAD_SETNAME_BYTHREAD  - we've got a way of setting the name on
> > diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h
> > index f3f47e426f..fd27b34128 100644
> > --- a/include/qemu/thread-posix.h
> > +++ b/include/qemu/thread-posix.h
> > @@ -12,6 +12,10 @@ typedef QemuMutex QemuRecMutex;
> >  
> >  struct QemuMutex {
> >      pthread_mutex_t lock;
> > +#ifdef CONFIG_DEBUG_MUTEX
> > +    const char *file;
> > +    int line;
> > +#endif
> 
> These look quite cheap, why we need a configure option to enable it?

Yeah; I am not 100% sure about whether it's cheap or not, hence with
that.  I can remove that part if we are sure we want it always.

-- 
Peter Xu

Re: [Qemu-devel] [PATCH v2] QemuMutex: support --enable-debug-mutex
Posted by Emilio G. Cota 5 years, 11 months ago
On Thu, Apr 19, 2018 at 11:13:35 +0800, Peter Xu wrote:
> On Thu, Apr 19, 2018 at 09:56:31AM +0800, Fam Zheng wrote:
(snip)
> > > @@ -12,6 +12,10 @@ typedef QemuMutex QemuRecMutex;
> > >  
> > >  struct QemuMutex {
> > >      pthread_mutex_t lock;
> > > +#ifdef CONFIG_DEBUG_MUTEX
> > > +    const char *file;
> > > +    int line;
> > > +#endif
> > 
> > These look quite cheap, why we need a configure option to enable it?
> 
> Yeah; I am not 100% sure about whether it's cheap or not, hence with
> that.  I can remove that part if we are sure we want it always.

I can think of a few good reasons not to enable these by default.

* Adds 12 bytes to struct QemuMutex on 64-bit hosts.
* Increases slightly the critical region (the assignment happens
  once the lock has been acquired)
  + This is measurable for a single-thread with a microbenchmark:
Before (no --enable-debug-mutex):
$ taskset -c 0 tests/atomic_add-bench -n 1 -m -d 10
Parameters:
 # of threads:      1
 duration:          10
 ops' range:        1024
Results:
Duration:            10 s
 Throughput:         57.59 Mops/s
 Throughput/thread:  57.59 Mops/s/thread

After (with --enable-debug-mutex):
$ taskset -c 0 tests/atomic_add-bench -n 1 -m -d 10
Parameters:
 # of threads:      1
 duration:          10
 ops' range:        1024
Results:
Duration:            10 s
 Throughput:         56.25 Mops/s
 Throughput/thread:  56.25 Mops/s/thread

NB. The -m option for atomic_add-bench is not upstream yet, but
feel free to cherry-pick this commit: 
  https://github.com/cota/qemu/commit/f04f34df

  + A longer critical section can impact scalability, especially
    for large core counts.

Also note that there are some alternatives to this.
On POSIX systems when I need to debug mutexes I just revert
24fa904 ("qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK",
2015-03-10). Note that this doesn't work well with fork() in
linux-user, but I rarely need that.

Another alternative is to trace mutex_lock, that will give
you the same info although at higher overhead (in both runtime
and disk usage).

So I'm not against this, but please keep it configured out.
BTW you might also want to add the file/line pair to
qemu-thread-win32.c, or hide the configure option to Windows
users.

Thanks,

		Emilio

Re: [Qemu-devel] [PATCH v2] QemuMutex: support --enable-debug-mutex
Posted by Emilio G. Cota 5 years, 11 months ago
On Thu, Apr 19, 2018 at 15:43:01 -0400, Emilio G. Cota wrote:
> BTW you might also want to add the file/line pair to
> qemu-thread-win32.c, or hide the configure option to Windows
> users.

A few other call sites are missing as well, e.g. when returning from
pthread_cond_wait or with _trylock.

If you grep for trace_qemu_mutex_locked you'll find all the
places where the file/line update should occur.

		Emilio

Re: [Qemu-devel] [PATCH v2] QemuMutex: support --enable-debug-mutex
Posted by Peter Xu 5 years, 11 months ago
On Thu, Apr 19, 2018 at 03:43:01PM -0400, Emilio G. Cota wrote:
> On Thu, Apr 19, 2018 at 11:13:35 +0800, Peter Xu wrote:
> > On Thu, Apr 19, 2018 at 09:56:31AM +0800, Fam Zheng wrote:
> (snip)
> > > > @@ -12,6 +12,10 @@ typedef QemuMutex QemuRecMutex;
> > > >  
> > > >  struct QemuMutex {
> > > >      pthread_mutex_t lock;
> > > > +#ifdef CONFIG_DEBUG_MUTEX
> > > > +    const char *file;
> > > > +    int line;
> > > > +#endif
> > > 
> > > These look quite cheap, why we need a configure option to enable it?
> > 
> > Yeah; I am not 100% sure about whether it's cheap or not, hence with
> > that.  I can remove that part if we are sure we want it always.
> 
> I can think of a few good reasons not to enable these by default.
> 
> * Adds 12 bytes to struct QemuMutex on 64-bit hosts.

Yes, I worried about this too.  Mutex is still widely used, and there
can be a large amount of locks even not used quite often but will
still consume the space.

> * Increases slightly the critical region (the assignment happens
>   once the lock has been acquired)
>   + This is measurable for a single-thread with a microbenchmark:
> Before (no --enable-debug-mutex):
> $ taskset -c 0 tests/atomic_add-bench -n 1 -m -d 10
> Parameters:
>  # of threads:      1
>  duration:          10
>  ops' range:        1024
> Results:
> Duration:            10 s
>  Throughput:         57.59 Mops/s
>  Throughput/thread:  57.59 Mops/s/thread
> 
> After (with --enable-debug-mutex):
> $ taskset -c 0 tests/atomic_add-bench -n 1 -m -d 10
> Parameters:
>  # of threads:      1
>  duration:          10
>  ops' range:        1024
> Results:
> Duration:            10 s
>  Throughput:         56.25 Mops/s
>  Throughput/thread:  56.25 Mops/s/thread
> 
> NB. The -m option for atomic_add-bench is not upstream yet, but
> feel free to cherry-pick this commit: 
>   https://github.com/cota/qemu/commit/f04f34df
> 
>   + A longer critical section can impact scalability, especially
>     for large core counts.

Thanks for the testing report.  Then I think I'll keep the configure
part, and keep it off by default.

> 
> Also note that there are some alternatives to this.
> On POSIX systems when I need to debug mutexes I just revert
> 24fa904 ("qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK",
> 2015-03-10). Note that this doesn't work well with fork() in
> linux-user, but I rarely need that.

Note that this patch can even be helpful to debug unlock missing paths
(when owner of a lock forgot to release after use).  AFAIU
PTHREAD_MUTEX_ERRORCHECK won't be able to, since it only provides the
thread ID that has taken the lock (and the lock type) then we can't
tell where the lock is explicitly taken.  Or is there a way that I
missed?

> 
> Another alternative is to trace mutex_lock, that will give
> you the same info although at higher overhead (in both runtime
> and disk usage).
> 
> So I'm not against this, but please keep it configured out.
> BTW you might also want to add the file/line pair to
> qemu-thread-win32.c, or hide the configure option to Windows
> users.

Yes, I missed some other paths that will take the lock, and it won't
be hard to support Windows too.  Thanks for your input.

-- 
Peter Xu