`pthread_mutex_destroy`, `pthread_mutex_lock` and `pthread_mutex_unlock`
return an error code that is currently ignored.
Add debug information if one of these operations failed, e.g. when there
is an attempt to destroy a still locked mutex or unlock an already
unlocked mutex. Both scenarios are considered undefined behavior.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
---
src/util/virthread.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/src/util/virthread.c b/src/util/virthread.c
index 5422bb74fd..14116a2221 100644
--- a/src/util/virthread.c
+++ b/src/util/virthread.c
@@ -35,7 +35,10 @@
#include "viralloc.h"
#include "virthreadjob.h"
+#include "virlog.h"
+#define VIR_FROM_THIS VIR_FROM_THREAD
+VIR_LOG_INIT("util.thread");
int virOnce(virOnceControl *once, virOnceFunc init)
{
@@ -83,17 +86,23 @@ int virMutexInitRecursive(virMutex *m)
void virMutexDestroy(virMutex *m)
{
- pthread_mutex_destroy(&m->lock);
+ if (pthread_mutex_destroy(&m->lock)) {
+ VIR_WARN("Failed to destroy mutex=%p", m);
+ }
}
void virMutexLock(virMutex *m)
{
- pthread_mutex_lock(&m->lock);
+ if (pthread_mutex_lock(&m->lock)) {
+ VIR_WARN("Failed to lock mutex=%p", m);
+ }
}
void virMutexUnlock(virMutex *m)
{
- pthread_mutex_unlock(&m->lock);
+ if (pthread_mutex_unlock(&m->lock)) {
+ VIR_WARN("Failed to unlock mutex=%p", m);
+ }
}
virLockGuard virLockGuardLock(virMutex *m)
--
2.43.0
On Wed, Jul 03, 2024 at 02:44:37PM +0200, Tim Wiederhake wrote:
> `pthread_mutex_destroy`, `pthread_mutex_lock` and `pthread_mutex_unlock`
> return an error code that is currently ignored.
>
> Add debug information if one of these operations failed, e.g. when there
> is an attempt to destroy a still locked mutex or unlock an already
> unlocked mutex. Both scenarios are considered undefined behavior.
>
> Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
> ---
> src/util/virthread.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/src/util/virthread.c b/src/util/virthread.c
> index 5422bb74fd..14116a2221 100644
> --- a/src/util/virthread.c
> +++ b/src/util/virthread.c
> @@ -35,7 +35,10 @@
>
> #include "viralloc.h"
> #include "virthreadjob.h"
> +#include "virlog.h"
>
> +#define VIR_FROM_THIS VIR_FROM_THREAD
> +VIR_LOG_INIT("util.thread");
>
> int virOnce(virOnceControl *once, virOnceFunc init)
> {
> @@ -83,17 +86,23 @@ int virMutexInitRecursive(virMutex *m)
>
> void virMutexDestroy(virMutex *m)
> {
> - pthread_mutex_destroy(&m->lock);
> + if (pthread_mutex_destroy(&m->lock)) {
> + VIR_WARN("Failed to destroy mutex=%p", m);
> + }
> }
>
> void virMutexLock(virMutex *m)
> {
> - pthread_mutex_lock(&m->lock);
> + if (pthread_mutex_lock(&m->lock)) {
> + VIR_WARN("Failed to lock mutex=%p", m);
> + }
> }
>
> void virMutexUnlock(virMutex *m)
> {
> - pthread_mutex_unlock(&m->lock);
> + if (pthread_mutex_unlock(&m->lock)) {
> + VIR_WARN("Failed to unlock mutex=%p", m);
> + }
> }
I'd be surprised if these lock/unlock warnings ever trigger, since IIUC
they would need us to be using an error checking mutex, not a regular
mutex. IOW, aren't these just adding condition test overhead + unreachable
code to the lock calls ?
The 2nd patch shows failures in the destroy calls IIUC.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Thu, 2024-07-04 at 08:10 +0100, Daniel P. Berrangé wrote:
> On Wed, Jul 03, 2024 at 02:44:37PM +0200, Tim Wiederhake wrote:
> > `pthread_mutex_destroy`, `pthread_mutex_lock` and
> > `pthread_mutex_unlock`
> > return an error code that is currently ignored.
> >
> > Add debug information if one of these operations failed, e.g. when
> > there
> > is an attempt to destroy a still locked mutex or unlock an already
> > unlocked mutex. Both scenarios are considered undefined behavior.
> >
> > Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
> > ---
> > src/util/virthread.c | 15 ++++++++++++---
> > 1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/util/virthread.c b/src/util/virthread.c
> > index 5422bb74fd..14116a2221 100644
> > --- a/src/util/virthread.c
> > +++ b/src/util/virthread.c
> > @@ -35,7 +35,10 @@
> >
> > #include "viralloc.h"
> > #include "virthreadjob.h"
> > +#include "virlog.h"
> >
> > +#define VIR_FROM_THIS VIR_FROM_THREAD
> > +VIR_LOG_INIT("util.thread");
> >
> > int virOnce(virOnceControl *once, virOnceFunc init)
> > {
> > @@ -83,17 +86,23 @@ int virMutexInitRecursive(virMutex *m)
> >
> > void virMutexDestroy(virMutex *m)
> > {
> > - pthread_mutex_destroy(&m->lock);
> > + if (pthread_mutex_destroy(&m->lock)) {
> > + VIR_WARN("Failed to destroy mutex=%p", m);
> > + }
> > }
> >
> > void virMutexLock(virMutex *m)
> > {
> > - pthread_mutex_lock(&m->lock);
> > + if (pthread_mutex_lock(&m->lock)) {
> > + VIR_WARN("Failed to lock mutex=%p", m);
> > + }
> > }
> >
> > void virMutexUnlock(virMutex *m)
> > {
> > - pthread_mutex_unlock(&m->lock);
> > + if (pthread_mutex_unlock(&m->lock)) {
> > + VIR_WARN("Failed to unlock mutex=%p", m);
> > + }
> > }
>
> I'd be surprised if these lock/unlock warnings ever trigger, since
> IIUC
> they would need us to be using an error checking mutex, not a regular
> mutex. IOW, aren't these just adding condition test overhead +
> unreachable
> code to the lock calls ?
>
> The 2nd patch shows failures in the destroy calls IIUC.
>
> With regards,
> Daniel
I have looked more closely into the issue now. pthread_mutex_lock and
pthread_mutex_unlock do indeed not return a non-zero value over us not
using error checking mutexes.
During my last attempt at fixing the issues I had a patch that would
count lockings and unlockings of mutexes explicitly, and I believe I
recall seeing problems in that area as well. Sadly, I cannot reproduce
that now, at least not reliably: Ignoring the warnings for
pthread_mutex_destroy, virnetdaemontest does seem to trigger my "number
of locks == number of unlocks" check in about 3 out of 10.000 runs. And
sometimes with a frequency of 1 in 10. Sometimes not at all. In any
case: I do not consider the checks for locking / unlocking dead code.
So far I have been using the test suite to check for obvious issues,
but I cannot rule out that libvirt itself has race conditions too.
I would advocate for merging this patch as is, and add a patch to
enable error checking for the mutexes.
Regards,
Tim
On Thu, Jul 18, 2024 at 03:50:27PM +0200, Tim Wiederhake wrote:
> On Thu, 2024-07-04 at 08:10 +0100, Daniel P. Berrangé wrote:
> > On Wed, Jul 03, 2024 at 02:44:37PM +0200, Tim Wiederhake wrote:
> > > `pthread_mutex_destroy`, `pthread_mutex_lock` and
> > > `pthread_mutex_unlock`
> > > return an error code that is currently ignored.
> > >
> > > Add debug information if one of these operations failed, e.g. when
> > > there
> > > is an attempt to destroy a still locked mutex or unlock an already
> > > unlocked mutex. Both scenarios are considered undefined behavior.
> > >
> > > Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
> > > ---
> > > src/util/virthread.c | 15 ++++++++++++---
> > > 1 file changed, 12 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/src/util/virthread.c b/src/util/virthread.c
> > > index 5422bb74fd..14116a2221 100644
> > > --- a/src/util/virthread.c
> > > +++ b/src/util/virthread.c
> > > @@ -35,7 +35,10 @@
> > >
> > > #include "viralloc.h"
> > > #include "virthreadjob.h"
> > > +#include "virlog.h"
> > >
> > > +#define VIR_FROM_THIS VIR_FROM_THREAD
> > > +VIR_LOG_INIT("util.thread");
> > >
> > > int virOnce(virOnceControl *once, virOnceFunc init)
> > > {
> > > @@ -83,17 +86,23 @@ int virMutexInitRecursive(virMutex *m)
> > >
> > > void virMutexDestroy(virMutex *m)
> > > {
> > > - pthread_mutex_destroy(&m->lock);
> > > + if (pthread_mutex_destroy(&m->lock)) {
> > > + VIR_WARN("Failed to destroy mutex=%p", m);
> > > + }
> > > }
> > >
> > > void virMutexLock(virMutex *m)
> > > {
> > > - pthread_mutex_lock(&m->lock);
> > > + if (pthread_mutex_lock(&m->lock)) {
> > > + VIR_WARN("Failed to lock mutex=%p", m);
> > > + }
> > > }
> > >
> > > void virMutexUnlock(virMutex *m)
> > > {
> > > - pthread_mutex_unlock(&m->lock);
> > > + if (pthread_mutex_unlock(&m->lock)) {
> > > + VIR_WARN("Failed to unlock mutex=%p", m);
> > > + }
> > > }
> >
> > I'd be surprised if these lock/unlock warnings ever trigger, since
> > IIUC
> > they would need us to be using an error checking mutex, not a regular
> > mutex. IOW, aren't these just adding condition test overhead +
> > unreachable
> > code to the lock calls ?
> >
> > The 2nd patch shows failures in the destroy calls IIUC.
>
> I have looked more closely into the issue now. pthread_mutex_lock and
> pthread_mutex_unlock do indeed not return a non-zero value over us not
> using error checking mutexes.
>
> During my last attempt at fixing the issues I had a patch that would
> count lockings and unlockings of mutexes explicitly, and I believe I
> recall seeing problems in that area as well. Sadly, I cannot reproduce
> that now, at least not reliably: Ignoring the warnings for
> pthread_mutex_destroy, virnetdaemontest does seem to trigger my "number
> of locks == number of unlocks" check in about 3 out of 10.000 runs. And
> sometimes with a frequency of 1 in 10. Sometimes not at all. In any
> case: I do not consider the checks for locking / unlocking dead code.
>
> So far I have been using the test suite to check for obvious issues,
> but I cannot rule out that libvirt itself has race conditions too.
But if the checks never fire, due to the mutex type, this isn't
helping us diagnose anything surely ?
> I would advocate for merging this patch as is, and add a patch to
> enable error checking for the mutexes.
Error checking mutexes aren't something we can easily enable, because
they break any code which needs to hold a mutex across fork() and
unlock in the child. IOW, we would need two different types of mutex
and pick which to use, where.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 7/3/24 14:44, Tim Wiederhake wrote:
> `pthread_mutex_destroy`, `pthread_mutex_lock` and `pthread_mutex_unlock`
> return an error code that is currently ignored.
>
> Add debug information if one of these operations failed, e.g. when there
> is an attempt to destroy a still locked mutex or unlock an already
> unlocked mutex. Both scenarios are considered undefined behavior.
>
> Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
> ---
> src/util/virthread.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/src/util/virthread.c b/src/util/virthread.c
> index 5422bb74fd..14116a2221 100644
> --- a/src/util/virthread.c
> +++ b/src/util/virthread.c
> @@ -35,7 +35,10 @@
>
> #include "viralloc.h"
> #include "virthreadjob.h"
> +#include "virlog.h"
>
> +#define VIR_FROM_THIS VIR_FROM_THREAD
> +VIR_LOG_INIT("util.thread");
>
> int virOnce(virOnceControl *once, virOnceFunc init)
> {
> @@ -83,17 +86,23 @@ int virMutexInitRecursive(virMutex *m)
>
> void virMutexDestroy(virMutex *m)
> {
> - pthread_mutex_destroy(&m->lock);
> + if (pthread_mutex_destroy(&m->lock)) {
> + VIR_WARN("Failed to destroy mutex=%p", m);
> + }
> }
>
> void virMutexLock(virMutex *m)
> {
> - pthread_mutex_lock(&m->lock);
> + if (pthread_mutex_lock(&m->lock)) {
> + VIR_WARN("Failed to lock mutex=%p", m);
> + }
> }
>
> void virMutexUnlock(virMutex *m)
> {
> - pthread_mutex_unlock(&m->lock);
> + if (pthread_mutex_unlock(&m->lock)) {
> + VIR_WARN("Failed to unlock mutex=%p", m);
> + }
> }
>
> virLockGuard virLockGuardLock(virMutex *m)
I'm a bit worried that without additional info (e.g. stacktrace) this
info is going to be useless. Though, even stacktrace might be not enough
:( But this can at least let us know there's something suspicious going on.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Michal
© 2016 - 2026 Red Hat, Inc.