`pthread_mutex_destroy`, `pthread_mutex_lock` and `pthread_mutex_unlock`
return an error code that is currently ignored.
Add debug information if one of these operations failed, e.g. when there
is an attempt to destroy a still locked mutex or unlock an already
unlocked mutex. Both scenarios are considered undefined behavior.
Signed-off-by: Tim Wiederhake <twiederh@redhat.com>
---
src/util/virthread.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/src/util/virthread.c b/src/util/virthread.c
index 5422bb74fd..14116a2221 100644
--- a/src/util/virthread.c
+++ b/src/util/virthread.c
@@ -35,7 +35,10 @@
#include "viralloc.h"
#include "virthreadjob.h"
+#include "virlog.h"
+#define VIR_FROM_THIS VIR_FROM_THREAD
+VIR_LOG_INIT("util.thread");
int virOnce(virOnceControl *once, virOnceFunc init)
{
@@ -83,17 +86,23 @@ int virMutexInitRecursive(virMutex *m)
void virMutexDestroy(virMutex *m)
{
- pthread_mutex_destroy(&m->lock);
+ if (pthread_mutex_destroy(&m->lock)) {
+ VIR_WARN("Failed to destroy mutex=%p", m);
+ }
}
void virMutexLock(virMutex *m)
{
- pthread_mutex_lock(&m->lock);
+ if (pthread_mutex_lock(&m->lock)) {
+ VIR_WARN("Failed to lock mutex=%p", m);
+ }
}
void virMutexUnlock(virMutex *m)
{
- pthread_mutex_unlock(&m->lock);
+ if (pthread_mutex_unlock(&m->lock)) {
+ VIR_WARN("Failed to unlock mutex=%p", m);
+ }
}
virLockGuard virLockGuardLock(virMutex *m)
--
2.43.0
On Wed, Jul 03, 2024 at 02:44:37PM +0200, Tim Wiederhake wrote: > `pthread_mutex_destroy`, `pthread_mutex_lock` and `pthread_mutex_unlock` > return an error code that is currently ignored. > > Add debug information if one of these operations failed, e.g. when there > is an attempt to destroy a still locked mutex or unlock an already > unlocked mutex. Both scenarios are considered undefined behavior. > > Signed-off-by: Tim Wiederhake <twiederh@redhat.com> > --- > src/util/virthread.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/src/util/virthread.c b/src/util/virthread.c > index 5422bb74fd..14116a2221 100644 > --- a/src/util/virthread.c > +++ b/src/util/virthread.c > @@ -35,7 +35,10 @@ > > #include "viralloc.h" > #include "virthreadjob.h" > +#include "virlog.h" > > +#define VIR_FROM_THIS VIR_FROM_THREAD > +VIR_LOG_INIT("util.thread"); > > int virOnce(virOnceControl *once, virOnceFunc init) > { > @@ -83,17 +86,23 @@ int virMutexInitRecursive(virMutex *m) > > void virMutexDestroy(virMutex *m) > { > - pthread_mutex_destroy(&m->lock); > + if (pthread_mutex_destroy(&m->lock)) { > + VIR_WARN("Failed to destroy mutex=%p", m); > + } > } > > void virMutexLock(virMutex *m) > { > - pthread_mutex_lock(&m->lock); > + if (pthread_mutex_lock(&m->lock)) { > + VIR_WARN("Failed to lock mutex=%p", m); > + } > } > > void virMutexUnlock(virMutex *m) > { > - pthread_mutex_unlock(&m->lock); > + if (pthread_mutex_unlock(&m->lock)) { > + VIR_WARN("Failed to unlock mutex=%p", m); > + } > } I'd be surprised if these lock/unlock warnings ever trigger, since IIUC they would need us to be using an error checking mutex, not a regular mutex. IOW, aren't these just adding condition test overhead + unreachable code to the lock calls ? The 2nd patch shows failures in the destroy calls IIUC. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Thu, 2024-07-04 at 08:10 +0100, Daniel P. Berrangé wrote: > On Wed, Jul 03, 2024 at 02:44:37PM +0200, Tim Wiederhake wrote: > > `pthread_mutex_destroy`, `pthread_mutex_lock` and > > `pthread_mutex_unlock` > > return an error code that is currently ignored. > > > > Add debug information if one of these operations failed, e.g. when > > there > > is an attempt to destroy a still locked mutex or unlock an already > > unlocked mutex. Both scenarios are considered undefined behavior. > > > > Signed-off-by: Tim Wiederhake <twiederh@redhat.com> > > --- > > src/util/virthread.c | 15 ++++++++++++--- > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/src/util/virthread.c b/src/util/virthread.c > > index 5422bb74fd..14116a2221 100644 > > --- a/src/util/virthread.c > > +++ b/src/util/virthread.c > > @@ -35,7 +35,10 @@ > > > > #include "viralloc.h" > > #include "virthreadjob.h" > > +#include "virlog.h" > > > > +#define VIR_FROM_THIS VIR_FROM_THREAD > > +VIR_LOG_INIT("util.thread"); > > > > int virOnce(virOnceControl *once, virOnceFunc init) > > { > > @@ -83,17 +86,23 @@ int virMutexInitRecursive(virMutex *m) > > > > void virMutexDestroy(virMutex *m) > > { > > - pthread_mutex_destroy(&m->lock); > > + if (pthread_mutex_destroy(&m->lock)) { > > + VIR_WARN("Failed to destroy mutex=%p", m); > > + } > > } > > > > void virMutexLock(virMutex *m) > > { > > - pthread_mutex_lock(&m->lock); > > + if (pthread_mutex_lock(&m->lock)) { > > + VIR_WARN("Failed to lock mutex=%p", m); > > + } > > } > > > > void virMutexUnlock(virMutex *m) > > { > > - pthread_mutex_unlock(&m->lock); > > + if (pthread_mutex_unlock(&m->lock)) { > > + VIR_WARN("Failed to unlock mutex=%p", m); > > + } > > } > > I'd be surprised if these lock/unlock warnings ever trigger, since > IIUC > they would need us to be using an error checking mutex, not a regular > mutex. IOW, aren't these just adding condition test overhead + > unreachable > code to the lock calls ? > > The 2nd patch shows failures in the destroy calls IIUC. > > With regards, > Daniel I have looked more closely into the issue now. pthread_mutex_lock and pthread_mutex_unlock do indeed not return a non-zero value over us not using error checking mutexes. During my last attempt at fixing the issues I had a patch that would count lockings and unlockings of mutexes explicitly, and I believe I recall seeing problems in that area as well. Sadly, I cannot reproduce that now, at least not reliably: Ignoring the warnings for pthread_mutex_destroy, virnetdaemontest does seem to trigger my "number of locks == number of unlocks" check in about 3 out of 10.000 runs. And sometimes with a frequency of 1 in 10. Sometimes not at all. In any case: I do not consider the checks for locking / unlocking dead code. So far I have been using the test suite to check for obvious issues, but I cannot rule out that libvirt itself has race conditions too. I would advocate for merging this patch as is, and add a patch to enable error checking for the mutexes. Regards, Tim
On Thu, Jul 18, 2024 at 03:50:27PM +0200, Tim Wiederhake wrote: > On Thu, 2024-07-04 at 08:10 +0100, Daniel P. Berrangé wrote: > > On Wed, Jul 03, 2024 at 02:44:37PM +0200, Tim Wiederhake wrote: > > > `pthread_mutex_destroy`, `pthread_mutex_lock` and > > > `pthread_mutex_unlock` > > > return an error code that is currently ignored. > > > > > > Add debug information if one of these operations failed, e.g. when > > > there > > > is an attempt to destroy a still locked mutex or unlock an already > > > unlocked mutex. Both scenarios are considered undefined behavior. > > > > > > Signed-off-by: Tim Wiederhake <twiederh@redhat.com> > > > --- > > > src/util/virthread.c | 15 ++++++++++++--- > > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > > > diff --git a/src/util/virthread.c b/src/util/virthread.c > > > index 5422bb74fd..14116a2221 100644 > > > --- a/src/util/virthread.c > > > +++ b/src/util/virthread.c > > > @@ -35,7 +35,10 @@ > > > > > > #include "viralloc.h" > > > #include "virthreadjob.h" > > > +#include "virlog.h" > > > > > > +#define VIR_FROM_THIS VIR_FROM_THREAD > > > +VIR_LOG_INIT("util.thread"); > > > > > > int virOnce(virOnceControl *once, virOnceFunc init) > > > { > > > @@ -83,17 +86,23 @@ int virMutexInitRecursive(virMutex *m) > > > > > > void virMutexDestroy(virMutex *m) > > > { > > > - pthread_mutex_destroy(&m->lock); > > > + if (pthread_mutex_destroy(&m->lock)) { > > > + VIR_WARN("Failed to destroy mutex=%p", m); > > > + } > > > } > > > > > > void virMutexLock(virMutex *m) > > > { > > > - pthread_mutex_lock(&m->lock); > > > + if (pthread_mutex_lock(&m->lock)) { > > > + VIR_WARN("Failed to lock mutex=%p", m); > > > + } > > > } > > > > > > void virMutexUnlock(virMutex *m) > > > { > > > - pthread_mutex_unlock(&m->lock); > > > + if (pthread_mutex_unlock(&m->lock)) { > > > + VIR_WARN("Failed to unlock mutex=%p", m); > > > + } > > > } > > > > I'd be surprised if these lock/unlock warnings ever trigger, since > > IIUC > > they would need us to be using an error checking mutex, not a regular > > mutex. IOW, aren't these just adding condition test overhead + > > unreachable > > code to the lock calls ? > > > > The 2nd patch shows failures in the destroy calls IIUC. > > I have looked more closely into the issue now. pthread_mutex_lock and > pthread_mutex_unlock do indeed not return a non-zero value over us not > using error checking mutexes. > > During my last attempt at fixing the issues I had a patch that would > count lockings and unlockings of mutexes explicitly, and I believe I > recall seeing problems in that area as well. Sadly, I cannot reproduce > that now, at least not reliably: Ignoring the warnings for > pthread_mutex_destroy, virnetdaemontest does seem to trigger my "number > of locks == number of unlocks" check in about 3 out of 10.000 runs. And > sometimes with a frequency of 1 in 10. Sometimes not at all. In any > case: I do not consider the checks for locking / unlocking dead code. > > So far I have been using the test suite to check for obvious issues, > but I cannot rule out that libvirt itself has race conditions too. But if the checks never fire, due to the mutex type, this isn't helping us diagnose anything surely ? > I would advocate for merging this patch as is, and add a patch to > enable error checking for the mutexes. Error checking mutexes aren't something we can easily enable, because they break any code which needs to hold a mutex across fork() and unlock in the child. IOW, we would need two different types of mutex and pick which to use, where. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On 7/3/24 14:44, Tim Wiederhake wrote: > `pthread_mutex_destroy`, `pthread_mutex_lock` and `pthread_mutex_unlock` > return an error code that is currently ignored. > > Add debug information if one of these operations failed, e.g. when there > is an attempt to destroy a still locked mutex or unlock an already > unlocked mutex. Both scenarios are considered undefined behavior. > > Signed-off-by: Tim Wiederhake <twiederh@redhat.com> > --- > src/util/virthread.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/src/util/virthread.c b/src/util/virthread.c > index 5422bb74fd..14116a2221 100644 > --- a/src/util/virthread.c > +++ b/src/util/virthread.c > @@ -35,7 +35,10 @@ > > #include "viralloc.h" > #include "virthreadjob.h" > +#include "virlog.h" > > +#define VIR_FROM_THIS VIR_FROM_THREAD > +VIR_LOG_INIT("util.thread"); > > int virOnce(virOnceControl *once, virOnceFunc init) > { > @@ -83,17 +86,23 @@ int virMutexInitRecursive(virMutex *m) > > void virMutexDestroy(virMutex *m) > { > - pthread_mutex_destroy(&m->lock); > + if (pthread_mutex_destroy(&m->lock)) { > + VIR_WARN("Failed to destroy mutex=%p", m); > + } > } > > void virMutexLock(virMutex *m) > { > - pthread_mutex_lock(&m->lock); > + if (pthread_mutex_lock(&m->lock)) { > + VIR_WARN("Failed to lock mutex=%p", m); > + } > } > > void virMutexUnlock(virMutex *m) > { > - pthread_mutex_unlock(&m->lock); > + if (pthread_mutex_unlock(&m->lock)) { > + VIR_WARN("Failed to unlock mutex=%p", m); > + } > } > > virLockGuard virLockGuardLock(virMutex *m) I'm a bit worried that without additional info (e.g. stacktrace) this info is going to be useless. Though, even stacktrace might be not enough :( But this can at least let us know there's something suspicious going on. Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Michal
© 2016 - 2024 Red Hat, Inc.