Documentation/locking/mutex-design.rst | 6 ++++++ kernel/locking/mutex.c | 5 +++++ 2 files changed, 11 insertions(+)
I have seen several cases of attempts to use mutex_unlock() to release an
object such that the object can then be freed by another task.
My understanding is that this is not safe because mutex_unlock(), in the
MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex
structure after having marked it as unlocked; so mutex_unlock() requires
its caller to ensure that the mutex stays alive until mutex_unlock()
returns.
If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters
have to keep the mutex alive, I think; but we could have a spurious
MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed
between the points where __mutex_unlock_slowpath() did the cmpxchg
reading the flags and where it acquired the wait_lock.
(With spinlocks, that kind of code pattern is allowed and, from what I
remember, used in several places in the kernel.)
If my understanding of this is correct, we should probably document this -
I think such a semantic difference between mutexes and spinlocks is fairly
unintuitive.
Signed-off-by: Jann Horn <jannh@google.com>
---
I hope for some thorough review on this patch to make sure the comments
I'm adding are actually true, and to confirm that mutexes intentionally
do not support this usage pattern.
Documentation/locking/mutex-design.rst | 6 ++++++
kernel/locking/mutex.c | 5 +++++
2 files changed, 11 insertions(+)
diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
index 78540cd7f54b..087716bfa7b2 100644
--- a/Documentation/locking/mutex-design.rst
+++ b/Documentation/locking/mutex-design.rst
@@ -101,6 +101,12 @@ features that make lock debugging easier and faster:
- Detects multi-task circular deadlocks and prints out all affected
locks and tasks (and only those tasks).
+Releasing a mutex is not an atomic operation: Once a mutex release operation
+has begun, another context may be able to acquire the mutex before the release
+operation has completed. The mutex user must ensure that the mutex is not
+destroyed while a release operation is still in progress - in other words,
+callers of 'mutex_unlock' must ensure that the mutex stays alive until
+'mutex_unlock' has returned.
Interfaces
----------
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 2deeeca3e71b..4c6b83bab643 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -532,6 +532,11 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
* This function must not be used in interrupt context. Unlocking
* of a not locked mutex is not allowed.
*
+ * The caller must ensure that the mutex stays alive until this function has
+ * returned - mutex_unlock() can NOT directly be used to release an object such
+ * that another concurrent task can free it.
+ * Mutexes are different from spinlocks in this aspect.
+ *
* This function is similar to (but not equivalent to) up().
*/
void __sched mutex_unlock(struct mutex *lock)
base-commit: 3b47bc037bd44f142ac09848e8d3ecccc726be99
--
2.43.0.rc2.451.g8631bc7472-goog
On Thu, Nov 30, 2023 at 09:48:17PM +0100, Jann Horn wrote: > I have seen several cases of attempts to use mutex_unlock() to release an > object such that the object can then be freed by another task. > My understanding is that this is not safe because mutex_unlock(), in the > MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex > structure after having marked it as unlocked; so mutex_unlock() requires > its caller to ensure that the mutex stays alive until mutex_unlock() > returns. > > If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters > have to keep the mutex alive, I think; but we could have a spurious > MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed > between the points where __mutex_unlock_slowpath() did the cmpxchg > reading the flags and where it acquired the wait_lock. > > (With spinlocks, that kind of code pattern is allowed and, from what I > remember, used in several places in the kernel.) > > If my understanding of this is correct, we should probably document this - > I think such a semantic difference between mutexes and spinlocks is fairly > unintuitive. IIRC this is true of all sleeping locks, and I think completion was the explcicit exception here, but it's been a while. > index 78540cd7f54b..087716bfa7b2 100644 > --- a/Documentation/locking/mutex-design.rst > +++ b/Documentation/locking/mutex-design.rst > @@ -101,6 +101,12 @@ features that make lock debugging easier and faster: > - Detects multi-task circular deadlocks and prints out all affected > locks and tasks (and only those tasks). > > +Releasing a mutex is not an atomic operation: Once a mutex release operation Well, it very much is an atomic store-release. That is, I object to your confusing use of atomic here :-)
On Fri, Dec 1, 2023 at 10:10 AM Peter Zijlstra <peterz@infradead.org> wrote: > On Thu, Nov 30, 2023 at 09:48:17PM +0100, Jann Horn wrote: > > I have seen several cases of attempts to use mutex_unlock() to release an > > object such that the object can then be freed by another task. > > My understanding is that this is not safe because mutex_unlock(), in the > > MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex > > structure after having marked it as unlocked; so mutex_unlock() requires > > its caller to ensure that the mutex stays alive until mutex_unlock() > > returns. > > > > If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters > > have to keep the mutex alive, I think; but we could have a spurious > > MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed > > between the points where __mutex_unlock_slowpath() did the cmpxchg > > reading the flags and where it acquired the wait_lock. > > > > (With spinlocks, that kind of code pattern is allowed and, from what I > > remember, used in several places in the kernel.) > > > > If my understanding of this is correct, we should probably document this - > > I think such a semantic difference between mutexes and spinlocks is fairly > > unintuitive. > > IIRC this is true of all sleeping locks, and I think completion was the > explcicit exception here, but it's been a while. In addition to completions, I think this also applies to up()? But I don't know if that's intentionally supported or just an implementation detail. Is there some central place where this should be documented instead of Documentation/locking/mutex-design.rst as a more general kernel locking design thing? Maybe Documentation/locking/locktypes.rst? I think it should also be documented on top of the relevant locking function(s) though, since I don't think everyone who uses locking functions necessarily reads the separate documentation files first. Mutexes kind of stand out as the most common locking type, but I guess to be consistent, we'd have to put the same comment on functions like up_read() and up_write()? And maybe drop the "Mutexes are different from spinlocks in this aspect" part? (Sidenote: Someone pointed out to me that an additional source of confusion could be that userspace POSIX mutexes support this usage pattern.) > > index 78540cd7f54b..087716bfa7b2 100644 > > --- a/Documentation/locking/mutex-design.rst > > +++ b/Documentation/locking/mutex-design.rst > > @@ -101,6 +101,12 @@ features that make lock debugging easier and faster: > > - Detects multi-task circular deadlocks and prints out all affected > > locks and tasks (and only those tasks). > > > > +Releasing a mutex is not an atomic operation: Once a mutex release operation > > Well, it very much is an atomic store-release. That is, I object to your > confusing use of atomic here :-) I'd say it involves an atomic store-release, but the whole operation is not atomic. :P But yeah, I see how this is confusing wording, and I'm not particularly attached to my specific choice of words.
On 11/30/23 15:48, Jann Horn wrote: > I have seen several cases of attempts to use mutex_unlock() to release an > object such that the object can then be freed by another task. > My understanding is that this is not safe because mutex_unlock(), in the > MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex > structure after having marked it as unlocked; so mutex_unlock() requires > its caller to ensure that the mutex stays alive until mutex_unlock() > returns. > > If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters > have to keep the mutex alive, I think; but we could have a spurious > MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed > between the points where __mutex_unlock_slowpath() did the cmpxchg > reading the flags and where it acquired the wait_lock. > > (With spinlocks, that kind of code pattern is allowed and, from what I > remember, used in several places in the kernel.) > > If my understanding of this is correct, we should probably document this - > I think such a semantic difference between mutexes and spinlocks is fairly > unintuitive. Spinlocks are fair. So doing a lock/unlock sequence will make sure that all the previously waiting waiters are done with the lock. Para-virtual spinlocks, however, can be a bit unfair so doing a lock/unlock sequence may not be enough to guarantee there is no waiter. The same is true for mutex. Adding a spin_is_locked() or mutex_is_locked() check can make sure that all the waiters are gone. Also the term "non-atomc" is kind of ambiguous as to what is the exact meaning of this word. Cheers, Longman > > Signed-off-by: Jann Horn <jannh@google.com> > --- > I hope for some thorough review on this patch to make sure the comments > I'm adding are actually true, and to confirm that mutexes intentionally > do not support this usage pattern. > > Documentation/locking/mutex-design.rst | 6 ++++++ > kernel/locking/mutex.c | 5 +++++ > 2 files changed, 11 insertions(+) > > diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst > index 78540cd7f54b..087716bfa7b2 100644 > --- a/Documentation/locking/mutex-design.rst > +++ b/Documentation/locking/mutex-design.rst > @@ -101,6 +101,12 @@ features that make lock debugging easier and faster: > - Detects multi-task circular deadlocks and prints out all affected > locks and tasks (and only those tasks). > > +Releasing a mutex is not an atomic operation: Once a mutex release operation > +has begun, another context may be able to acquire the mutex before the release > +operation has completed. The mutex user must ensure that the mutex is not > +destroyed while a release operation is still in progress - in other words, > +callers of 'mutex_unlock' must ensure that the mutex stays alive until > +'mutex_unlock' has returned. > > Interfaces > ---------- > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index 2deeeca3e71b..4c6b83bab643 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -532,6 +532,11 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne > * This function must not be used in interrupt context. Unlocking > * of a not locked mutex is not allowed. > * > + * The caller must ensure that the mutex stays alive until this function has > + * returned - mutex_unlock() can NOT directly be used to release an object such > + * that another concurrent task can free it. > + * Mutexes are different from spinlocks in this aspect. > + * > * This function is similar to (but not equivalent to) up(). > */ > void __sched mutex_unlock(struct mutex *lock) > > base-commit: 3b47bc037bd44f142ac09848e8d3ecccc726be99
On Fri, Dec 1, 2023 at 1:33 AM Waiman Long <longman@redhat.com> wrote: > On 11/30/23 15:48, Jann Horn wrote: > > I have seen several cases of attempts to use mutex_unlock() to release an > > object such that the object can then be freed by another task. > > My understanding is that this is not safe because mutex_unlock(), in the > > MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex > > structure after having marked it as unlocked; so mutex_unlock() requires > > its caller to ensure that the mutex stays alive until mutex_unlock() > > returns. > > > > If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters > > have to keep the mutex alive, I think; but we could have a spurious > > MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed > > between the points where __mutex_unlock_slowpath() did the cmpxchg > > reading the flags and where it acquired the wait_lock. > > > > (With spinlocks, that kind of code pattern is allowed and, from what I > > remember, used in several places in the kernel.) > > > > If my understanding of this is correct, we should probably document this - > > I think such a semantic difference between mutexes and spinlocks is fairly > > unintuitive. > > Spinlocks are fair. So doing a lock/unlock sequence will make sure that > all the previously waiting waiters are done with the lock. Para-virtual > spinlocks, however, can be a bit unfair so doing a lock/unlock sequence > may not be enough to guarantee there is no waiter. The same is true for > mutex. Adding a spin_is_locked() or mutex_is_locked() check can make > sure that all the waiters are gone. I think this pattern anyway only works when you're only trying to wait for the current holder of the lock, not tasks that are queued up on the lock as waiters - so a task initially holds a stable reference to some object, then acquires the object's lock, then drops the original reference, and then later drops the lock. You can see an example of such mutex usage (which is explicitly legal with userspace POSIX mutexes, but is forbidden with kernel mutexes) at the bottom of the POSIX manpage for pthread_mutex_destroy() at <https://pubs.opengroup.org/onlinepubs/007904875/functions/pthread_mutex_destroy.html>, in the section "Destroying Mutexes". (I think trying to wait for pending waiters before destroying a mutex wouldn't make sense because if there can still be pending waiters, there can almost always also be tasks that are about to _become_ pending waiters but that haven't called mutex_lock() yet.)
On 11/30/23 15:48, Jann Horn wrote: > I have seen several cases of attempts to use mutex_unlock() to release an > object such that the object can then be freed by another task. > My understanding is that this is not safe because mutex_unlock(), in the > MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex > structure after having marked it as unlocked; so mutex_unlock() requires > its caller to ensure that the mutex stays alive until mutex_unlock() > returns. > > If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters > have to keep the mutex alive, I think; but we could have a spurious > MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed > between the points where __mutex_unlock_slowpath() did the cmpxchg > reading the flags and where it acquired the wait_lock. Could you clarify under what condition a concurrent task can decide to free the object holding the mutex? Is it !mutex_is_locked() or after a mutex_lock()/mutex_unlock sequence? mutex_is_locked() will return true if the mutex has waiter even if it is currently free. Cheers, Longman > > (With spinlocks, that kind of code pattern is allowed and, from what I > remember, used in several places in the kernel.) > > If my understanding of this is correct, we should probably document this - > I think such a semantic difference between mutexes and spinlocks is fairly > unintuitive. > > Signed-off-by: Jann Horn <jannh@google.com> > --- > I hope for some thorough review on this patch to make sure the comments > I'm adding are actually true, and to confirm that mutexes intentionally > do not support this usage pattern. > > Documentation/locking/mutex-design.rst | 6 ++++++ > kernel/locking/mutex.c | 5 +++++ > 2 files changed, 11 insertions(+) > > diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst > index 78540cd7f54b..087716bfa7b2 100644 > --- a/Documentation/locking/mutex-design.rst > +++ b/Documentation/locking/mutex-design.rst > @@ -101,6 +101,12 @@ features that make lock debugging easier and faster: > - Detects multi-task circular deadlocks and prints out all affected > locks and tasks (and only those tasks). > > +Releasing a mutex is not an atomic operation: Once a mutex release operation > +has begun, another context may be able to acquire the mutex before the release > +operation has completed. The mutex user must ensure that the mutex is not > +destroyed while a release operation is still in progress - in other words, > +callers of 'mutex_unlock' must ensure that the mutex stays alive until > +'mutex_unlock' has returned. > > Interfaces > ---------- > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index 2deeeca3e71b..4c6b83bab643 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -532,6 +532,11 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne > * This function must not be used in interrupt context. Unlocking > * of a not locked mutex is not allowed. > * > + * The caller must ensure that the mutex stays alive until this function has > + * returned - mutex_unlock() can NOT directly be used to release an object such > + * that another concurrent task can free it. > + * Mutexes are different from spinlocks in this aspect. > + * > * This function is similar to (but not equivalent to) up(). > */ > void __sched mutex_unlock(struct mutex *lock) > > base-commit: 3b47bc037bd44f142ac09848e8d3ecccc726be99
* Waiman Long <longman@redhat.com> wrote: > On 11/30/23 15:48, Jann Horn wrote: > > I have seen several cases of attempts to use mutex_unlock() to release an > > object such that the object can then be freed by another task. > > My understanding is that this is not safe because mutex_unlock(), in the > > MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex > > structure after having marked it as unlocked; so mutex_unlock() requires > > its caller to ensure that the mutex stays alive until mutex_unlock() > > returns. > > > > If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters > > have to keep the mutex alive, I think; but we could have a spurious > > MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed > > between the points where __mutex_unlock_slowpath() did the cmpxchg > > reading the flags and where it acquired the wait_lock. > > Could you clarify under what condition a concurrent task can decide to free > the object holding the mutex? Is it !mutex_is_locked() or after a > mutex_lock()/mutex_unlock sequence? > > mutex_is_locked() will return true if the mutex has waiter even if it is > currently free. I believe the correct condition is what the changelog already says: "until mutex_unlock() returns". What happens within mutex_unlock() is kernel implementation specific and once a caller has called mutex_unlock(), the mutex must remain alive until it returns. No other call can substitute for this: neither mutex_is_locked(), nor some sort of mutex_lock()+mutex_unlock() sequence. Thanks, Ingo
On Thu, Nov 30, 2023 at 10:53 PM Waiman Long <longman@redhat.com> wrote: > On 11/30/23 15:48, Jann Horn wrote: > > I have seen several cases of attempts to use mutex_unlock() to release an > > object such that the object can then be freed by another task. > > My understanding is that this is not safe because mutex_unlock(), in the > > MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex > > structure after having marked it as unlocked; so mutex_unlock() requires > > its caller to ensure that the mutex stays alive until mutex_unlock() > > returns. > > > > If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters > > have to keep the mutex alive, I think; but we could have a spurious > > MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed > > between the points where __mutex_unlock_slowpath() did the cmpxchg > > reading the flags and where it acquired the wait_lock. > > Could you clarify under what condition a concurrent task can decide to > free the object holding the mutex? Is it !mutex_is_locked() or after a > mutex_lock()/mutex_unlock sequence? I mean a mutex_lock()+mutex_unlock() sequence. > mutex_is_locked() will return true if the mutex has waiter even if it > is currently free. I don't understand your point, and maybe I also don't understand what you mean by "free". Isn't mutex_is_locked() defined such that it only looks at whether a mutex has an owner, and doesn't look at the waiter list?
On 11/30/23 17:24, Jann Horn wrote: > On Thu, Nov 30, 2023 at 10:53 PM Waiman Long <longman@redhat.com> wrote: >> On 11/30/23 15:48, Jann Horn wrote: >>> I have seen several cases of attempts to use mutex_unlock() to release an >>> object such that the object can then be freed by another task. >>> My understanding is that this is not safe because mutex_unlock(), in the >>> MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex >>> structure after having marked it as unlocked; so mutex_unlock() requires >>> its caller to ensure that the mutex stays alive until mutex_unlock() >>> returns. >>> >>> If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters >>> have to keep the mutex alive, I think; but we could have a spurious >>> MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed >>> between the points where __mutex_unlock_slowpath() did the cmpxchg >>> reading the flags and where it acquired the wait_lock. >> Could you clarify under what condition a concurrent task can decide to >> free the object holding the mutex? Is it !mutex_is_locked() or after a >> mutex_lock()/mutex_unlock sequence? > I mean a mutex_lock()+mutex_unlock() sequence. Because of optimistic spinning, a mutex_lock()/mutex_unlock() can succeed even if there are still waiters waiting for the lock. > >> mutex_is_locked() will return true if the mutex has waiter even if it >> is currently free. > I don't understand your point, and maybe I also don't understand what > you mean by "free". Isn't mutex_is_locked() defined such that it only > looks at whether a mutex has an owner, and doesn't look at the waiter > list? What I mean is that the mutex is in an unlocked state ready to be acquired by another locker. mutex_is_locked() considers the state of the mutex as locked if any of the owner flags is set. Beside the mutex_lock()/mutex_unlock() sequence, I will suggest adding a mutex_is_locked() check just to be sure. Cheers, Longman
* Jann Horn <jannh@google.com> wrote:
> On Thu, Nov 30, 2023 at 10:53 PM Waiman Long <longman@redhat.com> wrote:
> > On 11/30/23 15:48, Jann Horn wrote:
> > > I have seen several cases of attempts to use mutex_unlock() to release an
> > > object such that the object can then be freed by another task.
> > > My understanding is that this is not safe because mutex_unlock(), in the
> > > MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex
> > > structure after having marked it as unlocked; so mutex_unlock() requires
> > > its caller to ensure that the mutex stays alive until mutex_unlock()
> > > returns.
> > >
> > > If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters
> > > have to keep the mutex alive, I think; but we could have a spurious
> > > MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed
> > > between the points where __mutex_unlock_slowpath() did the cmpxchg
> > > reading the flags and where it acquired the wait_lock.
> >
> > Could you clarify under what condition a concurrent task can decide to
> > free the object holding the mutex? Is it !mutex_is_locked() or after a
> > mutex_lock()/mutex_unlock sequence?
>
> I mean a mutex_lock()+mutex_unlock() sequence.
>
> > mutex_is_locked() will return true if the mutex has waiter even if it
> > is currently free.
>
> I don't understand your point, and maybe I also don't understand what
> you mean by "free". Isn't mutex_is_locked() defined such that it only
> looks at whether a mutex has an owner, and doesn't look at the waiter
> list?
Yeah, mutex_is_locked() is not a sufficient check - and mutexes have no
implicit refcount properties like spinlocks. Once you call a mutex API, you
have to guarantee the lifetime of the object until the function returns.
I.e. entering a mutex_lock()-ed critical section cannot be used to
guarantee that all mutex_unlock() instances have stopped using the mutex.
I agree that this is a bit unintuitive, and differs from spinlocks.
I've clarified all this a bit more in the final patch (added a 'fully'
qualifier, etc.), and made the changelog more assertive - see the attached
patch.
Thanks,
Ingo
=======================>
From: Jann Horn <jannh@google.com>
Date: Thu, 30 Nov 2023 21:48:17 +0100
Subject: [PATCH] locking/mutex: Document that mutex_unlock() is non-atomic
I have seen several cases of attempts to use mutex_unlock() to release an
object such that the object can then be freed by another task.
This is not safe because mutex_unlock(), in the
MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex
structure after having marked it as unlocked; so mutex_unlock() requires
its caller to ensure that the mutex stays alive until mutex_unlock()
returns.
If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters
have to keep the mutex alive, but we could have a spurious
MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed
between the points where __mutex_unlock_slowpath() did the cmpxchg
reading the flags and where it acquired the wait_lock.
( With spinlocks, that kind of code pattern is allowed and, from what I
remember, used in several places in the kernel. )
Document this, such a semantic difference between mutexes and spinlocks
is fairly unintuitive.
[ mingo: Made the changelog a bit more assertive, refined the comments. ]
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20231130204817.2031407-1-jannh@google.com
---
Documentation/locking/mutex-design.rst | 6 ++++++
kernel/locking/mutex.c | 5 +++++
2 files changed, 11 insertions(+)
diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
index 78540cd7f54b..7572339b2f12 100644
--- a/Documentation/locking/mutex-design.rst
+++ b/Documentation/locking/mutex-design.rst
@@ -101,6 +101,12 @@ features that make lock debugging easier and faster:
- Detects multi-task circular deadlocks and prints out all affected
locks and tasks (and only those tasks).
+Releasing a mutex is not an atomic operation: Once a mutex release operation
+has begun, another context may be able to acquire the mutex before the release
+operation has fully completed. The mutex user must ensure that the mutex is not
+destroyed while a release operation is still in progress - in other words,
+callers of mutex_unlock() must ensure that the mutex stays alive until
+mutex_unlock() has returned.
Interfaces
----------
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 2deeeca3e71b..cbae8c0b89ab 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -532,6 +532,11 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
* This function must not be used in interrupt context. Unlocking
* of a not locked mutex is not allowed.
*
+ * The caller must ensure that the mutex stays alive until this function has
+ * returned - mutex_unlock() can NOT directly be used to release an object such
+ * that another concurrent task can free it.
+ * Mutexes are different from spinlocks & refcounts in this aspect.
+ *
* This function is similar to (but not equivalent to) up().
*/
void __sched mutex_unlock(struct mutex *lock)
On Fri, Dec 01, 2023 at 11:33:19AM +0100, Ingo Molnar wrote: > =======================> > From: Jann Horn <jannh@google.com> > Date: Thu, 30 Nov 2023 21:48:17 +0100 > Subject: [PATCH] locking/mutex: Document that mutex_unlock() is non-atomic > > I have seen several cases of attempts to use mutex_unlock() to release an > object such that the object can then be freed by another task. > > This is not safe because mutex_unlock(), in the > MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex > structure after having marked it as unlocked; so mutex_unlock() requires > its caller to ensure that the mutex stays alive until mutex_unlock() > returns. > > If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters > have to keep the mutex alive, but we could have a spurious > MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed > between the points where __mutex_unlock_slowpath() did the cmpxchg > reading the flags and where it acquired the wait_lock. > > ( With spinlocks, that kind of code pattern is allowed and, from what I > remember, used in several places in the kernel. ) > > Document this, such a semantic difference between mutexes and spinlocks > is fairly unintuitive. > > [ mingo: Made the changelog a bit more assertive, refined the comments. ] > > Signed-off-by: Jann Horn <jannh@google.com> > Signed-off-by: Ingo Molnar <mingo@kernel.org> > Link: https://lore.kernel.org/r/20231130204817.2031407-1-jannh@google.com > --- > Documentation/locking/mutex-design.rst | 6 ++++++ > kernel/locking/mutex.c | 5 +++++ > 2 files changed, 11 insertions(+) > > diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst > index 78540cd7f54b..7572339b2f12 100644 > --- a/Documentation/locking/mutex-design.rst > +++ b/Documentation/locking/mutex-design.rst > @@ -101,6 +101,12 @@ features that make lock debugging easier and faster: > - Detects multi-task circular deadlocks and prints out all affected > locks and tasks (and only those tasks). > > +Releasing a mutex is not an atomic operation: Once a mutex release operation > +has begun, another context may be able to acquire the mutex before the release > +operation has fully completed. The mutex user must ensure that the mutex is not > +destroyed while a release operation is still in progress - in other words, > +callers of mutex_unlock() must ensure that the mutex stays alive until > +mutex_unlock() has returned. > > Interfaces > ---------- > diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c > index 2deeeca3e71b..cbae8c0b89ab 100644 > --- a/kernel/locking/mutex.c > +++ b/kernel/locking/mutex.c > @@ -532,6 +532,11 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne > * This function must not be used in interrupt context. Unlocking > * of a not locked mutex is not allowed. > * > + * The caller must ensure that the mutex stays alive until this function has > + * returned - mutex_unlock() can NOT directly be used to release an object such > + * that another concurrent task can free it. > + * Mutexes are different from spinlocks & refcounts in this aspect. > + * > * This function is similar to (but not equivalent to) up(). > */ > void __sched mutex_unlock(struct mutex *lock) Hi Ingo and Jann, thanks for the patch. The patch LGTM, thanks! Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> -- An old man doll... just what I always wanted! - Clara
The following commit has been merged into the locking/core branch of tip:
Commit-ID: a51749ab34d9e5dec548fe38ede7e01e8bb26454
Gitweb: https://git.kernel.org/tip/a51749ab34d9e5dec548fe38ede7e01e8bb26454
Author: Jann Horn <jannh@google.com>
AuthorDate: Thu, 30 Nov 2023 21:48:17 +01:00
Committer: Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 01 Dec 2023 11:27:43 +01:00
locking/mutex: Document that mutex_unlock() is non-atomic
I have seen several cases of attempts to use mutex_unlock() to release an
object such that the object can then be freed by another task.
This is not safe because mutex_unlock(), in the
MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex
structure after having marked it as unlocked; so mutex_unlock() requires
its caller to ensure that the mutex stays alive until mutex_unlock()
returns.
If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters
have to keep the mutex alive, but we could have a spurious
MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed
between the points where __mutex_unlock_slowpath() did the cmpxchg
reading the flags and where it acquired the wait_lock.
( With spinlocks, that kind of code pattern is allowed and, from what I
remember, used in several places in the kernel. )
Document this, such a semantic difference between mutexes and spinlocks
is fairly unintuitive.
[ mingo: Made the changelog a bit more assertive, refined the comments. ]
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20231130204817.2031407-1-jannh@google.com
---
Documentation/locking/mutex-design.rst | 6 ++++++
kernel/locking/mutex.c | 5 +++++
2 files changed, 11 insertions(+)
diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
index 78540cd..7572339 100644
--- a/Documentation/locking/mutex-design.rst
+++ b/Documentation/locking/mutex-design.rst
@@ -101,6 +101,12 @@ features that make lock debugging easier and faster:
- Detects multi-task circular deadlocks and prints out all affected
locks and tasks (and only those tasks).
+Releasing a mutex is not an atomic operation: Once a mutex release operation
+has begun, another context may be able to acquire the mutex before the release
+operation has fully completed. The mutex user must ensure that the mutex is not
+destroyed while a release operation is still in progress - in other words,
+callers of mutex_unlock() must ensure that the mutex stays alive until
+mutex_unlock() has returned.
Interfaces
----------
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 2deeeca..cbae8c0 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -532,6 +532,11 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
* This function must not be used in interrupt context. Unlocking
* of a not locked mutex is not allowed.
*
+ * The caller must ensure that the mutex stays alive until this function has
+ * returned - mutex_unlock() can NOT directly be used to release an object such
+ * that another concurrent task can free it.
+ * Mutexes are different from spinlocks & refcounts in this aspect.
+ *
* This function is similar to (but not equivalent to) up().
*/
void __sched mutex_unlock(struct mutex *lock)
On Fri, Dec 01, 2023 at 10:44:09AM -0000, tip-bot2 for Jann Horn wrote: > --- a/Documentation/locking/mutex-design.rst > +++ b/Documentation/locking/mutex-design.rst > @@ -101,6 +101,12 @@ features that make lock debugging easier and faster: > - Detects multi-task circular deadlocks and prints out all affected > locks and tasks (and only those tasks). > > +Releasing a mutex is not an atomic operation: Once a mutex release operation I still object to this confusing usage of atomic. Also all this also applies to all sleeping locks, rwsem etc. I don't see why we need to special case mutex here. Also completion_done() has an explicit lock+unlock on wait.lock to deal with this there.
* Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Dec 01, 2023 at 10:44:09AM -0000, tip-bot2 for Jann Horn wrote:
>
> > --- a/Documentation/locking/mutex-design.rst
> > +++ b/Documentation/locking/mutex-design.rst
> > @@ -101,6 +101,12 @@ features that make lock debugging easier and faster:
> > - Detects multi-task circular deadlocks and prints out all affected
> > locks and tasks (and only those tasks).
> >
> > +Releasing a mutex is not an atomic operation: Once a mutex release operation
>
> I still object to this confusing usage of atomic. Also all this also
> applies to all sleeping locks, rwsem etc. I don't see why we need to
> special case mutex here.
>
> Also completion_done() has an explicit lock+unlock on wait.lock to
> deal with this there.
Fair enough - but Jan's original observation stands: mutexes are the
sleeping locks most similar to spinlocks, so the locking & object lifetime
pattern that works under spinlocks cannot be carried over to mutexes in all
cases, and it's fair to warn about this pitfall.
We single out mutex_lock(), because they are the most similar in behavior
to spinlocks, and because this concern isn't hypothethical, it has been
observed in the wild with mutex users.
How about the language in the attached patch?
Thanks,
Ingo
================>
From: Ingo Molnar <mingo@kernel.org>
Date: Mon, 8 Jan 2024 09:31:16 +0100
Subject: [PATCH] locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, cannot be used to reference-count objects
Clarify the mutex_unlock() lock lifetime rules a bit more.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20231201121808.GL3818@noisy.programming.kicks-ass.net
---
Documentation/locking/mutex-design.rst | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)
diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
index 7572339b2f12..f5270323cf0b 100644
--- a/Documentation/locking/mutex-design.rst
+++ b/Documentation/locking/mutex-design.rst
@@ -101,12 +101,21 @@ features that make lock debugging easier and faster:
- Detects multi-task circular deadlocks and prints out all affected
locks and tasks (and only those tasks).
-Releasing a mutex is not an atomic operation: Once a mutex release operation
-has begun, another context may be able to acquire the mutex before the release
-operation has fully completed. The mutex user must ensure that the mutex is not
-destroyed while a release operation is still in progress - in other words,
-callers of mutex_unlock() must ensure that the mutex stays alive until
-mutex_unlock() has returned.
+A mutex - and most other sleeping locks like rwsems - do not provide an
+implicit refcount for the memory they occupy, which could then be released
+with mutex_unlock().
+
+[ This is in contrast with spin_unlock() [or completion_done()], which APIs can
+ be used to guarantee that the memory is not touched by the lock implementation
+ after spin_unlock() releases the lock. ]
+
+Once a mutex release operation has begun within mutex_unlock(), another context
+may be able to acquire the mutex before the release operation has fully completed,
+and it's not safe to free the object then.
+
+The mutex user must ensure that the mutex is not destroyed while a release operation
+is still in progress - in other words, callers of mutex_unlock() must ensure that
+the mutex stays alive until mutex_unlock() has returned.
Interfaces
----------
On Mon, Jan 8, 2024 at 9:45 AM Ingo Molnar <mingo@kernel.org> wrote: > * Peter Zijlstra <peterz@infradead.org> wrote: > > > On Fri, Dec 01, 2023 at 10:44:09AM -0000, tip-bot2 for Jann Horn wrote: > > > > > --- a/Documentation/locking/mutex-design.rst > > > +++ b/Documentation/locking/mutex-design.rst > > > @@ -101,6 +101,12 @@ features that make lock debugging easier and faster: > > > - Detects multi-task circular deadlocks and prints out all affected > > > locks and tasks (and only those tasks). > > > > > > +Releasing a mutex is not an atomic operation: Once a mutex release operation > > > > I still object to this confusing usage of atomic. Also all this also > > applies to all sleeping locks, rwsem etc. I don't see why we need to > > special case mutex here. > > > > Also completion_done() has an explicit lock+unlock on wait.lock to > > deal with this there. > > Fair enough - but Jan's original observation stands: mutexes are the > sleeping locks most similar to spinlocks, so the locking & object lifetime > pattern that works under spinlocks cannot be carried over to mutexes in all > cases, and it's fair to warn about this pitfall. > > We single out mutex_lock(), because they are the most similar in behavior > to spinlocks, and because this concern isn't hypothethical, it has been > observed in the wild with mutex users. > > How about the language in the attached patch? In case you missed it, I sent this rewritten documentation patch in response to the feedback I got, intended to replace the patch that is now sitting in the tip tree (but I don't know how that works procedurally for something that's already in the tip tree, whether you'd want to just swap out the patch with a forced update, or revert out the old version, or something else): <https://lore.kernel.org/all/20231204132259.112152-1-jannh@google.com/> Since there were comments on how this is really a more general rule than a mutex-specific one, that version doesn't touch Documentation/locking/mutex-design.rst and instead documents the rule in Documentation/locking/locktypes.rst; and then it adds comments above some of the most common unlock-type functions that would be affected.
* Ingo Molnar <mingo@kernel.org> wrote:
> > > +Releasing a mutex is not an atomic operation: Once a mutex release operation
> >
> > I still object to this confusing usage of atomic. Also all this also
> > applies to all sleeping locks, rwsem etc. I don't see why we need to
> > special case mutex here.
> >
> > Also completion_done() has an explicit lock+unlock on wait.lock to deal
> > with this there.
>
> Fair enough - but Jan's original observation stands: mutexes are the
> sleeping locks most similar to spinlocks, so the locking & object
> lifetime pattern that works under spinlocks cannot be carried over to
> mutexes in all cases, and it's fair to warn about this pitfall.
>
> We single out mutex_lock(), because they are the most similar in behavior
> to spinlocks, and because this concern isn't hypothethical, it has been
> observed in the wild with mutex users.
>
> How about the language in the attached patch?
Refined the language a bit more in the -v2 patch below.
Thanks,
Ingo
=============>
From: Ingo Molnar <mingo@kernel.org>
Date: Mon, 8 Jan 2024 09:31:16 +0100
Subject: [PATCH] locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, can still use the lock object after it's unlocked
Clarify the mutex lock lifetime rules a bit more.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20231201121808.GL3818@noisy.programming.kicks-ass.net
---
Documentation/locking/mutex-design.rst | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
index 7572339b2f12..7c30b4aa5e28 100644
--- a/Documentation/locking/mutex-design.rst
+++ b/Documentation/locking/mutex-design.rst
@@ -101,12 +101,24 @@ features that make lock debugging easier and faster:
- Detects multi-task circular deadlocks and prints out all affected
locks and tasks (and only those tasks).
-Releasing a mutex is not an atomic operation: Once a mutex release operation
-has begun, another context may be able to acquire the mutex before the release
-operation has fully completed. The mutex user must ensure that the mutex is not
-destroyed while a release operation is still in progress - in other words,
-callers of mutex_unlock() must ensure that the mutex stays alive until
-mutex_unlock() has returned.
+Mutexes - and most other sleeping locks like rwsems - do not provide an
+implicit reference for the memory they occupy, which reference is released
+with mutex_unlock().
+
+[ This is in contrast with spin_unlock() [or completion_done()], which
+ APIs can be used to guarantee that the memory is not touched by the
+ lock implementation after spin_unlock()/completion_done() releases
+ the lock. ]
+
+mutex_unlock() may access the mutex structure even after it has internally
+released the lock already - so it's not safe for another context to
+acquire the mutex and assume that the mutex_unlock() context is not using
+the structure anymore.
+
+The mutex user must ensure that the mutex is not destroyed while a
+release operation is still in progress - in other words, callers of
+mutex_unlock() must ensure that the mutex stays alive until mutex_unlock()
+has returned.
Interfaces
----------
The following commit has been merged into the locking/core branch of tip:
Commit-ID: 2b9d9e0a9ba0e24cb9c78336481f0ed8b2bc1ff2
Gitweb: https://git.kernel.org/tip/2b9d9e0a9ba0e24cb9c78336481f0ed8b2bc1ff2
Author: Ingo Molnar <mingo@kernel.org>
AuthorDate: Mon, 08 Jan 2024 09:31:16 +01:00
Committer: Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 08 Jan 2024 09:55:31 +01:00
locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, can still use the lock object after it's unlocked
Clarify the mutex lock lifetime rules a bit more.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20231201121808.GL3818@noisy.programming.kicks-ass.net
---
Documentation/locking/mutex-design.rst | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
index 7572339..7c30b4a 100644
--- a/Documentation/locking/mutex-design.rst
+++ b/Documentation/locking/mutex-design.rst
@@ -101,12 +101,24 @@ features that make lock debugging easier and faster:
- Detects multi-task circular deadlocks and prints out all affected
locks and tasks (and only those tasks).
-Releasing a mutex is not an atomic operation: Once a mutex release operation
-has begun, another context may be able to acquire the mutex before the release
-operation has fully completed. The mutex user must ensure that the mutex is not
-destroyed while a release operation is still in progress - in other words,
-callers of mutex_unlock() must ensure that the mutex stays alive until
-mutex_unlock() has returned.
+Mutexes - and most other sleeping locks like rwsems - do not provide an
+implicit reference for the memory they occupy, which reference is released
+with mutex_unlock().
+
+[ This is in contrast with spin_unlock() [or completion_done()], which
+ APIs can be used to guarantee that the memory is not touched by the
+ lock implementation after spin_unlock()/completion_done() releases
+ the lock. ]
+
+mutex_unlock() may access the mutex structure even after it has internally
+released the lock already - so it's not safe for another context to
+acquire the mutex and assume that the mutex_unlock() context is not using
+the structure anymore.
+
+The mutex user must ensure that the mutex is not destroyed while a
+release operation is still in progress - in other words, callers of
+mutex_unlock() must ensure that the mutex stays alive until mutex_unlock()
+has returned.
Interfaces
----------
© 2016 - 2025 Red Hat, Inc.