[v1] locking/mutex: Document that mutex_unlock() is non-atomic

[tip: locking/core] locking/mutex: Document that mutex_unlock() is non-atomic

Posted by tip-bot2 for Jann Horn 2 years, 4 months ago

The following commit has been merged into the locking/core branch of tip:

Commit-ID:     a51749ab34d9e5dec548fe38ede7e01e8bb26454
Gitweb:        https://git.kernel.org/tip/a51749ab34d9e5dec548fe38ede7e01e8bb26454
Author:        Jann Horn <jannh@google.com>
AuthorDate:    Thu, 30 Nov 2023 21:48:17 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 01 Dec 2023 11:27:43 +01:00

locking/mutex: Document that mutex_unlock() is non-atomic

I have seen several cases of attempts to use mutex_unlock() to release an
object such that the object can then be freed by another task.

This is not safe because mutex_unlock(), in the
MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex
structure after having marked it as unlocked; so mutex_unlock() requires
its caller to ensure that the mutex stays alive until mutex_unlock()
returns.

If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters
have to keep the mutex alive, but we could have a spurious
MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed
between the points where __mutex_unlock_slowpath() did the cmpxchg
reading the flags and where it acquired the wait_lock.

( With spinlocks, that kind of code pattern is allowed and, from what I
  remember, used in several places in the kernel. )

Document this, such a semantic difference between mutexes and spinlocks
is fairly unintuitive.

[ mingo: Made the changelog a bit more assertive, refined the comments. ]

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20231130204817.2031407-1-jannh@google.com
---
 Documentation/locking/mutex-design.rst | 6 ++++++
 kernel/locking/mutex.c                 | 5 +++++
 2 files changed, 11 insertions(+)

diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
index 78540cd..7572339 100644
--- a/Documentation/locking/mutex-design.rst
+++ b/Documentation/locking/mutex-design.rst
@@ -101,6 +101,12 @@ features that make lock debugging easier and faster:
     - Detects multi-task circular deadlocks and prints out all affected
       locks and tasks (and only those tasks).
 
+Releasing a mutex is not an atomic operation: Once a mutex release operation
+has begun, another context may be able to acquire the mutex before the release
+operation has fully completed. The mutex user must ensure that the mutex is not
+destroyed while a release operation is still in progress - in other words,
+callers of mutex_unlock() must ensure that the mutex stays alive until
+mutex_unlock() has returned.
 
 Interfaces
 ----------
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 2deeeca..cbae8c0 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -532,6 +532,11 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
  * This function must not be used in interrupt context. Unlocking
  * of a not locked mutex is not allowed.
  *
+ * The caller must ensure that the mutex stays alive until this function has
+ * returned - mutex_unlock() can NOT directly be used to release an object such
+ * that another concurrent task can free it.
+ * Mutexes are different from spinlocks & refcounts in this aspect.
+ *
  * This function is similar to (but not equivalent to) up().
  */
 void __sched mutex_unlock(struct mutex *lock)

Re: [tip: locking/core] locking/mutex: Document that mutex_unlock() is non-atomic

Posted by Peter Zijlstra 2 years, 4 months ago

On Fri, Dec 01, 2023 at 10:44:09AM -0000, tip-bot2 for Jann Horn wrote:

> --- a/Documentation/locking/mutex-design.rst
> +++ b/Documentation/locking/mutex-design.rst
> @@ -101,6 +101,12 @@ features that make lock debugging easier and faster:
>      - Detects multi-task circular deadlocks and prints out all affected
>        locks and tasks (and only those tasks).
>  
> +Releasing a mutex is not an atomic operation: Once a mutex release operation

I still object to this confusing usage of atomic. Also all this also
applies to all sleeping locks, rwsem etc. I don't see why we need to
special case mutex here.

Also completion_done() has an explicit lock+unlock on wait.lock to
deal with this there.

[PATCH] locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, cannot be used to reference-count objects

Posted by Ingo Molnar 2 years, 2 months ago


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, Dec 01, 2023 at 10:44:09AM -0000, tip-bot2 for Jann Horn wrote:
> 
> > --- a/Documentation/locking/mutex-design.rst
> > +++ b/Documentation/locking/mutex-design.rst
> > @@ -101,6 +101,12 @@ features that make lock debugging easier and faster:
> >      - Detects multi-task circular deadlocks and prints out all affected
> >        locks and tasks (and only those tasks).
> >  
> > +Releasing a mutex is not an atomic operation: Once a mutex release operation
> 
> I still object to this confusing usage of atomic. Also all this also
> applies to all sleeping locks, rwsem etc. I don't see why we need to
> special case mutex here.
> 
> Also completion_done() has an explicit lock+unlock on wait.lock to
> deal with this there.

Fair enough - but Jan's original observation stands: mutexes are the 
sleeping locks most similar to spinlocks, so the locking & object lifetime 
pattern that works under spinlocks cannot be carried over to mutexes in all 
cases, and it's fair to warn about this pitfall.

We single out mutex_lock(), because they are the most similar in behavior 
to spinlocks, and because this concern isn't hypothethical, it has been 
observed in the wild with mutex users.

How about the language in the attached patch?

Thanks,

	Ingo

================>
From: Ingo Molnar <mingo@kernel.org>
Date: Mon, 8 Jan 2024 09:31:16 +0100
Subject: [PATCH] locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, cannot be used to reference-count objects

Clarify the mutex_unlock() lock lifetime rules a bit more.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/r/20231201121808.GL3818@noisy.programming.kicks-ass.net
---
 Documentation/locking/mutex-design.rst | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
index 7572339b2f12..f5270323cf0b 100644
--- a/Documentation/locking/mutex-design.rst
+++ b/Documentation/locking/mutex-design.rst
@@ -101,12 +101,21 @@ features that make lock debugging easier and faster:
     - Detects multi-task circular deadlocks and prints out all affected
       locks and tasks (and only those tasks).
 
-Releasing a mutex is not an atomic operation: Once a mutex release operation
-has begun, another context may be able to acquire the mutex before the release
-operation has fully completed. The mutex user must ensure that the mutex is not
-destroyed while a release operation is still in progress - in other words,
-callers of mutex_unlock() must ensure that the mutex stays alive until
-mutex_unlock() has returned.
+A mutex - and most other sleeping locks like rwsems - do not provide an
+implicit refcount for the memory they occupy, which could then be released
+with mutex_unlock().
+
+[ This is in contrast with spin_unlock() [or completion_done()], which APIs can
+  be used to guarantee that the memory is not touched by the lock implementation
+  after spin_unlock() releases the lock. ]
+
+Once a mutex release operation has begun within mutex_unlock(), another context
+may be able to acquire the mutex before the release operation has fully completed,
+and it's not safe to free the object then.
+
+The mutex user must ensure that the mutex is not destroyed while a release operation
+is still in progress - in other words, callers of mutex_unlock() must ensure that
+the mutex stays alive until mutex_unlock() has returned.
 
 Interfaces
 ----------

Re: [PATCH] locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, cannot be used to reference-count objects

Posted by Jann Horn 2 years, 2 months ago

On Mon, Jan 8, 2024 at 9:45 AM Ingo Molnar <mingo@kernel.org> wrote:
> * Peter Zijlstra <peterz@infradead.org> wrote:
>
> > On Fri, Dec 01, 2023 at 10:44:09AM -0000, tip-bot2 for Jann Horn wrote:
> >
> > > --- a/Documentation/locking/mutex-design.rst
> > > +++ b/Documentation/locking/mutex-design.rst
> > > @@ -101,6 +101,12 @@ features that make lock debugging easier and faster:
> > >      - Detects multi-task circular deadlocks and prints out all affected
> > >        locks and tasks (and only those tasks).
> > >
> > > +Releasing a mutex is not an atomic operation: Once a mutex release operation
> >
> > I still object to this confusing usage of atomic. Also all this also
> > applies to all sleeping locks, rwsem etc. I don't see why we need to
> > special case mutex here.
> >
> > Also completion_done() has an explicit lock+unlock on wait.lock to
> > deal with this there.
>
> Fair enough - but Jan's original observation stands: mutexes are the
> sleeping locks most similar to spinlocks, so the locking & object lifetime
> pattern that works under spinlocks cannot be carried over to mutexes in all
> cases, and it's fair to warn about this pitfall.
>
> We single out mutex_lock(), because they are the most similar in behavior
> to spinlocks, and because this concern isn't hypothethical, it has been
> observed in the wild with mutex users.
>
> How about the language in the attached patch?

In case you missed it, I sent this rewritten documentation patch in
response to the feedback I got, intended to replace the patch that is
now sitting in the tip tree (but I don't know how that works
procedurally for something that's already in the tip tree, whether
you'd want to just swap out the patch with a forced update, or revert
out the old version, or something else):
<https://lore.kernel.org/all/20231204132259.112152-1-jannh@google.com/>

Since there were comments on how this is really a more general rule
than a mutex-specific one, that version doesn't touch
Documentation/locking/mutex-design.rst and instead documents the rule
in Documentation/locking/locktypes.rst; and then it adds comments
above some of the most common unlock-type functions that would be
affected.

[PATCH -v2] locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, can still use the lock object after it's unlocked

Posted by Ingo Molnar 2 years, 2 months ago


* Ingo Molnar <mingo@kernel.org> wrote:

> > > +Releasing a mutex is not an atomic operation: Once a mutex release operation
> > 
> > I still object to this confusing usage of atomic. Also all this also 
> > applies to all sleeping locks, rwsem etc. I don't see why we need to 
> > special case mutex here.
> > 
> > Also completion_done() has an explicit lock+unlock on wait.lock to deal 
> > with this there.
> 
> Fair enough - but Jan's original observation stands: mutexes are the 
> sleeping locks most similar to spinlocks, so the locking & object 
> lifetime pattern that works under spinlocks cannot be carried over to 
> mutexes in all cases, and it's fair to warn about this pitfall.
> 
> We single out mutex_lock(), because they are the most similar in behavior 
> to spinlocks, and because this concern isn't hypothethical, it has been 
> observed in the wild with mutex users.
> 
> How about the language in the attached patch?

Refined the language a bit more in the -v2 patch below.

Thanks,

	Ingo

=============>
From: Ingo Molnar <mingo@kernel.org>
Date: Mon, 8 Jan 2024 09:31:16 +0100
Subject: [PATCH] locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, can still use the lock object after it's unlocked

Clarify the mutex lock lifetime rules a bit more.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20231201121808.GL3818@noisy.programming.kicks-ass.net
---
 Documentation/locking/mutex-design.rst | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
index 7572339b2f12..7c30b4aa5e28 100644
--- a/Documentation/locking/mutex-design.rst
+++ b/Documentation/locking/mutex-design.rst
@@ -101,12 +101,24 @@ features that make lock debugging easier and faster:
     - Detects multi-task circular deadlocks and prints out all affected
       locks and tasks (and only those tasks).
 
-Releasing a mutex is not an atomic operation: Once a mutex release operation
-has begun, another context may be able to acquire the mutex before the release
-operation has fully completed. The mutex user must ensure that the mutex is not
-destroyed while a release operation is still in progress - in other words,
-callers of mutex_unlock() must ensure that the mutex stays alive until
-mutex_unlock() has returned.
+Mutexes - and most other sleeping locks like rwsems - do not provide an
+implicit reference for the memory they occupy, which reference is released
+with mutex_unlock().
+
+[ This is in contrast with spin_unlock() [or completion_done()], which
+  APIs can be used to guarantee that the memory is not touched by the
+  lock implementation after spin_unlock()/completion_done() releases
+  the lock. ]
+
+mutex_unlock() may access the mutex structure even after it has internally
+released the lock already - so it's not safe for another context to
+acquire the mutex and assume that the mutex_unlock() context is not using
+the structure anymore.
+
+The mutex user must ensure that the mutex is not destroyed while a
+release operation is still in progress - in other words, callers of
+mutex_unlock() must ensure that the mutex stays alive until mutex_unlock()
+has returned.
 
 Interfaces
 ----------

[tip: locking/core] locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, can still use the lock object after it's unlocked

Posted by tip-bot2 for Ingo Molnar 2 years, 2 months ago

The following commit has been merged into the locking/core branch of tip:

Commit-ID:     2b9d9e0a9ba0e24cb9c78336481f0ed8b2bc1ff2
Gitweb:        https://git.kernel.org/tip/2b9d9e0a9ba0e24cb9c78336481f0ed8b2bc1ff2
Author:        Ingo Molnar <mingo@kernel.org>
AuthorDate:    Mon, 08 Jan 2024 09:31:16 +01:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Mon, 08 Jan 2024 09:55:31 +01:00

locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, can still use the lock object after it's unlocked

Clarify the mutex lock lifetime rules a bit more.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20231201121808.GL3818@noisy.programming.kicks-ass.net
---
 Documentation/locking/mutex-design.rst | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
index 7572339..7c30b4a 100644
--- a/Documentation/locking/mutex-design.rst
+++ b/Documentation/locking/mutex-design.rst
@@ -101,12 +101,24 @@ features that make lock debugging easier and faster:
     - Detects multi-task circular deadlocks and prints out all affected
       locks and tasks (and only those tasks).
 
-Releasing a mutex is not an atomic operation: Once a mutex release operation
-has begun, another context may be able to acquire the mutex before the release
-operation has fully completed. The mutex user must ensure that the mutex is not
-destroyed while a release operation is still in progress - in other words,
-callers of mutex_unlock() must ensure that the mutex stays alive until
-mutex_unlock() has returned.
+Mutexes - and most other sleeping locks like rwsems - do not provide an
+implicit reference for the memory they occupy, which reference is released
+with mutex_unlock().
+
+[ This is in contrast with spin_unlock() [or completion_done()], which
+  APIs can be used to guarantee that the memory is not touched by the
+  lock implementation after spin_unlock()/completion_done() releases
+  the lock. ]
+
+mutex_unlock() may access the mutex structure even after it has internally
+released the lock already - so it's not safe for another context to
+acquire the mutex and assume that the mutex_unlock() context is not using
+the structure anymore.
+
+The mutex user must ensure that the mutex is not destroyed while a
+release operation is still in progress - in other words, callers of
+mutex_unlock() must ensure that the mutex stays alive until mutex_unlock()
+has returned.
 
 Interfaces
 ----------