locking: mutex: Fix proxy-exec potentially deactivating tasks marked TASK_RUNNING

[tip: sched/core] locking: mutex: Fix proxy-exec potentially deactivating tasks marked TASK_RUNNING
Posted by tip-bot2 for John Stultz 3 days, 11 hours ago
The following commit has been merged into the sched/core branch of tip:

Commit-ID:     bdaf235913e1f31453c6e0e109d797269f9f0a37
Gitweb:        https://git.kernel.org/tip/bdaf235913e1f31453c6e0e109d797269f9f0a37
Author:        John Stultz <jstultz@google.com>
AuthorDate:    Thu, 30 Apr 2026 21:50:47 
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 02 Jun 2026 12:26:05 +02:00

locking: mutex: Fix proxy-exec potentially deactivating tasks marked TASK_RUNNING

Vineeth found came up with a test driver that could trip up
workqueue stalls. After fixing one issue this test found,
Vineeth reported the test was still failing.

Greatly simplified, a task that tries to take a mutex already
owned by another task that is sleeping, can hit a edge case in
the mutex_lock_common() case.

If the task fails to get the lock, calls into schedule, but gets
a spurious wakeup, it will find that it is first waiter, and
go into the mutex_optimistic_spin() logic. Though before calling
mutex_optimistic_spin(), we clear task blocked_on state, since
mutex_optimistic_spin() may call schedule() if need_resched() is
set.

After mutex_optimistic_spin() fails, we set blocked_on again,
restart the main mutex loop, try to take the lock and call into
schedule_preempt_disabled().

>From there, with proxy-execution, we'll see the task is
blocked_on, follow the chain, see the owner is sleeping and
dequeue the waiting task from the runqueue.

This all sounds fine and reasonable.  But what I had missed is
that in mutex_optimistic_spin(), not only do we call schedule()
but we set TASK_RUNNABLE right before doing so.

This is ok for that invocation of schedule(). But when we come
back we re-set the blocked_on we had just cleared, but we do not
re-set the task state to TASK_INTERRUPTIBLE/UNINTERRUPTIBLE.

This means we have a task that is blocked_on & TASK_RUNNABLE,
so when the proxy execution code dequeues the task, we are
in trouble since future wakeups will be shortcut by the
ttwu_state_match() check.

Thus, to avoid this, after mutex_optimistic_spin(), set the task
state back when we set blocked_on.

Many many thanks again to Vineeth for his very useful testing
driver that uncovered this long hidden bug, that I hadn't
tripped in all my testing! Very impressed with the problems he's
uncovered!

Reported-by: Vineeth Pillai <vineethrp@google.com>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Vineeth Pillai <vineethrp@google.com>
Link: https://patch.msgid.link/20260430215103.2978955-3-jstultz@google.com
---
 kernel/locking/mutex.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 0953462..a93d4c6 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -763,6 +763,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int state, unsigned int subclas
 			raw_spin_lock_irqsave(&lock->wait_lock, flags);
 			raw_spin_lock(&current->blocked_lock);
 			__set_task_blocked_on(current, lock);
+			set_current_state(state);
 
 			if (opt_acquired)
 				break;