kernel/futex/requeue.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
syzbot managed to trigger the following race:
T1 T2
futex_wait_requeue_pi()
futex_do_wait()
schedule()
futex_requeue()
futex_proxy_trylock_atomic()
futex_requeue_pi_prepare()
requeue_pi_wake_futex()
futex_requeue_pi_complete()
/* preempt */
* timeout/ signal wakes T1 *
futex_requeue_pi_wakeup_sync() // Q_REQUEUE_PI_LOCKED
futex_hash_put()
// back to userland, on stack futex_q is garbage
/* back */
wake_up_state(q->task, TASK_NORMAL);
In this scenario futex_wait_requeue_pi() is able to leave without using
futex_q::lock_ptr for synchronization.
This can be prevented by reading futex_q::task before updating the
futex_q::requeue_state. A reference on the task_struct is not needed
because requeue_pi_wake_futex() is invoked with a spinlock_t held which
implies a RCU read section. Even if T1 terminates immediately after, the
task_struct will remain valid during T2's wake_up_state().
A READ_ONCE on futex_q::task before futex_requeue_pi_complete() is
enough because it ensures that the variable is read before the state is
updated.
Read futex_q::task before the updating the requeue state, use it for the
following wakeup.
Fixes: 07d91ef510fb1 ("futex: Prevent requeue_pi() lock nesting issue on RT")
Reported-by: syzbot+034246a838a10d181e78@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68b75989.050a0220.3db4df.01dd.GAE@google.com/
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/futex/requeue.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/futex/requeue.c b/kernel/futex/requeue.c
index c716a66f86929..d818b4d47f1ba 100644
--- a/kernel/futex/requeue.c
+++ b/kernel/futex/requeue.c
@@ -230,8 +230,9 @@ static inline
void requeue_pi_wake_futex(struct futex_q *q, union futex_key *key,
struct futex_hash_bucket *hb)
{
- q->key = *key;
+ struct task_struct *task;
+ q->key = *key;
__futex_unqueue(q);
WARN_ON(!q->rt_waiter);
@@ -243,10 +244,11 @@ void requeue_pi_wake_futex(struct futex_q *q, union futex_key *key,
futex_hash_get(hb);
q->drop_hb_ref = true;
q->lock_ptr = &hb->lock;
+ task = READ_ONCE(q->task);
/* Signal locked state to the waiter */
futex_requeue_pi_complete(q, 1);
- wake_up_state(q->task, TASK_NORMAL);
+ wake_up_state(task, TASK_NORMAL);
}
/**
--
2.51.0
On 2025-09-10 12:42:45 [+0200], To Thomas Gleixner wrote: > --- a/kernel/futex/requeue.c > +++ b/kernel/futex/requeue.c > @@ -243,10 +244,11 @@ void requeue_pi_wake_futex(struct futex_q *q, union futex_key *key, > futex_hash_get(hb); > q->drop_hb_ref = true; > q->lock_ptr = &hb->lock; > + task = READ_ONCE(q->task); > > /* Signal locked state to the waiter */ > futex_requeue_pi_complete(q, 1); once understood, adding an mdelay(500) here greatly improves the chances to trigger. futex_requeue_pi_complete() uses atomic_try_cmpxchg() which has full ordering. This means that the q->drop_hb_ref assignment earlier is visible to the other thread after that cmpxchg, correct? > - wake_up_state(q->task, TASK_NORMAL); > + wake_up_state(task, TASK_NORMAL); > } Sebastian
© 2016 - 2025 Red Hat, Inc.