[v2] locking/rtmutex: Always use trylock in rt_mutex_trylock()

[PATCH v2] locking/rtmutex: Always use trylock in rt_mutex_trylock()

Posted by Waiman Long 1 month, 3 weeks ago

One reason to use a trylock is to avoid a ABBA deadlock in case we need
to use a locking sequence that is not in the expected locking order. That
requires the use of trylock all the ways in the abnormal locking
sequence. Unfortunately, this is not the case for rt_mutex_trylock() as
it uses a raw_spin_lock_irqsave() to acquire the lock->wait_lock.

There are just a few rt_mutex_trylock() call sites in the stock kernel.
For PREEMPT_RT kernel, however, all the spin_trylock() calls become
rt_mutex_trylock(). There are a few hundreds of them. So it will be a lot
easier to trigger a circular locking lockdep splat like the following.

[   63.695668] -> #0 (&lock->wait_lock){-...}-{2:2}:
[   63.695674]        check_prev_add+0x1bd/0x1310
[   63.695678]        validate_chain+0x6cf/0x7c0
[   63.695682]        __lock_acquire+0x879/0xf40
[   63.695686]        lock_acquire.part.0+0xfa/0x2d0
[   63.695690]        _raw_spin_lock_irqsave+0x46/0x90
[   63.695695]        rt_mutex_slowtrylock+0x3f/0xb0
[   63.695699]        rt_spin_trylock+0x13/0xc0
[   63.695702]        rmqueue_pcplist+0x5b/0x180
[   63.695705]        rmqueue+0xb01/0x1040
     :
[   63.695840] other info that might help us debug this:
[   63.695840]
[   63.695842] Chain exists of:
[   63.695842]   &lock->wait_lock --> &p->pi_lock --> &rq->__lock
[   63.695842]
[   63.695850]  Possible unsafe locking scenario:
[   63.695850]
[   63.695851]        CPU0                    CPU1
[   63.695852]        ----                    ----
[   63.695854]   lock(&rq->__lock);
[   63.695857]                                lock(&p->pi_lock);
[   63.695861]                                lock(&rq->__lock);
[   63.695864]   lock(&lock->wait_lock);
[   63.695868]
[   63.695868]  *** DEADLOCK ***

Fix it by using raw_spin_trylock_irqsave() instead.

Fixes: 23f78d4a03c5 ("[PATCH] pi-futex: rt mutex core")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/locking/rtmutex.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index ebebd0eec7f6..a32bc2bb5d5e 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1381,10 +1381,13 @@ static int __sched rt_mutex_slowtrylock(struct rt_mutex_base *lock)
 		return 0;
 
 	/*
-	 * The mutex has currently no owner. Lock the wait lock and try to
-	 * acquire the lock. We use irqsave here to support early boot calls.
+	 * The mutex has currently no owner. Try to lock the wait lock first.
+	 * If successful, try to acquire the lock. We use irqsave here to
+	 * support early boot calls. Trylock is used all the way to avoid
+	 * circular lock dependency.
 	 */
-	raw_spin_lock_irqsave(&lock->wait_lock, flags);
+	if (!raw_spin_trylock_irqsave(&lock->wait_lock, flags))
+		return 0;
 
 	ret = __rt_mutex_slowtrylock(lock);
 
-- 
2.43.5

Re: [PATCH v2] locking/rtmutex: Always use trylock in rt_mutex_trylock()

Posted by Peter Zijlstra 1 month, 3 weeks ago

On Wed, Oct 02, 2024 at 03:01:08PM -0400, Waiman Long wrote:
> One reason to use a trylock is to avoid a ABBA deadlock in case we need
> to use a locking sequence that is not in the expected locking order. That
> requires the use of trylock all the ways in the abnormal locking
> sequence. Unfortunately, this is not the case for rt_mutex_trylock() as
> it uses a raw_spin_lock_irqsave() to acquire the lock->wait_lock.

This does not explain anything. lock->wait_lock only serializes the lock
internal state and should be fine to be taken like this.

> There are just a few rt_mutex_trylock() call sites in the stock kernel.
> For PREEMPT_RT kernel, however, all the spin_trylock() calls become
> rt_mutex_trylock(). There are a few hundreds of them. So it will be a lot
> easier to trigger a circular locking lockdep splat like the following.
> 
> [   63.695668] -> #0 (&lock->wait_lock){-...}-{2:2}:
> [   63.695674]        check_prev_add+0x1bd/0x1310
> [   63.695678]        validate_chain+0x6cf/0x7c0
> [   63.695682]        __lock_acquire+0x879/0xf40
> [   63.695686]        lock_acquire.part.0+0xfa/0x2d0
> [   63.695690]        _raw_spin_lock_irqsave+0x46/0x90
> [   63.695695]        rt_mutex_slowtrylock+0x3f/0xb0
> [   63.695699]        rt_spin_trylock+0x13/0xc0
> [   63.695702]        rmqueue_pcplist+0x5b/0x180
> [   63.695705]        rmqueue+0xb01/0x1040
>      :
> [   63.695840] other info that might help us debug this:
> [   63.695840]
> [   63.695842] Chain exists of:
> [   63.695842]   &lock->wait_lock --> &p->pi_lock --> &rq->__lock
> [   63.695842]
> [   63.695850]  Possible unsafe locking scenario:
> [   63.695850]
> [   63.695851]        CPU0                    CPU1
> [   63.695852]        ----                    ----
> [   63.695854]   lock(&rq->__lock);
> [   63.695857]                                lock(&p->pi_lock);
> [   63.695861]                                lock(&rq->__lock);
> [   63.695864]   lock(&lock->wait_lock);
> [   63.695868]
> [   63.695868]  *** DEADLOCK ***
> 

This is still useless crap. Please properly describe the locking problem
without truncated lockdep crud.

Re: [PATCH v2] locking/rtmutex: Always use trylock in rt_mutex_trylock()

Posted by Thomas Gleixner 1 month, 3 weeks ago

On Wed, Oct 02 2024 at 15:01, Waiman Long wrote:
> One reason to use a trylock is to avoid a ABBA deadlock in case we need
> to use a locking sequence that is not in the expected locking order. That
> requires the use of trylock all the ways in the abnormal locking
> sequence. Unfortunately, this is not the case for rt_mutex_trylock() as
> it uses a raw_spin_lock_irqsave() to acquire the lock->wait_lock.
>
> There are just a few rt_mutex_trylock() call sites in the stock kernel.
> For PREEMPT_RT kernel, however, all the spin_trylock() calls become
> rt_mutex_trylock(). There are a few hundreds of them. So it will be a lot
> easier to trigger a circular locking lockdep splat like the following.
>
> [   63.695668] -> #0 (&lock->wait_lock){-...}-{2:2}:
> [   63.695674]        check_prev_add+0x1bd/0x1310
> [   63.695678]        validate_chain+0x6cf/0x7c0
> [   63.695682]        __lock_acquire+0x879/0xf40
> [   63.695686]        lock_acquire.part.0+0xfa/0x2d0
> [   63.695690]        _raw_spin_lock_irqsave+0x46/0x90
> [   63.695695]        rt_mutex_slowtrylock+0x3f/0xb0
> [   63.695699]        rt_spin_trylock+0x13/0xc0
> [   63.695702]        rmqueue_pcplist+0x5b/0x180
> [   63.695705]        rmqueue+0xb01/0x1040
>      :
> [   63.695840] other info that might help us debug this:
> [   63.695840]
> [   63.695842] Chain exists of:
> [   63.695842]   &lock->wait_lock --> &p->pi_lock --> &rq->__lock
> [   63.695842]
> [   63.695850]  Possible unsafe locking scenario:
> [   63.695850]
> [   63.695851]        CPU0                    CPU1
> [   63.695852]        ----                    ----
> [   63.695854]   lock(&rq->__lock);
> [   63.695857]                                lock(&p->pi_lock);
> [   63.695861]                                lock(&rq->__lock);
> [   63.695864]   lock(&lock->wait_lock);
> [   63.695868]
> [   63.695868]  *** DEADLOCK ***
>
> Fix it by using raw_spin_trylock_irqsave() instead.
>
> Fixes: 23f78d4a03c5 ("[PATCH] pi-futex: rt mutex core")
> Signed-off-by: Waiman Long <longman@redhat.com>

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>