From nobody Thu Feb 12 04:51:22 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88D2DC77B73 for ; Thu, 27 Apr 2023 11:20:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243704AbjD0LUM (ORCPT ); Thu, 27 Apr 2023 07:20:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243620AbjD0LTs (ORCPT ); Thu, 27 Apr 2023 07:19:48 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69F7B4C2B for ; Thu, 27 Apr 2023 04:19:45 -0700 (PDT) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1682594383; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nAuLWDQMB55lcnDvQBz27uhWcaqTSuhCvrx4iirveNU=; b=BmkA1tCRKTr0kyPsvHU5Dx2g8M0ALymErOFL9wJnFzO+sFQX9zAW50FDQ6L+v5BlNr180F h/yyXzQs3uUjMuEpDYrNzEJMXW9k33YY3s6f/HN+9dj+yfr+6CZicnl82w8xi1IINnwqZo KynDhU/i5Sl6SrfL8yh7366jKIoQ6zeAnzqB4MzH2i6fhBH/w/Alk/0IZA/gLB1seLiuHS M7eRYEVwI7wieHXdEH6c7CyySRdzp70Jm4oTJvb2i7wpW1p07San6NlgxBhYUqBIHSMBgA sYSI9VDdjVS3rG4v1zEPM9dWBRhBh/U/i8slHSCDU/Ce+3prrQ5YKELIKVjfng== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1682594383; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nAuLWDQMB55lcnDvQBz27uhWcaqTSuhCvrx4iirveNU=; b=bs2EIpLBPRPQy4OQl6bI+zwRt6XERNPrd+Q1YXGQ6JTJocNYKkDZVwlZMK047VgIn5/kja YCoZjm/To83YB5BQ== To: linux-kernel@vger.kernel.org Cc: Ben Segall , Boqun Feng , Crystal Wood , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , John Stultz , Juri Lelli , Mel Gorman , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Valentin Schneider , Vincent Guittot , Waiman Long , Will Deacon , Sebastian Andrzej Siewior Subject: [PATCH v2 1/4] sched/core: Provide sched_rtmutex() and expose sched work helpers Date: Thu, 27 Apr 2023 13:19:34 +0200 Message-Id: <20230427111937.2745231-2-bigeasy@linutronix.de> In-Reply-To: <20230427111937.2745231-1-bigeasy@linutronix.de> References: <20230427111937.2745231-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner schedule() invokes sched_submit_work() before scheduling and sched_update_worker() afterwards to ensure that queued block requests are flushed and the (IO)worker machineries can instantiate new workers if required. This avoids deadlocks and starvation. With rt_mutexes this can lead to subtle problem: When rtmutex blocks current::pi_blocked_on points to the rtmutex it blocks on. When one of the functions in sched_submit/resume_work() contends on a rtmutex based lock then that would corrupt current::pi_blocked_on. Make it possible to let rtmutex issue the calls outside of the slowpath, i.e. when it is guaranteed that current::pi_blocked_on is NULL, by: - Exposing sched_submit_work() and moving the task_running() condition into schedule() - Renamimg sched_update_worker() to sched_resume_work() and exposing it too. - Providing sched_rtmutex() which just does the inner loop of scheduling until need_resched() is not longer set. Split out the loop so this does not create yet another copy. Signed-off-by: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior --- include/linux/sched.h | 5 +++++ kernel/sched/core.c | 40 ++++++++++++++++++++++------------------ 2 files changed, 27 insertions(+), 18 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 675298d6eb362..ff1ce66d8b6e3 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -304,6 +304,11 @@ extern long schedule_timeout_idle(long timeout); asmlinkage void schedule(void); extern void schedule_preempt_disabled(void); asmlinkage void preempt_schedule_irq(void); + +extern void sched_submit_work(void); +extern void sched_resume_work(void); +extern void schedule_rtmutex(void); + #ifdef CONFIG_PREEMPT_RT extern void schedule_rtlock(void); #endif diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c415418b0b847..7c5cfae086c78 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6690,14 +6690,11 @@ void __noreturn do_task_dead(void) cpu_relax(); } =20 -static inline void sched_submit_work(struct task_struct *tsk) +void sched_submit_work(void) { - unsigned int task_flags; + struct task_struct *tsk =3D current; + unsigned int task_flags =3D tsk->flags; =20 - if (task_is_running(tsk)) - return; - - task_flags =3D tsk->flags; /* * If a worker goes to sleep, notify and ask workqueue whether it * wants to wake up a task to maintain concurrency. @@ -6723,8 +6720,10 @@ static inline void sched_submit_work(struct task_str= uct *tsk) blk_flush_plug(tsk->plug, true); } =20 -static void sched_update_worker(struct task_struct *tsk) +void sched_resume_work(void) { + struct task_struct *tsk =3D current; + if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER)) { if (tsk->flags & PF_WQ_WORKER) wq_worker_running(tsk); @@ -6733,20 +6732,29 @@ static void sched_update_worker(struct task_struct = *tsk) } } =20 -asmlinkage __visible void __sched schedule(void) +static void schedule_loop(unsigned int sched_mode) { - struct task_struct *tsk =3D current; - - sched_submit_work(tsk); do { preempt_disable(); - __schedule(SM_NONE); + __schedule(sched_mode); sched_preempt_enable_no_resched(); } while (need_resched()); - sched_update_worker(tsk); +} + +asmlinkage __visible void __sched schedule(void) +{ + if (!task_is_running(current)) + sched_submit_work(); + schedule_loop(SM_NONE); + sched_resume_work(); } EXPORT_SYMBOL(schedule); =20 +void schedule_rtmutex(void) +{ + schedule_loop(SM_NONE); +} + /* * synchronize_rcu_tasks() makes sure that no task is stuck in preempted * state (have scheduled out non-voluntarily) by making sure that all @@ -6806,11 +6814,7 @@ void __sched schedule_preempt_disabled(void) #ifdef CONFIG_PREEMPT_RT void __sched notrace schedule_rtlock(void) { - do { - preempt_disable(); - __schedule(SM_RTLOCK_WAIT); - sched_preempt_enable_no_resched(); - } while (need_resched()); + schedule_loop(SM_RTLOCK_WAIT); } NOKPROBE_SYMBOL(schedule_rtlock); #endif --=20 2.40.1 From nobody Thu Feb 12 04:51:22 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57B1FC77B61 for ; Thu, 27 Apr 2023 11:20:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243666AbjD0LUD (ORCPT ); Thu, 27 Apr 2023 07:20:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243559AbjD0LTs (ORCPT ); Thu, 27 Apr 2023 07:19:48 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A81A85244 for ; Thu, 27 Apr 2023 04:19:45 -0700 (PDT) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1682594384; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ORxxMGTCqHCTuORmt0+70Mqp063X5vrWfREdzEDLbVo=; b=Gck4aSMK7OncrecOxwvKthriuTS5Inm5bndSjBt2etXMraZlGX5LwTrbipYyQ6g+FnQu7h D6VO9eIYi36JjwEB3OCkJEUB+hzWOrMTB9oFx68JZ0joHQ/j1+OXUA23QKOJD/Irt90G4m AJ1/SGpsJ8EzpzsVkIInAWp1z1bKq3hyaegCCjfuUiaPcd7mKKHZACmSqT7tg26AeLB58k GIp+cL1HPMrZgUpwvV3i4rZhC1h1KNAqecJ+RWhFLIyfm+s2Eztv1AUGFfvWjdxVEFewXi f1H2Nnqw3ZVdehzB5XEfdU1SKHPeX65QMa4VXdOe+kbrNXEmJiIXPJt/dJXczw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1682594384; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ORxxMGTCqHCTuORmt0+70Mqp063X5vrWfREdzEDLbVo=; b=kTs6IfxYm3EWHR6adfgx1jpJcNw3wEw7YFL0cGjAZtgwI9WLXPj1tvs402v/43ghvrNppo dLNHVFDlqOO0g8BQ== To: linux-kernel@vger.kernel.org Cc: Ben Segall , Boqun Feng , Crystal Wood , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , John Stultz , Juri Lelli , Mel Gorman , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Valentin Schneider , Vincent Guittot , Waiman Long , Will Deacon , Sebastian Andrzej Siewior Subject: [PATCH v2 2/4] locking/rtmutex: Submit/resume work explicitly before/after blocking Date: Thu, 27 Apr 2023 13:19:35 +0200 Message-Id: <20230427111937.2745231-3-bigeasy@linutronix.de> In-Reply-To: <20230427111937.2745231-1-bigeasy@linutronix.de> References: <20230427111937.2745231-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" schedule() invokes sched_submit_work() before scheduling and sched_resume_work() afterwards to ensure that queued block requests are flushed and the (IO)worker machineries can instantiate new workers if required. This avoids deadlocks and starvation. With rt_mutexes this can lead to a subtle problem: When rtmutex blocks current::pi_blocked_on points to the rtmutex it blocks on. When one of the functions in sched_submit/resume_work() conten= ds on a rtmutex based lock then that would corrupt current::pi_blocked_on. Let rtmutex and the RT lock variants which are based on it invoke sched_submit/resume_work() explicitly before and after the slowpath so it's guaranteed that current::pi_blocked_on cannot be corrupted by blocking on two locks. This does not apply to the PREEMPT_RT variants of spinlock_t and rwlock_t as their scheduling slowpath is separate and cannot invoke the work related functions due to potential deadlocks anyway. [ tglx: Make it explicit and symmetric. Massage changelog ] Fixes: e17ba59b7e8e1 ("locking/rtmutex: Guard regular sleeping locks specif= ic functions") Reported-by: Crystal Wood Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner Link: https://lore.kernel.org/4b4ab374d3e24e6ea8df5cadc4297619a6d945af.came= l@redhat.com Signed-off-by: Sebastian Andrzej Siewior --- kernel/locking/rtmutex.c | 11 +++++++++-- kernel/locking/rwbase_rt.c | 18 ++++++++++++++++-- kernel/locking/rwsem.c | 6 ++++++ kernel/locking/spinlock_rt.c | 3 +++ 4 files changed, 34 insertions(+), 4 deletions(-) diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index 728f434de2bbf..aa66a3c5950a7 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -1555,7 +1555,7 @@ static int __sched rt_mutex_slowlock_block(struct rt_= mutex_base *lock, raw_spin_unlock_irq(&lock->wait_lock); =20 if (!owner || !rtmutex_spin_on_owner(lock, waiter, owner)) - schedule(); + schedule_rtmutex(); =20 raw_spin_lock_irq(&lock->wait_lock); set_current_state(state); @@ -1584,7 +1584,7 @@ static void __sched rt_mutex_handle_deadlock(int res,= int detect_deadlock, WARN(1, "rtmutex deadlock detected\n"); while (1) { set_current_state(TASK_INTERRUPTIBLE); - schedule(); + schedule_rtmutex(); } } =20 @@ -1679,6 +1679,12 @@ static int __sched rt_mutex_slowlock(struct rt_mutex= _base *lock, unsigned long flags; int ret; =20 + /* + * The task is about to sleep. Invoke sched_submit_work() before + * blocking as that might take locks and corrupt tsk::pi_blocked_on. + */ + sched_submit_work(); + /* * Technically we could use raw_spin_[un]lock_irq() here, but this can * be called in early boot if the cmpxchg() fast path is disabled @@ -1691,6 +1697,7 @@ static int __sched rt_mutex_slowlock(struct rt_mutex_= base *lock, ret =3D __rt_mutex_slowlock_locked(lock, ww_ctx, state); raw_spin_unlock_irqrestore(&lock->wait_lock, flags); =20 + sched_resume_work(); return ret; } =20 diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c index 25ec0239477c2..945d474f5d27f 100644 --- a/kernel/locking/rwbase_rt.c +++ b/kernel/locking/rwbase_rt.c @@ -131,10 +131,21 @@ static int __sched __rwbase_read_lock(struct rwbase_r= t *rwb, static __always_inline int rwbase_read_lock(struct rwbase_rt *rwb, unsigned int state) { + int ret; + if (rwbase_read_trylock(rwb)) return 0; =20 - return __rwbase_read_lock(rwb, state); + /* + * The task is about to sleep. For rwsems this submits work as that + * might take locks and corrupt tsk::pi_blocked_on. Must be + * explicit here because __rwbase_read_lock() cannot invoke + * rt_mutex_slowlock(). NOP for rwlocks. + */ + rwbase_sched_submit_work(); + ret =3D __rwbase_read_lock(rwb, state); + rwbase_sched_resume_work(); + return ret; } =20 static void __sched __rwbase_read_unlock(struct rwbase_rt *rwb, @@ -230,7 +241,10 @@ static int __sched rwbase_write_lock(struct rwbase_rt = *rwb, struct rt_mutex_base *rtm =3D &rwb->rtmutex; unsigned long flags; =20 - /* Take the rtmutex as a first step */ + /* + * Take the rtmutex as a first step. For rwsem this will also + * invoke sched_submit_work() to flush IO and workers. + */ if (rwbase_rtmutex_lock_state(rtm, state)) return -EINTR; =20 diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c index acb5a50309a18..aca266006ad47 100644 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -1415,6 +1415,12 @@ static inline void __downgrade_write(struct rw_semap= hore *sem) #define rwbase_rtmutex_lock_state(rtm, state) \ __rt_mutex_lock(rtm, state) =20 +#define rwbase_sched_submit_work() \ + sched_submit_work() + +#define rwbase_sched_resume_work() \ + sched_resume_work() + #define rwbase_rtmutex_slowlock_locked(rtm, state) \ __rt_mutex_slowlock_locked(rtm, NULL, state) =20 diff --git a/kernel/locking/spinlock_rt.c b/kernel/locking/spinlock_rt.c index 48a19ed8486d8..62c4a6866087a 100644 --- a/kernel/locking/spinlock_rt.c +++ b/kernel/locking/spinlock_rt.c @@ -159,6 +159,9 @@ rwbase_rtmutex_lock_state(struct rt_mutex_base *rtm, un= signed int state) return 0; } =20 +static __always_inline void rwbase_sched_submit_work(void) { } +static __always_inline void rwbase_sched_resume_work(void) { } + static __always_inline int rwbase_rtmutex_slowlock_locked(struct rt_mutex_base *rtm, unsigned int sta= te) { --=20 2.40.1 From nobody Thu Feb 12 04:51:22 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DCEDC77B61 for ; Thu, 27 Apr 2023 11:20:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243681AbjD0LUG (ORCPT ); Thu, 27 Apr 2023 07:20:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243563AbjD0LTs (ORCPT ); Thu, 27 Apr 2023 07:19:48 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 599F55263 for ; Thu, 27 Apr 2023 04:19:46 -0700 (PDT) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1682594384; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ss/NEuQeuI1dqjAHKhSBAWsmc81HxasKaGCODSbI4ek=; b=eKuk40/REqcGKS0ii2ASevLY3qsTGyQpKqrPZwA4rFyEtiPLBKFamvgwzZkSxqe7rqHJW4 V7/MswqlsuXFxfaXz+WGA8BnoIoKM0NWYcRsY4uUScr1eHt/XaoltlRlkQHJmv71ER4MpW Y4yRirWQjv0+Bp/p6rJymqKaruezQ0uaq36fumnq5Gr2iTQ6zqwhLrGMV0LZWfvnJVZDqn 870vy+aipzwSfn/RWuhqJ52LM8KuvjglSYoQ8uMOv8Z7P+y1qMYXsWNOUEsRwebiQZSCjw 1vCPylGEcln7uR18jlws4iC9N/tvEzwV4OobbW8in4VPgrhbef9vmrVSVvqneg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1682594384; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ss/NEuQeuI1dqjAHKhSBAWsmc81HxasKaGCODSbI4ek=; b=odiuxUs1bodRYWfNKXGv8oSn53TaRU36lEYWWoaUUgo6tmXnDI/rNSWy8xH4gAaRrZ0T6h 4d7RhFEJLodYovAg== To: linux-kernel@vger.kernel.org Cc: Ben Segall , Boqun Feng , Crystal Wood , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , John Stultz , Juri Lelli , Mel Gorman , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Valentin Schneider , Vincent Guittot , Waiman Long , Will Deacon , Sebastian Andrzej Siewior Subject: [PATCH v2 3/4] locking/rtmutex: Avoid pointless blk_flush_plug() invocations Date: Thu, 27 Apr 2023 13:19:36 +0200 Message-Id: <20230427111937.2745231-4-bigeasy@linutronix.de> In-Reply-To: <20230427111937.2745231-1-bigeasy@linutronix.de> References: <20230427111937.2745231-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" With DEBUG_RT_MUTEXES enabled the fast-path rt_mutex_cmpxchg_acquire() always fails and all lock operations take the slow path, which leads to the invocation of blk_flush_plug() even if the lock is not contended which is unnecessary and avoids batch processing of requests. Provide a new helper inline rt_mutex_try_acquire() which maps to rt_mutex_cmpxchg_acquire() in the non-debug case. For the debug case it invokes rt_mutex_slowtrylock() which can acquire a non-contended rtmutex under full debug coverage. Replace the rt_mutex_cmpxchg_acquire() invocations in __rt_mutex_lock() and __ww_rt_mutex_lock() with the new helper function, which avoid the blk_flush_plug() for the non-contended case and preserves the debug mechanism. [ tglx: Created a new helper and massaged changelog ] Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior --- kernel/locking/rtmutex.c | 25 ++++++++++++++++++++++++- kernel/locking/ww_rt_mutex.c | 2 +- 2 files changed, 25 insertions(+), 2 deletions(-) diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index aa66a3c5950a7..dd76c1b9b7d21 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -218,6 +218,11 @@ static __always_inline bool rt_mutex_cmpxchg_acquire(s= truct rt_mutex_base *lock, return try_cmpxchg_acquire(&lock->owner, &old, new); } =20 +static __always_inline bool rt_mutex_try_acquire(struct rt_mutex_base *loc= k) +{ + return rt_mutex_cmpxchg_acquire(lock, NULL, current); +} + static __always_inline bool rt_mutex_cmpxchg_release(struct rt_mutex_base = *lock, struct task_struct *old, struct task_struct *new) @@ -297,6 +302,24 @@ static __always_inline bool rt_mutex_cmpxchg_acquire(s= truct rt_mutex_base *lock, =20 } =20 +static int __sched rt_mutex_slowtrylock(struct rt_mutex_base *lock); + +static __always_inline bool rt_mutex_try_acquire(struct rt_mutex_base *loc= k) +{ + /* + * With debug enabled rt_mutex_cmpxchg trylock() will always fail, + * which will unconditionally invoke sched_submit/resume_work() in + * the slow path of __rt_mutex_lock() and __ww_rt_mutex_lock() even + * in the non-contended case. + * + * Avoid that by using rt_mutex_slow_trylock() which is covered by + * the debug code and can acquire a non-contended rtmutex. On + * success the callsite avoids the sched_submit/resume_work() + * dance. + */ + return rt_mutex_slowtrylock(lock); +} + static __always_inline bool rt_mutex_cmpxchg_release(struct rt_mutex_base = *lock, struct task_struct *old, struct task_struct *new) @@ -1704,7 +1727,7 @@ static int __sched rt_mutex_slowlock(struct rt_mutex_= base *lock, static __always_inline int __rt_mutex_lock(struct rt_mutex_base *lock, unsigned int state) { - if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) + if (likely(rt_mutex_try_acquire(lock))) return 0; =20 return rt_mutex_slowlock(lock, NULL, state); diff --git a/kernel/locking/ww_rt_mutex.c b/kernel/locking/ww_rt_mutex.c index d1473c624105c..c7196de838edc 100644 --- a/kernel/locking/ww_rt_mutex.c +++ b/kernel/locking/ww_rt_mutex.c @@ -62,7 +62,7 @@ __ww_rt_mutex_lock(struct ww_mutex *lock, struct ww_acqui= re_ctx *ww_ctx, } mutex_acquire_nest(&rtm->dep_map, 0, 0, nest_lock, ip); =20 - if (likely(rt_mutex_cmpxchg_acquire(&rtm->rtmutex, NULL, current))) { + if (likely(rt_mutex_try_acquire(&rtm->rtmutex))) { if (ww_ctx) ww_mutex_set_context_fastpath(lock, ww_ctx); return 0; --=20 2.40.1 From nobody Thu Feb 12 04:51:22 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96707C77B73 for ; Thu, 27 Apr 2023 11:20:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243509AbjD0LT6 (ORCPT ); Thu, 27 Apr 2023 07:19:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243561AbjD0LTs (ORCPT ); Thu, 27 Apr 2023 07:19:48 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A52E5277 for ; Thu, 27 Apr 2023 04:19:46 -0700 (PDT) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1682594385; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w7SiOThIQ863K6oJvVNAv7De2opm8Uk3DJjr8iS90qc=; b=oZYijA09gs1W+HixBhBUTywY91zPkNMZlMSfNHnaMbMWBseV8AYEnY8zYx9B5isQZHVMCX wpkZtNYaHOVxJUJv+gSd7PBNmFI0uW3UmwnF1ZQNVLN1D7Cdqtzy9kGlOC88T2wMKycg7D +GpP+cfi5IupFM7/tyvniKKryq9KV/+B7ln8iCcSZMM+W4JOa4P9PqDljCFZpoYTwkPLYk Rv0dTfXJSG5EcrXcXTpJUowENq6TFP8DN/zpXH3Wz0SUJYCX7oEelnKQQh58fcNMIJvYFV 4EsLWM8uF6VqbSolkXfIdmakwlFazEBCDC+UX1LNc2POR4Z/m/ge3vFtdSIKag== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1682594385; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w7SiOThIQ863K6oJvVNAv7De2opm8Uk3DJjr8iS90qc=; b=YOAwTPHx6SRktxKl/lL048FMWg95bASatMmM3bNYK79/tErexC/Gtrfjto2afxjuVJNGwx TnjHTOIEoStIXICQ== To: linux-kernel@vger.kernel.org Cc: Ben Segall , Boqun Feng , Crystal Wood , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , John Stultz , Juri Lelli , Mel Gorman , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Valentin Schneider , Vincent Guittot , Waiman Long , Will Deacon , Sebastian Andrzej Siewior Subject: [PATCH v2 4/4] locking/rtmutex: Add a lockdep assert to catch potential nested blocking Date: Thu, 27 Apr 2023 13:19:37 +0200 Message-Id: <20230427111937.2745231-5-bigeasy@linutronix.de> In-Reply-To: <20230427111937.2745231-1-bigeasy@linutronix.de> References: <20230427111937.2745231-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner There used to be a BUG_ON(current->pi_blocked_on) in the lock acquisition functions, but that vanished in one of the rtmutex overhauls. Bring it back in form of a lockdep assert to catch code paths which take rtmutex based locks with current::pi_blocked_on !=3D NULL. Reported-by: Crystal Wood Signed-off-by: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior --- kernel/locking/rtmutex.c | 2 ++ kernel/locking/rwbase_rt.c | 2 ++ kernel/locking/spinlock_rt.c | 2 ++ 3 files changed, 6 insertions(+) diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index dd76c1b9b7d21..479a9487edcc2 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -1727,6 +1727,8 @@ static int __sched rt_mutex_slowlock(struct rt_mutex_= base *lock, static __always_inline int __rt_mutex_lock(struct rt_mutex_base *lock, unsigned int state) { + lockdep_assert(!current->pi_blocked_on); + if (likely(rt_mutex_try_acquire(lock))) return 0; =20 diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c index 945d474f5d27f..5be92ca5afabc 100644 --- a/kernel/locking/rwbase_rt.c +++ b/kernel/locking/rwbase_rt.c @@ -133,6 +133,8 @@ static __always_inline int rwbase_read_lock(struct rwba= se_rt *rwb, { int ret; =20 + lockdep_assert(!current->pi_blocked_on); + if (rwbase_read_trylock(rwb)) return 0; =20 diff --git a/kernel/locking/spinlock_rt.c b/kernel/locking/spinlock_rt.c index 62c4a6866087a..9fe282cd145d9 100644 --- a/kernel/locking/spinlock_rt.c +++ b/kernel/locking/spinlock_rt.c @@ -37,6 +37,8 @@ =20 static __always_inline void rtlock_lock(struct rt_mutex_base *rtm) { + lockdep_assert(!current->pi_blocked_on); + if (unlikely(!rt_mutex_cmpxchg_acquire(rtm, NULL, current))) rtlock_slowlock(rtm); } --=20 2.40.1