From nobody Wed Dec 17 04:37:12 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC40EC41513 for ; Tue, 15 Aug 2023 11:18:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236734AbjHOLRu (ORCPT ); Tue, 15 Aug 2023 07:17:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39200 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236682AbjHOLRM (ORCPT ); Tue, 15 Aug 2023 07:17:12 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5366BFE for ; Tue, 15 Aug 2023 04:17:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=ssLCWBNexAqMxY5nOdoi4YvdUan+K2TVkqWurGQmVsY=; b=SCghBTSMIDiXa6bUQDGItYDvRw JqGwh1wVkztWJlrJBQjauI1dp23GAw2jlm55xTX4QDuJgJCmpvwnEsjsPOCELnUPWr0xOIX98jdYo xXkNN+TeYmR0u2Ak2ozOuB3xfhH0olRQ6PciFZsBXc6fTtu4fFtI85Ik+GHEZYFKjmIxeXFfb/dGL BMc5EOdbFkAh6VDXsNJ6PrrrhQrxBSqMnoRM0jtoIp2zKOET/b8jauOMeUmQnXqfvzy+YP6A304Nw vKetUeq4AwL7QQd3MqEawc41qgjGLnhQZb9as9+xU7Rffo5AmuCC4o5vW4bCR19V01YeRhwHptRuE UkFIjs2g==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1qVs2c-007rqQ-9G; Tue, 15 Aug 2023 11:16:50 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 143483004AF; Tue, 15 Aug 2023 13:16:49 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id EBFEF2074B3DA; Tue, 15 Aug 2023 13:16:48 +0200 (CEST) Message-ID: <20230815111430.154558666@infradead.org> User-Agent: quilt/0.66 Date: Tue, 15 Aug 2023 13:01:22 +0200 From: Peter Zijlstra To: bigeasy@linutronix.de, tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, bsegall@google.com, boqun.feng@gmail.com, swood@redhat.com, bristot@redhat.com, dietmar.eggemann@arm.com, mingo@redhat.com, jstultz@google.com, juri.lelli@redhat.com, mgorman@suse.de, rostedt@goodmis.org, vschneid@redhat.com, vincent.guittot@linaro.org, longman@redhat.com, will@kernel.org Subject: [PATCH 1/6] sched: Constrain locks in sched_submit_work() References: <20230815110121.117752409@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Even though sched_submit_work() is ran from preemptible context, it is discouraged to have it use blocking locks due to the recursion potential. Enforce this. Signed-off-by: Peter Zijlstra (Intel) --- kernel/sched/core.c | 9 +++++++++ 1 file changed, 9 insertions(+) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6737,11 +6737,18 @@ void __noreturn do_task_dead(void) =20 static inline void sched_submit_work(struct task_struct *tsk) { + static DEFINE_WAIT_OVERRIDE_MAP(sched_map, LD_WAIT_CONFIG); unsigned int task_flags; =20 if (task_is_running(tsk)) return; =20 + /* + * Establish LD_WAIT_CONFIG context to ensure none of the code called + * will use a blocking primitive -- which would lead to recursion. + */ + lock_map_acquire_try(&sched_map); + task_flags =3D tsk->flags; /* * If a worker goes to sleep, notify and ask workqueue whether it @@ -6766,6 +6773,8 @@ static inline void sched_submit_work(str * make sure to submit it to avoid deadlocks. */ blk_flush_plug(tsk->plug, true); + + lock_map_release(&sched_map); } =20 static void sched_update_worker(struct task_struct *tsk) From nobody Wed Dec 17 04:37:12 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C020C04A6A for ; Tue, 15 Aug 2023 11:18:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236724AbjHOLRq (ORCPT ); Tue, 15 Aug 2023 07:17:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236686AbjHOLRM (ORCPT ); Tue, 15 Aug 2023 07:17:12 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54630107 for ; Tue, 15 Aug 2023 04:17:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=Cuoqo+L2mdz7teHEcB9Kr2BTzQAt+EJM1BOiUmzm59k=; b=JqDCIbiuGn170E9HOU0//FELEk tSr4SQtmr/LD7gx6B3zuXDWJzBiypuS2i4DQYhGeYJyE0U80uxZwx2Qg6yy72HiaaYP0cOvlyArCL txFJUiK7ZUJqg6sz0FNhE4p6qbhIH6fx6jf3LkV1BRts4mPW2JVpIvvx6OrNlWDw5rzRxP/O2N3ct I3yTRs+kBO8nL2O+soRjTUbyPOuWPZePp2DGvluj6NAYN+sX2ayeRCeExWPds5Ma4hjgMQWJWWHRV 4HlqlWX6q+LifTgbCv7ZGrMBIs3f97i58Hhn98HvrNkgZeQet1YHbDcWnCHqypNZtpMKYkA1dmSLK vYeV6wBQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1qVs2c-007rqR-3R; Tue, 15 Aug 2023 11:16:50 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 173893005A2; Tue, 15 Aug 2023 13:16:49 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id F15802074B3D9; Tue, 15 Aug 2023 13:16:48 +0200 (CEST) Message-ID: <20230815111430.220899937@infradead.org> User-Agent: quilt/0.66 Date: Tue, 15 Aug 2023 13:01:23 +0200 From: Peter Zijlstra To: bigeasy@linutronix.de, tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, bsegall@google.com, boqun.feng@gmail.com, swood@redhat.com, bristot@redhat.com, dietmar.eggemann@arm.com, mingo@redhat.com, jstultz@google.com, juri.lelli@redhat.com, mgorman@suse.de, rostedt@goodmis.org, vschneid@redhat.com, vincent.guittot@linaro.org, longman@redhat.com, will@kernel.org Subject: [PATCH 2/6] locking/rtmutex: Avoid unconditional slowpath for DEBUG_RT_MUTEXES References: <20230815110121.117752409@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Sebastian Andrzej Siewior With DEBUG_RT_MUTEXES enabled the fast-path rt_mutex_cmpxchg_acquire() always fails and all lock operations take the slow path. Provide a new helper inline rt_mutex_try_acquire() which maps to rt_mutex_cmpxchg_acquire() in the non-debug case. For the debug case it invokes rt_mutex_slowtrylock() which can acquire a non-contended rtmutex under full debug coverage. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20230427111937.2745231-4-bigeasy@linutronix= .de --- kernel/locking/rtmutex.c | 21 ++++++++++++++++++++- kernel/locking/ww_rt_mutex.c | 2 +- 2 files changed, 21 insertions(+), 2 deletions(-) --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -218,6 +218,11 @@ static __always_inline bool rt_mutex_cmp return try_cmpxchg_acquire(&lock->owner, &old, new); } =20 +static __always_inline bool rt_mutex_try_acquire(struct rt_mutex_base *loc= k) +{ + return rt_mutex_cmpxchg_acquire(lock, NULL, current); +} + static __always_inline bool rt_mutex_cmpxchg_release(struct rt_mutex_base = *lock, struct task_struct *old, struct task_struct *new) @@ -297,6 +302,20 @@ static __always_inline bool rt_mutex_cmp =20 } =20 +static int __sched rt_mutex_slowtrylock(struct rt_mutex_base *lock); + +static __always_inline bool rt_mutex_try_acquire(struct rt_mutex_base *loc= k) +{ + /* + * With debug enabled rt_mutex_cmpxchg trylock() will always fail. + * + * Avoid unconditionally taking the slow path by using + * rt_mutex_slow_trylock() which is covered by the debug code and can + * acquire a non-contended rtmutex. + */ + return rt_mutex_slowtrylock(lock); +} + static __always_inline bool rt_mutex_cmpxchg_release(struct rt_mutex_base = *lock, struct task_struct *old, struct task_struct *new) @@ -1755,7 +1774,7 @@ static int __sched rt_mutex_slowlock(str static __always_inline int __rt_mutex_lock(struct rt_mutex_base *lock, unsigned int state) { - if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current))) + if (likely(rt_mutex_try_acquire(lock))) return 0; =20 return rt_mutex_slowlock(lock, NULL, state); --- a/kernel/locking/ww_rt_mutex.c +++ b/kernel/locking/ww_rt_mutex.c @@ -62,7 +62,7 @@ __ww_rt_mutex_lock(struct ww_mutex *lock } mutex_acquire_nest(&rtm->dep_map, 0, 0, nest_lock, ip); =20 - if (likely(rt_mutex_cmpxchg_acquire(&rtm->rtmutex, NULL, current))) { + if (likely(rt_mutex_try_acquire(&rtm->rtmutex))) { if (ww_ctx) ww_mutex_set_context_fastpath(lock, ww_ctx); return 0; From nobody Wed Dec 17 04:37:12 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 046C9C0015E for ; Tue, 15 Aug 2023 11:18:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236765AbjHOLSG (ORCPT ); Tue, 15 Aug 2023 07:18:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39238 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236691AbjHOLRS (ORCPT ); Tue, 15 Aug 2023 07:17:18 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 881A0FE for ; Tue, 15 Aug 2023 04:17:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=/OF/BmdJj0OmjRhPOEFIGjmXpUiq2nX3+Wsu8AwP0Fc=; b=YF0BmO7qGOi9eSckTqSRhR8jsO 4nbsSR3YfN1TGH1IrxoUEQ/k0AzciiHTMutaiAxbVOgqS+7Lvzn0+1HApUDpgts361vbNOdFahL+9 z2geFLLgmS/1ADQxzDpeNdgZIniv0svxFPPtxmp9NnqerE53Pj2VWY31ALsoiZrya1qKMvZRe2A/U 0J4B+NUiK+0hCnMJwLYeaTDEKo6Dba6O27ilESVrH4vwv+n3lBBWjtkhWL1RU1Kh4t7NjMIzZE33Q k6Ou8Nbxnz1kxlQhHfQrQII2ctMhyhPt/hN+DWdVY+N/6H58cQC34jV/IxU5oAR/vlivz4TfCH0iE vxDXSMqw==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qVs2c-00Bo44-0J; Tue, 15 Aug 2023 11:16:53 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 1B33730075E; Tue, 15 Aug 2023 13:16:49 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 030702074B3DB; Tue, 15 Aug 2023 13:16:49 +0200 (CEST) Message-ID: <20230815111430.288063671@infradead.org> User-Agent: quilt/0.66 Date: Tue, 15 Aug 2023 13:01:24 +0200 From: Peter Zijlstra To: bigeasy@linutronix.de, tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, bsegall@google.com, boqun.feng@gmail.com, swood@redhat.com, bristot@redhat.com, dietmar.eggemann@arm.com, mingo@redhat.com, jstultz@google.com, juri.lelli@redhat.com, mgorman@suse.de, rostedt@goodmis.org, vschneid@redhat.com, vincent.guittot@linaro.org, longman@redhat.com, will@kernel.org Subject: [PATCH 3/6] sched: Extract __schedule_loop() References: <20230815110121.117752409@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner There are currently two implementations of this basic __schedule() loop, and there is soon to be a third. Signed-off-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20230427111937.2745231-2-bigeasy@linutronix= .de --- kernel/sched/core.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6787,16 +6787,21 @@ static void sched_update_worker(struct t } } =20 -asmlinkage __visible void __sched schedule(void) +static __always_inline void __schedule_loop(unsigned int sched_mode) { - struct task_struct *tsk =3D current; - - sched_submit_work(tsk); do { preempt_disable(); - __schedule(SM_NONE); + __schedule(sched_mode); sched_preempt_enable_no_resched(); } while (need_resched()); +} + +asmlinkage __visible void __sched schedule(void) +{ + struct task_struct *tsk =3D current; + + sched_submit_work(tsk); + __schedule_loop(SM_NONE); sched_update_worker(tsk); } EXPORT_SYMBOL(schedule); @@ -6860,11 +6865,7 @@ void __sched schedule_preempt_disabled(v #ifdef CONFIG_PREEMPT_RT void __sched notrace schedule_rtlock(void) { - do { - preempt_disable(); - __schedule(SM_RTLOCK_WAIT); - sched_preempt_enable_no_resched(); - } while (need_resched()); + __schedule_loop(SM_RTLOCK_WAIT); } NOKPROBE_SYMBOL(schedule_rtlock); #endif From nobody Wed Dec 17 04:37:12 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46840C0015E for ; Tue, 15 Aug 2023 11:18:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236694AbjHOLRl (ORCPT ); Tue, 15 Aug 2023 07:17:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236681AbjHOLRM (ORCPT ); Tue, 15 Aug 2023 07:17:12 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4850DEE for ; Tue, 15 Aug 2023 04:17:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=1lUumChi5BZurWbM1ZwmbgCultJMd6yfxKko3hnfLig=; b=lG9KBIs7JXgnU+RLGph9vLo6MS aWQkzsSoV+5f8ts049ALYR8an3BvIYAm1MgSvq9gNvt+HnRQP+wYcSDcYBLVAfwBuAEDLeuozqZwc JcAi5z8jq7xj0ddWyIyxXk2NefVVizliGBuyafx+40X1iS4ayDijsisXR5W4m0crxpcxaSYtlfN8G HlCYBvJaP0wwyNcqAoOHmRIPB05k++ICkLY0HCep/aZ/hkNN8GT/A+g7wX+3qLkgZhJafpC9zmRfw kqO9BV4+t/MMzBenl5ep2HWLxzzY8WbooIg8VpWBidn70UrbmTnJN3YaIjBL8WY3gUoGPWHSJlLN6 rqHxWYkQ==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1qVs2c-007rqP-90; Tue, 15 Aug 2023 11:16:50 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 220753007AF; Tue, 15 Aug 2023 13:16:49 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 082222074B3D6; Tue, 15 Aug 2023 13:16:49 +0200 (CEST) Message-ID: <20230815111430.355375399@infradead.org> User-Agent: quilt/0.66 Date: Tue, 15 Aug 2023 13:01:25 +0200 From: Peter Zijlstra To: bigeasy@linutronix.de, tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, bsegall@google.com, boqun.feng@gmail.com, swood@redhat.com, bristot@redhat.com, dietmar.eggemann@arm.com, mingo@redhat.com, jstultz@google.com, juri.lelli@redhat.com, mgorman@suse.de, rostedt@goodmis.org, vschneid@redhat.com, vincent.guittot@linaro.org, longman@redhat.com, will@kernel.org Subject: [PATCH 4/6] sched: Provide rt_mutex specific scheduler helpers References: <20230815110121.117752409@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With PREEMPT_RT there is a rt_mutex recursion problem where sched_submit_work() can use an rtlock (aka spinlock_t). More specifically what happens is: mutex_lock() /* really rt_mutex */ ... __rt_mutex_slowlock_locked() task_blocks_on_rt_mutex() // enqueue current task as waiter // do PI chain walk rt_mutex_slowlock_block() schedule() sched_submit_work() ... spin_lock() /* really rtlock */ ... __rt_mutex_slowlock_locked() task_blocks_on_rt_mutex() // enqueue current task as waiter *AGAIN* // *CONFUSION* Fix this by making rt_mutex do the sched_submit_work() early, before it enqueues itself as a waiter -- before it even knows *if* it will wait. [[ basically Thomas' patch but with different naming and a few asserts added ]] Originally-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) --- include/linux/sched.h | 3 +++ include/linux/sched/rt.h | 4 ++++ kernel/sched/core.c | 36 ++++++++++++++++++++++++++++++++---- 3 files changed, 39 insertions(+), 4 deletions(-) --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -906,6 +906,9 @@ struct task_struct { * ->sched_remote_wakeup gets used, so it can be in this word. */ unsigned sched_remote_wakeup:1; +#ifdef CONFIG_RT_MUTEXES + unsigned sched_rt_mutex:1; +#endif =20 /* Bit to tell LSMs we're in execve(): */ unsigned in_execve:1; --- a/include/linux/sched/rt.h +++ b/include/linux/sched/rt.h @@ -30,6 +30,10 @@ static inline bool task_is_realtime(stru } =20 #ifdef CONFIG_RT_MUTEXES +extern void rt_mutex_pre_schedule(void); +extern void rt_mutex_schedule(void); +extern void rt_mutex_post_schedule(void); + /* * Must hold either p->pi_lock or task_rq(p)->lock. */ --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6738,9 +6738,6 @@ static inline void sched_submit_work(str static DEFINE_WAIT_OVERRIDE_MAP(sched_map, LD_WAIT_CONFIG); unsigned int task_flags; =20 - if (task_is_running(tsk)) - return; - /* * Establish LD_WAIT_CONFIG context to ensure none of the code called * will use a blocking primitive -- which would lead to recursion. @@ -6798,7 +6795,12 @@ asmlinkage __visible void __sched schedu { struct task_struct *tsk =3D current; =20 - sched_submit_work(tsk); +#ifdef CONFIG_RT_MUTEXES + lockdep_assert(!tsk->sched_rt_mutex); +#endif + + if (!task_is_running(tsk)) + sched_submit_work(tsk); __schedule_loop(SM_NONE); sched_update_worker(tsk); } @@ -7059,6 +7061,32 @@ static void __setscheduler_prio(struct t =20 #ifdef CONFIG_RT_MUTEXES =20 +/* + * Would be more useful with typeof()/auto_type but they don't mix with + * bit-fields. Since it's a local thing, use int. Keep the generic sounding + * name such that if someone were to implement this function we get to com= pare + * notes. + */ +#define fetch_and_set(x, v) ({ int _x =3D (x); (x) =3D (v); _x; }) + +void rt_mutex_pre_schedule(void) +{ + lockdep_assert(!fetch_and_set(current->sched_rt_mutex, 1)); + sched_submit_work(current); +} + +void rt_mutex_schedule(void) +{ + lockdep_assert(current->sched_rt_mutex); + __schedule_loop(SM_NONE); +} + +void rt_mutex_post_schedule(void) +{ + sched_update_worker(current); + lockdep_assert(fetch_and_set(current->sched_rt_mutex, 0)); +} + static inline int __rt_effective_prio(struct task_struct *pi_task, int pri= o) { if (pi_task) From nobody Wed Dec 17 04:37:12 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0E10C04E69 for ; Tue, 15 Aug 2023 11:18:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236757AbjHOLSB (ORCPT ); Tue, 15 Aug 2023 07:18:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39242 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236690AbjHOLRS (ORCPT ); Tue, 15 Aug 2023 07:17:18 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2D0A107 for ; Tue, 15 Aug 2023 04:17:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=64NfNxxnz3oqbDWJ7Rqn3og2SOfT9Ixlmb53B7Q5zlw=; b=jWfTWnPklh7nzbZ1u5U1RmIPkG 6p6LupFaIZ6pDbasBubx7HCAUhqG+jpIT/Ybu6E1zHhhigADDEn3ea8RRj1dvRImpObZgBp6GPuTC a+C2/3SRwKhGEmUYyEHq/+yKLvkqO4Y+55lTGqBwfyX0FK8dzQbk5Xz3d9B8ZemST10EJkcWkMpYz WFaGiE92FrkrNsWhOPJwh4vrJH7IhUiLgiQIfkZJCVwW0C7klkod8wmm1Z+YQj/zbrirUjaR7Cymf dl8SZAAb1pSwYYnr0Oo5mxs3Dm248Aca0/rda6IRAMUULop92CtIGegOKug/0YCVjqpbFnk1fKpew tQWA1V4A==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.96 #2 (Red Hat Linux)) id 1qVs2d-00Bo45-0U; Tue, 15 Aug 2023 11:16:53 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 8C3463008CD; Tue, 15 Aug 2023 13:16:49 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 0C73E2074B3DC; Tue, 15 Aug 2023 13:16:49 +0200 (CEST) Message-ID: <20230815111430.421408298@infradead.org> User-Agent: quilt/0.66 Date: Tue, 15 Aug 2023 13:01:26 +0200 From: Peter Zijlstra To: bigeasy@linutronix.de, tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, bsegall@google.com, boqun.feng@gmail.com, swood@redhat.com, bristot@redhat.com, dietmar.eggemann@arm.com, mingo@redhat.com, jstultz@google.com, juri.lelli@redhat.com, mgorman@suse.de, rostedt@goodmis.org, vschneid@redhat.com, vincent.guittot@linaro.org, longman@redhat.com, will@kernel.org Subject: [PATCH 5/6] locking/rtmutex: Use rt_mutex specific scheduler helpers References: <20230815110121.117752409@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Sebastian Andrzej Siewior Have rt_mutex use the rt_mutex specific scheduler helpers to avoid recursion vs rtlock on the PI state. [[ peterz: adapted to new names ]] Reported-by: Crystal Wood Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Peter Zijlstra (Intel) --- kernel/locking/rtmutex.c | 14 ++++++++++++-- kernel/locking/rwbase_rt.c | 2 ++ kernel/locking/rwsem.c | 8 +++++++- kernel/locking/spinlock_rt.c | 4 ++++ 4 files changed, 25 insertions(+), 3 deletions(-) --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -1636,7 +1636,7 @@ static int __sched rt_mutex_slowlock_blo raw_spin_unlock_irq(&lock->wait_lock); =20 if (!owner || !rtmutex_spin_on_owner(lock, waiter, owner)) - schedule(); + rt_mutex_schedule(); =20 raw_spin_lock_irq(&lock->wait_lock); set_current_state(state); @@ -1665,7 +1665,7 @@ static void __sched rt_mutex_handle_dead WARN(1, "rtmutex deadlock detected\n"); while (1) { set_current_state(TASK_INTERRUPTIBLE); - schedule(); + rt_mutex_schedule(); } } =20 @@ -1761,6 +1761,15 @@ static int __sched rt_mutex_slowlock(str int ret; =20 /* + * Do all pre-schedule work here, before we queue a waiter and invoke + * PI -- any such work that trips on rtlock (PREEMPT_RT spinlock) would + * otherwise recurse back into task_blocks_on_rt_mutex() through + * rtlock_slowlock() and will then enqueue a second waiter for this + * same task and things get really confusing real fast. + */ + rt_mutex_pre_schedule(); + + /* * Technically we could use raw_spin_[un]lock_irq() here, but this can * be called in early boot if the cmpxchg() fast path is disabled * (debug, no architecture support). In this case we will acquire the @@ -1771,6 +1780,7 @@ static int __sched rt_mutex_slowlock(str raw_spin_lock_irqsave(&lock->wait_lock, flags); ret =3D __rt_mutex_slowlock_locked(lock, ww_ctx, state); raw_spin_unlock_irqrestore(&lock->wait_lock, flags); + rt_mutex_post_schedule(); =20 return ret; } --- a/kernel/locking/rwbase_rt.c +++ b/kernel/locking/rwbase_rt.c @@ -71,6 +71,7 @@ static int __sched __rwbase_read_lock(st struct rt_mutex_base *rtm =3D &rwb->rtmutex; int ret; =20 + rwbase_pre_schedule(); raw_spin_lock_irq(&rtm->wait_lock); =20 /* @@ -125,6 +126,7 @@ static int __sched __rwbase_read_lock(st rwbase_rtmutex_unlock(rtm); =20 trace_contention_end(rwb, ret); + rwbase_post_schedule(); return ret; } =20 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -1427,8 +1427,14 @@ static inline void __downgrade_write(str #define rwbase_signal_pending_state(state, current) \ signal_pending_state(state, current) =20 +#define rwbase_pre_schedule() \ + rt_mutex_pre_schedule() + #define rwbase_schedule() \ - schedule() + rt_mutex_schedule() + +#define rwbase_post_schedule() \ + rt_mutex_post_schedule() =20 #include "rwbase_rt.c" =20 --- a/kernel/locking/spinlock_rt.c +++ b/kernel/locking/spinlock_rt.c @@ -184,9 +184,13 @@ static __always_inline int rwbase_rtmut =20 #define rwbase_signal_pending_state(state, current) (0) =20 +#define rwbase_pre_schedule() + #define rwbase_schedule() \ schedule_rtlock() =20 +#define rwbase_post_schedule() + #include "rwbase_rt.c" /* * The common functions which get wrapped into the rwlock API. From nobody Wed Dec 17 04:37:12 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB96DC001E0 for ; Tue, 15 Aug 2023 11:18:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236751AbjHOLRz (ORCPT ); Tue, 15 Aug 2023 07:17:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236688AbjHOLRM (ORCPT ); Tue, 15 Aug 2023 07:17:12 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9164B1990 for ; Tue, 15 Aug 2023 04:17:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=jIATRXokMqv3fKd3yng/qiYyxL1zmrgrCIzNFpO3mYI=; b=TbHzre3Y9shV9EOcIjiw1v5uv+ 74nCkV4RNCD6WqgqiedyMegiZ4bsVpOdWO3yjdEIdp+aTTbfUs2XV6Im5bwkNaGlMzybj5bZef2yb 6J+MijCndJK6L/hxOCoKhC1czPCeRwnUYCBOQxouPrJSK+vVlClvtEQctd3nzOLyK5+DOA8pNMVc0 imU5vAdBq3YqPbQrx/onO5qSYuVojVWKQDrjhNd4oHgF86VAJJlU2kNDJ603IVjjGdgqz/dY0fbOs niLhEhODfo1dVyflCHtMJCqqrolqvf+zJq2s4/6ltxM9U5W3HgmSLxcQELIQjZs3kc4tDmC4GtBsG 9b6bvr5w==; Received: from j130084.upc-j.chello.nl ([24.132.130.84] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1qVs2c-007rqW-SQ; Tue, 15 Aug 2023 11:16:51 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 8C3163008AC; Tue, 15 Aug 2023 13:16:49 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 0) id 0EE4C2074B3DF; Tue, 15 Aug 2023 13:16:49 +0200 (CEST) Message-ID: <20230815111430.488430699@infradead.org> User-Agent: quilt/0.66 Date: Tue, 15 Aug 2023 13:01:27 +0200 From: Peter Zijlstra To: bigeasy@linutronix.de, tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, bsegall@google.com, boqun.feng@gmail.com, swood@redhat.com, bristot@redhat.com, dietmar.eggemann@arm.com, mingo@redhat.com, jstultz@google.com, juri.lelli@redhat.com, mgorman@suse.de, rostedt@goodmis.org, vschneid@redhat.com, vincent.guittot@linaro.org, longman@redhat.com, will@kernel.org Subject: [PATCH 6/6] locking/rtmutex: Add a lockdep assert to catch potential nested blocking References: <20230815110121.117752409@infradead.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Thomas Gleixner There used to be a BUG_ON(current->pi_blocked_on) in the lock acquisition functions, but that vanished in one of the rtmutex overhauls. Bring it back in form of a lockdep assert to catch code paths which take rtmutex based locks with current::pi_blocked_on !=3D NULL. Reported-by: Crystal Wood Signed-off-by: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20230427111937.2745231-5-bigeasy@linutronix= .de --- kernel/locking/rtmutex.c | 2 ++ kernel/locking/rwbase_rt.c | 2 ++ kernel/locking/spinlock_rt.c | 2 ++ 3 files changed, 6 insertions(+) --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -1774,6 +1774,8 @@ static int __sched rt_mutex_slowlock(str static __always_inline int __rt_mutex_lock(struct rt_mutex_base *lock, unsigned int state) { + lockdep_assert(!current->pi_blocked_on); + if (likely(rt_mutex_try_acquire(lock))) return 0; =20 --- a/kernel/locking/rwbase_rt.c +++ b/kernel/locking/rwbase_rt.c @@ -131,6 +131,8 @@ static int __sched __rwbase_read_lock(st static __always_inline int rwbase_read_lock(struct rwbase_rt *rwb, unsigned int state) { + lockdep_assert(!current->pi_blocked_on); + if (rwbase_read_trylock(rwb)) return 0; =20 --- a/kernel/locking/spinlock_rt.c +++ b/kernel/locking/spinlock_rt.c @@ -37,6 +37,8 @@ =20 static __always_inline void rtlock_lock(struct rt_mutex_base *rtm) { + lockdep_assert(!current->pi_blocked_on); + if (unlikely(!rt_mutex_cmpxchg_acquire(rtm, NULL, current))) rtlock_slowlock(rtm); }