From nobody Sun Feb 8 01:34:09 2026 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 674EB57303 for ; Mon, 1 Apr 2024 23:44:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015092; cv=none; b=BzJJ8lX59F/DlsfUUuptr/4HdYCCyNNgMrsboUaIJS0Cs3jJGKy0bIADutf7M8/VTYIiXQf8+GYHVfnL4D/gDNS/O6N4TCX3FLj5UDzmf/x3kIuUSTEzPxr8/np3pkq6IfdQF+pvo+Mwtpz3x2mMv/t/XxR7cbvkHB7G1onTtQw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015092; c=relaxed/simple; bh=GJCzqLd1yU21/5EPUVg1kQYMudOQPgbVDPdJvDi8PaQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=c66UOiGd71P8l1I0L8j4QDqQgPPrvjctmTOq5YwUeBr8iJzcA8zEHGR6pK5MFvO6R7wj4pmSWdC69pSe9u56yJw4h876vKVPgq5lL50CxiIuLfkbjOkUJSy2d6pkrV6M6Q8G/nkv8mpy5XTLl7jx8qAErUgpA8QGEqEujR+mzE0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=PCxsmhEr; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="PCxsmhEr" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dc6b26ce0bbso8402531276.1 for ; Mon, 01 Apr 2024 16:44:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712015089; x=1712619889; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=FtDbcV8zlQF7rqL76sToUaoMwq0dV5ffqW3dTh0qD/M=; b=PCxsmhErcKnp6WFH0TV2Kn4roKWDqC+yCfZ2HbNBCW3ZtdHBXDjNIFK3Om5mv0/rOf CsVBcj6PCqDEsiynHLkpTBgL185p9r0uhvoOmwFy85v6PblAqpyusg5UznJ5GyOHfdzw WzXfTXQTByCrs5FHY8zmtehkv6OujnaUTGY7aqcUleBAqe2oWCVOAMIiHHxvqZukdFLi fpI7a9OaqTMmV0wu89+Q6ARHdx6BAhJtbJe0LsCHc38nyWESV6WPCllcgIhYk0m64rC9 9gAlYiDS+4g1Y9tM/Lg2YX3TUJfgB4KthWh9wtlNc+hy9WriMJoevn5HWNajGeP3PVKd FI8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712015089; x=1712619889; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FtDbcV8zlQF7rqL76sToUaoMwq0dV5ffqW3dTh0qD/M=; b=orZLsIM7O9YzEHS+VQ64tv7RPuK5sSV6JVrgCnL1VeZkMSSvIkgeRYn1rm+puUR85w DMd3QAe/zmOdx6BSo4W3M4iEE1hJAwyGdpf59KOi3PIJqtzOQWLzht3YAWQyLGrsYkFT zRqNNIOLVMBJVkO14Pj6XbT7ozeHep0W4DbloviEvXfmoDVWDYGzdfAndfhYZbMzMJDw 7AifTkjGCf/XFpUEHP+H7EKORv8ZOiFXW0vySSqsgHhSJiMuimYENg/ooS+3dtl/J997 h+fy8Kalm6EmaFJyNBH73/WYs4Iqf2bYHvSEuVQY6NygOZ6dX98IZzPcmqTgTPGv3TXe ncTQ== X-Gm-Message-State: AOJu0YyIZPTTy2Ihnb9WiA4ENY99CEe3G/3X6g/b4nhx/mhb71hLel7d LnxwUwfadF15XgGVOti1odHVI0wCck3PMYCLpdpYNNMZQi4+Xr0BF+x7pKrveRm8vWl/WKBRLgk Jyktc4Gw7jf3vmDlmIKsACpKg6F5rIymT8Z+NTK2AEvoK7InDpAZTZAP/QCjkLJQ2NM1QsXPfa7 58KpEuWkYH5lbHFA2b32Hw3w1OGn0m80hxepQ2fuZMTtiz X-Google-Smtp-Source: AGHT+IEXLy7BY3eDJSf/IRWvnNlatFoOAFLtCpqCcmS+lzq8AeEejmhn/wBh9AliUVn+3KJNU3r11J0B5lbk X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a05:6902:1009:b0:dc6:ff2b:7e1b with SMTP id w9-20020a056902100900b00dc6ff2b7e1bmr3581710ybt.4.1712015089339; Mon, 01 Apr 2024 16:44:49 -0700 (PDT) Date: Mon, 1 Apr 2024 16:44:23 -0700 In-Reply-To: <20240401234439.834544-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240401234439.834544-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.478.gd926399ef9-goog Message-ID: <20240401234439.834544-2-jstultz@google.com> Subject: [RESEND][PATCH v9 1/7] locking/mutex: Remove wakeups from under mutex::wait_lock From: John Stultz To: LKML Cc: Peter Zijlstra , Joel Fernandes , Qais Yousef , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com, Davidlohr Bueso , John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra In preparation to nest mutex::wait_lock under rq::lock we need to remove wakeups from under it. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Tested-by: K Prateek Nayak Acked-by: Davidlohr Bueso Signed-off-by: Peter Zijlstra (Intel) [Heavily changed after 55f036ca7e74 ("locking: WW mutex cleanup") and 08295b3b5bee ("locking: Implement an algorithm choice for Wound-Wait mutexes")] Signed-off-by: Juri Lelli [jstultz: rebased to mainline, added extra wake_up_q & init to avoid hangs, similar to Connor's rework of this patch] Signed-off-by: John Stultz Acked-by tags I received & rebased to 6.9-rc2). --- v5: * Reverted back to an earlier version of this patch to undo the change that kept the wake_q in the ctx structure, as that broke the rule that the wake_q must always be on the stack, as its not safe for concurrency. v6: * Made tweaks suggested by Waiman Long v7: * Fixups to pass wake_qs down for PREEMPT_RT logic --- kernel/locking/mutex.c | 17 +++++++++++++---- kernel/locking/rtmutex.c | 26 +++++++++++++++++--------- kernel/locking/rwbase_rt.c | 4 +++- kernel/locking/rwsem.c | 4 ++-- kernel/locking/spinlock_rt.c | 3 ++- kernel/locking/ww_mutex.h | 29 ++++++++++++++++++----------- 6 files changed, 55 insertions(+), 28 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index cbae8c0b89ab..980ce630232c 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -575,6 +575,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas struct lockdep_map *nest_lock, unsigned long ip, struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx) { + DEFINE_WAKE_Q(wake_q); struct mutex_waiter waiter; struct ww_mutex *ww; int ret; @@ -625,7 +626,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas */ if (__mutex_trylock(lock)) { if (ww_ctx) - __ww_mutex_check_waiters(lock, ww_ctx); + __ww_mutex_check_waiters(lock, ww_ctx, &wake_q); =20 goto skip_wait; } @@ -645,7 +646,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas * Add in stamp order, waking up waiters that must kill * themselves. */ - ret =3D __ww_mutex_add_waiter(&waiter, lock, ww_ctx); + ret =3D __ww_mutex_add_waiter(&waiter, lock, ww_ctx, &wake_q); if (ret) goto err_early_kill; } @@ -681,6 +682,11 @@ __mutex_lock_common(struct mutex *lock, unsigned int s= tate, unsigned int subclas } =20 raw_spin_unlock(&lock->wait_lock); + /* Make sure we do wakeups before calling schedule */ + if (!wake_q_empty(&wake_q)) { + wake_up_q(&wake_q); + wake_q_init(&wake_q); + } schedule_preempt_disabled(); =20 first =3D __mutex_waiter_is_first(lock, &waiter); @@ -714,7 +720,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas */ if (!ww_ctx->is_wait_die && !__mutex_waiter_is_first(lock, &waiter)) - __ww_mutex_check_waiters(lock, ww_ctx); + __ww_mutex_check_waiters(lock, ww_ctx, &wake_q); } =20 __mutex_remove_waiter(lock, &waiter); @@ -730,6 +736,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas ww_mutex_lock_acquired(ww, ww_ctx); =20 raw_spin_unlock(&lock->wait_lock); + wake_up_q(&wake_q); preempt_enable(); return 0; =20 @@ -741,6 +748,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas raw_spin_unlock(&lock->wait_lock); debug_mutex_free_waiter(&waiter); mutex_release(&lock->dep_map, ip); + wake_up_q(&wake_q); preempt_enable(); return ret; } @@ -934,6 +942,7 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne } } =20 + preempt_disable(); raw_spin_lock(&lock->wait_lock); debug_mutex_unlock(lock); if (!list_empty(&lock->wait_list)) { @@ -952,8 +961,8 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne __mutex_handoff(lock, next); =20 raw_spin_unlock(&lock->wait_lock); - wake_up_q(&wake_q); + preempt_enable(); } =20 #ifndef CONFIG_DEBUG_LOCK_ALLOC diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index 88d08eeb8bc0..59f17e7ccf89 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -34,13 +34,15 @@ =20 static inline int __ww_mutex_add_waiter(struct rt_mutex_waiter *waiter, struct rt_mutex *lock, - struct ww_acquire_ctx *ww_ctx) + struct ww_acquire_ctx *ww_ctx, + struct wake_q_head *wake_q) { return 0; } =20 static inline void __ww_mutex_check_waiters(struct rt_mutex *lock, - struct ww_acquire_ctx *ww_ctx) + struct ww_acquire_ctx *ww_ctx, + struct wake_q_head *wake_q) { } =20 @@ -1207,6 +1209,7 @@ static int __sched task_blocks_on_rt_mutex(struct rt_= mutex_base *lock, struct rt_mutex_waiter *top_waiter =3D waiter; struct rt_mutex_base *next_lock; int chain_walk =3D 0, res; + DEFINE_WAKE_Q(wake_q); =20 lockdep_assert_held(&lock->wait_lock); =20 @@ -1245,7 +1248,8 @@ static int __sched task_blocks_on_rt_mutex(struct rt_= mutex_base *lock, =20 /* Check whether the waiter should back out immediately */ rtm =3D container_of(lock, struct rt_mutex, rtmutex); - res =3D __ww_mutex_add_waiter(waiter, rtm, ww_ctx); + res =3D __ww_mutex_add_waiter(waiter, rtm, ww_ctx, &wake_q); + wake_up_q(&wake_q); if (res) { raw_spin_lock(&task->pi_lock); rt_mutex_dequeue(lock, waiter); @@ -1678,7 +1682,8 @@ static int __sched __rt_mutex_slowlock(struct rt_mute= x_base *lock, struct ww_acquire_ctx *ww_ctx, unsigned int state, enum rtmutex_chainwalk chwalk, - struct rt_mutex_waiter *waiter) + struct rt_mutex_waiter *waiter, + struct wake_q_head *wake_q) { struct rt_mutex *rtm =3D container_of(lock, struct rt_mutex, rtmutex); struct ww_mutex *ww =3D ww_container_of(rtm); @@ -1689,7 +1694,7 @@ static int __sched __rt_mutex_slowlock(struct rt_mute= x_base *lock, /* Try to acquire the lock again: */ if (try_to_take_rt_mutex(lock, current, NULL)) { if (build_ww_mutex() && ww_ctx) { - __ww_mutex_check_waiters(rtm, ww_ctx); + __ww_mutex_check_waiters(rtm, ww_ctx, wake_q); ww_mutex_lock_acquired(ww, ww_ctx); } return 0; @@ -1707,7 +1712,7 @@ static int __sched __rt_mutex_slowlock(struct rt_mute= x_base *lock, /* acquired the lock */ if (build_ww_mutex() && ww_ctx) { if (!ww_ctx->is_wait_die) - __ww_mutex_check_waiters(rtm, ww_ctx); + __ww_mutex_check_waiters(rtm, ww_ctx, wake_q); ww_mutex_lock_acquired(ww, ww_ctx); } } else { @@ -1729,7 +1734,8 @@ static int __sched __rt_mutex_slowlock(struct rt_mute= x_base *lock, =20 static inline int __rt_mutex_slowlock_locked(struct rt_mutex_base *lock, struct ww_acquire_ctx *ww_ctx, - unsigned int state) + unsigned int state, + struct wake_q_head *wake_q) { struct rt_mutex_waiter waiter; int ret; @@ -1738,7 +1744,7 @@ static inline int __rt_mutex_slowlock_locked(struct r= t_mutex_base *lock, waiter.ww_ctx =3D ww_ctx; =20 ret =3D __rt_mutex_slowlock(lock, ww_ctx, state, RT_MUTEX_MIN_CHAINWALK, - &waiter); + &waiter, wake_q); =20 debug_rt_mutex_free_waiter(&waiter); return ret; @@ -1754,6 +1760,7 @@ static int __sched rt_mutex_slowlock(struct rt_mutex_= base *lock, struct ww_acquire_ctx *ww_ctx, unsigned int state) { + DEFINE_WAKE_Q(wake_q); unsigned long flags; int ret; =20 @@ -1775,8 +1782,9 @@ static int __sched rt_mutex_slowlock(struct rt_mutex_= base *lock, * irqsave/restore variants. */ raw_spin_lock_irqsave(&lock->wait_lock, flags); - ret =3D __rt_mutex_slowlock_locked(lock, ww_ctx, state); + ret =3D __rt_mutex_slowlock_locked(lock, ww_ctx, state, &wake_q); raw_spin_unlock_irqrestore(&lock->wait_lock, flags); + wake_up_q(&wake_q); rt_mutex_post_schedule(); =20 return ret; diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c index 34a59569db6b..e9d2f38b70f3 100644 --- a/kernel/locking/rwbase_rt.c +++ b/kernel/locking/rwbase_rt.c @@ -69,6 +69,7 @@ static int __sched __rwbase_read_lock(struct rwbase_rt *r= wb, unsigned int state) { struct rt_mutex_base *rtm =3D &rwb->rtmutex; + DEFINE_WAKE_Q(wake_q); int ret; =20 rwbase_pre_schedule(); @@ -110,7 +111,7 @@ static int __sched __rwbase_read_lock(struct rwbase_rt = *rwb, * For rwlocks this returns 0 unconditionally, so the below * !ret conditionals are optimized out. */ - ret =3D rwbase_rtmutex_slowlock_locked(rtm, state); + ret =3D rwbase_rtmutex_slowlock_locked(rtm, state, &wake_q); =20 /* * On success the rtmutex is held, so there can't be a writer @@ -122,6 +123,7 @@ static int __sched __rwbase_read_lock(struct rwbase_rt = *rwb, if (!ret) atomic_inc(&rwb->readers); raw_spin_unlock_irq(&rtm->wait_lock); + wake_up_q(&wake_q); if (!ret) rwbase_rtmutex_unlock(rtm); =20 diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c index c6d17aee4209..79ab7b8df5c1 100644 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -1415,8 +1415,8 @@ static inline void __downgrade_write(struct rw_semaph= ore *sem) #define rwbase_rtmutex_lock_state(rtm, state) \ __rt_mutex_lock(rtm, state) =20 -#define rwbase_rtmutex_slowlock_locked(rtm, state) \ - __rt_mutex_slowlock_locked(rtm, NULL, state) +#define rwbase_rtmutex_slowlock_locked(rtm, state, wq) \ + __rt_mutex_slowlock_locked(rtm, NULL, state, wq) =20 #define rwbase_rtmutex_unlock(rtm) \ __rt_mutex_unlock(rtm) diff --git a/kernel/locking/spinlock_rt.c b/kernel/locking/spinlock_rt.c index 38e292454fcc..fb1810a14c9d 100644 --- a/kernel/locking/spinlock_rt.c +++ b/kernel/locking/spinlock_rt.c @@ -162,7 +162,8 @@ rwbase_rtmutex_lock_state(struct rt_mutex_base *rtm, un= signed int state) } =20 static __always_inline int -rwbase_rtmutex_slowlock_locked(struct rt_mutex_base *rtm, unsigned int sta= te) +rwbase_rtmutex_slowlock_locked(struct rt_mutex_base *rtm, unsigned int sta= te, + struct wake_q_head *wake_q) { rtlock_slowlock_locked(rtm); return 0; diff --git a/kernel/locking/ww_mutex.h b/kernel/locking/ww_mutex.h index 3ad2cc4823e5..7189c6631d90 100644 --- a/kernel/locking/ww_mutex.h +++ b/kernel/locking/ww_mutex.h @@ -275,7 +275,7 @@ __ww_ctx_less(struct ww_acquire_ctx *a, struct ww_acqui= re_ctx *b) */ static bool __ww_mutex_die(struct MUTEX *lock, struct MUTEX_WAITER *waiter, - struct ww_acquire_ctx *ww_ctx) + struct ww_acquire_ctx *ww_ctx, struct wake_q_head *wake_q) { if (!ww_ctx->is_wait_die) return false; @@ -284,7 +284,7 @@ __ww_mutex_die(struct MUTEX *lock, struct MUTEX_WAITER = *waiter, #ifndef WW_RT debug_mutex_wake_waiter(lock, waiter); #endif - wake_up_process(waiter->task); + wake_q_add(wake_q, waiter->task); } =20 return true; @@ -299,7 +299,8 @@ __ww_mutex_die(struct MUTEX *lock, struct MUTEX_WAITER = *waiter, */ static bool __ww_mutex_wound(struct MUTEX *lock, struct ww_acquire_ctx *ww_ctx, - struct ww_acquire_ctx *hold_ctx) + struct ww_acquire_ctx *hold_ctx, + struct wake_q_head *wake_q) { struct task_struct *owner =3D __ww_mutex_owner(lock); =20 @@ -331,7 +332,7 @@ static bool __ww_mutex_wound(struct MUTEX *lock, * wakeup pending to re-read the wounded state. */ if (owner !=3D current) - wake_up_process(owner); + wake_q_add(wake_q, owner); =20 return true; } @@ -352,7 +353,8 @@ static bool __ww_mutex_wound(struct MUTEX *lock, * The current task must not be on the wait list. */ static void -__ww_mutex_check_waiters(struct MUTEX *lock, struct ww_acquire_ctx *ww_ctx) +__ww_mutex_check_waiters(struct MUTEX *lock, struct ww_acquire_ctx *ww_ctx, + struct wake_q_head *wake_q) { struct MUTEX_WAITER *cur; =20 @@ -364,8 +366,8 @@ __ww_mutex_check_waiters(struct MUTEX *lock, struct ww_= acquire_ctx *ww_ctx) if (!cur->ww_ctx) continue; =20 - if (__ww_mutex_die(lock, cur, ww_ctx) || - __ww_mutex_wound(lock, cur->ww_ctx, ww_ctx)) + if (__ww_mutex_die(lock, cur, ww_ctx, wake_q) || + __ww_mutex_wound(lock, cur->ww_ctx, ww_ctx, wake_q)) break; } } @@ -377,6 +379,8 @@ __ww_mutex_check_waiters(struct MUTEX *lock, struct ww_= acquire_ctx *ww_ctx) static __always_inline void ww_mutex_set_context_fastpath(struct ww_mutex *lock, struct ww_acquire_ctx= *ctx) { + DEFINE_WAKE_Q(wake_q); + ww_mutex_lock_acquired(lock, ctx); =20 /* @@ -405,8 +409,10 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock, s= truct ww_acquire_ctx *ctx) * die or wound us. */ lock_wait_lock(&lock->base); - __ww_mutex_check_waiters(&lock->base, ctx); + __ww_mutex_check_waiters(&lock->base, ctx, &wake_q); unlock_wait_lock(&lock->base); + + wake_up_q(&wake_q); } =20 static __always_inline int @@ -488,7 +494,8 @@ __ww_mutex_check_kill(struct MUTEX *lock, struct MUTEX_= WAITER *waiter, static inline int __ww_mutex_add_waiter(struct MUTEX_WAITER *waiter, struct MUTEX *lock, - struct ww_acquire_ctx *ww_ctx) + struct ww_acquire_ctx *ww_ctx, + struct wake_q_head *wake_q) { struct MUTEX_WAITER *cur, *pos =3D NULL; bool is_wait_die; @@ -532,7 +539,7 @@ __ww_mutex_add_waiter(struct MUTEX_WAITER *waiter, pos =3D cur; =20 /* Wait-Die: ensure younger waiters die. */ - __ww_mutex_die(lock, cur, ww_ctx); + __ww_mutex_die(lock, cur, ww_ctx, wake_q); } =20 __ww_waiter_add(lock, waiter, pos); @@ -550,7 +557,7 @@ __ww_mutex_add_waiter(struct MUTEX_WAITER *waiter, * such that either we or the fastpath will wound @ww->ctx. */ smp_mb(); - __ww_mutex_wound(lock, ww_ctx, ww->ctx); + __ww_mutex_wound(lock, ww_ctx, ww->ctx, wake_q); } =20 return 0; --=20 2.44.0.478.gd926399ef9-goog From nobody Sun Feb 8 01:34:09 2026 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3DCBE5732B for ; Mon, 1 Apr 2024 23:44:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015094; cv=none; b=TKbmjV3E+6fYCagNQExLhr6JWdRR7d+SKIlsl/+HoSxKhFc65lb80FXQO2zyRNQh2UlIR5V8NKSZ4Q29HRRs+rXTPqZwIGbvNTa/uet8csQDmBYSFM3kaf1uNgmhgSQ6YXxdO51lxD7Jp3ws6pqmNAE77gEh8LEMuOpkuhjNews= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015094; c=relaxed/simple; bh=eOx6u9TtmwlP/sM9ya+3c05W5pa4wv9HUAthttaLdEU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=KUVE8HHCjo9/Z9gYE11NGJluoek40VlYOqAIFsqwSFJ7ci3yc2jBkR8QZXjFuMm6H1TYBwbVntnLl4Fr0hzYSp429ZGewwcCL4nfULxQWrWkFWh4ZLEV4z9kTtFGsMcXyCjQooHKKpqyMeXG9Vi2drwAB1qpYNbEiu7ns05XG+4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=R/mNr26E; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="R/mNr26E" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dd169dd4183so5083512276.3 for ; Mon, 01 Apr 2024 16:44:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712015091; x=1712619891; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gG0JcNzJnpUQbpWngmz1RBhR0xqIRNQ92Kh/9u7RVtc=; b=R/mNr26EyB3tWrxhwqGYaRYWIx6iPyM2PjvbxRHyJ+s0QtYUmMP8Ho4w7vVLadzUPg hYq4HBYvnmRfU6lqe+UEmOhKwBVbo+t2vm700Ap/qedZL8klhyZDiueyrJKvtSZDe59m NgtTAD3OuzYqVYlyQ1tYXE2AJlMZ9uKcJEmgjaXQ2GIdAnknq2ketcpqV16jk9o1+p3Z fzRh0YlgAd5H0zPNlonJHr2eUWjzFmTKctkUgON9tvwTdXLjv0+1cE3Ys+7t6Nb/ESTV WydT8u/iWGJErbp3hTtg7PLH8VwkwrdyG9kLyAjTnHQK12J5OWy6L72wFBH9a9oOfn+U XLEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712015091; x=1712619891; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gG0JcNzJnpUQbpWngmz1RBhR0xqIRNQ92Kh/9u7RVtc=; b=aVDQoker1dGBupjoGiRBT7/LbPuYv4VF6JYG0emogegIwqjRgo0swOj5A2z6cuGreB 6L8fppvjdJm7e7fbDqP/5Jjho/UncFUi5EfouDALQCMfgh/5GDht3I0BT1/8wzGe51Lh +jfE/WHUJ31+AUoo/6rMvVrimp6MarDaIlY8/4dZY9z+7arZYSTC7Puxl807gUPz7j9N FS6SiEGY/t/3ZSnCplJbQJwS1cg2HFRiTkiLDtFe27PIsk3N1Hzxr9IBQYJqBj2mad3D mH77imURSEMppGS6bULEFBXtwP8ExsjINQA+gjWdyGrAKAWbW9j3XN7JOog/lJ4+0FkY gHxg== X-Gm-Message-State: AOJu0Yyf2e5FS95tNHFxnJUnlysW/JbkvEMxmu7oexSuq+Jfuea+z8FA N8iuUMIvaXRI+4LpsY39eGKTOYta+wFaI8FKtotkFUfCXy9tekX1gqSFF9u4vkKNdiWVPbGcRjq Poq7vWhKS/jUPWH9FoY9alstiIYryp8acCCcBii846idQW/tCc433368notjQYfq/s9PPnjNOzf vztDIG4nwXSXz6i4/PHjHobynSPVv+f1czJCBEM7LDIX2n X-Google-Smtp-Source: AGHT+IGM6S2GI2hDwmAiVatR1Ir34EocwBW7L7BC1g9F0srlkGqI0vG6A1IW/PwB3o5EFtV0X5++lxHSZgR7 X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a05:6902:1006:b0:dcc:94b7:a7a3 with SMTP id w6-20020a056902100600b00dcc94b7a7a3mr768869ybt.12.1712015090924; Mon, 01 Apr 2024 16:44:50 -0700 (PDT) Date: Mon, 1 Apr 2024 16:44:24 -0700 In-Reply-To: <20240401234439.834544-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240401234439.834544-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.478.gd926399ef9-goog Message-ID: <20240401234439.834544-3-jstultz@google.com> Subject: [RESEND][PATCH v9 2/7] locking/mutex: Make mutex::wait_lock irq safe From: John Stultz To: LKML Cc: Juri Lelli , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com, "Connor O'Brien" , John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Juri Lelli mutex::wait_lock might be nested under rq->lock. Make it irq safe then. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Tested-by: K Prateek Nayak Signed-off-by: Juri Lelli Signed-off-by: Peter Zijlstra (Intel) [rebase & fix {un,}lock_wait_lock helpers in ww_mutex.h] Signed-off-by: Connor O'Brien Signed-off-by: John Stultz Acked-by tags I received & rebased to 6.9-rc2). Reviewed-by: Valentin Schneider --- v3: * Re-added this patch after it was dropped in v2 which caused lockdep warnings to trip. v7: * Fix function definition for PREEMPT_RT case, as pointed out by Metin Kaya. * Fix incorrect flags handling in PREEMPT_RT case as found by Metin Kaya --- kernel/locking/mutex.c | 18 ++++++++++-------- kernel/locking/ww_mutex.h | 22 +++++++++++----------- 2 files changed, 21 insertions(+), 19 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 980ce630232c..7de72c610c65 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -578,6 +578,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas DEFINE_WAKE_Q(wake_q); struct mutex_waiter waiter; struct ww_mutex *ww; + unsigned long flags; int ret; =20 if (!use_ww_ctx) @@ -620,7 +621,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas return 0; } =20 - raw_spin_lock(&lock->wait_lock); + raw_spin_lock_irqsave(&lock->wait_lock, flags); /* * After waiting to acquire the wait_lock, try again. */ @@ -681,7 +682,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas goto err; } =20 - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, flags); /* Make sure we do wakeups before calling schedule */ if (!wake_q_empty(&wake_q)) { wake_up_q(&wake_q); @@ -707,9 +708,9 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas trace_contention_begin(lock, LCB_F_MUTEX); } =20 - raw_spin_lock(&lock->wait_lock); + raw_spin_lock_irqsave(&lock->wait_lock, flags); } - raw_spin_lock(&lock->wait_lock); + raw_spin_lock_irqsave(&lock->wait_lock, flags); acquired: __set_current_state(TASK_RUNNING); =20 @@ -735,7 +736,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas if (ww_ctx) ww_mutex_lock_acquired(ww, ww_ctx); =20 - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, flags); wake_up_q(&wake_q); preempt_enable(); return 0; @@ -745,7 +746,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas __mutex_remove_waiter(lock, &waiter); err_early_kill: trace_contention_end(lock, ret); - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, flags); debug_mutex_free_waiter(&waiter); mutex_release(&lock->dep_map, ip); wake_up_q(&wake_q); @@ -916,6 +917,7 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne struct task_struct *next =3D NULL; DEFINE_WAKE_Q(wake_q); unsigned long owner; + unsigned long flags; =20 mutex_release(&lock->dep_map, ip); =20 @@ -943,7 +945,7 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne } =20 preempt_disable(); - raw_spin_lock(&lock->wait_lock); + raw_spin_lock_irqsave(&lock->wait_lock, flags); debug_mutex_unlock(lock); if (!list_empty(&lock->wait_list)) { /* get the first entry from the wait-list: */ @@ -960,7 +962,7 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne if (owner & MUTEX_FLAG_HANDOFF) __mutex_handoff(lock, next); =20 - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, flags); wake_up_q(&wake_q); preempt_enable(); } diff --git a/kernel/locking/ww_mutex.h b/kernel/locking/ww_mutex.h index 7189c6631d90..9facc0ddfdd3 100644 --- a/kernel/locking/ww_mutex.h +++ b/kernel/locking/ww_mutex.h @@ -70,14 +70,14 @@ __ww_mutex_has_waiters(struct mutex *lock) return atomic_long_read(&lock->owner) & MUTEX_FLAG_WAITERS; } =20 -static inline void lock_wait_lock(struct mutex *lock) +static inline void lock_wait_lock(struct mutex *lock, unsigned long *flags) { - raw_spin_lock(&lock->wait_lock); + raw_spin_lock_irqsave(&lock->wait_lock, *flags); } =20 -static inline void unlock_wait_lock(struct mutex *lock) +static inline void unlock_wait_lock(struct mutex *lock, unsigned long *fla= gs) { - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, *flags); } =20 static inline void lockdep_assert_wait_lock_held(struct mutex *lock) @@ -144,14 +144,14 @@ __ww_mutex_has_waiters(struct rt_mutex *lock) return rt_mutex_has_waiters(&lock->rtmutex); } =20 -static inline void lock_wait_lock(struct rt_mutex *lock) +static inline void lock_wait_lock(struct rt_mutex *lock, unsigned long *fl= ags) { - raw_spin_lock(&lock->rtmutex.wait_lock); + raw_spin_lock_irqsave(&lock->rtmutex.wait_lock, *flags); } =20 -static inline void unlock_wait_lock(struct rt_mutex *lock) +static inline void unlock_wait_lock(struct rt_mutex *lock, unsigned long *= flags) { - raw_spin_unlock(&lock->rtmutex.wait_lock); + raw_spin_unlock_irqrestore(&lock->rtmutex.wait_lock, *flags); } =20 static inline void lockdep_assert_wait_lock_held(struct rt_mutex *lock) @@ -380,6 +380,7 @@ static __always_inline void ww_mutex_set_context_fastpath(struct ww_mutex *lock, struct ww_acquire_ctx= *ctx) { DEFINE_WAKE_Q(wake_q); + unsigned long flags; =20 ww_mutex_lock_acquired(lock, ctx); =20 @@ -408,10 +409,9 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock, s= truct ww_acquire_ctx *ctx) * Uh oh, we raced in fastpath, check if any of the waiters need to * die or wound us. */ - lock_wait_lock(&lock->base); + lock_wait_lock(&lock->base, &flags); __ww_mutex_check_waiters(&lock->base, ctx, &wake_q); - unlock_wait_lock(&lock->base); - + unlock_wait_lock(&lock->base, &flags); wake_up_q(&wake_q); } =20 --=20 2.44.0.478.gd926399ef9-goog From nobody Sun Feb 8 01:34:09 2026 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 427D858100 for ; Mon, 1 Apr 2024 23:44:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015096; cv=none; b=lLDkato9Jjf1rhmeoHFlhn+LcxTBKCG7rblv8dDfhQK+8X9Z2rUHj7kUg2v7TEbsrXPKu0aoerKPDRhc7Z4f6Px0h8NVzUJ2KTg0epne88AUj5ItCpZqWgCSOFzG4tIm4l6Zu44S9QDP6upsl/DmQ329s+3GW2ach0/2Ku2Z3gE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015096; c=relaxed/simple; bh=Er0fnT9u3Ba8OTtjlVXsTFy29JrXcDtlgfYMhmhcLWU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=B1hxC1agJDwtMAhQZ6xFzdbA9TmzdQbk7dao6u06OUbhLaoZ7eooDlwWMeC9yiuP73J3BiWsPxuj1EAS5Ca6v8ltMVRZXorI0TJtafEglV+S8YCVLgkrxD1xz+5F6H4eJmQ6jmymJBh8ekQWyeg3FyxjiXZm8+umCB9BHTGaNW0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=e3+qatwX; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="e3+qatwX" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-610c23abd1fso85876777b3.0 for ; Mon, 01 Apr 2024 16:44:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712015093; x=1712619893; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0uc0GIwNX9TQ2eyKhoV3DbzxqwGn/ULqzDV9v8PRZss=; b=e3+qatwXFHTNtVE4Ynq8cJiLnEInTOqstn/Quc8vRj9rio3E/VXf5+Jvf0ZSgVI6y0 ElOv/XSyyX4SX05AqCq911gnv59Kb4JXlnS0UlpHewJeW/4FFTUKi9DpNA5odXVXxHQW tR7c6CQaBls+kmSTMaPXjfZtlGiyKESjPGX6S8/6bpHzCk3eKNjdpFm4W5KSV/sCyx2F RD1NGgdU/YC9XVKVl7N+FrCEbr5V6sgyJddluaZZBNbahGVvYeS/FqA4ksjuqBLAChP0 NzRwWsb/haryQ5lz+8YDFtQAvs6bt+0eRMB6Uh7pQ+kdoFSj/SKMxH2DLHaMOc92no2h vcKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712015093; x=1712619893; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0uc0GIwNX9TQ2eyKhoV3DbzxqwGn/ULqzDV9v8PRZss=; b=P1Soy3Uja1baCiorP0xwqkrYSfvzcT2VpaUzxQWhv01fSry45W/+dpvgYFtqlDNxKT MAle8TsWyNgAcwBZRFwn7X6rV4STu7/Ip1HcPffIEGeoa0KYzWnKOrQ18MxaUxfzwaIN /vc8B7nznJmttKPsBR/qnXiv2iP3JyiSsWJH8L/9QRgjuaQGkdLrqSODjVtj+vrgoJLj Hxu7pzM7oJlkMgvZQ6COYPnS7NuvVS/C5LFxOZ/A2LYF8wXdp8TF3EioG0NCXB80SVdW vqlhcTM1NZwnKjWFv1BJZamTnnGoLh5b+y6ElQA5Js8YnNx+yWqGjE1XnlFWigmJ/zlv 3Daw== X-Gm-Message-State: AOJu0Ywi4Y+SuTfyD9j5VpwJZNLjkms09vtdW6uuqRmp5LErfY6YpTsZ TbKpvWqot2nQEti0buW+QEZNPNi9U9pFQcTnboZgC0C5cMxUDMeUUtb4pqJHIT5Er25SQ+S2DCb jJuryafLXCvDKoYK7xyk0kStNe3pGaz9xzRs8jTsT38RbLwRqq0auUlQwx4kMnweMOhv0sy8fCZ vLNYGDenAuYddDL+sAFCA7HBPbg+2mXBIXWGW1qCZi6v9P X-Google-Smtp-Source: AGHT+IGlFPwDuT56HmSq77B4zYaAm/qWzISnwik3PxGEpmphABZvFYU+ukksVNe+ZhCjOCT60269mo0fiLDu X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a05:690c:f12:b0:614:9623:90ca with SMTP id dc18-20020a05690c0f1200b00614962390camr1743830ywb.6.1712015092791; Mon, 01 Apr 2024 16:44:52 -0700 (PDT) Date: Mon, 1 Apr 2024 16:44:25 -0700 In-Reply-To: <20240401234439.834544-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240401234439.834544-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.478.gd926399ef9-goog Message-ID: <20240401234439.834544-4-jstultz@google.com> Subject: [RESEND][PATCH v9 3/7] locking/mutex: Expose __mutex_owner() From: John Stultz To: LKML Cc: Juri Lelli , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com, Valentin Schneider , "Connor O'Brien" , John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Juri Lelli Implementing proxy execution requires that scheduler code be able to identify the current owner of a mutex. Expose __mutex_owner() for this purpose (alone!). Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Tested-by: K Prateek Nayak Signed-off-by: Juri Lelli [Removed the EXPORT_SYMBOL] Signed-off-by: Valentin Schneider Signed-off-by: Connor O'Brien [jstultz: Reworked per Peter's suggestions] Signed-off-by: John Stultz Acked-by tags I received & rebased to 6.9-rc2). Reviewed-by: Valentin Schneider --- v4: * Move __mutex_owner() to kernel/locking/mutex.h instead of adding a new globally available accessor function to keep the exposure of this low, along with keeping it an inline function, as suggested by PeterZ --- kernel/locking/mutex.c | 25 ------------------------- kernel/locking/mutex.h | 25 +++++++++++++++++++++++++ 2 files changed, 25 insertions(+), 25 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 7de72c610c65..5741641be914 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -56,31 +56,6 @@ __mutex_init(struct mutex *lock, const char *name, struc= t lock_class_key *key) } EXPORT_SYMBOL(__mutex_init); =20 -/* - * @owner: contains: 'struct task_struct *' to the current lock owner, - * NULL means not owned. Since task_struct pointers are aligned at - * at least L1_CACHE_BYTES, we have low bits to store extra state. - * - * Bit0 indicates a non-empty waiter list; unlock must issue a wakeup. - * Bit1 indicates unlock needs to hand the lock to the top-waiter - * Bit2 indicates handoff has been done and we're waiting for pickup. - */ -#define MUTEX_FLAG_WAITERS 0x01 -#define MUTEX_FLAG_HANDOFF 0x02 -#define MUTEX_FLAG_PICKUP 0x04 - -#define MUTEX_FLAGS 0x07 - -/* - * Internal helper function; C doesn't allow us to hide it :/ - * - * DO NOT USE (outside of mutex code). - */ -static inline struct task_struct *__mutex_owner(struct mutex *lock) -{ - return (struct task_struct *)(atomic_long_read(&lock->owner) & ~MUTEX_FLA= GS); -} - static inline struct task_struct *__owner_task(unsigned long owner) { return (struct task_struct *)(owner & ~MUTEX_FLAGS); diff --git a/kernel/locking/mutex.h b/kernel/locking/mutex.h index 0b2a79c4013b..1c7d3d32def8 100644 --- a/kernel/locking/mutex.h +++ b/kernel/locking/mutex.h @@ -20,6 +20,31 @@ struct mutex_waiter { #endif }; =20 +/* + * @owner: contains: 'struct task_struct *' to the current lock owner, + * NULL means not owned. Since task_struct pointers are aligned at + * at least L1_CACHE_BYTES, we have low bits to store extra state. + * + * Bit0 indicates a non-empty waiter list; unlock must issue a wakeup. + * Bit1 indicates unlock needs to hand the lock to the top-waiter + * Bit2 indicates handoff has been done and we're waiting for pickup. + */ +#define MUTEX_FLAG_WAITERS 0x01 +#define MUTEX_FLAG_HANDOFF 0x02 +#define MUTEX_FLAG_PICKUP 0x04 + +#define MUTEX_FLAGS 0x07 + +/* + * Internal helper function; C doesn't allow us to hide it :/ + * + * DO NOT USE (outside of mutex & scheduler code). + */ +static inline struct task_struct *__mutex_owner(struct mutex *lock) +{ + return (struct task_struct *)(atomic_long_read(&lock->owner) & ~MUTEX_FLA= GS); +} + #ifdef CONFIG_DEBUG_MUTEXES extern void debug_mutex_lock_common(struct mutex *lock, struct mutex_waiter *waiter); --=20 2.44.0.478.gd926399ef9-goog From nobody Sun Feb 8 01:34:09 2026 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D3A2658133 for ; Mon, 1 Apr 2024 23:44:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015097; cv=none; b=cSH/srKsHCHxjlWpx7tcaEDUGVrjI506bRd0ov7m1eUZlk/Ze2/YQvjirHGW9bvgW/KJWPd0a1KIoH9J1THFYGrWiCVMxrnFpazhjbvIuLynuCkP1xgN0UDli0QfQcjZv0AaVgAyRWK82Ow87HaRBN3BdAOqb6n4VBNd0ZTjATk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015097; c=relaxed/simple; bh=EJ34wqd6b/ux/q+BX6JNRu0xPaf1PNbXxTmJ4orQ594=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=s6+Me0dCQy3Dojtbw7MTE+pfn+D2xfNoPDehA/M5i7yDI87CADWPv2/eXvHPeifCvH/TQONnxY2VL6oj0+sEuM9wGmVEiOf6TOxSbISFVK7EdfgcqgV+1HszG0ZVXeqsFJA/NtLoEUmE8WD35N7l29EhgTy52o6YqLJOVbeH1VI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=DpuyJBrQ; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="DpuyJBrQ" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dcbee93a3e1so6920091276.3 for ; Mon, 01 Apr 2024 16:44:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712015095; x=1712619895; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Se153erLk5xq/r1GOlypzsiGXhlpYY7z2C8apfnTM7c=; b=DpuyJBrQVpAwhAODbBYO48ewaI6x4nPezI0TJ4/OtaKMzQMEDYz+WlrFZzvZ+74Fxr bBfcfHFHUICawwuBbxd9RZQpFfc1xgGC0XhWMvrKqATnpNMIEERaDEtlUTtKN9UeYMM1 3v7YWStQr5dbtfxa+uPwun/opBXxsaWTATEf9uVRVdO7Q+rqnONhYGop5kf9uUWwqFDy /IZK7h1KE9nkR19gd39NHEz0CXRBKE54T2uLrUtH/bsAV1l5BPDm4727KCjxMGrhH+42 Onhv8wup/2GZbA1NL7WvSctNHGIyKm0OT5FFqErDw4Ibffrmv40i5QcFrBHNJEjZxK70 nKbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712015095; x=1712619895; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Se153erLk5xq/r1GOlypzsiGXhlpYY7z2C8apfnTM7c=; b=NBjGblVFdbps/b8hkfB87MU/x5sjBHfYr7ybLnQDOmcNIZCvr8XV8hUUAToW4De0Ph ejRAtcnSb2eSzMmTcY8VQ+6laFUWyX0yj9R8rafBEn8tdkUJokcCDaGlp/sirYP4/EBD Gb8NYHHCxLdk+zuqBwByZRbP08fRqFCGv7NICOoeqn/28TXvucxYlkE+cRlwr0/pHRyl iEPH8nRyN89TCEJbZZVJalhXzXkcJFCzZdIU6oiVln6x89ywNNrDerifgx7rRTFEjAG+ VSAMJ+c3jgrvLlPJ9WkHjtpVGIsvO1poytaJXBAYkSsZUJMCU4IWDZSFH/deRdHKB3SP GWTQ== X-Gm-Message-State: AOJu0Yzv3TAQYNV4d2PLGSKChBusQvM9dWUftSEmJEGBAmYoP7Jo7rGl xrIBtnUx1pjoFdYocmRqEApjZnb0uNnqwD+64YiT+QKfp2GkSteOPW649aRDeIFpBxqUiki8SlF xClzPZnC99hGVPKiNxE64o3YR4Oia02cue9eSjDo6/WWxJjXxFT2r93jAPYTUd5EpT74PCLSaaM p6t4GhRrTD2JKnOZrbffCJvjzwRb9srkVuH4dD5ybgvkm2 X-Google-Smtp-Source: AGHT+IF0RhHf8+AI3UDQGWpPn6ccU8GMZRG67P37CakQ2sxG9XWGJC3KcmZS6ipbhQWec1WHRG87AUZBALdn X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a05:6902:102f:b0:dc2:2ace:860 with SMTP id x15-20020a056902102f00b00dc22ace0860mr934958ybt.2.1712015094611; Mon, 01 Apr 2024 16:44:54 -0700 (PDT) Date: Mon, 1 Apr 2024 16:44:26 -0700 In-Reply-To: <20240401234439.834544-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240401234439.834544-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.478.gd926399ef9-goog Message-ID: <20240401234439.834544-5-jstultz@google.com> Subject: [RESEND][PATCH v9 4/7] sched: Add do_push_task helper From: John Stultz To: LKML Cc: "Connor O'Brien" , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com, John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Connor O'Brien Switch logic that deactivates, sets the task cpu, and reactivates a task on a different rq to use a helper that will be later extended to push entire blocked task chains. This patch was broken out from a larger chain migration patch originally by Connor O'Brien. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Tested-by: K Prateek Nayak Signed-off-by: Connor O'Brien [jstultz: split out from larger chain migration patch] Signed-off-by: John Stultz Acked-by tags I received & rebased to 6.9-rc2). Reviewed-by: Valentin Schneider --- v8: * Renamed from push_task_chain to do_push_task so it makes more sense without proxy-execution --- kernel/sched/core.c | 4 +--- kernel/sched/deadline.c | 8 ++------ kernel/sched/rt.c | 8 ++------ kernel/sched/sched.h | 9 +++++++++ 4 files changed, 14 insertions(+), 15 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 7019a40457a6..586a3f8186bd 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2712,9 +2712,7 @@ int push_cpu_stop(void *arg) =20 // XXX validate p is still the highest prio task if (task_rq(p) =3D=3D rq) { - deactivate_task(rq, p, 0); - set_task_cpu(p, lowest_rq->cpu); - activate_task(lowest_rq, p, 0); + do_push_task(rq, lowest_rq, p); resched_curr(lowest_rq); } =20 diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index a04a436af8cc..e68d88963e89 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2443,9 +2443,7 @@ static int push_dl_task(struct rq *rq) goto retry; } =20 - deactivate_task(rq, next_task, 0); - set_task_cpu(next_task, later_rq->cpu); - activate_task(later_rq, next_task, 0); + do_push_task(rq, later_rq, next_task); ret =3D 1; =20 resched_curr(later_rq); @@ -2531,9 +2529,7 @@ static void pull_dl_task(struct rq *this_rq) if (is_migration_disabled(p)) { push_task =3D get_push_task(src_rq); } else { - deactivate_task(src_rq, p, 0); - set_task_cpu(p, this_cpu); - activate_task(this_rq, p, 0); + do_push_task(src_rq, this_rq, p); dmin =3D p->dl.deadline; resched =3D true; } diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 3261b067b67e..dd072d11cc02 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -2106,9 +2106,7 @@ static int push_rt_task(struct rq *rq, bool pull) goto retry; } =20 - deactivate_task(rq, next_task, 0); - set_task_cpu(next_task, lowest_rq->cpu); - activate_task(lowest_rq, next_task, 0); + do_push_task(rq, lowest_rq, next_task); resched_curr(lowest_rq); ret =3D 1; =20 @@ -2379,9 +2377,7 @@ static void pull_rt_task(struct rq *this_rq) if (is_migration_disabled(p)) { push_task =3D get_push_task(src_rq); } else { - deactivate_task(src_rq, p, 0); - set_task_cpu(p, this_cpu); - activate_task(this_rq, p, 0); + do_push_task(src_rq, this_rq, p); resched =3D true; } /* diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index d2242679239e..16057de24ecd 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3472,5 +3472,14 @@ static inline void init_sched_mm_cid(struct task_str= uct *t) { } =20 extern u64 avg_vruntime(struct cfs_rq *cfs_rq); extern int entity_eligible(struct cfs_rq *cfs_rq, struct sched_entity *se); +#ifdef CONFIG_SMP +static inline +void do_push_task(struct rq *rq, struct rq *dst_rq, struct task_struct *ta= sk) +{ + deactivate_task(rq, task, 0); + set_task_cpu(task, dst_rq->cpu); + activate_task(dst_rq, task, 0); +} +#endif =20 #endif /* _KERNEL_SCHED_SCHED_H */ --=20 2.44.0.478.gd926399ef9-goog From nobody Sun Feb 8 01:34:09 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 50CF758AB8 for ; Mon, 1 Apr 2024 23:44:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015101; cv=none; b=T/QOgvtMuBq8JNqvPezvFCpGL5S0quk5yMaIn7xusAmodhAyGRMVtBVFBqZFOkA39Pp26OG0qDMpvYvhgOz4brQcSSn14wM8e9An3dKgEzSWzqlBqV0q6C/Vq+nhHW6MskxNVovE5WTs0Z8Ndz9M7vuiP4fyAJjUN1xFbwuoWGk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015101; c=relaxed/simple; bh=ddCiRQJoZ9qYztkVTnJ7p3TgD7QCkc8YbKeYaUReoH4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=qsyapnKg97LME/blDS0xhozlrqlcWAwYVmYnqn6h+OJEApyFl9O/oljkX7TkDHRustdci4U4MoVdcW2mG3bUQo4nTiVNRfiyCV4Fz850650lhzsKgBbf7KmsjmKdwPT4i80MVu7Xpbf7edTNyCGA9lhWAfnPnvfkUqmk2JkL3AY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=piGDLXIV; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="piGDLXIV" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-1e0984b137aso29784835ad.2 for ; Mon, 01 Apr 2024 16:44:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712015097; x=1712619897; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Vnfj4RJNtnkZlKor/hHQJKKoGPPgj0ewMfJ+EWEyxKM=; b=piGDLXIV10J8cGqvSMsSgO0OkxBScEt+5FfvK+itqjAOzRahjR8YVJ/mDUIccEL64x OT72Od+4A3QanhbJIa0a2hAlxMILx+5Wc6PUvTI8PZ6Z6Atwd6tmZ0d4Pg8w5EXgqtZ9 wGXm5ujf2ciP9ZX9Zd2FVfzsSfYYPYOEv3LfQhKCi8tpwBiOPIYESvM6qxy0d4ocYMS/ 9oW1r8hLgmRvaKSbg67fViKIo1nekqep4OG9H5/9b0b+J6pOI3KsxOmtUMNVTcdIjNh1 RAkv7mO6fL65foeTO9birSG87LxBY8siM4kfqd3/pmwpzae4LCMqA9X6WH8IdmiMlM1C 6Zgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712015097; x=1712619897; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Vnfj4RJNtnkZlKor/hHQJKKoGPPgj0ewMfJ+EWEyxKM=; b=ajBHrpfKbNb5KMq3ZUNtdyMDOnsX5dRFfkhC18NeUIF6Brv/a7zEY54r14FuDDwCBD A0A57UNsN+pjAekrFtJ78KSNaV94ISqjKf0fbDqhLPpreRJVV8+3GOJdhO+DvCNcrcdQ mPazsvrvp6SBYhquXQrGasl3d26sCk/NVOGWGb0Ws+yTKFBCE2w55p3/e7409geUqA+5 cvNZmdbeBhOg4T7PU0EP6dIOd68dw3rg1GfrZ7vZaOQrjYx/GPk0poTmS0KcOLwNgODq lB5byzpngMD+c4R5V+pppps/uiMd3fXsy4KbSWy0zu6bQAcc1grUj1zxAGp0myapu7ns +Shg== X-Gm-Message-State: AOJu0YzorQDkl7Jk9El+BpJ0/x0Hl0Qsg33GmBQxPh/sD6F/qpa3BuTr VeQoXlE1p3w3KT7U0/yTkABay3dPchYBs2O1GhhLs/1yBACM4djTq9Xw4mL/mz+Lk7cd1yHaGSv RM7rNJsIXYdoQ/bp/7vCLzGy0TXF7EiMH/x4VRXco2IZp3q+Bzw+ljtEuR800qL9Wym00Fzv3vD XRCB62gio41JIHjY5GeHKUY0qZ/Ik4lj2sYmNe5II+9x8W X-Google-Smtp-Source: AGHT+IGoIAwD9CrnQBK9W50FUdTR9DUi4m1EZxTtxqQV8EenJ7SwU8b7Lw+ALwrrmS4z5Rdndmp8mGdviqny X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a17:902:d4ce:b0:1e0:2cd7:5dfb with SMTP id o14-20020a170902d4ce00b001e02cd75dfbmr1056987plg.2.1712015096211; Mon, 01 Apr 2024 16:44:56 -0700 (PDT) Date: Mon, 1 Apr 2024 16:44:27 -0700 In-Reply-To: <20240401234439.834544-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240401234439.834544-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.478.gd926399ef9-goog Message-ID: <20240401234439.834544-6-jstultz@google.com> Subject: [RESEND][PATCH v9 5/7] sched: Consolidate pick_*_task to task_is_pushable helper From: John Stultz To: LKML Cc: "Connor O'Brien" , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com, John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Connor O'Brien This patch consolidates rt and deadline pick_*_task functions to a task_is_pushable() helper This patch was broken out from a larger chain migration patch originally by Connor O'Brien. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Tested-by: K Prateek Nayak Signed-off-by: Connor O'Brien [jstultz: split out from larger chain migration patch, renamed helper function] Signed-off-by: John Stultz Acked-by tags I received & rebased to 6.9-rc2). Reviewed-by: Valentin Schneider --- v7: * Split from chain migration patch * Renamed function --- kernel/sched/deadline.c | 10 +--------- kernel/sched/rt.c | 11 +---------- kernel/sched/sched.h | 10 ++++++++++ 3 files changed, 12 insertions(+), 19 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index e68d88963e89..1b9cdb507498 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2179,14 +2179,6 @@ static void task_fork_dl(struct task_struct *p) /* Only try algorithms three times */ #define DL_MAX_TRIES 3 =20 -static int pick_dl_task(struct rq *rq, struct task_struct *p, int cpu) -{ - if (!task_on_cpu(rq, p) && - cpumask_test_cpu(cpu, &p->cpus_mask)) - return 1; - return 0; -} - /* * Return the earliest pushable rq's task, which is suitable to be executed * on the CPU, NULL otherwise: @@ -2205,7 +2197,7 @@ static struct task_struct *pick_earliest_pushable_dl_= task(struct rq *rq, int cpu if (next_node) { p =3D __node_2_pdl(next_node); =20 - if (pick_dl_task(rq, p, cpu)) + if (task_is_pushable(rq, p, cpu) =3D=3D 1) return p; =20 next_node =3D rb_next(next_node); diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index dd072d11cc02..638e7c158ae4 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1791,15 +1791,6 @@ static void put_prev_task_rt(struct rq *rq, struct t= ask_struct *p) /* Only try algorithms three times */ #define RT_MAX_TRIES 3 =20 -static int pick_rt_task(struct rq *rq, struct task_struct *p, int cpu) -{ - if (!task_on_cpu(rq, p) && - cpumask_test_cpu(cpu, &p->cpus_mask)) - return 1; - - return 0; -} - /* * Return the highest pushable rq's task, which is suitable to be executed * on the CPU, NULL otherwise @@ -1813,7 +1804,7 @@ static struct task_struct *pick_highest_pushable_task= (struct rq *rq, int cpu) return NULL; =20 plist_for_each_entry(p, head, pushable_tasks) { - if (pick_rt_task(rq, p, cpu)) + if (task_is_pushable(rq, p, cpu) =3D=3D 1) return p; } =20 diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 16057de24ecd..f25eec405df9 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3480,6 +3480,16 @@ void do_push_task(struct rq *rq, struct rq *dst_rq, = struct task_struct *task) set_task_cpu(task, dst_rq->cpu); activate_task(dst_rq, task, 0); } + +static inline +int task_is_pushable(struct rq *rq, struct task_struct *p, int cpu) +{ + if (!task_on_cpu(rq, p) && + cpumask_test_cpu(cpu, &p->cpus_mask)) + return 1; + + return 0; +} #endif =20 #endif /* _KERNEL_SCHED_SCHED_H */ --=20 2.44.0.478.gd926399ef9-goog From nobody Sun Feb 8 01:34:09 2026 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E73F358ACC for ; Mon, 1 Apr 2024 23:44:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015100; cv=none; b=LdSf47+yGGZY37mwhRnWgiHuzqq2zr+9uRZsFOzNNa4DTy8/ZUOzkCZwxeykgB105eHn8EtL0rDdfnOA6stpCfZzwhP26mKRx/f0qc1Hst1aAEK6c4+SIX08+Qxna1bjJIqnxgAZ8H74RhzMn/gN3wBomeI0Ir4D3eGiIGULVoQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015100; c=relaxed/simple; bh=xOMUb+GmiXukMH9HQceN3GVQdj0ZjM9Tp+Q+1CsUWig=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=b5hVLrEOkgZ18u6HNTQwHJGRrgvaKllJhW6lLduJufWxvuX5y+GVFMKGNYGnnm53dTGmOiAClH46U4pyxX0yFBtMwdx17jDP4vDNE8alO1Ka4R23Rcqsx3+FxfgMIi0fq6uH4V4D4NxAjszS7Lzvm77pv5P/T4yOM6ZxQp3HDFc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=fhtwqDpo; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="fhtwqDpo" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dcd94cc48a1so7104800276.3 for ; Mon, 01 Apr 2024 16:44:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712015098; x=1712619898; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QgqtM/d+7PpQv4HE31iqvyKrSmgqLa7oQqkgUO0r7vE=; b=fhtwqDpoEWrXo0lB5QIDUAOhYIl61Qn9zQXIuF4HEr/OSFJ7YykZG9dmmiYDRvZbCH W4MJjv6oYU5204fN1F29HeVnXV02CIYyV+y97ydv0GIKjag3PSvTEq31RI94pC1G2J+g UOvaepNyyK89AiEdhIAFZcE6O1a9yncTEwl/5tE84jMenm/v4fCVgY7RF0A+FLv21G59 7uUgl+/ch/mhaTk7rKmQFkkY9YYR8AsR5odVt+nuEYk560LUpr4D6Z7rua+PfjlYAQdX o9XAsEp1O+qxGdZqTlxkoxYpw+5KoIwQ0fZqwhzjR1K3pASNBw+Lbk+h660DhyowS08q BJEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712015098; x=1712619898; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QgqtM/d+7PpQv4HE31iqvyKrSmgqLa7oQqkgUO0r7vE=; b=FWr7Q+6gw4AJj1yrpwlfeT5+9Rqno8D5pZmxlQXpaZ8oAZR7EDQsUu7u6nHYzDsLAs aWGwixNEZv3Q8EuNFAOnv6gKCsWsvxfTFLDwNBNM1wJ+BfD8tPINdbecIhwmkvw9v3H8 p+ms2bGZGPMfTN/xHsH0OKs1A1CL4GZaaVK1u7RoHx+rLCwVqBYu/HhpNd7CZJLZZbgm TqIMhLysYDTymmLiA6dboDHdGb2juH3qrNi6KXxVQ/GY5mULBk1RcWTArE8pyM4hUk77 VohALRGQMqxqZahMArJFjIGZn5idN1u+iIuJ5Q3RWSAU7tQNGV6R9f31DC1tS9cTAaLS nRig== X-Gm-Message-State: AOJu0YzKgq5aKq3J9YKg/EPMHkAd5/jnEBtH0qwL6t+mMkJt+OwU3CRS nrr7TzIMCRWVdyFNn7a4Nu06MROqp47dah4hqvx/BN0+EYlMbYTCphkpUTaLdJ7Pk3Th7dxM3n1 O2LGNpUCaxfeXfS5IDvN7VVEo6FzA6VVsxz64XCs9+qgh6voIbICsOda7GxRvyTCF/zOlRL/PIB Q1uy38aPkKxkvU4ZNP785x2eAqT/Otc65Xut+jvrPiBAKF X-Google-Smtp-Source: AGHT+IG81utI2kl1Gn1sM2N8PqFdBdAaI2i00f63wU9a+w5ADuC0vVidLWOXzMl+jp7aeSU/i4t9+9F8b9w5 X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a05:6902:1006:b0:dd9:4ad1:a1f7 with SMTP id w6-20020a056902100600b00dd94ad1a1f7mr3543161ybt.9.1712015097848; Mon, 01 Apr 2024 16:44:57 -0700 (PDT) Date: Mon, 1 Apr 2024 16:44:28 -0700 In-Reply-To: <20240401234439.834544-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240401234439.834544-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.478.gd926399ef9-goog Message-ID: <20240401234439.834544-7-jstultz@google.com> Subject: [RESEND][PATCH v9 6/7] sched: Split out __schedule() deactivate task logic into a helper From: John Stultz To: LKML Cc: John Stultz , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As we're going to re-use the deactivation logic, split it into a helper. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Tested-by: K Prateek Nayak Signed-off-by: John Stultz Acked-by tags I received & rebased to 6.9-rc2). --- v6: * Define function as static to avoid "no previous prototype" warnings as Reported-by: kernel test robot v7: * Rename state task_state to be more clear, as suggested by Metin Kaya --- kernel/sched/core.c | 72 +++++++++++++++++++++++++++------------------ 1 file changed, 43 insertions(+), 29 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 586a3f8186bd..0eaa0855ef86 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6572,6 +6572,48 @@ pick_next_task(struct rq *rq, struct task_struct *pr= ev, struct rq_flags *rf) # define SM_MASK_PREEMPT SM_PREEMPT #endif =20 +/* + * Helper function for __schedule() + * + * If a task does not have signals pending, deactivate it and return true + * Otherwise marks the task's __state as RUNNING and returns false + */ +static bool try_to_deactivate_task(struct rq *rq, struct task_struct *p, + unsigned long task_state) +{ + if (signal_pending_state(task_state, p)) { + WRITE_ONCE(p->__state, TASK_RUNNING); + } else { + p->sched_contributes_to_load =3D + (task_state & TASK_UNINTERRUPTIBLE) && + !(task_state & TASK_NOLOAD) && + !(task_state & TASK_FROZEN); + + if (p->sched_contributes_to_load) + rq->nr_uninterruptible++; + + /* + * __schedule() ttwu() + * prev_state =3D prev->state; if (p->on_rq && ...) + * if (prev_state) goto out; + * p->on_rq =3D 0; smp_acquire__after_ctrl_dep(); + * p->state =3D TASK_WAKING + * + * Where __schedule() and ttwu() have matching control dependencies. + * + * After this, schedule() must not care about p->state any more. + */ + deactivate_task(rq, p, DEQUEUE_SLEEP | DEQUEUE_NOCLOCK); + + if (p->in_iowait) { + atomic_inc(&rq->nr_iowait); + delayacct_blkio_start(); + } + return true; + } + return false; +} + /* * __schedule() is the main scheduler function. * @@ -6665,35 +6707,7 @@ static void __sched notrace __schedule(unsigned int = sched_mode) */ prev_state =3D READ_ONCE(prev->__state); if (!(sched_mode & SM_MASK_PREEMPT) && prev_state) { - if (signal_pending_state(prev_state, prev)) { - WRITE_ONCE(prev->__state, TASK_RUNNING); - } else { - prev->sched_contributes_to_load =3D - (prev_state & TASK_UNINTERRUPTIBLE) && - !(prev_state & TASK_NOLOAD) && - !(prev_state & TASK_FROZEN); - - if (prev->sched_contributes_to_load) - rq->nr_uninterruptible++; - - /* - * __schedule() ttwu() - * prev_state =3D prev->state; if (p->on_rq && ...) - * if (prev_state) goto out; - * p->on_rq =3D 0; smp_acquire__after_ctrl_dep(); - * p->state =3D TASK_WAKING - * - * Where __schedule() and ttwu() have matching control dependencies. - * - * After this, schedule() must not care about p->state any more. - */ - deactivate_task(rq, prev, DEQUEUE_SLEEP | DEQUEUE_NOCLOCK); - - if (prev->in_iowait) { - atomic_inc(&rq->nr_iowait); - delayacct_blkio_start(); - } - } + try_to_deactivate_task(rq, prev, prev_state); switch_count =3D &prev->nvcsw; } =20 --=20 2.44.0.478.gd926399ef9-goog From nobody Sun Feb 8 01:34:09 2026 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E800659154 for ; Mon, 1 Apr 2024 23:45:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015103; cv=none; b=ZQ1xv3Ty+a+ldMPeyArIdO4d416AFa75xU7DrTVzuNlcu7h8cBtlYhPHo64w89tRmvPReemXoL3Y7E1FEfxuYQXUFUC28JPbl78DBNZD8RyEyp4r5Bc/MF7nTvuJaXNJEt1jT4ILEAT0riJ3Y13Wj0sao8eaa1ADFYHxbQkSa1M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712015103; c=relaxed/simple; bh=UlgV0iwnPFw7IxBIDMnrfvJMQaCGP41ulv2iimr5ifY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=jfQhyKe5kD8dBHGYhIZEYd78WFsfR5jqLx2JiKEdT/wcqKz7AgxHxXe6T+RexJadoHqToeydRPT1nFYIPwMarxT5U+NG4r61aeOrC71ihbH1SK3ZX7xNORGscObzSUfXzJ1RbtamLmitsJJrWYg3B7CtcD+CABYnZG40WQdxB/M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=zcnTYtE6; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="zcnTYtE6" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-61510f72bb3so10222857b3.0 for ; Mon, 01 Apr 2024 16:45:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712015100; x=1712619900; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ieO+YRFZMYvEQA8u4UdE5lIEEp9nBxeDRqnvo1OFcE0=; b=zcnTYtE6K7/edPUcmorooKHm+LQMQKZzAEyWOHJQHgHSZ7DqOTaRr5nlsAzIuolJD4 GoI6+rF1K2ZQtktjFo7jcKi2Hs92MiyOnRR/PyraM0OIYv7e140DY0HlhlvamlPD8I4E ry/2mucUtCRaspGtsm8z5f7wlc0ywgGFpp3QJ9QrHVwA4Y/9bvIMHqcJSuc8pV63lJOh zVIKWGIaipdiquCYfSoCqrHvaQo1souxhWz3k+MtjdmXHYQUsGJCIhjQXFH8cH9MqV2v /1iEe1uFoTzeroWACtfzL8DYZIeirxbCxxcomBOALXhzs45va9sAKLPb2+IRTI9GbM/W nmsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712015100; x=1712619900; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ieO+YRFZMYvEQA8u4UdE5lIEEp9nBxeDRqnvo1OFcE0=; b=agDVgEGG4RSJ1EJkrjb4GPq9rCjrHNAEa5wBWb/0UBWHFivCCmsJ8/dJ2VLqwSS6VB DfQsjXBXxOyMNtGOaQtD0NyCURP9MP54nda/e4vPqHd2Y4ST0G3EldRQsnk2SVtGMxhu g6jJFM8pRA6w1OsHN8KoBjKiH5bzHcpezSc1VTMMICNyxXMZgkoTCQ7Y6l31wfkAB4+b 1uQAmcPET8pl/yVTuMJk+aWaEU/fO6KPACmIYAiQhSGviU/ecuk92O41jDBwpg+tCpJV 8UKjjAL8aTvsuhDeW1hTZW1t1an73A3VeXM9mXRnnWCa+BCQvdAMeGlInnBkYfoBjOlB qD6A== X-Gm-Message-State: AOJu0YyurIlqcj12TTu70/47/kb9IBwz3TcTCbX86x2FnfzjvqAUrUz+ u7SIRLhC4i6iOTwU4o9SQJ3j8dhk5Z9jSOdk+YmzXSaoKOsHVVbNZDqsL1lb9cZAO9Mu+W3kfx+ 8wNI+lo982Dk+p4V3maboYCoiYoojXHnfCHvik7RmlmNzQ3n1v80uAr73dKuYjzXHMfpZKwOmQh fR+wosz1/6xdHQCjB3ZHMjyMR0MK7+GMLoKd7/OVzEsDso X-Google-Smtp-Source: AGHT+IFavg0aaTt5sTrnLzgiB9nQVzaD5FwMuD/VgjJ9jSM9u/yoh/MDlKD48IR8awDMXz9RHp48FxDeDqkd X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a0d:e202:0:b0:615:1579:8660 with SMTP id l2-20020a0de202000000b0061515798660mr283363ywe.7.1712015099638; Mon, 01 Apr 2024 16:44:59 -0700 (PDT) Date: Mon, 1 Apr 2024 16:44:29 -0700 In-Reply-To: <20240401234439.834544-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240401234439.834544-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.478.gd926399ef9-goog Message-ID: <20240401234439.834544-8-jstultz@google.com> Subject: [RESEND][PATCH v9 7/7] sched: Split scheduler and execution contexts From: John Stultz To: LKML Cc: Peter Zijlstra , Joel Fernandes , Qais Yousef , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Xuewen Yan , K Prateek Nayak , Metin Kaya , Thomas Gleixner , kernel-team@android.com, "Connor O'Brien" , John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Let's define the scheduling context as all the scheduler state in task_struct for the task selected to run, and the execution context as all state required to actually run the task. Currently both are intertwined in task_struct. We want to logically split these such that we can use the scheduling context of the task selected to be scheduled, but use the execution context of a different task to actually be run. To this purpose, introduce rq_selected() macro to point to the task_struct selected from the runqueue by the scheduler, and will be used for scheduler state, and preserve rq->curr to indicate the execution context of the task that will actually be run. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Metin Kaya Cc: Thomas Gleixner Cc: kernel-team@android.com Tested-by: K Prateek Nayak Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Juri Lelli Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20181009092434.26221-5-juri.lelli@redhat.com [add additional comments and update more sched_class code to use rq::proxy] Signed-off-by: Connor O'Brien [jstultz: Rebased and resolved minor collisions, reworked to use accessors, tweaked update_curr_common to use rq_proxy fixing rt scheduling issues] Signed-off-by: John Stultz Acked-by tags I received & rebased to 6.9-rc2). --- v2: * Reworked to use accessors * Fixed update_curr_common to use proxy instead of curr v3: * Tweaked wrapper names * Swapped proxy for selected for clarity v4: * Minor variable name tweaks for readability * Use a macro instead of a inline function and drop other helper functions as suggested by Peter. * Remove verbose comments/questions to avoid review distractions, as suggested by Dietmar v5: * Add CONFIG_PROXY_EXEC option to this patch so the new logic can be tested with this change * Minor fix to grab rq_selected when holding the rq lock v7: * Minor spelling fix and unused argument fixes suggested by Metin Kaya * Switch to curr_selected for consistency, and minor rewording of commit message for clarity * Rename variables selected instead of curr when we're using rq_selected() * Reduce macros in CONFIG_SCHED_PROXY_EXEC ifdef sections, as suggested by Metin Kaya v8: * Use rq->curr, not rq_selected with task_tick, as suggested by Valentin * Minor rework to reorder this with CONFIG_SCHED_PROXY_EXEC patch --- kernel/sched/core.c | 46 ++++++++++++++++++++++++++--------------- kernel/sched/deadline.c | 35 ++++++++++++++++--------------- kernel/sched/fair.c | 18 ++++++++-------- kernel/sched/rt.c | 40 +++++++++++++++++------------------ kernel/sched/sched.h | 25 ++++++++++++++++++++-- 5 files changed, 99 insertions(+), 65 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 0eaa0855ef86..3a21f5e903bf 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -794,7 +794,7 @@ static enum hrtimer_restart hrtick(struct hrtimer *time= r) =20 rq_lock(rq, &rf); update_rq_clock(rq); - rq->curr->sched_class->task_tick(rq, rq->curr, 1); + rq_selected(rq)->sched_class->task_tick(rq, rq->curr, 1); rq_unlock(rq, &rf); =20 return HRTIMER_NORESTART; @@ -2236,16 +2236,18 @@ static inline void check_class_changed(struct rq *r= q, struct task_struct *p, =20 void wakeup_preempt(struct rq *rq, struct task_struct *p, int flags) { - if (p->sched_class =3D=3D rq->curr->sched_class) - rq->curr->sched_class->wakeup_preempt(rq, p, flags); - else if (sched_class_above(p->sched_class, rq->curr->sched_class)) + struct task_struct *selected =3D rq_selected(rq); + + if (p->sched_class =3D=3D selected->sched_class) + selected->sched_class->wakeup_preempt(rq, p, flags); + else if (sched_class_above(p->sched_class, selected->sched_class)) resched_curr(rq); =20 /* * A queue event has occurred, and we're going to schedule. In * this case, we can save a useless back to back clock update. */ - if (task_on_rq_queued(rq->curr) && test_tsk_need_resched(rq->curr)) + if (task_on_rq_queued(selected) && test_tsk_need_resched(rq->curr)) rq_clock_skip_update(rq); } =20 @@ -2772,7 +2774,7 @@ __do_set_cpus_allowed(struct task_struct *p, struct a= ffinity_context *ctx) lockdep_assert_held(&p->pi_lock); =20 queued =3D task_on_rq_queued(p); - running =3D task_current(rq, p); + running =3D task_current_selected(rq, p); =20 if (queued) { /* @@ -5596,7 +5598,7 @@ unsigned long long task_sched_runtime(struct task_str= uct *p) * project cycles that may never be accounted to this * thread, breaking clock_gettime(). */ - if (task_current(rq, p) && task_on_rq_queued(p)) { + if (task_current_selected(rq, p) && task_on_rq_queued(p)) { prefetch_curr_exec_start(p); update_rq_clock(rq); p->sched_class->update_curr(rq); @@ -5664,7 +5666,8 @@ void scheduler_tick(void) { int cpu =3D smp_processor_id(); struct rq *rq =3D cpu_rq(cpu); - struct task_struct *curr =3D rq->curr; + /* accounting goes to the selected task */ + struct task_struct *selected; struct rq_flags rf; unsigned long thermal_pressure; u64 resched_latency; @@ -5675,16 +5678,17 @@ void scheduler_tick(void) sched_clock_tick(); =20 rq_lock(rq, &rf); + selected =3D rq_selected(rq); =20 update_rq_clock(rq); thermal_pressure =3D arch_scale_thermal_pressure(cpu_of(rq)); update_thermal_load_avg(rq_clock_thermal(rq), rq, thermal_pressure); - curr->sched_class->task_tick(rq, curr, 0); + selected->sched_class->task_tick(rq, selected, 0); if (sched_feat(LATENCY_WARN)) resched_latency =3D cpu_resched_latency(rq); calc_global_load_tick(rq); sched_core_tick(rq); - task_tick_mm_cid(rq, curr); + task_tick_mm_cid(rq, selected); =20 rq_unlock(rq, &rf); =20 @@ -5693,8 +5697,8 @@ void scheduler_tick(void) =20 perf_event_task_tick(); =20 - if (curr->flags & PF_WQ_WORKER) - wq_worker_tick(curr); + if (selected->flags & PF_WQ_WORKER) + wq_worker_tick(selected); =20 #ifdef CONFIG_SMP rq->idle_balance =3D idle_cpu(cpu); @@ -5759,6 +5763,12 @@ static void sched_tick_remote(struct work_struct *wo= rk) struct task_struct *curr =3D rq->curr; =20 if (cpu_online(cpu)) { + /* + * Since this is a remote tick for full dynticks mode, + * we are always sure that there is no proxy (only a + * single task is running). + */ + SCHED_WARN_ON(rq->curr !=3D rq_selected(rq)); update_rq_clock(rq); =20 if (!is_idle_task(curr)) { @@ -6712,6 +6722,7 @@ static void __sched notrace __schedule(unsigned int s= ched_mode) } =20 next =3D pick_next_task(rq, prev, &rf); + rq_set_selected(rq, next); clear_tsk_need_resched(prev); clear_preempt_need_resched(); #ifdef CONFIG_SCHED_DEBUG @@ -7222,7 +7233,7 @@ void rt_mutex_setprio(struct task_struct *p, struct t= ask_struct *pi_task) =20 prev_class =3D p->sched_class; queued =3D task_on_rq_queued(p); - running =3D task_current(rq, p); + running =3D task_current_selected(rq, p); if (queued) dequeue_task(rq, p, queue_flag); if (running) @@ -7312,7 +7323,7 @@ void set_user_nice(struct task_struct *p, long nice) } =20 queued =3D task_on_rq_queued(p); - running =3D task_current(rq, p); + running =3D task_current_selected(rq, p); if (queued) dequeue_task(rq, p, DEQUEUE_SAVE | DEQUEUE_NOCLOCK); if (running) @@ -7891,7 +7902,7 @@ static int __sched_setscheduler(struct task_struct *p, } =20 queued =3D task_on_rq_queued(p); - running =3D task_current(rq, p); + running =3D task_current_selected(rq, p); if (queued) dequeue_task(rq, p, queue_flags); if (running) @@ -9318,6 +9329,7 @@ void __init init_idle(struct task_struct *idle, int c= pu) rcu_read_unlock(); =20 rq->idle =3D idle; + rq_set_selected(rq, idle); rcu_assign_pointer(rq->curr, idle); idle->on_rq =3D TASK_ON_RQ_QUEUED; #ifdef CONFIG_SMP @@ -9407,7 +9419,7 @@ void sched_setnuma(struct task_struct *p, int nid) =20 rq =3D task_rq_lock(p, &rf); queued =3D task_on_rq_queued(p); - running =3D task_current(rq, p); + running =3D task_current_selected(rq, p); =20 if (queued) dequeue_task(rq, p, DEQUEUE_SAVE); @@ -10512,7 +10524,7 @@ void sched_move_task(struct task_struct *tsk) =20 update_rq_clock(rq); =20 - running =3D task_current(rq, tsk); + running =3D task_current_selected(rq, tsk); queued =3D task_on_rq_queued(tsk); =20 if (queued) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 1b9cdb507498..c30b592d6e9d 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1218,7 +1218,7 @@ static enum hrtimer_restart dl_task_timer(struct hrti= mer *timer) #endif =20 enqueue_task_dl(rq, p, ENQUEUE_REPLENISH); - if (dl_task(rq->curr)) + if (dl_task(rq_selected(rq))) wakeup_preempt_dl(rq, p, 0); else resched_curr(rq); @@ -1442,7 +1442,7 @@ void dl_server_init(struct sched_dl_entity *dl_se, st= ruct rq *rq, */ static void update_curr_dl(struct rq *rq) { - struct task_struct *curr =3D rq->curr; + struct task_struct *curr =3D rq_selected(rq); struct sched_dl_entity *dl_se =3D &curr->dl; s64 delta_exec; =20 @@ -1899,7 +1899,7 @@ static int find_later_rq(struct task_struct *task); static int select_task_rq_dl(struct task_struct *p, int cpu, int flags) { - struct task_struct *curr; + struct task_struct *curr, *selected; bool select_rq; struct rq *rq; =20 @@ -1910,6 +1910,7 @@ select_task_rq_dl(struct task_struct *p, int cpu, int= flags) =20 rcu_read_lock(); curr =3D READ_ONCE(rq->curr); /* unlocked access */ + selected =3D READ_ONCE(rq_selected(rq)); =20 /* * If we are dealing with a -deadline task, we must @@ -1920,9 +1921,9 @@ select_task_rq_dl(struct task_struct *p, int cpu, int= flags) * other hand, if it has a shorter deadline, we * try to make it stay here, it might be important. */ - select_rq =3D unlikely(dl_task(curr)) && + select_rq =3D unlikely(dl_task(selected)) && (curr->nr_cpus_allowed < 2 || - !dl_entity_preempt(&p->dl, &curr->dl)) && + !dl_entity_preempt(&p->dl, &selected->dl)) && p->nr_cpus_allowed > 1; =20 /* @@ -1985,7 +1986,7 @@ static void check_preempt_equal_dl(struct rq *rq, str= uct task_struct *p) * let's hope p can move out. */ if (rq->curr->nr_cpus_allowed =3D=3D 1 || - !cpudl_find(&rq->rd->cpudl, rq->curr, NULL)) + !cpudl_find(&rq->rd->cpudl, rq_selected(rq), NULL)) return; =20 /* @@ -2024,7 +2025,7 @@ static int balance_dl(struct rq *rq, struct task_stru= ct *p, struct rq_flags *rf) static void wakeup_preempt_dl(struct rq *rq, struct task_struct *p, int flags) { - if (dl_entity_preempt(&p->dl, &rq->curr->dl)) { + if (dl_entity_preempt(&p->dl, &rq_selected(rq)->dl)) { resched_curr(rq); return; } @@ -2034,7 +2035,7 @@ static void wakeup_preempt_dl(struct rq *rq, struct t= ask_struct *p, * In the unlikely case current and p have the same deadline * let us try to decide what's the best thing to do... */ - if ((p->dl.deadline =3D=3D rq->curr->dl.deadline) && + if ((p->dl.deadline =3D=3D rq_selected(rq)->dl.deadline) && !test_tsk_need_resched(rq->curr)) check_preempt_equal_dl(rq, p); #endif /* CONFIG_SMP */ @@ -2066,7 +2067,7 @@ static void set_next_task_dl(struct rq *rq, struct ta= sk_struct *p, bool first) if (!first) return; =20 - if (rq->curr->sched_class !=3D &dl_sched_class) + if (rq_selected(rq)->sched_class !=3D &dl_sched_class) update_dl_rq_load_avg(rq_clock_pelt(rq), rq, 0); =20 deadline_queue_push_tasks(rq); @@ -2391,8 +2392,8 @@ static int push_dl_task(struct rq *rq) * can move away, it makes sense to just reschedule * without going further in pushing next_task. */ - if (dl_task(rq->curr) && - dl_time_before(next_task->dl.deadline, rq->curr->dl.deadline) && + if (dl_task(rq_selected(rq)) && + dl_time_before(next_task->dl.deadline, rq_selected(rq)->dl.deadline) = && rq->curr->nr_cpus_allowed > 1) { resched_curr(rq); return 0; @@ -2515,7 +2516,7 @@ static void pull_dl_task(struct rq *this_rq) * deadline than the current task of its runqueue. */ if (dl_time_before(p->dl.deadline, - src_rq->curr->dl.deadline)) + rq_selected(src_rq)->dl.deadline)) goto skip; =20 if (is_migration_disabled(p)) { @@ -2554,9 +2555,9 @@ static void task_woken_dl(struct rq *rq, struct task_= struct *p) if (!task_on_cpu(rq, p) && !test_tsk_need_resched(rq->curr) && p->nr_cpus_allowed > 1 && - dl_task(rq->curr) && + dl_task(rq_selected(rq)) && (rq->curr->nr_cpus_allowed < 2 || - !dl_entity_preempt(&p->dl, &rq->curr->dl))) { + !dl_entity_preempt(&p->dl, &rq_selected(rq)->dl))) { push_dl_tasks(rq); } } @@ -2731,12 +2732,12 @@ static void switched_to_dl(struct rq *rq, struct ta= sk_struct *p) return; } =20 - if (rq->curr !=3D p) { + if (rq_selected(rq) !=3D p) { #ifdef CONFIG_SMP if (p->nr_cpus_allowed > 1 && rq->dl.overloaded) deadline_queue_push_tasks(rq); #endif - if (dl_task(rq->curr)) + if (dl_task(rq_selected(rq))) wakeup_preempt_dl(rq, p, 0); else resched_curr(rq); @@ -2765,7 +2766,7 @@ static void prio_changed_dl(struct rq *rq, struct tas= k_struct *p, if (!rq->dl.overloaded) deadline_queue_pull_task(rq); =20 - if (task_current(rq, p)) { + if (task_current_selected(rq, p)) { /* * If we now have a earlier deadline task than p, * then reschedule, provided p is still on this diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 03be0d1330a6..46f87fd47a33 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1140,7 +1140,7 @@ static inline void update_curr_task(struct task_struc= t *p, s64 delta_exec) */ s64 update_curr_common(struct rq *rq) { - struct task_struct *curr =3D rq->curr; + struct task_struct *curr =3D rq_selected(rq); s64 delta_exec; =20 delta_exec =3D update_curr_se(rq, &curr->se); @@ -1177,7 +1177,7 @@ static void update_curr(struct cfs_rq *cfs_rq) =20 static void update_curr_fair(struct rq *rq) { - update_curr(cfs_rq_of(&rq->curr->se)); + update_curr(cfs_rq_of(&rq_selected(rq)->se)); } =20 static inline void @@ -6633,7 +6633,7 @@ static void hrtick_start_fair(struct rq *rq, struct t= ask_struct *p) s64 delta =3D slice - ran; =20 if (delta < 0) { - if (task_current(rq, p)) + if (task_current_selected(rq, p)) resched_curr(rq); return; } @@ -6648,7 +6648,7 @@ static void hrtick_start_fair(struct rq *rq, struct t= ask_struct *p) */ static void hrtick_update(struct rq *rq) { - struct task_struct *curr =3D rq->curr; + struct task_struct *curr =3D rq_selected(rq); =20 if (!hrtick_enabled_fair(rq) || curr->sched_class !=3D &fair_sched_class) return; @@ -8279,7 +8279,7 @@ static void set_next_buddy(struct sched_entity *se) */ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p= , int wake_flags) { - struct task_struct *curr =3D rq->curr; + struct task_struct *curr =3D rq_selected(rq); struct sched_entity *se =3D &curr->se, *pse =3D &p->se; struct cfs_rq *cfs_rq =3D task_cfs_rq(curr); int cse_is_idle, pse_is_idle; @@ -8310,7 +8310,7 @@ static void check_preempt_wakeup_fair(struct rq *rq, = struct task_struct *p, int * prevents us from potentially nominating it as a false LAST_BUDDY * below. */ - if (test_tsk_need_resched(curr)) + if (test_tsk_need_resched(rq->curr)) return; =20 /* Idle tasks are by definition preempted by non-idle tasks. */ @@ -9292,7 +9292,7 @@ static bool __update_blocked_others(struct rq *rq, bo= ol *done) * update_load_avg() can call cpufreq_update_util(). Make sure that RT, * DL and IRQ signals have been updated before updating CFS. */ - curr_class =3D rq->curr->sched_class; + curr_class =3D rq_selected(rq)->sched_class; =20 thermal_pressure =3D arch_scale_thermal_pressure(cpu_of(rq)); =20 @@ -12661,7 +12661,7 @@ prio_changed_fair(struct rq *rq, struct task_struct= *p, int oldprio) * our priority decreased, or if we are not currently running on * this runqueue and our priority is higher than the current's */ - if (task_current(rq, p)) { + if (task_current_selected(rq, p)) { if (p->prio > oldprio) resched_curr(rq); } else @@ -12764,7 +12764,7 @@ static void switched_to_fair(struct rq *rq, struct = task_struct *p) * kick off the schedule if running, otherwise just see * if we can still preempt the current task. */ - if (task_current(rq, p)) + if (task_current_selected(rq, p)) resched_curr(rq); else wakeup_preempt(rq, p, 0); diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 638e7c158ae4..48fc7a198f1a 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -530,7 +530,7 @@ static void dequeue_rt_entity(struct sched_rt_entity *r= t_se, unsigned int flags) =20 static void sched_rt_rq_enqueue(struct rt_rq *rt_rq) { - struct task_struct *curr =3D rq_of_rt_rq(rt_rq)->curr; + struct task_struct *curr =3D rq_selected(rq_of_rt_rq(rt_rq)); struct rq *rq =3D rq_of_rt_rq(rt_rq); struct sched_rt_entity *rt_se; =20 @@ -1000,7 +1000,7 @@ static int sched_rt_runtime_exceeded(struct rt_rq *rt= _rq) */ static void update_curr_rt(struct rq *rq) { - struct task_struct *curr =3D rq->curr; + struct task_struct *curr =3D rq_selected(rq); struct sched_rt_entity *rt_se =3D &curr->rt; s64 delta_exec; =20 @@ -1543,7 +1543,7 @@ static int find_lowest_rq(struct task_struct *task); static int select_task_rq_rt(struct task_struct *p, int cpu, int flags) { - struct task_struct *curr; + struct task_struct *curr, *selected; struct rq *rq; bool test; =20 @@ -1555,6 +1555,7 @@ select_task_rq_rt(struct task_struct *p, int cpu, int= flags) =20 rcu_read_lock(); curr =3D READ_ONCE(rq->curr); /* unlocked access */ + selected =3D READ_ONCE(rq_selected(rq)); =20 /* * If the current task on @p's runqueue is an RT task, then @@ -1583,8 +1584,8 @@ select_task_rq_rt(struct task_struct *p, int cpu, int= flags) * systems like big.LITTLE. */ test =3D curr && - unlikely(rt_task(curr)) && - (curr->nr_cpus_allowed < 2 || curr->prio <=3D p->prio); + unlikely(rt_task(selected)) && + (curr->nr_cpus_allowed < 2 || selected->prio <=3D p->prio); =20 if (test || !rt_task_fits_capacity(p, cpu)) { int target =3D find_lowest_rq(p); @@ -1614,12 +1615,8 @@ select_task_rq_rt(struct task_struct *p, int cpu, in= t flags) =20 static void check_preempt_equal_prio(struct rq *rq, struct task_struct *p) { - /* - * Current can't be migrated, useless to reschedule, - * let's hope p can move out. - */ if (rq->curr->nr_cpus_allowed =3D=3D 1 || - !cpupri_find(&rq->rd->cpupri, rq->curr, NULL)) + !cpupri_find(&rq->rd->cpupri, rq_selected(rq), NULL)) return; =20 /* @@ -1662,7 +1659,9 @@ static int balance_rt(struct rq *rq, struct task_stru= ct *p, struct rq_flags *rf) */ static void wakeup_preempt_rt(struct rq *rq, struct task_struct *p, int fl= ags) { - if (p->prio < rq->curr->prio) { + struct task_struct *curr =3D rq_selected(rq); + + if (p->prio < curr->prio) { resched_curr(rq); return; } @@ -1680,7 +1679,7 @@ static void wakeup_preempt_rt(struct rq *rq, struct t= ask_struct *p, int flags) * to move current somewhere else, making room for our non-migratable * task. */ - if (p->prio =3D=3D rq->curr->prio && !test_tsk_need_resched(rq->curr)) + if (p->prio =3D=3D curr->prio && !test_tsk_need_resched(rq->curr)) check_preempt_equal_prio(rq, p); #endif } @@ -1705,7 +1704,7 @@ static inline void set_next_task_rt(struct rq *rq, st= ruct task_struct *p, bool f * utilization. We only care of the case where we start to schedule a * rt task */ - if (rq->curr->sched_class !=3D &rt_sched_class) + if (rq_selected(rq)->sched_class !=3D &rt_sched_class) update_rt_rq_load_avg(rq_clock_pelt(rq), rq, 0); =20 rt_queue_push_tasks(rq); @@ -1977,6 +1976,7 @@ static struct task_struct *pick_next_pushable_task(st= ruct rq *rq) =20 BUG_ON(rq->cpu !=3D task_cpu(p)); BUG_ON(task_current(rq, p)); + BUG_ON(task_current_selected(rq, p)); BUG_ON(p->nr_cpus_allowed <=3D 1); =20 BUG_ON(!task_on_rq_queued(p)); @@ -2009,7 +2009,7 @@ static int push_rt_task(struct rq *rq, bool pull) * higher priority than current. If that's the case * just reschedule current. */ - if (unlikely(next_task->prio < rq->curr->prio)) { + if (unlikely(next_task->prio < rq_selected(rq)->prio)) { resched_curr(rq); return 0; } @@ -2362,7 +2362,7 @@ static void pull_rt_task(struct rq *this_rq) * p if it is lower in priority than the * current task on the run queue */ - if (p->prio < src_rq->curr->prio) + if (p->prio < rq_selected(src_rq)->prio) goto skip; =20 if (is_migration_disabled(p)) { @@ -2404,9 +2404,9 @@ static void task_woken_rt(struct rq *rq, struct task_= struct *p) bool need_to_push =3D !task_on_cpu(rq, p) && !test_tsk_need_resched(rq->curr) && p->nr_cpus_allowed > 1 && - (dl_task(rq->curr) || rt_task(rq->curr)) && + (dl_task(rq_selected(rq)) || rt_task(rq_selected(rq))) && (rq->curr->nr_cpus_allowed < 2 || - rq->curr->prio <=3D p->prio); + rq_selected(rq)->prio <=3D p->prio); =20 if (need_to_push) push_rt_tasks(rq); @@ -2490,7 +2490,7 @@ static void switched_to_rt(struct rq *rq, struct task= _struct *p) if (p->nr_cpus_allowed > 1 && rq->rt.overloaded) rt_queue_push_tasks(rq); #endif /* CONFIG_SMP */ - if (p->prio < rq->curr->prio && cpu_online(cpu_of(rq))) + if (p->prio < rq_selected(rq)->prio && cpu_online(cpu_of(rq))) resched_curr(rq); } } @@ -2505,7 +2505,7 @@ prio_changed_rt(struct rq *rq, struct task_struct *p,= int oldprio) if (!task_on_rq_queued(p)) return; =20 - if (task_current(rq, p)) { + if (task_current_selected(rq, p)) { #ifdef CONFIG_SMP /* * If our priority decreases while running, we @@ -2531,7 +2531,7 @@ prio_changed_rt(struct rq *rq, struct task_struct *p,= int oldprio) * greater than the current running task * then reschedule. */ - if (p->prio < rq->curr->prio) + if (p->prio < rq_selected(rq)->prio) resched_curr(rq); } } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index f25eec405df9..3c64a875f0ea 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1030,7 +1030,7 @@ struct rq { */ unsigned int nr_uninterruptible; =20 - struct task_struct __rcu *curr; + struct task_struct __rcu *curr; /* Execution context */ struct task_struct *idle; struct task_struct *stop; unsigned long next_balance; @@ -1225,6 +1225,13 @@ DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues); #define cpu_curr(cpu) (cpu_rq(cpu)->curr) #define raw_rq() raw_cpu_ptr(&runqueues) =20 +/* For now, rq_selected =3D=3D rq->curr */ +#define rq_selected(rq) ((rq)->curr) +static inline void rq_set_selected(struct rq *rq, struct task_struct *t) +{ + /* Do nothing */ +} + struct sched_group; #ifdef CONFIG_SCHED_CORE static inline struct cpumask *sched_group_span(struct sched_group *sg); @@ -2148,11 +2155,25 @@ static inline u64 global_rt_runtime(void) return (u64)sysctl_sched_rt_runtime * NSEC_PER_USEC; } =20 +/* + * Is p the current execution context? + */ static inline int task_current(struct rq *rq, struct task_struct *p) { return rq->curr =3D=3D p; } =20 +/* + * Is p the current scheduling context? + * + * Note that it might be the current execution context at the same time if + * rq->curr =3D=3D rq_selected() =3D=3D p. + */ +static inline int task_current_selected(struct rq *rq, struct task_struct = *p) +{ + return rq_selected(rq) =3D=3D p; +} + static inline int task_on_cpu(struct rq *rq, struct task_struct *p) { #ifdef CONFIG_SMP @@ -2322,7 +2343,7 @@ struct sched_class { =20 static inline void put_prev_task(struct rq *rq, struct task_struct *prev) { - WARN_ON_ONCE(rq->curr !=3D prev); + WARN_ON_ONCE(rq_selected(rq) !=3D prev); prev->sched_class->put_prev_task(rq, prev); } =20 --=20 2.44.0.478.gd926399ef9-goog