From nobody Sat Feb 7 15:05:56 2026 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 506F8DDAB for ; Fri, 15 Mar 2024 04:40:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477624; cv=none; b=c24Ar7cFVDJY/xJXyUI2gJ/QjLw//lmtTlXuONugZPhZd61ZhTl+aUD6OH5x9BgUI3B9M5ib6VJrxlQOhkNg5YMFuvRFZFfWxrDqmzkKNvpueLwfW3QzPb8sbqwnWAhU6RsZcAi4RZFjQvrZOQmZ3xfYSxtybYR5kZ8gcIn6/K8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477624; c=relaxed/simple; bh=ICp1nk6H2IXtyacy5SY3ldjvfIcwR5PeYVkYtrgbFXo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=KOEypQF4aql1XcURFn7xXgHtk+9fZvNm2aFn6yHS7CRaBSmpAYIfyZhr27ZJphSIm1naniBDFQZdOFhtb1PwtJ8PSkYPDDwjmMsYg+z9ZS+1C/d3n9J9M/arYBsIT2TrgQVRoA+QOF11M2LzRSoqtkiENNXdQMDW7Gy4cxiko54= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=BVZtTFRc; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="BVZtTFRc" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dc64f63d768so2804382276.2 for ; Thu, 14 Mar 2024 21:40:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1710477621; x=1711082421; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QlVsIWxgs1fvdxkuItqeKgh5Qs4on8ZKZ8NCgfxKVL8=; b=BVZtTFRcvXFtU1TM9V4KPECnN3uvLe7sI8fAyaWUGfOO3bit7xpKoxwoa5YdwycVOc AyGKnoK+gbFKIQ5/u9IQ1RdOwkbgBLmLfyRCRNPkroa9vVTGDhTU97TreGFBqAfqS7+Q D+1RE7SpmLUa9dnPc1LWlLVolBvbdM//5cgw6ox8cJMKZNZuSBOaN/NJVCvzgtJgT/CQ +n1Oot0npjv8+W2TL+BV/8kq72+7yhF3T/l/QMPuvV1oSBOEp108HKmEg47E1p0gJ+qZ NRyFZN3aXvDWD1DTXdGk7UM0jg1WMOZLsSveqGtOvH3C3Poa2lH2gVoThPEFeHbVnDcq KDCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710477621; x=1711082421; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QlVsIWxgs1fvdxkuItqeKgh5Qs4on8ZKZ8NCgfxKVL8=; b=L4o2o1TZNLAD9jWciKPmpDotHp5+w++mQoItLB2+R6owFECWXoh2PG+pJwTQBY0Dub ijE9MLJCg6jMMWdOZSKwKfqvjfPNnPRWn9IS+f4U1xONRICw/e2z4XvBkGfk9VQ5p5lL JHLGDiwwYFC9cNHsDlTgXKtP5xDjenWtitenFVHYTLTQVl5nHRzgk/Ndx2NWds+6oTYx OirMcKK2kAmIrVaAk7fN6PZyktFpOSEARdo759IwISsbQDntA/+ivtyMvGCV11kAZWSv f2tDIEl8oBX9yCn4ybRElT9l1TUemdFd7ljAejK8BfH1orUWkybdsbhZvBPbz7/mAM8U wySg== X-Gm-Message-State: AOJu0Yx3j97CXt5LpvcMA+geaoExnYy3QKsBx1VpuKeBFltbt6pa9PEE G8TL2EV93l/GibUI9IZOeJPvXkWywbF+5y4elKaTEXFx7nuEqlI3WC+M72ViL63agrjR9u1Z8pv Gf7stVnwGaitavSc/IUXd3RRaBFp4bTHd7a/+VP662lkkju0IJ0BtnZb+hm2jhfwzLwDlOGA3Xv D7CPT2Ps9oFjky1axbIYvJsu9sybPLIPSoiqtYc2No8WEm X-Google-Smtp-Source: AGHT+IFuSFgRNg1DGSudH6lRQBEZCQ4sjXU0IjajgWgCB5IgNAbjQjXlgvaDRYE7qE7PVr891Scc96/NqeAl X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a05:6902:2605:b0:dcb:b9d7:2760 with SMTP id dw5-20020a056902260500b00dcbb9d72760mr1100718ybb.13.1710477621104; Thu, 14 Mar 2024 21:40:21 -0700 (PDT) Date: Thu, 14 Mar 2024 21:39:46 -0700 In-Reply-To: <20240315044007.2778856-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240315044007.2778856-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.291.gc1ea87d7ee-goog Message-ID: <20240315044007.2778856-2-jstultz@google.com> Subject: [PATCH v9 1/7] locking/mutex: Remove wakeups from under mutex::wait_lock From: John Stultz To: LKML Cc: Peter Zijlstra , Joel Fernandes , Qais Yousef , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com, John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra In preparation to nest mutex::wait_lock under rq::lock we need to remove wakeups from under it. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Signed-off-by: Peter Zijlstra (Intel) [Heavily changed after 55f036ca7e74 ("locking: WW mutex cleanup") and 08295b3b5bee ("locking: Implement an algorithm choice for Wound-Wait mutexes")] Signed-off-by: Juri Lelli [jstultz: rebased to mainline, added extra wake_up_q & init to avoid hangs, similar to Connor's rework of this patch] Signed-off-by: John Stultz Acked-by: Davidlohr Bueso Reviewed-by: Metin Kaya Tested-by: K Prateek Nayak Tested-by: Metin Kaya --- v5: * Reverted back to an earlier version of this patch to undo the change that kept the wake_q in the ctx structure, as that broke the rule that the wake_q must always be on the stack, as its not safe for concurrency. v6: * Made tweaks suggested by Waiman Long v7: * Fixups to pass wake_qs down for PREEMPT_RT logic --- kernel/locking/mutex.c | 17 +++++++++++++---- kernel/locking/rtmutex.c | 26 +++++++++++++++++--------- kernel/locking/rwbase_rt.c | 4 +++- kernel/locking/rwsem.c | 4 ++-- kernel/locking/spinlock_rt.c | 3 ++- kernel/locking/ww_mutex.h | 29 ++++++++++++++++++----------- 6 files changed, 55 insertions(+), 28 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index cbae8c0b89ab..980ce630232c 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -575,6 +575,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas struct lockdep_map *nest_lock, unsigned long ip, struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx) { + DEFINE_WAKE_Q(wake_q); struct mutex_waiter waiter; struct ww_mutex *ww; int ret; @@ -625,7 +626,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas */ if (__mutex_trylock(lock)) { if (ww_ctx) - __ww_mutex_check_waiters(lock, ww_ctx); + __ww_mutex_check_waiters(lock, ww_ctx, &wake_q); =20 goto skip_wait; } @@ -645,7 +646,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas * Add in stamp order, waking up waiters that must kill * themselves. */ - ret =3D __ww_mutex_add_waiter(&waiter, lock, ww_ctx); + ret =3D __ww_mutex_add_waiter(&waiter, lock, ww_ctx, &wake_q); if (ret) goto err_early_kill; } @@ -681,6 +682,11 @@ __mutex_lock_common(struct mutex *lock, unsigned int s= tate, unsigned int subclas } =20 raw_spin_unlock(&lock->wait_lock); + /* Make sure we do wakeups before calling schedule */ + if (!wake_q_empty(&wake_q)) { + wake_up_q(&wake_q); + wake_q_init(&wake_q); + } schedule_preempt_disabled(); =20 first =3D __mutex_waiter_is_first(lock, &waiter); @@ -714,7 +720,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas */ if (!ww_ctx->is_wait_die && !__mutex_waiter_is_first(lock, &waiter)) - __ww_mutex_check_waiters(lock, ww_ctx); + __ww_mutex_check_waiters(lock, ww_ctx, &wake_q); } =20 __mutex_remove_waiter(lock, &waiter); @@ -730,6 +736,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas ww_mutex_lock_acquired(ww, ww_ctx); =20 raw_spin_unlock(&lock->wait_lock); + wake_up_q(&wake_q); preempt_enable(); return 0; =20 @@ -741,6 +748,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas raw_spin_unlock(&lock->wait_lock); debug_mutex_free_waiter(&waiter); mutex_release(&lock->dep_map, ip); + wake_up_q(&wake_q); preempt_enable(); return ret; } @@ -934,6 +942,7 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne } } =20 + preempt_disable(); raw_spin_lock(&lock->wait_lock); debug_mutex_unlock(lock); if (!list_empty(&lock->wait_list)) { @@ -952,8 +961,8 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne __mutex_handoff(lock, next); =20 raw_spin_unlock(&lock->wait_lock); - wake_up_q(&wake_q); + preempt_enable(); } =20 #ifndef CONFIG_DEBUG_LOCK_ALLOC diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index 4a10e8c16fd2..eaac8b196a69 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -34,13 +34,15 @@ =20 static inline int __ww_mutex_add_waiter(struct rt_mutex_waiter *waiter, struct rt_mutex *lock, - struct ww_acquire_ctx *ww_ctx) + struct ww_acquire_ctx *ww_ctx, + struct wake_q_head *wake_q) { return 0; } =20 static inline void __ww_mutex_check_waiters(struct rt_mutex *lock, - struct ww_acquire_ctx *ww_ctx) + struct ww_acquire_ctx *ww_ctx, + struct wake_q_head *wake_q) { } =20 @@ -1206,6 +1208,7 @@ static int __sched task_blocks_on_rt_mutex(struct rt_= mutex_base *lock, struct rt_mutex_waiter *top_waiter =3D waiter; struct rt_mutex_base *next_lock; int chain_walk =3D 0, res; + DEFINE_WAKE_Q(wake_q); =20 lockdep_assert_held(&lock->wait_lock); =20 @@ -1244,7 +1247,8 @@ static int __sched task_blocks_on_rt_mutex(struct rt_= mutex_base *lock, =20 /* Check whether the waiter should back out immediately */ rtm =3D container_of(lock, struct rt_mutex, rtmutex); - res =3D __ww_mutex_add_waiter(waiter, rtm, ww_ctx); + res =3D __ww_mutex_add_waiter(waiter, rtm, ww_ctx, &wake_q); + wake_up_q(&wake_q); if (res) { raw_spin_lock(&task->pi_lock); rt_mutex_dequeue(lock, waiter); @@ -1677,7 +1681,8 @@ static int __sched __rt_mutex_slowlock(struct rt_mute= x_base *lock, struct ww_acquire_ctx *ww_ctx, unsigned int state, enum rtmutex_chainwalk chwalk, - struct rt_mutex_waiter *waiter) + struct rt_mutex_waiter *waiter, + struct wake_q_head *wake_q) { struct rt_mutex *rtm =3D container_of(lock, struct rt_mutex, rtmutex); struct ww_mutex *ww =3D ww_container_of(rtm); @@ -1688,7 +1693,7 @@ static int __sched __rt_mutex_slowlock(struct rt_mute= x_base *lock, /* Try to acquire the lock again: */ if (try_to_take_rt_mutex(lock, current, NULL)) { if (build_ww_mutex() && ww_ctx) { - __ww_mutex_check_waiters(rtm, ww_ctx); + __ww_mutex_check_waiters(rtm, ww_ctx, wake_q); ww_mutex_lock_acquired(ww, ww_ctx); } return 0; @@ -1706,7 +1711,7 @@ static int __sched __rt_mutex_slowlock(struct rt_mute= x_base *lock, /* acquired the lock */ if (build_ww_mutex() && ww_ctx) { if (!ww_ctx->is_wait_die) - __ww_mutex_check_waiters(rtm, ww_ctx); + __ww_mutex_check_waiters(rtm, ww_ctx, wake_q); ww_mutex_lock_acquired(ww, ww_ctx); } } else { @@ -1728,7 +1733,8 @@ static int __sched __rt_mutex_slowlock(struct rt_mute= x_base *lock, =20 static inline int __rt_mutex_slowlock_locked(struct rt_mutex_base *lock, struct ww_acquire_ctx *ww_ctx, - unsigned int state) + unsigned int state, + struct wake_q_head *wake_q) { struct rt_mutex_waiter waiter; int ret; @@ -1737,7 +1743,7 @@ static inline int __rt_mutex_slowlock_locked(struct r= t_mutex_base *lock, waiter.ww_ctx =3D ww_ctx; =20 ret =3D __rt_mutex_slowlock(lock, ww_ctx, state, RT_MUTEX_MIN_CHAINWALK, - &waiter); + &waiter, wake_q); =20 debug_rt_mutex_free_waiter(&waiter); return ret; @@ -1753,6 +1759,7 @@ static int __sched rt_mutex_slowlock(struct rt_mutex_= base *lock, struct ww_acquire_ctx *ww_ctx, unsigned int state) { + DEFINE_WAKE_Q(wake_q); unsigned long flags; int ret; =20 @@ -1774,8 +1781,9 @@ static int __sched rt_mutex_slowlock(struct rt_mutex_= base *lock, * irqsave/restore variants. */ raw_spin_lock_irqsave(&lock->wait_lock, flags); - ret =3D __rt_mutex_slowlock_locked(lock, ww_ctx, state); + ret =3D __rt_mutex_slowlock_locked(lock, ww_ctx, state, &wake_q); raw_spin_unlock_irqrestore(&lock->wait_lock, flags); + wake_up_q(&wake_q); rt_mutex_post_schedule(); =20 return ret; diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c index 34a59569db6b..e9d2f38b70f3 100644 --- a/kernel/locking/rwbase_rt.c +++ b/kernel/locking/rwbase_rt.c @@ -69,6 +69,7 @@ static int __sched __rwbase_read_lock(struct rwbase_rt *r= wb, unsigned int state) { struct rt_mutex_base *rtm =3D &rwb->rtmutex; + DEFINE_WAKE_Q(wake_q); int ret; =20 rwbase_pre_schedule(); @@ -110,7 +111,7 @@ static int __sched __rwbase_read_lock(struct rwbase_rt = *rwb, * For rwlocks this returns 0 unconditionally, so the below * !ret conditionals are optimized out. */ - ret =3D rwbase_rtmutex_slowlock_locked(rtm, state); + ret =3D rwbase_rtmutex_slowlock_locked(rtm, state, &wake_q); =20 /* * On success the rtmutex is held, so there can't be a writer @@ -122,6 +123,7 @@ static int __sched __rwbase_read_lock(struct rwbase_rt = *rwb, if (!ret) atomic_inc(&rwb->readers); raw_spin_unlock_irq(&rtm->wait_lock); + wake_up_q(&wake_q); if (!ret) rwbase_rtmutex_unlock(rtm); =20 diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c index 2340b6d90ec6..74ebb2915d63 100644 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -1415,8 +1415,8 @@ static inline void __downgrade_write(struct rw_semaph= ore *sem) #define rwbase_rtmutex_lock_state(rtm, state) \ __rt_mutex_lock(rtm, state) =20 -#define rwbase_rtmutex_slowlock_locked(rtm, state) \ - __rt_mutex_slowlock_locked(rtm, NULL, state) +#define rwbase_rtmutex_slowlock_locked(rtm, state, wq) \ + __rt_mutex_slowlock_locked(rtm, NULL, state, wq) =20 #define rwbase_rtmutex_unlock(rtm) \ __rt_mutex_unlock(rtm) diff --git a/kernel/locking/spinlock_rt.c b/kernel/locking/spinlock_rt.c index 38e292454fcc..fb1810a14c9d 100644 --- a/kernel/locking/spinlock_rt.c +++ b/kernel/locking/spinlock_rt.c @@ -162,7 +162,8 @@ rwbase_rtmutex_lock_state(struct rt_mutex_base *rtm, un= signed int state) } =20 static __always_inline int -rwbase_rtmutex_slowlock_locked(struct rt_mutex_base *rtm, unsigned int sta= te) +rwbase_rtmutex_slowlock_locked(struct rt_mutex_base *rtm, unsigned int sta= te, + struct wake_q_head *wake_q) { rtlock_slowlock_locked(rtm); return 0; diff --git a/kernel/locking/ww_mutex.h b/kernel/locking/ww_mutex.h index 3ad2cc4823e5..7189c6631d90 100644 --- a/kernel/locking/ww_mutex.h +++ b/kernel/locking/ww_mutex.h @@ -275,7 +275,7 @@ __ww_ctx_less(struct ww_acquire_ctx *a, struct ww_acqui= re_ctx *b) */ static bool __ww_mutex_die(struct MUTEX *lock, struct MUTEX_WAITER *waiter, - struct ww_acquire_ctx *ww_ctx) + struct ww_acquire_ctx *ww_ctx, struct wake_q_head *wake_q) { if (!ww_ctx->is_wait_die) return false; @@ -284,7 +284,7 @@ __ww_mutex_die(struct MUTEX *lock, struct MUTEX_WAITER = *waiter, #ifndef WW_RT debug_mutex_wake_waiter(lock, waiter); #endif - wake_up_process(waiter->task); + wake_q_add(wake_q, waiter->task); } =20 return true; @@ -299,7 +299,8 @@ __ww_mutex_die(struct MUTEX *lock, struct MUTEX_WAITER = *waiter, */ static bool __ww_mutex_wound(struct MUTEX *lock, struct ww_acquire_ctx *ww_ctx, - struct ww_acquire_ctx *hold_ctx) + struct ww_acquire_ctx *hold_ctx, + struct wake_q_head *wake_q) { struct task_struct *owner =3D __ww_mutex_owner(lock); =20 @@ -331,7 +332,7 @@ static bool __ww_mutex_wound(struct MUTEX *lock, * wakeup pending to re-read the wounded state. */ if (owner !=3D current) - wake_up_process(owner); + wake_q_add(wake_q, owner); =20 return true; } @@ -352,7 +353,8 @@ static bool __ww_mutex_wound(struct MUTEX *lock, * The current task must not be on the wait list. */ static void -__ww_mutex_check_waiters(struct MUTEX *lock, struct ww_acquire_ctx *ww_ctx) +__ww_mutex_check_waiters(struct MUTEX *lock, struct ww_acquire_ctx *ww_ctx, + struct wake_q_head *wake_q) { struct MUTEX_WAITER *cur; =20 @@ -364,8 +366,8 @@ __ww_mutex_check_waiters(struct MUTEX *lock, struct ww_= acquire_ctx *ww_ctx) if (!cur->ww_ctx) continue; =20 - if (__ww_mutex_die(lock, cur, ww_ctx) || - __ww_mutex_wound(lock, cur->ww_ctx, ww_ctx)) + if (__ww_mutex_die(lock, cur, ww_ctx, wake_q) || + __ww_mutex_wound(lock, cur->ww_ctx, ww_ctx, wake_q)) break; } } @@ -377,6 +379,8 @@ __ww_mutex_check_waiters(struct MUTEX *lock, struct ww_= acquire_ctx *ww_ctx) static __always_inline void ww_mutex_set_context_fastpath(struct ww_mutex *lock, struct ww_acquire_ctx= *ctx) { + DEFINE_WAKE_Q(wake_q); + ww_mutex_lock_acquired(lock, ctx); =20 /* @@ -405,8 +409,10 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock, s= truct ww_acquire_ctx *ctx) * die or wound us. */ lock_wait_lock(&lock->base); - __ww_mutex_check_waiters(&lock->base, ctx); + __ww_mutex_check_waiters(&lock->base, ctx, &wake_q); unlock_wait_lock(&lock->base); + + wake_up_q(&wake_q); } =20 static __always_inline int @@ -488,7 +494,8 @@ __ww_mutex_check_kill(struct MUTEX *lock, struct MUTEX_= WAITER *waiter, static inline int __ww_mutex_add_waiter(struct MUTEX_WAITER *waiter, struct MUTEX *lock, - struct ww_acquire_ctx *ww_ctx) + struct ww_acquire_ctx *ww_ctx, + struct wake_q_head *wake_q) { struct MUTEX_WAITER *cur, *pos =3D NULL; bool is_wait_die; @@ -532,7 +539,7 @@ __ww_mutex_add_waiter(struct MUTEX_WAITER *waiter, pos =3D cur; =20 /* Wait-Die: ensure younger waiters die. */ - __ww_mutex_die(lock, cur, ww_ctx); + __ww_mutex_die(lock, cur, ww_ctx, wake_q); } =20 __ww_waiter_add(lock, waiter, pos); @@ -550,7 +557,7 @@ __ww_mutex_add_waiter(struct MUTEX_WAITER *waiter, * such that either we or the fastpath will wound @ww->ctx. */ smp_mb(); - __ww_mutex_wound(lock, ww_ctx, ww->ctx); + __ww_mutex_wound(lock, ww_ctx, ww->ctx, wake_q); } =20 return 0; --=20 2.44.0.291.gc1ea87d7ee-goog From nobody Sat Feb 7 15:05:56 2026 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF8F6F9DE for ; Fri, 15 Mar 2024 04:40:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477626; cv=none; b=blispntDP4z7VBDuwJRNry+EQJBpq6N8YiWlnwatxtb920R0pRh1bUH5H/jjzM8RBxqU0kUUs3MlDyKM5lyKAYtzeBGgQvvEOq0LwHtSIbF1p1BeGbF24oijkvmZgwsZv2ti6Yfw+oTL6SsbotE07xZIQxAugrCqsOJqplF37qg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477626; c=relaxed/simple; bh=2FiN2H+43D3sx0OR6+I4fJnMUbMIax6oO3HYblar+8k=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Wv/tY0vhNQpORzifHvYZkf3P3Z/Vy2lKTiWxQZvv9YBvW8H3PO3sjmEcjvn8+03MOOw/sIj/rLb6v+z5biVhmfKZ80W0sNAWu3YxDR7VGGwngbEkQkWFvc+gl+SLFP5feuQLHr2G/yHJOlV8zODXt/vzKBiakHjyTeclGCzXAsw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=fGptibbk; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="fGptibbk" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-60a2b82039bso31911597b3.1 for ; Thu, 14 Mar 2024 21:40:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1710477624; x=1711082424; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wPFpAa5+VGKsgJgb1JT18RP9N5QBHmzAN+rZSPB/N9U=; b=fGptibbkGCzPDTJTQiyif4UEiWSx7yP0wVtA1Z5Jfh6KSmZEGkl8xU54drxHv4YBMP mBvSNcnzPlOOSiVuMjGo1UrikjrGZepvGaYCcNbEzhDX5u/Xzjhne9sI0iYMTQz4vqlG qSqnr2e0zRzaXpD3Q4LmkYaZloNJlrMXtQYWzeL3svcVXNOyJn/a+aC1C5OYddDdQZ6p O40ryN69+NxebDTrCaYLDQYSZ6QqQSWN/cie3L10LPdkGAleeEKdbuxB2ApiNCj4VV5L werbK/Qf6CYsDNuDv7bvjUgRbP/MjVdy5Goa+zZrph5Os62YmKcvlQvHdxTXXfA/P3u5 1sdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710477624; x=1711082424; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wPFpAa5+VGKsgJgb1JT18RP9N5QBHmzAN+rZSPB/N9U=; b=FmyFGnkiQAE6dzQAHTKhIIkimlOB9omB8ohJr1zNeKAuewWuMea6jrAFMIUDJwomli ehrKkax7H1jDdETgyEggzCOXpmJXawvs1XCi7iH66q3By1d8Yv2U2AW/gXJNps2rWDPK NJ0GRxJJiaIeAp91yOUUxYGgo/gVUikqE0Z32V9eaYe8SxrEO3BLaC9ZlY4o1ejxaCZO Hx/6coqMTnsbvP5zZanVOGQLqpNwD27l+NYAlKGggR48JMFsWJNgiLU9CcdaCoH6SvNI WlpE5cG/mjjBQ1Vt+TMf4P6lfMGl9QRbxw+qCoKk9X7MG9nShWso/GEWUwD4VMme4cH3 yBjQ== X-Gm-Message-State: AOJu0YxnnG3em9okRxsHsW+JuiEvDNSZVzZD+p7Oll5AykYGCKDuFpw1 bMulNu/jh2Gj8cZAZV+CpDMqaFgshCZVmqUDcvMwz5LiYXoIecbK01Z6+VyFFQYEQqi9YAZwsdx 2Q7+BRSjALAVgEog9pa1cdF1iBxlNkyluz0BKYsa3SXw1LMwQk5H7tqbPsiz0cQvoIbi+tEuWT8 XdkJLl5FsjNPXl89RAxTY+P7CwjJQ49O6eZadFYDW0hra9 X-Google-Smtp-Source: AGHT+IHbuhfbSLlOQ+YuqId0AhDWuZAc6Dxqiic9vTVmYsExfcqXMr6IctAYiHlNUuO4/TdpLbLJcedc29bA X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a05:6902:2748:b0:dc6:ebd4:cca2 with SMTP id ea8-20020a056902274800b00dc6ebd4cca2mr216205ybb.11.1710477622683; Thu, 14 Mar 2024 21:40:22 -0700 (PDT) Date: Thu, 14 Mar 2024 21:39:47 -0700 In-Reply-To: <20240315044007.2778856-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240315044007.2778856-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.291.gc1ea87d7ee-goog Message-ID: <20240315044007.2778856-3-jstultz@google.com> Subject: [PATCH v9 2/7] locking/mutex: Make mutex::wait_lock irq safe From: John Stultz To: LKML Cc: Juri Lelli , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com, "Connor O'Brien" , John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Juri Lelli mutex::wait_lock might be nested under rq->lock. Make it irq safe then. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Signed-off-by: Juri Lelli Signed-off-by: Peter Zijlstra (Intel) [rebase & fix {un,}lock_wait_lock helpers in ww_mutex.h] Signed-off-by: Connor O'Brien Signed-off-by: John Stultz Reviewed-by: Metin Kaya Tested-by: K Prateek Nayak Tested-by: Metin Kaya --- v3: * Re-added this patch after it was dropped in v2 which caused lockdep warnings to trip. v7: * Fix function definition for PREEMPT_RT case, as pointed out by Metin Kaya. * Fix incorrect flags handling in PREEMPT_RT case as found by Metin Kaya --- kernel/locking/mutex.c | 18 ++++++++++-------- kernel/locking/ww_mutex.h | 22 +++++++++++----------- 2 files changed, 21 insertions(+), 19 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 980ce630232c..7de72c610c65 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -578,6 +578,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas DEFINE_WAKE_Q(wake_q); struct mutex_waiter waiter; struct ww_mutex *ww; + unsigned long flags; int ret; =20 if (!use_ww_ctx) @@ -620,7 +621,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas return 0; } =20 - raw_spin_lock(&lock->wait_lock); + raw_spin_lock_irqsave(&lock->wait_lock, flags); /* * After waiting to acquire the wait_lock, try again. */ @@ -681,7 +682,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas goto err; } =20 - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, flags); /* Make sure we do wakeups before calling schedule */ if (!wake_q_empty(&wake_q)) { wake_up_q(&wake_q); @@ -707,9 +708,9 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas trace_contention_begin(lock, LCB_F_MUTEX); } =20 - raw_spin_lock(&lock->wait_lock); + raw_spin_lock_irqsave(&lock->wait_lock, flags); } - raw_spin_lock(&lock->wait_lock); + raw_spin_lock_irqsave(&lock->wait_lock, flags); acquired: __set_current_state(TASK_RUNNING); =20 @@ -735,7 +736,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas if (ww_ctx) ww_mutex_lock_acquired(ww, ww_ctx); =20 - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, flags); wake_up_q(&wake_q); preempt_enable(); return 0; @@ -745,7 +746,7 @@ __mutex_lock_common(struct mutex *lock, unsigned int st= ate, unsigned int subclas __mutex_remove_waiter(lock, &waiter); err_early_kill: trace_contention_end(lock, ret); - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, flags); debug_mutex_free_waiter(&waiter); mutex_release(&lock->dep_map, ip); wake_up_q(&wake_q); @@ -916,6 +917,7 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne struct task_struct *next =3D NULL; DEFINE_WAKE_Q(wake_q); unsigned long owner; + unsigned long flags; =20 mutex_release(&lock->dep_map, ip); =20 @@ -943,7 +945,7 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne } =20 preempt_disable(); - raw_spin_lock(&lock->wait_lock); + raw_spin_lock_irqsave(&lock->wait_lock, flags); debug_mutex_unlock(lock); if (!list_empty(&lock->wait_list)) { /* get the first entry from the wait-list: */ @@ -960,7 +962,7 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne if (owner & MUTEX_FLAG_HANDOFF) __mutex_handoff(lock, next); =20 - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, flags); wake_up_q(&wake_q); preempt_enable(); } diff --git a/kernel/locking/ww_mutex.h b/kernel/locking/ww_mutex.h index 7189c6631d90..9facc0ddfdd3 100644 --- a/kernel/locking/ww_mutex.h +++ b/kernel/locking/ww_mutex.h @@ -70,14 +70,14 @@ __ww_mutex_has_waiters(struct mutex *lock) return atomic_long_read(&lock->owner) & MUTEX_FLAG_WAITERS; } =20 -static inline void lock_wait_lock(struct mutex *lock) +static inline void lock_wait_lock(struct mutex *lock, unsigned long *flags) { - raw_spin_lock(&lock->wait_lock); + raw_spin_lock_irqsave(&lock->wait_lock, *flags); } =20 -static inline void unlock_wait_lock(struct mutex *lock) +static inline void unlock_wait_lock(struct mutex *lock, unsigned long *fla= gs) { - raw_spin_unlock(&lock->wait_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, *flags); } =20 static inline void lockdep_assert_wait_lock_held(struct mutex *lock) @@ -144,14 +144,14 @@ __ww_mutex_has_waiters(struct rt_mutex *lock) return rt_mutex_has_waiters(&lock->rtmutex); } =20 -static inline void lock_wait_lock(struct rt_mutex *lock) +static inline void lock_wait_lock(struct rt_mutex *lock, unsigned long *fl= ags) { - raw_spin_lock(&lock->rtmutex.wait_lock); + raw_spin_lock_irqsave(&lock->rtmutex.wait_lock, *flags); } =20 -static inline void unlock_wait_lock(struct rt_mutex *lock) +static inline void unlock_wait_lock(struct rt_mutex *lock, unsigned long *= flags) { - raw_spin_unlock(&lock->rtmutex.wait_lock); + raw_spin_unlock_irqrestore(&lock->rtmutex.wait_lock, *flags); } =20 static inline void lockdep_assert_wait_lock_held(struct rt_mutex *lock) @@ -380,6 +380,7 @@ static __always_inline void ww_mutex_set_context_fastpath(struct ww_mutex *lock, struct ww_acquire_ctx= *ctx) { DEFINE_WAKE_Q(wake_q); + unsigned long flags; =20 ww_mutex_lock_acquired(lock, ctx); =20 @@ -408,10 +409,9 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock, s= truct ww_acquire_ctx *ctx) * Uh oh, we raced in fastpath, check if any of the waiters need to * die or wound us. */ - lock_wait_lock(&lock->base); + lock_wait_lock(&lock->base, &flags); __ww_mutex_check_waiters(&lock->base, ctx, &wake_q); - unlock_wait_lock(&lock->base); - + unlock_wait_lock(&lock->base, &flags); wake_up_q(&wake_q); } =20 --=20 2.44.0.291.gc1ea87d7ee-goog From nobody Sat Feb 7 15:05:56 2026 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11ACB101C4 for ; Fri, 15 Mar 2024 04:40:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477627; cv=none; b=Y99XVxq8wqaOgB5dQkcTYYbm3A4HeBr7Pe3lt1XBU2YAcYB8TU8z/fkfplIR0AMvcSSB7i1YFqZxmXgZiYhuJ7YT7nv4lqsb9C5imeqJhmI6MlKLQacMRZhWVVu8asyGwpBBOIujyPEpua/RKiN17rT4rdxZEUS4Q5K74VLGkW4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477627; c=relaxed/simple; bh=RoIcrErNdRRmnt6YHmJ5Updj0dP2brw5ELAlflT/1hs=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=DMQB9eH1cNxI9GyvOJHDdKtzIeY0lBdcSB0cFV3R+OK6wnvbr4HKY1y2KmvDV2HjLRk/+a4cl6YkvPFimLJ/eTqjdcrmLH/K3Q41igI4Ifxvrymfo8kK6rrZ7eW0JX0kYsDJIVjFvUgTo/fbKt7nPBzA3DKkMgSYxNhdj8TnN1o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=nxAY3pNd; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="nxAY3pNd" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dced704f17cso2769366276.1 for ; Thu, 14 Mar 2024 21:40:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1710477625; x=1711082425; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QaAjw+GLN3fthD1Tt2Fd5YB8ROakR8eIp7hBkVpSiIo=; b=nxAY3pNdYizTuB4L59lJLODjrpV5nwvBXX1l3pplVvlKhwZCGjdReZ+OIuyeVwHSHp uQlUzlVfNJ2UX08AtGf2vXXgx+KC2TTYH2vpWLIMsuAuW2Pd6d9dTZ5ty+4CTQStwvfI ttuJJmkfsP2w5D2vbww9++ViVISyEOm8pEbdObLLyXhYSL9Wwun4gGBKILJQMgYnswPZ frzNkVsTeCl69UUV1mypRopw9SmCCktYzSb3bLlJ5sw1ZlLnRXNFsYkRpFyGRvgUo0zN rrbDTTY+XvVd8PfoVhq6c2cRhfY6U0d7+6sLBACh49SAJydtz6zCnY3yydNQYmRjS5yn js/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710477625; x=1711082425; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QaAjw+GLN3fthD1Tt2Fd5YB8ROakR8eIp7hBkVpSiIo=; b=dggEopVx1zFNhGa/nmWCP2NaeZTpnD2m6Trrjcwc6QE9Tij8Shx3sQotw87cjip5pA KFkOZYhdYpBsSW64bDNUVBNnfWu9jKZB61vB5f5ri9UKRdG7envTurAtKNEQPpqzBuht 6bPiFJuW1HrMVY7V0bPoIVr64J5K3vbNjYI73LK/u6EqxzREyS8Rx9Nf1lAbA8EFHW1U E/pbxV1pc5WiNkATc4NxWuwrIcEGDfSTBTqPJjHLucb/NtOyzal1Lkh2y3OEDuX3t9n2 s7TeKd6fSbsOe2lplxnQLh3OorGQQA/ObMdX3rFgzcDbte7DjcXH9Yt9DKdjuLmVkv84 tq8w== X-Gm-Message-State: AOJu0Yw99dzF/ErN8XbMS/HduM2F7nT9tvCY0xfRii/jCBAvX+BxzCYB TqdDyQhmXy3KruRwXg02SsJbkkubezEe0qZJ9wjJbu1JO42k7jUHXvxFlWwJVGJRGpzNUNKjNtJ AyeqkItJ8kTsTA8u8dJrc6vX9VzKRxCin1TRlqA5QAzsh6U0Z846wEoY9Nccz/iRLXgMI2VMTlU Cn8fBXIc2l6ncF8nSsP7nzDmHEZbz1bdLslt1jvmwYpN4T X-Google-Smtp-Source: AGHT+IHrGFmuVdNv87RfxHRPYm5/7AnTutaFZYOVGNZQsMeBVL96LxwznLFIO7HJaAWLPjy4VEuQxTmDbtQT X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a05:6902:72e:b0:dcc:94b7:a7a3 with SMTP id l14-20020a056902072e00b00dcc94b7a7a3mr216175ybt.12.1710477624535; Thu, 14 Mar 2024 21:40:24 -0700 (PDT) Date: Thu, 14 Mar 2024 21:39:48 -0700 In-Reply-To: <20240315044007.2778856-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240315044007.2778856-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.291.gc1ea87d7ee-goog Message-ID: <20240315044007.2778856-4-jstultz@google.com> Subject: [PATCH v9 3/7] locking/mutex: Expose __mutex_owner() From: John Stultz To: LKML Cc: Juri Lelli , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com, Valentin Schneider , "Connor O'Brien" , John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Juri Lelli Implementing proxy execution requires that scheduler code be able to identify the current owner of a mutex. Expose __mutex_owner() for this purpose (alone!). Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Signed-off-by: Juri Lelli [Removed the EXPORT_SYMBOL] Signed-off-by: Valentin Schneider Signed-off-by: Connor O'Brien [jstultz: Reworked per Peter's suggestions] Signed-off-by: John Stultz Reviewed-by: Metin Kaya Tested-by: K Prateek Nayak Tested-by: Metin Kaya --- v4: * Move __mutex_owner() to kernel/locking/mutex.h instead of adding a new globally available accessor function to keep the exposure of this low, along with keeping it an inline function, as suggested by PeterZ --- kernel/locking/mutex.c | 25 ------------------------- kernel/locking/mutex.h | 25 +++++++++++++++++++++++++ 2 files changed, 25 insertions(+), 25 deletions(-) diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 7de72c610c65..5741641be914 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -56,31 +56,6 @@ __mutex_init(struct mutex *lock, const char *name, struc= t lock_class_key *key) } EXPORT_SYMBOL(__mutex_init); =20 -/* - * @owner: contains: 'struct task_struct *' to the current lock owner, - * NULL means not owned. Since task_struct pointers are aligned at - * at least L1_CACHE_BYTES, we have low bits to store extra state. - * - * Bit0 indicates a non-empty waiter list; unlock must issue a wakeup. - * Bit1 indicates unlock needs to hand the lock to the top-waiter - * Bit2 indicates handoff has been done and we're waiting for pickup. - */ -#define MUTEX_FLAG_WAITERS 0x01 -#define MUTEX_FLAG_HANDOFF 0x02 -#define MUTEX_FLAG_PICKUP 0x04 - -#define MUTEX_FLAGS 0x07 - -/* - * Internal helper function; C doesn't allow us to hide it :/ - * - * DO NOT USE (outside of mutex code). - */ -static inline struct task_struct *__mutex_owner(struct mutex *lock) -{ - return (struct task_struct *)(atomic_long_read(&lock->owner) & ~MUTEX_FLA= GS); -} - static inline struct task_struct *__owner_task(unsigned long owner) { return (struct task_struct *)(owner & ~MUTEX_FLAGS); diff --git a/kernel/locking/mutex.h b/kernel/locking/mutex.h index 0b2a79c4013b..1c7d3d32def8 100644 --- a/kernel/locking/mutex.h +++ b/kernel/locking/mutex.h @@ -20,6 +20,31 @@ struct mutex_waiter { #endif }; =20 +/* + * @owner: contains: 'struct task_struct *' to the current lock owner, + * NULL means not owned. Since task_struct pointers are aligned at + * at least L1_CACHE_BYTES, we have low bits to store extra state. + * + * Bit0 indicates a non-empty waiter list; unlock must issue a wakeup. + * Bit1 indicates unlock needs to hand the lock to the top-waiter + * Bit2 indicates handoff has been done and we're waiting for pickup. + */ +#define MUTEX_FLAG_WAITERS 0x01 +#define MUTEX_FLAG_HANDOFF 0x02 +#define MUTEX_FLAG_PICKUP 0x04 + +#define MUTEX_FLAGS 0x07 + +/* + * Internal helper function; C doesn't allow us to hide it :/ + * + * DO NOT USE (outside of mutex & scheduler code). + */ +static inline struct task_struct *__mutex_owner(struct mutex *lock) +{ + return (struct task_struct *)(atomic_long_read(&lock->owner) & ~MUTEX_FLA= GS); +} + #ifdef CONFIG_DEBUG_MUTEXES extern void debug_mutex_lock_common(struct mutex *lock, struct mutex_waiter *waiter); --=20 2.44.0.291.gc1ea87d7ee-goog From nobody Sat Feb 7 15:05:56 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE40611CAB for ; Fri, 15 Mar 2024 04:40:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477629; cv=none; b=uveapG1CZ650J5lrtL8eMf46Bmv3csNW8/dzWZFzsjFdmqZvtB6mqmFstOrjELZmF5FoI7/ZcCWL87Cf4Vg7ETXxQ7TJoKfSgxYoaphrIoI1VVZ091yBujma5V4G36Ov6ovN2hcKm2Or0FC67IWZD9bxVQTin5e6lAyVxq306+c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477629; c=relaxed/simple; bh=jb2f62llGktcTJX7jZ9P27KJuhKLBVfyaQMjusGxT5U=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=DcdGLgf4mKYEDwZTBGancqIxVydfuefYRkJWZdJwsd8FtrT33dDZFqMo9H31Tbg+i1lXqDetGbvXrh/jI02/nMrbDqyGhBH6aFzLznp0UCxCJ3vM+HfWJsy+YXjOmCUQoZDEqABDv/Nquw69mdJvLvu0aTVbZ64JkPHFkkpZnEE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=z0op849K; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="z0op849K" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-1dd72cc8590so18471185ad.3 for ; Thu, 14 Mar 2024 21:40:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1710477627; x=1711082427; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Rge6qMKoRypwUh+jBaE1ZlEEyDpmj0/zDUPqL1+MPa0=; b=z0op849KWpNDr04pXa+VxXw9mXNLwd4VIBOkkdcSuvBSv8qA5zRogHnQU1xZgs1WjL Plm1uZqg2Aw7Gf5w0CjjRMCcJFz0ReLo3eVyORCBReGyYWsE0gkj32nBJwkW09bJhVRl z0V2ZBiFQhwJzHkHP+faVm5Z7Hqf8258qUgWa519hh1sx1lWcxHHvWRBuZQZxPqLA1GE o90gACU8DKPRlKnh38A4fmRhmaLPiXQUjQT/kV+/HqXV6dJABpyKb7+5RgoXVTHKRcCq QH9p1HWhP+/WhVupe7TxSUR0jfnjXQzx5BzhEO2XG4JNgnt3GZuPaAGhP6Z0V6aszhuX EPAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710477627; x=1711082427; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Rge6qMKoRypwUh+jBaE1ZlEEyDpmj0/zDUPqL1+MPa0=; b=QtItXRjuJxrLqK4ox7Mpz/G/tQtCb/mc8FNjaqnW/GHSAikpe69jUQA5+XfgBSHxV5 rG1PfIyPWw8T2NvALMQHhc5EPkFWvKLku5xkt3efn3YHV2+7m26R690LHTbLskqapwuD CLH+xXBkeApV4SZT1Zrdm78Gm2oR6IK5Oq6RtA3H49/EyvppTWAcbgZzVfTY7cKgLBOq bCXtH4g+RvCL9g9Vn87ogdPLND3U6VXeDXWhBkLLNB94SpaptpnS41b5GbHOyB649d+a LrqiAuo8gQvQAu00l0nqzqGkDusaDRn2zp2TKtQbkcGOz9UW3Iw9NKbB6Tzdvyrm5fc6 Am0Q== X-Gm-Message-State: AOJu0YxJOA9FiQnrbM6avRtHrSm0PJzmnT7RFa6VCK1IGjolznU9i49H tRlExW+XBVEo19l2av14vCG1SZS1+BAgtq7OB7wbvStn0d1CqZSflDZ6c0gxCr1R6CL+W+7n/fT ZJ7lWsseKWl5UIEmX2zmhWfSj2z26li0KEcjlE4GazjpWZL8veyiJvn3SQrBoIyHs5wD095HdEn aNH6bqsgZHOQIPR6GAJlKBk0RfSNfHokxAlLIKYFkeNTEY X-Google-Smtp-Source: AGHT+IGZZKdBItwhke0XJ5jnVNca7Nuw3P/L4a3fcqTkh4AtaklIXX7S1Pa+2nLOmmOtyqMD15S30DhUO3Na X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a17:902:fa0f:b0:1dd:6bd0:d279 with SMTP id la15-20020a170902fa0f00b001dd6bd0d279mr133435plb.7.1710477626432; Thu, 14 Mar 2024 21:40:26 -0700 (PDT) Date: Thu, 14 Mar 2024 21:39:49 -0700 In-Reply-To: <20240315044007.2778856-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240315044007.2778856-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.291.gc1ea87d7ee-goog Message-ID: <20240315044007.2778856-5-jstultz@google.com> Subject: [PATCH v9 4/7] sched: Add do_push_task helper From: John Stultz To: LKML Cc: "Connor O'Brien" , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com, John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Connor O'Brien Switch logic that deactivates, sets the task cpu, and reactivates a task on a different rq to use a helper that will be later extended to push entire blocked task chains. This patch was broken out from a larger chain migration patch originally by Connor O'Brien. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Signed-off-by: Connor O'Brien [jstultz: split out from larger chain migration patch] Signed-off-by: John Stultz Reviewed-by: Metin Kaya Tested-by: K Prateek Nayak Tested-by: Metin Kaya --- v8: * Renamed from push_task_chain to do_push_task so it makes more sense without proxy-execution --- kernel/sched/core.c | 4 +--- kernel/sched/deadline.c | 8 ++------ kernel/sched/rt.c | 8 ++------ kernel/sched/sched.h | 9 +++++++++ 4 files changed, 14 insertions(+), 15 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9116bcc90346..ad4748327651 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2714,9 +2714,7 @@ int push_cpu_stop(void *arg) =20 // XXX validate p is still the highest prio task if (task_rq(p) =3D=3D rq) { - deactivate_task(rq, p, 0); - set_task_cpu(p, lowest_rq->cpu); - activate_task(lowest_rq, p, 0); + do_push_task(rq, lowest_rq, p); resched_curr(lowest_rq); } =20 diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index a04a436af8cc..e68d88963e89 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2443,9 +2443,7 @@ static int push_dl_task(struct rq *rq) goto retry; } =20 - deactivate_task(rq, next_task, 0); - set_task_cpu(next_task, later_rq->cpu); - activate_task(later_rq, next_task, 0); + do_push_task(rq, later_rq, next_task); ret =3D 1; =20 resched_curr(later_rq); @@ -2531,9 +2529,7 @@ static void pull_dl_task(struct rq *this_rq) if (is_migration_disabled(p)) { push_task =3D get_push_task(src_rq); } else { - deactivate_task(src_rq, p, 0); - set_task_cpu(p, this_cpu); - activate_task(this_rq, p, 0); + do_push_task(src_rq, this_rq, p); dmin =3D p->dl.deadline; resched =3D true; } diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 3261b067b67e..dd072d11cc02 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -2106,9 +2106,7 @@ static int push_rt_task(struct rq *rq, bool pull) goto retry; } =20 - deactivate_task(rq, next_task, 0); - set_task_cpu(next_task, lowest_rq->cpu); - activate_task(lowest_rq, next_task, 0); + do_push_task(rq, lowest_rq, next_task); resched_curr(lowest_rq); ret =3D 1; =20 @@ -2379,9 +2377,7 @@ static void pull_rt_task(struct rq *this_rq) if (is_migration_disabled(p)) { push_task =3D get_push_task(src_rq); } else { - deactivate_task(src_rq, p, 0); - set_task_cpu(p, this_cpu); - activate_task(this_rq, p, 0); + do_push_task(src_rq, this_rq, p); resched =3D true; } /* diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 001fe047bd5d..6ca83837e0f4 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3472,5 +3472,14 @@ static inline void init_sched_mm_cid(struct task_str= uct *t) { } =20 extern u64 avg_vruntime(struct cfs_rq *cfs_rq); extern int entity_eligible(struct cfs_rq *cfs_rq, struct sched_entity *se); +#ifdef CONFIG_SMP +static inline +void do_push_task(struct rq *rq, struct rq *dst_rq, struct task_struct *ta= sk) +{ + deactivate_task(rq, task, 0); + set_task_cpu(task, dst_rq->cpu); + activate_task(dst_rq, task, 0); +} +#endif =20 #endif /* _KERNEL_SCHED_SCHED_H */ --=20 2.44.0.291.gc1ea87d7ee-goog From nobody Sat Feb 7 15:05:56 2026 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B57DE134A6 for ; Fri, 15 Mar 2024 04:40:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477631; cv=none; b=Xbf5m9kB2zNPw5kwwRV9+0CuYnvkExERjLDGqOWre7ioVO75bn0TUJXQ7aXzKSa4oKSapinvRQBm46fT6moP6AAfsWM/SCkCTGswlqgKuROwXQSrpJPLNwxWiNKpUuyvG8PStzLH5hAqC54VN26F6UoM4XOd27tvFBKg7urVIDs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477631; c=relaxed/simple; bh=Grd0qwTsRfhLF5lqUVRBIRt9w7nSWKZcCX7ikkMuX/g=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=HxGJkwClHwA9gSzbv+iqph8FSQ994RXLKJ6z032qOnfN5+TDIvhbrQBJDH7+9yRIBdRr5PkVWoyGRr1p+FA9TDY1KK3WSHLR7YlZvYIl/iB0ga4at1z1TkL1IHhwFU4VraRCbpg1/wAybAx6R4Kpvkeprf+Ce8VnVPw3D9q8kzw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Mk6yaSuE; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Mk6yaSuE" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-60a20c33f0bso30021237b3.2 for ; Thu, 14 Mar 2024 21:40:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1710477628; x=1711082428; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QyAOgwNdCvCdtkb02Cz+1RqE/hO9U+hZcKT3Z9MgUA0=; b=Mk6yaSuEepPc4CjgCPp2EekIk3fqeCTrUvo6Zd2UKw9O/2yqQz02DDeKvUQofUnZl5 V+dVuNGUdIicq5B745+5v4iscvTNP/KFcnczaqNBh2j4tksfNio3HFpZfomtOrUctNlt UCwD5GclAo1lyPVoc9trgLruh9nD63PJgLijSYZ5lmptSFG+T+/H8Mw/Q+qcC/DEEIaU +lZQdniMqmxWxE7pTI5Md9dm1kbFCxiDE1wVIhs9PSjLwfOl/4k048kkqw5kv0WybqGw WdgiwPXNDlxG2dVS2cFhaG22GxZPRhkb0X6UC0cwft6ZjHZJ6OoLpjpl5qhbsdkaRHQl rQuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710477628; x=1711082428; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QyAOgwNdCvCdtkb02Cz+1RqE/hO9U+hZcKT3Z9MgUA0=; b=DSpee3l/bjbt99LiUT9CA1+ekJgHUrxf3zRlIej/cKTexSR9oW/r2E4dbsdhYry4Pq W+QhufcnezIAD4UeirBw+xV42fnzbS9UA6MHS+MOdaMo3AJibINajieXZXLdY6/4pom6 rdCNZp4u1qAw/NBRA5BBFq1OQ59l8xvkjSqo68o7Ix3tf/yJ1L1zOh0c+1Qc6DpKB1tD hL1p1OrikjyqiMSVUs55FXcvz2LzoJ54RbmxxtCrRIMyAIhf9WR/s0p3Ka1XEsi/Jl+S HNvhzgd75NHnqZX+h1YEoMKlVNHvnvkyft0Ra3l/RIa5ZCVyikrz8L2w5YzKesXIRTnN A+nQ== X-Gm-Message-State: AOJu0YzAc0E6f4cnhK+MRim1FTiJddhPAyDDV93nDQbskJX4w5vP4jf5 FgAeFbKL8yqJ3qHl4Yj9+coVy4SerU9mSrvYY+Vz0gTpRyZp+UDemQhjhXvp9YFWaiNbXM65/gG 3MpRmcYFU1fLA2pKT0xvuxrCZBMYr8DEzK9wf4ebV5vWaknezF2dMTYRaQjcRKfoolRn9x6wwna ELPIwVV1OSrNoigLFfGvw7CM2O+J16MCBaOVi8ODZHXOvx X-Google-Smtp-Source: AGHT+IFZisossGkZBSmeVynVE9zc5wzqj3kRkRJSTVMXyWdE/EZey4ntB1uteGlD/hhtTs33ndI3lR1K81t7 X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a05:690c:996:b0:60a:512a:1b22 with SMTP id ce22-20020a05690c099600b0060a512a1b22mr593648ywb.2.1710477628555; Thu, 14 Mar 2024 21:40:28 -0700 (PDT) Date: Thu, 14 Mar 2024 21:39:50 -0700 In-Reply-To: <20240315044007.2778856-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240315044007.2778856-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.291.gc1ea87d7ee-goog Message-ID: <20240315044007.2778856-6-jstultz@google.com> Subject: [PATCH v9 5/7] sched: Consolidate pick_*_task to task_is_pushable helper From: John Stultz To: LKML Cc: "Connor O'Brien" , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com, John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Connor O'Brien This patch consolidates rt and deadline pick_*_task functions to a task_is_pushable() helper This patch was broken out from a larger chain migration patch originally by Connor O'Brien. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Signed-off-by: Connor O'Brien [jstultz: split out from larger chain migration patch, renamed helper function] Signed-off-by: John Stultz Reviewed-by: Metin Kaya Tested-by: K Prateek Nayak Tested-by: Metin Kaya --- v7: * Split from chain migration patch * Renamed function --- kernel/sched/deadline.c | 10 +--------- kernel/sched/rt.c | 11 +---------- kernel/sched/sched.h | 10 ++++++++++ 3 files changed, 12 insertions(+), 19 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index e68d88963e89..1b9cdb507498 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2179,14 +2179,6 @@ static void task_fork_dl(struct task_struct *p) /* Only try algorithms three times */ #define DL_MAX_TRIES 3 =20 -static int pick_dl_task(struct rq *rq, struct task_struct *p, int cpu) -{ - if (!task_on_cpu(rq, p) && - cpumask_test_cpu(cpu, &p->cpus_mask)) - return 1; - return 0; -} - /* * Return the earliest pushable rq's task, which is suitable to be executed * on the CPU, NULL otherwise: @@ -2205,7 +2197,7 @@ static struct task_struct *pick_earliest_pushable_dl_= task(struct rq *rq, int cpu if (next_node) { p =3D __node_2_pdl(next_node); =20 - if (pick_dl_task(rq, p, cpu)) + if (task_is_pushable(rq, p, cpu) =3D=3D 1) return p; =20 next_node =3D rb_next(next_node); diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index dd072d11cc02..638e7c158ae4 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1791,15 +1791,6 @@ static void put_prev_task_rt(struct rq *rq, struct t= ask_struct *p) /* Only try algorithms three times */ #define RT_MAX_TRIES 3 =20 -static int pick_rt_task(struct rq *rq, struct task_struct *p, int cpu) -{ - if (!task_on_cpu(rq, p) && - cpumask_test_cpu(cpu, &p->cpus_mask)) - return 1; - - return 0; -} - /* * Return the highest pushable rq's task, which is suitable to be executed * on the CPU, NULL otherwise @@ -1813,7 +1804,7 @@ static struct task_struct *pick_highest_pushable_task= (struct rq *rq, int cpu) return NULL; =20 plist_for_each_entry(p, head, pushable_tasks) { - if (pick_rt_task(rq, p, cpu)) + if (task_is_pushable(rq, p, cpu) =3D=3D 1) return p; } =20 diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 6ca83837e0f4..c83e5e0672dc 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3480,6 +3480,16 @@ void do_push_task(struct rq *rq, struct rq *dst_rq, = struct task_struct *task) set_task_cpu(task, dst_rq->cpu); activate_task(dst_rq, task, 0); } + +static inline +int task_is_pushable(struct rq *rq, struct task_struct *p, int cpu) +{ + if (!task_on_cpu(rq, p) && + cpumask_test_cpu(cpu, &p->cpus_mask)) + return 1; + + return 0; +} #endif =20 #endif /* _KERNEL_SCHED_SCHED_H */ --=20 2.44.0.291.gc1ea87d7ee-goog From nobody Sat Feb 7 15:05:56 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 46481DDA9 for ; Fri, 15 Mar 2024 04:40:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477633; cv=none; b=lxO/RUyfPre42dJ4/nUzKGgrQLGazTAIoJVdJ5rdJL8iglWeBV8r+4mUfA/sD8d34L2/C4CIoQwbtC/cuvyHmcyO5ZDxn+CZj3Qx/zfcerC3hBRcOeLhTtcvcYTkgbEXdvZ1AWif5cRXdIsZ7nOvpRUhXjCU8xpYWcZPyWmyrMQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477633; c=relaxed/simple; bh=oAooOCHqvipNleN1AM7UMHZk9HQovQoUz6g88NsOR8I=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=NdyqwWY9xZYCwHY1rtQR1HVThpzoPV/IU69dI00wQskujOKEVa43gEolajcFvMOx2vQim0tSteuOMOEvrO35KPYFmk43eM/UKf5eRmJ+iIeK9fQ8Pa3avDqCLmL7f+GvY45ffLHX2O0sZAmmP5VUzAb3SBaDYXnl1mfDmeCaF/g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=gNOt+Gzz; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="gNOt+Gzz" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-1dd6198c570so21402835ad.1 for ; Thu, 14 Mar 2024 21:40:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1710477631; x=1711082431; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=sBZo6XA+wJmJL8a6sUAHL6sA97t7m/9VOLS633itAWU=; b=gNOt+GzzeVV6a5XssOtgwBLliZ28swzqCZKZra8D44e8MEv+HQhOs9Oq0OTt3N8VRA S4A34B9SUrETeWUSvEMU3TGuhszfpvz6NtTB52yPjauQ6LStvQdyl2UG1qDeEn1qF64G Vr7MUlRDbvn4cL/+yvJ+8aaaCyrPTkLNQgEQTzjuZxqajQjnL7xoW5NUwl7EyabTThj9 v8FF+7kMjr1c4KrdVLP5qkGweKixoyRDQiJ1QvADVnIWP6l8uxkhZrX7/tTiOb+8ACXE mcgQ2uke1059D/ps9ssNjY4U+RFRdqlavUOl/m/tJ4R7fl8zC0iwYy/jf0nDF13bdDSl hLWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710477631; x=1711082431; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sBZo6XA+wJmJL8a6sUAHL6sA97t7m/9VOLS633itAWU=; b=nkvfckrizy74aRFDJJeT8od6FYrRYIQhse18h7xw5t84IjE8Qq2GiNxh5WeL8IVZGj 1XI7kXLQ+I5RC9qrjUDJ+NGd2mOMT7/bsW4n2ZJGwcZuWagLpNFQbYDJWSO+ndhIFDvo 2NWJrhtbKckaFWHEcSBr222W96arep60IKjt09T/eGR2pEQ4pkLp/hjvMUxcA7xhAubb jjpxC1KOFB4kiqC/+lGMUKz3byyMqvxKfL1kNpnvn03U47SENdh4gsEaGF7jcSfM1huE ivvAZjodi+hzMHwumEYKC3lztdAfoW55zolS3IltbIS4NzTns/kTWFZUv1eINGJb+jfg oQOA== X-Gm-Message-State: AOJu0YzAPuK6pPFNT+VJFd2SukAL+scdYlJ4x4ryljvCWSsdHOPt3nZV /vJ6dHpZwRUBGOexIDi8JALA/kbhSqRjvnULYVDJuPWdAnUTOwwMCUlZ77uUIY0zsD+4qF6SHtm LbFoXPpBsZ5bG8kn4qSbDgkTeAC1Q5d8wP0yP83MedfRmAily7MiEfw8b5c7yv36oDU3JfGb00p 7wQP/oRyKLrv9YuaOB8GtXSKERQ6b2tR8MjSDjE9IZxOW7 X-Google-Smtp-Source: AGHT+IHss5N8ZAqEr3rl9cdGbf6FPS2z7cRMtDnwdBGjV7YvnymLRS0N8c1kHOh1jEjUBrNgEX8hJqbV+G42 X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a17:902:e80c:b0:1db:a861:f0ff with SMTP id u12-20020a170902e80c00b001dba861f0ffmr23433plg.3.1710477630362; Thu, 14 Mar 2024 21:40:30 -0700 (PDT) Date: Thu, 14 Mar 2024 21:39:51 -0700 In-Reply-To: <20240315044007.2778856-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240315044007.2778856-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.291.gc1ea87d7ee-goog Message-ID: <20240315044007.2778856-7-jstultz@google.com> Subject: [PATCH v9 6/7] sched: Split out __schedule() deactivate task logic into a helper From: John Stultz To: LKML Cc: John Stultz , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , kernel-team@android.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As we're going to re-use the deactivation logic, split it into a helper. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: kernel-team@android.com Signed-off-by: John Stultz Reviewed-by: Metin Kaya Tested-by: K Prateek Nayak Tested-by: Metin Kaya --- v6: * Define function as static to avoid "no previous prototype" warnings as Reported-by: kernel test robot v7: * Rename state task_state to be more clear, as suggested by Metin Kaya --- kernel/sched/core.c | 72 +++++++++++++++++++++++++++------------------ 1 file changed, 43 insertions(+), 29 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index ad4748327651..b537e5f501ea 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6563,6 +6563,48 @@ pick_next_task(struct rq *rq, struct task_struct *pr= ev, struct rq_flags *rf) # define SM_MASK_PREEMPT SM_PREEMPT #endif =20 +/* + * Helper function for __schedule() + * + * If a task does not have signals pending, deactivate it and return true + * Otherwise marks the task's __state as RUNNING and returns false + */ +static bool try_to_deactivate_task(struct rq *rq, struct task_struct *p, + unsigned long task_state) +{ + if (signal_pending_state(task_state, p)) { + WRITE_ONCE(p->__state, TASK_RUNNING); + } else { + p->sched_contributes_to_load =3D + (task_state & TASK_UNINTERRUPTIBLE) && + !(task_state & TASK_NOLOAD) && + !(task_state & TASK_FROZEN); + + if (p->sched_contributes_to_load) + rq->nr_uninterruptible++; + + /* + * __schedule() ttwu() + * prev_state =3D prev->state; if (p->on_rq && ...) + * if (prev_state) goto out; + * p->on_rq =3D 0; smp_acquire__after_ctrl_dep(); + * p->state =3D TASK_WAKING + * + * Where __schedule() and ttwu() have matching control dependencies. + * + * After this, schedule() must not care about p->state any more. + */ + deactivate_task(rq, p, DEQUEUE_SLEEP | DEQUEUE_NOCLOCK); + + if (p->in_iowait) { + atomic_inc(&rq->nr_iowait); + delayacct_blkio_start(); + } + return true; + } + return false; +} + /* * __schedule() is the main scheduler function. * @@ -6654,35 +6696,7 @@ static void __sched notrace __schedule(unsigned int = sched_mode) */ prev_state =3D READ_ONCE(prev->__state); if (!(sched_mode & SM_MASK_PREEMPT) && prev_state) { - if (signal_pending_state(prev_state, prev)) { - WRITE_ONCE(prev->__state, TASK_RUNNING); - } else { - prev->sched_contributes_to_load =3D - (prev_state & TASK_UNINTERRUPTIBLE) && - !(prev_state & TASK_NOLOAD) && - !(prev_state & TASK_FROZEN); - - if (prev->sched_contributes_to_load) - rq->nr_uninterruptible++; - - /* - * __schedule() ttwu() - * prev_state =3D prev->state; if (p->on_rq && ...) - * if (prev_state) goto out; - * p->on_rq =3D 0; smp_acquire__after_ctrl_dep(); - * p->state =3D TASK_WAKING - * - * Where __schedule() and ttwu() have matching control dependencies. - * - * After this, schedule() must not care about p->state any more. - */ - deactivate_task(rq, prev, DEQUEUE_SLEEP | DEQUEUE_NOCLOCK); - - if (prev->in_iowait) { - atomic_inc(&rq->nr_iowait); - delayacct_blkio_start(); - } - } + try_to_deactivate_task(rq, prev, prev_state); switch_count =3D &prev->nvcsw; } =20 --=20 2.44.0.291.gc1ea87d7ee-goog From nobody Sat Feb 7 15:05:56 2026 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6AD5612E6D for ; Fri, 15 Mar 2024 04:40:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477635; cv=none; b=kR7khIysI8PGAC2Gw7b2KMB8o6HbWeeYnE12pEJ46DA5t9f+ylrMCo9Tze8qCQeP327VTK+fdle22zTSJZfHU2KLEQktfZGXnz5Bw+iljwYzGLkQN58APD/5kTFl9A/AkVVb8iKBu2VK6dZHCt58o2aeIjrHdccUQWKER4XiP1g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710477635; c=relaxed/simple; bh=hciAhyMAUyAfngn2Iw8NQ7QKaPaz/6IJHLuVpHKc7bM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Af4NwTiR4gLcl1IazURG6O4rszsf3J0lHT5LigPlcGk6f9o7yoL8VrL7UyPAaT4MJ9lmaWIKOt8z8P2bSLS+3nQ0PO/uhAkE7MJ1w+TLGpMoq0CRPtBYPETI51BHpx872ZR76WTPitzNC7D89FMu3RnJUOQ88XftUbnLL4g/O50= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=BOQnMvFR; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="BOQnMvFR" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dc691f1f83aso840482276.1 for ; Thu, 14 Mar 2024 21:40:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1710477632; x=1711082432; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=skNqJBe5cfFB+6C2jSZ42lp9chUmC0iTX3cWJz+GN9A=; b=BOQnMvFRAlU7LauoI+Y0bhRSuP6Q3m4vwTn6lkp1yaWu+He4lNSNHM0GBImAK0rjc2 NIko5L8+fQ5X3y1Zc4lDfTret4X0LutmhJyKIvxHoMMi+5GmqK2TUiOL9hULUeDSl3SA S33eG+hi8xObi45Thqw4SAa4Nw3SMdJoewdiykT7djOsudYZcmjCzFaQjnvfuV6r0ZjO Jp65hLrsV8vLNeCkrpEJnfEEdzyYx7qF1PSg6DoYvLBeCZvqdjUfIP1PE2RcijK4u/FF r3CGo+Zq40Vd/0JWY7uy27sjuEX21cL819FxF7Ptg9qOgCSuaXAOkLpVncZbJjBG3r8j KdXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710477632; x=1711082432; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=skNqJBe5cfFB+6C2jSZ42lp9chUmC0iTX3cWJz+GN9A=; b=GT7NiDUAlNa+6hBvwGOOkPUwSa59LOUCde/c8VqYzJJFqg/ie9WNMbe7EWjj0r8622 6CBWKJQGJGREHdksIclKHunGe95GA+zOCYFCeodWHicim5kyeLGi1EyqU/g23fohfg41 fFfaCMHnwhUCfV7UXPS0Q5n0q9PJDH9VNnkqBsx/Gmms7txn8vVZ5HJmNiYNbGKHIo4c UrQi5SpfV4ZQIzkYv4coy3UIVCIcCBK4dodex4uOWeMVrQBqqLzXVfklWEqJIRkjzR1U /TUZBxY5e6j24ILNkp5v797sh0l28x/XH0p1/NpgreF31iM+WR1wYjBpLQBV12UwhcEd h96Q== X-Gm-Message-State: AOJu0Yx+nmtb8DOWwDB1XtvNt8v41Di+BIX9UlfZ4Mq2H6rWc5+qguCo tpLNdf8S/fXZeqaD4FlN2/eKek9rAzsIZFDi0RJG2ohD0uPl7InN976VuIuwu0qi6HYL8M7IDbT MOcd5MmKK8i1uQ+QS+FANRFk0apDSyC/lb1pPqOC+sBrBHJkJQQUtcuKVK0WCDQgTZeJCIvmnC8 BjQrEx6NZR6Ui2TUtz0++JJzc4OS27uH8Ff8rjbxbTzGGh X-Google-Smtp-Source: AGHT+IGyfunSY4F3/alietpKg3e2q9TtPyjN27ehiHaOGTtdupO20ZdS+P3aBDhnCfNnfNAKGEBW2kbfwlKq X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a05:6902:2311:b0:dbd:b4e8:1565 with SMTP id do17-20020a056902231100b00dbdb4e81565mr1054500ybb.4.1710477632348; Thu, 14 Mar 2024 21:40:32 -0700 (PDT) Date: Thu, 14 Mar 2024 21:39:52 -0700 In-Reply-To: <20240315044007.2778856-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240315044007.2778856-1-jstultz@google.com> X-Mailer: git-send-email 2.44.0.291.gc1ea87d7ee-goog Message-ID: <20240315044007.2778856-8-jstultz@google.com> Subject: [PATCH v9 7/7] sched: Split scheduler and execution contexts From: John Stultz To: LKML Cc: Peter Zijlstra , Joel Fernandes , Qais Yousef , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Xuewen Yan , K Prateek Nayak , Metin Kaya , Thomas Gleixner , kernel-team@android.com, "Connor O'Brien" , John Stultz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Let's define the scheduling context as all the scheduler state in task_struct for the task selected to run, and the execution context as all state required to actually run the task. Currently both are intertwined in task_struct. We want to logically split these such that we can use the scheduling context of the task selected to be scheduled, but use the execution context of a different task to actually be run. To this purpose, introduce rq_selected() macro to point to the task_struct selected from the runqueue by the scheduler, and will be used for scheduler state, and preserve rq->curr to indicate the execution context of the task that will actually be run. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Metin Kaya Cc: Thomas Gleixner Cc: kernel-team@android.com Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Juri Lelli Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20181009092434.26221-5-juri.lelli@redhat.com [add additional comments and update more sched_class code to use rq::proxy] Signed-off-by: Connor O'Brien [jstultz: Rebased and resolved minor collisions, reworked to use accessors, tweaked update_curr_common to use rq_proxy fixing rt scheduling issues] Signed-off-by: John Stultz Reviewed-by: Metin Kaya Tested-by: K Prateek Nayak Tested-by: Metin Kaya --- v2: * Reworked to use accessors * Fixed update_curr_common to use proxy instead of curr v3: * Tweaked wrapper names * Swapped proxy for selected for clarity v4: * Minor variable name tweaks for readability * Use a macro instead of a inline function and drop other helper functions as suggested by Peter. * Remove verbose comments/questions to avoid review distractions, as suggested by Dietmar v5: * Add CONFIG_PROXY_EXEC option to this patch so the new logic can be tested with this change * Minor fix to grab rq_selected when holding the rq lock v7: * Minor spelling fix and unused argument fixes suggested by Metin Kaya * Switch to curr_selected for consistency, and minor rewording of commit message for clarity * Rename variables selected instead of curr when we're using rq_selected() * Reduce macros in CONFIG_SCHED_PROXY_EXEC ifdef sections, as suggested by Metin Kaya v8: * Use rq->curr, not rq_selected with task_tick, as suggested by Valentin * Minor rework to reorder this with CONFIG_SCHED_PROXY_EXEC patch --- kernel/sched/core.c | 46 ++++++++++++++++++++++++++--------------- kernel/sched/deadline.c | 35 ++++++++++++++++--------------- kernel/sched/fair.c | 18 ++++++++-------- kernel/sched/rt.c | 40 +++++++++++++++++------------------ kernel/sched/sched.h | 25 ++++++++++++++++++++-- 5 files changed, 99 insertions(+), 65 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index b537e5f501ea..c17f91d6ceba 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -794,7 +794,7 @@ static enum hrtimer_restart hrtick(struct hrtimer *time= r) =20 rq_lock(rq, &rf); update_rq_clock(rq); - rq->curr->sched_class->task_tick(rq, rq->curr, 1); + rq_selected(rq)->sched_class->task_tick(rq, rq->curr, 1); rq_unlock(rq, &rf); =20 return HRTIMER_NORESTART; @@ -2238,16 +2238,18 @@ static inline void check_class_changed(struct rq *r= q, struct task_struct *p, =20 void wakeup_preempt(struct rq *rq, struct task_struct *p, int flags) { - if (p->sched_class =3D=3D rq->curr->sched_class) - rq->curr->sched_class->wakeup_preempt(rq, p, flags); - else if (sched_class_above(p->sched_class, rq->curr->sched_class)) + struct task_struct *selected =3D rq_selected(rq); + + if (p->sched_class =3D=3D selected->sched_class) + selected->sched_class->wakeup_preempt(rq, p, flags); + else if (sched_class_above(p->sched_class, selected->sched_class)) resched_curr(rq); =20 /* * A queue event has occurred, and we're going to schedule. In * this case, we can save a useless back to back clock update. */ - if (task_on_rq_queued(rq->curr) && test_tsk_need_resched(rq->curr)) + if (task_on_rq_queued(selected) && test_tsk_need_resched(rq->curr)) rq_clock_skip_update(rq); } =20 @@ -2774,7 +2776,7 @@ __do_set_cpus_allowed(struct task_struct *p, struct a= ffinity_context *ctx) lockdep_assert_held(&p->pi_lock); =20 queued =3D task_on_rq_queued(p); - running =3D task_current(rq, p); + running =3D task_current_selected(rq, p); =20 if (queued) { /* @@ -5587,7 +5589,7 @@ unsigned long long task_sched_runtime(struct task_str= uct *p) * project cycles that may never be accounted to this * thread, breaking clock_gettime(). */ - if (task_current(rq, p) && task_on_rq_queued(p)) { + if (task_current_selected(rq, p) && task_on_rq_queued(p)) { prefetch_curr_exec_start(p); update_rq_clock(rq); p->sched_class->update_curr(rq); @@ -5655,7 +5657,8 @@ void scheduler_tick(void) { int cpu =3D smp_processor_id(); struct rq *rq =3D cpu_rq(cpu); - struct task_struct *curr =3D rq->curr; + /* accounting goes to the selected task */ + struct task_struct *selected; struct rq_flags rf; unsigned long thermal_pressure; u64 resched_latency; @@ -5666,16 +5669,17 @@ void scheduler_tick(void) sched_clock_tick(); =20 rq_lock(rq, &rf); + selected =3D rq_selected(rq); =20 update_rq_clock(rq); thermal_pressure =3D arch_scale_thermal_pressure(cpu_of(rq)); update_thermal_load_avg(rq_clock_thermal(rq), rq, thermal_pressure); - curr->sched_class->task_tick(rq, curr, 0); + selected->sched_class->task_tick(rq, selected, 0); if (sched_feat(LATENCY_WARN)) resched_latency =3D cpu_resched_latency(rq); calc_global_load_tick(rq); sched_core_tick(rq); - task_tick_mm_cid(rq, curr); + task_tick_mm_cid(rq, selected); =20 rq_unlock(rq, &rf); =20 @@ -5684,8 +5688,8 @@ void scheduler_tick(void) =20 perf_event_task_tick(); =20 - if (curr->flags & PF_WQ_WORKER) - wq_worker_tick(curr); + if (selected->flags & PF_WQ_WORKER) + wq_worker_tick(selected); =20 #ifdef CONFIG_SMP rq->idle_balance =3D idle_cpu(cpu); @@ -5750,6 +5754,12 @@ static void sched_tick_remote(struct work_struct *wo= rk) struct task_struct *curr =3D rq->curr; =20 if (cpu_online(cpu)) { + /* + * Since this is a remote tick for full dynticks mode, + * we are always sure that there is no proxy (only a + * single task is running). + */ + SCHED_WARN_ON(rq->curr !=3D rq_selected(rq)); update_rq_clock(rq); =20 if (!is_idle_task(curr)) { @@ -6701,6 +6711,7 @@ static void __sched notrace __schedule(unsigned int s= ched_mode) } =20 next =3D pick_next_task(rq, prev, &rf); + rq_set_selected(rq, next); clear_tsk_need_resched(prev); clear_preempt_need_resched(); #ifdef CONFIG_SCHED_DEBUG @@ -7201,7 +7212,7 @@ void rt_mutex_setprio(struct task_struct *p, struct t= ask_struct *pi_task) =20 prev_class =3D p->sched_class; queued =3D task_on_rq_queued(p); - running =3D task_current(rq, p); + running =3D task_current_selected(rq, p); if (queued) dequeue_task(rq, p, queue_flag); if (running) @@ -7291,7 +7302,7 @@ void set_user_nice(struct task_struct *p, long nice) } =20 queued =3D task_on_rq_queued(p); - running =3D task_current(rq, p); + running =3D task_current_selected(rq, p); if (queued) dequeue_task(rq, p, DEQUEUE_SAVE | DEQUEUE_NOCLOCK); if (running) @@ -7870,7 +7881,7 @@ static int __sched_setscheduler(struct task_struct *p, } =20 queued =3D task_on_rq_queued(p); - running =3D task_current(rq, p); + running =3D task_current_selected(rq, p); if (queued) dequeue_task(rq, p, queue_flags); if (running) @@ -9297,6 +9308,7 @@ void __init init_idle(struct task_struct *idle, int c= pu) rcu_read_unlock(); =20 rq->idle =3D idle; + rq_set_selected(rq, idle); rcu_assign_pointer(rq->curr, idle); idle->on_rq =3D TASK_ON_RQ_QUEUED; #ifdef CONFIG_SMP @@ -9386,7 +9398,7 @@ void sched_setnuma(struct task_struct *p, int nid) =20 rq =3D task_rq_lock(p, &rf); queued =3D task_on_rq_queued(p); - running =3D task_current(rq, p); + running =3D task_current_selected(rq, p); =20 if (queued) dequeue_task(rq, p, DEQUEUE_SAVE); @@ -10491,7 +10503,7 @@ void sched_move_task(struct task_struct *tsk) =20 update_rq_clock(rq); =20 - running =3D task_current(rq, tsk); + running =3D task_current_selected(rq, tsk); queued =3D task_on_rq_queued(tsk); =20 if (queued) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 1b9cdb507498..c30b592d6e9d 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1218,7 +1218,7 @@ static enum hrtimer_restart dl_task_timer(struct hrti= mer *timer) #endif =20 enqueue_task_dl(rq, p, ENQUEUE_REPLENISH); - if (dl_task(rq->curr)) + if (dl_task(rq_selected(rq))) wakeup_preempt_dl(rq, p, 0); else resched_curr(rq); @@ -1442,7 +1442,7 @@ void dl_server_init(struct sched_dl_entity *dl_se, st= ruct rq *rq, */ static void update_curr_dl(struct rq *rq) { - struct task_struct *curr =3D rq->curr; + struct task_struct *curr =3D rq_selected(rq); struct sched_dl_entity *dl_se =3D &curr->dl; s64 delta_exec; =20 @@ -1899,7 +1899,7 @@ static int find_later_rq(struct task_struct *task); static int select_task_rq_dl(struct task_struct *p, int cpu, int flags) { - struct task_struct *curr; + struct task_struct *curr, *selected; bool select_rq; struct rq *rq; =20 @@ -1910,6 +1910,7 @@ select_task_rq_dl(struct task_struct *p, int cpu, int= flags) =20 rcu_read_lock(); curr =3D READ_ONCE(rq->curr); /* unlocked access */ + selected =3D READ_ONCE(rq_selected(rq)); =20 /* * If we are dealing with a -deadline task, we must @@ -1920,9 +1921,9 @@ select_task_rq_dl(struct task_struct *p, int cpu, int= flags) * other hand, if it has a shorter deadline, we * try to make it stay here, it might be important. */ - select_rq =3D unlikely(dl_task(curr)) && + select_rq =3D unlikely(dl_task(selected)) && (curr->nr_cpus_allowed < 2 || - !dl_entity_preempt(&p->dl, &curr->dl)) && + !dl_entity_preempt(&p->dl, &selected->dl)) && p->nr_cpus_allowed > 1; =20 /* @@ -1985,7 +1986,7 @@ static void check_preempt_equal_dl(struct rq *rq, str= uct task_struct *p) * let's hope p can move out. */ if (rq->curr->nr_cpus_allowed =3D=3D 1 || - !cpudl_find(&rq->rd->cpudl, rq->curr, NULL)) + !cpudl_find(&rq->rd->cpudl, rq_selected(rq), NULL)) return; =20 /* @@ -2024,7 +2025,7 @@ static int balance_dl(struct rq *rq, struct task_stru= ct *p, struct rq_flags *rf) static void wakeup_preempt_dl(struct rq *rq, struct task_struct *p, int flags) { - if (dl_entity_preempt(&p->dl, &rq->curr->dl)) { + if (dl_entity_preempt(&p->dl, &rq_selected(rq)->dl)) { resched_curr(rq); return; } @@ -2034,7 +2035,7 @@ static void wakeup_preempt_dl(struct rq *rq, struct t= ask_struct *p, * In the unlikely case current and p have the same deadline * let us try to decide what's the best thing to do... */ - if ((p->dl.deadline =3D=3D rq->curr->dl.deadline) && + if ((p->dl.deadline =3D=3D rq_selected(rq)->dl.deadline) && !test_tsk_need_resched(rq->curr)) check_preempt_equal_dl(rq, p); #endif /* CONFIG_SMP */ @@ -2066,7 +2067,7 @@ static void set_next_task_dl(struct rq *rq, struct ta= sk_struct *p, bool first) if (!first) return; =20 - if (rq->curr->sched_class !=3D &dl_sched_class) + if (rq_selected(rq)->sched_class !=3D &dl_sched_class) update_dl_rq_load_avg(rq_clock_pelt(rq), rq, 0); =20 deadline_queue_push_tasks(rq); @@ -2391,8 +2392,8 @@ static int push_dl_task(struct rq *rq) * can move away, it makes sense to just reschedule * without going further in pushing next_task. */ - if (dl_task(rq->curr) && - dl_time_before(next_task->dl.deadline, rq->curr->dl.deadline) && + if (dl_task(rq_selected(rq)) && + dl_time_before(next_task->dl.deadline, rq_selected(rq)->dl.deadline) = && rq->curr->nr_cpus_allowed > 1) { resched_curr(rq); return 0; @@ -2515,7 +2516,7 @@ static void pull_dl_task(struct rq *this_rq) * deadline than the current task of its runqueue. */ if (dl_time_before(p->dl.deadline, - src_rq->curr->dl.deadline)) + rq_selected(src_rq)->dl.deadline)) goto skip; =20 if (is_migration_disabled(p)) { @@ -2554,9 +2555,9 @@ static void task_woken_dl(struct rq *rq, struct task_= struct *p) if (!task_on_cpu(rq, p) && !test_tsk_need_resched(rq->curr) && p->nr_cpus_allowed > 1 && - dl_task(rq->curr) && + dl_task(rq_selected(rq)) && (rq->curr->nr_cpus_allowed < 2 || - !dl_entity_preempt(&p->dl, &rq->curr->dl))) { + !dl_entity_preempt(&p->dl, &rq_selected(rq)->dl))) { push_dl_tasks(rq); } } @@ -2731,12 +2732,12 @@ static void switched_to_dl(struct rq *rq, struct ta= sk_struct *p) return; } =20 - if (rq->curr !=3D p) { + if (rq_selected(rq) !=3D p) { #ifdef CONFIG_SMP if (p->nr_cpus_allowed > 1 && rq->dl.overloaded) deadline_queue_push_tasks(rq); #endif - if (dl_task(rq->curr)) + if (dl_task(rq_selected(rq))) wakeup_preempt_dl(rq, p, 0); else resched_curr(rq); @@ -2765,7 +2766,7 @@ static void prio_changed_dl(struct rq *rq, struct tas= k_struct *p, if (!rq->dl.overloaded) deadline_queue_pull_task(rq); =20 - if (task_current(rq, p)) { + if (task_current_selected(rq, p)) { /* * If we now have a earlier deadline task than p, * then reschedule, provided p is still on this diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 533547e3c90a..dc342e1fc420 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1140,7 +1140,7 @@ static inline void update_curr_task(struct task_struc= t *p, s64 delta_exec) */ s64 update_curr_common(struct rq *rq) { - struct task_struct *curr =3D rq->curr; + struct task_struct *curr =3D rq_selected(rq); s64 delta_exec; =20 delta_exec =3D update_curr_se(rq, &curr->se); @@ -1177,7 +1177,7 @@ static void update_curr(struct cfs_rq *cfs_rq) =20 static void update_curr_fair(struct rq *rq) { - update_curr(cfs_rq_of(&rq->curr->se)); + update_curr(cfs_rq_of(&rq_selected(rq)->se)); } =20 static inline void @@ -6627,7 +6627,7 @@ static void hrtick_start_fair(struct rq *rq, struct t= ask_struct *p) s64 delta =3D slice - ran; =20 if (delta < 0) { - if (task_current(rq, p)) + if (task_current_selected(rq, p)) resched_curr(rq); return; } @@ -6642,7 +6642,7 @@ static void hrtick_start_fair(struct rq *rq, struct t= ask_struct *p) */ static void hrtick_update(struct rq *rq) { - struct task_struct *curr =3D rq->curr; + struct task_struct *curr =3D rq_selected(rq); =20 if (!hrtick_enabled_fair(rq) || curr->sched_class !=3D &fair_sched_class) return; @@ -8267,7 +8267,7 @@ static void set_next_buddy(struct sched_entity *se) */ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p= , int wake_flags) { - struct task_struct *curr =3D rq->curr; + struct task_struct *curr =3D rq_selected(rq); struct sched_entity *se =3D &curr->se, *pse =3D &p->se; struct cfs_rq *cfs_rq =3D task_cfs_rq(curr); int cse_is_idle, pse_is_idle; @@ -8298,7 +8298,7 @@ static void check_preempt_wakeup_fair(struct rq *rq, = struct task_struct *p, int * prevents us from potentially nominating it as a false LAST_BUDDY * below. */ - if (test_tsk_need_resched(curr)) + if (test_tsk_need_resched(rq->curr)) return; =20 /* Idle tasks are by definition preempted by non-idle tasks. */ @@ -9282,7 +9282,7 @@ static bool __update_blocked_others(struct rq *rq, bo= ol *done) * update_load_avg() can call cpufreq_update_util(). Make sure that RT, * DL and IRQ signals have been updated before updating CFS. */ - curr_class =3D rq->curr->sched_class; + curr_class =3D rq_selected(rq)->sched_class; =20 thermal_pressure =3D arch_scale_thermal_pressure(cpu_of(rq)); =20 @@ -12673,7 +12673,7 @@ prio_changed_fair(struct rq *rq, struct task_struct= *p, int oldprio) * our priority decreased, or if we are not currently running on * this runqueue and our priority is higher than the current's */ - if (task_current(rq, p)) { + if (task_current_selected(rq, p)) { if (p->prio > oldprio) resched_curr(rq); } else @@ -12776,7 +12776,7 @@ static void switched_to_fair(struct rq *rq, struct = task_struct *p) * kick off the schedule if running, otherwise just see * if we can still preempt the current task. */ - if (task_current(rq, p)) + if (task_current_selected(rq, p)) resched_curr(rq); else wakeup_preempt(rq, p, 0); diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 638e7c158ae4..48fc7a198f1a 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -530,7 +530,7 @@ static void dequeue_rt_entity(struct sched_rt_entity *r= t_se, unsigned int flags) =20 static void sched_rt_rq_enqueue(struct rt_rq *rt_rq) { - struct task_struct *curr =3D rq_of_rt_rq(rt_rq)->curr; + struct task_struct *curr =3D rq_selected(rq_of_rt_rq(rt_rq)); struct rq *rq =3D rq_of_rt_rq(rt_rq); struct sched_rt_entity *rt_se; =20 @@ -1000,7 +1000,7 @@ static int sched_rt_runtime_exceeded(struct rt_rq *rt= _rq) */ static void update_curr_rt(struct rq *rq) { - struct task_struct *curr =3D rq->curr; + struct task_struct *curr =3D rq_selected(rq); struct sched_rt_entity *rt_se =3D &curr->rt; s64 delta_exec; =20 @@ -1543,7 +1543,7 @@ static int find_lowest_rq(struct task_struct *task); static int select_task_rq_rt(struct task_struct *p, int cpu, int flags) { - struct task_struct *curr; + struct task_struct *curr, *selected; struct rq *rq; bool test; =20 @@ -1555,6 +1555,7 @@ select_task_rq_rt(struct task_struct *p, int cpu, int= flags) =20 rcu_read_lock(); curr =3D READ_ONCE(rq->curr); /* unlocked access */ + selected =3D READ_ONCE(rq_selected(rq)); =20 /* * If the current task on @p's runqueue is an RT task, then @@ -1583,8 +1584,8 @@ select_task_rq_rt(struct task_struct *p, int cpu, int= flags) * systems like big.LITTLE. */ test =3D curr && - unlikely(rt_task(curr)) && - (curr->nr_cpus_allowed < 2 || curr->prio <=3D p->prio); + unlikely(rt_task(selected)) && + (curr->nr_cpus_allowed < 2 || selected->prio <=3D p->prio); =20 if (test || !rt_task_fits_capacity(p, cpu)) { int target =3D find_lowest_rq(p); @@ -1614,12 +1615,8 @@ select_task_rq_rt(struct task_struct *p, int cpu, in= t flags) =20 static void check_preempt_equal_prio(struct rq *rq, struct task_struct *p) { - /* - * Current can't be migrated, useless to reschedule, - * let's hope p can move out. - */ if (rq->curr->nr_cpus_allowed =3D=3D 1 || - !cpupri_find(&rq->rd->cpupri, rq->curr, NULL)) + !cpupri_find(&rq->rd->cpupri, rq_selected(rq), NULL)) return; =20 /* @@ -1662,7 +1659,9 @@ static int balance_rt(struct rq *rq, struct task_stru= ct *p, struct rq_flags *rf) */ static void wakeup_preempt_rt(struct rq *rq, struct task_struct *p, int fl= ags) { - if (p->prio < rq->curr->prio) { + struct task_struct *curr =3D rq_selected(rq); + + if (p->prio < curr->prio) { resched_curr(rq); return; } @@ -1680,7 +1679,7 @@ static void wakeup_preempt_rt(struct rq *rq, struct t= ask_struct *p, int flags) * to move current somewhere else, making room for our non-migratable * task. */ - if (p->prio =3D=3D rq->curr->prio && !test_tsk_need_resched(rq->curr)) + if (p->prio =3D=3D curr->prio && !test_tsk_need_resched(rq->curr)) check_preempt_equal_prio(rq, p); #endif } @@ -1705,7 +1704,7 @@ static inline void set_next_task_rt(struct rq *rq, st= ruct task_struct *p, bool f * utilization. We only care of the case where we start to schedule a * rt task */ - if (rq->curr->sched_class !=3D &rt_sched_class) + if (rq_selected(rq)->sched_class !=3D &rt_sched_class) update_rt_rq_load_avg(rq_clock_pelt(rq), rq, 0); =20 rt_queue_push_tasks(rq); @@ -1977,6 +1976,7 @@ static struct task_struct *pick_next_pushable_task(st= ruct rq *rq) =20 BUG_ON(rq->cpu !=3D task_cpu(p)); BUG_ON(task_current(rq, p)); + BUG_ON(task_current_selected(rq, p)); BUG_ON(p->nr_cpus_allowed <=3D 1); =20 BUG_ON(!task_on_rq_queued(p)); @@ -2009,7 +2009,7 @@ static int push_rt_task(struct rq *rq, bool pull) * higher priority than current. If that's the case * just reschedule current. */ - if (unlikely(next_task->prio < rq->curr->prio)) { + if (unlikely(next_task->prio < rq_selected(rq)->prio)) { resched_curr(rq); return 0; } @@ -2362,7 +2362,7 @@ static void pull_rt_task(struct rq *this_rq) * p if it is lower in priority than the * current task on the run queue */ - if (p->prio < src_rq->curr->prio) + if (p->prio < rq_selected(src_rq)->prio) goto skip; =20 if (is_migration_disabled(p)) { @@ -2404,9 +2404,9 @@ static void task_woken_rt(struct rq *rq, struct task_= struct *p) bool need_to_push =3D !task_on_cpu(rq, p) && !test_tsk_need_resched(rq->curr) && p->nr_cpus_allowed > 1 && - (dl_task(rq->curr) || rt_task(rq->curr)) && + (dl_task(rq_selected(rq)) || rt_task(rq_selected(rq))) && (rq->curr->nr_cpus_allowed < 2 || - rq->curr->prio <=3D p->prio); + rq_selected(rq)->prio <=3D p->prio); =20 if (need_to_push) push_rt_tasks(rq); @@ -2490,7 +2490,7 @@ static void switched_to_rt(struct rq *rq, struct task= _struct *p) if (p->nr_cpus_allowed > 1 && rq->rt.overloaded) rt_queue_push_tasks(rq); #endif /* CONFIG_SMP */ - if (p->prio < rq->curr->prio && cpu_online(cpu_of(rq))) + if (p->prio < rq_selected(rq)->prio && cpu_online(cpu_of(rq))) resched_curr(rq); } } @@ -2505,7 +2505,7 @@ prio_changed_rt(struct rq *rq, struct task_struct *p,= int oldprio) if (!task_on_rq_queued(p)) return; =20 - if (task_current(rq, p)) { + if (task_current_selected(rq, p)) { #ifdef CONFIG_SMP /* * If our priority decreases while running, we @@ -2531,7 +2531,7 @@ prio_changed_rt(struct rq *rq, struct task_struct *p,= int oldprio) * greater than the current running task * then reschedule. */ - if (p->prio < rq->curr->prio) + if (p->prio < rq_selected(rq)->prio) resched_curr(rq); } } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index c83e5e0672dc..808d6ee8ae33 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1030,7 +1030,7 @@ struct rq { */ unsigned int nr_uninterruptible; =20 - struct task_struct __rcu *curr; + struct task_struct __rcu *curr; /* Execution context */ struct task_struct *idle; struct task_struct *stop; unsigned long next_balance; @@ -1225,6 +1225,13 @@ DECLARE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues); #define cpu_curr(cpu) (cpu_rq(cpu)->curr) #define raw_rq() raw_cpu_ptr(&runqueues) =20 +/* For now, rq_selected =3D=3D rq->curr */ +#define rq_selected(rq) ((rq)->curr) +static inline void rq_set_selected(struct rq *rq, struct task_struct *t) +{ + /* Do nothing */ +} + struct sched_group; #ifdef CONFIG_SCHED_CORE static inline struct cpumask *sched_group_span(struct sched_group *sg); @@ -2148,11 +2155,25 @@ static inline u64 global_rt_runtime(void) return (u64)sysctl_sched_rt_runtime * NSEC_PER_USEC; } =20 +/* + * Is p the current execution context? + */ static inline int task_current(struct rq *rq, struct task_struct *p) { return rq->curr =3D=3D p; } =20 +/* + * Is p the current scheduling context? + * + * Note that it might be the current execution context at the same time if + * rq->curr =3D=3D rq_selected() =3D=3D p. + */ +static inline int task_current_selected(struct rq *rq, struct task_struct = *p) +{ + return rq_selected(rq) =3D=3D p; +} + static inline int task_on_cpu(struct rq *rq, struct task_struct *p) { #ifdef CONFIG_SMP @@ -2322,7 +2343,7 @@ struct sched_class { =20 static inline void put_prev_task(struct rq *rq, struct task_struct *prev) { - WARN_ON_ONCE(rq->curr !=3D prev); + WARN_ON_ONCE(rq_selected(rq) !=3D prev); prev->sched_class->put_prev_task(rq, prev); } =20 --=20 2.44.0.291.gc1ea87d7ee-goog