From nobody Sun Jun 14 17:32:41 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A3CBC33F5B9 for ; Sat, 4 Apr 2026 05:36:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281005; cv=none; b=kOSqzqQEBzqMNhUkvfU/ds2HAT9MgnovwrYwpsxoSe6VywjI+/mU1ZeSSF+HZU8WIMzliHylVUvXmuBG0gmJrBIQU5ivrd6Em4n7Tq6J3FaGELmIEBsJJplzYyI6oGFeEcNX8Vps+xT+WAuoSLvIn+Q1Lw/zdlWlsPUGmT3r60U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281005; c=relaxed/simple; bh=F28EPZQCQ4tn6kf812QuP2Zs7jpnRL5y3AEn1HmHCmk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=NWOclH6n56dHo84hwEPyKbdx2hpzf5E0n3nVMJMa839/8JyyX3/Wk2fMwWImkMQoU4IhYh1tiAMHGShBQemsSpZ9P4YxB2Fr1WlODJhUHTLfJ6i8Dw4vCeP7gPAxKYmPE2B47UwhjCJ6lLZOsOHdl22T0syLyByjqHuolBq7nvw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=FzCPX+YT; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="FzCPX+YT" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2b24305cb3cso30020095ad.2 for ; Fri, 03 Apr 2026 22:36:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1775281003; x=1775885803; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ZXo/AxJOuI/DokbpiWHMF1CpoLIvV/d7kD4GGDbrjcc=; b=FzCPX+YTJceXfG7JR8kpD6BKoFy0tBN+2RRde717M6darjHPikxr++BKbm74XACteJ mW74r/1PNhFXj/L7avMwmkb7x2yCWqF1HIWZ+MbjHqCHylDzTg5s02jmhpr4UQodfldZ wYP1JR+RdwhEqTWGWabaLxXFQ36QPgXRjHj1+g2bjNw1ja5yQ5fqLYZ1qY3AtOHnVCTH qTeBuckHz40vnfbgCfVMspJ7O0CklBc/3vpqm0aqQ3ivdn5hQAVfHUB2qaKMZancvU95 VB9TW8knpDbv4L2iLozMB4Y8oEawb/u+KxyQUDzJ9eiPvSFcj6KjLQUzi5yjN4ySamUj LzGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775281003; x=1775885803; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZXo/AxJOuI/DokbpiWHMF1CpoLIvV/d7kD4GGDbrjcc=; b=tBXd/vQESh15i7ejRV+6QEXqEQxlkNBULDpzk301Fvi2UC964CRVA9jaQxYsKbTAJu yMkCCAUxu+RAO7qnhsLAJhy/LFAtWea8AUuBWc8UyIfgb+3Sv3TRwGo43TGWdPa7AZ8X HImZXaRwMNyxgRqYdLuouRxwZ8yciG9WL1zahLrilB81PFCeNee4adyXgAg8LT5VR+zx 2sn1XqaDA/pySvb89HASBqXag9bw9eDHGzZOUpBfIqFvEGnD0bN721mX36JpwaTYNlJA w26Q5A30Li/+ftKTKOp0pjpKHVNdEYr2ceQbJyWXv6IoiANTiYcYd55BhMxNRBgO7YIs KUUA== X-Gm-Message-State: AOJu0YxG590e+JMTNXoimnhimKrN/tM7Nv9QCDi9VOQBSGQYJ7KfkaXA CSqjnSI3MP8HwT+6POxmovtCKuTJOn291bJgI1ccrHIIIHCF4gmddfFZpQw4rFWu0FvaztFh2DI ZR4ppsb+D1Y/jrnKRlk93tgIcBxihQJNkesFoKG583fXKeA8W2HFKobYHJpTxv7iO5H1qxesDzE pB2Vm8F+apyqow23vlSZHpi05RIm/XAbI5aAPdmXHDdR6fIYqg X-Received: from pgkk66.prod.google.com ([2002:a63:2445:0:b0:c76:3cb2:929c]) (user=jstultz job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:12c3:b0:398:7982:21df with SMTP id adf61e73a8af0-39f2eda85d8mr5037968637.9.1775281002193; Fri, 03 Apr 2026 22:36:42 -0700 (PDT) Date: Sat, 4 Apr 2026 05:36:18 +0000 In-Reply-To: <20260404053632.1729280-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260404053632.1729280-1-jstultz@google.com> X-Mailer: git-send-email 2.53.0.1213.gd9a14994de-goog Message-ID: <20260404053632.1729280-2-jstultz@google.com> Subject: [PATCH v27 01/10] sched: Rework pick_next_task() and prev_balance() to avoid stale prev references From: John Stultz To: LKML Cc: John Stultz , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Historically, the prev value from __schedule() was the rq->curr. This prev value is passed down through numerous functions, and used in the class scheduler implementations. The fact that prev was on_cpu until the end of __schedule(), meant it was stable across the rq lock drops that the class->pick_next_task() and ->balance() implementations often do. However, with proxy-exec, the prev passed to functions called by __schedule() is rq->donor, which may not be the same as rq->curr and may not be on_cpu, this makes the prev value potentially unstable across rq lock drops. A recently found issue with proxy-exec, is when we begin doing return migration from try_to_wake_up(), its possible we may be waking up the rq->donor. When we do this, we proxy_resched_idle() to put_prev_set_next() setting the rq->donor to rq->idle, allowing the rq->donor to be return migrated and allowed to run. This however runs into trouble, as on another cpu we might be in the middle of calling __schedule(). Conceptually the rq lock is held for the majority of the time, but in calling pick_next_task() its possible the class->pick_next_task() handler or the ->balance() call may briefly drop the rq lock. This opens a window for try_to_wake_up() to wake and return migrate the rq->donor before the class logic reacquires the rq lock. Unfortunately pick_next_task() and prev_balance() pass in a prev argument, to which we pass rq->donor. However this prev value can now become stale and incorrect across a rq lock drop. So, to correct this, rework the pick_next_task() and prev_balance() calls so that they do not take a "prev" argument. Also rework the class ->pick_next_task() and ->balance() implementations to drop the prev argument, and in the cases where it was used, and have the class functions reference rq->donor directly, and not save the value across rq lock drops so that we don't end up with a stale references. Signed-off-by: John Stultz --- Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Mel Gorman Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: Daniel Lezcano Cc: Suleiman Souhlal Cc: kuyo chang Cc: hupu Cc: kernel-team@android.com --- kernel/sched/core.c | 37 ++++++++++++++++++------------------- kernel/sched/deadline.c | 8 +++++++- kernel/sched/fair.c | 9 +++++++-- kernel/sched/idle.c | 2 +- kernel/sched/rt.c | 8 +++++++- kernel/sched/sched.h | 10 ++++------ kernel/sched/stop_task.c | 2 +- 7 files changed, 45 insertions(+), 31 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c15c9865299e7..9c8a769a6d109 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5907,10 +5907,9 @@ static inline void schedule_debug(struct task_struct= *prev, bool preempt) schedstat_inc(this_rq()->sched_count); } =20 -static void prev_balance(struct rq *rq, struct task_struct *prev, - struct rq_flags *rf) +static void prev_balance(struct rq *rq, struct rq_flags *rf) { - const struct sched_class *start_class =3D prev->sched_class; + const struct sched_class *start_class =3D rq->donor->sched_class; const struct sched_class *class; =20 /* @@ -5922,7 +5921,7 @@ static void prev_balance(struct rq *rq, struct task_s= truct *prev, * a runnable task of @class priority or higher. */ for_active_class_range(class, start_class, &idle_sched_class) { - if (class->balance && class->balance(rq, prev, rf)) + if (class->balance && class->balance(rq, rf)) break; } } @@ -5931,7 +5930,7 @@ static void prev_balance(struct rq *rq, struct task_s= truct *prev, * Pick up the highest-prio task: */ static inline struct task_struct * -__pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags = *rf) +__pick_next_task(struct rq *rq, struct rq_flags *rf) __must_hold(__rq_lockp(rq)) { const struct sched_class *class; @@ -5948,28 +5947,28 @@ __pick_next_task(struct rq *rq, struct task_struct = *prev, struct rq_flags *rf) * higher scheduling class, because otherwise those lose the * opportunity to pull in more work from other CPUs. */ - if (likely(!sched_class_above(prev->sched_class, &fair_sched_class) && + if (likely(!sched_class_above(rq->donor->sched_class, &fair_sched_class) = && rq->nr_running =3D=3D rq->cfs.h_nr_queued)) { =20 - p =3D pick_next_task_fair(rq, prev, rf); + p =3D pick_next_task_fair(rq, rf); if (unlikely(p =3D=3D RETRY_TASK)) goto restart; =20 /* Assume the next prioritized class is idle_sched_class */ if (!p) { p =3D pick_task_idle(rq, rf); - put_prev_set_next_task(rq, prev, p); + put_prev_set_next_task(rq, rq->donor, p); } =20 return p; } =20 restart: - prev_balance(rq, prev, rf); + prev_balance(rq, rf); =20 for_each_active_class(class) { if (class->pick_next_task) { - p =3D class->pick_next_task(rq, prev, rf); + p =3D class->pick_next_task(rq, rf); if (unlikely(p =3D=3D RETRY_TASK)) goto restart; if (p) @@ -5979,7 +5978,7 @@ __pick_next_task(struct rq *rq, struct task_struct *p= rev, struct rq_flags *rf) if (unlikely(p =3D=3D RETRY_TASK)) goto restart; if (p) { - put_prev_set_next_task(rq, prev, p); + put_prev_set_next_task(rq, rq->donor, p); return p; } } @@ -6032,7 +6031,7 @@ extern void task_vruntime_update(struct rq *rq, struc= t task_struct *p, bool in_f static void queue_core_balance(struct rq *rq); =20 static struct task_struct * -pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *r= f) +pick_next_task(struct rq *rq, struct rq_flags *rf) __must_hold(__rq_lockp(rq)) { struct task_struct *next, *p, *max; @@ -6045,7 +6044,7 @@ pick_next_task(struct rq *rq, struct task_struct *pre= v, struct rq_flags *rf) bool need_sync; =20 if (!sched_core_enabled(rq)) - return __pick_next_task(rq, prev, rf); + return __pick_next_task(rq, rf); =20 cpu =3D cpu_of(rq); =20 @@ -6058,7 +6057,7 @@ pick_next_task(struct rq *rq, struct task_struct *pre= v, struct rq_flags *rf) */ rq->core_pick =3D NULL; rq->core_dl_server =3D NULL; - return __pick_next_task(rq, prev, rf); + return __pick_next_task(rq, rf); } =20 /* @@ -6082,7 +6081,7 @@ pick_next_task(struct rq *rq, struct task_struct *pre= v, struct rq_flags *rf) goto out_set_next; } =20 - prev_balance(rq, prev, rf); + prev_balance(rq, rf); =20 smt_mask =3D cpu_smt_mask(cpu); need_sync =3D !!rq->core->core_cookie; @@ -6264,7 +6263,7 @@ pick_next_task(struct rq *rq, struct task_struct *pre= v, struct rq_flags *rf) } =20 out_set_next: - put_prev_set_next_task(rq, prev, next); + put_prev_set_next_task(rq, rq->donor, next); if (rq->core->core_forceidle_count && next =3D=3D rq->idle) queue_core_balance(rq); =20 @@ -6487,10 +6486,10 @@ static inline void sched_core_cpu_deactivate(unsign= ed int cpu) {} static inline void sched_core_cpu_dying(unsigned int cpu) {} =20 static struct task_struct * -pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *r= f) +pick_next_task(struct rq *rq, struct rq_flags *rf) __must_hold(__rq_lockp(rq)) { - return __pick_next_task(rq, prev, rf); + return __pick_next_task(rq, rf); } =20 #endif /* !CONFIG_SCHED_CORE */ @@ -7038,7 +7037,7 @@ static void __sched notrace __schedule(int sched_mode) =20 pick_again: assert_balance_callbacks_empty(rq); - next =3D pick_next_task(rq, rq->donor, &rf); + next =3D pick_next_task(rq, &rf); rq->next_class =3D next->sched_class; if (sched_proxy_exec()) { struct task_struct *prev_donor =3D rq->donor; diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 27359a1e995f9..7352506208287 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2509,8 +2509,14 @@ static void check_preempt_equal_dl(struct rq *rq, st= ruct task_struct *p) resched_curr(rq); } =20 -static int balance_dl(struct rq *rq, struct task_struct *p, struct rq_flag= s *rf) +static int balance_dl(struct rq *rq, struct rq_flags *rf) { + /* + * Note, rq->donor may change during rq lock drops, + * so don't re-use prev across lock drops + */ + struct task_struct *p =3D rq->donor; + if (!on_dl_rq(&p->dl) && need_pull_dl_task(rq, p)) { /* * This is OK, because current is on_cpu, which avoids it being diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 597ce5b718d26..4a6669c517dae 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9153,14 +9153,19 @@ static void __set_next_task_fair(struct rq *rq, str= uct task_struct *p, bool firs static void set_next_task_fair(struct rq *rq, struct task_struct *p, bool = first); =20 struct task_struct * -pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_fla= gs *rf) +pick_next_task_fair(struct rq *rq, struct rq_flags *rf) __must_hold(__rq_lockp(rq)) { struct sched_entity *se; - struct task_struct *p; + struct task_struct *p, *prev; int new_tasks; =20 again: + /* + * Re-read rq->donor at the top as it may have + * changed across a rq lock drop + */ + prev =3D rq->donor; p =3D pick_task_fair(rq, rf); if (!p) goto idle; diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index a83be0c834ddb..ff39120d723a9 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -462,7 +462,7 @@ select_task_rq_idle(struct task_struct *p, int cpu, int= flags) } =20 static int -balance_idle(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) +balance_idle(struct rq *rq, struct rq_flags *rf) { return WARN_ON_ONCE(1); } diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 4e5f1957b91b1..3fd03a836731e 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1596,8 +1596,14 @@ static void check_preempt_equal_prio(struct rq *rq, = struct task_struct *p) resched_curr(rq); } =20 -static int balance_rt(struct rq *rq, struct task_struct *p, struct rq_flag= s *rf) +static int balance_rt(struct rq *rq, struct rq_flags *rf) { + /* + * Note, rq->donor may change during rq lock drops, + * so don't re-use p across lock drops + */ + struct task_struct *p =3D rq->donor; + if (!on_rt_rq(&p->rt) && need_pull_rt_task(rq, p)) { /* * This is OK, because current is on_cpu, which avoids it being diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 9594355a36811..8ee82b03a8a10 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2550,7 +2550,7 @@ struct sched_class { /* * schedule/pick_next_task/prev_balance: rq->lock */ - int (*balance)(struct rq *rq, struct task_struct *prev, struct rq_flags *= rf); + int (*balance)(struct rq *rq, struct rq_flags *rf); =20 /* * schedule/pick_next_task: rq->lock @@ -2561,12 +2561,11 @@ struct sched_class { * * next =3D pick_task(); * if (next) { - * put_prev_task(prev); + * put_prev_task(rq->donor); * set_next_task_first(next); * } */ - struct task_struct *(*pick_next_task)(struct rq *rq, struct task_struct *= prev, - struct rq_flags *rf); + struct task_struct *(*pick_next_task)(struct rq *rq, struct rq_flags *rf); =20 /* * sched_change: @@ -2790,8 +2789,7 @@ static inline bool sched_fair_runnable(struct rq *rq) return rq->cfs.nr_queued > 0; } =20 -extern struct task_struct *pick_next_task_fair(struct rq *rq, struct task_= struct *prev, - struct rq_flags *rf); +extern struct task_struct *pick_next_task_fair(struct rq *rq, struct rq_fl= ags *rf); extern struct task_struct *pick_task_idle(struct rq *rq, struct rq_flags *= rf); =20 #define SCA_CHECK 0x01 diff --git a/kernel/sched/stop_task.c b/kernel/sched/stop_task.c index f95798baddebb..c909ca0d8c87c 100644 --- a/kernel/sched/stop_task.c +++ b/kernel/sched/stop_task.c @@ -16,7 +16,7 @@ select_task_rq_stop(struct task_struct *p, int cpu, int f= lags) } =20 static int -balance_stop(struct rq *rq, struct task_struct *prev, struct rq_flags *rf) +balance_stop(struct rq *rq, struct rq_flags *rf) { return sched_stop_runnable(rq); } --=20 2.53.0.1213.gd9a14994de-goog From nobody Sun Jun 14 17:32:41 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F01EF373BFC for ; Sat, 4 Apr 2026 05:36:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281006; cv=none; b=tK+4bm//3ZDjPRXsnmdoBHtUTN9cGVSo/EcmqeGHu6X37UorJWvkdfnK6SJrr2n3lStWYjanvd6RC039R+OxuSbQkQxIPZa1Tp9NRJ5MpawzBN7fCANBgbXpnrEAqTXeY+EsEiEok+vQldailIGsWS5Eg7pUCtX4YkeWbr56fMg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281006; c=relaxed/simple; bh=DSLfSyHQj1/pfjJ0/flR1vfGkic31tLO0vddP3e0HBQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=suLhGWMevcBV7EjcclDUH0b2fMAjyve7b7CUcIkzADXf1QkZ/tZI6DhnqlWW3lNsdr44UmVaWwUmnCMYgKe8/HKNsn0qmhaLmB+Pj+Aq3LZuZLuPpT5CXkCCwq7apIA0yook+2qCuRtwcNEN66jCiNwhBhFa6JhhH6p1RstH9Xs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=NgWjB2F1; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="NgWjB2F1" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2b0c92ff4ebso31474255ad.2 for ; Fri, 03 Apr 2026 22:36:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1775281004; x=1775885804; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ds56DNff610cUML0mVwRrV+nM7vhJRfBIMktQ5hdFtQ=; b=NgWjB2F1t2rEXDjzChiqTLNvX81kGeUA6o35dWi/w2mA29JQRvG6uut4+g2gz2sAIQ 3zlkg8aeK3UHSQj1UTQxDH0TnDvxxf8FFuUQJohVFQY8nIRJuICUbqMC2hEfBWUxubUQ qu9m63zC/z8nPd27UWexcXUYr2U+Wpy0kkMwYrsb+1qk0F0EABiVdcPxcHcbDEHdOCXH Qwl3tY3K0Z4+JBTl3t6odxWOGcifZaKawIpSWjLXDVyy/ysBDP4FWDcIU6p7zZZs+Ged DulOW6YyiaauTfOU8BpMkc6WyrVca/maIxUiQdbfqDUbIZM9uzlcLliM9BDgyrWDEDLd 6Jew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775281004; x=1775885804; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ds56DNff610cUML0mVwRrV+nM7vhJRfBIMktQ5hdFtQ=; b=GTmK42z5d77LhzqW6hlglhwuWa7JCVoVJY34n1lSSgGe/4HV8wmLbD6WSNRHR0VL8f yhRGUtGfDgeR0v3zyX7iW5Gi321TM4nE0E0oYq6XJH+YIVSGT3y6ccK/XB8wBn4WwZWS p5aZa2vGVar9HtrZuvInshFPSOOFL95Z3BmSm9EvsOx5hCMc6UaTEMMEeHjoCx36p2fF QdlIO4aDaPDe6mY6avCeny58HVcPrXJH53lwvn5mmTwIRURj9QypOPJHyWLAjuYYk70U 0LAWeYWyQpWVDwSFtxCjhDHKgQQxrkivlT2PiWUnbD+C10RuidY6d1fwTGu2rCMrBZSL 63Lw== X-Gm-Message-State: AOJu0YwBPFJMmebXAsngI4sDxXzWGUloY2WS3YU0WxgFDp/WHpj5GjtO xbW9qn9Ky09BJRJ4/DVxenscJ5Wz4DdSbF9F0IDZ851NPo225vYz0pQ3YhKR9nLRi6gQMHtmJDp tSXbBhmn3LeS+A5xAS/Exz2eY7HgPMn3jQ0ik4UY8cUzkl7jYWmaV/d7cDh9qMrF3CDbtWwQCS0 YqBchE/dE87o2a5HRy1ilnJYR9Im41b3XuiHUqiMacIZx+PJbB X-Received: from pgkb17.prod.google.com ([2002:a63:eb51:0:b0:c76:6a98:b776]) (user=jstultz job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:7348:b0:39b:9644:6ea4 with SMTP id adf61e73a8af0-39f2ee63d6dmr5395232637.16.1775281003685; Fri, 03 Apr 2026 22:36:43 -0700 (PDT) Date: Sat, 4 Apr 2026 05:36:19 +0000 In-Reply-To: <20260404053632.1729280-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260404053632.1729280-1-jstultz@google.com> X-Mailer: git-send-email 2.53.0.1213.gd9a14994de-goog Message-ID: <20260404053632.1729280-3-jstultz@google.com> Subject: [PATCH v27 02/10] sched: Avoid donor->sched_class->yield_task() null traversal From: John Stultz To: LKML Cc: John Stultz , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With proxy-exec once we do return migration from ttwu(), if a task is proxying for a waiting donor, and the donor is woken up, we switch the rq->donor to point to idle briefly until we can re-enter __schedule(). However, if a task that was acting as a proxy calls into yield() right after the donor is switched to idle, it may trip a null pointer traversal, because the idle task doesn't have a yield_task() pointer. So add a conditional to ensure we don't try to call the yield_task() pointer in that case. This was only recently found because prior to commit 127b90315ca07 ("sched/proxy: Yield the donor task") do_sched_yield() incorrectly called current->sched_class_yield_task() instead of using rq->donor. Signed-off-by: John Stultz --- Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Mel Gorman Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: Daniel Lezcano Cc: Suleiman Souhlal Cc: kuyo chang Cc: hupu Cc: kernel-team@android.com --- kernel/sched/syscalls.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index b215b0ead9a60..e3e4fd674ed63 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -1340,7 +1340,8 @@ static void do_sched_yield(void) rq =3D this_rq_lock_irq(&rf); =20 schedstat_inc(rq->yld_count); - rq->donor->sched_class->yield_task(rq); + if (rq->donor->sched_class->yield_task) + rq->donor->sched_class->yield_task(rq); =20 preempt_disable(); rq_unlock_irq(rq, &rf); --=20 2.53.0.1213.gd9a14994de-goog From nobody Sun Jun 14 17:32:41 2026 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62B40372671 for ; Sat, 4 Apr 2026 05:36:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281007; cv=none; b=hixlqtDSnNv/iQ6S3YRUgb9giRSTvPi9Fv2x8rCv0v7STyE6UgJR232PIdFXiMfyj4QZGTr2hOkyO5YJ2vRlfp75GyUUb3C9U5jVrF2GY/re05npOVW74OUtWxkfhlj14ftxEyP+b64kqLmvhcUhF9vK3z7mk4hZ3dD5OAYDt34= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281007; c=relaxed/simple; bh=zJmssLbWxheN8lTEHo5/guRE7fIFkkHcLBJ9e6RvevA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Daa5qEPBHTh/K9iv/x+coSJBYb5wGsXbjp8YJxWJ7naVKlm2Mog2q9LI2VIxcg86s6G4BQUN9JCOlawI4UZmfvBsLIye9Tkjdv7DTaM28lT2dZ8WXoj8KYoomKv37AhavL2NMh3xjkosDFYQifw3mivhSvwUuvx7SdqmOXIe8D4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=kQPnqx6J; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kQPnqx6J" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-82cf362659eso1683236b3a.2 for ; Fri, 03 Apr 2026 22:36:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1775281006; x=1775885806; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=SS/m1t0uYn/C8977KxUQKuI5bWfm1tXXwup9Gurcm64=; b=kQPnqx6J4nXbkjWGYrF2UR4vOeTsbKAIqk0WRfeRSQj6yxiES/vT16LUB8zuv8MdCC cAzyMewSoHWUiM0sRx622e3Hylj7Gw9sJT/WKND1/BoeDaZcGJdxk9wjxcriWrPSHHZy M55rHFVmaufPrmGLBc6WWTDGh/ZdQVLuouZvfIuTngAXUyUoXreBgoSVupRqGkJXsxdv hP1L/YkDhT5c4qRPklhsJg0nhtxhNsyM1ISmYHIJep0GulczjYyo1vmNMfF2s4tMr6Xk Uw3vgiI8wsy88lArn1sFeJCJI4B3HMYOj0YSJph+ITIoOQV4B0xF5rGEFSKROCPPUsXO Hyqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775281006; x=1775885806; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SS/m1t0uYn/C8977KxUQKuI5bWfm1tXXwup9Gurcm64=; b=pr4z5lv6QWdw6MGotkZfufSErJ3+jAXS6mcKQ6mYdjfCXMqgyAp7IUEqirnfoW6qwE yGd4tIK9DkhQ5aYAE9PR+tyvXitAIM3DtaJlS4sR3eDbi8waz7nDBaEvT1nC9QrZMUVT z2hsjTnjKEKQJh6WnfZKSBTWrjmKp8Cxih9Gj1RXIVesp6zk8/f82LMMmnTlQE//8Dqu HnKuOSXK6A3MsQpsTT/ANO6gp/iuBe/4dJABlDRtiF0hj3SXyMv3FcV+4xSeIPnLuPhQ aSelL8D2cNQFHJkE/gKA/oPy3vzvRtwwnRrfk0t2kMVHutFa9p9BAMPUfTqCuSCD+YgK T1Bw== X-Gm-Message-State: AOJu0YzMlnSXHEzKGIsd8IpYa1WOUoKzQC2lNZi6QaNZfp1SUQYUKowz EiVHnGGSj45DSOiLxasq/RYIozLGIdcVX4KwCGLwP5WSk2bETvwcoOgbtmTnYf91kOAB6j0B8m1 rVTqjuaBifwcVwJtiuQb6hUe3T+RCfJLJrxOUUvBTUWvRTwhosb/IN2nExgUcbm8vDW+F4cbk+I VBKFBBbVlp8YrVCT+RU1ld7qXJNy9MK7dd4r7z/acVidcixfKp X-Received: from pgko24.prod.google.com ([2002:a63:f158:0:b0:c6e:8dba:9fcb]) (user=jstultz job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:2da8:b0:398:7cb3:2bf0 with SMTP id adf61e73a8af0-39f2f02b7a1mr3897912637.36.1775281005118; Fri, 03 Apr 2026 22:36:45 -0700 (PDT) Date: Sat, 4 Apr 2026 05:36:20 +0000 In-Reply-To: <20260404053632.1729280-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260404053632.1729280-1-jstultz@google.com> X-Mailer: git-send-email 2.53.0.1213.gd9a14994de-goog Message-ID: <20260404053632.1729280-4-jstultz@google.com> Subject: [PATCH v27 03/10] sched: deadline: Add some helper variables to cleanup deadline logic From: John Stultz To: LKML Cc: John Stultz , Peter Zijlstra , Joel Fernandes , Qais Yousef , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As part of an improvement to handling pushable deadline tasks, Peter suggested this cleanup[1], to use helper values for dl_entity and dl_rq in the enqueue_task_dl() and put_prev_task_dl() functions. There should be no functional change from this patch. To make sure this cleanup change doesn't obscure later logic changes, I've split it into its own patch. [1]: https://lore.kernel.org/lkml/20260304095123.GP606826@noisy.programming= .kicks-ass.net/ Suggested-by: Peter Zijlstra Signed-off-by: John Stultz --- Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Mel Gorman Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: Daniel Lezcano Cc: Suleiman Souhlal Cc: kuyo chang Cc: hupu Cc: kernel-team@android.com --- kernel/sched/deadline.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 7352506208287..60ccb492c4427 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2295,7 +2295,10 @@ static void dequeue_dl_entity(struct sched_dl_entity= *dl_se, int flags) =20 static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flag= s) { - if (is_dl_boosted(&p->dl)) { + struct sched_dl_entity *dl_se =3D &p->dl; + struct dl_rq *dl_rq =3D &rq->dl; + + if (is_dl_boosted(dl_se)) { /* * Because of delays in the detection of the overrun of a * thread's runtime, it might be the case that a thread @@ -2308,14 +2311,14 @@ static void enqueue_task_dl(struct rq *rq, struct t= ask_struct *p, int flags) * * In this case, the boost overrides the throttle. */ - if (p->dl.dl_throttled) { + if (dl_se->dl_throttled) { /* * The replenish timer needs to be canceled. No * problem if it fires concurrently: boosted threads * are ignored in dl_task_timer(). */ - cancel_replenish_timer(&p->dl); - p->dl.dl_throttled =3D 0; + cancel_replenish_timer(dl_se); + dl_se->dl_throttled =3D 0; } } else if (!dl_prio(p->normal_prio)) { /* @@ -2327,7 +2330,7 @@ static void enqueue_task_dl(struct rq *rq, struct tas= k_struct *p, int flags) * being boosted again with no means to replenish the runtime and clear * the throttle. */ - p->dl.dl_throttled =3D 0; + dl_se->dl_throttled =3D 0; if (!(flags & ENQUEUE_REPLENISH)) printk_deferred_once("sched: DL de-boosted task PID %d: REPLENISH flag = missing\n", task_pid_nr(p)); @@ -2336,14 +2339,14 @@ static void enqueue_task_dl(struct rq *rq, struct t= ask_struct *p, int flags) } =20 check_schedstat_required(); - update_stats_wait_start_dl(dl_rq_of_se(&p->dl), &p->dl); + update_stats_wait_start_dl(dl_rq, dl_se); =20 if (p->on_rq =3D=3D TASK_ON_RQ_MIGRATING) flags |=3D ENQUEUE_MIGRATING; =20 - enqueue_dl_entity(&p->dl, flags); + enqueue_dl_entity(dl_se, flags); =20 - if (dl_server(&p->dl)) + if (dl_server(dl_se)) return; =20 if (task_is_blocked(p)) @@ -2646,7 +2649,7 @@ static void put_prev_task_dl(struct rq *rq, struct ta= sk_struct *p, struct task_s struct sched_dl_entity *dl_se =3D &p->dl; struct dl_rq *dl_rq =3D &rq->dl; =20 - if (on_dl_rq(&p->dl)) + if (on_dl_rq(dl_se)) update_stats_wait_start_dl(dl_rq, dl_se); =20 update_curr_dl(rq); @@ -2656,7 +2659,7 @@ static void put_prev_task_dl(struct rq *rq, struct ta= sk_struct *p, struct task_s if (task_is_blocked(p)) return; =20 - if (on_dl_rq(&p->dl) && p->nr_cpus_allowed > 1) + if (on_dl_rq(dl_se) && p->nr_cpus_allowed > 1) enqueue_pushable_dl_task(rq, p); } =20 --=20 2.53.0.1213.gd9a14994de-goog From nobody Sun Jun 14 17:32:41 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F27D133506C for ; Sat, 4 Apr 2026 05:36:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281010; cv=none; b=DucF5pBb03KF2X5wNYe13zjczkoQTKOGV1d3trAQzNGlv4NUP3rIrSDtWsm+jn0GYNYe87PggFklDgbMggJcCAXhY3YrgCLHQxNSTM+y5FiEl0SjG2BDJOcWAs5XG86LPQpZzrvtco+WAT5viAagBogZvsSDKfWQmBp3VPMmxe8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281010; c=relaxed/simple; bh=pTSPEL6AHZdvVxI/gJ2g16EBrOQoIsK9A3Ps2Ju1JXA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=pvz/5yyenvepcaanrwKSoe1Yd0Jh80Drm0DTjKMFRD89IMoW40wA5IVDX0keMqEcGyJtSQhMTD1f3aHWef4QcwJOxwQKmSR96x9fAy5eGP9XwBNAx1+UQm6Voa+fNZ2waqJlpeoan25mbdZqjEDNUuv1Geq7BvUorJo/4z+WvP0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=fz9qlngs; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="fz9qlngs" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2b0f4e632caso35326455ad.3 for ; Fri, 03 Apr 2026 22:36:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1775281008; x=1775885808; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Ztbl+3T3jxHP87PGN7Exc4Dus6q/0xepccEC4U2P7C0=; b=fz9qlngswH2kFPETe10CtSfiAQ+tuRFZu7q5/ZzfQU13Tkcu6ngYNAram9YSiRc578 zECS87P0ZCecEuSfNxq6Xqvlh3ULMU+vjOCiq9qZ7Wk7TvuwpD+5B48eRuY/oM8hmH2C 6lkU/9YDR3ygus0JjdfeYaobyJi/XCEFA3mBosRQATFUeyEP65FmYMI2apaA+SBpzn1K fJCwRvJlvhD8jZO4hi72kL6HWAI2np5HfeY7b37z0cc9I9bX7vBWfIT3csd3w84fp0/e Ai/GzLknIBDiAk1ZSOX2+lcZ6rbc71rb15lp92SvhLpQnkdmj0RHsf/E8XU3rZz1T9C/ d0nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775281008; x=1775885808; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Ztbl+3T3jxHP87PGN7Exc4Dus6q/0xepccEC4U2P7C0=; b=bmwNc2R7ODLWYg68ZhDIMUirzbEBWOc5PxMt/wqmXDHTbaCgeujU5p2RlAd4bPFhvy UZDgzY4Jhe8mwfJDdFI1H/zC/XCVbr8EArdONv6CvgkZtgPTPjey3uEFEIv5/+2udJYo sRhskd5AQdgmWqpdQeEbLy8v4Pua43+gR6PvYRksdGwBpmaCe/4Ojo0qfKY2jwUEg+vR ORNPJnbsqegoiUvhM5lucnEh1ryIcaeXu+r+dh6ccB/csRo+qFGwBnxi5ggnXk+7S0BL F9Z4G3u8vCiu8AB0aZ2yIeLUB/pJ0II2v8q94CQ5tzcXgXPL4cYCNk1hO2ENjfA6SZ1Z LH1Q== X-Gm-Message-State: AOJu0Yx8ecddXUcJZhthl8ukT72+nsdc+baBkufDWvmuaKUMwVolmEwg FCKNBE27f446OSkXxMSQ8JD9+18fiMHzsOKmpXDaOEuB+IP+rEKMoYgoR+rq4RELpTlhA0Gq5Nc K4v3ZAp+EL7qq1iRZd+FDtkBeYy4w8vmEQOVr52IZPnRc/rYhD40l0gIypT/Ugl4uYrNYlGKqoV oyR6Jy6TUTjL+HEfayxAhiEv4Pv77bjhBmLRccuZ87USMdSJPl X-Received: from pgib11.prod.google.com ([2002:a63:e70b:0:b0:c6e:4b29:e099]) (user=jstultz job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:3396:b0:398:aea8:a9c0 with SMTP id adf61e73a8af0-39f2eff034amr5328612637.19.1775281006636; Fri, 03 Apr 2026 22:36:46 -0700 (PDT) Date: Sat, 4 Apr 2026 05:36:21 +0000 In-Reply-To: <20260404053632.1729280-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260404053632.1729280-1-jstultz@google.com> X-Mailer: git-send-email 2.53.0.1213.gd9a14994de-goog Message-ID: <20260404053632.1729280-5-jstultz@google.com> Subject: [PATCH v27 04/10] sched: deadline: Add dl_rq->curr pointer to address issues with Proxy Exec From: John Stultz To: LKML Cc: John Stultz , Peter Zijlstra , Joel Fernandes , Qais Yousef , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The DL scheduler keeps the current task in the rbtree, since the deadline value isn't usually chagned while the task is runnable. This results in set_next_task() and put_prev_task() being simpler, but unfortunately this causes complexity elsewhere. Specifically when update_curr_dl() updates the deadline, it has to dequeue and then enqueue the task. From put_prev_task_dl(), we first call update_curr_dl(), and then call enqueue_pushable_dl_task(). However, with Proxy Exec this goes awry. Since when a mutex is released, we might wake the waiting rq->donor. This will cause put_prev_task() to be called on the donor to take it off the cpu for return migration. At that point, from put_prev_task_dl() the update_curr_dl() logic will dequeue & enqueue the task, and the enqueue function will call enqueue_pushable_dl_task() (since the task_current() check won't prevent it). Then back up the callstack in put_prev_task_dl() we'll end up calling enqueue_pushable_dl_task() again, tripping the !RB_EMPTY_NODE(&p->pushable_dl_tasks) warning. So to avoid this, use Peter's suggested[1] approach, and add a dl_rq->curr pointer that is set/cleared from set_next_task()/ put_prev_task(), which effectively tracks the rq->donor. We can then use this to avoid adding the active donor to the pushable list from enqueue_task_dl(). [1]: https://lore.kernel.org/lkml/20260304095123.GP606826@noisy.programming= .kicks-ass.net/ Suggested-by: Peter Zijlstra Signed-off-by: John Stultz --- Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Mel Gorman Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: Daniel Lezcano Cc: Suleiman Souhlal Cc: kuyo chang Cc: hupu Cc: kernel-team@android.com --- kernel/sched/deadline.c | 13 +++++++++++++ kernel/sched/sched.h | 1 + 2 files changed, 14 insertions(+) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 60ccb492c4427..3ba5f8deb3687 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2352,6 +2352,9 @@ static void enqueue_task_dl(struct rq *rq, struct tas= k_struct *p, int flags) if (task_is_blocked(p)) return; =20 + if (dl_rq->curr =3D=3D dl_se) + return; + if (!task_current(rq, p) && !p->dl.dl_throttled && p->nr_cpus_allowed > 1) enqueue_pushable_dl_task(rq, p); } @@ -2574,6 +2577,10 @@ static void start_hrtick_dl(struct rq *rq, struct sc= hed_dl_entity *dl_se) } #endif /* !CONFIG_SCHED_HRTICK */ =20 +/* + * DL keeps current in tree, because ->deadline is not typically changed w= hile + * a task is runnable. + */ static void set_next_task_dl(struct rq *rq, struct task_struct *p, bool fi= rst) { struct sched_dl_entity *dl_se =3D &p->dl; @@ -2586,6 +2593,9 @@ static void set_next_task_dl(struct rq *rq, struct ta= sk_struct *p, bool first) /* You can't push away the running task */ dequeue_pushable_dl_task(rq, p); =20 + WARN_ON_ONCE(dl_rq->curr); + dl_rq->curr =3D dl_se; + if (!first) return; =20 @@ -2656,6 +2666,9 @@ static void put_prev_task_dl(struct rq *rq, struct ta= sk_struct *p, struct task_s =20 update_dl_rq_load_avg(rq_clock_pelt(rq), rq, 1); =20 + WARN_ON_ONCE(dl_rq->curr !=3D dl_se); + dl_rq->curr =3D NULL; + if (task_is_blocked(p)) return; =20 diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 8ee82b03a8a10..adefea777e0a5 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -880,6 +880,7 @@ struct dl_rq { =20 bool overloaded; =20 + struct sched_dl_entity *curr; /* * Tasks on this rq that can be pushed away. They are kept in * an rb-tree, ordered by tasks' deadlines, with caching --=20 2.53.0.1213.gd9a14994de-goog From nobody Sun Jun 14 17:32:41 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 71FBB34A3A5 for ; Sat, 4 Apr 2026 05:36:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281010; cv=none; b=HlnNnkE6Q5UXYtTqALTIIwgfD1PDZlxZHT9PuG+fmgb2L/cjursWulFrGA8f6JrDgkRUfWAJyk9fCLPbMR1P9LnVoNU/MMKuj53uCQS90gJ5+7v/6Zt6itLhs6HCyODHTNGMsExmEPhNzh+6SHVc9GcMjhBro/epNUNEAEUrJS4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281010; c=relaxed/simple; bh=XlW9p26G5sW8xpujfOCavtuqXZ864bkciDyBKiMbNik=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ZcV0i//CnQGADXIZwC8GWJaL7IhwQyKeK0+GPpPbcwVDpYoIt0ZkiZuW7LTfiLiP2ao5E52LgLUO8NaoXVg04EaVadEdYUbGOBun8Ll1mLnteBeI4X9PqG2hWuqhPZVvh/TguiYvoQ+9PtkRrC/azjTVxJQiPFduO15P3Wd9H9I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Pn+Mtty4; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Pn+Mtty4" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2b249975139so60944885ad.0 for ; Fri, 03 Apr 2026 22:36:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1775281009; x=1775885809; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=aPxiEVgXKSmdKrjgrCQi/zOjLhPhgBmFhxWG4tzc/xQ=; b=Pn+Mtty4BdDSrbge7iV2aWoCZvYEGfiK4SPmZlI9DvgWwRcvPMEcGJ1nLGnXSxupRA M7IHt9sRV0kBPMQYN1oWvmI5LiapYc9Mgikwlkgl4L4r+QZx9CKz0Jh+mAXBrZCK4zXg w8cjBhTBOeimG7ZblDPhzF17dL+Mpql76wHHq8Axi9LFsywE2nMGLHuWPNe3fKlwq2BA 5QWH0HDAK6/9pC+QT7kzk+NJ1bPPp0RHXgo/VTnHXRjZznmKE4LkL4gtaBZjeF4JWPYp F1Jm1u2LKCr4YCzDTcd5Pnjyqb+Qm9Jwz4tZrCXapXcdHqvZn27wQTDerp0d1yLgl+TF lUuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775281009; x=1775885809; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=aPxiEVgXKSmdKrjgrCQi/zOjLhPhgBmFhxWG4tzc/xQ=; b=ljc5oAAJ2Whz4sA7J2FylXLHegccrYY25VtLSA8ggTk8DEq5tMFfmkeNYAAMrdjtdn uaSSHiLlcmhfj7hbftVcid55bNMhTSkYKP4yX8thS7+ynCQzrH5iGme4Xl+867SsF9b5 PMWcXmNieSwN5Ij4CFQ1hSkJ/HDljEzz2+b6TZN9ZWBS+W68rga7b6J0HBC0mocIIPyw ONeJ6n6QFsdKszcLbat64dbkjm5VJw8L3D0XR980PkSZCMl4jEdEHI/f01D0LDp5Xl3Q gLir6h2SnoHJxl0w2SDFKl0Fls0cl1AWnJXlMvuusdiPGEd6bDtxsuduEDf5M1ZVJRbi diXQ== X-Gm-Message-State: AOJu0YxlWJpt6jTT+vNKSoZdB0zNFXdiMffg+ER5cZ2Fdfrx6ARxAn/R sVbAL32Goy0uCDls/pSzNGOhjg6nYytHU+AXSB6MQj5SnNwMwVv+oorfuPqOgM11y5ofe5k+pZ2 Hpdc/yKGe/JVRG1UXsl7khmaus0XcwXDuH/ISavePujraWTrtVchAqgycekPGei/GVFYckv6QfG B62LZOkRU5ai+901kxOmLuM+97ifEWeDCPSi9Tw+M2ndSL5C7W X-Received: from pgcv14.prod.google.com ([2002:a05:6a02:530e:b0:c73:9264:56a5]) (user=jstultz job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:7f91:b0:398:71f2:59d8 with SMTP id adf61e73a8af0-39f2f1296b9mr5246612637.56.1775281008195; Fri, 03 Apr 2026 22:36:48 -0700 (PDT) Date: Sat, 4 Apr 2026 05:36:22 +0000 In-Reply-To: <20260404053632.1729280-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260404053632.1729280-1-jstultz@google.com> X-Mailer: git-send-email 2.53.0.1213.gd9a14994de-goog Message-ID: <20260404053632.1729280-6-jstultz@google.com> Subject: [PATCH v27 05/10] sched: Rework block_task so it can be directly called From: John Stultz To: LKML Cc: John Stultz , Peter Zijlstra , Joel Fernandes , Qais Yousef , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Pull most of the logic out of try_to_block_task() and put it into block_task() directly, so that we can call block_task() and not have to worry about the failing cases in try_to_block_task() Suggested-by: Peter Zijlstra Signed-off-by: John Stultz --- Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Mel Gorman Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: Daniel Lezcano Cc: Suleiman Souhlal Cc: kuyo chang Cc: hupu Cc: kernel-team@android.com --- kernel/sched/core.c | 45 +++++++++++++++++++++++---------------------- 1 file changed, 23 insertions(+), 22 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9c8a769a6d109..8f1b14a830851 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2160,8 +2160,29 @@ void deactivate_task(struct rq *rq, struct task_stru= ct *p, int flags) dequeue_task(rq, p, flags); } =20 -static void block_task(struct rq *rq, struct task_struct *p, int flags) +static void block_task(struct rq *rq, struct task_struct *p, unsigned long= task_state) { + int flags =3D DEQUEUE_NOCLOCK; + + p->sched_contributes_to_load =3D + (task_state & TASK_UNINTERRUPTIBLE) && + !(task_state & TASK_NOLOAD) && + !(task_state & TASK_FROZEN); + + if (unlikely(is_special_task_state(task_state))) + flags |=3D DEQUEUE_SPECIAL; + + /* + * __schedule() ttwu() + * prev_state =3D prev->state; if (p->on_rq && ...) + * if (prev_state) goto out; + * p->on_rq =3D 0; smp_acquire__after_ctrl_dep(); + * p->state =3D TASK_WAKING + * + * Where __schedule() and ttwu() have matching control dependencies. + * + * After this, schedule() must not care about p->state any more. + */ if (dequeue_task(rq, p, DEQUEUE_SLEEP | flags)) __block_task(rq, p); } @@ -6517,7 +6538,6 @@ static bool try_to_block_task(struct rq *rq, struct t= ask_struct *p, unsigned long *task_state_p, bool should_block) { unsigned long task_state =3D *task_state_p; - int flags =3D DEQUEUE_NOCLOCK; =20 if (signal_pending_state(task_state, p)) { WRITE_ONCE(p->__state, TASK_RUNNING); @@ -6537,26 +6557,7 @@ static bool try_to_block_task(struct rq *rq, struct = task_struct *p, if (!should_block) return false; =20 - p->sched_contributes_to_load =3D - (task_state & TASK_UNINTERRUPTIBLE) && - !(task_state & TASK_NOLOAD) && - !(task_state & TASK_FROZEN); - - if (unlikely(is_special_task_state(task_state))) - flags |=3D DEQUEUE_SPECIAL; - - /* - * __schedule() ttwu() - * prev_state =3D prev->state; if (p->on_rq && ...) - * if (prev_state) goto out; - * p->on_rq =3D 0; smp_acquire__after_ctrl_dep(); - * p->state =3D TASK_WAKING - * - * Where __schedule() and ttwu() have matching control dependencies. - * - * After this, schedule() must not care about p->state any more. - */ - block_task(rq, p, flags); + block_task(rq, p, task_state); return true; } =20 --=20 2.53.0.1213.gd9a14994de-goog From nobody Sun Jun 14 17:32:41 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11DF137755A for ; Sat, 4 Apr 2026 05:36:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281018; cv=none; b=ZdNeQfrO1SUlLCZmDVKcj81oMvNpYnS4lI2jt5BByd/IoqWA47qsVhDNMX39pl2tK88pdFUqM/SuZoXYd+j+R5uepSCAe4+8x681CA7c4m7kak6q+wSwD9UEdzOPf3CRXs/jPk77CW8qCHGCNndVfnh2/z2mEP/7mu4tacPgZ/Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281018; c=relaxed/simple; bh=6MGeMiTdWXeKHl0bix/fOT8GJOFL6I6BQlZjZc5W4T8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=rE4jLc9eOlnxKYi4Zg4lPwFOV1BdoCQlNUskoPRSEf6nxAUc1pGoAqg/mUyIYNFmQtPWy22hlwoIwztbVkYBPkD2E96VXyBnjZPD885vrdjbjl2Mj9+FpzQYLvMSZxDcHzgJ3F+PL36qI9i5lQdtCDcsBRjOn8ko5bJmjeMmi8M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=n0DwI8Aq; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="n0DwI8Aq" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-358df8fbd1cso2768101a91.0 for ; Fri, 03 Apr 2026 22:36:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1775281010; x=1775885810; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PxI7okwmhW5yjaVNEGP0PdvDUJuVFZiHzIx4cRjEVu4=; b=n0DwI8AqN2GozibneOjAHiJh2M9RWXjo7to/nwbBkoo3FYagf7Ga6CEMHOh8UIp3NI TgEAfa/T69RJZOhEdgD3xEKf9+9IlGxQi7vhQYzFYkDOQw8oYnZ3TMLaYsSwys7s3M2e p2rCErVb5mXl9B/Vzr9s0gj3KONCoURkyw5+94G7f0VEw59VvJG+noUjq/k8PRtKfI4q ohgq40/+xDR0WRtK9DK6k0dplrGw5O+UUNNExs3g/2Yfk7o5KDIOFVCE889zDG8JJ7hR 8YuJZG51QsgWSOn0o6KJpyDVAqKGxyjBdTayFkQjW1aSGRL2uiIJ/oe7Bltdwpzf0Bd+ XWqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775281010; x=1775885810; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PxI7okwmhW5yjaVNEGP0PdvDUJuVFZiHzIx4cRjEVu4=; b=gX99Oyn+O2aGCPRVvBQQiEKloSm+6cwncKVm1nhD4o20k0nQGVrHlVYzhoc9HOSvLL 9W8LXTYau19+dPJKLqcRca9QGQUlzhh321I0ANy9yZsURJfaxLJPSZbC67fXzeCpzsIW PDfH+5hMJQIUuNPoJ2Te8wcMIMvvucuL0YkD879Hk54sJTy3hVZDCHy4h/XkjFjnu1Pt JiuUnLGktCi+vQAD3/5DvctqJEnMWNPCahViLxxUK4iia9BvYQ76+W4cY2eA5G4wKlAv U6P+m3CmP9Xxt6I2NyN7IZb0t72RaWJBwWhy1m1yb1S9KNo7dQveu7SupeOnkao5vNiu KXmQ== X-Gm-Message-State: AOJu0YxZWrv0P/wfaDD7xr6/ApBtvEZpKhglZanXUOqCCq1CI8GXGsJu 5cEYoP7kybXvFM6ifef6WZZ8PFHOeToficCBX2gON75dZ2YmpYgNs94lk5Bf5Y+8rm9lggkX3kg xsgpxW5c+BUUX1KWls4+Pvfnf4M+7u5e3NzR28dQZm3kcp/pWku2wVKAmPDdx9EDrt2JwnQxUNP XPbHN/uXtB8oPNuhKaw8A9PIvF5QNx6fmljIHH6n/JAIyDVj6E X-Received: from pgbcq14.prod.google.com ([2002:a05:6a02:408e:b0:c74:1d79:2cfa]) (user=jstultz job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3fcf:b0:35d:a3b4:2ef6 with SMTP id 98e67ed59e1d1-35de6946a99mr5236438a91.21.1775281010094; Fri, 03 Apr 2026 22:36:50 -0700 (PDT) Date: Sat, 4 Apr 2026 05:36:23 +0000 In-Reply-To: <20260404053632.1729280-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260404053632.1729280-1-jstultz@google.com> X-Mailer: git-send-email 2.53.0.1213.gd9a14994de-goog Message-ID: <20260404053632.1729280-7-jstultz@google.com> Subject: [PATCH v27 06/10] sched: Have try_to_wake_up() handle return-migration for PROXY_WAKING case From: John Stultz To: LKML Cc: John Stultz , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This patch adds logic so try_to_wake_up() will notice if we are waking a task where blocked_on =3D=3D PROXY_WAKING, and if necessary dequeue the task so the wakeup will naturally return-migrate the donor task back to a cpu it can run on. This helps performance as we do the dequeue and wakeup under the locks normally taken in the try_to_wake_up() and avoids having to do proxy_force_return() from __schedule(), which has to re-take similar locks and then force a pick again loop. This was split out from the larger proxy patch, and significantly reworked. Credits for the original patch go to: Peter Zijlstra (Intel) Juri Lelli Valentin Schneider Connor O'Brien Signed-off-by: John Stultz --- v24: * Reworked proxy_needs_return() so its less nested as suggested by K Prateek * Switch to using block_task with DEQUEUE_SPECIAL as suggested by K Prateek * Fix edge case to reset wake_cpu if select_task_rq() chooses the current rq and we skip set_task_cpu() v26: * Handle both blocked and PROXY_WAKING tasks in proxy_needs_return(), as suggested by K Prateek * Try to handle signal edge case in ttwu that K Prateek pointed out v27: * Integrate simplifications to proxy_needs_return() suggested by K Prateek * Rework ttwu_runnable() to align with ACQUIRE(__task_rq_lock, guard)(p) usage as suggested by Peter * Major rework suggested by Peter to get rid of proxy_force_return() completely, using proxy_deactivate() and allow ttwu to handle all the return migration. Lots of helpful improvements suggested by K Prateek included as well here. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Mel Gorman Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: Daniel Lezcano Cc: Suleiman Souhlal Cc: kuyo chang Cc: hupu Cc: kernel-team@android.com --- include/linux/sched.h | 2 +- kernel/sched/core.c | 194 +++++++++++++++++++++--------------------- 2 files changed, 96 insertions(+), 100 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 8ec3b6d7d718b..3ae1330801157 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -161,7 +161,7 @@ struct user_event_mm; */ #define is_special_task_state(state) \ ((state) & (__TASK_STOPPED | __TASK_TRACED | TASK_PARKED | \ - TASK_DEAD | TASK_FROZEN)) + TASK_DEAD | TASK_WAKING | TASK_FROZEN)) =20 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP # define debug_normal_state_change(state_value) \ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 8f1b14a830851..2b5f9f905afe1 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3659,6 +3659,44 @@ void update_rq_avg_idle(struct rq *rq) rq->idle_stamp =3D 0; } =20 +#ifdef CONFIG_SCHED_PROXY_EXEC +static inline struct task_struct *proxy_resched_idle(struct rq *rq); + +/* + * Checks to see if task p has been proxy-migrated to another rq + * and needs to be returned. If so, we deactivate the task here + * so that it can be properly woken up on the p->wake_cpu + * (or whichever cpu select_task_rq() picks at the bottom of + * try_to_wake_up() + */ +static inline bool proxy_needs_return(struct rq *rq, struct task_struct *p) +{ + if (!task_is_blocked(p)) + return false; + + guard(raw_spinlock)(&p->blocked_lock); + + /* Task is waking up; clear any blocked_on relationship */ + __clear_task_blocked_on(p, NULL); + + /* If already current, don't need to return migrate */ + if (task_current(rq, p)) + return false; + + /* If we're return migrating the rq->donor, switch it out for idle */ + if (task_current_donor(rq, p)) + proxy_resched_idle(rq); + + block_task(rq, p, TASK_WAKING); + return true; +} +#else /* !CONFIG_SCHED_PROXY_EXEC */ +static inline bool proxy_needs_return(struct rq *rq, struct task_struct *p) +{ + return false; +} +#endif /* CONFIG_SCHED_PROXY_EXEC */ + static void ttwu_do_activate(struct rq *rq, struct task_struct *p, int wake_flags, struct rq_flags *rf) @@ -3723,28 +3761,26 @@ ttwu_do_activate(struct rq *rq, struct task_struct = *p, int wake_flags, */ static int ttwu_runnable(struct task_struct *p, int wake_flags) { - struct rq_flags rf; - struct rq *rq; - int ret =3D 0; + ACQUIRE(__task_rq_lock, guard)(p); + struct rq *rq =3D guard.rq; =20 - rq =3D __task_rq_lock(p, &rf); - if (task_on_rq_queued(p)) { - update_rq_clock(rq); - if (p->se.sched_delayed) - enqueue_task(rq, p, ENQUEUE_NOCLOCK | ENQUEUE_DELAYED); - if (!task_on_cpu(rq, p)) { - /* - * When on_rq && !on_cpu the task is preempted, see if - * it should preempt the task that is current now. - */ - wakeup_preempt(rq, p, wake_flags); - } - ttwu_do_wakeup(p); - ret =3D 1; - } - __task_rq_unlock(rq, p, &rf); + if (!task_on_rq_queued(p)) + return 0; =20 - return ret; + update_rq_clock(rq); + if (p->se.sched_delayed) + enqueue_task(rq, p, ENQUEUE_NOCLOCK | ENQUEUE_DELAYED); + if (proxy_needs_return(rq, p)) + return 0; + if (!task_on_cpu(rq, p)) { + /* + * When on_rq && !on_cpu the task is preempted, see if + * it should preempt the task that is current now. + */ + wakeup_preempt(rq, p, wake_flags); + } + ttwu_do_wakeup(p); + return 1; } =20 void sched_ttwu_pending(void *arg) @@ -4131,6 +4167,8 @@ int try_to_wake_up(struct task_struct *p, unsigned in= t state, int wake_flags) * it disabling IRQs (this allows not taking ->pi_lock). */ WARN_ON_ONCE(p->se.sched_delayed); + /* If p is current, we know we can run here, so clear blocked_on */ + clear_task_blocked_on(p, NULL); if (!ttwu_state_match(p, state, &success)) goto out; =20 @@ -4147,6 +4185,15 @@ int try_to_wake_up(struct task_struct *p, unsigned i= nt state, int wake_flags) */ scoped_guard (raw_spinlock_irqsave, &p->pi_lock) { smp_mb__after_spinlock(); + + /* + * We could get a wakeup from a signal which wouldn't + * mark the blocked_on state as PROXY_WAKING. So + * set the woken task as PROXY_WAKING here so we are + * sure the task will wake and run. + */ + set_task_blocked_on_waking(p, NULL); + if (!ttwu_state_match(p, state, &success)) break; =20 @@ -4211,6 +4258,14 @@ int try_to_wake_up(struct task_struct *p, unsigned i= nt state, int wake_flags) */ WRITE_ONCE(p->__state, TASK_WAKING); =20 + /* + * We never clear the blocked_on relation on proxy_deactivate. + * If we don't clear it here, we have TASK_RUNNING + p->blocked_on + * when waking up. Since this is a fully blocked, off CPU task + * waking up, it should be safe to clear the blocked_on relation. + */ + if (task_is_blocked(p)) + clear_task_blocked_on(p, NULL); /* * If the owning (remote) CPU is still in the middle of schedule() with * this task as prev, considering queueing p on the remote CPUs wake_list @@ -4255,6 +4310,16 @@ int try_to_wake_up(struct task_struct *p, unsigned i= nt state, int wake_flags) wake_flags |=3D WF_MIGRATED; psi_ttwu_dequeue(p); set_task_cpu(p, cpu); + } else if (cpu !=3D p->wake_cpu) { + /* + * If we were proxy-migrated to cpu, then + * select_task_rq() picks cpu instead of wake_cpu + * to return to, we won't call set_task_cpu(), + * leaving a stale wake_cpu pointing to where we + * proxy-migrated from. So just fixup wake_cpu here + * if its not correct + */ + p->wake_cpu =3D cpu; } =20 ttwu_queue(p, cpu, wake_flags); @@ -6542,7 +6607,7 @@ static bool try_to_block_task(struct rq *rq, struct t= ask_struct *p, if (signal_pending_state(task_state, p)) { WRITE_ONCE(p->__state, TASK_RUNNING); *task_state_p =3D TASK_RUNNING; - set_task_blocked_on_waking(p, NULL); + clear_task_blocked_on(p, NULL); =20 return false; } @@ -6585,13 +6650,11 @@ static inline struct task_struct *proxy_resched_idl= e(struct rq *rq) return rq->idle; } =20 -static bool proxy_deactivate(struct rq *rq, struct task_struct *donor) +static void proxy_deactivate(struct rq *rq, struct task_struct *donor) { unsigned long state =3D READ_ONCE(donor->__state); =20 - /* Don't deactivate if the state has been changed to TASK_RUNNING */ - if (state =3D=3D TASK_RUNNING) - return false; + WARN_ON_ONCE(state =3D=3D TASK_RUNNING); /* * Because we got donor from pick_next_task(), it is *crucial* * that we call proxy_resched_idle() before we deactivate it. @@ -6602,7 +6665,7 @@ static bool proxy_deactivate(struct rq *rq, struct ta= sk_struct *donor) * need to be changed from next *before* we deactivate. */ proxy_resched_idle(rq); - return try_to_block_task(rq, donor, &state, true); + block_task(rq, donor, state); } =20 static inline void proxy_release_rq_lock(struct rq *rq, struct rq_flags *r= f) @@ -6676,71 +6739,6 @@ static void proxy_migrate_task(struct rq *rq, struct= rq_flags *rf, proxy_reacquire_rq_lock(rq, rf); } =20 -static void proxy_force_return(struct rq *rq, struct rq_flags *rf, - struct task_struct *p) - __must_hold(__rq_lockp(rq)) -{ - struct rq *task_rq, *target_rq =3D NULL; - int cpu, wake_flag =3D WF_TTWU; - - lockdep_assert_rq_held(rq); - WARN_ON(p =3D=3D rq->curr); - - if (p =3D=3D rq->donor) - proxy_resched_idle(rq); - - proxy_release_rq_lock(rq, rf); - /* - * We drop the rq lock, and re-grab task_rq_lock to get - * the pi_lock (needed for select_task_rq) as well. - */ - scoped_guard (task_rq_lock, p) { - task_rq =3D scope.rq; - - /* - * Since we let go of the rq lock, the task may have been - * woken or migrated to another rq before we got the - * task_rq_lock. So re-check we're on the same RQ. If - * not, the task has already been migrated and that CPU - * will handle any futher migrations. - */ - if (task_rq !=3D rq) - break; - - /* - * Similarly, if we've been dequeued, someone else will - * wake us - */ - if (!task_on_rq_queued(p)) - break; - - /* - * Since we should only be calling here from __schedule() - * -> find_proxy_task(), no one else should have - * assigned current out from under us. But check and warn - * if we see this, then bail. - */ - if (task_current(task_rq, p) || task_on_cpu(task_rq, p)) { - WARN_ONCE(1, "%s rq: %i current/on_cpu task %s %d on_cpu: %i\n", - __func__, cpu_of(task_rq), - p->comm, p->pid, p->on_cpu); - break; - } - - update_rq_clock(task_rq); - deactivate_task(task_rq, p, DEQUEUE_NOCLOCK); - cpu =3D select_task_rq(p, p->wake_cpu, &wake_flag); - set_task_cpu(p, cpu); - target_rq =3D cpu_rq(cpu); - clear_task_blocked_on(p, NULL); - } - - if (target_rq) - attach_one_task(target_rq, p); - - proxy_reacquire_rq_lock(rq, rf); -} - /* * Find runnable lock owner to proxy for mutex blocked donor * @@ -6776,7 +6774,7 @@ find_proxy_task(struct rq *rq, struct task_struct *do= nor, struct rq_flags *rf) clear_task_blocked_on(p, PROXY_WAKING); return p; } - goto force_return; + goto deactivate; } =20 /* @@ -6811,7 +6809,7 @@ find_proxy_task(struct rq *rq, struct task_struct *do= nor, struct rq_flags *rf) __clear_task_blocked_on(p, NULL); return p; } - goto force_return; + goto deactivate; } =20 if (!READ_ONCE(owner->on_rq) || owner->se.sched_delayed) { @@ -6890,12 +6888,7 @@ find_proxy_task(struct rq *rq, struct task_struct *d= onor, struct rq_flags *rf) return owner; =20 deactivate: - if (proxy_deactivate(rq, donor)) - return NULL; - /* If deactivate fails, force return */ - p =3D donor; -force_return: - proxy_force_return(rq, rf, p); + proxy_deactivate(rq, p); return NULL; migrate_task: proxy_migrate_task(rq, rf, p, owner_cpu); @@ -7043,6 +7036,9 @@ static void __sched notrace __schedule(int sched_mode) if (sched_proxy_exec()) { struct task_struct *prev_donor =3D rq->donor; =20 + if (!prev_state && prev->blocked_on) + clear_task_blocked_on(prev, NULL); + rq_set_donor(rq, next); if (unlikely(next->blocked_on)) { next =3D find_proxy_task(rq, next, &rf); --=20 2.53.0.1213.gd9a14994de-goog From nobody Sun Jun 14 17:32:41 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DEACC23C512 for ; Sat, 4 Apr 2026 05:36:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281019; cv=none; b=AI506DNK2MRFMERsir+o+MA2F2S9E8+21OwVzSYrKvfVlR8HdYBGB/QtijkMMXvTzvQ5NRyqSF17Cx0MgRsHGdqaM2eXmaEjUoANCNp2G2qSUzLWuthzThiCXDOZmJTvaB+QSSA7YFerkuqeYeqbWP6zJ0OiApMIykeFc7k5opY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281019; c=relaxed/simple; bh=b/9eHiGacbALE105KAuHV4Osx9E6Rdf2s9xicbX86OA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=RvEMgHgnLJ7GBEpHm/y37bBpLRK3mYLZqDrl8t4QB6yLPWnjhVsjemqVKyoUbLhSShXUG54Tsu13ChP4OcvrD35Y7uImkr8RBbi/1dH7e1FPsud80m7h6z6WUV4iiNVohYr3NnMxJRDIlGxNx3J6BFF6kwRrf8GKslwdAIFcY0c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=nS9uVECs; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="nS9uVECs" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-35d98c6ab60so3379476a91.1 for ; Fri, 03 Apr 2026 22:36:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1775281012; x=1775885812; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KE6WFf4coARBUHATRGpmX7xHKBPsyY3zBSyjTWWGLwM=; b=nS9uVECsUbJBkhOfDOSeEWpwheC0l0u2FeMK2VEKLIrYIk2MvR6C0C2ZXKxEjsJl7k 3Jnr+kdfCyQzGgFs7WgT1tlxk8xSyeBLAQo5M5Gn+7zyb4NlBMOdbhHPpPFD5b/MIcZf yVjUmbQ04oqZpa7EcAPWT1OxaehDXrTEQs+izswMLWjrFA4I+rFeR3grFRZZYfD9M/r6 19sOkFZgGsRJBzMh+F2A0sJ6bwi/RoQAhMTX6DPz6GKBHb1YrIpRY0C1XwqaH19tApYh f4DX/jS+tdg07ZYSzV0xb1mVUi1xugd4ZzhK4dGxC/c+RR1oOAFzoWR5sCiEG/sIs1VP 5IyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775281012; x=1775885812; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KE6WFf4coARBUHATRGpmX7xHKBPsyY3zBSyjTWWGLwM=; b=B7Q0MAjEfU/oronbKiAgJbXRRQUwJs1XMIsn9/gfs43pMRk/VX8NoRAFJG+Ig81XtW TNiQIpm2Spd9n2psK4okyH7F8u3SX76FMrcIr+1KFS96kdVelXR6M9vAIg1T7PAnLdHL NuqAIjOhP1wcMGyXa0ulIuSArRr19sS3Un+uJp7E8OVkWJig2EeZOrNh0iG6QTaGBeKF SElgOl5Td1L35vSOrnN2G59VxnRZLc9wXACAyTHevovN6X5aS4UXT+QacXMAEOPtryV/ gYMpFFuFrbLKax48WKNHDS8gIEJWrjTFE2gFkC1QCaP3cxLwiyy5J3BHFpws+46Hwp33 vvjQ== X-Gm-Message-State: AOJu0Yyv4ALC5UqvcDHDaNx/eszeUgO0KUnlFe8bkEt4s7x6VrVSPGCU jHV20CPWyKJ9fnfR/qMP4wknzkGS0StKyk2nmMAau1vCh382AM2/DA4KkmD1fw3h8hMyQEo+JRu jwnQVcRCaMKhqMJ+HlCFErfPWCqNqhmq4F01fbId656/BcOHJkQsJnuO8Fim83O/e7kw9+rjJwQ gV4NcmYsGFsUX5N/a7/1rMgGyMb55/Es4VTAlOwoQnwrABwZzm X-Received: from pgib11.prod.google.com ([2002:a63:e70b:0:b0:c6e:4b29:e099]) (user=jstultz job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:4885:b0:39b:e3aa:df2d with SMTP id adf61e73a8af0-39f16ebf8e2mr6063578637.7.1775281011838; Fri, 03 Apr 2026 22:36:51 -0700 (PDT) Date: Sat, 4 Apr 2026 05:36:24 +0000 In-Reply-To: <20260404053632.1729280-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260404053632.1729280-1-jstultz@google.com> X-Mailer: git-send-email 2.53.0.1213.gd9a14994de-goog Message-ID: <20260404053632.1729280-8-jstultz@google.com> Subject: [PATCH v27 07/10] sched/core: Reset the donor to current task when donor is woken From: John Stultz To: LKML Cc: K Prateek Nayak , John Stultz , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: K Prateek Nayak Introduce proxy_reset_donor() to reset the donor to current task when the donor is woken up. This avoids needing to run with rq->idle as the donor when proxy_needs_return() hits the donor which can result in another different set of headaches. Signed-off-by: K Prateek Nayak Signed-off-by: John Stultz --- XXX: Confirm with Juri if there is any side-effect of temporarily doing a set_next_task() on a throttled DL task. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Mel Gorman Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: Daniel Lezcano Cc: Suleiman Souhlal Cc: kuyo chang Cc: hupu Cc: kernel-team@android.com --- kernel/sched/core.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 2b5f9f905afe1..a0d55225a62c3 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3660,7 +3660,17 @@ void update_rq_avg_idle(struct rq *rq) } =20 #ifdef CONFIG_SCHED_PROXY_EXEC -static inline struct task_struct *proxy_resched_idle(struct rq *rq); +static void zap_balance_callbacks(struct rq *rq); + +static inline void proxy_reset_donor(struct rq *rq) +{ + WARN_ON_ONCE(rq->donor =3D=3D rq->curr); + + put_prev_set_next_task(rq, rq->donor, rq->curr); + rq_set_donor(rq, rq->curr); + zap_balance_callbacks(rq); + resched_curr(rq); +} =20 /* * Checks to see if task p has been proxy-migrated to another rq @@ -3685,7 +3695,7 @@ static inline bool proxy_needs_return(struct rq *rq, = struct task_struct *p) =20 /* If we're return migrating the rq->donor, switch it out for idle */ if (task_current_donor(rq, p)) - proxy_resched_idle(rq); + proxy_reset_donor(rq); =20 block_task(rq, p, TASK_WAKING); return true; --=20 2.53.0.1213.gd9a14994de-goog From nobody Sun Jun 14 17:32:41 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9DA8637267D for ; Sat, 4 Apr 2026 05:36:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281019; cv=none; b=tDC4gL+6rBDX1xrecj/6J6tKW99TmidNV0xJ29QJsdLRDyOvYZ8RKLxTgv+X1U7fu0ji8x8vICFl0hGdTqlQZDVUKArPQ2Vb0DG+2Jh8/scAe2ql5rqNoEUUs/W6Y+ZjLa/DJzK6QXe/caK7dhiuilWqW2TCN3Cu3hba096kb48= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281019; c=relaxed/simple; bh=8DF1CAjyMoRMqx9bLLlT9MasF75wWUAIPKyM43lBm0w=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=aiUa4WGDFAUhgho5bqShLTVR/AGiX6IQXgNLCHmQOQnBE//SQ3cLKJa6AABg79kY5Aht2XQ7CwTDiOBzLfvJ3Qwc75odw/eF/BQxD9IixlR+IpxAwFpmf4+CVp+Bo8wFqKv2vdZ1+W3rv0iKTXmPew54Zb0cJKqC1wXMwuDaGbM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Pod54vhK; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Pod54vhK" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-35d93a8149bso4002186a91.0 for ; Fri, 03 Apr 2026 22:36:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1775281015; x=1775885815; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=fdvh9En/gGOuEWyTn5zdfN4mzmJEQdNkdBVEu82mKWM=; b=Pod54vhKoSc9JUH36ZkQ0zhH6FxafhZOwYqzBCdnGDI8v/XC5CDxrk1FDRECvTdCge Jq40CZNCa7UD7K/d+ZJHZf3VjCVI4jBJ50PX0568izyku3X12eFvCfBH586xq1VODezC T39XkxWraBVmLlXoLAVGn/YKRWStsUjOzigaMmiL+iqxt4GXx8AQWrDfgdM7flcyuaa1 LU/tf77/7fAixCdGhxfBM8LCGJI+TH3zZ7vExhr4i1vSIfwZryYtQUslrns7lxbuLHPR aOPNTIBsd/JnFMfQhw22JjHtdajFHnBlxSr9dc6GlPa+Di6T8noje2aoDU799/t1lNQB 5pmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775281015; x=1775885815; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fdvh9En/gGOuEWyTn5zdfN4mzmJEQdNkdBVEu82mKWM=; b=kpOI/ufUWuqNnX/MDaJERIKo+idPGn4YZ4za+G5TaCZfVQ8RrXuCHrUUJutTrBq1QU BDeu6no4AeytRYgK69NX5nEKO1Xz6ZsZdNGL7IVQ8uqJzISPQKua8lIhEFP3iA8JJDzP aHd6YkYPeODPy+Fu3rWnwKYbcsZpuFCCWeH24vbPZBgck0nPe2QhOKgJ3/30lPP2WGHW EITzjESL7NXTAUIB/4eMB9D6E07kLy9O3QPTwzs3tnP0k01OXsvjdzeUzzEnK+r+heus 8u6tNCnKTHC9oH94GV5X4pA5zv//ViOwtatYlERuX+tqDmrNJD69GcGB/ltdJR0FqoLP vxhg== X-Gm-Message-State: AOJu0YzqZfojWqhTZGHpmpXKlF8Ns58ok1tTfKfx0Olk37mirhuTw06N VASEkLLPc49Nnp4Dt5w4FQGiHJDf29pH88d4Rjsb7E0jUyNyEkwBRHBd+7gMCd7hPm5bQIZYSC9 L9BOzJCqeLY1StOqcNkh5XDRYQT8zSwLamQ0dRSp9zTXNoAxNwSF7iqVjB2zI8I1pbhd1W1FfYO 1gxXubcpVaSd9YUH0pBTNb3QKgedGUZY1WHHInIasQXjLGhkFZ X-Received: from pga9.prod.google.com ([2002:a05:6a02:4f89:b0:c5e:84e5:d15c]) (user=jstultz job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:6d8c:b0:398:8ea8:5f9f with SMTP id adf61e73a8af0-39f16f8dbd9mr7875029637.16.1775281013795; Fri, 03 Apr 2026 22:36:53 -0700 (PDT) Date: Sat, 4 Apr 2026 05:36:25 +0000 In-Reply-To: <20260404053632.1729280-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260404053632.1729280-1-jstultz@google.com> X-Mailer: git-send-email 2.53.0.1213.gd9a14994de-goog Message-ID: <20260404053632.1729280-9-jstultz@google.com> Subject: [PATCH v27 08/10] sched: Add blocked_donor link to task for smarter mutex handoffs From: John Stultz To: LKML Cc: Peter Zijlstra , Juri Lelli , Valentin Schneider , "Connor O'Brien" , John Stultz , Joel Fernandes , Qais Yousef , Ingo Molnar , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Add link to the task this task is proxying for, and use it so the mutex owner can do an intelligent hand-off of the mutex to the task that the owner is running on behalf. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Juri Lelli Signed-off-by: Valentin Schneider Signed-off-by: Connor O'Brien [jstultz: This patch was split out from larger proxy patch] Signed-off-by: John Stultz --- v5: * Split out from larger proxy patch v6: * Moved proxied value from earlier patch to this one where it is actually used * Rework logic to check sched_proxy_exec() instead of using ifdefs * Moved comment change to this patch where it makes sense v7: * Use more descriptive term then "us" in comments, as suggested by Metin Kaya. * Minor typo fixup from Metin Kaya * Reworked proxied variable to prev_not_proxied to simplify usage v8: * Use helper for donor blocked_on_state transition v9: * Re-add mutex lock handoff in the unlock path, but only when we have a blocked donor * Slight reword of commit message suggested by Metin v18: * Add task_init initialization for blocked_donor, suggested by Suleiman v23: * Reworks for PROXY_WAKING approach suggested by PeterZ v25: * Simplified some logic now we don't have proxy_tag_curr() Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Mel Gorman Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: Daniel Lezcano Cc: Suleiman Souhlal Cc: kuyo chang Cc: hupu Cc: kernel-team@android.com --- include/linux/sched.h | 1 + init/init_task.c | 1 + kernel/fork.c | 1 + kernel/locking/mutex.c | 44 +++++++++++++++++++++++++++++++++++++++--- kernel/sched/core.c | 14 +++++++++++++- 5 files changed, 57 insertions(+), 4 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 3ae1330801157..18665b4b973e2 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1238,6 +1238,7 @@ struct task_struct { #endif =20 struct mutex *blocked_on; /* lock we're blocked on */ + struct task_struct *blocked_donor; /* task that is boosting this task */ raw_spinlock_t blocked_lock; =20 #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER diff --git a/init/init_task.c b/init/init_task.c index b5f48ebdc2b6e..41c19670c8f6b 100644 --- a/init/init_task.c +++ b/init/init_task.c @@ -200,6 +200,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = =3D { .mems_allowed_seq =3D SEQCNT_SPINLOCK_ZERO(init_task.mems_allowed_seq, &init_task.alloc_lock), #endif + .blocked_donor =3D NULL, #ifdef CONFIG_RT_MUTEXES .pi_waiters =3D RB_ROOT_CACHED, .pi_top_task =3D NULL, diff --git a/kernel/fork.c b/kernel/fork.c index 079802cb61002..a3d2cd4395791 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2178,6 +2178,7 @@ __latent_entropy struct task_struct *copy_process( lockdep_init_task(p); =20 p->blocked_on =3D NULL; /* not blocked yet */ + p->blocked_donor =3D NULL; /* nobody is boosting p yet */ =20 #ifdef CONFIG_BCACHE p->sequential_io =3D 0; diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 7d359647156df..65f0f35b88972 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -942,7 +942,7 @@ EXPORT_SYMBOL_GPL(ww_mutex_lock_interruptible); */ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, u= nsigned long ip) { - struct task_struct *next =3D NULL; + struct task_struct *donor, *next =3D NULL; DEFINE_WAKE_Q(wake_q); unsigned long owner; unsigned long flags; @@ -961,6 +961,12 @@ static noinline void __sched __mutex_unlock_slowpath(s= truct mutex *lock, unsigne MUTEX_WARN_ON(__owner_task(owner) !=3D current); MUTEX_WARN_ON(owner & MUTEX_FLAG_PICKUP); =20 + if (sched_proxy_exec() && current->blocked_donor) { + /* force handoff if we have a blocked_donor */ + owner =3D MUTEX_FLAG_HANDOFF; + break; + } + if (owner & MUTEX_FLAG_HANDOFF) break; =20 @@ -974,7 +980,34 @@ static noinline void __sched __mutex_unlock_slowpath(s= truct mutex *lock, unsigne =20 raw_spin_lock_irqsave(&lock->wait_lock, flags); debug_mutex_unlock(lock); - if (!list_empty(&lock->wait_list)) { + + if (sched_proxy_exec()) { + raw_spin_lock(¤t->blocked_lock); + /* + * If we have a task boosting current, and that task was boosting + * current through this lock, hand the lock to that task, as that + * is the highest waiter, as selected by the scheduling function. + */ + donor =3D current->blocked_donor; + if (donor) { + struct mutex *next_lock; + + raw_spin_lock_nested(&donor->blocked_lock, SINGLE_DEPTH_NESTING); + next_lock =3D __get_task_blocked_on(donor); + if (next_lock =3D=3D lock) { + next =3D donor; + __set_task_blocked_on_waking(donor, next_lock); + wake_q_add(&wake_q, donor); + current->blocked_donor =3D NULL; + } + raw_spin_unlock(&donor->blocked_lock); + } + } + + /* + * Failing that, pick any on the wait list. + */ + if (!next && !list_empty(&lock->wait_list)) { /* get the first entry from the wait-list: */ struct mutex_waiter *waiter =3D list_first_entry(&lock->wait_list, @@ -982,14 +1015,19 @@ static noinline void __sched __mutex_unlock_slowpath= (struct mutex *lock, unsigne =20 next =3D waiter->task; =20 + raw_spin_lock_nested(&next->blocked_lock, SINGLE_DEPTH_NESTING); debug_mutex_wake_waiter(lock, waiter); - set_task_blocked_on_waking(next, lock); + __set_task_blocked_on_waking(next, lock); + raw_spin_unlock(&next->blocked_lock); wake_q_add(&wake_q, next); + } =20 if (owner & MUTEX_FLAG_HANDOFF) __mutex_handoff(lock, next); =20 + if (sched_proxy_exec()) + raw_spin_unlock(¤t->blocked_lock); raw_spin_unlock_irqrestore_wake(&lock->wait_lock, flags, &wake_q); } =20 diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a0d55225a62c3..9197b4274de8c 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6753,7 +6753,17 @@ static void proxy_migrate_task(struct rq *rq, struct= rq_flags *rf, * Find runnable lock owner to proxy for mutex blocked donor * * Follow the blocked-on relation: - * task->blocked_on -> mutex->owner -> task... + * + * ,-> task + * | | blocked-on + * | v + * blocked_donor | mutex + * | | owner + * | v + * `-- task + * + * and set the blocked_donor relation, this latter is used by the mutex + * code to find which (blocked) task to hand-off to. * * Lock order: * @@ -6893,6 +6903,7 @@ find_proxy_task(struct rq *rq, struct task_struct *do= nor, struct rq_flags *rf) * rq, therefore holding @rq->lock is sufficient to * guarantee its existence, as per ttwu_remote(). */ + owner->blocked_donor =3D p; } WARN_ON_ONCE(owner && !owner->on_rq); return owner; @@ -7050,6 +7061,7 @@ static void __sched notrace __schedule(int sched_mode) clear_task_blocked_on(prev, NULL); =20 rq_set_donor(rq, next); + next->blocked_donor =3D NULL; if (unlikely(next->blocked_on)) { next =3D find_proxy_task(rq, next, &rf); if (!next) { --=20 2.53.0.1213.gd9a14994de-goog From nobody Sun Jun 14 17:32:41 2026 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 59EFC34A3A5 for ; Sat, 4 Apr 2026 05:36:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281018; cv=none; b=Q7zLEb1JZMRCZo3qxKwR1n6pg/Iqwe62GcbMYNzw+R84tqNH5Lpta5jIBSgXQEb++pW9asOjvKesenJG9RzFdRzIibexbHNiW4LTvVStdqgvBoSuriXudQCeLVLc0BcNw3C7obuc40+l6oouBSBjUv5evD7ZdiUG7wPNucvVIGI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281018; c=relaxed/simple; bh=uGnqBjxqD9968pbq5J8npALEql4QzUTA7OuFcZx36Ys=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=IvO+1KA8GsgtTwwMjzFI/Rs8MTMiDgSG9+Hw7rNhOCrNVFQN0+9wwyRvAICE7v3tNrVDC1mY/OEP7U8SV6GSBgzqgkOrle4C9gHQb8Kg0sLHUrNjIpQ3iapx/j7XpHJFMjOR4ur/81xtyCga4TWkB+amSQQ+4gSi69WJUTQu3L4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=tigCOEqU; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="tigCOEqU" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-c7422397574so3453310a12.0 for ; Fri, 03 Apr 2026 22:36:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1775281016; x=1775885816; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=BbleK2rQ9CB1joEUhU8swEcN4Lv7Hyg2pgINrudzdSE=; b=tigCOEqUHhqei3ZF4PxzXsXgZTkEn4BdMWcyTeG+mOTvA8a+C7TRQ50CWRUenDkj3v dY0dpM2omfx7IQ5xBOt9/VBeT0DZiewNaNwlcbQXJMKz2zCiguwBix+nSEGES4/xSvh4 qet8j1Kj0OKzKKNAY+yUq58ztmUKAKks/wfYgcWoVJxIWt9kpcgjCgZlBLVULySCB+r9 52yecGavZHfUOyq5JxSxe2p2ZjK4wlWNWCdLDsMl5ohZtMJP28UGFUpKYYuD4X2CLMWy 9rmHpbVcSToGXpTorKTCj88R33ks6BAtO68OvJsjdCBZCfnMLeLSY/UG0KxB2TeuOmV7 BvYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775281016; x=1775885816; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=BbleK2rQ9CB1joEUhU8swEcN4Lv7Hyg2pgINrudzdSE=; b=RscpS3h19qn8huevnrO+Dw+me55WsDQiIdsOiOwfACu4FMyEtv6v+PI4Jf9h2GrVly BIaXXfTrx2vp5gYOyx6AF8pJLGgNiDGDgDTlajv3Z7/qGaFi1PrCgnOXfk6D5WyXfnaO +cSdzrOdOutf5GmKQIlTKK+kF5jA20xX9h4gfOVGlkKn2O2CPy6y3voYqHY9LNxXfvcj idT03EQv4Tjxik4DDeG5sxISULpUeCt+e2BD/oyKDZtyZAadLEl/yXHBXDYBx6J+1P+7 ZTHA/oRhJcoRWX5ZkKyroP9dHZEP7CDLWYodkZpkmb5wRrClpnIRGvRASxcNkKsjOOa1 ERiw== X-Gm-Message-State: AOJu0YwABYH1nUOtleoZ3WNpYbV9LLu+3JvSZs5VDoUHFt4vMr71M9G7 sp/jC20RHVKmqJryxB3LwiKH9XXQ7tjo11W0Y1UbP5cAqEEGdmSznUJ6Ds8vwG+aJd1PgdQogev +7fMyUUKUxNI8jerXzFs23m27AXSZLVG5szu1EKPnBany18m8RjJ1+KcSS6nqZ6aVAT4eJEYc9O hh4VyhMyadfJLgASo9wI6PGziXTydbMAuqntyuGmFKJSEgIb3G X-Received: from pgbbz35.prod.google.com ([2002:a05:6a02:623:b0:c74:2040:e525]) (user=jstultz job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:7f8a:b0:398:8e5c:4a94 with SMTP id adf61e73a8af0-39f2f11e04cmr5351925637.54.1775281015378; Fri, 03 Apr 2026 22:36:55 -0700 (PDT) Date: Sat, 4 Apr 2026 05:36:26 +0000 In-Reply-To: <20260404053632.1729280-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260404053632.1729280-1-jstultz@google.com> X-Mailer: git-send-email 2.53.0.1213.gd9a14994de-goog Message-ID: <20260404053632.1729280-10-jstultz@google.com> Subject: [PATCH v27 09/10] sched: Break out core of attach_tasks() helper into sched.h From: John Stultz To: LKML Cc: John Stultz , K Prateek Nayak , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Pull the core of attach_tasks() out into sched.h so it can be used more generically. Suggested-by: K Prateek Nayak Signed-off-by: John Stultz --- Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Mel Gorman Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: Daniel Lezcano Cc: Suleiman Souhlal Cc: kuyo chang Cc: hupu Cc: kernel-team@android.com --- kernel/sched/fair.c | 16 +--------------- kernel/sched/sched.h | 19 +++++++++++++++++++ 2 files changed, 20 insertions(+), 15 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 4a6669c517dae..9964670621d8a 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9969,21 +9969,7 @@ static int detach_tasks(struct lb_env *env) */ static void attach_tasks(struct lb_env *env) { - struct list_head *tasks =3D &env->tasks; - struct task_struct *p; - struct rq_flags rf; - - rq_lock(env->dst_rq, &rf); - update_rq_clock(env->dst_rq); - - while (!list_empty(tasks)) { - p =3D list_first_entry(tasks, struct task_struct, se.group_node); - list_del_init(&p->se.group_node); - - attach_task(env->dst_rq, p); - } - - rq_unlock(env->dst_rq, &rf); + __attach_tasks(env->dst_rq, &env->tasks); } =20 #ifdef CONFIG_NO_HZ_COMMON diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index adefea777e0a5..02e28fd211b76 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3034,6 +3034,25 @@ static inline void attach_one_task(struct rq *rq, st= ruct task_struct *p) attach_task(rq, p); } =20 +/* + * __attach_tasks() - attaches a list of tasks (using se.group_node) to + * the new rq + */ +static inline void __attach_tasks(struct rq *rq, struct list_head *tasks) +{ + guard(rq_lock)(rq); + update_rq_clock(rq); + + while (!list_empty(tasks)) { + struct task_struct *p; + + p =3D list_first_entry(tasks, struct task_struct, se.group_node); + list_del_init(&p->se.group_node); + + attach_task(rq, p); + } +} + #ifdef CONFIG_PREEMPT_RT # define SCHED_NR_MIGRATE_BREAK 8 #else --=20 2.53.0.1213.gd9a14994de-goog From nobody Sun Jun 14 17:32:41 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4BAFC358376 for ; Sat, 4 Apr 2026 05:36:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281019; cv=none; b=PMD+r6JxBUXABwZ4WSdsUW9bnH/216eAux1UyJyI7Wix/is7NGQ9erUWcytA1ZFFX7v42uwN9yQgy9KiC3MEDnDam/4d0Rbc0w4u6JyDmZ6jzNgOCp7CNda1/1GMRtSoAFNxE9Kj0N6q1T9ZcwUcXny+mSIuPsdxRfUIy4ngyaE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775281019; c=relaxed/simple; bh=pS62swtbY1WzP8+s4ohSpg6RpjgQH9a6cVLQmrMfVzc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=C5TlskjCmHEmCNDvFoXCu9j07na9ysqNaqX32Vxv5EhPkFgFYbPXISUcg0ggHdygeFJaJ3Ngq9RUweFaeoEBlIK2iYRORcxdUWx4i3g1v9KJeip3YGSZcIPRK8pDnWHUSeA4SSA+K8iSYDEhbf2xPv1KaJj1QqcTc/hUlasidXk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=E/q+jMbx; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="E/q+jMbx" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2b256ed2cc8so27171865ad.3 for ; Fri, 03 Apr 2026 22:36:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1775281018; x=1775885818; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=F5iXkOwPiQFobESqRJsKiFf0yXE4Dgq0YFJVQa0OGWU=; b=E/q+jMbxJeR7JEfA5S3ZJ8eWYezufCg+ZgPiB2ctRPWOeu1rsw9hPLI2MV567/sRGy gzPeeuQMwjNUmk75mTCmupDw9xFMF1+j11CJmOorIY/opsWNEzO2xKle7+b2Y7S7T69a N1NyHtckfXjEkF9PyWJD8J4PSpPMu5xY94k07o7IEuWYTb7re5KIcYlg/oOzJ21lDdka OmZIesJFb7OJvwYyzZqLs1/ePIlYZy+CJgRSFEd/9yYDY3+fJa1f8g59VHRsNeHfdIlV dEjJ8gdxv9K/eHSePgQ0OFnT0BDASsXJyj54wrEnJY4ZleH4QQRf8lIaT+P7XD6+X6me yb6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775281018; x=1775885818; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=F5iXkOwPiQFobESqRJsKiFf0yXE4Dgq0YFJVQa0OGWU=; b=sD4/3CqNhJoY+pzSFoChSS0PP8Q+g2jD8lfSswSqP1zxLwK7Ao2m5Qhx8FhtYlw38B +USn09cYjoTvekttRP0BQy6/DkGfprWPIftsBY3tgcxKZa6T9ytWfQy5cZCR0R3RAKuf k9Lwk6hzmi5zpEoNXsl8hO7yS5z37ysHPUj1M9CcfCy1w4aJ96fwFT8nTIOakmgmHCq6 6O7CI9d7a/Sf9KxZ2/ep1tbCh8iPQZmJggnaY/AE+8sl2cDyg1uK8MsMm41zGRusfqeh iemrnf12jy12gvEwhvHhAAeBQOWe1Pi3Cjf3xQ3wjLM81zfx2Epw7Kuj+RXJJl4NqWQI DhOg== X-Gm-Message-State: AOJu0YxTPKV8TcxgtjuagYS7JcLbn4maQsAwaZ9yEax/qaHQZDYfuIFg wOSNx5WHChTtL8SKVG6DPpSk6J3guc0w4svBnPVh/NspQh5eBPG2lrz1gLiPbDi+NDEoxQUwq3n t9h6AuxooTYzL9Y7rTJPr6de8V5CXCBuirmoqOo+2oSBXh2pk7Opt3fILSDF7YmNL8nXZ/PYC4N KEPG+CqPRBTNm762xMDFZYqDIaD0d88N/xoB0jEv/KspnVjOzP X-Received: from plks18.prod.google.com ([2002:a17:903:2d2:b0:2b2:52a7:1000]) (user=jstultz job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:ef4f:b0:2b0:663f:6b53 with SMTP id d9443c01a7336-2b28180e4cfmr51304625ad.13.1775281017067; Fri, 03 Apr 2026 22:36:57 -0700 (PDT) Date: Sat, 4 Apr 2026 05:36:27 +0000 In-Reply-To: <20260404053632.1729280-1-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260404053632.1729280-1-jstultz@google.com> X-Mailer: git-send-email 2.53.0.1213.gd9a14994de-goog Message-ID: <20260404053632.1729280-11-jstultz@google.com> Subject: [PATCH v27 10/10] sched: Migrate whole chain in proxy_migrate_task() From: John Stultz To: LKML Cc: John Stultz , Joel Fernandes , Qais Yousef , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Mel Gorman , Will Deacon , Waiman Long , Boqun Feng , "Paul E. McKenney" , Metin Kaya , Xuewen Yan , K Prateek Nayak , Thomas Gleixner , Daniel Lezcano , Suleiman Souhlal , kuyo chang , hupu , kernel-team@android.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Instead of migrating one task each time through find_proxy_task(), we can walk up the blocked_donor ptrs and migrate the entire current chain in one go. This was broken out of earlier patches and held back while the series was being stabilized, but I wanted to re-introduce it. Signed-off-by: John Stultz --- v12: * Earlier this was re-using blocked_node, but I hit a race with activating blocked entities, and to avoid it introduced a new migration_node listhead v18: * Add init_task initialization of migration_node as suggested by Suleiman v22: * Move migration_node under CONFIG_SCHED_PROXY_EXEC as suggested by K Prateek v25: * Use se.group_node instead of adding migration_node, as suggsested by K Prateek * Integrated attach_tasks() cleanups suggested by K Prateek Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Mel Gorman Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E. McKenney" Cc: Metin Kaya Cc: Xuewen Yan Cc: K Prateek Nayak Cc: Thomas Gleixner Cc: Daniel Lezcano Cc: Suleiman Souhlal Cc: kuyo chang Cc: hupu Cc: kernel-team@android.com --- kernel/sched/core.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9197b4274de8c..164429de8dc1f 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6723,9 +6723,9 @@ static void proxy_migrate_task(struct rq *rq, struct = rq_flags *rf, __must_hold(__rq_lockp(rq)) { struct rq *target_rq =3D cpu_rq(target_cpu); + LIST_HEAD(migrate_list); =20 lockdep_assert_rq_held(rq); - WARN_ON(p =3D=3D rq->curr); /* * Since we are migrating a blocked donor, it could be rq->donor, * and we want to make sure there aren't any references from this @@ -6738,13 +6738,20 @@ static void proxy_migrate_task(struct rq *rq, struc= t rq_flags *rf, * before we release the lock. */ proxy_resched_idle(rq); - - deactivate_task(rq, p, DEQUEUE_NOCLOCK); - proxy_set_task_cpu(p, target_cpu); - + for (; p; p =3D p->blocked_donor) { + WARN_ON(p =3D=3D rq->curr); + deactivate_task(rq, p, DEQUEUE_NOCLOCK); + proxy_set_task_cpu(p, target_cpu); + /* + * We can re-use se.group_node to migrate the thing, + * because @p is deactivated (won't be balanced) and + * we hold the rq_lock. + */ + list_add(&p->se.group_node, &migrate_list); + } proxy_release_rq_lock(rq, rf); =20 - attach_one_task(target_rq, p); + __attach_tasks(target_rq, &migrate_list); =20 proxy_reacquire_rq_lock(rq, rf); } --=20 2.53.0.1213.gd9a14994de-goog