From nobody Mon Jun 8 07:24:55 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE718494A02; Thu, 4 Jun 2026 18:45:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780598749; cv=none; b=MgE8b5wbKpz0+Pm6Ra+ZmPZLHYfUgqoDBpdCxBjk5bJOf0rweQQNOrj2nAY7kmjZLq5lrZjYGqe97xC/Mo8vSdBQBb+gTdmGTMZmi097zvQF7x7MTxDlPcoxiR1A/lMZEdmtPcvmKxsdZXDfUMetT4xYUE7TZlDYNvQIpIpuL2w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780598749; c=relaxed/simple; bh=HPZ6wwJIHU6Fw3kR+Z6Q+QhqORiCvvbK15628+CITjE=; h=Date:From:To:Subject:Cc:In-Reply-To:References:MIME-Version: Message-ID:Content-Type; b=mEGba9aMYcWpubmzLf4Z1U06WCZ2fG8D8uJOT0jRZicvZtu/e8zqGND05p4QL1O7unPHDiSTrY97qM7UM4O9ZfhVHkhkdYnlHPTHnfNrTYMoaURaZiGa7m8HaUB0IthLzgVGIxfJ+dOsoIlQNskmeXMNb1uvfBDvBNWGq/xLlMY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=Q2MxYmnA; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=X/HukvEO; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="Q2MxYmnA"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="X/HukvEO" Date: Thu, 04 Jun 2026 18:45:44 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1780598746; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IPa+AwgxFn0++E/Wl0dSzH19iluiu4+kg5IG8vfo2Tw=; b=Q2MxYmnAA5mMn2uIS41JO5HfZr8ImCcJl+zxm7KfhFB0hLdojjW6yqJiAimnJ+kj4Xfl34 T76iRx4Xgr3g6+DuPRX4KIUD5orqbT+SKGXuGVf7y/YQiiufPhhHiQMnCjCCxPAEFTkI/A VjTQTVsYogaSJdvgEyoHLaeaJQK64mch/zBxlrMKCeUMsYNfYgvVGFTREwO422SQ668vye rQ5wI3eGzs2UFgc207ex+ZWxESKuDHY1qCMjVW4jUmQxtvDQdQ6qDr4sdTuXqphd9k57HS hXxwItoDMkVdspl0CoJs4pOm8gea+k5cMD8kmg7dAaXqLL6ydj3FYsv66Cp4Eg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1780598746; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IPa+AwgxFn0++E/Wl0dSzH19iluiu4+kg5IG8vfo2Tw=; b=X/HukvEOsVJIQHg4BTGLhhRor2y0wRCT2mfNTOCcuEwdTMLdln5hTfbw9YepDbgVAP64ah 2R8YvCMuyDjpSwCw== From: "tip-bot2 for Peter Zijlstra" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/core] sched: Add blocked_donor link to task for smarter mutex handoffs Cc: "Peter Zijlstra (Intel)" , Juri Lelli , Valentin Schneider , "Connor O'Brien" , John Stultz , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20260512025635.2840817-8-jstultz@google.com> References: <20260512025635.2840817-8-jstultz@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-ID: <178059874448.710.10771232578900090474.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Precedence: bulk Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The following commit has been merged into the sched/core branch of tip: Commit-ID: 1628b25248d0742b2ce9c7cfa59cd183e35f37e1 Gitweb: https://git.kernel.org/tip/1628b25248d0742b2ce9c7cfa59cd183e= 35f37e1 Author: Peter Zijlstra AuthorDate: Tue, 12 May 2026 02:56:17=20 Committer: Peter Zijlstra CommitterDate: Tue, 02 Jun 2026 12:26:07 +02:00 sched: Add blocked_donor link to task for smarter mutex handoffs Add link to the task this task is proxying for, and use it so the mutex owner can do an intelligent hand-off of the mutex to the task that the owner is running on behalf. [jstultz: This patch was split out from larger proxy patch] Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Juri Lelli Signed-off-by: Valentin Schneider Signed-off-by: Connor O'Brien Signed-off-by: John Stultz Signed-off-by: Peter Zijlstra (Intel) Link: https://patch.msgid.link/20260512025635.2840817-8-jstultz@google.com --- include/linux/sched.h | 7 +++++- init/init_task.c | 1 +- kernel/fork.c | 1 +- kernel/locking/mutex.c | 60 ++++++++++++++++++++++++++++++++++++----- kernel/sched/core.c | 14 +++++++++- 5 files changed, 75 insertions(+), 8 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index ec17066..e2f127a 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1250,6 +1250,13 @@ struct task_struct { struct mutex *blocked_on; /* lock we're blocked on */ raw_spinlock_t blocked_lock; =20 + /* + * The task that is boosting this task; a back link for the current + * donor stack. Set in schedule() -> find_proxy_task() and only stable + * under preempt_disable(). + */ + struct task_struct *blocked_donor; + #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER /* * Encoded lock address causing task block (lower 2 bits =3D type from diff --git a/init/init_task.c b/init/init_task.c index 3ecd66f..674d174 100644 --- a/init/init_task.c +++ b/init/init_task.c @@ -200,6 +200,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = =3D { .mems_allowed_seq =3D SEQCNT_SPINLOCK_ZERO(init_task.mems_allowed_seq, &init_task.alloc_lock), #endif + .blocked_donor =3D NULL, #ifdef CONFIG_RT_MUTEXES .pi_waiters =3D RB_ROOT_CACHED, .pi_top_task =3D NULL, diff --git a/kernel/fork.c b/kernel/fork.c index a679b24..6fcca1d 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2224,6 +2224,7 @@ __latent_entropy struct task_struct *copy_process( lockdep_init_task(p); =20 p->blocked_on =3D NULL; /* not blocked yet */ + p->blocked_donor =3D NULL; /* nobody is boosting p yet */ =20 #ifdef CONFIG_BCACHE p->sequential_io =3D 0; diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index a93d4c6..2867716 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -981,9 +981,8 @@ EXPORT_SYMBOL_GPL(ww_mutex_lock_interruptible); static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, u= nsigned long ip) __releases(lock) { - struct task_struct *next =3D NULL; + struct task_struct *donor, *next =3D NULL; struct mutex_waiter *waiter; - DEFINE_WAKE_Q(wake_q); unsigned long owner; unsigned long flags; =20 @@ -991,6 +990,14 @@ static noinline void __sched __mutex_unlock_slowpath(s= truct mutex *lock, unsigne __release(lock); =20 /* + * Ensures the proxy donor stack is stable across unlock and handoff. + * Specifically, it avoids the case where current->blocked_donor is + * NULL when it is inspected while doing the unlock, but a preemption + * before taking the wake_lock would make it set and a hand-off is + * missed. + */ + guard(preempt)(); + /* * Release the lock before (potentially) taking the spinlock such that * other contenders can get on with things ASAP. * @@ -1002,6 +1009,12 @@ static noinline void __sched __mutex_unlock_slowpath= (struct mutex *lock, unsigne MUTEX_WARN_ON(__owner_task(owner) !=3D current); MUTEX_WARN_ON(owner & MUTEX_FLAG_PICKUP); =20 + if (sched_proxy_exec() && current->blocked_donor) { + /* force handoff if we have a blocked_donor */ + owner =3D MUTEX_FLAG_HANDOFF; + break; + } + if (owner & MUTEX_FLAG_HANDOFF) break; =20 @@ -1014,20 +1027,53 @@ static noinline void __sched __mutex_unlock_slowpat= h(struct mutex *lock, unsigne } =20 raw_spin_lock_irqsave(&lock->wait_lock, flags); + raw_spin_lock(¤t->blocked_lock); debug_mutex_unlock(lock); + + if (sched_proxy_exec()) { + /* + * If we have a task boosting current, and that task was boosting + * current through this lock, hand the lock to that task, as that + * is the highest waiter, as selected by the scheduling function. + */ + donor =3D current->blocked_donor; + if (donor) { + struct mutex *next_lock; + + raw_spin_lock_nested(&donor->blocked_lock, SINGLE_DEPTH_NESTING); + next_lock =3D __get_task_blocked_on(donor); + if (next_lock =3D=3D lock) { + next =3D get_task_struct(donor); + __set_task_blocked_on_waking(donor, next_lock); + current->blocked_donor =3D NULL; + } + raw_spin_unlock(&donor->blocked_lock); + } + } + + /* + * Failing that, pick first on the wait list. + */ waiter =3D lock->first_waiter; - if (waiter) { - next =3D waiter->task; + if (!next && waiter) { + next =3D get_task_struct(waiter->task); =20 + raw_spin_lock_nested(&next->blocked_lock, SINGLE_DEPTH_NESTING); debug_mutex_wake_waiter(lock, waiter); - set_task_blocked_on_waking(next, lock); - wake_q_add(&wake_q, next); + __set_task_blocked_on_waking(next, lock); + raw_spin_unlock(&next->blocked_lock); + } =20 if (owner & MUTEX_FLAG_HANDOFF) __mutex_handoff(lock, next); =20 - raw_spin_unlock_irqrestore_wake(&lock->wait_lock, flags, &wake_q); + raw_spin_unlock(¤t->blocked_lock); + raw_spin_unlock_irqrestore(&lock->wait_lock, flags); + if (next) { + wake_up_process(next); + put_task_struct(next); + } } =20 #ifndef CONFIG_DEBUG_LOCK_ALLOC diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c755286..4c6ceff 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6827,7 +6827,17 @@ static void proxy_migrate_task(struct rq *rq, struct= rq_flags *rf, * Find runnable lock owner to proxy for mutex blocked donor * * Follow the blocked-on relation: - * task->blocked_on -> mutex->owner -> task... + * + * ,-> task + * | | blocked-on + * | v + * blocked_donor | mutex + * | | owner + * | v + * `-- task + * + * and set the blocked_donor relation, this latter is used by the mutex + * code to find which (blocked) task to hand-off to. * * Lock order: * @@ -6969,6 +6979,7 @@ find_proxy_task(struct rq *rq, struct task_struct *do= nor, struct rq_flags *rf) * rq, therefore holding @rq->lock is sufficient to * guarantee its existence, as per ttwu_remote(). */ + owner->blocked_donor =3D p; } WARN_ON_ONCE(owner && !owner->on_rq); return owner; @@ -7125,6 +7136,7 @@ pick_again: clear_task_blocked_on(prev, NULL); =20 rq_set_donor(rq, next); + next->blocked_donor =3D NULL; if (unlikely(next->is_blocked && next->blocked_on)) { next =3D find_proxy_task(rq, next, &rf); if (!next) {