From nobody Wed Dec 31 08:39:19 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9505C4167D for ; Mon, 6 Nov 2023 19:37:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233096AbjKFThe (ORCPT ); Mon, 6 Nov 2023 14:37:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232941AbjKFTgt (ORCPT ); Mon, 6 Nov 2023 14:36:49 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7C0A1FDD for ; Mon, 6 Nov 2023 11:36:15 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1cc1ddb34ccso31473565ad.1 for ; Mon, 06 Nov 2023 11:36:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1699299375; x=1699904175; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YQZP1XEEgwn2YYv3etnqftUpHkV/CQmcTiW8k1J7ma8=; b=FjwOaRqOvOJT4NJ3JgSNlO9UKL1xPnnxLoRpEGhJ43rL1T2lcFcEMq7wuyrlyaqSZs YTJvuHIz49GSqACKRHvuwE6CEi1qISHpT72dcuEAyHCf2+CDmvPHxrFAg7zb05CuYKxs 939o0oR3GuC/bwyZJNdMZrlH8P2wMaawq57XL7gqFNfoU8HsI5TX17KorevWm5EzkTXv tEBq1L4aLlyBdAPrSDdTteyS8UxNxXJmYtqDeMvVOnIVs1WG1GpU5q7oVHYFpa7/KG74 urMg0AsmYPAz0ZHLC4siFflQZ+e4CZWMbVZdciNu345jAHGVuSgfCGfayGsspUmNlZtE wJqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699299375; x=1699904175; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YQZP1XEEgwn2YYv3etnqftUpHkV/CQmcTiW8k1J7ma8=; b=G3Wypry+cjLZHxkGE2nlgQJP0onFvNjW4fOKFkT+KmEw5HW72RKiuKVHZSjIyGqaaW 2WYYnQgT+1vZXB/Jmos5TtQLrv9gy5QQUKYGIpE7JvoYzA2R0Pv0hWoBGl2vjjX2sHls GVY54VLZ21ZxtY/cmXezxXnxAOAyoDbi2to1J7pdwg56tWZttr48+fx21X4BDL0u7VcU V7TyXIadgF1KeHsIwbWYIrCOFrOAotIsxmz5U/6tnqfVfH7/SaASSX4p4FozZSUaaMJ/ 2mZY7BDfcxtIBCZOQHtYBgkRMfVgFEybwDPao6HPrHpBUXlOOZuV5brGIX3tAyMFafen ILYA== X-Gm-Message-State: AOJu0Yz5fUx7q7L6WIRXz1xDc7DArjwwMgLoEkPVhLC7sE8fo75e1WVx 5binzW/Xt/gFsIesKwHMA9+/4VLAXd2MzZjGAUsJkYVeyJFDT+rGUctz7//CAckcxGQUo4T5T06 5WYupDjF1UB0oJFAR8WTRrw1iUShqD03FGK05RFb01Ml38EU/ffYcaQPUShIpObsdJLnWrLE= X-Google-Smtp-Source: AGHT+IGY/Vg33tEt+lGHPMLzx2eg54yf5M/fb4KFZc/G5c3aBucYKpOFcfeNOyv5xKlfAl6bx00g0vemAfvo X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a17:902:efd1:b0:1c6:2b9d:570b with SMTP id ja17-20020a170902efd100b001c62b9d570bmr533931plb.7.1699299373896; Mon, 06 Nov 2023 11:36:13 -0800 (PST) Date: Mon, 6 Nov 2023 19:35:02 +0000 In-Reply-To: <20231106193524.866104-1-jstultz@google.com> Mime-Version: 1.0 References: <20231106193524.866104-1-jstultz@google.com> X-Mailer: git-send-email 2.42.0.869.gea05f2083d-goog Message-ID: <20231106193524.866104-20-jstultz@google.com> Subject: [PATCH v6 19/20] sched: Add blocked_donor link to task for smarter mutex handoffs From: John Stultz To: LKML Cc: Peter Zijlstra , Joel Fernandes , Qais Yousef , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Valentin Schneider , Steven Rostedt , Ben Segall , Zimuzo Ezeozue , Youssef Esmat , Mel Gorman , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , "Paul E . McKenney" , kernel-team@android.com, Valentin Schneider , "Connor O'Brien" , John Stultz Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra Add link to the task this task is proxying for, and use it so we do intellegent hand-off of the owned mutex to the task we're running on behalf. Cc: Joel Fernandes Cc: Qais Yousef Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Juri Lelli Cc: Vincent Guittot Cc: Dietmar Eggemann Cc: Valentin Schneider Cc: Steven Rostedt Cc: Ben Segall Cc: Zimuzo Ezeozue Cc: Youssef Esmat Cc: Mel Gorman Cc: Daniel Bristot de Oliveira Cc: Will Deacon Cc: Waiman Long Cc: Boqun Feng Cc: "Paul E . McKenney" Cc: kernel-team@android.com Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Juri Lelli Signed-off-by: Valentin Schneider Signed-off-by: Connor O'Brien [jstultz: This patch was split out from larger proxy patch] Signed-off-by: John Stultz --- v5: * Split out from larger proxy patch v6: * Moved proxied value from earlier patch to this one where it is actually used * Rework logic to check sched_proxy_exec() instead of using ifdefs * Moved comment change to this patch where it makes sense --- include/linux/sched.h | 1 + kernel/fork.c | 1 + kernel/locking/mutex.c | 35 ++++++++++++++++++++++++++++++++--- kernel/sched/core.c | 19 +++++++++++++++++-- 4 files changed, 51 insertions(+), 5 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 47c7095b918a..9bff2f123207 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1145,6 +1145,7 @@ struct task_struct { struct rt_mutex_waiter *pi_blocked_on; #endif =20 + struct task_struct *blocked_donor; /* task that is boosting us */ struct mutex *blocked_on; /* lock we're blocked on */ bool blocked_on_waking; /* blocked on, but waking */ raw_spinlock_t blocked_lock; diff --git a/kernel/fork.c b/kernel/fork.c index 930947bf4569..6604e0472da0 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2456,6 +2456,7 @@ __latent_entropy struct task_struct *copy_process( lockdep_init_task(p); #endif =20 + p->blocked_donor =3D NULL; /* nobody is boosting us yet */ p->blocked_on =3D NULL; /* not blocked yet */ p->blocked_on_waking =3D false; /* not blocked yet */ =20 diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c index 5394a3c4b5d9..f7187a247482 100644 --- a/kernel/locking/mutex.c +++ b/kernel/locking/mutex.c @@ -907,7 +907,7 @@ EXPORT_SYMBOL_GPL(ww_mutex_lock_interruptible); */ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, u= nsigned long ip) { - struct task_struct *next =3D NULL; + struct task_struct *donor, *next =3D NULL; DEFINE_WAKE_Q(wake_q); unsigned long owner; unsigned long flags; @@ -945,7 +945,34 @@ static noinline void __sched __mutex_unlock_slowpath(s= truct mutex *lock, unsigne preempt_disable(); raw_spin_lock_irqsave(&lock->wait_lock, flags); debug_mutex_unlock(lock); - if (!list_empty(&lock->wait_list)) { + + if (sched_proxy_exec()) { + raw_spin_lock(¤t->blocked_lock); + /* + * If we have a task boosting us, and that task was boosting us through + * this lock, hand the lock to that task, as that is the highest + * waiter, as selected by the scheduling function. + */ + donor =3D current->blocked_donor; + if (donor) { + struct mutex *next_lock; + + raw_spin_lock_nested(&donor->blocked_lock, SINGLE_DEPTH_NESTING); + next_lock =3D get_task_blocked_on(donor); + if (next_lock =3D=3D lock) { + next =3D donor; + donor->blocked_on_waking =3D true; + wake_q_add(&wake_q, donor); + current->blocked_donor =3D NULL; + } + raw_spin_unlock(&donor->blocked_lock); + } + } + + /* + * Failing that, pick any on the wait list. + */ + if (!next && !list_empty(&lock->wait_list)) { /* get the first entry from the wait-list: */ struct mutex_waiter *waiter =3D list_first_entry(&lock->wait_list, @@ -954,7 +981,7 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne next =3D waiter->task; =20 debug_mutex_wake_waiter(lock, waiter); - raw_spin_lock(&next->blocked_lock); + raw_spin_lock_nested(&next->blocked_lock, SINGLE_DEPTH_NESTING); WARN_ON(next->blocked_on !=3D lock); next->blocked_on_waking =3D true; raw_spin_unlock(&next->blocked_lock); @@ -964,6 +991,8 @@ static noinline void __sched __mutex_unlock_slowpath(st= ruct mutex *lock, unsigne if (owner & MUTEX_FLAG_HANDOFF) __mutex_handoff(lock, next); =20 + if (sched_proxy_exec()) + raw_spin_unlock(¤t->blocked_lock); raw_spin_unlock_irqrestore(&lock->wait_lock, flags); wake_up_q(&wake_q); preempt_enable(); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 760e2753a24c..6ac7a241dacc 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6782,7 +6782,17 @@ static inline bool proxy_return_migration(struct rq = *rq, struct rq_flags *rf, * Find who @next (currently blocked on a mutex) can proxy for. * * Follow the blocked-on relation: - * task->blocked_on -> mutex->owner -> task... + * + * ,-> task + * | | blocked-on + * | v + * blocked_donor | mutex + * | | owner + * | v + * `-- task + * + * and set the blocked_donor relation, this latter is used by the mutex + * code to find which (blocked) task to hand-off to. * * Lock order: * @@ -6919,6 +6929,8 @@ proxy(struct rq *rq, struct task_struct *next, struct= rq_flags *rf) */ raw_spin_unlock(&p->blocked_lock); raw_spin_unlock(&mutex->wait_lock); + + owner->blocked_donor =3D p; } =20 WARN_ON_ONCE(owner && !owner->on_rq); @@ -7003,6 +7015,7 @@ static void __sched notrace __schedule(unsigned int s= ched_mode) unsigned long prev_state; struct rq_flags rf; struct rq *rq; + bool proxied; int cpu; bool preserve_need_resched =3D false; =20 @@ -7053,9 +7066,11 @@ static void __sched notrace __schedule(unsigned int = sched_mode) switch_count =3D &prev->nvcsw; } =20 + proxied =3D !!prev->blocked_donor; pick_again: next =3D pick_next_task(rq, rq_selected(rq), &rf); rq_set_selected(rq, next); + next->blocked_donor =3D NULL; if (unlikely(task_is_blocked(next))) { next =3D proxy(rq, next, &rf); if (!next) { @@ -7119,7 +7134,7 @@ static void __sched notrace __schedule(unsigned int s= ched_mode) rq =3D context_switch(rq, prev, next, &rf); } else { /* In case next was already curr but just got blocked_donor*/ - if (unlikely(!task_current_selected(rq, next))) + if (unlikely(!proxied && next->blocked_donor)) proxy_tag_curr(rq, next); =20 rq->clock_update_flags &=3D ~(RQCF_ACT_SKIP|RQCF_REQ_SKIP); --=20 2.42.0.869.gea05f2083d-goog