From nobody Thu Apr 9 16:33:21 2026 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB657179A3 for ; Tue, 3 Mar 2026 11:57:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772539048; cv=none; b=uc7Nmhg/ZmfQAUTT/v71sL9ChDz/7BoMYBynYckDdiULmFM8Y/LSS5z247mHoqBpIDACyFDrkc2nQmIOluVLv8xQ1efQdsVTKZwWYUk0wSgognk9jOdMw3rMz0+uUI8fXbNwjAGmBpHAVt9/GZiMAX0hP4aqMcVkAdHhZBTOjxI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772539048; c=relaxed/simple; bh=jtco5eZqcaa0fYptzeDWt7F9ml5dglBnUyptOtxQey4=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=ccSx3SX14w28IceUaCH/LYeH/h7tm8/dlb9GAUgwVpJDNnkHeIF1U4kErW4jNMpc+Fejt9lfR4Rt2YC3JnWZX84wZ2F3/tK1FZB3RgxTfMSHaIf4ZMsP2I2hMQOZNhbKvFwxxxjAJk1tSFGYQl/RA9hozEgX03gVubXXd2NDz/U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=DaDIku0v; arc=none smtp.client-ip=209.85.214.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="DaDIku0v" Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-2ae4e538abdso20649355ad.3 for ; Tue, 03 Mar 2026 03:57:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1772539046; x=1773143846; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=qKoZn3WVMFdhdcGl02jU1axHuJ3GfpNKyiowe9GHPp8=; b=DaDIku0v4lDgdXgvtKzbXXvTsK7FPKeQBP1V0CQU23jJEbiufWg3DfZsbYeQYX2h7c W8USzO5q+9ogshNF/YoKkrP3Qi3cFlCcbHN5a9ff3c9nS/oACPjEyD6+P5h/10Vg/wNg LCvzVOpB42fPDfrJ6d1XWkCKrbxEvSMYvp4a03X5WHjBGsqaf6BHzu8BJgD6cwNInAbE y92tHA+nJjfWhpq44mXrElQR4kQCtaGKN9pUmdGb0ZUna8JKikBiH8mvjGTc3slh0UkY HMmW/183YVkqmI8UgDdpcQhW4Qf8/KZ7I0j68kKRcNgnhmIQIHlYpae5c8Sx2h6hlc6a 057g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772539046; x=1773143846; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=qKoZn3WVMFdhdcGl02jU1axHuJ3GfpNKyiowe9GHPp8=; b=cdk3/psemLRRdYkQNjsPAz2bnc+rDvuv63YSX03206M5g55sJhq7QbldN+UIulKxtg gcVZFtbARUjrTZpujfT2ez8R9fSGqkUSZPV8dy3mcf20PQCXhNgFCNBaKfdsxFR7z8Uk pQj31lUidnnknGSNLKXF5ysbtUldpQzab1F1bc6yIX9WZvPGIwVf9+FQvGCtpcJn9dIM ufW1veB0S2RUIelvUiY03LfKtWJpsE8ISA8x22l57f7qbwpOXSkQNd1oIY+9J6nt18vL tWuX889tlJ/95yDJW/nW1Yo/7QBFLoC9JwUdUpL7qeZf72e8Hv4rhwcwP5kU7sNaNm4N QSXw== X-Gm-Message-State: AOJu0YwPn1jyNH5vAbH4e+uF33vWWBOeafc21nNKINMtNuvK+dAmJPHi jwxcwgivTKAOXPhl6P4QkXSWzj1XootGYOxXHMxRYwPysXv/VjdLterQnQLbc7WA X-Gm-Gg: ATEYQzyEWUm0kznp5SVcQhofwuW/+HvpsJBqQZVv/J/tdtI3TnmuwRB1WqaPJK5YfSP rlwM67aPPbJxIDpM9m1/vKOtBg9oJJujNoiTpgVbZ6LuTWjdL71LMUPEAczxL24bwIpFpQO+kp1 GuPyyZgtiDTkZAt5lwCirEVUrsdiA8QJZ7rZOnXduoTBiUzzB7Sj/g8NRNhlGuVGRYwisPNqW7j 0hvMEbKYxtnr0zbHkVb0Cg+V5LEtgtNaMSGmh7W+P4QDSmLwxh6CdWRbvQHz2ptXtJUh3hgGFv1 td00Pdrbs7WBoL2QNFDRMusdAIRpV4yqMwJihDcA9d6Yj5i0p8YAGrSYPmbBA2O8Qjqi+/+/D5V mCc6HxJkE3+8c2s9wJ6blccGEaBFG4sIhGA0km7oZFD42WPOENPjz1afJW/KzuYbjcEyqFs8u5C 45TPOc+tUS5gshxqof3Su8QSE8It/wvorB2vqMhrmNtZzhVWtjZhgieuElOX3/Iffauct49N5lH P1f X-Received: by 2002:a17:902:ce0b:b0:2ae:5776:45f8 with SMTP id d9443c01a7336-2ae577649a1mr48086095ad.3.1772539045774; Tue, 03 Mar 2026 03:57:25 -0800 (PST) Received: from mi-HP-ProDesk-680-G6-PCI-Microtower-PC.mioffice.cn ([43.224.245.226]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2ae4651e409sm96082705ad.44.2026.03.03.03.57.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Mar 2026 03:57:25 -0800 (PST) From: soolaugust@gmail.com To: linux-kernel@vger.kernel.org Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, jstultz@google.com, kprateek.nayak@amd.com, zhidao su Subject: [PATCH] sched/proxy_exec: Optimize proxy_tag_curr() pushable removal Date: Tue, 3 Mar 2026 19:57:18 +0800 Message-ID: <20260303115718.278608-1-soolaugust@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: zhidao su proxy_tag_curr() is called on every proxy-execution activation to prevent the owner task from being pushed to another CPU while the donor runs on its behalf. The current implementation uses a full dequeue/enqueue(SAVE/RESTORE) cycle. This is unnecessarily expensive: RT owner: O(log n) run-list manipulation and plist update, plus uclamp/PSI/sched_core bookkeeping, all with zero net effect except for the single dequeue_pushable_task() call buried inside dequeue_task_rt(). DL owner: same, plus DEQUEUE_SAVE triggers sub_running_bw() + sub_rq_bw(), which ENQUEUE_RESTORE immediately undoes via add_rq_bw() + add_running_bw(). The bandwidth counters of an owner that never left the runqueue should not be perturbed. CFS owner: the whole cycle is a no-op; CFS has no pushable list. can_migrate_task() already rejects blocked owners via task_is_blocked(), making migration impossible. The SAVE/RESTORE cycle achieves pushable removal only indirectly: enqueue_task_rt/dl() suppresses re-enqueue into the pushable list when task_is_blocked(owner) is true. The same result is obtained more directly by calling dequeue_pushable_task() or dequeue_pushable_dl_task() once, without any of the side effects. Replace the workaround with per-class direct calls: RT: dequeue_pushable_task(rq, owner) -- O(1) plist remove DL: dequeue_pushable_dl_task(rq, owner) -- O(log n) rb_erase, but avoids the bandwidth counter churn entirely CFS: no-op (no pushable list; task_is_blocked() suffices) Both functions are promoted from static and declared in sched.h. deadline.c also gains the missing isolation.h include required by dl_get_task_effective_cpus(). Signed-off-by: zhidao su --- kernel/sched/core.c | 28 +++++++++++++++++++--------- kernel/sched/deadline.c | 3 ++- kernel/sched/rt.c | 2 +- kernel/sched/sched.h | 2 ++ 4 files changed, 24 insertions(+), 11 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index dc9f17b35e4..2aba15d84b7 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6728,16 +6728,26 @@ static inline void proxy_tag_curr(struct rq *rq, st= ruct task_struct *owner) if (!sched_proxy_exec()) return; /* - * pick_next_task() calls set_next_task() on the chosen task - * at some point, which ensures it is not push/pullable. - * However, the chosen/donor task *and* the mutex owner form an - * atomic pair wrt push/pull. + * The donor goes through set_next_task() which calls + * dequeue_pushable_task() making it non-pushable. The owner + * does not go through that path, so we must remove it from + * the pushable list explicitly. * - * Make sure owner we run is not pushable. Unfortunately we can - * only deal with that by means of a dequeue/enqueue cycle. :-/ - */ - dequeue_task(rq, owner, DEQUEUE_NOCLOCK | DEQUEUE_SAVE); - enqueue_task(rq, owner, ENQUEUE_NOCLOCK | ENQUEUE_RESTORE); + * For RT tasks: remove from the plist directly. + * For DL tasks: remove from the rb-tree directly. + * For CFS tasks: no pushable list exists; can_migrate_task() + * already rejects blocked owners via task_is_blocked(). + * + * The prior dequeue/enqueue(SAVE/RESTORE) cycle achieved the + * same result by relying on task_is_blocked() suppressing the + * re-enqueue into the pushable list, but it carried O(log n) + * overhead and, for DL owners, triggered sub_running_bw() + + * sub_rq_bw() -- bandwidth counter churn with no net effect. + */ + if (rt_task(owner)) + dequeue_pushable_task(rq, owner); + else if (dl_task(owner)) + dequeue_pushable_dl_task(rq, owner); } =20 /* diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index d08b0042932..a43557fd84b 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -18,6 +18,7 @@ =20 #include #include +#include #include #include "sched.h" #include "pelt.h" @@ -593,7 +594,7 @@ static void enqueue_pushable_dl_task(struct rq *rq, str= uct task_struct *p) } } =20 -static void dequeue_pushable_dl_task(struct rq *rq, struct task_struct *p) +void dequeue_pushable_dl_task(struct rq *rq, struct task_struct *p) { struct dl_rq *dl_rq =3D &rq->dl; struct rb_root_cached *root =3D &dl_rq->pushable_dl_tasks_root; diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index f69e1f16d92..5a124ee3114 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -410,7 +410,7 @@ static void enqueue_pushable_task(struct rq *rq, struct= task_struct *p) } } =20 -static void dequeue_pushable_task(struct rq *rq, struct task_struct *p) +void dequeue_pushable_task(struct rq *rq, struct task_struct *p) { plist_del(&p->pushable_tasks, &rq->rt.pushable_tasks); =20 diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 43bbf0693cc..dd5e187e6c7 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -362,6 +362,7 @@ extern bool dl_param_changed(struct task_struct *p, con= st struct sched_attr *att extern int dl_cpuset_cpumask_can_shrink(const struct cpumask *cur, const = struct cpumask *trial); extern int dl_bw_deactivate(int cpu); extern s64 dl_scaled_delta_exec(struct rq *rq, struct sched_dl_entity *dl_= se, s64 delta_exec); +extern void dequeue_pushable_dl_task(struct rq *rq, struct task_struct *p); /* * SCHED_DEADLINE supports servers (nested scheduling) with the following * interface: @@ -3329,6 +3330,7 @@ print_numa_stats(struct seq_file *m, int node, unsign= ed long tsf, =20 extern void init_cfs_rq(struct cfs_rq *cfs_rq); extern void init_rt_rq(struct rt_rq *rt_rq); +extern void dequeue_pushable_task(struct rq *rq, struct task_struct *p); extern void init_dl_rq(struct dl_rq *dl_rq); =20 extern void cfs_bandwidth_usage_inc(void); --=20 2.43.0