From nobody Mon Dec 1 22:02:10 2025 Received: from mail-ej1-f45.google.com (mail-ej1-f45.google.com [209.85.218.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E26B231352B for ; Mon, 1 Dec 2025 12:42:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764592951; cv=none; b=BLHXK7rgdRXrDWglmqnupXgo9QCw7beLsSb43+aEXs6i0ifUo+KOt9LdrJgYeK3uv8BXPwlcE68izvE67Ox+8GC3jVssoo3mqukNYsV5ngz6/F1CPMwvu7t8uaiIeCA43i/WgKkxobdZuOh09t829OrPB5XzHMMPU9rqKTXFhA8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764592951; c=relaxed/simple; bh=nl/zj7XOo2Pt4AyqQMas/N8xeke7DeDOLWstQrQbvtM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oXO9pAgH9IsKWexU6Tdb51miPkzh9Zfm4fzjIlBsaxRVMehrVNZvfkwsesctoYg+bochgf4jnSZiJbxqde7+RLb9/iQERthBnDDmoF8GGRk2U7v2H71Dd5MnF4geuQhFSVofJrlx9eiBvRil21zQ9CYQyBbORbIW/MLnl769OCw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ijinxqWx; arc=none smtp.client-ip=209.85.218.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ijinxqWx" Received: by mail-ej1-f45.google.com with SMTP id a640c23a62f3a-b7380f66a8bso630077266b.2 for ; Mon, 01 Dec 2025 04:42:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764592947; x=1765197747; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CA7ZGgm2S1Wr54pv6s8ICnbsqpB7VrG8KXCm9pXfumM=; b=ijinxqWxNYsmyXqgNiuqv26iH6SW39bvB7dnP319xWaAktw84WuvJUzjNHAFywxbDG tkJH2z/000uKy3coGzWNaKg3150TyBN2qun9ddFQKGrb89Gq/PIfb61+A0Rmvnv3OSOJ fmzrlUdttHYSNhUDKnJ2DdyGjP8Ihoga1tziGiRpvDUnQWx5WDcnz2UrcNy6kL7V5ZeH +X0VuGTAN62kQtjD+XnIT5ctJLJhDvMdGTJEiW4J9A6ml/NMippSxFQlV9/0QwIkbO7Y rZ29q4Dv/Aub7hkfJjOTZ6snQuoGkOex1wbGrCfNcUdRLIHde/Mh1Ko4vhC6ZbBDbyPf kJig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764592947; x=1765197747; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=CA7ZGgm2S1Wr54pv6s8ICnbsqpB7VrG8KXCm9pXfumM=; b=DxMOfOssNU6VkEgDNF9nYwzTlSP2I5h9BZ4kSD+ZNxB5Ngklw/5QXG7aLMlZ7kxLSz QUZrmL0HW5MRiTEpn7RYr+Mv7+GMvUH3TqdZkBQKPuqRiGrpq+VhKXv56k77j+ElFzcM Y2bFRGL8WQP3ls2/cXKOmvUHFb89x9o2oPN5DSnFr3x8zemO2rbc5OiaX0V+5EFRsKYg qVgAxlLAktUbqvYmf0Xk5E3Vc7dZWaSfTe/XohY5fa2M90M2oaQDlg0gUGv1K2I2bYzO 3g8oec9IrdNYZz0HkQBjiYlGWNQ/iUkraaR4xNRORQqDQVOVWw4adnWxp6LFy1V0uWFn sk2g== X-Gm-Message-State: AOJu0YzICbF7F6RKa48txhV2kVHOKOXFmSoe/yb/NDtzRK/cWPGmE09M FqGBtzprXJesj1HAyEmTS/xxO3ikSlUP4fcPCeW5pIWpOgBkTSVjNBwg X-Gm-Gg: ASbGncs7p9ewxgnDYHUfz51wZ7d6hcpKruhxLEutCzazOAAwJBd713FsR+QKs5J7agl DbS5T7MlOxIJv0ee///RqpQA0r1fj+Svx9Vo3hrOEa1gCPVE88bJqWNkqXgOO7FEL1jSeu2MLs7 wfQaLyeoul5Up+SiHELkqEAq773KDwAsskP/9dI/rbI09YP6QhNjbwuG/Qnkys/shEiQ+MjbNkU wcyOFvKBIUFvp9loHcXTM8Ps4XdteS4rWalFnaueb6JOYgIxmPHtnh0nhGHShQ7E2XpnbIa0Scy ylFYpv1NPq+h9Bmx/ngV+B9Gggl4OX34uUq1JTo71AdiUaw/kaeu4NoCiQu76fYecM9v1LuF7vf 9XBajPG0P4xVqL5V8sPhgNc32qsGH8nyXgsV7Q0YdflBy7avKxmnBrJ3/JIo6ifRUx9gcp4qz96 3+YK7K2kVz X-Google-Smtp-Source: AGHT+IGRn2UPC9lfEI3hwJk8oqsutLo9dymbtCkWVGRO+g7eEa/RaIxY5NwR5lINfRuzFrtzEsrFWw== X-Received: by 2002:a17:907:7203:b0:b76:3548:cdee with SMTP id a640c23a62f3a-b76715e630bmr4113962566b.25.1764592946869; Mon, 01 Dec 2025 04:42:26 -0800 (PST) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b76f59e8612sm1173738266b.52.2025.12.01.04.42.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Dec 2025 04:42:26 -0800 (PST) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v4 22/28] sched/deadline: Introduce dl_server_try_pull_f Date: Mon, 1 Dec 2025 13:41:55 +0100 Message-ID: <20251201124205.11169-23-yurand2000@gmail.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251201124205.11169-1-yurand2000@gmail.com> References: <20251201124205.11169-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a new deadline server callback, dl_server_try_pull_f, that attempts to pull some tasks from other runqueues and returns true if after this operation the runqueue is not empty. This function is needed by some scheduling algorithms to guarantee that they are work-conserving (i.e. whenever there is an idle CPU and a ready task, this must be immediately scheduled there) or to enforce some other properties of the scheduling algorithm used on the served runqueue (for example, for a fixed-priority scheduler the m highest-priority tasks must be scheduled). The function is called whenever the dl_server_timer (the runtime replenishment timer) expires and the deadline server is recharged. The idea behind this callback is that since the deadline server is being unthrottled and is becoming able to serve its runqueue, it should pull tasks from the other runqueues (if there are no runnable tasks on its own runqueue, or if the tasks on its runqueue have low priority). The interface of the function provides a sched_dl_entity pointer, i.e. the deadline server and expects to return true if there are runnable tasks on that server (also just pulled from other runqueues), false otherwise. The replenishment timer callback, if there are no runnable tasks for a given server, will replenish its bandwidth and then stop it. This callback is not relevant for fair deadline servers. This fixes the test case where a single hog process, in a cgroup with reservation 10ms/100ms, runs (without this patch) on only one CPU, while (with this patch), by definition of the global Fixed Priority scheduling algorithm, it must run on all (up to a utilization of 100ms/100ms) the CPUs of the machine. Co-developed-by: luca abeni Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- include/linux/sched.h | 3 ++- kernel/sched/deadline.c | 12 ++++++++++++ kernel/sched/fair.c | 8 +++++++- kernel/sched/rt.c | 13 ++++++++++++- kernel/sched/sched.h | 6 ++++++ 5 files changed, 39 insertions(+), 3 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 9ef7797983..62b8586d4f 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -633,7 +633,7 @@ struct sched_rt_entity { #endif } __randomize_layout; =20 -typedef bool (*dl_server_has_tasks_f)(struct sched_dl_entity *); +typedef bool (*dl_server_try_pull_f)(struct sched_dl_entity *); typedef struct task_struct *(*dl_server_pick_f)(struct sched_dl_entity *); =20 struct sched_dl_entity { @@ -734,6 +734,7 @@ struct sched_dl_entity { struct dl_rq *dl_rq; struct rq *my_q; dl_server_pick_f server_pick_task; + dl_server_try_pull_f server_try_pull_task; =20 #ifdef CONFIG_RT_MUTEXES /* diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 082bccc30b..a588fe3bbf 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1295,6 +1295,7 @@ static const u64 dl_server_min_res =3D 1 * NSEC_PER_M= SEC; static enum hrtimer_restart dl_server_timer(struct hrtimer *timer, struct = sched_dl_entity *dl_se) { struct rq *rq =3D rq_of_dl_se(dl_se); + bool is_active; u64 fw; =20 scoped_guard (rq_lock, rq) { @@ -1309,6 +1310,15 @@ static enum hrtimer_restart dl_server_timer(struct h= rtimer *timer, struct sched_ if (!dl_se->dl_runtime) return HRTIMER_NORESTART; =20 + rq_unpin_lock(rq, rf); + is_active =3D dl_se->server_try_pull_task(dl_se); + rq_repin_lock(rq, rf); + if (!is_active) { + replenish_dl_entity(dl_se); + dl_server_stop(dl_se); + return HRTIMER_NORESTART; + } + if (dl_se->dl_defer_armed) { /* * First check if the server could consume runtime in background. @@ -1712,10 +1722,12 @@ void dl_server_stop(struct sched_dl_entity *dl_se) =20 void dl_server_init(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq, struct rq *served_rq, + dl_server_try_pull_f try_pull_task, dl_server_pick_f pick_task) { dl_se->dl_rq =3D dl_rq; dl_se->my_q =3D served_rq; + dl_se->server_try_pull_task =3D try_pull_task; dl_se->server_pick_task =3D pick_task; } =20 diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9c724d8232..dad46f6bd4 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8957,6 +8957,11 @@ static struct task_struct *__pick_next_task_fair(str= uct rq *rq, struct task_stru return pick_next_task_fair(rq, prev, NULL); } =20 +static bool fair_server_try_pull_task(struct sched_dl_entity *dl_se) +{ + return true; +} + static struct task_struct *fair_server_pick_task(struct sched_dl_entity *d= l_se) { return pick_task_fair(dl_se->my_q); @@ -8968,7 +8973,8 @@ void fair_server_init(struct rq *rq) =20 init_dl_entity(dl_se); =20 - dl_server_init(dl_se, &rq->dl, rq, fair_server_pick_task); + dl_server_init(dl_se, &rq->dl, rq, + fair_server_try_pull_task, fair_server_pick_task); } =20 /* diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index e2b67f8309..80580b48ab 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -134,6 +134,16 @@ void free_rt_sched_group(struct task_group *tg) static struct sched_rt_entity *pick_next_rt_entity(struct rt_rq *rt_rq); static inline void set_next_task_rt(struct rq *rq, struct task_struct *p, = bool first); =20 +static bool rt_server_try_pull(struct sched_dl_entity *dl_se) +{ + struct rt_rq *rt_rq =3D &dl_se->my_q->rt; + + if (dl_se->my_q->rt.rt_nr_running =3D=3D 0) + group_pull_rt_task(rt_rq); + + return dl_se->my_q->rt.rt_nr_running > 0; +} + static struct task_struct *rt_server_pick(struct sched_dl_entity *dl_se) { struct rt_rq *rt_rq =3D &dl_se->my_q->rt; @@ -235,7 +245,8 @@ int alloc_rt_sched_group(struct task_group *tg, struct = task_group *parent) dl_se->dl_density =3D to_ratio(dl_se->dl_period, dl_se->dl_runtime); dl_se->dl_server =3D 1; =20 - dl_server_init(dl_se, &cpu_rq(i)->dl, s_rq, rt_server_pick); + dl_server_init(dl_se, &cpu_rq(i)->dl, s_rq, + rt_server_try_pull, rt_server_pick); } =20 return 1; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index c8eac719eb..c069f6fef0 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -363,6 +363,11 @@ extern s64 dl_scaled_delta_exec(struct rq *rq, struct = sched_dl_entity *dl_se, s6 * * dl_se::rq -- runqueue we belong to. * + * dl_se::server_try_pull() -- used on bandwidth enforcement; the server= has a + * chance to pull tasks from the other runqueues, + * otherwise it is stopped if there is no task to + * run. + * * dl_se::server_pick() -- nested pick_next_task(); we yield the period = if this * returns NULL. * @@ -408,6 +413,7 @@ extern void dl_server_start(struct sched_dl_entity *dl_= se); extern void dl_server_stop(struct sched_dl_entity *dl_se); extern void dl_server_init(struct sched_dl_entity *dl_se, struct dl_rq *dl= _rq, struct rq *served_rq, + dl_server_try_pull_f try_pull_task, dl_server_pick_f pick_task); extern void sched_init_dl_servers(void); extern int dl_check_tg(unsigned long total); --=20 2.51.0