From nobody Mon Dec 1 22:02:18 2025 Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B0AD3101B6 for ; Mon, 1 Dec 2025 12:42:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764592942; cv=none; b=fsCewXOqSEDCfEtWggia6tgQB+aej/P+hX965p8N/Vc8D49dTna1cu2lusfBI51cUDzRSpUw8zwo+zKGQbYoxW8g2qvEf58B4BUZMn0hkJU6743hvELC+n5GLVt2AlDK9txq20AmCCywEtTkcEpR3Pr5nJtemeL/R3lzYnDbVQE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764592942; c=relaxed/simple; bh=k6vGTH/kCHJ8gl9GzuNz7ZmOXkBbH4lEhlurPAgcnoY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=V4A1F/a26Ux3bg/Coi7PEeZMYiIAImB5Tga/Bu7DqFuU6yXGulSE89wN7I2D/CPccV/d462hxdnHHCggbeQx83uuFdPPDu6K15vsZX3WxLwXncz45v55pjBee095feMhNDKqtPcc2yiqOHXRu6KvFIVVZ2jTGhVHb6oggftKlas= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=i/NuzH18; arc=none smtp.client-ip=209.85.208.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="i/NuzH18" Received: by mail-ed1-f44.google.com with SMTP id 4fb4d7f45d1cf-64165cd689eso7659466a12.0 for ; Mon, 01 Dec 2025 04:42:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1764592938; x=1765197738; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lm8Cv/YoLZ7BtJ1e+O4Z4YDATB0lJ0X3mzV7WuVpHIA=; b=i/NuzH18oGQBXSZmAMrt/lkq0djxJvqRn8Z9hrZAVDX7s70wRumQQhAZwthS2QADLM s72HuH3ymw6n9wO+uRh4uKK1Igf8I8kBf1EV3nI84Yj9Psbf5HcgQRoHuRQcgv9itUku U+f1hIiB7x47CIcfdFkrwc54FTtU3uLriuK+ETPL8uWuN/SwiLKnUs9MaVjvC7gEUGEr dB/PsWnJgAc6tCvs1rffEFM3eJUAQT5HMYByCvDjj7hdn9y6L3p//F9tY8F9VQYmSvoc IEE9JjRyqRV3sC4Mbvr6taxuw1W25Re27ZGG+4adadRLwoIHkYu0iQiCSSRFaOQQ1mBr t3sQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764592938; x=1765197738; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=lm8Cv/YoLZ7BtJ1e+O4Z4YDATB0lJ0X3mzV7WuVpHIA=; b=OR/tNFFsZlke5EGer/qnz8R3Jhf6X+DlfjNIfTAcI5xCx7rJNbv28CL7YCkREJdpj3 4Smmb31yszarP84clCjL8Qe2HO6LzGIR0fqt8qpXJ3YAobDOQ/Hz5d5Q5Nur2H1e1Fvq vVGbA6DHxJ5dvbWFPhULzH+e1mTd7MHNBm2VmZRnXVer3mBTIsfigWkWWGS7BpWhobuR LCn93/lXm7lSVfcuog6XWFZwHSD2ebN0DEd2HfwXNe6NiUwn7dT8VYJsAAHn2pYN1J+S 2xs5L1Qsk/ZvZnLMadqcVSp/XmG872rmID2RqrAsey6S+bT2cXEqU5VR4uV4r1WraFyt Z0FA== X-Gm-Message-State: AOJu0YzmI2CURM7ucjkLKX7goufXYeXJsW5R3tBhSbxMekfGDmmZ5DHO mjgEBLgX1FThrWSfEYIQqNvitEcgxqjlC6AfXoqFuzSNtwM7YQ9zZN0A X-Gm-Gg: ASbGncstJNI5/uzLK8ppbI4jsPtkPhLduT7+C6weLyBmlui0lp2WP9qzEpsIULX7S/g ksOrwHz8aiyTxLvShOHShHfYsPq8Nz4ZitGVoLK4UvzXnFnTJWo8goXW0huf4+2AW/mK69bNz5J g/FhGfIS6IikROhny4iaWekxfjgjXqdmXtkyxkpDGnSP0mCkjS5Y3jx/trd2GAeUFTqClVDCFh1 6ivLuy1n01H+frqAJvWosWPz03YkX3SNsHzUNWBMdPQ4hfPBpeBU0dOilL3sAwK6jQpD4u9KbzS d+hgTv8jBqFw/S3d2oGXvBzDSdbbRwjvcCuT0hUQmMJ58298nTwpAwB3k0iYRynLIDBK8ZDqxfh rkfs0p1TugydCnWIZaK/Wn0XkQk7A92SDXrya6wSDsVk0Cjwlu4lanrxiSawAaQhmNAJ6E3z53Y VQs0F/wQT6 X-Google-Smtp-Source: AGHT+IEHuWswqEwdC+9qn+WugVNjfLuveVPkB153iRgBZ/W32v7K/V5rcY3WDNKmnO3F1ehvE+cDsQ== X-Received: by 2002:a17:907:6d1c:b0:b72:e158:8229 with SMTP id a640c23a62f3a-b766ef1d289mr4861178866b.15.1764592938303; Mon, 01 Dec 2025 04:42:18 -0800 (PST) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b76f59e8612sm1173738266b.52.2025.12.01.04.42.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Dec 2025 04:42:18 -0800 (PST) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v4 12/28] sched/rt: Implement dl-server operations for rt-cgroups. Date: Mon, 1 Dec 2025 13:41:45 +0100 Message-ID: <20251201124205.11169-13-yurand2000@gmail.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251201124205.11169-1-yurand2000@gmail.com> References: <20251201124205.11169-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" - Implement rt_server_pick, the callback that deadline servers use to pick a task to schedule. - rt_server_pick(): pick the next runnable rt task and tell the scheduler that it is going to be scheduled next. - Let enqueue/dequeue_task_rt function start/stop the attached deadline server when the first/last task is enqueued/dequeued on a specific rq/server. - Change update_curr_rt to perform a deadline server update if the updated task is served by non-root group. - Update inc/dec_dl_tasks to account the number of active tasks in the local runqueue for rt-cgroups servers, as their local runqueue is different from the global runqueue, and thus when a rt-group server is activated/deactivated, the number of served tasks' must be added/removed. This uses nr_running to be compatible with future dl-server interfaces. - Update inc/dec_rt_prio_smp to change a rq's cpupri only if the rt_rq is the global runqueue, since cgroups are scheduled via their dl-server priority. - Update inc/dec_rt_tasks to account for waking/sleeping tasks on the global runqueue, when the task runs on the root cgroup, or its local dl server is active. The accounting is not done when servers are throttled, as they will add/sub the number of tasks running when they get enqueued/dequeued. For rt cgroups, account for the number of active tasks in the nr_running field of the local runqueue (add/sub_nr_running), as this number is used when a dl server is enqueued/dequeued. - Update set_task_rq to record the dl_rq, tracking which deadline server manages a task. - Update set_task_rq to not use the parent field anymore, as it is unused by this patchset's code. Remove the unused parent field from sched_rt_entity. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: luca abeni Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- include/linux/sched.h | 1 - kernel/sched/deadline.c | 8 +++++ kernel/sched/rt.c | 68 ++++++++++++++++++++++++++++++++++++++--- kernel/sched/sched.h | 8 ++++- 4 files changed, 79 insertions(+), 6 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 000aa3b2b1..3f1f15b6d2 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -629,7 +629,6 @@ struct sched_rt_entity { struct sched_rt_entity *back; #ifdef CONFIG_RT_GROUP_SCHED - struct sched_rt_entity *parent; /* rq on which this entity is (to be) queued: */ struct rt_rq *rt_rq; /* rq "owned" by this entity/group: */ diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 089fd2c9b7..b890fdd4b2 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1847,6 +1847,10 @@ void inc_dl_tasks(struct sched_dl_entity *dl_se, str= uct dl_rq *dl_rq) if (!dl_server(dl_se)) add_nr_running(rq_of_dl_rq(dl_rq), 1); + else if (rq_of_dl_se(dl_se) !=3D dl_se->my_q) { + WARN_ON(dl_se->my_q->rt.rt_nr_running !=3D dl_se->my_q->nr_running); + add_nr_running(rq_of_dl_rq(dl_rq), dl_se->my_q->nr_running); + } inc_dl_deadline(dl_rq, deadline); } @@ -1859,6 +1863,10 @@ void dec_dl_tasks(struct sched_dl_entity *dl_se, str= uct dl_rq *dl_rq) if (!dl_server(dl_se)) sub_nr_running(rq_of_dl_rq(dl_rq), 1); + else if (rq_of_dl_se(dl_se) !=3D dl_se->my_q) { + WARN_ON(dl_se->my_q->rt.rt_nr_running !=3D dl_se->my_q->nr_running); + sub_nr_running(rq_of_dl_rq(dl_rq), dl_se->my_q->nr_running); + } dec_dl_deadline(dl_rq, dl_se->deadline); } diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 2301efc03f..7ec117a18d 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -128,9 +128,22 @@ void free_rt_sched_group(struct task_group *tg) kfree(tg->dl_se); } +static struct sched_rt_entity *pick_next_rt_entity(struct rt_rq *rt_rq); +static inline void set_next_task_rt(struct rq *rq, struct task_struct *p, = bool first); + static struct task_struct *rt_server_pick(struct sched_dl_entity *dl_se) { - return NULL; + struct rt_rq *rt_rq =3D &dl_se->my_q->rt; + struct rq *rq =3D rq_of_rt_rq(rt_rq); + struct task_struct *p; + + if (dl_se->my_q->rt.rt_nr_running =3D=3D 0) + return NULL; + + p =3D rt_task_of(pick_next_rt_entity(rt_rq)); + set_next_task_rt(rq, p, true); + + return p; } static inline void __rt_rq_free(struct rt_rq **rt_rq) @@ -435,6 +448,7 @@ static inline int rt_se_prio(struct sched_rt_entity *rt= _se) static void update_curr_rt(struct rq *rq) { struct task_struct *donor =3D rq->donor; + struct rt_rq *rt_rq; s64 delta_exec; if (donor->sched_class !=3D &rt_sched_class) @@ -444,8 +458,18 @@ static void update_curr_rt(struct rq *rq) if (unlikely(delta_exec <=3D 0)) return; - if (!rt_bandwidth_enabled()) + if (!rt_group_sched_enabled()) return; + + if (!dl_bandwidth_enabled()) + return; + + rt_rq =3D rt_rq_of_se(&donor->rt); + if (is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + dl_server_update(dl_se, delta_exec); + } } static void @@ -456,7 +480,7 @@ inc_rt_prio_smp(struct rt_rq *rt_rq, int prio, int prev= _prio) /* * Change rq's cpupri only if rt_rq is the top queue. */ - if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && &rq->rt !=3D rt_rq) + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && is_dl_group(rt_rq)) return; if (rq->online && prio < prev_prio) @@ -471,7 +495,7 @@ dec_rt_prio_smp(struct rt_rq *rt_rq, int prio, int prev= _prio) /* * Change rq's cpupri only if rt_rq is the top queue. */ - if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && &rq->rt !=3D rt_rq) + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && is_dl_group(rt_rq)) return; if (rq->online && rt_rq->highest_prio.curr !=3D prev_prio) @@ -534,6 +558,16 @@ void inc_rt_tasks(struct sched_rt_entity *rt_se, struc= t rt_rq *rt_rq) rt_rq->rr_nr_running +=3D is_rr_task(rt_se); inc_rt_prio(rt_rq, rt_se_prio(rt_se)); + + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + if (!dl_se->dl_throttled) + add_nr_running(rq_of_rt_rq(rt_rq), 1); + add_nr_running(served_rq_of_rt_rq(rt_rq), 1); + } else { + add_nr_running(rq_of_rt_rq(rt_rq), 1); + } } static inline @@ -544,6 +578,16 @@ void dec_rt_tasks(struct sched_rt_entity *rt_se, struc= t rt_rq *rt_rq) rt_rq->rr_nr_running -=3D is_rr_task(rt_se); dec_rt_prio(rt_rq, rt_se_prio(rt_se)); + + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + if (!dl_se->dl_throttled) + sub_nr_running(rq_of_rt_rq(rt_rq), 1); + sub_nr_running(served_rq_of_rt_rq(rt_rq), 1); + } else { + sub_nr_running(rq_of_rt_rq(rt_rq), 1); + } } /* @@ -725,6 +769,14 @@ enqueue_task_rt(struct rq *rq, struct task_struct *p, = int flags) check_schedstat_required(); update_stats_wait_start_rt(rt_rq_of_se(rt_se), rt_se); + /* Task arriving in an idle group of tasks. */ + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && + is_dl_group(rt_rq) && rt_rq->rt_nr_running =3D=3D 0) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + dl_server_start(dl_se); + } + enqueue_rt_entity(rt_se, flags); if (task_is_blocked(p)) @@ -744,6 +796,14 @@ static bool dequeue_task_rt(struct rq *rq, struct task= _struct *p, int flags) dequeue_pushable_task(rt_rq, p); + /* Last task of the task group. */ + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && + is_dl_group(rt_rq) && rt_rq->rt_nr_running =3D=3D 0) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + dl_server_stop(dl_se); + } + return true; } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index f42bef06a9..fb4dcb4551 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2203,7 +2203,7 @@ static inline void set_task_rq(struct task_struct *p,= unsigned int cpu) if (!rt_group_sched_enabled()) tg =3D &root_task_group; p->rt.rt_rq =3D tg->rt_rq[cpu]; - p->rt.parent =3D tg->rt_se[cpu]; + p->dl.dl_rq =3D &cpu_rq(cpu)->dl; #endif /* CONFIG_RT_GROUP_SCHED */ } @@ -2750,6 +2750,9 @@ static inline void add_nr_running(struct rq *rq, unsi= gned count) unsigned prev_nr =3D rq->nr_running; rq->nr_running =3D prev_nr + count; + if (rq !=3D cpu_rq(rq->cpu)) + return; + if (trace_sched_update_nr_running_tp_enabled()) { call_trace_sched_update_nr_running(rq, count); } @@ -2763,6 +2766,9 @@ static inline void add_nr_running(struct rq *rq, unsi= gned count) static inline void sub_nr_running(struct rq *rq, unsigned count) { rq->nr_running -=3D count; + if (rq !=3D cpu_rq(rq->cpu)) + return; + if (trace_sched_update_nr_running_tp_enabled()) { call_trace_sched_update_nr_running(rq, -count); } -- 2.51.0