From nobody Fri Dec 19 19:07:39 2025 Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B70601BD035 for ; Thu, 5 Jun 2025 07:14:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107663; cv=none; b=DiHyDwQcjNQLLWXjV3IqlroEWmXaiXzf5+wkvTZ3ARijWLavWQD/9jRSDipa019/Dks1/I9QKAEEeShw62CJ5crA1LCgnkgxNvxS9ildDKjIYqDTZNlcX0ZQzr4YIl9GvGhaAF1htV53SiSBYCPAXcT1a8LFAaLhaSJPTlOymT8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107663; c=relaxed/simple; bh=l+ADq0dkQ0ISKn3AbvpIXY0kRmGqx3DmbHvcx4xzB6s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gKF08y3kaIjdfGFEV3Ftsm7ezUth9+2Vx2On7+wxKOvB2kIAlQ0Y7/lEhwu5FEYQMIc9rmHPrL918KXx/lfJID3SMmHUX0aMj91rn+zXvyVZAG7651ogCtxSMGuiT8NhmvwT8yfXKFKLhUWvOOXzr+acPvYV5Ikxiwq64aoPgpg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=GNI2AHV5; arc=none smtp.client-ip=209.85.221.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="GNI2AHV5" Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-3a3673e12c4so348033f8f.2 for ; Thu, 05 Jun 2025 00:14:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749107660; x=1749712460; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4bgsNX+3yqAc+84E0G9kZgZo/o3wMNKh9J6Vou91qVE=; b=GNI2AHV5ubOyVAnybhDvrlz82/ASQKC7DzW+x7N2N1TBGZxBPT44ljB6F8NXKfvfcJ nUWaeTWbmO83pAoD6h5/L+LarYQ7Dd1CFp76t19fB48Hc2IW4taHfJZ0KP0niTm+tipu AB7dA11YyXzxknJSH/4Rt/GsF+0KOTjGtna3VLJn2hnTnIhxp7e6dSAMe8MlCsuGSwem cKkB75O4d6G4+GGdpPhyOZyyt4HG8leDpcW8JpCmmUcwglLcSjDIa0KR2cQnFBpfA9DT 5B5ApF6zlzJfdH6M1BbruS4VFlrUZxWOXz5wv1gpeN2wA3WOnZN23sZA/6fZlajpJcS3 0QpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749107660; x=1749712460; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4bgsNX+3yqAc+84E0G9kZgZo/o3wMNKh9J6Vou91qVE=; b=Dt4kfY8+VqxlZqRX+XW5wxb3lmwIVqRl7bEAp7fsdVKiJJZk4kHm950b5KjAngY4/d 3ayrQslVOTAcV1LaSLv6/PNRLmlFFn6HsPMYfHiNZGXuEaOvrYGCJeG9tcPVrn6oAlS+ vMUCfCawdtZiSBxH+FwNoxVqmU+ryv0V/NGxEo7wGgPdUK+EcmL3mO3jpPSn/aFykBdH f5dw6/GVtkK2z/k5Nxr95ICZhAnqAq1gMFp+elaHkMqGwpZxF0SlYC4jxHrdeoV5mKog lP+c8bj7gtqmNf7HuMF0jxBpxJ2H+ERm0FxFyhNfW+ronQK/q8U8DNHUsjjSVlycnqmI 87tQ== X-Gm-Message-State: AOJu0YxT0ie7dW1BXQhrsTwApXuPQRcl2frW5h1fZPFFB5sZoE1yJIb3 x4suuQxWoxr6EINgzby3gIDAAg0mBHH9zhJAWuV1Mb0vEPJS1orsxOLN X-Gm-Gg: ASbGncvVZ9A91+SsbP8fzch4nlzXfy6o4laFEBIWCet8FtmMrqKA/PIUDS2h+A8P3Qy wGX5KIjFU+bQ+Ac5xykAPCTU2vPdvPv54lMaSQxpvj0XxVSzLllbSlDVpApPGoO0GhsSd5keXnr P/ep+FgpOteAu1RNBj8uU4YOeA9/m5uEAfy29RslQFfN+4cdQlvclCOa+1lCPs6sfe8Bex2QkNH rRH1hvL2NUpPMgRy3FaouxtMx2Trwab08udNBvIA8zBFBuZ+pxHOOJU3a9Ls4/eHfeBNy3veK9i 09ELCHkM/pWvtspQikwLfliS6ks+MtvzEkj5Tw327GIq42efGqoBY0OBZCI0JlA6 X-Google-Smtp-Source: AGHT+IHKRat52YZ2v9YBVBgALfknsV6OcA4Bw2gnvLID8VBbh1J5mB1Y3NobJd5cyaYad2I0lchxFw== X-Received: by 2002:a05:6000:24c8:b0:3a4:f520:8bfc with SMTP id ffacd0b85a97d-3a51d95874fmr4682403f8f.36.1749107659790; Thu, 05 Jun 2025 00:14:19 -0700 (PDT) Received: from localhost.localdomain ([78.210.56.234]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4f00972b5sm23885431f8f.76.2025.06.05.00.14.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 00:14:19 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH 1/9] sched/deadline: Do not access dl_se->rq directly Date: Thu, 5 Jun 2025 09:14:04 +0200 Message-ID: <20250605071412.139240-2-yurand2000@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250605071412.139240-1-yurand2000@gmail.com> References: <20250605071412.139240-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Make deadline.c code access the runqueue of a scheduling entity saved in the sched_dl_entity data structure. This allows future patches to save different runqueues in sched_dl_entity other than the global runqueues. Signed-off-by: luca abeni --- kernel/sched/deadline.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index ad45a8fea..26cd0c559 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -935,7 +935,7 @@ static void replenish_dl_entity(struct sched_dl_entity = *dl_se) * and arm the defer timer. */ if (dl_se->dl_defer && !dl_se->dl_defer_running && - dl_time_before(rq_clock(dl_se->rq), dl_se->deadline - dl_se->runtime)= ) { + dl_time_before(rq_clock(rq), dl_se->deadline - dl_se->runtime)) { if (!is_dl_boosted(dl_se) && dl_se->server_has_tasks(dl_se)) { =20 /* @@ -1244,11 +1244,11 @@ static enum hrtimer_restart dl_server_timer(struct = hrtimer *timer, struct sched_ * of time. The dl_server_min_res serves as a limit to avoid * forwarding the timer for a too small amount of time. */ - if (dl_time_before(rq_clock(dl_se->rq), + if (dl_time_before(rq_clock(rq), (dl_se->deadline - dl_se->runtime - dl_server_min_res))) { =20 /* reset the defer timer */ - fw =3D dl_se->deadline - rq_clock(dl_se->rq) - dl_se->runtime; + fw =3D dl_se->deadline - rq_clock(rq) - dl_se->runtime; =20 hrtimer_forward_now(timer, ns_to_ktime(fw)); return HRTIMER_RESTART; @@ -1259,7 +1259,7 @@ static enum hrtimer_restart dl_server_timer(struct hr= timer *timer, struct sched_ =20 enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH); =20 - if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &dl_se->rq->cu= rr->dl)) + if (!dl_task(rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl)) resched_curr(rq); =20 __push_dl_task(rq, rf); @@ -1527,7 +1527,7 @@ static void update_curr_dl_se(struct rq *rq, struct s= ched_dl_entity *dl_se, s64 =20 hrtimer_try_to_cancel(&dl_se->dl_timer); =20 - replenish_dl_new_period(dl_se, dl_se->rq); + replenish_dl_new_period(dl_se, rq); =20 /* * Not being able to start the timer seems problematic. If it could not @@ -1640,12 +1640,12 @@ void dl_server_update(struct sched_dl_entity *dl_se= , s64 delta_exec) { /* 0 runtime =3D fair server disabled */ if (dl_se->dl_runtime) - update_curr_dl_se(dl_se->rq, dl_se, delta_exec); + update_curr_dl_se(rq_of_dl_se(dl_se), dl_se, delta_exec); } =20 void dl_server_start(struct sched_dl_entity *dl_se) { - struct rq *rq =3D dl_se->rq; + struct rq *rq; =20 /* * XXX: the apply do not work fine at the init phase for the @@ -1656,9 +1656,9 @@ void dl_server_start(struct sched_dl_entity *dl_se) u64 runtime =3D 50 * NSEC_PER_MSEC; u64 period =3D 1000 * NSEC_PER_MSEC; =20 + dl_se->dl_server =3D 1; dl_server_apply_params(dl_se, runtime, period, 1); =20 - dl_se->dl_server =3D 1; dl_se->dl_defer =3D 1; setup_new_dl_entity(dl_se); } @@ -1668,8 +1668,9 @@ void dl_server_start(struct sched_dl_entity *dl_se) =20 dl_se->dl_server_active =3D 1; enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP); - if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl)) - resched_curr(dl_se->rq); + rq =3D rq_of_dl_se(dl_se); + if (!dl_task(rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl)) + resched_curr(rq); } =20 void dl_server_stop(struct sched_dl_entity *dl_se) @@ -1712,7 +1713,7 @@ int dl_server_apply_params(struct sched_dl_entity *dl= _se, u64 runtime, u64 perio { u64 old_bw =3D init ? 0 : to_ratio(dl_se->dl_period, dl_se->dl_runtime); u64 new_bw =3D to_ratio(period, runtime); - struct rq *rq =3D dl_se->rq; + struct rq *rq =3D rq_of_dl_se(dl_se); int cpu =3D cpu_of(rq); struct dl_bw *dl_b; unsigned long cap; @@ -1789,7 +1790,7 @@ static enum hrtimer_restart inactive_task_timer(struc= t hrtimer *timer) p =3D dl_task_of(dl_se); rq =3D task_rq_lock(p, &rf); } else { - rq =3D dl_se->rq; + rq =3D rq_of_dl_se(dl_se); rq_lock(rq, &rf); } =20 --=20 2.49.0 From nobody Fri Dec 19 19:07:39 2025 Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D8F91F8733 for ; Thu, 5 Jun 2025 07:14:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107665; cv=none; b=dobOG316fqAUUwqjGoGPnxqnh8xH7PJoK2LZ+5enUpEKiRK6uMAvVDbOR28+AlKR69i6gvsinCyVifJnTaDS9ojBEj3Ntpg7UbGi/E//FCJjir+tdLNbI1kDC4EusoLdKHpx/UBBxZNM5x4SX6hU2p9xyzFNwjEr9+N+7etNbJ8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107665; c=relaxed/simple; bh=YU9rGe6PQ/oncQIwLOd08cwWq3mkY+bBsBW0gw/jR/8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JAcRq3Zb38f5aBMObjRaf1S6NavqxhWtHsKOf0RUwY1usvZUtySoyi81iC8Pbntc6Q4Ur8eMyJgle04HDn2qaACPymePB/X0JloV2HfgGCAcSCRSJ7dX/u/P08s8ipj78sBF4+JLY8QYS7DSwxkqZzKrmGmTLFJnEPTqeiD98aM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=HKdNSqT5; arc=none smtp.client-ip=209.85.221.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HKdNSqT5" Received: by mail-wr1-f45.google.com with SMTP id ffacd0b85a97d-3a0ac853894so505661f8f.3 for ; Thu, 05 Jun 2025 00:14:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749107661; x=1749712461; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9nh62+5aAxVtkRzDZeSU5sYBKB02VhcaGkXgdn/hDMs=; b=HKdNSqT5yWcZGXcF/P/4ojT2N+Z4PACKY2e/APN0DIe5vf2fXVlFjwaIzj2tV1vIyl XTs1uxw9bjbpxd8kxzK+4lP8FE4ZxFtoiZvMGU/eAPO869278mpXUxmDAwKF/luWfm3s VWHWepXI6Sain1xRLDyc0uhJFBK1jm5YaQYznGqukYn6Yg+CSnaPILi7OUSRcxba1aOg dmV7XuPlyMgDFKE8Z2+U5nTJU62zeUxih0llnnUZcr+5G1HHL8p/WtV52h/WJWmK7hfk VQkngKfpRpx0iAUtcfV5aTe5yjcf58pYMUD6fMTVfBmP+cv453fc7bHOKM3Evj5D3PO6 /bQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749107661; x=1749712461; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9nh62+5aAxVtkRzDZeSU5sYBKB02VhcaGkXgdn/hDMs=; b=GfD1B7WiFRJFEiRPwjZGzP4xGagukmSk+OHt6uXT+xyU3i5J/6DB+hLzncKIpao2en 0qLTShvjfg/x7FNBi/GTl/xuey1Znm2YWe5F65aCnQ9Ewb7xdeP5jXdQO8wikSs7cUT3 9xb4hETRfj2s3CB5g86qKTAG+FO1+KBK8tnyc6bGK9MAbnShcbddwhlhjSYJ3Pw9Tj7x /SQ/cw3RUq+7R/weK1omgywQGPyOQ19jD670brBzD84gQysM7Na/Inzp0MgtVbMtFfFY 200urvwCKAVvYmWvn1FGcDwBEBlQKm3Xql6jt5faO3i6ZeL9EF0NLIwLfU6Yp2enH6Bb Aevw== X-Gm-Message-State: AOJu0YwnGzH717t91NqpJyIYNAiA05IO0BLSqQQAi7pbYshGDF2Cm71T pIwLDdNl79+Gdyn8mOgZU9VHbXaLPQpFR/uXjRlRNyC932Fb3K/QpPfz X-Gm-Gg: ASbGnctX6GWlXOVIaJX4YluOZAikDjxG5j3JgUs4BtElbfiRPQshgBTwKkc671k8Obn makennYDFvGGFNSISY/xCEPyLqBrJzLTuqhvpbGioUlIWFuwATc0qp5y/uAVSMNp+fA75xgXfgj TOJpz9ldA76bjgDD4IpQpz4YsktNdoF0YTh6K46EcUXZjfy4FpSZswbkj5RZmmKq9r0A8QcKeOk bkatxuDeJ8j5rxHS5kvTume//SDtfUtJ0ZKSRNFtR0XeT9xJiW3dS3rS/Y8mXkR4nNl8Vz57clJ cMv5Y0YX+TyKfNN+cmsgtFxRz481puMjNFzHvncOwQKtmjEBi0ybEUGv/DjAO9mn X-Google-Smtp-Source: AGHT+IGv+oN1bVg0NyG4r2WniX6HBaM/A1WRGrcYCxIUzHTSdM6xSA8ckdTGMcnQY/Nkd4WY+qpg+Q== X-Received: by 2002:a05:6000:2207:b0:3a4:f2ed:217e with SMTP id ffacd0b85a97d-3a51d9806e6mr4449079f8f.42.1749107661414; Thu, 05 Jun 2025 00:14:21 -0700 (PDT) Received: from localhost.localdomain ([78.210.56.234]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4f00972b5sm23885431f8f.76.2025.06.05.00.14.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 00:14:21 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH 2/9] sched/deadline: Make a distinction between dl_rq and my_q Date: Thu, 5 Jun 2025 09:14:05 +0200 Message-ID: <20250605071412.139240-3-yurand2000@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250605071412.139240-1-yurand2000@gmail.com> References: <20250605071412.139240-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Create two fields for runqueues in sched_dl_entity to make a distinction be= tween the global runqueue and the runqueue which the dl_server serves. Signed-off-by: luca abeni --- include/linux/sched.h | 6 ++++-- kernel/sched/deadline.c | 11 +++++++---- kernel/sched/fair.c | 6 +++--- kernel/sched/sched.h | 3 ++- 4 files changed, 16 insertions(+), 10 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 4f78a64be..6dd86d13c 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -725,12 +725,14 @@ struct sched_dl_entity { * Bits for DL-server functionality. Also see the comment near * dl_server_update(). * - * @rq the runqueue this server is for + * @dl_rq the runqueue on which this entity is (to be) queued + * @my_q the runqueue "owned" by this entity * * @server_has_tasks() returns true if @server_pick return a * runnable task. */ - struct rq *rq; + struct dl_rq *dl_rq; + struct rq *my_q; dl_server_has_tasks_f server_has_tasks; dl_server_pick_f server_pick_task; =20 diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 26cd0c559..7736a625f 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -71,11 +71,12 @@ static inline struct rq *rq_of_dl_rq(struct dl_rq *dl_r= q) =20 static inline struct rq *rq_of_dl_se(struct sched_dl_entity *dl_se) { - struct rq *rq =3D dl_se->rq; + struct rq *rq; =20 if (!dl_server(dl_se)) rq =3D task_rq(dl_task_of(dl_se)); - + else + rq =3D container_of(dl_se->dl_rq, struct rq, dl); return rq; } =20 @@ -1685,11 +1686,13 @@ void dl_server_stop(struct sched_dl_entity *dl_se) dl_se->dl_server_active =3D 0; } =20 -void dl_server_init(struct sched_dl_entity *dl_se, struct rq *rq, +void dl_server_init(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq, + struct rq *served_rq, dl_server_has_tasks_f has_tasks, dl_server_pick_f pick_task) { - dl_se->rq =3D rq; + dl_se->dl_rq =3D dl_rq; + dl_se->my_q =3D served_rq; dl_se->server_has_tasks =3D has_tasks; dl_se->server_pick_task =3D pick_task; } diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 7a14da539..f489e0419 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8976,12 +8976,12 @@ static struct task_struct *__pick_next_task_fair(st= ruct rq *rq, struct task_stru =20 static bool fair_server_has_tasks(struct sched_dl_entity *dl_se) { - return !!dl_se->rq->cfs.nr_queued; + return !!dl_se->my_q->cfs.nr_queued; } =20 static struct task_struct *fair_server_pick_task(struct sched_dl_entity *d= l_se) { - return pick_task_fair(dl_se->rq); + return pick_task_fair(dl_se->my_q); } =20 void fair_server_init(struct rq *rq) @@ -8990,7 +8990,7 @@ void fair_server_init(struct rq *rq) =20 init_dl_entity(dl_se); =20 - dl_server_init(dl_se, rq, fair_server_has_tasks, fair_server_pick_task); + dl_server_init(dl_se, &rq->dl, rq, fair_server_has_tasks, fair_server_pic= k_task); } =20 /* diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 475bb5998..755ff5734 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -381,7 +381,8 @@ extern s64 dl_scaled_delta_exec(struct rq *rq, struct s= ched_dl_entity *dl_se, s6 extern void dl_server_update(struct sched_dl_entity *dl_se, s64 delta_exec= ); extern void dl_server_start(struct sched_dl_entity *dl_se); extern void dl_server_stop(struct sched_dl_entity *dl_se); -extern void dl_server_init(struct sched_dl_entity *dl_se, struct rq *rq, +extern void dl_server_init(struct sched_dl_entity *dl_se, struct dl_rq *dl= _rq, + struct rq *served_rq, dl_server_has_tasks_f has_tasks, dl_server_pick_f pick_task); =20 --=20 2.49.0 From nobody Fri Dec 19 19:07:39 2025 Received: from mail-wr1-f54.google.com (mail-wr1-f54.google.com [209.85.221.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B3241FDA82 for ; Thu, 5 Jun 2025 07:14:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107667; cv=none; b=gaGfUyWohGqLMJktcKBnjqnb/D6VW5pTZfUtKaGMJXJaDXLv6kKmH4S1LhNXN1kBIFKjpF4e0KIIUSJ6Tfpk+znq4cOOPs1mK7Gg1XqomdzAS+VdalRKEQCRHifNxRDe2w1sI+IBRBJN0ay8rQOHMSLCyDNrgx5tgkGbydhswsI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107667; c=relaxed/simple; bh=UoC0y3XzkHe79zWkNZv4k4ntOhTVcahJJcfVmOpGZqI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kNfii3akFLNrLBs0qA8uHlvXME6FNWjmgvyOSTal5HqoGjfQXX7GMxFEjiUwYzadftwT2G5DiZWwJbx6aC1VPJ5P5Kmiid7mUVUhIM6sSZHKwWTWr28Uieiyg3r8dMIu17FYmsQRk51Ff67HYeNFPHUYA12ZME7/p1eewRyhZmU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=FJIjcInl; arc=none smtp.client-ip=209.85.221.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="FJIjcInl" Received: by mail-wr1-f54.google.com with SMTP id ffacd0b85a97d-3a510432236so494586f8f.0 for ; Thu, 05 Jun 2025 00:14:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749107664; x=1749712464; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iRIm7kyVhWBvt0NqPziCsA+yLvnjZzIa36x2nFgwovA=; b=FJIjcInl6CULNxYJMhQAfMMIiWGsM1iq/Pyai7VGBk59xjqYu9l+55NTv4nlS8hqAn uMYlYMb5wpUWPi4iJzp3wDJXBljmyhbTPxA0VjdWsH9LwumJYupY3PIa5vbDAU1GP1Ri gFPhZvKwalTh7iR/bH57DHe4/C+jE7BMigVGrxr5sX7QxtEtn6J1eZOcRwe1Hr003T5d 30ZD4Ymospey25MadpnB8iCgKxnAZOtFYTXAiT43DvsF8tlcPVL5X4+q8blwVK0NYd24 uJvX6sINNRuagep9CuZq2BAXiROTZSKqevDe3fnFDIjtAUf2RKIQ5EmDH6fqW7Ax6mv0 HnXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749107664; x=1749712464; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iRIm7kyVhWBvt0NqPziCsA+yLvnjZzIa36x2nFgwovA=; b=icL52Hq7M9qpkFXmBqYwr0YPWJHk2KkILuUzBIX3ooRgd2uZ2XU521F5BWC/pZ8EON PsAfBzvdTwKaaSSjfK118C+cVUwM7NII8Zt+Xdzhvuq+cyM8hhuYTCdGhP/prSexSsZH ceCOjx96w63aafunc8s7OyQ/jTu1KglHJM38bXAWVfxEjz1ZwZYlojqifI1xeNuS3htR wrJ4sQ1AgbhZhgTf756AYCRJ7OEeUhRnTo+xqyKTBLRmWdlnfrkddPVW8N3QN7H76Vfj DUpCQOKtpNFjLRaYKtauZFWFWR90cXpY5bPDOH2XI/Fgr/zLPRiLkGbyN2jlv+F9Hr4O MjTg== X-Gm-Message-State: AOJu0YxXym2qq7wLYw2Mv/AzLnBj7IfK2HSdgAYnXU1WYe3MuUD+xYcF 3ZWbm8fzj6hcbU+FyguHpEpSfhLps5aOkBHyEZV3WzeT0pSY1MFj4nQf X-Gm-Gg: ASbGnctcBLtt5xV91NgJwQfeghn2rZnHhmLO36fQuvjDZjk8sz1jbrIWJEd/yVe81q/ +OZ6nNM0iBhh8gE+FW2J9YoDiE1zLL5zloEdekBG9GedRcRJ3taqMnwRjlY6yYBq4vUZ00hPCVV zDhBYJjGsj88fGQlK8G9gmLYUm2CfRalANjbn4mJuS3tYEOSjPsaeQriXAPc/iHAuNVdXbOn6NO rBZGU6JTqspNyGX8Xts06ExhLE9QGsPRxQJ9pmGBNYFzpLD3WIW5cx/JEBfl1Ohew6hbRDBz6s7 N5sesHq4hALfjNuuxy8IzG7dQs8aS2y0qwcy4RWvonJmSr5Wmory1rbdgwx6ehux X-Google-Smtp-Source: AGHT+IEB8tlx2hjQr+o8vhPYrXPROkc1rZxHVlD+XndPBlUlG8sisk9j+Skh1AbLwWvbnyQ1YkGDvg== X-Received: by 2002:a05:6000:4287:b0:3a4:d6ed:8e00 with SMTP id ffacd0b85a97d-3a51d959f72mr4912018f8f.33.1749107663311; Thu, 05 Jun 2025 00:14:23 -0700 (PDT) Received: from localhost.localdomain ([78.210.56.234]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4f00972b5sm23885431f8f.76.2025.06.05.00.14.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 00:14:22 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH 3/9] sched/rt: Pass an rt_rq instead of an rq where needed Date: Thu, 5 Jun 2025 09:14:06 +0200 Message-ID: <20250605071412.139240-4-yurand2000@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250605071412.139240-1-yurand2000@gmail.com> References: <20250605071412.139240-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Make rt.c code access the runqueue through the rt_rq data structure rather = than passing an rq pointer directly. This allows future patches to define rt_rq = data structures which do not refer only to the global runqueue, but also to local cgroup runqueues (rt_rq is not always equal to &rq->rt). Signed-off-by: luca abeni --- kernel/sched/rt.c | 87 +++++++++++++++++++++++++---------------------- 1 file changed, 46 insertions(+), 41 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index e40422c37..046a89fc7 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -371,9 +371,9 @@ static inline void rt_clear_overload(struct rq *rq) cpumask_clear_cpu(rq->cpu, rq->rd->rto_mask); } =20 -static inline int has_pushable_tasks(struct rq *rq) +static inline int has_pushable_tasks(struct rt_rq *rt_rq) { - return !plist_head_empty(&rq->rt.pushable_tasks); + return !plist_head_empty(&rt_rq->pushable_tasks); } =20 static DEFINE_PER_CPU(struct balance_callback, rt_push_head); @@ -384,7 +384,7 @@ static void pull_rt_task(struct rq *); =20 static inline void rt_queue_push_tasks(struct rq *rq) { - if (!has_pushable_tasks(rq)) + if (!has_pushable_tasks(&rq->rt)) return; =20 queue_balance_callback(rq, &per_cpu(rt_push_head, rq->cpu), push_rt_tasks= ); @@ -395,48 +395,48 @@ static inline void rt_queue_pull_task(struct rq *rq) queue_balance_callback(rq, &per_cpu(rt_pull_head, rq->cpu), pull_rt_task); } =20 -static void enqueue_pushable_task(struct rq *rq, struct task_struct *p) +static void enqueue_pushable_task(struct rt_rq *rt_rq, struct task_struct = *p) { - plist_del(&p->pushable_tasks, &rq->rt.pushable_tasks); + plist_del(&p->pushable_tasks, &rt_rq->pushable_tasks); plist_node_init(&p->pushable_tasks, p->prio); - plist_add(&p->pushable_tasks, &rq->rt.pushable_tasks); + plist_add(&p->pushable_tasks, &rt_rq->pushable_tasks); =20 /* Update the highest prio pushable task */ - if (p->prio < rq->rt.highest_prio.next) - rq->rt.highest_prio.next =3D p->prio; + if (p->prio < rt_rq->highest_prio.next) + rt_rq->highest_prio.next =3D p->prio; =20 - if (!rq->rt.overloaded) { - rt_set_overload(rq); - rq->rt.overloaded =3D 1; + if (!rt_rq->overloaded) { + rt_set_overload(rq_of_rt_rq(rt_rq)); + rt_rq->overloaded =3D 1; } } =20 -static void dequeue_pushable_task(struct rq *rq, struct task_struct *p) +static void dequeue_pushable_task(struct rt_rq *rt_rq, struct task_struct = *p) { - plist_del(&p->pushable_tasks, &rq->rt.pushable_tasks); + plist_del(&p->pushable_tasks, &rt_rq->pushable_tasks); =20 /* Update the new highest prio pushable task */ - if (has_pushable_tasks(rq)) { - p =3D plist_first_entry(&rq->rt.pushable_tasks, + if (has_pushable_tasks(rt_rq)) { + p =3D plist_first_entry(&rt_rq->pushable_tasks, struct task_struct, pushable_tasks); - rq->rt.highest_prio.next =3D p->prio; + rt_rq->highest_prio.next =3D p->prio; } else { - rq->rt.highest_prio.next =3D MAX_RT_PRIO-1; + rt_rq->highest_prio.next =3D MAX_RT_PRIO-1; =20 - if (rq->rt.overloaded) { - rt_clear_overload(rq); - rq->rt.overloaded =3D 0; + if (rt_rq->overloaded) { + rt_clear_overload(rq_of_rt_rq(rt_rq)); + rt_rq->overloaded =3D 0; } } } =20 #else =20 -static inline void enqueue_pushable_task(struct rq *rq, struct task_struct= *p) +static inline void enqueue_pushable_task(struct rt_rq *rt_rq, struct task_= struct *p) { } =20 -static inline void dequeue_pushable_task(struct rq *rq, struct task_struct= *p) +static inline void dequeue_pushable_task(struct rt_rq *rt_rq, struct task_= struct *p) { } =20 @@ -1479,6 +1479,7 @@ static void enqueue_task_rt(struct rq *rq, struct task_struct *p, int flags) { struct sched_rt_entity *rt_se =3D &p->rt; + struct rt_rq *rt_rq =3D rt_rq_of_se(rt_se); =20 if (flags & ENQUEUE_WAKEUP) rt_se->timeout =3D 0; @@ -1489,17 +1490,18 @@ enqueue_task_rt(struct rq *rq, struct task_struct *= p, int flags) enqueue_rt_entity(rt_se, flags); =20 if (!task_current(rq, p) && p->nr_cpus_allowed > 1) - enqueue_pushable_task(rq, p); + enqueue_pushable_task(rt_rq, p); } =20 static bool dequeue_task_rt(struct rq *rq, struct task_struct *p, int flag= s) { struct sched_rt_entity *rt_se =3D &p->rt; + struct rt_rq *rt_rq =3D rt_rq_of_se(rt_se); =20 update_curr_rt(rq); dequeue_rt_entity(rt_se, flags); =20 - dequeue_pushable_task(rq, p); + dequeue_pushable_task(rt_rq, p); =20 return true; } @@ -1688,14 +1690,14 @@ static void wakeup_preempt_rt(struct rq *rq, struct= task_struct *p, int flags) static inline void set_next_task_rt(struct rq *rq, struct task_struct *p, = bool first) { struct sched_rt_entity *rt_se =3D &p->rt; - struct rt_rq *rt_rq =3D &rq->rt; + struct rt_rq *rt_rq =3D rt_rq_of_se(&p->rt); =20 p->se.exec_start =3D rq_clock_task(rq); if (on_rt_rq(&p->rt)) update_stats_wait_end_rt(rt_rq, rt_se); =20 /* The running task is never eligible for pushing */ - dequeue_pushable_task(rq, p); + dequeue_pushable_task(rt_rq, p); =20 if (!first) return; @@ -1759,7 +1761,7 @@ static struct task_struct *pick_task_rt(struct rq *rq) static void put_prev_task_rt(struct rq *rq, struct task_struct *p, struct = task_struct *next) { struct sched_rt_entity *rt_se =3D &p->rt; - struct rt_rq *rt_rq =3D &rq->rt; + struct rt_rq *rt_rq =3D rt_rq_of_se(&p->rt); =20 if (on_rt_rq(&p->rt)) update_stats_wait_start_rt(rt_rq, rt_se); @@ -1773,7 +1775,7 @@ static void put_prev_task_rt(struct rq *rq, struct ta= sk_struct *p, struct task_s * if it is still active */ if (on_rt_rq(&p->rt) && p->nr_cpus_allowed > 1) - enqueue_pushable_task(rq, p); + enqueue_pushable_task(rt_rq, p); } =20 #ifdef CONFIG_SMP @@ -1785,16 +1787,16 @@ static void put_prev_task_rt(struct rq *rq, struct = task_struct *p, struct task_s * Return the highest pushable rq's task, which is suitable to be executed * on the CPU, NULL otherwise */ -static struct task_struct *pick_highest_pushable_task(struct rq *rq, int c= pu) +static struct task_struct *pick_highest_pushable_task(struct rt_rq *rt_rq,= int cpu) { - struct plist_head *head =3D &rq->rt.pushable_tasks; + struct plist_head *head =3D &rt_rq->pushable_tasks; struct task_struct *p; =20 - if (!has_pushable_tasks(rq)) + if (!has_pushable_tasks(rt_rq)) return NULL; =20 plist_for_each_entry(p, head, pushable_tasks) { - if (task_is_pushable(rq, p, cpu)) + if (task_is_pushable(rq_of_rt_rq(rt_rq), p, cpu)) return p; } =20 @@ -1894,14 +1896,15 @@ static int find_lowest_rq(struct task_struct *task) return -1; } =20 -static struct task_struct *pick_next_pushable_task(struct rq *rq) +static struct task_struct *pick_next_pushable_task(struct rt_rq *rt_rq) { + struct rq *rq =3D rq_of_rt_rq(rt_rq); struct task_struct *p; =20 - if (!has_pushable_tasks(rq)) + if (!has_pushable_tasks(rt_rq)) return NULL; =20 - p =3D plist_first_entry(&rq->rt.pushable_tasks, + p =3D plist_first_entry(&rt_rq->pushable_tasks, struct task_struct, pushable_tasks); =20 BUG_ON(rq->cpu !=3D task_cpu(p)); @@ -1954,7 +1957,7 @@ static struct rq *find_lock_lowest_rq(struct task_str= uct *task, struct rq *rq) */ if (unlikely(is_migration_disabled(task) || !cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) || - task !=3D pick_next_pushable_task(rq))) { + task !=3D pick_next_pushable_task(&rq->rt))) { =20 double_unlock_balance(rq, lowest_rq); lowest_rq =3D NULL; @@ -1988,7 +1991,7 @@ static int push_rt_task(struct rq *rq, bool pull) if (!rq->rt.overloaded) return 0; =20 - next_task =3D pick_next_pushable_task(rq); + next_task =3D pick_next_pushable_task(&rq->rt); if (!next_task) return 0; =20 @@ -2063,7 +2066,7 @@ static int push_rt_task(struct rq *rq, bool pull) * run-queue and is also still the next task eligible for * pushing. */ - task =3D pick_next_pushable_task(rq); + task =3D pick_next_pushable_task(&rq->rt); if (task =3D=3D next_task) { /* * The task hasn't migrated, and is still the next @@ -2251,7 +2254,7 @@ void rto_push_irq_work_func(struct irq_work *work) * We do not need to grab the lock to check for has_pushable_tasks. * When it gets updated, a check is made if a push is possible. */ - if (has_pushable_tasks(rq)) { + if (has_pushable_tasks(&rq->rt)) { raw_spin_rq_lock(rq); while (push_rt_task(rq, true)) ; @@ -2280,6 +2283,7 @@ static void pull_rt_task(struct rq *this_rq) int this_cpu =3D this_rq->cpu, cpu; bool resched =3D false; struct task_struct *p, *push_task; + struct rt_rq *src_rt_rq; struct rq *src_rq; int rt_overload_count =3D rt_overloaded(this_rq); =20 @@ -2309,6 +2313,7 @@ static void pull_rt_task(struct rq *this_rq) continue; =20 src_rq =3D cpu_rq(cpu); + src_rt_rq =3D &src_rq->rt; =20 /* * Don't bother taking the src_rq->lock if the next highest @@ -2317,7 +2322,7 @@ static void pull_rt_task(struct rq *this_rq) * logically higher, the src_rq will push this task away. * And if its going logically lower, we do not care */ - if (src_rq->rt.highest_prio.next >=3D + if (src_rt_rq->highest_prio.next >=3D this_rq->rt.highest_prio.curr) continue; =20 @@ -2333,7 +2338,7 @@ static void pull_rt_task(struct rq *this_rq) * We can pull only a task, which is pushable * on its rq, and no others. */ - p =3D pick_highest_pushable_task(src_rq, this_cpu); + p =3D pick_highest_pushable_task(src_rt_rq, this_cpu); =20 /* * Do we have an RT task that preempts --=20 2.49.0 From nobody Fri Dec 19 19:07:39 2025 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDB4C202969 for ; Thu, 5 Jun 2025 07:14:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107668; cv=none; b=IrATxJJB7NexiuKI6UGWu9ATwJIyklTa+MRfsXqBFLof77MQQS8j6QKCL1KCq/AVeYJLFofX8KX+Hy6+fXJuBI907We9UlZ7uRqJqO0dBXENScIMUC4FWf8z9tjVtCI+k3iUtPSBURJXRLIzIbfrGxJGII9C/2GuuAdOCug81bM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107668; c=relaxed/simple; bh=BymqvX7d+Ux8Jvc4cWzfgCRPYKN6iYt3wLfE8k23f/A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ul0hl/FmayO5JPZg1HXtm1uUWSqMe6rmTYkVOIsWaOA3wFDdIWqMsFXx1nTriPIet1Omh8ELlHPgccn1fm27nLiENvc6EYoEYQDkIWXRaV8sQhMyes9HZflk4oOxowiShpb/YCpb0mrKb3aBkmMG8u/68sV78ZHzrz+Ws33uJjw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=acWq7oEd; arc=none smtp.client-ip=209.85.128.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="acWq7oEd" Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-450cea01b9cso1468315e9.0 for ; Thu, 05 Jun 2025 00:14:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749107665; x=1749712465; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SQIxOYf7kXX1kxVhb+VBXZt6NYP23Z7bRj9VUYfQMSU=; b=acWq7oEdXRNDWQ8LiXJ0qvAPoRGrOT28jel4ttim7GYNtCLhMd/FMZJ5g8DAJCv2lE IEFkB4qrdkgJljqRIy/rVlYyh+PEHUqWAYQVxaWoqK4MiT0xnimF0z1dBej5kGKRMXCU 5h8A+C3t+ZYJ1j8c7S8QV9k+PLYXZ71vazuGGfrO1GpA1yi5VkQed5Odqfs5INAhEukG bjwrzz4hcoE37ATe55Ai9j3zjZxMUDOStxHeke536QMrCf9E2HLJG9rJ4NWGVmSHQPHq 7qYrV2Ecf7AdyPOBUuTdNDgSFTIMeUaT5iDmpz8kO7nKCM0vFGVFfHDNCIlFZqxPky0L qWyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749107665; x=1749712465; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SQIxOYf7kXX1kxVhb+VBXZt6NYP23Z7bRj9VUYfQMSU=; b=GbljSchmAU1Txy2qN2WHHMO7MeIwYA1OvfVKNTPUifNmMPG7KEsw7SkYoorkaGFE7z Hye10/7HKvbtICFkfOkFLbJr+//s2dPJwQxyvdYnHIkQ9+YJYcD4aoaVjVj5Rvn2FFL8 lhY3b/hjgAKTGbKUWvaecZ3JFsumUjXcqjUR9DhaPdwZtlEbR9MWKOlvpR7R0D5OKOtl gD11snRvOaj/9nHm+9B02FPBuRAfZ10MLFos1cur7lbUfT62AGak2AR5f9BhCu/4zJPd mX0Hkj8U5NVrGQYCVQ2EQrsG+fCRS19RF8w+bzmiM0T9ENj30Bj7iohl18zOvesm7OG+ EZng== X-Gm-Message-State: AOJu0Yw9fBjfHxXSrmgRErOIYC6VuaImTQrgxkqcE2OljPPPHhIrivyB CloyoQfbsyDpitEV18ue6RUyotlMIIFXSmBq70WWQ7tLn6zCwghiRlhY X-Gm-Gg: ASbGnctPJwvOd/fyWyRtTdvS7ys7SD3EStUMVJ+AmnG32uBu8r8/65ypHnFRsXsUnsR Xjp4LGGMMAObM5wN3kzLSO529bh5uQYou/jzWLjAHeLO6kbAtuAImY1Ntb+9LfPjpL2YyKTZiym OuCfoQOeboq4ljj/x9SHUg8CKV7rL3K+tmZVm6UTGjEzdPMDQXn2qGM5eS87Xk1pDDWLVO80Y4p cOVlyG+NEAqAoiYvlSeKq7O4BafopzG8p+xbrl+khWhsJ8ztq7s6Vd6b2QA4NdPJ/BK1T34+abM +Kui2Vgxlh5/i37OhjoAOehS4STfGUxpW9WuMoVJntFzjDn+20qD9Liv+QKpdi4B X-Google-Smtp-Source: AGHT+IFLJOOCuXzy59FS4QxM9xLFiuJUfW7dVSUK0u+Wv14Lw6bkRzh8toNtN5jJ/dtyyxzepFTwjg== X-Received: by 2002:a05:600c:34c9:b0:43c:f8fe:dd82 with SMTP id 5b1f17b1804b1-451f0b0e7a1mr57910725e9.18.1749107664852; Thu, 05 Jun 2025 00:14:24 -0700 (PDT) Received: from localhost.localdomain ([78.210.56.234]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4f00972b5sm23885431f8f.76.2025.06.05.00.14.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 00:14:24 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH 4/9] sched/rt: Move some inline functions from rt.c to sched.h Date: Thu, 5 Jun 2025 09:14:07 +0200 Message-ID: <20250605071412.139240-5-yurand2000@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250605071412.139240-1-yurand2000@gmail.com> References: <20250605071412.139240-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Make the following functions be non-static and move them in sched.h, so tha= t they can be used also in other source files: - rt_task_of() - rq_of_rt_rq() - rt_rq_of_se() - rq_of_rt_se() There are no functional changes. This is needed by future patches. Signed-off-by: luca abeni --- kernel/sched/rt.c | 52 -------------------------------------------- kernel/sched/sched.h | 51 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 51 insertions(+), 52 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 046a89fc7..382126274 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -167,34 +167,6 @@ static void destroy_rt_bandwidth(struct rt_bandwidth *= rt_b) =20 #define rt_entity_is_task(rt_se) (!(rt_se)->my_q) =20 -static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) -{ - WARN_ON_ONCE(!rt_entity_is_task(rt_se)); - - return container_of(rt_se, struct task_struct, rt); -} - -static inline struct rq *rq_of_rt_rq(struct rt_rq *rt_rq) -{ - /* Cannot fold with non-CONFIG_RT_GROUP_SCHED version, layout */ - WARN_ON(!rt_group_sched_enabled() && rt_rq->tg !=3D &root_task_group); - return rt_rq->rq; -} - -static inline struct rt_rq *rt_rq_of_se(struct sched_rt_entity *rt_se) -{ - WARN_ON(!rt_group_sched_enabled() && rt_se->rt_rq->tg !=3D &root_task_gro= up); - return rt_se->rt_rq; -} - -static inline struct rq *rq_of_rt_se(struct sched_rt_entity *rt_se) -{ - struct rt_rq *rt_rq =3D rt_se->rt_rq; - - WARN_ON(!rt_group_sched_enabled() && rt_rq->tg !=3D &root_task_group); - return rt_rq->rq; -} - void unregister_rt_sched_group(struct task_group *tg) { if (!rt_group_sched_enabled()) @@ -295,30 +267,6 @@ int alloc_rt_sched_group(struct task_group *tg, struct= task_group *parent) =20 #define rt_entity_is_task(rt_se) (1) =20 -static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) -{ - return container_of(rt_se, struct task_struct, rt); -} - -static inline struct rq *rq_of_rt_rq(struct rt_rq *rt_rq) -{ - return container_of(rt_rq, struct rq, rt); -} - -static inline struct rq *rq_of_rt_se(struct sched_rt_entity *rt_se) -{ - struct task_struct *p =3D rt_task_of(rt_se); - - return task_rq(p); -} - -static inline struct rt_rq *rt_rq_of_se(struct sched_rt_entity *rt_se) -{ - struct rq *rq =3D rq_of_rt_se(rt_se); - - return &rq->rt; -} - void unregister_rt_sched_group(struct task_group *tg) { } =20 void free_rt_sched_group(struct task_group *tg) { } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 755ff5734..439a95239 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3128,6 +3128,57 @@ static inline void double_rq_unlock(struct rq *rq1, = struct rq *rq2) =20 #endif /* !CONFIG_SMP */ =20 +#ifdef CONFIG_RT_GROUP_SCHED +static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) +{ +#ifdef CONFIG_SCHED_DEBUG + WARN_ON_ONCE(rt_se->my_q); +#endif + return container_of(rt_se, struct task_struct, rt); +} + +static inline struct rq *rq_of_rt_rq(struct rt_rq *rt_rq) +{ + return rt_rq->rq; +} + +static inline struct rt_rq *rt_rq_of_se(struct sched_rt_entity *rt_se) +{ + return rt_se->rt_rq; +} + +static inline struct rq *rq_of_rt_se(struct sched_rt_entity *rt_se) +{ + struct rt_rq *rt_rq =3D rt_se->rt_rq; + + return rt_rq->rq; +} +#else +static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) +{ + return container_of(rt_se, struct task_struct, rt); +} + +static inline struct rq *rq_of_rt_rq(struct rt_rq *rt_rq) +{ + return container_of(rt_rq, struct rq, rt); +} + +static inline struct rq *rq_of_rt_se(struct sched_rt_entity *rt_se) +{ + struct task_struct *p =3D rt_task_of(rt_se); + + return task_rq(p); +} + +static inline struct rt_rq *rt_rq_of_se(struct sched_rt_entity *rt_se) +{ + struct rq *rq =3D rq_of_rt_se(rt_se); + + return &rq->rt; +} +#endif + DEFINE_LOCK_GUARD_2(double_rq_lock, struct rq, double_rq_lock(_T->lock, _T->lock2), double_rq_unlock(_T->lock, _T->lock2)) --=20 2.49.0 From nobody Fri Dec 19 19:07:39 2025 Received: from mail-wr1-f43.google.com (mail-wr1-f43.google.com [209.85.221.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 136B41F4C8E for ; Thu, 5 Jun 2025 07:14:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107672; cv=none; b=nHLfAZXkczp2s1C4GEKebfBBuG0Fzb2osexEJg+ihV0avandbRPdxcu0zputelo427sR/0I7hUa0e+aGqjYttBX8VXLpg0EeFM9UsMVVSMit+IyILN9Qaynyiv7JgGv1ZN1FDYm3TQAACcqeyXVcRB2HOxnuAjoriMCGK8Kc5z0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107672; c=relaxed/simple; bh=+SLrJP+SwnpzCynRJkodV0YYCKHymXa5kVx+/59ksm4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LQAlDho7UITSSxuuE9GOiRu/G2mAuOVEDp64Cp4iqC6jIUB/OTOwRk9yePXhJhUK7biRCid2/rsGwpKQ1/bUfehkcuqbi5MvNAYdLX1DABqYvu2B1zi4lN7hX8P0ywLY9ZqXTdR81oWkljAUkserKutPLX+XEmumq9dVk8pBUYI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lvW1Y6IR; arc=none smtp.client-ip=209.85.221.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lvW1Y6IR" Received: by mail-wr1-f43.google.com with SMTP id ffacd0b85a97d-3a4f71831abso573830f8f.3 for ; Thu, 05 Jun 2025 00:14:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749107667; x=1749712467; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Nl3EyhwNmJYbp+s8ngdxDvdBkfcvseywo+ids1EXZSk=; b=lvW1Y6IRkDyrqtvAX7cNboLXNUJgwvBW3VYhQkVyY4ZmOn+m15vujXx49PjBJFeGLG ABnc0u0C9+EtPJO1beIlq622aHOG5xUPy6zCrwr3QOj0itks5YN/vFZ+gwC1ajWONXOv 2GRdsJRoTT7/SZneN2GbGg6HiFs4XSVjIa7XXOvo11xIwSZzpLxJkV3Nky17DJZfVqDS hGWXzuJ6qcdZ+576fF2T2SMalbEVaxY5RjzpinOwu4uu/26wMWLua/5rYb2kcxAGxIc2 5VXxNxEMVvWwLx45YMpOABGrDqAOXcblw2Zs0UTJ0t5UfeW0xUw05cWb5aq8Brl5z18b S80A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749107667; x=1749712467; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Nl3EyhwNmJYbp+s8ngdxDvdBkfcvseywo+ids1EXZSk=; b=SYPAh3G5D7UnxblbLAIFLnmuvq7GcojO2zv9NFsSP+R1IgnoDNbMrEn23flUDP1Uke D1D9VJ5wGPcV85zPpr3RALtKhh5XwpqGkY0VlAvDqnUOnlVUeKU6dB5e4AEmvp63Naq9 9NBTdEublR1nLICznCO18jXCHlVWhOhebyo+oYTb64n5IzCob9X68AM+e8q3n3NL9osL DGgNNFNder9pzFB2KQHydeLfJ1kLleEqlMYJ8lryX5lhGUpM4j+OyCyFmt91l+Wz47cA 2KHbzRBKNNB0MuxH37LJvmmMuI50glp/w9anUUi284ybJjTYi1ru1tw+DNXhL2PJe26n orFA== X-Gm-Message-State: AOJu0Ywi1Dx5zRD7k8/GQW9KnXeVdWXqjx6en4apyPjUi7JzFMZS5p/6 RiZ6XjSYGswoMemYZgRuyfvwDZjW9d38ZLEU13qp+J6TMKqP+oNWDYmaxfHJLWfQ13s= X-Gm-Gg: ASbGncs7lcTwCPQ5zWAdqdVV6/5DABar04eC1HUzfgDfjhqQa0iZ2EZJMs7WeZUDwm3 N+okUu13vxBrhLeTJHAwK5iNxWXyyM81pR/HHNHeDC2/hD8g+8b66CzErPu0HcALxzw+WYt1wo0 nti48nA1FCneESc1UxTZW8Xio5oihDWFPYkh01KdVWUvSSt+CurHTO1jMCtszpJ5Vtq9ZaPuF/I H/US4dFMvO4Aq5j2RY9RxiLubF1L1PotiMiKTHrMhzT9CUP0I42b74XKP3KF8WP6xHqNhU++KHX jrXw9wc989n/QulunndAsulvW3naMxK3CZywUm5VOV5pVOnhKuhpFFanb+xhFB4o X-Google-Smtp-Source: AGHT+IFSvd/4VaY1YpesBrCGCxbgoh0VjyWHKJsJy+Zaj6L4wCy5WaRkxY/OLWy7Xh8s6cmU2kTq7w== X-Received: by 2002:a05:6000:430d:b0:3a4:e68e:d33c with SMTP id ffacd0b85a97d-3a51d96b0fdmr4676637f8f.47.1749107666807; Thu, 05 Jun 2025 00:14:26 -0700 (PDT) Received: from localhost.localdomain ([78.210.56.234]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4f00972b5sm23885431f8f.76.2025.06.05.00.14.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 00:14:26 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH 5/9] sched/deadline: Hierarchical scheduling with DL on top of RT Date: Thu, 5 Jun 2025 09:14:08 +0200 Message-ID: <20250605071412.139240-6-yurand2000@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250605071412.139240-1-yurand2000@gmail.com> References: <20250605071412.139240-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Implement the hierarchical scheduling mechanism: - Enforce the runtime of RT tasks controlled by cgroups using the dl_server mechanism, based on the runtime and period (the deadline is set equal to the period) parameters. - Make sched_dl_entity store the cgroups' RT local runqueue, and provide a rt_rq for this runqueue. - Allow zeroing the runtime of a rt cgroups. Update dl_server code: - Make check for the return value of dl_server_apply_params on initializati= on of the fair server. - Initialize the dl_server in dl_server_start only if it is a fair server, = but do not initialize other types of servers, regardless of its period value. - Make inc_dl_task and dec_dl_task increase/decrease by one the value of ac= tive tasks if a fair_server starts/stops, while ignore rt-cgroups dl_servers in the accounting. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: Yuri Andriaccio Signed-off-by: Yuri Andriaccio Signed-off-by: luca abeni --- kernel/sched/autogroup.c | 4 +- kernel/sched/core.c | 15 +- kernel/sched/deadline.c | 145 ++++++++++++-- kernel/sched/rt.c | 415 ++++++++++++++++++++++++++------------- kernel/sched/sched.h | 59 +++++- kernel/sched/syscalls.c | 4 +- 6 files changed, 471 insertions(+), 171 deletions(-) diff --git a/kernel/sched/autogroup.c b/kernel/sched/autogroup.c index 2b331822c..a647c9265 100644 --- a/kernel/sched/autogroup.c +++ b/kernel/sched/autogroup.c @@ -49,7 +49,7 @@ static inline void autogroup_destroy(struct kref *kref) =20 #ifdef CONFIG_RT_GROUP_SCHED /* We've redirected RT tasks to the root task group... */ - ag->tg->rt_se =3D NULL; + ag->tg->dl_se =3D NULL; ag->tg->rt_rq =3D NULL; #endif sched_release_group(ag->tg); @@ -106,7 +106,7 @@ static inline struct autogroup *autogroup_create(void) * the policy change to proceed. */ free_rt_sched_group(tg); - tg->rt_se =3D root_task_group.rt_se; + tg->dl_se =3D root_task_group.dl_se; tg->rt_rq =3D root_task_group.rt_rq; #endif tg->autogroup =3D ag; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index dce50fa57..c07fddbf2 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2196,6 +2196,9 @@ void wakeup_preempt(struct rq *rq, struct task_struct= *p, int flags) { struct task_struct *donor =3D rq->donor; =20 + if (is_dl_group(rt_rq_of_se(&p->rt)) && task_has_rt_policy(p)) + resched_curr(rq); + if (p->sched_class =3D=3D donor->sched_class) donor->sched_class->wakeup_preempt(rq, p, flags); else if (sched_class_above(p->sched_class, donor->sched_class)) @@ -8548,7 +8551,7 @@ void __init sched_init(void) root_task_group.scx_weight =3D CGROUP_WEIGHT_DFL; #endif /* CONFIG_EXT_GROUP_SCHED */ #ifdef CONFIG_RT_GROUP_SCHED - root_task_group.rt_se =3D (struct sched_rt_entity **)ptr; + root_task_group.dl_se =3D (struct sched_dl_entity **)ptr; ptr +=3D nr_cpu_ids * sizeof(void **); =20 root_task_group.rt_rq =3D (struct rt_rq **)ptr; @@ -8562,7 +8565,7 @@ void __init sched_init(void) #endif =20 #ifdef CONFIG_RT_GROUP_SCHED - init_rt_bandwidth(&root_task_group.rt_bandwidth, + init_dl_bandwidth(&root_task_group.dl_bandwidth, global_rt_period(), global_rt_runtime()); #endif /* CONFIG_RT_GROUP_SCHED */ =20 @@ -8618,7 +8621,7 @@ void __init sched_init(void) * yet. */ rq->rt.rt_runtime =3D global_rt_runtime(); - init_tg_rt_entry(&root_task_group, &rq->rt, NULL, i, NULL); + init_tg_rt_entry(&root_task_group, rq, NULL, i, NULL); #endif #ifdef CONFIG_SMP rq->sd =3D NULL; @@ -9125,6 +9128,12 @@ cpu_cgroup_css_alloc(struct cgroup_subsys_state *par= ent_css) return &root_task_group.css; } =20 + /* Do not allow cpu_cgroup hierachies with depth greater than 2. */ +#ifdef CONFIG_RT_GROUP_SCHED + if (parent !=3D &root_task_group) + return ERR_PTR(-EINVAL); +#endif + tg =3D sched_create_group(parent); if (IS_ERR(tg)) return ERR_PTR(-ENOMEM); diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 7736a625f..6589077c0 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -239,8 +239,15 @@ void __dl_add(struct dl_bw *dl_b, u64 tsk_bw, int cpus) static inline bool __dl_overflow(struct dl_bw *dl_b, unsigned long cap, u64 old_bw, u64 new_b= w) { + u64 dl_groups_root =3D 0; + +#ifdef CONFIG_RT_GROUP_SCHED + dl_groups_root =3D to_ratio(root_task_group.dl_bandwidth.dl_period, + root_task_group.dl_bandwidth.dl_runtime); +#endif return dl_b->bw !=3D -1 && - cap_scale(dl_b->bw, cap) < dl_b->total_bw - old_bw + new_bw; + cap_scale(dl_b->bw, cap) < dl_b->total_bw - old_bw + new_bw + + cap_scale(dl_groups_root, cap); } =20 static inline @@ -366,6 +373,93 @@ void cancel_inactive_timer(struct sched_dl_entity *dl_= se) cancel_dl_timer(dl_se, &dl_se->inactive_timer); } =20 +/* + * Used for dl_bw check and update, used under sched_rt_handler()::mutex a= nd + * sched_domains_mutex. + */ +u64 dl_cookie; + +#ifdef CONFIG_RT_GROUP_SCHED +int dl_check_tg(unsigned long total) +{ + unsigned long flags; + int which_cpu; + int cpus; + struct dl_bw *dl_b; + u64 gen =3D ++dl_cookie; + + for_each_possible_cpu(which_cpu) { + rcu_read_lock_sched(); + + if (!dl_bw_visited(which_cpu, gen)) { + cpus =3D dl_bw_cpus(which_cpu); + dl_b =3D dl_bw_of(which_cpu); + + raw_spin_lock_irqsave(&dl_b->lock, flags); + + if (dl_b->bw !=3D -1 && + dl_b->bw * cpus < dl_b->total_bw + total * cpus) { + raw_spin_unlock_irqrestore(&dl_b->lock, flags); + rcu_read_unlock_sched(); + + return 0; + } + + raw_spin_unlock_irqrestore(&dl_b->lock, flags); + } + + rcu_read_unlock_sched(); + } + + return 1; +} + +int dl_init_tg(struct sched_dl_entity *dl_se, u64 rt_runtime, u64 rt_perio= d) +{ + struct rq *rq =3D container_of(dl_se->dl_rq, struct rq, dl); + int is_active; + u64 old_runtime; + + /* + * Since we truncate DL_SCALE bits, make sure we're at least + * that big. + */ + if (rt_runtime !=3D 0 && rt_runtime < (1ULL << DL_SCALE)) + return 0; + + /* + * Since we use the MSB for wrap-around and sign issues, make + * sure it's not set (mind that period can be equal to zero). + */ + if (rt_period & (1ULL << 63)) + return 0; + + raw_spin_rq_lock_irq(rq); + is_active =3D dl_se->my_q->rt.rt_nr_running > 0; + old_runtime =3D dl_se->dl_runtime; + dl_se->dl_runtime =3D rt_runtime; + dl_se->dl_period =3D rt_period; + dl_se->dl_deadline =3D dl_se->dl_period; + if (is_active) { + sub_running_bw(dl_se, dl_se->dl_rq); + } else if (dl_se->dl_non_contending) { + sub_running_bw(dl_se, dl_se->dl_rq); + dl_se->dl_non_contending =3D 0; + hrtimer_try_to_cancel(&dl_se->inactive_timer); + } + __sub_rq_bw(dl_se->dl_bw, dl_se->dl_rq); + dl_se->dl_bw =3D to_ratio(dl_se->dl_period, dl_se->dl_runtime); + __add_rq_bw(dl_se->dl_bw, dl_se->dl_rq); + + if (is_active) + add_running_bw(dl_se, dl_se->dl_rq); + + raw_spin_rq_unlock_irq(rq); + + return 1; +} +#endif + static void dl_change_utilization(struct task_struct *p, u64 new_bw) { WARN_ON_ONCE(p->dl.flags & SCHED_FLAG_SUGOV); @@ -539,6 +633,14 @@ static inline int is_leftmost(struct sched_dl_entity *= dl_se, struct dl_rq *dl_rq =20 static void init_dl_rq_bw_ratio(struct dl_rq *dl_rq); =20 +void init_dl_bandwidth(struct dl_bandwidth *dl_b, u64 period, u64 runtime) +{ + raw_spin_lock_init(&dl_b->dl_runtime_lock); + dl_b->dl_period =3D period; + dl_b->dl_runtime =3D runtime; +} + + void init_dl_bw(struct dl_bw *dl_b) { raw_spin_lock_init(&dl_b->lock); @@ -1493,6 +1595,9 @@ static void update_curr_dl_se(struct rq *rq, struct s= ched_dl_entity *dl_se, s64 { s64 scaled_delta_exec; =20 + if (dl_server(dl_se) && !on_dl_rq(dl_se)) + return; + if (unlikely(delta_exec <=3D 0)) { if (unlikely(dl_se->dl_yielded)) goto throttle; @@ -1654,13 +1759,15 @@ void dl_server_start(struct sched_dl_entity *dl_se) * this before getting generic. */ if (!dl_server(dl_se)) { - u64 runtime =3D 50 * NSEC_PER_MSEC; - u64 period =3D 1000 * NSEC_PER_MSEC; - dl_se->dl_server =3D 1; - dl_server_apply_params(dl_se, runtime, period, 1); + if (dl_se =3D=3D &rq_of_dl_se(dl_se)->fair_server) { + u64 runtime =3D 50 * NSEC_PER_MSEC; + u64 period =3D 1000 * NSEC_PER_MSEC; + + BUG_ON(dl_server_apply_params(dl_se, runtime, period, 1)); =20 - dl_se->dl_defer =3D 1; + dl_se->dl_defer =3D 1; + } setup_new_dl_entity(dl_se); } =20 @@ -1669,13 +1776,14 @@ void dl_server_start(struct sched_dl_entity *dl_se) =20 dl_se->dl_server_active =3D 1; enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP); - rq =3D rq_of_dl_se(dl_se); + rq =3D rq_of_dl_se(dl_se); if (!dl_task(rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl)) resched_curr(rq); } =20 void dl_server_stop(struct sched_dl_entity *dl_se) { +// if (!dl_server(dl_se)) return; TODO: Check if the following is equivale= nt to this!!! if (!dl_se->dl_runtime) return; =20 @@ -1898,7 +2006,13 @@ void inc_dl_tasks(struct sched_dl_entity *dl_se, str= uct dl_rq *dl_rq) u64 deadline =3D dl_se->deadline; =20 dl_rq->dl_nr_running++; - add_nr_running(rq_of_dl_rq(dl_rq), 1); + if (!dl_server(dl_se) || dl_se =3D=3D &rq_of_dl_rq(dl_rq)->fair_server) { + add_nr_running(rq_of_dl_rq(dl_rq), 1); + } else { + struct rt_rq *rt_rq =3D &dl_se->my_q->rt; + + add_nr_running(rq_of_dl_rq(dl_rq), rt_rq->rt_nr_running); + } =20 inc_dl_deadline(dl_rq, deadline); } @@ -1908,7 +2022,13 @@ void dec_dl_tasks(struct sched_dl_entity *dl_se, str= uct dl_rq *dl_rq) { WARN_ON(!dl_rq->dl_nr_running); dl_rq->dl_nr_running--; - sub_nr_running(rq_of_dl_rq(dl_rq), 1); + if ((!dl_server(dl_se)) || dl_se =3D=3D &rq_of_dl_rq(dl_rq)->fair_server)= { + sub_nr_running(rq_of_dl_rq(dl_rq), 1); + } else { + struct rt_rq *rt_rq =3D &dl_se->my_q->rt; + + sub_nr_running(rq_of_dl_rq(dl_rq), rt_rq->rt_nr_running); + } =20 dec_dl_deadline(dl_rq, dl_se->deadline); } @@ -2445,6 +2565,7 @@ static struct task_struct *__pick_task_dl(struct rq *= rq) } goto again; } + BUG_ON(!p); rq->dl_server =3D dl_se; } else { p =3D dl_task_of(dl_se); @@ -3177,12 +3298,6 @@ DEFINE_SCHED_CLASS(dl) =3D { #endif }; =20 -/* - * Used for dl_bw check and update, used under sched_rt_handler()::mutex a= nd - * sched_domains_mutex. - */ -u64 dl_cookie; - int sched_dl_global_validate(void) { u64 runtime =3D global_rt_runtime(); diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 382126274..e348b8aba 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1,3 +1,4 @@ +#pragma GCC diagnostic ignored "-Wunused-function" // SPDX-License-Identifier: GPL-2.0 /* * Real-Time Scheduling Class (mapped to the SCHED_FIFO and SCHED_RR @@ -184,81 +185,122 @@ void free_rt_sched_group(struct task_group *tg) return; =20 for_each_possible_cpu(i) { - if (tg->rt_rq) - kfree(tg->rt_rq[i]); - if (tg->rt_se) - kfree(tg->rt_se[i]); + if (tg->dl_se) { + unsigned long flags; + + /* + * Since the dl timer is going to be cancelled, + * we risk to never decrease the running bw... + * Fix this issue by changing the group runtime + * to 0 immediately before freeing it. + */ + BUG_ON(!dl_init_tg(tg->dl_se[i], 0, tg->dl_se[i]->dl_period)); + raw_spin_rq_lock_irqsave(cpu_rq(i), flags); + BUG_ON(tg->rt_rq[i]->rt_nr_running); + raw_spin_rq_unlock_irqrestore(cpu_rq(i), flags); + + hrtimer_cancel(&tg->dl_se[i]->dl_timer); + kfree(tg->dl_se[i]); + } + if (tg->rt_rq) { + struct rq *served_rq; + + served_rq =3D container_of(tg->rt_rq[i], struct rq, rt); + kfree(served_rq); + } } =20 kfree(tg->rt_rq); - kfree(tg->rt_se); + kfree(tg->dl_se); } =20 -void init_tg_rt_entry(struct task_group *tg, struct rt_rq *rt_rq, - struct sched_rt_entity *rt_se, int cpu, - struct sched_rt_entity *parent) +void init_tg_rt_entry(struct task_group *tg, struct rq *served_rq, + struct sched_dl_entity *dl_se, int cpu, + struct sched_dl_entity *parent) { struct rq *rq =3D cpu_rq(cpu); =20 - rt_rq->highest_prio.curr =3D MAX_RT_PRIO-1; - rt_rq->rt_nr_boosted =3D 0; - rt_rq->rq =3D rq; - rt_rq->tg =3D tg; + served_rq->rt.highest_prio.curr =3D MAX_RT_PRIO-1; + served_rq->rt.rq =3D rq; + served_rq->rt.tg =3D tg; =20 - tg->rt_rq[cpu] =3D rt_rq; - tg->rt_se[cpu] =3D rt_se; + tg->rt_rq[cpu] =3D &served_rq->rt; + tg->dl_se[cpu] =3D dl_se; =20 - if (!rt_se) + if (!dl_se) return; =20 - if (!parent) - rt_se->rt_rq =3D &rq->rt; - else - rt_se->rt_rq =3D parent->my_q; + dl_se->dl_rq =3D &rq->dl; + dl_se->my_q =3D served_rq; +} =20 - rt_se->my_q =3D rt_rq; - rt_se->parent =3D parent; - INIT_LIST_HEAD(&rt_se->run_list); +static bool rt_server_has_tasks(struct sched_dl_entity *dl_se) +{ + return !!dl_se->my_q->rt.rt_nr_running; +} + +static struct task_struct *_pick_next_task_rt(struct rt_rq *rt_rq); +static inline void set_next_task_rt(struct rq *rq, struct task_struct *p, = bool first); +static struct task_struct *rt_server_pick(struct sched_dl_entity *dl_se) +{ + struct rt_rq *rt_rq =3D &dl_se->my_q->rt; + struct rq *rq =3D rq_of_rt_rq(rt_rq); + struct task_struct *p; + + if (dl_se->my_q->rt.rt_nr_running =3D=3D 0) + return NULL; + + p =3D _pick_next_task_rt(rt_rq); + set_next_task_rt(rq, p, true); + + return p; } =20 int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) { - struct rt_rq *rt_rq; - struct sched_rt_entity *rt_se; + struct rq *s_rq; + struct sched_dl_entity *dl_se; int i; =20 if (!rt_group_sched_enabled()) return 1; =20 - tg->rt_rq =3D kcalloc(nr_cpu_ids, sizeof(rt_rq), GFP_KERNEL); + tg->rt_rq =3D kcalloc(nr_cpu_ids, sizeof(struct rt_rq *), GFP_KERNEL); if (!tg->rt_rq) goto err; - tg->rt_se =3D kcalloc(nr_cpu_ids, sizeof(rt_se), GFP_KERNEL); - if (!tg->rt_se) + tg->dl_se =3D kcalloc(nr_cpu_ids, sizeof(dl_se), GFP_KERNEL); + if (!tg->dl_se) goto err; =20 - init_rt_bandwidth(&tg->rt_bandwidth, ktime_to_ns(global_rt_period()), 0); + init_dl_bandwidth(&tg->dl_bandwidth, 0, 0); =20 for_each_possible_cpu(i) { - rt_rq =3D kzalloc_node(sizeof(struct rt_rq), + s_rq =3D kzalloc_node(sizeof(struct rq), GFP_KERNEL, cpu_to_node(i)); - if (!rt_rq) + if (!s_rq) goto err; =20 - rt_se =3D kzalloc_node(sizeof(struct sched_rt_entity), + dl_se =3D kzalloc_node(sizeof(struct sched_dl_entity), GFP_KERNEL, cpu_to_node(i)); - if (!rt_se) + if (!dl_se) goto err_free_rq; =20 - init_rt_rq(rt_rq); - rt_rq->rt_runtime =3D tg->rt_bandwidth.rt_runtime; - init_tg_rt_entry(tg, rt_rq, rt_se, i, parent->rt_se[i]); + init_rt_rq(&s_rq->rt); + init_dl_entity(dl_se); + dl_se->dl_runtime =3D tg->dl_bandwidth.dl_runtime; + dl_se->dl_period =3D tg->dl_bandwidth.dl_period; + dl_se->dl_deadline =3D dl_se->dl_period; + dl_se->dl_bw =3D to_ratio(dl_se->dl_period, dl_se->dl_runtime); + + dl_server_init(dl_se, &cpu_rq(i)->dl, s_rq, rt_server_has_tasks, rt_serv= er_pick); + + init_tg_rt_entry(tg, s_rq, dl_se, i, parent->dl_se[i]); } =20 return 1; =20 err_free_rq: - kfree(rt_rq); + kfree(s_rq); err: return 0; } @@ -391,6 +433,10 @@ static inline void dequeue_pushable_task(struct rt_rq = *rt_rq, struct task_struct static inline void rt_queue_push_tasks(struct rq *rq) { } + +static inline void rt_queue_pull_task(struct rq *rq) +{ +} #endif /* CONFIG_SMP */ =20 static void enqueue_top_rt_rq(struct rt_rq *rt_rq); @@ -449,7 +495,7 @@ static inline u64 sched_rt_runtime(struct rt_rq *rt_rq) =20 static inline u64 sched_rt_period(struct rt_rq *rt_rq) { - return ktime_to_ns(rt_rq->tg->rt_bandwidth.rt_period); + return ktime_to_ns(rt_rq->tg->dl_bandwidth.dl_period); } =20 typedef struct task_group *rt_rq_iter_t; @@ -952,6 +998,9 @@ static void update_curr_rt(struct rq *rq) { struct task_struct *donor =3D rq->donor; s64 delta_exec; +#ifdef CONFIG_RT_GROUP_SCHED + struct rt_rq *rt_rq; +#endif =20 if (donor->sched_class !=3D &rt_sched_class) return; @@ -961,25 +1010,17 @@ static void update_curr_rt(struct rq *rq) return; =20 #ifdef CONFIG_RT_GROUP_SCHED - struct sched_rt_entity *rt_se =3D &donor->rt; + if (!rt_group_sched_enabled()) + return; =20 - if (!rt_bandwidth_enabled()) + if (!dl_bandwidth_enabled()) return; =20 - for_each_sched_rt_entity(rt_se) { - struct rt_rq *rt_rq =3D rt_rq_of_se(rt_se); - int exceeded; + rt_rq =3D rt_rq_of_se(&donor->rt); + if (is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); =20 - if (sched_rt_runtime(rt_rq) !=3D RUNTIME_INF) { - raw_spin_lock(&rt_rq->rt_runtime_lock); - rt_rq->rt_time +=3D delta_exec; - exceeded =3D sched_rt_runtime_exceeded(rt_rq); - if (exceeded) - resched_curr(rq); - raw_spin_unlock(&rt_rq->rt_runtime_lock); - if (exceeded) - do_start_rt_bandwidth(sched_rt_bandwidth(rt_rq)); - } + dl_server_update(dl_se, delta_exec); } #endif } @@ -1033,7 +1074,7 @@ inc_rt_prio_smp(struct rt_rq *rt_rq, int prio, int pr= ev_prio) /* * Change rq's cpupri only if rt_rq is the top queue. */ - if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && &rq->rt !=3D rt_rq) + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && is_dl_group(rt_rq)) return; =20 if (rq->online && prio < prev_prio) @@ -1048,7 +1089,7 @@ dec_rt_prio_smp(struct rt_rq *rt_rq, int prio, int pr= ev_prio) /* * Change rq's cpupri only if rt_rq is the top queue. */ - if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && &rq->rt !=3D rt_rq) + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && is_dl_group(rt_rq)) return; =20 if (rq->online && rt_rq->highest_prio.curr !=3D prev_prio) @@ -1177,19 +1218,34 @@ void inc_rt_tasks(struct sched_rt_entity *rt_se, st= ruct rt_rq *rt_rq) rt_rq->rr_nr_running +=3D rt_se_rr_nr_running(rt_se); =20 inc_rt_prio(rt_rq, prio); - inc_rt_group(rt_se, rt_rq); + + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + if (!dl_se->dl_throttled) + add_nr_running(rq_of_rt_rq(rt_rq), 1); + } else { + add_nr_running(rq_of_rt_rq(rt_rq), 1); + } } =20 static inline void dec_rt_tasks(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) { WARN_ON(!rt_prio(rt_se_prio(rt_se))); - WARN_ON(!rt_rq->rt_nr_running); rt_rq->rt_nr_running -=3D rt_se_nr_running(rt_se); rt_rq->rr_nr_running -=3D rt_se_rr_nr_running(rt_se); =20 dec_rt_prio(rt_rq, rt_se_prio(rt_se)); - dec_rt_group(rt_se, rt_rq); + + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + if (!dl_se->dl_throttled) + sub_nr_running(rq_of_rt_rq(rt_rq), 1); + } else { + sub_nr_running(rq_of_rt_rq(rt_rq), 1); + } } =20 /* @@ -1323,21 +1379,8 @@ static void __enqueue_rt_entity(struct sched_rt_enti= ty *rt_se, unsigned int flag { struct rt_rq *rt_rq =3D rt_rq_of_se(rt_se); struct rt_prio_array *array =3D &rt_rq->active; - struct rt_rq *group_rq =3D group_rt_rq(rt_se); struct list_head *queue =3D array->queue + rt_se_prio(rt_se); =20 - /* - * Don't enqueue the group if its throttled, or when empty. - * The latter is a consequence of the former when a child group - * get throttled and the current group doesn't have any other - * active members. - */ - if (group_rq && (rt_rq_throttled(group_rq) || !group_rq->rt_nr_running)) { - if (rt_se->on_list) - __delist_rt_entity(rt_se, array); - return; - } - if (move_entity(flags)) { WARN_ON_ONCE(rt_se->on_list); if (flags & ENQUEUE_HEAD) @@ -1393,31 +1436,16 @@ static void dequeue_rt_stack(struct sched_rt_entity= *rt_se, unsigned int flags) =20 static void enqueue_rt_entity(struct sched_rt_entity *rt_se, unsigned int = flags) { - struct rq *rq =3D rq_of_rt_se(rt_se); - update_stats_enqueue_rt(rt_rq_of_se(rt_se), rt_se, flags); =20 - dequeue_rt_stack(rt_se, flags); - for_each_sched_rt_entity(rt_se) - __enqueue_rt_entity(rt_se, flags); - enqueue_top_rt_rq(&rq->rt); + __enqueue_rt_entity(rt_se, flags); } =20 static void dequeue_rt_entity(struct sched_rt_entity *rt_se, unsigned int = flags) { - struct rq *rq =3D rq_of_rt_se(rt_se); - update_stats_dequeue_rt(rt_rq_of_se(rt_se), rt_se, flags); =20 - dequeue_rt_stack(rt_se, flags); - - for_each_sched_rt_entity(rt_se) { - struct rt_rq *rt_rq =3D group_rt_rq(rt_se); - - if (rt_rq && rt_rq->rt_nr_running) - __enqueue_rt_entity(rt_se, flags); - } - enqueue_top_rt_rq(&rq->rt); + __enqueue_rt_entity(rt_se, flags); } =20 /* @@ -1435,6 +1463,15 @@ enqueue_task_rt(struct rq *rq, struct task_struct *p= , int flags) check_schedstat_required(); update_stats_wait_start_rt(rt_rq_of_se(rt_se), rt_se); =20 +#ifdef CONFIG_RT_GROUP_SCHED + /* Task arriving in an idle group of tasks. */ + if (is_dl_group(rt_rq) && (rt_rq->rt_nr_running =3D=3D 0)) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + dl_server_start(dl_se); + } +#endif + enqueue_rt_entity(rt_se, flags); =20 if (!task_current(rq, p) && p->nr_cpus_allowed > 1) @@ -1451,6 +1488,15 @@ static bool dequeue_task_rt(struct rq *rq, struct ta= sk_struct *p, int flags) =20 dequeue_pushable_task(rt_rq, p); =20 +#ifdef CONFIG_RT_GROUP_SCHED + /* Last task of the task group. */ + if (is_dl_group(rt_rq) && !rt_rq->rt_nr_running) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + dl_server_stop(dl_se); + } +#endif + return true; } =20 @@ -1477,10 +1523,8 @@ static void requeue_task_rt(struct rq *rq, struct ta= sk_struct *p, int head) struct sched_rt_entity *rt_se =3D &p->rt; struct rt_rq *rt_rq; =20 - for_each_sched_rt_entity(rt_se) { - rt_rq =3D rt_rq_of_se(rt_se); - requeue_rt_entity(rt_rq, rt_se, head); - } + rt_rq =3D rt_rq_of_se(rt_se); + requeue_rt_entity(rt_rq, rt_se, head); } =20 static void yield_task_rt(struct rq *rq) @@ -1612,6 +1656,36 @@ static void wakeup_preempt_rt(struct rq *rq, struct = task_struct *p, int flags) { struct task_struct *donor =3D rq->donor; =20 +#ifdef CONFIG_RT_GROUP_SCHED + if (!rt_group_sched_enabled()) + goto no_group_sched; + + if (is_dl_group(rt_rq_of_se(&p->rt)) && + is_dl_group(rt_rq_of_se(&rq->curr->rt))) { + struct sched_dl_entity *dl_se, *curr_dl_se; + + dl_se =3D dl_group_of(rt_rq_of_se(&p->rt)); + curr_dl_se =3D dl_group_of(rt_rq_of_se(&rq->curr->rt)); + + if (dl_entity_preempt(dl_se, curr_dl_se)) { + resched_curr(rq); + return; + } else if (!dl_entity_preempt(curr_dl_se, dl_se)) { + if (p->prio < rq->curr->prio) { + resched_curr(rq); + return; + } + } + return; + } else if (is_dl_group(rt_rq_of_se(&p->rt))) { + resched_curr(rq); + return; + } else if (is_dl_group(rt_rq_of_se(&rq->curr->rt))) { + return; + } +#endif + +no_group_sched: if (p->prio < donor->prio) { resched_curr(rq); return; @@ -1679,17 +1753,12 @@ static struct sched_rt_entity *pick_next_rt_entity(= struct rt_rq *rt_rq) return next; } =20 -static struct task_struct *_pick_next_task_rt(struct rq *rq) +static struct task_struct *_pick_next_task_rt(struct rt_rq *rt_rq) { struct sched_rt_entity *rt_se; - struct rt_rq *rt_rq =3D &rq->rt; =20 - do { - rt_se =3D pick_next_rt_entity(rt_rq); - if (unlikely(!rt_se)) - return NULL; - rt_rq =3D group_rt_rq(rt_se); - } while (rt_rq); + rt_se =3D pick_next_rt_entity(rt_rq); + BUG_ON(!rt_se); =20 return rt_task_of(rt_se); } @@ -1701,7 +1770,7 @@ static struct task_struct *pick_task_rt(struct rq *rq) if (!sched_rt_runnable(rq)) return NULL; =20 - p =3D _pick_next_task_rt(rq); + p =3D _pick_next_task_rt(&rq->rt); =20 return p; } @@ -2337,12 +2406,36 @@ static void pull_rt_task(struct rq *this_rq) resched_curr(this_rq); } =20 +#ifdef CONFIG_RT_GROUP_SCHED +static int group_push_rt_task(struct rt_rq *rt_rq) +{ + struct rq *rq =3D rq_of_rt_rq(rt_rq); + + if (is_dl_group(rt_rq)) + return 0; + + return push_rt_task(rq, false); +} + +static void group_push_rt_tasks(struct rt_rq *rt_rq) +{ + while (group_push_rt_task(rt_rq)) + ; +} +#else +static void group_push_rt_tasks(struct rt_rq *rt_rq) +{ + push_rt_tasks(rq_of_rt_rq(rt_rq)); +} +#endif + /* * If we are not running and we are not going to reschedule soon, we should * try to push tasks away now */ static void task_woken_rt(struct rq *rq, struct task_struct *p) { + struct rt_rq *rt_rq =3D rt_rq_of_se(&p->rt); bool need_to_push =3D !task_on_cpu(rq, p) && !test_tsk_need_resched(rq->curr) && p->nr_cpus_allowed > 1 && @@ -2351,7 +2444,7 @@ static void task_woken_rt(struct rq *rq, struct task_= struct *p) rq->donor->prio <=3D p->prio); =20 if (need_to_push) - push_rt_tasks(rq); + group_push_rt_tasks(rt_rq); } =20 /* Assumes rq->lock is held */ @@ -2360,8 +2453,6 @@ static void rq_online_rt(struct rq *rq) if (rq->rt.overloaded) rt_set_overload(rq); =20 - __enable_runtime(rq); - cpupri_set(&rq->rd->cpupri, rq->cpu, rq->rt.highest_prio.curr); } =20 @@ -2371,8 +2462,6 @@ static void rq_offline_rt(struct rq *rq) if (rq->rt.overloaded) rt_clear_overload(rq); =20 - __disable_runtime(rq); - cpupri_set(&rq->rd->cpupri, rq->cpu, CPUPRI_INVALID); } =20 @@ -2382,6 +2471,8 @@ static void rq_offline_rt(struct rq *rq) */ static void switched_from_rt(struct rq *rq, struct task_struct *p) { + struct rt_rq *rt_rq =3D rt_rq_of_se(&p->rt); + /* * If there are other RT tasks then we will reschedule * and the scheduling of the other RT tasks will handle @@ -2389,10 +2480,12 @@ static void switched_from_rt(struct rq *rq, struct = task_struct *p) * we may need to handle the pulling of RT tasks * now. */ - if (!task_on_rq_queued(p) || rq->rt.rt_nr_running) + if (!task_on_rq_queued(p) || rt_rq->rt_nr_running) return; =20 +#ifndef CONFIG_RT_GROUP_SCHED rt_queue_pull_task(rq); +#endif } =20 void __init init_sched_rt_class(void) @@ -2429,8 +2522,16 @@ static void switched_to_rt(struct rq *rq, struct tas= k_struct *p) */ if (task_on_rq_queued(p)) { #ifdef CONFIG_SMP +#ifndef CONFIG_RT_GROUP_SCHED if (p->nr_cpus_allowed > 1 && rq->rt.overloaded) rt_queue_push_tasks(rq); +#else + if (rt_rq_of_se(&p->rt)->overloaded) { + } else { + if (p->prio < rq->curr->prio) + resched_curr(rq); + } +#endif #endif /* CONFIG_SMP */ if (p->prio < rq->donor->prio && cpu_online(cpu_of(rq))) resched_curr(rq); @@ -2444,6 +2545,10 @@ static void switched_to_rt(struct rq *rq, struct tas= k_struct *p) static void prio_changed_rt(struct rq *rq, struct task_struct *p, int oldprio) { +#ifdef CONFIG_SMP + struct rt_rq *rt_rq =3D rt_rq_of_se(&p->rt); +#endif + if (!task_on_rq_queued(p)) return; =20 @@ -2453,14 +2558,16 @@ prio_changed_rt(struct rq *rq, struct task_struct *= p, int oldprio) * If our priority decreases while running, we * may need to pull tasks to this runqueue. */ +#ifndef CONFIG_RT_GROUP_SCHED if (oldprio < p->prio) rt_queue_pull_task(rq); +#endif =20 /* * If there's a higher priority task waiting to run * then reschedule. */ - if (p->prio > rq->rt.highest_prio.curr) + if (p->prio > rt_rq->highest_prio.curr) resched_curr(rq); #else /* For UP simply resched on drop of prio */ @@ -2468,6 +2575,15 @@ prio_changed_rt(struct rq *rq, struct task_struct *p= , int oldprio) resched_curr(rq); #endif /* CONFIG_SMP */ } else { + /* + * This task is not running, thus we check against the currently + * running task for preemption. We can preempt only if both tasks are + * in the same cgroup or on the global runqueue. + */ + if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && + rt_rq_of_se(&p->rt)->tg !=3D rt_rq_of_se(&rq->curr->rt)->tg) + return; + /* * This task is not running, but if it is * greater than the current running task @@ -2539,12 +2655,12 @@ static void task_tick_rt(struct rq *rq, struct task= _struct *p, int queued) * Requeue to the end of queue if we (and all of our ancestors) are not * the only element on the queue */ - for_each_sched_rt_entity(rt_se) { - if (rt_se->run_list.prev !=3D rt_se->run_list.next) { - requeue_task_rt(rq, p, 0); - resched_curr(rq); - return; - } + if (rt_se->run_list.prev !=3D rt_se->run_list.next) { + requeue_task_rt(rq, p, 0); + resched_curr(rq); + // set_tsk_need_resched(p); + + return; } } =20 @@ -2562,16 +2678,16 @@ static unsigned int get_rr_interval_rt(struct rq *r= q, struct task_struct *task) #ifdef CONFIG_SCHED_CORE static int task_is_throttled_rt(struct task_struct *p, int cpu) { - struct rt_rq *rt_rq; - #ifdef CONFIG_RT_GROUP_SCHED // XXX maybe add task_rt_rq(), see also sched= _rt_period_rt_rq + struct rt_rq *rt_rq; +=09 rt_rq =3D task_group(p)->rt_rq[cpu]; WARN_ON(!rt_group_sched_enabled() && rt_rq->tg !=3D &root_task_group); + + return dl_group_of(rt_rq)->dl_throttled; #else - rt_rq =3D &cpu_rq(cpu)->rt; + return 0; #endif - - return rt_rq_throttled(rt_rq); } #endif =20 @@ -2655,8 +2771,8 @@ static int tg_rt_schedulable(struct task_group *tg, v= oid *data) unsigned long total, sum =3D 0; u64 period, runtime; =20 - period =3D ktime_to_ns(tg->rt_bandwidth.rt_period); - runtime =3D tg->rt_bandwidth.rt_runtime; + period =3D tg->dl_bandwidth.dl_period; + runtime =3D tg->dl_bandwidth.dl_runtime; =20 if (tg =3D=3D d->tg) { period =3D d->rt_period; @@ -2672,8 +2788,7 @@ static int tg_rt_schedulable(struct task_group *tg, v= oid *data) /* * Ensure we don't starve existing RT tasks if runtime turns zero. */ - if (rt_bandwidth_enabled() && !runtime && - tg->rt_bandwidth.rt_runtime && tg_has_rt_tasks(tg)) + if (dl_bandwidth_enabled() && !runtime && tg_has_rt_tasks(tg)) return -EBUSY; =20 if (WARN_ON(!rt_group_sched_enabled() && tg !=3D &root_task_group)) @@ -2687,12 +2802,17 @@ static int tg_rt_schedulable(struct task_group *tg,= void *data) if (total > to_ratio(global_rt_period(), global_rt_runtime())) return -EINVAL; =20 + if (tg =3D=3D &root_task_group) { + if (!dl_check_tg(total)) + return -EBUSY; + } + /* * The sum of our children's runtime should not exceed our own. */ list_for_each_entry_rcu(child, &tg->children, siblings) { - period =3D ktime_to_ns(child->rt_bandwidth.rt_period); - runtime =3D child->rt_bandwidth.rt_runtime; + period =3D child->dl_bandwidth.dl_period; + runtime =3D child->dl_bandwidth.dl_runtime; =20 if (child =3D=3D d->tg) { period =3D d->rt_period; @@ -2718,6 +2838,20 @@ static int __rt_schedulable(struct task_group *tg, u= 64 period, u64 runtime) .rt_runtime =3D runtime, }; =20 + /* + * Since we truncate DL_SCALE bits, make sure we're at least + * that big. + */ + if (runtime !=3D 0 && runtime < (1ULL << DL_SCALE)) + return -EINVAL; + + /* + * Since we use the MSB for wrap-around and sign issues, make + * sure it's not set (mind that period can be equal to zero). + */ + if (period & (1ULL << 63)) + return -EINVAL; + rcu_read_lock(); ret =3D walk_tg_tree(tg_rt_schedulable, tg_nop, &data); rcu_read_unlock(); @@ -2752,18 +2886,21 @@ static int tg_set_rt_bandwidth(struct task_group *t= g, if (err) goto unlock; =20 - raw_spin_lock_irq(&tg->rt_bandwidth.rt_runtime_lock); - tg->rt_bandwidth.rt_period =3D ns_to_ktime(rt_period); - tg->rt_bandwidth.rt_runtime =3D rt_runtime; + raw_spin_lock_irq(&tg->dl_bandwidth.dl_runtime_lock); + tg->dl_bandwidth.dl_period =3D rt_period; + tg->dl_bandwidth.dl_runtime =3D rt_runtime; =20 - for_each_possible_cpu(i) { - struct rt_rq *rt_rq =3D tg->rt_rq[i]; + if (tg =3D=3D &root_task_group) + goto unlock_bandwidth; =20 - raw_spin_lock(&rt_rq->rt_runtime_lock); - rt_rq->rt_runtime =3D rt_runtime; - raw_spin_unlock(&rt_rq->rt_runtime_lock); + for_each_possible_cpu(i) { + if (!dl_init_tg(tg->dl_se[i], rt_runtime, rt_period)) { + err =3D -EINVAL; + break; + } } - raw_spin_unlock_irq(&tg->rt_bandwidth.rt_runtime_lock); +unlock_bandwidth: + raw_spin_unlock_irq(&tg->dl_bandwidth.dl_runtime_lock); unlock: mutex_unlock(&rt_constraints_mutex); =20 @@ -2774,7 +2911,7 @@ int sched_group_set_rt_runtime(struct task_group *tg,= long rt_runtime_us) { u64 rt_runtime, rt_period; =20 - rt_period =3D ktime_to_ns(tg->rt_bandwidth.rt_period); + rt_period =3D tg->dl_bandwidth.dl_period; rt_runtime =3D (u64)rt_runtime_us * NSEC_PER_USEC; if (rt_runtime_us < 0) rt_runtime =3D RUNTIME_INF; @@ -2788,10 +2925,10 @@ long sched_group_rt_runtime(struct task_group *tg) { u64 rt_runtime_us; =20 - if (tg->rt_bandwidth.rt_runtime =3D=3D RUNTIME_INF) + if (tg->dl_bandwidth.dl_runtime =3D=3D RUNTIME_INF) return -1; =20 - rt_runtime_us =3D tg->rt_bandwidth.rt_runtime; + rt_runtime_us =3D tg->dl_bandwidth.dl_runtime; do_div(rt_runtime_us, NSEC_PER_USEC); return rt_runtime_us; } @@ -2804,7 +2941,7 @@ int sched_group_set_rt_period(struct task_group *tg, = u64 rt_period_us) return -EINVAL; =20 rt_period =3D rt_period_us * NSEC_PER_USEC; - rt_runtime =3D tg->rt_bandwidth.rt_runtime; + rt_runtime =3D tg->dl_bandwidth.dl_runtime; =20 return tg_set_rt_bandwidth(tg, rt_period, rt_runtime); } @@ -2813,7 +2950,7 @@ long sched_group_rt_period(struct task_group *tg) { u64 rt_period_us; =20 - rt_period_us =3D ktime_to_ns(tg->rt_bandwidth.rt_period); + rt_period_us =3D tg->dl_bandwidth.dl_period; do_div(rt_period_us, NSEC_PER_USEC); return rt_period_us; } @@ -2834,7 +2971,7 @@ static int sched_rt_global_constraints(void) int sched_rt_can_attach(struct task_group *tg, struct task_struct *tsk) { /* Don't accept real-time tasks when there is no way for them to run */ - if (rt_group_sched_enabled() && rt_task(tsk) && tg->rt_bandwidth.rt_runti= me =3D=3D 0) + if (rt_group_sched_enabled() && rt_task(tsk) && tg->dl_bandwidth.dl_runti= me =3D=3D 0) return 0; =20 return 1; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 439a95239..c7227a510 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -318,6 +318,13 @@ struct rt_bandwidth { unsigned int rt_period_active; }; =20 +struct dl_bandwidth { + raw_spinlock_t dl_runtime_lock; + u64 dl_runtime; + u64 dl_period; +}; + + static inline int dl_bandwidth_enabled(void) { return sysctl_sched_rt_runtime >=3D 0; @@ -385,6 +392,8 @@ extern void dl_server_init(struct sched_dl_entity *dl_s= e, struct dl_rq *dl_rq, struct rq *served_rq, dl_server_has_tasks_f has_tasks, dl_server_pick_f pick_task); +int dl_check_tg(unsigned long total); +int dl_init_tg(struct sched_dl_entity *dl_se, u64 rt_runtime, u64 rt_perio= d); =20 extern void dl_server_update_idle_time(struct rq *rq, struct task_struct *p); @@ -455,9 +464,15 @@ struct task_group { =20 #ifdef CONFIG_RT_GROUP_SCHED struct sched_rt_entity **rt_se; + /* + * The scheduling entities for the task group are managed as a single + * sched_dl_entity, each of them sharing the same dl_bandwidth. + */ + struct sched_dl_entity **dl_se; struct rt_rq **rt_rq; =20 struct rt_bandwidth rt_bandwidth; + struct dl_bandwidth dl_bandwidth; #endif =20 #ifdef CONFIG_EXT_GROUP_SCHED @@ -552,9 +567,9 @@ extern void start_cfs_bandwidth(struct cfs_bandwidth *c= fs_b); extern void unthrottle_cfs_rq(struct cfs_rq *cfs_rq); extern bool cfs_task_bw_constrained(struct task_struct *p); =20 -extern void init_tg_rt_entry(struct task_group *tg, struct rt_rq *rt_rq, - struct sched_rt_entity *rt_se, int cpu, - struct sched_rt_entity *parent); +extern void init_tg_rt_entry(struct task_group *tg, struct rq *s_rq, + struct sched_dl_entity *rt_se, int cpu, + struct sched_dl_entity *parent); extern int sched_group_set_rt_runtime(struct task_group *tg, long rt_runti= me_us); extern int sched_group_set_rt_period(struct task_group *tg, u64 rt_period_= us); extern long sched_group_rt_runtime(struct task_group *tg); @@ -784,7 +799,7 @@ struct scx_rq { =20 static inline int rt_bandwidth_enabled(void) { - return sysctl_sched_rt_runtime >=3D 0; + return 0; } =20 /* RT IPI pull logic requires IRQ_WORK */ @@ -820,12 +835,12 @@ struct rt_rq { raw_spinlock_t rt_runtime_lock; =20 unsigned int rt_nr_boosted; - - struct rq *rq; /* this is always top-level rq, cache? */ #endif #ifdef CONFIG_CGROUP_SCHED struct task_group *tg; /* this tg has "this" rt_rq on given CPU for runna= ble entities */ #endif + + struct rq *rq; /* this is always top-level rq, cache? */ }; =20 static inline bool rt_rq_is_runnable(struct rt_rq *rt_rq) @@ -2174,7 +2189,7 @@ static inline void set_task_rq(struct task_struct *p,= unsigned int cpu) if (!rt_group_sched_enabled()) tg =3D &root_task_group; p->rt.rt_rq =3D tg->rt_rq[cpu]; - p->rt.parent =3D tg->rt_se[cpu]; + p->dl.dl_rq =3D &cpu_rq(cpu)->dl; #endif } =20 @@ -2702,6 +2717,7 @@ extern void resched_cpu(int cpu); extern void init_rt_bandwidth(struct rt_bandwidth *rt_b, u64 period, u64 r= untime); extern bool sched_rt_bandwidth_account(struct rt_rq *rt_rq); =20 +void init_dl_bandwidth(struct dl_bandwidth *dl_b, u64 period, u64 runtime); extern void init_dl_entity(struct sched_dl_entity *dl_se); =20 #define BW_SHIFT 20 @@ -2760,6 +2776,7 @@ static inline void add_nr_running(struct rq *rq, unsi= gned count) =20 static inline void sub_nr_running(struct rq *rq, unsigned count) { + BUG_ON(rq->nr_running < count); rq->nr_running -=3D count; if (trace_sched_update_nr_running_tp_enabled()) { call_trace_sched_update_nr_running(rq, -count); @@ -3131,9 +3148,6 @@ static inline void double_rq_unlock(struct rq *rq1, s= truct rq *rq2) #ifdef CONFIG_RT_GROUP_SCHED static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) { -#ifdef CONFIG_SCHED_DEBUG - WARN_ON_ONCE(rt_se->my_q); -#endif return container_of(rt_se, struct task_struct, rt); } =20 @@ -3153,6 +3167,21 @@ static inline struct rq *rq_of_rt_se(struct sched_rt= _entity *rt_se) =20 return rt_rq->rq; } + +static inline int is_dl_group(struct rt_rq *rt_rq) +{ + return rt_rq->tg !=3D &root_task_group; +} + +/* + * Return the scheduling entity of this group of tasks. + */ +static inline struct sched_dl_entity *dl_group_of(struct rt_rq *rt_rq) +{ + BUG_ON(!is_dl_group(rt_rq)); + + return rt_rq->tg->dl_se[cpu_of(rt_rq->rq)]; +} #else static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) { @@ -3177,6 +3206,16 @@ static inline struct rt_rq *rt_rq_of_se(struct sched= _rt_entity *rt_se) =20 return &rq->rt; } + +static inline int is_dl_group(struct rt_rq *rt_rq) +{ + return 0; +} + +static inline struct sched_dl_entity *dl_group_of(struct rt_rq *rt_rq) +{ + return NULL; +} #endif =20 DEFINE_LOCK_GUARD_2(double_rq_lock, struct rq, diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index 547c1f05b..6c6666b39 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -635,8 +635,8 @@ int __sched_setscheduler(struct task_struct *p, * assigned. */ if (rt_group_sched_enabled() && - rt_bandwidth_enabled() && rt_policy(policy) && - task_group(p)->rt_bandwidth.rt_runtime =3D=3D 0 && + dl_bandwidth_enabled() && rt_policy(policy) && + task_group(p)->dl_bandwidth.dl_runtime =3D=3D 0 && !task_group_is_autogroup(task_group(p))) { retval =3D -EPERM; goto unlock; --=20 2.49.0 From nobody Fri Dec 19 19:07:39 2025 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EFF77205ABA for ; Thu, 5 Jun 2025 07:14:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107673; cv=none; b=sxnJ3kIYLkee+H02E4TaRU/dLZo5fomfvwrjYILOfRxWCRPfQTc/pEpK3HjejsjVp7KNc8wV7P3s9MsRwyMrkfZbv/2qonm7rgCZJ+qr9n4JG5eauBG8mInigvQOrApEb2h6i+t44UyOgPr9eKaxNeMohALgsbZ1QGshCyD5cNg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107673; c=relaxed/simple; bh=HlDS33kOBGBT4LPfu7vpBEDWwYjNO8ZrN08EzF9IAnM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YDWd5bOL3LMlccMyu4AHriJyi0yQ+UV9z+QEJgrcVxmxKwq027XgHNgV8EKjIdGsyzps5GIGbgAltBEK8W2yy4JVhAnbq6KIU4E14PsqfJCdHwj56R5u4SKP2W8vAHgkNoY6uCDkIwkvcECbJHm9KVFpw6qMOF0kuEVRtMI/Dto= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=UJnEUn6G; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UJnEUn6G" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-450dd065828so4223275e9.2 for ; Thu, 05 Jun 2025 00:14:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749107669; x=1749712469; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hIucJawQcNvXGPrYMEIEFQvoe2AQAVhdYJJdukK2UUk=; b=UJnEUn6GE0DX2D3uxeUVs3NmpM1yluAg126HEuYiL7ZfZ1mYjELJXt7Yix31U3QF8+ RN1++LPO7RaP0nk/l2zAne+tNhHOdxOcmHlao0+tUPKJM6qN3H+H3F8whHcZQRDYPu3s P93KCgn0l8DqLOit0obNrDNuxAQw0RlzfyGWOA6eJF7Pannr7VCIvQCUP4s7KTo4qfPb 2bNvYKl4twNyPo2NV+s8wQWEjMVDIF6h+YGKqasWLjHutfLoavZlbPci+yn4k4rgMTGK vObNRGtoq9KGTn02OBNKPSd9mo3RyAopi625kKV0bBSrBF3OSfducBO9uWDLVYHQAo1i Zkmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749107669; x=1749712469; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hIucJawQcNvXGPrYMEIEFQvoe2AQAVhdYJJdukK2UUk=; b=eRA/kZnNfE2CThpu02FRrZ0/erGaxYT/rhfv+H820XqDhDiuu+729ptf+XlKN9ZdUA GyOIDs6d5hO8xIg9qS7JGoFqtZs7j2hCaCEMfk7K72oecTsSxDcz9kUl1loe0ZZcqKS0 ttzQre0gtIS6ERA+CnaQG5xzEWC1psx+KSofUU8ulSAJHTZ/0It7cN/9sgCNr8x5M7+0 L4WJwGHY1h8vbR0zrBsn1p/kIu5mNIxrvIhv0AwX0wgMsHJNAqFZz6VkyYK6dQLi2EFV rnKANOMF7sbw/Koo6+KiFiRngxOaW9br7OX5Vfv/LS/KeIQA97/uguZ+JlFS8AlRYXH+ EA6w== X-Gm-Message-State: AOJu0YyZf6ab6WyiwmNvn3Lus5juW/vrPkVXVsJz4E1wAx0p7Orqsm1T Jz9Ju8Jn1+5u2SU4NgOouJs/M6uuxD4hbQxx4knHGFWQvOaEstmYiVWT X-Gm-Gg: ASbGnctmXCj0SH6ZUNG45rYh3lbuRpHsUy+MoDSpQwG3exseF4Se+MMC8pdjVo8xcLJ yPRhJyRp7/zbUaRSOnp7mJkmrho8PFy7Nqk1z3OPT+etYFogdB2qT2eVTRzYhfk+r8SVFFUm4Il sl0mq00qnv+KaLfTdAp7vhc6nki7xS865FsNmEOp3s/2FqoYAB6irpBmjCDg2ngeKikFNd376b2 tNAd7uE76bXsBv1V2qHo1smWAIk6/sHxezHsVSwNKZsUEWQYBMo830efBev2jQd2tOrfoi9Fvc7 /qsocLONhQ+MsddwPuFpAmhVsi7fRVhPOmTxQpKjh3LldQrO76BeNySGOlKJCO7vJTaYVAeLNy3 ZzN4rBFc1dg== X-Google-Smtp-Source: AGHT+IHppqw7anAAdOJNEQBWhVN9zAEwM3lpZJ6731hs9YWSSkwCDOBMt26c0EbhM6ehcXkH0n3q/A== X-Received: by 2002:a05:600c:1d1:b0:43c:efed:732c with SMTP id 5b1f17b1804b1-451f50a2953mr28997065e9.28.1749107668757; Thu, 05 Jun 2025 00:14:28 -0700 (PDT) Received: from localhost.localdomain ([78.210.56.234]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4f00972b5sm23885431f8f.76.2025.06.05.00.14.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 00:14:28 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH 6/9] sched/rt: Remove unused code Date: Thu, 5 Jun 2025 09:14:09 +0200 Message-ID: <20250605071412.139240-7-yurand2000@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250605071412.139240-1-yurand2000@gmail.com> References: <20250605071412.139240-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Remove the old RT_GROUP_SCHED code, as it is not used anymore. Signed-off-by: luca abeni --- include/linux/sched.h | 4 - kernel/sched/core.c | 1 - kernel/sched/deadline.c | 34 -- kernel/sched/debug.c | 6 - kernel/sched/rt.c | 698 +--------------------------------------- kernel/sched/sched.h | 32 +- 6 files changed, 11 insertions(+), 764 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 6dd86d13c..d03190526 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -623,13 +623,9 @@ struct sched_rt_entity { unsigned short on_rq; unsigned short on_list; =20 - struct sched_rt_entity *back; #ifdef CONFIG_RT_GROUP_SCHED - struct sched_rt_entity *parent; /* rq on which this entity is (to be) queued: */ struct rt_rq *rt_rq; - /* rq "owned" by this entity/group: */ - struct rt_rq *my_q; #endif } __randomize_layout; =20 diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c07fddbf2..e90b3608a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8620,7 +8620,6 @@ void __init sched_init(void) * starts working after scheduler_running, which is not the case * yet. */ - rq->rt.rt_runtime =3D global_rt_runtime(); init_tg_rt_entry(&root_task_group, rq, NULL, i, NULL); #endif #ifdef CONFIG_SMP diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 6589077c0..b07abbb60 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1671,40 +1671,6 @@ static void update_curr_dl_se(struct rq *rq, struct = sched_dl_entity *dl_se, s64 if (!is_leftmost(dl_se, &rq->dl)) resched_curr(rq); } - - /* - * The fair server (sole dl_server) does not account for real-time - * workload because it is running fair work. - */ - if (dl_se =3D=3D &rq->fair_server) - return; - -#ifdef CONFIG_RT_GROUP_SCHED - /* - * Because -- for now -- we share the rt bandwidth, we need to - * account our runtime there too, otherwise actual rt tasks - * would be able to exceed the shared quota. - * - * Account to the root rt group for now. - * - * The solution we're working towards is having the RT groups scheduled - * using deadline servers -- however there's a few nasties to figure - * out before that can happen. - */ - if (rt_bandwidth_enabled()) { - struct rt_rq *rt_rq =3D &rq->rt; - - raw_spin_lock(&rt_rq->rt_runtime_lock); - /* - * We'll let actual RT tasks worry about the overflow here, we - * have our own CBS to keep us inline; only account when RT - * bandwidth is relevant. - */ - if (sched_rt_bandwidth_account(rt_rq)) - rt_rq->rt_time +=3D delta_exec; - raw_spin_unlock(&rt_rq->rt_runtime_lock); - } -#endif } =20 /* diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index 9d71baf08..524ae64f0 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -907,12 +907,6 @@ void print_rt_rq(struct seq_file *m, int cpu, struct r= t_rq *rt_rq) =20 PU(rt_nr_running); =20 -#ifdef CONFIG_RT_GROUP_SCHED - P(rt_throttled); - PN(rt_time); - PN(rt_runtime); -#endif - #undef PN #undef PU #undef P diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index e348b8aba..9d17bda66 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1,4 +1,3 @@ -#pragma GCC diagnostic ignored "-Wunused-function" // SPDX-License-Identifier: GPL-2.0 /* * Real-Time Scheduling Class (mapped to the SCHED_FIFO and SCHED_RR @@ -82,99 +81,12 @@ void init_rt_rq(struct rt_rq *rt_rq) rt_rq->overloaded =3D 0; plist_head_init(&rt_rq->pushable_tasks); #endif /* CONFIG_SMP */ - /* We start is dequeued state, because no RT tasks are queued */ - rt_rq->rt_queued =3D 0; - -#ifdef CONFIG_RT_GROUP_SCHED - rt_rq->rt_time =3D 0; - rt_rq->rt_throttled =3D 0; - rt_rq->rt_runtime =3D 0; - raw_spin_lock_init(&rt_rq->rt_runtime_lock); - rt_rq->tg =3D &root_task_group; -#endif } =20 #ifdef CONFIG_RT_GROUP_SCHED =20 -static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun= ); - -static enum hrtimer_restart sched_rt_period_timer(struct hrtimer *timer) -{ - struct rt_bandwidth *rt_b =3D - container_of(timer, struct rt_bandwidth, rt_period_timer); - int idle =3D 0; - int overrun; - - raw_spin_lock(&rt_b->rt_runtime_lock); - for (;;) { - overrun =3D hrtimer_forward_now(timer, rt_b->rt_period); - if (!overrun) - break; - - raw_spin_unlock(&rt_b->rt_runtime_lock); - idle =3D do_sched_rt_period_timer(rt_b, overrun); - raw_spin_lock(&rt_b->rt_runtime_lock); - } - if (idle) - rt_b->rt_period_active =3D 0; - raw_spin_unlock(&rt_b->rt_runtime_lock); - - return idle ? HRTIMER_NORESTART : HRTIMER_RESTART; -} - -void init_rt_bandwidth(struct rt_bandwidth *rt_b, u64 period, u64 runtime) -{ - rt_b->rt_period =3D ns_to_ktime(period); - rt_b->rt_runtime =3D runtime; - - raw_spin_lock_init(&rt_b->rt_runtime_lock); - - hrtimer_setup(&rt_b->rt_period_timer, sched_rt_period_timer, CLOCK_MONOTO= NIC, - HRTIMER_MODE_REL_HARD); -} - -static inline void do_start_rt_bandwidth(struct rt_bandwidth *rt_b) -{ - raw_spin_lock(&rt_b->rt_runtime_lock); - if (!rt_b->rt_period_active) { - rt_b->rt_period_active =3D 1; - /* - * SCHED_DEADLINE updates the bandwidth, as a run away - * RT task with a DL task could hog a CPU. But DL does - * not reset the period. If a deadline task was running - * without an RT task running, it can cause RT tasks to - * throttle when they start up. Kick the timer right away - * to update the period. - */ - hrtimer_forward_now(&rt_b->rt_period_timer, ns_to_ktime(0)); - hrtimer_start_expires(&rt_b->rt_period_timer, - HRTIMER_MODE_ABS_PINNED_HARD); - } - raw_spin_unlock(&rt_b->rt_runtime_lock); -} - -static void start_rt_bandwidth(struct rt_bandwidth *rt_b) -{ - if (!rt_bandwidth_enabled() || rt_b->rt_runtime =3D=3D RUNTIME_INF) - return; - - do_start_rt_bandwidth(rt_b); -} - -static void destroy_rt_bandwidth(struct rt_bandwidth *rt_b) -{ - hrtimer_cancel(&rt_b->rt_period_timer); -} - -#define rt_entity_is_task(rt_se) (!(rt_se)->my_q) - void unregister_rt_sched_group(struct task_group *tg) { - if (!rt_group_sched_enabled()) - return; - - if (tg->rt_se) - destroy_rt_bandwidth(&tg->rt_bandwidth); } =20 void free_rt_sched_group(struct task_group *tg) @@ -307,8 +219,6 @@ int alloc_rt_sched_group(struct task_group *tg, struct = task_group *parent) =20 #else /* CONFIG_RT_GROUP_SCHED */ =20 -#define rt_entity_is_task(rt_se) (1) - void unregister_rt_sched_group(struct task_group *tg) { } =20 void free_rt_sched_group(struct task_group *tg) { } @@ -439,9 +349,6 @@ static inline void rt_queue_pull_task(struct rq *rq) } #endif /* CONFIG_SMP */ =20 -static void enqueue_top_rt_rq(struct rt_rq *rt_rq); -static void dequeue_top_rt_rq(struct rt_rq *rt_rq, unsigned int count); - static inline int on_rt_rq(struct sched_rt_entity *rt_se) { return rt_se->on_rq; @@ -488,16 +395,6 @@ static inline bool rt_task_fits_capacity(struct task_s= truct *p, int cpu) =20 #ifdef CONFIG_RT_GROUP_SCHED =20 -static inline u64 sched_rt_runtime(struct rt_rq *rt_rq) -{ - return rt_rq->rt_runtime; -} - -static inline u64 sched_rt_period(struct rt_rq *rt_rq) -{ - return ktime_to_ns(rt_rq->tg->dl_bandwidth.dl_period); -} - typedef struct task_group *rt_rq_iter_t; =20 static inline struct task_group *next_task_group(struct task_group *tg) @@ -523,407 +420,9 @@ static inline struct task_group *next_task_group(stru= ct task_group *tg) iter && (rt_rq =3D iter->rt_rq[cpu_of(rq)]); \ iter =3D next_task_group(iter)) =20 -#define for_each_sched_rt_entity(rt_se) \ - for (; rt_se; rt_se =3D rt_se->parent) - -static inline struct rt_rq *group_rt_rq(struct sched_rt_entity *rt_se) -{ - return rt_se->my_q; -} - static void enqueue_rt_entity(struct sched_rt_entity *rt_se, unsigned int = flags); static void dequeue_rt_entity(struct sched_rt_entity *rt_se, unsigned int = flags); =20 -static void sched_rt_rq_enqueue(struct rt_rq *rt_rq) -{ - struct task_struct *donor =3D rq_of_rt_rq(rt_rq)->donor; - struct rq *rq =3D rq_of_rt_rq(rt_rq); - struct sched_rt_entity *rt_se; - - int cpu =3D cpu_of(rq); - - rt_se =3D rt_rq->tg->rt_se[cpu]; - - if (rt_rq->rt_nr_running) { - if (!rt_se) - enqueue_top_rt_rq(rt_rq); - else if (!on_rt_rq(rt_se)) - enqueue_rt_entity(rt_se, 0); - - if (rt_rq->highest_prio.curr < donor->prio) - resched_curr(rq); - } -} - -static void sched_rt_rq_dequeue(struct rt_rq *rt_rq) -{ - struct sched_rt_entity *rt_se; - int cpu =3D cpu_of(rq_of_rt_rq(rt_rq)); - - rt_se =3D rt_rq->tg->rt_se[cpu]; - - if (!rt_se) { - dequeue_top_rt_rq(rt_rq, rt_rq->rt_nr_running); - /* Kick cpufreq (see the comment in kernel/sched/sched.h). */ - cpufreq_update_util(rq_of_rt_rq(rt_rq), 0); - } - else if (on_rt_rq(rt_se)) - dequeue_rt_entity(rt_se, 0); -} - -static inline int rt_rq_throttled(struct rt_rq *rt_rq) -{ - return rt_rq->rt_throttled && !rt_rq->rt_nr_boosted; -} - -static int rt_se_boosted(struct sched_rt_entity *rt_se) -{ - struct rt_rq *rt_rq =3D group_rt_rq(rt_se); - struct task_struct *p; - - if (rt_rq) - return !!rt_rq->rt_nr_boosted; - - p =3D rt_task_of(rt_se); - return p->prio !=3D p->normal_prio; -} - -#ifdef CONFIG_SMP -static inline const struct cpumask *sched_rt_period_mask(void) -{ - return this_rq()->rd->span; -} -#else -static inline const struct cpumask *sched_rt_period_mask(void) -{ - return cpu_online_mask; -} -#endif - -static inline -struct rt_rq *sched_rt_period_rt_rq(struct rt_bandwidth *rt_b, int cpu) -{ - return container_of(rt_b, struct task_group, rt_bandwidth)->rt_rq[cpu]; -} - -static inline struct rt_bandwidth *sched_rt_bandwidth(struct rt_rq *rt_rq) -{ - return &rt_rq->tg->rt_bandwidth; -} - -bool sched_rt_bandwidth_account(struct rt_rq *rt_rq) -{ - struct rt_bandwidth *rt_b =3D sched_rt_bandwidth(rt_rq); - - return (hrtimer_active(&rt_b->rt_period_timer) || - rt_rq->rt_time < rt_b->rt_runtime); -} - -#ifdef CONFIG_SMP -/* - * We ran out of runtime, see if we can borrow some from our neighbours. - */ -static void do_balance_runtime(struct rt_rq *rt_rq) -{ - struct rt_bandwidth *rt_b =3D sched_rt_bandwidth(rt_rq); - struct root_domain *rd =3D rq_of_rt_rq(rt_rq)->rd; - int i, weight; - u64 rt_period; - - weight =3D cpumask_weight(rd->span); - - raw_spin_lock(&rt_b->rt_runtime_lock); - rt_period =3D ktime_to_ns(rt_b->rt_period); - for_each_cpu(i, rd->span) { - struct rt_rq *iter =3D sched_rt_period_rt_rq(rt_b, i); - s64 diff; - - if (iter =3D=3D rt_rq) - continue; - - raw_spin_lock(&iter->rt_runtime_lock); - /* - * Either all rqs have inf runtime and there's nothing to steal - * or __disable_runtime() below sets a specific rq to inf to - * indicate its been disabled and disallow stealing. - */ - if (iter->rt_runtime =3D=3D RUNTIME_INF) - goto next; - - /* - * From runqueues with spare time, take 1/n part of their - * spare time, but no more than our period. - */ - diff =3D iter->rt_runtime - iter->rt_time; - if (diff > 0) { - diff =3D div_u64((u64)diff, weight); - if (rt_rq->rt_runtime + diff > rt_period) - diff =3D rt_period - rt_rq->rt_runtime; - iter->rt_runtime -=3D diff; - rt_rq->rt_runtime +=3D diff; - if (rt_rq->rt_runtime =3D=3D rt_period) { - raw_spin_unlock(&iter->rt_runtime_lock); - break; - } - } -next: - raw_spin_unlock(&iter->rt_runtime_lock); - } - raw_spin_unlock(&rt_b->rt_runtime_lock); -} - -/* - * Ensure this RQ takes back all the runtime it lend to its neighbours. - */ -static void __disable_runtime(struct rq *rq) -{ - struct root_domain *rd =3D rq->rd; - rt_rq_iter_t iter; - struct rt_rq *rt_rq; - - if (unlikely(!scheduler_running)) - return; - - for_each_rt_rq(rt_rq, iter, rq) { - struct rt_bandwidth *rt_b =3D sched_rt_bandwidth(rt_rq); - s64 want; - int i; - - raw_spin_lock(&rt_b->rt_runtime_lock); - raw_spin_lock(&rt_rq->rt_runtime_lock); - /* - * Either we're all inf and nobody needs to borrow, or we're - * already disabled and thus have nothing to do, or we have - * exactly the right amount of runtime to take out. - */ - if (rt_rq->rt_runtime =3D=3D RUNTIME_INF || - rt_rq->rt_runtime =3D=3D rt_b->rt_runtime) - goto balanced; - raw_spin_unlock(&rt_rq->rt_runtime_lock); - - /* - * Calculate the difference between what we started out with - * and what we current have, that's the amount of runtime - * we lend and now have to reclaim. - */ - want =3D rt_b->rt_runtime - rt_rq->rt_runtime; - - /* - * Greedy reclaim, take back as much as we can. - */ - for_each_cpu(i, rd->span) { - struct rt_rq *iter =3D sched_rt_period_rt_rq(rt_b, i); - s64 diff; - - /* - * Can't reclaim from ourselves or disabled runqueues. - */ - if (iter =3D=3D rt_rq || iter->rt_runtime =3D=3D RUNTIME_INF) - continue; - - raw_spin_lock(&iter->rt_runtime_lock); - if (want > 0) { - diff =3D min_t(s64, iter->rt_runtime, want); - iter->rt_runtime -=3D diff; - want -=3D diff; - } else { - iter->rt_runtime -=3D want; - want -=3D want; - } - raw_spin_unlock(&iter->rt_runtime_lock); - - if (!want) - break; - } - - raw_spin_lock(&rt_rq->rt_runtime_lock); - /* - * We cannot be left wanting - that would mean some runtime - * leaked out of the system. - */ - WARN_ON_ONCE(want); -balanced: - /* - * Disable all the borrow logic by pretending we have inf - * runtime - in which case borrowing doesn't make sense. - */ - rt_rq->rt_runtime =3D RUNTIME_INF; - rt_rq->rt_throttled =3D 0; - raw_spin_unlock(&rt_rq->rt_runtime_lock); - raw_spin_unlock(&rt_b->rt_runtime_lock); - - /* Make rt_rq available for pick_next_task() */ - sched_rt_rq_enqueue(rt_rq); - } -} - -static void __enable_runtime(struct rq *rq) -{ - rt_rq_iter_t iter; - struct rt_rq *rt_rq; - - if (unlikely(!scheduler_running)) - return; - - /* - * Reset each runqueue's bandwidth settings - */ - for_each_rt_rq(rt_rq, iter, rq) { - struct rt_bandwidth *rt_b =3D sched_rt_bandwidth(rt_rq); - - raw_spin_lock(&rt_b->rt_runtime_lock); - raw_spin_lock(&rt_rq->rt_runtime_lock); - rt_rq->rt_runtime =3D rt_b->rt_runtime; - rt_rq->rt_time =3D 0; - rt_rq->rt_throttled =3D 0; - raw_spin_unlock(&rt_rq->rt_runtime_lock); - raw_spin_unlock(&rt_b->rt_runtime_lock); - } -} - -static void balance_runtime(struct rt_rq *rt_rq) -{ - if (!sched_feat(RT_RUNTIME_SHARE)) - return; - - if (rt_rq->rt_time > rt_rq->rt_runtime) { - raw_spin_unlock(&rt_rq->rt_runtime_lock); - do_balance_runtime(rt_rq); - raw_spin_lock(&rt_rq->rt_runtime_lock); - } -} -#else /* !CONFIG_SMP */ -static inline void balance_runtime(struct rt_rq *rt_rq) {} -#endif /* CONFIG_SMP */ - -static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun) -{ - int i, idle =3D 1, throttled =3D 0; - const struct cpumask *span; - - span =3D sched_rt_period_mask(); - - /* - * FIXME: isolated CPUs should really leave the root task group, - * whether they are isolcpus or were isolated via cpusets, lest - * the timer run on a CPU which does not service all runqueues, - * potentially leaving other CPUs indefinitely throttled. If - * isolation is really required, the user will turn the throttle - * off to kill the perturbations it causes anyway. Meanwhile, - * this maintains functionality for boot and/or troubleshooting. - */ - if (rt_b =3D=3D &root_task_group.rt_bandwidth) - span =3D cpu_online_mask; - - for_each_cpu(i, span) { - int enqueue =3D 0; - struct rt_rq *rt_rq =3D sched_rt_period_rt_rq(rt_b, i); - struct rq *rq =3D rq_of_rt_rq(rt_rq); - struct rq_flags rf; - int skip; - - /* - * When span =3D=3D cpu_online_mask, taking each rq->lock - * can be time-consuming. Try to avoid it when possible. - */ - raw_spin_lock(&rt_rq->rt_runtime_lock); - if (!sched_feat(RT_RUNTIME_SHARE) && rt_rq->rt_runtime !=3D RUNTIME_INF) - rt_rq->rt_runtime =3D rt_b->rt_runtime; - skip =3D !rt_rq->rt_time && !rt_rq->rt_nr_running; - raw_spin_unlock(&rt_rq->rt_runtime_lock); - if (skip) - continue; - - rq_lock(rq, &rf); - update_rq_clock(rq); - - if (rt_rq->rt_time) { - u64 runtime; - - raw_spin_lock(&rt_rq->rt_runtime_lock); - if (rt_rq->rt_throttled) - balance_runtime(rt_rq); - runtime =3D rt_rq->rt_runtime; - rt_rq->rt_time -=3D min(rt_rq->rt_time, overrun*runtime); - if (rt_rq->rt_throttled && rt_rq->rt_time < runtime) { - rt_rq->rt_throttled =3D 0; - enqueue =3D 1; - - /* - * When we're idle and a woken (rt) task is - * throttled wakeup_preempt() will set - * skip_update and the time between the wakeup - * and this unthrottle will get accounted as - * 'runtime'. - */ - if (rt_rq->rt_nr_running && rq->curr =3D=3D rq->idle) - rq_clock_cancel_skipupdate(rq); - } - if (rt_rq->rt_time || rt_rq->rt_nr_running) - idle =3D 0; - raw_spin_unlock(&rt_rq->rt_runtime_lock); - } else if (rt_rq->rt_nr_running) { - idle =3D 0; - if (!rt_rq_throttled(rt_rq)) - enqueue =3D 1; - } - if (rt_rq->rt_throttled) - throttled =3D 1; - - if (enqueue) - sched_rt_rq_enqueue(rt_rq); - rq_unlock(rq, &rf); - } - - if (!throttled && (!rt_bandwidth_enabled() || rt_b->rt_runtime =3D=3D RUN= TIME_INF)) - return 1; - - return idle; -} - -static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq) -{ - u64 runtime =3D sched_rt_runtime(rt_rq); - - if (rt_rq->rt_throttled) - return rt_rq_throttled(rt_rq); - - if (runtime >=3D sched_rt_period(rt_rq)) - return 0; - - balance_runtime(rt_rq); - runtime =3D sched_rt_runtime(rt_rq); - if (runtime =3D=3D RUNTIME_INF) - return 0; - - if (rt_rq->rt_time > runtime) { - struct rt_bandwidth *rt_b =3D sched_rt_bandwidth(rt_rq); - - /* - * Don't actually throttle groups that have no runtime assigned - * but accrue some time due to boosting. - */ - if (likely(rt_b->rt_runtime)) { - rt_rq->rt_throttled =3D 1; - printk_deferred_once("sched: RT throttling activated\n"); - } else { - /* - * In case we did anyway, make it go away, - * replenishment is a joke, since it will replenish us - * with exactly 0 ns. - */ - rt_rq->rt_time =3D 0; - } - - if (rt_rq_throttled(rt_rq)) { - sched_rt_rq_dequeue(rt_rq); - return 1; - } - } - - return 0; -} - #else /* !CONFIG_RT_GROUP_SCHED */ =20 typedef struct rt_rq *rt_rq_iter_t; @@ -931,62 +430,10 @@ typedef struct rt_rq *rt_rq_iter_t; #define for_each_rt_rq(rt_rq, iter, rq) \ for ((void) iter, rt_rq =3D &rq->rt; rt_rq; rt_rq =3D NULL) =20 -#define for_each_sched_rt_entity(rt_se) \ - for (; rt_se; rt_se =3D NULL) - -static inline struct rt_rq *group_rt_rq(struct sched_rt_entity *rt_se) -{ - return NULL; -} - -static inline void sched_rt_rq_enqueue(struct rt_rq *rt_rq) -{ - struct rq *rq =3D rq_of_rt_rq(rt_rq); - - if (!rt_rq->rt_nr_running) - return; - - enqueue_top_rt_rq(rt_rq); - resched_curr(rq); -} - -static inline void sched_rt_rq_dequeue(struct rt_rq *rt_rq) -{ - dequeue_top_rt_rq(rt_rq, rt_rq->rt_nr_running); -} - -static inline int rt_rq_throttled(struct rt_rq *rt_rq) -{ - return false; -} - -static inline const struct cpumask *sched_rt_period_mask(void) -{ - return cpu_online_mask; -} - -static inline -struct rt_rq *sched_rt_period_rt_rq(struct rt_bandwidth *rt_b, int cpu) -{ - return &cpu_rq(cpu)->rt; -} - -#ifdef CONFIG_SMP -static void __enable_runtime(struct rq *rq) { } -static void __disable_runtime(struct rq *rq) { } -#endif - #endif /* CONFIG_RT_GROUP_SCHED */ =20 static inline int rt_se_prio(struct sched_rt_entity *rt_se) { -#ifdef CONFIG_RT_GROUP_SCHED - struct rt_rq *rt_rq =3D group_rt_rq(rt_se); - - if (rt_rq) - return rt_rq->highest_prio.curr; -#endif - return rt_task_of(rt_se)->prio; } =20 @@ -1025,45 +472,6 @@ static void update_curr_rt(struct rq *rq) #endif } =20 -static void -dequeue_top_rt_rq(struct rt_rq *rt_rq, unsigned int count) -{ - struct rq *rq =3D rq_of_rt_rq(rt_rq); - - BUG_ON(&rq->rt !=3D rt_rq); - - if (!rt_rq->rt_queued) - return; - - BUG_ON(!rq->nr_running); - - sub_nr_running(rq, count); - rt_rq->rt_queued =3D 0; - -} - -static void -enqueue_top_rt_rq(struct rt_rq *rt_rq) -{ - struct rq *rq =3D rq_of_rt_rq(rt_rq); - - BUG_ON(&rq->rt !=3D rt_rq); - - if (rt_rq->rt_queued) - return; - - if (rt_rq_throttled(rt_rq)) - return; - - if (rt_rq->rt_nr_running) { - add_nr_running(rq, rt_rq->rt_nr_running); - rt_rq->rt_queued =3D 1; - } - - /* Kick cpufreq (see the comment in kernel/sched/sched.h). */ - cpufreq_update_util(rq, 0); -} - #if defined CONFIG_SMP =20 static void @@ -1151,58 +559,17 @@ static inline void dec_rt_prio(struct rt_rq *rt_rq, = int prio) {} =20 #endif /* CONFIG_SMP || CONFIG_RT_GROUP_SCHED */ =20 -#ifdef CONFIG_RT_GROUP_SCHED - -static void -inc_rt_group(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) -{ - if (rt_se_boosted(rt_se)) - rt_rq->rt_nr_boosted++; - - start_rt_bandwidth(&rt_rq->tg->rt_bandwidth); -} - -static void -dec_rt_group(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) -{ - if (rt_se_boosted(rt_se)) - rt_rq->rt_nr_boosted--; - - WARN_ON(!rt_rq->rt_nr_running && rt_rq->rt_nr_boosted); -} - -#else /* CONFIG_RT_GROUP_SCHED */ - -static void -inc_rt_group(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) -{ -} - -static inline -void dec_rt_group(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) {} - -#endif /* CONFIG_RT_GROUP_SCHED */ - static inline unsigned int rt_se_nr_running(struct sched_rt_entity *rt_se) { - struct rt_rq *group_rq =3D group_rt_rq(rt_se); - - if (group_rq) - return group_rq->rt_nr_running; - else - return 1; + return 1; } =20 static inline unsigned int rt_se_rr_nr_running(struct sched_rt_entity *rt_se) { - struct rt_rq *group_rq =3D group_rt_rq(rt_se); struct task_struct *tsk; =20 - if (group_rq) - return group_rq->rr_nr_running; - tsk =3D rt_task_of(rt_se); =20 return (tsk->policy =3D=3D SCHED_RR) ? 1 : 0; @@ -1274,10 +641,6 @@ static void __delist_rt_entity(struct sched_rt_entity= *rt_se, struct rt_prio_arr static inline struct sched_statistics * __schedstats_from_rt_se(struct sched_rt_entity *rt_se) { - /* schedstats is not supported for rt group. */ - if (!rt_entity_is_task(rt_se)) - return NULL; - return &rt_task_of(rt_se)->stats; } =20 @@ -1290,9 +653,7 @@ update_stats_wait_start_rt(struct rt_rq *rt_rq, struct= sched_rt_entity *rt_se) if (!schedstat_enabled()) return; =20 - if (rt_entity_is_task(rt_se)) - p =3D rt_task_of(rt_se); - + p =3D rt_task_of(rt_se); stats =3D __schedstats_from_rt_se(rt_se); if (!stats) return; @@ -1309,9 +670,7 @@ update_stats_enqueue_sleeper_rt(struct rt_rq *rt_rq, s= truct sched_rt_entity *rt_ if (!schedstat_enabled()) return; =20 - if (rt_entity_is_task(rt_se)) - p =3D rt_task_of(rt_se); - + p =3D rt_task_of(rt_se); stats =3D __schedstats_from_rt_se(rt_se); if (!stats) return; @@ -1339,9 +698,7 @@ update_stats_wait_end_rt(struct rt_rq *rt_rq, struct s= ched_rt_entity *rt_se) if (!schedstat_enabled()) return; =20 - if (rt_entity_is_task(rt_se)) - p =3D rt_task_of(rt_se); - + p =3D rt_task_of(rt_se); stats =3D __schedstats_from_rt_se(rt_se); if (!stats) return; @@ -1358,9 +715,7 @@ update_stats_dequeue_rt(struct rt_rq *rt_rq, struct sc= hed_rt_entity *rt_se, if (!schedstat_enabled()) return; =20 - if (rt_entity_is_task(rt_se)) - p =3D rt_task_of(rt_se); - + p =3D rt_task_of(rt_se); if ((flags & DEQUEUE_SLEEP) && p) { unsigned int state; =20 @@ -1410,30 +765,6 @@ static void __dequeue_rt_entity(struct sched_rt_entit= y *rt_se, unsigned int flag dec_rt_tasks(rt_se, rt_rq); } =20 -/* - * Because the prio of an upper entry depends on the lower - * entries, we must remove entries top - down. - */ -static void dequeue_rt_stack(struct sched_rt_entity *rt_se, unsigned int f= lags) -{ - struct sched_rt_entity *back =3D NULL; - unsigned int rt_nr_running; - - for_each_sched_rt_entity(rt_se) { - rt_se->back =3D back; - back =3D rt_se; - } - - rt_nr_running =3D rt_rq_of_se(back)->rt_nr_running; - - for (rt_se =3D back; rt_se; rt_se =3D rt_se->back) { - if (on_rt_rq(rt_se)) - __dequeue_rt_entity(rt_se, flags); - } - - dequeue_top_rt_rq(rt_rq_of_se(back), rt_nr_running); -} - static void enqueue_rt_entity(struct sched_rt_entity *rt_se, unsigned int = flags) { update_stats_enqueue_rt(rt_rq_of_se(rt_se), rt_se, flags); @@ -1445,7 +776,7 @@ static void dequeue_rt_entity(struct sched_rt_entity *= rt_se, unsigned int flags) { update_stats_dequeue_rt(rt_rq_of_se(rt_se), rt_se, flags); =20 - __enqueue_rt_entity(rt_se, flags); + __dequeue_rt_entity(rt_se, flags); } =20 /* @@ -2453,6 +1784,7 @@ static void rq_online_rt(struct rq *rq) if (rq->rt.overloaded) rt_set_overload(rq); =20 + /*FIXME: Enable the dl server! */ cpupri_set(&rq->rd->cpupri, rq->cpu, rq->rt.highest_prio.curr); } =20 @@ -2462,6 +1794,7 @@ static void rq_offline_rt(struct rq *rq) if (rq->rt.overloaded) rt_clear_overload(rq); =20 + /* FIXME: Disable the dl server! */ cpupri_set(&rq->rd->cpupri, rq->cpu, CPUPRI_INVALID); } =20 @@ -2958,13 +2291,7 @@ long sched_group_rt_period(struct task_group *tg) #ifdef CONFIG_SYSCTL static int sched_rt_global_constraints(void) { - int ret =3D 0; - - mutex_lock(&rt_constraints_mutex); - ret =3D __rt_schedulable(NULL, 0, 0); - mutex_unlock(&rt_constraints_mutex); - - return ret; + return 0; } #endif /* CONFIG_SYSCTL */ =20 @@ -2999,10 +2326,6 @@ static int sched_rt_global_validate(void) return 0; } =20 -static void sched_rt_do_global(void) -{ -} - static int sched_rt_handler(const struct ctl_table *table, int write, void= *buffer, size_t *lenp, loff_t *ppos) { @@ -3029,9 +2352,6 @@ static int sched_rt_handler(const struct ctl_table *t= able, int write, void *buff ret =3D sched_rt_global_constraints(); if (ret) goto undo; - - sched_rt_do_global(); - sched_dl_do_global(); } if (0) { undo: diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index c7227a510..686578666 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -309,15 +309,6 @@ struct rt_prio_array { struct list_head queue[MAX_RT_PRIO]; }; =20 -struct rt_bandwidth { - /* nests inside the rq lock: */ - raw_spinlock_t rt_runtime_lock; - ktime_t rt_period; - u64 rt_runtime; - struct hrtimer rt_period_timer; - unsigned int rt_period_active; -}; - struct dl_bandwidth { raw_spinlock_t dl_runtime_lock; u64 dl_runtime; @@ -471,7 +462,6 @@ struct task_group { struct sched_dl_entity **dl_se; struct rt_rq **rt_rq; =20 - struct rt_bandwidth rt_bandwidth; struct dl_bandwidth dl_bandwidth; #endif =20 @@ -797,11 +787,6 @@ struct scx_rq { }; #endif /* CONFIG_SCHED_CLASS_EXT */ =20 -static inline int rt_bandwidth_enabled(void) -{ - return 0; -} - /* RT IPI pull logic requires IRQ_WORK */ #if defined(CONFIG_IRQ_WORK) && defined(CONFIG_SMP) # define HAVE_RT_PUSH_IPI @@ -825,17 +810,7 @@ struct rt_rq { struct plist_head pushable_tasks; =20 #endif /* CONFIG_SMP */ - int rt_queued; - -#ifdef CONFIG_RT_GROUP_SCHED - int rt_throttled; - u64 rt_time; /* consumed RT time, goes up in update_curr_rt */ - u64 rt_runtime; /* allotted RT time, "slice" from rt_bandwidth, RT shar= ing/balancing */ - /* Nests inside the rq lock: */ - raw_spinlock_t rt_runtime_lock; =20 - unsigned int rt_nr_boosted; -#endif #ifdef CONFIG_CGROUP_SCHED struct task_group *tg; /* this tg has "this" rt_rq on given CPU for runna= ble entities */ #endif @@ -845,7 +820,7 @@ struct rt_rq { =20 static inline bool rt_rq_is_runnable(struct rt_rq *rt_rq) { - return rt_rq->rt_queued && rt_rq->rt_nr_running; + return rt_rq->rt_nr_running; } =20 /* Deadline class' related fields in a runqueue */ @@ -2581,7 +2556,7 @@ static inline bool sched_dl_runnable(struct rq *rq) =20 static inline bool sched_rt_runnable(struct rq *rq) { - return rq->rt.rt_queued > 0; + return rq->rt.rt_nr_running > 0; } =20 static inline bool sched_fair_runnable(struct rq *rq) @@ -2714,9 +2689,6 @@ extern void resched_curr(struct rq *rq); extern void resched_curr_lazy(struct rq *rq); extern void resched_cpu(int cpu); =20 -extern void init_rt_bandwidth(struct rt_bandwidth *rt_b, u64 period, u64 r= untime); -extern bool sched_rt_bandwidth_account(struct rt_rq *rt_rq); - void init_dl_bandwidth(struct dl_bandwidth *dl_b, u64 period, u64 runtime); extern void init_dl_entity(struct sched_dl_entity *dl_se); =20 --=20 2.49.0 From nobody Fri Dec 19 19:07:39 2025 Received: from mail-wr1-f50.google.com (mail-wr1-f50.google.com [209.85.221.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3EE3C207A18 for ; Thu, 5 Jun 2025 07:14:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107673; cv=none; b=uUIEK7JR6hnFb8vf2Vjbi/Wj5g5ObI3B3bmdiuY0jjOLwkitdTVLPx8rsuVOn1RDbBLgVVFal0ffZxxM8PKqu2bt6YYEPCgxhForfd6GoGD2nhJ/V7UIL13RkSZGNKPcQKA1KROGiKpIqWm06UInVQaoBssPfWZNiYPIr2gX5a4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107673; c=relaxed/simple; bh=jbp15bWLpp9sT0GFj9cr7VfzaeOBTsXq09F5x17dSkg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Jjr2FxlNOvosvtK05MLXBP52U+REVH4kIuAVXufeL0sa2TCTkAUE+VT0y+jquol4jhtlV3lD4Fq4EblDFXi3HnRx+gY7vAeTpNhCPqP3svRRhIlW4bs1e8o74cy4q6nhlsuptmTGFKZCfzMBjR1ZBL0zO/vajIXOnsXhCmOO78w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=LKagQt6i; arc=none smtp.client-ip=209.85.221.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LKagQt6i" Received: by mail-wr1-f50.google.com with SMTP id ffacd0b85a97d-3a510432236so494678f8f.0 for ; Thu, 05 Jun 2025 00:14:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749107670; x=1749712470; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jVE56mb1hZ5CiRyJyad1+1WeIf3NNpyTOYbe0lP1OI4=; b=LKagQt6iYVedfWCxz3cFKUd16DM8fVw7kXtcu5U29ML94PVtNl5x7JB37eACJO+piA pJb6GhKHTu+nuqGvyK4fNdmcOE3igcS/e9efIxRgUvP3lkI0NKYU/WVb9+WriV7O54vL h7CcUFCWeauHQU4p314xYKtT7+0l8iv7XcivbMpTybgvGPbnUKT1NIVgtNU+yRtdwUFs Cfg+lPVXVlwTjP7mQNHwW2DFb9UqG60VBUKCqN70ZYpB6Rb7iNlAcfe+TfalI0mrR+mj OZS26QK5Wc7O1aYW05B0g7uzQuODEO9MEwXf49tYLBtuWWXN+pDMOUNN4XcmjWMZr/aB w5mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749107670; x=1749712470; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jVE56mb1hZ5CiRyJyad1+1WeIf3NNpyTOYbe0lP1OI4=; b=qC4X5u4qTSStWxQi5E2lwFIllEEYAz77nHlCFsfXXx2KE6WWyOfN/CVzK+QlTCXa/t MqQxdEVZd2uL/fLbmyNoRep/jk2TiF27xhw1Y96L3GgrusGbet5fEvMQtHtrNoXioSJp Bd9KyhtJgx65lI2p5Cy1TarS2tWqTrntaKYhdESJ3nfQ1Cyp7ygYph81cZWu/wS2L4ge fuN+DvmnW+V4zFM8lu31nYzX3y5bw/wtqhe04bTS85ziJ6DQ+CuKhtIh3cOf8o2KQxP3 MzolLhvbgJSDFKHPYqowe9vwtPFppeX8LMo3nQ+wA72bDK/ZqRbaQwYiBT6D/NF2n5D5 bA8w== X-Gm-Message-State: AOJu0YyfcccGj3qWnWDnA//DSnUafKI3Mgricb69v3W5Qdf6v/FYBpMR 7CcQV2K6Xp+imbxWjZyrdJOBUJRm84Asf7nLAUjT/1EdKaE/jf0UNGuh X-Gm-Gg: ASbGncv3H0n0Esx4atujt4AT+VXLPj0R3FX6eDBIXe9T1fb8Gy9ad0w1bH5tyTZF+wa k12XXIKH+S/iw5in3H8CJ65MVOybPhCR8aRUDDTWnlM9arvjbnUS3wiEf5wXZWMJiIf5tUYQI4A AmPg1NJCbHNTRrZ1uLJEDb9pAOj1hZ65ON2ZEOj3ceRLSz9Hl1fd8q1DmlJ8knPKBgu3klltgVn E0RoLf3QO5rNnu6QxOSiMf5mAG/JB0HR5jK8pBdCRQP3t+0q5Ghp1nc7HQe3uZ+xXNuZXsIQk3A T16M9/+LBi3HIu3+fLDYvmWDl7v38xun2EsUubX2x7FHJMA+sP2X5msvP0DEDJTw X-Google-Smtp-Source: AGHT+IEWYvSmAqefWRXcp7O40fgU9BhBsFjJ8qljeS80ykzX1zt6L0sTHDa4psnsHlWF1uTrvsk/5Q== X-Received: by 2002:a05:6000:290d:b0:39c:1f02:5409 with SMTP id ffacd0b85a97d-3a51d8ef864mr4259853f8f.9.1749107670306; Thu, 05 Jun 2025 00:14:30 -0700 (PDT) Received: from localhost.localdomain ([78.210.56.234]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4f00972b5sm23885431f8f.76.2025.06.05.00.14.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 00:14:30 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH 7/9] sched/core: Cgroup v2 support Date: Thu, 5 Jun 2025 09:14:10 +0200 Message-ID: <20250605071412.139240-8-yurand2000@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250605071412.139240-1-yurand2000@gmail.com> References: <20250605071412.139240-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Make rt_runtime_us and rt_period_us virtual files accessible also to the cg= roup v2 controller, effectively enabling the RT_GROUP_SCHED mechanism to cgroups= v2. Signed-off-by: luca abeni --- kernel/sched/core.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index e90b3608a..cad2963a2 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -10134,6 +10134,18 @@ static struct cftype cpu_files[] =3D { .seq_show =3D cpu_uclamp_max_show, .write =3D cpu_uclamp_max_write, }, +#endif +#ifdef CONFIG_RT_GROUP_SCHED + { + .name =3D "rt_runtime_us", + .read_s64 =3D cpu_rt_runtime_read, + .write_s64 =3D cpu_rt_runtime_write, + }, + { + .name =3D "rt_period_us", + .read_u64 =3D cpu_rt_period_read_uint, + .write_u64 =3D cpu_rt_period_write_uint, + }, #endif { } /* terminate */ }; --=20 2.49.0 From nobody Fri Dec 19 19:07:39 2025 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2A49520C004 for ; Thu, 5 Jun 2025 07:14:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107678; cv=none; b=RegUlox/FVZOMSZ2ITT7SzXxrPitbl0eOTTwWkrNesrEoJmS3F8dCI0m5nb97NtTg4Qkmw6MI23BvikJalkyHgaPtVpzlu/c33HFhO6i2QkP+W8VAY1tKbKAieUGYBgm23upQXQf+Y/vE/fKocb5zdSha7/AnRkl08JgxK95+/o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107678; c=relaxed/simple; bh=n/1RffjE+hyND1ErJYWBkrmY8OjexFlJQTMC+knE7EA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LxaM4nWCwHCi8h7+O+nNJKML7flaNfcfcVG1P9JaKilX2hVf2Dwh6lPu8VzvvkN3F8GKMzDn7CAFEcGgNDig7wFmAJO7RRkBGwGtJ3/2cG3KN8+wDNmCOT5aMCCgyxb5fjtp6pttdpA1Dfkoxm4Xw4LY/aFaTQDFS4cYRxLyvvk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=AKWQx7TC; arc=none smtp.client-ip=209.85.128.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="AKWQx7TC" Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-451d54214adso4622365e9.3 for ; Thu, 05 Jun 2025 00:14:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749107674; x=1749712474; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mKxyhVouN1ABOrb3lXqLiY73ZUC/lGOicpA/Buk0ZmU=; b=AKWQx7TCk5ce9zxYxxcf+R+/8yu37DoEzDniP3mTp04ihxkVDdHmsDEO/hmeS5E7qg QQRrL7C7Nc5/nKjrGIXUnDBl5OIX3XxEglpHseK09WqlKa/X6KS9eMOrxEji4AJOXp2y Br/egI5iuOl3ejqPyGtYHFMdNA9n5le7bMz47Ri8jnf2cUm0r0A1AO7rOh0gfZBzYlOg 8KIFuMCRv7b8zb4ZiAvHLkpSVRhPYYroFlS1DLdIL7BVUUYCRYXr7EU9hUIcdQiTn+2j lvX2FkXTQWIwjE9C5WceH9WDgTZuT4h39KnOska+Bomi9bsHWWWBkgZZWLJt+kahwFpQ aNNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749107674; x=1749712474; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mKxyhVouN1ABOrb3lXqLiY73ZUC/lGOicpA/Buk0ZmU=; b=aYuLuB2SC0/2OCWthVdIyeInnLdJfLkrv16FU3lLjSkDhfkxjL3ozvF1B/pDbzhOM1 ZquFH6eIV24JYZ8XN/je0vP3GwvzMjni2EJCYKLds5CjtwscwAHHMoUcyzO6RyYjNgIP nqxH/XuaM6fOGizQ3IqYy1Kmn0IXK31MMVwLqOCtysXIyx52v77lZbVF4dirZ/G1TfJi WMLvOl31n9te83zSr1wk4BXtrm3XjIauOrgvA5wJ/uJPHyCja06DFtqZmPe3x20giBxj tbAnb6Q9hvm00C7asgQYiifinm1WF8o2+pCNGmB9YNEHU36/tFr1vll6Xa775mcQkpR8 jJDg== X-Gm-Message-State: AOJu0YxtkMgflA4Pu1KMXi1evwqzWKP9tyghSkBebzz1kz6EZGdk0FrV dAkDNqaUjw1/I35IrIjew1yBWp1mUUt3x1dqkS7/TxbthxH4Syt35VJM X-Gm-Gg: ASbGnctIQMAVWF6FzkZWW4ZXsY+Hk3wSBAjKH31wkNW++yyIHKad0QQS2abk7wBm3m4 vvbfk4yp11MIZZnx9gqNaTjn2zzE0kXYMPDsSkmrcAjzUtF5Mu0QyYFHm+bkAh6QAIBz8zc7vx5 jv9zxssfUlQUqndn+kEYOov1Biv5NRhm3GkYXdw+G+SDnTngFj7yzfiP8tUZ9HBedj8shuOudks SYw0OXmUzlzy+kP0IdojFnWzaVdoEbxQqDiRTJw1oO+cGRa//CSXomFaMGHtk7f1IkR8yXEfkgM mWX0ro0hSX3CDokbwSxMPiWwPdt3oNBIZwxEnVbp/t7AQLY1NjzQb+h3tLS29Te6 X-Google-Smtp-Source: AGHT+IGZHVzghn3Kotyh1egqmzIGcPFdoD85lXAfz8fek545or5JPUdFw9sM8yF+4DSAeAoNkTv2kw== X-Received: by 2002:a05:6000:40d9:b0:3a4:dbdf:7140 with SMTP id ffacd0b85a97d-3a51d9673aemr4750996f8f.49.1749107674221; Thu, 05 Jun 2025 00:14:34 -0700 (PDT) Received: from localhost.localdomain ([78.210.56.234]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4f00972b5sm23885431f8f.76.2025.06.05.00.14.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 00:14:33 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH 8/9] sched/rt: Remove support for cgroups-v1 Date: Thu, 5 Jun 2025 09:14:11 +0200 Message-ID: <20250605071412.139240-9-yurand2000@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250605071412.139240-1-yurand2000@gmail.com> References: <20250605071412.139240-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Disable control files for cgroups-v1, and allow only cgroups-v2. This should simplify maintaining the code, also because cgroups-v1 is deprecated. Set the default rt-cgroups runtime to zero, otherwise a cgroup-v1 kernel wi= ll not be able to start SCHED_DEADLINE tasks. Allow zeroing the runtime of the root control group. This runtime only affe= cts the available bandwidth of the rt-cgroup hierarchy but not the SCHED_FIFO / SCHED_RR tasks on the global runqueue. Notes: Disabling the root control group bandwidth should not cause any side effect= , as SCHED_FIFO / SCHED_RR tasks do not depend on it since the introduction of fair_servers. Signed-off-by: Yuri Andriaccio --- kernel/sched/core.c | 22 ++-------------------- kernel/sched/rt.c | 13 +++++-------- kernel/sched/syscalls.c | 2 +- 3 files changed, 8 insertions(+), 29 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index cad2963a2..9c8bc9728 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8566,7 +8566,7 @@ void __init sched_init(void) =20 #ifdef CONFIG_RT_GROUP_SCHED init_dl_bandwidth(&root_task_group.dl_bandwidth, - global_rt_period(), global_rt_runtime()); + global_rt_period(), 0); #endif /* CONFIG_RT_GROUP_SCHED */ =20 #ifdef CONFIG_CGROUP_SCHED @@ -9198,7 +9198,7 @@ static int cpu_cgroup_can_attach(struct cgroup_taskse= t *tset) goto scx_check; =20 cgroup_taskset_for_each(task, css, tset) { - if (!sched_rt_can_attach(css_tg(css), task)) + if (rt_task(task) && !sched_rt_can_attach(css_tg(css), task)) return -EINVAL; } scx_check: @@ -9873,20 +9873,6 @@ static struct cftype cpu_legacy_files[] =3D { }; =20 #ifdef CONFIG_RT_GROUP_SCHED -static struct cftype rt_group_files[] =3D { - { - .name =3D "rt_runtime_us", - .read_s64 =3D cpu_rt_runtime_read, - .write_s64 =3D cpu_rt_runtime_write, - }, - { - .name =3D "rt_period_us", - .read_u64 =3D cpu_rt_period_read_uint, - .write_u64 =3D cpu_rt_period_write_uint, - }, - { } /* Terminate */ -}; - # ifdef CONFIG_RT_GROUP_SCHED_DEFAULT_DISABLED DEFINE_STATIC_KEY_FALSE(rt_group_sched); # else @@ -9912,10 +9898,6 @@ __setup("rt_group_sched=3D", setup_rt_group_sched); =20 static int __init cpu_rt_group_init(void) { - if (!rt_group_sched_enabled()) - return 0; - - WARN_ON(cgroup_add_legacy_cftypes(&cpu_cgrp_subsys, rt_group_files)); return 0; } subsys_initcall(cpu_rt_group_init); diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 9d17bda66..ce3320f12 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -2197,13 +2197,6 @@ static int tg_set_rt_bandwidth(struct task_group *tg, { int i, err =3D 0; =20 - /* - * Disallowing the root group RT runtime is BAD, it would disallow the - * kernel creating (and or operating) RT threads. - */ - if (tg =3D=3D &root_task_group && rt_runtime =3D=3D 0) - return -EINVAL; - /* No period doesn't make any sense. */ if (rt_period =3D=3D 0) return -EINVAL; @@ -2297,8 +2290,12 @@ static int sched_rt_global_constraints(void) =20 int sched_rt_can_attach(struct task_group *tg, struct task_struct *tsk) { + /* Allow executing in the root cgroup regardless of allowed bandwidth */ + if (tg =3D=3D &root_task_group) + return 1; + /* Don't accept real-time tasks when there is no way for them to run */ - if (rt_group_sched_enabled() && rt_task(tsk) && tg->dl_bandwidth.dl_runti= me =3D=3D 0) + if (rt_group_sched_enabled() && tg->dl_bandwidth.dl_runtime =3D=3D 0) return 0; =20 return 1; diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index 6c6666b39..45a38fe5e 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -636,7 +636,7 @@ int __sched_setscheduler(struct task_struct *p, */ if (rt_group_sched_enabled() && dl_bandwidth_enabled() && rt_policy(policy) && - task_group(p)->dl_bandwidth.dl_runtime =3D=3D 0 && + !sched_rt_can_attach(task_group(p), p) && !task_group_is_autogroup(task_group(p))) { retval =3D -EPERM; goto unlock; --=20 2.49.0 From nobody Fri Dec 19 19:07:39 2025 Received: from mail-wr1-f41.google.com (mail-wr1-f41.google.com [209.85.221.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6BE1E1FC7CB for ; Thu, 5 Jun 2025 07:14:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107680; cv=none; b=hK6W9N0CG8FnODAFLr1+jb0G0ruUk+3tMaD0g5M0JUjv4x2G9ZcWwG7McmRt0AqFFlWorWwZacdkThHdVYJLRqhlol85J1vpRYL29CJglhhicXPR1awQ/tsJpXIYmxHFsKvZATuMQBkKgxttBEisJrmEWgcH7e8AXfyS7H5mzTM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749107680; c=relaxed/simple; bh=akpe+JnFNh0nrYF5bAG6YaO3R78jRwSWyOGs9dvuljg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=R1XUHB6cBGZ4YrX2g8XKqH9x0hdriPP30rCni3FI0gxz5bdVeXHLFxWMKy9drVy0fINeE+7O7y3lAyiN4X2By/b2B79oLOAhebwjeVYlB+1dL3Uk3RRWFU9bRcT8rKqAcgGnu58S5/cmyyDW/JtWOagIAn7/WsXBfJXhhpoR6LI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=CSLyQLYh; arc=none smtp.client-ip=209.85.221.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CSLyQLYh" Received: by mail-wr1-f41.google.com with SMTP id ffacd0b85a97d-3a361b8a664so592357f8f.3 for ; Thu, 05 Jun 2025 00:14:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749107677; x=1749712477; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wunkGBMjdfVg2AP1giwWoYeKTmh1qFR/UdkCw8YVy9E=; b=CSLyQLYh39/2MZUFMVgNuS4IscRFPTk/OPpGYuXAf6RgTlSx0yI1pPt3K0xrtyLm+n k7w7Z0oDO5nWZ6Qghngyto3+pHbIhnYTeJ7WJ9/7ijiTzG/zcZltM0DUVps7lPB6a1/J IAjlG5sIYfiwd0xulsKoqQO1HyUTX6cYZ7XIQPuaH7yZPAcMPuLUN7QlHtwFOESDZt43 ylsWX7g+1u4iyOpR4n/GglqShqUJuEhlnyy43A5Z6Jexe9cBHWZARaGGhdCz9Q+fwSYj glTdpcdwA5Oanm8OSvPNFAWAyen2RYRUV2DLkAy+lQ7Y14LCq+jA0++z/V0wGfX6RJGj fj3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749107677; x=1749712477; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wunkGBMjdfVg2AP1giwWoYeKTmh1qFR/UdkCw8YVy9E=; b=h6LSoCDijKMd9SbgRqm5qgJb0IznRIBvP/DMj3AV+LnxxaiE5mD7RWliPeTxFq1PTc 3AYldQQPC5i/TeXphbrEAkQVvgD+ngxDbwg9ftUB9l/jkZDCWtd2pVOSv5RQO0TeKFwQ zCbGKZkp6Xkp396wvldRjLW9+78LTKkG7jOIvaZkklI3K6qT9VOjTiOVBYyTd4Nk07tK hLvyt+iZATrnQhqNF43MXvqP4VB/AyCjjppmt28TgOCxV37y340tF/VHieE84EWyVVoL +Amc4DIYnSrBS4LM+PH22i/zuWUdgrPPDEb/iXgg303MTvZcBm3DkSId2HTNpRvSPVyM 9Llw== X-Gm-Message-State: AOJu0YxC7mRqeovqdCq+BfBTXppRxMBcZYrTwZaPIL9UHMNva9Gpm/5j 0ABeCjjieFwb/7apgBwqPY9V/LX208DPt+SGc/t0SgDgn4V4MJMjz1pC X-Gm-Gg: ASbGncvp6xpeECTjzmevOaB2Fdda6c1oYcqATN6ZXWWENmLMyeG1TQjqBhpYjMAq8i8 eWz6gihZk/hN5CReO9x0Ff6Z4H26Qx1PsqJayZNJ+qbjh/JLoLS6S++3eG1YzIJPBvPBiMqpAuV 7WrW6Z/cbR8emvl+EF0c3dqNcCGwbihs3B59DAosrXEcB8u2bqLEGbdT0hPyDkec8t+M1gEukEx Wz+vJ7kgzt0FvqWx7QBUrsEXLnf4SZvki7knT33WVdk7MfLpl4G7WR0edwJvIQ1jvBglEmV66u6 T9TIED9fLc2aIIgH2sfMWXqRt4yMDL6oRSQEFVhJRzW3unnfe8c5OADZzm3sMjW6 X-Google-Smtp-Source: AGHT+IFPiL6k/rAlAhbgeXT5cOuoAthP4yNSe1X48pEbt2w+TJrnfinK05CsPBV//zd7kId9wRjoEw== X-Received: by 2002:a05:6000:26d0:b0:3a4:f787:9b58 with SMTP id ffacd0b85a97d-3a51d986a6emr4521122f8f.58.1749107676489; Thu, 05 Jun 2025 00:14:36 -0700 (PDT) Received: from localhost.localdomain ([78.210.56.234]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a4f00972b5sm23885431f8f.76.2025.06.05.00.14.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Jun 2025 00:14:36 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH 9/9] sched/deadline: Allow deeper hierarchies of RT cgroups Date: Thu, 5 Jun 2025 09:14:12 +0200 Message-ID: <20250605071412.139240-10-yurand2000@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250605071412.139240-1-yurand2000@gmail.com> References: <20250605071412.139240-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Allow creation of cgroup hierachies with depth greater than two. Add check to prevent attaching tasks to a child cgroup of an active cgroup = (i.e. with a running FIFO/RR task). Add check to prevent attaching tasks to cgroups which have children with non-zero runtime. Update rt-cgroups allocated bandwidth accounting for nested cgroup hierachi= es. Co-developed-by: Yuri Andriaccio Signed-off-by: Yuri Andriaccio Signed-off-by: luca abeni --- kernel/sched/core.c | 6 ---- kernel/sched/deadline.c | 69 ++++++++++++++++++++++++++++++++++------- kernel/sched/rt.c | 25 +++++++++++++-- kernel/sched/sched.h | 2 +- kernel/sched/syscalls.c | 4 +++ 5 files changed, 84 insertions(+), 22 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9c8bc9728..c02cdeccf 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -9127,12 +9127,6 @@ cpu_cgroup_css_alloc(struct cgroup_subsys_state *par= ent_css) return &root_task_group.css; } =20 - /* Do not allow cpu_cgroup hierachies with depth greater than 2. */ -#ifdef CONFIG_RT_GROUP_SCHED - if (parent !=3D &root_task_group) - return ERR_PTR(-EINVAL); -#endif - tg =3D sched_create_group(parent); if (IS_ERR(tg)) return ERR_PTR(-ENOMEM); diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index b07abbb60..b405b0724 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -414,10 +414,39 @@ int dl_check_tg(unsigned long total) return 1; } =20 -int dl_init_tg(struct sched_dl_entity *dl_se, u64 rt_runtime, u64 rt_perio= d) +static inline bool is_active_sched_group(struct task_group *tg) { + struct task_group *child; + bool is_active =3D 1; + + // if there are no children, this is a leaf group, thus it is active + list_for_each_entry_rcu(child, &tg->children, siblings) { + if (child->dl_bandwidth.dl_runtime > 0) { + is_active =3D 0; + } + } + return is_active; +} + +static inline bool sched_group_has_active_siblings(struct task_group *tg) +{ + struct task_group *child; + bool has_active_siblings =3D 0; + + // if there are no children, this is a leaf group, thus it is active + list_for_each_entry_rcu(child, &tg->parent->children, siblings) { + if (child !=3D tg && child->dl_bandwidth.dl_runtime > 0) { + has_active_siblings =3D 1; + } + } + return has_active_siblings; +} + +int dl_init_tg(struct task_group *tg, int cpu, u64 rt_runtime, u64 rt_peri= od) +{ + struct sched_dl_entity *dl_se =3D tg->dl_se[cpu]; struct rq *rq =3D container_of(dl_se->dl_rq, struct rq, dl); - int is_active; + int is_active, is_active_group; u64 old_runtime; =20 /* @@ -434,24 +463,40 @@ int dl_init_tg(struct sched_dl_entity *dl_se, u64 rt_= runtime, u64 rt_period) if (rt_period & (1ULL << 63)) return 0; =20 + is_active_group =3D is_active_sched_group(tg); + raw_spin_rq_lock_irq(rq); is_active =3D dl_se->my_q->rt.rt_nr_running > 0; old_runtime =3D dl_se->dl_runtime; dl_se->dl_runtime =3D rt_runtime; dl_se->dl_period =3D rt_period; dl_se->dl_deadline =3D dl_se->dl_period; - if (is_active) { - sub_running_bw(dl_se, dl_se->dl_rq); - } else if (dl_se->dl_non_contending) { - sub_running_bw(dl_se, dl_se->dl_rq); - dl_se->dl_non_contending =3D 0; - hrtimer_try_to_cancel(&dl_se->inactive_timer); + if (is_active_group) { + if (is_active) { + sub_running_bw(dl_se, dl_se->dl_rq); + } else if (dl_se->dl_non_contending) { + sub_running_bw(dl_se, dl_se->dl_rq); + dl_se->dl_non_contending =3D 0; + hrtimer_try_to_cancel(&dl_se->inactive_timer); + } + __sub_rq_bw(dl_se->dl_bw, dl_se->dl_rq); + dl_se->dl_bw =3D to_ratio(dl_se->dl_period, dl_se->dl_runtime); + __add_rq_bw(dl_se->dl_bw, dl_se->dl_rq); + } else { + dl_se->dl_bw =3D to_ratio(dl_se->dl_period, dl_se->dl_runtime); + } + + // add/remove the parent's bw + if (tg->parent && tg->parent !=3D &root_task_group) + { + if (rt_runtime =3D=3D 0 && old_runtime !=3D 0 && !sched_group_has_active= _siblings(tg)) { + __add_rq_bw(tg->parent->dl_se[cpu]->dl_bw, dl_se->dl_rq); + } else if (rt_runtime !=3D 0 && old_runtime =3D=3D 0 && !sched_group_has= _active_siblings(tg)) { + __sub_rq_bw(tg->parent->dl_se[cpu]->dl_bw, dl_se->dl_rq); + } } - __sub_rq_bw(dl_se->dl_bw, dl_se->dl_rq); - dl_se->dl_bw =3D to_ratio(dl_se->dl_period, dl_se->dl_runtime); - __add_rq_bw(dl_se->dl_bw, dl_se->dl_rq); =20 - if (is_active) + if (is_active_group && is_active) add_running_bw(dl_se, dl_se->dl_rq); =20 raw_spin_rq_unlock_irq(rq); diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index ce3320f12..225684450 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -106,7 +106,8 @@ void free_rt_sched_group(struct task_group *tg) * Fix this issue by changing the group runtime * to 0 immediately before freeing it. */ - BUG_ON(!dl_init_tg(tg->dl_se[i], 0, tg->dl_se[i]->dl_period)); + if (tg->dl_se[i]->dl_runtime) + BUG_ON(!dl_init_tg(tg, i, 0, tg->dl_se[i]->dl_period)); raw_spin_rq_lock_irqsave(cpu_rq(i), flags); BUG_ON(tg->rt_rq[i]->rt_nr_running); raw_spin_rq_unlock_irqrestore(cpu_rq(i), flags); @@ -2197,6 +2198,14 @@ static int tg_set_rt_bandwidth(struct task_group *tg, { int i, err =3D 0; =20 + /* + * Do not allow to set a RT runtime > 0 if the parent has RT tasks + * (and is not the root group) + */ + if (rt_runtime && (tg !=3D &root_task_group) && (tg->parent !=3D &root_ta= sk_group) && tg_has_rt_tasks(tg->parent)) { + return -EINVAL; + } + /* No period doesn't make any sense. */ if (rt_period =3D=3D 0) return -EINVAL; @@ -2220,7 +2229,7 @@ static int tg_set_rt_bandwidth(struct task_group *tg, goto unlock_bandwidth; =20 for_each_possible_cpu(i) { - if (!dl_init_tg(tg->dl_se[i], rt_runtime, rt_period)) { + if (!dl_init_tg(tg, i, rt_runtime, rt_period)) { err =3D -EINVAL; break; } @@ -2290,6 +2299,9 @@ static int sched_rt_global_constraints(void) =20 int sched_rt_can_attach(struct task_group *tg, struct task_struct *tsk) { + struct task_group *child; + int can_attach =3D 1; + /* Allow executing in the root cgroup regardless of allowed bandwidth */ if (tg =3D=3D &root_task_group) return 1; @@ -2298,7 +2310,14 @@ int sched_rt_can_attach(struct task_group *tg, struc= t task_struct *tsk) if (rt_group_sched_enabled() && tg->dl_bandwidth.dl_runtime =3D=3D 0) return 0; =20 - return 1; + /* If one of the children has runtime > 0, cannot attach RT tasks! */ + list_for_each_entry_rcu(child, &tg->children, siblings) { + if (child->dl_bandwidth.dl_runtime) { + can_attach =3D 0; + } + } + + return can_attach; } =20 #else /* !CONFIG_RT_GROUP_SCHED */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 686578666..fde133f9c 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -384,7 +384,7 @@ extern void dl_server_init(struct sched_dl_entity *dl_s= e, struct dl_rq *dl_rq, dl_server_has_tasks_f has_tasks, dl_server_pick_f pick_task); int dl_check_tg(unsigned long total); -int dl_init_tg(struct sched_dl_entity *dl_se, u64 rt_runtime, u64 rt_perio= d); +int dl_init_tg(struct task_group *tg, int cpu, u64 rt_runtime, u64 rt_peri= od); =20 extern void dl_server_update_idle_time(struct rq *rq, struct task_struct *p); diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index 45a38fe5e..7e5e6de92 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -630,6 +630,10 @@ int __sched_setscheduler(struct task_struct *p, =20 if (user) { #ifdef CONFIG_RT_GROUP_SCHED + if (dl_bandwidth_enabled() && rt_policy(policy) && !sched_rt_can_attach(= task_group(p), p)) { + retval =3D -EPERM; + goto unlock; + } /* * Do not allow real-time tasks into groups that have no runtime * assigned. --=20 2.49.0