From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wm1-f50.google.com (mail-wm1-f50.google.com [209.85.128.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F6BB377016 for ; Mon, 8 Jun 2026 12:15:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920951; cv=none; b=GD+T3aiqw2XEjbcwZzWL70gE5eDuebSkPll1357tcR3EIuDjGOYYG1Jw7wWxrnPdqn3Idy0Uaa2+RFIq+ReFlh4Zseyd5C5+dv9WY5xBh18P+t6ajgCKDTuAUfdlhr7gkbIXla5DPSmdEVLCGmiRd+FHqENuv6ZqCBLbZqo3CZo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920951; c=relaxed/simple; bh=cExBacDTus5EuPLcE+dMq0EfEQ2Qzf4UcPMcwgVhM7g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TNznSCbeOwvhdaoXvgHwMhL9lRrkLajhHuzV7FhXHvOHzYJQqBGKlQsXZsGGyzwwSPOCTN1wd6nb9E8EEFvB0OKYwav6Q+rqAomJ6lkGYP3dUiOWSgfY2WWXzvKLLWuFWaGcsjGQXfltd7gD8L4/Ij5KCDWZg//iMN193NDXHbM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=WkCpvkXV; arc=none smtp.client-ip=209.85.128.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WkCpvkXV" Received: by mail-wm1-f50.google.com with SMTP id 5b1f17b1804b1-490cf3000f0so11508385e9.1 for ; Mon, 08 Jun 2026 05:15:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920949; x=1781525749; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=I4r5mAas9Db2M4QBCfy/IlZpu64a+mCWZ5iexSneZTA=; b=WkCpvkXVdszcsngvyc1b8TyXaWpj/5bmtl4/PpjC3HVLCvCT5EI15AUUxw25z3PWu5 0VSWsu5bgR6lnX7Q69rIVX3dXZBxT2sBtzJ1S7LHeabbl0eoxsZmXS0lsdhwRCFvsqJ7 4OQ6xnPVfL2Vf1eCPVBvm59LfYT2onnqaaYolE7QWs/YE8aw61Mf4sSmwJldaiiJZYJf FfKeoy6vAz86k+Rn2u5hOKQVP/amV7AAQQ9u7tQ+NFGRcNIBuj1wKon6ho5M6+v/2dch JoMigPs+UjdzVJslzcED+LXrsaz+ck8MN8iO6QWbHV6cc2TP0OY+vMPhBORewoQfmuhY ZRtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920949; x=1781525749; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=I4r5mAas9Db2M4QBCfy/IlZpu64a+mCWZ5iexSneZTA=; b=I5ZggDT8HGD1rYAIrvAeHP1CsRgrpwYWou0jEKHaln0or7eKiiTz+0tNNOeV+F3VXe Vit4nLXRblMD1ccvUPK+hM19VS8Dtui+bA8IZmm4kCNPzO6XdRIrtSuWZgOoudbgYJ3a Ya7T2+GNEB5qMel8Hv7o/fM6sV4/j7LD2tvgvAcM76iy3GK9GyA8sfAQ9cvn+ZeVCxzF CALC7exPRGTDLhPmWoTttKR/tCVlDezpat+Z/xQRnHQiuYB2E2qJvHt/wBPd0QY3hNDt GUifr/K53NQH+sHYIGgxkd3IXznbO7LzNjahFFRZekKMLyYVOgjoCedejO9tXbpQHrZ2 cNzw== X-Forwarded-Encrypted: i=1; AFNElJ+ATbdKK0fvY+JPTGB1L7d6NkFAQTME8UkGg8GU0s1Z3aS+cL+HFnCHyk5jAJ/ojNkRToEaD9Osm44q9XI=@vger.kernel.org X-Gm-Message-State: AOJu0YzAxLcutrmQdqBKJusT2S+KVphrIByQC/G0wZHQ/8JKxcI7KuYs ydBiFIGVRqFq2PVzXR8d/PJ0OyK1dncJ7+kiym4M6VPskDTjiM0DCHV9 X-Gm-Gg: Acq92OEeSf/uPhb4PQ2mKUsdGSYW65ABXsIM1zSyMYmW0HnEL++RksUWrl4WMDUyvOR CuwKN8IZldE8zMexZtzlyag6dnH0UfOhNy806950BBpA57C36UcThOgGdE/b1sGt29px5uYjcO3 kzk+v16ZEglhL8RxpTS3gtOg+eP2o+nc0L1JBTfgn1aMezWslNCej1BI2ETF5AVNZagoMM06ExF WRrA9Aw5RXZIIY+bqGbbGP7XXrs1vOc4uNBgnIB57ZhZFZ/yBxKOQWrs9/Y+whvHr9H8D+QtcsI AIHSNOT3uh13qb3xXZ9v/VUtZ7IEGUH+dNkja1SKmNV9xju18mEFX9dMWGXmsrtc8GMygPAjoMw huOJjsj8t7f0UN0hcthyaPFTGGEKYDt3QD7tmt9r7Q0dDVLQIx2W26bNmmEp4a+XZIn4euPdPs4 YZdNsh35odkxZaCuHyB88lJuh/YOTCCB4= X-Received: by 2002:a05:600c:c16e:b0:490:b8c0:d470 with SMTP id 5b1f17b1804b1-490c2622d03mr266687855e9.19.1780920948601; Mon, 08 Jun 2026 05:15:48 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:48 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 01/25] sched/deadline: Fix replenishment logic for non-deferred servers Date: Mon, 8 Jun 2026 14:15:20 +0200 Message-ID: <20260608121546.69910-2-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Enqueue and replenish non-deferred deadline servers when their runtime is exhausted and the replenishment timer could not be started because it is too close to the wake-up instant. --- Already merged in sched/tip: https://git.kernel.org/tip/eecd5e117cfa63a353f4c69fdcea5d9b14af698e Signed-off-by: Yuri Andriaccio --- kernel/sched/deadline.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 7db4c87df83b..ddfd6bc63ab1 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1515,8 +1515,12 @@ static void update_curr_dl_se(struct rq *rq, struct = sched_dl_entity *dl_se, s64 if (unlikely(is_dl_boosted(dl_se) || !start_dl_timer(dl_se))) { if (dl_server(dl_se)) { - replenish_dl_new_period(dl_se, rq); - start_dl_timer(dl_se); + if (dl_se->dl_defer) { + replenish_dl_new_period(dl_se, rq); + start_dl_timer(dl_se); + } else { + enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH); + } } else { enqueue_task_dl(rq, dl_task_of(dl_se), ENQUEUE_REPLENISH); } -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1FECB3783C0 for ; Mon, 8 Jun 2026 12:15:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920952; cv=none; b=tcljj8I64LPspn8L+KdcZH+QIL6AH7MiFABDhz7vm6ySUlDgo6JELVkbNSbIrRnV4vm+RbUfysFqlQFJfad2vLuJC5AcwwWq6ubtrwAwEE9LRd9FaO75eYuCnb2T22OO7bly22DTEzhaALie70XuQdzOqZ0AfP8MwYHnPfb89Ik= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920952; c=relaxed/simple; bh=UjEBDxtPRqozcWdlJuwbLUvPglANfSeYQER9BO4aiPM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hhGqIrsfb8pAwAX09qAJgHc0nCm6tdIB9lgyL/M4tmRXcDfPyOOwsgbZXwXjmZLZYiJaw88nYiuREYIZoAj7TZcEZnfDe5zsV52pp7kO++evimGiDRUvWSdX0I6RRn4SdUfDL+DRAIHP6uUQagso2RPfzQHnv+ov3v0uO/PJHVg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=sncBl5Bt; arc=none smtp.client-ip=209.85.221.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="sncBl5Bt" Received: by mail-wr1-f52.google.com with SMTP id ffacd0b85a97d-45ef29c5561so2178704f8f.0 for ; Mon, 08 Jun 2026 05:15:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920949; x=1781525749; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Aa6pcb3XbDlbPOw5oJXZLx2AdQ+6MRmY7aNw+gfgVdc=; b=sncBl5BtdK1nsqEbteCy70OKWgtnEYj6CHYB2xzZuJgx7PRLexqr26/CB3Ss32RT3Q VJN41aBe3VRVTLeTBTIWXg3sB5/fZ7ajSjkx9i5lkKMlVnC8KCevBcb8nfXvcxilEDrb +a/ItWrzFObTQbOZaOY5K5DzSoPavTJtVUsxtz/V6Yqo14FyB2xWhSnsHTh2yYAQ6g6P dkPdOcR1fwGkDj7xR4tYLLlFBiPR9EfZJCwkW3anHF5vv0b29I5c2+5D6waj7KCS/YCg hsIIXItaZuvX8P6TzhtiURsE18ceJORGncK8mOiLfQTXMdfI0JoYcOooS7v6dUiT9BQL /kdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920949; x=1781525749; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Aa6pcb3XbDlbPOw5oJXZLx2AdQ+6MRmY7aNw+gfgVdc=; b=TqmqxH0y48tMFAyPzHVU7g1uBUKKGe9kocSLU5o4jsONKGOJLv2Tox0UbjNneeMNxo tdzZmSFHqgz6WgLyfSh2REXLOtlYg9U+0nFBFle6HuUzeFvsblI1pRQeNwHq3odmr/tD YkYeHE0IcWJgDRVv8ZZWL5ZjEU+tNkfx0Zm+AyF4jEm6HlHzGXW698i4wYtp018GDKuS vFhlJoigrNrYX1lJCxB+sSnL2CvE36N4RSeWMZfCnPeE+tUWdJMiu26Zv45+vmcQRwqf lW/9lsiF7xiRVN+xG+ByqFaIFIj7kRDoqP2JXtYToTLxeaA7ptjbwZPjdbi66lFEzy45 bTDg== X-Forwarded-Encrypted: i=1; AFNElJ+SAEPHAKhBLoQah+TlZKbQQoektYr3VQClPkRE1PEcgjYH1ciuNU9f+lUuVjq34TEPqMBPYjtGgJilHqM=@vger.kernel.org X-Gm-Message-State: AOJu0YyaJMsEsIC0cabQhgY/UQj13U/ZFOyBCNodmsibPPMkNn23lE5c iFmExDBH+RhXIdsvXqk/LPk9MKPPolCSxT0z7MzI3M5EmpOOf8d3QWI7 X-Gm-Gg: Acq92OHk/THlpgqNM1/flLKo2Nz2hwEjcw4gH//SFqMpVpFOVvFPB8e1kZXsqZN/Nv5 +So4CXHxY3QcEztxqIKwJjp4tZzz1SNz+jHuHAtIvkUHrTGYtgNtVUacTEtqSoUZfk1915XdXA6 pru+66bZesSrJpcWrKtWukg2ZCtKDKMmF45wGZe4LRYkE03FDSh0EYmvxssmsbNdpSljPkWY36M TSEj9np7nlxcJVE+cRVKPHVa6r0q1LzyBRcF6Nf2u+8egiWVm5+SfVRmqI9dZeXLh0ZV61gezP/ UJ1kHRMu3uRMc8XQ8qC4eTyNtQoM5EobAG2QQLg9m7jgOO9SXolQkFGQWL93nCiHqAr+FF5tDAK uDhjEQaQkQdYjzFYHvLE3Ql8XpbvTZTYBSqdanNqTA7ubScxvrN4xuTRJ82ruSZu/DIhTJzHVuh iKUHpZpUcR3rOAj0uBBKRAp7tyLmfVJoc= X-Received: by 2002:adf:d02f:0:b0:460:f36:79b0 with SMTP id ffacd0b85a97d-460304fda0amr17776245f8f.19.1780920949486; Mon, 08 Jun 2026 05:15:49 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:49 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 02/25] sched/rt: Update default bandwidth for real-time tasks to ONE Date: Mon, 8 Jun 2026 14:15:21 +0200 Message-ID: <20260608121546.69910-3-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Set the default total bandwidth for SCHED_DEADLINE tasks and servers to ONE. FIFO/RR tasks are already throttled by fair-servers and ext-servers, and the sysctl_sched_rt_runtime parameter now only defines the total bw that is allowed to deadline entities. --- Already merged in sched/tip: https://git.kernel.org/tip/c2e390197ad1360db6686a8c89abaafaf83adf72 Signed-off-by: Yuri Andriaccio --- kernel/sched/rt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 4ee8faf01441..e6ea728f519e 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -19,9 +19,9 @@ int sysctl_sched_rt_period =3D 1000000; /* * part of the period that we allow rt tasks to run in us. - * default: 0.95s + * default: 1s */ -int sysctl_sched_rt_runtime =3D 950000; +int sysctl_sched_rt_runtime =3D 1000000; #ifdef CONFIG_SYSCTL static int sysctl_sched_rr_timeslice =3D (MSEC_PER_SEC * RR_TIMESLICE) / H= Z; -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E42DB372B26 for ; Mon, 8 Jun 2026 12:15:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920953; cv=none; b=K8jjiOao5s2Jz6dbtafus10my3m0s6Z2EcV2OwPEXcLrZWwr0RhY9EJMLYnI3BY6ShKodmqViHfaYTKPxxUxZAWwUQAtqqxbqbjjVNsMLwJFwCWKJkBJQaMT2smBAMecvXCWUwkNign6/zHPEqlmRI7o/dcPfQVH9iUPPhJgVS4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920953; c=relaxed/simple; bh=rgXOYVc0DI3aXh/5+n+s9LzuQgPbNm+N8WAVL2wfAT8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qsWw20GMDhLXOpnusOGSI1wF1rvaKj9g6clbc1VOoORBbIleb/R7pGvJVVZ9xpGi7S2Wb81ATVKyZUNm38dZwnUtDyzqrpnmuVuje9YjYUzygvTz5C8XQfEI5VKASp6dJEZc5geM5zK+/SvQacCZs17rPaKkRu+BxrM8DaN93tM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MBmR/Zkb; arc=none smtp.client-ip=209.85.221.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MBmR/Zkb" Received: by mail-wr1-f52.google.com with SMTP id ffacd0b85a97d-460166910e6so2137150f8f.2 for ; Mon, 08 Jun 2026 05:15:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920950; x=1781525750; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bLJJ4ZrsmzvR8fpp1788yY7mK5hc0zSHwTxr8F7vRWU=; b=MBmR/ZkbyuiXYKlyyr5sP/v2WnJYNMKO+zHrd6lWQ3GHEbWSzFQuKM+AcHYObymleB JX+3j7JAJ5Dp9F1sebFJC2QClOfdT0WIOGi5f9gm/3UsFdtTX2eVkEtM9+6cbdR7kGYU IRhiYApV/IGhoZ77eyL13hocRbS9/v4CGFkHMY4X9RnA5bILAZa6idAgXQ5EroKak0ez X5kusjt75XHS8SAHZSiyN+IgqoDmI8DUoHdjgy1PL1+sNFeqK02YKQWMIo7g1zd7Qdbg eMzjfb647u9NHllKpff28w3NA5wGU6Uci9Xgq9qiRsESNGEpDBDztvAjCAlbo0iKnlAe cYIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920950; x=1781525750; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=bLJJ4ZrsmzvR8fpp1788yY7mK5hc0zSHwTxr8F7vRWU=; b=XpPTxPuCuD9x5I67tf0Biq8UgLUMdKga1EDRrmKQab1LUnAleVyjyA2PNaVihB4FZU g8dSed83gcLajOVSRt9IbyZ+mhaLM3FMayCZuIMoHEDaEGF+A24PJfQmmXDLSiHp/SXy sV9TLJPMGlavYUeW2NiIfEIXbr/uBzpmFa4yZXuEC/BrCOimXOutsDi29mWPKcsAfMsJ GZcU5dacHMFVRoDKLioLbh5Nhwqglyadkjp/Qo/Kl3yRiEPLBpqfPh4UBWaGa++MB0MV FImryH600gODRTWO7tovo83h5+Bq/W0yyBd4oBq6ccvoLIR3PYcP6hGDtnb7ar13sECU 3vQQ== X-Forwarded-Encrypted: i=1; AFNElJ9eoqi4OTS/wuMejupuT9kZsYaT4BqXYR6pa5ZUwPIbGJmXbOUks4ktXdrE6HOPO0H8/C+4uFwN1CMOnUw=@vger.kernel.org X-Gm-Message-State: AOJu0Yx38l14/WEWoE6sQRQs4/5yr4Q0hQaZ6j8w5FOfThP7jLxMAwpG GPjzIA8mUhj6EzZ9te5SiTiWTc4dIsrKp8bXtBhdoqivIRQMHRvM6lPC X-Gm-Gg: Acq92OEPU7sna3Gfwbi5Ebnxnv7nq2aub7GCE8OtE05c16Z1uqfIFgo2taLQ9H14CH1 oQXaC06hleSHLTqQzof7A96w9xxjafHs7xJqpCwZGhQo9u9zpQNGjfkTe5XqdeTJbB6F4otCB9R PgUHADY2wxU6tfVUrqumx9Yax3WSx1hJQEoAJS6szsTKecL+tqnUYYHQAezRUJHpsE21adg2W3c 3RARnXMAPQ/JXH7WtOFcqptFwQK2OYEbjyLQTGQ/lDgUT0q5s3foJ37ySaNWb4l9sXG6bXIXZG9 YPW4Mn8TNYxOy7ltlYcY94Sn2IqqFsypHBwDhWC5AlgJ+3tNZ3hp6guPf41p5kJghijAejte2Sc HwB8VyfIvV8DeeWPvT6H4uk5CsfXU0IH6tWRu6aI8D/k8Bj1FUaVVIPZCP74e979FgfDyLjLgtc 3Qp+5zhwOSb688FXuNEIIxU6L2gm/ieBk= X-Received: by 2002:adf:e848:0:b0:460:cfc:eb24 with SMTP id ffacd0b85a97d-460304fec2bmr19705651f8f.22.1780920950442; Mon, 08 Jun 2026 05:15:50 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:50 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 03/25] sched/deadline: Do not access dl_se->rq directly Date: Mon, 8 Jun 2026 14:15:22 +0200 Message-ID: <20260608121546.69910-4-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Make deadline.c code access the runqueue of a scheduling entity saved in the sched_dl_entity data structure. This allows future patches to save different runqueues in sched_dl_entity other than the global runqueues. Move dl_server_apply_params call in sched_init_dl_servers as the rq_of_dl_se function will return the correct deadline entity only if the dl_server flag is set. Add a WARN_ON on the return value of dl_server_apply_params in sched_init_dl_servers as this function may fail if the kernel is not configured correctly. Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- kernel/sched/deadline.c | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index ddfd6bc63ab1..63e88ecdd5ed 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -869,7 +869,7 @@ static void replenish_dl_entity(struct sched_dl_entity = *dl_se) * and arm the defer timer. */ if (dl_se->dl_defer && !dl_se->dl_defer_running && - dl_time_before(rq_clock(dl_se->rq), dl_se->deadline - dl_se->runtime)= ) { + dl_time_before(rq_clock(rq), dl_se->deadline - dl_se->runtime)) { if (!is_dl_boosted(dl_se)) { =20 /* @@ -1170,11 +1170,11 @@ static enum hrtimer_restart dl_server_timer(struct = hrtimer *timer, struct sched_ * of time. The dl_server_min_res serves as a limit to avoid * forwarding the timer for a too small amount of time. */ - if (dl_time_before(rq_clock(dl_se->rq), + if (dl_time_before(rq_clock(rq), (dl_se->deadline - dl_se->runtime - dl_server_min_res))) { =20 /* reset the defer timer */ - fw =3D dl_se->deadline - rq_clock(dl_se->rq) - dl_se->runtime; + fw =3D dl_se->deadline - rq_clock(rq) - dl_se->runtime; =20 hrtimer_forward_now(timer, ns_to_ktime(fw)); return HRTIMER_RESTART; @@ -1185,7 +1185,7 @@ static enum hrtimer_restart dl_server_timer(struct hr= timer *timer, struct sched_ =20 enqueue_dl_entity(dl_se, ENQUEUE_REPLENISH); =20 - if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &dl_se->rq->cu= rr->dl)) + if (!dl_task(rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl)) resched_curr(rq); =20 __push_dl_task(rq, rf); @@ -1481,7 +1481,7 @@ static void update_curr_dl_se(struct rq *rq, struct s= ched_dl_entity *dl_se, s64 =20 hrtimer_try_to_cancel(&dl_se->dl_timer); =20 - replenish_dl_new_period(dl_se, dl_se->rq); + replenish_dl_new_period(dl_se, rq); =20 if (idle) dl_se->dl_defer_idle =3D 1; @@ -1578,14 +1578,14 @@ static void update_curr_dl_se(struct rq *rq, struct= sched_dl_entity *dl_se, s64 void dl_server_update_idle(struct sched_dl_entity *dl_se, s64 delta_exec) { if (dl_se->dl_server_active && dl_se->dl_runtime && dl_se->dl_defer) - update_curr_dl_se(dl_se->rq, dl_se, delta_exec); + update_curr_dl_se(rq_of_dl_se(dl_se), dl_se, delta_exec); } =20 void dl_server_update(struct sched_dl_entity *dl_se, s64 delta_exec) { /* 0 runtime =3D fair server disabled */ if (dl_se->dl_server_active && dl_se->dl_runtime) - update_curr_dl_se(dl_se->rq, dl_se, delta_exec); + update_curr_dl_se(rq_of_dl_se(dl_se), dl_se, delta_exec); } =20 /* @@ -1794,7 +1794,7 @@ void dl_server_update(struct sched_dl_entity *dl_se, = s64 delta_exec) */ void dl_server_start(struct sched_dl_entity *dl_se) { - struct rq *rq =3D dl_se->rq; + struct rq *rq; =20 dl_se->dl_defer_idle =3D 0; if (!dl_server(dl_se) || dl_se->dl_server_active || !dl_se->dl_runtime) @@ -1803,16 +1803,16 @@ void dl_server_start(struct sched_dl_entity *dl_se) /* * Update the current task to 'now'. */ + rq =3D rq_of_dl_se(dl_se); rq->donor->sched_class->update_curr(rq); - if (WARN_ON_ONCE(!cpu_online(cpu_of(rq)))) return; =20 trace_sched_dl_server_start_tp(dl_se, cpu_of(rq), dl_get_type(dl_se, rq)); dl_se->dl_server_active =3D 1; enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP); - if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl)) - resched_curr(dl_se->rq); + if (!dl_task(rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl)) + resched_curr(rq); } =20 void dl_server_stop(struct sched_dl_entity *dl_se) @@ -1856,9 +1856,9 @@ void sched_init_dl_servers(void) =20 WARN_ON(dl_server(dl_se)); =20 - dl_server_apply_params(dl_se, runtime, period, 1); - dl_se->dl_server =3D 1; + WARN_ON(dl_server_apply_params(dl_se, runtime, period, 1)); + dl_se->dl_defer =3D 1; setup_new_dl_entity(dl_se); =20 @@ -1867,9 +1867,9 @@ void sched_init_dl_servers(void) =20 WARN_ON(dl_server(dl_se)); =20 - dl_server_apply_params(dl_se, runtime, period, 1); - dl_se->dl_server =3D 1; + WARN_ON(dl_server_apply_params(dl_se, runtime, period, 1)); + dl_se->dl_defer =3D 1; setup_new_dl_entity(dl_se); #endif @@ -1895,7 +1895,7 @@ int dl_server_apply_params(struct sched_dl_entity *dl= _se, u64 runtime, u64 perio { u64 old_bw =3D init ? 0 : to_ratio(dl_se->dl_period, dl_se->dl_runtime); u64 new_bw =3D to_ratio(period, runtime); - struct rq *rq =3D dl_se->rq; + struct rq *rq =3D rq_of_dl_se(dl_se); int cpu =3D cpu_of(rq); struct dl_bw *dl_b; unsigned long cap; @@ -1971,7 +1971,7 @@ static enum hrtimer_restart inactive_task_timer(struc= t hrtimer *timer) p =3D dl_task_of(dl_se); rq =3D task_rq_lock(p, &rf); } else { - rq =3D dl_se->rq; + rq =3D rq_of_dl_se(dl_se); rq_lock(rq, &rf); } =20 --=20 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wm1-f42.google.com (mail-wm1-f42.google.com [209.85.128.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BCFF7379EF7 for ; Mon, 8 Jun 2026 12:15:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920956; cv=none; b=ArNp4noDMi42cACj+L3PfsgL8c+PnPa1XCo3pudWQzIf1hEjAe8+4sGuBXdzww36YI27MpkrhquASkUOcSkM00cclo2Whskv4A4/x/dljDuah9bzz973qFMH2Chqx0p9zmpVa1fqWzyxlObzzNg6xprk5riOxUf3Sbtquz49Qdg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920956; c=relaxed/simple; bh=Fk6uyIEWShVizL/+ulsie+vw2RG/38LGEd2khZSyBdo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VKhyG7AyzLYdLqd5r5m63HTroIX+QF/UXLle6rundCYp/ucvQO1awAhs/DvMFGJEzNo0tTQuDqnODusJHpBWdRxvwjfhQ1fVNkVINobmKrDb0HoZSqSlKEtDqqllIiGHagqvPzc4LFTnH1mPUNQYsIzSd1kb1WJnHpd8KZwxwmY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JBl8UfJf; arc=none smtp.client-ip=209.85.128.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JBl8UfJf" Received: by mail-wm1-f42.google.com with SMTP id 5b1f17b1804b1-490b8a97b11so46689945e9.0 for ; Mon, 08 Jun 2026 05:15:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920951; x=1781525751; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=I/+5kyst+IWx470x1SiW1/Gwa7Y3Y1pD8ZiaQxMDkvk=; b=JBl8UfJf56erOYn+VsJ/JLcGYdi7hfxUFMoK6OsnlwZwiUC1Q/d6Xo4LckHHn7wJR+ 8hdZGU5i1UEz7OjCcrNMmRmbc16GEuvOgs94RGMYwqwg6BeqrFgT8yh4ifq0TUbLeJ/i swSteMK0qYNl/Z6ifkUKmG7SpWw4/NGKoAqWrEGwvN7d1oO7ezB04/Wo/quUHqAC0JZu j4KRenEN5PvgpdZBkAit4UO6qBvZZEtyLQ0cNMsGhmksJDa9XiQv8sHb5W7+G1SE1c3c 59QTmi8rPhOHjHNXEYkzgMS9/wSiVG7iUSE6G0oGrnKGCn/GCTrbHARrf4TNEayL8qeE jldw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920951; x=1781525751; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=I/+5kyst+IWx470x1SiW1/Gwa7Y3Y1pD8ZiaQxMDkvk=; b=sw++ybXFu0CJI3xCJXVV+TQ5HCnxrxKBvsvSP2LXusMZeJN87Ss2Xn0rDewUxVPjx8 MDMkrDEerr0JekoIRlmJ79NIbF1oxmuccsIsFtpnfdTobvFxk1P5f5Rt6+sWBEwAX4mI 6OqOiEZpu/foB0jdUmOzm1PWZtIFRQICKtLNUuXV8QszbCTgdmtbCCwZoscz048dy1nR k/tlDgtM1YgO1uA/UaeA9PVJo2T2ZeJi914aQ6zVT8dl4wEgL07y5AdTC+SvqyEocPfY YkoY//kjWgkBab1sPLL9ftEV3q1EdO4MuS/EO288tbcygMFrHak1QIWMo7zZFYxx3suc KwNA== X-Forwarded-Encrypted: i=1; AFNElJ80ecrmqckLOI2/A4FAOU+23rxGphcHq73XWEMGBMMOJ1KCj3l+gsvQLl7NqoE9uXxlmsXBaMsyX8ZyZKg=@vger.kernel.org X-Gm-Message-State: AOJu0YzlRvHUShMjaMUr7rmv9W844qX2YjynwATIuSsx16KbXx1nBe6J GrIao/TPFyyYh7wbI8vXOzaSNgg6MGKGl/rsqmBvlEfR/8W3r0bmYxEG X-Gm-Gg: Acq92OEl3XOQNMIsZ0Ivcx7HGtoPJn/ihVjLQRziYBme4xuSsfOqSWCAi6wUqtAkCZj ukzlGRTKLPb2rt4yJld9yTbeRA1zkBZETufFpOS0ZX0pq4bQ5SQeSlS+v9fXCduPa0G0+3IuNNn 9Knc4A6hb4peVfXtzKZQvSe67qNRI+Z4B3UTQ+TZ9wTGFJknf8Bk19vcAIa+JMM9ebX+XR5e8uu JxIg/LDfII/XEvMWgyTOXBmYgxpEV/Fe89AvU5oMb2n5A5NOxFYaFPWokoOkE6LL2QCxu4Csqzg eCn0wKCPWazVUiVb1A2U8UOS7hLKLX6cG9z19tYk7yI6GX97D38ecnf/v4ROy4PoZcK6LneZCe6 VIkqTFAsuv6dwIEhw07s9x2cvVjdOTw0kHb2ev36Acr6PgIZTxtC6NHLXSAo4KagOE2JUW5TBzB H4/MyThf+rtsCA3lGVjzi6bFZ+kWZiepA= X-Received: by 2002:a05:600d:848a:b0:490:b4e5:ce7e with SMTP id 5b1f17b1804b1-490c26e1afbmr142063105e9.25.1780920951249; Mon, 08 Jun 2026 05:15:51 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:51 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 04/25] sched/deadline: Distinguish between dl_rq and my_q Date: Mon, 8 Jun 2026 14:15:23 +0200 Message-ID: <20260608121546.69910-5-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Split the single runqueue pointer in sched_dl_entity into two separate pointers, following the existing pattern used by sched_rt_entity: - dl_rq: Points to the deadline runqueue where this entity is queued (global runqueue). - my_q: Points to the runqueue that this entity serves (for servers). This distinction is currently redundant for the fair_server and ext_servers (both point to the same CPU's structures), but is essential for future RT cgroup support where deadline servers will be queued on the global dl_rq wh= ile serving tasks from cgroup-specific runqueues. Update rq_of_dl_se() to use container_of_const() to recover the global rq f= rom dl_rq, and update fair.c and ext.c to explicitly use my_q (local rq) when accessing the served runqueue. Update dl_server_init() to take a dl_rq pointer (use to retrieve the global runqueue where the dl_server is scheduled) and a rq pointer (for the local runqueue served by the server). Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- include/linux/sched.h | 6 ++++-- kernel/sched/deadline.c | 17 ++++++++++++----- kernel/sched/ext.c | 4 ++-- kernel/sched/fair.c | 4 ++-- kernel/sched/sched.h | 3 ++- 5 files changed, 22 insertions(+), 12 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index ee06cba5c6f5..411ffe9b34b3 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -733,9 +733,11 @@ struct sched_dl_entity { * Bits for DL-server functionality. Also see the comment near * dl_server_update(). * - * @rq the runqueue this server is for + * @dl_rq the runqueue on which this entity is (to be) queued + * @my_q the runqueue "owned" by this entity */ - struct rq *rq; + struct dl_rq *dl_rq; + struct rq *my_q; dl_server_pick_f server_pick_task; #ifdef CONFIG_RT_MUTEXES diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 63e88ecdd5ed..b3059658a74a 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -65,10 +65,12 @@ static inline struct rq *rq_of_dl_rq(struct dl_rq *dl_r= q) static inline struct rq *rq_of_dl_se(struct sched_dl_entity *dl_se) { - struct rq *rq =3D dl_se->rq; + struct rq *rq; if (!dl_server(dl_se)) rq =3D task_rq(dl_task_of(dl_se)); + else + rq =3D container_of_const(dl_se->dl_rq, struct rq, dl); return rq; } @@ -1817,11 +1819,14 @@ void dl_server_start(struct sched_dl_entity *dl_se) void dl_server_stop(struct sched_dl_entity *dl_se) { + struct rq *rq; + if (!dl_server(dl_se) || !dl_server_active(dl_se)) return; - trace_sched_dl_server_stop_tp(dl_se, cpu_of(dl_se->rq), - dl_get_type(dl_se, dl_se->rq)); + rq =3D rq_of_dl_se(dl_se); + trace_sched_dl_server_stop_tp(dl_se, cpu_of(rq), + dl_get_type(dl_se, rq)); dequeue_dl_entity(dl_se, DEQUEUE_SLEEP); hrtimer_try_to_cancel(&dl_se->dl_timer); dl_se->dl_defer_armed =3D 0; @@ -1830,10 +1835,12 @@ void dl_server_stop(struct sched_dl_entity *dl_se) dl_se->dl_server_active =3D 0; } -void dl_server_init(struct sched_dl_entity *dl_se, struct rq *rq, +void dl_server_init(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq, + struct rq *served_rq, dl_server_pick_f pick_task) { - dl_se->rq =3D rq; + dl_se->dl_rq =3D dl_rq; + dl_se->my_q =3D served_rq; dl_se->server_pick_task =3D pick_task; } diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 65631e577ee9..306bd22a4731 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -3252,7 +3252,7 @@ ext_server_pick_task(struct sched_dl_entity *dl_se, s= truct rq_flags *rf) if (!scx_enabled()) return NULL; - return do_pick_task_scx(dl_se->rq, rf, true); + return do_pick_task_scx(dl_se->my_q, rf, true); } /* @@ -3264,7 +3264,7 @@ void ext_server_init(struct rq *rq) init_dl_entity(dl_se); - dl_server_init(dl_se, rq, ext_server_pick_task); + dl_server_init(dl_se, &rq->dl, rq, ext_server_pick_task); } #ifdef CONFIG_SCHED_CORE diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 3ebec186f982..2bc749ae9203 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9315,7 +9315,7 @@ pick_next_task_fair(struct rq *rq, struct task_struct= *prev, struct rq_flags *rf static struct task_struct * fair_server_pick_task(struct sched_dl_entity *dl_se, struct rq_flags *rf) { - return pick_task_fair(dl_se->rq, rf); + return pick_task_fair(dl_se->my_q, rf); } void fair_server_init(struct rq *rq) @@ -9324,7 +9324,7 @@ void fair_server_init(struct rq *rq) init_dl_entity(dl_se); - dl_server_init(dl_se, rq, fair_server_pick_task); + dl_server_init(dl_se, &rq->dl, rq, fair_server_pick_task); } /* diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 9f63b15d309d..970386ce4dbf 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -412,7 +412,8 @@ extern void dl_server_update_idle(struct sched_dl_entit= y *dl_se, s64 delta_exec) extern void dl_server_update(struct sched_dl_entity *dl_se, s64 delta_exec= ); extern void dl_server_start(struct sched_dl_entity *dl_se); extern void dl_server_stop(struct sched_dl_entity *dl_se); -extern void dl_server_init(struct sched_dl_entity *dl_se, struct rq *rq, +extern void dl_server_init(struct sched_dl_entity *dl_se, struct dl_rq *dl= _rq, + struct rq *served_rq, dl_server_pick_f pick_task); extern void sched_init_dl_servers(void); -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E09D33793B5 for ; Mon, 8 Jun 2026 12:15:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920956; cv=none; b=gWClXjiumL7XuXSwjhEGlZ5xysDP6herGh4aWM7UJ7RFbjiJ6Mg19WNsdOmI80bjI5CS1pn1BWs2f7uSsbthXdpuid0OaATlbMm55AilrshqPtw6DtSIEO6ANcHbLlq2zPyxarvwEJc/82k6J6lR0oBxL8n2ca9cOOb45vGugHM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920956; c=relaxed/simple; bh=+6NNVyOAj+MiG/3rsdS2krfz8Ku9XuCIg24dZR7/GR4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IlUcL6pe/GXiqNuRWOJbXuV4HkKM+k/PzaS7OKc7wZQO068YHbpuI9ruu7nRqaqB3o0d7Sk7S7G4GPNipaXYkwo7bmGcvaFnuFFzp6x455zoibQc0rnY40qvvaqepmgNx61WzCjTdmwoX7TgMg7SmNNRs61K7iOFOeQIk56pOIU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=G+osm++j; arc=none smtp.client-ip=209.85.221.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="G+osm++j" Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-45ef29c5561so2178750f8f.0 for ; Mon, 08 Jun 2026 05:15:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920952; x=1781525752; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=m5ogVvmPvhphEqQWbqzhUkign6lvp8DgfpN6HVT+Wd4=; b=G+osm++jZ3m7VCwwxA0UAwuPB5sC5mHbjFdyYwMdQXMfYa9i3o6WkbX9cYiXR2HzCb 956PH21ODi2asSkHHWbA2hjtVoCwJUfiwpnlVinmnmZzAojZLam/mq5lIWds1/4a4BG+ /tlWf+n9TDbDVPZsx/Umie4oZPnq/V18hWFROWduqZInwmjQ7uEOm8mIsZqs4m/dq9t5 a4Wgwjn67eWl/MOARK01jQETX/ne0KqWXB8eX8PJNjbKnUebT9qNUGazMdsKr6afIz0a J4M8PCcwZDn92SsBCnaTDGHk3Zy72mzWF9IwsGy2RRk5toMRm6zXpX7osqY0Y7Fga1P9 pbmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920952; x=1781525752; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=m5ogVvmPvhphEqQWbqzhUkign6lvp8DgfpN6HVT+Wd4=; b=fKOZJAx+96xJpx7a4dWM1uTPegA5QzOB8sp4TRbY2wG31euJ8Sgq2k8ojyrTGs5hFs AUz/KP+vz+2kFtBT9AWO9HXwWwiGi83EWlbraxLx1+WEDdkwB7ahPm2D82lJ86030zPh JVTUK4Ukd84KSEV5WE0wqWC3vi9TINdLVwhsywU+bugYeNbpgUnKEXuO7pHnjTc6w2oH j0qCwUVJCe/ZJdCN/SIibYFX/Zeyq5Ebika4+kp0pbnH2HiXJol2VEB4fzMh7iljoAVn DYFeZepfG+h34X/uxvjLiYvGlxnWR3x0X7iKBg++126/twN8snWwyl22i1twZp5TDsXY brzg== X-Forwarded-Encrypted: i=1; AFNElJ9SG3qgai+J2or0NXSa8oS96eXoORqpy2jngMb7fe1cjryzDu2mzyerw5HxK91JgPxqK753FZarU1JYfTw=@vger.kernel.org X-Gm-Message-State: AOJu0YwDJgOL76xSLXgt8K1+EJzLmqTFdZCF5wGGjnfhVlD/38CAJxsA c1ZCo6VI5th1tkZA5GAK5dsf6vLPq0l/WrbtD+YF9BZbCxIHmQIuzDwh X-Gm-Gg: Acq92OGHp8bhVOJ3BmoBBVZcw0AHk8jI4zp7rSw3SmzDaIJnsnkH29m3rB6EtLOa56+ ADugrv2VhebY0P5bbwdV+m1GA2wKYXs3IJgPGbGhlfrzJVyf+2ekdEvlY3NnPnWVc/VerCzx7/p Gq3Zv1MQMTEEn1LeVTyyMqs2kqDjn3F6HJJVdFAI8L+UYEMVajUFz3sU7UJTH6E51Pkl/jIvIRR aHhewkZfjYTgvjkVxImaLUJWmtcp+/I4D3lelxIIotqz0GpactWrzUp4XwjZwiykZHsYlZQws7h 27BlGI+mBvsKOyKVgjUpuXFXqPFjeDjB7xgbrdzD08OHoJKHfGUyoLofRo7VQxeYjc9pIXA4P74 TtY5LdnTC/gUKgg93vdP3RBCm+B0zbmYaNtLFnNsk0XZYWjHWauvqs+PvfqIrBaA1zLipPWDoOp Ua47c+KxWaroxzoKh6waUL879Gp/WTcB0= X-Received: by 2002:a05:6000:22c5:b0:45e:f68d:e791 with SMTP id ffacd0b85a97d-46030182afdmr24324224f8f.0.1780920952250; Mon, 08 Jun 2026 05:15:52 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:51 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 05/25] sched/rt: Pass an rt_rq instead of an rq where needed Date: Mon, 8 Jun 2026 14:15:24 +0200 Message-ID: <20260608121546.69910-6-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Make rt.c code access the runqueue through the rt_rq data structure rather than passing an rq pointer directly. This allows future patches to define rt_rq data structures which do not refer only to the global runqueue, but also to local cgroup runqueues (as rt_rq will not be always equal to &rq->rt). Co-developed-by: Yuri Andriaccio Signed-off-by: Yuri Andriaccio Signed-off-by: luca abeni --- kernel/sched/rt.c | 169 +++++++++++++++++++++++++++------------------- 1 file changed, 100 insertions(+), 69 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index e6ea728f519e..0f0d9c283bd4 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -370,9 +370,9 @@ static inline void rt_clear_overload(struct rq *rq) cpumask_clear_cpu(rq->cpu, rq->rd->rto_mask); } -static inline int has_pushable_tasks(struct rq *rq) +static inline int has_pushable_tasks(struct rt_rq *rt_rq) { - return !plist_head_empty(&rq->rt.pushable_tasks); + return !plist_head_empty(&rt_rq->pushable_tasks); } static DEFINE_PER_CPU(struct balance_callback, rt_push_head); @@ -381,50 +381,66 @@ static DEFINE_PER_CPU(struct balance_callback, rt_pul= l_head); static void push_rt_tasks(struct rq *); static void pull_rt_task(struct rq *); -static inline void rt_queue_push_tasks(struct rq *rq) +static inline void rt_queue_push_tasks(struct rt_rq *rt_rq) { - if (!has_pushable_tasks(rq)) + struct rq *rq =3D container_of_const(rt_rq, struct rq, rt); + + if (!has_pushable_tasks(rt_rq)) return; queue_balance_callback(rq, &per_cpu(rt_push_head, rq->cpu), push_rt_tasks= ); } -static inline void rt_queue_pull_task(struct rq *rq) +static inline void rt_queue_pull_task(struct rt_rq *rt_rq) { + struct rq *rq =3D container_of_const(rt_rq, struct rq, rt); + queue_balance_callback(rq, &per_cpu(rt_pull_head, rq->cpu), pull_rt_task); } -static void enqueue_pushable_task(struct rq *rq, struct task_struct *p) +static void push_rt_rq_tasks(struct rt_rq *rt_rq); + +static void push_rt_tasks(struct rq *global_rq) { + push_rt_rq_tasks(&global_rq->rt); +} + +static void pull_rt_rq_task(struct rt_rq *this_rt_rq); + +static void pull_rt_task(struct rq *global_rq) { + pull_rt_rq_task(&global_rq->rt); +} + +static void enqueue_pushable_task(struct rt_rq *rt_rq, struct task_struct = *p) { - plist_del(&p->pushable_tasks, &rq->rt.pushable_tasks); + plist_del(&p->pushable_tasks, &rt_rq->pushable_tasks); plist_node_init(&p->pushable_tasks, p->prio); - plist_add(&p->pushable_tasks, &rq->rt.pushable_tasks); + plist_add(&p->pushable_tasks, &rt_rq->pushable_tasks); /* Update the highest prio pushable task */ - if (p->prio < rq->rt.highest_prio.next) - rq->rt.highest_prio.next =3D p->prio; + if (p->prio < rt_rq->highest_prio.next) + rt_rq->highest_prio.next =3D p->prio; - if (!rq->rt.overloaded) { - rt_set_overload(rq); - rq->rt.overloaded =3D 1; + if (!rt_rq->overloaded) { + rt_set_overload(rq_of_rt_rq(rt_rq)); + rt_rq->overloaded =3D 1; } } -static void dequeue_pushable_task(struct rq *rq, struct task_struct *p) +static void dequeue_pushable_task(struct rt_rq *rt_rq, struct task_struct = *p) { - plist_del(&p->pushable_tasks, &rq->rt.pushable_tasks); + plist_del(&p->pushable_tasks, &rt_rq->pushable_tasks); /* Update the new highest prio pushable task */ - if (has_pushable_tasks(rq)) { - p =3D plist_first_entry(&rq->rt.pushable_tasks, + if (has_pushable_tasks(rt_rq)) { + p =3D plist_first_entry(&rt_rq->pushable_tasks, struct task_struct, pushable_tasks); - rq->rt.highest_prio.next =3D p->prio; + rt_rq->highest_prio.next =3D p->prio; } else { - rq->rt.highest_prio.next =3D MAX_RT_PRIO-1; + rt_rq->highest_prio.next =3D MAX_RT_PRIO-1; - if (rq->rt.overloaded) { - rt_clear_overload(rq); - rq->rt.overloaded =3D 0; + if (rt_rq->overloaded) { + rt_clear_overload(rq_of_rt_rq(rt_rq)); + rt_rq->overloaded =3D 0; } } } @@ -1436,6 +1452,7 @@ static void enqueue_task_rt(struct rq *rq, struct task_struct *p, int flags) { struct sched_rt_entity *rt_se =3D &p->rt; + struct rt_rq *rt_rq =3D rt_rq_of_se(rt_se); if (flags & ENQUEUE_WAKEUP) rt_se->timeout =3D 0; @@ -1449,17 +1466,18 @@ enqueue_task_rt(struct rq *rq, struct task_struct *= p, int flags) return; if (!task_current(rq, p) && p->nr_cpus_allowed > 1) - enqueue_pushable_task(rq, p); + enqueue_pushable_task(rt_rq, p); } static bool dequeue_task_rt(struct rq *rq, struct task_struct *p, int flag= s) { struct sched_rt_entity *rt_se =3D &p->rt; + struct rt_rq *rt_rq =3D rt_rq_of_se(rt_se); update_curr_rt(rq); dequeue_rt_entity(rt_se, flags); - dequeue_pushable_task(rq, p); + dequeue_pushable_task(rt_rq, p); return true; } @@ -1498,7 +1516,7 @@ static void yield_task_rt(struct rq *rq) requeue_task_rt(rq, rq->donor, 0); } -static int find_lowest_rq(struct task_struct *task); +static int find_lowest_rt_rq(struct task_struct *task); static int select_task_rq_rt(struct task_struct *p, int cpu, int flags) @@ -1548,7 +1566,7 @@ select_task_rq_rt(struct task_struct *p, int cpu, int= flags) (curr->nr_cpus_allowed < 2 || donor->prio <=3D p->prio); if (test || !rt_task_fits_capacity(p, cpu)) { - int target =3D find_lowest_rq(p); + int target =3D find_lowest_rt_rq(p); /* * Bail out if we were forcing a migration to find a better @@ -1606,7 +1624,7 @@ static int balance_rt(struct rq *rq, struct task_stru= ct *p, struct rq_flags *rf) * not yet started the picking loop. */ rq_unpin_lock(rq, rf); - pull_rt_task(rq); + pull_rt_rq_task(&rq->rt); rq_repin_lock(rq, rf); } @@ -1650,14 +1668,14 @@ static void wakeup_preempt_rt(struct rq *rq, struct= task_struct *p, int flags) static inline void set_next_task_rt(struct rq *rq, struct task_struct *p, = bool first) { struct sched_rt_entity *rt_se =3D &p->rt; - struct rt_rq *rt_rq =3D &rq->rt; + struct rt_rq *rt_rq =3D rt_rq_of_se(&p->rt); p->se.exec_start =3D rq_clock_task(rq); if (on_rt_rq(&p->rt)) update_stats_wait_end_rt(rt_rq, rt_se); /* The running task is never eligible for pushing */ - dequeue_pushable_task(rq, p); + dequeue_pushable_task(rt_rq, p); if (!first) return; @@ -1670,7 +1688,7 @@ static inline void set_next_task_rt(struct rq *rq, st= ruct task_struct *p, bool f if (rq->donor->sched_class !=3D &rt_sched_class) update_rt_rq_load_avg(rq_clock_pelt(rq), rq, 0); - rt_queue_push_tasks(rq); + rt_queue_push_tasks(rt_rq); } static struct sched_rt_entity *pick_next_rt_entity(struct rt_rq *rt_rq) @@ -1721,7 +1739,7 @@ static struct task_struct *pick_task_rt(struct rq *rq= , struct rq_flags *rf) static void put_prev_task_rt(struct rq *rq, struct task_struct *p, struct = task_struct *next) { struct sched_rt_entity *rt_se =3D &p->rt; - struct rt_rq *rt_rq =3D &rq->rt; + struct rt_rq *rt_rq =3D rt_rq_of_se(&p->rt); if (on_rt_rq(&p->rt)) update_stats_wait_start_rt(rt_rq, rt_se); @@ -1737,7 +1755,7 @@ static void put_prev_task_rt(struct rq *rq, struct ta= sk_struct *p, struct task_s * if it is still active */ if (on_rt_rq(&p->rt) && p->nr_cpus_allowed > 1) - enqueue_pushable_task(rq, p); + enqueue_pushable_task(rt_rq, p); } /* Only try algorithms three times */ @@ -1747,16 +1765,16 @@ static void put_prev_task_rt(struct rq *rq, struct = task_struct *p, struct task_s * Return the highest pushable rq's task, which is suitable to be executed * on the CPU, NULL otherwise */ -static struct task_struct *pick_highest_pushable_task(struct rq *rq, int c= pu) +static struct task_struct *pick_highest_pushable_task(struct rt_rq *rt_rq,= int cpu) { - struct plist_head *head =3D &rq->rt.pushable_tasks; + struct plist_head *head =3D &rt_rq->pushable_tasks; struct task_struct *p; - if (!has_pushable_tasks(rq)) + if (!has_pushable_tasks(rt_rq)) return NULL; plist_for_each_entry(p, head, pushable_tasks) { - if (task_is_pushable(rq, p, cpu)) + if (task_is_pushable(rq_of_rt_rq(rt_rq), p, cpu)) return p; } @@ -1765,7 +1783,7 @@ static struct task_struct *pick_highest_pushable_task= (struct rq *rq, int cpu) static DEFINE_PER_CPU(cpumask_var_t, local_cpu_mask); -static int find_lowest_rq(struct task_struct *task) +static int find_lowest_rt_rq(struct task_struct *task) { struct sched_domain *sd; struct cpumask *lowest_mask =3D this_cpu_cpumask_var_ptr(local_cpu_mask); @@ -1856,12 +1874,13 @@ static int find_lowest_rq(struct task_struct *task) return -1; } -static struct task_struct *pick_next_pushable_task(struct rq *rq) +static struct task_struct *pick_next_pushable_task(struct rt_rq *rt_rq) { - struct plist_head *head =3D &rq->rt.pushable_tasks; + struct rq *rq =3D rq_of_rt_rq(rt_rq); + struct plist_head *head =3D &rt_rq->pushable_tasks; struct task_struct *i, *p =3D NULL; - if (!has_pushable_tasks(rq)) + if (!has_pushable_tasks(rt_rq)) return NULL; plist_for_each_entry(i, head, pushable_tasks) { @@ -1887,14 +1906,15 @@ static struct task_struct *pick_next_pushable_task(= struct rq *rq) } /* Will lock the rq it finds */ -static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq = *rq) +static struct rt_rq *find_lock_lowest_rt_rq(struct task_struct *task, stru= ct rt_rq *rt_rq) { + struct rq *rq =3D rq_of_rt_rq(rt_rq); struct rq *lowest_rq =3D NULL; int tries; int cpu; for (tries =3D 0; tries < RT_MAX_TRIES; tries++) { - cpu =3D find_lowest_rq(task); + cpu =3D find_lowest_rt_rq(task); if ((cpu =3D=3D -1) || (cpu =3D=3D rq->cpu)) break; @@ -1925,7 +1945,7 @@ static struct rq *find_lock_lowest_rq(struct task_str= uct *task, struct rq *rq) */ if (unlikely(is_migration_disabled(task) || !cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) || - task !=3D pick_next_pushable_task(rq))) { + task !=3D pick_next_pushable_task(rt_rq))) { double_unlock_balance(rq, lowest_rq); lowest_rq =3D NULL; @@ -1942,7 +1962,13 @@ static struct rq *find_lock_lowest_rq(struct task_st= ruct *task, struct rq *rq) lowest_rq =3D NULL; } - return lowest_rq; + return &lowest_rq->rt; +} + +static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq = *rq) { + struct rt_rq *rt_rq =3D find_lock_lowest_rt_rq(task, &rq->rt); + + return rq_of_rt_rq(rt_rq); } /* @@ -1950,16 +1976,17 @@ static struct rq *find_lock_lowest_rq(struct task_s= truct *task, struct rq *rq) * running task can migrate over to a CPU that is running a task * of lesser priority. */ -static int push_rt_task(struct rq *rq, bool pull) +static int push_rt_rq_task(struct rt_rq *rt_rq, bool pull) { struct task_struct *next_task; - struct rq *lowest_rq; + struct rq *lowest_rq, *rq =3D rq_of_rt_rq(rt_rq); + struct rt_rq *lowest_rt_rq; int ret =3D 0; - if (!rq->rt.overloaded) + if (!rt_rq->overloaded) return 0; - next_task =3D pick_next_pushable_task(rq); + next_task =3D pick_next_pushable_task(rt_rq); if (!next_task) return 0; @@ -1982,7 +2009,7 @@ static int push_rt_task(struct rq *rq, bool pull) return 0; /* - * Invoking find_lowest_rq() on anything but an RT task doesn't + * Invoking find_lowest_rt_rq() on anything but an RT task doesn't * make sense. Per the above priority check, curr has to * be of higher priority than next_task, so no need to * reschedule when bailing out. @@ -1993,7 +2020,7 @@ static int push_rt_task(struct rq *rq, bool pull) if (rq->donor->sched_class !=3D &rt_sched_class) return 0; - cpu =3D find_lowest_rq(rq->curr); + cpu =3D find_lowest_rt_rq(rq->curr); if (cpu =3D=3D -1 || cpu =3D=3D rq->cpu) return 0; @@ -2022,19 +2049,19 @@ static int push_rt_task(struct rq *rq, bool pull) /* We might release rq lock */ get_task_struct(next_task); - /* find_lock_lowest_rq locks the rq if found */ - lowest_rq =3D find_lock_lowest_rq(next_task, rq); - if (!lowest_rq) { + /* find_lock_lowest_rt_rq locks the rq if found */ + lowest_rt_rq =3D find_lock_lowest_rt_rq(next_task, rt_rq); + if (!lowest_rt_rq) { struct task_struct *task; /* - * find_lock_lowest_rq releases rq->lock + * find_lock_lowest_rt_rq releases rq->lock * so it is possible that next_task has migrated. * * We need to make sure that the task is still on the same * run-queue and is also still the next task eligible for * pushing. */ - task =3D pick_next_pushable_task(rq); + task =3D pick_next_pushable_task(rt_rq); if (task =3D=3D next_task) { /* * The task hasn't migrated, and is still the next @@ -2057,6 +2084,7 @@ static int push_rt_task(struct rq *rq, bool pull) goto retry; } + lowest_rq =3D rq_of_rt_rq(lowest_rt_rq); move_queued_task_locked(rq, lowest_rq, next_task); resched_curr(lowest_rq); ret =3D 1; @@ -2068,10 +2096,10 @@ static int push_rt_task(struct rq *rq, bool pull) return ret; } -static void push_rt_tasks(struct rq *rq) +static void push_rt_rq_tasks(struct rt_rq *rt_rq) { - /* push_rt_task will return true if it moved an RT */ - while (push_rt_task(rq, false)) + /* push_rt_rq_task will return true if it moved an RT */ + while (push_rt_rq_task(rt_rq, false)) ; } @@ -2227,9 +2255,9 @@ void rto_push_irq_work_func(struct irq_work *work) * We do not need to grab the lock to check for has_pushable_tasks. * When it gets updated, a check is made if a push is possible. */ - if (has_pushable_tasks(rq)) { + if (has_pushable_tasks(&rq->rt)) { raw_spin_rq_lock(rq); - while (push_rt_task(rq, true)) + while (push_rt_rq_task(&rq->rt, true)) ; raw_spin_rq_unlock(rq); } @@ -2251,11 +2279,13 @@ void rto_push_irq_work_func(struct irq_work *work) } #endif /* HAVE_RT_PUSH_IPI */ -static void pull_rt_task(struct rq *this_rq) +static void pull_rt_rq_task(struct rt_rq *this_rt_rq) { + struct rq *this_rq =3D rq_of_rt_rq(this_rt_rq); int this_cpu =3D this_rq->cpu, cpu; bool resched =3D false; struct task_struct *p, *push_task; + struct rt_rq *src_rt_rq; struct rq *src_rq; int rt_overload_count =3D rt_overloaded(this_rq); @@ -2285,6 +2315,7 @@ static void pull_rt_task(struct rq *this_rq) continue; src_rq =3D cpu_rq(cpu); + src_rt_rq =3D &src_rq->rt; /* * Don't bother taking the src_rq->lock if the next highest @@ -2293,8 +2324,8 @@ static void pull_rt_task(struct rq *this_rq) * logically higher, the src_rq will push this task away. * And if its going logically lower, we do not care */ - if (src_rq->rt.highest_prio.next >=3D - this_rq->rt.highest_prio.curr) + if (src_rt_rq->highest_prio.next >=3D + this_rt_rq->highest_prio.curr) continue; /* @@ -2309,13 +2340,13 @@ static void pull_rt_task(struct rq *this_rq) * We can pull only a task, which is pushable * on its rq, and no others. */ - p =3D pick_highest_pushable_task(src_rq, this_cpu); + p =3D pick_highest_pushable_task(src_rt_rq, this_cpu); /* * Do we have an RT task that preempts * the to-be-scheduled task? */ - if (p && (p->prio < this_rq->rt.highest_prio.curr)) { + if (p && (p->prio < this_rt_rq->highest_prio.curr)) { WARN_ON(p =3D=3D src_rq->curr); WARN_ON(!task_on_rq_queued(p)); @@ -2374,7 +2405,7 @@ static void task_woken_rt(struct rq *rq, struct task_= struct *p) rq->donor->prio <=3D p->prio); if (need_to_push) - push_rt_tasks(rq); + push_rt_rq_tasks(rt_rq_of_se(&p->rt)); } /* Assumes rq->lock is held */ @@ -2415,7 +2446,7 @@ static void switched_from_rt(struct rq *rq, struct ta= sk_struct *p) if (!task_on_rq_queued(p) || rq->rt.rt_nr_running) return; - rt_queue_pull_task(rq); + rt_queue_pull_task(rt_rq_of_se(&p->rt)); } void __init init_sched_rt_class(void) @@ -2451,7 +2482,7 @@ static void switched_to_rt(struct rq *rq, struct task= _struct *p) */ if (task_on_rq_queued(p)) { if (p->nr_cpus_allowed > 1 && rq->rt.overloaded) - rt_queue_push_tasks(rq); + rt_queue_push_tasks(rt_rq_of_se(&p->rt)); if (p->prio < rq->donor->prio && cpu_online(cpu_of(rq))) resched_curr(rq); } @@ -2476,7 +2507,7 @@ prio_changed_rt(struct rq *rq, struct task_struct *p,= u64 oldprio) * may need to pull tasks to this runqueue. */ if (oldprio < p->prio) - rt_queue_pull_task(rq); + rt_queue_pull_task(rt_rq_of_se(&p->rt)); /* * If there's a higher priority task waiting to run -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16CF037B00F for ; Mon, 8 Jun 2026 12:15:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920957; cv=none; b=IBn4Fnt8tD4RElXDOlSW3GXL2Ll1NvXLObNPWfPK4OZ31t3jjDBFutMzyU/0/C0bquAMG+ASI/4qCG4Fxv1Hqu+AxgWz4MNSBTfInSt7lcOWnEgPUWE/XyI/bYPpdAjU2K9z1tf1DZ7Hd2cppEJwNjspXKWCzIS9H+Q9h4aoxlg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920957; c=relaxed/simple; bh=iwi1n/GaegwIdKWm3vvM+QonyYzEKh7dlyHKXurUVmE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=H842oHT3Hj5FDigDD4ALai4LeqCa9lduyCf1UBrdibilyJaqEI1/sHsfvE64NbRGKo4IqKvH3BSbSYyVMjPKxpv5CtAzM8gu+ZYU4dqnr+L215Ye6p4S7sYHHaaKSQhbKJz7KKRYEobRnfSSp3HI9Fpf5WCbF3yMQv/pr/gZiNI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=p0ny/cJQ; arc=none smtp.client-ip=209.85.221.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="p0ny/cJQ" Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-46013161068so2052145f8f.2 for ; Mon, 08 Jun 2026 05:15:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920953; x=1781525753; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zPYebrpCD2c79LMO4tVw7qbstFYmvk0kD7UuVMMLoIY=; b=p0ny/cJQbGLhRDoz40lzMtdANYnsoGXKoj3LfwHcKUo9xRmFag80XK8dnW3KcYm8cS ZGEjCv/LUlRW5LSKs/+RDK2igqSdRuzq6G1++LOjOTaX5E0daa8bEgVDG85XX2hKARhm qTO15LxQuoAB+vQv3BoIb0LEy6WA7J5gx0pw0BwEKtrdBaK68D7vWXXgQeL5/AqKkdYq lq/wZbGjrmXFi5Lt64E7aWIIzChK6A3RGKrNtPCrPREby4omu58XKKqifXaqQzh2RaVh SL/L8tlq6qesXT4w7AXCZwzpq3jRKCkHYfz/d6EMcQi8hmKGc4AxInwSljrOYmAZ061d salA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920953; x=1781525753; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=zPYebrpCD2c79LMO4tVw7qbstFYmvk0kD7UuVMMLoIY=; b=WLVtGRMLRy+Vb6PXmj83DuFV70s009Tde8TIVuvKEEHm8f/2xGhHDBVj2mS1UBOGqJ wrpXb2lmst7U4ZPSrrmmVfuw4XMfyPw3rTZ331y40eFtBo031hrpBskODf2sX0y1Ol0s fpR5k6F/zLbR9vfdWNUc3pQ/xauKPO9mUmLvvjTclUNHfBGhqkTC8J2Umkh6c1vPML0F Mq/kidWhW+/jgdXbwkwARc36lGmfTXERLTK6r+TyA2doFg3nGjQI8PU/n0QlKwTlNtp0 7Tt4i7D5OEC5k3ier9+1UsFDQXa5sKOUhKFkkQ7/mBvzpdOqsEWcM3+7KHICWMclgXdC dA+Q== X-Forwarded-Encrypted: i=1; AFNElJ9ufj2k8q1Er0tCdluUVsCazKThrQEdJPfAKx8MzbTcQzEgfnDoPjPvc7yCFqCfBdDzgppAekefkCZPdmQ=@vger.kernel.org X-Gm-Message-State: AOJu0YxaY30ATjnq6UhK/agonZbmhM3rCF8KLi9Mz9Y3PJPxR5AsEKAh 3sArnyx1W5m1EO/Wpn/ksA6xhDHduB3btgSSf/9/lfJeMoPdGMkerRqz X-Gm-Gg: Acq92OF98K0eOaO/NxiUVakK1dFkPq7Y7uS5Bvf3ZBokMS3U+qLpOZycGVXnPjAZkqy Z5DC+q1W7SM92IBgRdaovB9symtdmgPMEkW+oyLYkEyH1cZw4pZsHmmJPnzcesIArNnxtJamlCt GmEz+divh6fV76lqqspYBh3ukp/6tjr4P4YPBn9BcpzqTBSt/EKrgmdqMoV0zQlP5Vij1jfbF7h Bxn/ymmYo9asfGQwVy7U9TkCvTpxJP3Hf/pXiWj78cBmecu0cUvRJ/HYHvi06yZmfhnwT/r87Wn +ScBZYxSdJhGKQJ0JBkPMGOgJJRzBJyGxxEZFzVOo0RjbAy3ZLTIkhItSl43FpsTTuBFOjO5OUl 2ZyLyrHcLnNNE3vQpO0Qv99YRT8HyhidlHacTN/29TlP5vCinh/jCfnrZ3qwL/DzHHKPJ2qJLxX tnID/9GP5AActqLg8C3WGqzuIkUyEpwBQ= X-Received: by 2002:a5d:53c6:0:b0:45e:f29d:d42d with SMTP id ffacd0b85a97d-46030632bdbmr17813458f8f.25.1780920953377; Mon, 08 Jun 2026 05:15:53 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:52 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 06/25] sched/rt: Move functions from rt.c to sched.h Date: Mon, 8 Jun 2026 14:15:25 +0200 Message-ID: <20260608121546.69910-7-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Make the following functions/macros be non-static and move them in sched.h, so that they can be also used in other source files: - rt_entity_is_task() - rt_task_of() - rq_of_rt_rq() - rt_rq_of_se() - rq_of_rt_se() There are no functional changes, apart from the use of container_of_const() instead of container_of() where applicable. This is needed by future patche= s. Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- kernel/sched/rt.c | 56 ------------------------------------------ kernel/sched/sched.h | 58 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 58 insertions(+), 56 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 0f0d9c283bd4..fe5b58f8fc69 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -166,36 +166,6 @@ static void destroy_rt_bandwidth(struct rt_bandwidth *= rt_b) hrtimer_cancel(&rt_b->rt_period_timer); } =20 -#define rt_entity_is_task(rt_se) (!(rt_se)->my_q) - -static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) -{ - WARN_ON_ONCE(!rt_entity_is_task(rt_se)); - - return container_of(rt_se, struct task_struct, rt); -} - -static inline struct rq *rq_of_rt_rq(struct rt_rq *rt_rq) -{ - /* Cannot fold with non-CONFIG_RT_GROUP_SCHED version, layout */ - WARN_ON(!rt_group_sched_enabled() && rt_rq->tg !=3D &root_task_group); - return rt_rq->rq; -} - -static inline struct rt_rq *rt_rq_of_se(struct sched_rt_entity *rt_se) -{ - WARN_ON(!rt_group_sched_enabled() && rt_se->rt_rq->tg !=3D &root_task_gro= up); - return rt_se->rt_rq; -} - -static inline struct rq *rq_of_rt_se(struct sched_rt_entity *rt_se) -{ - struct rt_rq *rt_rq =3D rt_se->rt_rq; - - WARN_ON(!rt_group_sched_enabled() && rt_rq->tg !=3D &root_task_group); - return rt_rq->rq; -} - void unregister_rt_sched_group(struct task_group *tg) { if (!rt_group_sched_enabled()) @@ -294,32 +264,6 @@ int alloc_rt_sched_group(struct task_group *tg, struct= task_group *parent) =20 #else /* !CONFIG_RT_GROUP_SCHED: */ =20 -#define rt_entity_is_task(rt_se) (1) - -static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) -{ - return container_of(rt_se, struct task_struct, rt); -} - -static inline struct rq *rq_of_rt_rq(struct rt_rq *rt_rq) -{ - return container_of(rt_rq, struct rq, rt); -} - -static inline struct rq *rq_of_rt_se(struct sched_rt_entity *rt_se) -{ - struct task_struct *p =3D rt_task_of(rt_se); - - return task_rq(p); -} - -static inline struct rt_rq *rt_rq_of_se(struct sched_rt_entity *rt_se) -{ - struct rq *rq =3D rq_of_rt_se(rt_se); - - return &rq->rt; -} - void unregister_rt_sched_group(struct task_group *tg) { } =20 void free_rt_sched_group(struct task_group *tg) { } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 970386ce4dbf..a03866f68a3b 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3332,6 +3332,64 @@ extern void set_rq_offline(struct rq *rq); =20 extern bool sched_smp_initialized; =20 +#ifdef CONFIG_RT_GROUP_SCHED +#define rt_entity_is_task(rt_se) (!(rt_se)->my_q) + +static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) +{ + WARN_ON_ONCE(!rt_entity_is_task(rt_se)); + + return container_of_const(rt_se, struct task_struct, rt); +} + +static inline struct rq *rq_of_rt_rq(struct rt_rq *rt_rq) +{ + /* Cannot fold with non-CONFIG_RT_GROUP_SCHED version, layout */ + WARN_ON(!rt_group_sched_enabled() && rt_rq->tg !=3D &root_task_group); + return rt_rq->rq; +} + +static inline struct rt_rq *rt_rq_of_se(struct sched_rt_entity *rt_se) +{ + WARN_ON(!rt_group_sched_enabled() && rt_se->rt_rq->tg !=3D &root_task_gro= up); + return rt_se->rt_rq; +} + +static inline struct rq *rq_of_rt_se(struct sched_rt_entity *rt_se) +{ + struct rt_rq *rt_rq =3D rt_se->rt_rq; + + WARN_ON(!rt_group_sched_enabled() && rt_rq->tg !=3D &root_task_group); + return rt_rq->rq; +} +#else +#define rt_entity_is_task(rt_se) (1) + +static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) +{ + return container_of_const(rt_se, struct task_struct, rt); +} + +static inline struct rq *rq_of_rt_rq(struct rt_rq *rt_rq) +{ + return container_of_const(rt_rq, struct rq, rt); +} + +static inline struct rq *rq_of_rt_se(struct sched_rt_entity *rt_se) +{ + struct task_struct *p =3D rt_task_of(rt_se); + + return task_rq(p); +} + +static inline struct rt_rq *rt_rq_of_se(struct sched_rt_entity *rt_se) +{ + struct rq *rq =3D rq_of_rt_se(rt_se); + + return &rq->rt; +} +#endif + DEFINE_LOCK_GUARD_2(double_rq_lock, struct rq, double_rq_lock(_T->lock, _T->lock2), double_rq_unlock(_T->lock, _T->lock2)) --=20 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3D171378833 for ; Mon, 8 Jun 2026 12:15:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920960; cv=none; b=LsyVR5aPLcPhXFHgEswIwirXPTd+x+P8lM5GIa6m0SSDvR0RPuWlCKAUyTIayScv+k+xZOcVpp1v46rdwPm4O4/g7ptTVGoLJ6tKtXd/k5OcE6KT1LpkP30123KeeYD5zU8MpJieaRC/U89qkM3McPYGFdOHQK2PjCZqJx+y2UA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920960; c=relaxed/simple; bh=tjnTxI2QCugOXqRqvj40i033x+3jZH5sGefRnhmhLfU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ULrerAgFMpF6DR8oJXKDYXNx/EkYQ60YPCawtLztOb5j6acZcf9XM27K35ySpYNRgrjrv089RXUFT7kGHTQ8XArs3f3lleRx6r5Zueb5U5q75XpQ5i2GKwY08Ha3pdyW77FQw0uJ7jFo8KVqp1p1vUNfz0aqfF1zYkfMwB7+LTY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=QpNGNNHT; arc=none smtp.client-ip=209.85.221.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="QpNGNNHT" Received: by mail-wr1-f45.google.com with SMTP id ffacd0b85a97d-45eee266c6cso3480970f8f.1 for ; Mon, 08 Jun 2026 05:15:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920955; x=1781525755; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=364uttzh73Oa+/XNtb6jhsl3muZzgrKhNSL18k3lXqY=; b=QpNGNNHTsz4Q0pkSKos12VjaqVe/UgcC+Kg36bUjo8z7F2S2LpTgTw7vvkC4+LgfVQ MoXOs+VuSPpn10mYO+vFaEfTmSCbCf9xsopSgFy2WIEhuEPKl0OR6JzxFgmoHyp+UYTB zykHf2i5NUBLEGgeQ09iTiGJpEm7pMFy7dNRZfPv3SBlobET6tjOg9fn5wLJ89DCcwIW fdwfbi6z291H1IFY8xnqsS2587/IIS1PiIeenA0VnL6GmK6pdTYKfDIkUi3UwagTxFS5 CIIHQrdCHN41O9QOMEL3czEAjL/xOOTBcS4+glOZLqeH7H4CZbxDqx6QZnzggxi176pu WSRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920955; x=1781525755; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=364uttzh73Oa+/XNtb6jhsl3muZzgrKhNSL18k3lXqY=; b=rjxQU4zgh+18o+VhYzb058KgEvi2CPOieGhA3d/xIZM0Pj45JfLXQfnlx2+pZ23Mdf 08HOAMEubQRkmPnPc1PNPZ5APwYpHG5Z66IcF7yPjfAE5N3RjR03cU8a9fcxv1sieFWf sN0cVfzjv/SJLRu48yfJEakrZYCCHIZf+S+aqvj3g/+x7ZxXsDeoFY+zrA383tA3AZaF cnBXrK8nA1wmrTj4oNTALFbo4UB8CrOd6G5H6dRmIzY71Xpeed4fSYqbPUlGev2GIBck ahtvBJvHMdQTMw4sbHtzFs4KrDyb7F6UaMGpHj2YLQt2jSEHvsPTCkT3vJNgNs7UF+0f N4fA== X-Forwarded-Encrypted: i=1; AFNElJ+8E66R+YcvXdjSkppb0uP7ijWgfZ2bmZ5DYsTFP/8x3ZcEFDyXhcxzaVv+KPm8j67F0Gr2wR9lpNvdkUw=@vger.kernel.org X-Gm-Message-State: AOJu0YwT+M4eIMFhJgcbTTA4mP7/m5Df1dGoowxOVa52aJHqQB8eqZj9 ZFnixi/vd1VfU60Jt+OmilPN7Cy7Ecg//vWUnPF5AVDWd0DSkXiZFWqh X-Gm-Gg: Acq92OG8rZpiqjHJMwyUP1Fe/PVRQJySsq71GO2UY6jO8nF+MVGADjIX8ypxaWkqqju HQ/j1etnTAyIa+o1EarlAWPmwCxGmzEdI23+FflrZjPkaYFnbB2dUm3z73UTQBwOQLOsn4ItVNm hey8a5omNV2X5bbbiXXcWzvE64EVKFyP1OiGQ/2mh0Huf0Dh8TrJv54pouEr60qoBy4uJiLAM6G zsuWH/3MDShULPy+J3DNdBPrs2535AIVTAVtRF+nAsP4Pe0X+wqJV2dYXkbVqdzXUsreZP8r1U5 d/cSmZ0CaoGhcd365+Puu9Ag86g/WS3/lloMfWN2xfu9MhkJ3OzbmCdv2JjCFLYZ7GsNKGZcif4 94iojf2TfoOlGs+O+IPWQ8EJvjydhM1ZyIBhfmWbWFP/9OmSIGRYLXo7m4RMZ+UnIOZoWAUeAow mckJdNVdI5pqlTH39xgmPBSFE153tJVLNGf2zCBSeHeQ== X-Received: by 2002:a05:600c:8b67:b0:490:d354:bcef with SMTP id 5b1f17b1804b1-490d354be2emr8449925e9.33.1780920954386; Mon, 08 Jun 2026 05:15:54 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:54 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 07/25] sched/rt: Disable RT_GROUP_SCHED Date: Mon, 8 Jun 2026 14:15:26 +0200 Message-ID: <20260608121546.69910-8-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Disable the old RT_GROUP_SCHED scheduler. Note that this does not completely remove all the RT_GROUP_SCHED functionality, just unhooks it and removes most of the relevant functions. Some of the RT_GROUP_SCHED functions are kept because they will be adapted for the HCBS scheduling. Most notably: - Disable the initialization of the rt_bandwidth for group scheduling. - Unhook any functionality for RT_GROUP_SCHED in normal rt.c code, leaving only non-group functionality. - Remove group related field initialization in init_rt_rq(). - Remove all the unhooked (and so unused) functions from RT_GROUP_SCHED. - Remove all allocation/deallocation code for rt-groups, always returning failure on allocation. - Update inc/dec_rt_tasks active tasks' counters, as rt scheduling entities now only represent a single task, and not a group of tasks anymore. - Remove unused rq_of_rt_se function. Signed-off-by: Yuri Andriaccio --- kernel/sched/core.c | 6 - kernel/sched/deadline.c | 34 -- kernel/sched/debug.c | 6 - kernel/sched/rt.c | 852 ++-------------------------------------- kernel/sched/sched.h | 32 +- kernel/sched/syscalls.c | 13 - 6 files changed, 28 insertions(+), 915 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index b8871449d3c6..e38ca8192d2d 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8922,11 +8922,6 @@ void __init sched_init(void) =20 init_defrootdomain(); =20 -#ifdef CONFIG_RT_GROUP_SCHED - init_rt_bandwidth(&root_task_group.rt_bandwidth, - global_rt_period(), global_rt_runtime()); -#endif /* CONFIG_RT_GROUP_SCHED */ - #ifdef CONFIG_CGROUP_SCHED task_group_cache =3D KMEM_CACHE(task_group, 0); =20 @@ -8978,7 +8973,6 @@ void __init sched_init(void) * starts working after scheduler_running, which is not the case * yet. */ - rq->rt.rt_runtime =3D global_rt_runtime(); init_tg_rt_entry(&root_task_group, &rq->rt, NULL, i, NULL); #endif rq->next_class =3D &idle_sched_class; diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index b3059658a74a..c12882348a03 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1533,40 +1533,6 @@ static void update_curr_dl_se(struct rq *rq, struct = sched_dl_entity *dl_se, s64 } else { trace_sched_dl_update_tp(dl_se, cpu_of(rq), dl_get_type(dl_se, rq)); } - - /* - * The dl_server does not account for real-time workload because it - * is running fair work. - */ - if (dl_se->dl_server) - return; - -#ifdef CONFIG_RT_GROUP_SCHED - /* - * Because -- for now -- we share the rt bandwidth, we need to - * account our runtime there too, otherwise actual rt tasks - * would be able to exceed the shared quota. - * - * Account to the root rt group for now. - * - * The solution we're working towards is having the RT groups scheduled - * using deadline servers -- however there's a few nasties to figure - * out before that can happen. - */ - if (rt_bandwidth_enabled()) { - struct rt_rq *rt_rq =3D &rq->rt; - - raw_spin_lock(&rt_rq->rt_runtime_lock); - /* - * We'll let actual RT tasks worry about the overflow here, we - * have our own CBS to keep us inline; only account when RT - * bandwidth is relevant. - */ - if (sched_rt_bandwidth_account(rt_rq)) - rt_rq->rt_time +=3D delta_exec; - raw_spin_unlock(&rt_rq->rt_runtime_lock); - } -#endif /* CONFIG_RT_GROUP_SCHED */ } =20 /* diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index 74c1617cf652..40cc905a65b7 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -1009,12 +1009,6 @@ void print_rt_rq(struct seq_file *m, int cpu, struct= rt_rq *rt_rq) =20 PU(rt_nr_running); =20 -#ifdef CONFIG_RT_GROUP_SCHED - P(rt_throttled); - PN(rt_time); - PN(rt_runtime); -#endif - #undef PN #undef PU #undef P diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index fe5b58f8fc69..7b526a86083c 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -82,115 +82,19 @@ void init_rt_rq(struct rt_rq *rt_rq) rt_rq->highest_prio.next =3D MAX_RT_PRIO-1; rt_rq->overloaded =3D 0; plist_head_init(&rt_rq->pushable_tasks); - /* We start is dequeued state, because no RT tasks are queued */ - rt_rq->rt_queued =3D 0; - -#ifdef CONFIG_RT_GROUP_SCHED - rt_rq->rt_time =3D 0; - rt_rq->rt_throttled =3D 0; - rt_rq->rt_runtime =3D 0; - raw_spin_lock_init(&rt_rq->rt_runtime_lock); - rt_rq->tg =3D &root_task_group; -#endif } =20 #ifdef CONFIG_RT_GROUP_SCHED =20 -static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun= ); - -static enum hrtimer_restart sched_rt_period_timer(struct hrtimer *timer) -{ - struct rt_bandwidth *rt_b =3D - container_of(timer, struct rt_bandwidth, rt_period_timer); - int idle =3D 0; - int overrun; - - raw_spin_lock(&rt_b->rt_runtime_lock); - for (;;) { - overrun =3D hrtimer_forward_now(timer, rt_b->rt_period); - if (!overrun) - break; - - raw_spin_unlock(&rt_b->rt_runtime_lock); - idle =3D do_sched_rt_period_timer(rt_b, overrun); - raw_spin_lock(&rt_b->rt_runtime_lock); - } - if (idle) - rt_b->rt_period_active =3D 0; - raw_spin_unlock(&rt_b->rt_runtime_lock); - - return idle ? HRTIMER_NORESTART : HRTIMER_RESTART; -} - -void init_rt_bandwidth(struct rt_bandwidth *rt_b, u64 period, u64 runtime) -{ - rt_b->rt_period =3D ns_to_ktime(period); - rt_b->rt_runtime =3D runtime; - - raw_spin_lock_init(&rt_b->rt_runtime_lock); - - hrtimer_setup(&rt_b->rt_period_timer, sched_rt_period_timer, CLOCK_MONOTO= NIC, - HRTIMER_MODE_REL_HARD); -} - -static inline void do_start_rt_bandwidth(struct rt_bandwidth *rt_b) -{ - raw_spin_lock(&rt_b->rt_runtime_lock); - if (!rt_b->rt_period_active) { - rt_b->rt_period_active =3D 1; - /* - * SCHED_DEADLINE updates the bandwidth, as a run away - * RT task with a DL task could hog a CPU. But DL does - * not reset the period. If a deadline task was running - * without an RT task running, it can cause RT tasks to - * throttle when they start up. Kick the timer right away - * to update the period. - */ - hrtimer_forward_now(&rt_b->rt_period_timer, ns_to_ktime(0)); - hrtimer_start_expires(&rt_b->rt_period_timer, - HRTIMER_MODE_ABS_PINNED_HARD); - } - raw_spin_unlock(&rt_b->rt_runtime_lock); -} - -static void start_rt_bandwidth(struct rt_bandwidth *rt_b) -{ - if (!rt_bandwidth_enabled() || rt_b->rt_runtime =3D=3D RUNTIME_INF) - return; - - do_start_rt_bandwidth(rt_b); -} - -static void destroy_rt_bandwidth(struct rt_bandwidth *rt_b) -{ - hrtimer_cancel(&rt_b->rt_period_timer); -} - void unregister_rt_sched_group(struct task_group *tg) { - if (!rt_group_sched_enabled()) - return; =20 - if (tg->rt_se) - destroy_rt_bandwidth(&tg->rt_bandwidth); } =20 void free_rt_sched_group(struct task_group *tg) { - int i; - if (!rt_group_sched_enabled()) return; - - for_each_possible_cpu(i) { - if (tg->rt_rq) - kfree(tg->rt_rq[i]); - if (tg->rt_se) - kfree(tg->rt_se[i]); - } - - kfree(tg->rt_rq); - kfree(tg->rt_se); } =20 void init_tg_rt_entry(struct task_group *tg, struct rt_rq *rt_rq, @@ -200,66 +104,19 @@ void init_tg_rt_entry(struct task_group *tg, struct r= t_rq *rt_rq, struct rq *rq =3D cpu_rq(cpu); =20 rt_rq->highest_prio.curr =3D MAX_RT_PRIO-1; - rt_rq->rt_nr_boosted =3D 0; rt_rq->rq =3D rq; rt_rq->tg =3D tg; =20 tg->rt_rq[cpu] =3D rt_rq; tg->rt_se[cpu] =3D rt_se; - - if (!rt_se) - return; - - if (!parent) - rt_se->rt_rq =3D &rq->rt; - else - rt_se->rt_rq =3D parent->my_q; - - rt_se->my_q =3D rt_rq; - rt_se->parent =3D parent; - INIT_LIST_HEAD(&rt_se->run_list); } =20 int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) { - struct rt_rq *rt_rq; - struct sched_rt_entity *rt_se; - int i; - if (!rt_group_sched_enabled()) return 1; =20 - tg->rt_rq =3D kzalloc_objs(rt_rq, nr_cpu_ids); - if (!tg->rt_rq) - goto err; - tg->rt_se =3D kzalloc_objs(rt_se, nr_cpu_ids); - if (!tg->rt_se) - goto err; - - init_rt_bandwidth(&tg->rt_bandwidth, ktime_to_ns(global_rt_period()), 0); - - for_each_possible_cpu(i) { - rt_rq =3D kzalloc_node(sizeof(struct rt_rq), - GFP_KERNEL, cpu_to_node(i)); - if (!rt_rq) - goto err; - - rt_se =3D kzalloc_node(sizeof(struct sched_rt_entity), - GFP_KERNEL, cpu_to_node(i)); - if (!rt_se) - goto err_free_rq; - - init_rt_rq(rt_rq); - rt_rq->rt_runtime =3D tg->rt_bandwidth.rt_runtime; - init_tg_rt_entry(tg, rt_rq, rt_se, i, parent->rt_se[i]); - } - return 1; - -err_free_rq: - kfree(rt_rq); -err: - return 0; } =20 #else /* !CONFIG_RT_GROUP_SCHED: */ @@ -389,9 +246,6 @@ static void dequeue_pushable_task(struct rt_rq *rt_rq, = struct task_struct *p) } } =20 -static void enqueue_top_rt_rq(struct rt_rq *rt_rq); -static void dequeue_top_rt_rq(struct rt_rq *rt_rq, unsigned int count); - static inline int on_rt_rq(struct sched_rt_entity *rt_se) { return rt_se->on_rq; @@ -438,16 +292,6 @@ static inline bool rt_task_fits_capacity(struct task_s= truct *p, int cpu) =20 #ifdef CONFIG_RT_GROUP_SCHED =20 -static inline u64 sched_rt_runtime(struct rt_rq *rt_rq) -{ - return rt_rq->rt_runtime; -} - -static inline u64 sched_rt_period(struct rt_rq *rt_rq) -{ - return ktime_to_ns(rt_rq->tg->rt_bandwidth.rt_period); -} - typedef struct task_group *rt_rq_iter_t; =20 static inline struct task_group *next_task_group(struct task_group *tg) @@ -473,457 +317,20 @@ static inline struct task_group *next_task_group(str= uct task_group *tg) iter && (rt_rq =3D iter->rt_rq[cpu_of(rq)]); \ iter =3D next_task_group(iter)) =20 -#define for_each_sched_rt_entity(rt_se) \ - for (; rt_se; rt_se =3D rt_se->parent) - -static inline struct rt_rq *group_rt_rq(struct sched_rt_entity *rt_se) -{ - return rt_se->my_q; -} - static void enqueue_rt_entity(struct sched_rt_entity *rt_se, unsigned int = flags); static void dequeue_rt_entity(struct sched_rt_entity *rt_se, unsigned int = flags); =20 -static void sched_rt_rq_enqueue(struct rt_rq *rt_rq) -{ - struct task_struct *donor =3D rq_of_rt_rq(rt_rq)->donor; - struct rq *rq =3D rq_of_rt_rq(rt_rq); - struct sched_rt_entity *rt_se; - - int cpu =3D cpu_of(rq); - - rt_se =3D rt_rq->tg->rt_se[cpu]; - - if (rt_rq->rt_nr_running) { - if (!rt_se) - enqueue_top_rt_rq(rt_rq); - else if (!on_rt_rq(rt_se)) - enqueue_rt_entity(rt_se, 0); - - if (rt_rq->highest_prio.curr < donor->prio) - resched_curr(rq); - } -} - -static void sched_rt_rq_dequeue(struct rt_rq *rt_rq) -{ - struct sched_rt_entity *rt_se; - int cpu =3D cpu_of(rq_of_rt_rq(rt_rq)); - - rt_se =3D rt_rq->tg->rt_se[cpu]; - - if (!rt_se) { - dequeue_top_rt_rq(rt_rq, rt_rq->rt_nr_running); - /* Kick cpufreq (see the comment in kernel/sched/sched.h). */ - cpufreq_update_util(rq_of_rt_rq(rt_rq), 0); - } - else if (on_rt_rq(rt_se)) - dequeue_rt_entity(rt_se, 0); -} - -static inline int rt_rq_throttled(struct rt_rq *rt_rq) -{ - return rt_rq->rt_throttled && !rt_rq->rt_nr_boosted; -} - -static int rt_se_boosted(struct sched_rt_entity *rt_se) -{ - struct rt_rq *rt_rq =3D group_rt_rq(rt_se); - struct task_struct *p; - - if (rt_rq) - return !!rt_rq->rt_nr_boosted; - - p =3D rt_task_of(rt_se); - return p->prio !=3D p->normal_prio; -} - -static inline const struct cpumask *sched_rt_period_mask(void) -{ - return this_rq()->rd->span; -} - -static inline -struct rt_rq *sched_rt_period_rt_rq(struct rt_bandwidth *rt_b, int cpu) -{ - return container_of(rt_b, struct task_group, rt_bandwidth)->rt_rq[cpu]; -} - -static inline struct rt_bandwidth *sched_rt_bandwidth(struct rt_rq *rt_rq) -{ - return &rt_rq->tg->rt_bandwidth; -} - -bool sched_rt_bandwidth_account(struct rt_rq *rt_rq) -{ - struct rt_bandwidth *rt_b =3D sched_rt_bandwidth(rt_rq); - - return (hrtimer_active(&rt_b->rt_period_timer) || - rt_rq->rt_time < rt_b->rt_runtime); -} - -/* - * We ran out of runtime, see if we can borrow some from our neighbours. - */ -static void do_balance_runtime(struct rt_rq *rt_rq) -{ - struct rt_bandwidth *rt_b =3D sched_rt_bandwidth(rt_rq); - struct root_domain *rd =3D rq_of_rt_rq(rt_rq)->rd; - int i, weight; - u64 rt_period; - - weight =3D cpumask_weight(rd->span); - - raw_spin_lock(&rt_b->rt_runtime_lock); - rt_period =3D ktime_to_ns(rt_b->rt_period); - for_each_cpu(i, rd->span) { - struct rt_rq *iter =3D sched_rt_period_rt_rq(rt_b, i); - s64 diff; - - if (iter =3D=3D rt_rq) - continue; - - raw_spin_lock(&iter->rt_runtime_lock); - /* - * Either all rqs have inf runtime and there's nothing to steal - * or __disable_runtime() below sets a specific rq to inf to - * indicate its been disabled and disallow stealing. - */ - if (iter->rt_runtime =3D=3D RUNTIME_INF) - goto next; - - /* - * From runqueues with spare time, take 1/n part of their - * spare time, but no more than our period. - */ - diff =3D iter->rt_runtime - iter->rt_time; - if (diff > 0) { - diff =3D div_u64((u64)diff, weight); - if (rt_rq->rt_runtime + diff > rt_period) - diff =3D rt_period - rt_rq->rt_runtime; - iter->rt_runtime -=3D diff; - rt_rq->rt_runtime +=3D diff; - if (rt_rq->rt_runtime =3D=3D rt_period) { - raw_spin_unlock(&iter->rt_runtime_lock); - break; - } - } -next: - raw_spin_unlock(&iter->rt_runtime_lock); - } - raw_spin_unlock(&rt_b->rt_runtime_lock); -} - -/* - * Ensure this RQ takes back all the runtime it lend to its neighbours. - */ -static void __disable_runtime(struct rq *rq) -{ - struct root_domain *rd =3D rq->rd; - rt_rq_iter_t iter; - struct rt_rq *rt_rq; - - if (unlikely(!scheduler_running)) - return; - - for_each_rt_rq(rt_rq, iter, rq) { - struct rt_bandwidth *rt_b =3D sched_rt_bandwidth(rt_rq); - s64 want; - int i; - - raw_spin_lock(&rt_b->rt_runtime_lock); - raw_spin_lock(&rt_rq->rt_runtime_lock); - /* - * Either we're all inf and nobody needs to borrow, or we're - * already disabled and thus have nothing to do, or we have - * exactly the right amount of runtime to take out. - */ - if (rt_rq->rt_runtime =3D=3D RUNTIME_INF || - rt_rq->rt_runtime =3D=3D rt_b->rt_runtime) - goto balanced; - raw_spin_unlock(&rt_rq->rt_runtime_lock); - - /* - * Calculate the difference between what we started out with - * and what we current have, that's the amount of runtime - * we lend and now have to reclaim. - */ - want =3D rt_b->rt_runtime - rt_rq->rt_runtime; - - /* - * Greedy reclaim, take back as much as we can. - */ - for_each_cpu(i, rd->span) { - struct rt_rq *iter =3D sched_rt_period_rt_rq(rt_b, i); - s64 diff; - - /* - * Can't reclaim from ourselves or disabled runqueues. - */ - if (iter =3D=3D rt_rq || iter->rt_runtime =3D=3D RUNTIME_INF) - continue; - - raw_spin_lock(&iter->rt_runtime_lock); - if (want > 0) { - diff =3D min_t(s64, iter->rt_runtime, want); - iter->rt_runtime -=3D diff; - want -=3D diff; - } else { - iter->rt_runtime -=3D want; - want -=3D want; - } - raw_spin_unlock(&iter->rt_runtime_lock); - - if (!want) - break; - } - - raw_spin_lock(&rt_rq->rt_runtime_lock); - /* - * We cannot be left wanting - that would mean some runtime - * leaked out of the system. - */ - WARN_ON_ONCE(want); -balanced: - /* - * Disable all the borrow logic by pretending we have inf - * runtime - in which case borrowing doesn't make sense. - */ - rt_rq->rt_runtime =3D RUNTIME_INF; - rt_rq->rt_throttled =3D 0; - raw_spin_unlock(&rt_rq->rt_runtime_lock); - raw_spin_unlock(&rt_b->rt_runtime_lock); - - /* Make rt_rq available for pick_next_task() */ - sched_rt_rq_enqueue(rt_rq); - } -} - -static void __enable_runtime(struct rq *rq) -{ - rt_rq_iter_t iter; - struct rt_rq *rt_rq; - - if (unlikely(!scheduler_running)) - return; - - /* - * Reset each runqueue's bandwidth settings - */ - for_each_rt_rq(rt_rq, iter, rq) { - struct rt_bandwidth *rt_b =3D sched_rt_bandwidth(rt_rq); - - raw_spin_lock(&rt_b->rt_runtime_lock); - raw_spin_lock(&rt_rq->rt_runtime_lock); - rt_rq->rt_runtime =3D rt_b->rt_runtime; - rt_rq->rt_time =3D 0; - rt_rq->rt_throttled =3D 0; - raw_spin_unlock(&rt_rq->rt_runtime_lock); - raw_spin_unlock(&rt_b->rt_runtime_lock); - } -} - -static void balance_runtime(struct rt_rq *rt_rq) -{ - if (!sched_feat(RT_RUNTIME_SHARE)) - return; - - if (rt_rq->rt_time > rt_rq->rt_runtime) { - raw_spin_unlock(&rt_rq->rt_runtime_lock); - do_balance_runtime(rt_rq); - raw_spin_lock(&rt_rq->rt_runtime_lock); - } -} - -static int do_sched_rt_period_timer(struct rt_bandwidth *rt_b, int overrun) -{ - int i, idle =3D 1, throttled =3D 0; - const struct cpumask *span; - - span =3D sched_rt_period_mask(); - - /* - * FIXME: isolated CPUs should really leave the root task group, - * whether they are isolcpus or were isolated via cpusets, lest - * the timer run on a CPU which does not service all runqueues, - * potentially leaving other CPUs indefinitely throttled. If - * isolation is really required, the user will turn the throttle - * off to kill the perturbations it causes anyway. Meanwhile, - * this maintains functionality for boot and/or troubleshooting. - */ - if (rt_b =3D=3D &root_task_group.rt_bandwidth) - span =3D cpu_online_mask; - - for_each_cpu(i, span) { - int enqueue =3D 0; - struct rt_rq *rt_rq =3D sched_rt_period_rt_rq(rt_b, i); - struct rq *rq =3D rq_of_rt_rq(rt_rq); - struct rq_flags rf; - int skip; - - /* - * When span =3D=3D cpu_online_mask, taking each rq->lock - * can be time-consuming. Try to avoid it when possible. - */ - raw_spin_lock(&rt_rq->rt_runtime_lock); - if (!sched_feat(RT_RUNTIME_SHARE) && rt_rq->rt_runtime !=3D RUNTIME_INF) - rt_rq->rt_runtime =3D rt_b->rt_runtime; - skip =3D !rt_rq->rt_time && !rt_rq->rt_nr_running; - raw_spin_unlock(&rt_rq->rt_runtime_lock); - if (skip) - continue; - - rq_lock(rq, &rf); - update_rq_clock(rq); - - if (rt_rq->rt_time) { - u64 runtime; - - raw_spin_lock(&rt_rq->rt_runtime_lock); - if (rt_rq->rt_throttled) - balance_runtime(rt_rq); - runtime =3D rt_rq->rt_runtime; - rt_rq->rt_time -=3D min(rt_rq->rt_time, overrun*runtime); - if (rt_rq->rt_throttled && rt_rq->rt_time < runtime) { - rt_rq->rt_throttled =3D 0; - enqueue =3D 1; - - /* - * When we're idle and a woken (rt) task is - * throttled wakeup_preempt() will set - * skip_update and the time between the wakeup - * and this unthrottle will get accounted as - * 'runtime'. - */ - if (rt_rq->rt_nr_running && rq->curr =3D=3D rq->idle) - rq_clock_cancel_skipupdate(rq); - } - if (rt_rq->rt_time || rt_rq->rt_nr_running) - idle =3D 0; - raw_spin_unlock(&rt_rq->rt_runtime_lock); - } else if (rt_rq->rt_nr_running) { - idle =3D 0; - if (!rt_rq_throttled(rt_rq)) - enqueue =3D 1; - } - if (rt_rq->rt_throttled) - throttled =3D 1; - - if (enqueue) - sched_rt_rq_enqueue(rt_rq); - rq_unlock(rq, &rf); - } - - if (!throttled && (!rt_bandwidth_enabled() || rt_b->rt_runtime =3D=3D RUN= TIME_INF)) - return 1; - - return idle; -} - -static int sched_rt_runtime_exceeded(struct rt_rq *rt_rq) -{ - u64 runtime =3D sched_rt_runtime(rt_rq); - - if (rt_rq->rt_throttled) - return rt_rq_throttled(rt_rq); - - if (runtime >=3D sched_rt_period(rt_rq)) - return 0; - - balance_runtime(rt_rq); - runtime =3D sched_rt_runtime(rt_rq); - if (runtime =3D=3D RUNTIME_INF) - return 0; - - if (rt_rq->rt_time > runtime) { - struct rt_bandwidth *rt_b =3D sched_rt_bandwidth(rt_rq); - - /* - * Don't actually throttle groups that have no runtime assigned - * but accrue some time due to boosting. - */ - if (likely(rt_b->rt_runtime)) { - rt_rq->rt_throttled =3D 1; - printk_deferred_once("sched: RT throttling activated\n"); - } else { - /* - * In case we did anyway, make it go away, - * replenishment is a joke, since it will replenish us - * with exactly 0 ns. - */ - rt_rq->rt_time =3D 0; - } - - if (rt_rq_throttled(rt_rq)) { - sched_rt_rq_dequeue(rt_rq); - return 1; - } - } - - return 0; -} - -#else /* !CONFIG_RT_GROUP_SCHED: */ +#else /* !CONFIG_RT_GROUP_SCHED */ =20 typedef struct rt_rq *rt_rq_iter_t; =20 #define for_each_rt_rq(rt_rq, iter, rq) \ for ((void) iter, rt_rq =3D &rq->rt; rt_rq; rt_rq =3D NULL) =20 -#define for_each_sched_rt_entity(rt_se) \ - for (; rt_se; rt_se =3D NULL) - -static inline struct rt_rq *group_rt_rq(struct sched_rt_entity *rt_se) -{ - return NULL; -} - -static inline void sched_rt_rq_enqueue(struct rt_rq *rt_rq) -{ - struct rq *rq =3D rq_of_rt_rq(rt_rq); - - if (!rt_rq->rt_nr_running) - return; - - enqueue_top_rt_rq(rt_rq); - resched_curr(rq); -} - -static inline void sched_rt_rq_dequeue(struct rt_rq *rt_rq) -{ - dequeue_top_rt_rq(rt_rq, rt_rq->rt_nr_running); -} - -static inline int rt_rq_throttled(struct rt_rq *rt_rq) -{ - return false; -} - -static inline const struct cpumask *sched_rt_period_mask(void) -{ - return cpu_online_mask; -} - -static inline -struct rt_rq *sched_rt_period_rt_rq(struct rt_bandwidth *rt_b, int cpu) -{ - return &cpu_rq(cpu)->rt; -} - -static void __enable_runtime(struct rq *rq) { } -static void __disable_runtime(struct rq *rq) { } - -#endif /* !CONFIG_RT_GROUP_SCHED */ +#endif /* CONFIG_RT_GROUP_SCHED */ =20 static inline int rt_se_prio(struct sched_rt_entity *rt_se) { -#ifdef CONFIG_RT_GROUP_SCHED - struct rt_rq *rt_rq =3D group_rt_rq(rt_se); - - if (rt_rq) - return rt_rq->highest_prio.curr; -#endif - return rt_task_of(rt_se)->prio; } =20 @@ -943,67 +350,8 @@ static void update_curr_rt(struct rq *rq) if (unlikely(delta_exec <=3D 0)) return; =20 -#ifdef CONFIG_RT_GROUP_SCHED - struct sched_rt_entity *rt_se =3D &donor->rt; - if (!rt_bandwidth_enabled()) return; - - for_each_sched_rt_entity(rt_se) { - struct rt_rq *rt_rq =3D rt_rq_of_se(rt_se); - int exceeded; - - if (sched_rt_runtime(rt_rq) !=3D RUNTIME_INF) { - raw_spin_lock(&rt_rq->rt_runtime_lock); - rt_rq->rt_time +=3D delta_exec; - exceeded =3D sched_rt_runtime_exceeded(rt_rq); - if (exceeded) - resched_curr(rq); - raw_spin_unlock(&rt_rq->rt_runtime_lock); - if (exceeded) - do_start_rt_bandwidth(sched_rt_bandwidth(rt_rq)); - } - } -#endif /* CONFIG_RT_GROUP_SCHED */ -} - -static void -dequeue_top_rt_rq(struct rt_rq *rt_rq, unsigned int count) -{ - struct rq *rq =3D rq_of_rt_rq(rt_rq); - - BUG_ON(&rq->rt !=3D rt_rq); - - if (!rt_rq->rt_queued) - return; - - BUG_ON(!rq->nr_running); - - sub_nr_running(rq, count); - rt_rq->rt_queued =3D 0; - -} - -static void -enqueue_top_rt_rq(struct rt_rq *rt_rq) -{ - struct rq *rq =3D rq_of_rt_rq(rt_rq); - - BUG_ON(&rq->rt !=3D rt_rq); - - if (rt_rq->rt_queued) - return; - - if (rt_rq_throttled(rt_rq)) - return; - - if (rt_rq->rt_nr_running) { - add_nr_running(rq, rt_rq->rt_nr_running); - rt_rq->rt_queued =3D 1; - } - - /* Kick cpufreq (see the comment in kernel/sched/sched.h). */ - cpufreq_update_util(rq, 0); } =20 static void @@ -1074,58 +422,11 @@ dec_rt_prio(struct rt_rq *rt_rq, int prio) dec_rt_prio_smp(rt_rq, prio, prev_prio); } =20 -#ifdef CONFIG_RT_GROUP_SCHED - -static void -inc_rt_group(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) -{ - if (rt_se_boosted(rt_se)) - rt_rq->rt_nr_boosted++; - - start_rt_bandwidth(&rt_rq->tg->rt_bandwidth); -} - -static void -dec_rt_group(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) -{ - if (rt_se_boosted(rt_se)) - rt_rq->rt_nr_boosted--; - - WARN_ON(!rt_rq->rt_nr_running && rt_rq->rt_nr_boosted); -} - -#else /* !CONFIG_RT_GROUP_SCHED: */ - -static void -inc_rt_group(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) -{ -} - -static inline -void dec_rt_group(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) {} - -#endif /* !CONFIG_RT_GROUP_SCHED */ - static inline -unsigned int rt_se_nr_running(struct sched_rt_entity *rt_se) +unsigned int is_rr_task(struct sched_rt_entity *rt_se) { - struct rt_rq *group_rq =3D group_rt_rq(rt_se); - - if (group_rq) - return group_rq->rt_nr_running; - else - return 1; -} - -static inline -unsigned int rt_se_rr_nr_running(struct sched_rt_entity *rt_se) -{ - struct rt_rq *group_rq =3D group_rt_rq(rt_se); struct task_struct *tsk; =20 - if (group_rq) - return group_rq->rr_nr_running; - tsk =3D rt_task_of(rt_se); =20 return (tsk->policy =3D=3D SCHED_RR) ? 1 : 0; @@ -1134,26 +435,21 @@ unsigned int rt_se_rr_nr_running(struct sched_rt_ent= ity *rt_se) static inline void inc_rt_tasks(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) { - int prio =3D rt_se_prio(rt_se); - - WARN_ON(!rt_prio(prio)); - rt_rq->rt_nr_running +=3D rt_se_nr_running(rt_se); - rt_rq->rr_nr_running +=3D rt_se_rr_nr_running(rt_se); + WARN_ON(!rt_prio(rt_se_prio(rt_se))); + rt_rq->rt_nr_running +=3D 1; + rt_rq->rr_nr_running +=3D is_rr_task(rt_se); =20 - inc_rt_prio(rt_rq, prio); - inc_rt_group(rt_se, rt_rq); + inc_rt_prio(rt_rq, rt_se_prio(rt_se)); } =20 static inline void dec_rt_tasks(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) { WARN_ON(!rt_prio(rt_se_prio(rt_se))); - WARN_ON(!rt_rq->rt_nr_running); - rt_rq->rt_nr_running -=3D rt_se_nr_running(rt_se); - rt_rq->rr_nr_running -=3D rt_se_rr_nr_running(rt_se); + rt_rq->rt_nr_running -=3D 1; + rt_rq->rr_nr_running -=3D is_rr_task(rt_se); =20 dec_rt_prio(rt_rq, rt_se_prio(rt_se)); - dec_rt_group(rt_se, rt_rq); } =20 /* @@ -1182,10 +478,6 @@ static void __delist_rt_entity(struct sched_rt_entity= *rt_se, struct rt_prio_arr static inline struct sched_statistics * __schedstats_from_rt_se(struct sched_rt_entity *rt_se) { - /* schedstats is not supported for rt group. */ - if (!rt_entity_is_task(rt_se)) - return NULL; - return &rt_task_of(rt_se)->stats; } =20 @@ -1198,9 +490,7 @@ update_stats_wait_start_rt(struct rt_rq *rt_rq, struct= sched_rt_entity *rt_se) if (!schedstat_enabled()) return; =20 - if (rt_entity_is_task(rt_se)) - p =3D rt_task_of(rt_se); - + p =3D rt_task_of(rt_se); stats =3D __schedstats_from_rt_se(rt_se); if (!stats) return; @@ -1217,9 +507,7 @@ update_stats_enqueue_sleeper_rt(struct rt_rq *rt_rq, s= truct sched_rt_entity *rt_ if (!schedstat_enabled()) return; =20 - if (rt_entity_is_task(rt_se)) - p =3D rt_task_of(rt_se); - + p =3D rt_task_of(rt_se); stats =3D __schedstats_from_rt_se(rt_se); if (!stats) return; @@ -1247,9 +535,7 @@ update_stats_wait_end_rt(struct rt_rq *rt_rq, struct s= ched_rt_entity *rt_se) if (!schedstat_enabled()) return; =20 - if (rt_entity_is_task(rt_se)) - p =3D rt_task_of(rt_se); - + p =3D rt_task_of(rt_se); stats =3D __schedstats_from_rt_se(rt_se); if (!stats) return; @@ -1267,12 +553,10 @@ update_stats_dequeue_rt(struct rt_rq *rt_rq, struct = sched_rt_entity *rt_se, if (!schedstat_enabled()) return; =20 - if (rt_entity_is_task(rt_se)) { - p =3D rt_task_of(rt_se); + p =3D rt_task_of(rt_se); =20 - if (p !=3D rq->curr) - update_stats_wait_end_rt(rt_rq, rt_se); - } + if (p !=3D rq->curr) + update_stats_wait_end_rt(rt_rq, rt_se); =20 if ((flags & DEQUEUE_SLEEP) && p) { unsigned int state; @@ -1292,21 +576,8 @@ static void __enqueue_rt_entity(struct sched_rt_entit= y *rt_se, unsigned int flag { struct rt_rq *rt_rq =3D rt_rq_of_se(rt_se); struct rt_prio_array *array =3D &rt_rq->active; - struct rt_rq *group_rq =3D group_rt_rq(rt_se); struct list_head *queue =3D array->queue + rt_se_prio(rt_se); =20 - /* - * Don't enqueue the group if its throttled, or when empty. - * The latter is a consequence of the former when a child group - * get throttled and the current group doesn't have any other - * active members. - */ - if (group_rq && (rt_rq_throttled(group_rq) || !group_rq->rt_nr_running)) { - if (rt_se->on_list) - __delist_rt_entity(rt_se, array); - return; - } - if (move_entity(flags)) { WARN_ON_ONCE(rt_se->on_list); if (flags & ENQUEUE_HEAD) @@ -1336,57 +607,18 @@ static void __dequeue_rt_entity(struct sched_rt_enti= ty *rt_se, unsigned int flag dec_rt_tasks(rt_se, rt_rq); } =20 -/* - * Because the prio of an upper entry depends on the lower - * entries, we must remove entries top - down. - */ -static void dequeue_rt_stack(struct sched_rt_entity *rt_se, unsigned int f= lags) -{ - struct sched_rt_entity *back =3D NULL; - unsigned int rt_nr_running; - - for_each_sched_rt_entity(rt_se) { - rt_se->back =3D back; - back =3D rt_se; - } - - rt_nr_running =3D rt_rq_of_se(back)->rt_nr_running; - - for (rt_se =3D back; rt_se; rt_se =3D rt_se->back) { - if (on_rt_rq(rt_se)) - __dequeue_rt_entity(rt_se, flags); - } - - dequeue_top_rt_rq(rt_rq_of_se(back), rt_nr_running); -} - static void enqueue_rt_entity(struct sched_rt_entity *rt_se, unsigned int = flags) { - struct rq *rq =3D rq_of_rt_se(rt_se); - update_stats_enqueue_rt(rt_rq_of_se(rt_se), rt_se, flags); =20 - dequeue_rt_stack(rt_se, flags); - for_each_sched_rt_entity(rt_se) - __enqueue_rt_entity(rt_se, flags); - enqueue_top_rt_rq(&rq->rt); + __enqueue_rt_entity(rt_se, flags); } =20 static void dequeue_rt_entity(struct sched_rt_entity *rt_se, unsigned int = flags) { - struct rq *rq =3D rq_of_rt_se(rt_se); - update_stats_dequeue_rt(rt_rq_of_se(rt_se), rt_se, flags); =20 - dequeue_rt_stack(rt_se, flags); - - for_each_sched_rt_entity(rt_se) { - struct rt_rq *rt_rq =3D group_rt_rq(rt_se); - - if (rt_rq && rt_rq->rt_nr_running) - __enqueue_rt_entity(rt_se, flags); - } - enqueue_top_rt_rq(&rq->rt); + __dequeue_rt_entity(rt_se, flags); } =20 /* @@ -1446,13 +678,7 @@ requeue_rt_entity(struct rt_rq *rt_rq, struct sched_r= t_entity *rt_se, int head) =20 static void requeue_task_rt(struct rq *rq, struct task_struct *p, int head) { - struct sched_rt_entity *rt_se =3D &p->rt; - struct rt_rq *rt_rq; - - for_each_sched_rt_entity(rt_se) { - rt_rq =3D rt_rq_of_se(rt_se); - requeue_rt_entity(rt_rq, rt_se, head); - } + requeue_rt_entity(rt_rq_of_se(&p->rt), &p->rt, head); } =20 static void yield_task_rt(struct rq *rq) @@ -1653,21 +879,6 @@ static struct sched_rt_entity *pick_next_rt_entity(st= ruct rt_rq *rt_rq) return next; } =20 -static struct task_struct *_pick_next_task_rt(struct rq *rq) -{ - struct sched_rt_entity *rt_se; - struct rt_rq *rt_rq =3D &rq->rt; - - do { - rt_se =3D pick_next_rt_entity(rt_rq); - if (unlikely(!rt_se)) - return NULL; - rt_rq =3D group_rt_rq(rt_se); - } while (rt_rq); - - return rt_task_of(rt_se); -} - static struct task_struct *pick_task_rt(struct rq *rq, struct rq_flags *rf) { struct task_struct *p; @@ -1675,7 +886,7 @@ static struct task_struct *pick_task_rt(struct rq *rq,= struct rq_flags *rf) if (!sched_rt_runnable(rq)) return NULL; =20 - p =3D _pick_next_task_rt(rq); + p =3D rt_task_of(pick_next_rt_entity(&rq->rt)); =20 return p; } @@ -2358,8 +1569,6 @@ static void rq_online_rt(struct rq *rq) if (rq->rt.overloaded) rt_set_overload(rq); =20 - __enable_runtime(rq); - cpupri_set(&rq->rd->cpupri, rq->cpu, rq->rt.highest_prio.curr); } =20 @@ -2369,8 +1578,6 @@ static void rq_offline_rt(struct rq *rq) if (rq->rt.overloaded) rt_clear_overload(rq); =20 - __disable_runtime(rq); - cpupri_set(&rq->rd->cpupri, rq->cpu, CPUPRI_INVALID); } =20 @@ -2531,12 +1738,10 @@ static void task_tick_rt(struct rq *rq, struct task= _struct *p, int queued) * Requeue to the end of queue if we (and all of our ancestors) are not * the only element on the queue */ - for_each_sched_rt_entity(rt_se) { - if (rt_se->run_list.prev !=3D rt_se->run_list.next) { - requeue_task_rt(rq, p, 0); - resched_curr(rq); - return; - } + if (rt_se->run_list.prev !=3D rt_se->run_list.next) { + requeue_task_rt(rq, p, 0); + resched_curr(rq); + return; } } =20 @@ -2554,16 +1759,7 @@ static unsigned int get_rr_interval_rt(struct rq *rq= , struct task_struct *task) #ifdef CONFIG_SCHED_CORE static int task_is_throttled_rt(struct task_struct *p, int cpu) { - struct rt_rq *rt_rq; - -#ifdef CONFIG_RT_GROUP_SCHED // XXX maybe add task_rt_rq(), see also sched= _rt_period_rt_rq - rt_rq =3D task_group(p)->rt_rq[cpu]; - WARN_ON(!rt_group_sched_enabled() && rt_rq->tg !=3D &root_task_group); -#else - rt_rq =3D &cpu_rq(cpu)->rt; -#endif - - return rt_rq_throttled(rt_rq); + return 0; } #endif /* CONFIG_SCHED_CORE */ =20 diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index a03866f68a3b..a217c4ab6660 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -827,7 +827,7 @@ struct scx_rq { =20 static inline int rt_bandwidth_enabled(void) { - return sysctl_sched_rt_runtime >=3D 0; + return 0; } =20 /* RT IPI pull logic requires IRQ_WORK */ @@ -867,7 +867,7 @@ struct rt_rq { =20 static inline bool rt_rq_is_runnable(struct rt_rq *rt_rq) { - return rt_rq->rt_queued && rt_rq->rt_nr_running; + return rt_rq->rt_nr_running; } =20 /* Deadline class' related fields in a runqueue */ @@ -2794,7 +2794,7 @@ static inline bool sched_dl_runnable(struct rq *rq) =20 static inline bool sched_rt_runnable(struct rq *rq) { - return rq->rt.rt_queued > 0; + return rq->rt.rt_nr_running > 0; } =20 static inline bool sched_fair_runnable(struct rq *rq) @@ -2906,9 +2906,6 @@ extern void resched_curr(struct rq *rq); extern void resched_curr_lazy(struct rq *rq); extern void resched_cpu(int cpu); =20 -extern void init_rt_bandwidth(struct rt_bandwidth *rt_b, u64 period, u64 r= untime); -extern bool sched_rt_bandwidth_account(struct rt_rq *rt_rq); - extern void init_dl_entity(struct sched_dl_entity *dl_se); =20 extern void init_cfs_throttle_work(struct task_struct *p); @@ -3333,12 +3330,8 @@ extern void set_rq_offline(struct rq *rq); extern bool sched_smp_initialized; =20 #ifdef CONFIG_RT_GROUP_SCHED -#define rt_entity_is_task(rt_se) (!(rt_se)->my_q) - static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) { - WARN_ON_ONCE(!rt_entity_is_task(rt_se)); - return container_of_const(rt_se, struct task_struct, rt); } =20 @@ -3354,17 +3347,7 @@ static inline struct rt_rq *rt_rq_of_se(struct sched= _rt_entity *rt_se) WARN_ON(!rt_group_sched_enabled() && rt_se->rt_rq->tg !=3D &root_task_gro= up); return rt_se->rt_rq; } - -static inline struct rq *rq_of_rt_se(struct sched_rt_entity *rt_se) -{ - struct rt_rq *rt_rq =3D rt_se->rt_rq; - - WARN_ON(!rt_group_sched_enabled() && rt_rq->tg !=3D &root_task_group); - return rt_rq->rq; -} #else -#define rt_entity_is_task(rt_se) (1) - static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) { return container_of_const(rt_se, struct task_struct, rt); @@ -3375,16 +3358,9 @@ static inline struct rq *rq_of_rt_rq(struct rt_rq *r= t_rq) return container_of_const(rt_rq, struct rq, rt); } =20 -static inline struct rq *rq_of_rt_se(struct sched_rt_entity *rt_se) -{ - struct task_struct *p =3D rt_task_of(rt_se); - - return task_rq(p); -} - static inline struct rt_rq *rt_rq_of_se(struct sched_rt_entity *rt_se) { - struct rq *rq =3D rq_of_rt_se(rt_se); + struct rq *rq =3D task_rq(rt_task_of(rt_se)); =20 return &rq->rt; } diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index b215b0ead9a6..9c1ba10ea5a7 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -606,19 +606,6 @@ int __sched_setscheduler(struct task_struct *p, change: =20 if (user) { -#ifdef CONFIG_RT_GROUP_SCHED - /* - * Do not allow real-time tasks into groups that have no runtime - * assigned. - */ - if (rt_group_sched_enabled() && - rt_bandwidth_enabled() && rt_policy(policy) && - task_group(p)->rt_bandwidth.rt_runtime =3D=3D 0 && - !task_group_is_autogroup(task_group(p))) { - retval =3D -EPERM; - goto unlock; - } -#endif /* CONFIG_RT_GROUP_SCHED */ if (dl_bandwidth_enabled() && dl_policy(policy) && !(attr->sched_flags & SCHED_FLAG_SUGOV)) { cpumask_t *span =3D rq->rd->span; --=20 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C4E00379C46 for ; Mon, 8 Jun 2026 12:15:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920959; cv=none; b=uMsCVZXQSSCa1jtTf+122oA3cq/ZnKRQVK4tI6S6N7Rwb8OSei/5H/UWykeedAUnnHzJX8iVvMRxr916WrKGBiN3Hy7IxXA2Cs6KliNy8DREcLHSxToH9X3pCT0yegm3mOx1lU+oTo1r7YD9TiL/ZHkc+M0kI+j8LkIcN0Xax0w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920959; c=relaxed/simple; bh=qyB5yI6Le6xpCF2Mh0VVp9O8korJzrXEa3MKPuArOrQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ot6MRj1cyUcWAvsMGzpRLhXBjHHzpQy4f4xt/o/niYTOkN0dvXbTLPURXl5deXXVS9huvMsZChL10Rc15xoAs75nU4xvHvdMOjAnBdepYePV9ZB0LmG/uEOAe+HvHeq8dAHLdc01Uo49NYzzxKuIECHMAOO2feeL2C/WVgGY5B0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ot4qIufa; arc=none smtp.client-ip=209.85.221.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ot4qIufa" Received: by mail-wr1-f45.google.com with SMTP id ffacd0b85a97d-45eeba68948so2963936f8f.1 for ; Mon, 08 Jun 2026 05:15:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920955; x=1781525755; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NueTbVBmAJACS7om+Cw1nmdAI7MJ2FcgIbTst3qsjQM=; b=ot4qIufakqIxK4v/5VDZ2nJLNxGsXc0eX72sRmuKMgAJaKYfLtEGv4GHu5b4ONGFsF nlurnvIQdSfo7gG6SmGkMF7Naa5E1H+YX+h57cnorjlo2NHQmdpuwsXMt+t0ZQVIBB6p zX4qzrlHsCcPv34TBK6iZCXVNYKmPv903X+k7/bgO1VlNJx27X/ub/fTQlsfkSIKWPBp JWy0QN07FM+DzuTPsuN1lMx5IHTfiF1Xk58Mi6giWgsmcexSDdkW969z4Zv0Z8iXLi8s KkjUvP6J0fcMVfZ+5AoZRvUSgdgsUuzSV1pIi0FdAgdoCsvDTeF4bIRQlsli2+VAz7Qn g8ZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920955; x=1781525755; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=NueTbVBmAJACS7om+Cw1nmdAI7MJ2FcgIbTst3qsjQM=; b=gQ2Lopjuma/aYx5oTtYxAdCOWvpuFDdvZ8ckbwQQihJ2TJrbF/Kx0s02w1kY+kO7ux JN5Sw4G1QXXNcNRJ/7WHXrgivRfSuTPJc+NalF8rFxazIb40BuoBTFDNPlRtWVrk5PyP t1nu5bt140B4imrAl1KrSz1A9oKACFJNUc2eG+UYY6gOxdB7I0C9OdnjmAwxIrwIGxCS RVZDgo2NUZ2tIPVBSzb11wX841JqwF0DvVmBE+qk8EilGu9N/BLH32hfbOPG1tppHGSv qHFk064h1C0Eg+yjk3IOS7kOwxehfSp1vvk1uwWQB4WuxHhinReiJ9STA4t+N9/aTjg8 Fcfg== X-Forwarded-Encrypted: i=1; AFNElJ+AKOjFVmQYoamruRHf3ElkOT/+r1Q7BG0pzEEuzq95GuDiFrUgwJtrFUfALm1Ipe5/St72T+V+v3xqsTg=@vger.kernel.org X-Gm-Message-State: AOJu0YyFtVt4OzMwAaZI9ydtrB7qn9Jb21yVXVEDaxeggmr+GARxxC0J fAJFdnXAxk7F0XPLHCiSaWjrKTpwik8vpgwMpzU+yvdueGXVjOmFtkxt X-Gm-Gg: Acq92OGW7ZoIFXhnxB0ilr/0hqGjQsyH8bDF3Hq9XYYqmw47qoek2jBAx3SM0h0Few9 NBPxXu7mqV8EtIngTb6fhad9ya/Bf0/cE3g+CgMbvx9CduQlkTwX0XdDEBZoXeFHtzmjwhGqB6m 2FrYpP1i2Lu+6i6mxFUacE5GBh5Unpis5jdinaPavr8avLQxeyO9uGmmWnJ2SWiGcuTRwcEidEM WqORwzmDKvTHseb9HtTieOIGfRn9PZtrVC0Nc+zNPTzwKNDZ/QleYFvyXFWQP8gXrSoM0knh2GK xpLeUvEIDy14Z/YSM8U38xZfvBj/kJHjicTUoqSe42e8gw8/CSzUo5gJbMUuDFR9X0ud39Z15Io a1tccLEFOhdqviNZeTKJZQ3Av+vk/GINAbdTeQ+GldasolpZkSzZ3uG92CnOgxXa2/Q7dzrrhe6 wKAMrzncIBz5zV62WRCKZ5kb+WLPzaUdc= X-Received: by 2002:a5d:598b:0:b0:45e:73b3:8118 with SMTP id ffacd0b85a97d-46030512fa1mr24677902f8f.29.1780920955231; Mon, 08 Jun 2026 05:15:55 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:55 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 08/25] sched/rt: Remove unnecessary runqueue pointer in struct rt_rq Date: Mon, 8 Jun 2026 14:15:27 +0200 Message-ID: <20260608121546.69910-9-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove the rq field in struct rt_rq. The rq field now is just caching the pointer to the global runqueue of the given rt_rq, so it is unnecessary as the global runqueue can be retrieved in other ways. Introduce global_rq_of_rt_rq to retrieve the global runqueue which serves a rt_rq's dl_server. Rework rq_of_rt_rq to retrieve the runqueue a rt_rq is serving. Signed-off-by: Yuri Andriaccio --- kernel/sched/rt.c | 21 +++++++++------------ kernel/sched/sched.h | 16 ++++++++++++---- 2 files changed, 21 insertions(+), 16 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 7b526a86083c..4575c234ae46 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -101,10 +101,7 @@ void init_tg_rt_entry(struct task_group *tg, struct rt= _rq *rt_rq, struct sched_rt_entity *rt_se, int cpu, struct sched_rt_entity *parent) { - struct rq *rq =3D cpu_rq(cpu); - rt_rq->highest_prio.curr =3D MAX_RT_PRIO-1; - rt_rq->rq =3D rq; rt_rq->tg =3D tg; tg->rt_rq[cpu] =3D rt_rq; @@ -184,7 +181,7 @@ static void pull_rt_task(struct rq *); static inline void rt_queue_push_tasks(struct rt_rq *rt_rq) { - struct rq *rq =3D container_of_const(rt_rq, struct rq, rt); + struct rq *rq =3D global_rq_of_rt_rq(rt_rq); if (!has_pushable_tasks(rt_rq)) return; @@ -194,7 +191,7 @@ static inline void rt_queue_push_tasks(struct rt_rq *rt= _rq) static inline void rt_queue_pull_task(struct rt_rq *rt_rq) { - struct rq *rq =3D container_of_const(rt_rq, struct rq, rt); + struct rq *rq =3D global_rq_of_rt_rq(rt_rq); queue_balance_callback(rq, &per_cpu(rt_pull_head, rq->cpu), pull_rt_task); } @@ -222,7 +219,7 @@ static void enqueue_pushable_task(struct rt_rq *rt_rq, = struct task_struct *p) rt_rq->highest_prio.next =3D p->prio; if (!rt_rq->overloaded) { - rt_set_overload(rq_of_rt_rq(rt_rq)); + rt_set_overload(global_rq_of_rt_rq(rt_rq)); rt_rq->overloaded =3D 1; } } @@ -240,7 +237,7 @@ static void dequeue_pushable_task(struct rt_rq *rt_rq, = struct task_struct *p) rt_rq->highest_prio.next =3D MAX_RT_PRIO-1; if (rt_rq->overloaded) { - rt_clear_overload(rq_of_rt_rq(rt_rq)); + rt_clear_overload(global_rq_of_rt_rq(rt_rq)); rt_rq->overloaded =3D 0; } } @@ -495,7 +492,7 @@ update_stats_wait_start_rt(struct rt_rq *rt_rq, struct = sched_rt_entity *rt_se) if (!stats) return; - __update_stats_wait_start(rq_of_rt_rq(rt_rq), p, stats); + __update_stats_wait_start(global_rq_of_rt_rq(rt_rq), p, stats); } static inline void @@ -512,7 +509,7 @@ update_stats_enqueue_sleeper_rt(struct rt_rq *rt_rq, st= ruct sched_rt_entity *rt_ if (!stats) return; - __update_stats_enqueue_sleeper(rq_of_rt_rq(rt_rq), p, stats); + __update_stats_enqueue_sleeper(global_rq_of_rt_rq(rt_rq), p, stats); } static inline void @@ -540,7 +537,7 @@ update_stats_wait_end_rt(struct rt_rq *rt_rq, struct sc= hed_rt_entity *rt_se) if (!stats) return; - __update_stats_wait_end(rq_of_rt_rq(rt_rq), p, stats); + __update_stats_wait_end(global_rq_of_rt_rq(rt_rq), p, stats); } static inline void @@ -564,11 +561,11 @@ update_stats_dequeue_rt(struct rt_rq *rt_rq, struct s= ched_rt_entity *rt_se, state =3D READ_ONCE(p->__state); if (state & TASK_INTERRUPTIBLE) __schedstat_set(p->stats.sleep_start, - rq_clock(rq_of_rt_rq(rt_rq))); + rq_clock(global_rq_of_rt_rq(rt_rq))); if (state & TASK_UNINTERRUPTIBLE) __schedstat_set(p->stats.block_start, - rq_clock(rq_of_rt_rq(rt_rq))); + rq_clock(global_rq_of_rt_rq(rt_rq))); } } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index a217c4ab6660..3aa29fe932fc 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -857,8 +857,6 @@ struct rt_rq { raw_spinlock_t rt_runtime_lock; unsigned int rt_nr_boosted; - - struct rq *rq; /* this is always top-level rq, cache? */ #endif #ifdef CONFIG_CGROUP_SCHED struct task_group *tg; /* this tg has "this" rt_rq on given CPU for runna= ble entities */ @@ -3337,9 +3335,14 @@ static inline struct task_struct *rt_task_of(struct = sched_rt_entity *rt_se) static inline struct rq *rq_of_rt_rq(struct rt_rq *rt_rq) { - /* Cannot fold with non-CONFIG_RT_GROUP_SCHED version, layout */ WARN_ON(!rt_group_sched_enabled() && rt_rq->tg !=3D &root_task_group); - return rt_rq->rq; + return container_of_const(rt_rq, struct rq, rt); +} + +static inline struct rq *global_rq_of_rt_rq(struct rt_rq *rt_rq) +{ + /* Cannot fold with non-CONFIG_RT_GROUP_SCHED version, layout */ + return cpu_rq(rq_of_rt_rq(rt_rq)->cpu); } static inline struct rt_rq *rt_rq_of_se(struct sched_rt_entity *rt_se) @@ -3358,6 +3361,11 @@ static inline struct rq *rq_of_rt_rq(struct rt_rq *r= t_rq) return container_of_const(rt_rq, struct rq, rt); } +static inline struct rq *global_rq_of_rt_rq(struct rt_rq *rt_rq) +{ + return container_of_const(rt_rq, struct rq, rt); +} + static inline struct rt_rq *rt_rq_of_se(struct sched_rt_entity *rt_se) { struct rq *rq =3D task_rq(rt_task_of(rt_se)); -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 06DE437BE84 for ; Mon, 8 Jun 2026 12:15:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920961; cv=none; b=HA+bBu3PaE54iz3ySoewqWXoLzOFXCVrgdCfnuatml0C24f+hTDvdoyKhzstzL/Aat/iIiZzVe0vxOMvY7D6ZXBa2xy/1LmZ3KKPzQx8yjsB0EVPondxEsEfRrB4B3INC+sqF2H62rLc6plKfYrLz0TOksd5uCGTPpEXntNTtRU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920961; c=relaxed/simple; bh=92fhzTOKe1eWv26TwL8Usa1oTklYxybytl7/TzxxfVM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=i9MQTv6rxgeAwGwGpWyVt+2X0oyeVY+EfCAMNeKTypSf1E96bBFedh3kVOlGCuGw8J6jsMwCPuIKMLOZ7VZGMUO0AMLSrkUGlnQc7WYuUuQSbm4Y3nrr53BAUgDe2gwGxjN2xwJhPpCim+xlMRwnU7MjxU+pQTSUsY+wPnOxTEY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=dcvmsug9; arc=none smtp.client-ip=209.85.128.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dcvmsug9" Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-490b1bbcf3aso34444685e9.1 for ; Mon, 08 Jun 2026 05:15:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920956; x=1781525756; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wdaPQPUNpe7pv5r2Wo/aUtFcdIVNOnNCTBCujeA0Bao=; b=dcvmsug9oKr93W+xdd9AGH2ornVuZNFyEy00eHvQkZ2qUGAEeBXObPdqc/yidDuLwN cqSPnF78WNYhbFcEK74l7U0U4mpOz3zRvd/bKMwYYdHTLlPQyMh3VrWGaTERl2b3i+zq 7kRB3NMKglszQbppIjqKNDO7niy22VDMxFsfWSA4ed9a6D802L+jkM6rz+8l83DHe3Pk 2Gwk+guH2VMsLd5Aa+8QliuEBYRfZlabM50C+aMaleiA3GmTi1+O+vOxSsT2WfxLa+kW E6fTLVhohq2QR9imyDskb0YRHvHPK3xh1P+fgUsHwGZ2ER+KHX5Xzo7N/WWgNvGS6ZxM r6Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920956; x=1781525756; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=wdaPQPUNpe7pv5r2Wo/aUtFcdIVNOnNCTBCujeA0Bao=; b=Qd3xQhHRLy1CtK2BXn1KKcgS/W5iujgzSIKY+iai7/84TWCocclyQ+0krfagXgFVg9 hzbMqmFDgnVdmQT6n0de7/mZrvaDc/cpQpMkR90hpzexyDq+9euqWWVWdbfulqylAnjV 8e0UPdlFG0iiP2WHMFxHNh1XZvKpWvi3IMe5X56TYZn6mJcmqTxhA6m0jTwwSbgRswyO IOAcw+BGAWrJQA8LJtiI/U1zmn6wjVdgVYT5sEzbTz64qWLCoauhkOUI2nWKGn1DrdMO ZPWM8GTGt68clRY4Izr6+y78pioIkqT52IQj85uDuhsK+TLVxVGqrqp9fi30E8RXSuhr 5vCg== X-Forwarded-Encrypted: i=1; AFNElJ/xcBm9SU3nNCLHrcYrRTsmHfrgBx2in+F1RUIvLxGUr7LXa9lNSprICNkiUlu1rNG7he7oQVnp4FrtIcE=@vger.kernel.org X-Gm-Message-State: AOJu0YwJ/SHi4XKhcMduVLcfQmlCCq9f5LS/ZCjIAIFjR+oxjDOY3PhK up7nq6b83hzH4nPpE1dX3+BKZBDtq3iUCoxJnyWMVHRfSnsRJUiD3IOf X-Gm-Gg: Acq92OEKAwxCRZTQtW44bSirdrBo/2/KuCSwDmDHqf3Sjbi6iQx7f7DOaf1n8sHjVkc HGoY2VMSJZLwUmdNclHktE8t+act1MylUGSLOWS6eNG8JJv9dIGVOxGBvl8rM1AIMT9mCUuBqpN D6ZTp7FOnHWfQbT49pvMVufhEpfUYSF8ImINh4WC1BQxdo/JA3ws6pwGU19qcyHOeHhzSFVxhr0 iH3dr2PM3NOYBA7EIl5BIbSgXAZWa+OqnKWkr0mX6Z5PebZlOtgPucrgKtkHby8cKa3yLjcMM3h UZbZWi+rHLqo05T0ryR2DhlWI9mgNg/gRbbhy7pA9QbnhePlnQtYsHoDK/3qHwpELcQkQqzL/ts gT0LLsjTirn3yie/coOxyz0MkWO9y+O90NQKALE+7mlXXHxiX52NJJsJVRKFdae+bxn4pOBL6MK nB6/ZX6331i41AJ/YH2P43lP0zTrtBIRQ= X-Received: by 2002:a05:600c:19d2:b0:490:47e3:929a with SMTP id 5b1f17b1804b1-490c25ada0dmr261074695e9.6.1780920956340; Mon, 08 Jun 2026 05:15:56 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:55 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 09/25] sched/rt: Introduce HCBS specific structs in task_group Date: Mon, 8 Jun 2026 14:15:28 +0200 Message-ID: <20260608121546.69910-10-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Add an array of sched_dl_entity objects in task_group. Create the dl_bandwidth struct and add a field for it in task_group. Add a rq pointer field in struct rt_rq. --- For each CPU on the host system, the task_group manages a sched_dl_entity a= nd a rt_rq object, which in turn keeps a pointer to its locally managed runque= ue. The sched_dl_entity object manages the deadline server which will be schedu= led for execution on the CPU, while the rt_rq object is instead used to referen= ce the local runqueue's specific data and entities and it is used when an actu= al task must be scheduled when the CPU is given to the dl_server. The dl_bandwidth object keeps track of the currently allocated bandwidth for the cgroup and the currently active context. RT-cgroups can either run tasks themselves or can delegate the scheduling of their tasks to their parent, t= he active_context field keeps track of which cgroup is serving the tasks. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: Yuri Andriaccio Signed-off-by: Yuri Andriaccio Signed-off-by: luca abeni --- kernel/sched/sched.h | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 3aa29fe932fc..f3c259ab9344 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -322,6 +322,15 @@ struct rt_bandwidth { unsigned int rt_period_active; }; +struct dl_bandwidth { + raw_spinlock_t dl_runtime_lock; + u64 dl_runtime; + u64 dl_internal_runtime; + u64 dl_period; + struct task_group *active_context; +}; + + static inline int dl_bandwidth_enabled(void) { return sysctl_sched_rt_runtime >=3D 0; @@ -495,10 +504,17 @@ struct task_group { #endif /* CONFIG_FAIR_GROUP_SCHED */ #ifdef CONFIG_RT_GROUP_SCHED + /* + * Each task group manages a different scheduling entity per CPU, i.e. a + * different deadline server, and a runqueue per CPU. All the dl-servers + * share the same dl_bandwidth object. + */ struct sched_rt_entity **rt_se; + struct sched_dl_entity **dl_se; struct rt_rq **rt_rq; struct rt_bandwidth rt_bandwidth; + struct dl_bandwidth dl_bandwidth; #endif struct scx_task_group scx; @@ -861,6 +877,12 @@ struct rt_rq { #ifdef CONFIG_CGROUP_SCHED struct task_group *tg; /* this tg has "this" rt_rq on given CPU for runna= ble entities */ #endif + + /* + * The cgroup's served runqueue if the rt_rq entity belongs to a cgroup, + * otherwise the top-level global runqueue. + */ + struct rq *rq; }; static inline bool rt_rq_is_runnable(struct rt_rq *rt_rq) -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f41.google.com (mail-wr1-f41.google.com [209.85.221.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 27FDE37B022 for ; Mon, 8 Jun 2026 12:15:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920961; cv=none; b=X3JzWCgzCPuqyoBFLfxtfkq/8c6PdTxiB3Y/zG9LVv+PthuyxpFPM8aPAwsYxIQ9rxeUc1lAuZypP1RmMeL4Rg9lhcVXZOfVwtEeadqmQIwo3+0dS9l4JVjRusQf5GevbvwHfgagg7c9r7Rvx6iRppiztT6Hn1m+DlhBKINFUrY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920961; c=relaxed/simple; bh=KkBlQItDzOxsEF4Lf2x1Vu8P1jr7Y12pNbksjdoe9ng=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dHEolxuzf5gNr7R86OQL+BFlegURuNaLEj/lbxF4h0vqBWUyNse6pEentsT883C+KHUWQUMf8BcJW2/5ze7P1Mz8z2F9NAsJ2aJsUA+Bts6JlcQoj05nqTgpeSsmR8E+w/mJ05+LZaaIDH3jXwJ6AljTfsqTrhkn+z7mNQFbAqc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Aptk6nj9; arc=none smtp.client-ip=209.85.221.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Aptk6nj9" Received: by mail-wr1-f41.google.com with SMTP id ffacd0b85a97d-460166910e6so2137232f8f.2 for ; Mon, 08 Jun 2026 05:15:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920957; x=1781525757; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=i9vcVMFYUZttChdRfBEuCYgVvxDq81TU4GOQktFMuTQ=; b=Aptk6nj9m4xxXHcmapgGRkukt1Ue1W+w2HY2PGV+XbiC0PQD2SimUAztUdisJJIG/c 2oMjSNP7mrIrGC2YZpn/2ro3lloCN/UoqEsUgJm3Y1seVYoPBiibRfk4eo8/9XkLYpwd 0TygqORMCYzgi/8ILAUKY5M71p+uiDzS4kHSrMRWJgzrCnV8dbpwGQeFC4gmpPhV5ifs ye6P8KvFBmkoq2oqkyYB5+k2l+10TzcHJ3OTF+D48gE+ozwMKYmcDWTey0ZXAu5DkwG9 Z7Cwlenra+T5e2Mwv8xPtqXsWTYM0CXnsFVtovu8NxbNPNRyXkJGeMUlnBlyFxaTSxBw SdHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920957; x=1781525757; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=i9vcVMFYUZttChdRfBEuCYgVvxDq81TU4GOQktFMuTQ=; b=fLnogntWpNm0B2ZfohwsT7Ln2cBDd7nWUBPkc37Unihbtmw9JVwwEDoMbZIZNlihFi BzB3020I5F/EdX905xJ/Wrtv+m14TFixFlQMTmP0oGk4eO4DGpgkAKbJs9TDsOOSUmRa 9V6xiDnVpquive+yzvO7GuIrXjfNoPyq6sVBDbtP8LlffWHi22LWCD9g2wVrwGv5j2Td lLXRUu8p4wSdwULtnNfz0uj1i9iiotCmmiXDLRclHAEOHy/4Rzyg3jQ5tazveojezNso u7WuceiQ9oZvh8mIfxIOxyXATG3UvaH683EAGUolvgte9C6Ncjw7sMEUeXOwF71J690W FmwA== X-Forwarded-Encrypted: i=1; AFNElJ8B0ARWU4qaMNhKuey2V3jwm6Es53Mvz/7+lz/Jw811t543kjxFMOnXdW/rKkJLDiPS9Rcju6GGt79Dci8=@vger.kernel.org X-Gm-Message-State: AOJu0YwNPbHK1xhzV6EO8IjicvLTvwEYLkIiKCH0TJnXuPO1kpT1ZTkS EPskYymPtlSZxuoWl4IGG36EQzRP7l7SHYbvpW7gVFVwr5M6iD4+b0de X-Gm-Gg: Acq92OGYNm3T169MjvWBCIzuATiyG3y25isXnE4anzrhv4ZzpYI1PWRqEfdVSmZ1xup Zdp5hU9IH3Lwy3LyO1UsmdOgYm6C9LkzucRfApH8t6AsLgeL6LMJQnmo/T6sfZh//ndMi3CHkoS e0hbOv3hTNnxE+oOPs7jziB5ePquSfweC3+B9t6I7cz4Kg53agnYNWy6wd7d3Io4avCEMYbPMWN OKf0IxDQ7NZghU3C4hIFQ2Koc/AcW+dzCM10PqYp1FYFVSsnBvnlot9OPQVeB98FJOb0y6cHlVR Lww4zSE+/3m3hoyOvwzHCsC6RaWcsdTdp0/x3B3ZCCj9ahjC4bcu3nyRkzfg4wX02Mw6CqVuLop i4T4jMqbkW/FxjHSkgvRQ8bYgh3H1M37mQwnVeEoouu2ZBs+coRFPT2cerKaagGO0GJThyBbL/c /SfNKXmtxRqj/BnODt6C10HfDVpQC0Hew= X-Received: by 2002:a05:6000:2990:20b0:460:25f3:b25a with SMTP id ffacd0b85a97d-460306301bdmr16257882f8f.34.1780920957342; Mon, 08 Jun 2026 05:15:57 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:57 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 10/25] sched/core: Initialize HCBS specific structures. Date: Mon, 8 Jun 2026 14:15:29 +0200 Message-ID: <20260608121546.69910-11-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Update autogroups' creation/destruction to use the new data structures. Initialize the default bandwidth for rt-cgroups (sched_init). Initialize rt-scheduler's specific data structures for the root control group (sched_init). Remove init_tg_rt_entry in favour of manual setup of the necessary data structures in sched_init. Add utility functions to check (and get) if a rt_rq entity is connected to a rt-cgroup. Add read/write accessors for dl_bandwidth. Add dl_bw_lock_of_tg macro to reference the a task group dl_bandwidth's spinlock. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: Yuri Andriaccio Signed-off-by: Yuri Andriaccio Signed-off-by: luca abeni --- kernel/sched/autogroup.c | 4 ++-- kernel/sched/core.c | 11 ++++++++-- kernel/sched/deadline.c | 11 ++++++++++ kernel/sched/rt.c | 45 ++++++++++++++++++++++++++++------------ kernel/sched/sched.h | 38 ++++++++++++++++++++++++++++++--- 5 files changed, 89 insertions(+), 20 deletions(-) diff --git a/kernel/sched/autogroup.c b/kernel/sched/autogroup.c index e380cf9372bb..2122a0740a19 100644 --- a/kernel/sched/autogroup.c +++ b/kernel/sched/autogroup.c @@ -52,7 +52,7 @@ static inline void autogroup_destroy(struct kref *kref) =20 #ifdef CONFIG_RT_GROUP_SCHED /* We've redirected RT tasks to the root task group... */ - ag->tg->rt_se =3D NULL; + ag->tg->dl_se =3D NULL; ag->tg->rt_rq =3D NULL; #endif sched_release_group(ag->tg); @@ -109,7 +109,7 @@ static inline struct autogroup *autogroup_create(void) * the policy change to proceed. */ free_rt_sched_group(tg); - tg->rt_se =3D root_task_group.rt_se; + tg->dl_se =3D root_task_group.dl_se; tg->rt_rq =3D root_task_group.rt_rq; #endif /* CONFIG_RT_GROUP_SCHED */ tg->autogroup =3D ag; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index e38ca8192d2d..9e47a02cfaf7 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8911,7 +8911,7 @@ void __init sched_init(void) scx_tg_init(&root_task_group); #endif /* CONFIG_EXT_GROUP_SCHED */ #ifdef CONFIG_RT_GROUP_SCHED - root_task_group.rt_se =3D (struct sched_rt_entity **)ptr; + root_task_group.dl_se =3D (struct sched_dl_entity **)ptr; ptr +=3D nr_cpu_ids * sizeof(void **); =20 root_task_group.rt_rq =3D (struct rt_rq **)ptr; @@ -8922,6 +8922,11 @@ void __init sched_init(void) =20 init_defrootdomain(); =20 +#ifdef CONFIG_RT_GROUP_SCHED + init_dl_bandwidth(&root_task_group.dl_bandwidth, + global_rt_period(), 0, &root_task_group); +#endif /* CONFIG_RT_GROUP_SCHED */ + #ifdef CONFIG_CGROUP_SCHED task_group_cache =3D KMEM_CACHE(task_group, 0); =20 @@ -8973,7 +8978,9 @@ void __init sched_init(void) * starts working after scheduler_running, which is not the case * yet. */ - init_tg_rt_entry(&root_task_group, &rq->rt, NULL, i, NULL); + rq->rt.tg =3D &root_task_group; + root_task_group.rt_rq[i] =3D &rq->rt; + root_task_group.dl_se[i] =3D NULL; #endif rq->next_class =3D &idle_sched_class; =20 diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index c12882348a03..673c6f2b5ece 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -508,6 +508,17 @@ static inline int is_leftmost(struct sched_dl_entity *= dl_se, struct dl_rq *dl_rq =20 static void init_dl_rq_bw_ratio(struct dl_rq *dl_rq); =20 +void init_dl_bandwidth(struct dl_bandwidth *dl_b, u64 period, u64 runtime, + struct task_group *active_context) +{ + raw_spin_lock_init(&dl_b->dl_runtime_lock); + dl_b->dl_period =3D period; + dl_b->dl_runtime =3D runtime; + dl_b->dl_internal_runtime =3D 0; + dl_b->active_context =3D active_context; +} + + void init_dl_bw(struct dl_bw *dl_b) { raw_spin_lock_init(&dl_b->lock); diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 4575c234ae46..dbba7a57d6f1 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -86,26 +86,47 @@ void init_rt_rq(struct rt_rq *rt_rq) =20 #ifdef CONFIG_RT_GROUP_SCHED =20 -void unregister_rt_sched_group(struct task_group *tg) +DEFINE_MUTEX(rt_constraints_mutex); + +const struct dl_bandwidth *dl_bandwidth_read(struct task_group *tg) { + int held; + + if (IS_ENABLED(CONFIG_LOCKDEP) && debug_locks) { + held =3D 0; + if (lockdep_is_held(&rt_constraints_mutex)) { + __assume_ctx_lock(&rt_constraints_mutex); + held =3D 1; + } + + if (lockdep_is_held(dl_bw_lock_of_tg(tg))) { + __assume_ctx_lock(dl_bw_lock_of_tg(tg)); + held =3D 1; + } =20 + lockdep_assert(held); + } + + return (const struct dl_bandwidth *)&tg->dl_bandwidth; } =20 -void free_rt_sched_group(struct task_group *tg) +struct dl_bandwidth *dl_bandwidth_write(struct task_group *tg) { - if (!rt_group_sched_enabled()) - return; + lockdep_assert_held(&rt_constraints_mutex); + lockdep_assert_held(dl_bw_lock_of_tg(tg)); + + return &tg->dl_bandwidth; } =20 -void init_tg_rt_entry(struct task_group *tg, struct rt_rq *rt_rq, - struct sched_rt_entity *rt_se, int cpu, - struct sched_rt_entity *parent) +void unregister_rt_sched_group(struct task_group *tg) { - rt_rq->highest_prio.curr =3D MAX_RT_PRIO-1; - rt_rq->tg =3D tg; =20 - tg->rt_rq[cpu] =3D rt_rq; - tg->rt_se[cpu] =3D rt_se; +} + +void free_rt_sched_group(struct task_group *tg) +{ + if (!rt_group_sched_enabled()) + return; } =20 int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) @@ -1802,8 +1823,6 @@ DEFINE_SCHED_CLASS(rt) =3D { /* * Ensure that the real time constraints are schedulable. */ -static DEFINE_MUTEX(rt_constraints_mutex); - static inline int tg_has_rt_tasks(struct task_group *tg) { struct task_struct *task; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index f3c259ab9344..0ba87be1c98f 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -606,9 +606,6 @@ extern void start_cfs_bandwidth(struct cfs_bandwidth *c= fs_b); extern void unthrottle_cfs_rq(struct cfs_rq *cfs_rq); extern bool cfs_task_bw_constrained(struct task_struct *p); =20 -extern void init_tg_rt_entry(struct task_group *tg, struct rt_rq *rt_rq, - struct sched_rt_entity *rt_se, int cpu, - struct sched_rt_entity *parent); extern int sched_group_set_rt_runtime(struct task_group *tg, long rt_runti= me_us); extern int sched_group_set_rt_period(struct task_group *tg, u64 rt_period_= us); extern long sched_group_rt_runtime(struct task_group *tg); @@ -2926,6 +2923,8 @@ extern void resched_curr(struct rq *rq); extern void resched_curr_lazy(struct rq *rq); extern void resched_cpu(int cpu); =20 +extern void init_dl_bandwidth(struct dl_bandwidth *dl_b, u64 period, u64 r= untime, + struct task_group *active_context); extern void init_dl_entity(struct sched_dl_entity *dl_se); =20 extern void init_cfs_throttle_work(struct task_struct *p); @@ -3349,6 +3348,9 @@ extern void set_rq_offline(struct rq *rq); =20 extern bool sched_smp_initialized; =20 +extern const struct dl_bandwidth *dl_bandwidth_read(struct task_group *tg); +extern struct dl_bandwidth *dl_bandwidth_write(struct task_group *tg); + #ifdef CONFIG_RT_GROUP_SCHED static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) { @@ -3372,6 +3374,24 @@ static inline struct rt_rq *rt_rq_of_se(struct sched= _rt_entity *rt_se) WARN_ON(!rt_group_sched_enabled() && rt_se->rt_rq->tg !=3D &root_task_gro= up); return rt_se->rt_rq; } + +static inline int is_dl_group(struct rt_rq *rt_rq) +{ + return rt_rq->tg !=3D &root_task_group; +} + +/* + * Return the scheduling entity of this group of tasks. + */ +static inline struct sched_dl_entity *dl_group_of(struct rt_rq *rt_rq) +{ + if (WARN_ON_ONCE(!is_dl_group(rt_rq))) + return NULL; + + return rt_rq->tg->dl_se[rq_of_rt_rq(rt_rq)->cpu]; +} + +#define dl_bw_lock_of_tg(tg) (&(tg)->dl_bandwidth.dl_runtime_lock) #else static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) { @@ -3394,6 +3414,18 @@ static inline struct rt_rq *rt_rq_of_se(struct sched= _rt_entity *rt_se) =20 return &rq->rt; } + +static inline int is_dl_group(struct rt_rq *rt_rq) +{ + return 0; +} + +static inline struct sched_dl_entity *dl_group_of(struct rt_rq *rt_rq) +{ + return NULL; +} + +#define dl_bw_lock_of_tg(tg) ((raw_spinlock_t*)NULL) #endif =20 DEFINE_LOCK_GUARD_2(double_rq_lock, struct rq, --=20 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1B9C937C91A for ; Mon, 8 Jun 2026 12:15:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920962; cv=none; b=QrXkXz8lKxXpkmLdhmA75BmWN7SiidTWb6j2et/1n61AQ4qHWT4UrL/gfo5+q9ZyqcKmjfEMK9El34syf6LTRLf+FzMbfFl+23FIGhLGFmvncUm/TRAjhTpOYM/oRAGvRrRPK0CoH8t20jHqWQtORqmDhb+otXpGwPIBWNc3pAE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920962; c=relaxed/simple; bh=l7/2jOFeBJhQNf2GhvMwDBVsIZzI9+ePozibbqh9nsI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WZWYj9Bp6ymk2bkM2pwHyV4aQiN4OgB+ZbanLh/L6VdkAWpuuLpJ1ap9b8e4S1i5L1OBJpCbN3ANjXNlBxJQV5fGtDvgzzHPb9yBnFUw1O9LW3pUfJa5QvBFuUpPAtOGWEwmIQjcAVMHS1BsPD1R+99j6tmB1l0jlzcLb2AWOfQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=TmBwjHDx; arc=none smtp.client-ip=209.85.221.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TmBwjHDx" Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-45ef779c1c2so2858999f8f.1 for ; Mon, 08 Jun 2026 05:15:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920958; x=1781525758; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=U0yT2Aj62az0wIP5mZB6k9It5mchAon7rR5a4sHBIVw=; b=TmBwjHDxf79PmLytvgLrD4ZbR6OA7SQ8yTStDmLfjnikQz53RC7QGmOj78cc+XheQf XXMUIA2bHa/rEYDluqFiEhMHoPNTc3DNRMeZHLnDzsqumawWl10vGz5ZTrkjq6QHvhQP BIXiIbMUyNOxqOxKiG2tTo9sMqdd2CPQe/XwCEHuOMeia0J9nINTrosWssfIsJaKkaJM D/bzGXPEz4MobwlBomnl4VTVjvo+enrjpRQsj18/A9BeeIXtE7fJSyI4k23FVrHqjEZ/ ocU+b0/ionkalV+lfysdQiAjosFY49WVSOHcFeuVgAlwij4Vq0a15kX+GigEahShlyte 4yIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920958; x=1781525758; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=U0yT2Aj62az0wIP5mZB6k9It5mchAon7rR5a4sHBIVw=; b=kDpgPauLUpiCRKh8p3qVy8ClxP7GabhllNtEg20q5W5t+GkxqwTbtljxEEILoTtuKA sgsWSkvUPdWjWVR7tLvxIOHncU/f7ATXBTd9JdBJKuYbLNpAz+r7sQM2IH8qjEWVOG63 yL5kyOztGy/ic7LCAn7T7uSiHICBNYnUkRXqemu4NfQn7wikyhe04oO/9qp5H3ZmOyKf gwNUT4x3dmfneY/XTEEdHCbdohGvYsrdftAW1v38+XPMtihDjEb4EUkCWPFGXUVH1eol +ZCQD+psSCtf2nllec7n8M0B7p5TXqHhy3lxTEMnYa5dQq8FYm8AlYjxVN5FNElsuv5e Ciyw== X-Forwarded-Encrypted: i=1; AFNElJ/rNvBKQpym9ecKfy7G+lw7A/v6Smi/dvKOv7dVu+lzRFe66e8jR77N04z87eyM7Dc1bCFBut2hjr7IbtU=@vger.kernel.org X-Gm-Message-State: AOJu0YxfTsDjDK6YyBSmGM4Pqo5nFlaUo/Ns5md3MkoBa/XHWv17DZTq fM6y2fTmP2ovzpHHtWlrZfNxx5sWxUVaWC5tvwU+HAAn5Bo4xHKTTSKu X-Gm-Gg: Acq92OETYNp8dNSm2tLZzgwH56u9e8a8bUiCzL3aSHod/zCNGd3sa5fshSjaFrqpFzP vA9Qr1OFEVgsvCN35+sDyLTg39delWnj/Z1cgB0XDsczubAPi6KEeyqP+eekM1SXA/MMyehNlZy peHx+GX0eVHUapwAQXeBUc9mPAWDarHqV8LKkA3EnV8hwSiljzbX52P/3JEsTLnBI8lxYabYVqr VoFRkJsf4emK6f+iq+r7YhpOxaGKOS1TqNIuRyf/8DnbE49/Qi3YwwBr6KstnTEWFWWJLcFRW7l eJE4yTeWyLWy5+noRVDFZQ5CNfdnXSu70mCigmmRSbBHiiv/LCIqV3uNQLOVl66lYgfDv/A3rsh HElf63Fm5Pg0PBxdQUi2xoKV2vVdZzPuDpvyIOGwiSVuFwnbnAp0DKDHvOTX40toMP5eAIjg0Xf BMdjUbn75xGZF2tdnOEtQUwGhVPrOoDEY= X-Received: by 2002:adf:f750:0:b0:45e:d8dc:922e with SMTP id ffacd0b85a97d-4603063a8ccmr17387986f8f.20.1780920958441; Mon, 08 Jun 2026 05:15:58 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:58 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 11/25] sched/deadline: Add dl_init_tg Date: Mon, 8 Jun 2026 14:15:30 +0200 Message-ID: <20260608121546.69910-12-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Add dl_init_tg to initialize and/or update a rt-cgroup dl_server and to also account the allocated bandwidth. This function is currently unhooked and will be later used to allocate bandwidth to rt-cgroups. Add lock guard for raw_spin_rq_lock_irq for cleaner code. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: Yuri Andriaccio Signed-off-by: Yuri Andriaccio Signed-off-by: luca abeni --- kernel/sched/deadline.c | 31 +++++++++++++++++++++++++++++++ kernel/sched/sched.h | 5 +++++ 2 files changed, 36 insertions(+) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 673c6f2b5ece..afadc3521bc0 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -335,6 +335,37 @@ void cancel_inactive_timer(struct sched_dl_entity *dl_= se) cancel_dl_timer(dl_se, &dl_se->inactive_timer); } =20 +#ifdef CONFIG_RT_GROUP_SCHED +void dl_init_tg(struct sched_dl_entity *dl_se, u64 rt_runtime, u64 rt_peri= od) +{ + struct rq *rq =3D container_of_const(dl_se->dl_rq, struct rq, dl); + int is_active; + u64 new_bw; + + guard(raw_spin_rq_lock_irq)(rq); + is_active =3D dl_se->my_q->rt.rt_nr_running > 0; + + update_rq_clock(rq); + dl_server_stop(dl_se); + + new_bw =3D to_ratio(rt_period, rt_runtime); + dl_rq_change_utilization(rq, dl_se, new_bw); + + dl_se->dl_runtime =3D rt_runtime; + dl_se->dl_deadline =3D rt_period; + dl_se->dl_period =3D rt_period; + + dl_se->runtime =3D 0; + dl_se->deadline =3D 0; + + dl_se->dl_bw =3D new_bw; + dl_se->dl_density =3D new_bw; + + if (is_active) + dl_server_start(dl_se); +} +#endif + static void dl_change_utilization(struct task_struct *p, u64 new_bw) { WARN_ON_ONCE(p->dl.flags & SCHED_FLAG_SUGOV); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 0ba87be1c98f..58f67093145e 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -425,6 +425,7 @@ extern void dl_server_init(struct sched_dl_entity *dl_s= e, struct dl_rq *dl_rq, struct rq *served_rq, dl_server_pick_f pick_task); extern void sched_init_dl_servers(void); +extern void dl_init_tg(struct sched_dl_entity *dl_se, u64 rt_runtime, u64 = rt_period); =20 extern void fair_server_init(struct rq *rq); extern void ext_server_init(struct rq *rq); @@ -2044,6 +2045,10 @@ static inline struct rq *_this_rq_lock_irq(struct rq= _flags *rf) __acquires_ret return rq; } =20 +DEFINE_LOCK_GUARD_1(raw_spin_rq_lock_irq, struct rq, + raw_spin_rq_lock_irq(_T->lock), + raw_spin_rq_unlock_irq(_T->lock)) + #ifdef CONFIG_NUMA =20 enum numa_topology_type { --=20 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wm1-f47.google.com (mail-wm1-f47.google.com [209.85.128.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 236E6379C3D for ; Mon, 8 Jun 2026 12:16:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920963; cv=none; b=djSCSFqZdmXlb0wxDYfCjebjtZjF7WjPhR9pyD+Majnz362CTse/3MUMJhuXgWaewyYVIe9oKCQDOiF7SRz4RPEsPtO1FNQEclWiFLVoVOSTLnjLVI7Jwx8TVYWvaEF5E3VymXg/stHowkEvXM5BhjJRyapvNvon7LpZkyVlSo0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920963; c=relaxed/simple; bh=mpexrv9BBxJTWzc0utwv/vwSsLvDN6O0bgb2ctyPub8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=G0OgUnOvYEDM1eccK5wgDRPlhc6nhZ/g0eGcPuXfC3RNAaD3LxE4aCco2v5auc41DMz27RvB6vO538hvKOosxUbBpnHkzJB9MDOHAkH4KNvJDgwEjhL5RnSdq01hMjy8f9siJNj0sOCdV5eyZu+QTYcLQpmRuq3ZAQJSaeo+j/I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hnf03Ug1; arc=none smtp.client-ip=209.85.128.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hnf03Ug1" Received: by mail-wm1-f47.google.com with SMTP id 5b1f17b1804b1-490b613a17bso41334735e9.3 for ; Mon, 08 Jun 2026 05:16:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920959; x=1781525759; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LVhlGSJa1Vn6iJfAbUGEfqSaevJCYRft5zj2SU3HYyE=; b=hnf03Ug1QDHYbMG91ePwbPn9Z9t3fZ20fiSAAAkqFBOx9H5exPB7Lq3sP5xwdebZ48 ukR8xYFta9Ta81P6VsXbaYjT9SEm6yXpdgmuK8N42pMiF3lQFyw0SqFtz7zgdOYdpqe+ PIV0TpBeoWOUVlbPFx7ihmlHmtqIVsBnhv5VRHN4RPMcyh2Ww+8jPLANK29oNJHoowD4 1/WtOztaWHRt40KJ1HVQ7toMyaYfXiPDPeoIH+aiEcVGHxdJbwJGppLzFkZyms6Upbi1 mVaeVFHa8Hqcasd2Y53dfQnvS9INN9XA9IsFarCigmT0Tqdx6mYh/rZgGjcRqmPZqMXe 5ybg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920959; x=1781525759; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=LVhlGSJa1Vn6iJfAbUGEfqSaevJCYRft5zj2SU3HYyE=; b=r50lskcGRZxmCo3xaX9OmurvSmetdIv9cz22/Sz7qOPxYsi78QnBCzkxMAIYjOseu4 NUkNmyp3rQGVEiWAXEab8m3nRP2D+goANMtT4QFjkvcjdyw51gdD4BZ9G/3xrqSoZtD2 mM50STocEsLXgNx2bJIfEtHVIbuhBVO1bhO5Di6U0wBRN0ge99zsnPDtUn5x7+Vi1za+ SIbxUqdJDFkTJZhK0ZZ94NfmgOSUhYAbk6jyypnaeAMrXmUm5lSURBVu2JhjK4lAdwx0 KKppyG/5A23hQkcegMu2sroKAdleTUZlhf3pzG0y5lY/VyWA9OsHILQ96p/UOG1fa9XE s+UQ== X-Forwarded-Encrypted: i=1; AFNElJ8mJmwxHFtxviqXnW8UgX+kU3Y+UK1dGTt9VRAxtGm6sKptdjQq+0QChNtqOY65QlYCdlL5g/hKtG8+zGc=@vger.kernel.org X-Gm-Message-State: AOJu0YzmU8sgMscNFfUfgbSFzWmPEpMC/xC03Guv0cy1dFqbtWKP5SYV RGskUaVcULCNPB79BO4h58KSfsDcA2KBaCNWnGy9L/ZOe6i8z4/zdiEG X-Gm-Gg: Acq92OEdqsERZP98b5QriThfo0IjUM780EUxYL3R8c6WLPOEEQvUglsgLpg/5jmTM1D mgeS+E+puA2CKurb5aNs+EbM+nrY6DMvrv8OUuychzpqqSw/jGedTARgWSPXxy0TTXOO8SYGQRL wqTsFDpKiqv/VgRFYrZ90RDDegKMp7RY3ogrYhrZXP0QH1H3rJ2OQHuwMDQZkM0MeyzQH0JecsW VIlZleo/jqbs3ixIeWOLn0gmXva4QK8USFsbJ8kz+A/Cp4vWWaETNxK8pA4o52ejNDy6v+VCZJs H2S0/leTZOCIVqrPK7emnCRsw10ryDHY0kIx90IN/dWw2N7JK/oBIYabN+HxsvO2/+XlNeu/Wv5 6WABE4+zYApLKYOmiU4GSPq7TrPr71Ol3Ib2kHgyip1DFZszt6hwvuIfORjMTUF4IJ15jPRjNtS 4hg2/MpETSdKjRSgwIQXldKhpVQFwHuDM= X-Received: by 2002:a05:600c:1c1f:b0:490:b189:212d with SMTP id 5b1f17b1804b1-490c2631948mr252757195e9.33.1780920959367; Mon, 08 Jun 2026 05:15:59 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:59 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 12/25] sched/rt: Add {alloc/unregister/free}_rt_sched_group Date: Mon, 8 Jun 2026 14:15:31 +0200 Message-ID: <20260608121546.69910-13-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add allocation and deallocation code for rt-cgroups. Declare dl_server specific functions (only skeleton, but no implementation yet), needed by the deadline servers to be called when trying to schedule. Initialize a cgroup's active context to that of its parent. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: luca abeni Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- kernel/sched/rt.c | 156 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 154 insertions(+), 2 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index dbba7a57d6f1..a6adf21772a6 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -120,24 +120,176 @@ struct dl_bandwidth *dl_bandwidth_write(struct task_= group *tg) void unregister_rt_sched_group(struct task_group *tg) { + int i; + + if (!rt_group_sched_enabled()) + return; + + if (!tg->dl_se || !tg->rt_rq) + return; + for_each_possible_cpu(i) { + if (!tg->dl_se[i] || !tg->rt_rq[i]) + continue; + + if (tg->dl_se[i]->dl_runtime) + dl_init_tg(tg->dl_se[i], 0, tg->dl_se[i]->dl_period); + } } void free_rt_sched_group(struct task_group *tg) { + int i; + unsigned long flags; + if (!rt_group_sched_enabled()) return; + + if (!tg->dl_se || !tg->rt_rq) + return; + + for_each_possible_cpu(i) { + if (!tg->dl_se[i] || !tg->rt_rq[i]) + continue; + + /* + * Shutdown the dl_server and free it + * + * Since the dl timer is going to be cancelled, + * we risk to never decrease the running bw... + * Fix this issue by changing the group runtime + * to 0 immediately before freeing it. + */ + if (tg->dl_se[i]->dl_runtime) + dl_init_tg(tg->dl_se[i], 0, tg->dl_se[i]->dl_period); + + raw_spin_rq_lock_irqsave(cpu_rq(i), flags); + hrtimer_cancel(&tg->dl_se[i]->dl_timer); + raw_spin_rq_unlock_irqrestore(cpu_rq(i), flags); + kfree(tg->dl_se[i]); + + /* Free the local per-cpu runqueue */ + kfree(rq_of_rt_rq(tg->rt_rq[i])); + } + + kfree(tg->rt_rq); + kfree(tg->dl_se); } +static inline void __rt_rq_free(struct rt_rq **rt_rq) +{ + int i; + + for_each_possible_cpu(i) { + kfree(rq_of_rt_rq(rt_rq[i])); + } + + kfree(rt_rq); +} + +DEFINE_FREE(rt_rq_free, struct rt_rq **, if (_T) __rt_rq_free(_T)) + +static inline void __dl_se_free(struct sched_dl_entity **dl_se) +{ + int i; + + for_each_possible_cpu(i) { + kfree(dl_se[i]); + } + + kfree(dl_se); +} + +DEFINE_FREE(dl_se_free, struct sched_dl_entity **, if (_T) __dl_se_free(_T= )) + +static int __alloc_rt_sched_group_data(struct task_group *tg) { + /* Instantiate automatic cleanup in event of kalloc fail */ + struct rt_rq **tg_rt_rq __free(rt_rq_free) =3D NULL; + struct sched_dl_entity **tg_dl_se __free(dl_se_free) =3D NULL; + struct sched_dl_entity *dl_se __free(kfree) =3D NULL; + struct rq *s_rq __free(kfree) =3D NULL; + int i; + + tg_rt_rq =3D kcalloc(nr_cpu_ids, sizeof(struct rt_rq *), GFP_KERNEL); + if (!tg_rt_rq) + return 0; + + tg_dl_se =3D kcalloc(nr_cpu_ids, + sizeof(struct sched_dl_entity *), GFP_KERNEL); + if (!tg_dl_se) + return 0; + + for_each_possible_cpu(i) { + s_rq =3D kzalloc_node(sizeof(struct rq), + GFP_KERNEL, cpu_to_node(i)); + if (!s_rq) + return 0; + + dl_se =3D kzalloc_node(sizeof(struct sched_dl_entity), + GFP_KERNEL, cpu_to_node(i)); + if (!dl_se) + return 0; + + tg_rt_rq[i] =3D &no_free_ptr(s_rq)->rt; + tg_dl_se[i] =3D no_free_ptr(dl_se); + } + + tg->rt_rq =3D no_free_ptr(tg_rt_rq); + tg->dl_se =3D no_free_ptr(tg_dl_se); + + return 1; +} + +static struct task_struct *rt_server_pick(struct sched_dl_entity *dl_se, s= truct rq_flags *rf); + int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent) { + struct sched_dl_entity *dl_se; + struct rq *s_rq; + int i; + if (!rt_group_sched_enabled()) return 1; + /* Allocate all necessary resources beforehand */ + if (!__alloc_rt_sched_group_data(tg)) + return 0; + + /* Initialize the allocated resources now. */ + scoped_guard(raw_spinlock_irq, dl_bw_lock_of_tg(parent)) { + init_dl_bandwidth(&tg->dl_bandwidth, 0, RUNTIME_INF, + dl_bandwidth_read(parent)->active_context); + } + + for_each_possible_cpu(i) { + s_rq =3D rq_of_rt_rq(tg->rt_rq[i]); + dl_se =3D tg->dl_se[i]; + + init_rt_rq(&s_rq->rt); + s_rq->cpu =3D i; + s_rq->rt.tg =3D tg; + + init_dl_entity(dl_se); + dl_se->dl_runtime =3D 0; + dl_se->dl_deadline =3D 0; + dl_se->dl_period =3D 0; + dl_se->runtime =3D 0; + dl_se->deadline =3D 0; + dl_se->dl_bw =3D to_ratio(dl_se->dl_period, dl_se->dl_runtime); + dl_se->dl_density =3D to_ratio(dl_se->dl_deadline, dl_se->dl_runtime); + dl_se->dl_server =3D 1; + dl_server_init(dl_se, &cpu_rq(i)->dl, s_rq, rt_server_pick); + } + return 1; } -#else /* !CONFIG_RT_GROUP_SCHED: */ +static struct task_struct *rt_server_pick(struct sched_dl_entity *dl_se, s= truct rq_flags *rf) +{ + return NULL; +} + +#else /* !CONFIG_RT_GROUP_SCHED */ void unregister_rt_sched_group(struct task_group *tg) { } @@ -147,7 +299,7 @@ int alloc_rt_sched_group(struct task_group *tg, struct = task_group *parent) { return 1; } -#endif /* !CONFIG_RT_GROUP_SCHED */ +#endif /* CONFIG_RT_GROUP_SCHED */ static inline bool need_pull_rt_task(struct rq *rq, struct task_struct *pr= ev) { -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D394037AA79 for ; Mon, 8 Jun 2026 12:16:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920963; cv=none; b=q8I0P1hb+e1/xYACI/7JZbQlc7piBVywuh6r/jNJYfg49UfJXHHQ7/GTqXNLVWO9nyfToLrb/V7Wtlt/gJ4LxvTKgCJSfydGKLb3SYkTXlriG+2pKgSRweRk9jonIYbsmmVABfPhQahsfdASx/pxKpKLtpSWQ/eaKjAe7W7r3cQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920963; c=relaxed/simple; bh=TCzeSZTBx8lzpAQELS8A6vLFLj2ynJCbG3lDm1QkhmQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PpgfOGn3zIpTpE42Gr82dAwgBZinN+tQLv4qWCAYVeamqetVBOY1xKzbScOsX/42ySPaegVzBnLXppVILXQSDxA9SDkOz+8GXSYe7DoYhiD6Jn6K1fOeUUf7Y1K1nHQBirMiDiUajM/DETtTQjCjode/pDZYiRkgDBhHNIfDAdU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=sZ15FC4X; arc=none smtp.client-ip=209.85.221.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="sZ15FC4X" Received: by mail-wr1-f52.google.com with SMTP id ffacd0b85a97d-45efb698ef2so1918597f8f.3 for ; Mon, 08 Jun 2026 05:16:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920960; x=1781525760; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LUGi6GN5j9tpmC+RZ4hz/nx0wpr5cX/yRLcD8GqgOGU=; b=sZ15FC4Xo+dMBc8DAr8Nz0n5r8mJAwIH4eMX95UgFmlyZcTzctRRqlt/M/x8XgNkSL 5jPuq4gNp8IBdoJB6kA9EpUT79G4LkLSkkeZfEP9I9RT7n1INIyLE5+lHJdfJfCy2BdJ 5DaHBwG7aT7iOGVfGrNzKp5fpPGwSewmJtTB/asC2Pemwr18YhqpfXAk4YeeWb+G5YZC p+acIRPu1gwcAPAn13Ol4it4IWMcFw6dmzo5XmEzj6GbsJwnIx49o+I9U+Kb8qJfCHHw s8AmRHg3GLiAhNv1U1I4OOgyQBVNmUuUWtQQz1f92U0seTdKEB+144geQkw7n9RUjjVF bubw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920960; x=1781525760; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=LUGi6GN5j9tpmC+RZ4hz/nx0wpr5cX/yRLcD8GqgOGU=; b=mzd4pD/Tgm1v9EpynjThkjD5vTX9KT8kum8clFd2gpYrTbh2ssZEH5jh7kW7wpQWea vYdzDnPM8/oCyKi2XmM+6bkJT6xcEbeUjPFM5We28CVieVn0BOvp1St6Z9RD5QeaV67c ac7S1U8/rys09jQjJPQX8r6Ni3i3QOxGCkWyg0JndMvJ4GbQ373TyICjsPBDBeiBLvXb Y4mveLcZcJxu+lXR1+139H0RHqSYWJeSy0Yx02icyQJpGuSm1Isyg3mtL1afpTPIdMRj AjLaf5Zs3mPNeM8jvm1Bv7CPr6WtRMOwMgbQ4nVEjaBGY+PV2eL6/ryKlvhA1sMqPSjb BT1w== X-Forwarded-Encrypted: i=1; AFNElJ9GL7W7ArVeRDvaVavVBnwC4lE2e4VE2B7uLXo9AhrQROvuD5+LIdKNMQDT9CmWB6rSwlTrUSVpwVspBaI=@vger.kernel.org X-Gm-Message-State: AOJu0YwJl/knW83JLtDx6BWHB9IxHDsd56V7vXQm/sFC3zNs3ShwOwaE 5/uSbYdx4ZdC9/ww5s6isXbGqJEC1lyplhsmbVw6ubORK1jCqVMj8OwG X-Gm-Gg: Acq92OHEaqr8c9OQyD3WjmOeOT0dny5cPyp2hFQ8zuAcWxpsbOrWOiCDCmPaRtvnc2V CT3Q8qYKj+WhR2y7iOa/XIE54r/Vp43AxbFOYN6Gsp+m06G3OnbZNVwVNP5ptNw4QW0W7Hun2is eqMxp7XhnNpxcaxg0UQ7KUxKjoGXNdBefrD8TlUOEh0i+xnTcKk2al+PIwaSDY2AS4UqEI5eD0t Ab5hlcgi+R7AgcbTRzBo7qH5VOSjyd+JevklKHSbzMTbGIO3t5QKd0JKtKVnOx4vri0Ko5jPyWV 89nGdv9BrDeebv2mYkYo540BfAU69fo0GKrk1i2WQVnWwJQVeGk/MIpXpIsSoA9iOc7mKyz8TgX nLnmjUnshE/kgzGxqLjC2on1J0XdHE8aAIVcBfDLnOehhZ++K5yNpSK1sVN4opsZaG7jTyUlQdN sGAWJB2Wh0TekAXYB0jENuB9fxvu/kavA= X-Received: by 2002:a05:6000:4029:b0:45e:f798:5531 with SMTP id ffacd0b85a97d-460306095b2mr26881415f8f.23.1780920960169; Mon, 08 Jun 2026 05:16:00 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.15.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:15:59 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 13/25] sched/deadline: Account rt-cgroups bandwidth in deadline tasks schedulability tests. Date: Mon, 8 Jun 2026 14:15:32 +0200 Message-ID: <20260608121546.69910-14-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Account the rt-cgroups hierarchy's reserved bandwidth in the schedulability test of deadline entities. This mechanism allows to completely reserve portion of the rt-bandwidth to rt-cgroups even if they do not use all of it. Account for the rt-cgroups' reserved bandwidth also when changing the total dedicated bandwidth for real time tasks. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: Yuri Andriaccio Signed-off-by: Yuri Andriaccio Signed-off-by: luca abeni --- kernel/sched/deadline.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index afadc3521bc0..166d23f45cab 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -205,11 +205,22 @@ void __dl_add(struct dl_bw *dl_b, u64 tsk_bw, int cpu= s) __dl_update(dl_b, -((s32)tsk_bw / cpus)); } +static inline u64 get_dl_groups_bw(void) +{ +#ifdef CONFIG_RT_GROUP_SCHED + return to_ratio(root_task_group.dl_bandwidth.dl_period, + root_task_group.dl_bandwidth.dl_runtime); +#else + return 0; +#endif +} + static inline bool __dl_overflow(struct dl_bw *dl_b, unsigned long cap, u64 old_bw, u64 new_b= w) { return dl_b->bw !=3D -1 && - cap_scale(dl_b->bw, cap) < dl_b->total_bw - old_bw + new_bw; + cap_scale(dl_b->bw, cap) < dl_b->total_bw - old_bw + new_bw + + cap_scale(get_dl_groups_bw(), cap); } static inline @@ -3490,8 +3501,9 @@ int sched_dl_global_validate(void) u64 period =3D global_rt_period(); u64 new_bw =3D to_ratio(period, runtime); u64 cookie =3D ++dl_cookie; + u64 dl_groups_root =3D get_dl_groups_bw(); struct dl_bw *dl_b; - int cpu, cpus, ret =3D 0; + int cpu, cap, cpus, ret =3D 0; unsigned long flags; /* @@ -3506,10 +3518,12 @@ int sched_dl_global_validate(void) goto next; dl_b =3D dl_bw_of(cpu); + cap =3D dl_bw_capacity(cpu); cpus =3D dl_bw_cpus(cpu); raw_spin_lock_irqsave(&dl_b->lock, flags); - if (new_bw * cpus < dl_b->total_bw) + if (new_bw * cpus < dl_b->total_bw + + cap_scale(dl_groups_root, cap)) ret =3D -EBUSY; raw_spin_unlock_irqrestore(&dl_b->lock, flags); -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F0E8A3806B6 for ; Mon, 8 Jun 2026 12:16:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920965; cv=none; b=JKcBduyZWt0usF42ufHxA9M3K/vH7xChYU7BFdHbywsEXp31eFweZfl37REecrPZfB875TlEE+jGzROHyUi9l7LN585vXkzGi0NBSTPPT1+DzSheE8RqV3AXqD8HMdGwNwspyPi5oF3HVmEFqyl38uzPQ9llIgvHRlmMqKnNxIY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920965; c=relaxed/simple; bh=jcQB6FhJIuwr/IRdvU92ziCijr23bUOi8QVp9D/t2PA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HhvpDGIkTCWny1Fq4WqnXuZvqLszh3/Sob16uvkcX9R1COueatnI27g4kzCRkxi3bpSW/28HQhWl7jOiYkCKrEaiUyAHXyWDdY25MHmRukM2ZFe3brAveWhLtt1btGfwQT6BX7EJPmBZq14sxlgev0U/GwK3ARRrHPvqHHtQMIc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=pLT9Q2di; arc=none smtp.client-ip=209.85.221.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="pLT9Q2di" Received: by mail-wr1-f45.google.com with SMTP id ffacd0b85a97d-45ef6565cfdso1927712f8f.0 for ; Mon, 08 Jun 2026 05:16:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920961; x=1781525761; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=voPji+yceKdPoSsP7jAvETRoIpzyXcg+YDnlEYuWa/U=; b=pLT9Q2di3fH6fmDCWb5Xlx8sRTV6KyPs79TlKMc06GB6RSOPC7YICGS+pD1ikrY3Mk Tx9FO+ksdl0x844QLSuHIDBkMpMbvCzhkMsOb7Jp06z4PjbWRZHqBfB6xGjStCupoEEL AHBzdQiKz7uo2PTRsWBeFWM84xaDXaLU+PtAiBeSk5CJThufNyFEYzhxRsfkBa2Cuj3I W2yDWNb7ORS4EpuXn0LfLMpeeb1Kt+Y2TcgvjvwKRXQjAU31Q3vJhqF2pHSl48C6hZPC /FgQphLXlg747agq8mX5yjIw/TRKwyuBkFOV4d9v5CzZZXDErA6xHJx5IaGpegtL4Nky RE7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920961; x=1781525761; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=voPji+yceKdPoSsP7jAvETRoIpzyXcg+YDnlEYuWa/U=; b=nMjz2LD6peWq+MO+29OqjzXYSD5dbKospT5roMzH64a17FPLnHa8oPtuAVzhPQRERM zoNkFZ2Rn2CKq64kL3HwbLxMuc971eCvjzxPC1+EHYJwVNf5eWMTdXTgUoYBsBaJXoW1 QcHuhuUIBP5lfNRXCNVu0PGSndyZfiWskJZBiNM+jWkHABOgM3/cW5iVJTcmakgE5m4C 6zAJHTxw7ZTkki1Af1XJrGV0lFJPI84uGValikXp4+ZXdAzgI64wARTV80haW6XyQEvu er20JWn/I9db958OYOAynPBnzDTPccRqhGrz7U8pVgUlvf+wD6nPwHBvoTGUeBqo4x3K x59Q== X-Forwarded-Encrypted: i=1; AFNElJ/1wOV5GGSBqg9sqrrz3EVtvJ6/pCJ4GcTG5nL09gdlqqCUQCynImH5KyDfYxvU6sF3li8ZS0QYRH8Lhho=@vger.kernel.org X-Gm-Message-State: AOJu0Yzed5rt8PLj+nYEyrB5uotNRcA2WGJoUBcYo9IcSkWxwyF0wqXH gLlI1uk334T33C6LBMNXlDmtunjwDDejDTZ9ECiwNBRxcQNO+k1baYFM X-Gm-Gg: Acq92OHG1J9OJLKWq13y0re8DNlnQRSWrxEylms4koI3uFoLghjkIsxjANoUjoJ3PXC XLhjWp65gn96JWzQv8HAhwQgelWphk2Mk6VkqNpa+ESqNEK8uScobB544lcaT2BGGo0XFwodwsv K9gSGh/iHa+j3c/khQc7JPJenH5tllOygi4xUY3E07dKTY6jdE5Q/VDmZPZQQ7mUhrWT+RYR/wL 41vIR3Cm8kcZuQr+gm/ZGSGtnmmYgu92rcr2mNtcb6VLwoYfIsy5RZXCFMBz86XcO7JrhcZvQMd wz/ut6uoJw65sSLNa3vWIY2xxiQoklkITZMbUT9FS07bswZUoJO1XTUg8WUXs3FDaikunFtOrvg 9+JRkxhuUtbzGqhKOJqtbFIdoKSPPjRxEkpW6LV7bkT0qKx4lT6s8ab1Z2i6/sPM1bz4lVz25rD CtTiHvbZ8oqapIck7Acj/QO+UUmbq6lO8= X-Received: by 2002:a5d:5049:0:b0:460:1a57:dd7c with SMTP id ffacd0b85a97d-4603063d5acmr18425112f8f.23.1780920961065; Mon, 08 Jun 2026 05:16:01 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.16.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:16:00 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 14/25] sched/rt: Implement dl-server operations for rt-cgroups. Date: Mon, 8 Jun 2026 14:15:33 +0200 Message-ID: <20260608121546.69910-15-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implement rt_server_pick, the callback that deadline servers use to pick a task to schedule. rt_server_pick(): pick the next runnable rt task and tell the scheduler that it is going to be scheduled next. Let enqueue_task_rt function start the attached deadline server when the first task is enqueued on a specific rq/server. The server is not symmetrically stopped in dequeue_task_rt as it is stopped when server_pick_task returns NULL (see deadline.c). Change update_curr_rt to perform a deadline server update if the updated task is served by non-root group. Update {enqueue/dequeue}_pushable_task and rt_{set/clear}_overload to only set the CPU-wise overload flag only if the root runqueues are overloaded, but not for HCBS runqueues. Update inc/dec_dl_tasks to account the number of active tasks in the local runqueue for rt-cgroups servers, as their local runqueue is different from the global runqueue, and thus when a rt-group server is activated/deactivated, the number of served tasks' must be added/removed. This uses nr_running to be compatible with future dl-server interfaces. Account also the deadline server so that it is picked for shutdown when its runqueue is empty (future patches will try to pull tasks before stopping). Update inc/dec_rt_prio_smp to change a rq's cpupri only if the rt_rq is the global runqueue, since cgroups are scheduled via their dl-server priority. Update inc/dec_rt_tasks to account for waking/sleeping tasks on the global runqueue, when the task runs on the root cgroup, or its local dl server is active. The accounting is not done when servers are throttled, as they will add/sub the number of tasks running when they get enqueued/dequeued. For rt cgroups, account for the number of active tasks in the nr_running field of the local runqueue (add/sub_nr_running), as this number is used when a dl server is enqueued/dequeued. Update set_task_rq to record the rt_rq of the cgroup's active_context, tracking where to schedule the given task. Update set_task_rq to record the dl_rq, tracking which deadline server manages a task. Update set_task_rq to not use the parent field anymore, as it is unused by this patchset's code. Remove the unused parent field from sched_rt_entity. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: luca abeni Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- include/linux/sched.h | 1 - kernel/sched/deadline.c | 8 +++++ kernel/sched/rt.c | 70 ++++++++++++++++++++++++++++++++++++----- kernel/sched/sched.h | 11 +++++-- 4 files changed, 79 insertions(+), 11 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 411ffe9b34b3..b20451fcda55 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -630,7 +630,6 @@ struct sched_rt_entity { struct sched_rt_entity *back; #ifdef CONFIG_RT_GROUP_SCHED - struct sched_rt_entity *parent; /* rq on which this entity is (to be) queued: */ struct rt_rq *rt_rq; /* rq "owned" by this entity/group: */ diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 166d23f45cab..a63253ec6441 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2096,6 +2096,10 @@ void inc_dl_tasks(struct sched_dl_entity *dl_se, str= uct dl_rq *dl_rq) if (!dl_server(dl_se)) add_nr_running(rq_of_dl_rq(dl_rq), 1); + else if (rq_of_dl_se(dl_se) !=3D dl_se->my_q) { + WARN_ON(dl_se->my_q->rt.rt_nr_running !=3D dl_se->my_q->nr_running); + add_nr_running(rq_of_dl_rq(dl_rq), dl_se->my_q->nr_running + 1); + } inc_dl_deadline(dl_rq, deadline); } @@ -2108,6 +2112,10 @@ void dec_dl_tasks(struct sched_dl_entity *dl_se, str= uct dl_rq *dl_rq) if (!dl_server(dl_se)) sub_nr_running(rq_of_dl_rq(dl_rq), 1); + else if (rq_of_dl_se(dl_se) !=3D dl_se->my_q) { + WARN_ON(dl_se->my_q->rt.rt_nr_running !=3D dl_se->my_q->nr_running); + sub_nr_running(rq_of_dl_rq(dl_rq), dl_se->my_q->nr_running - 1); + } dec_dl_deadline(dl_rq, dl_se->deadline); } diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index a6adf21772a6..61e9dab894d1 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -284,9 +284,19 @@ int alloc_rt_sched_group(struct task_group *tg, struct= task_group *parent) return 1; } +static struct sched_rt_entity *pick_next_rt_entity(struct rt_rq *rt_rq); + static struct task_struct *rt_server_pick(struct sched_dl_entity *dl_se, s= truct rq_flags *rf) { - return NULL; + struct rt_rq *rt_rq =3D &dl_se->my_q->rt; + struct task_struct *p; + + if (!sched_rt_runnable(dl_se->my_q)) + return NULL; + + p =3D rt_task_of(pick_next_rt_entity(rt_rq)); + + return p; } #else /* !CONFIG_RT_GROUP_SCHED */ @@ -314,6 +324,9 @@ static inline int rt_overloaded(struct rq *rq) static inline void rt_set_overload(struct rq *rq) { + if (is_dl_group(&rq->rt)) + return; + if (!rq->online) return; @@ -333,6 +346,9 @@ static inline void rt_set_overload(struct rq *rq) static inline void rt_clear_overload(struct rq *rq) { + if (is_dl_group(&rq->rt)) + return; + if (!rq->online) return; @@ -392,7 +408,7 @@ static void enqueue_pushable_task(struct rt_rq *rt_rq, = struct task_struct *p) rt_rq->highest_prio.next =3D p->prio; if (!rt_rq->overloaded) { - rt_set_overload(global_rq_of_rt_rq(rt_rq)); + rt_set_overload(rq_of_rt_rq(rt_rq)); rt_rq->overloaded =3D 1; } } @@ -410,7 +426,7 @@ static void dequeue_pushable_task(struct rt_rq *rt_rq, = struct task_struct *p) rt_rq->highest_prio.next =3D MAX_RT_PRIO-1; if (rt_rq->overloaded) { - rt_clear_overload(global_rq_of_rt_rq(rt_rq)); + rt_clear_overload(rq_of_rt_rq(rt_rq)); rt_rq->overloaded =3D 0; } } @@ -511,6 +527,7 @@ static inline int rt_se_prio(struct sched_rt_entity *rt= _se) static void update_curr_rt(struct rq *rq) { struct task_struct *donor =3D rq->donor; + struct rt_rq *rt_rq; s64 delta_exec; if (donor->sched_class !=3D &rt_sched_class) @@ -520,21 +537,32 @@ static void update_curr_rt(struct rq *rq) if (unlikely(delta_exec <=3D 0)) return; - if (!rt_bandwidth_enabled()) + if (!rt_group_sched_enabled()) + return; + + if (!dl_bandwidth_enabled()) return; + + rt_rq =3D rt_rq_of_se(&donor->rt); + if (is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + dl_server_update(dl_se, delta_exec); + } } static void inc_rt_prio_smp(struct rt_rq *rt_rq, int prio, int prev_prio) { - struct rq *rq =3D rq_of_rt_rq(rt_rq); + struct rq *rq; /* * Change rq's cpupri only if rt_rq is the top queue. */ - if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && &rq->rt !=3D rt_rq) + if (is_dl_group(rt_rq)) return; + rq =3D rq_of_rt_rq(rt_rq); if (rq->online && prio < prev_prio) cpupri_set(&rq->rd->cpupri, rq->cpu, prio); } @@ -542,14 +570,15 @@ inc_rt_prio_smp(struct rt_rq *rt_rq, int prio, int pr= ev_prio) static void dec_rt_prio_smp(struct rt_rq *rt_rq, int prio, int prev_prio) { - struct rq *rq =3D rq_of_rt_rq(rt_rq); + struct rq *rq; /* * Change rq's cpupri only if rt_rq is the top queue. */ - if (IS_ENABLED(CONFIG_RT_GROUP_SCHED) && &rq->rt !=3D rt_rq) + if (is_dl_group(rt_rq)) return; + rq =3D rq_of_rt_rq(rt_rq); if (rq->online && rt_rq->highest_prio.curr !=3D prev_prio) cpupri_set(&rq->rd->cpupri, rq->cpu, rt_rq->highest_prio.curr); } @@ -610,6 +639,15 @@ void inc_rt_tasks(struct sched_rt_entity *rt_se, struc= t rt_rq *rt_rq) rt_rq->rr_nr_running +=3D is_rr_task(rt_se); inc_rt_prio(rt_rq, rt_se_prio(rt_se)); + + if (is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + if (!dl_se->dl_throttled) + add_nr_running(global_rq_of_rt_rq(rt_rq), 1); + } + + add_nr_running(rq_of_rt_rq(rt_rq), 1); } static inline @@ -620,6 +658,15 @@ void dec_rt_tasks(struct sched_rt_entity *rt_se, struc= t rt_rq *rt_rq) rt_rq->rr_nr_running -=3D is_rr_task(rt_se); dec_rt_prio(rt_rq, rt_se_prio(rt_se)); + + if (is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + if (!dl_se->dl_throttled) + sub_nr_running(global_rq_of_rt_rq(rt_rq), 1); + } + + sub_nr_running(rq_of_rt_rq(rt_rq), 1); } /* @@ -806,6 +853,13 @@ enqueue_task_rt(struct rq *rq, struct task_struct *p, = int flags) check_schedstat_required(); update_stats_wait_start_rt(rt_rq_of_se(rt_se), rt_se); + /* Task arriving in an idle group of tasks. */ + if (is_dl_group(rt_rq) && rt_rq->rt_nr_running =3D=3D 0) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + dl_server_start(dl_se); + } + enqueue_rt_entity(rt_se, flags); if (task_is_blocked(p)) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 58f67093145e..66d5bd1aa4f1 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2310,10 +2310,11 @@ static inline void set_task_rq(struct task_struct *= p, unsigned int cpu) * root_task_group's rt_rq than switching in rt_rq_of_se() * Clobbers tg(!) */ + guard(raw_spinlock_irqsave)(&tg->dl_bandwidth.dl_runtime_lock); if (!rt_group_sched_enabled()) tg =3D &root_task_group; - p->rt.rt_rq =3D tg->rt_rq[cpu]; - p->rt.parent =3D tg->rt_se[cpu]; + p->rt.rt_rq =3D tg->dl_bandwidth.active_context->rt_rq[cpu]; + p->dl.dl_rq =3D &cpu_rq(cpu)->dl; #endif /* CONFIG_RT_GROUP_SCHED */ } @@ -2976,6 +2977,9 @@ static inline void add_nr_running(struct rq *rq, unsi= gned count) unsigned prev_nr =3D rq->nr_running; rq->nr_running =3D prev_nr + count; + if (rq !=3D cpu_rq(rq->cpu)) + return; + if (trace_sched_update_nr_running_tp_enabled()) { call_trace_sched_update_nr_running(rq, count); } @@ -2989,6 +2993,9 @@ static inline void add_nr_running(struct rq *rq, unsi= gned count) static inline void sub_nr_running(struct rq *rq, unsigned count) { rq->nr_running -=3D count; + if (rq !=3D cpu_rq(rq->cpu)) + return; + if (trace_sched_update_nr_running_tp_enabled()) { call_trace_sched_update_nr_running(rq, -count); } -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C9F103815DB for ; Mon, 8 Jun 2026 12:16:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920966; cv=none; b=Df9zjbYZ+oRt4zbcvyT8RMngi0ReVO18hjhEQXXJK9urUI+0Z+h9XFX03jn20mcK/IbFUId07QslHvjS5TJbgV64zphIxwe5e3Zp76NBb+I+Cmu7lBV0X2dTRx2j62cLK1fThhfQxH3P10D0JQq2TL1g9kzmKSCvqPc3gN1JXL8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920966; c=relaxed/simple; bh=bFpBJKY+dY1TRn8e9BmV1Ywb6ktkznGZhF8ataM/d3c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lj6tl2I1Td+HJI0ZXTuST0Ym+TJ1NGKhVYswsE4Ccg67fBTeBGhf2Ww4CSaQghE5uludlPPh7GlLTRA9X16RW7vkpqATfScy+/djYO7+yHeOpAOgRj5kA9I+LSulIi3c87pPz/zZa4w7icYHfnS/7tGwqTEfT2nbpcnRbx2PcsA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=dOlAHn18; arc=none smtp.client-ip=209.85.221.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dOlAHn18" Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-45e9f4a3510so1925375f8f.1 for ; Mon, 08 Jun 2026 05:16:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920962; x=1781525762; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=u4pbifYzHaqVoXX1q5IVhVdB74Q/h2EMaW5Z7DLocPs=; b=dOlAHn18fkvkPv5DVbs9H2FuIi73oh6Ow6aSyPh8Yzj6QkZwuGsYOl0OkZeJ1OpK7E pW4xEoO2kmmPPKekSh9KZKcokPwWheAPFu5fhN4C6iZq2vYenrgQls4fv9ynjdI6NMNc sGiYHjvCiVA4nlaRy36XRK6wIPKYaNR0YL2PfAKJphUPf/0xfXUCie70ydkH5iUXnwo+ 8/QLGQjCiUtfE5X6nt/VYhgZjAyoIhs/45Uig+MLeMVf0PMpjGSKOLHI+a43tJ6Zzy5Y eN3Mm3/0rImAuqpJlOR95U7HfNaYr3st+LmXCFMo8yWAqBmvZtyGPjqAHlw785MWwKrO Hq4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920962; x=1781525762; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=u4pbifYzHaqVoXX1q5IVhVdB74Q/h2EMaW5Z7DLocPs=; b=DrU6KJlp36z74vCXFyOfrJ2hYYiplcHrgN8APUm8cDbPlWMhjw/q8n7RPSHQ/2ctNf i5/0UggIH7jVYmWvThvBRyDq5EqMrLGlHyHn8/9p4wY3VOn6nEt+qAhpsBzveab3G37v Hbmy4p36/XCHMUYjhSCdzG1V8W+n7qCbbzx/2orFX9kEw9KOP6kea5nYdVIQAwcjt3RP csTHY2JZDZ8uWpiXwvltZ2oTumDQhkWhhLyPqIhvNWP9oJYK2dhJX69Ufc1D8VqENput 8jX8Wsz7B8FyvU8QBOyaFuqHvS7hECIK4mZ0ntQJNGAIG+aonaJTnPqAY0qf5tSeamQ4 TmDQ== X-Forwarded-Encrypted: i=1; AFNElJ/PPJaam7Mt4TTvcSWLV8Fx5Jvns9ZmffpOvKOV2XCpuOLqR5UHEsTq5j5x7SXRQhHSmANwfRuNz/gxpnw=@vger.kernel.org X-Gm-Message-State: AOJu0YzXSZ5jrqKV1XNv+9KirvEmac/+50eB6ljH+E9dXxCAMYVDNqwn Zc7SGB6FQqW4cOKNmtsRdQ30tOb6PnQ8xFAk+bPsiD2QM84AK3eVrmvF X-Gm-Gg: Acq92OGPfFOmfc2wYzBY7WUZ5W+t5WZ3bXUTc8zcHIWR2UN0jRgk9BOngBAum9eljSX hdJ5kXR+V+K2LxP1XltLvRYisfWL0pv70b7S18xkR7i8lA7n0PaBkAHTbfjvhd4RXEjyIfBqzos jkjLsduohYA1GglBdLKIKvGylgk7SzbV/11zrjI9fx8D7+OO6UWLvJfRB3IH0M0b8l3R5PcydDu 5gZUkHWoNNEk1aISkdKE4vxe2LFWeznpbdRO5Sl6lALJymBObJK1S90OeDtPP5yCJJ1765yDkHb 5+UQvGMYAY3Jbc72mLHMwmravyN7P49S2UrWxKG4USc2XZ1KnKipk6o7UUfWT5l3UG8LFHorHlr no9/vBG9NPRr9uH8FK8S/936zlKsL1tv+2+nCgnNuRz4oJ9uJg6gJEs6uKLD3MZD6Jd7w6dlrof 6HPCRQRxtNfUMu0evv182/YlnGKn2b1JE= X-Received: by 2002:a5d:5a4c:0:b0:452:273:5cd6 with SMTP id ffacd0b85a97d-460302e3749mr17960879f8f.1.1780920962145; Mon, 08 Jun 2026 05:16:02 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.16.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:16:01 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 15/25] sched/rt: Update task event callbacks for HCBS scheduling Date: Mon, 8 Jun 2026 14:15:34 +0200 Message-ID: <20260608121546.69910-16-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Update wakeup_preempt_rt, switched_{from/to}_rt and prio_changed_rt with rt-cgroup's specific preemption rules: - In wakeup_preempt_rt(), whenever a task wakes up, it must be checked if it is served by a deadline server or it lives on the global runqueue. Preemption rules (as documented in the function), change based on the current task's donor and woken task runqueue: - If both tasks are FIFO/RR tasks on the global runqueue, or the same cgroup, run as normal. - If woken is inside a cgroup, but donor is a FIFO task on the global runqueue, always preempt. If donor is a DEADLINE task, check if the dl server preempts donor. - If both tasks are FIFO/RR tasks in served but different groups, check whether the woken server preempts the donor server. - In prio_changed_rt(), if the task is not running, only run preemption checks if the running task resides on the same task group of the task that changed priority. Update sched_rt_can_attach() to check if a task can be attached to a given cgroup. For now the check only consists in checking if the group has non-zero bandwidth. Remove the tsk argument from sched_rt_can_attach, as it is unused. Change cpu_cgroup_can_attach() to check if the attachee is a FIFO/RR task before attaching it to a cgroup. Update __sched_setscheduler() to perform checks when trying to switch to FIFO/RR for a task inside a cgroup, as the group needs to have runtime allocated. Update task_is_throttled_rt() for SCHED_CORE, returning the is_throttled value of the server if present, while global rt-tasks are never throttled. Update migration functions to ignore cgroups migration, to be implemented in later patches. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: luca abeni Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- kernel/sched/core.c | 2 +- kernel/sched/rt.c | 98 ++++++++++++++++++++++++++++++++++++++--- kernel/sched/sched.h | 2 +- kernel/sched/syscalls.c | 12 +++++ 4 files changed, 105 insertions(+), 9 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9e47a02cfaf7..1252f45feda0 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -9545,7 +9545,7 @@ static int cpu_cgroup_can_attach(struct cgroup_taskse= t *tset) goto scx_check; cgroup_taskset_for_each(task, css, tset) { - if (!sched_rt_can_attach(css_tg(css), task)) + if (rt_task(task) && !sched_rt_can_attach(css_tg(css))) return -EINVAL; } scx_check: diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 61e9dab894d1..168a92945b4a 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -372,6 +372,9 @@ static inline void rt_queue_push_tasks(struct rt_rq *rt= _rq) { struct rq *rq =3D global_rq_of_rt_rq(rt_rq); + if (is_dl_group(rt_rq)) + return; + if (!has_pushable_tasks(rt_rq)) return; @@ -382,6 +385,9 @@ static inline void rt_queue_pull_task(struct rt_rq *rt_= rq) { struct rq *rq =3D global_rq_of_rt_rq(rt_rq); + if (is_dl_group(rt_rq)) + return; + queue_balance_callback(rq, &per_cpu(rt_pull_head, rq->cpu), pull_rt_task); } @@ -1031,7 +1037,55 @@ static int balance_rt(struct rq *rq, struct task_str= uct *p, struct rq_flags *rf) static void wakeup_preempt_rt(struct rq *rq, struct task_struct *p, int fl= ags) { struct task_struct *donor =3D rq->donor; + struct sched_dl_entity *woken_dl_se =3D NULL; + struct sched_dl_entity *donor_dl_se =3D NULL; + + if (!rt_group_sched_enabled()) + goto same_group_sched; + + /* + * Preemption checks are different if the waking task and the current don= or + * are running on the global runqueue or in a cgroup. The following rules + * apply: + * - dl-tasks (and equally dl_servers) always preempt FIFO/RR tasks. + * - if donor is a FIFO/RR task inside a cgroup (i.e. run by a + * dl_server), or donor is a DEADLINE task and waking is a FIFO/RR + * task on the root cgroup, do nothing. + * - if waking is inside a cgroup but donor is a FIFO/RR task in the + * root cgroup, always reschedule. + * - if they are both on the global runqueue or in the same cgroup, run + * the standard code. + * - if they are both in a cgroup, but not the same one, check whether = the + * woken task's dl_server preempts the donor's dl_server. + * - if donor is a DEADLINE task and waking is in a cgroup, check wheth= er + * the woken task's server preempts donor. + */ + if (is_dl_group(rt_rq_of_se(&p->rt))) + woken_dl_se =3D dl_group_of(rt_rq_of_se(&p->rt)); + if (is_dl_group(rt_rq_of_se(&donor->rt))) + donor_dl_se =3D dl_group_of(rt_rq_of_se(&donor->rt)); + else if (task_has_dl_policy(donor)) + donor_dl_se =3D &donor->dl; + + if (woken_dl_se !=3D NULL && donor_dl_se !=3D NULL) { + if (woken_dl_se =3D=3D donor_dl_se) { + goto same_group_sched; + } + + if (dl_entity_preempt(woken_dl_se, donor_dl_se)) + resched_curr(rq); + + return; + + } else if (woken_dl_se !=3D NULL) { + resched_curr(rq); + return; + + } else if (donor_dl_se !=3D NULL) { + return; + } +same_group_sched: /* * XXX If we're preempted by DL, queue a push? */ @@ -1055,7 +1109,8 @@ static void wakeup_preempt_rt(struct rq *rq, struct t= ask_struct *p, int flags) * to move current somewhere else, making room for our non-migratable * task. */ - if (p->prio =3D=3D donor->prio && !test_tsk_need_resched(rq->curr)) + if (!is_dl_group(rt_rq_of_se(&p->rt)) && + p->prio =3D=3D donor->prio && !test_tsk_need_resched(rq->curr)) check_preempt_equal_prio(rq, p); } @@ -1362,6 +1417,9 @@ static int push_rt_rq_task(struct rt_rq *rt_rq, bool = pull) struct rt_rq *lowest_rt_rq; int ret =3D 0; + if (is_dl_group(rt_rq)) + return 0; + if (!rt_rq->overloaded) return 0; @@ -1668,6 +1726,9 @@ static void pull_rt_rq_task(struct rt_rq *this_rt_rq) struct rq *src_rq; int rt_overload_count =3D rt_overloaded(this_rq); + if (is_dl_group(&this_rq->rt)) + return; + if (likely(!rt_overload_count)) return; @@ -1811,6 +1872,8 @@ static void rq_offline_rt(struct rq *rq) */ static void switched_from_rt(struct rq *rq, struct task_struct *p) { + struct rt_rq *rt_rq =3D rt_rq_of_se(&p->rt); + /* * If there are other RT tasks then we will reschedule * and the scheduling of the other RT tasks will handle @@ -1818,10 +1881,10 @@ static void switched_from_rt(struct rq *rq, struct = task_struct *p) * we may need to handle the pulling of RT tasks * now. */ - if (!task_on_rq_queued(p) || rq->rt.rt_nr_running) + if (!task_on_rq_queued(p) || rt_rq->rt_nr_running) return; - rt_queue_pull_task(rt_rq_of_se(&p->rt)); + rt_queue_pull_task(rt_rq); } void __init init_sched_rt_class(void) @@ -1858,6 +1921,7 @@ static void switched_to_rt(struct rq *rq, struct task= _struct *p) if (task_on_rq_queued(p)) { if (p->nr_cpus_allowed > 1 && rq->rt.overloaded) rt_queue_push_tasks(rt_rq_of_se(&p->rt)); + if (p->prio < rq->donor->prio && cpu_online(cpu_of(rq))) resched_curr(rq); } @@ -1870,6 +1934,8 @@ static void switched_to_rt(struct rq *rq, struct task= _struct *p) static void prio_changed_rt(struct rq *rq, struct task_struct *p, u64 oldprio) { + struct rt_rq *rt_rq =3D rt_rq_of_se(&p->rt); + if (!task_on_rq_queued(p)) return; @@ -1882,15 +1948,24 @@ prio_changed_rt(struct rq *rq, struct task_struct *= p, u64 oldprio) * may need to pull tasks to this runqueue. */ if (oldprio < p->prio) - rt_queue_pull_task(rt_rq_of_se(&p->rt)); + rt_queue_pull_task(rt_rq); /* * If there's a higher priority task waiting to run * then reschedule. */ - if (p->prio > rq->rt.highest_prio.curr) + if (p->prio > rt_rq->highest_prio.curr) resched_curr(rq); } else { + /* + * This task is not running, thus we check against the currently + * running task for preemption. We can preempt only if both tasks are + * in the same cgroup or on the global runqueue. + */ + if (rt_group_sched_enabled() && + rt_rq->tg !=3D rt_rq_of_se(&rq->curr->rt)->tg) + return; + /* * This task is not running, but if it is * greater than the current running task @@ -1983,7 +2058,16 @@ static unsigned int get_rr_interval_rt(struct rq *rq= , struct task_struct *task) #ifdef CONFIG_SCHED_CORE static int task_is_throttled_rt(struct task_struct *p, int cpu) { +#ifdef CONFIG_RT_GROUP_SCHED + struct rt_rq *rt_rq; + + rt_rq =3D task_group(p)->rt_rq[cpu]; + WARN_ON(!rt_group_sched_enabled() && rt_rq->tg !=3D &root_task_group); + + return dl_group_of(rt_rq)->dl_throttled; +#else return 0; +#endif } #endif /* CONFIG_SCHED_CORE */ @@ -2222,10 +2306,10 @@ long sched_group_rt_period(struct task_group *tg) return rt_period_us; } -int sched_rt_can_attach(struct task_group *tg, struct task_struct *tsk) +int sched_rt_can_attach(struct task_group *tg) { /* Don't accept real-time tasks when there is no way for them to run */ - if (rt_group_sched_enabled() && rt_task(tsk) && tg->rt_bandwidth.rt_runti= me =3D=3D 0) + if (rt_group_sched_enabled() && tg->dl_bandwidth.dl_runtime =3D=3D 0) return 0; return 1; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 66d5bd1aa4f1..bde49f216081 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -611,7 +611,7 @@ extern int sched_group_set_rt_runtime(struct task_group= *tg, long rt_runtime_us) extern int sched_group_set_rt_period(struct task_group *tg, u64 rt_period_= us); extern long sched_group_rt_runtime(struct task_group *tg); extern long sched_group_rt_period(struct task_group *tg); -extern int sched_rt_can_attach(struct task_group *tg, struct task_struct *= tsk); +extern int sched_rt_can_attach(struct task_group *tg); extern struct task_group *sched_create_group(struct task_group *parent); extern void sched_online_group(struct task_group *tg, diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index 9c1ba10ea5a7..773f744c0460 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -606,6 +606,18 @@ int __sched_setscheduler(struct task_struct *p, change: if (user) { + /* + * Do not allow real-time tasks into groups that have no runtime + * assigned. + */ + if (rt_group_sched_enabled() && + dl_bandwidth_enabled() && rt_policy(policy) && + !sched_rt_can_attach(task_group(p)) && + !task_group_is_autogroup(task_group(p))) { + retval =3D -EPERM; + goto unlock; + } + if (dl_bandwidth_enabled() && dl_policy(policy) && !(attr->sched_flags & SCHED_FLAG_SUGOV)) { cpumask_t *span =3D rq->rd->span; -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7E5F3803C4 for ; Mon, 8 Jun 2026 12:16:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920966; cv=none; b=D56EGPzfumERl7VE/OHX8Z1ClcPteuOi4Uf3dXalkXWd3maXeaiDjC6D+h7AiNcZfOwkkM169M8qpcGJyi/GZc0F7TtBeejQjBn+CUbSQ+CQTLO8ZFnYYr+CS1n8qNx+CuxgPW2bk7TXDfHbxczLNGRp8js+FSKyvcutrM67UdQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920966; c=relaxed/simple; bh=/esGex96/7K5WixXjZCTWYkRUCk4zCR/RLJQIooAqyY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=J5fhQSkkkSMlu5qMZXQC6IaP8Olgf5vJD3DipopLfQPq4VDsTDLpddeOIFeRCJBl/jvpwM7lR7QQWaqRjerhCJqEawQEa9Aqrx9CN+8rLlmv25XwpBVxIQ3NEOkch5dDE2Lc3qhv1OL+IOct2ZVH6/rtJjbONYf2VQhwyv/Syf8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=s+nwOnoq; arc=none smtp.client-ip=209.85.221.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="s+nwOnoq" Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-45fe59255beso2116758f8f.1 for ; Mon, 08 Jun 2026 05:16:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920963; x=1781525763; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eR4oYliqi6ykFwo23mj4CnogVrnrasJzQru04cwA8RQ=; b=s+nwOnoqN8RLanMGtpmidL6pvcqhAqsBQkZJtBcTBwafnwiJGp0M5gUZygRbq8CFPC ZE8Tb3JuWZSrMkgbSnkBpPbamoeOnc2dLepXgp7o2gGhVpwMMS775TFEmTo8N42vHLpU n8W8ENzyD56v9mtDgg99jDs++XqXj3EIsBJVffkFli03AyWvMUMdHgf+1Fd1zXBMOHLD 15hRX02sH/CDN4oPo96HMBuEZH020cdKqtSMkSmype2ARd8ToZeVGAQd5NCgJhbtkqVq 5yM8iI1JrbLm7f3vMCVl0mCK6h3BNYaA2jJvxTtUeEb6kphVgfCX17CbpQWtUjvbnO3A SRZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920963; x=1781525763; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=eR4oYliqi6ykFwo23mj4CnogVrnrasJzQru04cwA8RQ=; b=fjUb9jNXa8kLS81prO4ROIYAA1nix5bTkNMDv5PkkiJPrwt36zUGD/oYe3eD22sVJw fqJOPYQSug7lsrqtXzOTDaAjVEGuVv4+ruHDMSN15Mfph+yWN9nuRrYq/UEjh9dzRK0m 3/qeTUcrINU27slYr6yRY2DdIfe3OaJ6GwcuVZZlYF62MmZ4Rj5yVhnua8uTE3rx/cU0 03Vot7gWg1MMn8o7mrD1c0xDpO4YDA8ePzEnCcONwBxzss4k8ZEmB1F+O0jiI3wx/b08 tHY61dkkq3vzSNGFBnN01khQO7sTQOr7NNQUczIUDg4IRI3wQ3CVodvArJTxcpWAJtQR +Ibg== X-Forwarded-Encrypted: i=1; AFNElJ84rv9qprytWevzUWtPqisrhBZ83uIBnMKAnAltzV6euFs/An23U890l/N5gv/6w7szza3Asc1EFI/5IC8=@vger.kernel.org X-Gm-Message-State: AOJu0Yz1Xaa2I70nVaN92wzbv5TdUnRisnudo+SVnT1TS0nADYlCHaNe en6bMSA6ZXAMZ2b9RulFGlmFFfQHRJojKFfLOs/TiLc0Opor3PZWIiYv X-Gm-Gg: Acq92OFvsfZ4KP5In+bgIgUL+3Hp1o1dktpclq8zvRT0cid25nMvWYZKTVaUmx263Cb j4ksUquWkktg1xyzm3xG+UbWqKG0YuiFn1G15lh90d6ruAMt06zJbcYzOJOGUPq6zRxKNKKOBog 3eKpqxE4cFE7SMh6vt3gVUNskP0s8VzfevUBjN9VGeSovNkTIG0UxpxvsEyrBx1H1cvWJuXKEJD 5wX1a//QizvJTNFIOfscysTiudowUJtlGtI6DhFvmFpV68b/t1oRqSA1Di/w/yGrb3+9hIC+1dn icfY4/Kwxyxy1teg2X8aizyaxvPuua5EVLUpOhzAE3VlxSeG3RTNiK9Uk2ljOcyERYS1+Az39w4 xlam899D6cyYPifFdfyzQjPW517dEEaT6ZzRSILzh2NSub54uL9CKhPtEupjrhRO4sVeKR3eFBr n2BnYek2f8sUVto7jhY5LbT/Bk5pFcRNQ= X-Received: by 2002:adf:f2cf:0:b0:45e:f381:cd85 with SMTP id ffacd0b85a97d-4603050230cmr19831400f8f.20.1780920963272; Mon, 08 Jun 2026 05:16:03 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.16.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:16:02 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 16/25] sched/rt: Remove support for cgroups-v1 Date: Mon, 8 Jun 2026 14:15:35 +0200 Message-ID: <20260608121546.69910-17-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Disable control files for cgroups-v1. Remove cpu_rt_group_init function and functions related to the cgroup-v1 control files 'rt_runtime_us' and 'rt_period_us'. Signed-off-by: Yuri Andriaccio --- kernel/sched/core.c | 50 -------------------------------------------- kernel/sched/rt.c | 49 +------------------------------------------ kernel/sched/sched.h | 4 ---- 3 files changed, 1 insertion(+), 102 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 1252f45feda0..a8a81c69b3d3 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -10150,32 +10150,6 @@ static int cpu_burst_write_u64(struct cgroup_subsy= s_state *css, } #endif /* CONFIG_GROUP_SCHED_BANDWIDTH */ =20 -#ifdef CONFIG_RT_GROUP_SCHED -static int cpu_rt_runtime_write(struct cgroup_subsys_state *css, - struct cftype *cft, s64 val) -{ - return sched_group_set_rt_runtime(css_tg(css), val); -} - -static s64 cpu_rt_runtime_read(struct cgroup_subsys_state *css, - struct cftype *cft) -{ - return sched_group_rt_runtime(css_tg(css)); -} - -static int cpu_rt_period_write_uint(struct cgroup_subsys_state *css, - struct cftype *cftype, u64 rt_period_us) -{ - return sched_group_set_rt_period(css_tg(css), rt_period_us); -} - -static u64 cpu_rt_period_read_uint(struct cgroup_subsys_state *css, - struct cftype *cft) -{ - return sched_group_rt_period(css_tg(css)); -} -#endif /* CONFIG_RT_GROUP_SCHED */ - #ifdef CONFIG_GROUP_SCHED_WEIGHT static s64 cpu_idle_read_s64(struct cgroup_subsys_state *css, struct cftype *cft) @@ -10253,20 +10227,6 @@ static struct cftype cpu_legacy_files[] =3D { }; =20 #ifdef CONFIG_RT_GROUP_SCHED -static struct cftype rt_group_files[] =3D { - { - .name =3D "rt_runtime_us", - .read_s64 =3D cpu_rt_runtime_read, - .write_s64 =3D cpu_rt_runtime_write, - }, - { - .name =3D "rt_period_us", - .read_u64 =3D cpu_rt_period_read_uint, - .write_u64 =3D cpu_rt_period_write_uint, - }, - { } /* Terminate */ -}; - # ifdef CONFIG_RT_GROUP_SCHED_DEFAULT_DISABLED DEFINE_STATIC_KEY_FALSE(rt_group_sched); # else @@ -10289,16 +10249,6 @@ static int __init setup_rt_group_sched(char *str) return 1; } __setup("rt_group_sched=3D", setup_rt_group_sched); - -static int __init cpu_rt_group_init(void) -{ - if (!rt_group_sched_enabled()) - return 0; - - WARN_ON(cgroup_add_legacy_cftypes(&cpu_cgrp_subsys, rt_group_files)); - return 0; -} -subsys_initcall(cpu_rt_group_init); #endif /* CONFIG_RT_GROUP_SCHED */ =20 static int cpu_extra_stat_show(struct seq_file *sf, diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 168a92945b4a..4f1e7af2e88d 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1,3 +1,4 @@ +#pragma GCC diagnostic ignored "-Wunused-function" // SPDX-License-Identifier: GPL-2.0 /* * Real-Time Scheduling Class (mapped to the SCHED_FIFO and SCHED_RR @@ -2258,54 +2259,6 @@ static int tg_set_rt_bandwidth(struct task_group *tg, return err; } =20 -int sched_group_set_rt_runtime(struct task_group *tg, long rt_runtime_us) -{ - u64 rt_runtime, rt_period; - - rt_period =3D ktime_to_ns(tg->rt_bandwidth.rt_period); - rt_runtime =3D (u64)rt_runtime_us * NSEC_PER_USEC; - if (rt_runtime_us < 0) - rt_runtime =3D RUNTIME_INF; - else if ((u64)rt_runtime_us > U64_MAX / NSEC_PER_USEC) - return -EINVAL; - - return tg_set_rt_bandwidth(tg, rt_period, rt_runtime); -} - -long sched_group_rt_runtime(struct task_group *tg) -{ - u64 rt_runtime_us; - - if (tg->rt_bandwidth.rt_runtime =3D=3D RUNTIME_INF) - return -1; - - rt_runtime_us =3D tg->rt_bandwidth.rt_runtime; - do_div(rt_runtime_us, NSEC_PER_USEC); - return rt_runtime_us; -} - -int sched_group_set_rt_period(struct task_group *tg, u64 rt_period_us) -{ - u64 rt_runtime, rt_period; - - if (rt_period_us > U64_MAX / NSEC_PER_USEC) - return -EINVAL; - - rt_period =3D rt_period_us * NSEC_PER_USEC; - rt_runtime =3D tg->rt_bandwidth.rt_runtime; - - return tg_set_rt_bandwidth(tg, rt_period, rt_runtime); -} - -long sched_group_rt_period(struct task_group *tg) -{ - u64 rt_period_us; - - rt_period_us =3D ktime_to_ns(tg->rt_bandwidth.rt_period); - do_div(rt_period_us, NSEC_PER_USEC); - return rt_period_us; -} - int sched_rt_can_attach(struct task_group *tg) { /* Don't accept real-time tasks when there is no way for them to run */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index bde49f216081..efe52e162ba5 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -607,10 +607,6 @@ extern void start_cfs_bandwidth(struct cfs_bandwidth *= cfs_b); extern void unthrottle_cfs_rq(struct cfs_rq *cfs_rq); extern bool cfs_task_bw_constrained(struct task_struct *p); =20 -extern int sched_group_set_rt_runtime(struct task_group *tg, long rt_runti= me_us); -extern int sched_group_set_rt_period(struct task_group *tg, u64 rt_period_= us); -extern long sched_group_rt_runtime(struct task_group *tg); -extern long sched_group_rt_period(struct task_group *tg); extern int sched_rt_can_attach(struct task_group *tg); =20 extern struct task_group *sched_create_group(struct task_group *parent); --=20 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f54.google.com (mail-wr1-f54.google.com [209.85.221.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A4C33B0AE7 for ; Mon, 8 Jun 2026 12:16:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920970; cv=none; b=ghJ6r86mJecTP5KpXVvEoXuxN+3pLMP5cPBLHiwP9BkxuKd04V8qbGvEU2LOrN6eGNBGLzPodftkEBctrM7DEd4RM2SQLeQY4s+pM1eAvptnLVgUD1tBVwxbVdoT2r93LsELbzVN5tKz/Kca5koiqS9ID7bAhIUvk9tuhuyXT6o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920970; c=relaxed/simple; bh=x+kQMSJL0HaKNgdoHL5nXTs5650NkXHU/v/ZcKQFyWw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mJ55558CfW4cvQf+EWu0hwkLB3U04kZtMzvFJ+tZgu2EBAIITWpkh3db6OXsRs1BR4JxPhwhZa9Gl/H31jPZrWyqftHTHd2P9JtbF15N+bM7jCN2VV2lrhAZSZYPaDcWFDL0hyFehl46zIeshBWLKug20XRRpDH+5iv196a6EY8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=LeNc0ZWT; arc=none smtp.client-ip=209.85.221.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LeNc0ZWT" Received: by mail-wr1-f54.google.com with SMTP id ffacd0b85a97d-45fd464d51fso2256892f8f.3 for ; Mon, 08 Jun 2026 05:16:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920964; x=1781525764; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Zu6ynVRY/qI7FjTBOrlhoMJpQptnT/GTnRCT/yRZp9g=; b=LeNc0ZWTCoCJUdXWFq3yaaluHbrRsri5KVfWHV0W2jtXZsjQrgx4fDaHJhYRs9nVuL xPi10nyU4VkrDu8ISKGHIPrS40kg3g5k9Z98MBPpt4ZYwuPinXVqcDYhSqgI/HaL+pdQ Y//R+w0XvpcIF+ShdOXVBCergyhVA0txhlieJIbhU7sy1haaULzKIZMLAYQg5zE0FRGA udJR1zImr+1tHuRugY+OTKTH/cTArS4fRcmt2RVrxGmqys3Q5j6rTN3SgHU41U9hzIfT 7B6aQ9tLL8QTF5ULPq9kSo5yCngySg/BW4jPm68eDyynVlIw4oHnnqEupGBxG/UgPwo/ CNog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920964; x=1781525764; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Zu6ynVRY/qI7FjTBOrlhoMJpQptnT/GTnRCT/yRZp9g=; b=faIQ90XPmPC8zoCtimq2fZnTAwebPp/3ZcYelro1miRm6HfKgdT02fDaUnQhW6zGrM VHJCQZiQZ21eIdCGdmKsIwApAB4t5r84wgeckvq20g66cO/TLOuLs1bT/5aFf6m+6tti sfiRcsroWTXHeoOhBSJ3ycHILNYKrQAREgFGxPVzZSC9Mz4Sjm30uS9YjW7hAKcXCB3M E4kriie13oBbnh2ChHngEeSw2z2qBpiiIstEyXrEt3xaavG7ohwdJqF4Y3zlzH2oWfru rJ/jExQ39haZ184ZjAl+ushoB/blR7A7aIsC83qGrlecUJnSVQDJ7Wxl337isrSPrgeD GvVQ== X-Forwarded-Encrypted: i=1; AFNElJ9aX5iya0qStUCoFG460xmVIc95QB2WS7xNYiwqWGPqKNgGy2JIUL0iZDxmfK6Z02Aj2R0cAsZyGGk3/2o=@vger.kernel.org X-Gm-Message-State: AOJu0YzuOkmMp8eS157P2PBJ8CXEmDlqdiVA6O2Qgb4SYKAjr2ugpaIQ WLrZbvLAwpfgj7cQ5UkPuKheWjc2zHbMpavm8pzH8ZWcxcJgunAtpbTk X-Gm-Gg: Acq92OGuqcNGOdQMHDGQ+vmqWI89wgwhiUAULL0nlLTeDyjjT4vovZqbDGTwxKH4Ppa gJRDj2SlB2hlA2bg0V+mAHexBsrrulRWXppZElD2p5ywRH3AKQXRmndp1/Fh3GJqQh9DrCQ547a ppAye+qYxrlS1D+2upaRyomaysFDC4YaIQ5dRhbZZXw9yswi8wGG/C11LIb4dNaX8lYXfEcsJhi 9UdBQmvCuM6GP3+EJKUp447qX28EQtMi8QiMLA8O+geT9RQZ0/tnXzHX3nnGVVYw4Wz8YmCTQLa YTSEgNqcfswSUgGlCk1OXNzbhOoUHZl7OWy4xVUKiH1XzmkcskPBeavXrp8b9OOfq0giY544cVA m8Uvm2WXHFNE3vL2J5vxtpnWnLAhMAs22jX1gN/q26DyejsgGkoeScXwtiGfCXy/RsGxFdH1kDK dkIP5OFG/2wMZVoI5GEeG8AP66iMnC13Y= X-Received: by 2002:adf:d02f:0:b0:460:f36:79b0 with SMTP id ffacd0b85a97d-460304fda0amr17777567f8f.19.1780920964257; Mon, 08 Jun 2026 05:16:04 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.16.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:16:03 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 17/25] sched/rt: Update rt-cgroup schedulability checks Date: Mon, 8 Jun 2026 14:15:36 +0200 Message-ID: <20260608121546.69910-18-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Introduce cgroup-v2 control files: - cpu.rt.max: Get/set the bandwidth of the given cgroup, or inherith from parent. - cpu.rt.internal: Get the actual remaning bandwidth for the group, removing the bw of the group's children. Introduce a number of functions to update the cgroup settings across the whole hierarchy: - tg_subtree_has_rt_tasks() Checks if the active context rooted at tg is running rt workload. Child groups which do not share the same active context are ignored. - tg_compute_children_bw() Computes the total bandwidth of the active context rooted at tg minux the root of the context itself. - tg_rt_schedulable() Runs admission tests for the current cgroup tree and the given bandwidth update. - tg_update_active_context() Updates the active context of a given subtree with a new one. - tg_rt_bandwidth() / tg_rt_internal_bandwidth() Read the max (internal) bandwidth set to the cgroup. - tg_set_rt_bandwidth() Set the bandwidth of the group. Update sched_rt_can_attach to run only tasks in the root cgroup or HCBS cgroups which have non-zero runtime. Update and reuse __checkparam_dl to check for numerical issues regarding the dl_server's parameters. Add from_ratio function to convert from period and bw to runtime, inverse of the to_ratio function. Add dl_check_tg(), which performs an admission control test similar to __dl_overflow, but this time we are updating the cgroup's total bandwidth rather than scheduling a new DEADLINE task or updating a non-cgroup deadline server. Add rcu_sched lock guard for rcu_read_{lock/unlock}_sched. Add sched_domains lock guard for sched_domains_mutex_{lock/unlock}. Add lock/unlock methods for sched_rt_handler_mutex and its lock guard. Add asserts for held sched_domains_mutex and sched_rt_handler_mutex. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: luca abeni Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- include/linux/rcupdate.h | 1 + include/linux/sched.h | 2 + kernel/sched/core.c | 55 ++++++ kernel/sched/deadline.c | 60 ++++-- kernel/sched/rt.c | 393 +++++++++++++++++++++++++++++++-------- kernel/sched/sched.h | 18 +- kernel/sched/syscalls.c | 2 +- 7 files changed, 445 insertions(+), 86 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index bfa765132de8..70432ca3dbb9 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -1179,6 +1179,7 @@ extern int rcu_expedited; extern int rcu_normal; =20 DEFINE_LOCK_GUARD_0(rcu, rcu_read_lock(), rcu_read_unlock()) +DEFINE_LOCK_GUARD_0(rcu_sched, rcu_read_lock_sched(), rcu_read_unlock_sche= d()) DECLARE_LOCK_GUARD_0_ATTRS(rcu, __acquires_shared(RCU), __releases_shared(= RCU)) =20 #endif /* __LINUX_RCUPDATE_H */ diff --git a/include/linux/sched.h b/include/linux/sched.h index b20451fcda55..0021069581c2 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2522,4 +2522,6 @@ extern void migrate_enable(void); =20 DEFINE_LOCK_GUARD_0(migrate, migrate_disable(), migrate_enable()) =20 +DEFINE_LOCK_GUARD_0(sched_domains, sched_domains_mutex_lock(), sched_domai= ns_mutex_unlock()) + #endif diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a8a81c69b3d3..1ad1efe1dca7 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4815,6 +4815,14 @@ u64 to_ratio(u64 period, u64 runtime) return div64_u64(runtime << BW_SHIFT, period); } =20 +u64 from_ratio(u64 period, u64 bw) +{ + if (bw =3D=3D BW_UNIT) + return RUNTIME_INF; + + return (bw * period) >> BW_SHIFT; +} + /* * wake_up_new_task - wake up a newly created task for the first time. * @@ -10415,6 +10423,41 @@ static ssize_t cpu_max_write(struct kernfs_open_fi= le *of, } #endif /* CONFIG_CFS_BANDWIDTH */ =20 +#ifdef CONFIG_RT_GROUP_SCHED +static int cpu_rt_max_show(struct seq_file *sf, void *v) +{ + struct task_group *tg =3D css_tg(seq_css(sf)); + long period_us, runtime_us; + + tg_rt_bandwidth(tg, &period_us, &runtime_us); + cpu_period_quota_print(sf, period_us, runtime_us); + return 0; +} + +static int cpu_rt_internal_show(struct seq_file *sf, void *v) +{ + struct task_group *tg =3D css_tg(seq_css(sf)); + long period_us, runtime_us; + + tg_rt_internal_bandwidth(tg, &period_us, &runtime_us); + cpu_period_quota_print(sf, period_us, runtime_us); + return 0; +} + +static ssize_t cpu_rt_max_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off) +{ + struct task_group *tg =3D css_tg(of_css(of)); + u64 period_us, runtime_us; + int ret; + + ret =3D cpu_period_quota_parse(buf, &period_us, &runtime_us); + if (!ret) + ret =3D tg_set_rt_bandwidth(tg, period_us, runtime_us); + return ret ?: nbytes; +} +#endif /* CONFIG_RT_GROUP_SCHED */ + static struct cftype cpu_files[] =3D { #ifdef CONFIG_GROUP_SCHED_WEIGHT { @@ -10450,6 +10493,18 @@ static struct cftype cpu_files[] =3D { .write_u64 =3D cpu_burst_write_u64, }, #endif /* CONFIG_CFS_BANDWIDTH */ +#ifdef CONFIG_RT_GROUP_SCHED + { + .name =3D "rt.max", + .seq_show =3D cpu_rt_max_show, + .write =3D cpu_rt_max_write, + }, + { + .name =3D "rt.internal", + .flags =3D CFTYPE_NOT_ON_ROOT, + .seq_show =3D cpu_rt_internal_show, + }, +#endif /* CONFIG_RT_GROUP_SCHED */ #ifdef CONFIG_UCLAMP_TASK_GROUP { .name =3D "uclamp.min", diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index a63253ec6441..b7102f643171 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -346,10 +346,45 @@ void cancel_inactive_timer(struct sched_dl_entity *dl= _se) cancel_dl_timer(dl_se, &dl_se->inactive_timer); } =20 +/* + * Used for dl_bw check and update, used under sched_rt_handler()::mutex a= nd + * sched_domains_mutex. + */ +u64 dl_cookie; + #ifdef CONFIG_RT_GROUP_SCHED +int dl_check_tg(unsigned long total) +{ + int which_cpu; + int cap; + struct dl_bw *dl_b; + u64 gen =3D ++dl_cookie; + + lockdep_assert_held(&sched_domains_mutex); + lockdep_assert_held(&sched_rt_handler_mutex); + + for_each_possible_cpu(which_cpu) { + guard(rcu_sched)(); + + if (!dl_bw_visited(which_cpu, gen)) { + cap =3D dl_bw_capacity(which_cpu); + dl_b =3D dl_bw_of(which_cpu); + + guard(raw_spinlock_irqsave)(&dl_b->lock); + + if (dl_b->bw !=3D -1 && + cap_scale(dl_b->bw, cap) < dl_b->total_bw + cap_scale(total, cap)) + return 0; + } + + } + + return 1; +} + void dl_init_tg(struct sched_dl_entity *dl_se, u64 rt_runtime, u64 rt_peri= od) { - struct rq *rq =3D container_of_const(dl_se->dl_rq, struct rq, dl); + struct rq *rq =3D rq_of_dl_se(dl_se); int is_active; u64 new_bw; =20 @@ -3497,12 +3532,6 @@ DEFINE_SCHED_CLASS(dl) =3D { #endif }; =20 -/* - * Used for dl_bw check and update, used under sched_rt_handler()::mutex a= nd - * sched_domains_mutex. - */ -u64 dl_cookie; - int sched_dl_global_validate(void) { u64 runtime =3D global_rt_runtime(); @@ -3514,6 +3543,9 @@ int sched_dl_global_validate(void) int cpu, cap, cpus, ret =3D 0; unsigned long flags; =20 + lockdep_assert_held(&sched_domains_mutex); + lockdep_assert_held(&sched_rt_handler_mutex); + /* * Here we want to check the bandwidth not being set to some * value smaller than the currently allocated bandwidth in @@ -3566,6 +3598,9 @@ void sched_dl_do_global(void) int cpu; unsigned long flags; =20 + lockdep_assert_held(&sched_domains_mutex); + lockdep_assert_held(&sched_rt_handler_mutex); + if (global_rt_runtime() !=3D RUNTIME_INF) new_bw =3D to_ratio(global_rt_period(), global_rt_runtime()); =20 @@ -3711,7 +3746,7 @@ void __getparam_dl(struct task_struct *p, struct sche= d_attr *attr, unsigned int * below 2^63 ns (we have to check both sched_deadline and * sched_period, as the latter can be zero). */ -bool __checkparam_dl(const struct sched_attr *attr) +bool __checkparam_dl(const struct sched_attr *attr, bool allow_zero_runtim= e) { u64 period, max, min; =20 @@ -3720,14 +3755,16 @@ bool __checkparam_dl(const struct sched_attr *attr) return true; =20 /* deadline !=3D 0 */ - if (attr->sched_deadline =3D=3D 0) + if ((!allow_zero_runtime || attr->sched_runtime !=3D 0) && + attr->sched_deadline =3D=3D 0) return false; =20 /* * Since we truncate DL_SCALE bits, make sure we're at least * that big. */ - if (attr->sched_runtime < (1ULL << DL_SCALE)) + if ((!allow_zero_runtime || attr->sched_runtime !=3D 0) && + attr->sched_runtime < (1ULL << DL_SCALE)) return false; =20 /* @@ -3750,7 +3787,8 @@ bool __checkparam_dl(const struct sched_attr *attr) max =3D (u64)READ_ONCE(sysctl_sched_dl_period_max) * NSEC_PER_USEC; min =3D (u64)READ_ONCE(sysctl_sched_dl_period_min) * NSEC_PER_USEC; =20 - if (period < min || period > max) + if ((!allow_zero_runtime || period !=3D 0) && + (period < min || period > max)) return false; =20 return true; diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 4f1e7af2e88d..a32b1f68e645 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1,4 +1,3 @@ -#pragma GCC diagnostic ignored "-Wunused-function" // SPDX-License-Identifier: GPL-2.0 /* * Real-Time Scheduling Class (mapped to the SCHED_FIFO and SCHED_RR @@ -2111,9 +2110,6 @@ DEFINE_SCHED_CLASS(rt) =3D { }; =20 #ifdef CONFIG_RT_GROUP_SCHED -/* - * Ensure that the real time constraints are schedulable. - */ static inline int tg_has_rt_tasks(struct task_group *tg) { struct task_struct *task; @@ -2134,38 +2130,114 @@ static inline int tg_has_rt_tasks(struct task_grou= p *tg) return ret; } =20 -struct rt_schedulable_data { +static int __tg_subtree_has_rt_tasks(struct task_group *tg, void *data) { + struct task_group *ctx =3D data; + + if (dl_bandwidth_read(tg)->active_context =3D=3D ctx && tg_has_rt_tasks(t= g)) + return 1; + else + return 0; +} + +static int tg_subtree_has_rt_tasks(struct task_group *tg) { + lockdep_assert(rcu_read_lock_held()); + return walk_tg_tree_from(tg, __tg_subtree_has_rt_tasks, tg_nop, + dl_bandwidth_read(tg)->active_context); +} + +struct tg_update_data { struct task_group *tg; u64 rt_period; u64 rt_runtime; }; =20 -static int tg_rt_schedulable(struct task_group *tg, void *data) +struct tg_compute_children_bw_data { + struct tg_update_data update; + struct task_group *active_context; + u64 bw_sum; +}; + +static int __tg_compute_children_bw(struct task_group *tg, void *data) { + struct tg_compute_children_bw_data *d =3D data; + const struct dl_bandwidth *dl_b =3D dl_bandwidth_read(tg); + u64 period, runtime; + + /* Skip the current task group from the sum. */ + if (tg =3D=3D d->active_context) + return 0; + + period =3D dl_b->dl_period; + runtime =3D dl_b->dl_runtime; + if (tg =3D=3D d->update.tg) { + period =3D d->update.rt_period; + runtime =3D d->update.rt_runtime; + } + + if (runtime =3D=3D RUNTIME_INF || + dl_bandwidth_read(tg->parent)->active_context !=3D d->active_context) + return 0; + + d->bw_sum +=3D to_ratio(period, runtime); + return 0; +} + +static unsigned long tg_compute_children_bw(struct task_group *tg, + struct tg_update_data *data) +{ + struct tg_compute_children_bw_data sum_data =3D { + .active_context =3D tg, + .bw_sum =3D 0, + .update =3D (struct tg_update_data) { + .tg =3D data->tg, + .rt_period =3D data->rt_period, + .rt_runtime =3D data->rt_runtime, + } + }; + + lockdep_assert(rcu_read_lock_held()); + walk_tg_tree_from(tg, __tg_compute_children_bw, tg_nop, &sum_data); + return sum_data.bw_sum; +} + +struct rt_schedulable_data { + struct tg_update_data update; + u64 rt_runtime_remainder; +}; + +static int __tg_rt_schedulable(struct task_group *tg, void *data) { struct rt_schedulable_data *d =3D data; - struct task_group *child; + const struct dl_bandwidth *dl_b; u64 total, sum =3D 0; u64 period, runtime; =20 - period =3D ktime_to_ns(tg->rt_bandwidth.rt_period); - runtime =3D tg->rt_bandwidth.rt_runtime; + dl_b =3D dl_bandwidth_read(tg); + period =3D dl_b->dl_period; + runtime =3D dl_b->dl_runtime; =20 - if (tg =3D=3D d->tg) { - period =3D d->rt_period; - runtime =3D d->rt_runtime; + if (tg =3D=3D d->update.tg) { + period =3D d->update.rt_period; + runtime =3D d->update.rt_runtime; } =20 + /* + * "max" groups are always schedulable, as they defer their access + * control to their first non-max parent. + */ + if (runtime =3D=3D RUNTIME_INF) + return 0; + /* * Cannot have more runtime than the period. */ - if (runtime > period && runtime !=3D RUNTIME_INF) + if (runtime > period) return -EINVAL; =20 /* * Ensure we don't starve existing RT tasks if runtime turns zero. */ - if (rt_bandwidth_enabled() && !runtime && - tg->rt_bandwidth.rt_runtime && tg_has_rt_tasks(tg)) + if (dl_bandwidth_enabled() && !runtime && tg !=3D &root_task_group && + tg_subtree_has_rt_tasks(tg)) return -EBUSY; =20 total =3D to_ratio(period, runtime); @@ -2176,58 +2248,146 @@ static int tg_rt_schedulable(struct task_group *tg= , void *data) if (total > to_ratio(global_rt_period(), global_rt_runtime())) return -EINVAL; =20 + if (tg =3D=3D &root_task_group) { + if (!dl_check_tg(total)) + return -EBUSY; + } + /* - * The sum of our children's runtime should not exceed our own. + * The sum of our children's runtime, plus our own bw, should not + * exceed our own max. */ - list_for_each_entry_rcu(child, &tg->children, siblings) { - period =3D ktime_to_ns(child->rt_bandwidth.rt_period); - runtime =3D child->rt_bandwidth.rt_runtime; + sum =3D tg_compute_children_bw(tg, &d->update); + if (sum > total) + return -EINVAL; =20 - if (child =3D=3D d->tg) { - period =3D d->rt_period; - runtime =3D d->rt_runtime; - } + /* + * Compute remaining runtime + */ + if (tg =3D=3D d->update.tg) + d->rt_runtime_remainder =3D from_ratio(period, total - sum); + + return 0; +} =20 - sum +=3D to_ratio(period, runtime); +static int tg_rt_schedulable(struct tg_update_data *data, u64 *remainder_r= untime) +{ + int err; + struct rt_schedulable_data d =3D { + .update =3D (struct tg_update_data) { + .tg =3D data->tg, + .rt_period =3D data->rt_period, + .rt_runtime =3D data->rt_runtime, + }, + .rt_runtime_remainder =3D 0, + }; + + /* + * Walk the cgroup tree and check schedulability constraints. + */ + lockdep_assert(rcu_read_lock_held()); + err =3D walk_tg_tree(__tg_rt_schedulable, tg_nop, &d); + if (err) + return err; + + *remainder_runtime =3D d.rt_runtime_remainder; + return 0; +} + +struct tg_update_active_context_data { + struct task_group *new_active_context; + struct task_group *old_active_context; +}; + +static int __tg_update_active_context(struct task_group *tg, void *data) { + struct tg_update_active_context_data *d =3D data; + + if (dl_bandwidth_read(tg)->active_context =3D=3D d->old_active_context) { + guard(raw_spinlock_irq)(dl_bw_lock_of_tg(tg)); + dl_bandwidth_write(tg)->active_context =3D d->new_active_context; } =20 - if (sum > total) - return -EINVAL; + return 0; +} + +static void tg_update_active_context(struct task_group *tg, + struct task_group *old_context, + struct task_group *new_context) +{ + struct tg_update_active_context_data data =3D { + .new_active_context =3D new_context, + .old_active_context =3D old_context, + }; + lockdep_assert(rcu_read_lock_held()); + walk_tg_tree_from(tg, __tg_update_active_context, tg_nop, &data); +} + +int tg_rt_bandwidth(struct task_group *tg, + long *rt_period_us, long *rt_runtime_us) +{ + const struct dl_bandwidth *dl_b; + + guard(raw_spinlock_irq)(dl_bw_lock_of_tg(tg)); + dl_b =3D dl_bandwidth_read(tg); + + *rt_runtime_us =3D -1; + if (dl_b->dl_runtime !=3D RUNTIME_INF) { + *rt_runtime_us =3D dl_b->dl_runtime; + do_div(*rt_runtime_us, NSEC_PER_USEC); + } + + *rt_period_us =3D dl_b->dl_period; + do_div(*rt_period_us, NSEC_PER_USEC); =20 return 0; } =20 -static int __rt_schedulable(struct task_group *tg, u64 period, u64 runtime) +int tg_rt_internal_bandwidth(struct task_group *tg, + long *rt_period_us, long *rt_runtime_us) { - int ret; + const struct dl_bandwidth *dl_b; =20 - struct rt_schedulable_data data =3D { - .tg =3D tg, - .rt_period =3D period, - .rt_runtime =3D runtime, - }; + guard(raw_spinlock_irq)(dl_bw_lock_of_tg(tg)); + dl_b =3D dl_bandwidth_read(tg); =20 - rcu_read_lock(); - ret =3D walk_tg_tree(tg_rt_schedulable, tg_nop, &data); - rcu_read_unlock(); + *rt_runtime_us =3D dl_b->dl_internal_runtime; + do_div(*rt_runtime_us, NSEC_PER_USEC); =20 - return ret; + *rt_period_us =3D dl_b->dl_period; + do_div(*rt_period_us, NSEC_PER_USEC); + + return 0; } =20 -static int tg_set_rt_bandwidth(struct task_group *tg, - u64 rt_period, u64 rt_runtime) +int tg_set_rt_bandwidth(struct task_group *tg, + u64 rt_period_us, u64 rt_runtime_us) { - int i, err =3D 0; + struct tg_update_data update; + struct task_group *parent_ctx; + struct dl_bandwidth *dl_b; + u64 rt_period, rt_runtime, old_rt_runtime; + u64 rt_actual_runtime =3D 0; + u64 bw, children_bw; + struct sched_attr attr; + int err, i; =20 - /* - * Disallowing the root group RT runtime is BAD, it would disallow the - * kernel creating (and or operating) RT threads. - */ - if (tg =3D=3D &root_task_group && rt_runtime =3D=3D 0) + if (rt_runtime_us =3D=3D RUNTIME_INF) + rt_runtime =3D RUNTIME_INF; + else if ((u64)rt_runtime_us > U64_MAX / NSEC_PER_USEC) return -EINVAL; + else + rt_runtime =3D (u64)rt_runtime_us * NSEC_PER_USEC; =20 - /* No period doesn't make any sense. */ - if (rt_period =3D=3D 0) + if ((u64)rt_period_us > U64_MAX / NSEC_PER_USEC) + return -EINVAL; + else + rt_period =3D (u64)rt_period_us * NSEC_PER_USEC; + + /* + * The root_task_group bandwidth settings are only used to reserve bw + * for HCBS cgroups; runtime =3D=3D "max" has no meaning there. + */ + if (rt_runtime =3D=3D RUNTIME_INF && tg =3D=3D &root_task_group) return -EINVAL; =20 /* @@ -2236,34 +2396,119 @@ static int tg_set_rt_bandwidth(struct task_group *= tg, if (rt_runtime !=3D RUNTIME_INF && rt_runtime > max_rt_runtime) return -EINVAL; =20 - mutex_lock(&rt_constraints_mutex); - err =3D __rt_schedulable(tg, rt_period, rt_runtime); + /* + * Check if the runtime and period min and max values are admissible. + */ + attr =3D (struct sched_attr){ + .sched_flags =3D 0, + .sched_runtime =3D rt_runtime, + .sched_deadline =3D rt_period, + .sched_period =3D rt_period, + }; + + if (rt_runtime !=3D RUNTIME_INF && !__checkparam_dl(&attr, true)) + return -EINVAL; + + update =3D (struct tg_update_data) { + .tg =3D tg, + .rt_period =3D rt_period, + .rt_runtime =3D rt_runtime, + }; + + guard(mutex)(&rt_constraints_mutex); + old_rt_runtime =3D dl_bandwidth_read(tg)->dl_runtime; + + /* + * Disallow changing from/to "max" and a HCBS reservation if the group + * and all of its "max" children have active tasks. + */ + guard(sched_rt_handler)(); + guard(sched_domains)(); + guard(rcu)(); + if (((rt_runtime =3D=3D RUNTIME_INF && old_rt_runtime !=3D RUNTIME_INF) || + (rt_runtime !=3D RUNTIME_INF && old_rt_runtime =3D=3D RUNTIME_INF)) = && + tg_subtree_has_rt_tasks(tg)) + return -EINVAL; + + err =3D tg_rt_schedulable(&update, &rt_actual_runtime); if (err) - goto unlock; + return err; =20 - raw_spin_lock_irq(&tg->rt_bandwidth.rt_runtime_lock); - tg->rt_bandwidth.rt_period =3D ns_to_ktime(rt_period); - tg->rt_bandwidth.rt_runtime =3D rt_runtime; + scoped_guard(raw_spinlock_irq, dl_bw_lock_of_tg(tg)) { + dl_b =3D dl_bandwidth_write(tg); + dl_b->dl_period =3D rt_period; + dl_b->dl_runtime =3D rt_runtime; + dl_b->dl_internal_runtime =3D rt_actual_runtime; + } + + if (tg =3D=3D &root_task_group) + return 0; =20 + parent_ctx =3D dl_bandwidth_read(tg->parent)->active_context; + + /* + * If changing from/to "max" and a HCBS reservation, must update the + * active_context of self and all of its subtree. + */ + if ((rt_runtime =3D=3D RUNTIME_INF && old_rt_runtime !=3D RUNTIME_INF) || + (rt_runtime !=3D RUNTIME_INF && old_rt_runtime =3D=3D RUNTIME_INF)) + { + if (rt_runtime =3D=3D RUNTIME_INF) + tg_update_active_context(tg, dl_b->active_context, parent_ctx); + else + tg_update_active_context(tg, dl_b->active_context, tg); + + } + + WARN_ON(rt_runtime =3D=3D RUNTIME_INF && rt_actual_runtime !=3D 0); for_each_possible_cpu(i) { - struct rt_rq *rt_rq =3D tg->rt_rq[i]; + dl_init_tg(tg->dl_se[i], rt_actual_runtime, rt_period); + } + + /* + * Update the dl_servers of the parent's active context + */ + if (parent_ctx =3D=3D &root_task_group) + return 0; + + scoped_guard(raw_spinlock_irq, dl_bw_lock_of_tg(parent_ctx)) { + dl_b =3D dl_bandwidth_write(parent_ctx); =20 - raw_spin_lock(&rt_rq->rt_runtime_lock); - rt_rq->rt_runtime =3D rt_runtime; - raw_spin_unlock(&rt_rq->rt_runtime_lock); + bw =3D to_ratio(dl_b->dl_period, dl_b->dl_runtime); + children_bw =3D tg_compute_children_bw(parent_ctx, &update); + + rt_period =3D dl_b->dl_period; + rt_actual_runtime =3D from_ratio(rt_period, bw - children_bw); + dl_b->dl_internal_runtime =3D rt_actual_runtime; } - raw_spin_unlock_irq(&tg->rt_bandwidth.rt_runtime_lock); -unlock: - mutex_unlock(&rt_constraints_mutex); =20 - return err; + for_each_possible_cpu(i) { + dl_init_tg(parent_ctx->dl_se[i], rt_actual_runtime, rt_period); + } + + return 0; } =20 int sched_rt_can_attach(struct task_group *tg) { + struct task_group *ctx; + + /* If rt group sched is disabled, tasks are always run in the root rq */ + if (!rt_group_sched_enabled()) + return 1; + + /* Can always run on the root task group */ + scoped_guard(raw_spinlock_irqsave, dl_bw_lock_of_tg(tg)) { + ctx =3D dl_bandwidth_read(tg)->active_context; + if (ctx =3D=3D &root_task_group) + return 1; + } + /* Don't accept real-time tasks when there is no way for them to run */ - if (rt_group_sched_enabled() && tg->dl_bandwidth.dl_runtime =3D=3D 0) - return 0; + scoped_guard(raw_spinlock_irqsave, dl_bw_lock_of_tg(ctx)) { + if (dl_bandwidth_read(ctx)->dl_runtime =3D=3D 0) + return 0; + } =20 return 1; } @@ -2279,24 +2524,26 @@ static int sched_rt_global_validate(void) NSEC_PER_USEC > max_rt_runtime))) return -EINVAL; =20 -#ifdef CONFIG_RT_GROUP_SCHED - if (!rt_group_sched_enabled()) - return 0; - - scoped_guard(mutex, &rt_constraints_mutex) - return __rt_schedulable(NULL, 0, 0); -#endif return 0; } =20 +DEFINE_MUTEX(sched_rt_handler_mutex); + +void sched_rt_handler_mutex_lock() { + mutex_lock(&sched_rt_handler_mutex); +} + +void sched_rt_handler_mutex_unlock() { + mutex_unlock(&sched_rt_handler_mutex); +} + static int sched_rt_handler(const struct ctl_table *table, int write, void= *buffer, size_t *lenp, loff_t *ppos) { int old_period, old_runtime; - static DEFINE_MUTEX(mutex); int ret; =20 - mutex_lock(&mutex); + sched_rt_handler_mutex_lock(); sched_domains_mutex_lock(); old_period =3D sysctl_sched_rt_period; old_runtime =3D sysctl_sched_rt_runtime; @@ -2320,7 +2567,7 @@ static int sched_rt_handler(const struct ctl_table *t= able, int write, void *buff sysctl_sched_rt_runtime =3D old_runtime; } sched_domains_mutex_unlock(); - mutex_unlock(&mutex); + sched_rt_handler_mutex_unlock(); =20 /* * After changing maximum available bandwidth for DEADLINE, we need to diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index efe52e162ba5..394f40dc26db 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -366,7 +366,7 @@ extern void sched_dl_do_global(void); extern int sched_dl_overflow(struct task_struct *p, int policy, const str= uct sched_attr *attr); extern void __setparam_dl(struct task_struct *p, const struct sched_attr *= attr); extern void __getparam_dl(struct task_struct *p, struct sched_attr *attr, = unsigned int flags); -extern bool __checkparam_dl(const struct sched_attr *attr); +extern bool __checkparam_dl(const struct sched_attr *attr, bool allow_zero= _runtime); extern bool dl_param_changed(struct task_struct *p, const struct sched_att= r *attr); extern int dl_cpuset_cpumask_can_shrink(const struct cpumask *cur, const = struct cpumask *trial); extern int dl_bw_deactivate(int cpu); @@ -425,6 +425,7 @@ extern void dl_server_init(struct sched_dl_entity *dl_s= e, struct dl_rq *dl_rq, struct rq *served_rq, dl_server_pick_f pick_task); extern void sched_init_dl_servers(void); +extern int dl_check_tg(unsigned long total); extern void dl_init_tg(struct sched_dl_entity *dl_se, u64 rt_runtime, u64 = rt_period); =20 extern void fair_server_init(struct rq *rq); @@ -607,6 +608,12 @@ extern void start_cfs_bandwidth(struct cfs_bandwidth *= cfs_b); extern void unthrottle_cfs_rq(struct cfs_rq *cfs_rq); extern bool cfs_task_bw_constrained(struct task_struct *p); =20 +extern int tg_rt_bandwidth(struct task_group *tg, + long *rt_period_us, long *rt_runtime_us); +extern int tg_rt_internal_bandwidth(struct task_group *tg, + long *rt_period_us, long *rt_runtime_us); +extern int tg_set_rt_bandwidth(struct task_group *tg, + u64 rt_period_us, u64 rt_runtime_us); extern int sched_rt_can_attach(struct task_group *tg); =20 extern struct task_group *sched_create_group(struct task_group *parent); @@ -2045,6 +2052,14 @@ DEFINE_LOCK_GUARD_1(raw_spin_rq_lock_irq, struct rq, raw_spin_rq_lock_irq(_T->lock), raw_spin_rq_unlock_irq(_T->lock)) =20 +extern struct mutex sched_rt_handler_mutex; +extern void sched_rt_handler_mutex_lock(void); +extern void sched_rt_handler_mutex_unlock(void); + +DEFINE_LOCK_GUARD_0(sched_rt_handler, + sched_rt_handler_mutex_lock(), + sched_rt_handler_mutex_unlock()) + #ifdef CONFIG_NUMA =20 enum numa_topology_type { @@ -2938,6 +2953,7 @@ extern void init_cfs_throttle_work(struct task_struct= *p); #define MAX_BW ((1ULL << MAX_BW_BITS) - 1) =20 extern u64 to_ratio(u64 period, u64 runtime); +extern u64 from_ratio(u64 period, u64 bw); =20 extern void init_entity_runnable_average(struct sched_entity *se); extern void post_init_entity_util_avg(struct task_struct *p); diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index 773f744c0460..e5b8d2f42ea8 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -528,7 +528,7 @@ int __sched_setscheduler(struct task_struct *p, */ if (attr->sched_priority > MAX_RT_PRIO-1) return -EINVAL; - if ((dl_policy(policy) && !__checkparam_dl(attr)) || + if ((dl_policy(policy) && !__checkparam_dl(attr, false)) || (rt_policy(policy) !=3D (attr->sched_priority !=3D 0))) return -EINVAL; =20 --=20 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wm1-f46.google.com (mail-wm1-f46.google.com [209.85.128.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 047243B52FB for ; Mon, 8 Jun 2026 12:16:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920969; cv=none; b=H/YHnxIfrJlip1F5DkA1/G9LlnZmgTBxc+matkaKYnOO5K7cUgf4KZ+1iKV9vn19c9e/hF339E3ojc4vy4I2iZaPkUsic5vsYMqDud+iMQ/S8K2RFujE/k+xs6V5ZCIABx6toe5AbhQjX2mepjhzII1jzRzqUFTqz5B4nQ1k7po= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920969; c=relaxed/simple; bh=to3vSgHXdNomWchO0GTrjo+buXnOSkm5uMW+MO2JWIE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jvBhPCbh/LS4hFrqr+QPff9ycdGHnlx6LnGyf6JaTOhVFXutGvgb4ZGLuANPO/qAAdp9I4ZWf+l3mXX8tIdeZcKKt98aL0XiOXld0AbeDcQglb7NZ7rDTX5fwmIKIdMm+7fOsr9QojUqaXZfrJ0KhbTJHrGxe3+yJ2KDyJrEViE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=V1suc8Ul; arc=none smtp.client-ip=209.85.128.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="V1suc8Ul" Received: by mail-wm1-f46.google.com with SMTP id 5b1f17b1804b1-490be29c1c5so54574915e9.2 for ; Mon, 08 Jun 2026 05:16:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920965; x=1781525765; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=L2iHCNH12Qn8fPKVTKO+hnsoSrEhmuO/S50hcL98V7o=; b=V1suc8Ullt/ErqOqvAuhf2J+rk0P99Yuh9l6lfG6T/8zFDBMsJ0DWk2x5EivJGth1E 50zV5+08KP7/xPICDbPCt/7prvXai6e29ZJJSTn8MUZd/dxBOg7fBew+JICwSuukob6P 1i6vpRT+3HsDc1sV6JkkwQ+B6oCmVqDUHqFn787SBCwzJeGwVi2VoEvH0Av+WeUQXxvf CWbVIWfHWfi7JiHUKTc2hiZBFUY0NWiaw2+Q8DHWeYmhzfwUZxcW2iX1ryMWIUCRGC86 068KmKA2//ht7RvUXnn488aOiRNQWEjbhYTGH3TfLsq/fQNdNmq00rCxVn22JtYGiElQ 4S7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920965; x=1781525765; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=L2iHCNH12Qn8fPKVTKO+hnsoSrEhmuO/S50hcL98V7o=; b=UlRobBihgkldtBDtSND0dHFpTGUkFdC0URDfSSCpXm1ETVGpOIwYBKfCZkmwxqGNOd PQZeP3T6OuFkxnh+K5JpwHXVHXJb1WhnzwnabuncKCR/3xpFJIIKIPrL9T1eqNLjndfO j7VHz7cZqSweE9NTEnpz+MDxOwVvHzUYrPCChbY43eVdsruZruVB7WEDv0bUbPRxH07j KG0bmPMnPR8Hpp0ZhWSAD5rSS8VRUQE0VE6eCBilWWjoW17SuE59e6XXA30lzTFJQ1BG dxfdYdNEoErC6Wq504mn11vzl6vuh5fPI7qWDVZg5LdngGaxl8lGdGs5zKbAh8t09iIl Oi+w== X-Forwarded-Encrypted: i=1; AFNElJ/Iu8wWts38hc75Lp8sX731b2Oo70G3J4TKlXR+A9s4X+X9tOkKNAJVrdiPG49P/KT5v3bL3vVUE54e+uQ=@vger.kernel.org X-Gm-Message-State: AOJu0Ywc1VzGlCj20FaxDFmWoK6UApXEO/4Oisg/BfvUH2r+JWutmvhr iyCBMdsbfPaprZXpE8dZyJfFqJKj4OIVyVWGnEiJKVI2iO63ZwJtQfZ7 X-Gm-Gg: Acq92OH9K8NsLl/8a80Bqq5teTaChoUA87GsLX6MfxgNzx3zOMC9Jw85ZqAB//QUWjn zUnrbDpwv4lrzuObY+aX6dignW5draGgx0stVOgXUC9xSPgZnh1TOqewSv3rDAH7+jSbI/qDBvp BONyzPpILoRb67ZfTP/HF9J6c/VLheI0REPERUHA8TnvzfooDD0lo8pIN77p0rY50WyZ4ZN5yjz Rt5TpB1E1zSIHwl1M7O3eDbNR3ZJ9GmKQPthJ62fUXfNyN82Pli4yz9Uc1V6yv3uxcRuSoc1IbQ JcXdc+yNFW5UptgTztKZ6iKFMQbj/GqqiG6tQIE62Z2mZQY+/dE6Zyf+EwDxRaxPfWzmKFxuCG2 pWxdkO7f+eDoWuqLTov/ANdpbJQwoDPQGniijn9EmOda+964/IDCNvhm4NpwBiuKpHYfvxmNaGA HSzIyXwt1qfq11BmHQ+wa3917DkFPzSpfKBTfNuRspWA== X-Received: by 2002:a05:600c:8b83:b0:490:b106:4fe8 with SMTP id 5b1f17b1804b1-490c25e268dmr253370545e9.33.1780920965147; Mon, 08 Jun 2026 05:16:05 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.16.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:16:04 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 18/25] sched/rt: Update task's RT runqueue when switching scheduling class Date: Mon, 8 Jun 2026 14:15:37 +0200 Message-ID: <20260608121546.69910-19-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Signed-off-by: Yuri Andriaccio --- kernel/sched/rt.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index a32b1f68e645..fc7af6bda3f8 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1897,6 +1897,25 @@ void __init init_sched_rt_class(void) } } +#ifdef CONFIG_RT_GROUP_SCHED +static void switching_to_rt(struct rq *rq, struct task_struct *p) +{ + struct task_group *tg =3D p->sched_task_group; + int cpu =3D rq->cpu; + + if (tg =3D=3D &root_task_group) + return; + + guard(raw_spinlock_irqsave)(dl_bw_lock_of_tg(tg)); + if (!rt_group_sched_enabled()) + tg =3D &root_task_group; + + p->rt.rt_rq =3D dl_bandwidth_read(tg)->active_context->rt_rq[cpu]; +} +#else +static void switching_to_rt(struct rq *rq, struct task_struct *p) {} +#endif + /* * When switching a task to RT, we may overload the runqueue * with RT tasks. In this case we try to push them off to @@ -2095,6 +2114,7 @@ DEFINE_SCHED_CLASS(rt) =3D { .get_rr_interval =3D get_rr_interval_rt, + .switching_to =3D switching_to_rt, .switched_to =3D switched_to_rt, .prio_changed =3D prio_changed_rt, -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C71E23B892D for ; Mon, 8 Jun 2026 12:16:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920969; cv=none; b=fWUFD1SQ0iEf4cqhhKF4cVzm6htEGhIAWs361ZuEXHLsqK+aIXXIwY8iRZLuKUsbqouh/Pjz7+Wt+7AhO/nIJFdnakUDZjUkVZZyAkqguxhXbPA9HmK8U1RITCgYO1eImdaOXc++RTwoM7sjpZ6ivXdqgbJUKjPTJeTfWnjdTVs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920969; c=relaxed/simple; bh=O4QRi/1KDDjUv1bgVSbxKdT6dZaaPpKxQaswvFSPefI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BqpTtLOghLdkXu03YFA557jqO+2rcudMqYIU05mAyiqLtuVi3spKns+Qe6iz0bmex9LQEzZhdutpr5ZsZU9t9mjbB7FXTepC2irAQmOLmLA0dKeZO8cV9puVIAjvq/DqlRQNkDp8p05ejAe0eBsgELdvUe4EUV/9/16ed6n/IIA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=HqfLFeT/; arc=none smtp.client-ip=209.85.221.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HqfLFeT/" Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-45f3cf907ceso1956656f8f.2 for ; Mon, 08 Jun 2026 05:16:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920966; x=1781525766; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6LCDTuPozw1aQu15eNY968u+ZsEnLeyKTwQrNnXXOZ8=; b=HqfLFeT/8mmXrvMf6CjqBXyRt8Tjf6F6X6mBlpO+Sk8VPQcIpSJ2DrGow6KcglQJBr WYIEP/EtzAhz8z7J13hHWQZF0n9uCC1QIs/Uf0R2YA1lboQrJrW0AD/ZtYbKwT2yc0rR pVuTC9TtEFCRSNyh1tOWdkyFaNTNg1h5oXW9vmhzaROxiRhCminuRUFk5yf8d+woSooU 2dnOrUcQn019Alc3u/oMrKZJ5KvFxe+SoUTLvxvxhS5vUkLxIDceH0TGTDotlTIQrEgq LwLxL52QF7g9Wou7rSrd0b49sItQxEe3i7HBYQqeq8Di2neVLEWVk2U7MOQDESE7G0Bu yXBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920966; x=1781525766; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=6LCDTuPozw1aQu15eNY968u+ZsEnLeyKTwQrNnXXOZ8=; b=OwHiGtsL2Ya5cjfEPXXhgqgxpNANE2OQtuTtHSNhkAlpQVsbHvZ6ncMoIQhSofFAlR QLyXR42hJ1NyfiMLuknnfp72Fmr18welMBLHZHKjNuUDvAmSYtALcr8F0R2b5aXv7c55 SXhjdphpOSp4pmVKN0kOdTWAdNgSbC+yM7vl7urkSpbBNcoZrVc923owQx0lxYqaK3mM ORjHDo4iyM+T18gRKsykKINNZFBue65scfkkO6rfZ1UpoNixmyG+DniSjCN12xfAr6n5 6f4BFL0jjsrNSZDfyqt2dT3zDuey1TXWM7ODPq0B3Zyc1NquqOuOGeiHT8GFI0e6mg7o NqcA== X-Forwarded-Encrypted: i=1; AFNElJ8dOYu3CIbNMWWzBb/fRYSeNLJZkSRivzuZRku1VY6n98LLnphavlotjzqQupukbrxg0z7+mTLgQpffFLw=@vger.kernel.org X-Gm-Message-State: AOJu0YzJJRAifxOGD1WEFwEU81hGwsYYnMGMIns9T+vvPtHDnBykfnkw xKfYwt9jzhi2AMxOMXMHQ4bg5N1AkNIXUZFzF7h/wsFvra4vKfTg7jJ4 X-Gm-Gg: Acq92OG7GdRu01VH+Fqo3M0DAb2IiQH7gKr4p7Aphg1nMBVdBqQJX142UuMMS2nt9B5 EbxG3jBpImHobpZFTyXUpmj9Th7hpnsuABjAZLkv6ZMEPGkSNISIfB/pPfGxSJ7PXcl1m+aUGBQ g1cbl0xy49cF0ITwsnwlu+W7XPLxZoNtN89Mu4VwT1eNcDH+hTxrcoeCGbJUhqPvQGVBgWHRtSi 1mOuGu/4vHbb0vi6OXU/xoGvJDj+8Ho+6bbaCLU2Gp90k4iCvJVyNfZ9b3e1VSPu34cLXNkWyZy iF2bUcbpWYT4aAVExB+8Btoq4KjPtX//+GVjB7WGcQUqzxUqWLKYU6ArCOS81RBOm6EYBW98MV6 ag6pjViRrk1n8FC5fICvCE9cBWyiCct/vsmayi8jHR33DGpNZBSqjEaojyS3sW/bcYAOXAg6oUr SWiihnmR11WisyD4adkC3xy5cLCjoiyKk= X-Received: by 2002:a05:6000:46cd:b0:460:1492:4f0d with SMTP id ffacd0b85a97d-4603063c591mr15526771f8f.34.1780920965986; Mon, 08 Jun 2026 05:16:05 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.16.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:16:05 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 19/25] sched/rt: Remove old RT_GROUP_SCHED data structures Date: Mon, 8 Jun 2026 14:15:38 +0200 Message-ID: <20260608121546.69910-20-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Completely remove the old RT_GROUP_SCHED's functions and data structures: - Remove the fields back and my_q from sched_rt_entity. - Remove the rt_bandwidth data structure. - Remove the field rt_bandwidth from task_group. - Remove the rt_bandwidth_enabled function. - Remove the fields rt_queued, rt_throttled, rt_time, rt_runtime, rt_runtime_lock and rt_nr_boosted from rt_rq. All of the removed fields and data are similarly represented in previously added fields in rq, rt_rq, dl_bandwidth and in the dl server themselves. Co-developed-by: Yuri Andriaccio Signed-off-by: Yuri Andriaccio Signed-off-by: luca abeni --- include/linux/sched.h | 3 --- kernel/sched/sched.h | 33 --------------------------------- 2 files changed, 36 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 0021069581c2..e934ec9fc3a9 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -628,12 +628,9 @@ struct sched_rt_entity { unsigned short on_rq; unsigned short on_list; - struct sched_rt_entity *back; #ifdef CONFIG_RT_GROUP_SCHED /* rq on which this entity is (to be) queued: */ struct rt_rq *rt_rq; - /* rq "owned" by this entity/group: */ - struct rt_rq *my_q; #endif } __randomize_layout; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 394f40dc26db..53248cbbeaf8 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -313,15 +313,6 @@ struct rt_prio_array { struct list_head queue[MAX_RT_PRIO]; }; -struct rt_bandwidth { - /* nests inside the rq lock: */ - raw_spinlock_t rt_runtime_lock; - ktime_t rt_period; - u64 rt_runtime; - struct hrtimer rt_period_timer; - unsigned int rt_period_active; -}; - struct dl_bandwidth { raw_spinlock_t dl_runtime_lock; u64 dl_runtime; @@ -343,12 +334,6 @@ static inline int dl_bandwidth_enabled(void) * - cache the fraction of bandwidth that is currently allocated in * each root domain; * - * This is all done in the data structure below. It is similar to the - * one used for RT-throttling (rt_bandwidth), with the main difference - * that, since here we are only interested in admission control, we - * do not decrease any runtime while the group "executes", neither we - * need a timer to replenish it. - * * With respect to SMP, bandwidth is given on a per root domain basis, * meaning that: * - bw (< 100%) is the deadline bandwidth of each CPU; @@ -511,11 +496,9 @@ struct task_group { * different deadline server, and a runqueue per CPU. All the dl-servers * share the same dl_bandwidth object. */ - struct sched_rt_entity **rt_se; struct sched_dl_entity **dl_se; struct rt_rq **rt_rq; - struct rt_bandwidth rt_bandwidth; struct dl_bandwidth dl_bandwidth; #endif @@ -842,11 +825,6 @@ struct scx_rq { }; #endif /* CONFIG_SCHED_CLASS_EXT */ -static inline int rt_bandwidth_enabled(void) -{ - return 0; -} - /* RT IPI pull logic requires IRQ_WORK */ #if defined(CONFIG_IRQ_WORK) && defined(CONFIG_SMP) # define HAVE_RT_PUSH_IPI @@ -864,17 +842,6 @@ struct rt_rq { bool overloaded; struct plist_head pushable_tasks; - int rt_queued; - -#ifdef CONFIG_RT_GROUP_SCHED - int rt_throttled; - u64 rt_time; /* consumed RT time, goes up in update_curr_rt */ - u64 rt_runtime; /* allotted RT time, "slice" from rt_bandwidth, RT shar= ing/balancing */ - /* Nests inside the rq lock: */ - raw_spinlock_t rt_runtime_lock; - - unsigned int rt_nr_boosted; -#endif #ifdef CONFIG_CGROUP_SCHED struct task_group *tg; /* this tg has "this" rt_rq on given CPU for runna= ble entities */ #endif -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f51.google.com (mail-wr1-f51.google.com [209.85.221.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CBFE93BA23A for ; Mon, 8 Jun 2026 12:16:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920971; cv=none; b=JrZY8yOSfLjNuCWQeUTG/9WePHqIREMKXChZQXs7NxWz74Tnz7MaFKj8bsk5CNO7HZvuFlKFKOSwoPHyA1d4z8j5Br9sqRcBe4sELlyv2f7DCotd6c82kJFRBoP7xs41/7gkQarvasw6nAKw6gn40JR/5JLnxmduzh8DLY/rzXw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920971; c=relaxed/simple; bh=4H2yEEwaf4FlJ+NZqz+XGSMYyLq38zJDZ20mJLr1i5o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NvmPzSXQGlOePtp3TtJMLn56dTOS4hgBqFmzFr8SEkY3krmycAsrxZUZvDyb452Hw6PguPbq1HudiUfYM8wut5mbccGOQ2hTnZEunR1JArYSKm2nmzeXCdQh36DawwUDLP7Psn45NwbENIhtzyvAkdsBhsiyUGr3b6C4rxLuf4M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=FnUadsyq; arc=none smtp.client-ip=209.85.221.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="FnUadsyq" Received: by mail-wr1-f51.google.com with SMTP id ffacd0b85a97d-4600cbb06deso2274449f8f.1 for ; Mon, 08 Jun 2026 05:16:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920967; x=1781525767; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0mQBzqg4bAUYq8M7ta+FoS64+dFJcVYXykliMfqHR1M=; b=FnUadsyqTe2AumztJrxyQ53vN8eJXslafchdyQjGdfFNsNquuuDu1t7AQ0JV3LrGfj X/f42aMoVofSTx+sv2Hnl8pWTQLCYhmP0yvj7kCc65mIIgM7GiVD4LFJaptGXiyPDB2x VFadjeFz79hhBJMfkxUtQrQJf9mIV6CdxsEx4VdFXbZOPPe754KHslYqzdTkVKhrPwJ8 aphzj/XAuzMHIG5Utn+EoaZiz7zaWfYEeFZ0glTpX8SXbJmgYhMaSso694lq3sNSt07R hqWvCn0g/Us05QoW3HG+ISC5WLFfv7ziKJWxs4RSyWFrP0MhZMnCHX9MrfYxrg3NhcM/ 5HbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920967; x=1781525767; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=0mQBzqg4bAUYq8M7ta+FoS64+dFJcVYXykliMfqHR1M=; b=oRjptoYraIzX2NZqschp9lTyAvUWWWp/eJ9Rf0nYLEWLe4oLMPEW0OAIGXHQv+RJXY gxA6YygiViECfvHIbFYFsfRK4T4xg7Nre+ziJRHlP0XG/Ah7yj7QPnd3yNJB+coNS6fk nsYUZ00OjafpKzFfc8lhSPGpoa3vZ5CWkAE1s+vQH1m6fa2KnoeIF5XdOBa51pJ43puH /PVSOrWvz5IzCa4vVhRQCGAsIlN8zE7OIUhRS29KMi0tnbgFJO2DzRpRGItUyMtbFUiL RMVPSI55gKZ0nqhXKKTauK+fYnW/M4c/4y0KOpZex7LfEmHNk6JZKbF59FbDTqrqCTOx W+8g== X-Forwarded-Encrypted: i=1; AFNElJ8Ub/EMOaTBzHJ3hD/+qFtdokbs2mpQL18uwibFtY7jG8Rob2SzIPZj6uuPacJsL15YxgoOwB7q0wiLY2I=@vger.kernel.org X-Gm-Message-State: AOJu0Yzt+o0ZENrg9ldy+2etvhpsGjrhUm6ViKxKiWUHTVC5KiUQbB+w A38FMDhxr3+QdxzhrlHgqv7dJymXkjSUaMDnzNkmUuWsPQRNN6QnMHNp X-Gm-Gg: Acq92OEMEvd/NlKmhpnWiXtrMQb9L6BBeespII6c+hh4onrvbmtas7pTuEIKcddoeZL d0BXahgnKpaKs3onCmkfqQtT2SB4VIWBMDAP7YBrayLL8HIDxs31TduZ/vv6so5NMsQjkkjrKpk sXwMjRNenkdm+cNpc47imXr5jXd5zLlp1iNWhediihAbENUwthibWNkprNx10RgT8f2uzc8rxrY e5fKk8Aiwn9xwKrR1m8ZCUcBGZMyaTtk1wc+Rzd0lWnW3CC6ceo7k3xDgZjje3Lr2UmpzZVxzeN Gdh4MCULG+8FochtXnBRaWe0c0byR7n/t6MHVh5AG68hHzSZzoZMEig9s/AeXbn3CMlXTtnlPaS MzL6nm1s7luZnpxm5k8mzgBEGnss73WqJIWtPgN0xvS6LNkiBtrzLr6ksJOIAvAHQSqp1IeJCu7 uztIEiGtFjomU23YzJVai94knV8BXEQAM4C2yzgY3Zyg== X-Received: by 2002:a05:6000:1aca:b0:45e:9520:d729 with SMTP id ffacd0b85a97d-460328109c5mr19241103f8f.0.1780920966856; Mon, 08 Jun 2026 05:16:06 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.16.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:16:06 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 20/25] sched/rt: Add HCBS migration code to related functions Date: Mon, 8 Jun 2026 14:15:39 +0200 Message-ID: <20260608121546.69910-21-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Update rt_queue_{push/pull}_task{s} to differentiate between cgroup and global runqueue balancing. Introduce new balance callbacks for cgroup migration. Add rq_to_push_from and rq_to_push_to fields for cgroup related migration. Balance callbacks are only called on the root runqueues, thus it is necessary to store which non-root runqueues need to be balanced. Update migration functions to specialize for cgroup migration: - find_lowest_rt_rq(): Scan all the cpus to get the cgroup specific lowest_mask. - find_lock_lowest_rt_rq(): Use appropriate rt_rqs to differentiate the cgroup being checked. Prevent migration for throttled cgroups. - push_rt_rq_task(): Allow pushing away for migration disabled tasks only if the tasks belong to the same cgroup. - pull_rt_rq_task(): Use appropriate rt_rqs and push away for migration disabled only if the task to pull and curr are in the same runqueue. Add tg_of_se to get the task group a scheduling entity is assigned to. This is different from the active context of the group. Add new macros for field access and non-CONFIG_RT_GROUP_SCHED code. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: luca abeni Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- kernel/sched/rt.c | 263 ++++++++++++++++++++++++++++++++----------- kernel/sched/sched.h | 26 +++++ 2 files changed, 221 insertions(+), 68 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index fc7af6bda3f8..276eebe8d0a9 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -364,31 +364,61 @@ static inline int has_pushable_tasks(struct rt_rq *rt= _rq) static DEFINE_PER_CPU(struct balance_callback, rt_push_head); static DEFINE_PER_CPU(struct balance_callback, rt_pull_head); +static DEFINE_PER_CPU(struct balance_callback, rt_group_push_head); +static DEFINE_PER_CPU(struct balance_callback, rt_group_pull_head); static void push_rt_tasks(struct rq *); static void pull_rt_task(struct rq *); +static void push_group_rt_tasks(struct rq *); +static void pull_group_rt_task(struct rq *); static inline void rt_queue_push_tasks(struct rt_rq *rt_rq) { - struct rq *rq =3D global_rq_of_rt_rq(rt_rq); - - if (is_dl_group(rt_rq)) - return; + struct rq *rq =3D rq_of_rt_rq(rt_rq); + struct rq *global_rq =3D global_rq_of_rt_rq(rt_rq); if (!has_pushable_tasks(rt_rq)) return; - queue_balance_callback(rq, &per_cpu(rt_push_head, rq->cpu), push_rt_tasks= ); + if (!rt_group_sched_enabled() || !is_dl_group(rt_rq)) { + + queue_balance_callback(global_rq, + &per_cpu(rt_push_head, rq->cpu), + push_rt_tasks); + } else { + + if (rq_to_push_from(global_rq)) + return; + + rq_to_push_from(global_rq) =3D rq; + queue_balance_callback(global_rq, + &per_cpu(rt_group_push_head, global_rq->cpu), + push_group_rt_tasks); + } } static inline void rt_queue_pull_task(struct rt_rq *rt_rq) { - struct rq *rq =3D global_rq_of_rt_rq(rt_rq); + struct rq *rq =3D rq_of_rt_rq(rt_rq); + struct rq *global_rq =3D global_rq_of_rt_rq(rt_rq); + struct sched_dl_entity *dl_se; - if (is_dl_group(rt_rq)) - return; + if (!rt_group_sched_enabled() || !is_dl_group(rt_rq)) { - queue_balance_callback(rq, &per_cpu(rt_pull_head, rq->cpu), pull_rt_task); + queue_balance_callback(global_rq, + &per_cpu(rt_pull_head, rq->cpu), + pull_rt_task); + } else { + + dl_se =3D dl_group_of(rt_rq); + if (dl_se->dl_throttled || rq_to_pull_to(global_rq)) + return; + + rq_to_pull_to(global_rq) =3D rq; + queue_balance_callback(global_rq, + &per_cpu(rt_group_pull_head, global_rq->cpu), + pull_group_rt_task); + } } static void push_rt_rq_tasks(struct rt_rq *rt_rq); @@ -403,6 +433,27 @@ static void pull_rt_task(struct rq *global_rq) { pull_rt_rq_task(&global_rq->rt); } +static void push_group_rt_tasks(struct rq *global_rq) +{ + struct rq *rq =3D rq_to_push_from(global_rq); + struct rt_rq *rt_rq =3D &rq->rt; + + if (rt_rq->rt_nr_running <=3D 1 && !dl_group_of(rt_rq)->dl_throttled) + return; + + push_rt_rq_tasks(rt_rq); + rq_to_push_from(global_rq) =3D NULL; +} + +static void pull_group_rt_task(struct rq *global_rq) +{ + struct rq *rq =3D rq_to_pull_to(global_rq); + struct rt_rq *rt_rq =3D &rq->rt; + + pull_rt_rq_task(rt_rq); + rq_to_pull_to(global_rq) =3D NULL; +} + static void enqueue_pushable_task(struct rt_rq *rt_rq, struct task_struct = *p) { plist_del(&p->pushable_tasks, &rt_rq->pushable_tasks); @@ -1220,35 +1271,71 @@ static DEFINE_PER_CPU(cpumask_var_t, local_cpu_mask= ); static int find_lowest_rt_rq(struct task_struct *task) { struct sched_domain *sd; - struct cpumask *lowest_mask =3D this_cpu_cpumask_var_ptr(local_cpu_mask); - int this_cpu =3D smp_processor_id(); - int cpu =3D task_cpu(task); - int ret; - - /* Make sure the mask is initialized first */ - if (unlikely(!lowest_mask)) - return -1; + struct cpumask mask, *lowest_mask; + struct sched_dl_entity *dl_se; + struct rt_rq *rt_rq, *task_rt_rq =3D rt_rq_of_se(&task->rt); + int cpu, this_cpu =3D smp_processor_id(); + int ret, prio, lowest_prio; if (task->nr_cpus_allowed =3D=3D 1) return -1; /* No other targets possible */ - /* - * If we're on asym system ensure we consider the different capacities - * of the CPUs when searching for the lowest_mask. - */ - if (sched_asym_cpucap_active()) { + if (!rt_group_sched_enabled() || !is_dl_group(task_rt_rq)) { + + lowest_mask =3D this_cpu_cpumask_var_ptr(local_cpu_mask); - ret =3D cpupri_find_fitness(&task_rq(task)->rd->cpupri, - task, lowest_mask, - rt_task_fits_capacity); + /* Make sure the mask is initialized first */ + if (unlikely(!lowest_mask)) + return -1; + + /* + * If we're on asym system ensure we consider the different + * capacities of the CPUs when searching for the lowest_mask. + */ + if (sched_asym_cpucap_active()) { + + ret =3D cpupri_find_fitness(&task_rq(task)->rd->cpupri, + task, lowest_mask, + rt_task_fits_capacity); + } else { + + ret =3D cpupri_find(&task_rq(task)->rd->cpupri, + task, lowest_mask); + } + + if (!ret) + return -1; /* No targets found */ } else { - ret =3D cpupri_find(&task_rq(task)->rd->cpupri, - task, lowest_mask); + lowest_prio =3D task->prio - 1; + lowest_mask =3D &mask; + cpumask_clear(lowest_mask); + for_each_cpu_and(cpu, cpu_online_mask, task->cpus_ptr) { + dl_se =3D dl_se_of_tg(task_rt_rq->tg, cpu); + rt_rq =3D &dl_se->my_q->rt; + prio =3D rt_rq->highest_prio.curr; + + /* + * If we're on asym system ensure we consider the + * different capacities of the CPUs when searching for + * the lowest_mask. + */ + if (dl_se->dl_throttled || !rt_task_fits_capacity(task, cpu)) + continue; + + if (prio >=3D lowest_prio) { + if (prio > lowest_prio) { + cpumask_clear(lowest_mask); + lowest_prio =3D prio; + } + + cpumask_set_cpu(cpu, lowest_mask); + } + } } - if (!ret) - return -1; /* No targets found */ + if (cpumask_empty(lowest_mask)) + return -1; /* * At this point we have built a mask of CPUs representing the @@ -1258,6 +1345,7 @@ static int find_lowest_rt_rq(struct task_struct *task) * We prioritize the last CPU that the task executed on since * it is most likely cache-hot in that location. */ + cpu =3D task_cpu(task); if (cpumask_test_cpu(cpu, lowest_mask)) return cpu; @@ -1268,30 +1356,27 @@ static int find_lowest_rt_rq(struct task_struct *ta= sk) if (!cpumask_test_cpu(this_cpu, lowest_mask)) this_cpu =3D -1; /* Skip this_cpu opt if not among lowest */ - rcu_read_lock(); - for_each_domain(cpu, sd) { - if (sd->flags & SD_WAKE_AFFINE) { + scoped_guard(rcu) { + for_each_domain(cpu, sd) { int best_cpu; + if (!(sd->flags & SD_WAKE_AFFINE)) + continue; + /* * "this_cpu" is cheaper to preempt than a * remote processor. */ if (this_cpu !=3D -1 && - cpumask_test_cpu(this_cpu, sched_domain_span(sd))) { - rcu_read_unlock(); + cpumask_test_cpu(this_cpu, sched_domain_span(sd))) return this_cpu; - } best_cpu =3D cpumask_any_and_distribute(lowest_mask, sched_domain_span(sd)); - if (best_cpu < nr_cpu_ids) { - rcu_read_unlock(); + if (best_cpu < nr_cpu_ids) return best_cpu; - } } } - rcu_read_unlock(); /* * And finally, if there were no matches within the domains @@ -1342,27 +1427,35 @@ static struct task_struct *pick_next_pushable_task(= struct rt_rq *rt_rq) /* Will lock the rq it finds */ static struct rt_rq *find_lock_lowest_rt_rq(struct task_struct *task, stru= ct rt_rq *rt_rq) { - struct rq *rq =3D rq_of_rt_rq(rt_rq); - struct rq *lowest_rq =3D NULL; - int tries; - int cpu; + struct rq *lowest_rq, *rq =3D global_rq_of_rt_rq(rt_rq); + struct rt_rq *lowest_rt_rq; + struct sched_dl_entity *lowest_dl_se; + int tries, cpu; + bool dl_group; + + dl_group =3D rt_group_sched_enabled() && is_dl_group(rt_rq); for (tries =3D 0; tries < RT_MAX_TRIES; tries++) { cpu =3D find_lowest_rt_rq(task); if ((cpu =3D=3D -1) || (cpu =3D=3D rq->cpu)) - break; + return NULL; lowest_rq =3D cpu_rq(cpu); + if (dl_group) { + lowest_dl_se =3D dl_se_of_tg(rt_rq->tg, cpu); + lowest_rt_rq =3D &lowest_dl_se->my_q->rt; + } else { + lowest_rt_rq =3D &lowest_rq->rt; + } - if (lowest_rq->rt.highest_prio.curr <=3D task->prio) { + if (lowest_rt_rq->highest_prio.curr <=3D task->prio) { /* * Target rq has tasks of equal or higher priority, * retrying does not release any lock and is unlikely * to yield a different result. */ - lowest_rq =3D NULL; - break; + return NULL; } /* if the prio of this runqueue changed, try again */ @@ -1378,25 +1471,24 @@ static struct rt_rq *find_lock_lowest_rt_rq(struct = task_struct *task, struct rt_ * check the task migration disable flag here too. */ if (unlikely(is_migration_disabled(task) || + (dl_group && lowest_dl_se->dl_throttled) || !cpumask_test_cpu(lowest_rq->cpu, &task->cpus_mask) || task !=3D pick_next_pushable_task(rt_rq))) { double_unlock_balance(rq, lowest_rq); - lowest_rq =3D NULL; - break; + return NULL; } } /* If this rq is still suitable use it. */ - if (lowest_rq->rt.highest_prio.curr > task->prio) - break; + if (lowest_rt_rq->highest_prio.curr > task->prio) + return lowest_rt_rq; /* try again */ double_unlock_balance(rq, lowest_rq); - lowest_rq =3D NULL; } - return &lowest_rq->rt; + return NULL; } static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq = *rq) { @@ -1413,12 +1505,10 @@ static struct rq *find_lock_lowest_rq(struct task_s= truct *task, struct rq *rq) { static int push_rt_rq_task(struct rt_rq *rt_rq, bool pull) { struct task_struct *next_task; - struct rq *lowest_rq, *rq =3D rq_of_rt_rq(rt_rq); + struct rq *lowest_rq, *rq =3D global_rq_of_rt_rq(rt_rq); struct rt_rq *lowest_rt_rq; int ret =3D 0; - - if (is_dl_group(rt_rq)) - return 0; + bool dl_group; if (!rt_rq->overloaded) return 0; @@ -1433,7 +1523,8 @@ static int push_rt_rq_task(struct rt_rq *rt_rq, bool = pull) * higher priority than current. If that's the case * just reschedule current. */ - if (unlikely(next_task->prio < rq->donor->prio)) { + dl_group =3D rt_group_sched_enabled() && is_dl_group(rt_rq); + if (!dl_group && unlikely(next_task->prio < rq->donor->prio)) { resched_curr(rq); return 0; } @@ -1445,6 +1536,13 @@ static int push_rt_rq_task(struct rt_rq *rt_rq, bool= pull) if (!pull || rq->push_busy) return 0; + /* + * If the current task does not belong to the same task group + * we cannot push it away. + */ + if (dl_group && rq->donor->sched_task_group !=3D rt_rq->tg) + return 0; + /* * Invoking find_lowest_rt_rq() on anything but an RT task doesn't * make sense. Per the above priority check, curr has to @@ -1521,7 +1619,7 @@ static int push_rt_rq_task(struct rt_rq *rt_rq, bool = pull) goto retry; } - lowest_rq =3D rq_of_rt_rq(lowest_rt_rq); + lowest_rq =3D global_rq_of_rt_rq(lowest_rt_rq); move_queued_task_locked(rq, lowest_rq, next_task); resched_curr(lowest_rq); ret =3D 1; @@ -1718,16 +1816,22 @@ void rto_push_irq_work_func(struct irq_work *work) static void pull_rt_rq_task(struct rt_rq *this_rt_rq) { - struct rq *this_rq =3D rq_of_rt_rq(this_rt_rq); + struct rq* this_rq =3D global_rq_of_rt_rq(this_rt_rq); int this_cpu =3D this_rq->cpu, cpu; bool resched =3D false; - struct task_struct *p, *push_task; + struct task_struct *p, *push_task =3D NULL; + struct sched_dl_entity *src_dl_se; struct rt_rq *src_rt_rq; struct rq *src_rq; - int rt_overload_count =3D rt_overloaded(this_rq); + int rt_overload_count; + const struct cpumask *cpu_mask; + bool dl_group; - if (is_dl_group(&this_rq->rt)) - return; + dl_group =3D rt_group_sched_enabled() && is_dl_group(this_rt_rq); + if (dl_group) + goto group_sched; + + rt_overload_count =3D rt_overloaded(this_rq); if (likely(!rt_overload_count)) return; @@ -1750,12 +1854,26 @@ static void pull_rt_rq_task(struct rt_rq *this_rt_r= q) } #endif - for_each_cpu(cpu, this_rq->rd->rto_mask) { +group_sched: + if (!dl_group) + cpu_mask =3D this_rq->rd->rto_mask; + else + cpu_mask =3D cpu_online_mask; + + for_each_cpu(cpu, cpu_mask) { if (this_cpu =3D=3D cpu) continue; src_rq =3D cpu_rq(cpu); - src_rt_rq =3D &src_rq->rt; + if (!dl_group) { + src_rt_rq =3D &src_rq->rt; + } else { + src_dl_se =3D dl_se_of_tg(this_rt_rq->tg, cpu); + src_rt_rq =3D &src_dl_se->my_q->rt; + + if (src_rt_rq->rt_nr_running <=3D 1 && !src_dl_se->dl_throttled) + continue; + } /* * Don't bother taking the src_rq->lock if the next highest @@ -1796,12 +1914,21 @@ static void pull_rt_rq_task(struct rt_rq *this_rt_r= q) * This is just that p is waking up and hasn't * had a chance to schedule. We only pull * p if it is lower in priority than the - * current task on the run queue + * current task on the run queue and p is + * in the same runqueue as donor. */ - if (p->prio < src_rq->donor->prio) + if (tg_of_se(&src_rq->donor->rt) =3D=3D this_rt_rq->tg && + p->prio < src_rq->donor->prio) goto skip; if (is_migration_disabled(p)) { + /* + * If the current task does not belong to the + * same task group we cannot push it away. + */ + if (tg_of_se(&src_rq->donor->rt) !=3D this_rt_rq->tg) + goto skip; + push_task =3D get_push_task(src_rq); } else { move_queued_task_locked(src_rq, this_rq, p); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 53248cbbeaf8..3acc88a035a5 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1341,6 +1341,16 @@ struct rq { struct list_head cfsb_csd_list; #endif +#ifdef CONFIG_RT_GROUP_SCHED + /* + * Balance callbacks operate only on global runqueues. + * These pointers allow referencing cgroup specific runqueues + * for balancing operations. + */ + struct rq *rq_to_push_from; + struct rq *rq_to_pull_to; +#endif + atomic_t nr_iowait; } __no_randomize_layout; @@ -3366,6 +3376,11 @@ static inline struct rt_rq *rt_rq_of_se(struct sched= _rt_entity *rt_se) return rt_se->rt_rq; } +static inline struct task_group *tg_of_se(struct sched_rt_entity *rt_se) +{ + return rt_rq_of_se(rt_se)->tg; +} + static inline int is_dl_group(struct rt_rq *rt_rq) { return rt_rq->tg !=3D &root_task_group; @@ -3382,6 +3397,9 @@ static inline struct sched_dl_entity *dl_group_of(str= uct rt_rq *rt_rq) return rt_rq->tg->dl_se[rq_of_rt_rq(rt_rq)->cpu]; } +#define rq_to_push_from(rq) ((rq)->rq_to_push_from) +#define rq_to_pull_to(rq) ((rq)->rq_to_pull_to) +#define dl_se_of_tg(tg, cpu) ((tg)->dl_se[(cpu)]) #define dl_bw_lock_of_tg(tg) (&(tg)->dl_bandwidth.dl_runtime_lock) #else static inline struct task_struct *rt_task_of(struct sched_rt_entity *rt_se) @@ -3406,6 +3424,11 @@ static inline struct rt_rq *rt_rq_of_se(struct sched= _rt_entity *rt_se) return &rq->rt; } +static inline struct task_group *tg_of_se(struct sched_rt_entity *rt_se) +{ + return &root_task_group; +} + static inline int is_dl_group(struct rt_rq *rt_rq) { return 0; @@ -3416,6 +3439,9 @@ static inline struct sched_dl_entity *dl_group_of(str= uct rt_rq *rt_rq) return NULL; } +#define rq_to_push_from(rq) (rq) +#define rq_to_pull_to(rq) (rq) +#define dl_se_of_tg(tg, cpu) ((struct sched_dl_entity*)NULL) #define dl_bw_lock_of_tg(tg) ((raw_spinlock_t*)NULL) #endif -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 52A1F37AA7F for ; Mon, 8 Jun 2026 12:16:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920971; cv=none; b=SGkCcFNn1v1OkoXEJjah+5YWZ8KBzAZULDIT9CfxL9dTMIxIAqnzZvLWFjGvWsi0EHz0Va9hP1U7U6UC6yzcQSEpGmk9Wfzt40BcseDpHKCsMUw12NEkP/6tK+NzqlI3Y9bTA2dv8goscRE/DT2datbMKvaPHy1FCB/M9cDMmw4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920971; c=relaxed/simple; bh=tDC7u52egQ55UBVu3L5EMvdM3Vtu7xDSms+xUy0L7YA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rrb1eYKMPPCPagILTW+WB/08HwkODQc6VezdRUZwjcgvop+yZEjxpowzEZPYoBb8BJSruZDYWF4uIrBzxdGnTJQiCLnkn1xNjFAhAVFODc9bKOfVrGavL0pz3y65NWGgWFOsgOetGEqGhjqiUsnLYJYYUj5CcbGjDi6zdpBR46g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MNtbC7fq; arc=none smtp.client-ip=209.85.221.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MNtbC7fq" Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-45ee5cdbd28so3071274f8f.1 for ; Mon, 08 Jun 2026 05:16:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920968; x=1781525768; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tOIvSewylSpCmuC9OVq7JPT5uWWaJG2zU5cqeF25bdA=; b=MNtbC7fqAvcJ4DGOCs299AC1jcmNRSVulF0OCzJXgyEEudLQ0BHWrVQmCKNjzpjUy3 EAhTlyR6s/LwwMlt6CG7Cn+UErlYebwG/EoQxZsbhmYr+nggiC6LAOK1yjOWwlMASAXv 7TkTrG9vRsMjMyvfGzR5JrtGM4ygdthlIF4rC5HTvUKdk4nTRGSZWKcsZMWzMFFN/NyP Ff5Eyca/HY/kdyPc9yKUZe6OIBQ8oaXhf/QpSbyfV7AvYQ9OV1mrcym9+iVuQORCr5jj nfzQPpinN8PvJ0mUJNqo5F5o/VOdvNptKKiFruY6ELgcLlwa3+CSxXnQAGVAekVJ5NFa P1yg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920968; x=1781525768; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=tOIvSewylSpCmuC9OVq7JPT5uWWaJG2zU5cqeF25bdA=; b=NlW2A8GT/3tET2UgRb1qpEzbe2W77FxnkYN0gWoFsAUI87ZYqjBlvW5NerWCBDwG6P Ay5scEoGYaoyxuZzlgjIEO+JzaLhcxvV+l1D5YIhoW2rvcY4836TXNVhQLHCul84JSXD 0gT/B/8IdULhEfa+HANxg2AFBHcfogNfAAcqLg+naiO/SmO5woPCWC9YTkT3SnYQZwaG jfScP9afIBizXUoXBMl51YszcKxUMCwVOeSYusZJD7EAkKI1B1gy3wYukViNv/RtV7fM 8DgK8KIPUSH/QjA+sz7ib+9OpVIDDLRYDp1Gao7lulmdFaFQiDud7konnjqBLCi3xy6h asNQ== X-Forwarded-Encrypted: i=1; AFNElJ9ABqYUl7ToTbe4/ceM8pp6yG42nVdDEqArxQaeYZEDQKtNym7WmwaM/MIWuisG3YPjd2wUFKS9O1N709Q=@vger.kernel.org X-Gm-Message-State: AOJu0YwrfapjDfeVrHEYSvTHWtK6n1at3QUd4+2hrtoS5otVC1myHAsM bEgFwOo1YbcuSuP0j3JdLqCHuOyJOrDVgnZsDM/fzidaLP3UqQUEczMF X-Gm-Gg: Acq92OG9aMR05pyDNtAjUXhHVT/vWY3OHsKt5PLvjzQ9Bwv6/DVpaDwxmysQR/O/9rt w1LUCZyEvBX8Jw+6LlZboTI7yxNEqspB8W58JFGggWMaTgF14iU8U9cYhHWaJ3X4V629V0faTKY fXnQN+KnbxbrkKBbeAdG3SsVkDTbNA+hIAW49pkF28vdfYXm4g/S0+IKB62w9JjFpjz+C53pJ03 1Lw4SdMrA6lA+GZrRUpYhsJvCYC1lc6uCQOEUUWrNX9lkWq4iYwZnvgM9e3uWj4ZYkiVaZMCweE vgDEonDfI1EdrsOrLNKfhezAzzj7ioHJRPBHGab/Nc8hTpethPaFPKjXHo9clkJqV+RhiPZnBga 36JFvxTTVGJh2PHCAjEArzQTd6A7gWmVhVn2KDT1eSMIV/JREuHNHwQLduDIfXptedZEZ6owknW MhAdJnCXaQG/Y+0SzgRA8YmBCO8NxyxSA= X-Received: by 2002:a5d:6448:0:b0:460:d18:865c with SMTP id ffacd0b85a97d-46032b61422mr15269071f8f.1.1780920967701; Mon, 08 Jun 2026 05:16:07 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.16.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:16:07 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 21/25] sched/rt: Hook HCBS migration functions Date: Mon, 8 Jun 2026 14:15:40 +0200 Message-ID: <20260608121546.69910-22-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Hook rt-cgroup migration functions: - select_task_rt_rq Always return the cpu where the task is scheduled. - balance_rt - put_prev_task_rt If a server is throttled, put_prev_task_rt is invoked and a push is necessary so that the task can keep running on another server if possible. - switched_to_rt Keep track of the deadline server that is assigned to the task switching to FIFO/RR priority. Co-developed-by: Alessio Balsini Signed-off-by: Alessio Balsini Co-developed-by: Andrea Parri Signed-off-by: Andrea Parri Co-developed-by: luca abeni Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- kernel/sched/rt.c | 38 +++++++++++++++++++++++++++++++------- 1 file changed, 31 insertions(+), 7 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 276eebe8d0a9..964704d88ba1 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -976,6 +976,10 @@ select_task_rq_rt(struct task_struct *p, int cpu, int = flags) struct rq *rq; bool test; =20 + /* Just return the task_cpu for processes inside task groups */ + if (is_dl_group(rt_rq_of_se(&p->rt))) + goto out; + /* For anything but wake ups, just return the task_cpu */ if (!(flags & (WF_TTWU | WF_FORK))) goto out; @@ -1065,21 +1069,25 @@ static void check_preempt_equal_prio(struct rq *rq,= struct task_struct *p) resched_curr(rq); } =20 -static int balance_rt(struct rq *rq, struct task_struct *p, struct rq_flag= s *rf) +static int balance_rt(struct rq *global_rq, struct task_struct *p, struct = rq_flags *rf) { - if (!on_rt_rq(&p->rt) && need_pull_rt_task(rq, p)) { + struct rt_rq *rt_rq =3D rt_rq_of_se(&p->rt); + + if (!on_rt_rq(&p->rt) && need_pull_rt_task(rq_of_rt_rq(rt_rq), p)) { /* * This is OK, because current is on_cpu, which avoids it being * picked for load-balance and preemption/IRQs are still * disabled avoiding further scheduler activity on it and we've * not yet started the picking loop. */ - rq_unpin_lock(rq, rf); - pull_rt_rq_task(&rq->rt); - rq_repin_lock(rq, rf); + rq_unpin_lock(global_rq, rf); + pull_rt_rq_task(rt_rq); + rq_repin_lock(global_rq, rf); } =20 - return sched_stop_runnable(rq) || sched_dl_runnable(rq) || sched_rt_runna= ble(rq); + return sched_stop_runnable(global_rq) || + sched_dl_runnable(global_rq) || + sched_rt_runnable(global_rq); } =20 /* @@ -1241,6 +1249,13 @@ static void put_prev_task_rt(struct rq *rq, struct t= ask_struct *p, struct task_s */ if (on_rt_rq(&p->rt) && p->nr_cpus_allowed > 1) enqueue_pushable_task(rt_rq, p); + + if (is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + if (dl_se->dl_throttled) + rt_queue_push_tasks(rt_rq); + } } =20 /* Only try algorithms three times */ @@ -2050,12 +2065,21 @@ static void switching_to_rt(struct rq *rq, struct t= ask_struct *p) {} */ static void switched_to_rt(struct rq *rq, struct task_struct *p) { + struct rt_rq *rt_rq =3D rt_rq_of_se(&p->rt); + /* * If we are running, update the avg_rt tracking, as the running time * will now on be accounted into the latter. */ if (task_current(rq, p)) { update_rt_rq_load_avg(rq_clock_pelt(rq), rq, 0); + + if (is_dl_group(rt_rq)) { + struct sched_dl_entity *dl_se =3D dl_group_of(rt_rq); + + p->dl_server =3D dl_se; + } + return; } =20 @@ -2066,7 +2090,7 @@ static void switched_to_rt(struct rq *rq, struct task= _struct *p) */ if (task_on_rq_queued(p)) { if (p->nr_cpus_allowed > 1 && rq->rt.overloaded) - rt_queue_push_tasks(rt_rq_of_se(&p->rt)); + rt_queue_push_tasks(rt_rq); =20 if (p->prio < rq->donor->prio && cpu_online(cpu_of(rq))) resched_curr(rq); --=20 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 41CBF3BCD2B for ; Mon, 8 Jun 2026 12:16:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920972; cv=none; b=TG0hbHZGELXoSCeX8fsepisdDmoZxJnqPArxZQlY8T/aNrbAWFvQm3MogmVpaBXdgZG2tSblsunE8/oPgCApkcvfJfs5EE0y1adsj5YY/jy6uf3G3voaK9i5VwNC7Q0rqjWKEFZXvxBqR/iZLHaprxOvrIZCYrBg5rcYCThPEP8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920972; c=relaxed/simple; bh=FaDubOoH7dsOeXM2Hn0EEIBjQF2nxAfMMPK3pnb2MTw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tSIJRYGoXOc+k2RUsYCrPlxPIUnO25uwUL3lZW5TN/6qPNVfYYXmJPgKOdt50a9gtcpPH+5dmViepOi5HAThUdESyskoxaZys2D+2QQEhJsjET0eel45gVQoIR6iUeJLKLiL0HKxJ1BPKN8St4jVYdOkbYO+Pk23FDRkAsk3+do= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=pYWVUIOt; arc=none smtp.client-ip=209.85.221.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="pYWVUIOt" Received: by mail-wr1-f52.google.com with SMTP id ffacd0b85a97d-45ef1629ff4so2797775f8f.0 for ; Mon, 08 Jun 2026 05:16:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920968; x=1781525768; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+KNwOSXF/MHMJ4z938tfpzy14/zc4Yo7RxXmK9ytG1M=; b=pYWVUIOtKnwUaxFmjZXqWJmLbqpvWO6MdR6+ukJFYuk7injJ4Ftvgv7G7A7l46dYnS ik38HpvlaBDHaWy8cNTfliNBYJIhSs2EgzSGO4WzNHMGEmfIawECd4XFhVpb7i2mwOBL Bp2LEjNpyV8BZojMttVlDSUPc+b+vPkTDk9aM9mrWum6tpjMla0IqyWdz1Wls5kQPKXb UvT+ZZfaVCC0TwKk7aVQhWnBPup5TyQBGHP3UYlj4rhTdV192qUyOzMaC5H00rzGxf/P pOlRFutV+ZnAQwYxBJPKODujppk+uAjnQsWnTMiD8qS95d8Zym3Ovd/exEJ8krOB3mUT VMhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920968; x=1781525768; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=+KNwOSXF/MHMJ4z938tfpzy14/zc4Yo7RxXmK9ytG1M=; b=Oar6oeIRigPzDuxrDyGLsmWxWxgVL6v1cUQk+0sO7RRrI9ylReLEDCSWE0QbbEj/cd 56igmda1gLLaluvh2U0F4MkdFEKJOiN18l+qTLynTD6+VbvPP0QZ53iUjR/T6UBQRhrw 2Pc9J1UUUie886i/AGKqn3qxQ4e7ytQp876jqeBRsCzs5Xvh1s1NkufjUOmSedCksUkE 0l9tuxCpGvcnzM+g03MknvoAnY36MT4Gx1u2xQ1R5+QShQE87066a30JrUgGgRHKXXMf p9ky/LW547/ZujlH95H4OC4o3CDscxOzkrdJMAKsBjB7BWNHaY0bL1CRNQ7NhOXge4Sc PXlw== X-Forwarded-Encrypted: i=1; AFNElJ/NcY+tH6JFB4ZFrsYaMAJ5ucfIdWvKfrYWtDwSw4aETiiFe4ko1I/BKZAEc1PMlTO/jSUf6JbF6aJ4KCI=@vger.kernel.org X-Gm-Message-State: AOJu0YzSM5fxsBvXOkgr4Lp8yjdR37b7daPfCmwBkvbvpqKqc0wity2q sQe6/TQYPlNfPREMCnvV8FXnhzdsrLkEhJK93JaixXTt6o/pUtimQ1+a X-Gm-Gg: Acq92OHhye6Ktk8iEDrRpNsgujw+jWotmTfJfubOLiPGv40oKbkppdAt2E+VPxu+LSI ffwx5YFBsJT6wAm9ExN60Me//9wjiQRGj4WZfLQICzxVWyvW2OJ0p65tVGc2Q1lG22V4/dz3Iq3 S/GpooLWiIOFq9Kn9ZDXmmz0Xk2R9kawVjCMqtQtj+bZ6trdT1O0hdOZMKD8MojQWxNY3YKsW2O a5kYgepdSvc38NupT7v+84DLYaf5B0LtjHCH8uOxBNiRCKsHCzH36gJO4/p9yAtE8XZCpXVVQSl DP+Lat7evw47bW0hizsI+MvBblwB5nXY+EfCaphIOwc+nuSuNrRajK5aDp6Epo5jA3AwprcgX/R 4Z5yVwiR1a6gt0o3MtvPtxiqynSbfPXaWB2KaDKxEIiHCEeC4oLz5YkB1gHs6f0hsEtye26x4Pn H5pVyChzoiMzu+8qiP/hcvgOWenKZFfno= X-Received: by 2002:a05:600c:a088:b0:48a:5301:bb5c with SMTP id 5b1f17b1804b1-490c25fc129mr246357485e9.16.1780920968525; Mon, 08 Jun 2026 05:16:08 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.16.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:16:08 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 22/25] sched/core: Execute enqueued balance callbacks when changing allowed CPUs Date: Mon, 8 Jun 2026 14:15:41 +0200 Message-ID: <20260608121546.69910-23-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: luca abeni Execute balancing callbacks when setting the affinity of a task, since the HCBS scheduler may request balancing of throttled dl_servers to fully utilize the server's bandwidth. Signed-off-by: luca abeni Signed-off-by: Yuri Andriaccio --- kernel/sched/core.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 1ad1efe1dca7..9e337f0090b3 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2933,6 +2933,7 @@ static int affine_move_task(struct rq *rq, struct tas= k_struct *p, struct rq_flag if (cpumask_test_cpu(task_cpu(p), &p->cpus_mask) || (task_current_donor(rq, p) && !task_current(rq, p))) { struct task_struct *push_task =3D NULL; + struct balance_callback *head; if ((flags & SCA_MIGRATE_ENABLE) && (p->migration_flags & MDF_PUSH) && !rq->push_busy) { @@ -2951,11 +2952,13 @@ static int affine_move_task(struct rq *rq, struct t= ask_struct *p, struct rq_flag } preempt_disable(); + head =3D splice_balance_callbacks(rq); task_rq_unlock(rq, p, rf); if (push_task) { stop_one_cpu_nowait(rq->cpu, push_cpu_stop, p, &rq->push_work); } + balance_callbacks(rq, head); preempt_enable(); if (complete) @@ -3010,6 +3013,8 @@ static int affine_move_task(struct rq *rq, struct tas= k_struct *p, struct rq_flag } if (task_on_cpu(rq, p) || READ_ONCE(p->__state) =3D=3D TASK_WAKING) { + struct balance_callback *head; + /* * MIGRATE_ENABLE gets here because 'p =3D=3D current', but for * anything else we cannot do is_migration_disabled(), punt @@ -3023,16 +3028,19 @@ static int affine_move_task(struct rq *rq, struct t= ask_struct *p, struct rq_flag p->migration_flags &=3D ~MDF_PUSH; preempt_disable(); + head =3D splice_balance_callbacks(rq); task_rq_unlock(rq, p, rf); if (!stop_pending) { stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, &pending->arg, &pending->stop_work); } + balance_callbacks(rq, head); preempt_enable(); if (flags & SCA_MIGRATE_ENABLE) return 0; } else { + struct balance_callback *head; if (!is_migration_disabled(p)) { if (task_on_rq_queued(p)) @@ -3043,7 +3051,12 @@ static int affine_move_task(struct rq *rq, struct ta= sk_struct *p, struct rq_flag complete =3D true; } } + + preempt_disable(); + head =3D splice_balance_callbacks(rq); task_rq_unlock(rq, p, rf); + balance_callbacks(rq, head); + preempt_enable(); if (complete) complete_all(&pending->done); -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 231D93BF679 for ; Mon, 8 Jun 2026 12:16:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920973; cv=none; b=asYUbjUcCtS6zp2zkYGrCd3HUpqZSOr336n1Df+tpENg6iC0ZX84PVORiAmVPh5DNdkn8ErKsbA8cfanXtmii1iH4wv3T6qHQ2mvWIOLM3hlOq4VB71RNeinqprvfEZMAh+JqLEWAtXD3xnkNMrNK9zrqF0lU/d/hi71QLFdUOk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920973; c=relaxed/simple; bh=BYQW8jJLIb415FaGZ/DHcSKEWD0PzRzsPjdJ7m5EWpQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=W5hPvmqG0A/98/yTTrzwU23Zp8V5vPxm2Io1KcGN0N8o2LzBxAp1Zuij8deYwPKgsyYF+DRYFt4nDx0VfRCBhYuTlcEstVJt9E8U0xE0lg+mgd8fONo3q0rTdV5QbLaJnJhNlT7wfFGfjWpLP4SqfiMpl095+1BXrKYq/jdhUno= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=UUVa6xcl; arc=none smtp.client-ip=209.85.128.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UUVa6xcl" Received: by mail-wm1-f41.google.com with SMTP id 5b1f17b1804b1-490cf322ed0so8298255e9.1 for ; Mon, 08 Jun 2026 05:16:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920969; x=1781525769; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=U/E0eBSATwh3f1Xuk/npu4WDtAq8O6nFLmGZ8AjsGbA=; b=UUVa6xclWQ7dsXoXP8UoWGSuppGoLKxvUdnuWYpHVbTSJVAxFGleARyj5pf8Uc/ADd 6RNAFifRoviOySYhZy3yPf4tL8aoKs0V/r8YPkeS1iS0CtrdRdO1ASz9HOV2fu3Tuja1 feyLRYYgBtg9v76/zKPuQGcy2GhWcCBKtC50Ut+8fdz2bzkWyoqlhEWutwaofOI8VdWg W+go91tf0XhvDP95GyYHfGJAdT8ARAadZWa5M4/j6isicTpiNzH7aL5YhCw9dMyXXoQ4 pdjU/BVMw+Wei+v8ZQjzvYJPZw2XRFkdS6yM6WE3+VH+dvEi2IcAdTLM2LVpEBExM5cz cQKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920969; x=1781525769; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=U/E0eBSATwh3f1Xuk/npu4WDtAq8O6nFLmGZ8AjsGbA=; b=bAN8c8EF7G3iZvWuRfRa4E+FlOgzmzZA8pfHhMkI+ZVqcqnyPsZhhWwN3poeugeAwi C4Txhz9y8FqdujkTeQfFjO5WGwsZH5iWEtzP8QLjkXESxA5JEMYye1FwfNu7P+eBEZP1 Lb6ex1Mewqj1kKdoEiUIep7av0RGJdv6K5h3QG4xvOzhfp1fA0p8k1TgsPcq8mwMIDCp dQmZIWO6cpyRQmE3yGkbtYVI/6JwZeK6MyBqrH0qCp/f1J3BioenMG/fJdpx0wWLe0rN OH6cEF4VI2Hjb2MyJ+gvdN6veeqX10K13l6YJ9P7yUfEUrx157iElFJebeT2dRJR5te0 U/Iw== X-Forwarded-Encrypted: i=1; AFNElJ8hufhEusvXJb/5tWC4bUK86WYPiwtJBe341PMgBnMqLEJ8WD8SgWW57qkYamfpLlPC1VyoIDx3xQeKpMY=@vger.kernel.org X-Gm-Message-State: AOJu0YwdwkzhRPOiF+RePZYca/CsXPIbgUeM7pgYF9/SMLeBt1Hf3jht gW5/WfSxOHBKfzWNfr9wPeI6v4Jd1z45Q5TvQH7XjLVhDcWXu5NdMk8lIen0s6iL X-Gm-Gg: Acq92OGCuK/L8f4LI7jftZXiAos8U2G4hKmykoyoORSf+353zB8jKUhKPZqYNiejcDl f5rvgioUNlGV7LE2wNpXLJYEN8bKQ5n/PnwdRflb0Ybo1lPeqV6QEHyd2shCtnH/u6OtypF96g6 yiN6g5usVZeXYBVs22hM/1JszH9wmGDOGO9Pp6iWjdUL50ptYryjv0acxHM0OrxBWuI//t51pT9 eYURJ6dwNct9gQUP4d+QMucJ0FQ6NLHxN1vXet3GS10fWlUPGP0k0+zni7hOkcrV0zuRBEU/l5q cpEsGLzREO/uF1FO3omq/3tMcC3Q/P5USZIzp3NgC80jMQr/Z81kDVKJ3ufPqUMboTvty4rJKy4 fSJwR3oE8JOKcO7fmV+guMNG++EJ1LMXq0dwYxFb1R6WTrAy/msOrpqw1lOYK1u5TvGbYyilJ2f pOmh8aKSLUGSiyMig1qsUrkcLCjrXdlaQ= X-Received: by 2002:a05:600c:1f83:b0:490:b724:5085 with SMTP id 5b1f17b1804b1-490c2621a5cmr257900725e9.33.1780920969326; Mon, 08 Jun 2026 05:16:09 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.16.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:16:09 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 23/25] sched/rt: Try pull task on empty server pick. Date: Mon, 8 Jun 2026 14:15:42 +0200 Message-ID: <20260608121546.69910-24-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Try to pull task on a server with an empty runqueue before returning NULL (= and thus shutting down). --- When all the servers of a cgroup are throttled, work is pending, and any on= e of the servers is replenished, it may happen that the runqueue is empty and th= us the replenished server is immediately shut down. The server may try to pull a task so that the cgroup could consume its allocated runtime as soon as it is replenished. Signed-off-by: Yuri Andriaccio --- kernel/sched/rt.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 964704d88ba1..f672ef17e5d1 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -285,14 +285,22 @@ int alloc_rt_sched_group(struct task_group *tg, struc= t task_group *parent) } =20 static struct sched_rt_entity *pick_next_rt_entity(struct rt_rq *rt_rq); +static void pull_rt_task(struct rq *); =20 static struct task_struct *rt_server_pick(struct sched_dl_entity *dl_se, s= truct rq_flags *rf) { struct rt_rq *rt_rq =3D &dl_se->my_q->rt; + struct rq *global_rq =3D global_rq_of_rt_rq(rt_rq); struct task_struct *p; =20 - if (!sched_rt_runnable(dl_se->my_q)) - return NULL; + if (!sched_rt_runnable(dl_se->my_q)) { + rq_unpin_lock(global_rq, rf); + pull_rt_task(rq_of_rt_rq(rt_rq)); + rq_repin_lock(global_rq, rf); + + if (!sched_rt_runnable(dl_se->my_q)) + return NULL; + } =20 p =3D rt_task_of(pick_next_rt_entity(rt_rq)); =20 --=20 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f46.google.com (mail-wr1-f46.google.com [209.85.221.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A82E63C0601 for ; Mon, 8 Jun 2026 12:16:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920973; cv=none; b=ob7rFoe/LUWtw4GymyLTatRRQPmSD7teLZkIclu5wsjpdxfZetkBON2tybDMCH9KDa7IsXu+JCyKKx2HJ8bIoNvyxs8/RXJ6grmKfYphVWo/r8yY4qf83ItlQdy5rTDQoUh/YblJqL3jqsQ0m3uAx2UADWGgNCwjDwes9net+04= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920973; c=relaxed/simple; bh=HcRNgF/O8yJDYJTUnmjZ+m7CiqJa3AxT4FmrF+40hgk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Auvb7KCa7Fs/di1AbRK6+i+He6LMHiVjrKkagM7nJCddogVFFPZ/m/EcRdqYvELHnTbGwlCG1kjpLrSnTytGQPz7s7Ht9RJMHNufSBNfmzoA1VazgYUyM6fsAF2iUShjQuY/zAyIbAjicCyQftqSzRdEOUJCiNfusfh0eCXlxtI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=A1tOrh6+; arc=none smtp.client-ip=209.85.221.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="A1tOrh6+" Received: by mail-wr1-f46.google.com with SMTP id ffacd0b85a97d-45ef5146b56so3310207f8f.0 for ; Mon, 08 Jun 2026 05:16:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920970; x=1781525770; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=EkbH48zQzOxR1p1OVZ36l1CXw9g8/fUqm1rKlMh4u8E=; b=A1tOrh6+vWISxf0DC5FfGjtd4KVa+Ua4GQNNGnqaO8CbEHPRvL380V3YXJZSKmSLtF lpZ+CY62OXdfN6+7y/nR80QR/yptJcLa7SYEOyVewXq4g9+B7psnOXQmDTh6yKESDW9R 1rOOcIZ48cHFx/Ng+VXdLzDiKkpkBm4CKS3cev3s+VJ5VXKb5G1OtW0Zcg4JXvogn9aK H58L+4gizlzv2NWqpEJuOfYruaUkdDS87W+r4LEZAGEP9kTXmSuf87ACzU7LF+M4iRK5 voIDt0/tqeXbmZPsvWlhzYTZpnPzL2bcgdvzCDopuuTZb90GycBb68gw4eyLwcHGjB+J YedQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920970; x=1781525770; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=EkbH48zQzOxR1p1OVZ36l1CXw9g8/fUqm1rKlMh4u8E=; b=UjgeT54qD8U6WQc72gmkrYyHfUAeH+oEtUMO5InRGf9B+C+Cx3tV4etGrtWZ4OTQar Vv7NG21AekCf9o6ZTSlLwckEBIHtk/Wo5HNW0p1HJyQYe0dKhVloqL6816aD54Z4f5c8 89RTlUFov6EpDPnWwglOmE7pC3ESSY3pX5aYMvYb5Hr0W1tZ6NLrJi7bmVtll+2UVJBN 9HFnvcEOwzNRpy8fBJbFWgznQJx2uDZpQ7zUvCNyB7QhsoBBCqwgbgUHWjFqYhc7gr0B 9+vlC1c6kio4GHPcTMu7Tm8ujL0dRfz0TDNLtfUTMwkKRwVsmII/eCx6rnympdWDG/Z8 s+IQ== X-Forwarded-Encrypted: i=1; AFNElJ/iuqa/LEs6o/yNsmNhrGS3MiEMcPYQaaO1I0PZuP4lC/5pw7Mwd5kc+bMqzxQ81XvqcP1r2hkCoEjoGT4=@vger.kernel.org X-Gm-Message-State: AOJu0Yx5KitnSG3dKBWqGk0ZTQPmeGZTvdLm1SlqARO04eoWOmf3zuYs 0CQ/TBRu2FN2xhZ3OeVVNAKR6ELtmP9sU+Jv9F3+rxrCHKYVRBDzqjIy X-Gm-Gg: Acq92OHf6dALQCwz8xOZfQDTMFhjCJ8oDtOWURrnxaO5M/FKWVpyOWsLOjm6thQp7AZ +M8RaZEBuKlMaXLa8RTCbexjZfvnAommERhQoaoMA7mGXE2REvLPJ3eApz1e145fcFbxocQeX0u 4Od06A+k0mfb3j9zqzT/+WIRLKmYKPuyfoXNWYpIxGIsnqXv5It1bjDklRASqqbE9vSMMfinL/Q +sj/hcsTiRO4jI1DGKxxuKNn9YNXNq7i1I7raP7rosksA9mo51VV7u3E2p2KSR8YggGdr0IHYF/ +YzmCR0ydzsezHRKGSg6ZBCFMhQtixW8r8E+WOtcI+NmMNN7KX6un67RP8JG874TRasqwyqmeay l47yz8OeId1l7qX4/mHR01m/prhyWAX+6WnSjz0MkelT9wNAZLjCwEYPSW3+iigbyP6iEStppud Nv+UZ+15ZpHYFMzwoG5tb8HFnTzRCrcuw= X-Received: by 2002:a5d:42d2:0:b0:44f:69f4:39b5 with SMTP id ffacd0b85a97d-46032dd94d2mr13083394f8f.29.1780920970201; Mon, 08 Jun 2026 05:16:10 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.16.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:16:09 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 24/25] sched/core: Execute enqueued balance callbacks after migrate_disable_switch Date: Mon, 8 Jun 2026 14:15:43 +0200 Message-ID: <20260608121546.69910-25-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Execute balance callbacks after migrate_disable_switch. Balancing may be requested on the __schedule path, in migrate_disable_swi= tch, when the running task is throttled and then pushed away from its runqueue. Signed-off-by: Yuri Andriaccio --- kernel/sched/core.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9e337f0090b3..1d458638aab9 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2410,6 +2410,9 @@ do_set_cpus_allowed(struct task_struct *p, struct aff= inity_context *ctx); static void migrate_disable_switch(struct rq *rq, struct task_struct *p) { + struct rq_flags rf; + struct balance_callback *head; + struct affinity_context ac =3D { .new_mask =3D cpumask_of(rq->cpu), .flags =3D SCA_MIGRATE_DISABLE, @@ -2421,8 +2424,13 @@ static void migrate_disable_switch(struct rq *rq, st= ruct task_struct *p) if (p->cpus_ptr !=3D &p->cpus_mask) return; - scoped_guard (task_rq_lock, p) - do_set_cpus_allowed(p, &ac); + rq =3D task_rq_lock(p, &rf); + + do_set_cpus_allowed(p, &ac); + + head =3D splice_balance_callbacks(rq); + task_rq_unlock(rq, p, &rf); + balance_callbacks(rq, head); } void ___migrate_enable(void) -- 2.54.0 From nobody Sun Jun 14 02:12:34 2026 Received: from mail-wr1-f53.google.com (mail-wr1-f53.google.com [209.85.221.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 160253C2BAF for ; Mon, 8 Jun 2026 12:16:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920976; cv=none; b=Pvp4PgHF72zJVdMAy0dhR1h/7b9syi1Cj5+T9csEQ7jLiXp1dZPH/jFmZFyZ/VzLG7MU54lB8g4fovzB+VXTYad7pODXSlwdp1ya+IC+jUt0umSCbbyUqnsOIAnGsI2EcShfrck8/2sVeHWmUYpzZaCv2ld2InNh6hyRP7Bg8io= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780920976; c=relaxed/simple; bh=CqaDjjQjOkeL8yfGY4Eb5JSwo7hijLXuwcbTorJIl7E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=n3bcXr3c8BgV1+9LrNxW1lLpV03yLKwFyHt+o80kgTL7ojnU8SOgKmLgZSWSHJqt8/eoTroB+uie78gnwoTrOxHz20+2FdX2kQ7nFIyzUiRMk1MVpMsokaxu8XAxA0YXC8xJg/uKNV3n+WjipaBo7T8S1Iya8dv3JH3hqIAidFo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=N0JjK+yh; arc=none smtp.client-ip=209.85.221.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="N0JjK+yh" Received: by mail-wr1-f53.google.com with SMTP id ffacd0b85a97d-45ef779c1c2so2859186f8f.1 for ; Mon, 08 Jun 2026 05:16:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780920971; x=1781525771; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=G+Hu7j6VzNGO+KsqpoMk4Vp1tTzaI24I1m+yxcC7P6A=; b=N0JjK+yhCbLcXTXdml3xq4nEOYuD6mOW2X+ma1dcCbmlTyOZVOtwEehld76VOyCTXc usyLn/Rrz+eOFAcT2b6tdcJye/9ted8bPnLXv+3Uol8NowUGZzFcNuDSVM/Z7r1rnG/9 GNBY3aQ1v7wShuvqnV3Rmj11FC/mhVf/QWCVFzrx4WMkwQNZe7ac8toDBbRFWx+ppr58 WYhqfnuGUrVMdNG80YBydvRT5D+HwyJtGWgX4OgWIcBiAhAd5h7DVnlvBnys4Ztn42Xf ARISl0NhiF86gyP5BOJI3YWh0pCVSG9gU2l3n37zaSeomJ00Xq4+59egJwCOIsdoRl4Y 1ymA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780920971; x=1781525771; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=G+Hu7j6VzNGO+KsqpoMk4Vp1tTzaI24I1m+yxcC7P6A=; b=qQXv6EROXQb4GXzxJdZ9UVDDnHJWxsjcRrqOR4o8BHkCUxOCSRO2gdOY9HSyTmCbTw i+mhEu7T0fuXAkbtJ1t4qopvj2hQ7SKPymmVcba4oVL5HEmK9Qx0J8araG67q3z6yiu4 87nhENXFWy0iWLGpUZqBEWSUi9DvFkL5RZjzvNHJq2/XkK2aMlOQgebNcVKfrOMN+KZU byhnSRdcpAY8rEQnmnEcnJKh2KCoO9eVphe895La60FPQU7Lxso9M1g7TOJ7DZUjtRL8 IEzb2mkgS9d6079c7wIsaE/VQQahn66vCvU4GkpXb1D/lckMWFA3ATLL5DvukVDu8prT 8TYQ== X-Forwarded-Encrypted: i=1; AFNElJ8Gtod13LL+PmWDD/VjGk+dYASygcN4C42Hx0EOpBPJxhPSfJ1CEMhLCPSKrlCJwHZGiqpXWvspTlYl2w8=@vger.kernel.org X-Gm-Message-State: AOJu0YyhshQe8jg/bsQ2RavE5aOWcGwAG6fEWJxpbuYZLJ50WIPDaEYD 852C71m0xr6DWNmmvrDE+Wb5ybpxfvfkgJ1YNM93gTycrnp1G8OYpDL0 X-Gm-Gg: Acq92OFqUtkWYJKMBXcvkLKw19zGsypvoiCdx+wpVUcfskyHDJB5f8pJIbYFUqX/03o oUlXd09T/DyAKVvY2FBm7RsICauuPWjms+irGvodC54v2NMGLNJWr1Tnhws/69knrqubuPqV2pV MuxjJS1JxIkRk3gdvf4s94n+EPzXl7qm8+75oMjv/89MfH2Uo304+rs9spNqB0IHY92LDX3IdKT Szq5eUdO9/pfrgAUaFa/pmWeNyPHLduOrCRg9P1KRWpERfW4OgF2d+1AhjSOC+ghqxbsZFQGMhM 7574gceCekiIonIuRuBhEmLy/7+k4ZOcjrXIMqYZhkB/3XOrgYa6rXttrisDjAUlWrTzXpaw5UU LJAf9msnZve/uORrN3VAReiIVdNRyjaCa0qNKEmVvsUg3kOtFT1y9PPr/cnDZjYxDGU/Y4u+whB 86ujkJeyeEz4oponAZG6koAaH7+si3yIL5xk3e7GMiCw== X-Received: by 2002:a05:6000:c41:b0:455:70bc:216d with SMTP id ffacd0b85a97d-460304fc121mr18378365f8f.12.1780920971112; Mon, 08 Jun 2026 05:16:11 -0700 (PDT) Received: from victus-lab ([193.205.81.5]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4601f2ec711sm50644906f8f.12.2026.06.08.05.16.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jun 2026 05:16:10 -0700 (PDT) From: Yuri Andriaccio To: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Luca Abeni , Yuri Andriaccio Subject: [RFC PATCH v6 25/25] Documentation: Update documentation for real-time cgroups Date: Mon, 8 Jun 2026 14:15:44 +0200 Message-ID: <20260608121546.69910-26-yurand2000@gmail.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260608121546.69910-1-yurand2000@gmail.com> References: <20260608121546.69910-1-yurand2000@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Update the RT_GROUP_SCHED specific documentation. Give a brief theoretical background for Hierarchical Constant Bandwidth Server (HCBS). Document how the HCBS is implemented in the kernel and how the RT_GROUP_SCHED behaves now compared to the version which this patchset replaces. Signed-off-by: Yuri Andriaccio --- Documentation/scheduler/sched-rt-group.rst | 470 +++++++++++++++++---- 1 file changed, 393 insertions(+), 77 deletions(-) diff --git a/Documentation/scheduler/sched-rt-group.rst b/Documentation/sch= eduler/sched-rt-group.rst index ab464335d320..f00bec718d67 100644 --- a/Documentation/scheduler/sched-rt-group.rst +++ b/Documentation/scheduler/sched-rt-group.rst @@ -53,9 +53,12 @@ CPU time is divided by means of specifying how much time= can be spent running in a given period. We allocate this "run time" for each real-time group wh= ich the other real-time groups will not be permitted to use. -Any time not allocated to a real-time group will be used to run normal pri= ority -tasks (SCHED_OTHER). Any allocated run time not used will also be picked u= p by -SCHED_OTHER. +Each real-time group runs at the same priority as SCHED_DEADLINE, thus they +share and contend the SCHED_DEADLINE allowed bandwidth. Any time not alloc= ated +to a real-time group (and SCHED_DEADLINE tasks) will be used to run both +SCHED_FIFO/SCHED_RR, normal priority tasks (SCHED_OTHER), and SCHED_EXT ta= sks, +following the usual priorities. Any allocated run time not used will also = be +picked up by the other scheduling classes, in the same order as before. Let's consider an example: a frame fixed real-time renderer must deliver 25 frames a second, which yields a period of 0.04s per frame. Now say it will= also @@ -73,10 +76,6 @@ The remaining CPU time will be used for user input and o= ther tasks. Because real-time tasks have explicitly allocated the CPU time they need to perform their tasks, buffer underruns in the graphics or audio can be eliminated. -NOTE: the above example is not fully implemented yet. We still -lack an EDF scheduler to make non-uniform periods usable. - - 2. The Interface =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D @@ -86,105 +85,422 @@ lack an EDF scheduler to make non-uniform periods usa= ble. The system wide settings are configured under the /proc virtual file syste= m: -/proc/sys/kernel/sched_rt_period_us: +``/proc/sys/kernel/sched_rt_period_us``: The scheduling period that is equivalent to 100% CPU bandwidth. -/proc/sys/kernel/sched_rt_runtime_us: - A global limit on how much time real-time scheduling may use. This is al= ways - less or equal to the period_us, as it denotes the time allocated from the - period_us for the real-time tasks. Without CONFIG_RT_GROUP_SCHED enabled, - this only serves for admission control of deadline tasks. With - CONFIG_RT_GROUP_SCHED=3Dy it also signifies the total bandwidth availabl= e to - all real-time groups. +``/proc/sys/kernel/sched_rt_runtime_us``: + A global limit on how much time real-time scheduling may use (SCHED_DEAD= LINE + tasks + real-time groups). This is always less or equal to the period_us= , as + it denotes the time allocated from the period_us for the real-time tasks. + Without **CONFIG_RT_GROUP_SCHED** enabled, this only serves for admission + control of deadline tasks. With **CONFIG_RT_GROUP_SCHED=3Dy** it also si= gnifies + the total bandwidth available to both real-time groups and deadline task= s. * Time is specified in us because the interface is s32. This gives an operating range from 1us to about 35 minutes. - * sched_rt_period_us takes values from 1 to INT_MAX. - * sched_rt_runtime_us takes values from -1 to sched_rt_period_us. - * A run time of -1 specifies runtime =3D=3D period, ie. no limit. - * sched_rt_runtime_us/sched_rt_period_us > 0.05 inorder to preserve - bandwidth for fair dl_server. For accurate value check average of - runtime/period in /sys/kernel/debug/sched/fair_server/cpuX/ + * ``sched_rt_period_us`` takes values from 1 to INT_MAX. + * ``sched_rt_runtime_us`` takes values from -1 to ``sched_rt_period_us``. + * A run time of -1 specifies runtime =3D=3D period, i.e., no limit, but = also + disables admission tests for SCHED_DEADLINE. + +The default value for both ``sched_rt_period_us`` and ``sched_rt_runtime_u= s`` is +1000000 (or 1s), while fair-servers and ext-servers have a default runtime= of +50ms and default period of 1s, giving a minimum of 0.05s to be used by +SCHED_FIFO/SCHED_RR and non-RT tasks (SCHED_OTHER, SCHED_EXT), while 0.95s= are +the maximum to be used by SCHED_DEADLINE, and rt-cgroups if enabled. + +2.2 Cgroup settings +------------------- + +Enabling **CONFIG_RT_GROUP_SCHED** lets you explicitly allocate real CPU +bandwidth to task groups. + + .. warning:: + Real Time Cgroups are only available for cgroups-v2. + .. + +This uses the cgroup virtual file system and the CPU controller for cgroup= s. +Enabling the controller for the hierarchy creates two files: + +* ``/cpu.rt.max``, which specifies the runtime and period of the g= roup. + The file also accepts a runtime of 'max', specifying that its tasks must= be + scheduled using the nearest configured ancestor (or the root cgroup if i= t is + the nearest non-max ancestor). +* ``/cpu.rt.internal``, read-only, returns the runtime and period + actually allocated to the group, excluding that of its children. + + .. tip:: + For more information on working with control groups, you should read + *Documentation/admin-guide/cgroup-v2.rst*. + .. + +By default the root cgroup has the same period of +``/proc/sys/kernel/sched_rt_period_us``, which is 1s, and a runtime of zer= o, so +that rt-cgroup is *soft-disabled* by default, and all the runtime is avail= able +for SCHED_DEADLINE tasks only. New groups instead get a period of zero and +runtime of 'max' (essentially delegating their tasks' scheduling to the ne= arest +configured ancestor). + +3. Theoretical Background +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D + + + .. BIG FAT WARNING ****************************************************** + + .. warning:: + + This section contains a (not-thorough) summary on deadline/hierarchical + scheduling theory, and how it applies to real-time control groups. + The reader can "safely" skip to Section 4 if only interested in seeing + how the scheduling policy can be used. Anyway, we strongly recommend + to come back here and continue reading (once the urge for testing is + satisfied :P) to be sure of fully understanding all technical details. + + .. **********************************************************************= ** + +The real-time cgroup scheduler is based upon the **Hierarchical Constant +Bandwidth Server** (HCBS) [1] *Compositional Scheduling Framework* (CSF). A +**CSF** is a framework where global (system-level) timing properties can be +established by composing independently (specified and) analyzed local +(component-level) timing properties [5]. + +For HCBS (related to the Linux kernel), the compositional framework consis= ts of +two parts: + +* The *scheduling components*, which are the basic units of the scheduling= . In + the kernel these are the single cgroups along with the tasks that must b= e run + inside. + +* The *scheduling resources*, which are the CPUs of the machine. + +HCBS is a *hierarchical scheduling framework*, where the scheduling compon= ents +form a hierarchy and resources are allocated from parent components to its= child +components in the hierarchy. + +The Chapter is organized as follows: **Section 3.1** gives basic real-time +theory definitions that are used throughout the whole section. **Section 3= .2** +talks about the HCBS framework, giving a general idea on how this is struc= tured. +**Section 3.3** introduces the MPR model, one of the many models which may= be +used for the analysis of the scheduling components and the computation of = the +minimum required scheduling resources for a given component. **Section 3.4= ** +shows the schedulability test for MPR on the HCBS framework. **Section 3.5= ** +shows how to convert a MPR interface to a HCBS compatible resource reserva= tion +for a component. Finally, **Section 3.6** lists other interesting models w= hich +could be used for the component analysis in HCBS. + +3.1 Basic Definitions +--------------------- +*We borrow the same definitions given in the* ``sched_deadline`` *document= , which +are very briefly summarized here, and new ones, needed by the following co= ntent, +are added.* + +A typical real-time task is composed of a repetition of computation phases= (task +instances, or jobs) which are activated on a periodic or sporadic fashion.= For +our purposes, real-time tasks are characterized by three parameters: + +* Worst Case Execution Time (WCET): the maximum execution time among all j= obs. +* Relative Deadline (D): the maximum time each job must be completed, rela= tive + to the release time of the job. +* Inter-Arrival Period (P): the exact/minimum (for periodic/sporadic tasks= ) time + between each consecutive job. + +3.2 Hierarchical Constant Bandwidth Server (HCBS) [1] +----------------------------------------------------- + +As mentioned, HCBS is a *hierarchical scheduling framework*: + +* The framework hierarchy follows the same hierarchy of cgroups. Cgroups m= ay + have two roles, either bandwidth reservation for children cgroups, or th= ey may + be *live*, i.e. run tasks (but not both). The root cgroup, for the kerne= l's + implementation of HCBS, acts only as bandwidth reservation (but as writt= en in + this document it has also different uses outside of the hierarchical + framework). +* The cgroup tree is internally flattened, for ease of scheduling, to a + two-level hierarchy, since only the *live* groups are of interest and al= l the + necessary information for their scheduling lies in their interface (ther= e is + no need for the reservation components). +* The hierarchical framework, now on two levels, consists then of a first = level + of cgroups, and a second level of tasks that are run inside these groups. +* The scheduling of components is performed using global Earliest Deadline= First + (gEDF), SCHED_DEADLINE in the kernel, following the bandwidth reservatio= n of + each group. +* Whenever a component is scheduled, a local scheduler picks which of the = tasks + of the cgroup to run. The scheduling policy is global Fixed Priority (gF= P), + SCHED_FIFO/SCHED_RR in the kernel. -2.2 Default behaviour ---------------------- +3.3 Multiprocessor Periodic Resource (MPR) model +------------------------------------------------ + +A Multiprocessor Periodic Resource (MPR) model [2] **u =3D = ** +specifies that an identical, unit-capacity multiprocessor platform collect= ively +provides **Theta** units of resource every **Pi** time units, where the +**Theta** time units are supplied with concurrency at most **m'**. + +This theoretical model is one of the many models that can abstract the +interface of our real-time cgroups: let **m'** be the number of CPUs of the +machine, let **Theta** be **m' * /cpu.rt_runtime_us** and **Pi** be +**/cpu.rt_period_us**. + +Let's introduce the concept of Supply Bound Function (SBF). A SBF is a fun= ction +which outputs a lower bound for the processor supply provided in a given t= ime +interval, given a resource supply model. For a completely dedicated CPU, t= he SBF +function is simply the identity function, as it will always provide **t** = units +of computation for an interval of length **t**. The situation gets slightl= y more +complicated for the MPR model or any of the other model listed in section = 3.6. + +The **SBF(t)** for a MPR model **u =3D ** is:: + + | 0 if t' < 0 + | + SBF_u(t) =3D | floor(t' / PI) * Theta + | + max(0, m' * x - (m' * Pi - Theta) if t' >=3D 0 and 1 = <=3D x <=3D y + | + | floor(t' / PI) * Theta + | + max(0, m' * x - (m' * Pi - Theta) else + | - (m' - beta) + +where:: + + alpha =3D floor(Theta / m') + beta =3D Theta - m' * alpha + t' =3D t - (Pi - ceil(Theta / m')) + x =3D t' - (Pi * floor(t' / Pi)) + y =3D Pi - floor(Theta / m') + +Briefly, this function models that the server's bandwidth is given as late= as +possible, so describing the worst case possible for the supplied bandwidth. + +3.4 Schedulability for MPR on global Fixed-Priority +--------------------------------------------------- + +Let's introduce the concept of Demand Bound Function (DBF). A DBF is a fun= ction +that, given a taskset, a scheduling algorithm and an interval of time, out= puts +the worst resource demand for that interval of time. + +It is easy to see that, given a DBF and a SBF, we can deem a component/tas= kset +schedulable if, for every time interval t >=3D 0, it is possible to demons= trate +that: + + DBF(t) <=3D SBF(t) + +We have the Supply Bound Function for our given MPR model, so we are missi= ng the +Demand Bound Function for a given taskset that is being scheduled using gl= obal +Fixed Priority. + +3.4.1 Schedulability Analysis for global Fixed Priority +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Bertogna, Cirinei and Lipari [6] have derived a schedulability test for gl= obal +Fixed Priority (gFP) on multi-processor platforms. In this test (called +*BCL_gFP* test) we can consider all the CPUs to be dedicated to the schedu= ling. + + A taskset **Tau** is schedulable with gFP on a multiprocessor platform + composed of **m'** identical processors if for each task **tau_k in Tau*= *: + + Sum(for i < k)( min(W_i(D_k), D_k - C_k + 1) ) < m' * (D_k - C_k + 1) + + where **W_i(t)** is the workload of task **tau_i** over a time interval = **t**: + + W_i(t) =3D N_i(t) * C_i + min(C_i, t + D_i - C_i - N_i(t) * P_i) -The default values for sched_rt_period_us (1000000 or 1s) and -sched_rt_runtime_us (950000 or 0.95s). This gives 0.05s to be used by -SCHED_OTHER (non-RT tasks). These defaults were chosen so that a run-away -real-time tasks will not lock up the machine but leave a little time to re= cover -it. By setting runtime to -1 you'd get the old behaviour back. + and **N_i(t)** is the number of activations of task **tau_i** that compl= ete in + a time interval **t**: -By default all bandwidth is assigned to the root group and new groups get = the -period from /proc/sys/kernel/sched_rt_period_us and a run time of 0. If you -want to assign bandwidth to another group, reduce the root group's bandwid= th -and assign some or all of the difference to another group. + N_i(t) =3D floor( (t + D_i - C_i) / P_i ) -Real-time group scheduling means you have to assign a portion of total CPU -bandwidth to the group before it will accept real-time tasks. Therefore yo= u will -not be able to run real-time tasks as any user other than root until you h= ave -done that, even if the user has the rights to run processes with real-time -priority! + while the **min** term is the contribution of the carried-out job in the + interval **t**, i.e. that job that does not completely fit in the interv= al + **t**, but starts inside the interval after all the jobs that complete. + +3.4.2 From BCL_gFP to the Demand Bound Function +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +We can then derive the DBF from this test: + + DBF_gFP(tau_k) =3D Sum(for i < k)( min(W_i(D_k), D_k - C_k + 1) ) + m' *= (C_k - 1) + +Briefly, the first sum component, the same in the BCL_gFP test, describes = the +maximum interference that higher priority task give to the analysed task. = The +workload is upperbounded by ``(D_k - C_K + 1)`` because we are only intere= sted +in the interference in the slack time, while for the ``C_k`` time we are +requiring that all the CPUs are fully available, as the single job needs `= C_k` +(non overlapping) time units to run. + +The demand bound function from Bertogna et al. is only defined on a single= time +(i.e. the deadline of the task in analysis) instead of all possible times = as +this is the minimum argument to demonstrate schedulability on global Fixed +Priority. + +3.4.3 Putting it all togheter +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A component **C**, on **m'** processors, running a taskset **Tau =3D { tau= _1 =3D +(C_1, D_1, P_1), ..., tau_n =3D (C_n, D_n, P_n) }** of **n** sporadic task= s, is +schedulable under gFP using an MPR model **u =3D **, if for= all +tasks **tau_k in Tau**: + + DBF_gFP(tau_k) <=3D SBF_u(D_K) + +3.5 From MPR to deadline servers +-------------------------------- + +Since there exist no algorithm to schedule MPR interfaces, a tecnique was +developed to transform MPR interfaces into periodic tasks, so that a +number of periodic servers which respect the tasks requirements can be use= d for +the scheduling of the MPR interface and associated tasks. + +Let **u =3D ** be a MPR interface, let **a =3D Theta - m * f= loor(Theta +/ m)**, let **k =3D floor(a)**. Define a transformation from **u** to a pe= riodic +taskset **Tau_u =3D { tau_1 =3D (C_1, D_1, P_1), ..., tau_m' =3D (C_m', D_= m', P_m') +}**, where: + + **tau_1 =3D ... =3D tau_k =3D (floor(Theta / m') + 1, Pi, Pi)** + + **tau_k+1 =3D (floor(Theta / m') + a - k * floor(a/k), Pi, Pi)** + + **tau_k+2 =3D ... =3D tau_m' =3D (floor(Theta / m'), Pi, Pi)** + +This periodic taskset of servers **Tau_u** can be scheduled on any number = of +processors with concurrency at most **m'**. + +For real-time control groups, it is possible to just consider a slightly m= ore +demanding taskset **Tau_u'**, where each task **tau_i** is defined as foll= ows: + + **tau_i =3D (ceil(Theta / m'), Pi, Pi)** + +3.6 Other models +---------------- + +There exist many other theoretical models in literature which are used to +describe a hierarchical scheduling framework on multi-core architectures. +Notable examples are the Multi Supply Function (MSF) abstraction [3], the +Parallel Supply Function (PSF) abstraction [4] and the Bounded Delay +Multipartition (BDM) [7]. + +3.7 References +-------------- + 1 - L. Abeni, A. Balsini, and T. Cucinotta, =E2=80=9CContainer-based rea= l-time + scheduling in the Linux kernel,=E2=80=9D SIGBED Rev., vol. 16, no. 3= , pp. 33-38, + Nov. 2019, doi: 10.1145/3373400.3373405. + 2 - A. Easwaran, I. Shin, and I. Lee, =E2=80=9COptimal virtual cluster-b= ased + multiprocessor scheduling,=E2=80=9D Real-Time Syst, vol. 43, no. 1, = pp. 25-59, + Sept. 2009, doi: 10.1007/s11241-009-9073-x. + 3 - E. Bini, G. Buttazzo, and M. Bertogna, =E2=80=9CThe Multi Supply Fun= ction + Abstraction for Multiprocessors,=E2=80=9D in 2009 15th IEEE Internat= ional + Conference on Embedded and Real-Time Computing Systems and Applicati= ons, + Aug. 2009, pp. 294-302. doi: 10.1109/RTCSA.2009.39. + 4 - E. Bini, B. Marko, and S. K. Baruah, =E2=80=9CThe Parallel Supply Fu= nction + Abstraction for a Virtual Multiprocessor,=E2=80=9D in Scheduling, S.= Albers, S. K. + Baruah, R. H. M=C3=B6hring, and K. Pruhs, Eds., in Dagstuhl Seminar = Proceedings + (DagSemProc), vol. 10071. Dagstuhl, Germany: Schloss Dagstuhl - + Leibniz-Zentrum f=C3=BCr Informatik, 2010, pp. 1-14. doi: + 10.4230/DagSemProc.10071.14. + 5 - I. Shin and I. Lee, =E2=80=9CCompositional real-time scheduling fram= ework,=E2=80=9D in + 25th IEEE International Real-Time Systems Symposium, Dec. 2004, pp. = 57-67. + doi: 10.1109/REAL.2004.15. + 6 - M. Bertogna, M. Cirinei, and G. Lipari, =E2=80=9CSchedulability Anal= ysis of Global + Scheduling Algorithms on Multiprocessor Platforms,=E2=80=9D IEEE Tra= nsactions on + Parallel and Distributed Systems, vol. 20, no. 4, pp. 553-566, Apr. = 2009, + doi: 10.1109/TPDS.2008.129. + 7 - G. Lipari and E. Bini, =E2=80=9CA Framework for Hierarchical Schedul= ing on + Multiprocessors: From Application Requirements to Run-Time Allocatio= n,=E2=80=9D in + 2010 31st IEEE Real-Time Systems Symposium, Nov. 2010, pp. 249-258. = doi: + 10.1109/RTSS.2010.12. + + +4. Using Real-Time cgroups +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D + +4.1 CGroup Setup +---------------- +The following is a brief guide to the use of Real-Time Control Groups. -2.3 Basis for grouping tasks ----------------------------- +Of course, real-time control groups require mounting of the cgroup file sy= stem. +We have decided to only support cgroups v2, so make sure you mount the v2 +controller for the cgroup hierarchy. -Enabling CONFIG_RT_GROUP_SCHED lets you explicitly allocate real -CPU bandwidth to task groups. +Additionally the real-time cgroups require the CPU controller for the cgro= ups to +be enabled:: -This uses the cgroup virtual file system and "/cpu.rt_runtime_us" -to control the CPU time reserved for each control group. + # Assume the cgroup file system is mounted at /sys/fs/cgroup + > echo "+cpu" > /sys/fs/cgroup/cgroup.subtree_control -For more information on working with control groups, you should read -Documentation/admin-guide/cgroup-v1/cgroups.rst as well. +The CPU controller can only be mounted if there is no SCHED_FIFO/SCHED_RR = task +scheduled in any cgroup other than the root control group. -Group settings are checked against the following limits in order to keep t= he -configuration schedulable: +The root control group has no bandwidth allocated by default, so make sure= to +allocate some bandwidth so that it can be used by the other cgroups. More = on +that in the following section... - \Sum_{i} runtime_{i} / global_period <=3D global_runtime / global_period +4.2 Bandwidth Allocation for groups +----------------------------------- -For now, this can be simplified to just the following (but see Future plan= s): +Allocating bandwidth to a cgroup is a fundamental step to run real-time +workload. The cgroup filesystem exposes two files: - \Sum_{i} runtime_{i} <=3D global_runtime +* ``/cpu.rt.max``: which specifies the cgroups' runtime and period= in + microseconds. +* ``/cpu.rt.internal``: read-only, get the cgroups' actualy runtim= e and + period in microseconds, without its children's bandwidth. +By definition, the specified runtime must be always less than or equal to = the +period. Additionally, an admission test checks if the bandwidth invariant = is +respected (i.e. sum of children's bandwidth <=3D parent's bandwidth). -3. Future plans -=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +The root control group files instead control and reserve the SCHED_DEADLINE +bandwidth allocated to real-time cgroups, since real-time groups compete a= nd +share the same bandwidth allocated to SCHED_DEADLINE tasks. -There is work in progress to make the scheduling period for each group -("/cpu.rt_period_us") configurable as well. +4.3 Running real-time tasks in groups +------------------------------------- -The constraint on the period is that a subgroup must have a smaller or -equal period to its parent. But realistically its not very useful _yet_ -as its prone to starvation without deadline scheduling. +To run tasks in real-time groups it is just necessary to change a tasks +scheduling policy to SCHED_FIFO/SCHED_RR and migrate it into the group. If= the +group is not allowed to run real-time tasks because of incorrect configura= tion, +either migrating a SCHED_FIFO/SCHED_RR task into the group or changing +scheduling policy to a task already inside the group will fail:: -Consider two sibling groups A and B; both have 50% bandwidth, but A's -period is twice the length of B's. + # assume there is a task of PID 42 running + # change its scheduling policy to SCHED_FIFO, priority 99 + > chrt -f -p 99 42 -* group A: period=3D100000us, runtime=3D50000us + # migrate the task to a cgroup + > echo 42 > /sys/fs/cgroup//cgroup.procs - - this runs for 0.05s once every 0.1s +4.4 Special case: the root control group +---------------------------------------- -* group B: period=3D 50000us, runtime=3D25000us +The root cgroup is special, compared to the other cgroups, as its tasks ar= e not +managed by the HCBS algorithm, rather they just use the original +SCHED_FIFO/SCHED_RR policies (as if CONFIG_RT_GROUP_SCHED was disabled). As +mentioned, its bandwidth files are just used to control how much of the +SCHED_DEADLINE bandwidth is allocated to cgroups. - - this runs for 0.025s twice every 0.1s (or once every 0.05 sec). +Any non-root cgroup configured as 'max' that has the root cgroup as its ne= arest +non-max ancestor will run its tasks in the root runqueue. -This means that currently a while (1) loop in A will run for the full peri= od of -B and can starve B's tasks (assuming they are of lower priority) for a who= le -period. +4.5 Guarantees and Special Behaviours +------------------------------------- -The next project will be SCHED_EDF (Earliest Deadline First scheduling) to= bring -full deadline scheduling to the linux kernel. Deadline scheduling the above -groups and treating end of the period as a deadline will ensure that they = both -get their allocated time. +Real-time cgroups are run at the same priority level of SCHED_DEADLINE tas= ks. +Since this is the highest priority scheduling policy, and since the Consta= nt +Bandwidth Server (CBS) enforces that the specified bandwidth requirements = for +both groups and tasks cannot be overrun, real-time groups have the same +guarantees that SCHED_DEADLINE tasks have, i.e. they will be necessarily +supplied by the amount of bandwidth requested (whenever the admission tests +pass). -Implementing SCHED_EDF might take a while to complete. Priority Inheritanc= e is -the biggest challenge as the current linux PI infrastructure is geared tow= ards -the limited static priority levels 0-99. With deadline scheduling you need= to -do deadline inheritance (since priority is inversely proportional to the -deadline delta (deadline - now)). +This means that, since SCHED_FIFO/SCHED_RR tasks (scheduled in the root co= ntrol +group) are not subject to bandwidth controls, they are run at a lower prio= rity +than the cgroups' counterparts. Nonetheless, a minimum amount of bandwidth= , if +reserved, will always be available to run SCHED_FIFO/SCHED_RR workloads in= the +root cgroup, while they will be able to use more runtime if any of the +SCHED_DEADLINE tasks or servers use less than their specified amount of +bandwidth. SCHED_OTHER tasks are instead scheduled as normal, at lower pri= ority +than real-time workloads. -This means the whole PI machinery will have to be reworked - and that is o= ne of -the most complex pieces of code we have. +The aforementioned behaviour differs from the preceding RT_GROUP_SCHED +implementation, but this is necessary to give actual guarantees to the amo= unt of +bandwidth given to rt-cgroups. \ No newline at end of file -- 2.54.0