From nobody Wed Apr 1 08:16:24 2026 Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D09F329E4B for ; Tue, 31 Mar 2026 16:23:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774974239; cv=none; b=F+kN1ldzLC3w3tH0s9OYF/Wadr8j+IJYTjnVMHqqnDxM10VWuj1HjoHGU5MA1gMwFj0Yezbyxz/YUQngaq/luS2VoYr1x01a4lHt0CC4YEx99VZuf1U2LlCnEMdBk3n+0GNVg70WScche7OOqbFM9wY/6EX3ywzFhLjU4L+KNTc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774974239; c=relaxed/simple; bh=UhZHq8ec4wBR3BOM6qFe0eREljQRuiUiekOIqUf6JxQ=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=JnzP2XguadymIbN9GlRvjtW7ptAWbZmgpcyAtmYuJdc8ALzikSyhXICe2QuhqeqkW19tgnckWTKUtu6TZGisYbU6cpmhvaeIvJMgHaSsNgWW+S5qMWV9aACslUog57ZTQD80TJDdtdwbqXgSE7wevlGcosgMtKeCZgKS/f/TDxI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org; spf=pass smtp.mailfrom=linaro.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b=Txd45YNr; arc=none smtp.client-ip=209.85.128.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linaro.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Txd45YNr" Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-4838c15e3cbso49425635e9.3 for ; Tue, 31 Mar 2026 09:23:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1774974237; x=1775579037; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Dem5aXc9fIWbW/FTRG7KyuhAupNl6uiR1krPyhaCTsY=; b=Txd45YNrCEwpwBgFx+tsD0fi+gQ5dx8gyqDqHrM80B+WiQ0qqfHOrRUfQF+avLpRBx 9uwD0Pc34AZmZvKO2yXGmAb8QQbFtlk1+XYDyKO6NJF02putPRBlKTU56NKObdAiYmvc mhFKlGPwhVLQO3ndS/BRgVAtNFwF6sDLJyozKt7hrut1DSNbv3FtBEOcR6bbIVbPXddJ 6QnI7FSXvaiC+gAOlnqDYP92wH/6yibAqARgkpzCXGTH92K0bA+PetNs3h4NfBOWzFBO DHFZupwGoHy7O35bkdUb4KRHe8LLw0I3JYvAcJzoytiChLlc3JaD8m1lwxBoQh1kOpQt 5u4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774974237; x=1775579037; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Dem5aXc9fIWbW/FTRG7KyuhAupNl6uiR1krPyhaCTsY=; b=sS4Ke4c90fn/U9l+8r1k4NKFBDepEVXWnXlVEXIr1n8IrtQErxzXZ7MSetUPxTggtm jl5Zln9z3W5JHRADtdO3+EmQ/HcB5XcsjttHfjZ8JbdYpIOXOftKKYHiZ/hwB6zNtmxW zsLUvglWz6xdLy8rzDsO7Le/lcq4O/dneyNufrurOA+XxdlRkf91ZbQldbys+od7x4ye EdXDXpOyeyS7RKZgSvRo/3wtDMU6lg27CyZJHC5ZJiAGJMgrzjCBbsiZMjhvlWH786yL eSgMNm/WIhc+SO0B3SjwhR3j0OMwJ8HiLMMYgbmCjkS9+C8+NS8efg16QtpwFKTbos8c dQZg== X-Forwarded-Encrypted: i=1; AJvYcCXON1NyHi55CU5MU2h+A5Lrl2xuaIA3JUjY7829YFkB67Y2VfWw4aMS9MPbG7jW3cA0LhIDUjUSWSF/Ksg=@vger.kernel.org X-Gm-Message-State: AOJu0Yyuw8mA+Pt/bb9hgFuGcKuWiN3yFa+jiXsixY5+hj8AhKS/J+P7 SaC7YbVRL2wvnoemtijT2iImntMg58wCI8PsA9NgQDdsZ+o/U8j7ne83TOhZxCiVmCQ= X-Gm-Gg: ATEYQzx6rO2kXvOPkSfO9zzFqmlWNsypBPIREq2/ux+w70MHKoV9Dn8jJOPW/GbEhp8 pWIglYxsYSzodcd46LaPfezLq4+zPcdb+z0arHNTWSu+WOSsyjf8siWF/3eT2ewUAVMC7zEBhlP ykuK7ACPHAi7r6HNS6cn+Y4VkRDY36ahsVr1sHBNb+8FDwMxQIZnXOAYpE6v2ved0+XUz9dLtuG srn0wu8m378izDLfE5W72X3ub72KCpc/7inhGUZwq7jk9+mnyLC+W+/Ad4pBfLQvI/bu3CN8fsI x2mCMEf9vskmB/bMQCCsLslhDL+JP6s10GVl/8XCZgXYfz1rXbgJ92jTwfotl+Dw4euPIqhqJYC KVRaH7pec/2FKcGzXoikPg5kmYylQWVVixEadKDSejejfprVug7xa3B8PNeqYaH44oPLz8K39pE IspAzaH7Hg15RSA3K6QC/irOvZ5qnFGvcE X-Received: by 2002:a05:600c:529b:b0:485:3ec6:e634 with SMTP id 5b1f17b1804b1-48727d84100mr275538095e9.15.1774974236557; Tue, 31 Mar 2026 09:23:56 -0700 (PDT) Received: from vingu-cube.. ([2a01:e0a:f:6020:8483:cc23:c327:1c6a]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-48880d864b7sm11561445e9.3.2026.03.31.09.23.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 31 Mar 2026 09:23:55 -0700 (PDT) From: Vincent Guittot To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, kprateek.nayak@amd.com, linux-kernel@vger.kernel.org, shubhang@os.amperecomputing.com Cc: Vincent Guittot Subject: [PATCH v2] sched: fair: Prevent negative lag increase during delayed dequeue Date: Tue, 31 Mar 2026 18:23:52 +0200 Message-ID: <20260331162352.551501-1-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Delayed dequeue feature aims to reduce the negative lag of a dequeued task while sleeping but it can happens that newly enqueued tasks will move backward the avg vruntime and increase its negative lag. When the delayed dequeued task wakes up, it has more neg lag compared to being dequeued immediately or to other tasks that have been dequeued just before theses new enqueues. Ensure that the negative lag of a delayed dequeued task doesn't increase during its delayed dequeued phase while waiting for its neg lag to diseappear. Similarly, we remove any positive lag that the delayed dequeued task could have gain during thsi period. Short slice tasks are particularly impacted in overloaded system. Test on snapdragon rb5: hackbench -T -p -l 16000000 -g 2 1> /dev/null & cyclictest -t 1 -i 2777 -D 333 --policy=3Dfair --mlock -h 20000 -q The scheduling latency of cyclictest is: tip/sched/core tip/sched/core +this patch cyclictest slice (ms) (default)2.8 8 8 hackbench slice (ms) (default)2.8 20 20 Total Samples | 115632 119733 119806 Average (us) | 364 64(-82%) 61(- 5%) Median (P50) (us) | 60 56(- 7%) 56( 0%) 90th Percentile (us) | 1166 62(-95%) 62( 0%) 99th Percentile (us) | 4192 73(-98%) 72(- 1%) 99.9th Percentile (us) | 8528 2707(-68%) 1300(-52%) Maximum (us) | 17735 14273(-20%) 13525(- 5%) Signed-off-by: Vincent Guittot --- Since v1: - Embedded the check of lag evolution of delayed dequeue entities in update_entity_lag() to include all cases. kernel/sched/fair.c | 53 ++++++++++++++++++++++++++------------------- 1 file changed, 31 insertions(+), 22 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 226509231e67..c1ffe86bf78d 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -840,11 +840,30 @@ static s64 entity_lag(struct cfs_rq *cfs_rq, struct s= ched_entity *se, u64 avrunt return clamp(vlag, -limit, limit); } =20 -static void update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *= se) +/* + * Delayed dequeue aims to reduce the negative lag of a dequeued task. + * While updating the lag of an entity, check that negative lag didn't inc= rease + * during the delayed dequeue period which would be unfair. + * Similarly, check that the entity didn't gain positive lag when DELAY_ZE= RO is + * set. + * + * Return true if the lag has been adjusted. + */ +static bool update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *= se) { + s64 vlag; + WARN_ON_ONCE(!se->on_rq); =20 - se->vlag =3D entity_lag(cfs_rq, se, avg_vruntime(cfs_rq)); + vlag =3D entity_lag(cfs_rq, se, avg_vruntime(cfs_rq)); + + if (se->sched_delayed) + /* previous vlag < 0 otherwise se would not be delayed */ + se->vlag =3D clamp(vlag, se->vlag, sched_feat(DELAY_ZERO) ? 0 : S64_MAX); + else + se->vlag =3D vlag; + + return (vlag !=3D se->vlag); } =20 /* @@ -5563,13 +5582,6 @@ static void clear_delayed(struct sched_entity *se) } } =20 -static inline void finish_delayed_dequeue_entity(struct sched_entity *se) -{ - clear_delayed(se); - if (sched_feat(DELAY_ZERO) && se->vlag > 0) - se->vlag =3D 0; -} - static bool dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) { @@ -5595,6 +5607,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_en= tity *se, int flags) if (sched_feat(DELAY_DEQUEUE) && delay && !entity_eligible(cfs_rq, se)) { update_load_avg(cfs_rq, se, 0); + update_entity_lag(cfs_rq, se); set_delayed(se); return false; } @@ -5634,7 +5647,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_en= tity *se, int flags) update_cfs_group(se); =20 if (flags & DEQUEUE_DELAYED) - finish_delayed_dequeue_entity(se); + clear_delayed(se); =20 if (cfs_rq->nr_queued =3D=3D 0) { update_idle_cfs_rq_clock_pelt(cfs_rq); @@ -7088,18 +7101,14 @@ requeue_delayed_entity(struct sched_entity *se) WARN_ON_ONCE(!se->sched_delayed); WARN_ON_ONCE(!se->on_rq); =20 - if (sched_feat(DELAY_ZERO)) { - update_entity_lag(cfs_rq, se); - if (se->vlag > 0) { - cfs_rq->nr_queued--; - if (se !=3D cfs_rq->curr) - __dequeue_entity(cfs_rq, se); - se->vlag =3D 0; - place_entity(cfs_rq, se, 0); - if (se !=3D cfs_rq->curr) - __enqueue_entity(cfs_rq, se); - cfs_rq->nr_queued++; - } + if (update_entity_lag(cfs_rq, se)) { + cfs_rq->nr_queued--; + if (se !=3D cfs_rq->curr) + __dequeue_entity(cfs_rq, se); + place_entity(cfs_rq, se, 0); + if (se !=3D cfs_rq->curr) + __enqueue_entity(cfs_rq, se); + cfs_rq->nr_queued++; } =20 update_load_avg(cfs_rq, se, 0); --=20 2.43.0