From nobody Thu Apr 2 19:00:24 2026 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 627E1339878 for ; Thu, 19 Feb 2026 08:11:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488665; cv=none; b=rgkvjdG0vxkmHkWqXTl0u8gWvV2w9X2ye4mIi3oEGgmYX8CwJ8qMUKU6AHifWG2Tis8s6E2JsJ5Uon7UXrxBbiQP7e/OokbZ9VKdEiQCnB/PEAGOa1bATxx2O+Ybf0RaaeAp+h/uUr57/axHvPpEe31PA3KRxR39wrG2WZFJDwM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488665; c=relaxed/simple; bh=nzjwwszV7nPITwyAvUTjd1+sOssFh41Csy3GUsMDjfA=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=dtlLUfTG6mcv8ny3qvAlzNUbmzc7scGqdE5/Mk10w1LvFU0DfCYqWIg4ssvJnmP9ESPAWPNjlCyUn3CTPlUHxXQpar3IrQ3Hu4HW3Pxy1q02TMHNG6sxOh/+eo16Glbmv6odRbi1em+OIacauOUPHx0nr82vic2YrBIuQ9sJcx0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=hHS+mGlR; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="hHS+mGlR" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=R/dm1QedGFrzQel3qAPlX6FB2yOugGJ5Hr0oX6jvp2Y=; b=hHS+mGlROJxxEYUVTT4MStKA2/ TcsIOk3Ywf1pwQ55kIu/akMxI5g1irytxl7XwV0DX2utlMGCga6T833NYSxCFjrWOHmHisTWQnz0k /Mp/vjR6aWPbL2lGtfA4cj+1na/H0mS2dkOXvOg+yFZmaR9y6WPoXX00Ww2AWl4qj38HZq1orOU4t b3zgpkmW7ILkSacJ+21Q08B8482anIJNm68Vq/ngHXv7MeBIkp6wRznNWoMHgRdG69q5elTaD+uTQ L727817cWWy87NsvBYodp/08eESj2fveuesSpHHxo74ggSzNuoCyMulAd4pOzqbp33FJIQcmIr9X+ wZXeTQQA==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1vsz7Z-00000007KIy-3sd8; Thu, 19 Feb 2026 08:10:50 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id B9F8C30325A; Thu, 19 Feb 2026 09:10:48 +0100 (CET) Message-ID: <20260219080624.438854780@infradead.org> User-Agent: quilt/0.68 Date: Thu, 19 Feb 2026 08:58:41 +0100 From: Peter Zijlstra To: mingo@kernel.org Cc: peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, linux-kernel@vger.kernel.org, wangtao554@huawei.com, quzicheng@huawei.com, kprateek.nayak@amd.com, dsmythies@telus.net, shubhang@os.amperecomputing.com Subject: [PATCH v2 1/7] sched/fair: Fix zero_vruntime tracking References: <20260219075840.162631716@infradead.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It turns out that zero_vruntime tracking is broken when there is but a sing= le task running. Current update paths are through __{en,de}queue_entity(), and when there is but a single task, pick_next_task() will always return that o= ne task, and put_prev_set_next_task() will end up in neither function. This can cause entity_key() to grow indefinitely large and cause overflows, leading to much pain and suffering. Furtermore, doing update_zero_vruntime() from __{de,en}queue_entity(), which are called from {set_next,put_prev}_entity() has problems because: - set_next_entity() calls __dequeue_entity() before it does cfs_rq->curr = =3D se. This means the avg_vruntime() will see the removal but not current, miss= ing the entity for accounting. - put_prev_entity() calls __enqueue_entity() before it does cfs_rq->curr = =3D NULL. This means the avg_vruntime() will see the addition *and* current, leading to double accounting. Both cases are incorrect/inconsistent. Noting that avg_vruntime is already called on each {en,de}queue, remove the explicit avg_vruntime() calls (which removes an extra 64bit division for ea= ch {en,de}queue) and have avg_vruntime() update zero_vruntime itself. Additionally, have the tick call avg_vruntime() -- discarding the result, b= ut for the side-effect of updating zero_vruntime. While there, optimize avg_vruntime() by noting that the average of one valu= e is rather trivial to compute. Test case: # taskset -c -p 1 $$ # taskset -c 2 bash -c 'while :; do :; done&' # cat /sys/kernel/debug/sched/debug | awk '/^cpu#/ {P=3D0} /^cpu#2,/ {P= =3D1} {if (P) print $0}' | grep -e zero_vruntime -e "^>" PRE: .zero_vruntime : 31316.407903 >R bash 487 50787.345112 E 50789.145972 = 2.800000 50780.298364 16 120 0.000000 0.00= 0000 0.000000 / .zero_vruntime : 382548.253179 >R bash 487 427275.204288 E 427276.003584 = 2.800000 427268.157540 23 120 0.000000 0.00= 0000 0.000000 / POST: .zero_vruntime : 17259.709467 >R bash 526 17259.709467 E 17262.509467 = 2.800000 16915.031624 9 120 0.000000 0.00= 0000 0.000000 / .zero_vruntime : 18702.723356 >R bash 526 18702.723356 E 18705.523356 = 2.800000 18358.045513 9 120 0.000000 0.00= 0000 0.000000 / Fixes: 79f3f9bedd14 ("sched/eevdf: Fix min_vruntime vs avg_vruntime") Reported-by: K Prateek Nayak Signed-off-by: Peter Zijlstra (Intel) Tested-by: K Prateek Nayak Tested-by: Shubhang Kaushik Reviewed-by: Vincent Guittot Tested-by: John Stultz --- kernel/sched/fair.c | 84 +++++++++++++++++++++++++++++++++++------------= ----- 1 file changed, 57 insertions(+), 27 deletions(-) --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -589,6 +589,21 @@ static inline bool entity_before(const s return vruntime_cmp(a->deadline, "<", b->deadline); } =20 +/* + * Per avg_vruntime() below, cfs_rq::zero_vruntime is only slightly stale + * and this value should be no more than two lag bounds. Which puts it in = the + * general order of: + * + * (slice + TICK_NSEC) << NICE_0_LOAD_SHIFT + * + * which is around 44 bits in size (on 64bit); that is 20 for + * NICE_0_LOAD_SHIFT, another 20 for NSEC_PER_MSEC and then a handful for + * however many msec the actual slice+tick ends up begin. + * + * (disregarding the actual divide-by-weight part makes for the worst case + * weight of 2, which nicely cancels vs the fuzz in zero_vruntime not actu= ally + * being the zero-lag point). + */ static inline s64 entity_key(struct cfs_rq *cfs_rq, struct sched_entity *s= e) { return vruntime_op(se->vruntime, "-", cfs_rq->zero_vruntime); @@ -676,39 +691,61 @@ sum_w_vruntime_sub(struct cfs_rq *cfs_rq } =20 static inline -void sum_w_vruntime_update(struct cfs_rq *cfs_rq, s64 delta) +void update_zero_vruntime(struct cfs_rq *cfs_rq, s64 delta) { /* - * v' =3D v + d =3D=3D> sum_w_vruntime' =3D sum_runtime - d*sum_weight + * v' =3D v + d =3D=3D> sum_w_vruntime' =3D sum_w_vruntime - d*sum_weight */ cfs_rq->sum_w_vruntime -=3D cfs_rq->sum_weight * delta; + cfs_rq->zero_vruntime +=3D delta; } =20 /* - * Specifically: avg_runtime() + 0 must result in entity_eligible() :=3D t= rue + * Specifically: avg_vruntime() + 0 must result in entity_eligible() :=3D = true * For this to be so, the result of this function must have a left bias. + * + * Called in: + * - place_entity() -- before enqueue + * - update_entity_lag() -- before dequeue + * - entity_tick() + * + * This means it is one entry 'behind' but that puts it close enough to wh= ere + * the bound on entity_key() is at most two lag bounds. */ u64 avg_vruntime(struct cfs_rq *cfs_rq) { struct sched_entity *curr =3D cfs_rq->curr; - s64 avg =3D cfs_rq->sum_w_vruntime; - long load =3D cfs_rq->sum_weight; + long weight =3D cfs_rq->sum_weight; + s64 delta =3D 0; =20 - if (curr && curr->on_rq) { - unsigned long weight =3D scale_load_down(curr->load.weight); + if (curr && !curr->on_rq) + curr =3D NULL; =20 - avg +=3D entity_key(cfs_rq, curr) * weight; - load +=3D weight; - } + if (weight) { + s64 runtime =3D cfs_rq->sum_w_vruntime; + + if (curr) { + unsigned long w =3D scale_load_down(curr->load.weight); + + runtime +=3D entity_key(cfs_rq, curr) * w; + weight +=3D w; + } =20 - if (load) { /* sign flips effective floor / ceiling */ - if (avg < 0) - avg -=3D (load - 1); - avg =3D div_s64(avg, load); + if (runtime < 0) + runtime -=3D (weight - 1); + + delta =3D div_s64(runtime, weight); + } else if (curr) { + /* + * When there is but one element, it is the average. + */ + delta =3D curr->vruntime - cfs_rq->zero_vruntime; } =20 - return cfs_rq->zero_vruntime + avg; + update_zero_vruntime(cfs_rq, delta); + + return cfs_rq->zero_vruntime; } =20 /* @@ -777,16 +814,6 @@ int entity_eligible(struct cfs_rq *cfs_r return vruntime_eligible(cfs_rq, se->vruntime); } =20 -static void update_zero_vruntime(struct cfs_rq *cfs_rq) -{ - u64 vruntime =3D avg_vruntime(cfs_rq); - s64 delta =3D vruntime_op(vruntime, "-", cfs_rq->zero_vruntime); - - sum_w_vruntime_update(cfs_rq, delta); - - cfs_rq->zero_vruntime =3D vruntime; -} - static inline u64 cfs_rq_min_slice(struct cfs_rq *cfs_rq) { struct sched_entity *root =3D __pick_root_entity(cfs_rq); @@ -856,7 +883,6 @@ RB_DECLARE_CALLBACKS(static, min_vruntim static void __enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *s= e) { sum_w_vruntime_add(cfs_rq, se); - update_zero_vruntime(cfs_rq); se->min_vruntime =3D se->vruntime; se->min_slice =3D se->slice; rb_add_augmented_cached(&se->run_node, &cfs_rq->tasks_timeline, @@ -868,7 +894,6 @@ static void __dequeue_entity(struct cfs_ rb_erase_augmented_cached(&se->run_node, &cfs_rq->tasks_timeline, &min_vruntime_cb); sum_w_vruntime_sub(cfs_rq, se); - update_zero_vruntime(cfs_rq); } =20 struct sched_entity *__pick_root_entity(struct cfs_rq *cfs_rq) @@ -5524,6 +5549,11 @@ entity_tick(struct cfs_rq *cfs_rq, struc update_load_avg(cfs_rq, curr, UPDATE_TG); update_cfs_group(curr); =20 + /* + * Pulls along cfs_rq::zero_vruntime. + */ + avg_vruntime(cfs_rq); + #ifdef CONFIG_SCHED_HRTICK /* * queued ticks are scheduled to match the slice, so don't bother From nobody Thu Apr 2 19:00:24 2026 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4E7CF2222B2 for ; Thu, 19 Feb 2026 08:10:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488660; cv=none; b=ecY0LM44PsALE9cad1JY13BA1JaiB9A91/4seuELzhd6sGiHbCPrNN2zjfrRL3JTFiXBBZ0gUk0mRPJss35TFZoJKxzjtbVM9TUan4wmt5GQVNAOKO+vi7mFOxm0EgFIz97CHjGKK1H9r7EM1zt5Dh83eAoJA2GyE49EUj8Rzyc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488660; c=relaxed/simple; bh=+X8QytkFwvjonO55mLhVGD/6EK8ONNO7Mee3Q4HG+hQ=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=XAfDQKdwnEHszp/8JzX0eRnP/wCOYlnu9vgOhr3zfwiGoeewfhRVFQdnZINbbfD2j0Ynj3gHEA0FPG7y2D2Frqr57N7NPXZrGjZkV0FzeF5buSN7x/u9xOozHnLf76kgamQTWQud1PQbEjU/WHIQS3Yl9onk17zScBH7wSJyDW4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=BnLc2ALj; arc=none smtp.client-ip=90.155.92.199 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="BnLc2ALj" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=wgvmMbGZTcc14/fwbMa0Vsxs68JQMpfuCOmPLN/La0c=; b=BnLc2ALjSEQEWWegpxekVKMuYh yatltwYdOM116lKj3D5HG9VkY8bMbcrxd7blJFRfpcY/LjUOGSOuJt9juVY5thCdQu85NmIq0wZdU /IJ2xqma1hXafDb9Ci0BCthRzIuqqvoxclUj/g754S098J1Ce5FgmIkjmXfFfRG6D04MIgv4X56kw PvS74RU0PnLbbnKm+Is4wd6EdmqDiKs1XyeKAFtrj7lG3XZxOswwDg/j4mjSX4yrTQBO+u1JfkYhB fDKuQVQwmNAZ0shQ3wFTBDMvunrwFpqZ6NrQaqgJiLnSLVWbtvKYihFZQFtYDfxx83Co8ekD5aH+0 LgvbfUFg==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1vsz7Z-00000000xzY-26Kv; Thu, 19 Feb 2026 08:10:49 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id BEB6C303324; Thu, 19 Feb 2026 09:10:48 +0100 (CET) Message-ID: <20260219080624.561421378@infradead.org> User-Agent: quilt/0.68 Date: Thu, 19 Feb 2026 08:58:42 +0100 From: Peter Zijlstra To: mingo@kernel.org Cc: peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, linux-kernel@vger.kernel.org, wangtao554@huawei.com, quzicheng@huawei.com, kprateek.nayak@amd.com, dsmythies@telus.net, shubhang@os.amperecomputing.com Subject: [PATCH v2 2/7] sched/fair: Only set slice protection at pick time References: <20260219075840.162631716@infradead.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We should not (re)set slice protection in the sched_change pattern which calls put_prev_task() / set_next_task(). Fixes: 63304558ba5d ("sched/eevdf: Curb wakeup-preemption") Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Vincent Guittot Tested-by: K Prateek Nayak Tested-by: Shubhang Kaushik --- kernel/sched/fair.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5444,7 +5444,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, st } =20 static void -set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se) +set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, bool first) { clear_buddies(cfs_rq, se); =20 @@ -5459,7 +5459,8 @@ set_next_entity(struct cfs_rq *cfs_rq, s __dequeue_entity(cfs_rq, se); update_load_avg(cfs_rq, se, UPDATE_TG); =20 - set_protect_slice(cfs_rq, se); + if (first) + set_protect_slice(cfs_rq, se); } =20 update_stats_curr_start(cfs_rq, se); @@ -8977,13 +8978,13 @@ pick_next_task_fair(struct rq *rq, struc pse =3D parent_entity(pse); } if (se_depth >=3D pse_depth) { - set_next_entity(cfs_rq_of(se), se); + set_next_entity(cfs_rq_of(se), se, true); se =3D parent_entity(se); } } =20 put_prev_entity(cfs_rq, pse); - set_next_entity(cfs_rq, se); + set_next_entity(cfs_rq, se, true); =20 __set_next_task_fair(rq, p, true); } @@ -13597,7 +13598,7 @@ static void set_next_task_fair(struct rq for_each_sched_entity(se) { struct cfs_rq *cfs_rq =3D cfs_rq_of(se); =20 - set_next_entity(cfs_rq, se); + set_next_entity(cfs_rq, se, first); /* ensure bandwidth has been allocated on our new cfs_rq */ account_cfs_rq_runtime(cfs_rq, 0); } From nobody Thu Apr 2 19:00:24 2026 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6286033987F for ; Thu, 19 Feb 2026 08:11:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488664; cv=none; b=qsBxpDFleiHPmdgd+fAH5UByFBIBcTGBsUQzVeW0H0fsPEEM3DD3OFTfhI90nHQHv+t8NzaqmAPGP82f8ass0dR+H5XT0pyWQG3mxE69iHuzZ+9lfei4syaBdXs2Q8iL1AwRLT/kT03lFriR4xYzkeOg+WL88YOagsokMbIr4F8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488664; c=relaxed/simple; bh=qxePsahSSTni7pOnj/8ZO65j5ulK9XCO82OomD6NCy8=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=so2uhwN8pWZDx1VFSPNC8WEQdexmD74KshU5LzFbUmz813rA+oDEevRQuLVvUaqko2I9XikAqXHu1TSNDwhr5iAM/ErPTS13zrqIW+vdF1YE7qsthdx//5IRGfKdifBadXZn/7Cd7mTbrxuFRwUQ4B9N24ElTko15ZDpl/5bb3I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=Ar0GhbGk; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="Ar0GhbGk" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=dvs/8tMoN6rh7xHDWuQUS6D+qWuuXZ76GRKBxZUxnXY=; b=Ar0GhbGkWctusDpyd70/E7esiz z+cGdo/WpEHDI7epQ5BnVB9u6gJwnYov4Ws5sYiDLaPiWBAPGLdA6c9rPHXLOzPbzqwax1S3nq/uv AH7ZhryjyhhNk8S9XzaHMUPbb34GoTh+3QCg0jlGFZ0RW5dfwzPyH7m0iYdLGTpICbpJrgt3Pn+f1 Y3xW0TTPgVb6cAnhB4yK84k1Gb2dM80lpNnGtNstKijApwa7C+IBGkwkZVMpuWDIGt3PSYlPdsw8i GAAO4xLZuAXueh2GpEQMuF37hcZ7vAr94Nn0sGUK8XA6mjXR+imyBFkYckzagWsYO/iCt4Ay2WAPd Br744F4w==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1vsz7Z-00000007KIz-3sF4; Thu, 19 Feb 2026 08:10:50 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id C321D30332E; Thu, 19 Feb 2026 09:10:48 +0100 (CET) Message-ID: <20260219080624.715897842@infradead.org> User-Agent: quilt/0.68 Date: Thu, 19 Feb 2026 08:58:43 +0100 From: Peter Zijlstra To: mingo@kernel.org Cc: peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, linux-kernel@vger.kernel.org, wangtao554@huawei.com, quzicheng@huawei.com, kprateek.nayak@amd.com, dsmythies@telus.net, shubhang@os.amperecomputing.com, Zhang Qiao Subject: [PATCH v2 3/7] sched/eevdf: Update se->vprot in reweight_entity() References: <20260219075840.162631716@infradead.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Wang Tao In the EEVDF framework with Run-to-Parity protection, `se->vprot` is an independent variable defining the virtual protection timestamp. When `reweight_entity()` is called (e.g., via nice/renice), it performs the following actions to preserve Lag consistency: 1. Scales `se->vlag` based on the new weight. 2. Calls `place_entity()`, which recalculates `se->vruntime` based on the new weight and scaled lag. However, the current implementation fails to update `se->vprot`, leading to mismatches between the task's actual runtime and its expected duration. Fixes: 63304558ba5d ("sched/eevdf: Curb wakeup-preemption") Suggested-by: Zhang Qiao Signed-off-by: Wang Tao Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Vincent Guittot Tested-by: K Prateek Nayak Tested-by: Shubhang Kaushik Link: https://patch.msgid.link/20260120123113.3518950-1-wangtao554@huawei.c= om --- kernel/sched/fair.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3814,6 +3814,8 @@ static void reweight_entity(struct cfs_r unsigned long weight) { bool curr =3D cfs_rq->curr =3D=3D se; + bool rel_vprot =3D false; + u64 vprot; =20 if (se->on_rq) { /* commit outstanding execution time */ @@ -3821,6 +3823,11 @@ static void reweight_entity(struct cfs_r update_entity_lag(cfs_rq, se); se->deadline -=3D se->vruntime; se->rel_deadline =3D 1; + if (curr && protect_slice(se)) { + vprot =3D se->vprot - se->vruntime; + rel_vprot =3D true; + } + cfs_rq->nr_queued--; if (!curr) __dequeue_entity(cfs_rq, se); @@ -3836,6 +3843,9 @@ static void reweight_entity(struct cfs_r if (se->rel_deadline) se->deadline =3D div_s64(se->deadline * se->load.weight, weight); =20 + if (rel_vprot) + vprot =3D div_s64(vprot * se->load.weight, weight); + update_load_set(&se->load, weight); =20 do { @@ -3847,6 +3857,8 @@ static void reweight_entity(struct cfs_r enqueue_load_avg(cfs_rq, se); if (se->on_rq) { place_entity(cfs_rq, se, 0); + if (rel_vprot) + se->vprot =3D se->vruntime + vprot; update_load_add(&cfs_rq->load, se->load.weight); if (!curr) __enqueue_entity(cfs_rq, se); From nobody Thu Apr 2 19:00:24 2026 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4E75E2AD2C for ; Thu, 19 Feb 2026 08:10:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488660; cv=none; b=Lb87uaW6SUOM6LaqOW5jLhNza8jS8WI8dqbSqJ7YXm6/q6UKO4Y3AcqfTWU3TZrFDI9aaApjt4GG9czU0eqyzP/OoUGSbeQeUhrKLD25OWe1lSMByr6PSr7XlXzS/D8l0u/2A+Sc6NXhxSZWcR8jbnC22fKzUDolB1hkKFYcjks= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488660; c=relaxed/simple; bh=HBamDKNPVCgrh3uVbm087xGUtgzG3LqrV2gUldF0l4A=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=RBTXDdCRcLJZ8WyWUD0vX64q63bM/VqJ7hIdSl9j62NHy2Vwva9eiXq+RCUlkMrnIo4v5k8n9NnWQU0H8ZWAuoVJjgIRWzb5lOLCf3NUMwwyw+XcK3ARxxU0s34lL3w3HtHy0pZ5lnQGKN2yaDbmq0T83K6tYFUQrTBjlq78JjE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=SK0RYfkX; arc=none smtp.client-ip=90.155.92.199 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="SK0RYfkX" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=kFQXiMn5/u84aWz0pDpIIDNXMNeep9eenqtlm+znF5U=; b=SK0RYfkXgVgxygC8Lxi+sUBAzm P+tuM1eUYCC5YRRwVupPXXZKpsI8rkWv6EXTRz+xGj8fjtmrSMWarCbNwh53Xoj1b37DiZlht0gZl cVaviRYfirX43hyW6xmhMiGgv1xWtqctLFsfg2T7hDxSDDvtX5rUFKqkmXtQoIEj+pW/3aq2HWdTN IRgv22RhHwyp5VHVUlNX9UkwuZO8uJGPYEGL26l0yoH80pTdfG7cPG/4PEZsmtS5EEFz66P0m0rW+ PYrQvTExIFuQIuOwUCe3mNlXXQx2K+qRT+7OjHkXPKLRXR4b2HSqPgAKJalcx64pkn+y/iy6k8zgZ nmLt4OVA==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1vsz7Z-00000000xzZ-27W0; Thu, 19 Feb 2026 08:10:49 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id C7F0630334D; Thu, 19 Feb 2026 09:10:48 +0100 (CET) Message-ID: <20260219080624.830623197@infradead.org> User-Agent: quilt/0.68 Date: Thu, 19 Feb 2026 08:58:44 +0100 From: Peter Zijlstra To: mingo@kernel.org Cc: peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, linux-kernel@vger.kernel.org, wangtao554@huawei.com, quzicheng@huawei.com, kprateek.nayak@amd.com, dsmythies@telus.net, shubhang@os.amperecomputing.com Subject: [PATCH v2 4/7] sched/fair: Fix lag clamp References: <20260219075840.162631716@infradead.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Vincent reported that he was seeing undue lag clamping in a mixed slice workload. Implement the max_slice tracking as per the todo comment. Fixes: 147f3efaa241 ("sched/fair: Implement an EEVDF-like scheduling policy= ") Reported-off-by: Vincent Guittot Signed-off-by: Peter Zijlstra (Intel) Tested-by: Vincent Guittot Tested-by: K Prateek Nayak Tested-by: Shubhang Kaushik Link: https://patch.msgid.link/20250422101628.GA33555@noisy.programming.kic= ks-ass.net Reviewed-by: Vincent Guittot --- include/linux/sched.h | 1 + kernel/sched/fair.c | 39 +++++++++++++++++++++++++++++++++++---- 2 files changed, 36 insertions(+), 4 deletions(-) --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -574,6 +574,7 @@ struct sched_entity { u64 deadline; u64 min_vruntime; u64 min_slice; + u64 max_slice; =20 struct list_head group_node; unsigned char on_rq; --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -748,6 +748,8 @@ u64 avg_vruntime(struct cfs_rq *cfs_rq) return cfs_rq->zero_vruntime; } =20 +static inline u64 cfs_rq_max_slice(struct cfs_rq *cfs_rq); + /* * lag_i =3D S - s_i =3D w_i * (V - v_i) * @@ -761,17 +763,16 @@ u64 avg_vruntime(struct cfs_rq *cfs_rq) * EEVDF gives the following limit for a steady state system: * * -r_max < lag < max(r_max, q) - * - * XXX could add max_slice to the augmented data to track this. */ static void update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *= se) { + u64 max_slice =3D cfs_rq_max_slice(cfs_rq) + TICK_NSEC; s64 vlag, limit; =20 WARN_ON_ONCE(!se->on_rq); =20 vlag =3D avg_vruntime(cfs_rq) - se->vruntime; - limit =3D calc_delta_fair(max_t(u64, 2*se->slice, TICK_NSEC), se); + limit =3D calc_delta_fair(max_slice, se); =20 se->vlag =3D clamp(vlag, -limit, limit); } @@ -829,6 +830,21 @@ static inline u64 cfs_rq_min_slice(struc return min_slice; } =20 +static inline u64 cfs_rq_max_slice(struct cfs_rq *cfs_rq) +{ + struct sched_entity *root =3D __pick_root_entity(cfs_rq); + struct sched_entity *curr =3D cfs_rq->curr; + u64 max_slice =3D 0ULL; + + if (curr && curr->on_rq) + max_slice =3D curr->slice; + + if (root) + max_slice =3D max(max_slice, root->max_slice); + + return max_slice; +} + static inline bool __entity_less(struct rb_node *a, const struct rb_node *= b) { return entity_before(__node_2_se(a), __node_2_se(b)); @@ -853,6 +869,15 @@ static inline void __min_slice_update(st } } =20 +static inline void __max_slice_update(struct sched_entity *se, struct rb_n= ode *node) +{ + if (node) { + struct sched_entity *rse =3D __node_2_se(node); + if (rse->max_slice > se->max_slice) + se->max_slice =3D rse->max_slice; + } +} + /* * se->min_vruntime =3D min(se->vruntime, {left,right}->min_vruntime) */ @@ -860,6 +885,7 @@ static inline bool min_vruntime_update(s { u64 old_min_vruntime =3D se->min_vruntime; u64 old_min_slice =3D se->min_slice; + u64 old_max_slice =3D se->max_slice; struct rb_node *node =3D &se->run_node; =20 se->min_vruntime =3D se->vruntime; @@ -870,8 +896,13 @@ static inline bool min_vruntime_update(s __min_slice_update(se, node->rb_right); __min_slice_update(se, node->rb_left); =20 + se->max_slice =3D se->slice; + __max_slice_update(se, node->rb_right); + __max_slice_update(se, node->rb_left); + return se->min_vruntime =3D=3D old_min_vruntime && - se->min_slice =3D=3D old_min_slice; + se->min_slice =3D=3D old_min_slice && + se->max_slice =3D=3D old_max_slice; } =20 RB_DECLARE_CALLBACKS(static, min_vruntime_cb, struct sched_entity, From nobody Thu Apr 2 19:00:24 2026 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA12C2C0F8C for ; Thu, 19 Feb 2026 08:10:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488661; cv=none; b=Kol1fvC4oqCRvgNJVLH0Tun8KCjFzlI8VUGujDk8LXQgdyoOYe4U4yu2pKbZJ6m+vS2ea0AzD1kX6wPtm1Jxe5vga0kK56/sjgCpHvY8p2UxOAYVoPWwvwJrm2rjOzh1T1M818uSF33PWlEF5aXxQjZBlniiZUPj3hSs82xTUf4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488661; c=relaxed/simple; bh=X5PbYyiEZoWfHsrPmMOtpM5BonYokdI+rJdtM0L6q00=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=HhrjfXNnk+wkQdHRM/4Tpcr/lugpemmOkbmr2EPEpCZ+foZ95r2YZlGMovckw0aeeIuPwhAF4QRQM8B07+4U4diw25hktdtYILXCZ2kHUFLM3On9vlmMkbdkDD9Yqu8lmkto7lvVWKNwIiFG8wdOmfG/qBb+wjf5KYY953Bhy1s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=TAHUabfJ; arc=none smtp.client-ip=90.155.92.199 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="TAHUabfJ" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=W+0qmspPjrQHWGqt6jMkbGOJYKyMWhs+tRmai8YC5o0=; b=TAHUabfJEjyI5uu5ScOjUyuCR6 sx1HIiSkz6GCZpoXDWLXQFHV93xTJgWEqiFnLhae83yXL1ryHa1dd8nKXm9mpAGD7RMhbYyrEV/JH 6tgY1oeGPbWUtlaMswI15s9VTbqeQ2EUpvGNqp8uy0pbBTu+hNNFj3UVDTzvw2JbYrwopYPL5rqlc 3HizwvLTZ2lJfH/TjxIAqBWrTtI4OI+AStSeGNjoIRkkOUKeRZhRzSyGHE0SPXBPeTjrPNEgAq0rz 5deP1W9VJQi6/49p834IoOrR0g/VVjbkSMx5WGW7I5Ucb9lUTUcwfpPXhUSU81lkZ7JpeivVp8xGp 2VAst1Pg==; Received: from 77-249-17-252.cable.dynamic.v4.ziggo.nl ([77.249.17.252] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1vsz7Z-00000000xzb-43cj; Thu, 19 Feb 2026 08:10:50 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id CCD64303364; Thu, 19 Feb 2026 09:10:48 +0100 (CET) Message-ID: <20260219080624.942813440@infradead.org> User-Agent: quilt/0.68 Date: Thu, 19 Feb 2026 08:58:45 +0100 From: Peter Zijlstra To: mingo@kernel.org Cc: peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, linux-kernel@vger.kernel.org, wangtao554@huawei.com, quzicheng@huawei.com, kprateek.nayak@amd.com, dsmythies@telus.net, shubhang@os.amperecomputing.com Subject: [PATCH v2 5/7] sched/fair: Increase weight bits for avg_vruntime References: <20260219075840.162631716@infradead.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Due to the zero_vruntime patch, the deltas are now a lot smaller and measurement with kernel-build and hackbench runs show about 45 bits used. This ensures avg_vruntime() tracks the full weight range, reducing numerical artifacts in reweight and the like. Also, lets keep the paranoid debug code around fow now. Signed-off-by: Peter Zijlstra (Intel) Tested-by: K Prateek Nayak Tested-by: Shubhang Kaushik --- kernel/sched/debug.c | 14 ++++++- kernel/sched/fair.c | 91 ++++++++++++++++++++++++++++++++++++++-----= ----- kernel/sched/features.h | 2 + kernel/sched/sched.h | 3 + 4 files changed, 90 insertions(+), 20 deletions(-) --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -8,6 +8,7 @@ */ #include #include +#include #include "sched.h" =20 /* @@ -901,10 +902,13 @@ static void print_rq(struct seq_file *m, =20 void print_cfs_rq(struct seq_file *m, int cpu, struct cfs_rq *cfs_rq) { - s64 left_vruntime =3D -1, zero_vruntime, right_vruntime =3D -1, left_dead= line =3D -1, spread; + s64 left_vruntime =3D -1, right_vruntime =3D -1, left_deadline =3D -1, sp= read; + s64 zero_vruntime =3D -1, sum_w_vruntime =3D -1; struct sched_entity *last, *first, *root; struct rq *rq =3D cpu_rq(cpu); + unsigned int sum_shift; unsigned long flags; + u64 sum_weight; =20 #ifdef CONFIG_FAIR_GROUP_SCHED SEQ_printf(m, "\n"); @@ -925,6 +929,9 @@ void print_cfs_rq(struct seq_file *m, in if (last) right_vruntime =3D last->vruntime; zero_vruntime =3D cfs_rq->zero_vruntime; + sum_w_vruntime =3D cfs_rq->sum_w_vruntime; + sum_weight =3D cfs_rq->sum_weight; + sum_shift =3D cfs_rq->sum_shift; raw_spin_rq_unlock_irqrestore(rq, flags); =20 SEQ_printf(m, " .%-30s: %Ld.%06ld\n", "left_deadline", @@ -933,6 +940,11 @@ void print_cfs_rq(struct seq_file *m, in SPLIT_NS(left_vruntime)); SEQ_printf(m, " .%-30s: %Ld.%06ld\n", "zero_vruntime", SPLIT_NS(zero_vruntime)); + SEQ_printf(m, " .%-30s: %Ld (%d bits)\n", "sum_w_vruntime", + sum_w_vruntime, ilog2(abs(sum_w_vruntime))); + SEQ_printf(m, " .%-30s: %Lu\n", "sum_weight", + sum_weight); + SEQ_printf(m, " .%-30s: %u\n", "sum_shift", sum_shift); SEQ_printf(m, " .%-30s: %Ld.%06ld\n", "avg_vruntime", SPLIT_NS(avg_vruntime(cfs_rq))); SEQ_printf(m, " .%-30s: %Ld.%06ld\n", "right_vruntime", --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -665,15 +665,20 @@ static inline s64 entity_key(struct cfs_ * Since zero_vruntime closely tracks the per-task service, these * deltas: (v_i - v0), will be in the order of the maximal (virtual) lag * induced in the system due to quantisation. - * - * Also, we use scale_load_down() to reduce the size. - * - * As measured, the max (key * weight) value was ~44 bits for a kernel bui= ld. */ -static void -sum_w_vruntime_add(struct cfs_rq *cfs_rq, struct sched_entity *se) +static inline unsigned long avg_vruntime_weight(struct cfs_rq *cfs_rq, uns= igned long w) +{ +#ifdef CONFIG_64BIT + if (cfs_rq->sum_shift) + w =3D max(2UL, w >> cfs_rq->sum_shift); +#endif + return w; +} + +static inline void +__sum_w_vruntime_add(struct cfs_rq *cfs_rq, struct sched_entity *se) { - unsigned long weight =3D scale_load_down(se->load.weight); + unsigned long weight =3D avg_vruntime_weight(cfs_rq, se->load.weight); s64 key =3D entity_key(cfs_rq, se); =20 cfs_rq->sum_w_vruntime +=3D key * weight; @@ -681,9 +686,59 @@ sum_w_vruntime_add(struct cfs_rq *cfs_rq } =20 static void +sum_w_vruntime_add_paranoid(struct cfs_rq *cfs_rq, struct sched_entity *se) +{ + unsigned long weight; + s64 key, tmp; + +again: + weight =3D avg_vruntime_weight(cfs_rq, se->load.weight); + key =3D entity_key(cfs_rq, se); + + if (check_mul_overflow(key, weight, &key)) + goto overflow; + + if (check_add_overflow(cfs_rq->sum_w_vruntime, key, &tmp)) + goto overflow; + + cfs_rq->sum_w_vruntime =3D tmp; + cfs_rq->sum_weight +=3D weight; + return; + +overflow: + /* + * There's gotta be a limit -- if we're still failing at this point + * there's really nothing much to be done about things. + */ + BUG_ON(cfs_rq->sum_shift >=3D 10); + cfs_rq->sum_shift++; + + /* + * Note: \Sum (k_i * (w_i >> 1)) !=3D (\Sum (k_i * w_i)) >> 1 + */ + cfs_rq->sum_w_vruntime =3D 0; + cfs_rq->sum_weight =3D 0; + + for (struct rb_node *node =3D cfs_rq->tasks_timeline.rb_leftmost; + node; node =3D rb_next(node)) + __sum_w_vruntime_add(cfs_rq, __node_2_se(node)); + + goto again; +} + +static void +sum_w_vruntime_add(struct cfs_rq *cfs_rq, struct sched_entity *se) +{ + if (sched_feat(PARANOID_AVG)) + return sum_w_vruntime_add_paranoid(cfs_rq, se); + + __sum_w_vruntime_add(cfs_rq, se); +} + +static void sum_w_vruntime_sub(struct cfs_rq *cfs_rq, struct sched_entity *se) { - unsigned long weight =3D scale_load_down(se->load.weight); + unsigned long weight =3D avg_vruntime_weight(cfs_rq, se->load.weight); s64 key =3D entity_key(cfs_rq, se); =20 cfs_rq->sum_w_vruntime -=3D key * weight; @@ -725,7 +780,7 @@ u64 avg_vruntime(struct cfs_rq *cfs_rq) s64 runtime =3D cfs_rq->sum_w_vruntime; =20 if (curr) { - unsigned long w =3D scale_load_down(curr->load.weight); + unsigned long w =3D avg_vruntime_weight(cfs_rq, curr->load.weight); =20 runtime +=3D entity_key(cfs_rq, curr) * w; weight +=3D w; @@ -735,7 +790,7 @@ u64 avg_vruntime(struct cfs_rq *cfs_rq) if (runtime < 0) runtime -=3D (weight - 1); =20 - delta =3D div_s64(runtime, weight); + delta =3D div64_long(runtime, weight); } else if (curr) { /* * When there is but one element, it is the average. @@ -801,7 +856,7 @@ static int vruntime_eligible(struct cfs_ long load =3D cfs_rq->sum_weight; =20 if (curr && curr->on_rq) { - unsigned long weight =3D scale_load_down(curr->load.weight); + unsigned long weight =3D avg_vruntime_weight(cfs_rq, curr->load.weight); =20 avg +=3D entity_key(cfs_rq, curr) * weight; load +=3D weight; @@ -3871,12 +3926,12 @@ static void reweight_entity(struct cfs_r * Because we keep se->vlag =3D V - v_i, while: lag_i =3D w_i*(V - v_i), * we need to scale se->vlag when w_i changes. */ - se->vlag =3D div_s64(se->vlag * se->load.weight, weight); + se->vlag =3D div64_long(se->vlag * se->load.weight, weight); if (se->rel_deadline) - se->deadline =3D div_s64(se->deadline * se->load.weight, weight); + se->deadline =3D div64_long(se->deadline * se->load.weight, weight); =20 if (rel_vprot) - vprot =3D div_s64(vprot * se->load.weight, weight); + vprot =3D div64_long(vprot * se->load.weight, weight); =20 update_load_set(&se->load, weight); =20 @@ -5180,7 +5235,7 @@ place_entity(struct cfs_rq *cfs_rq, stru */ if (sched_feat(PLACE_LAG) && cfs_rq->nr_queued && se->vlag) { struct sched_entity *curr =3D cfs_rq->curr; - unsigned long load; + long load; =20 lag =3D se->vlag; =20 @@ -5238,12 +5293,12 @@ place_entity(struct cfs_rq *cfs_rq, stru */ load =3D cfs_rq->sum_weight; if (curr && curr->on_rq) - load +=3D scale_load_down(curr->load.weight); + load +=3D avg_vruntime_weight(cfs_rq, curr->load.weight); =20 - lag *=3D load + scale_load_down(se->load.weight); + lag *=3D load + avg_vruntime_weight(cfs_rq, se->load.weight); if (WARN_ON_ONCE(!load)) load =3D 1; - lag =3D div_s64(lag, load); + lag =3D div64_long(lag, load); } =20 se->vruntime =3D vruntime - lag; --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -58,6 +58,8 @@ SCHED_FEAT(CACHE_HOT_BUDDY, true) SCHED_FEAT(DELAY_DEQUEUE, true) SCHED_FEAT(DELAY_ZERO, true) =20 +SCHED_FEAT(PARANOID_AVG, false) + /* * Allow wakeup-time preemption of the current task: */ --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -684,8 +684,9 @@ struct cfs_rq { =20 s64 sum_w_vruntime; u64 sum_weight; - u64 zero_vruntime; + unsigned int sum_shift; + #ifdef CONFIG_SCHED_CORE unsigned int forceidle_seq; u64 zero_vruntime_fi; From nobody Thu Apr 2 19:00:24 2026 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F677336EC9 for ; Thu, 19 Feb 2026 08:11:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488665; cv=none; b=QFCWF24+lve20o4FxmKj6KOvP13UMrW0LiEK8/AGkZt/tzmKqq8nn2RmZalI2LllY/5B0Gxy7Z+b2Wgbe7VHKx5A8SAMa6i4YKKlRob9aph5Bwkq9XETLIyuduHFh8I/0mwuicsE8m7eq8DRef7wC7nGzn0yR52SaxqqhrSF6dg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488665; c=relaxed/simple; bh=gzrNG1llBp7ZLD0Wh9zMr3U5iseC+njiW/myQdt+j4A=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=Kg1836gIl4w3issbBwUOFI15OF+Hmvh7xMolUMMgKEmnC2Pl5lLj8esNVa37j1F5xmLhO8CBHP9irWyB1vUiew1Qa0dT4IS7QRmSXv/BF39C00Az1BX/p49YSobzMxCHxteUBcGGDdcag7G6nysG0tCdDGNiIL9V20pYIU1u1I4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=mL5EaJDK; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="mL5EaJDK" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=rq6xgkEkqw3ZQwG3a/Ndw9okxf/nU7AII02lB3PR93I=; b=mL5EaJDKmownbBUm/iWy4Spics wah3PMOotEqjbLVzrxRkCEf7uMsBIdH7XpVYCc72UTrAZ/5zO8j800mk/0veObuIPwlNHUTP7baF3 u4DxQlvkjamXPSK7320iM6hXTCujpnPGEdboKoz3BeTx1uFjvyf4XF/ltgi+Gkvio6at27+NiVhfx uVivl5C1izHiKkOwSfiEEDeZFIsRJCc7ax+EW9jJRfcuhMf7iMksc4xHUIMwgC4BEfd77DIzc56PF /70S6JCz8iYEIyEO8IHlH92Q+CyIdfu6mQ1a7QQxa/19TlHfo72+ayzrArfLuULaTdK7Ndtxe2qht jG9BRHog==; Received: from 2001-1c00-8d85-5700-266e-96ff-fe07-7dcc.cable.dynamic.v6.ziggo.nl ([2001:1c00:8d85:5700:266e:96ff:fe07:7dcc] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1vsz7a-00000007KJ0-0c8w; Thu, 19 Feb 2026 08:10:50 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id D1114303366; Thu, 19 Feb 2026 09:10:48 +0100 (CET) Message-ID: <20260219080625.066102672@infradead.org> User-Agent: quilt/0.68 Date: Thu, 19 Feb 2026 08:58:46 +0100 From: Peter Zijlstra To: mingo@kernel.org Cc: peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, linux-kernel@vger.kernel.org, wangtao554@huawei.com, quzicheng@huawei.com, kprateek.nayak@amd.com, dsmythies@telus.net, shubhang@os.amperecomputing.com Subject: [PATCH v2 6/7] sched/fair: Revert 6d71a9c61604 ("sched/fair: Fix EEVDF entity placement bug causing scheduling lag") References: <20260219075840.162631716@infradead.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Zicheng Qu reported that, because avg_vruntime() always includes cfs_rq->curr, when ->on_rq, place_entity() doesn't work right. Specifically, the lag scaling in place_entity() relies on avg_vruntime() being the state *before* placement of the new entity. However in this case avg_vruntime() will actually already include the entity, which breaks things. Also, Zicheng Qu argues that avg_vruntime should be invariant under reweight. IOW commit 6d71a9c61604 ("sched/fair: Fix EEVDF entity placement bug causing scheduling lag") was wrong! The issue reported in 6d71a9c61604 could possibly be explained by rounding artifacts -- notably the extreme weight '2' is outside of the range of avg_vruntime/sum_w_vruntime, since that uses scale_load_down(). By scaling vruntime by the real weight, but accounting it in vruntime with a factor 1024 more, the average moves significantly. However, that is now cured. Tested by reverting 66951e4860d3 ("sched/fair: Fix update_cfs_group() vs DELAY_DEQUEUE") and tracing vruntime and vlag figures again. Reported-by: Zicheng Qu Signed-off-by: Peter Zijlstra (Intel) Tested-by: K Prateek Nayak Tested-by: Shubhang Kaushik Reviewed-by: Vincent Guittot --- kernel/sched/fair.c | 148 +++++++++++++++++++++++++++++++++++++++++++----= ----- 1 file changed, 124 insertions(+), 24 deletions(-) --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -819,17 +819,22 @@ static inline u64 cfs_rq_max_slice(struc * * -r_max < lag < max(r_max, q) */ -static void update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *= se) +static s64 entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *se, u64 = avruntime) { u64 max_slice =3D cfs_rq_max_slice(cfs_rq) + TICK_NSEC; s64 vlag, limit; =20 - WARN_ON_ONCE(!se->on_rq); - - vlag =3D avg_vruntime(cfs_rq) - se->vruntime; + vlag =3D avruntime - se->vruntime; limit =3D calc_delta_fair(max_slice, se); =20 - se->vlag =3D clamp(vlag, -limit, limit); + return clamp(vlag, -limit, limit); +} + +static void update_entity_lag(struct cfs_rq *cfs_rq, struct sched_entity *= se) +{ + WARN_ON_ONCE(!se->on_rq); + + se->vlag =3D entity_lag(cfs_rq, se, avg_vruntime(cfs_rq)); } =20 /* @@ -3895,23 +3900,125 @@ dequeue_load_avg(struct cfs_rq *cfs_rq, se_weight(se) * -se->avg.load_sum); } =20 -static void place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, i= nt flags); +static void +rescale_entity(struct sched_entity *se, unsigned long weight, bool rel_vpr= ot) +{ + unsigned long old_weight =3D se->load.weight; + + /* + * VRUNTIME + * -------- + * + * COROLLARY #1: The virtual runtime of the entity needs to be + * adjusted if re-weight at !0-lag point. + * + * Proof: For contradiction assume this is not true, so we can + * re-weight without changing vruntime at !0-lag point. + * + * Weight VRuntime Avg-VRuntime + * before w v V + * after w' v' V' + * + * Since lag needs to be preserved through re-weight: + * + * lag =3D (V - v)*w =3D (V'- v')*w', where v =3D v' + * =3D=3D> V' =3D (V - v)*w/w' + v (1) + * + * Let W be the total weight of the entities before reweight, + * since V' is the new weighted average of entities: + * + * V' =3D (WV + w'v - wv) / (W + w' - w) (2) + * + * by using (1) & (2) we obtain: + * + * (WV + w'v - wv) / (W + w' - w) =3D (V - v)*w/w' + v + * =3D=3D> (WV-Wv+Wv+w'v-wv)/(W+w'-w) =3D (V - v)*w/w' + v + * =3D=3D> (WV - Wv)/(W + w' - w) + v =3D (V - v)*w/w' + v + * =3D=3D> (V - v)*W/(W + w' - w) =3D (V - v)*w/w' (3) + * + * Since we are doing at !0-lag point which means V !=3D v, we + * can simplify (3): + * + * =3D=3D> W / (W + w' - w) =3D w / w' + * =3D=3D> Ww' =3D Ww + ww' - ww + * =3D=3D> W * (w' - w) =3D w * (w' - w) + * =3D=3D> W =3D w (re-weight indicates w' !=3D w) + * + * So the cfs_rq contains only one entity, hence vruntime of + * the entity @v should always equal to the cfs_rq's weighted + * average vruntime @V, which means we will always re-weight + * at 0-lag point, thus breach assumption. Proof completed. + * + * + * COROLLARY #2: Re-weight does NOT affect weighted average + * vruntime of all the entities. + * + * Proof: According to corollary #1, Eq. (1) should be: + * + * (V - v)*w =3D (V' - v')*w' + * =3D=3D> v' =3D V' - (V - v)*w/w' (4) + * + * According to the weighted average formula, we have: + * + * V' =3D (WV - wv + w'v') / (W - w + w') + * =3D (WV - wv + w'(V' - (V - v)w/w')) / (W - w + w') + * =3D (WV - wv + w'V' - Vw + wv) / (W - w + w') + * =3D (WV + w'V' - Vw) / (W - w + w') + * + * =3D=3D> V'*(W - w + w') =3D WV + w'V' - Vw + * =3D=3D> V' * (W - w) =3D (W - w) * V (5) + * + * If the entity is the only one in the cfs_rq, then reweight + * always occurs at 0-lag point, so V won't change. Or else + * there are other entities, hence W !=3D w, then Eq. (5) turns + * into V' =3D V. So V won't change in either case, proof done. + * + * + * So according to corollary #1 & #2, the effect of re-weight + * on vruntime should be: + * + * v' =3D V' - (V - v) * w / w' (4) + * =3D V - (V - v) * w / w' + * =3D V - vl * w / w' + * =3D V - vl' + */ + se->vlag =3D div64_long(se->vlag * old_weight, weight); + + /* + * DEADLINE + * -------- + * + * When the weight changes, the virtual time slope changes and + * we should adjust the relative virtual deadline accordingly. + * + * d' =3D v' + (d - v)*w/w' + * =3D V' - (V - v)*w/w' + (d - v)*w/w' + * =3D V - (V - v)*w/w' + (d - v)*w/w' + * =3D V + (d - V)*w/w' + */ + if (se->rel_deadline) + se->deadline =3D div64_long(se->deadline * old_weight, weight); + + if (rel_vprot) + se->vprot =3D div64_long(se->vprot * old_weight, weight); +} =20 static void reweight_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, unsigned long weight) { bool curr =3D cfs_rq->curr =3D=3D se; bool rel_vprot =3D false; - u64 vprot; + u64 avruntime =3D 0; =20 if (se->on_rq) { /* commit outstanding execution time */ update_curr(cfs_rq); - update_entity_lag(cfs_rq, se); - se->deadline -=3D se->vruntime; + avruntime =3D avg_vruntime(cfs_rq); + se->vlag =3D entity_lag(cfs_rq, se, avruntime); + se->deadline -=3D avruntime; se->rel_deadline =3D 1; if (curr && protect_slice(se)) { - vprot =3D se->vprot - se->vruntime; + se->vprot -=3D avruntime; rel_vprot =3D true; } =20 @@ -3922,30 +4029,23 @@ static void reweight_entity(struct cfs_r } dequeue_load_avg(cfs_rq, se); =20 - /* - * Because we keep se->vlag =3D V - v_i, while: lag_i =3D w_i*(V - v_i), - * we need to scale se->vlag when w_i changes. - */ - se->vlag =3D div64_long(se->vlag * se->load.weight, weight); - if (se->rel_deadline) - se->deadline =3D div64_long(se->deadline * se->load.weight, weight); - - if (rel_vprot) - vprot =3D div64_long(vprot * se->load.weight, weight); + rescale_entity(se, weight, rel_vprot); =20 update_load_set(&se->load, weight); =20 do { u32 divider =3D get_pelt_divider(&se->avg); - se->avg.load_avg =3D div_u64(se_weight(se) * se->avg.load_sum, divider); } while (0); =20 enqueue_load_avg(cfs_rq, se); if (se->on_rq) { - place_entity(cfs_rq, se, 0); if (rel_vprot) - se->vprot =3D se->vruntime + vprot; + se->vprot +=3D avruntime; + se->deadline +=3D avruntime; + se->rel_deadline =3D 0; + se->vruntime =3D avruntime - se->vlag; + update_load_add(&cfs_rq->load, se->load.weight); if (!curr) __enqueue_entity(cfs_rq, se); @@ -5303,7 +5403,7 @@ place_entity(struct cfs_rq *cfs_rq, stru =20 se->vruntime =3D vruntime - lag; =20 - if (se->rel_deadline) { + if (sched_feat(PLACE_REL_DEADLINE) && se->rel_deadline) { se->deadline +=3D se->vruntime; se->rel_deadline =3D 0; return; From nobody Thu Apr 2 19:00:24 2026 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 628E0339B32 for ; Thu, 19 Feb 2026 08:11:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488664; cv=none; b=Q2vOChkify7uVKUHjcRHr6bJs3pnPsXYVoW8QKEbBnq80Ae09wphjUBD0sQY3K//uxBLw8Gkv18nMxHwC23R+92Y52qHYu4HNOmN1pRuKEK89ixNbNZ9o7e7S/1u+7GzQF+qCxFo7K4YiZPGTXdow+OshCuuHHvvlghfZAw2Yjg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771488664; c=relaxed/simple; bh=fxRLlNZud3hxlrMgKFhDc5KjXfN+IA7SN2Kf2PU8yFY=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=OPiSAke5gno0uTcav+tULIsg0Np3fex34Z40xV2w5LsK+rHmD2nbI9BioGVVxtdf//yxNdWUgcEm1QEohM7YELU6VyFZX5/nAKM/TWd+Eqq931wGFZDKEMAzZjoGagEt9nEjFLklSmIOVHjAVpDt7gm8Vsy1ksbKTnCe51BXwTQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=a3BAQIUO; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="a3BAQIUO" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:References: Subject:Cc:To:From:Date:Message-ID:Sender:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:In-Reply-To; bh=tZXhfbfUSMYtydXUZF4P0kDtIfEYQ2UqVUCaxsBwATg=; b=a3BAQIUO0gKxdRXQBBGN4eTJfl iO67SmY8S/EvV1uTDX/2vQ+q1UHpVe+cvmWtL7cGrpxQjaRem9y98yC9srSAWlI8NwSPHjf4DLlCn tWjJXaAe5fctVWnbZj5EeQn8qLrUL+DAecVefCtiGLgQE+Yu0NetuOOexidv2aChQjNId2b+X0yKY sxyXBrERv8SSM5n2QemUPiVv2rLlBuf/KWEgAN5rSqsdudxyLuwx7/wEgf59wnzOe5ckkEquacdbq glYnmJySzpAOdFykG9QQQhLQy4WiWJJ0B/xsH8+OWKB47Yy8+NhTEr7ahNHqHwLy/pPx87e/oshi5 TbyrVHnw==; Received: from 2001-1c00-8d85-5700-266e-96ff-fe07-7dcc.cable.dynamic.v6.ziggo.nl ([2001:1c00:8d85:5700:266e:96ff:fe07:7dcc] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1vsz7a-00000007KJ1-0i1U; Thu, 19 Feb 2026 08:10:50 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 0) id D5D16303369; Thu, 19 Feb 2026 09:10:48 +0100 (CET) Message-ID: <20260219080625.183283814@infradead.org> User-Agent: quilt/0.68 Date: Thu, 19 Feb 2026 08:58:47 +0100 From: Peter Zijlstra To: mingo@kernel.org Cc: peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, linux-kernel@vger.kernel.org, wangtao554@huawei.com, quzicheng@huawei.com, kprateek.nayak@amd.com, dsmythies@telus.net, shubhang@os.amperecomputing.com Subject: [PATCH v2 7/7] sched/fair: Use full weight to __calc_delta() References: <20260219075840.162631716@infradead.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since we now use the full weight for avg_vruntime(), also make __calc_delta() use the full value. Since weight is effectively NICE_0_LOAD, this is 20 bits on 64bit. This leaves 44 bits for delta_exec, which is ~16k seconds, way longer than any one tick would ever be, so no worry about overflow. Signed-off-by: Peter Zijlstra (Intel) Tested-by: K Prateek Nayak Tested-by: Shubhang Kaushik Reviewed-by: Vincent Guittot --- kernel/sched/fair.c | 7 +++++++ 1 file changed, 7 insertions(+) --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -225,6 +225,7 @@ void __init sched_init_granularity(void) update_sysctl(); } =20 +#ifndef CONFIG_64BIT #define WMULT_CONST (~0U) #define WMULT_SHIFT 32 =20 @@ -283,6 +284,12 @@ static u64 __calc_delta(u64 delta_exec, =20 return mul_u64_u32_shr(delta_exec, fact, shift); } +#else +static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_= weight *lw) +{ + return (delta_exec * weight) / lw->weight; +} +#endif =20 /* * delta /=3D w