From nobody Fri Sep 12 06:15:16 2025
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id DEB2BC001B0
	for <linux-kernel@archiver.kernel.org>; Thu, 10 Aug 2023 07:11:10 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S233790AbjHJHLJ (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Thu, 10 Aug 2023 03:11:09 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57818 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S232102AbjHJHK6 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 10 Aug 2023 03:10:58 -0400
Received: from galois.linutronix.de (Galois.linutronix.de
 [IPv6:2a0a:51c0:0:12e:550::1])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D8D31702;
        Thu, 10 Aug 2023 00:10:56 -0700 (PDT)
Date: Thu, 10 Aug 2023 07:10:54 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de;
        s=2020; t=1691651454;
        h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date:
         message-id:message-id:to:to:cc:cc:mime-version:mime-version:
         content-type:content-type:
         content-transfer-encoding:content-transfer-encoding:
         in-reply-to:in-reply-to:references:references;
        bh=wLOA6As5Xj6inhYYC+BfXcl3PN31rE5iXkejlkBTpwc=;
        b=ZEpCdlQ+THBhL+iOfK2yFyf8RhdaL1lC/IjG+izrs6WDeRk+Zv0wF9h7Bx9kaOyMIbmv40
        WcqaT33vJb/wKkftBJwj9XjLIbDPF/Kd+q2Q386ktMun0/UhWfTq+Le3qDeQzRr/tnAW5X
        z8SCuyjucLIYQ8HzpDort1hgu4pQorqt21KPYOKjoLgvYBziTpXNVfDTIvvEEKYneAWrBc
        7+bGt3hVAqIkUzLp55T0oqfb06bJTcaCOiRZnKrGpX1i2gSMeAiGukrZXDCRLa5bhzGq7C
        y+5oYZ241hL4SRbG9WoCER35rSoWMKKYmMz///r0h/ZsJicZYP8n11vALijBFw==
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de;
        s=2020e; t=1691651454;
        h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date:
         message-id:message-id:to:to:cc:cc:mime-version:mime-version:
         content-type:content-type:
         content-transfer-encoding:content-transfer-encoding:
         in-reply-to:in-reply-to:references:references;
        bh=wLOA6As5Xj6inhYYC+BfXcl3PN31rE5iXkejlkBTpwc=;
        b=mqy8/njlObfcCJ4WFnHbTYOmJetvLKbT9ZtZLvkeA2iskHG/hbM0YjWWaTLn0eV9hY2tXx
        cENpFkab3wXodxBA==
From: "tip-bot2 for Peter Zijlstra" <tip-bot2@linutronix.de>
Sender: tip-bot2@linutronix.de
Reply-to: linux-kernel@vger.kernel.org
To: linux-tip-commits@vger.kernel.org
Subject: [tip: sched/core] sched/fair: Commit to EEVDF
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>,
        Ingo Molnar <mingo@kernel.org>, x86@kernel.org,
        linux-kernel@vger.kernel.org
In-Reply-To: <20230531124604.137187212@infradead.org>
References: <20230531124604.137187212@infradead.org>
MIME-Version: 1.0
Message-ID: <169165145409.27769.15158486372345244495.tip-bot2@tip-bot2>
Robot-ID: <tip-bot2@linutronix.de>
Robot-Unsubscribe: Contact <mailto:tglx@linutronix.de> to get blacklisted from
 these emails
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     5e963f2bd4654a202a8a05aa3a86cb0300b10e6c
Gitweb:        https://git.kernel.org/tip/5e963f2bd4654a202a8a05aa3a86cb030=
0b10e6c
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 31 May 2023 13:58:47 +02:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 19 Jul 2023 09:43:58 +02:00

sched/fair: Commit to EEVDF

EEVDF is a better defined scheduling policy, as a result it has less
heuristics/tunables. There is no compelling reason to keep CFS around.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230531124604.137187212@infradead.org
---
 kernel/sched/debug.c    |   6 +-
 kernel/sched/fair.c     | 465 +++------------------------------------
 kernel/sched/features.h |  12 +-
 kernel/sched/sched.h    |   5 +-
 4 files changed, 38 insertions(+), 450 deletions(-)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 18efc6d..f8d190c 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -347,10 +347,7 @@ static __init int sched_init_debug(void)
 	debugfs_create_file("preempt", 0644, debugfs_sched, NULL, &sched_dynamic_=
fops);
 #endif
=20
-	debugfs_create_u32("latency_ns", 0644, debugfs_sched, &sysctl_sched_laten=
cy);
 	debugfs_create_u32("min_granularity_ns", 0644, debugfs_sched, &sysctl_sch=
ed_min_granularity);
-	debugfs_create_u32("idle_min_granularity_ns", 0644, debugfs_sched, &sysct=
l_sched_idle_min_granularity);
-	debugfs_create_u32("wakeup_granularity_ns", 0644, debugfs_sched, &sysctl_=
sched_wakeup_granularity);
=20
 	debugfs_create_u32("latency_warn_ms", 0644, debugfs_sched, &sysctl_resche=
d_latency_warn_ms);
 	debugfs_create_u32("latency_warn_once", 0644, debugfs_sched, &sysctl_resc=
hed_latency_warn_once);
@@ -866,10 +863,7 @@ static void sched_debug_header(struct seq_file *m)
 	SEQ_printf(m, "  .%-40s: %Ld\n", #x, (long long)(x))
 #define PN(x) \
 	SEQ_printf(m, "  .%-40s: %Ld.%06ld\n", #x, SPLIT_NS(x))
-	PN(sysctl_sched_latency);
 	PN(sysctl_sched_min_granularity);
-	PN(sysctl_sched_idle_min_granularity);
-	PN(sysctl_sched_wakeup_granularity);
 	P(sysctl_sched_child_runs_first);
 	P(sysctl_sched_features);
 #undef PN
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 57e8bc1..0605eb4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -58,22 +58,6 @@
 #include "autogroup.h"
=20
 /*
- * Targeted preemption latency for CPU-bound tasks:
- *
- * NOTE: this latency value is not the same as the concept of
- * 'timeslice length' - timeslices in CFS are of variable length
- * and have no persistent notion like in traditional, time-slice
- * based scheduling concepts.
- *
- * (to see the precise effective timeslice length of your workload,
- *  run vmstat and monitor the context-switches (cs) field)
- *
- * (default: 6ms * (1 + ilog(ncpus)), units: nanoseconds)
- */
-unsigned int sysctl_sched_latency			=3D 6000000ULL;
-static unsigned int normalized_sysctl_sched_latency	=3D 6000000ULL;
-
-/*
  * The initial- and re-scaling of tunables is configurable
  *
  * Options are:
@@ -95,36 +79,11 @@ unsigned int sysctl_sched_min_granularity			=3D 750000U=
LL;
 static unsigned int normalized_sysctl_sched_min_granularity	=3D 750000ULL;
=20
 /*
- * Minimal preemption granularity for CPU-bound SCHED_IDLE tasks.
- * Applies only when SCHED_IDLE tasks compete with normal tasks.
- *
- * (default: 0.75 msec)
- */
-unsigned int sysctl_sched_idle_min_granularity			=3D 750000ULL;
-
-/*
- * This value is kept at sysctl_sched_latency/sysctl_sched_min_granularity
- */
-static unsigned int sched_nr_latency =3D 8;
-
-/*
  * After fork, child runs first. If set to 0 (default) then
  * parent will (try to) run first.
  */
 unsigned int sysctl_sched_child_runs_first __read_mostly;
=20
-/*
- * SCHED_OTHER wake-up granularity.
- *
- * This option delays the preemption effects of decoupled workloads
- * and reduces their over-scheduling. Synchronous workloads will still
- * have immediate wakeup/sleep latencies.
- *
- * (default: 1 msec * (1 + ilog(ncpus)), units: nanoseconds)
- */
-unsigned int sysctl_sched_wakeup_granularity			=3D 1000000UL;
-static unsigned int normalized_sysctl_sched_wakeup_granularity	=3D 1000000=
UL;
-
 const_debug unsigned int sysctl_sched_migration_cost	=3D 500000UL;
=20
 int sched_thermal_decay_shift;
@@ -279,8 +238,6 @@ static void update_sysctl(void)
 #define SET_SYSCTL(name) \
 	(sysctl_##name =3D (factor) * normalized_sysctl_##name)
 	SET_SYSCTL(sched_min_granularity);
-	SET_SYSCTL(sched_latency);
-	SET_SYSCTL(sched_wakeup_granularity);
 #undef SET_SYSCTL
 }
=20
@@ -888,30 +845,6 @@ struct sched_entity *__pick_first_entity(struct cfs_rq=
 *cfs_rq)
 	return __node_2_se(left);
 }
=20
-static struct sched_entity *__pick_next_entity(struct sched_entity *se)
-{
-	struct rb_node *next =3D rb_next(&se->run_node);
-
-	if (!next)
-		return NULL;
-
-	return __node_2_se(next);
-}
-
-static struct sched_entity *pick_cfs(struct cfs_rq *cfs_rq, struct sched_e=
ntity *curr)
-{
-	struct sched_entity *left =3D __pick_first_entity(cfs_rq);
-
-	/*
-	 * If curr is set we have to see if its left of the leftmost entity
-	 * still in the tree, provided there was anything in the tree at all.
-	 */
-	if (!left || (curr && entity_before(curr, left)))
-		left =3D curr;
-
-	return left;
-}
-
 /*
  * Earliest Eligible Virtual Deadline First
  *
@@ -1008,85 +941,15 @@ int sched_update_scaling(void)
 {
 	unsigned int factor =3D get_update_sysctl_factor();
=20
-	sched_nr_latency =3D DIV_ROUND_UP(sysctl_sched_latency,
-					sysctl_sched_min_granularity);
-
 #define WRT_SYSCTL(name) \
 	(normalized_sysctl_##name =3D sysctl_##name / (factor))
 	WRT_SYSCTL(sched_min_granularity);
-	WRT_SYSCTL(sched_latency);
-	WRT_SYSCTL(sched_wakeup_granularity);
 #undef WRT_SYSCTL
=20
 	return 0;
 }
 #endif
=20
-/*
- * The idea is to set a period in which each task runs once.
- *
- * When there are too many tasks (sched_nr_latency) we have to stretch
- * this period because otherwise the slices get too small.
- *
- * p =3D (nr <=3D nl) ? l : l*nr/nl
- */
-static u64 __sched_period(unsigned long nr_running)
-{
-	if (unlikely(nr_running > sched_nr_latency))
-		return nr_running * sysctl_sched_min_granularity;
-	else
-		return sysctl_sched_latency;
-}
-
-static bool sched_idle_cfs_rq(struct cfs_rq *cfs_rq);
-
-/*
- * We calculate the wall-time slice from the period by taking a part
- * proportional to the weight.
- *
- * s =3D p*P[w/rw]
- */
-static u64 sched_slice(struct cfs_rq *cfs_rq, struct sched_entity *se)
-{
-	unsigned int nr_running =3D cfs_rq->nr_running;
-	struct sched_entity *init_se =3D se;
-	unsigned int min_gran;
-	u64 slice;
-
-	if (sched_feat(ALT_PERIOD))
-		nr_running =3D rq_of(cfs_rq)->cfs.h_nr_running;
-
-	slice =3D __sched_period(nr_running + !se->on_rq);
-
-	for_each_sched_entity(se) {
-		struct load_weight *load;
-		struct load_weight lw;
-		struct cfs_rq *qcfs_rq;
-
-		qcfs_rq =3D cfs_rq_of(se);
-		load =3D &qcfs_rq->load;
-
-		if (unlikely(!se->on_rq)) {
-			lw =3D qcfs_rq->load;
-
-			update_load_add(&lw, se->load.weight);
-			load =3D &lw;
-		}
-		slice =3D __calc_delta(slice, se->load.weight, load);
-	}
-
-	if (sched_feat(BASE_SLICE)) {
-		if (se_is_idle(init_se) && !sched_idle_cfs_rq(cfs_rq))
-			min_gran =3D sysctl_sched_idle_min_granularity;
-		else
-			min_gran =3D sysctl_sched_min_granularity;
-
-		slice =3D max_t(u64, slice, min_gran);
-	}
-
-	return slice;
-}
-
 static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se);
=20
 /*
@@ -1098,35 +961,25 @@ static void update_deadline(struct cfs_rq *cfs_rq, s=
truct sched_entity *se)
 	if ((s64)(se->vruntime - se->deadline) < 0)
 		return;
=20
-	if (sched_feat(EEVDF)) {
-		/*
-		 * For EEVDF the virtual time slope is determined by w_i (iow.
-		 * nice) while the request time r_i is determined by
-		 * sysctl_sched_min_granularity.
-		 */
-		se->slice =3D sysctl_sched_min_granularity;
-
-		/*
-		 * The task has consumed its request, reschedule.
-		 */
-		if (cfs_rq->nr_running > 1) {
-			resched_curr(rq_of(cfs_rq));
-			clear_buddies(cfs_rq, se);
-		}
-	} else {
-		/*
-		 * When many tasks blow up the sched_period; it is possible
-		 * that sched_slice() reports unusually large results (when
-		 * many tasks are very light for example). Therefore impose a
-		 * maximum.
-		 */
-		se->slice =3D min_t(u64, sched_slice(cfs_rq, se), sysctl_sched_latency);
-	}
+	/*
+	 * For EEVDF the virtual time slope is determined by w_i (iow.
+	 * nice) while the request time r_i is determined by
+	 * sysctl_sched_min_granularity.
+	 */
+	se->slice =3D sysctl_sched_min_granularity;
=20
 	/*
 	 * EEVDF: vd_i =3D ve_i + r_i / w_i
 	 */
 	se->deadline =3D se->vruntime + calc_delta_fair(se->slice, se);
+
+	/*
+	 * The task has consumed its request, reschedule.
+	 */
+	if (cfs_rq->nr_running > 1) {
+		resched_curr(rq_of(cfs_rq));
+		clear_buddies(cfs_rq, se);
+	}
 }
=20
 #include "pelt.h"
@@ -5055,19 +4908,6 @@ static inline void update_misfit_status(struct task_=
struct *p, struct rq *rq) {}
=20
 #endif /* CONFIG_SMP */
=20
-static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se)
-{
-#ifdef CONFIG_SCHED_DEBUG
-	s64 d =3D se->vruntime - cfs_rq->min_vruntime;
-
-	if (d < 0)
-		d =3D -d;
-
-	if (d > 3*sysctl_sched_latency)
-		schedstat_inc(cfs_rq->nr_spread_over);
-#endif
-}
-
 static void
 place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
 {
@@ -5219,7 +5059,6 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_en=
tity *se, int flags)
=20
 	check_schedstat_required();
 	update_stats_enqueue_fair(cfs_rq, se, flags);
-	check_spread(cfs_rq, se);
 	if (!curr)
 		__enqueue_entity(cfs_rq, se);
 	se->on_rq =3D 1;
@@ -5241,17 +5080,6 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_e=
ntity *se, int flags)
 	}
 }
=20
-static void __clear_buddies_last(struct sched_entity *se)
-{
-	for_each_sched_entity(se) {
-		struct cfs_rq *cfs_rq =3D cfs_rq_of(se);
-		if (cfs_rq->last !=3D se)
-			break;
-
-		cfs_rq->last =3D NULL;
-	}
-}
-
 static void __clear_buddies_next(struct sched_entity *se)
 {
 	for_each_sched_entity(se) {
@@ -5263,27 +5091,10 @@ static void __clear_buddies_next(struct sched_entit=
y *se)
 	}
 }
=20
-static void __clear_buddies_skip(struct sched_entity *se)
-{
-	for_each_sched_entity(se) {
-		struct cfs_rq *cfs_rq =3D cfs_rq_of(se);
-		if (cfs_rq->skip !=3D se)
-			break;
-
-		cfs_rq->skip =3D NULL;
-	}
-}
-
 static void clear_buddies(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
-	if (cfs_rq->last =3D=3D se)
-		__clear_buddies_last(se);
-
 	if (cfs_rq->next =3D=3D se)
 		__clear_buddies_next(se);
-
-	if (cfs_rq->skip =3D=3D se)
-		__clear_buddies_skip(se);
 }
=20
 static __always_inline void return_cfs_rq_runtime(struct cfs_rq *cfs_rq);
@@ -5341,45 +5152,6 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_e=
ntity *se, int flags)
 		update_idle_cfs_rq_clock_pelt(cfs_rq);
 }
=20
-/*
- * Preempt the current task with a newly woken task if needed:
- */
-static void
-check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr)
-{
-	unsigned long delta_exec;
-	struct sched_entity *se;
-	s64 delta;
-
-	delta_exec =3D curr->sum_exec_runtime - curr->prev_sum_exec_runtime;
-	if (delta_exec > curr->slice) {
-		resched_curr(rq_of(cfs_rq));
-		/*
-		 * The current task ran long enough, ensure it doesn't get
-		 * re-elected due to buddy favours.
-		 */
-		clear_buddies(cfs_rq, curr);
-		return;
-	}
-
-	/*
-	 * Ensure that a task that missed wakeup preemption by a
-	 * narrow margin doesn't have to wait for a full slice.
-	 * This also mitigates buddy induced latencies under load.
-	 */
-	if (delta_exec < sysctl_sched_min_granularity)
-		return;
-
-	se =3D __pick_first_entity(cfs_rq);
-	delta =3D curr->vruntime - se->vruntime;
-
-	if (delta < 0)
-		return;
-
-	if (delta > curr->slice)
-		resched_curr(rq_of(cfs_rq));
-}
-
 static void
 set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
@@ -5418,9 +5190,6 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_e=
ntity *se)
 	se->prev_sum_exec_runtime =3D se->sum_exec_runtime;
 }
=20
-static int
-wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se);
-
 /*
  * Pick the next process, keeping these things in mind, in this order:
  * 1) keep things fair between processes/task groups
@@ -5431,53 +5200,14 @@ wakeup_preempt_entity(struct sched_entity *curr, st=
ruct sched_entity *se);
 static struct sched_entity *
 pick_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *curr)
 {
-	struct sched_entity *left, *se;
-
-	if (sched_feat(EEVDF)) {
-		/*
-		 * Enabling NEXT_BUDDY will affect latency but not fairness.
-		 */
-		if (sched_feat(NEXT_BUDDY) &&
-		    cfs_rq->next && entity_eligible(cfs_rq, cfs_rq->next))
-			return cfs_rq->next;
-
-		return pick_eevdf(cfs_rq);
-	}
-
-	se =3D left =3D pick_cfs(cfs_rq, curr);
-
 	/*
-	 * Avoid running the skip buddy, if running something else can
-	 * be done without getting too unfair.
+	 * Enabling NEXT_BUDDY will affect latency but not fairness.
 	 */
-	if (cfs_rq->skip && cfs_rq->skip =3D=3D se) {
-		struct sched_entity *second;
+	if (sched_feat(NEXT_BUDDY) &&
+	    cfs_rq->next && entity_eligible(cfs_rq, cfs_rq->next))
+		return cfs_rq->next;
=20
-		if (se =3D=3D curr) {
-			second =3D __pick_first_entity(cfs_rq);
-		} else {
-			second =3D __pick_next_entity(se);
-			if (!second || (curr && entity_before(curr, second)))
-				second =3D curr;
-		}
-
-		if (second && wakeup_preempt_entity(second, left) < 1)
-			se =3D second;
-	}
-
-	if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1) {
-		/*
-		 * Someone really wants this to run. If it's not unfair, run it.
-		 */
-		se =3D cfs_rq->next;
-	} else if (cfs_rq->last && wakeup_preempt_entity(cfs_rq->last, left) < 1)=
 {
-		/*
-		 * Prefer last buddy, try to return the CPU to a preempted task.
-		 */
-		se =3D cfs_rq->last;
-	}
-
-	return se;
+	return pick_eevdf(cfs_rq);
 }
=20
 static bool check_cfs_rq_runtime(struct cfs_rq *cfs_rq);
@@ -5494,8 +5224,6 @@ static void put_prev_entity(struct cfs_rq *cfs_rq, st=
ruct sched_entity *prev)
 	/* throttle cfs_rqs exceeding runtime */
 	check_cfs_rq_runtime(cfs_rq);
=20
-	check_spread(cfs_rq, prev);
-
 	if (prev->on_rq) {
 		update_stats_wait_start_fair(cfs_rq, prev);
 		/* Put 'current' back into the tree. */
@@ -5536,9 +5264,6 @@ entity_tick(struct cfs_rq *cfs_rq, struct sched_entit=
y *curr, int queued)
 			hrtimer_active(&rq_of(cfs_rq)->hrtick_timer))
 		return;
 #endif
-
-	if (!sched_feat(EEVDF) && cfs_rq->nr_running > 1)
-		check_preempt_tick(cfs_rq, curr);
 }
=20
=20
@@ -6610,8 +6335,7 @@ static void hrtick_update(struct rq *rq)
 	if (!hrtick_enabled_fair(rq) || curr->sched_class !=3D &fair_sched_class)
 		return;
=20
-	if (cfs_rq_of(&curr->se)->nr_running < sched_nr_latency)
-		hrtick_start_fair(rq, curr);
+	hrtick_start_fair(rq, curr);
 }
 #else /* !CONFIG_SCHED_HRTICK */
 static inline void
@@ -6652,17 +6376,6 @@ static int sched_idle_rq(struct rq *rq)
 			rq->nr_running);
 }
=20
-/*
- * Returns true if cfs_rq only has SCHED_IDLE entities enqueued. Note the =
use
- * of idle_nr_running, which does not consider idle descendants of normal
- * entities.
- */
-static bool sched_idle_cfs_rq(struct cfs_rq *cfs_rq)
-{
-	return cfs_rq->nr_running &&
-		cfs_rq->nr_running =3D=3D cfs_rq->idle_nr_running;
-}
-
 #ifdef CONFIG_SMP
 static int sched_idle_cpu(int cpu)
 {
@@ -8205,66 +7918,6 @@ balance_fair(struct rq *rq, struct task_struct *prev=
, struct rq_flags *rf)
 }
 #endif /* CONFIG_SMP */
=20
-static unsigned long wakeup_gran(struct sched_entity *se)
-{
-	unsigned long gran =3D sysctl_sched_wakeup_granularity;
-
-	/*
-	 * Since its curr running now, convert the gran from real-time
-	 * to virtual-time in his units.
-	 *
-	 * By using 'se' instead of 'curr' we penalize light tasks, so
-	 * they get preempted easier. That is, if 'se' < 'curr' then
-	 * the resulting gran will be larger, therefore penalizing the
-	 * lighter, if otoh 'se' > 'curr' then the resulting gran will
-	 * be smaller, again penalizing the lighter task.
-	 *
-	 * This is especially important for buddies when the leftmost
-	 * task is higher priority than the buddy.
-	 */
-	return calc_delta_fair(gran, se);
-}
-
-/*
- * Should 'se' preempt 'curr'.
- *
- *             |s1
- *        |s2
- *   |s3
- *         g
- *      |<--->|c
- *
- *  w(c, s1) =3D -1
- *  w(c, s2) =3D  0
- *  w(c, s3) =3D  1
- *
- */
-static int
-wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se)
-{
-	s64 gran, vdiff =3D curr->vruntime - se->vruntime;
-
-	if (vdiff <=3D 0)
-		return -1;
-
-	gran =3D wakeup_gran(se);
-	if (vdiff > gran)
-		return 1;
-
-	return 0;
-}
-
-static void set_last_buddy(struct sched_entity *se)
-{
-	for_each_sched_entity(se) {
-		if (SCHED_WARN_ON(!se->on_rq))
-			return;
-		if (se_is_idle(se))
-			return;
-		cfs_rq_of(se)->last =3D se;
-	}
-}
-
 static void set_next_buddy(struct sched_entity *se)
 {
 	for_each_sched_entity(se) {
@@ -8276,12 +7929,6 @@ static void set_next_buddy(struct sched_entity *se)
 	}
 }
=20
-static void set_skip_buddy(struct sched_entity *se)
-{
-	for_each_sched_entity(se)
-		cfs_rq_of(se)->skip =3D se;
-}
-
 /*
  * Preempt the current task with a newly woken task if needed:
  */
@@ -8290,7 +7937,6 @@ static void check_preempt_wakeup(struct rq *rq, struc=
t task_struct *p, int wake_
 	struct task_struct *curr =3D rq->curr;
 	struct sched_entity *se =3D &curr->se, *pse =3D &p->se;
 	struct cfs_rq *cfs_rq =3D task_cfs_rq(curr);
-	int scale =3D cfs_rq->nr_running >=3D sched_nr_latency;
 	int next_buddy_marked =3D 0;
 	int cse_is_idle, pse_is_idle;
=20
@@ -8306,7 +7952,7 @@ static void check_preempt_wakeup(struct rq *rq, struc=
t task_struct *p, int wake_
 	if (unlikely(throttled_hierarchy(cfs_rq_of(pse))))
 		return;
=20
-	if (sched_feat(NEXT_BUDDY) && scale && !(wake_flags & WF_FORK)) {
+	if (sched_feat(NEXT_BUDDY) && !(wake_flags & WF_FORK)) {
 		set_next_buddy(pse);
 		next_buddy_marked =3D 1;
 	}
@@ -8354,44 +8000,16 @@ static void check_preempt_wakeup(struct rq *rq, str=
uct task_struct *p, int wake_
 	cfs_rq =3D cfs_rq_of(se);
 	update_curr(cfs_rq);
=20
-	if (sched_feat(EEVDF)) {
-		/*
-		 * XXX pick_eevdf(cfs_rq) !=3D se ?
-		 */
-		if (pick_eevdf(cfs_rq) =3D=3D pse)
-			goto preempt;
-
-		return;
-	}
-
-	if (wakeup_preempt_entity(se, pse) =3D=3D 1) {
-		/*
-		 * Bias pick_next to pick the sched entity that is
-		 * triggering this preemption.
-		 */
-		if (!next_buddy_marked)
-			set_next_buddy(pse);
+	/*
+	 * XXX pick_eevdf(cfs_rq) !=3D se ?
+	 */
+	if (pick_eevdf(cfs_rq) =3D=3D pse)
 		goto preempt;
-	}
=20
 	return;
=20
 preempt:
 	resched_curr(rq);
-	/*
-	 * Only set the backward buddy when the current task is still
-	 * on the rq. This can happen when a wakeup gets interleaved
-	 * with schedule on the ->pre_schedule() or idle_balance()
-	 * point, either of which can * drop the rq lock.
-	 *
-	 * Also, during early boot the idle thread is in the fair class,
-	 * for obvious reasons its a bad idea to schedule back to it.
-	 */
-	if (unlikely(!se->on_rq || curr =3D=3D rq->idle))
-		return;
-
-	if (sched_feat(LAST_BUDDY) && scale && entity_is_task(se))
-		set_last_buddy(se);
 }
=20
 #ifdef CONFIG_SMP
@@ -8592,8 +8210,6 @@ static void put_prev_task_fair(struct rq *rq, struct =
task_struct *prev)
=20
 /*
  * sched_yield() is very simple
- *
- * The magic of dealing with the ->skip buddy is in pick_next_entity.
  */
 static void yield_task_fair(struct rq *rq)
 {
@@ -8609,23 +8225,19 @@ static void yield_task_fair(struct rq *rq)
=20
 	clear_buddies(cfs_rq, se);
=20
-	if (sched_feat(EEVDF) || curr->policy !=3D SCHED_BATCH) {
-		update_rq_clock(rq);
-		/*
-		 * Update run-time statistics of the 'current'.
-		 */
-		update_curr(cfs_rq);
-		/*
-		 * Tell update_rq_clock() that we've just updated,
-		 * so we don't do microscopic update in schedule()
-		 * and double the fastpath cost.
-		 */
-		rq_clock_skip_update(rq);
-	}
-	if (sched_feat(EEVDF))
-		se->deadline +=3D calc_delta_fair(se->slice, se);
+	update_rq_clock(rq);
+	/*
+	 * Update run-time statistics of the 'current'.
+	 */
+	update_curr(cfs_rq);
+	/*
+	 * Tell update_rq_clock() that we've just updated,
+	 * so we don't do microscopic update in schedule()
+	 * and double the fastpath cost.
+	 */
+	rq_clock_skip_update(rq);
=20
-	set_skip_buddy(se);
+	se->deadline +=3D calc_delta_fair(se->slice, se);
 }
=20
 static bool yield_to_task_fair(struct rq *rq, struct task_struct *p)
@@ -8873,8 +8485,7 @@ static int task_hot(struct task_struct *p, struct lb_=
env *env)
 	 * Buddy candidates are cache hot:
 	 */
 	if (sched_feat(CACHE_HOT_BUDDY) && env->dst_rq->nr_running &&
-			(&p->se =3D=3D cfs_rq_of(&p->se)->next ||
-			 &p->se =3D=3D cfs_rq_of(&p->se)->last))
+	    (&p->se =3D=3D cfs_rq_of(&p->se)->next))
 		return 1;
=20
 	if (sysctl_sched_migration_cost =3D=3D -1)
diff --git a/kernel/sched/features.h b/kernel/sched/features.h
index 2a830ec..54334ca 100644
--- a/kernel/sched/features.h
+++ b/kernel/sched/features.h
@@ -15,13 +15,6 @@ SCHED_FEAT(PLACE_DEADLINE_INITIAL, true)
 SCHED_FEAT(NEXT_BUDDY, false)
=20
 /*
- * Prefer to schedule the task that ran last (when we did
- * wake-preempt) as that likely will touch the same data, increases
- * cache locality.
- */
-SCHED_FEAT(LAST_BUDDY, true)
-
-/*
  * Consider buddies to be cache hot, decreases the likeliness of a
  * cache buddy being migrated away, increases cache locality.
  */
@@ -93,8 +86,3 @@ SCHED_FEAT(UTIL_EST, true)
 SCHED_FEAT(UTIL_EST_FASTUP, true)
=20
 SCHED_FEAT(LATENCY_WARN, false)
-
-SCHED_FEAT(ALT_PERIOD, true)
-SCHED_FEAT(BASE_SLICE, true)
-
-SCHED_FEAT(EEVDF, true)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index aa5b293..f814bb7 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -570,8 +570,6 @@ struct cfs_rq {
 	 */
 	struct sched_entity	*curr;
 	struct sched_entity	*next;
-	struct sched_entity	*last;
-	struct sched_entity	*skip;
=20
 #ifdef	CONFIG_SCHED_DEBUG
 	unsigned int		nr_spread_over;
@@ -2508,9 +2506,6 @@ extern const_debug unsigned int sysctl_sched_migratio=
n_cost;
 extern unsigned int sysctl_sched_min_granularity;
=20
 #ifdef CONFIG_SCHED_DEBUG
-extern unsigned int sysctl_sched_latency;
-extern unsigned int sysctl_sched_idle_min_granularity;
-extern unsigned int sysctl_sched_wakeup_granularity;
 extern int sysctl_resched_latency_warn_ms;
 extern int sysctl_resched_latency_warn_once;