From nobody Fri Jun 19 23:41:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1845C433EF for ; Fri, 25 Mar 2022 23:52:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229449AbiCYXxu (ORCPT ); Fri, 25 Mar 2022 19:53:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229562AbiCYXxo (ORCPT ); Fri, 25 Mar 2022 19:53:44 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5CEA14AE33 for ; Fri, 25 Mar 2022 16:52:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1648252329; x=1679788329; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mHLNNSPQ4EYmrAFhqKFt7xJOVLotEnt4YlgX3orT5jE=; b=HEeKewEyJM9Ob/1FQBjkI6q0G23b/yOLq4Bwn9Ic5cQUX+iAWM/X3yPI o/BrI9tLU4kve7jzq4ZpKagWD1LVCnmyFDzGpCllENYKYJ+EH3V5ddicP 9ORYR6U71wmoXn4sd2mVeyaflUEm2mqVCvEQ6hbpJnZdhxOvLOlPXKtmG xnhmYpEWNrLNDdU72y+4DtnZWwWVHvF81bE2LWESYvyt2quQub8iMNUGm iprWdRHQ5BOxuIaTs6Lv1cEQ9ncrrwqqTVOtL/l2BGbLyhDys+YUa9hed Pn3SoiXB54vpDaCHb0+AOWt194m90XPZ5Yq3W3mJTrB2cAKfsGgopAeJj g==; X-IronPort-AV: E=McAfee;i="6200,9189,10297"; a="321930148" X-IronPort-AV: E=Sophos;i="5.90,211,1643702400"; d="scan'208";a="321930148" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2022 16:52:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,211,1643702400"; d="scan'208";a="648425294" Received: from skl-02.jf.intel.com ([10.54.74.28]) by fmsmga002.fm.intel.com with ESMTP; 25 Mar 2022 16:52:08 -0700 From: Tim Chen To: Peter Zijlstra , Vincent Guittot , Ingo Molnar , Juri Lelli Cc: Yu Chen , Walter Mack , Mel Gorman , linux-kernel@vger.kernel.org, Tim Chen Subject: [PATCH 1/2] sched/fair: Don't rely on ->exec_start for migration Date: Fri, 25 Mar 2022 15:54:16 -0700 Message-Id: <68832dfbb60fda030540b5f4e39c5801942689b1.1648228023.git.tim.c.chen@linux.intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Peter Zijlstra From: Peter Zijlstra (Intel) Currently migrate_task_rq_fair() (ab)uses se->exec_start to make task_hot() fail. In order to preserve ->exec_start, add a ->migrated flag to sched_entity. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Tim Chen --- include/linux/sched.h | 1 + kernel/sched/fair.c | 6 +++++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 75ba8aa60248..0edf16b4d40a 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -541,6 +541,7 @@ struct sched_entity { struct rb_node run_node; struct list_head group_node; unsigned int on_rq; + unsigned int migrated; =20 u64 exec_start; u64 sum_exec_runtime; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 5146163bfabb..2498e97804fd 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1004,6 +1004,7 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct= sched_entity *se) /* * We are starting a new run period: */ + se->migrated =3D 0; se->exec_start =3D rq_clock_task(rq_of(cfs_rq)); } =20 @@ -6979,7 +6980,7 @@ static void migrate_task_rq_fair(struct task_struct *= p, int new_cpu) p->se.avg.last_update_time =3D 0; =20 /* We have migrated, no longer consider this task hot */ - p->se.exec_start =3D 0; + p->se.migrated =3D 1; =20 update_scan_period(p, new_cpu); } @@ -7665,6 +7666,9 @@ static int task_hot(struct task_struct *p, struct lb_= env *env) if (sysctl_sched_migration_cost =3D=3D 0) return 0; =20 + if (p->se.migrated) + return 0; + delta =3D rq_clock_task(env->src_rq) - p->se.exec_start; =20 return delta < (s64)sysctl_sched_migration_cost; --=20 2.32.0 From nobody Fri Jun 19 23:41:51 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27B3EC433EF for ; Fri, 25 Mar 2022 23:52:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229665AbiCYXxx (ORCPT ); Fri, 25 Mar 2022 19:53:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229620AbiCYXxs (ORCPT ); Fri, 25 Mar 2022 19:53:48 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 108AF10BBDB for ; Fri, 25 Mar 2022 16:52:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1648252331; x=1679788331; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jsVPk/AP3tfYmFM+PdRlupq5b+ybr1R0V4Pies6bYIo=; b=FG1JkPVl7+7oWcN0B9fqNhvzkXzMSoSiTmFkb/sqbf3TeTTHON5FeWV9 Nz6yVl4t46gR2cAcoW8Ci0WYhTLu7bYVJF49NdYIpiJdquyJq0hdIr6j9 V48kfAsfu0vWn7FA6DwspN8GzkYAlfAs7LC2I2z7g9P7/CXBtctDGHDRW 8uTE0oxl+FvLVXQDqJIXvQa8OaifMeA+1mdAR1QHl2fekjEGpspQXm5CH 1Hw1Ijt/xo253JvN6IczMZM6YXO0FvTClwTmJTN08a10OrajwI7SnAuKs VsVCkoOz+aKztp+PyLLJACv+nyvJK4vJkgMrLEXwzomgpC2Y5rdBdhBtt g==; X-IronPort-AV: E=McAfee;i="6200,9189,10297"; a="321930161" X-IronPort-AV: E=Sophos;i="5.90,211,1643702400"; d="scan'208";a="321930161" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Mar 2022 16:52:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,211,1643702400"; d="scan'208";a="648425297" Received: from skl-02.jf.intel.com ([10.54.74.28]) by fmsmga002.fm.intel.com with ESMTP; 25 Mar 2022 16:52:10 -0700 From: Tim Chen To: Peter Zijlstra , Vincent Guittot , Ingo Molnar , Juri Lelli Cc: Yu Chen , Walter Mack , Mel Gorman , linux-kernel@vger.kernel.org, Tim Chen Subject: [PATCH 2/2] sched/fair: Simple runqueue order on migrate Date: Fri, 25 Mar 2022 15:54:17 -0700 Message-Id: X-Mailer: git-send-email 2.20.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: "Peter Zijlstra (Intel)" From: Peter Zijlstra (Intel) There's a number of problems with SMP migration of fair tasks, but basically it boils down to a task not receiving equal service on each runqueue (consider the trivial 3 tasks 2 cpus infeasible weight scenario). Fully solving that with vruntime placement is 'hard', not least because a task might be very under-services on a busy runqueue and would need to be placed so far left on the new runqueue that it would significantly impact latency on the existing tasks. Instead do minimal / basic placement instead; when moving to a less busy queue place at the front of the queue to receive time sooner. When moving to a busier queue, place at the end of the queue to receive time later. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Tim Chen Tested-by: Chen Yu Tested-by: Walter Mack Reported-by: kernel test robot --- kernel/sched/fair.c | 33 +++++++++++++++++++++++++++++---- kernel/sched/features.h | 2 ++ 2 files changed, 31 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 2498e97804fd..c5d2cb3a8f42 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4223,6 +4223,27 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_ent= ity *se, int initial) se->vruntime =3D max_vruntime(se->vruntime, vruntime); } =20 +static void place_entity_migrate(struct cfs_rq *cfs_rq, struct sched_entit= y *se) +{ + if (!sched_feat(PLACE_MIGRATE)) + return; + + if (cfs_rq->nr_running < se->migrated) { + /* + * Migrated to a shorter runqueue, go first because + * we were under-served on the old runqueue. + */ + se->vruntime =3D cfs_rq->min_vruntime; + return; + } + + /* + * Migrated to a longer runqueue, go last because + * we got over-served on the old runqueue. + */ + se->vruntime =3D cfs_rq->min_vruntime + sched_vslice(cfs_rq, se); +} + static void check_enqueue_throttle(struct cfs_rq *cfs_rq); =20 static inline bool cfs_bandwidth_used(void); @@ -4296,6 +4317,8 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_en= tity *se, int flags) =20 if (flags & ENQUEUE_WAKEUP) place_entity(cfs_rq, se, 0); + else if (se->migrated) + place_entity_migrate(cfs_rq, se); =20 check_schedstat_required(); update_stats_enqueue_fair(cfs_rq, se, flags); @@ -6930,6 +6953,7 @@ static void detach_entity_cfs_rq(struct sched_entity = *se); */ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu) { + struct sched_entity *se =3D &p->se; /* * As blocked tasks retain absolute vruntime the migration needs to * deal with this by subtracting the old and adding the new @@ -6962,7 +6986,7 @@ static void migrate_task_rq_fair(struct task_struct *= p, int new_cpu) * rq->lock and can modify state directly. */ lockdep_assert_rq_held(task_rq(p)); - detach_entity_cfs_rq(&p->se); + detach_entity_cfs_rq(se); =20 } else { /* @@ -6973,14 +6997,15 @@ static void migrate_task_rq_fair(struct task_struct= *p, int new_cpu) * wakee task is less decayed, but giving the wakee more load * sounds not bad. */ - remove_entity_load_avg(&p->se); + remove_entity_load_avg(se); } =20 /* Tell new CPU we are migrated */ - p->se.avg.last_update_time =3D 0; + se->avg.last_update_time =3D 0; =20 /* We have migrated, no longer consider this task hot */ - p->se.migrated =3D 1; + for_each_sched_entity(se) + se->migrated =3D READ_ONCE(cfs_rq_of(se)->nr_running) + !se->on_rq; =20 update_scan_period(p, new_cpu); } diff --git a/kernel/sched/features.h b/kernel/sched/features.h index 1cf435bbcd9c..681c84fd062c 100644 --- a/kernel/sched/features.h +++ b/kernel/sched/features.h @@ -100,3 +100,5 @@ SCHED_FEAT(LATENCY_WARN, false) =20 SCHED_FEAT(ALT_PERIOD, true) SCHED_FEAT(BASE_SLICE, true) + +SCHED_FEAT(PLACE_MIGRATE, true) --=20 2.32.0