From nobody Wed Apr 8 22:50:09 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82DC2C38A02 for ; Mon, 31 Oct 2022 12:23:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230291AbiJaMXB (ORCPT ); Mon, 31 Oct 2022 08:23:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48768 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231330AbiJaMWt (ORCPT ); Mon, 31 Oct 2022 08:22:49 -0400 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E8D22F02D for ; Mon, 31 Oct 2022 05:22:14 -0700 (PDT) Received: from dggpeml500023.china.huawei.com (unknown [172.30.72.54]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4N1Bwn66ZDz15MHl; Mon, 31 Oct 2022 20:17:13 +0800 (CST) Received: from dggpeml500018.china.huawei.com (7.185.36.186) by dggpeml500023.china.huawei.com (7.185.36.114) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 31 Oct 2022 20:22:13 +0800 Received: from huawei.com (10.67.174.191) by dggpeml500018.china.huawei.com (7.185.36.186) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 31 Oct 2022 20:22:13 +0800 From: Zhang Qiao To: , , , CC: , , , , , , , , Subject: [PATCH next 1/2] sched: Init new task's vruntime after select cpu Date: Mon, 31 Oct 2022 20:51:12 +0800 Message-ID: <20221031125113.72980-2-zhangqiao22@huawei.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221031125113.72980-1-zhangqiao22@huawei.com> References: <20221031125113.72980-1-zhangqiao22@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.174.191] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpeml500018.china.huawei.com (7.185.36.186) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When create a new task, we initialize vruntime of the new task at sched_cgroup_fork(). However, this action is executed too early and may be incorrect, because it use current cpu to init the vruntime, but the new task actually runs on the cpu assigned at wake_up_new_task(). So the patch call task_fork() after select fork cpu and use the ready cpu(the child will run on it) init the new task. Signed-off-by: Zhang Qiao --- kernel/sched/core.c | 7 ++++++- kernel/sched/fair.c | 7 +------ 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index e4ce124ec701..ca5677206efd 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4627,9 +4627,13 @@ void sched_cgroup_fork(struct task_struct *p, struct= kernel_clone_args *kargs) * so use __set_task_cpu(). */ __set_task_cpu(p, smp_processor_id()); + raw_spin_unlock_irqrestore(&p->pi_lock, flags); +} + +void sched_task_fork(struct task_struct *p) +{ if (p->sched_class->task_fork) p->sched_class->task_fork(p); - raw_spin_unlock_irqrestore(&p->pi_lock, flags); } =20 void sched_post_fork(struct task_struct *p) @@ -4682,6 +4686,7 @@ void wake_up_new_task(struct task_struct *p) #endif rq =3D __task_rq_lock(p, &rf); update_rq_clock(rq); + sched_task_fork(p); post_init_entity_util_avg(p); =20 activate_task(rq, p, ENQUEUE_NOCLOCK); diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e4a0b8bd941c..34845d425180 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -11603,12 +11603,8 @@ static void task_fork_fair(struct task_struct *p) struct cfs_rq *cfs_rq; struct sched_entity *se =3D &p->se, *curr; struct rq *rq =3D this_rq(); - struct rq_flags rf; =20 - rq_lock(rq, &rf); - update_rq_clock(rq); - - cfs_rq =3D task_cfs_rq(current); + cfs_rq =3D task_cfs_rq(p); curr =3D cfs_rq->curr; if (curr) { update_curr(cfs_rq); @@ -11626,7 +11622,6 @@ static void task_fork_fair(struct task_struct *p) } =20 se->vruntime -=3D cfs_rq->min_vruntime; - rq_unlock(rq, &rf); } =20 /* --=20 2.17.1 From nobody Wed Apr 8 22:50:09 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A5B8C38A02 for ; Mon, 31 Oct 2022 12:22:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231362AbiJaMWx (ORCPT ); Mon, 31 Oct 2022 08:22:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48772 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231336AbiJaMWt (ORCPT ); Mon, 31 Oct 2022 08:22:49 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2574F039 for ; Mon, 31 Oct 2022 05:22:15 -0700 (PDT) Received: from dggpeml500026.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4N1Bwv2Wv0zVjBx; Mon, 31 Oct 2022 20:17:19 +0800 (CST) Received: from dggpeml500018.china.huawei.com (7.185.36.186) by dggpeml500026.china.huawei.com (7.185.36.106) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 31 Oct 2022 20:22:14 +0800 Received: from huawei.com (10.67.174.191) by dggpeml500018.china.huawei.com (7.185.36.186) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 31 Oct 2022 20:22:13 +0800 From: Zhang Qiao To: , , , CC: , , , , , , , , Subject: [PATCH next 2/2] sched: Fix sched_child_runs_first Date: Mon, 31 Oct 2022 20:51:13 +0800 Message-ID: <20221031125113.72980-3-zhangqiao22@huawei.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221031125113.72980-1-zhangqiao22@huawei.com> References: <20221031125113.72980-1-zhangqiao22@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.174.191] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To dggpeml500018.china.huawei.com (7.185.36.186) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There are two cases that the sched_child_runs_first maybe not work fine: 1) when call clone3() with CLONE_INTO_CGROUP flags, will creating the child task into a cgroup different from the parent's cgroup, so that child and parent's cfs_rq is diffent. 2) Assign a different cpu to the new task when fork balancing. Above two case, the child and the parent will attach to different cpu and cfs_rq. At this time, we can't swap the child and parent's vruntime, and i think only do swap vruntime when the parent and child in the same cfs_rq. This patch will add the cfs_rq check before swap vruntime. Signed-off-by: Zhang Qiao --- kernel/sched/fair.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 34845d425180..6061ceb1b7cb 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -11612,7 +11612,9 @@ static void task_fork_fair(struct task_struct *p) } place_entity(cfs_rq, se, 1); =20 - if (sysctl_sched_child_runs_first && curr && entity_before(curr, se)) { + if (sysctl_sched_child_runs_first && + cfs_rq =3D=3D task_cfs_rq(current) && + curr && entity_before(curr, se)) { /* * Upon rescheduling, sched_class::put_prev_task() will place * 'current' within the tree based on its new key value. --=20 2.17.1