From nobody Sun Apr 12 05:50:40 2026
Return-Path: <linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id E8B9CC00144
	for <linux-kernel@archiver.kernel.org>; Mon,  1 Aug 2022 04:28:58 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S239151AbiHAE25 (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 1 Aug 2022 00:28:57 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53994 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S239502AbiHAE2f (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 1 Aug 2022 00:28:35 -0400
Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com
 [IPv6:2607:f8b0:4864:20::632])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 011F013F1B
        for <linux-kernel@vger.kernel.org>;
 Sun, 31 Jul 2022 21:28:29 -0700 (PDT)
Received: by mail-pl1-x632.google.com with SMTP id x10so8697431plb.3
        for <linux-kernel@vger.kernel.org>;
 Sun, 31 Jul 2022 21:28:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=bytedance-com.20210112.gappssmtp.com; s=20210112;
        h=from:to:cc:subject:date:message-id:in-reply-to:references
         :mime-version:content-transfer-encoding;
        bh=qM5t1EVQbnssjElUyFkZNByPxLswIQhwSrHRz48dXWo=;
        b=xi9UjcUW4iBi+IuxTN6mmZ+i3gBKqsgjnolzaRD3dRQzotiZXRfETFn9aT3Hu4npwy
         cSI1WasmgaKz0rgy2K0jcHtdHfrFigSYylueA9L6dBqTQEwCv2MRTXqSpZUJGR7ETbOS
         okqJtj0iYbIv240aKWtH+maYhgiWtcpojYn+1iZjCHMbEUPlwhE9gZT/qeLVccHcf0xw
         PCuSdigQeaH/l3WTMB3E5IHXOJfBxkwE0bVt1zjuA/9Yb8ALqmq8rkwMJy5ItgXSzAe7
         lf5S5q7q6ZS6Di3Aby5HLsKaVxMjmnmjFhSsN9PwQP8tgneqN30yjmaIGCsles6qT4jW
         lzrw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
         :references:mime-version:content-transfer-encoding;
        bh=qM5t1EVQbnssjElUyFkZNByPxLswIQhwSrHRz48dXWo=;
        b=HR6YXjqE/vLCL75GPdvwuw2ZNRqmUnVLL4onX9eKhH6i8zThUFM4FINJnCXDL6QO4Q
         76S/Hdr4nSe0AW21BvfHnoOELnLq1H35miHmrBUiZ0Zlp1JYWmL/XQFbtYDgBmd94D16
         gW9R7We9RKwfc8n1ys348B/RMrEZQbegxKoTSXPCr/rFpNGBWeAwR65xSaju0UiyPvnJ
         eARg7GFZoFT5qOg75A0E7f4Q+bzIznD/jZqHZJh1J7CgdSPZRWwD9QECfrC+wAlFFC1P
         TNFIHK2Q0l3QPfGuZFDHtGnJu4Zde6Ty/AQ6fkWj/aIjw+DIcviGWqNhQP2silSuFojE
         pVrQ==
X-Gm-Message-State: ACgBeo37RVqDwaRCAJuV8bM4QedpnIujW4K253QLJ5FT4GmwSji7RI0+
        JiNw2SemJOBUol6eX0i0cZcCyQ==
X-Google-Smtp-Source: 
 AA6agR4KB/qhLFEe6ItaP7mbifr1Td54rTx+xkyl6aZ+rbJmPb/C9iyZlCt0Uxq+kHSuOh6IFs9oHg==
X-Received: by 2002:a17:902:a502:b0:16b:fbd9:7fc5 with SMTP id
 s2-20020a170902a50200b0016bfbd97fc5mr15350849plq.112.1659328108379;
        Sun, 31 Jul 2022 21:28:28 -0700 (PDT)
Received: from C02CV1DAMD6P.bytedance.net ([139.177.225.241])
        by smtp.gmail.com with ESMTPSA id
 ot10-20020a17090b3b4a00b001f326ead012sm7012202pjb.37.2022.07.31.21.28.24
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Sun, 31 Jul 2022 21:28:28 -0700 (PDT)
From: Chengming Zhou <zhouchengming@bytedance.com>
To: mingo@redhat.com, peterz@infradead.org, vincent.guittot@linaro.org,
        dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com,
        vschneid@redhat.com
Cc: linux-kernel@vger.kernel.org,
        Chengming Zhou <zhouchengming@bytedance.com>
Subject: [PATCH v3 07/10] sched/fair: allow changing cgroup of new forked task
Date: Mon,  1 Aug 2022 12:27:42 +0800
Message-Id: <20220801042745.7794-8-zhouchengming@bytedance.com>
X-Mailer: git-send-email 2.35.1
In-Reply-To: <20220801042745.7794-1-zhouchengming@bytedance.com>
References: <20220801042745.7794-1-zhouchengming@bytedance.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

commit 7dc603c9028e ("sched/fair: Fix PELT integrity for new tasks")
introduce a TASK_NEW state and an unnessary limitation that would fail
when changing cgroup of new forked task.

Because at that time, we can't handle task_change_group_fair() for new
forked fair task which hasn't been woken up by wake_up_new_task(),
which will cause detach on an unattached task sched_avg problem.

This patch delete this unnessary limitation by adding check before do
attach_entity_cfs_rq().

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 include/linux/sched.h |  5 ++---
 kernel/sched/core.c   | 30 +++++++-----------------------
 kernel/sched/fair.c   |  7 ++++++-
 3 files changed, 15 insertions(+), 27 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 88b8817b827d..b504e55bbf7a 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -95,10 +95,9 @@ struct task_group;
 #define TASK_WAKEKILL			0x0100
 #define TASK_WAKING			0x0200
 #define TASK_NOLOAD			0x0400
-#define TASK_NEW			0x0800
 /* RT specific auxilliary flag to mark RT lock waiters */
-#define TASK_RTLOCK_WAIT		0x1000
-#define TASK_STATE_MAX			0x2000
+#define TASK_RTLOCK_WAIT		0x0800
+#define TASK_STATE_MAX			0x1000
=20
 /* Convenience macros for the sake of set_current_state: */
 #define TASK_KILLABLE			(TASK_WAKEKILL | TASK_UNINTERRUPTIBLE)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 614d7180c99e..220bce5e73e0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4500,11 +4500,11 @@ int sched_fork(unsigned long clone_flags, struct ta=
sk_struct *p)
 {
 	__sched_fork(clone_flags, p);
 	/*
-	 * We mark the process as NEW here. This guarantees that
+	 * We mark the process as running here. This guarantees that
 	 * nobody will actually run it, and a signal or other external
 	 * event cannot wake it up and insert it on the runqueue either.
 	 */
-	p->__state =3D TASK_NEW;
+	p->__state =3D TASK_RUNNING;
=20
 	/*
 	 * Make sure we do not leak PI boosting priority to the child.
@@ -4622,7 +4622,6 @@ void wake_up_new_task(struct task_struct *p)
 	struct rq *rq;
=20
 	raw_spin_lock_irqsave(&p->pi_lock, rf.flags);
-	WRITE_ONCE(p->__state, TASK_RUNNING);
 #ifdef CONFIG_SMP
 	/*
 	 * Fork balancing, do it here and not earlier because:
@@ -10249,36 +10248,19 @@ static void cpu_cgroup_css_free(struct cgroup_sub=
sys_state *css)
 	sched_unregister_group(tg);
 }
=20
+#ifdef CONFIG_RT_GROUP_SCHED
 static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
 {
 	struct task_struct *task;
 	struct cgroup_subsys_state *css;
-	int ret =3D 0;
=20
 	cgroup_taskset_for_each(task, css, tset) {
-#ifdef CONFIG_RT_GROUP_SCHED
 		if (!sched_rt_can_attach(css_tg(css), task))
 			return -EINVAL;
-#endif
-		/*
-		 * Serialize against wake_up_new_task() such that if it's
-		 * running, we're sure to observe its full state.
-		 */
-		raw_spin_lock_irq(&task->pi_lock);
-		/*
-		 * Avoid calling sched_move_task() before wake_up_new_task()
-		 * has happened. This would lead to problems with PELT, due to
-		 * move wanting to detach+attach while we're not attached yet.
-		 */
-		if (READ_ONCE(task->__state) =3D=3D TASK_NEW)
-			ret =3D -EINVAL;
-		raw_spin_unlock_irq(&task->pi_lock);
-
-		if (ret)
-			break;
 	}
-	return ret;
+	return 0;
 }
+#endif
=20
 static void cpu_cgroup_attach(struct cgroup_taskset *tset)
 {
@@ -11114,7 +11096,9 @@ struct cgroup_subsys cpu_cgrp_subsys =3D {
 	.css_released	=3D cpu_cgroup_css_released,
 	.css_free	=3D cpu_cgroup_css_free,
 	.css_extra_stat_show =3D cpu_extra_stat_show,
+#ifdef CONFIG_RT_GROUP_SCHED
 	.can_attach	=3D cpu_cgroup_can_attach,
+#endif
 	.attach		=3D cpu_cgroup_attach,
 	.legacy_cftypes	=3D cpu_legacy_files,
 	.dfl_cftypes	=3D cpu_files,
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a32da4e71ddf..ad20a939227d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11603,7 +11603,12 @@ static void attach_task_cfs_rq(struct task_struct =
*p)
 	struct sched_entity *se =3D &p->se;
 	struct cfs_rq *cfs_rq =3D cfs_rq_of(se);
=20
-	attach_entity_cfs_rq(se);
+	/*
+	 * We couldn't detach or attach a forked task which
+	 * hasn't been woken up by wake_up_new_task().
+	 */
+	if (p->on_rq || se->sum_exec_runtime)
+		attach_entity_cfs_rq(se);
=20
 	if (!vruntime_normalized(p))
 		se->vruntime +=3D cfs_rq->min_vruntime;
--=20
2.36.1