[PATCH v4 2/3] cgroup: retry find task if threadgroup leader changed

Yi Tao posted 3 patches 3 weeks, 2 days ago
There is a newer version of this series
[PATCH v4 2/3] cgroup: retry find task if threadgroup leader changed
Posted by Yi Tao 3 weeks, 2 days ago
Between obtaining the threadgroup leader via PID and acquiring the
cgroup attach lock, the threadgroup leader may change, which could lead
to incorrect cgroup migration. Therefore, after acquiring the cgroup
attach lock, we check whether the threadgroup leader has changed, and if
so, retry the operation.

Signed-off-by: Yi Tao <escape@linux.alibaba.com>
---
 kernel/cgroup/cgroup.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index b53ae8fd9681..6e90b79e8fa3 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -3010,6 +3010,7 @@ struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup,
 	if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0)
 		return ERR_PTR(-EINVAL);
 
+retry_find_task:
 	rcu_read_lock();
 	if (pid) {
 		tsk = find_task_by_vpid(pid);
@@ -3051,6 +3052,21 @@ struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup,
 
 	cgroup_attach_lock(*threadgroup_locked, tsk);
 
+	if (threadgroup) {
+		if (!thread_group_leader(tsk)) {
+			/*
+			 * a race with de_thread from another thread's exec()
+			 * may strip us of our leadership, if this happens,
+			 * there is no choice but to throw this task away and
+			 * try again; this is
+			 * "double-double-toil-and-trouble-check locking".
+			 */
+			cgroup_attach_unlock(*threadgroup_locked, tsk);
+			put_task_struct(tsk);
+			goto retry_find_task;
+		}
+	}
+
 	return tsk;
 
 out_unlock_rcu:
-- 
2.32.0.3.g01195cf9f
Re: [PATCH v4 2/3] cgroup: retry find task if threadgroup leader changed
Posted by Tejun Heo 3 weeks, 2 days ago
On Tue, Sep 09, 2025 at 03:55:29PM +0800, Yi Tao wrote:
> Between obtaining the threadgroup leader via PID and acquiring the
> cgroup attach lock, the threadgroup leader may change, which could lead
> to incorrect cgroup migration. Therefore, after acquiring the cgroup
> attach lock, we check whether the threadgroup leader has changed, and if
> so, retry the operation.
> 
> Signed-off-by: Yi Tao <escape@linux.alibaba.com>

This and thread group lock relocation should come before the patch which
requires it, not after.

Thanks.

-- 
tejun