[PATCH cgroup/for-next v2 0/5] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach()

Waiman Long posted 5 patches 1 week, 2 days ago
include/linux/sched.h           |   3 +
kernel/cgroup/cpuset-internal.h |   6 +
kernel/cgroup/cpuset.c          | 358 +++++++++++++++++++++-----------
3 files changed, 249 insertions(+), 118 deletions(-)
[PATCH cgroup/for-next v2 0/5] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach()
Posted by Waiman Long 1 week, 2 days ago
Sashiko AI review of another cpuset patch had found that cpuset_attach()
and cpuset_can_attach() can be passed a cgroup_taskset with tasks
migrating from one source cpuset to multiple destination cpusets and
vice versa.  Further testing of the cpuset code indicates that this is
indeed the case when the v2 cpuset controller is enabled or disabled.

Unfortunately, cpuset_attach() and cpuset_can_attach() still assume that
there will be one source and one destinaton cpuset which may result in
inocrrect behavior.

This patch series is created to fix this issue. The first 2 patches are
just preparatory patches to make the remaining patches easier to review.

Patch 3 adds a new attach_old_cs field into task_struct to track the
old cpuset to be used in case when cpuset_migrate_mm() needs to be
called in cpuset_attach().

Patch 4 moves mpol_rebind_mm() and cpuset_migrate_mm() inside
cpuset_attach_task() to make CLONE_INTO_CGROUP flag of clone(2) works
more like moving task from one cpuset to another one, while also make
supporting multiple source and destination cpusets easier.

Patch 5 makes the necessary changes to enable the support of multiple
source and destination cpusets by keeping all the source and destination
cpusets found during task iterations in two singly linked lists for
source and destination cpusets respectively.

Waiman Long (5):
  cgroup/cpuset: Add a cpuset_reserve_dl_bw() helper
  cgroup/cpuset: Expand the scope of cpuset_can_attach_check()
  cgroup/cpuset: Replace cpuset_attach_old_cs by a new attach_old_cs
    field in task_struct
  cgroup/cpuset: Move mpol_rebind_mm/cpuset_migrate_mm() calls inside
    cpuset_attach_task()
  cgroup/cpuset: Support multiple source/destination cpusets for
    cpuset_*attach()

 include/linux/sched.h           |   3 +
 kernel/cgroup/cpuset-internal.h |   6 +
 kernel/cgroup/cpuset.c          | 358 +++++++++++++++++++++-----------
 3 files changed, 249 insertions(+), 118 deletions(-)

-- 
2.54.0
Re: [PATCH cgroup/for-next v2 0/5] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach()
Posted by Ridong Chen 4 days, 22 hours ago

On 2026/5/16 12:24, Waiman Long wrote:
> Sashiko AI review of another cpuset patch had found that cpuset_attach()
> and cpuset_can_attach() can be passed a cgroup_taskset with tasks
> migrating from one source cpuset to multiple destination cpusets and
> vice versa.  Further testing of the cpuset code indicates that this is
> indeed the case when the v2 cpuset controller is enabled or disabled.
> 
> Unfortunately, cpuset_attach() and cpuset_can_attach() still assume that
> there will be one source and one destinaton cpuset which may result in
> inocrrect behavior.
> 

Hi Longman,

I am thinking whether we can use the pids subsystem's approach to solve 
this issue, which I think could be much simpler.

For the DL task accounting, we can handle it the same way 
pids_can_attach() does - just call task_cs(task) for each task 
individually inside the can_attach() loop and do the nr_deadline_tasks 
adjustment right there. This eliminates the need to pass per-task source 
cpuset information to the attach() callback entirely for DL accounting 
purposes.

For cpuset_migrate_mm(), I don't think we need per-task oldcs storage in 
task_struct either. The scenarios where multiple source cpusets are 
involved are:

enable cpuset controller: child cpusets inherit parent's effective_mems, 
so attach_mems_updated is false and cpuset_migrate_mm() is never called.

disable cpuset controller: tasks move from children to parent. Since 
children's effective_mems is always a subset of parent's effective_mems, 
even if cpuset_migrate_mm() is triggered, it's effectively a noop (no 
pages need to move from a subset to its superset).

cgroup.procs write with threads in different cpusets: this is a 
many-to-one migration with a single process, so there is only one 
group_leader and one mm. We only need to record the leader's oldcs, 
which a single static variable can handle.

So in all cases, the migration path only needs one oldcs for the leader. 
We don't need to add a field to task_struct.

What do you think?



> This patch series is created to fix this issue. The first 2 patches are
> just preparatory patches to make the remaining patches easier to review.
> 
> Patch 3 adds a new attach_old_cs field into task_struct to track the
> old cpuset to be used in case when cpuset_migrate_mm() needs to be
> called in cpuset_attach().
> 
> Patch 4 moves mpol_rebind_mm() and cpuset_migrate_mm() inside
> cpuset_attach_task() to make CLONE_INTO_CGROUP flag of clone(2) works
> more like moving task from one cpuset to another one, while also make
> supporting multiple source and destination cpusets easier.
> 
> Patch 5 makes the necessary changes to enable the support of multiple
> source and destination cpusets by keeping all the source and destination
> cpusets found during task iterations in two singly linked lists for
> source and destination cpusets respectively.
> 
> Waiman Long (5):
>    cgroup/cpuset: Add a cpuset_reserve_dl_bw() helper
>    cgroup/cpuset: Expand the scope of cpuset_can_attach_check()
>    cgroup/cpuset: Replace cpuset_attach_old_cs by a new attach_old_cs
>      field in task_struct
>    cgroup/cpuset: Move mpol_rebind_mm/cpuset_migrate_mm() calls inside
>      cpuset_attach_task()
>    cgroup/cpuset: Support multiple source/destination cpusets for
>      cpuset_*attach()
> 
>   include/linux/sched.h           |   3 +
>   kernel/cgroup/cpuset-internal.h |   6 +
>   kernel/cgroup/cpuset.c          | 358 +++++++++++++++++++++-----------
>   3 files changed, 249 insertions(+), 118 deletions(-)
> 

-- 
Best regards,
Ridong
Re: [PATCH cgroup/for-next v2 0/5] cgroup/cpuset: Support multiple source/destination cpusets for cpuset_*attach()
Posted by Waiman Long 1 week, 2 days ago
On 5/16/26 12:24 AM, Waiman Long wrote:
> Sashiko AI review of another cpuset patch had found that cpuset_attach()
> and cpuset_can_attach() can be passed a cgroup_taskset with tasks
> migrating from one source cpuset to multiple destination cpusets and
> vice versa.  Further testing of the cpuset code indicates that this is
> indeed the case when the v2 cpuset controller is enabled or disabled.
>
> Unfortunately, cpuset_attach() and cpuset_can_attach() still assume that
> there will be one source and one destinaton cpuset which may result in
> inocrrect behavior.
>
> This patch series is created to fix this issue. The first 2 patches are
> just preparatory patches to make the remaining patches easier to review.
>
> Patch 3 adds a new attach_old_cs field into task_struct to track the
> old cpuset to be used in case when cpuset_migrate_mm() needs to be
> called in cpuset_attach().
>
> Patch 4 moves mpol_rebind_mm() and cpuset_migrate_mm() inside
> cpuset_attach_task() to make CLONE_INTO_CGROUP flag of clone(2) works
> more like moving task from one cpuset to another one, while also make
> supporting multiple source and destination cpusets easier.
>
> Patch 5 makes the necessary changes to enable the support of multiple
> source and destination cpusets by keeping all the source and destination
> cpusets found during task iterations in two singly linked lists for
> source and destination cpusets respectively.

Sorry, I forgot to add the change log. It is basically to address all 
the AI review comments [1] for my v1 patch.

The 2 major changes are to move cpuset_migrate_mm() into 
cpuset_attach_task() and add a new field in task_struct to record the 
old cpuset to be used by cpuset_migrate_mm(). There also some other 
minor changes.

[1] 
https://lore.kernel.org/lkml/4f49602d35d987e029b8e92a577f0c60@kernel.org/

Cheers,
Longman

>
> Waiman Long (5):
>    cgroup/cpuset: Add a cpuset_reserve_dl_bw() helper
>    cgroup/cpuset: Expand the scope of cpuset_can_attach_check()
>    cgroup/cpuset: Replace cpuset_attach_old_cs by a new attach_old_cs
>      field in task_struct
>    cgroup/cpuset: Move mpol_rebind_mm/cpuset_migrate_mm() calls inside
>      cpuset_attach_task()
>    cgroup/cpuset: Support multiple source/destination cpusets for
>      cpuset_*attach()
>
>   include/linux/sched.h           |   3 +
>   kernel/cgroup/cpuset-internal.h |   6 +
>   kernel/cgroup/cpuset.c          | 358 +++++++++++++++++++++-----------
>   3 files changed, 249 insertions(+), 118 deletions(-)
>