cgroup: Return error when attempting to migrate a zombie process

[RFC PATCH] cgroup: Return error when attempting to migrate a zombie process

Posted by Michal Koutný 2 years, 9 months ago

Zombies aren't migrated. However, return value of a migration write may
suggest a zombie process was migrated and causing confusion about lack
of cgroup.events:populated between origin and target cgroups (e.g.
target cgroup rmdir).

Notify the users about no effect of their action by a return value.
(update_dfl_csses migration of zombies still silently passes since it is
not meant to be user-visible migration anyway.)

Suggested-by: Benjamin Berg <benjamin@sipsolutions.net>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/cgroup/cgroup.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Reasons for RFC:
1) Some users may notice the change,
2) EINVAL vs ESCHR,
3) add a selftest?

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 625d7483951c..306547dd7b76 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -2968,7 +2968,8 @@ struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup,
 	 * become trapped in a cpuset, or RT kthread may be born in a
 	 * cgroup with no rt_runtime allocated.  Just say no.
 	 */
-	if (tsk->no_cgroup_migration || (tsk->flags & PF_NO_SETAFFINITY)) {
+	if (tsk->no_cgroup_migration || (tsk->flags & PF_NO_SETAFFINITY) ||
+	    !atomic_read(&tsk->signal->live)) {
 		tsk = ERR_PTR(-EINVAL);
 		goto out_unlock_threadgroup;
 	}
-- 
2.40.1

Re: [RFC PATCH] cgroup: Return error when attempting to migrate a zombie process

Posted by Tejun Heo 2 years, 9 months ago

Hello,

On Wed, May 03, 2023 at 02:53:59PM +0200, Michal Koutný wrote:
> Zombies aren't migrated. However, return value of a migration write may
> suggest a zombie process was migrated and causing confusion about lack
> of cgroup.events:populated between origin and target cgroups (e.g.
> target cgroup rmdir).
> 
> Notify the users about no effect of their action by a return value.
> (update_dfl_csses migration of zombies still silently passes since it is
> not meant to be user-visible migration anyway.)
> 
> Suggested-by: Benjamin Berg <benjamin@sipsolutions.net>
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>  kernel/cgroup/cgroup.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> Reasons for RFC:
> 1) Some users may notice the change,
> 2) EINVAL vs ESCHR,
> 3) add a selftest?
> 
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index 625d7483951c..306547dd7b76 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -2968,7 +2968,8 @@ struct task_struct *cgroup_procs_write_start(char *buf, bool threadgroup,
>  	 * become trapped in a cpuset, or RT kthread may be born in a
>  	 * cgroup with no rt_runtime allocated.  Just say no.
>  	 */
> -	if (tsk->no_cgroup_migration || (tsk->flags & PF_NO_SETAFFINITY)) {
> +	if (tsk->no_cgroup_migration || (tsk->flags & PF_NO_SETAFFINITY) ||
> +	    !atomic_read(&tsk->signal->live)) {

This seems racy to me. The liveness state can change between here and the
PF_EXITING check in cgroup_migrate_add_task(), right? Wouldn't it be better
to just track how many tasks are tracked and return -ESRCH if none was
migrated?

Thanks.

-- 
tejun

Re: [RFC PATCH] cgroup: Return error when attempting to migrate a zombie process

Posted by Michal Koutný 2 years, 9 months ago

On Fri, May 05, 2023 at 09:04:37AM -1000, Tejun Heo <tj@kernel.org> wrote:
> This seems racy to me. The liveness state can change between here and the
> PF_EXITING check in cgroup_migrate_add_task(), right?

You're right, threadgroup lock won't prevent that (as I got wrongly in
the patch):

  cgroup_procs_write_start                                 
                                                     do_exit
                                                       exit_signals
                                                         cgroup_threadgroup_change_begin
                                                           tsk->flags |= PF_EXITING
                                                         cgroup_threadgroup_change_end
    percpu_down_write(&cgroup_threadgroup_rwsem)           
      ...                                                  
      atomic_read(&live)                                   
                                                       ...
                                                       atomic_dec_and_test(live)
      ...                                                  
      cgroup_migrate_add_task                              
      ...                                                  
    percpu_up_write(&cgroup_threadgroup_rwsem)


> Wouldn't it be better to just track how many tasks are tracked and
> return -ESRCH if none was migrated?

Thanks, such an integral sounds better, will see.

Michal