[PATCH RFC 0/2] pidfd: add CLONE_AUTOREAP

Christian Brauner posted 2 patches 1 month, 2 weeks ago
include/linux/sched/signal.h                       |   1 +
include/uapi/linux/sched.h                         |   1 +
kernel/fork.c                                      |  16 +-
kernel/ptrace.c                                    |   3 +-
kernel/signal.c                                    |   4 +
tools/testing/selftests/pidfd/.gitignore           |   1 +
tools/testing/selftests/pidfd/Makefile             |   2 +-
.../testing/selftests/pidfd/pidfd_autoreap_test.c  | 475 +++++++++++++++++++++
8 files changed, 500 insertions(+), 3 deletions(-)
[PATCH RFC 0/2] pidfd: add CLONE_AUTOREAP
Posted by Christian Brauner 1 month, 2 weeks ago
Add a new clone3() flag CLONE_AUTOREAP that makes a child process
auto-reap on exit without ever becoming a zombie. This is a per-process
property in contrast to the existing auto-reap mechanism via
SA_NOCLDWAIT or SIG_IGN for SIGCHLD which applies to all children of a
given parent.

With pidfds this is very useful as the parent can monitor the pidfd via
poll and retrieve the exit status from the pidfd.

Currently the only way to automatically reap children is to set
SA_NOCLDWAIT or SIG_IGN on SIGCHLD. This is a parent-scoped property
affecting all children which makes it unsuitable for libraries or
applications that need selective auto-reaping of specific children while
still being able to wait() on others.

CLONE_AUTOREAP stores an autoreap flag in the child's signal_struct.
When the child exits do_notify_parent() checks this flag and returns
autoreap=true causing exit_notify() to transition the task directly to
EXIT_DEAD. Since the flag lives on the child it survives reparenting: if
the original parent exits and the child is reparented to a subreaper or
init the child still auto-reaps when it eventually exits. This is
cleaner then forcing the subreaper to get SIGCHLD and then reaping it.
If the parent doesn't care the subreaper won't care. If there's a
subreaper that would care it would be easy enough to add a prctl() that
either just turns back on SIGCHLD and turns of auto-reaping or a prctl()
that just notifies the subreaper whenever a child is reparented to it.

CLONE_AUTOREAP requires CLONE_PIDFD because the process will never be
visible to wait(). The parent must use the pidfd to monitor exit via
poll() and retrieve exit status via PIDFD_GET_INFO. No exit signal is
delivered so exit_signal must be zero.

The flag is not inherited by the autoreap process's own children. Each
child that should be autoreaped must be explicitly created with
CLONE_AUTOREAP.

(Later on we can augment this with another addition CLONE_PIDFD_AUTOKILL
 which would SIGKILL the child process when the pidfd that was returned
 from clone3() is closed. Specifically, when the file referenced by the
 fd from clone3() is closed. The wrinkly here is that it would either
 have to be reset on privilege gaining exec - like pdeath signal - or we
 enforce that autokill only works when no-new-privileges is set.)

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
Christian Brauner (2):
      clone: add CLONE_AUTOREAP
      selftests/pidfd: add CLONE_AUTOREAP tests

 include/linux/sched/signal.h                       |   1 +
 include/uapi/linux/sched.h                         |   1 +
 kernel/fork.c                                      |  16 +-
 kernel/ptrace.c                                    |   3 +-
 kernel/signal.c                                    |   4 +
 tools/testing/selftests/pidfd/.gitignore           |   1 +
 tools/testing/selftests/pidfd/Makefile             |   2 +-
 .../testing/selftests/pidfd/pidfd_autoreap_test.c  | 475 +++++++++++++++++++++
 8 files changed, 500 insertions(+), 3 deletions(-)
---
base-commit: 72c395024dac5e215136cbff793455f065603b06
change-id: 20260214-work-pidfs-autoreap-3ee677e240a8
Re: [PATCH RFC 0/2] pidfd: add CLONE_AUTOREAP
Posted by Linus Torvalds 1 month, 2 weeks ago
On Mon, 16 Feb 2026 at 05:49, Christian Brauner <brauner@kernel.org> wrote:
>
> CLONE_AUTOREAP requires CLONE_PIDFD because the process will never be
> visible to wait().

This seems an unnecessary and counter-productive limitation.

The very *traditional* unix way to do auto-reaping is to fork twice,
and have the "middle" parent just exit.

That makes the final child be re-parented to init, and it is invisible
to wait() - all very much on purpose.

This was (perhaps still is?) very commonly used for starting up
background daemons (together with disassociating from the tty etc).

So I don't mind th enew flag, but I think the restriction is
unnecessary and not logical. Sometimes you simply don't *want*
processes visible to wait - or care about a pidfd.

            Linus
Re: [PATCH RFC 0/2] pidfd: add CLONE_AUTOREAP
Posted by Christian Brauner 1 month, 2 weeks ago
On Mon, Feb 16, 2026 at 07:25:48AM -0800, Linus Torvalds wrote:
> On Mon, 16 Feb 2026 at 05:49, Christian Brauner <brauner@kernel.org> wrote:
> >
> > CLONE_AUTOREAP requires CLONE_PIDFD because the process will never be
> > visible to wait().
> 
> This seems an unnecessary and counter-productive limitation.
> 
> The very *traditional* unix way to do auto-reaping is to fork twice,
> and have the "middle" parent just exit.
> 
> That makes the final child be re-parented to init, and it is invisible
> to wait() - all very much on purpose.
> 
> This was (perhaps still is?) very commonly used for starting up
> background daemons (together with disassociating from the tty etc).
> 
> So I don't mind th enew flag, but I think the restriction is
> unnecessary and not logical. Sometimes you simply don't *want*
> processes visible to wait - or care about a pidfd.

I'm completely fine removing that restriction and supporting autoreap
without pidfd.