From nobody Sun Feb 8 16:31:06 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A608325DD1E; Thu, 24 Apr 2025 12:25:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745497526; cv=none; b=aY1puDEfGZFssFyhvvZmfFQ1Ju5GYeK/IlYveNt8MgJ8ZYgwoQCzlHynhIG/QeLBX2YiCcjB9Olhuwh16IDUgR1vfUrHLpasl1ZHKL7m2fgm4ih6z8xKlQ2kE+0KJlHTAdcQMhIerenNtxLPlJ1e9xll32HoAEkYJY2HMf177Tw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745497526; c=relaxed/simple; bh=q0Zul5GmEGj0WjtZQ7Mk0Q5/9DR1bnkoJs7kA778bl4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=iq3ka3BjYfb9OAj7FrT1S8UuRREtosgylx+ZJOUbNxXqwNkROCeV9Xvsgwz6vGyvGxByiyQaOwHLu2Ts81ayE2p0TBjTevpCppPqLqnLDQhZs8D44vg6FgSEgtD46Bu7SvAHPneDQA93+sm4Va3SwnC5Ystipe/FCjOJYUNYNyQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=skjFl12L; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="skjFl12L" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 75DE5C4CEE8; Thu, 24 Apr 2025 12:25:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1745497526; bh=q0Zul5GmEGj0WjtZQ7Mk0Q5/9DR1bnkoJs7kA778bl4=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=skjFl12LA4JxPQ9blgcChu0HgzagItI7BnpsuFND+zYE/jXupVQDYSp31KwyItZoI X2JnKrYMtnwsw/SlcezWDi2aNWNven8bldIy7MmPwpU5MoqPgoUyKq7fFi7LrdIhnB EOAiLkTVET6QHqhQgvwksPcbFodxLarKg5eCHI2s7jSfVbIDGpzE6x4KA4UxSw0GyL kKQhM2Hs7k23ZJy1cYGIdgPIbuptTPKDg2hlmqCBAH3ra7SnbsdiXULxR7WFDMny6g 3trd+w9t7oOhbc1Rlyoh1Dk46hChCvvRyAcy9DBYCUktexiBo+asAfteTmIk2a1+rk JKEEP/P0G75dQ== From: Christian Brauner Date: Thu, 24 Apr 2025 14:24:36 +0200 Subject: [PATCH RFC 3/4] pidfs: get rid of __pidfd_prepare() Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250424-work-pidfs-net-v1-3-0dc97227d854@kernel.org> References: <20250424-work-pidfs-net-v1-0-0dc97227d854@kernel.org> In-Reply-To: <20250424-work-pidfs-net-v1-0-0dc97227d854@kernel.org> To: Oleg Nesterov , Kuniyuki Iwashima , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, netdev@vger.kernel.org, David Rheinsberg , Jan Kara , Alexander Mikhalitsyn , Luca Boccassi , Lennart Poettering , Daan De Meyer , Mike Yuan , Christian Brauner X-Mailer: b4 0.15-dev-c25d1 X-Developer-Signature: v=1; a=openpgp-sha256; l=6898; i=brauner@kernel.org; h=from:subject:message-id; bh=q0Zul5GmEGj0WjtZQ7Mk0Q5/9DR1bnkoJs7kA778bl4=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMWRw6S69s0Nw6wHGzAv+OssFa01v8Ri5Lz4i3aJxXs8h+ YZto/KPjlIWBjEuBlkxRRaHdpNwueU8FZuNMjVg5rAygQxh4OIUgIlEH2VkmHiGedINp8VSIbn/ e7mf8KeFbZqZpi/cVfT2ZodqxpJvuYwMF1sWMfc8YDLSY/I6Myk7/a03r3XgXmumidNXpN/6/IK TEwA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Fold it into pidfd_prepare() and rename PIDFD_CLONE to PIDFD_STALE to indicate that the passed pid might not have task linkage and no explicit check for that should be performed. Signed-off-by: Christian Brauner --- fs/pidfs.c | 12 +++---- include/uapi/linux/pidfd.h | 2 +- kernel/fork.c | 78 ++++++++++++++----------------------------= ---- 3 files changed, 31 insertions(+), 61 deletions(-) diff --git a/fs/pidfs.c b/fs/pidfs.c index 8e6c11774c60..3199ec02aaec 100644 --- a/fs/pidfs.c +++ b/fs/pidfs.c @@ -768,7 +768,7 @@ static inline bool pidfs_pid_valid(struct pid *pid, con= st struct path *path, { enum pid_type type; =20 - if (flags & PIDFD_CLONE) + if (flags & PIDFD_STALE) return true; =20 /* @@ -777,7 +777,7 @@ static inline bool pidfs_pid_valid(struct pid *pid, con= st struct path *path, * pidfd has been allocated perform another check that the pid * is still alive. If it is exit information is available even * if the task gets reaped before the pidfd is returned to - * userspace. The only exception is PIDFD_CLONE where no task + * userspace. The only exception is PIDFD_STALE where no task * linkage has been established for @pid yet and the kernel is * in the middle of process creation so there's nothing for * pidfs to miss. @@ -874,11 +874,11 @@ struct file *pidfs_alloc_file(struct pid *pid, unsign= ed int flags) int ret; =20 /* - * Ensure that PIDFD_CLONE can be passed as a flag without + * Ensure that PIDFD_STALE can be passed as a flag without * overloading other uapi pidfd flags. */ - BUILD_BUG_ON(PIDFD_CLONE =3D=3D PIDFD_THREAD); - BUILD_BUG_ON(PIDFD_CLONE =3D=3D PIDFD_NONBLOCK); + BUILD_BUG_ON(PIDFD_STALE =3D=3D PIDFD_THREAD); + BUILD_BUG_ON(PIDFD_STALE =3D=3D PIDFD_NONBLOCK); =20 ret =3D path_from_stashed(&pid->stashed, pidfs_mnt, get_pid(pid), &path); if (ret < 0) @@ -887,7 +887,7 @@ struct file *pidfs_alloc_file(struct pid *pid, unsigned= int flags) if (!pidfs_pid_valid(pid, &path, flags)) return ERR_PTR(-ESRCH); =20 - flags &=3D ~PIDFD_CLONE; + flags &=3D ~PIDFD_STALE; pidfd_file =3D dentry_open(&path, flags, current_cred()); /* Raise PIDFD_THREAD explicitly as do_dentry_open() strips it. */ if (!IS_ERR(pidfd_file)) diff --git a/include/uapi/linux/pidfd.h b/include/uapi/linux/pidfd.h index 2970ef44655a..8c1511edd0e9 100644 --- a/include/uapi/linux/pidfd.h +++ b/include/uapi/linux/pidfd.h @@ -12,7 +12,7 @@ #define PIDFD_THREAD O_EXCL #ifdef __KERNEL__ #include -#define PIDFD_CLONE CLONE_PIDFD +#define PIDFD_STALE CLONE_PIDFD #endif =20 /* Flags for pidfd_send_signal(). */ diff --git a/kernel/fork.c b/kernel/fork.c index f7403e1fb0d4..365687e1698f 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2035,50 +2035,6 @@ static inline void rcu_copy_process(struct task_stru= ct *p) #endif /* #ifdef CONFIG_TASKS_TRACE_RCU */ } =20 -/** - * __pidfd_prepare - allocate a new pidfd_file and reserve a pidfd - * @pid: the struct pid for which to create a pidfd - * @flags: flags of the new @pidfd - * @ret: Where to return the file for the pidfd. - * - * Allocate a new file that stashes @pid and reserve a new pidfd number in= the - * caller's file descriptor table. The pidfd is reserved but not installed= yet. - * - * The helper doesn't perform checks on @pid which makes it useful for pid= fds - * created via CLONE_PIDFD where @pid has no task attached when the pidfd = and - * pidfd file are prepared. - * - * If this function returns successfully the caller is responsible to eith= er - * call fd_install() passing the returned pidfd and pidfd file as argument= s in - * order to install the pidfd into its file descriptor table or they must = use - * put_unused_fd() and fput() on the returned pidfd and pidfd file - * respectively. - * - * This function is useful when a pidfd must already be reserved but there - * might still be points of failure afterwards and the caller wants to ens= ure - * that no pidfd is leaked into its file descriptor table. - * - * Return: On success, a reserved pidfd is returned from the function and = a new - * pidfd file is returned in the last argument to the function. On - * error, a negative error code is returned from the function and = the - * last argument remains unchanged. - */ -static int __pidfd_prepare(struct pid *pid, unsigned int flags, struct fil= e **ret) -{ - struct file *pidfd_file; - - CLASS(get_unused_fd, pidfd)(O_CLOEXEC); - if (pidfd < 0) - return pidfd; - - pidfd_file =3D pidfs_alloc_file(pid, flags | O_RDWR); - if (IS_ERR(pidfd_file)) - return PTR_ERR(pidfd_file); - - *ret =3D pidfd_file; - return take_fd(pidfd); -} - /** * pidfd_prepare - allocate a new pidfd_file and reserve a pidfd * @pid: the struct pid for which to create a pidfd @@ -2108,14 +2064,19 @@ static int __pidfd_prepare(struct pid *pid, unsigne= d int flags, struct file **re */ int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret) { - /* - * While holding the pidfd waitqueue lock removing the task - * linkage for the thread-group leader pid (PIDTYPE_TGID) isn't - * possible. Thus, if there's still task linkage for PIDTYPE_PID - * not having thread-group leader linkage for the pid means it - * wasn't a thread-group leader in the first place. - */ - scoped_guard(spinlock_irq, &pid->wait_pidfd.lock) { + struct file *pidfd_file; + + if (!(flags & PIDFD_STALE)) { + /* + * While holding the pidfd waitqueue lock removing the + * task linkage for the thread-group leader pid + * (PIDTYPE_TGID) isn't possible. Thus, if there's still + * task linkage for PIDTYPE_PID not having thread-group + * leader linkage for the pid means it wasn't a + * thread-group leader in the first place. + */ + guard(spinlock_irq)(&pid->wait_pidfd.lock); + /* Task has already been reaped. */ if (!pid_has_task(pid, PIDTYPE_PID)) return -ESRCH; @@ -2128,7 +2089,16 @@ int pidfd_prepare(struct pid *pid, unsigned int flag= s, struct file **ret) return -ENOENT; } =20 - return __pidfd_prepare(pid, flags, ret); + CLASS(get_unused_fd, pidfd)(O_CLOEXEC); + if (pidfd < 0) + return pidfd; + + pidfd_file =3D pidfs_alloc_file(pid, flags | O_RDWR); + if (IS_ERR(pidfd_file)) + return PTR_ERR(pidfd_file); + + *ret =3D pidfd_file; + return take_fd(pidfd); } =20 static void __delayed_free_task(struct rcu_head *rhp) @@ -2477,7 +2447,7 @@ __latent_entropy struct task_struct *copy_process( * Note that no task has been attached to @pid yet indicate * that via CLONE_PIDFD. */ - retval =3D __pidfd_prepare(pid, flags | PIDFD_CLONE, &pidfile); + retval =3D pidfd_prepare(pid, flags | PIDFD_STALE, &pidfile); if (retval < 0) goto bad_fork_free_pid; pidfd =3D retval; --=20 2.47.2