From nobody Tue Apr 7 14:04:17 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B9F93815EF; Wed, 25 Feb 2026 23:22:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772061775; cv=none; b=UuK+gcenMbWGsusTjzrjlwdc2yHEks6vTzXAwfz3tspaNLAtLkT2BXjfn19Pd8l9z7dRW9hXRbARp0uEVotdUt713bSuPVaiAuQOaltGfeEYz8RZMWgTZnw1HJWUbR8fG9bIJZsenPa/z6Byp8QgG6P45sOg9olERuSFvQN6ueU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772061775; c=relaxed/simple; bh=rRBHQ5Tj/mkAX07rI6Q3wXDtZyOHOyBYMgozaxdM7Pg=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=SuF/xdnPXHVZf9srZtbsBIWgcmWVzADzQcgyWdYoPrxJKwy/Sy0ukw0Mxhe2BKalAV0XMNwVmnKVbIRQqPA66jvEEzVdO8eIEU8fcP25zPMj8HRFsPkmSsW75rlKxNXGr3XKKSuibH5OUNB2xqQXVl+A9ck1kCnL5M/IR+QhwxA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=nY482vpc; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nY482vpc" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D9F05C19423; Wed, 25 Feb 2026 23:22:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1772061774; bh=rRBHQ5Tj/mkAX07rI6Q3wXDtZyOHOyBYMgozaxdM7Pg=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=nY482vpcquKA6zKf3Vak+GopmiancyFXH7dezb4Y40hg1uhToem28F591K1gYOeqv mUei71zGeBf3ZqnGlMd1MmXTT/qavbxiEgKFqjZdnnGFrQOi6UcFjyD2WsJ1lkv4Rn yDtSjGU/6raxKnIstLHAZ8eI8OOiOVBKfHWp0TMXlRrfYMU50jINQWmcqwIWVyZ8KL 9ldYtl+mFKPTFYbWBSW8lZ1sfkBm3ve1ZOdTlgnRr8rcscXh2IIp1ustLb0Kk+rbk0 IXI+VbGocKcP4tVBeQoLwZhxP14KsnUPmfBqcHSLfAWgMV5Go0F0jULl4bPKL6BUqS 0oILJDRvZQWMw== From: Christian Brauner Date: Thu, 26 Feb 2026 00:22:44 +0100 Subject: [PATCH RFC v4 1/2] pidfs: add inode ownership and permission checks Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260226-work-pidfs-inode-owner-v4-1-990032ec9700@kernel.org> References: <20260226-work-pidfs-inode-owner-v4-0-990032ec9700@kernel.org> In-Reply-To: <20260226-work-pidfs-inode-owner-v4-0-990032ec9700@kernel.org> To: linux-fsdevel@vger.kernel.org, Jann Horn Cc: Kees Cook , Andy Lutomirski , Alexander Viro , Jan Kara , linux-kernel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.15-dev-47773 X-Developer-Signature: v=1; a=openpgp-sha256; l=9051; i=brauner@kernel.org; h=from:subject:message-id; bh=rRBHQ5Tj/mkAX07rI6Q3wXDtZyOHOyBYMgozaxdM7Pg=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMWTOb/FqmcjXvFuffct60QX7f66b2NcmMk8tYmrWjO1Ha vZXis550VHKwiDGxSArpsji0G4SLrecp2KzUaYGzBxWJpAhDFycAjCRFVsZGa4JmmtPTSiRED+z 2uyPWlk/l6b9bf/If89nt7EsPMu91JCRYb+Miv7UVi3fw77bGXeaWJ6+8/dI5t6rs3li5+7s229 cxg4A X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Right now we only support trusted.* xattrs which require CAP_SYS_ADMIN which doesn't really require any meaningful permission checking. But in order to support user.* xattrs and custom pidfs.* xattrs in the future we need permission checking for pidfs inodes. Add baseline permission checking that can later be extended with additional write-time checks for specific pidfs.* xattrs. Make the {u,g}id of the task the owner of the pidfs inode. The ownership is set when the dentry is first stashed and reported dynamically via getattr since credentials may change due to setuid() and similar operations. For kernel threads use root, for exited tasks use the credentials saved at exit time. The inode's ownership is dynamically updated via pidfs_update_owner() which is called from the getattr() and permission() callbacks. It writes the uid/gid directly to the inode via WRITE_ONCE(). This doesn't serialize against inode->i_op->setattr() but since pidfs rejects setattr() this isn't currently an issue. A seqcount-based approach can be used if setattr() support is added in the future [1]. Save the task's credentials and thread group pid inode number at exit time so that ownership and permission checks remain functional after the task has been reaped. The permission callback updates the inode's ownership via pidfs_update_owner() and then performs standard POSIX permission checking via generic_permission() against the inode's ownership and mode bits (S_IRWXU / 0700). This is intentionally less strict than ptrace_may_access() because pidfs currently does not allow operating on data that is completely private to the process such as its mm or file descriptors. Additional checks will be needed once that changes. Link: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git/log/?h=3D= work.inode.seqcount [1] Signed-off-by: Christian Brauner --- fs/pidfs.c | 110 +++++++++++++++++++++++++++++++++++++++++++++++++++------= ---- 1 file changed, 93 insertions(+), 17 deletions(-) diff --git a/fs/pidfs.c b/fs/pidfs.c index 318253344b5c..4f480a814c5a 100644 --- a/fs/pidfs.c +++ b/fs/pidfs.c @@ -42,21 +42,30 @@ void pidfs_get_root(struct path *path) } =20 enum pidfs_attr_mask_bits { - PIDFS_ATTR_BIT_EXIT =3D 0, - PIDFS_ATTR_BIT_COREDUMP =3D 1, + PIDFS_ATTR_BIT_EXIT =3D (1U << 0), + PIDFS_ATTR_BIT_COREDUMP =3D (1U << 1), + PIDFS_ATTR_BIT_KTHREAD =3D (1U << 2), }; =20 -struct pidfs_attr { - unsigned long attr_mask; - struct simple_xattrs *xattrs; - struct /* exit info */ { - __u64 cgroupid; - __s32 exit_code; - }; +struct pidfs_exit_attr { + __u64 cgroupid; + __s32 exit_code; + const struct cred *exit_cred; + u64 exit_tgid_ino; +}; + +struct pidfs_coredump_attr { __u32 coredump_mask; __u32 coredump_signal; }; =20 +struct pidfs_attr { + atomic_t attr_mask; + struct simple_xattrs *xattrs; + struct pidfs_exit_attr; + struct pidfs_coredump_attr; +}; + static struct rhashtable pidfs_ino_ht; =20 static const struct rhashtable_params pidfs_ino_ht_params =3D { @@ -200,6 +209,7 @@ void pidfs_free_pid(struct pid *pid) if (IS_ERR(attr)) return; =20 + put_cred(attr->exit_cred); xattrs =3D no_free_ptr(attr->xattrs); if (xattrs) simple_xattrs_free(xattrs, NULL); @@ -364,7 +374,7 @@ static long pidfd_info(struct file *file, unsigned int = cmd, unsigned long arg) =20 attr =3D READ_ONCE(pid->attr); if (mask & PIDFD_INFO_EXIT) { - if (test_bit(PIDFS_ATTR_BIT_EXIT, &attr->attr_mask)) { + if (atomic_read(&attr->attr_mask) & PIDFS_ATTR_BIT_EXIT) { smp_rmb(); kinfo.mask |=3D PIDFD_INFO_EXIT; #ifdef CONFIG_CGROUPS @@ -376,7 +386,7 @@ static long pidfd_info(struct file *file, unsigned int = cmd, unsigned long arg) } =20 if (mask & PIDFD_INFO_COREDUMP) { - if (test_bit(PIDFS_ATTR_BIT_COREDUMP, &attr->attr_mask)) { + if (atomic_read(&attr->attr_mask) & PIDFS_ATTR_BIT_COREDUMP) { smp_rmb(); kinfo.mask |=3D PIDFD_INFO_COREDUMP | PIDFD_INFO_COREDUMP_SIGNAL; kinfo.coredump_mask =3D attr->coredump_mask; @@ -674,6 +684,7 @@ void pidfs_exit(struct task_struct *tsk) { struct pid *pid =3D task_pid(tsk); struct pidfs_attr *attr; + unsigned int mask; #ifdef CONFIG_CGROUPS struct cgroup *cgrp; #endif @@ -703,17 +714,22 @@ void pidfs_exit(struct task_struct *tsk) * is put */ =20 -#ifdef CONFIG_CGROUPS rcu_read_lock(); +#ifdef CONFIG_CGROUPS cgrp =3D task_dfl_cgroup(tsk); attr->cgroupid =3D cgroup_id(cgrp); - rcu_read_unlock(); #endif + attr->exit_cred =3D get_cred(__task_cred(tsk)); + rcu_read_unlock(); + attr->exit_tgid_ino =3D task_tgid(tsk)->ino; attr->exit_code =3D tsk->exit_code; =20 /* Ensure that PIDFD_GET_INFO sees either all or nothing. */ smp_wmb(); - set_bit(PIDFS_ATTR_BIT_EXIT, &attr->attr_mask); + mask =3D PIDFS_ATTR_BIT_EXIT; + if (unlikely(tsk->flags & PF_KTHREAD)) + mask |=3D PIDFS_ATTR_BIT_KTHREAD; + atomic_or(mask, &attr->attr_mask); } =20 #ifdef CONFIG_COREDUMP @@ -735,12 +751,49 @@ void pidfs_coredump(const struct coredump_params *cpr= m) /* Expose the signal number that caused the coredump. */ attr->coredump_signal =3D cprm->siginfo->si_signo; smp_wmb(); - set_bit(PIDFS_ATTR_BIT_COREDUMP, &attr->attr_mask); + atomic_or(PIDFS_ATTR_BIT_COREDUMP, &attr->attr_mask); } #endif =20 static struct vfsmount *pidfs_mnt __ro_after_init; =20 +static void pidfs_update_owner(struct inode *inode) +{ + struct pid *pid =3D inode->i_private; + struct task_struct *task; + struct pidfs_attr *attr; + const struct cred *cred; + + VFS_WARN_ON_ONCE(!pid); + + attr =3D READ_ONCE(pid->attr); + VFS_WARN_ON_ONCE(!attr); + + if (unlikely(atomic_read(&attr->attr_mask) & PIDFS_ATTR_BIT_KTHREAD)) + return; + + guard(rcu)(); + task =3D pid_task(pid, PIDTYPE_PID); + if (task) { + cred =3D __task_cred(task); + WRITE_ONCE(inode->i_uid, cred->uid); + WRITE_ONCE(inode->i_gid, cred->gid); + return; + } + + /* + * During copy_process() with CLONE_PIDFD the task hasn't been + * attached to the pid yet so pid_task() returns NULL and + * there's no exit_cred as the task obviously hasn't exited. Use + * the parent's credentials. + */ + cred =3D attr->exit_cred; + if (!cred) + cred =3D current_cred(); + WRITE_ONCE(inode->i_uid, cred->uid); + WRITE_ONCE(inode->i_gid, cred->gid); +} + /* * The vfs falls back to simple_setattr() if i_op->setattr() isn't * implemented. Let's reject it completely until we have a clean @@ -756,6 +809,9 @@ static int pidfs_getattr(struct mnt_idmap *idmap, const= struct path *path, struct kstat *stat, u32 request_mask, unsigned int query_flags) { + struct inode *inode =3D d_inode(path->dentry); + + pidfs_update_owner(inode); return anon_inode_getattr(idmap, path, stat, request_mask, query_flags); } =20 @@ -773,10 +829,24 @@ static ssize_t pidfs_listxattr(struct dentry *dentry,= char *buf, size_t size) return simple_xattr_list(inode, xattrs, buf, size); } =20 +static int pidfs_permission(struct mnt_idmap *idmap, struct inode *inode, + int mask) +{ + struct pid *pid =3D inode->i_private; + struct pidfs_attr *attr =3D READ_ONCE(pid->attr); + + if (unlikely(atomic_read(&attr->attr_mask) & PIDFS_ATTR_BIT_KTHREAD)) + return -EPERM; + + pidfs_update_owner(inode); + return generic_permission(&nop_mnt_idmap, inode, mask); +} + static const struct inode_operations pidfs_inode_operations =3D { .getattr =3D pidfs_getattr, .setattr =3D pidfs_setattr, .listxattr =3D pidfs_listxattr, + .permission =3D pidfs_permission, }; =20 static void pidfs_evict_inode(struct inode *inode) @@ -835,7 +905,7 @@ static struct pid *pidfs_ino_get_pid(u64 ino) attr =3D READ_ONCE(pid->attr); if (IS_ERR_OR_NULL(attr)) return NULL; - if (test_bit(PIDFS_ATTR_BIT_EXIT, &attr->attr_mask)) + if (atomic_read(&attr->attr_mask) & PIDFS_ATTR_BIT_EXIT) return NULL; /* Within our pid namespace hierarchy? */ if (pid_vnr(pid) =3D=3D 0) @@ -949,6 +1019,7 @@ static void pidfs_put_data(void *data) int pidfs_register_pid(struct pid *pid) { struct pidfs_attr *new_attr __free(kfree) =3D NULL; + struct task_struct *task; struct pidfs_attr *attr; =20 might_sleep(); @@ -975,6 +1046,9 @@ int pidfs_register_pid(struct pid *pid) if (unlikely(attr)) return 0; =20 + task =3D pid_task(pid, PIDTYPE_PID); + if (task && (task->flags & PF_KTHREAD)) + atomic_or(PIDFS_ATTR_BIT_KTHREAD, &new_attr->attr_mask); pid->attr =3D no_free_ptr(new_attr); return 0; } @@ -983,7 +1057,8 @@ static struct dentry *pidfs_stash_dentry(struct dentry= **stashed, struct dentry *dentry) { int ret; - struct pid *pid =3D d_inode(dentry)->i_private; + struct inode *inode =3D d_inode(dentry); + struct pid *pid =3D inode->i_private; =20 VFS_WARN_ON_ONCE(stashed !=3D &pid->stashed); =20 @@ -991,6 +1066,7 @@ static struct dentry *pidfs_stash_dentry(struct dentry= **stashed, if (ret) return ERR_PTR(ret); =20 + pidfs_update_owner(inode); return stash_dentry(stashed, dentry); } =20 --=20 2.47.3 From nobody Tue Apr 7 14:04:17 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D1FF3806D9; Wed, 25 Feb 2026 23:22:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772061777; cv=none; b=eeYLdgX19nR0idyNiqsTuIH81geintpGuMHIT3Y5O8RsPiLhnjJVowu+vWYl9jSz1bELJE+E55+9L9RracVE12mxe4SvcxluZhEfVN1s7PqjN4VpfMywe2CQVSjHAlPV74xEIubq/Hmz5roQkOEJltqBqo6aKK7DLEd7KSWol5o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772061777; c=relaxed/simple; bh=ZhBOqZBMABjhlGhR8/nIBrwbHQEP/sWUGyrp2/NmTEM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=n++p76dLhL/EcRHUyNjOSa0JA0aJ4UaxftP4P5x2gJy75qGkNwj5vlpePRP1YrDDCbNmIMd4k3L0vih9wWYamVa9vIwyhna3w1MJH9sAWaqSSkM45VwWygpZ8aSq1waQFsn9sNhRFmzrwhxAMl9CQBGHnNwb/HwAL+N74kx10MI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=eOeM1v2X; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="eOeM1v2X" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2B895C19425; Wed, 25 Feb 2026 23:22:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1772061776; bh=ZhBOqZBMABjhlGhR8/nIBrwbHQEP/sWUGyrp2/NmTEM=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=eOeM1v2X3cZo+qMaL07yJkxEj6VN9cQh/5esTCmnqaeOBl5x1OtFKmwxllAxdMgfB LgkZZ2+WUw1ymBFOj0YDbHJyDMYIfHX26XSAxqC3UcWLEPalB0uoZaLPtNZNCtNRm1 Y1EFi1JlZVXqLCWBpfnBejMWDvMFpKE6qMZMgQL4E5wauTB48ZEVw7H2WtrI9Vsv4T +/FRNH2F26I1hhdf5OubSDKKFijo0xUrCA6wH6s6GLNMGyNZDoVqMOWU3PY+3Ak0Hd I742+ViRQlbTwzbQl+j4lgBMn/5ZYKFoTbBvCxblfrHaR+iFfQzMBp8i1njN/pxe9f sjhvmU+cIBCKA== From: Christian Brauner Date: Thu, 26 Feb 2026 00:22:45 +0100 Subject: [PATCH RFC v4 2/2] selftests/pidfd: add inode ownership and permission tests Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260226-work-pidfs-inode-owner-v4-2-990032ec9700@kernel.org> References: <20260226-work-pidfs-inode-owner-v4-0-990032ec9700@kernel.org> In-Reply-To: <20260226-work-pidfs-inode-owner-v4-0-990032ec9700@kernel.org> To: linux-fsdevel@vger.kernel.org, Jann Horn Cc: Kees Cook , Andy Lutomirski , Alexander Viro , Jan Kara , linux-kernel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.15-dev-47773 X-Developer-Signature: v=1; a=openpgp-sha256; l=9663; i=brauner@kernel.org; h=from:subject:message-id; bh=ZhBOqZBMABjhlGhR8/nIBrwbHQEP/sWUGyrp2/NmTEM=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMWTOb/H6/99gVcWm6POr68SKrNv3+bydb+a4cM77a3avt kx7drQjsKOUhUGMi0FWTJHFod0kXG45T8Vmo0wNmDmsTCBDGLg4BWAiLbwM/3NXnndf9JQr2P9g Dz/fjn0hF/meGV8Kv30rKd67Nzdg4kJGhi/NW45r1acVTC35GJMzS3L/vaTg48duruwUkPTbzDx nAysA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Test the pidfs inode ownership reporting (via fstat) and the permission model (via user.* xattr operations that trigger pidfs_permission()): Ownership tests: - owner_self: own pidfd reports caller's uid/gid - owner_child: child pidfd reports correct ownership - owner_child_changed_uid: ownership tracks live credential changes - owner_exited_child: ownership persists after exit and reap - owner_exited_child_changed_uid: exit_cred preserves changed credentials - owner_kthread: kernel thread pidfd reports root ownership Permission tests: - permission_same_user: same-user xattr access succeeds (EOPNOTSUPP) - permission_different_user_denied: cross-user access denied (EACCES) The user.* xattr namespace is used to exercise pidfs_permission() from userspace: xattr_permission() calls inode_permission() for user.* on S_IFREG inodes, so fgetxattr() returns EOPNOTSUPP when permission is granted (no handler) and EACCES when denied. Tests requiring root skip gracefully via SKIP(). Signed-off-by: Christian Brauner --- tools/testing/selftests/pidfd/.gitignore | 1 + tools/testing/selftests/pidfd/Makefile | 2 +- .../selftests/pidfd/pidfd_inode_owner_test.c | 314 +++++++++++++++++= ++++ 3 files changed, 316 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/pidfd/.gitignore b/tools/testing/selft= ests/pidfd/.gitignore index 144e7ff65d6a..1981d39fe3dc 100644 --- a/tools/testing/selftests/pidfd/.gitignore +++ b/tools/testing/selftests/pidfd/.gitignore @@ -12,3 +12,4 @@ pidfd_info_test pidfd_exec_helper pidfd_xattr_test pidfd_setattr_test +pidfd_inode_owner_test diff --git a/tools/testing/selftests/pidfd/Makefile b/tools/testing/selftes= ts/pidfd/Makefile index 764a8f9ecefa..904c9fd595c1 100644 --- a/tools/testing/selftests/pidfd/Makefile +++ b/tools/testing/selftests/pidfd/Makefile @@ -4,7 +4,7 @@ CFLAGS +=3D -g $(KHDR_INCLUDES) $(TOOLS_INCLUDES) -pthread = -Wall TEST_GEN_PROGS :=3D pidfd_test pidfd_fdinfo_test pidfd_open_test \ pidfd_poll_test pidfd_wait pidfd_getfd_test pidfd_setns_test \ pidfd_file_handle_test pidfd_bind_mount pidfd_info_test \ - pidfd_xattr_test pidfd_setattr_test + pidfd_xattr_test pidfd_setattr_test pidfd_inode_owner_test =20 TEST_GEN_PROGS_EXTENDED :=3D pidfd_exec_helper =20 diff --git a/tools/testing/selftests/pidfd/pidfd_inode_owner_test.c b/tools= /testing/selftests/pidfd/pidfd_inode_owner_test.c new file mode 100644 index 000000000000..0c15d0ccaafc --- /dev/null +++ b/tools/testing/selftests/pidfd/pidfd_inode_owner_test.c @@ -0,0 +1,314 @@ +// SPDX-License-Identifier: GPL-2.0 + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "pidfd.h" +#include "kselftest_harness.h" + +FIXTURE(pidfs_inode_owner) +{ + pid_t child_pid; + int child_pidfd; +}; + +FIXTURE_SETUP(pidfs_inode_owner) +{ + int pipe_fds[2]; + char buf; + + self->child_pid =3D -1; + self->child_pidfd =3D -1; + + ASSERT_EQ(pipe(pipe_fds), 0); + + self->child_pid =3D create_child(&self->child_pidfd, 0); + ASSERT_GE(self->child_pid, 0); + + if (self->child_pid =3D=3D 0) { + close(pipe_fds[0]); + write_nointr(pipe_fds[1], "c", 1); + close(pipe_fds[1]); + pause(); + _exit(EXIT_SUCCESS); + } + + close(pipe_fds[1]); + ASSERT_EQ(read_nointr(pipe_fds[0], &buf, 1), 1); + close(pipe_fds[0]); +} + +FIXTURE_TEARDOWN(pidfs_inode_owner) +{ + if (self->child_pid > 0) { + kill(self->child_pid, SIGKILL); + sys_waitid(P_PID, self->child_pid, NULL, WEXITED); + } + if (self->child_pidfd >=3D 0) + close(self->child_pidfd); +} + +/* Own pidfd reports correct ownership. */ +TEST_F(pidfs_inode_owner, owner_self) +{ + int pidfd; + struct stat st; + + pidfd =3D sys_pidfd_open(getpid(), 0); + ASSERT_GE(pidfd, 0); + + ASSERT_EQ(fstat(pidfd, &st), 0); + EXPECT_EQ(st.st_uid, getuid()); + EXPECT_EQ(st.st_gid, getgid()); + + close(pidfd); +} + +/* Child pidfd reports correct ownership. */ +TEST_F(pidfs_inode_owner, owner_child) +{ + struct stat st; + + ASSERT_EQ(fstat(self->child_pidfd, &st), 0); + EXPECT_EQ(st.st_uid, getuid()); + EXPECT_EQ(st.st_gid, getgid()); +} + +/* Ownership tracks credential changes in a live task. */ +TEST_F(pidfs_inode_owner, owner_child_changed_uid) +{ + pid_t pid; + int pidfd, pipe_fds[2]; + struct stat st; + char buf; + + if (getuid() !=3D 0) + SKIP(return, "Test requires root"); + + ASSERT_EQ(pipe(pipe_fds), 0); + + pid =3D create_child(&pidfd, 0); + ASSERT_GE(pid, 0); + + if (pid =3D=3D 0) { + close(pipe_fds[0]); + if (setresgid(65534, 65534, 65534)) + _exit(PIDFD_ERROR); + if (setresuid(65534, 65534, 65534)) + _exit(PIDFD_ERROR); + write_nointr(pipe_fds[1], "c", 1); + close(pipe_fds[1]); + pause(); + _exit(EXIT_SUCCESS); + } + + close(pipe_fds[1]); + ASSERT_EQ(read_nointr(pipe_fds[0], &buf, 1), 1); + close(pipe_fds[0]); + + ASSERT_EQ(fstat(pidfd, &st), 0); + EXPECT_EQ(st.st_uid, (uid_t)65534); + EXPECT_EQ(st.st_gid, (gid_t)65534); + + kill(pid, SIGKILL); + sys_waitid(P_PID, pid, NULL, WEXITED); + close(pidfd); +} + +/* Ownership persists after the child exits and is reaped. */ +TEST_F(pidfs_inode_owner, owner_exited_child) +{ + pid_t pid; + int pidfd; + struct stat st; + + pid =3D create_child(&pidfd, 0); + ASSERT_GE(pid, 0); + + if (pid =3D=3D 0) + _exit(EXIT_SUCCESS); + + ASSERT_EQ(sys_waitid(P_PID, pid, NULL, WEXITED), 0); + + ASSERT_EQ(fstat(pidfd, &st), 0); + EXPECT_EQ(st.st_uid, getuid()); + EXPECT_EQ(st.st_gid, getgid()); + + close(pidfd); +} + +/* Exit credentials preserve changed credentials. */ +TEST_F(pidfs_inode_owner, owner_exited_child_changed_uid) +{ + pid_t pid; + int pidfd; + struct stat st; + + if (getuid() !=3D 0) + SKIP(return, "Test requires root"); + + pid =3D create_child(&pidfd, 0); + ASSERT_GE(pid, 0); + + if (pid =3D=3D 0) { + if (setresgid(65534, 65534, 65534)) + _exit(PIDFD_ERROR); + if (setresuid(65534, 65534, 65534)) + _exit(PIDFD_ERROR); + _exit(EXIT_SUCCESS); + } + + ASSERT_EQ(sys_waitid(P_PID, pid, NULL, WEXITED), 0); + + ASSERT_EQ(fstat(pidfd, &st), 0); + EXPECT_EQ(st.st_uid, (uid_t)65534); + EXPECT_EQ(st.st_gid, (gid_t)65534); + + close(pidfd); +} + +/* Same-user cross-process permission check succeeds. */ +TEST_F(pidfs_inode_owner, permission_same_user) +{ + pid_t pid; + int pidfd; + pid_t parent_pid =3D getpid(); + + pid =3D create_child(&pidfd, 0); + ASSERT_GE(pid, 0); + + if (pid =3D=3D 0) { + int fd; + char buf; + + fd =3D sys_pidfd_open(parent_pid, 0); + if (fd < 0) + _exit(PIDFD_ERROR); + + /* + * user.* xattr access triggers pidfs_permission(). + * Same user's FSUID matches target's RUID, so + * generic_permission() passes and we get EOPNOTSUPP + * (no user.* xattr handler) instead of EACCES. + */ + if (fgetxattr(fd, "user.test", &buf, sizeof(buf)) < 0 && + errno =3D=3D EOPNOTSUPP) { + close(fd); + _exit(PIDFD_PASS); + } + + close(fd); + _exit(PIDFD_FAIL); + } + + ASSERT_EQ(wait_for_pid(pid), PIDFD_PASS); + close(pidfd); +} + +/* Cross-user access is denied when FSUID doesn't match target's RUID. */ +TEST_F(pidfs_inode_owner, permission_different_user_denied) +{ + pid_t pid; + int pidfd; + + if (getuid() !=3D 0) + SKIP(return, "Test requires root"); + + pid =3D create_child(&pidfd, 0); + ASSERT_GE(pid, 0); + + if (pid =3D=3D 0) { + int fd; + struct stat init_st; + char buf; + + /* Open pidfd for init (uid 0). */ + fd =3D sys_pidfd_open(1, 0); + if (fd < 0) + _exit(PIDFD_ERROR); + + /* Verify init is actually uid 0 (may not be in all namespaces). */ + if (fstat(fd, &init_st) || init_st.st_uid !=3D 0) { + close(fd); + _exit(PIDFD_SKIP); + } + + /* Drop to uid/gid 65534 and lose all capabilities. */ + if (setresgid(65534, 65534, 65534)) + _exit(PIDFD_ERROR); + if (setresuid(65534, 65534, 65534)) + _exit(PIDFD_ERROR); + + /* + * FSUID 65534 doesn't match target's RUID 0, and + * no CAP_DAC_OVERRIDE, so generic_permission() + * returns -EACCES. + */ + if (fgetxattr(fd, "user.test", &buf, sizeof(buf)) < 0 && + errno =3D=3D EACCES) { + close(fd); + _exit(PIDFD_PASS); + } + + close(fd); + _exit(PIDFD_FAIL); + } + + { + int ret =3D wait_for_pid(pid); + if (ret =3D=3D PIDFD_SKIP) + SKIP(goto out, "pid 1 is not uid 0 (not in init PID namespace?)"); + ASSERT_EQ(ret, PIDFD_PASS); + } +out: + close(pidfd); +} + +/* Kernel thread pidfd reports root ownership. */ +TEST_F(pidfs_inode_owner, owner_kthread) +{ + int pidfd; + struct stat st; + char comm[16] =3D {}; + FILE *f; + + /* + * pid 2 is kthreadd only in the init PID namespace. + * Skip if we're in a different PID namespace. + */ + f =3D fopen("/proc/2/comm", "r"); + if (!f) + SKIP(return, "Cannot read /proc/2/comm"); + if (!fgets(comm, sizeof(comm), f)) { + fclose(f); + SKIP(return, "Cannot read /proc/2/comm"); + } + fclose(f); + comm[strcspn(comm, "\n")] =3D '\0'; + if (strcmp(comm, "kthreadd") !=3D 0) + SKIP(return, "pid 2 is not kthreadd (not in init PID namespace?)"); + + pidfd =3D sys_pidfd_open(2, 0); + ASSERT_GE(pidfd, 0); + + ASSERT_EQ(fstat(pidfd, &st), 0); + EXPECT_EQ(st.st_uid, (uid_t)0); + EXPECT_EQ(st.st_gid, (gid_t)0); + + close(pidfd); +} + +TEST_HARNESS_MAIN --=20 2.47.3